Skip to content

Conversation

@marcboulle
Copy link
Contributor

@marcboulle marcboulle commented Oct 10, 2025

Cette PR répond aux trois issues suivantes:

Assez délicat: bien relire pour évaluer l’intérêt, la forme et le fond
Idéalement, faire relire par différents profiles (@lucaurelien, @alexisbondu, @popescu-v, ...), voire organiser une réunion de travail.

Cette branche repond aux trois issues suivantes:
- Expand dictionary documentation with reference to hierarchical structures
  #81
- Be more explicit on the impact of keys in multi-table schema
  #87
- Improve documentation on external tables
  #146

Assez delicat: bien relire pour evaluer l'interet, la forme et le fond
Idealement, faire relire par differents profiles (Luc-Aurelien, Alexis, Vladimir...), voir organiser une reunion de travail.
### Khiops Hierarchical Schemas

While traditional databases are designed for **efficient, reliable data storage and retrieval**
across various technologies, Khiops **extends the single-table data schema** used in data mining by a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say "typically used in data mining".

**hierarchical schema** that supports **domain knowledge encoding, automated feature engineering and predictive modeling**.
This approach bridges the gap between raw relational data and the analytical needs of machine learning workflows.

Database technologies cover a wide range, each suited to specific needs: simple storage, hierarchical, relational,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"cover a wide range of data storage schemata"?


In memory, this hierarchical structure closely resembles objects in programming languages,
which can be composed of sub-objects or arrays of sub-objects.
Khiops dictionaries serve as the language that allows us to describe and formalize this structure effectively.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/serve as a language/provide a language/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/effectively/concisely and in an expressive manner/


In the context of Khiops, **keys** are introduced within each dictionary solely to facilitate reading data from files
and constructing hierarchical in-memory instances.
These keys are organized hierarchically in accordance with the Khiops dictionary schema: the key fields of a parent entity are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/in accordance with/according to/


- **Shared Data and Computations:**
Loading external tables into memory allows for shared access to data and derived variables, which are computed once and reused within each process.
- However, processing external tables is resource-intensive: it is not parallelized and must be performed separately for each process, unlike main instances where each process handles a subset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/unlike main instances where each process handles a subset/unlike standard tables, which are processed in parallel and each process handles a subset of the table/

Copy link
Contributor

@popescu-v popescu-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few rephrasings as per the comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expand dictionary documentation with reference to hierarchical structures

3 participants