-
Notifications
You must be signed in to change notification settings - Fork 0
Document deeper insights of hierachical schema, keys and external tables #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Document deeper insights of hierachical schema, keys and external tables #150
Conversation
Cette branche repond aux trois issues suivantes: - Expand dictionary documentation with reference to hierarchical structures #81 - Be more explicit on the impact of keys in multi-table schema #87 - Improve documentation on external tables #146 Assez delicat: bien relire pour evaluer l'interet, la forme et le fond Idealement, faire relire par differents profiles (Luc-Aurelien, Alexis, Vladimir...), voir organiser une reunion de travail.
| ### Khiops Hierarchical Schemas | ||
|
|
||
| While traditional databases are designed for **efficient, reliable data storage and retrieval** | ||
| across various technologies, Khiops **extends the single-table data schema** used in data mining by a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say "typically used in data mining".
| **hierarchical schema** that supports **domain knowledge encoding, automated feature engineering and predictive modeling**. | ||
| This approach bridges the gap between raw relational data and the analytical needs of machine learning workflows. | ||
|
|
||
| Database technologies cover a wide range, each suited to specific needs: simple storage, hierarchical, relational, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"cover a wide range of data storage schemata"?
|
|
||
| In memory, this hierarchical structure closely resembles objects in programming languages, | ||
| which can be composed of sub-objects or arrays of sub-objects. | ||
| Khiops dictionaries serve as the language that allows us to describe and formalize this structure effectively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/serve as a language/provide a language/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/effectively/concisely and in an expressive manner/
|
|
||
| In the context of Khiops, **keys** are introduced within each dictionary solely to facilitate reading data from files | ||
| and constructing hierarchical in-memory instances. | ||
| These keys are organized hierarchically in accordance with the Khiops dictionary schema: the key fields of a parent entity are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/in accordance with/according to/
|
|
||
| - **Shared Data and Computations:** | ||
| Loading external tables into memory allows for shared access to data and derived variables, which are computed once and reused within each process. | ||
| - However, processing external tables is resource-intensive: it is not parallelized and must be performed separately for each process, unlike main instances where each process handles a subset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/unlike main instances where each process handles a subset/unlike standard tables, which are processed in parallel and each process handles a subset of the table/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few rephrasings as per the comments.
Cette PR répond aux trois issues suivantes:
Assez délicat: bien relire pour évaluer l’intérêt, la forme et le fond
Idéalement, faire relire par différents profiles (@lucaurelien, @alexisbondu, @popescu-v, ...), voire organiser une réunion de travail.