Skip to content

Documentation for toDataFrame #1479

@koperagen

Description

@koperagen

I put together a draft, but it needs further improvements:

  1. Expand on multiple properties in toDataFrame lambda use case:
list.toDataFrame {
  properties(A::prop1, A::prop2) { 
     /* subgraph settings go here */
  }
  properties(A::prop3, A::prop4) { 
     /* subgraph settings go here */
  }
}
  1. Verify Values and lists of values of a type annotated with [DataSchema] are expanded ignoring maxDepth, but can be preserved?..
  2. Provide example with multiple exclude/preserve, maxDepth
  3. Merge with info from existing documentation
  4. Note that exclude is similar to remove, but exclude might be useful if computation of property is expensive
  5. Note about that GRAPH of objects is being converted and can result in stack overflow or OOM exceptions
  6. Update Iterable<MyValues> sample with class hierarchy. I propose to use Konsist as a reference: they represect Kotlin functions as object with many many fields and most of them are nested. We could take this hierarchy, simplify it a bit for our purpose and inline some data
8. Maybe extensive examples should go to the website, and be linked to from kdocs
9. `toDataFrame` should be in "Interop with collections", now it's only under https://kotlin.github.io/dataframe/createdataframe.html#todataframe

/** Converts `Iterable<T>` to DataFrame, each element becomes a row in DataFrame.


 *
 * Performs expansion of value T according to its properties and getter-like functions, recursively
 * or according to props
 *
 * Additional columns can be added [CreateDataFrameDsl].
 *
 * If the value of `property3: NestedObject` can be expanded recursively, it becomes a ColumnGroup.
 *
 * If the value of `property4: List<NestedObject>` can be expanded recursively, it becomes a FrameColumn.
 *
 * Rules for expansions:
 * 1. Values and lists of values of a type annotated with [DataSchema] are expanded ignoring maxDepth, but can be preserved?..
 * 2. Value types are always preserved [org.jetbrains.kotlinx.dataframe.impl.api.isValueType].
 * 3. Given `Iterable<T>`, if T itself is a value type, a DataFrame with column `value: T` is created
 * 4. maxDepth is checked. Below, property1, 2, 3 is maxDepth = 1. nestedProperty1, 2, 3 is maxDepth = 2
 * 5. Additional classes and properties can be preserved [preserve], [TraversePropertiesDsl.preserve]
 * 6. Classes and properties can be excluded from conversion whatsoever [TraversePropertiesDsl.exclude]
 *
 *
 * ```
 * Iterable<MyValues>
 *   property1: Int
 *   property2: String?
 *   property3: NestedObject
 *       nestedProperty1: String
 *       nestedProperty2: List<Int>
 *       nestedProperty3: Map<String, String>
 *       nestedProperty4: List<AnotherNestedObject>
 *   property4: List<NestedObject>
 * ```
 *
 * To put it all together:
 * ```
 * list.toDataFrame {
 *   properties(maxDepth = 0)
 * }
 * ```
 * ```
 * property1: Int
 * property2: String?
 * property3: NestedObject // preserved because of maxDepth
 * property4: List<NestedObject>
 * ```
 *
 *
 */

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation (not KDocs)

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions