diff --git a/book/src/table_collection_row_access.md b/book/src/table_collection_row_access.md index e2f06c8e4..afb781d04 100644 --- a/book/src/table_collection_row_access.md +++ b/book/src/table_collection_row_access.md @@ -1,6 +1,53 @@ ## Accessing table rows -We may also access entire table rows by a row id: +The rows of a table contain two types of data: + +* Numerical data consisting of a single value. +* Ragged data. Examples include metadata for all tables, + ancestral states for sites, derived states for mutations, + parent and location information for individuals, etc.. + +`tskit` provides two ways to access row data. +The first is by a "view", which contains non-owning references +to the ragged column data. +The second is by row objects containing *copies* of the ragged column data. + +The former will be more efficient when the ragged columns are populated. +The latter will be more convenient to work with because the API is a standard +rust iterator. + +By holding references, row views have the usual implications for borrowing. +The row objects, however, own their data and are thus independent of their parent +objects. + +### Row views + +To generate a row view using a row id: + +```rust, noplaygound, ignore +{{#include ../../tests/book_table_collection.rs:get_edge_table_row_by_id}} +``` + +To iterate over all views we use *lending* iterators: + +```rust, noplaygound, ignore +{{#include ../../tests/book_table_collection.rs:get_edge_table_rows_by_lending_iterator}} +``` + +#### Lending iterators + +The lending iterators are implemented using the [`streaming_iterator`](https://docs.rs/streaming-iterator/latest/streaming_iterator/) crate. +(The community now prefers the term "lending" over "streaming" for this concept.) +The `tskit` prelude includes the trait declarations that allow the code shown above to compile. + +rust 1.65.0 stabilized Generic Associated Types, or GATs. +GATs allows lending iterators to be implemented directly without the workarounds used in the `streaming_iterator` crate. +We have decided not to implement our own lending iterator using GATs. +Rather, we will see what the community settles on and will decide in the future whether or not to adopt it. + +### Row objects + +We may access entire table rows by a row id: ```rust, noplaygound, ignore {{#include ../../tests/book_table_collection.rs:get_edge_table_row_by_id}} diff --git a/tests/book_table_collection.rs b/tests/book_table_collection.rs index f180e7e02..98ed1fbcf 100644 --- a/tests/book_table_collection.rs +++ b/tests/book_table_collection.rs @@ -70,6 +70,7 @@ fn add_node_handle_error() { #[test] fn get_data_from_edge_table() { use rand::distributions::Distribution; + use tskit::prelude::*; let sequence_length = tskit::Position::from(100.0); let mut rng = rand::thread_rng(); let random_pos = rand::distributions::Uniform::new::(0., sequence_length.into()); @@ -120,6 +121,35 @@ fn get_data_from_edge_table() { } // ANCHOR_END: get_edge_table_row_by_id + // ANCHOR: get_edge_table_row_view_by_id + if let Some(row_view) = tables.edges().row_view(edge_id) { + assert_eq!(row_view.id, 0); + assert_eq!(row_view.left, left); + assert_eq!(row_view.right, right); + assert_eq!(row_view.parent, parent); + assert_eq!(row_view.child, child); + } else { + panic!("that should have worked..."); + } + // ANCHOR_END: get_edge_table_row_view_by_id + + // ANCHOR: get_edge_table_rows_by_lending_iterator + let mut edge_table_lending_iter = tables.edges().lending_iter(); + while let Some(row_view) = edge_table_lending_iter.next() { + // there is only one row! + assert_eq!(row_view.id, 0); + assert_eq!(row_view.left, left); + assert_eq!(row_view.right, right); + assert_eq!(row_view.parent, parent); + assert_eq!(row_view.child, child); + assert!(row_view.metadata.is_none()); // no metadata in our table + } + // ANCHOR_END: get_edge_table_rows_by_lending_iterator + + assert!(tables + .check_integrity(tskit::TableIntegrityCheckFlags::default()) + .is_ok()); + // ANCHOR: get_edge_table_rows_by_iterator for row in tables.edges_iter() { // there is only one row!