-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
How do you tell the number of rows in a data view?
There is this very interesting method here.
| long? GetRowCount(bool lazy = true); |
The semantics of this are somewhat odd (though logical, in their way), and basically boil down to: under the default value of lazy=true, only return the row count if it is basically an O(1) operation. But what if this returns null? Then you have the lazy=false operation! This is a hint that we ought to possibly expend more effort, but only if doing so entails less work than just iterating over every row and counting directly.
Indeed, this is what this utility function does:
| public static long ComputeRowCount(IDataView view) |
First it asks (with lazy=false!) for the row count, and failing that will actually open a cursors (with no rows active) and directly count the number.
This is all fairly logical. However, as a practical matter, no one ever bothered to implement a lazy=false different code path as far as I am aware. This is not to say they couldn't have -- you might imagine some text-loader that without trying to parse anything merely counts the number of newline characters in a file, which would be much faster than an iteration -- but again, as a practical matter, no one did.
This suggests removing this lazy parameter to simplify the interface. It would still have the same semantics, just without all the complication of explaining lazy. (Though we'd still need to make clear to people that they should be lazy in the implementation notes.)