-
Notifications
You must be signed in to change notification settings - Fork 58
perf: Replace expensive len() call with PandasBatches.total_rows in anywidget TableWidget #1937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
aee37a7
to
303c4af
Compare
Please also update the benchmarks to use the |
fc38cf3
to
f643cfb
Compare
Let's use a separate PR for this request. #1949 |
e12c8ff
to
f8ab27b
Compare
f8ab27b
to
df85824
Compare
df85824
to
2756968
Compare
5bc65ce
to
26eb25e
Compare
0d6a300
to
00f203e
Compare
cc7add6
to
8e9f8ee
Compare
8e9f8ee
to
536345c
Compare
79a10b8
to
0caaa52
Compare
execute_result = dataframe._block.session._executor.execute( | ||
dataframe._block.expr, | ||
ordered=True, | ||
use_explicit_destination=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to use an explicit destination here. This would result in creating a BigQuery job every time, which is not desirable. We want to allow the faster job optional code paths.
# The query issued by `to_pandas_batches()` already contains metadata | ||
# about how many results there were. Use that to avoid doing an extra | ||
# COUNT(*) query that `len(...)` would do. | ||
self.row_count = execute_result.total_rows or 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you switch it to use the execute result instead of the PandasBatches object returned by to_pandas_batches
?
perf: Replace expensive len() call with PandasBatches.total_rows in anywidget TableWidget