Skip to content

Conversation

shuoweil
Copy link
Contributor

perf: Replace expensive len() call with PandasBatches.total_rows in anywidget TableWidget

@shuoweil shuoweil self-assigned this Jul 24, 2025
@shuoweil shuoweil requested review from a team as code owners July 24, 2025 23:22
@shuoweil shuoweil requested a review from GarrettWu July 24, 2025 23:22
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Jul 24, 2025
@shuoweil shuoweil requested review from tswast and removed request for GarrettWu July 24, 2025 23:23
@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch from aee37a7 to 303c4af Compare July 24, 2025 23:23
@tswast
Copy link
Collaborator

tswast commented Jul 29, 2025

Please also update the benchmarks to use the total_rows parameter.

@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch from fc38cf3 to f643cfb Compare July 30, 2025 03:28
@shuoweil
Copy link
Contributor Author

shuoweil commented Jul 30, 2025

Please also update the benchmarks to use the total_rows parameter.

Let's use a separate PR for this request. #1949

@shuoweil shuoweil requested a review from tswast July 30, 2025 03:29
@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch 2 times, most recently from e12c8ff to f8ab27b Compare July 30, 2025 22:07
@shuoweil shuoweil added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jul 31, 2025
@bigframes-bot bigframes-bot removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jul 31, 2025
@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch from f8ab27b to df85824 Compare July 31, 2025 04:32
@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch from df85824 to 2756968 Compare August 1, 2025 08:06
@shuoweil shuoweil requested a review from tswast August 1, 2025 08:07
@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch 7 times, most recently from 5bc65ce to 26eb25e Compare August 9, 2025 00:18
@bigframes-bot bigframes-bot removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 14, 2025
@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch from 0d6a300 to 00f203e Compare August 14, 2025 21:56
@shuoweil shuoweil requested a review from tswast August 14, 2025 22:13
@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch from cc7add6 to 8e9f8ee Compare August 19, 2025 05:57
@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch from 8e9f8ee to 536345c Compare August 19, 2025 21:11
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Aug 19, 2025
@shuoweil shuoweil requested a review from tswast August 19, 2025 21:27
@shuoweil shuoweil force-pushed the shuowei-anywidget-remove-len-call branch from 79a10b8 to 0caaa52 Compare August 21, 2025 21:07
@shuoweil shuoweil added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 21, 2025
@bigframes-bot bigframes-bot removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 21, 2025
execute_result = dataframe._block.session._executor.execute(
dataframe._block.expr,
ordered=True,
use_explicit_destination=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to use an explicit destination here. This would result in creating a BigQuery job every time, which is not desirable. We want to allow the faster job optional code paths.

# The query issued by `to_pandas_batches()` already contains metadata
# about how many results there were. Use that to avoid doing an extra
# COUNT(*) query that `len(...)` would do.
self.row_count = execute_result.total_rows or 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you switch it to use the execute result instead of the PandasBatches object returned by to_pandas_batches?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants