Skip to content

chore: add benchmarks for read_gbq_colab #1860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 27, 2025
Merged

Conversation

tswast
Copy link
Collaborator

@tswast tswast commented Jun 26, 2025

Follow-up to #1846

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Towards internal issue b/420984164 🦕

@tswast tswast requested review from a team as code owners June 26, 2025 18:13
@tswast tswast requested a review from jialuoo June 26, 2025 18:13
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Jun 26, 2025
@tswast tswast added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jun 26, 2025
@tswast
Copy link
Collaborator Author

tswast commented Jun 26, 2025

Marking as do not merge until the percentile_99 table finishes writing, but I think it's ready to review at least. Ran locally.

@tswast tswast requested review from TrevorBergeron and removed request for jialuoo June 26, 2025 18:16
@tswast tswast assigned TrevorBergeron and unassigned jiaxunwu Jun 26, 2025
TrevorBergeron
TrevorBergeron previously approved these changes Jun 26, 2025
Comment on lines +36 to +38
group_column = "col_int64_1"
if group_column not in df.columns:
group_column = "col_bool_0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow what is going on here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some column to group by and some tables with tiny rows can only fit a boolean. I can add a comment.


# Simulate the user filtering by a column and visualizing those results
df_filtered = df[df["col_bool_0"]]
df_filtered.shape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These .shape calls are going to be pretty brutal, going to double-execute

@tswast tswast removed the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jun 27, 2025
@tswast tswast merged commit ed75cd9 into main Jun 27, 2025
24 of 25 checks passed
@tswast tswast deleted the b420984164-bench-methods branch June 27, 2025 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants