Skip to content

feat(bigframes): Gcs bucket read will bind to gcs region if region unset#16855

Draft
TrevorBergeron wants to merge 1 commit intomainfrom
tbergeron_bind_gcs_region
Draft

feat(bigframes): Gcs bucket read will bind to gcs region if region unset#16855
TrevorBergeron wants to merge 1 commit intomainfrom
tbergeron_bind_gcs_region

Conversation

@TrevorBergeron
Copy link
Copy Markdown
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements automatic BigQuery location detection based on GCS bucket locations for various read functions and from_glob_path. It introduces a helper function to retrieve a storage client and logic to set the default session location if it hasn't been explicitly configured. Review feedback highlights a missing threading import and lock definition in the API module, as well as a case-sensitivity error in the new unit tests' assertions.

Comment on lines +761 to +763
global _default_location_lock

with _default_location_lock:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The variable _default_location_lock is used but not defined at the module level, and the threading module is not imported in this file. This will result in a NameError at runtime when _set_default_session_location_if_possible_gcs is called. Please define the lock at the module level (e.g., _default_location_lock = threading.Lock()) and ensure threading is imported.


bf_io_api.read_csv("gs://test-bucket/file.csv")

assert config.options.bigquery.location == "EU"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The assertion expects "EU", but the location was set to "eu" at line 161. Since _set_default_session_location_if_possible_gcs returns early when a location is already set, the value will remain "eu", causing this test to fail. Based on the previous test case (line 148), locations appear to be case-sensitive in these assertions.

Suggested change
assert config.options.bigquery.location == "EU"
assert config.options.bigquery.location == "eu"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant