-
Notifications
You must be signed in to change notification settings - Fork 50
feat: create deploy_remote_function
and deploy_udf
functions to immediately deploy functions to BigQuery
#1832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit refactors the implementation of immediate deployment for remote functions and UDFs to eliminate code duplication introduced in a previous commit. Changes: - The `remote_function` and `udf` methods in `bigframes.functions._function_session.FunctionSession` now accept an optional `deploy_immediately: bool` parameter (defaulting to `False`). The previous `deploy_remote_function` and `deploy_udf` methods in `FunctionSession` have been removed, and their logic is now incorporated into the unified methods. - The public API functions `bigframes.pandas.deploy_remote_function` and `bigframes.pandas.deploy_udf` now call the corresponding `FunctionSession` methods with `deploy_immediately=True`. - The public API functions `bigframes.pandas.remote_function` and `bigframes.pandas.udf` call the `FunctionSession` methods with `deploy_immediately=False` (relying on the default). - Unit tests in `tests/unit/functions/test_remote_function.py` have been updated to patch the unified `FunctionSession` methods and verify the correct `deploy_immediately` boolean is passed based on which public API function is called. Note: The underlying provisioning logic in `FunctionSession` currently deploys functions immediately regardless of the `deploy_immediately` flag. This flag serves as an indicator of intent and allows for future enhancements to support true lazy deployment if desired, without further API changes.
This commit corrects a previous refactoring attempt to eliminate code duplication and properly separates immediate-deployment functions from standard (potentially lazy) functions. Changes: - `bigframes.functions._function_session.FunctionSession` now has distinct methods: `remote_function`, `udf`, `deploy_remote_function`, and `deploy_udf`. The `deploy_immediately` flag has been removed from this class. - `deploy_remote_function` and `deploy_udf` methods in `FunctionSession` are responsible for ensuring immediate deployment by calling the underlying provisioning logic directly. The standard `remote_function` and `udf` methods in `FunctionSession` also currently call this provisioning logic, meaning all functions are deployed immediately as of now, but the structure allows for future lazy evaluation for standard functions without changing the deploy variants' contract. - Public API functions in `bigframes.pandas` (`remote_function`, `udf`, `deploy_remote_function`, `deploy_udf`) now correctly delegate to their corresponding distinct methods in `FunctionSession` (via the `Session` object). - Unit tests in `tests/unit/functions/test_remote_function.py` have been updated to mock and verify calls to the correct distinct methods on `bigframes.session.Session`. This resolves the issue of using a boolean flag to control deployment type and instead relies on calling specific, dedicated methods for immediate deployment, aligning with your request.
This commit simplifies the implementation of `deploy_remote_function` and `deploy_udf` within `bigframes.functions._function_session.FunctionSession`. Given that the standard `remote_function` and `udf` methods in `FunctionSession` already perform immediate deployment of resources (as the underlying provisioning logic they call is immediate), the `deploy_remote_function` and `deploy_udf` methods in the same class are simplified to directly call `self.remote_function(...)` and `self.udf(...)` respectively. This change makes the distinction between the `deploy_` variants and the standard variants in `FunctionSession` primarily a matter of semantic clarity and intent at that level; both paths currently result in immediate deployment. The public API in `bigframes.pandas` continues to offer distinct `deploy_` functions that call these `FunctionSession.deploy_` methods, preserving your user-facing API and its documented behavior of immediate deployment. No changes were needed for the public API in `bigframes.pandas` or the unit tests, as they were already aligned with calling distinct methods on the `Session` object, which in turn calls the now-simplified `FunctionSession` methods.
CC @ivansmf -- this PR should add those aliases for your sample. |
We already have two ways for each - the decorator |
@ivansmf for additional thoughts. |
According to Jing Jing our APIs have to be a verb. That is the totality of the reason for the request. |
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕