Add Databricks Serverless Compute Support#3392
Open
rohitrsh wants to merge 1 commit intoflyteorg:masterfrom
Open
Add Databricks Serverless Compute Support#3392rohitrsh wants to merge 1 commit intoflyteorg:masterfrom
rohitrsh wants to merge 1 commit intoflyteorg:masterfrom
Conversation
Signed-off-by: Rohit Sharma <rohitrsh@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Tracking issue
flyteorg/flyte#6911
Why are the changes needed?
Databricks Serverless Compute offers faster startup times (seconds vs. minutes), automatic scaling, and zero infrastructure management. However, the existing flytekit-spark connector only supports classic compute (clusters). This PR enables teams to use serverless without changing their task code.
Users switch between classic and serverless by changing only the
databricks_conftask code stays identical:What changes were proposed in this pull request?
Adds first-class support for running Flyte Spark tasks on Databricks Serverless Compute, alongside the existing classic compute (clusters) support.
databricks_confcontents (no new task type needed)environmentsarray)flytekit.current_context().spark_sessionfor both compute modesFiles modified
flytekitplugins/spark/connector.pyflytekitplugins/spark/task.pypre_execute(),DatabricksV2config additions (credential provider, notebook support), docstringtests/test_connector.pytests/test_spark_task.pyNo new files in the plugin. The serverless entrypoint (
entrypoint_serverless.py) lives in the flytetools repository same pattern as the classic entrypoint.Technical details
1. Auto-detection of compute mode
New function
_is_serverless_config()detects serverless based ondatabricks_confkeys:existing_cluster_idnew_clusterenvironment_keyorenvironments(no cluster keys)2. Serverless job spec format
Databricks Serverless requires a different Jobs API payload (multi-task format with
tasksarray andenvironmentsarray). New function_configure_serverless()handles the environments array creation and env var injection.3. Entrypoint resolution
Both classic and serverless default to the same
flytetoolsrepository. Only thepython_filepath differs:Users can override both
git_sourceandpython_fileviadatabricks_conffor custom entrypoints.4. SparkSession in serverless
The serverless entrypoint (in flytetools) pre-creates the SparkSession and stores it in
sys.modulesandbuiltins. New method_get_databricks_serverless_spark_session()intask.pyretrieves it and exposes it viaflytekit.current_context().spark_sessionthe same API as classic compute.5. AWS credential provider
New
DatabricksV2config field:databricks_service_credential_provider. Resolution order: task config → connector env var (FLYTE_DATABRICKS_SERVICE_CREDENTIAL_PROVIDER).6. Notebook task support
New
DatabricksV2config fields:notebook_path,notebook_base_parameters. Works with both classic and serverless compute.Backward compatibility
databricks_confnew_cluster/existing_cluster_idwork as beforeDatabricksV2entrypoint.pyunchanged new file added alongsideHow was this patch tested?
Unit tests (14 connector tests, all passing)
Existing tests (unchanged):
test_databricks_agentClassic compute full agent flowtest_agent_create_with_no_instanceMissing instance errortest_agent_create_with_default_instanceInstance from env varNew serverless tests:
test_is_serverless_config_detection7 scenarios for compute mode detectiontest_configure_serverless_with_env_key_onlyAuto-creates environments arraytest_configure_serverless_with_inline_envPreserves user's environment spectest_configure_serverless_creates_default_envDefault env when none specifiedtest_get_databricks_job_spec_serverless_with_env_keyFull spec for env_key configtest_get_databricks_job_spec_serverless_with_inline_envFull spec for inline env configtest_get_databricks_job_spec_error_no_computeError when no compute configtest_databricks_agent_serverless— Full agent create/get flow for serverlesstest_serverless_default_entrypoint_from_flytetoolsDefault flytetools entrypointtest_serverless_task_git_source_overrides_defaultTask-level override workstest_classic_and_serverless_use_same_repoSame flytetools repo, different python_fileTask tests (
test_spark_task.py):sys.modulesandbuiltinsDatabricksV2credential provider and notebook configurationManual testing
environment_keyenvironmentsspecflytekit.current_context().spark_sessionSetup process
No additional setup needed. Run tests with:
Check all the applicable boxes
Related PRs
Docs link