Skip to content

Add Databricks Serverless Compute Support#3392

Open
rohitrsh wants to merge 1 commit intoflyteorg:masterfrom
rohitrsh:feat/databricks-serverless-support
Open

Add Databricks Serverless Compute Support#3392
rohitrsh wants to merge 1 commit intoflyteorg:masterfrom
rohitrsh:feat/databricks-serverless-support

Conversation

@rohitrsh
Copy link

@rohitrsh rohitrsh commented Feb 17, 2026

Tracking issue

flyteorg/flyte#6911

Why are the changes needed?

Databricks Serverless Compute offers faster startup times (seconds vs. minutes), automatic scaling, and zero infrastructure management. However, the existing flytekit-spark connector only supports classic compute (clusters). This PR enables teams to use serverless without changing their task code.

Users switch between classic and serverless by changing only the databricks_conf task code stays identical:

import flytekit
from flytekit import task
from flytekitplugins.spark import DatabricksV2

# Classic compute
@task(task_config=DatabricksV2(
    databricks_conf={
        "new_cluster": {
            "spark_version": "15.4.x-scala2.12",
            "node_type_id": "m5.xlarge",
            "num_workers": 2,
        },
    },
    databricks_instance="my-workspace.cloud.databricks.com",
))
def classic_task() -> float:
    spark = flytekit.current_context().spark_session
    return spark.range(100).count()

# Serverless compute same task code, different config
@task(task_config=DatabricksV2(
    databricks_conf={
        "environment_key": "default",
        "environments": [{
            "environment_key": "default",
            "spec": {"client": "1"},
        }],
    },
    databricks_instance="my-workspace.cloud.databricks.com",
    databricks_service_credential_provider="my-s3-credential",
))
def serverless_task() -> float:
    spark = flytekit.current_context().spark_session  # same API
    return spark.range(100).count()

What changes were proposed in this pull request?

Adds first-class support for running Flyte Spark tasks on Databricks Serverless Compute, alongside the existing classic compute (clusters) support.

  • Auto-detect serverless vs. classic based on databricks_conf contents (no new task type needed)
  • Generate correct Databricks Jobs API payload for serverless (multi-task format with environments array)
  • SparkSession available via flytekit.current_context().spark_session for both compute modes
  • AWS credential forwarding via Databricks Service Credentials for S3 access in serverless
  • Notebook task support for both classic and serverless compute
  • Default entrypoint from flytetools same pattern as classic, no user configuration needed

Files modified

File Description
flytekitplugins/spark/connector.py Serverless detection, multi-task job format, env injection, credential forwarding, entrypoint resolution, notebook tasks
flytekitplugins/spark/task.py Serverless SparkSession retrieval in pre_execute(), DatabricksV2 config additions (credential provider, notebook support), docstring
tests/test_connector.py 11 new tests for serverless detection, configuration, job spec generation, entrypoint defaults
tests/test_spark_task.py Tests for serverless detection, SparkSession retrieval, credential provider, notebook config

No new files in the plugin. The serverless entrypoint (entrypoint_serverless.py) lives in the flytetools repository same pattern as the classic entrypoint.

Technical details

1. Auto-detection of compute mode

New function _is_serverless_config() detects serverless based on databricks_conf keys:

Keys Present Compute Mode
existing_cluster_id Classic (existing cluster)
new_cluster Classic (new cluster)
environment_key or environments (no cluster keys) Serverless
None of the above Error

2. Serverless job spec format

Databricks Serverless requires a different Jobs API payload (multi-task format with tasks array and environments array). New function _configure_serverless() handles the environments array creation and env var injection.

3. Entrypoint resolution

Both classic and serverless default to the same flytetools repository. Only the python_file path differs:

default_classic_python_file = "flytekitplugins/databricks/entrypoint.py"
default_serverless_python_file = "flytekitplugins/databricks/entrypoint_serverless.py"

Users can override both git_source and python_file via databricks_conf for custom entrypoints.

4. SparkSession in serverless

The serverless entrypoint (in flytetools) pre-creates the SparkSession and stores it in sys.modules and builtins. New method _get_databricks_serverless_spark_session() in task.py retrieves it and exposes it via flytekit.current_context().spark_session the same API as classic compute.

5. AWS credential provider

New DatabricksV2 config field: databricks_service_credential_provider. Resolution order: task config → connector env var (FLYTE_DATABRICKS_SERVICE_CREDENTIAL_PROVIDER).

6. Notebook task support

New DatabricksV2 config fields: notebook_path, notebook_base_parameters. Works with both classic and serverless compute.

Backward compatibility

Aspect Impact
Existing classic tasks No change detection is additive, classic path unchanged
Existing databricks_conf No change configs with new_cluster/existing_cluster_id work as before
API surface Additive only — new optional fields on DatabricksV2
flytetools entrypoint Classic entrypoint.py unchanged new file added alongside

How was this patch tested?

Unit tests (14 connector tests, all passing)

Existing tests (unchanged):

  • test_databricks_agent Classic compute full agent flow
  • test_agent_create_with_no_instance Missing instance error
  • test_agent_create_with_default_instance Instance from env var

New serverless tests:

  • test_is_serverless_config_detection 7 scenarios for compute mode detection
  • test_configure_serverless_with_env_key_only Auto-creates environments array
  • test_configure_serverless_with_inline_env Preserves user's environment spec
  • test_configure_serverless_creates_default_env Default env when none specified
  • test_get_databricks_job_spec_serverless_with_env_key Full spec for env_key config
  • test_get_databricks_job_spec_serverless_with_inline_env Full spec for inline env config
  • test_get_databricks_job_spec_error_no_compute Error when no compute config
  • test_databricks_agent_serverless — Full agent create/get flow for serverless
  • test_serverless_default_entrypoint_from_flytetools Default flytetools entrypoint
  • test_serverless_task_git_source_overrides_default Task-level override works
  • test_classic_and_serverless_use_same_repo Same flytetools repo, different python_file

Task tests (test_spark_task.py):

  • Serverless environment detection
  • SparkSession retrieval from sys.modules and builtins
  • DatabricksV2 credential provider and notebook configuration

Manual testing

  • Classic compute tasks continue to work (no regression)
  • Serverless compute with pre-configured environment_key
  • Serverless compute with inline environments spec
  • AWS credentials from Databricks service credentials
  • SparkSession available via flytekit.current_context().spark_session
  • Complex Spark workloads (DataFrame operations, UDFs, aggregations)

Setup process

No additional setup needed. Run tests with:

pytest plugins/flytekit-spark/tests/test_connector.py -v
pytest plugins/flytekit-spark/tests/test_spark_task.py -v

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Signed-off-by: Rohit Sharma <rohitrsh@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments