[python-package] scikit-learn fit() methods: add eval_X, eval_y, deprecate eval_set #6857

lorentzenchr · 2025-03-09T14:04:40Z

As discussed in scikit-learn/scikit-learn#28901 (comment), this PR adds eval_X and eval_y in order to make LGBM estimators compatible with scikit-learn's (as of version 1.6) Pipeline(..., transform_input=["eval_X"]).

See also scikit-learn/scikit-learn#27124.

jameslamb

Thanks! It's looking like you're struggling to get this passing CI, so I'm going to put into "draft" for now. @ me any time here if you need help with development, and we can open this back up for review once CI is passing.

I saw you had multiple commits responding to linting errors... here's how to run those locally for faster feedback:

# (or conda, whatever you want)
pip install pre-commit
pre-commit run --all-files

And here's how to build locally and run the tests:

# step 1: compile lib_ligihgbtm
# (only need to do this once, because you're not making any C/C++ changes)
cmake -B build -S .
cmake --build build --target _lightgbm -j4

# step 2: install the Python package, re-using it
# (do this every time you change Python code in the library)
sh build-python.sh install --precompile

# step 3: run the scikit-learn tests
pytest tests/python_package_test/test_sklearn.py

lorentzenchr · 2025-03-10T07:01:21Z

@jameslamb Thanks for your suggestions.
Could you already comment on the deprecation strategy, raising a warning?
Then, should I adapt all the (scikit-learn api) python tests and replace eval_set by the new eval_X, eval_y (thereby avoiding the warnings in the tests)?

jameslamb · 2025-03-10T14:49:07Z

Could you already comment on the deprecation strategy, raising a warning?

Making both options available for a time and raising a deprecation warning when eval_set if non-empty seems fine to me, if we decide to move forward with this. I'd also support a runtime error when both eval_set and eval_X are non-empty, to avoid taking on the complexity of merging those 2 inputs.

I'm sorry but I cannot invest much time in this right now (for example, looking into whether this would introduce inconsistencies with HistGradientBoostingClassifier, XGBoost, or CatBoost). If you want to see this change, getting CI working and then opening it up for others to review is probably the best path.

should I adapt all the (scikit-learn api) python tests and replace eval_set by the new eval_X, eval_y (thereby avoiding the warnings in the tests)?

No, please. As I said in scikit-learn/scikit-learn#28901 (comment), removing eval_set from LightGBM's scikit-learn estimators would be highly disruptive and requires a long deprecation cycle (more than a year, in my opinion). Throughout that time, we need to continue to test it at least as thoroughly as we have been.

lorentzenchr · 2025-03-10T19:41:42Z

@jameslamb I'm sorry, I really need a maintainer's help. The tests in tests/python_package_test/test_dask.py fail even on the master branch, locally on my computer. I tried to play with different versions of dask, numpy, scipy, scikit-learn, no success.
TLDR: CI failure seems to be in master and not caused by this PR.

Details

pytest -x tests/python_package_test/test_dask.py::test_ranker

...
>           dask_ranker = dask_ranker.fit(dX, dy, sample_weight=dw, group=dg)

tests/python_package_test/test_dask.py:714: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:1566: in fit
    self._lgb_dask_fit(
../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:1068: in _lgb_dask_fit
    model = _train(
../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:811: in _train
    results = client.gather(futures_classifiers)
../python3_lgbm/lib/python3.11/site-packages/distributed/client.py:2565: in gather
    return self.sync(
../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:205: in _train_part
    data = _concat([x["data"] for x in list_of_parts])

...

 def _concat(seq: List[_DaskPart]) -> _DaskPart:
        if isinstance(seq[0], np.ndarray):
            return np.concatenate(seq, axis=0)
        elif isinstance(seq[0], (pd_DataFrame, pd_Series)):
            return concat(seq, axis=0)
        elif isinstance(seq[0], ss.spmatrix):
            return ss.vstack(seq, format="csr")
        else:
>           raise TypeError(
                f"Data must be one of: numpy arrays, pandas dataframes, sparse matrices (from scipy). Got {type(seq[0]).__name__}."
            )
E           TypeError: Data must be one of: numpy arrays, pandas dataframes, sparse matrices (from scipy). Got tuple.

../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:159: TypeError

jameslamb · 2025-03-10T20:11:58Z

What versions of dask / distributed do you have installed?

Searching that error in the issue tracker here has a match: #6739

I suspect you need to pin to dask<2024.12 in your environment, as we do in CI here (#6742).

lorentzenchr · 2025-03-10T20:22:44Z

@jameslamb Thank you so much. Pinning dask<2024.12 worked fine.

lorentzenchr · 2025-03-10T21:35:52Z

The remaining CI failures seem unrelated.
TODO for myself: Improve test coverage a bit.

jameslamb · 2025-09-14T04:45:47Z

TODO for myself: Improve test coverage a bit.

@lorentzenchr if you are interested in continuing this I'd be happy to help with reviews. I'm supportive of adding this, for better compatibility with newer versions of scikit-learn.

lorentzenchr · 2025-12-04T06:27:12Z

@jameslamb Yes, I‘d like to finish this. Your review would be great. Anything from me you need before you can start reviewing?

jameslamb · 2025-12-04T15:08:39Z

Great! I'd been waiting to review until you were done adding whatever tests you wanted.

If you'd like a review before then, update this to latest master and get CI passing (especially the check that LightGBM works with its oldest support scikit-learn version), then @ me and I'll be happy to provide some feedback.

lorentzenchr · 2025-12-05T09:22:31Z

@jameslamb I rebased with master. CI failures seem unrelated (only R-packages which are not touched). So ready for review from my side.

jameslamb

Thanks very much! Overall this is looking good. I left a couple of suggestions I'd like to see implemented.

After that I'd be happy to merge this, I'd really like to see it in the next lightgbm release.

cc @jmoralez I'd love a review from you too if you have time

jameslamb · 2025-12-05T14:56:18Z

python-package/lightgbm/sklearn.py

        eval_set: Optional[List[_LGBM_ScikitValidSet]] = None,
        eval_names: Optional[List[str]] = None,
+        eval_X: Optional[Union[_LGBM_ScikitMatrixLike, Tuple[_LGBM_ScikitMatrixLike]]] = None,
+        eval_y: Optional[Union[_LGBM_LabelType, Tuple[_LGBM_LabelType]]] = None,


Putting new arguments in the middle of the signature like this is a breaking change from the perspective of anyone calling these functions with positional arguments.

Could you please move these new arguments to the end of the signature and make them keyword-only?

Like this:

def fit( self, X, # ... other arguments that are already there ... *, eval_X: Optional[Union[_LGBM_ScikitMatrixLike, Tuple[_LGBM_ScikitMatrixLike]]] = None, eval_y: Optional[Union[_LGBM_LabelType, Tuple[_LGBM_LabelType]]] = None,

Making all new arguments added here keyword-only would allow us to change the ordering in the future without breaking anyone because we'll know there cannot be any code calling them positionally.

jameslamb · 2025-12-28T04:18:12Z

python-package/lightgbm/sklearn.py

+            eval_set = list(zip(eval_X, eval_y))
+        else:
+            eval_set = [(eval_X, eval_y)]
+    return eval_set


Can you please add test cases covering all these if-else branches? e.g. using pytest.raises() for the error cases and checking that the expected error type + message are raised?

There are notes in #7031 on how to generate a coverage report with pytest-cov for LightGBM's python tests if you want to use that to double-check.

jameslamb · 2025-12-28T04:22:31Z

python-package/lightgbm/sklearn.py

+        This is deprecated, use `eval_X` and `eval_y` instead.
    eval_names : list of str, or None, optional (default=None)
        Names of eval_set.
-    eval_sample_weight : {eval_sample_weight_shape}


Was this change a mistake? I don't see any code changes to the handling of eval_sample_weight in this PR.

Right now the doc for this says the following:

LightGBM/python-package/lightgbm/sklearn.py

Lines 1085 to 1087 in 544d439

eval_sample_weight_shape="list of array (same types as ``sample_weight`` supports), or None, optional (default=None)",

eval_init_score_shape="list of array (same types as ``init_score`` supports), or None, optional (default=None)",

eval_group_shape="list of array (same types as ``group`` supports), or None, optional (default=None)",

If you're changing this because you found that "list" is overly specific and that other iterables like tuple could be provided, then I'm ok with this change but please:

make it in the formatting variable eval_sample_weight_shape instead of tacking on "or tuple" here

double-check if eval_init_score and eval_group could also be modified that way

add tests in test_sklearn.py covering tuples being provided for those inputs

jameslamb · 2025-12-28T04:56:53Z

tests/python_package_test/test_sklearn.py

+    X, y = make_synthetic_regression()
+    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
+    cbs = [lgb.early_stopping(2)]
+    gbm1 = lgb.LGBMRegressor()
+    gbm1.fit(X_train, y_train, eval_set=[(X_test, y_test)], callbacks=cbs)
+    gbm2 = lgb.LGBMRegressor()
+    gbm2.fit(X_train, y_train, eval_X=X_test, eval_y=y_test, callbacks=cbs)
+    np.testing.assert_allclose(gbm1.predict(X), gbm2.predict(X))
+
+    # 2 evaluation sets
+    n = X_test.shape[0]
+    X_test1, X_test2 = X_test[: n // 2], X_test[n // 2 :]
+    y_test1, y_test2 = y_test[: n // 2], y_test[n // 2 :]
+    gbm1 = lgb.LGBMRegressor()
+    gbm1.fit(X_train, y_train, eval_set=[(X_test1, y_test1), (X_test2, y_test2)], callbacks=cbs)
+    gbm2 = lgb.LGBMRegressor()
+    gbm2.fit(X_train, y_train, eval_X=(X_test1, X_test2), eval_y=(y_test1, y_test2), callbacks=cbs)
+    np.testing.assert_allclose(gbm1.predict(X), gbm2.predict(X))
+    assert gbm1.evals_result_["valid_0"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_0"]["l2"][0])


Suggested change

X, y = make_synthetic_regression()

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

cbs = [lgb.early_stopping(2)]

gbm1 = lgb.LGBMRegressor()

gbm1.fit(X_train, y_train, eval_set=[(X_test, y_test)], callbacks=cbs)

gbm2 = lgb.LGBMRegressor()

gbm2.fit(X_train, y_train, eval_X=X_test, eval_y=y_test, callbacks=cbs)

np.testing.assert_allclose(gbm1.predict(X), gbm2.predict(X))

# 2 evaluation sets

n = X_test.shape[0]

X_test1, X_test2 = X_test[: n // 2], X_test[n // 2 :]

y_test1, y_test2 = y_test[: n // 2], y_test[n // 2 :]

gbm1 = lgb.LGBMRegressor()

gbm1.fit(X_train, y_train, eval_set=[(X_test1, y_test1), (X_test2, y_test2)], callbacks=cbs)

gbm2 = lgb.LGBMRegressor()

gbm2.fit(X_train, y_train, eval_X=(X_test1, X_test2), eval_y=(y_test1, y_test2), callbacks=cbs)

np.testing.assert_allclose(gbm1.predict(X), gbm2.predict(X))

assert gbm1.evals_result_["valid_0"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_0"]["l2"][0])

X, y = make_synthetic_regression()

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

params = {

"deterministic": True,

"force_row_wise": True,

"n_jobs": 1,

"seed": 708,

}

cbs = [lgb.early_stopping(2)]

gbm1 = lgb.LGBMRegressor(**params)

gbm1.fit(X_train, y_train, eval_set=[(X_test, y_test)], callbacks=cbs)

gbm2 = lgb.LGBMRegressor(**params)

gbm2.fit(X_train, y_train, eval_X=X_test, eval_y=y_test, callbacks=cbs)

np.testing.assert_allclose(gbm1.predict(X), gbm2.predict(X))

# 2 evaluation sets

n = X_test.shape[0]

X_test1, X_test2 = X_test[: n // 2], X_test[n // 2 :]

y_test1, y_test2 = y_test[: n // 2], y_test[n // 2 :]

gbm1 = lgb.LGBMRegressor(**params)

gbm1.fit(X_train, y_train, eval_set=[(X_test1, y_test1), (X_test2, y_test2)], callbacks=cbs)

gbm2 = lgb.LGBMRegressor(**params)

gbm2.fit(X_train, y_train, eval_X=(X_test1, X_test2), eval_y=(y_test1, y_test2), callbacks=cbs)

np.testing.assert_allclose(gbm1.predict(X), gbm2.predict(X))

assert gbm1.evals_result_["valid_0"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_0"]["l2"][0])

Let's please pass some parameter values to guarantee a deterministic fit, so the test won't fail due to random differences.

lorentzenchr requested review from StrikerRUS, borchero, guolinke, jameslamb, jmoralez and shiyu1994 as code owners March 9, 2025 14:04

lorentzenchr force-pushed the sklearn_Xval_yval branch 2 times, most recently from 197a3bd to 5dd3171 Compare March 9, 2025 20:13

jameslamb added in progress feature labels Mar 9, 2025

jameslamb reviewed Mar 9, 2025

View reviewed changes

jameslamb marked this pull request as draft March 9, 2025 23:15

lorentzenchr marked this pull request as ready for review March 10, 2025 21:34

jameslamb changed the title ~~ENH add eval_X, eval_y, deprecate eval_set~~ [python-package] scikit-learn fit() methods: add eval_X, eval_y, deprecate eval_set Mar 10, 2025

lorentzenchr mentioned this pull request Mar 18, 2025

ENH add X_val and y_val to HGBT.fit scikit-learn/scikit-learn#27124

Merged

jameslamb added the awaiting response label Sep 14, 2025

github-actions bot removed the awaiting response label Dec 4, 2025

lorentzenchr added 4 commits December 5, 2025 08:22

ENH add eval_X, eval_y, deprecate eval_set

5f27a57

CLN eval_X, eval_y in all fit signatures

79665bb

ENH eval_X and eval_y in _DaskLGBMModel

00022f7

CLN appease ruff/linting

d79f545

lorentzenchr added 2 commits December 5, 2025 08:22

FIX _validate_eval_set_Xy

f8d2d13

FIX dask

69371c6

lorentzenchr force-pushed the sklearn_Xval_yval branch from 48bd40f to 69371c6 Compare December 5, 2025 07:23

StrikerRUS and others added 2 commits December 9, 2025 01:25

Merge branch 'master' into sklearn_Xval_yval

1e8ac7f

Merge branch 'master' into sklearn_Xval_yval

6f0a56d

jameslamb requested changes Dec 28, 2025

View reviewed changes

	eval_sample_weight_shape="list of array (same types as ``sample_weight`` supports), or None, optional (default=None)",
	eval_init_score_shape="list of array (same types as ``init_score`` supports), or None, optional (default=None)",
	eval_group_shape="list of array (same types as ``group`` supports), or None, optional (default=None)",

[python-package] scikit-learn fit() methods: add eval_X, eval_y, deprecate eval_set #6857

Are you sure you want to change the base?

[python-package] scikit-learn fit() methods: add eval_X, eval_y, deprecate eval_set #6857

Conversation

lorentzenchr commented Mar 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jameslamb left a comment

Choose a reason for hiding this comment

Uh oh!

lorentzenchr commented Mar 10, 2025

Uh oh!

jameslamb commented Mar 10, 2025

Uh oh!

lorentzenchr commented Mar 10, 2025

Uh oh!

jameslamb commented Mar 10, 2025

Uh oh!

lorentzenchr commented Mar 10, 2025

Uh oh!

lorentzenchr commented Mar 10, 2025

Uh oh!

jameslamb commented Sep 14, 2025

Uh oh!

lorentzenchr commented Dec 4, 2025

Uh oh!

jameslamb commented Dec 4, 2025

Uh oh!

lorentzenchr commented Dec 5, 2025

Uh oh!

jameslamb left a comment

Choose a reason for hiding this comment

Uh oh!

jameslamb Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

jameslamb Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

jameslamb Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

jameslamb Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lorentzenchr commented Mar 9, 2025 •

edited

Loading