FEA D2 Brier Score #28971

OmarManzoor · 2024-05-07T14:35:31Z

Reference Issues/PRs

Closes #20943

What does this implement/fix? Explain your changes.

Adds the D2 Brier score which is the D2 score for brier_score_loss

Any other comments?

CC: @lorentzenchr

github-actions · 2024-05-07T14:37:48Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 4b6936c. Link to the linter CI: here}

doc/modules/model_evaluation.rst

sklearn/metrics/_classification.py

sklearn/metrics/tests/test_classification.py

lorentzenchr · 2024-05-21T12:30:09Z

#22046 seems like a blocker

sklearn/metrics/tests/test_classification.py

OmarManzoor · 2025-07-01T11:13:51Z

@lorentzenchr @ogrisel I updated this PR

ogrisel

Thanks for the PR. Here is a first pass of feedback.

Note: once this PR is merged, feel free to follow-up with a PR to add array API support.

ogrisel · 2025-07-01T12:26:54Z

doc/modules/model_evaluation.rst

+  >>> y_true = [0, 1, 1, 0]
+  >>> y_pred = [0.15, 0.9, 0.85, 0.25]
+  >>> d2_brier_score(y_true, y_pred)
+  0.882...


I think it would be interesting to see the D2 values for the Brier score for the exact same 3 toy data cases used in the D2 code snippets for the log loss above.

I just saw that Christian said the opposite above. For educational reason, I find it interesting to see the three cases (good, bad and very bad models) to see that the value can be negative.

It's very often the case that the readers just read the code snippets in the doc and ignore the text. If they see a negative value, I suspect that will pick their interest and nudge them into reading why this can be negative.

sklearn/metrics/_classification.py

ogrisel · 2025-07-01T13:02:03Z

sklearn/metrics/_classification.py

+    if _num_samples(y_proba) < 2:
+        msg = "D^2 score is not well-defined with less than two samples."
+        warnings.warn(msg, UndefinedMetricWarning)
+        return float("nan")


I think it could be helpful to expand the warning message to give the use a code snippet to either turn this warning into an exception or to silence it:

#29048 (comment)

We could also expose an replace_undefined_by argument, but this can be done in a follow-up PR for all D2 scores.

I think this is the same warning we are raising for all D^2 metrics and even the R^2 score. We could change it here but wouldn't it be better to just leave this as it is for consistency. Also if we want to update it how about adding an additional statement instead of a code snippet mentioning that y_proba should have at least more than 2 samples for the D^2 score to provide anything meaningful.

ogrisel · 2025-07-01T13:07:31Z

sklearn/metrics/tests/test_classification.py

+
+
+def test_d2_brier_score():
+    sample_weight = [2, 2, 3, 1, 1, 1]


Maybe we could parametrize this test "use_sample_weight", [True, False] and consistently pass the sample_weight argument everywhere, either with None or with [2, 2, 3, 1, 1, 1].

This will be a bit difficult because the null model scores or the y_proba (equaling the null model) that are hardcoded, are different with sample weight and without sample weight. If we have two parameter values in the test overall we would need to add a bunch of if and else statements which would get awkward. Should we separate this out into binary and multiclass cases?

ogrisel · 2025-07-01T13:10:59Z

sklearn/metrics/tests/test_classification.py

+    # Also confirm that the order of the labels does not affect the d2_score
+    labels = [2, 0, 1]
+    new_d2_score = d2_brier_score(y_true=y_true, y_proba=y_proba, labels=labels)
+    assert new_d2_score == pytest.approx(d2_score)


I think this test function is a bit long. We could split the edge cases where the expected assert d2_score == 0 in a dedicated function.

sklearn/metrics/tests/test_classification.py

OmarManzoor added 2 commits May 7, 2024 19:25

FEA Add D2 Brier Score

b7b1d14

Merge branch 'main' into d2_brier_score

b9cd666

github-actions bot added the module:metrics label May 7, 2024

Update pr number in changelog

1eae280

Fix docstring

26c1ba6

OmarManzoor changed the title ~~D2 Brier Score~~ FEA D2 Brier Score May 7, 2024

Update reference

1df2dab

lorentzenchr reviewed May 21, 2024

View reviewed changes

lorentzenchr mentioned this pull request May 21, 2024

ENH Add Multiclass Brier Score Loss #22046

Merged

OmarManzoor added 2 commits May 21, 2024 16:08

Merge branch 'main' into d2_brier_score

8730060

Partially address review suggestions

c91c5b3

Remove conflicting file

5bc5037

lorentzenchr reviewed Oct 14, 2024

View reviewed changes

sklearn/metrics/tests/test_classification.py Show resolved Hide resolved

sklearn/metrics/tests/test_classification.py Outdated Show resolved Hide resolved

OmarManzoor and others added 5 commits October 16, 2024 10:33

Merge branch 'main' into d2_brier_score

4603f67

PR suggestions

5c9e699

Merge branch 'main' into d2_brier_score

bc9d968

Updates in the d2 brier score

de1f03a

Refactor

905b810

Update validate_params

24d0980

ogrisel reviewed Jul 1, 2025

View reviewed changes

Partially address suggestions

4b6936c



		def test_d2_brier_score():
		sample_weight = [2, 2, 3, 1, 1, 1]

Uh oh!

FEA D2 Brier Score #28971

Are you sure you want to change the base?

FEA D2 Brier Score #28971

Conversation

OmarManzoor commented May 7, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented May 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lorentzenchr commented May 21, 2024

Uh oh!

Uh oh!

Uh oh!

OmarManzoor commented Jul 1, 2025

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

ogrisel Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ogrisel Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

OmarManzoor Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

ogrisel Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

OmarManzoor Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 7, 2024 •

edited

Loading

OmarManzoor Jul 1, 2025 •

edited

Loading