[SPARK-17057] [ML] ProbabilisticClassifierModels' prediction more reasonable with multi zero thresholds #14949

srowen · 2016-09-03T14:28:05Z

What changes were proposed in this pull request?

See related discussion at #14643

This actually changes more than what the original JIRA encompassed, but does propose a more reasonable (?) and deterministic result in this and other corner cases.

Revise semantics of ProbabilisticClassifierModel thresholds so that classes can only be predicted if they exceed their threshold (meaning no class may be predicted), and otherwise ordering by highest probability, then lowest threshold, then by class index.

How was this patch tested?

Existing and new unit tests.

…lasses can only be predicted if they exceed their threshold (meaning no class may be predicted), and otherwise ordering by highest probability, then lowest threshold, then by class index

SparkQA · 2016-09-03T15:11:21Z

Test build #64900 has finished for PR 14949 at commit 08dbe43.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

…ld meaning

SparkQA · 2016-09-03T17:20:17Z

Test build #64903 has finished for PR 14949 at commit 2fa331e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2016-09-07T09:06:20Z

@jkbradley @zhengruifeng @MLnick I wonder if I could ask you for comments on this change? it's a behavior change, so not something I'd do lightly, but I do think it improves the semantics here.

srowen · 2016-09-12T10:49:33Z

Trying @holdenk or @mengxr maybe. I think this behavior should be changed because it doesn't match the common meaning of 'threshold', but I feel like I'm missing context about why it was done this way.

zhengruifeng · 2016-09-12T11:28:57Z

I think both this change and current design are reasonable. And I personally prefer to current one which treat threshould as a kind of weight.

srowen · 2016-09-12T11:32:26Z

The problem is that it's called 'threshold' and not 'weight', and 'threshold' means something different. Is anyone suggesting that it was always meant as a 'weight', and/or has a reference for this type of weighting? I've never seen it ... not sure what multiplying a probability by 1/weight would mean theoretically.

Note this change would also make threshold matter as a tie-breaker.

MLnick · 2016-09-12T12:35:22Z

The original JIRA SPARK-8069 refers to https://cran.r-project.org/web/packages/randomForest/randomForest.pdf.

That R package calls it "cutoff". Though it does indeed seem to act more like a "weight" or "scaling". I can't say I've come across it before, and it appears this is the only package that does it like this (at least that I've been able to find from some quick searching). I haven't found any theoretical background for it either.

In any case, now that we have it, I think it probably best to keep it as is. However, It appears that our implementation here is flawed since in the original R code, the cutoff vector sum must be in (0, 1) (and also be >0 everywhere) - see https://github.com/cran/randomForest/blob/9208176df98d561aba6dae239472be8b124e2631/R/predict.randomForest.R#L47. If we're going to base something on another impl, probably best to actually follow it.

So:

If sum(thresholds) > 1 or < 0, throw and error
If each entry in thresholds not > 0, throw an error

I believe this takes care of the edge cases since no thresholds can be 0 or 1. The tie breaker element is taken care of with Vector.argmax (if p/t is the same for 2 or more classes, then ties will effectively be broken by class index order).

I don't like returning NaN. Since the R impl is actually scaling things rather than actually "cutting off" or "thresholding", it should always return a prediction and I think we should too.

srowen · 2016-09-12T12:56:04Z

Oh, I get it now. That makes sense. If this were being applied to decision trees only, that would make sense and we could fix this up and document the meaning. I agree it only makes sense to return "no class" if actually thresholding.

The only problem here is that this is not being applied just to a random forest implementation but to all classifiers that output a probability. That's a little more of a stretch. I suppose the result here can be thought of as a likelihood ratio of class probability vs prior, not some hacky heuristic specific to the CRAN package. I think the name is unfortunate because I would not have guessed that's the meaning given the name (though to be fair the scaladoc does say what it means).

I'll close this but what's the best way forward?

Option 1.
Keep current behavior. Modify #14643 to include Nick's suggestions above, and add a bunch of documentation about what 'thresholds' really means here.

Option 2.
As above but deprecate threshold and rename to 'cutoff' to be a little clearer.

Option 3.
As in Option 2 but also go back and actually implement thresholds.

Revise semantics of ProbabilisticClassifierModel thresholds so that c…

08dbe43

…lasses can only be predicted if they exceed their threshold (meaning no class may be predicted), and otherwise ordering by highest probability, then lowest threshold, then by class index

Update MultinomialLogisticRegression test output to match new thresho…

2fa331e

…ld meaning

srowen closed this Sep 12, 2016

srowen mentioned this pull request Sep 12, 2016

[SPARK-17057][ML] ProbabilisticClassifierModels' prediction more reasonable with multi zero thresholds #14643

Closed

srowen deleted the SPARK-17057.2 branch September 13, 2016 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-17057] [ML] ProbabilisticClassifierModels' prediction more reasonable with multi zero thresholds #14949

[SPARK-17057] [ML] ProbabilisticClassifierModels' prediction more reasonable with multi zero thresholds #14949

Uh oh!

srowen commented Sep 3, 2016

Uh oh!

SparkQA commented Sep 3, 2016

Uh oh!

SparkQA commented Sep 3, 2016

Uh oh!

srowen commented Sep 7, 2016

Uh oh!

srowen commented Sep 12, 2016

Uh oh!

zhengruifeng commented Sep 12, 2016

Uh oh!

srowen commented Sep 12, 2016

Uh oh!

MLnick commented Sep 12, 2016

Uh oh!

srowen commented Sep 12, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-17057] [ML] ProbabilisticClassifierModels' prediction more reasonable with multi zero thresholds #14949

[SPARK-17057] [ML] ProbabilisticClassifierModels' prediction more reasonable with multi zero thresholds #14949

Uh oh!

Conversation

srowen commented Sep 3, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Sep 3, 2016

Uh oh!

SparkQA commented Sep 3, 2016

Uh oh!

srowen commented Sep 7, 2016

Uh oh!

srowen commented Sep 12, 2016

Uh oh!

zhengruifeng commented Sep 12, 2016

Uh oh!

srowen commented Sep 12, 2016

Uh oh!

MLnick commented Sep 12, 2016

Uh oh!

srowen commented Sep 12, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants