Skip to content

AUROC metric should throw an error when used for multi-class problems #3303

@NumesSanguis

Description

@NumesSanguis

🐛 Bug

AUROC accepts multi-class input without throwing an error. Instead, it gives a random value, which gives the illusion that it is working.

Background: https://forums.pytorchlightning.ai/t/pytorch-lightning-auroc-value-for-multi-class-seems-to-be-completely-off-compared-to-sklearn-using-it-wrong/61/7

To Reproduce

Steps to reproduce the behavior:

  1. Manually create some multi-class arrays
  2. Use PyTorch Lightning's AUROC() metric
  3. Use sklearn's AUROC metric
  4. Observe values not matching

Code sample

import torch
import sklearn.metrics
import pytorch_lightning as pl
from pytorch_lightning.metrics.classification import AUROC

pl.seed_everything(0)
auroc = AUROC()

def test_auroc_sk_multiclass():
    for i in range(100):
        target = torch.randint(0, 3, size=(10,))  # 2 --> 3
        pred = torch.rand(10, 3).softmax(dim=1)  # torch.randint(0, 2, size=(10, ))
        score_sk = sklearn.metrics.roc_auc_score(target.numpy(), pred.numpy(), multi_class='ovo', labels=[0, 1, 2])
        score_pl = auroc(pred, target)
        print(score_sk, score_pl)
        assert torch.allclose(torch.tensor(score_pl).float(), torch.tensor(score_sk).float())

test_auroc_sk_multiclass()

Expected behavior

  • Throw error that multi-class AUROC has not been implemented (yet).
  • Note in documentation that the AUROC metric does not support multi-class yet ()

Actual behavior

Giving a random value, giving a false sense that it is working.

Environment

* CUDA:
	- GPU:
		- GeForce GTX 1080 Ti
	- available:         True
	- version:           10.2
* Packages:
	- numpy:             1.19.1
	- pyTorch_debug:     False
	- pyTorch_version:   1.6.0
	- pytorch-lightning: 0.9.0
	- tensorboard:       2.2.0
	- tqdm:              4.48.2
* System:
	- OS:                Linux
	- architecture:
		- 64bit
		- 
	- processor:         x86_64
	- python:            3.7.7
	- version:           #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020

Additional context

The output value is nonsense in the multi-class classification, instead of an error. That's why I thought it was appropriate to file it as a bug instead of a feature request / documentation improvement. I'm aware that the AUROC implementation is not intended to be for multi-class after the discussion on the PyTorch Lightning forum.

I used the AUROC value and noticed it was wrong after training a few models, but it will take many people off-guard in it's current form.

Feature request for MulticlassAUROC: #3304

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinghelp wantedOpen to be worked on

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions