AUROC metric should throw an error when used for multi-class problems

## 🐛 Bug

AUROC accepts multi-class input without throwing an error. Instead, it gives a random value, which gives the illusion that it is working.

Background: https://forums.pytorchlightning.ai/t/pytorch-lightning-auroc-value-for-multi-class-seems-to-be-completely-off-compared-to-sklearn-using-it-wrong/61/7

### To Reproduce

Steps to reproduce the behavior:

1. Manually create some multi-class arrays
2. Use PyTorch Lightning's `AUROC()` metric
3. Use sklearn's AUROC metric
4. Observe values not matching


#### Code sample
```python
import torch
import sklearn.metrics
import pytorch_lightning as pl
from pytorch_lightning.metrics.classification import AUROC

pl.seed_everything(0)
auroc = AUROC()

def test_auroc_sk_multiclass():
    for i in range(100):
        target = torch.randint(0, 3, size=(10,))  # 2 --> 3
        pred = torch.rand(10, 3).softmax(dim=1)  # torch.randint(0, 2, size=(10, ))
        score_sk = sklearn.metrics.roc_auc_score(target.numpy(), pred.numpy(), multi_class='ovo', labels=[0, 1, 2])
        score_pl = auroc(pred, target)
        print(score_sk, score_pl)
        assert torch.allclose(torch.tensor(score_pl).float(), torch.tensor(score_sk).float())

test_auroc_sk_multiclass()
```

### Expected behavior

- Throw error that multi-class AUROC has not been implemented (yet).
- Note in documentation that the AUROC metric does not support multi-class yet ()

### Actual behavior
Giving a random value, giving a false sense that it is working.

### Environment

```
* CUDA:
	- GPU:
		- GeForce GTX 1080 Ti
	- available:         True
	- version:           10.2
* Packages:
	- numpy:             1.19.1
	- pyTorch_debug:     False
	- pyTorch_version:   1.6.0
	- pytorch-lightning: 0.9.0
	- tensorboard:       2.2.0
	- tqdm:              4.48.2
* System:
	- OS:                Linux
	- architecture:
		- 64bit
		- 
	- processor:         x86_64
	- python:            3.7.7
	- version:           #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020
```

### Additional context

The output value is nonsense in the multi-class classification, instead of an error. That's why I thought it was appropriate to file it as a bug instead of a feature request / documentation improvement. I'm aware that the AUROC implementation is not intended to be for multi-class after the discussion on the PyTorch Lightning forum.

I used the `AUROC` value and noticed it was wrong after training a few models, but it will take many people off-guard in it's current form.

Feature request for MulticlassAUROC: https://github.com/PyTorchLightning/pytorch-lightning/issues/3304

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AUROC metric should throw an error when used for multi-class problems #3303

🐛 Bug

To Reproduce

Code sample

Expected behavior

Actual behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AUROC metric should throw an error when used for multi-class problems #3303

Description

🐛 Bug

To Reproduce

Code sample

Expected behavior

Actual behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions