-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Description
Describe the bug
In InstanceHardnessThreshold, the "estimator" parameter does not support a string value. However, the documentation states that "If str, the choices using a string are the following: 'knn', 'decision-tree', 'random-forest', 'adaboost', 'gradient-boosting' and 'linear-svm'."
The source code seems to not support a string value either.
imbalanced-learn/imblearn/under_sampling/_prototype_selection/_instance_hardness_threshold.py
Line 129 in e802a19
| def _validate_estimator(self, random_state): |
Steps/Code to Reproduce
The code is modified from the example in the documentation.
from collections import Counter
from sklearn.datasets import make_classification
from imblearn.under_sampling import InstanceHardnessThreshold
X, y = make_classification(n_classes=2, class_sep=2,
weights=[0.1, 0.9], n_informative=3, n_redundant=1, flip_y=0,
n_features=20, n_clusters_per_class=1, n_samples=1000, random_state=10)
print('Original dataset shape %s' % Counter(y))
iht = InstanceHardnessThreshold(random_state=42, estimator="knn")
X_res, y_res = iht.fit_resample(X, y)
print('Resampled dataset shape %s' % Counter(y_res)) Expected Results
No error is thrown.
Actual Results
Original dataset shape Counter({1: 900, 0: 100})
Traceback (most recent call last):
File "...\IHT.py", line 11, in <module>
X_res, y_res = iht.fit_resample(X, y)
File "...\site-packages\imblearn\base.py", line 83, in fit_resample
output = self._fit_resample(X, y)
File "...\site-packages\imblearn\under_sampling\_prototype_selection\_instance_hardness_threshold.py", line 153, in _fit_resample
self._validate_estimator(random_state)
File "...\site-packages\imblearn\under_sampling\_prototype_selection\_instance_hardness_threshold.py", line 147, in _validate_estimator
raise ValueError(
ValueError: Invalid parameter `estimator`. Got <class 'str'>.
Versions
Python 3.9.2
Scikit-Learn 1.0.2
Imbalanced-Learn 0.9.0
Metadata
Metadata
Assignees
Labels
No labels