[WIP] Fix a bug in ClusterCentroids when using the hard voting strategy #741

lixiilu · 2020-08-10T15:12:29Z

Related Issue: #738

pep8speaks · 2020-08-10T15:12:34Z

Hello @tangxi1227! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file imblearn/under_sampling/_prototype_generation/_cluster_centroids.py:

Line 174:89: E501 line too long (110 > 88 characters)

chkoar · 2020-08-11T11:54:29Z

Thanks for the PR. Please fix the pep8 issue and add a test.

lixiilu · 2020-08-11T12:09:36Z

file: imbalanced-learn/imblearn/under_sampling/_prototype_generation/_cluster_centroids.py
function : _fit_resample
line: 174

def _fit_resample(self, X, y):
self._validate_estimator()

    if self.voting == "auto":
        if sparse.issparse(X):
            self.voting_ = "hard"
        else:
            self.voting_ = "soft"
    else:
        if self.voting in VOTING_KIND:
            self.voting_ = self.voting
        else:
            raise ValueError(
                "'voting' needs to be one of {}. Got {}"
                " instead.".format(VOTING_KIND, self.voting)
            )

    X_resampled, y_resampled = [], []
    for target_class in np.unique(y):
        if target_class in self.sampling_strategy_.keys():
            n_samples = self.sampling_strategy_[target_class]
            self.estimator_.set_params(**{"n_clusters": n_samples})
            self.estimator_.fit(X[y == target_class])
            X_new, y_new = self._generate_sample(
                X[y == target_class], y[y == target_class], self.estimator_.cluster_centers_, target_class
            )
            X_resampled.append(X_new)
            y_resampled.append(y_new)
        else:
            target_class_indices = np.flatnonzero(y == target_class)
            X_resampled.append(_safe_indexing(X, target_class_indices))
            y_resampled.append(_safe_indexing(y, target_class_indices))

    if sparse.issparse(X):
        X_resampled = sparse.vstack(X_resampled)
    else:
        X_resampled = np.vstack(X_resampled)
    y_resampled = np.hstack(y_resampled)

    return X_resampled, np.array(y_resampled, dtype=y.dtype)

glemaitre · 2020-08-11T12:40:55Z

A potential test would be to check that there is no minority sample once you resample in the majority part. This could be quite easy to implement. We would also need an entry in what's new since it is affecting end-user.

modify cluster_centroids, fix a bug

b9147df

chkoar changed the title ~~modify cluster_centroids, fix a bug~~ [WIP] Fix a bug in ClusterCentroids when using the hard voting strategy Aug 11, 2020

chkoar requested a review from glemaitre August 11, 2020 11:59

glemaitre mentioned this pull request Nov 1, 2020

FIX select sample from the targeted class in ClusterCentroids #769

Merged

glemaitre closed this in #769 Nov 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Fix a bug in ClusterCentroids when using the hard voting strategy #741

[WIP] Fix a bug in ClusterCentroids when using the hard voting strategy #741

Uh oh!

lixiilu commented Aug 10, 2020 •

edited by chkoar

Loading

Uh oh!

pep8speaks commented Aug 10, 2020

Uh oh!

chkoar commented Aug 11, 2020 •

edited

Loading

Uh oh!

lixiilu commented Aug 11, 2020

Uh oh!

glemaitre commented Aug 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[WIP] Fix a bug in ClusterCentroids when using the hard voting strategy #741

[WIP] Fix a bug in ClusterCentroids when using the hard voting strategy #741

Uh oh!

Conversation

lixiilu commented Aug 10, 2020 • edited by chkoar Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pep8speaks commented Aug 10, 2020

Uh oh!

chkoar commented Aug 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lixiilu commented Aug 11, 2020

Uh oh!

glemaitre commented Aug 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lixiilu commented Aug 10, 2020 •

edited by chkoar

Loading

chkoar commented Aug 11, 2020 •

edited

Loading