@@ -274,6 +274,9 @@ The parameter ``n_neighbors`` allows to give a classifier subclassed from
274274``KNeighborsMixin `` from scikit-learn to find the nearest neighbors and make
275275the decision to keep a given sample or not.
276276
277+ Repeated Edited Nearest Neighbours
278+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
279+
277280:class: `RepeatedEditedNearestNeighbours ` extends
278281:class: `EditedNearestNeighbours ` by repeating the algorithm multiple times
279282:cite: `tomek1976experiment `. Generally, repeating the algorithm will delete
@@ -285,9 +288,23 @@ more data::
285288 >>> print(sorted(Counter(y_resampled).items()))
286289 [(0, 64), (1, 208), (2, 4551)]
287290
288- :class: `AllKNN ` differs from the previous
289- :class: `RepeatedEditedNearestNeighbours ` since the number of neighbors of the
290- internal nearest neighbors algorithm is increased at each iteration
291+ The user can set up the number of times the edited nearest neighbours method should be
292+ repeated through the parameter `max_iter `.
293+
294+ The repetitions will stop when:
295+
296+ 1. the maximum number of iterations is reached, or
297+ 2. no more observations are removed, or
298+ 3. one of the majority classes becomes a minority class, or
299+ 4. one of the majority classes disappears during the undersampling.
300+
301+ All KNN
302+ ~~~~~~~
303+
304+ :class: `AllKNN ` is a variation of the
305+ :class: `RepeatedEditedNearestNeighbours ` where the number of neighbours evaluated at
306+ each round of :class: `EditedNearestNeighbours ` increases. It starts by editing based on
307+ 1-Nearest Neighbour, and it increases the neighbourhood by 1 at each iteration
291308:cite: `tomek1976experiment `::
292309
293310 >>> from imblearn.under_sampling import AllKNN
@@ -296,8 +313,13 @@ internal nearest neighbors algorithm is increased at each iteration
296313 >>> print(sorted(Counter(y_resampled).items()))
297314 [(0, 64), (1, 220), (2, 4601)]
298315
299- In the example below, it can be seen that the three algorithms have similar
300- impact by cleaning noisy samples next to the boundaries of the classes.
316+ :class: `AllKNN ` stops cleaning when the maximum number of neighbours to examine, which
317+ is determined by the user through the parameter `n_neighbors ` is reached, or when the
318+ majority class becomes the minority class.
319+
320+ In the example below, we see that :class: `EditedNearestNeighbours `,
321+ :class: `RepeatedEditedNearestNeighbours ` and :class: `AllKNN ` have similar impact when
322+ cleaning "noisy" samples at the boundaries between classes.
301323
302324.. image :: ./auto_examples/under-sampling/images/sphx_glr_plot_comparison_under_sampling_004.png
303325 :target: ./auto_examples/under-sampling/plot_comparison_under_sampling.html
0 commit comments