This repository was archived by the owner on May 4, 2019. It is now read-only.
Sort NAs to last position for PooledDataArrays as well #106
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As @garborg noticed, since #102, sorting of NAs in PDAs is inconsistent with DataArrays. This fixes PDAs to sort NAs to the end.
I put a
nalastparameter ongroupsort_indexerto control whether NAs are sorted first or last, as currently the grouping functions and tests in DataFrames seem to expect NAs in the first position. This should be revisited, but I need to go through that code to understand how it works (or maybe @powerdistribution can say what's best?).I also moved the sorting tests from DataFrames here, and I now run the same sorting tests I run on DataArrays on PooledDataArrays. Sorting PooledDataArrays in reverse is broken, but that was a pre-existing bug.