Skip to content

Commit 1f6ded6

Browse files
bllchmbrsrxin
authored andcommitted
[SPARK-19127][DOCS] Update Rank Function Documentation
## What changes were proposed in this pull request? - [X] Fix inconsistencies in function reference for dense rank and dense - [X] Make all languages equivalent in their reference to `dense_rank` and `rank`. ## How was this patch tested? N/A for docs. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: anabranch <[email protected]> Closes #16505 from anabranch/SPARK-19127.
1 parent 4351e62 commit 1f6ded6

File tree

3 files changed

+26
-16
lines changed

3 files changed

+26
-16
lines changed

R/pkg/R/functions.R

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3150,7 +3150,8 @@ setMethod("cume_dist",
31503150
#' The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking
31513151
#' sequence when there are ties. That is, if you were ranking a competition using dense_rank
31523152
#' and had three people tie for second place, you would say that all three were in second
3153-
#' place and that the next person came in third.
3153+
#' place and that the next person came in third. Rank would give me sequential numbers, making
3154+
#' the person that came in third place (after the ties) would register as coming in fifth.
31543155
#'
31553156
#' This is equivalent to the \code{DENSE_RANK} function in SQL.
31563157
#'
@@ -3321,10 +3322,11 @@ setMethod("percent_rank",
33213322
#'
33223323
#' Window function: returns the rank of rows within a window partition.
33233324
#'
3324-
#' The difference between rank and denseRank is that denseRank leaves no gaps in ranking
3325-
#' sequence when there are ties. That is, if you were ranking a competition using denseRank
3325+
#' The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking
3326+
#' sequence when there are ties. That is, if you were ranking a competition using dense_rank
33263327
#' and had three people tie for second place, you would say that all three were in second
3327-
#' place and that the next person came in third.
3328+
#' place and that the next person came in third. Rank would give me sequential numbers, making
3329+
#' the person that came in third place (after the ties) would register as coming in fifth.
33283330
#'
33293331
#' This is equivalent to the RANK function in SQL.
33303332
#'

python/pyspark/sql/functions.py

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -157,17 +157,21 @@ def _():
157157
'dense_rank':
158158
"""returns the rank of rows within a window partition, without any gaps.
159159
160-
The difference between rank and denseRank is that denseRank leaves no gaps in ranking
161-
sequence when there are ties. That is, if you were ranking a competition using denseRank
160+
The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking
161+
sequence when there are ties. That is, if you were ranking a competition using dense_rank
162162
and had three people tie for second place, you would say that all three were in second
163-
place and that the next person came in third.""",
163+
place and that the next person came in third. Rank would give me sequential numbers, making
164+
the person that came in third place (after the ties) would register as coming in fifth.
165+
166+
This is equivalent to the DENSE_RANK function in SQL.""",
164167
'rank':
165168
"""returns the rank of rows within a window partition.
166169
167-
The difference between rank and denseRank is that denseRank leaves no gaps in ranking
168-
sequence when there are ties. That is, if you were ranking a competition using denseRank
170+
The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking
171+
sequence when there are ties. That is, if you were ranking a competition using dense_rank
169172
and had three people tie for second place, you would say that all three were in second
170-
place and that the next person came in third.
173+
place and that the next person came in third. Rank would give me sequential numbers, making
174+
the person that came in third place (after the ties) would register as coming in fifth.
171175
172176
This is equivalent to the RANK function in SQL.""",
173177
'cume_dist':

sql/core/src/main/scala/org/apache/spark/sql/functions.scala

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -785,10 +785,13 @@ object functions {
785785
/**
786786
* Window function: returns the rank of rows within a window partition, without any gaps.
787787
*
788-
* The difference between rank and denseRank is that denseRank leaves no gaps in ranking
789-
* sequence when there are ties. That is, if you were ranking a competition using denseRank
788+
* The difference between rank and dense_rank is that denseRank leaves no gaps in ranking
789+
* sequence when there are ties. That is, if you were ranking a competition using dense_rank
790790
* and had three people tie for second place, you would say that all three were in second
791-
* place and that the next person came in third.
791+
* place and that the next person came in third. Rank would give me sequential numbers, making
792+
* the person that came in third place (after the ties) would register as coming in fifth.
793+
*
794+
* This is equivalent to the DENSE_RANK function in SQL.
792795
*
793796
* @group window_funcs
794797
* @since 1.6.0
@@ -929,10 +932,11 @@ object functions {
929932
/**
930933
* Window function: returns the rank of rows within a window partition.
931934
*
932-
* The difference between rank and denseRank is that denseRank leaves no gaps in ranking
933-
* sequence when there are ties. That is, if you were ranking a competition using denseRank
935+
* The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking
936+
* sequence when there are ties. That is, if you were ranking a competition using dense_rank
934937
* and had three people tie for second place, you would say that all three were in second
935-
* place and that the next person came in third.
938+
* place and that the next person came in third. Rank would give me sequential numbers, making
939+
* the person that came in third place (after the ties) would register as coming in fifth.
936940
*
937941
* This is equivalent to the RANK function in SQL.
938942
*

0 commit comments

Comments
 (0)