Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 11 additions & 8 deletions R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -2313,9 +2313,9 @@ setMethod("dropDuplicates",
#' @param joinExpr (Optional) The expression used to perform the join. joinExpr must be a
#' Column expression. If joinExpr is omitted, the default, inner join is attempted and an error is
#' thrown if it would be a Cartesian Product. For Cartesian join, use crossJoin instead.
#' @param joinType The type of join to perform. The following join types are available:
#' 'inner', 'outer', 'full', 'fullouter', leftouter', 'left_outer', 'left',
#' 'right_outer', 'rightouter', 'right', and 'leftsemi'. The default joinType is "inner".
#' @param joinType The type of join to perform, default 'inner'.
#' Must be one of: 'inner', 'cross', 'outer', 'full', 'full_outer',
#' 'left', 'left_outer', 'right', 'right_outer', 'left_semi', or 'left_anti'.
#' @return A SparkDataFrame containing the result of the join operation.
#' @family SparkDataFrame functions
#' @aliases join,SparkDataFrame,SparkDataFrame-method
Expand Down Expand Up @@ -2344,15 +2344,18 @@ setMethod("join",
if (is.null(joinType)) {
sdf <- callJMethod(x@sdf, "join", y@sdf, joinExpr@jc)
} else {
if (joinType %in% c("inner", "outer", "full", "fullouter",
"leftouter", "left_outer", "left",
"rightouter", "right_outer", "right", "leftsemi")) {
if (joinType %in% c("inner", "cross",
"outer", "full", "fullouter", "full_outer",
"left", "leftouter", "left_outer",
"right", "rightouter", "right_outer",
"left_semi", "leftsemi", "left_anti", "leftanti")) {
joinType <- gsub("_", "", joinType)
sdf <- callJMethod(x@sdf, "join", y@sdf, joinExpr@jc, joinType)
} else {
stop("joinType must be one of the following types: ",
"'inner', 'outer', 'full', 'fullouter', 'leftouter', 'left_outer', 'left',
'rightouter', 'right_outer', 'right', 'leftsemi'")
"'inner', 'cross', 'outer', 'full', 'full_outer',",
"'left', 'left_outer', 'right', 'right_outer',",
"'left_semi', or 'left_anti'.")
}
}
}
Expand Down
5 changes: 3 additions & 2 deletions python/pyspark/sql/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -730,8 +730,9 @@ def join(self, other, on=None, how=None):
a join expression (Column), or a list of Columns.
If `on` is a string or a list of strings indicating the name of the join column(s),
the column(s) must exist on both sides, and this performs an equi-join.
:param how: str, default 'inner'.
One of `inner`, `outer`, `left_outer`, `right_outer`, `leftsemi`.
:param how: str, default ``inner``. Must be one of: ``inner``, ``cross``, ``outer``,
``full``, ``full_outer``, ``left``, ``left_outer``, ``right``, ``right_outer``,
``left_semi``, and ``left_anti``.

The following performs a full outer join between ``df1`` and ``df2``.

Expand Down
16 changes: 12 additions & 4 deletions sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
Original file line number Diff line number Diff line change
Expand Up @@ -750,14 +750,18 @@ class Dataset[T] private[sql](
}

/**
* Equi-join with another `DataFrame` using the given columns.
* Equi-join with another `DataFrame` using the given columns. A cross join with a predicate
* is specified as an inner join. If you would explicitly like to perform a cross join use the
* `crossJoin` method.
*
* Different from other join functions, the join columns will only appear once in the output,
* i.e. similar to SQL's `JOIN USING` syntax.
*
* @param right Right side of the join operation.
* @param usingColumns Names of the columns to join on. This columns must exist on both sides.
* @param joinType One of: `inner`, `outer`, `left_outer`, `right_outer`, `leftsemi`.
* @param joinType Type of join to perform. Default `inner`. Must be one of:
* `inner`, `cross`, `outer`, `full`, `full_outer`, `left`, `left_outer`,
* `right`, `right_outer`, `left_semi`, `left_anti`.
*
* @note If you perform a self-join using this function without aliasing the input
* `DataFrame`s, you will NOT be able to reference any columns after the join, since
Expand Down Expand Up @@ -812,7 +816,9 @@ class Dataset[T] private[sql](
*
* @param right Right side of the join.
* @param joinExprs Join expression.
* @param joinType One of: `inner`, `outer`, `left_outer`, `right_outer`, `leftsemi`.
* @param joinType Type of join to perform. Default `inner`. Must be one of:
* `inner`, `cross`, `outer`, `full`, `full_outer`, `left`, `left_outer`,
* `right`, `right_outer`, `left_semi`, `left_anti`.
*
* @group untypedrel
* @since 2.0.0
Expand Down Expand Up @@ -889,7 +895,9 @@ class Dataset[T] private[sql](
*
* @param other Right side of the join.
* @param condition Join expression.
* @param joinType One of: `inner`, `outer`, `left_outer`, `right_outer`, `leftsemi`.
* @param joinType Type of join to perform. Default `inner`. Must be one of:
* `inner`, `cross`, `outer`, `full`, `full_outer`, `left`, `left_outer`,
* `right`, `right_outer`, `left_semi`, `left_anti`.
*
* @group typedrel
* @since 1.6.0
Expand Down