-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-31636][SQL][DOCS] Remove HTML syntax in SQL reference #28451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5f5c0c4
f7bf428
3654928
0af3791
37ebf02
0a8df51
0575f06
5726a1f
bd0b900
bd82fdd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -41,10 +41,10 @@ This means that in case an operation causes overflows, the result is the same wi | |
| On the other hand, Spark SQL returns null for decimal overflows. | ||
| When `spark.sql.ansi.enabled` is set to `true` and an overflow occurs in numeric and interval arithmetic operations, it throws an arithmetic exception at runtime. | ||
|
|
||
| {% highlight sql %} | ||
| ```sql | ||
| -- `spark.sql.ansi.enabled=true` | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @huaxingao I know that it's not related to the format change that you r doing in this PR. But shouldn't we have a SET statement here, so users can cut-paste the command in their shell to see the behavior ? Perhaps we discussed it in the pr that added this clause. Just a question :-)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have a strong opinion on this. seems to me comment is OK too. |
||
| SELECT 2147483647 + 1; | ||
| java.lang.ArithmeticException: integer overflow | ||
| java.lang.ArithmeticException: integer overflow | ||
|
|
||
| -- `spark.sql.ansi.enabled=false` | ||
| SELECT 2147483647 + 1; | ||
|
|
@@ -53,7 +53,7 @@ SELECT 2147483647 + 1; | |
| +----------------+ | ||
| | -2147483648| | ||
| +----------------+ | ||
| {% endhighlight %} | ||
| ``` | ||
|
|
||
| ### Type Conversion | ||
|
|
||
|
|
@@ -64,15 +64,15 @@ On the other hand, `INSERT INTO` syntax throws an analysis exception when the AN | |
| Currently, the ANSI mode affects explicit casting and assignment casting only. | ||
| In future releases, the behaviour of type coercion might change along with the other two type conversion rules. | ||
|
|
||
| {% highlight sql %} | ||
| ```sql | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems OK, is there any behavior difference?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| -- Examples of explicit casting | ||
|
|
||
| -- `spark.sql.ansi.enabled=true` | ||
| SELECT CAST('a' AS INT); | ||
| java.lang.NumberFormatException: invalid input syntax for type numeric: a | ||
| java.lang.NumberFormatException: invalid input syntax for type numeric: a | ||
|
|
||
| SELECT CAST(2147483648L AS INT); | ||
| java.lang.ArithmeticException: Casting 2147483648 to int causes overflow | ||
| java.lang.ArithmeticException: Casting 2147483648 to int causes overflow | ||
|
|
||
| -- `spark.sql.ansi.enabled=false` (This is a default behaviour) | ||
| SELECT CAST('a' AS INT); | ||
|
|
@@ -94,8 +94,8 @@ CREATE TABLE t (v INT); | |
|
|
||
| -- `spark.sql.storeAssignmentPolicy=ANSI` | ||
| INSERT INTO t VALUES ('1'); | ||
| org.apache.spark.sql.AnalysisException: Cannot write incompatible data to table '`default`.`t`': | ||
| - Cannot safely cast 'v': StringType to IntegerType; | ||
| org.apache.spark.sql.AnalysisException: Cannot write incompatible data to table '`default`.`t`': | ||
| - Cannot safely cast 'v': StringType to IntegerType; | ||
|
|
||
| -- `spark.sql.storeAssignmentPolicy=LEGACY` (This is a legacy behaviour until Spark 2.x) | ||
| INSERT INTO t VALUES ('1'); | ||
|
|
@@ -105,7 +105,7 @@ SELECT * FROM t; | |
| +---+ | ||
| | 1| | ||
| +---+ | ||
| {% endhighlight %} | ||
| ``` | ||
|
|
||
| ### SQL Functions | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -27,46 +27,35 @@ User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act | |
|
|
||
| A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. | ||
|
|
||
| * IN - The input type for the aggregation. | ||
| * BUF - The type of the intermediate value of the reduction. | ||
| * OUT - The type of the final output result. | ||
| ***IN*** - The input type for the aggregation. | ||
|
|
||
| ***BUF*** - The type of the intermediate value of the reduction. | ||
|
|
||
| ***OUT*** - The type of the final output result. | ||
|
|
||
| * **bufferEncoder: Encoder[BUF]** | ||
|
|
||
| <dl> | ||
| <dt><code><em>bufferEncoder: Encoder[BUF]</em></code></dt> | ||
| <dd> | ||
| Specifies the Encoder for the intermediate value type. | ||
| </dd> | ||
| </dl> | ||
| <dl> | ||
| <dt><code><em>finish(reduction: BUF): OUT</em></code></dt> | ||
| <dd> | ||
|
|
||
| * **finish(reduction: BUF): OUT** | ||
|
|
||
| Transform the output of the reduction. | ||
| </dd> | ||
| </dl> | ||
| <dl> | ||
| <dt><code><em>merge(b1: BUF, b2: BUF): BUF</em></code></dt> | ||
| <dd> | ||
|
|
||
| * **merge(b1: BUF, b2: BUF): BUF** | ||
|
|
||
| Merge two intermediate values. | ||
| </dd> | ||
| </dl> | ||
| <dl> | ||
| <dt><code><em>outputEncoder: Encoder[OUT]</em></code></dt> | ||
| <dd> | ||
|
|
||
| * **outputEncoder: Encoder[OUT]** | ||
|
|
||
| Specifies the Encoder for the final output value type. | ||
| </dd> | ||
| </dl> | ||
| <dl> | ||
| <dt><code><em>reduce(b: BUF, a: IN): BUF</em></code></dt> | ||
| <dd> | ||
| Aggregate input value <code>a</code> into current intermediate value. For performance, the function may modify <code>b</code> and return it instead of constructing new object for <code>b</code>. | ||
| </dd> | ||
| </dl> | ||
| <dl> | ||
| <dt><code><em>zero: BUF</em></code></dt> | ||
| <dd> | ||
|
|
||
| * **reduce(b: BUF, a: IN): BUF** | ||
|
|
||
| Aggregate input value `a` into current intermediate value. For performance, the function may modify `b` and return it instead of constructing new object for `b`. | ||
|
|
||
| * **zero: BUF** | ||
|
|
||
| The initial value of the intermediate result for this aggregation. | ||
| </dd> | ||
| </dl> | ||
|
|
||
| ### Examples | ||
|
|
||
|
|
@@ -95,16 +84,16 @@ For example, a user-defined average for untyped DataFrames can look like: | |
| {% include_example untyped_custom_aggregation java/org/apache/spark/examples/sql/JavaUserDefinedUntypedAggregation.java%} | ||
| </div> | ||
| <div data-lang="SQL" markdown="1"> | ||
| {% highlight sql %} | ||
| ```sql | ||
| -- Compile and place UDAF MyAverage in a JAR file called `MyAverage.jar` in /tmp. | ||
| CREATE FUNCTION myAverage AS 'MyAverage' USING JAR '/tmp/MyAverage.jar'; | ||
|
|
||
| SHOW USER FUNCTIONS; | ||
| -- +------------------+ | ||
| -- | function| | ||
| -- +------------------+ | ||
| -- | default.myAverage| | ||
| -- +------------------+ | ||
| +------------------+ | ||
| | function| | ||
| +------------------+ | ||
| | default.myAverage| | ||
| +------------------+ | ||
|
|
||
| CREATE TEMPORARY VIEW employees | ||
| USING org.apache.spark.sql.json | ||
|
|
@@ -113,26 +102,26 @@ OPTIONS ( | |
| ); | ||
|
|
||
| SELECT * FROM employees; | ||
| -- +-------+------+ | ||
| -- | name|salary| | ||
| -- +-------+------+ | ||
| -- |Michael| 3000| | ||
| -- | Andy| 4500| | ||
| -- | Justin| 3500| | ||
| -- | Berta| 4000| | ||
| -- +-------+------+ | ||
| +-------+------+ | ||
| | name|salary| | ||
| +-------+------+ | ||
| |Michael| 3000| | ||
| | Andy| 4500| | ||
| | Justin| 3500| | ||
| | Berta| 4000| | ||
| +-------+------+ | ||
|
|
||
| SELECT myAverage(salary) as average_salary FROM employees; | ||
| -- +--------------+ | ||
| -- |average_salary| | ||
| -- +--------------+ | ||
| -- | 3750.0| | ||
| -- +--------------+ | ||
| {% endhighlight %} | ||
| +--------------+ | ||
| |average_salary| | ||
| +--------------+ | ||
| | 3750.0| | ||
| +--------------+ | ||
| ``` | ||
| </div> | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We cannot avoid this tag, too?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will take a look at this.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is for examples |
||
| </div> | ||
|
|
||
| ### Related Statements | ||
|
|
||
| * [Scalar User Defined Functions (UDFs)](sql-ref-functions-udf-scalar.html) | ||
| * [Integration with Hive UDFs/UDAFs/UDTFs](sql-ref-functions-udf-hive.html) | ||
| * [Scalar User Defined Functions (UDFs)](sql-ref-functions-udf-scalar.html) | ||
| * [Integration with Hive UDFs/UDAFs/UDTFs](sql-ref-functions-udf-hive.html) | ||


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the changes in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't change the order of the first 8 clauses. I think these should be grouped together. But I changed the rest to make them alphabetical order.
