Skip to content

Commit dcc0902

Browse files
huaxingaosrowen
authored andcommitted
[SPARK-29458][SQL][DOCS] Add a paragraph for scalar function in sql getting started
### What changes were proposed in this pull request? Add a paragraph for scalar function in sql getting started ### Why are the changes needed? To make 3.0 doc complete. ### Does this PR introduce any user-facing change? before: <img width="870" alt="Screen Shot 2020-04-21 at 10 11 12 PM" src="https://user-images.githubusercontent.com/13592258/79943182-16d1fd00-841d-11ea-9744-9cdd58d83f81.png"> after: <img width="865" alt="Screen Shot 2020-04-22 at 11 49 59 PM" src="https://user-images.githubusercontent.com/13592258/80068256-26704500-84f4-11ea-9845-c835927c027e.png"> <img width="1033" alt="Screen Shot 2020-04-23 at 6 22 53 PM" src="https://user-images.githubusercontent.com/13592258/80165100-82d47280-858f-11ea-8c84-1ef702cc1bff.png"> ### How was this patch tested? Closes #28290 from huaxingao/scalar. Authored-by: Huaxin Gao <[email protected]> Signed-off-by: Sean Owen <[email protected]>
1 parent 54996be commit dcc0902

File tree

2 files changed

+10
-10
lines changed

2 files changed

+10
-10
lines changed

docs/sql-getting-started.md

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -347,16 +347,13 @@ For example:
347347
</div>
348348

349349
## Scalar Functions
350-
(to be filled soon)
351350

352-
## Aggregations
351+
Scalar functions are functions that return a single value per row, as opposed to aggregation functions, which return a value for a group of rows. Spark SQL supports a variety of [Built-in Scalar Functions](sql-ref-functions.html#scalar-functions). It also supports [User Defined Scalar Functions](sql-ref-functions-udf-scalar.html).
353352

354-
The [built-in DataFrames functions](api/scala/org/apache/spark/sql/functions$.html) provide common
355-
aggregations such as `count()`, `countDistinct()`, `avg()`, `max()`, `min()`, etc.
356-
While those functions are designed for DataFrames, Spark SQL also has type-safe versions for some of them in
357-
[Scala](api/scala/org/apache/spark/sql/expressions/scalalang/typed$.html) and
358-
[Java](api/java/org/apache/spark/sql/expressions/javalang/typed.html) to work with strongly typed Datasets.
359-
Moreover, users are not limited to the predefined aggregate functions and can create their own. For more details
353+
## Aggregate Functions
354+
355+
Aggregate functions are functions that return a single value on a group of rows. The [Built-in Aggregation Functions](sql-ref-functions-builtin.html#aggregate-functions) provide common aggregations such as `count()`, `countDistinct()`, `avg()`, `max()`, `min()`, etc.
356+
Users are not limited to the predefined aggregate functions and can create their own. For more details
360357
about user defined aggregate functions, please refer to the documentation of
361358
[User Defined Aggregate Functions](sql-ref-functions-udf-aggregate.html).
362359

docs/sql-ref-functions.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,13 +27,16 @@ Built-in functions are commonly used routines that Spark SQL predefines and a co
2727
Spark SQL has some categories of frequently-used built-in functions for aggregtion, arrays/maps, date/timestamp, and JSON data.
2828
This subsection presents the usages and descriptions of these functions.
2929

30-
* [Aggregate Functions](sql-ref-functions-builtin.html#aggregate-functions)
31-
* [Window Functions](sql-ref-functions-builtin.html#window-functions)
30+
#### Scalar Functions
3231
* [Array Functions](sql-ref-functions-builtin.html#array-functions)
3332
* [Map Functions](sql-ref-functions-builtin.html#map-functions)
3433
* [Date and Timestamp Functions](sql-ref-functions-builtin.html#date-and-timestamp-functions)
3534
* [JSON Functions](sql-ref-functions-builtin.html#json-functions)
3635

36+
#### Aggregate-like Functions
37+
* [Aggregate Functions](sql-ref-functions-builtin.html#aggregate-functions)
38+
* [Window Functions](sql-ref-functions-builtin.html#window-functions)
39+
3740
### UDFs (User-Defined Functions)
3841

3942
User-Defined Functions (UDFs) are a feature of Spark SQL that allows users to define their own functions when the system's built-in functions are not enough to perform the desired task. To use UDFs in Spark SQL, users must first define the function, then register the function with Spark, and finally call the registered function. The User-Defined Functions can act on a single row or act on multiple rows at once. Spark SQL also supports integration of existing Hive implementations of UDFs, UDAFs and UDTFs.

0 commit comments

Comments
 (0)