Skip to content

Commit b2970d9

Browse files
dongjinleekrsrowen
authored andcommitted
[MINOR][DOCS] Fix spacings in Structured Streaming Programming Guide
## What changes were proposed in this pull request? 1. Omitted space between the sentences: `... on static data.The Spark SQL engine will ...` -> `... on static data. The Spark SQL engine will ...` 2. Omitted colon in Output Model section. ## How was this patch tested? None. Author: Lee Dongjin <[email protected]> Closes #17564 from dongjinleekr/feature/fix-programming-guide. (cherry picked from commit b938438) Signed-off-by: Sean Owen <[email protected]>
1 parent 46e212d commit b2970d9

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

docs/structured-streaming-programming-guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ title: Structured Streaming Programming Guide
88
{:toc}
99

1010
# Overview
11-
Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. You can express your streaming computation the same way you would express a batch computation on static data.The Spark SQL engine will take care of running it incrementally and continuously and updating the final result as streaming data continues to arrive. You can use the [Dataset/DataFrame API](sql-programming-guide.html) in Scala, Java or Python to express streaming aggregations, event-time windows, stream-to-batch joins, etc. The computation is executed on the same optimized Spark SQL engine. Finally, the system ensures end-to-end exactly-once fault-tolerance guarantees through checkpointing and Write Ahead Logs. In short, *Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream processing without the user having to reason about streaming.*
11+
Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. You can express your streaming computation the same way you would express a batch computation on static data. The Spark SQL engine will take care of running it incrementally and continuously and updating the final result as streaming data continues to arrive. You can use the [Dataset/DataFrame API](sql-programming-guide.html) in Scala, Java or Python to express streaming aggregations, event-time windows, stream-to-batch joins, etc. The computation is executed on the same optimized Spark SQL engine. Finally, the system ensures end-to-end exactly-once fault-tolerance guarantees through checkpointing and Write Ahead Logs. In short, *Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream processing without the user having to reason about streaming.*
1212

1313
**Structured Streaming is still ALPHA in Spark 2.1** and the APIs are still experimental. In this guide, we are going to walk you through the programming model and the APIs. First, let's start with a simple example - a streaming word count.
1414

@@ -368,7 +368,7 @@ A query on the input will generate the "Result Table". Every trigger interval (s
368368

369369
![Model](img/structured-streaming-model.png)
370370

371-
The "Output" is defined as what gets written out to the external storage. The output can be defined in different modes
371+
The "Output" is defined as what gets written out to the external storage. The output can be defined in a different mode:
372372

373373
- *Complete Mode* - The entire updated Result Table will be written to the external storage. It is up to the storage connector to decide how to handle writing of the entire table.
374374

0 commit comments

Comments
 (0)