[SPARK-6128][Streaming][Documentation] Updates to Spark Streaming Programming Guide #4956

tdas · 2015-03-09T23:00:21Z

Updates to the documentation are as follows:

Added information on Kafka Direct API and Kafka Python API
Added joins to the main streaming guide
Improved details on the fault-tolerance semantics

Generated docs located here
http://people.apache.org/~tdas/spark-1.3.0-temp-docs/streaming-programming-guide.html#fault-tolerance-semantics

More things to add:

Configuration for Kafka receive rate
May be add concurrentJobs

tdas · 2015-03-09T23:33:34Z

@JoshRosen

JoshRosen · 2015-03-09T23:34:28Z

docs/streaming-kafka-integration.md

"loose" -> "lose".

"zero-data" probably shouldn't be hyphenated. There's an extra space before the period at the end of the this sentence, too.

Typo: "Ssee".

SparkQA · 2015-03-10T00:25:46Z

Test build #28411 has finished for PR 4956 at commit 86c4c2a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…ide-update-1.3

SparkQA · 2015-03-10T04:14:26Z

Test build #28417 has finished for PR 4956 at commit 04167a6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-03-10T04:28:24Z

Test build #28418 has finished for PR 4956 at commit 380cf8d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-03-11T19:49:20Z

Test build #28477 has finished for PR 4956 at commit debe484.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Row(word: String)
- class JavaSQLContextSingleton
- public class JavaRow implements java.io.Serializable
- You can also easily use machine learning algorithms provided by [MLlib](mllib-guide.html). First of all, there are streaming machine learning algorithms (e.g. (Streaming Linear Regression](mllib-linear-methods.html#streaming-linear-regression), [Streaming KMeans](file:///Users/tdas/Projects/Spark/spark/docs/_site/mllib-clustering.html#streaming-k-means), etc.) which can simultaneously learn from the streaming data as well as apply the model on the streaming data. Beyond these, for a much larger class of machine learning algorithms, you can learn a learning model offline (i.e. using historical data) and then apply the model online on streaming data. See the [MLlib](mllib-guide.html) guide for more details.

JoshRosen · 2015-03-11T22:22:51Z

docs/streaming-programming-guide.md

"and then queried it using" -> drop the 'it'

JoshRosen · 2015-03-11T22:25:01Z

docs/streaming-programming-guide.md

The file:// link here should be updated. Also, it looks like the link to Streaming Linear Regression starts with a opening paren rather than a square bracket, causing it to be misformatted in Markdown.

tdas · 2015-03-12T01:47:32Z

Thank you so much @JoshRosen . I am merging this to unblock the release!

…gramming Guide Updates to the documentation are as follows: - Added information on Kafka Direct API and Kafka Python API - Added joins to the main streaming guide - Improved details on the fault-tolerance semantics Generated docs located here http://people.apache.org/~tdas/spark-1.3.0-temp-docs/streaming-programming-guide.html#fault-tolerance-semantics More things to add: - Configuration for Kafka receive rate - May be add concurrentJobs Author: Tathagata Das <[email protected]> Closes #4956 from tdas/streaming-guide-update-1.3 and squashes the following commits: 819408c [Tathagata Das] Minor fixes. debe484 [Tathagata Das] Added DataFrames and MLlib 380cf8d [Tathagata Das] Fix link 04167a6 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into streaming-guide-update-1.3 0b77486 [Tathagata Das] Updates based on Josh's comments. 86c4c2a [Tathagata Das] Updated streaming guides 82de92a [Tathagata Das] Add Kafka to Python api docs (cherry picked from commit cd3b68d) Signed-off-by: Tathagata Das <[email protected]>

SparkQA · 2015-03-12T03:03:30Z

Test build #28490 has finished for PR 4956 at commit 819408c.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Row(word: String)
- class JavaSQLContextSingleton
- public class JavaRow implements java.io.Serializable
- You can also easily use machine learning algorithms provided by [MLlib](mllib-guide.html). First of all, there are streaming machine learning algorithms (e.g. (Streaming Linear Regression](mllib-linear-methods.html#streaming-linear-regression), [Streaming KMeans](mllib-clustering.html#streaming-k-means), etc.) which can simultaneously learn from the streaming data as well as apply the model on the streaming data. Beyond these, for a much larger class of machine learning algorithms, you can learn a learning model offline (i.e. using historical data) and then apply the model online on streaming data. See the [MLlib](mllib-guide.html) guide for more details.

tdas added 2 commits March 2, 2015 17:07

Add Kafka to Python api docs

82de92a

Updated streaming guides

86c4c2a

tdas changed the title ~~[SPARK-6128][Streaming][Documentation] Streaming guide update 1.3~~ [SPARK-6128][Streaming][Documentation] Updates to Spark Streaming Programming Guide Mar 9, 2015

JoshRosen reviewed Mar 9, 2015
View reviewed changes

tdas added 3 commits March 9, 2015 19:42

Updates based on Josh's comments.

0b77486

Merge remote-tracking branch 'apache-github/master' into streaming-gu…

04167a6

…ide-update-1.3

Fix link

380cf8d

Added DataFrames and MLlib

debe484

JoshRosen reviewed Mar 11, 2015
View reviewed changes

docs/streaming-programming-guide.md Outdated

Copy link

Contributor

JoshRosen Mar 11, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"and then queried it using" -> drop the 'it'

JoshRosen reviewed Mar 11, 2015
View reviewed changes

Minor fixes.

819408c

asfgit closed this in cd3b68d Mar 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-6128][Streaming][Documentation] Updates to Spark Streaming Programming Guide #4956

[SPARK-6128][Streaming][Documentation] Updates to Spark Streaming Programming Guide #4956

Uh oh!

tdas commented Mar 9, 2015

Uh oh!

tdas commented Mar 9, 2015

Uh oh!

JoshRosen Mar 9, 2015

Uh oh!

tdas Mar 10, 2015

Uh oh!

SparkQA commented Mar 10, 2015

Uh oh!

SparkQA commented Mar 10, 2015

Uh oh!

SparkQA commented Mar 10, 2015

Uh oh!

SparkQA commented Mar 11, 2015

Uh oh!

JoshRosen Mar 11, 2015

Uh oh!

JoshRosen Mar 11, 2015

Uh oh!

tdas commented Mar 12, 2015

Uh oh!

SparkQA commented Mar 12, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-6128][Streaming][Documentation] Updates to Spark Streaming Programming Guide #4956

[SPARK-6128][Streaming][Documentation] Updates to Spark Streaming Programming Guide #4956

Uh oh!

Conversation

tdas commented Mar 9, 2015

Uh oh!

tdas commented Mar 9, 2015

Uh oh!

JoshRosen Mar 9, 2015

Choose a reason for hiding this comment

Uh oh!

tdas Mar 10, 2015

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 10, 2015

Uh oh!

SparkQA commented Mar 10, 2015

Uh oh!

SparkQA commented Mar 10, 2015

Uh oh!

SparkQA commented Mar 11, 2015

Uh oh!

JoshRosen Mar 11, 2015

Choose a reason for hiding this comment

Uh oh!

JoshRosen Mar 11, 2015

Choose a reason for hiding this comment

Uh oh!

tdas commented Mar 12, 2015

Uh oh!

SparkQA commented Mar 12, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants