Fixed streaming examples docs to use run-example instead of spark-submit #722

tdas · 2014-05-10T10:33:02Z

Pretty self-explanatory

…mit.

AmplabJenkins · 2014-05-10T10:37:58Z

Merged build triggered.

AmplabJenkins · 2014-05-10T10:38:06Z

Merged build started.

AmplabJenkins · 2014-05-10T10:39:24Z

Merged build finished.

AmplabJenkins · 2014-05-10T10:39:25Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14869/

AmplabJenkins · 2014-05-10T10:42:58Z

Merged build triggered.

AmplabJenkins · 2014-05-10T10:43:06Z

Merged build started.

AmplabJenkins · 2014-05-10T11:22:05Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-10T11:22:06Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14870/

pwendell · 2014-05-10T19:23:12Z

Looks good @tdas - are you still changing things or can I merge this?

andrewor14 · 2014-05-12T04:54:56Z

examples/src/main/scala/org/apache/spark/examples/streaming/KafkaWordCount.scala

this is outdated. KafkaWordCount no longer takes in <master>

AmplabJenkins · 2014-05-14T10:52:57Z

Merged build triggered.

AmplabJenkins · 2014-05-14T10:53:07Z

Merged build started.

AmplabJenkins · 2014-05-14T10:57:58Z

Merged build triggered.

AmplabJenkins · 2014-05-14T10:58:07Z

Merged build started.

Pretty self-explanatory Author: Tathagata Das <[email protected]> Closes #722 from tdas/example-fix and squashes the following commits: 7839979 [Tathagata Das] Minor changes. 0673441 [Tathagata Das] Fixed java docs of java streaming example e687123 [Tathagata Das] Fixed scala style errors. 9b8d112 [Tathagata Das] Fixed streaming examples docs to use run-example instead of spark-submit.

Pretty self-explanatory Author: Tathagata Das <[email protected]> Closes apache#722 from tdas/example-fix and squashes the following commits: 7839979 [Tathagata Das] Minor changes. 0673441 [Tathagata Das] Fixed java docs of java streaming example e687123 [Tathagata Das] Fixed scala style errors. 9b8d112 [Tathagata Das] Fixed streaming examples docs to use run-example instead of spark-submit.

Pretty self-explanatory Author: Tathagata Das <[email protected]> Closes #722 from tdas/example-fix and squashes the following commits: 7839979 [Tathagata Das] Minor changes. 0673441 [Tathagata Das] Fixed java docs of java streaming example e687123 [Tathagata Das] Fixed scala style errors. 9b8d112 [Tathagata Das] Fixed streaming examples docs to use run-example instead of spark-submit.

AmplabJenkins · 2014-05-14T11:33:31Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-14T11:33:32Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14977/

AmplabJenkins · 2014-05-14T11:37:54Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-14T11:37:54Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14978/

pwendell · 2014-05-14T16:28:20Z

@tdas mind closing this? Didn't close properly for some reason.

tdas · 2014-05-14T16:30:21Z

Closing.

pwendell · 2014-05-28T07:09:24Z

@tdas mind closing this?

Pretty self-explanatory Author: Tathagata Das <[email protected]> Closes apache#722 from tdas/example-fix and squashes the following commits: 7839979 [Tathagata Das] Minor changes. 0673441 [Tathagata Das] Fixed java docs of java streaming example e687123 [Tathagata Das] Fixed scala style errors. 9b8d112 [Tathagata Das] Fixed streaming examples docs to use run-example instead of spark-submit.

AmplabJenkins · 2014-07-02T03:35:44Z

Merged build triggered.

AmplabJenkins · 2014-07-02T03:35:52Z

Merged build started.

AmplabJenkins · 2014-07-02T04:18:51Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-07-02T04:18:51Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16291/

…dsToMetricType (apache#722) ### What changes were proposed in this pull request? This PR aims to reduce the memory consumption of `LiveStageMetrics.accumIdsToMetricType`, which should help to reduce driver memory usage when running complex SQL queries that contain many operators and run many jobs. In SQLAppStatusListener, the LiveStageMetrics.accumIdsToMetricType field holds a map which is used to look up the type of accumulators in order to perform conditional processing of a stage’s metrics. Currently, that field is derived from `LiveExecutionData.metrics`, which contains metrics for _all_ operators used anywhere in the query. Whenever a job is submitted, we construct a fresh map containing all metrics that have ever been registered for that SQL query. If a query runs a single job, this isn't an issue: in that case, all `LiveStageMetrics` instances will hold the same immutable `accumIdsToMetricType`. The problem arises if we have a query that runs many jobs (e.g. a complex query with many joins which gets divided into many jobs due to AQE): in that case, each job submission results in a new `accumIdsToMetricType` map being created. This PR fixes this by changing `accumIdsToMetricType` to be a mutable `mutable.HashMap` which is shared across all `LivestageMetrics` instances belonging to the same `LiveExecutionData`. The modified classes are `private` and are used only in SQLAppStatusListener, so I don't think this change poses any realistic risk of binary incompatibility risks to third party code. ### Why are the changes needed? Addresses one contributing factor behind high driver memory / OOMs when executing complex queries. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unit tests. To demonstrate memory reduction, I performed manual benchmarking and heap dump inspection using benchmark that ran copies of a complex query: each test query launches ~200 jobs (so at least 200 stages) and contains ~3800 total operators, resulting in a huge number metric accumulators. Prior to this PR's fix, ~3700 LiveStageMetrics instances (from multiple concurrent runs of the query) consumed a combined ~3.3 GB of heap. After this PR's fix, I observed negligible memory usage from LiveStageMetrics. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#43250 from JoshRosen/reduce-accum-ids-to-metric-type-mem-overhead. Authored-by: Josh Rosen <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> (cherry picked from commit 2f6cca5) Co-authored-by: Josh Rosen <[email protected]>

Fixed streaming examples docs to use run-example instead of spark-sub…

9b8d112

…mit.

Fixed scala style errors.

e687123

andrewor14 reviewed May 12, 2014
View reviewed changes

Fixed java docs of java streaming example

0673441

Minor changes.

7839979

tdas closed this Jul 8, 2014

agirish pushed a commit to HPEEzmeral/apache-spark that referenced this pull request May 5, 2022

MapR [SPARK-791] Spark does not work on non-secure cluster (apache#722)

64d44cf

udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024

MapR [SPARK-791] Spark does not work on non-secure cluster (apache#722)

982c72f

mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025

MapR [SPARK-791] Spark does not work on non-secure cluster (apache#722)

4680e93

Fixed streaming examples docs to use run-example instead of spark-submit #722

Fixed streaming examples docs to use run-example instead of spark-submit #722

Uh oh!

Conversation

tdas commented May 10, 2014

Uh oh!

AmplabJenkins commented May 10, 2014

Uh oh!

AmplabJenkins commented May 10, 2014

Uh oh!

AmplabJenkins commented May 10, 2014

Uh oh!

AmplabJenkins commented May 10, 2014

Uh oh!

AmplabJenkins commented May 10, 2014

Uh oh!

AmplabJenkins commented May 10, 2014

Uh oh!

AmplabJenkins commented May 10, 2014

Uh oh!

AmplabJenkins commented May 10, 2014

Uh oh!

pwendell commented May 10, 2014

Uh oh!

andrewor14 May 12, 2014

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented May 14, 2014

Uh oh!

AmplabJenkins commented May 14, 2014

Uh oh!

AmplabJenkins commented May 14, 2014

Uh oh!

AmplabJenkins commented May 14, 2014

Uh oh!

AmplabJenkins commented May 14, 2014

Uh oh!

AmplabJenkins commented May 14, 2014

Uh oh!

AmplabJenkins commented May 14, 2014

Uh oh!

AmplabJenkins commented May 14, 2014

Uh oh!

pwendell commented May 14, 2014

Uh oh!

tdas commented May 14, 2014

Uh oh!

pwendell commented May 28, 2014

Uh oh!

AmplabJenkins commented Jul 2, 2014

Uh oh!

AmplabJenkins commented Jul 2, 2014

Uh oh!

AmplabJenkins commented Jul 2, 2014

Uh oh!

AmplabJenkins commented Jul 2, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants