Skip to content

Conversation

@AnthonyTruchet
Copy link

This is a backport of #14270. Because the spark.internal.config system
does not exists in branch 1.6, a simpler substitution scheme for ${} in
the spark.metrics.namespace value, using only Spark configuration had to
be added to preserve the behaviour discussed in the tickets and tested.

This backport is contributed by Criteo SA under the Apache v2 licence.

What changes were proposed in this pull request?

Adding a new property to SparkConf called spark.metrics.namespace that allows users to
set a custom namespace for executor and driver metrics in the metrics systems.

By default, the root namespace used for driver or executor metrics is
the value of spark.app.id. However, often times, users want to be able to track the metrics
across apps for driver and executor metrics, which is hard to do with application ID
(i.e. spark.app.id) since it changes with every invocation of the app. For such use cases,
users can set the spark.metrics.namespace property to any given value
or to another spark configuration key reference like ${spark.app.name}
which is then used to populate the root namespace of the metrics system
(with the app name in our example). spark.metrics.namespace property can be set to any
arbitrary spark property key, whose value would be used to set the root namespace of the
metrics system. Non driver and executor metrics are never prefixed with spark.app.id, nor
does the spark.metrics.namespace property have any such affect on such metrics.

How was this patch tested?

Added new unit tests, modified existing unit tests.

…use of app ID to namespace all metrics

This is a backport of apache#14270. Because the spark.internal.config system
does not exists in branch 1.6, a simpler substitution scheme for ${} in
the spark.metrics.namespace value, using only Spark configuration had to
be added to preserve the behaviour discussed in the tickets and tested.

This backport is contributed by Criteo SA under the Apache v2 licence.

Adding a new property to SparkConf called spark.metrics.namespace that allows users to
set a custom namespace for executor and driver metrics in the metrics systems.

By default, the root namespace used for driver or executor metrics is
the value of `spark.app.id`. However, often times, users want to be able to track the metrics
across apps for driver and executor metrics, which is hard to do with application ID
(i.e. `spark.app.id`) since it changes with every invocation of the app. For such use cases,
users can set the `spark.metrics.namespace` property to any given value
or to another spark configuration key reference like `${spark.app.name}`
which is then used to populate the root namespace of the metrics system
(with the app name in our example). `spark.metrics.namespace` property can be set to any
arbitrary spark property key, whose value would be used to set the root namespace of the
metrics system. Non driver and executor metrics are never prefixed with `spark.app.id`, nor
does the `spark.metrics.namespace` property have any such affect on such metrics.

Added new unit tests, modified existing unit tests.
@HyukjinKwon
Copy link
Member

HyukjinKwon commented Sep 9, 2016

Please ignore AppVeyor test here. (Please refer #15022 (comment)) it seems it happens only in other branchs as they don't have appveyor.yml and try to refer the configuration in the web (although I have to test this) (tested - before/after). I'd be great if any of auhorized one sets the branch to test to master brunch only as described in https://github.com/apache/spark/blob/master/dev/appveyor-guide.md#specifying-the-branch-for-building-and-setting-the-build-schedule

@vanzin
Copy link
Contributor

vanzin commented Sep 9, 2016

We don't generally backport features to maintenance releases. Pinging @marmbrus who I believe is currently doing RM for the 1.6 line.

@AnthonyTruchet
Copy link
Author

I'm aware that features are not generally back-ported. The point is, for us this is a bug, preventing a deployment in production. We thus back-ported the fix internally and now propose to share it with the community as the work is already done.

@marmbrus
Copy link
Contributor

Thanks for spending the time to backport this, but it does seem a little risky to include changes to the configuration system in a maintenance release. As such, I'd probably error on the side of caution and close this PR unless there are a lot of 1.6 users clamoring for this functionality.

@AnthonyTruchet
Copy link
Author

This is perfectly understandable, Just in the same way that we err on the side of caution by not switching to 2.0 branch for prod just now :-)

@SparkQA
Copy link

SparkQA commented Sep 13, 2016

Test build #3262 has finished for PR 15023 at commit c1c57fb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

Thanks for understanding! I do hope you guys upgrade eventually, there's a lot of good stuff and 2.0.1 should be out in the near future. Please do report any issues you see :)

@asfgit asfgit closed this in 5c5396c Sep 23, 2016
@Willymontaz Willymontaz deleted the backport-SPARK-5847 branch April 2, 2019 15:06
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
## What changes were proposed in this pull request?

This PR proposes to close some stale PRs and ones suggested to be closed by committer(s)

Closes apache#12415
Closes apache#14765
Closes apache#15118
Closes apache#15184
Closes apache#15183
Closes apache#9440
Closes apache#15023
Closes apache#14643
Closes apache#14827

## How was this patch tested?

N/A

Author: hyukjinkwon <[email protected]>

Closes apache#15198 from HyukjinKwon/stale-prs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants