-
Notifications
You must be signed in to change notification settings - Fork 28.9k
Backport [SPARK-5847] Allow for configuring MetricsSystem's prefix #15023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…use of app ID to namespace all metrics This is a backport of apache#14270. Because the spark.internal.config system does not exists in branch 1.6, a simpler substitution scheme for ${} in the spark.metrics.namespace value, using only Spark configuration had to be added to preserve the behaviour discussed in the tickets and tested. This backport is contributed by Criteo SA under the Apache v2 licence. Adding a new property to SparkConf called spark.metrics.namespace that allows users to set a custom namespace for executor and driver metrics in the metrics systems. By default, the root namespace used for driver or executor metrics is the value of `spark.app.id`. However, often times, users want to be able to track the metrics across apps for driver and executor metrics, which is hard to do with application ID (i.e. `spark.app.id`) since it changes with every invocation of the app. For such use cases, users can set the `spark.metrics.namespace` property to any given value or to another spark configuration key reference like `${spark.app.name}` which is then used to populate the root namespace of the metrics system (with the app name in our example). `spark.metrics.namespace` property can be set to any arbitrary spark property key, whose value would be used to set the root namespace of the metrics system. Non driver and executor metrics are never prefixed with `spark.app.id`, nor does the `spark.metrics.namespace` property have any such affect on such metrics. Added new unit tests, modified existing unit tests.
|
Please ignore AppVeyor test here. (Please refer #15022 (comment)) it seems it happens only in other branchs as they don't have |
|
We don't generally backport features to maintenance releases. Pinging @marmbrus who I believe is currently doing RM for the 1.6 line. |
|
I'm aware that features are not generally back-ported. The point is, for us this is a bug, preventing a deployment in production. We thus back-ported the fix internally and now propose to share it with the community as the work is already done. |
|
Thanks for spending the time to backport this, but it does seem a little risky to include changes to the configuration system in a maintenance release. As such, I'd probably error on the side of caution and close this PR unless there are a lot of 1.6 users clamoring for this functionality. |
|
This is perfectly understandable, Just in the same way that we err on the side of caution by not switching to 2.0 branch for prod just now :-) |
|
Test build #3262 has finished for PR 15023 at commit
|
|
Thanks for understanding! I do hope you guys upgrade eventually, there's a lot of good stuff and 2.0.1 should be out in the near future. Please do report any issues you see :) |
## What changes were proposed in this pull request? This PR proposes to close some stale PRs and ones suggested to be closed by committer(s) Closes apache#12415 Closes apache#14765 Closes apache#15118 Closes apache#15184 Closes apache#15183 Closes apache#9440 Closes apache#15023 Closes apache#14643 Closes apache#14827 ## How was this patch tested? N/A Author: hyukjinkwon <[email protected]> Closes apache#15198 from HyukjinKwon/stale-prs.
This is a backport of #14270. Because the spark.internal.config system
does not exists in branch 1.6, a simpler substitution scheme for ${} in
the spark.metrics.namespace value, using only Spark configuration had to
be added to preserve the behaviour discussed in the tickets and tested.
This backport is contributed by Criteo SA under the Apache v2 licence.
What changes were proposed in this pull request?
Adding a new property to SparkConf called spark.metrics.namespace that allows users to
set a custom namespace for executor and driver metrics in the metrics systems.
By default, the root namespace used for driver or executor metrics is
the value of
spark.app.id. However, often times, users want to be able to track the metricsacross apps for driver and executor metrics, which is hard to do with application ID
(i.e.
spark.app.id) since it changes with every invocation of the app. For such use cases,users can set the
spark.metrics.namespaceproperty to any given valueor to another spark configuration key reference like
${spark.app.name}which is then used to populate the root namespace of the metrics system
(with the app name in our example).
spark.metrics.namespaceproperty can be set to anyarbitrary spark property key, whose value would be used to set the root namespace of the
metrics system. Non driver and executor metrics are never prefixed with
spark.app.id, nordoes the
spark.metrics.namespaceproperty have any such affect on such metrics.How was this patch tested?
Added new unit tests, modified existing unit tests.