-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-1537] [YARN] Add history provider for YARN Application Timeline Server #10545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-1537] [YARN] Add history provider for YARN Application Timeline Server #10545
Conversation
|
Test build #48568 has finished for PR 10545 at commit
|
|
I should add that i'm thinking of moving the core |
21e0c76 to
8d781db
Compare
|
Test build #48767 has finished for PR 10545 at commit
|
8d781db to
690b686
Compare
|
Test build #49241 has finished for PR 10545 at commit
|
690b686 to
83560db
Compare
|
Test build #50201 has finished for PR 10545 at commit
|
…ter (the one with the service API merged in)
…ew SPARK-11315 publisher branch
…request pushed. This is for more reliable polling for changes during integration with YARN-7889
…d track attempt versions. This is for more reliable polling for changes during integration with YARN-7889
…f improvement in test running in the process. Tests can register "failureActions" for execution on a test failure; closures to dump the state of things & so have better diagnostics
…s this. In production even 10s is probably too short, so it doesn't make things much worse
…ATS URL (as info level wasn't giving any details on whether/when entities were published, or under what); downgrade event drop to info & not warning
83560db to
b6f3b99
Compare
|
Test build #52326 has finished for PR 10545 at commit
|
This is the successor to PR #5423 ; it incorporates SPARK-11315 (PR #8744), which was split out for easier review.
It adds a history provider which uses the YARN timeline server for histories, reading the events published in the application by way of the #8744 publisher. It's very efficient at getting attempt summary data, as that is server
It also contains preparatory support for history server metrics (SPARK-11373 / #9571) (i.e. it collect metrics, but does not publish them), and the cache updating of incomplete work of SPARK-7889 /#6935, (the #8744 publisher includes an incrementing counter, which is used in the history server to determine updates to histories.)
In comparison to the FS history provider, bootstrap time is fast as there is no need to replay histories to extract that metadata. It does place load on the timeline server, hence various options to configure the frequency of probing for updates, including disabling background refreshes until users actually reload pages. Because the YARN ATS service has different failure modes from HDFS, there's some more startup checking of service availability, with failure information collected and reported —as well as noted in metrics. (More succinctly, the FS history provider assumes HDFS doesn't fail).
The new history server provider is added in
yarn/src/history, along with its various tests. The code is only included in compiles, tests and scalastyle checks on Hadoop 2.6+, so does not cause any compatibility issues when Spark is built against previous Hadoop versions.