Skip to content

Conversation

@sarutak
Copy link
Member

@sarutak sarutak commented Oct 9, 2014

@tgravescs reported this issue.

Following is quoted from @tgravescs' report.

YarnRMClientImpl.registerApplicationMaster can throw null pointer exception when setting the trackingurl if its empty:

appMasterRequest.setTrackingUrl(new URI(uiAddress).getAuthority())

I hit this just start spark-shell without the tracking url set.

14/09/23 16:18:34 INFO yarn.YarnRMClientImpl: Connecting to ResourceManager at kryptonitered-jt1.red.ygrid.yahoo.com/98.139.154.99:8030
Exception in thread "main" java.lang.NullPointerException
at org.apache.hadoop.yarn.proto.YarnServiceProtos$RegisterApplicationMasterRequestProto$Builder.setTrackingUrl(YarnServiceProtos.java:710)
at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.RegisterApplicationMasterRequestPBImpl.setTrackingUrl(RegisterApplicationMasterRequestPBImpl.java:132)
at org.apache.spark.deploy.yarn.YarnRMClientImpl.registerApplicationMaster(YarnRMClientImpl.scala:102)
at org.apache.spark.deploy.yarn.YarnRMClientImpl.register(YarnRMClientImpl.scala:55)
at org.apache.spark.deploy.yarn.YarnRMClientImpl.register(YarnRMClientImpl.scala:38)
at org.apache.spark.deploy.yarn.ApplicationMaster.registerAM(ApplicationMaster.scala:168)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:206)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:120)

@sarutak
Copy link
Member Author

sarutak commented Oct 9, 2014

CC @tgravescs

@SparkQA
Copy link

SparkQA commented Oct 9, 2014

QA tests have started for PR 2728 at commit 8b5a96e.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 9, 2014

QA tests have finished for PR 2728 at commit 8b5a96e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

@sarutak thanks for working on this. Did you by chance test this to see if the tracking url is really not being set in the RM UI for all the modes (including using spark-shell)? If its not there might be a bug somewhere else as that used to work.

@SparkQA
Copy link

SparkQA commented Oct 10, 2014

QA tests have started for PR 2728 at commit 592b5d7.

  • This patch merges cleanly.

@sarutak
Copy link
Member Author

sarutak commented Oct 10, 2014

@tgravescs I didn't really understand the root cause.
Now I resolved that.
When we run YARN cluster mode, uiAddress starts with scheme but in YARN client mode, uiAddress is only authority, not with scheme so URI(uiAddress).getAuthority returns null right?

I attached this patch and run client/cluster mode including Spark Shell.

@SparkQA
Copy link

SparkQA commented Oct 10, 2014

QA tests have finished for PR 2728 at commit 592b5d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 12, 2014

QA tests have started for PR 2728 at commit 2d43c64.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 12, 2014

QA tests have finished for PR 2728 at commit 2d43c64.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member Author

sarutak commented Oct 12, 2014

retest this please.

@SparkQA
Copy link

SparkQA commented Oct 12, 2014

QA tests have started for PR 2728 at commit 2d43c64.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 12, 2014

Tests timed out for PR 2728 at commit 2d43c64 after a configured wait of 120m.

@sarutak
Copy link
Member Author

sarutak commented Oct 12, 2014

retest this please.

@SparkQA
Copy link

SparkQA commented Oct 12, 2014

QA tests have started for PR 2728 at commit 2d43c64.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 12, 2014

QA tests have finished for PR 2728 at commit 2d43c64.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member Author

sarutak commented Oct 17, 2014

test this please.

@tgravescs
Copy link
Contributor

So one issue is that the scheme was added to properly handle when yarn using https (SPARK-3286). If client mode isn't passing the scheme then that is probably broken. If it was passing the scheme that you wouldn't hit this issue. I think changing the YarnClientSchedulerBackend.start routine where it sets the spark.driver.appUIAddress would be the equivalent change. And then we would need to test.

With the above change it would have the scheme included and wouldn't hit the null. If we want to add the check in anyway for handling the case where it is null just in case something else comes up, thats fine, but I'm not real fond of pattern matching here. How about just checking the URI.getScheme and if null we pass it in as is, otherwise we do the getAuthority()?

@tgravescs
Copy link
Contributor

@sarutak could you submit a different pr for this so jenkins picks it up? Then close this one

@sarutak
Copy link
Member Author

sarutak commented Oct 28, 2014

@tgravescs I opened new PR and Jenkins worked in #2981. Thanks!

@sarutak sarutak closed this Oct 28, 2014
@sarutak sarutak deleted the SPARK-3657 branch April 11, 2015 05:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants