-
Notifications
You must be signed in to change notification settings - Fork 117
Fail submission if submitter-local files are provided without resourc… #447
Fail submission if submitter-local files are provided without resourc… #447
Conversation
…e staging server URI
|
This is not necessarily the case with #437 |
|
Hi @sahilprasad , thanks for the contribution to this project! Just earlier today we merged a PR that supports sending "small files" from the submitter to drivers/executors via k8s secrets, for a definition of small. If the files aren't small enough and no RSS is specified, then it throws this exception: https://github.com/apache-spark-on-k8s/spark/pull/437/files#diff-5fd183129559d8c0a34135c347be647bR40 For jars there's no small jar vs large jar distinction, so any time there's a local jar there must be an RSS specified. Were you looking at this primarily because of files or because of jars? |
|
Yeah sorry, to clarify, this change is still valid for jars, but we need to be more careful with local files. |
|
@mccheah Got it. If I were to change this just to accommodate jars, is there anything else in the way of jar-to-RSS validation that would be good to have? |
|
jars probably don't need any special case. I think local files is actually already handled by the small files bootstrap given that we do a best effort to mount as a secret but when the files are too large we print an informative error message. |
|
Right, is this change applicable, if modified, to just jars then? |
|
I think so - should be able to write a unit and/or integration test that covers this. |
|
@mccheah I can't get this test to pass when I build locally: https://github.com/apache-spark-on-k8s/spark/blob/branch-2.2-kubernetes/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/kubernetes/submit/submitsteps/initcontainer/InitContainerConfigurationStepsOrchestratorSuite.scala#L72 Since the |
|
This indicates to me that the validation is being added at the incorrect location, because the init container configuration steps orchestrator should always be being built with the arguments already having been validated. In other words, the orchestrator itself shouldn't be doing the validation, but some component above it. |
|
Either that or the test should be patched to make it such that the orchestrator is created with "compatible" arguments - that is, if we're giving the orchestrator local files, it should also be given a resource staging server URI. |
|
@mccheah Where would the validation take place if not One way that I see to patch the test in question would be to include only the first element of |
|
The orchestrator can do the validation, but we should then change the test such that the arguments provided to the orchestrator line up with what the expectations are. We should be testing that
|
|
@mccheah can you review? |
| .getOrElse(false) | ||
|
|
||
| OptionRequirements.requireSecondIfFirstIsDefined( | ||
| KubernetesFileUtils.getOnlySubmitterLocalFiles(sparkJars).nonEmpty match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't match on true or false here - use the functional APIs like map, filter, getOrElse, etc.
|
|
||
| assert(sparkConf.get(RESOURCE_STAGING_SERVER_URI).isEmpty) | ||
|
|
||
| intercept[IllegalArgumentException] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we inspect the correctness of the error message as well? We wouldn't want an IllegalArgumentException to come from some other part of the constructor.
|
Ok to merge once the build passes. |
|
@mccheah should I squash my commits, or is that done automatically? |
|
@mccheah this good to merge? |
| NAMESPACE, | ||
| APP_RESOURCE_PREFIX, | ||
| SPARK_JARS, | ||
| SPARK_JARS.take(1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit brittle in case SPARK_JARS changes in the future -- we should create a new SPARK_JARS_REMOTE that has only hdfs:// path
|
Good to merge though, and Matt +1'd several weeks ago |
apache-spark-on-k8s#447) * Fail submission if submitter-local files are provided without resource staging server URI * Modified logic to validate only submitted jars; added orchestrator tests * Incorporated feedback * Fix failing test case
apache-spark-on-k8s#447) * Fail submission if submitter-local files are provided without resource staging server URI * Modified logic to validate only submitted jars; added orchestrator tests * Incorporated feedback * Fix failing test case
…e staging server URI
Helpful validation to inhibit users from submitting local dependencies without specifying a URI for the RSS. Closes #339.
How was this patch tested?
Ran provided test suite and manually tested with local and non-local application dependency submissions.