-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-23529][K8s] Support mounting hostPath volumes #21095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@liyinan926 @foxish Please take a look |
|
jenkins, ok to test |
|
@madanadit, thanks for following up with the PR and for your design doc. I had a few comments on the doc for further discussion, but in general, I think this is a good direction. |
|
Kubernetes integration test starting |
|
Test build #89529 has finished for PR 21095 at commit
|
|
Kubernetes integration test status success |
|
Thanks @foxish for your feedback. As a first time contributor to Spark, I would like to limit the scope of the changes in this PR. Let me know when you're ready to review again. |
|
My general comment is the code is centered around parsing and setting up |
|
@liyinan926 Thanks for the comment. I refactored it to address your concern. The code is simple enough to be easily refactored if the need be by whoever implements the next volume type. Note that the goal of this PR is limited to hostPath volumes. Even though the implementation here is general enough to be extended, I will not attempt to add other volume types in this PR. |
|
Test build #89601 has finished for PR 21095 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Thanks @madanadit! I will take a look over the weekend or early net week. |
|
Hey @liyinan926, do you think you'll have time to take a look this week? |
|
@madanadit, sorry about the long turnaround time here. I made a cursory pass over it - and it looks good. Before getting deeper into the code, I think we need more exhaustive unit testing for non-hostpath type volumes, and a couple of e2e tests. Once those are done, I'm also happy to do a more detailed review here. |
|
@madanadit sorry I didn't get a chance to look into this. I will take a detailed look once @foxish's comments on testing are addressed. |
|
Hi @foxish, I don't see why the 2 testing concerns should block reviewing this PR.
|
|
The testing is not a blocker for the review. When I said "tests for non-hostpath type volumes", I meant to say that we want to cover more than just hostpath mounts with the initial PR - because we might end up with something too specific. Sorry if that wasn't clear from my comment. I think doing |
| * @param prefix the prefix for volume configuration | ||
| * @return a tuple of (pod with the volume(s) added, container with mount(s) added) | ||
| */ | ||
| def addVolumes( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Scaladoc should not mention hostPath as this function is not hostPath exclusively.
| } | ||
| // Populate spec | ||
| volumes.foreach { | ||
| case (name, spec) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case line should be merged into the previous line according to the Spark code convention, e.g., volumes.foreach { case (name, spec) =>.
| volumes.foreach { | ||
| case (name, spec) => | ||
| properties.foreach { | ||
| k => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
| val podBuilder = new PodBuilder(pod).editOrNewSpec() | ||
| val containerBuilder = new ContainerBuilder(container) | ||
| volumes foreach { | ||
| case (name, spec) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
| if (spec.optionsSpec.contains(KUBERNETES_VOLUMES_PATH_KEY)) { | ||
| hostPath = Some(spec.optionsSpec(KUBERNETES_VOLUMES_PATH_KEY)) | ||
| } | ||
| if (hostPath.isDefined && spec.mountPath.isDefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this if block can be combined with the previous if block, e.g., if (spec.mountPath.isDefined && spec.optionsSpec.contains(KUBERNETES_VOLUMES_PATH_KEY)) {...}.
| .withHostPath(new HostPathVolumeSource(hostPath.get)) | ||
| .withName(name) | ||
| .build()) | ||
| val volumeBuilder = new VolumeMountBuilder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/volumeBuilder/mountBuilder.
| * @param prefix the given property name prefix | ||
| * @return a Map storing with volume name as key and spec as value | ||
| */ | ||
| def parseHostPathVolumesWithPrefix( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you don't really need this function as it's just a wrapper of parseVolumesWithPrefix .
| .build() | ||
| SparkPod(driverPod, driverContainer) | ||
|
|
||
| val (driverPodWithVolumes, driverContainerVolumes) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of putting the logic of volume mounting in BasicDriverFeatureStep and BasicExecutorFeatureStep, we should add a new step for mounting volumes, similarly to how we handle secrets, e.g., MountVolumesFeatureStep where the logic of addVolumes should be. This feature step can be used for both the driver and executors.
|
@madanadit @liyinan926 @foxish I addressed comments here and added PVC support #21260. However I'm unsure if this is the right way to go, please check the PR. |
|
@liyinan926 that's the idea. We should have a single step to mount all kinds of volumes including hostPath and emptyDir. The only problem is the configuration awkwardness. |
|
Thanks @liyinan926 for the review and @andrusha for addressing the comments and moving the work along. I'll close this PR once #21260 merges. |
|
Can one of the admins verify this patch? |
This PR continues #21095 and intersects with #21238. I've added volume mounts as a separate step and added PersistantVolumeClaim support. There is a fundamental problem with how we pass the options through spark conf to fabric8. For each volume type and all possible volume options we would have to implement some custom code to map config values to fabric8 calls. This will result in big body of code we would have to support and means that Spark will always be somehow out of sync with k8s. I think there needs to be a discussion on how to proceed correctly (eg use PodPreset instead) ---- Due to the complications of provisioning and managing actual resources this PR addresses only volume mounting of already present resources. ---- - [x] emptyDir support - [x] Testing - [x] Documentation - [x] KubernetesVolumeUtils tests Author: Andrew Korzhuev <[email protected]> Author: madanadit <[email protected]> Closes #21260 from andrusha/k8s-vol.
|
@madanadit can you close this as #21260 has been merged? |
What changes were proposed in this pull request?
This PR introduces a new config
spark.kubernetes.driver/executor.volumestaking a values of the format documented hereThe use case is to enable short-circuit writes to distributed storage on k8s. The Alluxio File System uses domain sockets to enable short-circuit writes from the client to worker memory when co-located on the same host machine. A directory, lets say /tmp/domain on the host, is mounted on the Alluxio worker container as well as the Alluxio client ( = Spark executor) container. The worker creates a domain socket /tmp/domain/d and if the client container mounts the same directory, it can write directly to the Alluxio worker w/o passing through network stack. The end result is faster data access when data is local.
How was this patch tested?
Manual testing on a k8s v1.8 cluster. Unit tests added to
Driver/ExecutorPodFactorySuite.This PR replaces #21032.