Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Copy added Spark files to the working directory of the driver and executors #292

@mccheah

Description

@mccheah

Spark's contract is that files sent to a Spark job via spark.files are in the working directory of the driver and executor processes. The first version of the driver took the files sent to the submission server and wrote them into the working directory before starting the driver JVM. Submission v2 does not do this.

There's a few ways to accomplish this in the new architecture. The tricky part is that the init-container should be unaware in the strictest sense of what the working directory of the driver is. For example, users who change the working directory of the driver with custom Docker images will inadvertently make their driver have a different working directory from the init-container. Thus the design that is chosen should be resilient to this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions