Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Conversation

@ash211
Copy link

@ash211 ash211 commented Feb 8, 2017

Target state:

  • k8s releases are purely an Apache release plus the additional k8s features (no early release of things on branch-2.1 before they're in a 2.1.x release)
  • they are versioned like v2.1.0-kubernetes-0.1.0
    • first three numbers are the Apache release this is based off
    • last three numbers are the "kubernetes release" versioning
    • we aim to follow semver on the k8s version
  • day-to-day k8s work happens on branch-2.1-kubernetes
    • when we are ready for a new release, we cut versions like v2.X.Y-kubernetes-0.1.1 or v2.X.Y-kubernetes-0.2.0 depending on the magnitude of the release
    • at some point in the future we might even consider v2.X.Y-kubernetes-1.0.0
  • when a new Apache patch release is announced (e.g. v2.1.1) we merge the v2.1.1 tag into branch-2.1-kubernetes in a PR (with code review) and continue work in the new branch
  • when a new Apache minor release is announced (e.g. 2.2.0) we create a new empty branch-2.2-kubernetes branch off the v2.2.0 tag and cherry pick (with git rebase) the k8s patchset onto the new branch
    • the "k8s patchset" is everything in branch-2.1-kubernetes not in branch-2.1
    • having both branch-2.2-kubernetes and branch-2.1-kubernetes branches allows us to support both Spark 2.1.x and 2.2.x if we choose (we're not yet committing to support multiple Spark versions)

This PR created via:

  • make branch-2.1-kubernetes off of v2.1.0
    • git checkout v2.1.0
    • git checkout -b branch-2.1-kubernetes
  • create prep-for-alpha-release off the latest k8s-support-alternate-incremental
    • git checkout k8s-support-alternate-incremental
    • git checkout -b prep-for-alpha-release
  • move commits from master -> k8s-support-alternate-incremental over onto branch-2.1-kubernetes
    • git rebase --onto branch-2.1-kubernetes origin/master prep-for-alpha-release

mccheah and others added 30 commits February 7, 2017 17:59
- Don't hold the raw secret bytes
- Add CPU limits and requests
The build process fails ScalaStyle checks otherwise.
* Use tar and gzip to archive shipped jars.

* Address comments

* Move files to resolve merge
* Use alpine and java 8 for docker images.

* Remove installation of vim and redundant comment
* Error messages when the driver container fails to start.

* Fix messages a bit

* Use timeout constant

* Delete the pod if it fails for any reason (not just timeout)

* Actually set submit succeeded

* Fix typo
* Documentation for the current state of the world.

* Adding navigation links from other pages

* Address comments, add TODO for things that should be fixed

* Address comments, mostly making images section clearer

* Virtual runtime -> container runtime
#20)

* Development workflow documentation for the current state of the world.

* Address comments.

* Clarified code change and added ticket link
* Added service name as prefix to executor pods to be able to tell them apart from kubectl output

* Addressed comments
* Add kubernetes profile to travis yml file

* Fix long lines in CompressionUtils.scala
* Improved the example commands in running-on-k8s document.

* Fixed more example commands.

* Fixed typo.
* Support custom labels on the driver pod.

* Add integration test and fix logic.

* Fix tests

* Fix minor formatting mistake

* Reduce unnecessary diff
* A number of small tweaks to the MVP.

- Master protocol defaults to https if not specified
- Removed upload driver extra classpath functionality
- Added ability to specify main app resource with container:// URI
- Updated docs to reflect all of the above
- Add examples to Docker images, mostly for integration testing but
could be useful for easily getting started without shipping anything

* Add example to documentation.
* Support setting the driver pod launching timeout.

And increase the default value from 30s to 60s. The current value of
30s is kind of short for pulling the image from public docker registry
plus the container/JVM start time.

* Use a better name for the default timeout.
* Use "extraTestArgLine" to pass extra options to scalatest.

Because the "argLine" option of scalatest is set in pom.xml and we can't
overwrite it from the command line.

Ref #37

* Added a default value for extraTestArgLine

* Use a better name.

* Added a tip for this in the dev docs.
mccheah and others added 11 commits February 7, 2017 17:59
* Fixed k8s integration test

- Enable spark ui explicitly for in-process submit
- Fixed some broken assertions in integration tests
- Fixed a scalastyle error in SparkDockerImageBuilder.scala
- Log into target/integration-tests.log like other modules

* Fixed line length.

* CR
* Create README to better describe project purpose

* Add links to usage guide and dev docs

* Minor changes
…it jars (#30)

* Revamp ports and service setup for the driver.

- Expose the driver-submission service on NodePort and contact that as
opposed to going through the API server proxy
- Restrict the ports that are exposed on the service to only the driver
submission service when uploading content and then only the Spark UI
after the job has started

* Move service creation down and more thorough error handling

* Fix missed merge conflict

* Add braces

* Fix bad merge

* Address comments and refactor run() more.

Method nesting was getting confusing so pulled out the inner class and
removed the extra method indirection from createDriverPod()

* Remove unused method

* Support SSL configuration for the driver application submission (#49)

* Support SSL when setting up the driver.

The user can provide a keyStore to load onto the driver pod and the
driver pod will use that keyStore to set up SSL on its server.

* Clean up SSL secrets after finishing submission.

We don't need to persist these after the pod has them mounted and is
running already.

* Fix compilation error

* Revert image change

* Address comments

* Programmatically generate certificates for integration tests.

* Address comments

* Resolve merge conflicts

* Fix bad merge

* Remove unnecessary braces

* Fix compiler error
* Extract constants and config into separate file. Launch => Submit.

* Address comments

* A small shorthand

* Refactor more ThreadUtils

* Fix scalastyle, use cached thread pool

* Tiny Scala style change
* Retry the submit-application request to multiple nodes.

* Fix doc style comment

* Check node unschedulable, log retry failures
* Allow adding arbitrary files

* Address comments and add documentation
* Introduce blocking submit to kubernetes by default

Two new configuration settings:
- spark.kubernetes.submit.waitAppCompletion
- spark.kubernetes.report.interval

* Minor touchups

* More succinct logging for pod state

* Fix import order

* Switch to watch-based logging

* Spaces in comma-joined volumes, labels, and containers

* Use CountDownLatch instead of SettableFuture

* Match parallel ConfigBuilder style

* Disable logging in fire-and-forget mode

Which is enabled with spark.kubernetes.submit.waitAppCompletion=false
(default: true)

* Additional log line for when application is launched

* Minor wording changes

* More logging

* Drop log to DEBUG
Since the example job are patched to never finish.
@ash211 ash211 mentioned this pull request Feb 8, 2017
11 tasks
@ash211
Copy link
Author

ash211 commented Mar 2, 2017

Closing -- will redo this after code freeze. Note that changing the branch name and its base will require all in-progress PRs to adjust their destination branch.

@ash211 ash211 closed this Mar 2, 2017
@ash211 ash211 deleted the prep-for-alpha-release branch March 2, 2017 22:56
@ash211
Copy link
Author

ash211 commented Mar 2, 2017

Also update on this as well -- I think there's a use case for continuing to do development while a release is being stabilized (the time between code freeze and release of a new version). Likely we should have a -develop branch in addition to the release branch.

@ash211 ash211 mentioned this pull request Mar 8, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants