Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
158 commits
Select commit Hold shift + click to select a range
2ffed59
[SPARK-18278] Minimal support for submitting to Kubernetes.
mccheah Dec 6, 2016
00e545f
Fix style
mccheah Dec 6, 2016
cdbd9bb
Make naming more consistent
mccheah Dec 7, 2016
8f69fc0
Fix building assembly with Kubernetes.
mccheah Dec 9, 2016
75c6086
Service account support, use constants from fabric8 library.
mccheah Dec 10, 2016
93b75ce
Some small changes
mccheah Jan 7, 2017
e7397e8
Use k8s:// formatted URL instead of separate setting.
mccheah Jan 9, 2017
ed65428
Reindent comment to conforn to JavaDoc style
foxish Jan 9, 2017
f9ddb63
Move kubernetes under resource-managers folder.
mccheah Jan 9, 2017
178abc1
Use tar and gzip to compress+archive shipped jars (#2)
mccheah Jan 11, 2017
e2787e8
Use alpine and java 8 for docker images. (#10)
mccheah Jan 12, 2017
acceb72
Copy the Dockerfiles from docker-minimal-bundle into the distribution…
mccheah Jan 12, 2017
24f4bf0
inherit IO (#13)
foxish Jan 12, 2017
adcc906
Error messages when the driver container fails to start. (#11)
mccheah Jan 13, 2017
0b81dbf
Fix linter error to make CI happy (#18)
foxish Jan 13, 2017
e70f427
Documentation for the current state of the world (#16)
mccheah Jan 13, 2017
b25bc8b
Development workflow documentation for the current state of the world…
mccheah Jan 13, 2017
761b317
Added service name as prefix to executor pods (#14)
foxish Jan 13, 2017
8739b41
Add kubernetes profile to travis CI yml file (#21)
kimoonkim Jan 14, 2017
928e00e
Improved the example commands in running-on-k8s document. (#25)
lins05 Jan 17, 2017
3e3c4d4
Fix spacing for command highlighting (#31)
foxish Jan 18, 2017
36c4e94
Support custom labels on the driver pod. (#27)
mccheah Jan 19, 2017
b6c57c7
Make pod name unique using the submission timestamp (#32)
foxish Jan 19, 2017
3fd9c62
A number of small tweaks to the MVP. (#23)
mccheah Jan 24, 2017
81875a6
Correct hadoop profile: hadoop2.7 -> hadoop-2.7 (#41)
ash211 Jan 25, 2017
2a26ebd
Support setting the driver pod launching timeout. (#36)
lins05 Jan 25, 2017
b98c852
Sanitize kubernetesAppId for use in secret, service, and pod names (#45)
ash211 Jan 25, 2017
27f3005
Support spark.driver.extraJavaOptions (#48)
kimoonkim Jan 26, 2017
48f5884
Use "extraScalaTestArgs" to pass extra options to scalatest. (#52)
lins05 Jan 26, 2017
81bd355
Use OpenJDK8's official Alpine image. (#51)
mccheah Jan 26, 2017
86bd589
Remove unused driver extra classpath upload code (#54)
mccheah Jan 26, 2017
e6f35d2
Fix k8s integration tests (#44)
lins05 Jan 27, 2017
6cceb59
Added GC to components (#56)
foxish Jan 27, 2017
3b5901a
Create README to better describe project purpose (#50)
ash211 Jan 28, 2017
2e992be
Access the Driver Launcher Server over NodePort for app launch + subm…
mccheah Jan 30, 2017
b2e6877
Extract constants and config into separate file. Launch => Submit. (#65)
mccheah Jan 31, 2017
6ee3be5
Retry the submit-application request to multiple nodes (#69)
mccheah Feb 2, 2017
d0f95db
Allow adding arbitrary files (#71)
mccheah Feb 2, 2017
de9a82e
Fix NPE around unschedulable pod specs (#79)
ash211 Feb 2, 2017
fae76a0
Introduce blocking submit to kubernetes by default (#53)
ash211 Feb 3, 2017
4bc7c52
Do not wait for pod finishing in integration tests. (#84)
lins05 Feb 3, 2017
52a7ab2
Check for user jars/files existence before creating the driver pod. (…
lins05 Feb 8, 2017
487d1e1
Use readiness probe instead of client-side ping. (#75)
mccheah Feb 9, 2017
bdfc4e1
Note integration tests require Java 8 (#99)
ash211 Feb 10, 2017
fe8b45c
Bumping up kubernetes-client version to fix GKE and local proxy (#105)
foxish Feb 10, 2017
7a4075f
Truncate k8s hostnames to be no longer than 63 characters (#102)
ash211 Feb 11, 2017
3d80fff
Fixed loading the executors page through the kubectl proxy. (#95)
lins05 Feb 13, 2017
a34a114
Filter nodes to only try and send files to external IPs (#106)
foxish Feb 13, 2017
ac4dd91
Parse results of minikube status more rigorously (#97)
ash211 Feb 13, 2017
2112c4a
Adding legacyHostIP to the list of IPs we look at (#114)
foxish Feb 14, 2017
043cdd9
Add -DskipTests to dev docs (#115)
ash211 Feb 15, 2017
0e6df11
Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving…
varunkatta Feb 16, 2017
a800e20
Trigger scalatest plugin in the integration-test phase (#93)
kimoonkim Feb 16, 2017
2773b77
Fix issue with DNS resolution (#118)
foxish Feb 16, 2017
6a999ca
Change the API contract for uploading local files (#107)
mccheah Feb 16, 2017
cad5dd3
Optionally expose the driver UI port as NodePort (#131)
kimoonkim Feb 22, 2017
68a83a2
Set the REST service's exit code to the exit code of its driver subpr…
ash211 Feb 23, 2017
1ab6dbc
Pass the actual iterable from the option to get files (#139)
mccheah Feb 23, 2017
bb5cb21
Use a separate class to track components that need to be cleaned up (…
mccheah Feb 23, 2017
04a555e
Enable unit tests in Travis CI build (#132)
kimoonkim Feb 23, 2017
d7f41c5
Change driver pod's restart policy from OnFailure to Never (#145)
ash211 Feb 23, 2017
b4b1bdd
Extract SSL configuration handling to a separate class (#123)
mccheah Feb 24, 2017
39c2cf2
Exclude known flaky tests (#156)
kimoonkim Feb 24, 2017
2303aad
Richer logging and better error handling in driver pod watch (#154)
foxish Feb 24, 2017
e7f78cb
Document blocking submit calls (#152)
ash211 Feb 25, 2017
fd24f23
Allow custom annotations on the driver pod. (#163)
mccheah Mar 2, 2017
7132f5d
Update client version & minikube version (#142)
foxish Mar 2, 2017
a51dcc8
Allow customizing external URI provision + External URI can be set vi…
mccheah Mar 3, 2017
a14dc1e
Remove okhttp from top-level pom (#166)
foxish Mar 3, 2017
015f18d
Allow setting memory on the driver submission server. (#161)
mccheah Mar 3, 2017
f414355
Add a section for prerequisites (#171)
foxish Mar 4, 2017
6cf635d
Add instructions to find master URL (#169)
foxish Mar 4, 2017
191dd51
Propagate exceptions (#172)
mccheah Mar 6, 2017
dc4e3d2
Logging for resource deletion (#170)
ash211 Mar 6, 2017
3636939
Fix pom versions (#178)
foxish Mar 14, 2017
2382ea6
Exclude flaky ExternalShuffleServiceSuite from Travis (#185)
kimoonkim Mar 15, 2017
b139b46
Fix lint-check failures and javadoc8 break (#187)
ash211 Mar 16, 2017
8c08189
Docs improvements (#176)
foxish Mar 8, 2017
8756494
Add Apache license to a few files (#175)
ash211 Mar 8, 2017
fece639
Adding clarification pre-alpha (#181)
foxish Mar 8, 2017
35724a3
Allow providing an OAuth token for authenticating against k8s (#180)
mccheah Mar 13, 2017
f9f5af4
Merge pull request #177 from apache-spark-on-k8s/prep-for-alpha-release
ash211 Mar 16, 2017
d5502ed
Allow the driver pod's credentials to be shipped from the submission …
ash211 Mar 17, 2017
078697f
Support using PEM files to configure SSL for driver submission (#173)
mccheah Mar 20, 2017
7039934
Update tags on docker images. (#196)
foxish Mar 21, 2017
3254246
Add additional instructions to use release tarball (#198)
foxish Mar 22, 2017
35a5e32
Support specify CPU cores for driver pod (#207)
hustcat Mar 30, 2017
0a13206
Register executors using pod IPs instead of pod host names (#215)
kimoonkim Apr 5, 2017
13f16d5
Upgrade bouncycastle, force bcprov version (#223)
mccheah Apr 10, 2017
c6a5c6e
Stop executors cleanly before deleting their pods (#231)
ash211 Apr 13, 2017
0b0fb6f
Upgrade Kubernetes client to 2.2.13. (#230)
mccheah Apr 14, 2017
1388e0a
Respect JVM http proxy settings when using Feign. (#228)
mccheah Apr 17, 2017
3f6e5ea
Staging server for receiving application dependencies. (#212)
mccheah Apr 21, 2017
e24c4af
Reorganize packages between v1 work and v2 work (#220)
mccheah Apr 21, 2017
4940eae
Support SSL on the file staging server (#221)
mccheah Apr 21, 2017
04afcf8
Driver submission with mounting dependencies from the staging server …
mccheah Apr 25, 2017
6b489c2
Enable testing against GCE clusters (#243)
foxish May 2, 2017
0e1cb40
Update running-on-kubernetes.md (#259)
erikerlandson May 2, 2017
ba151c0
Build with sbt and fix scalastyle checks. (#241)
lins05 May 3, 2017
4ac0de1
Updating images in doc (#219)
foxish May 3, 2017
8ccb305
Correct readme links (#266)
johscheuer May 5, 2017
0a8080a
edit readme with a working build example command (#254)
erikerlandson May 9, 2017
26f747e
Fix watcher conditional logic (#269)
erikerlandson May 10, 2017
546f09c
Dispatch tasks to right executors that have tasks' input HDFS data (#…
kimoonkim May 10, 2017
eb45ae5
Add parameter for driver pod name (#258)
hustcat May 16, 2017
e9da549
Dynamic allocation (#272)
foxish May 17, 2017
f005268
Download remotely-located resources on driver and executor startup vi…
mccheah May 17, 2017
e071ad9
Scalastyle fixes (#278)
ash211 May 17, 2017
6882a1b
Exit properly when the k8s cluster is not available. (#256)
lins05 May 18, 2017
9d6665c
Support driver pod kubernetes credentials mounting in V2 submission (…
mccheah May 18, 2017
88306b2
Allow client certificate PEM for resource staging server. (#257)
mccheah May 19, 2017
8f6f0a0
Differentiate between URI and SSL settings for in-cluster vs. submiss…
mccheah May 19, 2017
408c65f
Monitor pod status in submission v2. (#283)
mccheah May 22, 2017
8f3d965
Replace submission v1 with submission v2. (#286)
mccheah May 23, 2017
56414f9
Added files should be in the working directories. (#294)
mccheah May 23, 2017
fe03c7c
Add missing license (#296)
mccheah May 24, 2017
3881404
Remove some leftover code and fix a constant. (#297)
mccheah May 24, 2017
b84cb66
Adding restart policy fix for v2 (#303)
foxish May 25, 2017
dbf7a39
Add all dockerfiles to distributions. (#307)
mccheah May 26, 2017
2a2cfb6
Add proxy configuration to retrofit clients. (#301)
mccheah May 26, 2017
d31d81a
Fix an HDFS data locality bug in case cluster node names are short ho…
kimoonkim May 26, 2017
0702e18
Remove leading slash from Retrofit interface. (#308)
mccheah May 30, 2017
9be8f20
Use tini in Docker images (#320)
mccheah May 31, 2017
e5623b7
Allow custom executor labels and annotations (#321)
mccheah Jun 1, 2017
5e2b205
Dynamic allocation, cleanup in case of driver death (#319)
foxish Jun 2, 2017
bb1b234
Fix client to await the driver pod (#325)
kimoonkim Jun 2, 2017
e37b0cf
Clean up resources that are not used by pods. (#305)
mccheah Jun 3, 2017
c325691
Copy yaml files when making distribution (#327)
tnachen Jun 4, 2017
d835b6a
Allow docker image pull policy to be configurable (#328)
tnachen Jun 5, 2017
4751371
POM update 0.2.0 (#329)
foxish Jun 5, 2017
5470366
Update tags (#332)
foxish Jun 6, 2017
ca4309f
nicer readme (#333)
foxish Jun 6, 2017
0dd146c
Support specify CPU cores and Memory restricts for driver (#340)
duyanghao Jun 8, 2017
bcf57cf
Generate the application ID label irrespective of app name. (#331)
mccheah Jun 8, 2017
78baf9b
Create base-image and minimize layer count (#324)
johscheuer Jun 8, 2017
2f80b1d
Added log4j config for k8s unit tests. (#314)
lins05 Jun 9, 2017
d4ec136
Use node affinity to launch executors on preferred nodes benefitting …
kimoonkim Jun 14, 2017
d6a3111
Fix sbt build. (#344)
mccheah Jun 14, 2017
fdd50f1
New API for custom labels and annotations. (#346)
mccheah Jun 14, 2017
a6291c6
Allow spark driver find shuffle pods in specified namespace (#357)
Jun 22, 2017
08fe944
Bypass init-containers when possible (#348)
chenchun Jun 23, 2017
8b3248f
Config for hard cpu limit on pods; default unlimited (#356)
Jun 23, 2017
6f6cfd6
Allow number of executor cores to have fractional values (#361)
liyinan926 Jun 29, 2017
befcf0a
Python Bindings for launching PySpark Jobs from the JVM (#364)
ifilonenko Jul 3, 2017
0f4368f
Submission client redesign to use a step-based builder pattern (#365)
mccheah Jul 14, 2017
8c35d81
Add implicit conversions to imports. (#374)
mccheah Jul 17, 2017
db5f5be
Fix import order and scalastyle (#375)
ash211 Jul 17, 2017
8751a9a
fix submit job errors (#376)
Jul 18, 2017
6dbd32e
Add node selectors for driver and executor pods (#355)
Jul 18, 2017
3ec9410
Retry binding server to random port in the resource staging server te…
mccheah Jul 19, 2017
e1ff2f0
set RestartPolicy=Never for executor (#367)
Jul 19, 2017
b1c48f9
Read classpath entries from SPARK_EXTRA_CLASSPATH on executors. (#383)
mccheah Jul 20, 2017
4dfb184
Changes to support executor recovery behavior during static allocatio…
varunkatta Jul 21, 2017
37f9943
Update pom to v0.3.0 of spark-kubernetes (#385)
foxish Jul 22, 2017
af446e6
Fix bug with null arguments (#415)
foxish Aug 4, 2017
96a1d8c
Flag-guard expensive DNS lookup of cluster node full names, part of H…
liyinan926 Aug 8, 2017
84d4336
Updated pom version to 0.3.1 for the new bug fix 2.1 release
liyinan926 Aug 8, 2017
d2c2e9b
Merge pull request #422 from liyinan926/branch-2.1-kubernetes
liyinan926 Aug 8, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 17 additions & 5 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,22 @@
sudo: required
dist: trusty

# 2. Choose language and target JDKs for parallel builds.
# 2. Choose language, target JDK and env's for parallel builds.
language: java
jdk:
- oraclejdk7
- oraclejdk8
env: # Used by the install section below.
# Configure the unit test build for spark core and kubernetes modules,
# while excluding some flaky unit tests using a regex pattern.
- PHASE=test \
PROFILES="-Pmesos -Pyarn -Phadoop-2.7 -Pkubernetes" \
MODULES="-pl core,resource-managers/kubernetes/core -am" \
ARGS="-Dtest=none -Dsuffixes='^org\.apache\.spark\.(?!ExternalShuffleServiceSuite|SortShuffleSuite$|rdd\.LocalCheckpointSuite$|deploy\.SparkSubmitSuite$|deploy\.StandaloneDynamicAllocationSuite$).*'"
# Configure the full build.
- PHASE=install \
PROFILES="-Pmesos -Pyarn -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver" \
MODULES="" \
ARGS="-T 4 -q -DskipTests"

# 3. Setup cache directory for SBT and Maven.
cache:
Expand All @@ -41,11 +52,12 @@ cache:
notifications:
email: false

# 5. Run maven install before running lint-java.
# 5. Run maven build before running lints.
install:
- export MAVEN_SKIP_RC=1
- build/mvn -T 4 -q -DskipTests -Pmesos -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive -Phive-thriftserver install
- build/mvn ${PHASE} ${PROFILES} ${MODULES} ${ARGS}

# 6. Run lint-java.
# 6. Run lints.
script:
- dev/lint-java
- dev/lint-scala
114 changes: 22 additions & 92 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,104 +1,34 @@
# Apache Spark
# Apache Spark On Kubernetes

Spark is a fast and general cluster computing system for Big Data. It provides
high-level APIs in Scala, Java, Python, and R, and an optimized engine that
supports general computation graphs for data analysis. It also supports a
rich set of higher-level tools including Spark SQL for SQL and DataFrames,
MLlib for machine learning, GraphX for graph processing,
and Spark Streaming for stream processing.
This repository, located at https://github.com/apache-spark-on-k8s/spark, contains a fork of Apache Spark that enables running Spark jobs natively on a Kubernetes cluster.

<http://spark.apache.org/>
## What is this?

This is a collaboratively maintained project working on [SPARK-18278](https://issues.apache.org/jira/browse/SPARK-18278). The goal is to bring native support for Spark to use Kubernetes as a cluster manager, in a fully supported way on par with the Spark Standalone, Mesos, and Apache YARN cluster managers.

## Online Documentation
## Getting Started

You can find the latest Spark documentation, including a programming
guide, on the [project web page](http://spark.apache.org/documentation.html).
This README file only contains basic setup instructions.
- [Usage guide](https://apache-spark-on-k8s.github.io/userdocs/) shows how to run the code
- [Development docs](resource-managers/kubernetes/README.md) shows how to get set up for development
- Code is primarily located in the [resource-managers/kubernetes](resource-managers/kubernetes) folder

## Building Spark
## Why does this fork exist?

Spark is built using [Apache Maven](http://maven.apache.org/).
To build Spark and its example programs, run:
Adding native integration for a new cluster manager is a large undertaking. If poorly executed, it could introduce bugs into Spark when run on other cluster managers, cause release blockers slowing down the overall Spark project, or require hotfixes which divert attention away from development towards managing additional releases. Any work this deep inside Spark needs to be done carefully to minimize the risk of those negative externalities.

build/mvn -DskipTests clean package
At the same time, an increasing number of people from various companies and organizations desire to work together to natively run Spark on Kubernetes. The group needs a code repository, communication forum, issue tracking, and continuous integration, all in order to work together effectively on an open source product.

(You do not need to do this if you downloaded a pre-built package.)
We've been asked by an Apache Spark Committer to work outside of the Apache infrastructure for a short period of time to allow this feature to be hardened and improved without creating risk for Apache Spark. The aim is to rapidly bring it to the point where it can be brought into the mainline Apache Spark repository for continued development within the Apache umbrella. If all goes well, this should be a short-lived fork rather than a long-lived one.

You can build Spark using more than one thread by using the -T option with Maven, see ["Parallel builds in Maven 3"](https://cwiki.apache.org/confluence/display/MAVEN/Parallel+builds+in+Maven+3).
More detailed documentation is available from the project site, at
["Building Spark"](http://spark.apache.org/docs/latest/building-spark.html).
## Who are we?

For general development tips, including info on developing Spark using an IDE, see
[http://spark.apache.org/developer-tools.html](the Useful Developer Tools page).
This is a collaborative effort by several folks from different companies who are interested in seeing this feature be successful. Companies active in this project include (alphabetically):

## Interactive Scala Shell

The easiest way to start using Spark is through the Scala shell:

./bin/spark-shell

Try the following command, which should return 1000:

scala> sc.parallelize(1 to 1000).count()

## Interactive Python Shell

Alternatively, if you prefer Python, you can use the Python shell:

./bin/pyspark

And run the following command, which should also return 1000:

>>> sc.parallelize(range(1000)).count()

## Example Programs

Spark also comes with several sample programs in the `examples` directory.
To run one of them, use `./bin/run-example <class> [params]`. For example:

./bin/run-example SparkPi

will run the Pi example locally.

You can set the MASTER environment variable when running examples to submit
examples to a cluster. This can be a mesos:// or spark:// URL,
"yarn" to run on YARN, and "local" to run
locally with one thread, or "local[N]" to run locally with N threads. You
can also use an abbreviated class name if the class is in the `examples`
package. For instance:

MASTER=spark://host:7077 ./bin/run-example SparkPi

Many of the example programs print usage help if no params are given.

## Running Tests

Testing first requires [building Spark](#building-spark). Once Spark is built, tests
can be run using:

./dev/run-tests

Please see the guidance on how to
[run tests for a module, or individual tests](http://spark.apache.org/developer-tools.html#individual-tests).

## A Note About Hadoop Versions

Spark uses the Hadoop core library to talk to HDFS and other Hadoop-supported
storage systems. Because the protocols have changed in different versions of
Hadoop, you must build Spark against the same version that your cluster runs.

Please refer to the build documentation at
["Specifying the Hadoop Version"](http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version)
for detailed guidance on building for a particular distribution of Hadoop, including
building for particular Hive and Hive Thriftserver distributions.

## Configuration

Please refer to the [Configuration Guide](http://spark.apache.org/docs/latest/configuration.html)
in the online documentation for an overview on how to configure Spark.

## Contributing

Please review the [Contribution to Spark guide](http://spark.apache.org/contributing.html)
for information on how to get started contributing to the project.
- Bloomberg
- Google
- Haiwen
- Hyperpilot
- Intel
- Palantir
- Pepperdata
- Red Hat
12 changes: 11 additions & 1 deletion assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.1.0</version>
<version>2.1.0-k8s-0.3.1-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down Expand Up @@ -148,6 +148,16 @@
</dependency>
</dependencies>
</profile>
<profile>
<id>kubernetes</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-kubernetes_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>hive</id>
<dependencies>
Expand Down
2 changes: 1 addition & 1 deletion common/network-common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.1.0</version>
<version>2.1.0-k8s-0.3.1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/network-shuffle/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.1.0</version>
<version>2.1.0-k8s-0.3.1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.network.shuffle.kubernetes;

import org.apache.spark.network.client.RpcResponseCallback;
import org.apache.spark.network.client.TransportClient;
import org.apache.spark.network.sasl.SecretKeyHolder;
import org.apache.spark.network.shuffle.ExternalShuffleClient;
import org.apache.spark.network.shuffle.protocol.RegisterDriver;
import org.apache.spark.network.util.TransportConf;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.IOException;
import java.nio.ByteBuffer;

/**
* A client for talking to the external shuffle service in Kubernetes cluster mode.
*
* This is used by the each Spark executor to register with a corresponding external
* shuffle service on the cluster. The purpose is for cleaning up shuffle files
* reliably if the application exits unexpectedly.
*/
public class KubernetesExternalShuffleClient extends ExternalShuffleClient {
private static final Logger logger = LoggerFactory
.getLogger(KubernetesExternalShuffleClient.class);

/**
* Creates an Kubernetes external shuffle client that wraps the {@link ExternalShuffleClient}.
* Please refer to docs on {@link ExternalShuffleClient} for more information.
*/
public KubernetesExternalShuffleClient(
TransportConf conf,
SecretKeyHolder secretKeyHolder,
boolean saslEnabled,
boolean saslEncryptionEnabled) {
super(conf, secretKeyHolder, saslEnabled, saslEncryptionEnabled);
}

public void registerDriverWithShuffleService(String host, int port) throws IOException {
checkInit();
ByteBuffer registerDriver = new RegisterDriver(appId, 0).toByteBuffer();
TransportClient client = clientFactory.createClient(host, port);
client.sendRpc(registerDriver, new RegisterDriverCallback());
}

private class RegisterDriverCallback implements RpcResponseCallback {
@Override
public void onSuccess(ByteBuffer response) {
logger.info("Successfully registered app " + appId + " with external shuffle service.");
}

@Override
public void onFailure(Throwable e) {
logger.warn("Unable to register app " + appId + " with external shuffle service. " +
"Please manually remove shuffle data after driver exit. Error: " + e);
}
}

@Override
public void close() {
super.close();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
import org.apache.spark.network.client.TransportClient;
import org.apache.spark.network.sasl.SecretKeyHolder;
import org.apache.spark.network.shuffle.ExternalShuffleClient;
import org.apache.spark.network.shuffle.protocol.mesos.RegisterDriver;
import org.apache.spark.network.shuffle.protocol.RegisterDriver;
import org.apache.spark.network.util.TransportConf;

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
import io.netty.buffer.Unpooled;

import org.apache.spark.network.protocol.Encodable;
import org.apache.spark.network.shuffle.protocol.mesos.RegisterDriver;
import org.apache.spark.network.shuffle.protocol.mesos.ShuffleServiceHeartbeat;

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,18 @@
* limitations under the License.
*/

package org.apache.spark.network.shuffle.protocol.mesos;
package org.apache.spark.network.shuffle.protocol;

import com.google.common.base.Objects;
import io.netty.buffer.ByteBuf;

import org.apache.spark.network.protocol.Encoders;
import org.apache.spark.network.shuffle.protocol.BlockTransferMessage;

// Needed by ScalaDoc. See SPARK-7726
import static org.apache.spark.network.shuffle.protocol.BlockTransferMessage.Type;

/**
* A message sent from the driver to register with the MesosExternalShuffleService.
* A message sent from the driver to register with an ExternalShuffleService.
*/
public class RegisterDriver extends BlockTransferMessage {
private final String appId;
Expand Down
2 changes: 1 addition & 1 deletion common/network-yarn/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.1.0</version>
<version>2.1.0-k8s-0.3.1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/sketch/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.1.0</version>
<version>2.1.0-k8s-0.3.1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/tags/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.1.0</version>
<version>2.1.0-k8s-0.3.1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/unsafe/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.1.0</version>
<version>2.1.0-k8s-0.3.1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,7 @@ public static long parseSecondNano(String secondNano) throws IllegalArgumentExce
public final int months;
public final long microseconds;

public final long milliseconds() {
public long milliseconds() {
return this.microseconds / MICROS_PER_MILLI;
}

Expand Down
Loading