Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

[SPARK-18278][NOSUBMIT] Ongoing diff for Spark on Kubernetes (branch-2.2) #450

Open
wants to merge 231 commits into
base: branch-2.2
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
231 commits
Select commit Hold shift + click to select a range
a91f660
[SPARK-18278] Minimal support for submitting to Kubernetes.
mccheah Dec 6, 2016
9d71348
Fix style
mccheah Dec 6, 2016
f1baed2
Make naming more consistent
mccheah Dec 7, 2016
5a45654
Fix building assembly with Kubernetes.
mccheah Dec 9, 2016
5694a8a
Service account support, use constants from fabric8 library.
mccheah Dec 10, 2016
dbfb87d
Some small changes
mccheah Jan 7, 2017
acb8b14
Use k8s:// formatted URL instead of separate setting.
mccheah Jan 9, 2017
f9ae918
Reindent comment to conforn to JavaDoc style
foxish Jan 9, 2017
f20397b
Move kubernetes under resource-managers folder.
mccheah Jan 9, 2017
728be0e
Use tar and gzip to compress+archive shipped jars (#2)
mccheah Jan 11, 2017
793143d
Use alpine and java 8 for docker images. (#10)
mccheah Jan 12, 2017
2b1a99d
Copy the Dockerfiles from docker-minimal-bundle into the distribution…
mccheah Jan 12, 2017
457ebd8
inherit IO (#13)
foxish Jan 12, 2017
94ab8dd
Error messages when the driver container fails to start. (#11)
mccheah Jan 13, 2017
7afadb3
Fix linter error to make CI happy (#18)
foxish Jan 13, 2017
909b281
Documentation for the current state of the world (#16)
mccheah Jan 13, 2017
77b287e
Development workflow documentation for the current state of the world…
mccheah Jan 13, 2017
0bcc391
Added service name as prefix to executor pods (#14)
foxish Jan 13, 2017
979fa92
Add kubernetes profile to travis CI yml file (#21)
kimoonkim Jan 14, 2017
087555a
Improved the example commands in running-on-k8s document. (#25)
lins05 Jan 17, 2017
a89b4b0
Fix spacing for command highlighting (#31)
foxish Jan 18, 2017
85f02bf
Support custom labels on the driver pod. (#27)
mccheah Jan 19, 2017
f71abc1
Make pod name unique using the submission timestamp (#32)
foxish Jan 19, 2017
95747bc
A number of small tweaks to the MVP. (#23)
mccheah Jan 24, 2017
2f02444
Correct hadoop profile: hadoop2.7 -> hadoop-2.7 (#41)
ash211 Jan 25, 2017
92b2b52
Support setting the driver pod launching timeout. (#36)
lins05 Jan 25, 2017
3c6eed8
Sanitize kubernetesAppId for use in secret, service, and pod names (#45)
ash211 Jan 25, 2017
c47ce5c
Support spark.driver.extraJavaOptions (#48)
kimoonkim Jan 26, 2017
6bd7240
Use "extraScalaTestArgs" to pass extra options to scalatest. (#52)
lins05 Jan 26, 2017
25abc4d
Use OpenJDK8's official Alpine image. (#51)
mccheah Jan 26, 2017
ee01986
Remove unused driver extra classpath upload code (#54)
mccheah Jan 26, 2017
b986484
Fix k8s integration tests (#44)
lins05 Jan 27, 2017
f2b7346
Added GC to components (#56)
foxish Jan 27, 2017
9124aac
Create README to better describe project purpose (#50)
ash211 Jan 28, 2017
4ff44d3
Access the Driver Launcher Server over NodePort for app launch + subm…
mccheah Jan 30, 2017
c57ccdc
Extract constants and config into separate file. Launch => Submit. (#65)
mccheah Jan 31, 2017
261a624
Retry the submit-application request to multiple nodes (#69)
mccheah Feb 2, 2017
ab731f1
Allow adding arbitrary files (#71)
mccheah Feb 2, 2017
0cf0d02
Fix NPE around unschedulable pod specs (#79)
ash211 Feb 2, 2017
efd803d
Introduce blocking submit to kubernetes by default (#53)
ash211 Feb 3, 2017
381b69a
Do not wait for pod finishing in integration tests. (#84)
lins05 Feb 3, 2017
15a8292
Check for user jars/files existence before creating the driver pod. (…
lins05 Feb 8, 2017
a62c20f
Use readiness probe instead of client-side ping. (#75)
mccheah Feb 9, 2017
1a43957
Note integration tests require Java 8 (#99)
ash211 Feb 10, 2017
3aba68a
Bumping up kubernetes-client version to fix GKE and local proxy (#105)
foxish Feb 10, 2017
1f2fd80
Truncate k8s hostnames to be no longer than 63 characters (#102)
ash211 Feb 11, 2017
e239ac7
Fixed loading the executors page through the kubectl proxy. (#95)
lins05 Feb 13, 2017
3a51dbe
Filter nodes to only try and send files to external IPs (#106)
foxish Feb 13, 2017
ba6a9e5
Parse results of minikube status more rigorously (#97)
ash211 Feb 13, 2017
bab88e0
Adding legacyHostIP to the list of IPs we look at (#114)
foxish Feb 14, 2017
be4330f
Add -DskipTests to dev docs (#115)
ash211 Feb 15, 2017
b1d7706
Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving…
varunkatta Feb 16, 2017
6ea3047
Trigger scalatest plugin in the integration-test phase (#93)
kimoonkim Feb 16, 2017
de5a105
Fix issue with DNS resolution (#118)
foxish Feb 16, 2017
81c6968
Change the API contract for uploading local files (#107)
mccheah Feb 16, 2017
e7b3569
Optionally expose the driver UI port as NodePort (#131)
kimoonkim Feb 22, 2017
6f27fb3
Set the REST service's exit code to the exit code of its driver subpr…
ash211 Feb 23, 2017
6d179a6
Pass the actual iterable from the option to get files (#139)
mccheah Feb 23, 2017
e8359ca
Use a separate class to track components that need to be cleaned up (…
mccheah Feb 23, 2017
a9dced2
Enable unit tests in Travis CI build (#132)
kimoonkim Feb 23, 2017
a124814
Change driver pod's restart policy from OnFailure to Never (#145)
ash211 Feb 23, 2017
06f78b4
Extract SSL configuration handling to a separate class (#123)
mccheah Feb 24, 2017
eb25262
Exclude known flaky tests (#156)
kimoonkim Feb 24, 2017
5587588
Richer logging and better error handling in driver pod watch (#154)
foxish Feb 24, 2017
4029718
Document blocking submit calls (#152)
ash211 Feb 25, 2017
d2c181b
Allow custom annotations on the driver pod. (#163)
mccheah Mar 2, 2017
5bbd6bb
Update client version & minikube version (#142)
foxish Mar 2, 2017
a4092cd
Allow customizing external URI provision + External URI can be set vi…
mccheah Mar 3, 2017
6c42d4b
Remove okhttp from top-level pom (#166)
foxish Mar 3, 2017
bd3deca
Allow setting memory on the driver submission server. (#161)
mccheah Mar 3, 2017
53bf7a1
Add a section for prerequisites (#171)
foxish Mar 4, 2017
b2a5d3d
Add instructions to find master URL (#169)
foxish Mar 4, 2017
387aefb
Propagate exceptions (#172)
mccheah Mar 6, 2017
ca76fbe
Logging for resource deletion (#170)
ash211 Mar 6, 2017
a9f1d6e
Fix pom versions (#178)
foxish Mar 14, 2017
0a5c4d5
Exclude flaky ExternalShuffleServiceSuite from Travis (#185)
kimoonkim Mar 15, 2017
73a0de3
Docs improvements (#176)
foxish Mar 8, 2017
face1f4
Add Apache license to a few files (#175)
ash211 Mar 8, 2017
804d0f8
Adding clarification pre-alpha (#181)
foxish Mar 8, 2017
c5ab210
Allow providing an OAuth token for authenticating against k8s (#180)
mccheah Mar 13, 2017
ffacd1f
Allow the driver pod's credentials to be shipped from the submission …
ash211 Mar 17, 2017
64f3a69
Support using PEM files to configure SSL for driver submission (#173)
mccheah Mar 20, 2017
d6b3234
Update tags on docker images. (#196)
foxish Mar 21, 2017
368664f
Add additional instructions to use release tarball (#198)
foxish Mar 22, 2017
37880e2
Support specify CPU cores for driver pod (#207)
hustcat Mar 30, 2017
3a0b770
Register executors using pod IPs instead of pod host names (#215)
kimoonkim Apr 5, 2017
02ab18e
Upgrade bouncycastle, force bcprov version (#223)
mccheah Apr 10, 2017
d0e27b1
Stop executors cleanly before deleting their pods (#231)
ash211 Apr 13, 2017
9a895a8
Upgrade Kubernetes client to 2.2.13. (#230)
mccheah Apr 14, 2017
88ec1c5
Respect JVM http proxy settings when using Feign. (#228)
mccheah Apr 17, 2017
275510a
Staging server for receiving application dependencies. (#212)
mccheah Apr 21, 2017
b196426
Reorganize packages between v1 work and v2 work (#220)
mccheah Apr 21, 2017
d432dba
Support SSL on the file staging server (#221)
mccheah Apr 21, 2017
7c29732
Driver submission with mounting dependencies from the staging server …
mccheah Apr 25, 2017
2c753de
Enable testing against GCE clusters (#243)
foxish May 2, 2017
c902d69
Update running-on-kubernetes.md (#259)
erikerlandson May 2, 2017
f09bf4a
Build with sbt and fix scalastyle checks. (#241)
lins05 May 3, 2017
68ddcd5
Updating images in doc (#219)
foxish May 3, 2017
c4f17b7
Correct readme links (#266)
johscheuer May 5, 2017
da94d91
edit readme with a working build example command (#254)
erikerlandson May 9, 2017
ecf248c
Fix watcher conditional logic (#269)
erikerlandson May 10, 2017
085fcd1
Dispatch tasks to right executors that have tasks' input HDFS data (#…
kimoonkim May 10, 2017
2af7f05
Add parameter for driver pod name (#258)
hustcat May 16, 2017
20956e7
Dynamic allocation (#272)
foxish May 17, 2017
30597f6
Download remotely-located resources on driver and executor startup vi…
mccheah May 17, 2017
636dbda
Scalastyle fixes (#278)
ash211 May 17, 2017
76c865d
Exit properly when the k8s cluster is not available. (#256)
lins05 May 18, 2017
a6cebcb
Support driver pod kubernetes credentials mounting in V2 submission (…
mccheah May 18, 2017
2458b81
Allow client certificate PEM for resource staging server. (#257)
mccheah May 19, 2017
910865f
Differentiate between URI and SSL settings for in-cluster vs. submiss…
mccheah May 19, 2017
2e5f2cd
Monitor pod status in submission v2. (#283)
mccheah May 22, 2017
cc5eb85
Replace submission v1 with submission v2. (#286)
mccheah May 23, 2017
27b79a2
Added files should be in the working directories. (#294)
mccheah May 23, 2017
4d4819c
Add missing license (#296)
mccheah May 24, 2017
1311de1
Remove some leftover code and fix a constant. (#297)
mccheah May 24, 2017
e9f0a37
Adding restart policy fix for v2 (#303)
foxish May 25, 2017
bd8f6da
Add all dockerfiles to distributions. (#307)
mccheah May 26, 2017
fc5d9c5
Add proxy configuration to retrofit clients. (#301)
mccheah May 26, 2017
51a325c
Fix an HDFS data locality bug in case cluster node names are short ho…
kimoonkim May 26, 2017
b8dc23d
Remove leading slash from Retrofit interface. (#308)
mccheah May 30, 2017
1c8bf38
Use tini in Docker images (#320)
mccheah May 31, 2017
2cbd6fc
Allow custom executor labels and annotations (#321)
mccheah Jun 1, 2017
5be5938
Dynamic allocation, cleanup in case of driver death (#319)
foxish Jun 2, 2017
6610cd3
Fix client to await the driver pod (#325)
kimoonkim Jun 2, 2017
04ff1d8
Clean up resources that are not used by pods. (#305)
mccheah Jun 3, 2017
c312567
Copy yaml files when making distribution (#327)
tnachen Jun 4, 2017
4f6a4d7
Allow docker image pull policy to be configurable (#328)
tnachen Jun 5, 2017
f208d68
POM update 0.2.0 (#329)
foxish Jun 5, 2017
9cdccbe
Update tags (#332)
foxish Jun 6, 2017
5a41e1e
nicer readme (#333)
foxish Jun 6, 2017
069bd04
Support specify CPU cores and Memory restricts for driver (#340)
duyanghao Jun 8, 2017
4a01baf
Generate the application ID label irrespective of app name. (#331)
mccheah Jun 8, 2017
e763252
Create base-image and minimize layer count (#324)
johscheuer Jun 8, 2017
9f2ce8e
Added log4j config for k8s unit tests. (#314)
lins05 Jun 9, 2017
0010a57
Use node affinity to launch executors on preferred nodes benefitting …
kimoonkim Jun 14, 2017
efb5081
Fix sbt build. (#344)
mccheah Jun 14, 2017
af7297e
New API for custom labels and annotations. (#346)
mccheah Jun 14, 2017
38287f6
Allow spark driver find shuffle pods in specified namespace (#357)
Jun 22, 2017
168ef0a
Bypass init-containers when possible (#348)
chenchun Jun 23, 2017
cdf6c36
Config for hard cpu limit on pods; default unlimited (#356)
Jun 23, 2017
9dc5eed
Allow number of executor cores to have fractional values (#361)
liyinan926 Jun 29, 2017
442490a
Python Bindings for launching PySpark Jobs from the JVM (#364)
ifilonenko Jul 3, 2017
fd30c5d
Submission client redesign to use a step-based builder pattern (#365)
mccheah Jul 14, 2017
f46443e
Add implicit conversions to imports. (#374)
mccheah Jul 17, 2017
42f578f
Fix import order and scalastyle (#375)
ash211 Jul 17, 2017
2c00103
fix submit job errors (#376)
Jul 18, 2017
e086f4d
Add node selectors for driver and executor pods (#355)
Jul 18, 2017
7d0fa56
Retry binding server to random port in the resource staging server te…
mccheah Jul 19, 2017
4ffb4d6
set RestartPolicy=Never for executor (#367)
Jul 19, 2017
e3b2360
Read classpath entries from SPARK_EXTRA_CLASSPATH on executors. (#383)
mccheah Jul 20, 2017
15e13f4
Changes to support executor recovery behavior during static allocatio…
varunkatta Jul 21, 2017
823bf0e
Update pom to v0.3.0 of spark-kubernetes
foxish Jul 21, 2017
436482e
Fix: changed signature of ExternalShuffleClient
foxish Jul 24, 2017
beb1361
Updated poms
foxish Jul 24, 2017
a8330eb
Merge pull request #388 from apache-spark-on-k8s/branch-2.2-kubernetes-g
foxish Jul 25, 2017
64f3ddd
Add missing code blocks (#403)
erikerlandson Jul 28, 2017
bce9b77
Add an entrypoint.sh script to add a passwd entry if one does not exi…
erikerlandson Jul 28, 2017
8ecff61
revert my COPY mods
erikerlandson Jul 29, 2017
702a8f6
Fix bug with null arguments
ifilonenko Aug 1, 2017
2c5d784
Merge pull request #407 from bloomberg/python-testing
foxish Aug 3, 2017
fa67455
Merge pull request #404 from erikerlandson/anonymous-uids
foxish Aug 3, 2017
5fdaa7f
Exclude com.sun.jersey from docker-minimal-bundle. (#420)
mccheah Aug 8, 2017
e3cfaa4
Flag-guard expensive DNS lookup of cluster node full names, part of H…
kimoonkim Aug 8, 2017
bd50627
fixes #389 - increase SparkReadinessWatcher wait time (#419)
erikerlandson Aug 8, 2017
24cd9ee
Initial architecture documentation. (#401)
mccheah Aug 8, 2017
372ae41
Allow configuration to set environment variables on driver and execut…
Aug 9, 2017
410dc9c
version 2.2.0-k8s-0.3.0
erikerlandson Aug 9, 2017
737abdc
bump to 2.2.0-k8s-0.4.0-SNAPSHOT
erikerlandson Aug 9, 2017
a46b4a3
Revert "bump to 2.2.0-k8s-0.4.0-SNAPSHOT"
erikerlandson Aug 10, 2017
ff601a3
Revert "version 2.2.0-k8s-0.3.0"
erikerlandson Aug 10, 2017
19f49d0
version 2.2.0-k8s-0.3.0
erikerlandson Aug 10, 2017
982760c
bump to 2.2.0-k8s-0.4.0-SNAPSHOT
erikerlandson Aug 10, 2017
cb645ca
Update external shuffle service docs
foxish Aug 14, 2017
437eb89
Updated with documentation (#430)
foxish Aug 14, 2017
6ab02e2
Merge pull request #431 from apache-spark-on-k8s/foxish-patch-2
foxish Aug 14, 2017
3b3aeb7
Link to architecture docs (#432)
foxish Aug 14, 2017
6e1d69e
Removed deprecated option from pom (#433)
foxish Aug 14, 2017
c457f10
Support HDFS rack locality (#350)
kimoonkim Aug 17, 2017
4a322ad
Fix license check (#442)
ash211 Aug 18, 2017
f8cf9db
Scalastyle (#446)
ash211 Aug 21, 2017
455317d
Use a secret to mount small files in driver and executors. (#437)
mccheah Aug 21, 2017
58cebd1
Updated devloper doc to include a install step for first time compila…
liyinan926 Aug 21, 2017
e44d81a
Support service account override
kimoonkim Aug 22, 2017
f7b5820
Use a list of environment variables for JVM options. (#444)
mccheah Aug 22, 2017
7959fc5
Fix indentation
kimoonkim Aug 22, 2017
2cb2074
Support executor java options. (#445)
mccheah Aug 23, 2017
0c160f5
Bumping versions to v2.2.0-kubernetes-0.3.0
Aug 24, 2017
d6e922d
Properly wrap getOrElse in a tuple (#458)
mccheah Aug 24, 2017
dca9b04
Merge pull request #460 from sahilprasad/bump-shuffle-version
liyinan926 Aug 24, 2017
e600a07
Merge pull request #451 from kimoonkim/override-service-account
foxish Aug 24, 2017
6177bf8
Add command echoing for better command debugging (#462)
erikerlandson Aug 25, 2017
c6bc19d
Fix conversion from GB to MiB (#470)
ash211 Aug 30, 2017
1e63a60
spark-examples jar filename misses k8s-0.3.0 (#476)
silenceshell Aug 31, 2017
d710563
typo (#474)
silenceshell Aug 31, 2017
728ba0a
Set ENV_DRIVER_MEMORY to memory instead of memory+overhead (#475)
duyanghao Aug 31, 2017
bc845c3
Use paths to read small local files instead of URIs (#477)
mccheah Sep 4, 2017
fa02fb1
Move executor pod construction to a separate class. (#452)
mccheah Sep 6, 2017
8b63aad
Added configuration properties to inject arbitrary secrets into the d…
liyinan926 Sep 7, 2017
6d7d798
Extract more of the shuffle management to a different class. (#454)
mccheah Sep 7, 2017
6053455
Unit Tests for KubernetesClusterSchedulerBackend (#459)
mccheah Sep 8, 2017
6cebfb6
Use a headless service to give a hostname to the driver. (#483)
mccheah Sep 8, 2017
e5838c1
Code enhancement: Replaced explicit synchronized access to a hashmap …
varunkatta Sep 15, 2017
2cb0b04
docs: Fix path to spark-base Dockerfile (#493)
somcsel Sep 15, 2017
52fe7f5
Improve the image building workflow (#488)
foxish Sep 16, 2017
8a0f485
Fail submission if submitter-local files are provided without resourc…
Sep 16, 2017
7477cbe
Rename package to k8s (#497)
foxish Sep 21, 2017
3eb04bb
Added reference YAML files for RBAC configs for driver and shuffle se…
liyinan926 Sep 22, 2017
3c1a16a
Removing deprecated configuration (#503)
liyinan926 Sep 22, 2017
562f301
Update poms for 2.2 release 0.4.0 (#508)
foxish Sep 25, 2017
3c7dec5
Update POM to 0.5.0-SNAPSHOT (#512)
foxish Sep 26, 2017
887fdce
Add unit-testing for executorpodfactory (#491)
foxish Oct 10, 2017
49932d6
Mount emptyDir volumes for temporary directories on executors in stat…
mccheah Oct 16, 2017
f94499b
SparkR Support (#507)
ifilonenko Oct 18, 2017
0abf0b9
Use the new initContainers field instead of the deprecated annotation…
liyinan926 Oct 20, 2017
6b1caca
Use the driver pod IP address for spark.driver.bindAddress (#533)
liyinan926 Oct 26, 2017
b008be3
Updated poms to 0.5.0 for the new 2.2 release (#536)
liyinan926 Oct 27, 2017
8f73508
Add quotes around $SPARK_CLASSPATH in Dockerfile java commands (#541)
tmckayus Nov 7, 2017
3ff2cbb
Spark Submit changes and test (#542)
foxish Nov 8, 2017
5fd1304
Update docker images for rss and shuffle service (#553)
liyinan926 Nov 17, 2017
15a333c
Do not add the MountSmallLocalFilesStep when there's no submitter loc…
liyinan926 Nov 28, 2017
0612195
Allow user-specified environment variables and secrets in the init-co…
liyinan926 Dec 4, 2017
246b885
Basic Secure HDFS Support [514] (#540)
ifilonenko Dec 12, 2017
6428bb9
Append HADOOP_CONF_DIR to SPARK_CLASS in driver/executor Dockerfiles …
chenchun Dec 20, 2017
6d724a9
Missed an && (#587)
coderanger Jan 2, 2018
d7dd259
Refactor for Hadoop conf related code (#596)
hex108 Jan 9, 2018
12d590c
Avoids adding duplicated secret volumes when init-container is used (…
liyinan926 Jan 12, 2018
90a204c
Create ISSUE_TEMPLATE.md (#608)
erikerlandson Feb 3, 2018
a0117ea
Add message to redirect PRs upstream if possible (#607)
erikerlandson Feb 4, 2018
7b8c9f5
remove camel case naming in kerberos secret names (#612)
Feb 9, 2018
a218f3d
Add deprecation notice to README.md (#631)
mccheah Jun 8, 2018
72cf35d
Add an archival notice to README
erikerlandson Jan 8, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .github/ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
## Before you submit an issue
The Kubernetes scheduler backend is currently being upstreamed to the main [Apache Spark project](https://github.com/apache/spark).
We are attempting to re-direct as much new development as possible to the upstream.
Please consider whether your issue can be submitted against the Apache Spark project,
and submit there as an Apache Spark JRIA, if possible.

If you have any questions about whether an issue should be submitted upstream or against this fork,
please feel free to reach out on the following channels:

* Apache Spark developer mailing list: [email protected]
* Apache Spark [JIRA](https://issues.apache.org/jira/)
* Big Data SIG [slack channel](https://kubernetes.slack.com/)
* Regular Big Data SIG [meetings](https://github.com/kubernetes/community/tree/master/sig-big-data)
13 changes: 13 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
@@ -1,3 +1,16 @@
## Before you submit a Pull Request
The Kubernetes scheduler backend is currently being upstreamed to the main [Apache Spark project](https://github.com/apache/spark).
We are attempting to re-direct as much new development as possible to the upstream.
Please consider whether your pull request can be submitted against the Apache Spark project, and submit there if possible.

If you have any questions about whether a PR should be submitted upstream or against this fork,
please feel free to reach out on the following channels:

* Apache Spark developer mailing list: [email protected]
* Apache Spark [JIRA](https://issues.apache.org/jira/)
* Big Data SIG [slack channel](https://kubernetes.slack.com/)
* Regular Big Data SIG [meetings](https://github.com/kubernetes/community/tree/master/sig-big-data)

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)
Expand Down
21 changes: 17 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,22 @@
sudo: required
dist: trusty

# 2. Choose language and target JDKs for parallel builds.
# 2. Choose language, target JDK and env's for parallel builds.
language: java
jdk:
- oraclejdk8
env: # Used by the install section below.
# Configure the unit test build for spark core and kubernetes modules,
# while excluding some flaky unit tests using a regex pattern.
- PHASE=test \
PROFILES="-Pmesos -Pyarn -Phadoop-2.7 -Pkubernetes" \
MODULES="-pl core,resource-managers/kubernetes/core -am" \
ARGS="-Dtest=none -Dsuffixes='^org\.apache\.spark\.(?!ExternalShuffleServiceSuite|SortShuffleSuite$|rdd\.LocalCheckpointSuite$|deploy\.SparkSubmitSuite$|deploy\.StandaloneDynamicAllocationSuite$).*'"
# Configure the full build.
- PHASE=install \
PROFILES="-Pmesos -Pyarn -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver" \
MODULES="" \
ARGS="-T 4 -q -DskipTests"

# 3. Setup cache directory for SBT and Maven.
cache:
Expand All @@ -40,11 +52,12 @@ cache:
notifications:
email: false

# 5. Run maven install before running lint-java.
# 5. Run maven build before running lints.
install:
- export MAVEN_SKIP_RC=1
- build/mvn -T 4 -q -DskipTests -Pmesos -Pyarn -Pkinesis-asl -Phive -Phive-thriftserver install
- build/mvn ${PHASE} ${PROFILES} ${MODULES} ${ARGS}

# 6. Run lint-java.
# 6. Run lints.
script:
- dev/lint-java
- dev/lint-scala
48 changes: 48 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,51 @@
**ARCHIVED** This repository is being archived, to prevent any future confusion: All development on the Kubernetes scheduler back-end for Apache Spark is now upstream at https://spark.apache.org/ and https://github.com/apache/spark/

**DEPRECATED**. Work on this fork is discontinued. Further development is continuing on the mainline implementation of Apache Spark: https://github.com/apache/spark.

You can run Spark on Kubernetes using Spark 2.3. Some features from this work need to be ported to mainline. If a feature is missing, please check https://issues.apache.org/jira/projects/SPARK/issues to see if we're tracking that work, and if we are not, please file a JIRA ticket indicating the missing behavior.

All other bugs and feature requests should either be proposed through JIRA or sent to [email protected] or [email protected].

# Apache Spark On Kubernetes

This repository, located at https://github.com/apache-spark-on-k8s/spark, contains a fork of Apache Spark that enables running Spark jobs natively on a Kubernetes cluster.

## What is this?

This is a collaboratively maintained project working on [SPARK-18278](https://issues.apache.org/jira/browse/SPARK-18278). The goal is to bring native support for Spark to use Kubernetes as a cluster manager, in a fully supported way on par with the Spark Standalone, Mesos, and Apache YARN cluster managers.

## Getting Started

- [Usage guide](https://apache-spark-on-k8s.github.io/userdocs/) shows how to run the code
- [Development docs](resource-managers/kubernetes/README.md) shows how to get set up for development
- [Architecture docs](resource-managers/kubernetes/architecture-docs/) shows the high level architecture of Spark on Kubernetes
- Code is primarily located in the [resource-managers/kubernetes](resource-managers/kubernetes) folder

## Why does this fork exist?

Adding native integration for a new cluster manager is a large undertaking. If poorly executed, it could introduce bugs into Spark when run on other cluster managers, cause release blockers slowing down the overall Spark project, or require hotfixes which divert attention away from development towards managing additional releases. Any work this deep inside Spark needs to be done carefully to minimize the risk of those negative externalities.

At the same time, an increasing number of people from various companies and organizations desire to work together to natively run Spark on Kubernetes. The group needs a code repository, communication forum, issue tracking, and continuous integration, all in order to work together effectively on an open source product.

We've been asked by an Apache Spark Committer to work outside of the Apache infrastructure for a short period of time to allow this feature to be hardened and improved without creating risk for Apache Spark. The aim is to rapidly bring it to the point where it can be brought into the mainline Apache Spark repository for continued development within the Apache umbrella. If all goes well, this should be a short-lived fork rather than a long-lived one.

## Who are we?

This is a collaborative effort by several folks from different companies who are interested in seeing this feature be successful. Companies active in this project include (alphabetically):

- Bloomberg
- Google
- Haiwen
- Hyperpilot
- Intel
- Palantir
- Pepperdata
- Red Hat

--------------------

(original README below)

# Apache Spark

Spark is a fast and general cluster computing system for Big Data. It provides
Expand Down
12 changes: 11 additions & 1 deletion assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.5.0</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down Expand Up @@ -148,6 +148,16 @@
</dependency>
</dependencies>
</profile>
<profile>
<id>kubernetes</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-kubernetes_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>hive</id>
<dependencies>
Expand Down
2 changes: 1 addition & 1 deletion common/network-common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.5.0</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/network-shuffle/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.5.0</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.network.shuffle.kubernetes;

import java.io.Closeable;
import java.io.IOException;

public interface KubernetesExternalShuffleClient extends Closeable {

void init(String appId);

void registerDriverWithShuffleService(String host, int port)
throws IOException, InterruptedException;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.network.shuffle.kubernetes;

import org.apache.spark.network.client.RpcResponseCallback;
import org.apache.spark.network.client.TransportClient;
import org.apache.spark.network.sasl.SecretKeyHolder;
import org.apache.spark.network.shuffle.ExternalShuffleClient;
import org.apache.spark.network.shuffle.protocol.RegisterDriver;
import org.apache.spark.network.util.TransportConf;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.IOException;
import java.nio.ByteBuffer;

/**
* A client for talking to the external shuffle service in Kubernetes cluster mode.
*
* This is used by the each Spark executor to register with a corresponding external
* shuffle service on the cluster. The purpose is for cleaning up shuffle files
* reliably if the application exits unexpectedly.
*/
public class KubernetesExternalShuffleClientImpl
extends ExternalShuffleClient implements KubernetesExternalShuffleClient {

private static final Logger logger = LoggerFactory
.getLogger(KubernetesExternalShuffleClientImpl.class);

/**
* Creates a Kubernetes external shuffle client that wraps the {@link ExternalShuffleClient}.
* Please refer to docs on {@link ExternalShuffleClient} for more information.
*/
public KubernetesExternalShuffleClientImpl(
TransportConf conf,
SecretKeyHolder secretKeyHolder,
boolean saslEnabled) {
super(conf, secretKeyHolder, saslEnabled);
}

@Override
public void registerDriverWithShuffleService(String host, int port)
throws IOException, InterruptedException {
checkInit();
ByteBuffer registerDriver = new RegisterDriver(appId, 0).toByteBuffer();
TransportClient client = clientFactory.createClient(host, port);
client.sendRpc(registerDriver, new RegisterDriverCallback());
}

private class RegisterDriverCallback implements RpcResponseCallback {
@Override
public void onSuccess(ByteBuffer response) {
logger.info("Successfully registered app " + appId + " with external shuffle service.");
}

@Override
public void onFailure(Throwable e) {
logger.warn("Unable to register app " + appId + " with external shuffle service. " +
"Please manually remove shuffle data after driver exit. Error: " + e);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
import org.apache.spark.network.client.TransportClient;
import org.apache.spark.network.sasl.SecretKeyHolder;
import org.apache.spark.network.shuffle.ExternalShuffleClient;
import org.apache.spark.network.shuffle.protocol.mesos.RegisterDriver;
import org.apache.spark.network.shuffle.protocol.RegisterDriver;
import org.apache.spark.network.util.TransportConf;

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
import io.netty.buffer.Unpooled;

import org.apache.spark.network.protocol.Encodable;
import org.apache.spark.network.shuffle.protocol.mesos.RegisterDriver;
import org.apache.spark.network.shuffle.protocol.mesos.ShuffleServiceHeartbeat;

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,18 @@
* limitations under the License.
*/

package org.apache.spark.network.shuffle.protocol.mesos;
package org.apache.spark.network.shuffle.protocol;

import com.google.common.base.Objects;
import io.netty.buffer.ByteBuf;

import org.apache.spark.network.protocol.Encoders;
import org.apache.spark.network.shuffle.protocol.BlockTransferMessage;

// Needed by ScalaDoc. See SPARK-7726
import static org.apache.spark.network.shuffle.protocol.BlockTransferMessage.Type;

/**
* A message sent from the driver to register with the MesosExternalShuffleService.
* A message sent from the driver to register with an ExternalShuffleService.
*/
public class RegisterDriver extends BlockTransferMessage {
private final String appId;
Expand Down
2 changes: 1 addition & 1 deletion common/network-yarn/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.5.0</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/sketch/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.5.0</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/tags/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.5.0</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/unsafe/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.5.0</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
Loading