Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1172 commits
Select commit Hold shift + click to select a range
daf54f0
[SPARK-11860][PYSAPRK][DOCUMENTATION] Invalid argument specification …
zjffdu Nov 25, 2015
0132b47
[SPARK-10666][SPARK-6880][CORE] Use properties from ActiveJob associa…
markhamstra Nov 25, 2015
c319c4a
[SPARK-11956][CORE] Fix a few bugs in network lib-based file transfer.
Nov 25, 2015
a54c1b6
[SPARK-11984][SQL][PYTHON] Fix typos in doc for pivot for scala and p…
felixcheung Nov 25, 2015
82aefee
[SPARK-11974][CORE] Not all the temp dirs had been deleted when the J…
pzzs Nov 25, 2015
720a5d1
[SPARK-11969] [SQL] [PYSPARK] visualization of SQL query for pyspark
Nov 25, 2015
3aa7ebb
[MINOR] Remove unnecessary spaces in `include_example.rb`
yu-iskw Nov 25, 2015
d441911
[DOCUMENTATION] Fix minor doc error
zjffdu Nov 25, 2015
6d45ed8
[SPARK-10864][WEB UI] app name is hidden if window is resized
ajbozarth Nov 25, 2015
29556bf
[SPARK-11880][WINDOWS][SPARK SUBMIT] bin/load-spark-env.cmd loads spa…
wangt Nov 25, 2015
fd0d48b
[SPARK-10558][CORE] Fix wrong executor state in Master
jerryshao Nov 25, 2015
a7b1fce
[SPARK-11935][PYSPARK] Send the Python exceptions in TransformFunctio…
zsxwing Nov 25, 2015
94164a2
[SPARK-11866][NETWORK][CORE] Make sure timed out RPCs are cleaned up.
Nov 25, 2015
5e58288
Fix Aggregator documentation (rename present to finish).
rxin Nov 25, 2015
7d88f53
[SPARK-11983][SQL] remove all unused codegen fallback trait
adrian-wang Nov 25, 2015
01551c5
[SPARK-11206] Support SQL UI on the history server
carsonwang Nov 25, 2015
dd50d76
[SPARK-12003] [SQL] remove the prefix for name after expanded star
Nov 26, 2015
54b1c89
[SPARK-11980][SPARK-10621][SQL] Fix json_tuple and add test cases for
gatorsmile Nov 26, 2015
eb846ce
[SPARK-11999][CORE] Fix the issue that ThreadUtils.newDaemonCachedThr…
zsxwing Nov 26, 2015
9799433
[SPARK-11973] [SQL] push filter through aggregation with alias and li…
Nov 26, 2015
fd196d9
[SPARK-12005][SQL] Work around VerifyError in HyperLogLogPlusPlus.
Nov 26, 2015
557167d
[SPARK-11863][SQL] Unable to resolve order by if it contains mixture …
dilipbiswal Nov 26, 2015
3e0102c
[SPARK-11998][SQL][TEST-HADOOP2.0] When downloading Hadoop artifacts …
yhuai Nov 27, 2015
343f9d7
[SPARK-11973][SQL] Improve optimizer code readability.
rxin Nov 27, 2015
f3c3dd3
doc typo: "classificaion" -> "classification"
muxator Nov 27, 2015
4b7a494
[SPARK-11996][CORE] Make the executor thread dump work again
zsxwing Nov 27, 2015
26bc2ee
[SPARK-12011][SQL] Stddev/Variance etc should support columnName as a…
yanboliang Nov 27, 2015
2534c7c
[SPARK-11881][SQL] Fix for postgresql fetchsize > 0
mariusvniekerk Nov 27, 2015
c88c760
[SPARK-11917][PYSPARK] Add SQLContext#dropTempTable to PySpark
zjffdu Nov 27, 2015
6ed3313
[SPARK-11778][SQL] add regression test
Nov 27, 2015
c342ef1
[SPARK-11991] fixes
Nov 27, 2015
a84c0ef
Fix style violation for b63938a8b04
rxin Nov 27, 2015
8b9e7c0
[SPARK-11997] [SQL] NPE when save a DataFrame as parquet and partitio…
dilipbiswal Nov 27, 2015
fe0e447
[SPARK-12025][SPARKR] Rename some window rank function names for SparkR
yanboliang Nov 27, 2015
a6ee8a0
[SPARK-12021][STREAMING][TESTS] Fix the potential dead-lock in Stream…
zsxwing Nov 27, 2015
67b921e
[SPARK-12020][TESTS][TEST-HADOOP2.0] PR builder cannot trigger hadoop…
yhuai Nov 27, 2015
131352b
[SPARK-12028] [SQL] get_json_object returns an incorrect result when …
gatorsmile Nov 28, 2015
5e4c4a8
[SPARK-12029][SPARKR] Improve column functions signature, param check…
felixcheung Nov 29, 2015
3392386
[SPARK-9319][SPARKR] Add support for setting column names, types
felixcheung Nov 29, 2015
187e403
[SPARK-11781][SPARKR] SparkR has problem in inferring type of raw type.
Nov 29, 2015
6ac2a75
[SPARK-12024][SQL] More efficient multi-column counting.
hvanhovell Nov 29, 2015
ce0edf9
[SPARK-12039] [SQL] Ignore HiveSparkSubmitSuite's "SPARK-9757 Persist…
yhuai Nov 30, 2015
ec7e453
[SPARK-11859][MESOS] SparkContext accepts invalid Master URLs in the …
toddwan Nov 30, 2015
5b38164
[MINOR][BUILD] Changed the comment to reflect the plugin project is t…
ScrapCodes Nov 30, 2015
68ff33c
[DOC] Explicitly state that top maintains the order of elements
mineo Nov 30, 2015
abe08fb
[SPARK-12023][BUILD] Fix warnings while packaging spark with maven.
ScrapCodes Nov 30, 2015
919153c
[SPARK-11989][SQL] Only use commit in JDBC data source if the underly…
CK50 Nov 30, 2015
e54151b
[SPARK-11700] [SQL] Remove thread local SQLContext in SparkPlan
Nov 30, 2015
b6860b9
[SPARK-11982] [SQL] improve performance of cartesian product
Nov 30, 2015
710e445
[MINOR][DOCS] fixed list display in ml-ensembles
BenFradet Nov 30, 2015
3de5d82
Revert "[SPARK-11206] Support SQL UI on the history server"
JoshRosen Nov 30, 2015
fdfd9a1
[SPARK-12053][CORE] EventLoggingListener.getLogPath needs 4 parameters
chutium Nov 30, 2015
db5f114
[SPARK-11689][ML] Add user guide and example code for LDA under spark.ml
hhbyyh Nov 30, 2015
46d4bdb
[SPARK-11975][ML] Remove duplicate mllib example (DT/RF/GBT in Java/P…
yanboliang Nov 30, 2015
ca56fe8
[SPARK-11960][MLLIB][DOC] User guide for streaming tests
feynmanliang Nov 30, 2015
7d9a876
fix Maven build
davies Nov 30, 2015
95aa0ee
[SPARK-12058][HOTFIX] Disable KinesisStreamTests
zsxwing Dec 1, 2015
4445bf0
[SPARK-12000] Fix API doc generation issues
JoshRosen Dec 1, 2015
776d6cd
[SPARK-12035] Add more debug information in include_example tag of Je…
yinxusen Dec 1, 2015
378f137
[SPARK-12037][CORE] initialize heartbeatReceiverRef before calling st…
CodingCat Dec 1, 2015
24bfd58
[SPARK-12007][NETWORK] Avoid copies in the network lib's RPC layer.
Dec 1, 2015
c4d1ae0
[SPARK-12049][CORE] User JVM shutdown hook can cause deadlock at shut…
srowen Dec 1, 2015
39b3359
[HOTFIX][SPARK-12000] Add missing quotes in Jekyll API docs plugin.
JoshRosen Dec 1, 2015
c461877
[SPARK-12018][SQL] Refactor common subexpression elimination code
viirya Dec 1, 2015
f693dae
[SPARK-11898][MLLIB] Use broadcast for the global tables in Word2Vec
hhbyyh Dec 1, 2015
e686df1
[SPARK-11949][SQL] Set field nullable property for GroupingSets to ge…
viirya Dec 1, 2015
948c6f1
[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.seria…
zsxwing Dec 1, 2015
2e9a943
[SPARK-12046][DOC] Fixes various ScalaDoc/JavaDoc issues
liancheng Dec 1, 2015
163e38e
[SPARK-12068][SQL] use a single column in Dataset.groupBy and count w…
cloud-fan Dec 1, 2015
9cf3e2e
[SPARK-11856][SQL] add type cast if the real type is different but co…
cloud-fan Dec 1, 2015
01ab37d
[SPARK-11954][SQL] Encoder for JavaBeans
cloud-fan Dec 1, 2015
efd9661
[SPARK-11905][SQL] Support Persist/Cache and Unpersist in Dataset APIs
gatorsmile Dec 1, 2015
6f8fb6f
[SPARK-11821] Propagate Kerberos keytab for all environments
woj-i Dec 1, 2015
a46c68b
[SPARK-12065] Upgrade Tachyon from 0.8.1 to 0.8.2
JoshRosen Dec 1, 2015
7f4be9f
[SPARK-12030] Fix Platform.copyMemory to handle overlapping regions.
nongli Dec 1, 2015
3ff48ba
[SPARK-12004] Preserve the RDD partitioner through RDD checkpointing
tdas Dec 1, 2015
e61fe19
Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstan…
zsxwing Dec 1, 2015
cd6f330
[SPARK-11961][DOC] Add docs of ChiSqSelector
yinxusen Dec 1, 2015
09fd767
[SPARK-12002][STREAMING][PYSPARK] Fix python direct stream checkpoint…
jerryshao Dec 1, 2015
8f0ce8b
[SPARK-12075][SQL] Speed up HiveComparisionTest by avoiding / speedin…
JoshRosen Dec 1, 2015
eb2d13a
[SPARK-11328][SQL] Improve error message when hitting this issue
nongli Dec 1, 2015
90bcb15
[SPARK-11788][SQL] surround timestamp/date value with quotes in JDBC …
Dec 1, 2015
7606bb6
[SPARK-11352][SQL] Escape */ in the generated comments.
yhuai Dec 2, 2015
8871b24
[SPARK-11596][SQL] In TreeNode's argString, if a TreeNode is not a ch…
yhuai Dec 2, 2015
f4f27a5
[SPARK-8414] Ensure context cleaner periodic cleanups
Dec 2, 2015
170a8d5
[SPARK-12081] Make unified memory manager work with small heaps
Dec 2, 2015
b70c651
[SPARK-12077][SQL] change the default plan for single distinct
Dec 2, 2015
041df50
[SPARK-12087][STREAMING] Create new JobConf for every batch in saveAs…
tdas Dec 2, 2015
f5ef4d1
[SPARK-11949][SQL] Check bitmasks to set nullable property
viirya Dec 2, 2015
1bca918
[SPARK-12090] [PYSPARK] consider shuffle in coalesce()
Dec 2, 2015
6527885
[SPARK-3580][CORE] Add Consistent Method To Get Number of RDD Partiti…
Dec 2, 2015
1117281
[SPARK-12094][SQL] Prettier tree string for TreeNode
liancheng Dec 2, 2015
6876c50
[SPARK-12001] Allow partially-stopped StreamingContext to be complete…
JoshRosen Dec 2, 2015
2e3c994
[SPARK-10266][DOCUMENTATION, ML] Fixed @Since annotation for ml.tunning
yu-iskw Dec 2, 2015
7148fb9
[SPARK-12093][SQL] Fix the error of comment in DDLParser
watermen Dec 3, 2015
7f4868d
[SPARK-12000] do not specify arg types when reference a method in Sca…
mengxr Dec 3, 2015
73b7a6a
[SPARK-12082][FLAKY-TEST] Increase timeouts in NettyBlockTransferSecu…
JoshRosen Dec 3, 2015
be50460
[SPARK-12109][SQL] Expressions's simpleString should delegate to its …
yhuai Dec 3, 2015
4eefdf6
[SPARK-12088][SQL] check connection.isClosed before calling connection…
Dec 3, 2015
6d5a6f9
[DOCUMENTATION][MLLIB] typo in mllib doc
zjffdu Dec 3, 2015
c1c5f56
[DOCUMENTATION][KAFKA] fix typo in kafka/OffsetRange.scala
microwishing Dec 3, 2015
a67fbee
[SPARK-12116][SPARKR][DOCS] document how to workaround function name …
felixcheung Dec 3, 2015
3ed25b4
[SPARK-11314][YARN] add service API and test service for Yarn Cluster…
steveloughran Dec 3, 2015
2bb1336
[SPARK-12059][CORE] Avoid assertion error when unexpected state trans…
jerryshao Dec 3, 2015
582cc93
[SPARK-12101][CORE] Fix thread pools that cannot cache tasks in Worke…
zsxwing Dec 3, 2015
c3c92f8
[SPARK-12108] Make event logs smaller
Dec 3, 2015
6d6be99
[MINOR][ML] Use coefficients replace weights
yanboliang Dec 3, 2015
1c86f0f
[SPARK-12107][EC2] Update spark-ec2 versions
nchammas Dec 3, 2015
64d6254
[FLAKY-TEST-FIX][STREAMING][TEST] Make sure StreamingContexts are shu…
tdas Dec 3, 2015
e69bb9f
[SPARK-12019][SPARKR] Support character vector for sparkR.init(), che…
felixcheung Dec 3, 2015
2d3b873
[SPARK-12056][CORE] Create a TaskAttemptContext only after calling se…
Dec 4, 2015
854bff1
[SPARK-11206] Support SQL UI on the history server (resubmit)
carsonwang Dec 4, 2015
99d89f7
[SPARK-12104][SPARKR] collect() does not handle multiple columns with…
Dec 4, 2015
ff3cb64
[SPARK-12122][STREAMING] Prevent batches from being submitted twice a…
tdas Dec 4, 2015
a12d099
Add links howto to setup IDEs for developing spark
kaklakariada Dec 4, 2015
9a45fb6
[SPARK-12089] [SQL] Fix memory corrupt due to freeing a page being re…
Dec 4, 2015
5c5776d
[SPARK-6990][BUILD] Add Java linting script; fix minor warnings
dskrvk Dec 4, 2015
ea01d53
[SPARK-12058][STREAMING][KINESIS][TESTS] fix Kinesis python tests
brkyvz Dec 4, 2015
def672c
[SPARK-11314][BUILD][HOTFIX] Add exclusion for moved YARN classes.
Dec 4, 2015
dc12745
[SPARK-12112][BUILD] Upgrade to SBT 0.13.9
JoshRosen Dec 5, 2015
5c49b17
[SPARK-12142][CORE]Reply false when container allocator is not ready …
XuTingjun Dec 5, 2015
0a7f722
[SPARK-12080][CORE] Kryo - Support multiple user registrators
Dec 5, 2015
40d8a51
[SPARK-12084][CORE] Fix codes that uses ByteBuffer.array incorrectly
zsxwing Dec 5, 2015
29c481b
[SPARK-12096][MLLIB] remove the old constraint in word2vec
hhbyyh Dec 5, 2015
f3aa57a
[SPARK-11994][MLLIB] Word2VecModel load and save cause SparkException…
tmnd1991 Dec 5, 2015
9982fd2
[SPARK-11988][ML][MLLIB] Update JPMML to 1.2.7
srowen Dec 5, 2015
b33e47b
[SPARK-11774][SPARKR] Implement struct(), encode(), decode() function…
Dec 5, 2015
4d3ad94
[SPARK-11715][SPARKR] Add R support corr for Column Aggregration
felixcheung Dec 6, 2015
83f77bf
[SPARK-12115][SPARKR] Change numPartitions() to getNumPartitions() to…
yanboliang Dec 6, 2015
4102584
[SPARK-12044][SPARKR] Fix usage of isnan, isNaN
yanboliang Dec 6, 2015
680ab41
[SPARK-12048][SQL] Prevent to close JDBC resources twice
Dec 6, 2015
55c2ae6
[SPARK-12138][SQL] Escape \u in the generated comments of codegen
gatorsmile Dec 6, 2015
8e03d8f
[SPARK-12152][PROJECT-INFRA] Speed up Scalastyle checks by only invok…
JoshRosen Dec 7, 2015
ef02e1d
[SPARK-12106][STREAMING][FLAKY-TEST] BatchedWAL test transiently flak…
brkyvz Dec 7, 2015
0e51b47
[SPARK-12032] [SQL] Re-order inner joins to do join with conditions f…
Dec 7, 2015
9e0cb2c
[SPARK-12034][SPARKR] Eliminate warnings in SparkR test cases.
Dec 7, 2015
3af9596
[SPARK-12132] [PYSPARK] raise KeyboardInterrupt inside SIGINT handler
Dec 7, 2015
e084743
[SPARK-11932][STREAMING] Partition previous TrackStateRDD if partitio…
tdas Dec 7, 2015
c7bb3b2
[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.seria…
zsxwing Dec 7, 2015
bb017dd
[SPARK-11963][DOC] Add docs for QuantileDiscretizer
yinxusen Dec 7, 2015
db2eca8
[SPARK-11884] Drop multiple columns in the DataFrame API
tedyu Dec 7, 2015
8658058
[SPARK-12184][PYTHON] Make python api doc for pivot consistant with s…
aray Dec 7, 2015
6211930
[SPARK-12160][MLLIB] Use SQLContext.getOrCreate in MLlib
jkbradley Dec 8, 2015
cfd3bf4
[SPARK-11551][DOC][EXAMPLE] Replace example code in ml-features.md us…
somideshmukh Dec 8, 2015
b11245a
[SPARK-10259][ML] Add @since annotation to ml.classification
Dec 8, 2015
c0e42e5
[SPARK-11958][SPARK-11957][ML][DOC] SQLTransformer user guide and exa…
yanboliang Dec 8, 2015
48f2914
[SPARK-12103][STREAMING][KAFKA][DOC] document that K means Key and V …
koeninger Dec 8, 2015
675b92a
[SPARK-12166][TEST] Unset hadoop related environment in testing
zjffdu Dec 8, 2015
7d60877
[SPARK-11439][ML] Optimization of creating sparse feature without den…
Dec 8, 2015
3dc7ca1
[SPARK-11551][DOC][EXAMPLE] Revert PR #10002
liancheng Dec 8, 2015
f7fc52a
[SPARK-11652][CORE] Remote code execution with InvokerTransformer
srowen Dec 8, 2015
1ee1b4f
[SPARK-11155][WEB UI] Stage summary json should include stage duration
keypointt Dec 8, 2015
303e6f2
[SPARK-12074] Avoid memory copy involving ByteBuffer.wrap(ByteArrayOu…
tedyu Dec 8, 2015
a551f53
[SPARK-12201][SQL] add type coercion rule for greatest/least
cloud-fan Dec 8, 2015
e0accf0
[SPARK-12195][SQL] Adding BigDecimal, Date and Timestamp into Encoder
gatorsmile Dec 8, 2015
4426d95
[SPARK-12188][SQL] Code refactoring and comment correction in Dataset…
gatorsmile Dec 8, 2015
526862c
[SPARK-10393] use ML pipeline in LDA example
hhbyyh Dec 8, 2015
bf0176b
[SPARK-12205][SQL] Pivot fails Analysis when aggregate is UnresolvedF…
aray Dec 8, 2015
b82de20
[SPARK-11605][MLLIB] ML 1.6 QA: API: Java compatibility, docs
hhbyyh Dec 8, 2015
add793d
[SPARK-12159][ML] Add user guide section for IndexToString transformer
BenFradet Dec 8, 2015
3fd7988
[SPARK-3873][BUILD] Add style checker to enforce import ordering.
Dec 8, 2015
1bcc18b
[SPARK-12187] *MemoryPool classes should not be fully public
Dec 8, 2015
a384868
[SPARK-12069][SQL] Update documentation with Datasets
marmbrus Dec 8, 2015
a4aaed0
[SPARK-8517][ML][DOC] Reorganizes the spark.ml user guide
thunterdb Dec 9, 2015
c6a00f4
[SPARK-11343][ML] Documentation of float and double prediction/label …
dahlem Dec 9, 2015
1258f45
[SPARK-12222] [CORE] Deserialize RoaringBitmap using Kryo serializer …
scwf Dec 9, 2015
7298d57
[SPARK-11676][SQL] Parquet filter tests all pass if filters are not r…
HyukjinKwon Dec 9, 2015
81a02af
[SPARK-12031][CORE][BUG] Integer overflow when do sampling
uncleGen Dec 9, 2015
49c3d5d
[SPARK-12012][SQL] Show more comprehensive PhysicalRDD metadata when …
liancheng Dec 9, 2015
ab069c0
[SPARK-10299][ML] word2vec should allow users to specify the window size
holdenk Dec 9, 2015
2f2b101
[SPARK-10582][YARN][CORE] Fix AM failure situation for dynamic alloca…
jerryshao Dec 9, 2015
d525dd8
[SPARK-12241][YARN] Improve failure reporting in Yarn client obtainTo…
steveloughran Dec 9, 2015
a012ce5
[SPARK-12165][SPARK-12189] Fix bugs in eviction of storage memory by …
JoshRosen Dec 9, 2015
874635e
[SPARK-11824][WEBUI] WebUI does not render descriptions with 'bad' HT…
srowen Dec 9, 2015
6ae8e1e
[SPARK-11551][DOC] Replace example code in ml-features.md using inclu…
yinxusen Dec 9, 2015
c3f9969
[SPARK-12211][DOC][GRAPHX] Fix version number in graphx doc for migra…
aray Dec 10, 2015
0e97f52
[SPARK-12165][ADDENDUM] Fix outdated comments on unroll test
Dec 10, 2015
c4187c6
[SPARK-11678][SQL][DOCS] Document basePath in the programming guide.
yhuai Dec 10, 2015
c006bba
[SPARK-11796] Fix httpclient and httpcore depedency issues related to…
Dec 10, 2015
72dbfc1
[SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWi…
tdas Dec 10, 2015
041c0a1
[SPARK-12252][SPARK-12131][SQL] refactor MapObjects to make it less h…
cloud-fan Dec 10, 2015
06844cc
[SPARK-12136][STREAMING] rddToFileName does not properly handle prefi…
Dec 10, 2015
e4b8401
[SPARK-11530][MLLIB] Return eigenvalues with PCA model
srowen Dec 10, 2015
f5cdabf
[SPARK-12242][SQL] Add DataFrame.transform method
rxin Dec 10, 2015
c1bf494
[SPARK-11832][CORE] Process arguments in spark-shell for Scala 2.11
jodersky Dec 10, 2015
a08155b
[SPARK-12198][SPARKR] SparkR support read.parquet and deprecate parqu…
yanboliang Dec 10, 2015
de86e36
[SPARK-11602][MLLIB] Refine visibility for 1.6 scala API audit
hhbyyh Dec 10, 2015
67b0607
[SPARK-12234][SPARKR] Fix ```subset``` function error when only set `…
yanboliang Dec 10, 2015
f9b12f8
[SPARK-12250][SQL] Allow users to define a UDAF without providing det…
yhuai Dec 10, 2015
6aef5a5
[SPARK-12228][SQL] Try to run execution hive's derby in memory.
yhuai Dec 10, 2015
1e53e1b
[SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spa…
thunterdb Dec 10, 2015
34a9bc1
[SPARK-11563][CORE][REPL] Use RpcEnv to transfer REPL-generated classes.
Dec 10, 2015
4983e4d
[SPARK-11713] [PYSPARK] [STREAMING] Initial RDD updateStateByKey for …
BryanCutler Dec 10, 2015
8ccbe0e
[SPARK-12251] Document and improve off-heap memory configurations
JoshRosen Dec 10, 2015
fb5b0f8
[SPARK-12155][SPARK-12253] Fix executor OOM in unified memory management
Dec 10, 2015
7413ec8
[STREAMING][DOC][MINOR] Update the description of direct Kafka stream…
jerryshao Dec 10, 2015
c415627
[SPARK-12258][SQL] passing null into ScalaUDF
Dec 11, 2015
eb564db
[SPARK-10991][ML] logistic regression training summary handle empty p…
holdenk Dec 11, 2015
c50c7d0
[SPARK-12258] [SQL] passing null into ScalaUDF (follow-up)
Dec 11, 2015
a45d1fe
[SPARK-12146][SPARKR] SparkR jsonFile should support multiple input f…
yanboliang Dec 11, 2015
3bff1d2
[SPARK-11964][DOCS][ML] Add in Pipeline Import/Export Documentation
bllchmbrs Dec 11, 2015
a2376c3
[SPARK-12273][STREAMING] Make Spark Streaming web UI list Receivers i…
lw-lin Dec 11, 2015
24352b7
[SPARK-11497][MLLIB][PYTHON] PySpark RowMatrix Constructor Has Type E…
dusenberrymw Dec 11, 2015
74bc858
[SPARK-12217][ML] Document invalid handling for StringIndexer
BenFradet Dec 11, 2015
9f25f85
[SPARK-11978][ML] Move dataset_example.py to examples/ml and rename t…
yanboliang Dec 12, 2015
1c5cd9e
[SPARK-12298][SQL] Fix infinite loop in DataFrame.sortWithinPartitions
ankurdave Dec 12, 2015
5b39f6c
[SPARK-12158][SPARKR][SQL] Fix 'sample' functions that break R unit t…
gatorsmile Dec 12, 2015
a82431f
[SPARK-11193] Use Java ConcurrentHashMap instead of SynchronizedMap t…
jbonofre Dec 12, 2015
53389bd
[SPARK-12199][DOC] Follow-up: Refine example code in ml-features.md
yinxusen Dec 13, 2015
6659b4e
[SPARK-12267][CORE] Store the remote RpcEnv address to send the corre…
zsxwing Dec 13, 2015
af3e071
[SPARK-12281][CORE] Fix a race condition when reporting ExecutorState…
zsxwing Dec 14, 2015
cb0b872
[SPARK-12213][SQL] use multiple partitions for single distinct query
Dec 14, 2015
0acee0c
[SPARK-12275][SQL] No plan for BroadcastHint in some condition
Dec 14, 2015
bbdb3e7
[MINOR][DOC] Fix broken word2vec link
BenFradet Dec 14, 2015
50a33d8
[SPARK-12016] [MLLIB] [PYSPARK] Wrap Word2VecModel when loading it in…
viirya Dec 14, 2015
490e0bc
[SPARK-12327] Disable commented code lintr temporarily
shivaram Dec 15, 2015
4f28c60
[SPARK-12274][SQL] WrapOption should not have type constraint for child
cloud-fan Dec 15, 2015
8a8d663
[SPARK-12188][SQL][FOLLOW-UP] Code refactoring and comment correction…
gatorsmile Dec 15, 2015
c8a084a
[SPARK-12288] [SQL] Support UnsafeRow in Coalesce/Except/Intersect.
gatorsmile Dec 15, 2015
d9608be
[SPARK-12332][TRIVIAL][TEST] Fix minor typo in ResetSystemProperties
holdenk Dec 15, 2015
cd4ce8b
[STREAMING][MINOR] Fix typo in function name of StateImpl
jerryshao Dec 15, 2015
e05f455
[MINOR][ML] Rename weights to coefficients for examples/DeveloperApiE…
yanboliang Dec 16, 2015
9602389
[SPARK-12271][SQL] Improve error message when Dataset.as[ ] has incom…
nongli Dec 16, 2015
e809ee0
[SPARK-12236][SQL] JDBC filter tests all pass if filters are not real…
HyukjinKwon Dec 16, 2015
01dc131
[SPARK-12105] [SQL] add convenient show functions
jbonofre Dec 16, 2015
8f55388
[HOTFIX] Compile error from commit 31b3910
Dec 16, 2015
e45f4c2
[SPARK-12056][CORE] Part 2 Create a TaskAttemptContext only after cal…
tedyu Dec 16, 2015
1ba6d14
[SPARK-12130] Replace shuffleManagerClass with shortShuffleMgrNames i…
lianhuiwang Dec 16, 2015
8195550
[SPARK-12351][MESOS] Add documentation about submitting Spark with me…
tnachen Dec 16, 2015
e7046b5
[SPARK-9516][UI] Improvement of Thread Dump Page
CodingCat Dec 16, 2015
9c36948
[SPARK-9026][SPARK-4514] Modifications to JobWaiter, FutureAction, an…
reggert Dec 16, 2015
442df7c
[SPARK-10123][DEPLOY] Support specifying deploy mode from configuration
jerryshao Dec 16, 2015
f4aee47
[SPARK-9886][CORE] Fix to use ShutdownHookManager in
naveenminchu Dec 16, 2015
64cf457
[SPARK-12062][CORE] Change Master to asyc rebuild UI when application…
BryanCutler Dec 16, 2015
84198a7
[SPARK-10477][SQL] using DSL in ColumnPruningSuite to improve readabi…
cloud-fan Dec 16, 2015
98904fd
[SPARK-4117][YARN] Spark on Yarn handle AM being told command from RM
Dec 16, 2015
11bb8c1
[SPARK-12304][STREAMING] Make Spark Streaming web UI display more fri…
lw-lin Dec 16, 2015
d690e1c
[SPARK-12249][SQL] JDBC non-equality comparison operator not pushed d…
HyukjinKwon Dec 16, 2015
f354717
[SPARK-12314][SQL] isnull operator not pushed down for JDBC datasource.
HyukjinKwon Dec 16, 2015
1c533ab
[SPARK-12315][SQL] isnotnull operator not pushed down for JDBC dataso…
HyukjinKwon Dec 16, 2015
140a0b8
Style fix for the previous 3 JDBC filter push down commits.
rxin Dec 16, 2015
29d744d
Revert "[HOTFIX] Compile error from commit 31b3910"
rxin Dec 16, 2015
8b64134
Revert "[SPARK-12105] [SQL] add convenient show functions"
rxin Dec 16, 2015
95385ea
[SPARK-10618] [Mesos] Refactoring coarsed-grained scheduling conditio…
SleepyThread Sep 15, 2015
3b01c2a
[SPARK-10618] [Mesos] Killing space and removing duplication, also ha…
SleepyThread Sep 16, 2015
cc67b7d
[SPARK-10514] [CORE] [Mesos] Refactoring fine-grained scheduling cond…
SleepyThread Sep 17, 2015
fde2994
[SPARK-10618] [Mesos] Adressing comments on PR.
SleepyThread Dec 15, 2015
66f1750
[SPARK-10618] [Mesos] removing meetconstraints check as it is already…
SleepyThread Dec 16, 2015
c10af66
[SPARK-10618] [Mesos] checking in missed Style check fix
SleepyThread Dec 16, 2015
412f56d
Merge branch 'scheduling-refactor' of github.com:SleepyThread/spark i…
SleepyThread Feb 23, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -254,12 +254,7 @@ private[spark] class CoarseMesosSchedulerBackend(
val cpus = getResource(offer.getResourcesList, "cpus").toInt
val id = offer.getId.getValue
if (meetsConstraints) {
if (taskIdToSlaveId.size < executorLimit &&
totalCoresAcquired < maxCores &&
mem >= calculateTotalMemory(sc) &&
cpus >= 1 &&
failuresBySlaveId.getOrElse(slaveId, 0) < MAX_SLAVE_FAILURES &&
!slaveIdsWithExecutors.contains(slaveId)) {
if (isOfferSatisfiesRequirements(slaveId, mem, cpus, sc)) {
// Launch an executor on the slave
val cpusToUse = math.min(cpus, maxCores - totalCoresAcquired)
totalCoresAcquired += cpusToUse
Expand Down Expand Up @@ -308,6 +303,35 @@ private[spark] class CoarseMesosSchedulerBackend(
}
}

// ToDo: Abstract out each condition and log them.
def isOfferSatisfiesRequirements(slaveId: String, mem: Double, cpusOffered: Int,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name sounds a bit funny to me. Maybe just offerSatisfiesRequirements?

Also, this might be private, or private[spark] to allow testing.

Ideally the TODO would be implemented as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, a short scaladoc explaining the difference between meetsConstraints and satisfiesRequirements. To the casual reader they are the same.

sc: SparkContext): Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style:

def offerSatisfiesRequirements(
    slaveId: String,
    mem: Double,
    cpusOffered: Int,
    sc: SparkContext): Boolean = {
  ...
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, the method signature is shared across both fine-grained and coarse-grained modes, so I would put this in the helper trait MesosSchedulerUtils. Then this can be protected or private[mesos] or something.

val meetsMemoryRequirements = mem >= calculateTotalMemory(sc)
val meetsCPURequirements = cpusOffered >= 1
val needMoreCores = totalCoresAcquired < maxCores
val healthySlave = failuresBySlaveId.getOrElse(slaveId, 0) < MAX_SLAVE_FAILURES
val taskOnEachSlaveLessThanExecutorLimit = taskIdToSlaveId.size < executorLimit
val executorNotRunningOnSlave = !slaveIdsWithExecutors.contains(slaveId)

executorNotRunningOnSlave &&
taskOnEachSlaveLessThanExecutorLimit &&
needMoreCores &&
meetsMemoryRequirements &&
meetsCPURequirements &&
healthySlave
}

def isOfferValidForScheduling(meetsConstraints: Boolean,
slaveId: String, mem: Double,
cpus: Int, sc: SparkContext): Boolean = {
taskIdToSlaveId.size < executorLimit &&
totalCoresAcquired < maxCores &&
meetsConstraints &&
mem >= calculateTotalMemory(sc) &&
cpus >= 1 &&
failuresBySlaveId.getOrElse(slaveId, 0) < MAX_SLAVE_FAILURES &&
!slaveIdsWithExecutors.contains(slaveId)
}

override def statusUpdate(d: SchedulerDriver, status: TaskStatus) {
val taskId = status.getTaskId.getValue.toInt
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -246,14 +246,13 @@ private[spark] class MesosSchedulerBackend(
val slaveId = o.getSlaveId.getValue
val offerAttributes = toAttributeMap(o.getAttributesList)

// check offers for
// 1. Memory requirements
// 2. CPU requirements - need at least 1 for executor, 1 for task
val meetsMemoryRequirements = mem >= calculateTotalMemory(sc)
val meetsCPURequirements = cpus >= (mesosExecutorCores + scheduler.CPUS_PER_TASK)
// check if Attribute constraints is satisfied
val meetsConstraints = matchesAttributeRequirements(slaveOfferConstraints, offerAttributes)

val meetsRequirements =
(meetsMemoryRequirements && meetsCPURequirements) ||
(slaveIdToExecutorInfo.contains(slaveId) && cpus >= scheduler.CPUS_PER_TASK)
isOfferSatisfiesRequirements(cpus, mem, slaveId, sc)

// add some debug messaging
val debugstr = if (meetsRequirements) "Accepting" else "Declining"
logDebug(s"$debugstr offer: ${o.getId.getValue} with attributes: "
+ s"$offerAttributes mem: $mem cpu: $cpus")
Expand Down Expand Up @@ -328,6 +327,18 @@ private[spark] class MesosSchedulerBackend(
}
}

// check if all constraints are satisfied
// 1. Memory requirements
// 2. CPU requirements - need at least 1 for executor, 1 for task
def isOfferSatisfiesRequirements(cpusOffered: Double, memory : Double,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turn this into a ScalaDoc.

Would it be possible to not duplicate this method, but put the common logic between coarse and fine-grained somewhere else?

Same comment regarding visibility as for the other one.

slaveId: String, sc : SparkContext): Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use consistent style (and similar to the other places in the code). Don't put a space between identifier and :. It should be sc: SparkContext, not sc : SparkContext. There are several occurrences of this style in this PR.

val meetsMemoryRequirements = memory >= calculateTotalMemory(sc)
val meetsCPURequirements = cpusOffered >= (mesosExecutorCores + scheduler.CPUS_PER_TASK)

(meetsMemoryRequirements && meetsCPURequirements) ||
(slaveIdToExecutorInfo.contains(slaveId) && cpusOffered >= scheduler.CPUS_PER_TASK)
}

/** Turn a Spark TaskDescription into a Mesos task and also resources unused by the task */
def createMesosTask(
task: TaskDescription,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ class CoarseMesosSchedulerBackendSuite extends SparkFunSuite

private def createSchedulerBackend(
taskScheduler: TaskSchedulerImpl,
driver: SchedulerDriver): CoarseMesosSchedulerBackend = {
driver: SchedulerDriver, sc: SparkContext): CoarseMesosSchedulerBackend = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: this needs to go on the next line

val securityManager = mock[SecurityManager]
val backend = new CoarseMesosSchedulerBackend(taskScheduler, sc, "master", securityManager) {
override protected def createSchedulerDriver(
Expand All @@ -77,16 +77,25 @@ class CoarseMesosSchedulerBackendSuite extends SparkFunSuite
backend
}

private def createSchedulerBackendForGivenSparkConf(sc : SparkContext) = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: no space before colon, and need return type

private def createXXXConf(sc: SparkContext): Unit = {
}

val driver = mock[SchedulerDriver]
when(driver.start()).thenReturn(Protos.Status.DRIVER_RUNNING)
val taskScheduler = mock[TaskSchedulerImpl]
when(taskScheduler.sc).thenReturn(sc)
createSchedulerBackend(taskScheduler, driver, sc)
}

var sparkConf: SparkConf = _

before {
sparkConf = (new SparkConf)
.setMaster("local[*]")
.setAppName("test-mesos-dynamic-alloc")
.setSparkHome("/path")
.set("spark.cores.max", "10")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do this? Add a comment to explain?


sc = new SparkContext(sparkConf)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please revert this indentation change


test("mesos supports killing and limiting executors") {
val driver = mock[SchedulerDriver]
Expand All @@ -97,7 +106,7 @@ class CoarseMesosSchedulerBackendSuite extends SparkFunSuite
sparkConf.set("spark.driver.host", "driverHost")
sparkConf.set("spark.driver.port", "1234")

val backend = createSchedulerBackend(taskScheduler, driver)
val backend = createSchedulerBackend(taskScheduler, driver, sc)
val minMem = backend.calculateTotalMemory(sc)
val minCpu = 4

Expand Down Expand Up @@ -145,15 +154,15 @@ class CoarseMesosSchedulerBackendSuite extends SparkFunSuite
val taskScheduler = mock[TaskSchedulerImpl]
when(taskScheduler.sc).thenReturn(sc)

val backend = createSchedulerBackend(taskScheduler, driver)
val backend = createSchedulerBackend(taskScheduler, driver, sc)
val minMem = backend.calculateTotalMemory(sc) + 1024
val minCpu = 4

val mesosOffers = new java.util.ArrayList[Offer]
val offer1 = createOffer("o1", "s1", minMem, minCpu)
mesosOffers.add(offer1)

val offer2 = createOffer("o2", "s1", minMem, 1);
val offer2 = createOffer("o2", "s1", minMem, 1)

backend.resourceOffers(driver, mesosOffers)

Expand Down Expand Up @@ -184,4 +193,47 @@ class CoarseMesosSchedulerBackendSuite extends SparkFunSuite

verify(driver, times(1)).reviveOffers()
}

test("isOfferSatisfiesRequirements return true when there is a valid offer") {
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)

assert(schedulerBackend.isOfferSatisfiesRequirements("Slave1", 10000, 5, sc))
}


test("isOfferSatisfiesRequirements return false when memory in offer is less" +
" than required memory") {
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)

assert(schedulerBackend.isOfferSatisfiesRequirements("Slave1", 1, 5, sc) === false)
}

test("isOfferSatisfiesRequirements return false when cpu in offer is less than required cpu") {
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)

assert(schedulerBackend.isOfferSatisfiesRequirements("Slave1", 10000, 0, sc) === false)
}

test("isOfferSatisfiesRequirements return false when offer is from slave already running" +
" an executor") {
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)
schedulerBackend.slaveIdsWithExecutors += "Slave2"

assert(schedulerBackend.isOfferSatisfiesRequirements("Slave2", 10000, 5, sc) === false)
}

test("isOfferSatisfiesRequirements return false when task is failed more than " +
"MAX_SLAVE_FAILURES times on the given slave") {
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)
schedulerBackend.failuresBySlaveId("Slave3") = 2

assert(schedulerBackend.isOfferSatisfiesRequirements("Slave3", 10000, 5, sc) === false)
}

test("isOfferSatisfiesRequirements return false when max core is already acquired") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests can probably all be grouped in one test called offerSatisfiesRequirements.

val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)
schedulerBackend.totalCoresAcquired = 10

assert(schedulerBackend.isOfferSatisfiesRequirements("Slave1", 10000, 5, sc) === false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of assert(x === false), just do assert(!x)

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ import scala.collection.mutable.ArrayBuffer

import org.apache.mesos.Protos.Value.Scalar
import org.apache.mesos.Protos._
import org.apache.mesos.SchedulerDriver
import org.apache.mesos.{Protos, SchedulerDriver}
import org.mockito.Matchers._
import org.mockito.Mockito._
import org.mockito.{ArgumentCaptor, Matchers}
Expand Down Expand Up @@ -344,4 +344,66 @@ class MesosSchedulerBackendSuite extends SparkFunSuite with LocalSparkContext wi
r.getName.equals("cpus") && r.getScalar.getValue.equals(1.0) && r.getRole.equals("prod")
})
}

private def createSchedulerBackendForGivenSparkConf(sc : SparkContext) : MesosSchedulerBackend = {
val conf = new SparkConf

val listenerBus = mock[LiveListenerBus]
listenerBus.post(
SparkListenerExecutorAdded(anyLong, "s1", new ExecutorInfo("host1", 2, Map.empty)))

when(sc.getSparkHome()).thenReturn(Option("/spark-home"))

when(sc.conf).thenReturn(conf)
when(sc.executorEnvs).thenReturn(new mutable.HashMap[String, String])
when(sc.executorMemory).thenReturn(100)
when(sc.listenerBus).thenReturn(listenerBus)

val taskScheduler = mock[TaskSchedulerImpl]
when(taskScheduler.CPUS_PER_TASK).thenReturn(2)

new MesosSchedulerBackend(taskScheduler, sc, "master")
}

test("isOfferSatisfiesRequirements return true when there offer meet cpu and" +
" memory requirement") {
val sc = mock[SparkContext]
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)

assert(schedulerBackend.isOfferSatisfiesRequirements( 5, 10000, "Slave1", sc))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style, space after (. I won't point out each occurrence of extra spaces. :)

}

test("isOfferSatisfiesRequirements return false when memory in offer is less " +
"than required memory") {
val sc = mock[SparkContext]
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)

assert(schedulerBackend.isOfferSatisfiesRequirements(5, 10, "Slave1", sc) === false)
}

test("isOfferSatisfiesRequirements return false when cpu in offer is less than required cpu") {
val sc = mock[SparkContext]
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)

assert(schedulerBackend.isOfferSatisfiesRequirements(0, 10000, "Slave1", sc) === false)
}

test("isOfferSatisfiesRequirements return true when offer is from slave already running and" +
" cpu is less than minimum cpu per task an executor") {
val sc = mock[SparkContext]
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)
schedulerBackend.slaveIdToExecutorInfo("Slave2") = null

assert(schedulerBackend.isOfferSatisfiesRequirements(2, 10000, "Slave2", sc) === true)
}

test("isOfferSatisfiesRequirements return false when offer is from slave already running but" +
" cpu is less than minimum cpu per task an executor") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, can group these tests

val sc = mock[SparkContext]
val schedulerBackend = createSchedulerBackendForGivenSparkConf(sc)
schedulerBackend.slaveIdToExecutorInfo("Slave2") = null

assert(schedulerBackend.isOfferSatisfiesRequirements(1, 10000, "Slave2", sc) === false)
}

}