Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
514 commits
Select commit Hold shift + click to select a range
b14bfc3
[SPARK-19993][SQL] Caching logical plans containing subquery expressi…
dilipbiswal Apr 12, 2017
b938438
[MINOR][DOCS] Fix spacings in Structured Streaming Programming Guide
dongjinleekr Apr 12, 2017
bca4259
[MINOR][DOCS] JSON APIs related documentation fixes
HyukjinKwon Apr 12, 2017
044f7ec
[SPARK-20298][SPARKR][MINOR] fixed spelling mistake "charactor"
bdwyer2 Apr 12, 2017
ffc57b0
[SPARK-20302][SQL] Short circuit cast when from and to types are stru…
rxin Apr 12, 2017
2e1fd46
[SPARK-20296][TRIVIAL][DOCS] Count distinct error message for streaming
jtoka Apr 12, 2017
ceaf77a
[SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on Jenkins
HyukjinKwon Apr 12, 2017
504e62e
[SPARK-20303][SQL] Rename createTempFunction to registerFunction
gatorsmile Apr 12, 2017
5408553
[SPARK-20304][SQL] AssertNotNull should not include path in string re…
rxin Apr 12, 2017
99a9473
[SPARK-19570][PYSPARK] Allow to disable hive in pyspark shell
zjffdu Apr 12, 2017
924c424
[SPARK-20301][FLAKY-TEST] Fix Hadoop Shell.runCommand flakiness in St…
brkyvz Apr 12, 2017
a7b430b
[SPARK-15354][FLAKY-TEST] TopologyAwareBlockReplicationPolicyBehavior…
cloud-fan Apr 13, 2017
c5f1cc3
[SPARK-20131][CORE] Don't use `this` lock in StandaloneSchedulerBacke…
zsxwing Apr 13, 2017
ec68d8f
[SPARK-20189][DSTREAM] Fix spark kinesis testcases to remove deprecat…
yashs360 Apr 13, 2017
095d1cb
[SPARK-20265][MLLIB] Improve Prefix'span pre-processing efficiency
Syrux Apr 13, 2017
a4293c2
[SPARK-20284][CORE] Make {Des,S}erializationStream extend Closeable
Apr 13, 2017
fbe4216
[SPARK-20233][SQL] Apply star-join filter heuristics to dynamic progr…
ioana-delaney Apr 13, 2017
8ddf0d2
[SPARK-20232][PYTHON] Improve combineByKey docs
Apr 13, 2017
7536e28
[SPARK-20038][SQL] FileFormatWriter.ExecuteWriteTask.releaseResources…
steveloughran Apr 13, 2017
fb036c4
[SPARK-20318][SQL] Use Catalyst type for min/max in ColumnStat for ea…
Apr 14, 2017
98b41ec
[SPARK-20316][SQL] Val and Var should strictly follow the Scala syntax
Apr 15, 2017
35e5ae4
[SPARK-19716][SQL][FOLLOW-UP] UnresolvedMapObjects should always be s…
cloud-fan Apr 16, 2017
e090f3c
[SPARK-20335][SQL] Children expressions of Hive UDF impacts the deter…
gatorsmile Apr 16, 2017
a888fed
[SPARK-19740][MESOS] Add support in Spark to pass arbitrary parameter…
Apr 16, 2017
ad935f5
[SPARK-20343][BUILD] Add avro dependency in core POM to resolve build…
HyukjinKwon Apr 16, 2017
86d251c
[SPARK-20278][R] Disable 'multiple_dots_linter' lint rule that is aga…
HyukjinKwon Apr 16, 2017
24f09b3
[SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json.array in fro…
HyukjinKwon Apr 17, 2017
01ff035
[SPARK-20349][SQL] ListFunctions returns duplicate functions after us…
gatorsmile Apr 17, 2017
e5fee3e
[SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patterns.
jodersky Apr 17, 2017
0075562
Typo fix: distitrbuted -> distributed
ash211 Apr 18, 2017
33ea908
[TEST][MINOR] Replace repartitionBy with distribute in CollapseRepart…
jaceklaskowski Apr 18, 2017
b0a1e93
[SPARK-17647][SQL][FOLLOWUP][MINOR] fix typo
felixcheung Apr 18, 2017
07fd94e
[SPARK-20344][SCHEDULER] Duplicate call in FairSchedulableBuilder.add…
snazy Apr 18, 2017
d4f10cb
[SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to resolve build f…
HyukjinKwon Apr 18, 2017
321b4f0
[SPARK-20366][SQL] Fix recursive join reordering: inside joins are no…
Apr 18, 2017
1f81dda
[SPARK-20354][CORE][REST-API] When I request access to the 'http: //i…
Apr 18, 2017
f654b39
[SPARK-20360][PYTHON] reprs for interpreters
rgbkrk Apr 18, 2017
74aa0df
[SPARK-20377][SS] Fix JavaStructuredSessionization example
tdas Apr 18, 2017
e468a96
[SPARK-20254][SQL] Remove unnecessary data conversion for Dataset wit…
kiszk Apr 19, 2017
702d85a
[SPARK-20208][R][DOCS] Document R fpGrowth support
zero323 Apr 19, 2017
608bf30
[SPARK-20359][SQL] Avoid unnecessary execution in EliminateOuterJoin …
koertkuipers Apr 19, 2017
773754b
[SPARK-20356][SQL] Pruned InMemoryTableScanExec should have correct o…
viirya Apr 19, 2017
3537876
[SPARK-20343][BUILD] Avoid Unidoc build only if Hadoop 2.6 is explici…
HyukjinKwon Apr 19, 2017
71a8e9d
[SPARK-20036][DOC] Note incompatible dependencies on org.apache.kafka…
koeninger Apr 19, 2017
4fea784
[SPARK-20397][SPARKR][SS] Fix flaky test: test_streaming.R.Terminated…
zsxwing Apr 19, 2017
63824b2
[SPARK-20350] Add optimization rules to apply Complementation Laws.
ptkool Apr 20, 2017
39e303a
[MINOR][SS] Fix a missing space in UnsupportedOperationChecker error …
zsxwing Apr 20, 2017
dd6d55d
[SPARK-20398][SQL] range() operator should include cancellation reaso…
ericl Apr 20, 2017
bdc6056
Fixed typos in docs
Apr 20, 2017
46c5749
[SPARK-20375][R] R wrappers for array and map
zero323 Apr 20, 2017
55bea56
[SPARK-20156][SQL][FOLLOW-UP] Java String toLowerCase "Turkish locale…
gatorsmile Apr 20, 2017
c6f62c5
[SPARK-20405][SQL] Dataset.withNewExecutionId should be private
rxin Apr 20, 2017
b91873d
[SPARK-20409][SQL] fail early if aggregate function in GROUP BY
cloud-fan Apr 20, 2017
c5a31d1
[SPARK-20407][TESTS] ParquetQuerySuite 'Enabling/disabling ignoreCorr…
bogdanrdc Apr 20, 2017
b2ebadf
[SPARK-20358][CORE] Executors failing stage on interrupted exception …
ericl Apr 20, 2017
d95e4d9
[SPARK-20334][SQL] Return a better error message when correlated pred…
dilipbiswal Apr 20, 2017
0332063
[SPARK-20410][SQL] Make sparkConf a def in SharedSQLContext
hvanhovell Apr 20, 2017
592f5c8
[SPARK-20172][CORE] Add file permission check when listing files in F…
jerryshao Apr 20, 2017
0368eb9
[SPARK-20367] Properly unescape column names of partitioning columns …
juliuszsompolski Apr 21, 2017
760c8d0
[SPARK-20329][SQL] Make timezone aware expression without timezone un…
hvanhovell Apr 21, 2017
48d760d
[SPARK-20281][SQL] Print the identical Range parameters of SparkConte…
maropu Apr 21, 2017
e2b3d23
[SPARK-20420][SQL] Add events to the external catalog
hvanhovell Apr 21, 2017
3476799
Small rewording about history server use case
dud225 Apr 21, 2017
c9e6035
[SPARK-20412] Throw ParseException from visitNonOptionalPartitionSpec…
juliuszsompolski Apr 21, 2017
a750a59
[SPARK-20341][SQL] Support BigInt's value that does not fit in long v…
kiszk Apr 21, 2017
eb00378
[SPARK-20423][ML] fix MLOR coeffs centering when reg == 0
WeichenXu123 Apr 21, 2017
fd648bf
[SPARK-20371][R] Add wrappers for collect_list and collect_set
zero323 Apr 21, 2017
ad29040
[SPARK-20401][DOC] In the spark official configuration document, the …
Apr 21, 2017
05a4514
[SPARK-20386][SPARK CORE] modify the log info if the block exists on …
eatoncys Apr 22, 2017
b3c572a
[SPARK-20430][SQL] Initialise RangeExec parameters in a driver side
maropu Apr 22, 2017
8765bc1
[SPARK-20132][DOCS] Add documentation for column string functions
map222 Apr 23, 2017
2eaf4f3
[SPARK-20385][WEB-UI] Submitted Time' field, the date format needs to…
Apr 23, 2017
e9f9715
[BUILD] Close stale PRs
maropu Apr 24, 2017
776a2c0
[SPARK-20439][SQL] Fix Catalog API listTables and getTable when faile…
gatorsmile Apr 24, 2017
90264ac
[SPARK-18901][ML] Require in LR LogisticAggregator is redundant
wangmiao1981 Apr 24, 2017
8a272dd
[SPARK-20438][R] SparkR wrappers for split and repeat
zero323 Apr 24, 2017
5280d93
[SPARK-20239][CORE] Improve HistoryServer's ACL mechanism
jerryshao Apr 25, 2017
f44c8a8
[SPARK-20453] Bump master branch version to 2.3.0-SNAPSHOT
JoshRosen Apr 25, 2017
31345fd
[SPARK-20451] Filter out nested mapType datatypes from sort order in …
sameeragarwal Apr 25, 2017
c8f1219
[SPARK-20455][DOCS] Fix Broken Docker IT Docs
original-brownbear Apr 25, 2017
0bc7a90
[SPARK-20404][CORE] Using Option(name) instead of Some(name)
szhem Apr 25, 2017
387565c
[SPARK-18901][FOLLOWUP][ML] Require in LR LogisticAggregator is redun…
wangmiao1981 Apr 25, 2017
67eef47
[SPARK-20449][ML] Upgrade breeze version to 0.13.1
yanboliang Apr 25, 2017
0a7f5f2
[SPARK-5484][GRAPHX] Periodically do checkpoint in Pregel
Apr 25, 2017
caf3920
[SPARK-18127] Add hooks and extension points to Spark
sameeragarwal Apr 26, 2017
57e1da3
[SPARK-16548][SQL] Inconsistent error handling in JSON parsing SQL fu…
Apr 26, 2017
df58a95
[SPARK-20437][R] R wrappers for rollup and cube
zero323 Apr 26, 2017
7a36525
[SPARK-20400][DOCS] Remove References to 3rd Party Vendor Tools
Apr 26, 2017
7fecf51
[SPARK-19812] YARN shuffle service fails to relocate recovery DB acro…
tgravescs Apr 26, 2017
dbb06c6
[MINOR][ML] Fix some PySpark & SparkR flaky tests
yanboliang Apr 26, 2017
66dd5b8
[SPARK-20391][CORE] Rename memory related fields in ExecutorSummay
jerryshao Apr 26, 2017
99c6cf9
[SPARK-20473] Enabling missing types in ColumnVector.Array
michal-databricks Apr 26, 2017
a277ae8
[SPARK-20474] Fixing OnHeapColumnVector reallocation
michal-databricks Apr 26, 2017
2ba1eba
[SPARK-12868][SQL] Allow adding jars from hdfs
weiqingy Apr 26, 2017
66636ef
[SPARK-20435][CORE] More thorough redaction of sensitive information
markgrover Apr 27, 2017
b4724db
[SPARK-20425][SQL] Support a vertical display mode for Dataset.show
maropu Apr 27, 2017
b58cf77
[DOCS][MINOR] Add missing since to SparkR repeat_string note.
zero323 Apr 27, 2017
ba76662
[SPARK-20208][DOCS][FOLLOW-UP] Add FP-Growth to SparkR programming guide
zero323 Apr 27, 2017
7633933
[SPARK-20483] Mesos Coarse mode may starve other Mesos frameworks
dgshep Apr 27, 2017
561e9cc
[SPARK-20421][CORE] Mark internal listeners as deprecated.
Apr 27, 2017
85c6ce6
[SPARK-20426] Lazy initialization of FileSegmentManagedBuffer for shu…
Apr 27, 2017
26ac2ce
[SPARK-20482][SQL] Resolving Casts is too strict on having time zone set
rednaxelafx Apr 27, 2017
a4aa466
[SPARK-20487][SQL] `HiveTableScan` node is quite verbose in explained…
tejasapatil Apr 27, 2017
039e32c
[SPARK-20483][MINOR] Test for Mesos Coarse mode may starve other Meso…
dgshep Apr 27, 2017
606432a
[SPARK-20047][ML] Constrained Logistic Regression
yanboliang Apr 27, 2017
01c999e
[SPARK-20461][CORE][SS] Use UninterruptibleThread for Executor and fi…
zsxwing Apr 27, 2017
823baca
[SPARK-20452][SS][KAFKA] Fix a potential ConcurrentModificationExcept…
zsxwing Apr 27, 2017
b90bf52
[SPARK-12837][CORE] Do not send the name of internal accumulator to e…
cloud-fan Apr 28, 2017
7fe8249
[SPARKR][DOC] Document LinearSVC in R programming guide
wangmiao1981 Apr 28, 2017
e3c8160
[SPARK-20476][SQL] Block users to create a table that use commas in t…
gatorsmile Apr 28, 2017
59e3a56
[SPARK-14471][SQL] Aliases in SELECT could be used in GROUP BY
maropu Apr 28, 2017
8c911ad
[SPARK-20465][CORE] Throws a proper exception when any temp directory…
HyukjinKwon Apr 28, 2017
733b81b
[SPARK-20496][SS] Bug in KafkaWriter Looks at Unanalyzed Plans
Apr 28, 2017
5d71f3d
[SPARK-20514][CORE] Upgrade Jetty to 9.3.11.v20160721
markgrover Apr 28, 2017
ebff519
[SPARK-20471] Remove AggregateBenchmark testsuite warning: Two level …
heary-cao Apr 28, 2017
77bcd77
[SPARK-19525][CORE] Add RDD checkpoint compression support
Apr 28, 2017
814a61a
[SPARK-20487][SQL] Display `serde` for `HiveTableScan` node in explai…
tejasapatil Apr 29, 2017
b28c3bc
[SPARK-20477][SPARKR][DOC] Document R bisecting k-means in R programm…
wangmiao1981 Apr 29, 2017
add9d1b
[SPARK-19791][ML] Add doc and example for fpgrowth
YY-OnCall Apr 29, 2017
ee694cd
[SPARK-20533][SPARKR] SparkR Wrappers Model should be private and val…
wangmiao1981 Apr 29, 2017
70f1bcd
[SPARK-20493][R] De-duplicate parse logics for DDL-like type strings …
HyukjinKwon Apr 29, 2017
d228cd0
[SPARK-20442][PYTHON][DOCS] Fill up documentations for functions in C…
HyukjinKwon Apr 29, 2017
4d99b95
[SPARK-20521][DOC][CORE] The default of 'spark.worker.cleanup.appData…
Apr 30, 2017
1ee494d
[SPARK-20492][SQL] Do not print empty parentheses for invalid primiti…
HyukjinKwon Apr 30, 2017
ae3df4e
[SPARK-20535][SPARKR] R wrappers for explode_outer and posexplode_outer
zero323 Apr 30, 2017
6613046
[MINOR][DOCS][PYTHON] Adding missing boolean type for replacement val…
May 1, 2017
80e9cf1
[SPARK-20490][SPARKR] Add R wrappers for eqNullSafe and ! / not
zero323 May 1, 2017
a355b66
[SPARK-20541][SPARKR][SS] support awaitTermination without timeout
felixcheung May 1, 2017
f0169a1
[SPARK-20290][MINOR][PYTHON][SQL] Add PySpark wrapper for eqNullSafe
zero323 May 1, 2017
6b44c4d
[SPARK-20534][SQL] Make outer generate exec return empty rows
hvanhovell May 1, 2017
ab30590
[SPARK-20517][UI] Fix broken history UI download link
jerryshao May 1, 2017
6fc6cf8
[SPARK-20464][SS] Add a job group and description for streaming queri…
kunalkhamar May 1, 2017
2b2dd08
[SPARK-20540][CORE] Fix unstable executor requests.
rdblue May 1, 2017
af726cd
[SPARK-20459][SQL] JdbcUtils throws IllegalStateException: Cause alre…
srowen May 2, 2017
259860d
[SPARK-20463] Add support for IS [NOT] DISTINCT FROM.
ptkool May 2, 2017
943a684
[SPARK-20548] Disable ReplSuite.newProductSeqEncoder with REPL define…
sameeragarwal May 2, 2017
d20a976
[SPARK-20192][SPARKR][DOC] SparkR migration guide to 2.2.0
felixcheung May 2, 2017
90d77e9
[SPARK-20532][SPARKR] Implement grouping and grouping_id
zero323 May 2, 2017
afb21bf
[SPARK-20537][CORE] Fixing OffHeapColumnVector reallocation
kiszk May 2, 2017
86174ea
[SPARK-20549] java.io.CharConversionException: Invalid UTF-32' in Jso…
brkyvz May 2, 2017
e300a5a
[SPARK-20300][ML][PYSPARK] Python API for ALSModel.recommendForAllUse…
May 2, 2017
b1e639a
[SPARK-19235][SQL][TEST][FOLLOW-UP] Enable Test Cases in DDLSuite wit…
gatorsmile May 2, 2017
13f47dc
[SPARK-20490][SPARKR][DOC] add family tag for not function
felixcheung May 2, 2017
ef3df91
[SPARK-20421][CORE] Add a missing deprecation tag.
May 2, 2017
b946f31
[SPARK-20558][CORE] clear InheritableThreadLocal variables in SparkCo…
cloud-fan May 3, 2017
6235132
[SPARK-20567] Lazily bind in GenerateExec
marmbrus May 3, 2017
db2fb84
[SPARK-6227][MLLIB][PYSPARK] Implement PySpark wrappers for SVD and P…
MechCoder May 3, 2017
16fab6b
[SPARK-20523][BUILD] Clean up build warnings for 2.2.0 release
srowen May 3, 2017
7f96f2d
[SPARK-16957][MLLIB] Use midpoints for split values.
facaiy May 3, 2017
27f543b
[SPARK-20441][SPARK-20432][SS] Within the same streaming query, one S…
lw-lin May 3, 2017
527fc5d
[SPARK-20576][SQL] Support generic hint function in Dataset/DataFrame
rxin May 3, 2017
6b9e49d
[SPARK-19965][SS] DataFrame batch reader may fail to infer partitions…
lw-lin May 3, 2017
13eb37c
[MINOR][SQL] Fix the test title from =!= to <=>, remove a duplicated …
HyukjinKwon May 3, 2017
02bbe73
[SPARK-20584][PYSPARK][SQL] Python generic hint support
zero323 May 4, 2017
fc472bd
[SPARK-20543][SPARKR] skip tests when running on CRAN
felixcheung May 4, 2017
b8302cc
[SPARK-20015][SPARKR][SS][DOC][EXAMPLE] Document R Structured Streami…
felixcheung May 4, 2017
9c36aa2
[SPARK-20585][SPARKR] R generic hint support
zero323 May 4, 2017
f21897f
[SPARK-20544][SPARKR] R wrapper for input_file_name
zero323 May 4, 2017
57b6470
[SPARK-20571][SPARKR][SS] Flaky Structured Streaming tests
felixcheung May 4, 2017
c5dceb8
[SPARK-20047][FOLLOWUP][ML] Constrained Logistic Regression follow up
yanboliang May 4, 2017
bfc8c79
[SPARK-20566][SQL] ColumnVector should support `appendFloats` for array
dongjoon-hyun May 4, 2017
0d16faa
[SPARK-20574][ML] Allow Bucketizer to handle non-Double numeric column
May 5, 2017
4411ac7
[INFRA] Close stale PRs
HyukjinKwon May 5, 2017
37cdf07
[SPARK-19660][SQL] Replace the deprecated property name fs.default.na…
wangyum May 5, 2017
5773ab1
[SPARK-20546][DEPLOY] spark-class gets syntax error in posix mode
jyu00 May 5, 2017
9064f1b
[SPARK-20495][SQL][CORE] Add StorageLevel to cacheTable API
phatak-dev May 5, 2017
b9ad2d1
[SPARK-20613] Remove excess quotes in Windows executable
jarrettmeyer May 5, 2017
41439fd
[SPARK-20381][SQL] Add SQL metrics of numOutputRows for ObjectHashAgg…
May 5, 2017
bd57882
[SPARK-20603][SS][TEST] Set default number of topic partitions to 1 t…
zsxwing May 5, 2017
b31648c
[SPARK-20557][SQL] Support for db column type TIMESTAMP WITH TIME ZONE
JannikArndt May 5, 2017
5d75b14
[SPARK-20616] RuleExecutor logDebug of batch results should show diff…
juliuszsompolski May 5, 2017
b433aca
[SPARK-20614][PROJECT INFRA] Use the same log4j configuration with Je…
HyukjinKwon May 6, 2017
cafca54
[SPARK-20557][SQL] Support JDBC data type Time with Time Zone
gatorsmile May 7, 2017
63d90e7
[SPARK-18777][PYTHON][SQL] Return UDF from udf.register
zero323 May 7, 2017
37f963a
[SPARK-20518][CORE] Supplement the new blockidsuite unit tests
heary-cao May 7, 2017
88e6d75
[SPARK-20484][MLLIB] Add documentation to ALS code
danielyli May 7, 2017
2cf83c4
[SPARK-7481][BUILD] Add spark-hadoop-cloud module to pull in object s…
steveloughran May 7, 2017
7087e01
[SPARK-20543][SPARKR][FOLLOWUP] Don't skip tests on AppVeyor
felixcheung May 7, 2017
500436b
[MINOR][SQL][DOCS] Improve unix_timestamp's scaladoc (and typo hunting)
jaceklaskowski May 7, 2017
1f73d35
[SPARK-20550][SPARKR] R wrapper for Dataset.alias
zero323 May 7, 2017
f53a820
[SPARK-16931][PYTHON][SQL] Add Python wrapper for bucketBy
zero323 May 8, 2017
2269155
[SPARK-12297][SQL] Hive compatibility for Parquet Timestamps
squito May 8, 2017
c24bdaa
[SPARK-20626][SPARKR] address date test warning with timezone on windows
felixcheung May 8, 2017
42cc6d1
[SPARK-20380][SQL] Unable to set/unset table comment property using A…
sujith71955 May 8, 2017
2fdaeb5
[SPARKR][DOC] fix typo in vignettes
May 8, 2017
0f820e2
[SPARK-20519][SQL][CORE] Modify to prevent some possible runtime exce…
10110346 May 8, 2017
1552665
[SPARK-19956][CORE] Optimize a location order of blocks with topology…
ConeyLiu May 8, 2017
58518d0
[SPARK-20596][ML][TEST] Consolidate and improve ALS recommendAll test…
May 8, 2017
aeb2ecc
[SPARK-20621][DEPLOY] Delete deprecated config parameter in 'spark-en…
ConeyLiu May 8, 2017
829cd7b
[SPARK-20605][CORE][YARN][MESOS] Deprecate not used AM and executor p…
jerryshao May 8, 2017
2abfee1
[SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails
falaki May 8, 2017
b952b44
[SPARK-20661][SPARKR][TEST][FOLLOWUP] SparkR tableNames() test fails
felixcheung May 9, 2017
8079424
[SPARK-11968][MLLIB] Optimize MLLIB ALS recommendForAll
May 9, 2017
10b00ab
[SPARK-20587][ML] Improve performance of ML ALS recommendForAll
May 9, 2017
be53a78
[SPARK-20615][ML][TEST] SparseVector.argmax throws IndexOutOfBoundsEx…
May 9, 2017
b8733e0
[SPARK-20606][ML] ML 2.2 QA: Remove deprecated methods for ML
yanboliang May 9, 2017
0d00c76
[SPARK-20667][SQL][TESTS] Cleanup the cataloged metadata after comple…
gatorsmile May 9, 2017
714811d
[SPARK-20311][SQL] Support aliases for table value functions
maropu May 9, 2017
181261a
[SPARK-20355] Add per application spark version on the history server…
May 9, 2017
f561a76
[SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases
cloud-fan May 9, 2017
d099f41
[SPARK-20674][SQL] Support registering UserDefinedFunction as named UDF
rxin May 9, 2017
25ee816
[SPARK-19876][BUILD] Move Trigger.java to java source hierarchy
srowen May 9, 2017
1b85bcd
[SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Pyt…
holdenk May 9, 2017
ac1ab6b
Revert "[SPARK-12297][SQL] Hive compatibility for Parquet Timestamps"
rxin May 9, 2017
f79aa28
Revert "[SPARK-20311][SQL] Support aliases for table value functions"
yhuai May 9, 2017
c0189ab
[SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataFrame.withWate…
uncleGen May 9, 2017
771abeb
[SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when …
wangyum May 10, 2017
3d2131a
[SPARK-20590][SQL] Use Spark internal datasource if multiples are fou…
HyukjinKwon May 10, 2017
a90c5cd
[SPARK-20686][SQL] PropagateEmptyRelation incorrectly handles aggrega…
JoshRosen May 10, 2017
a819dab
[SPARK-20670][ML] Simplify FPGrowth transform
YY-OnCall May 10, 2017
0ef16bd
[SPARK-20668][SQL] Modify ScalaUDF to handle nullability.
ueshin May 10, 2017
804949c
[SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsisten…
zero323 May 10, 2017
ca4625e
[SPARK-20630][WEB UI] Fixed column visibility in Executor Tab
ajbozarth May 10, 2017
a4cbf26
[SPARK-20637][CORE] Remove mention of old RDD classes from comments
michaelmior May 10, 2017
b512233
[SPARK-20393][WEBU UI] Strengthen Spark to prevent XSS vulnerabilities
n-marion May 10, 2017
789bdbe
[SPARK-20688][SQL] correctly check analysis for scalar sub-queries
cloud-fan May 10, 2017
76e4a55
[SPARK-20678][SQL] Ndv for columns not in filter condition should als…
May 10, 2017
fcb88f9
[MINOR][BUILD] Fix lint-java breaks.
ConeyLiu May 10, 2017
5c2c4dc
[SPARK-19447] Remove remaining references to generated rows metric
ala May 10, 2017
af8b6cc
[SPARK-20689][PYSPARK] python doctest leaking bucketed table
felixcheung May 10, 2017
8ddbc43
[SPARK-20685] Fix BatchPythonEvaluation bug in case of single UDF w/ …
JoshRosen May 10, 2017
0698e6c
[SPARK-20606][ML] Revert "[] ML 2.2 QA: Remove deprecated methods for…
yanboliang May 11, 2017
65accb8
[SPARK-17029] make toJSON not go through rdd form but operate on data…
May 11, 2017
b4c99f4
[SPARK-20569][SQL] RuntimeReplaceable functions should not take extra…
cloud-fan May 11, 2017
8c67aa7
[SPARK-20311][SQL] Support aliases for table value functions
maropu May 11, 2017
3aa4e46
[SPARK-20416][SQL] Print UDF names in EXPLAIN
maropu May 11, 2017
7144b51
[SPARK-20600][SS] KafkaRelation should be pretty printed in web UI
jaceklaskowski May 11, 2017
04901dd
[SPARK-20431][SQL] Specify a schema by using a DDL-formatted string
maropu May 11, 2017
609ba5f
[SPARK-20399][SQL] Add a config to fallback string literal parsing co…
viirya May 12, 2017
2b36eb6
[SPARK-20665][SQL] Bround" and "Round" function return NULL
10110346 May 12, 2017
c8da535
[SPARK-20718][SQL] FileSourceScanExec with different filter orders sh…
May 12, 2017
888b84a
[SPARK-20704][SPARKR] change CRAN test to run single thread
felixcheung May 12, 2017
af40bb1
[SPARK-20619][ML] StringIndexer supports multiple ways to order label
May 12, 2017
720708c
[SPARK-20639][SQL] Add single argument support for to_timestamp in SQ…
HyukjinKwon May 12, 2017
fc8a2b6
[SPARK-20554][BUILD] Remove usage of scala.language.reflectiveCalls
srowen May 12, 2017
b236933
[SPARK-17424] Fix unsound substitution bug in ScalaReflection.
rdblue May 12, 2017
54b4f2a
[SPARK-20718][SQL][FOLLOWUP] Fix canonicalization for HiveTableScanExec
May 12, 2017
92ea7fd
[SPARK-20710][SQL] Support aliases in CUBE/ROLLUP/GROUPING SETS
maropu May 12, 2017
b526f70
[SPARK-19951][SQL] Add string concatenate operator || to Spark SQL
maropu May 12, 2017
7d6ff39
[SPARK-20702][CORE] TaskContextImpl.markTaskCompleted should not hide…
zsxwing May 12, 2017
0d3a631
[SPARK-20714][SS] Fix match error when watermark is set with timeout …
tdas May 12, 2017
e3d2022
[SPARK-20594][SQL] The staging directory should be a child directory …
May 12, 2017
b84ff7e
[SPARK-20719][SQL] Support LIMIT ALL
gatorsmile May 12, 2017
3f98375
[SPARK-18772][SQL] Avoid unnecessary conversion try for special float…
HyukjinKwon May 13, 2017
c2c1c5b
respect both gpu and maxgpu
Mar 10, 2017
c5c5c37
Merge branch 'ji/hard_limit_on_gpu' of https://github.com/yanji84/spa…
May 13, 2017
ba87b35
fix syntax
May 13, 2017
5ef2881
fix gpu offer
May 14, 2017
c301f3d
syntax fix
May 14, 2017
7a07742
pass all tests
May 15, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -297,3 +297,4 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
(MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)
(MIT License) machinist (https://github.com/typelevel/machinist)
20 changes: 10 additions & 10 deletions R/check-cran.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,18 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
pushd $FWDIR > /dev/null
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
pushd "$FWDIR" > /dev/null

. $FWDIR/find-r.sh
. "$FWDIR/find-r.sh"

# Install the package (this is required for code in vignettes to run when building it later)
# Build the latest docs, but not vignettes, which is built with the package next
. $FWDIR/install-dev.sh
. "$FWDIR/install-dev.sh"

# Build source package with vignettes
SPARK_HOME="$(cd "${FWDIR}"/..; pwd)"
. "${SPARK_HOME}"/bin/load-spark-env.sh
. "${SPARK_HOME}/bin/load-spark-env.sh"
if [ -f "${SPARK_HOME}/RELEASE" ]; then
SPARK_JARS_DIR="${SPARK_HOME}/jars"
else
Expand All @@ -40,16 +40,16 @@ fi

if [ -d "$SPARK_JARS_DIR" ]; then
# Build a zip file containing the source package with vignettes
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD build $FWDIR/pkg
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/R" CMD build "$FWDIR/pkg"

find pkg/vignettes/. -not -name '.' -not -name '*.Rmd' -not -name '*.md' -not -name '*.pdf' -not -name '*.html' -delete
else
echo "Error Spark JARs not found in $SPARK_HOME"
echo "Error Spark JARs not found in '$SPARK_HOME'"
exit 1
fi

# Run check as-cran.
VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`
VERSION=`grep Version "$FWDIR/pkg/DESCRIPTION" | awk '{print $NF}'`

CRAN_CHECK_OPTIONS="--as-cran"

Expand All @@ -67,10 +67,10 @@ echo "Running CRAN check with $CRAN_CHECK_OPTIONS options"

if [ -n "$NO_TESTS" ] && [ -n "$NO_MANUAL" ]
then
"$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
"$R_SCRIPT_PATH/R" CMD check $CRAN_CHECK_OPTIONS "SparkR_$VERSION.tar.gz"
else
# This will run tests and/or build vignettes, and require SPARK_HOME
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/R" CMD check $CRAN_CHECK_OPTIONS "SparkR_$VERSION.tar.gz"
fi

popd > /dev/null
10 changes: 5 additions & 5 deletions R/create-docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,23 +33,23 @@ export FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
export SPARK_HOME="$(cd "`dirname "${BASH_SOURCE[0]}"`"/..; pwd)"

# Required for setting SPARK_SCALA_VERSION
. "${SPARK_HOME}"/bin/load-spark-env.sh
. "${SPARK_HOME}/bin/load-spark-env.sh"

echo "Using Scala $SPARK_SCALA_VERSION"

pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

# Install the package (this will also generate the Rd files)
. $FWDIR/install-dev.sh
. "$FWDIR/install-dev.sh"

# Now create HTML files

# knit_rd puts html in current working directory
mkdir -p pkg/html
pushd pkg/html

"$R_SCRIPT_PATH/"Rscript -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knitr); knit_rd("SparkR", links = tools::findHTMLlinks(paste(libDir, "SparkR", sep="/")))'
"$R_SCRIPT_PATH/Rscript" -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knitr); knit_rd("SparkR", links = tools::findHTMLlinks(paste(libDir, "SparkR", sep="/")))'

popd

Expand Down
8 changes: 4 additions & 4 deletions R/create-rd.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

# Generate Rd files if devtools is installed
"$R_SCRIPT_PATH/"Rscript -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'
"$R_SCRIPT_PATH/Rscript" -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'
14 changes: 7 additions & 7 deletions R/install-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,21 +29,21 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
LIB_DIR="$FWDIR/lib"

mkdir -p $LIB_DIR
mkdir -p "$LIB_DIR"

pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

. $FWDIR/create-rd.sh
. "$FWDIR/create-rd.sh"

# Install SparkR to $LIB_DIR
"$R_SCRIPT_PATH/"R CMD INSTALL --library=$LIB_DIR $FWDIR/pkg/
"$R_SCRIPT_PATH/R" CMD INSTALL --library="$LIB_DIR" "$FWDIR/pkg/"

# Zip the SparkR package so that it can be distributed to worker nodes on YARN
cd $LIB_DIR
cd "$LIB_DIR"
jar cfM "$LIB_DIR/sparkr.zip" SparkR

popd > /dev/null
20 changes: 10 additions & 10 deletions R/install-source-package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,28 +29,28 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

if [ -z "$VERSION" ]; then
VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`
VERSION=`grep Version "$FWDIR/pkg/DESCRIPTION" | awk '{print $NF}'`
fi

if [ ! -f "$FWDIR"/SparkR_"$VERSION".tar.gz ]; then
echo -e "R source package file $FWDIR/SparkR_$VERSION.tar.gz is not found."
if [ ! -f "$FWDIR/SparkR_$VERSION.tar.gz" ]; then
echo -e "R source package file '$FWDIR/SparkR_$VERSION.tar.gz' is not found."
echo -e "Please build R source package with check-cran.sh"
exit -1;
fi

echo "Removing lib path and installing from source package"
LIB_DIR="$FWDIR/lib"
rm -rf $LIB_DIR
mkdir -p $LIB_DIR
"$R_SCRIPT_PATH/"R CMD INSTALL SparkR_"$VERSION".tar.gz --library=$LIB_DIR
rm -rf "$LIB_DIR"
mkdir -p "$LIB_DIR"
"$R_SCRIPT_PATH/R" CMD INSTALL "SparkR_$VERSION.tar.gz" --library="$LIB_DIR"

# Zip the SparkR package so that it can be distributed to worker nodes on YARN
pushd $LIB_DIR > /dev/null
pushd "$LIB_DIR" > /dev/null
jar cfM "$LIB_DIR/sparkr.zip" SparkR
popd > /dev/null

Expand Down
2 changes: 1 addition & 1 deletion R/pkg/.lintr
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
linters: with_defaults(line_length_linter(100), camel_case_linter = NULL, open_curly_linter(allow_single_line = TRUE), closed_curly_linter(allow_single_line = TRUE))
linters: with_defaults(line_length_linter(100), multiple_dots_linter = NULL, camel_case_linter = NULL, open_curly_linter(allow_single_line = TRUE), closed_curly_linter(allow_single_line = TRUE))
exclusions: list("inst/profile/general.R" = 1, "inst/profile/shell.R")
3 changes: 3 additions & 0 deletions R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Collate:
'WindowSpec.R'
'backend.R'
'broadcast.R'
'catalog.R'
'client.R'
'context.R'
'deserialize.R'
Expand All @@ -43,6 +44,7 @@ Collate:
'jvm.R'
'mllib_classification.R'
'mllib_clustering.R'
'mllib_fpm.R'
'mllib_recommendation.R'
'mllib_regression.R'
'mllib_stat.R'
Expand All @@ -51,6 +53,7 @@ Collate:
'serialize.R'
'sparkR.R'
'stats.R'
'streaming.R'
'types.R'
'utils.R'
'window.R'
Expand Down
48 changes: 46 additions & 2 deletions R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,10 @@ exportMethods("glm",
"spark.randomForest",
"spark.gbt",
"spark.bisectingKmeans",
"spark.svmLinear")
"spark.svmLinear",
"spark.fpGrowth",
"spark.freqItemsets",
"spark.associationRules")

# Job group lifecycle management methods
export("setJobGroup",
Expand All @@ -82,6 +85,7 @@ exportMethods("arrange",
"as.data.frame",
"attach",
"cache",
"checkpoint",
"coalesce",
"collect",
"colnames",
Expand All @@ -97,6 +101,7 @@ exportMethods("arrange",
"createOrReplaceTempView",
"crossJoin",
"crosstab",
"cube",
"dapply",
"dapplyCollect",
"describe",
Expand All @@ -118,9 +123,11 @@ exportMethods("arrange",
"group_by",
"groupBy",
"head",
"hint",
"insertInto",
"intersect",
"isLocal",
"isStreaming",
"join",
"limit",
"merge",
Expand All @@ -138,6 +145,7 @@ exportMethods("arrange",
"registerTempTable",
"rename",
"repartition",
"rollup",
"sample",
"sample_frac",
"sampleBy",
Expand Down Expand Up @@ -169,12 +177,14 @@ exportMethods("arrange",
"write.json",
"write.orc",
"write.parquet",
"write.stream",
"write.text",
"write.ml")

exportClasses("Column")

exportMethods("%in%",
exportMethods("%<=>%",
"%in%",
"abs",
"acos",
"add_months",
Expand All @@ -197,6 +207,8 @@ exportMethods("%in%",
"cbrt",
"ceil",
"ceiling",
"collect_list",
"collect_set",
"column",
"concat",
"concat_ws",
Expand All @@ -207,6 +219,8 @@ exportMethods("%in%",
"count",
"countDistinct",
"crc32",
"create_array",
"create_map",
"hash",
"cume_dist",
"date_add",
Expand All @@ -222,6 +236,7 @@ exportMethods("%in%",
"endsWith",
"exp",
"explode",
"explode_outer",
"expm1",
"expr",
"factorial",
Expand All @@ -235,12 +250,15 @@ exportMethods("%in%",
"getField",
"getItem",
"greatest",
"grouping_bit",
"grouping_id",
"hex",
"histogram",
"hour",
"hypot",
"ifelse",
"initcap",
"input_file_name",
"instr",
"isNaN",
"isNotNull",
Expand Down Expand Up @@ -278,18 +296,21 @@ exportMethods("%in%",
"nanvl",
"negate",
"next_day",
"not",
"ntile",
"otherwise",
"over",
"percent_rank",
"pmod",
"posexplode",
"posexplode_outer",
"quarter",
"rand",
"randn",
"rank",
"regexp_extract",
"regexp_replace",
"repeat_string",
"reverse",
"rint",
"rlike",
Expand All @@ -313,6 +334,7 @@ exportMethods("%in%",
"sort_array",
"soundex",
"spark_partition_id",
"split_string",
"stddev",
"stddev_pop",
"stddev_samp",
Expand Down Expand Up @@ -355,17 +377,29 @@ export("as.DataFrame",
"clearCache",
"createDataFrame",
"createExternalTable",
"createTable",
"currentDatabase",
"dropTempTable",
"dropTempView",
"jsonFile",
"listColumns",
"listDatabases",
"listFunctions",
"listTables",
"loadDF",
"parquetFile",
"read.df",
"read.jdbc",
"read.json",
"read.orc",
"read.parquet",
"read.stream",
"read.text",
"recoverPartitions",
"refreshByPath",
"refreshTable",
"setCheckpointDir",
"setCurrentDatabase",
"spark.lapply",
"spark.addFile",
"spark.getSparkFilesRootDirectory",
Expand Down Expand Up @@ -402,6 +436,16 @@ export("partitionBy",
export("windowPartitionBy",
"windowOrderBy")

exportClasses("StreamingQuery")

export("awaitTermination",
"isActive",
"lastProgress",
"queryName",
"status",
"stopQuery")


S3method(print, jobj)
S3method(print, structField)
S3method(print, structType)
Expand Down
Loading