Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
716 commits
Select commit Hold shift + click to select a range
068c4ae
[SPARK-24364][SS] Prevent InMemoryFileIndex from failing if file path…
HyukjinKwon May 24, 2018
f48d624
[SPARK-24230][SQL] Fix SpecificParquetRecordReaderBase with dictionar…
rdblue May 24, 2018
d0f30e3
[SPARK-24378][SQL] Fix date_trunc function incorrect examples
wangyum May 24, 2018
54aeae7
[MINOR] Add port SSL config in toString and scaladoc
mgaido91 May 25, 2018
a06fc45
[SPARK-19112][CORE][FOLLOW-UP] Add missing shortCompressionCodecNames…
wangyum May 26, 2018
9b0f6f5
[SPARK-24334] Fix race condition in ArrowPythonRunner causes unclean …
icexelloss May 28, 2018
8bb6c22
[SPARK-24392][PYTHON] Label pandas_udf as Experimental
BryanCutler May 28, 2018
a9700cb
[SPARK-24373][SQL] Add AnalysisBarrier to RelationalGroupedDataset's …
mgaido91 May 28, 2018
fec43fe
[SPARK-19613][SS][TEST] Random.nextString is not safe for directory n…
dongjoon-hyun May 29, 2018
49a6c2b
[SPARK-23991][DSTREAMS] Fix data loss when WAL write fails in allocat…
gaborgsomogyi May 29, 2018
66289a3
[SPARK-24369][SQL] Correct handling for multiple distinct aggregation…
maropu May 30, 2018
e1c0ab1
[SPARK-23754][BRANCH-2.3][PYTHON] Re-raising StopIteration in client …
e-dorigatti May 30, 2018
3a024a4
[SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correctly into Pyth…
HyukjinKwon May 30, 2018
dc24da2
[WEBUI] Avoid possibility of script in query param keys
srowen May 31, 2018
b37e76f
[SPARK-24414][UI] Calculate the correct number of tasks for a stage.
May 31, 2018
e56266a
[SPARK-24444][DOCS][PYTHON][BRANCH-2.3] Improve Pandas UDF docs to ex…
BryanCutler Jun 1, 2018
1cc5f68
Preparing Spark release v2.3.1-rc3
vanzin Jun 1, 2018
2e0c346
Preparing development version 2.3.2-SNAPSHOT
vanzin Jun 1, 2018
e4e96f9
Revert "[SPARK-24369][SQL] Correct handling for multiple distinct agg…
gatorsmile Jun 1, 2018
30aaa5a
Preparing Spark release v2.3.1-rc4
vanzin Jun 1, 2018
21800b8
Preparing development version 2.3.2-SNAPSHOT
vanzin Jun 1, 2018
1819454
[SPARK-24455][CORE] fix typo in TaskSchedulerImpl comment
Jun 4, 2018
36f1d5e
[SPARK-24369][SQL] Correct handling for multiple distinct aggregation…
cloud-fan Jun 4, 2018
1582945
[SPARK-24468][SQL] Handle negative scale when adjusting precision for…
mgaido91 Jun 9, 2018
4d4548a
[SPARK-23732][DOCS] Fix source links in generated scaladoc.
Jun 12, 2018
bf58687
[SPARK-24502][SQL] flaky test: UnsafeRowSerializerSuite
cloud-fan Jun 12, 2018
63e1da1
[SPARK-24531][TESTS] Remove version 2.2.0 from testing versions in Hi…
mgaido91 Jun 12, 2018
a55de38
[SPARK-24506][UI] Add UI filters to tabs added after binding
mgaido91 Jun 12, 2018
470cacd
[SPARK-23754][PYTHON][FOLLOWUP][BACKPORT-2.3] Move UDF stop iteration…
e-dorigatti Jun 13, 2018
a2f65eb
[MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInMemorySorterSuite
jiangxb1987 Jun 14, 2018
e6bf325
[SPARK-24495][SQL] EnsureRequirement returns wrong plan when reorderi…
mgaido91 Jun 14, 2018
7f1708a
[PYTHON] Fix typo in serializer exception
rberenguel Jun 15, 2018
d3255a5
revert [SPARK-21743][SQL] top-most limit should not cause memory leak
cloud-fan Jun 15, 2018
a7d378e
[SPARK-24531][TESTS] Replace 2.3.0 version with 2.3.1
mgaido91 Jun 13, 2018
d426104
[SPARK-24452][SQL][CORE] Avoid possible overflow in int add or multiple
kiszk Jun 15, 2018
9d63e54
[SPARK-24216][SQL] Spark TypedAggregateExpression uses getSimpleName …
Jun 12, 2018
b8dbfcc
Fix issue in 'docker-image-tool.sh'
fabriziocucci Jun 18, 2018
50cdb41
[SPARK-24542][SQL] UDF series UDFXPathXXXX allow users to pass carefu…
gatorsmile Jun 19, 2018
d687d97
[SPARK-24583][SQL] Wrong schema type in InsertIntoDataSourceCommand
maryannxue Jun 19, 2018
8928de3
[SPARK-24578][CORE] Cap sub-region's size of returned nio buffer
WenboZhao Jun 20, 2018
3a4b6f3
[SPARK-24589][CORE] Correctly identify tasks in output commit coordin…
Jun 21, 2018
a1e9640
[SPARK-24588][SS] streaming join should require HashClusteredPartitio…
cloud-fan Jun 21, 2018
db538b2
[SPARK-24552][CORE][SQL][BRANCH-2.3] Use unique id instead of attempt…
Jun 25, 2018
6e1f5e0
[SPARK-24613][SQL] Cache with UDF could not be matched with subsequen…
maryannxue Jun 21, 2018
0f534d3
[SPARK-24603][SQL] Fix findTightestCommonType reference in comments
Jun 28, 2018
8ff4b97
simplify rand in dsl/package.scala
gatorsmile Jun 30, 2018
3c0af79
[SPARK-24696][SQL] ColumnPruning rule fails to remove extra Project
maryannxue Jun 30, 2018
1cba050
[SPARK-24507][DOCUMENTATION] Update streaming guide
rekhajoshm Jul 2, 2018
bc7ee75
[SPARK-24385][SQL] Resolve self-join condition ambiguity for EqualNul…
mgaido91 Jul 3, 2018
e5cc5f6
[SPARK-24535][SPARKR] fix tests on java check error
felixcheung Jul 6, 2018
64c72b4
[SPARK-24739][PYTHON] Make PySpark compatible with Python 3.7
HyukjinKwon Jul 7, 2018
4df06b4
Preparing Spark release v2.3.2-rc1
jerryshao Jul 8, 2018
72eb97c
Preparing development version 2.3.3-SNAPSHOT
jerryshao Jul 8, 2018
19542f5
[SPARK-24530][PYTHON] Add a control to force Python version in Sphinx…
HyukjinKwon Jul 11, 2018
307499e
Preparing Spark release v2.3.2-rc2
jerryshao Jul 11, 2018
86457a1
Preparing development version 2.3.3-SNAPSHOT
jerryshao Jul 11, 2018
3242925
[SPARK-24208][SQL] Fix attribute deduplication for FlatMapGroupsInPandas
mgaido91 Jul 11, 2018
9cf375f
[SPARK-24781][SQL] Using a reference from Dataset in Filter/Sort migh…
viirya Jul 13, 2018
b3726da
Preparing Spark release v2.3.2-rc3
jerryshao Jul 15, 2018
f9a2b0a
Preparing development version 2.3.3-SNAPSHOT
jerryshao Jul 15, 2018
dae352a
[SPARK-24813][TESTS][HIVE][HOTFIX] HiveExternalCatalogVersionsSuite s…
srowen Jul 16, 2018
e31b476
[SPARK-24813][BUILD][FOLLOW-UP][HOTFIX] HiveExternalCatalogVersionsSu…
srowen Jul 17, 2018
7be70e2
[SPARK-24677][CORE] Avoid NoSuchElementException from MedianHeap
cxzl25 Jul 18, 2018
d0280ab
[SPARK-24755][CORE] Executor loss can cause task to not be resubmitted
Jul 19, 2018
db1f3cc
[SPARK-23731][SQL] Make FileSourceScanExec canonicalizable after bein…
HyukjinKwon Jul 20, 2018
bd6bfac
[SPARK-24879][SQL] Fix NPE in Hive partition pruning filter pushdown
PenguinToast Jul 21, 2018
f5bc948
[SQL][HIVE] Correct an assert message in function makeRDDForTable
SongYadong Jul 23, 2018
740a23d
[SPARK-22499][FOLLOWUP][SQL] Reduce input string expressions for Leas…
HyukjinKwon Jul 24, 2018
6a59992
[SPARK-24891][SQL] Fix HandleNullInputsForUDF rule
maryannxue Jul 25, 2018
740606e
[SPARK-24891][FOLLOWUP][HOT-FIX][2.3] Fix the Compilation Errors
gatorsmile Jul 25, 2018
fa552c3
[SPARK-24867][SQL] Add AnalysisBarrier to DataFrameWriter
gatorsmile Jul 26, 2018
d5f340f
[SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-java cannot be "…
liancheng Jul 27, 2018
71eb7d4
[SPARK-24809][SQL] Serializing LongToUnsafeRowMap in executor may res…
liutang123 Jul 29, 2018
bad56bb
[MINOR][CORE][TEST] Fix afterEach() in TastSetManagerSuite and TaskSc…
jiangxb1987 Jul 30, 2018
aa51c07
[SPARK-24934][SQL] Explicitly whitelist supported types in upper/lowe…
HyukjinKwon Jul 30, 2018
25ea27b
[SPARK-24957][SQL] Average with decimal followed by aggregation retur…
mgaido91 Jul 30, 2018
fc3df45
[SPARK-24536] Validate that an evaluated limit clause cannot be null
mauropalsgraaf Jul 31, 2018
5b187a8
[SPARK-24976][PYTHON] Allow None for Decimal type conversion (specifi…
HyukjinKwon Aug 1, 2018
8080c93
[PYSPARK] Updates to Accumulators
LucaCanali Jul 18, 2018
14b50d7
[SPARK-24987][SS] - Fix Kafka consumer leak when no new offsets for T…
Aug 4, 2018
136588e
[SPARK-25015][BUILD] Update Hadoop 2.7 to 2.7.7
srowen Aug 4, 2018
9fb70f4
[SPARK-24948][SHS][BACKPORT-2.3] Delegate check access permissions to…
mgaido91 Aug 8, 2018
7d465d8
[MINOR][BUILD] Update Jetty to 9.3.24.v20180605
srowen Aug 9, 2018
9bfc55b
[SPARK-25076][SQL] SQLConf should not be retrieved from a stopped Spa…
cloud-fan Aug 9, 2018
b426ec5
[SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays f…
d80tb7 Jul 28, 2018
6930f48
Preparing Spark release v2.3.2-rc4
jerryshao Aug 10, 2018
e66f3f9
Preparing development version 2.3.3-SNAPSHOT
jerryshao Aug 10, 2018
7306ac7
[MINOR][BUILD] Add ECCN notice required by http://www.apache.org/dev/…
srowen Aug 10, 2018
04c6520
[SPARK-25081][CORE] Nested spill in ShuffleExternalSorter should not …
zsxwing Aug 10, 2018
a0a7e41
[SPARK-24908][R][STYLE] removing spaces to make lintr happy
shaneknapp Jul 24, 2018
b9b35b9
[SPARK-25084][SQL][BACKPORT-2.3] distribute by" on multiple columns (…
LantaoJin Aug 13, 2018
787790b
[SPARK-25028][SQL] Avoid NPE when analyzing partition with NULL values
mgaido91 Aug 13, 2018
4dc8225
Preparing Spark release v2.3.2-rc5
jerryshao Aug 14, 2018
29a0403
Preparing development version 2.3.3-SNAPSHOT
jerryshao Aug 14, 2018
0856b82
[MINOR][SQL][DOC] Fix `to_json` example in function description and doc
dongjoon-hyun Aug 14, 2018
34191e6
[SPARK-25051][SQL] FixNullability should not stop on AnalysisBarrier
mgaido91 Aug 14, 2018
032f6d9
[MINOR][DOC][SQL] use one line for annotation arg value
mengxr Aug 18, 2018
ea01e36
[SPARK-25144][SQL][TEST][BRANCH-2.3] Free aggregate map when task ends
cloud-fan Aug 20, 2018
9702bb6
[DOCS] Fixed NDCG formula issues
yueguoguo Aug 20, 2018
8bde467
[SPARK-25114][CORE] Fix RecordBinaryComparator when subtraction betwe…
jiangxb1987 Aug 21, 2018
9cb9d72
[SPARK-25114][2.3][CORE][FOLLOWUP] Fix RecordBinaryComparatorSuite bu…
jiangxb1987 Aug 21, 2018
fcc9bd6
[SPARK-25205][CORE] Fix typo in spark.network.crypto.keyFactoryIterat…
squito Aug 24, 2018
42c1fdd
[SPARK-25234][SPARKR] avoid integer overflow in parallelize
mengxr Aug 24, 2018
f598382
[SPARK-25124][ML] VectorSizeHint setSize and getSize don't return val…
huaxingao Aug 24, 2018
8db935f
[SPARK-25164][SQL] Avoid rebuilding column and path list for each col…
bersprockets Aug 23, 2018
306e881
[SPARK-24704][WEBUI] Fix the order of stages in the DAG graph
stanzhai Jul 4, 2018
b072717
[SPARK-25273][DOC] How to install testthat 1.0.2
MaxGekk Aug 30, 2018
dbf0b93
[SPARK-24909][CORE] Always unregister pending partition on task compl…
Aug 29, 2018
31e46ec
[SPARK-25231] Fix synchronization of executor heartbeat receiver in T…
Sep 5, 2018
9db81fd
[SPARK-25313][BRANCH-2.3][SQL] Fix regression in FileFormatWriter out…
gengliangwang Sep 6, 2018
31dab71
[SPARK-25072][PYSPARK] Forbid extra value for custom Row
Sep 6, 2018
d22379e
[SPARK-23243][CORE][2.3] Fix RDD.repartition() data correctness issue
cloud-fan Sep 7, 2018
84922e5
[SPARK-25330][BUILD][BRANCH-2.3] Revert Hadoop 2.7 to 2.7.3
wangyum Sep 7, 2018
5b8b6b4
[SPARK-24415][CORE] Fixed the aggregated stage metrics by retaining s…
ankuriitg Sep 5, 2018
5ad644a
[SPARK-25368][SQL] Incorrect predicate pushdown returns wrong result
wangyum Sep 9, 2018
4b57818
Revert "[SPARK-25072][PYSPARK] Forbid extra value for custom Row"
gatorsmile Sep 10, 2018
60e56bc
[SPARK-25313][SQL][FOLLOW-UP][BACKPORT-2.3] Fix InsertIntoHiveDirComm…
wangyum Sep 11, 2018
18688d3
[SPARK-24889][CORE] Update block info when unpersist rdds
viirya Sep 11, 2018
d8ec5ff
[SPARK-25371][SQL][BACKPORT-2.3] struct() should allow being called w…
mgaido91 Sep 12, 2018
db9c041
[SPARK-25402][SQL] Null handling in BooleanSimplification
gatorsmile Sep 12, 2018
9ac9f36
[SPARK-25357][SQL] Add metadata to SparkPlanInfo to dump more informa…
LantaoJin Sep 13, 2018
a2a54a5
[SPARK-25253][PYSPARK] Refactor local connection & auth code
squito Aug 29, 2018
09dd34c
[PYSPARK] Updates to pyspark broadcast
squito Aug 14, 2018
6d742d1
[PYSPARK][SQL] Updates to RowQueue
squito Sep 6, 2018
575fea1
[CORE] Updates to remote cache reads
squito Aug 22, 2018
f3bbb7c
[HOTFIX] fix lint-java
squito Sep 13, 2018
0c1e3d1
[SPARK-25400][CORE][TEST] Increase test timeouts
squito Sep 13, 2018
02b5107
Preparing Spark release v2.3.2-rc6
jerryshao Sep 16, 2018
7b5da37
Preparing development version 2.3.3-SNAPSHOT
jerryshao Sep 16, 2018
e319a62
[SPARK-25471][PYTHON][TEST] Fix pyspark-sql test error when using Pyt…
BryanCutler Sep 20, 2018
dad5c48
[MINOR][PYTHON] Use a helper in `PythonUtils` instead of direct acces…
HyukjinKwon Sep 20, 2018
7edfdfc
[SPARK-25450][SQL] PushProjectThroughUnion rule uses the same exprId …
maryannxue Sep 20, 2018
8ccc478
[SPARK-25502][CORE][WEBUI] Empty Page when page number exceeds the re…
shahidki31 Sep 24, 2018
12717ba
[SPARKR] Match pyspark features in SparkR communication protocol
HyukjinKwon Sep 24, 2018
9674d08
[SPARK-25503][CORE][WEBUI] Total task message in stage page is ambiguous
shahidki31 Sep 25, 2018
cbb228e
[SPARK-25425][SQL][BACKPORT-2.3] Extra options should override sessio…
MaxGekk Sep 26, 2018
2381d60
[SPARK-25509][CORE] Windows doesn't support POSIX permissions
Sep 26, 2018
26d893a
[SPARK-25454][SQL] add a new config for picking minimum precision for…
cloud-fan Sep 27, 2018
f40e4c7
[SPARK-25536][CORE] metric value for METRIC_OUTPUT_RECORDS_WRITTEN is…
shahidki31 Sep 27, 2018
f13565b
[SPARK-25533][CORE][WEBUI] AppSummary should hold the information abo…
shahidki31 Sep 26, 2018
eb78380
[SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in HiveExternalCata…
dongjoon-hyun Sep 29, 2018
73408f0
[SPARK-25568][CORE] Continue to update the remaining accumulators whe…
zsxwing Sep 30, 2018
8d7723f
[CORE][MINOR] Fix obvious error and compiling for Scala 2.12.7
da-liii Sep 30, 2018
7102aee
[SPARK-25583][DOC][BRANCH-2.3] Add history-server related configurati…
shahidki31 Oct 3, 2018
5324a85
[SPARK-25674][SQL] If the records are incremented by more than 1 at a…
10110346 Oct 11, 2018
182bc85
[SPARK-25714] Fix Null Handling in the Optimizer rule BooleanSimplifi…
gatorsmile Oct 13, 2018
b3d1b1b
Revert "[SPARK-25714] Fix Null Handling in the Optimizer rule Boolean…
gatorsmile Oct 13, 2018
1e15998
[SPARK-25726][SQL][TEST] Fix flaky test in SaveIntoDataSourceCommandS…
dongjoon-hyun Oct 14, 2018
d87896b
[SPARK-25714][BACKPORT-2.3] Fix Null Handling in the Optimizer rule B…
gatorsmile Oct 16, 2018
0726bc5
[SPARK-25674][FOLLOW-UP] Update the stats for each ColumnarBatch
gatorsmile Oct 16, 2018
61b301c
[SPARK-21402][SQL][BACKPORT-2.3] Fix java array of structs deserializ…
Oct 18, 2018
353d328
[SPARK-25768][SQL] fix constant argument expecting UDAFs
peter-toth Oct 19, 2018
5cef11a
fix security issue of zinc
cloud-fan Oct 19, 2018
719ff7a
[DOC][MINOR] Fix minor error in the code of graphx guide
WeichenXu123 Oct 20, 2018
d7a3587
fix security issue of zinc(simplier version)
cloud-fan Oct 19, 2018
8fbf3ee
[SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Example
dongjoon-hyun Oct 22, 2018
0a05cf9
[SPARK-25822][PYSPARK] Fix a race condition when releasing a Python w…
zsxwing Oct 26, 2018
3afb3a2
[SPARK-25854][BUILD] fix `build/mvn` not to fail during Zinc server s…
shaneknapp Oct 26, 2018
53aeb3d
[SPARK-25816][SQL] Fix attribute resolution in nested extractors
peter-toth Oct 29, 2018
3e0160b
[SPARK-25797][SQL][DOCS][BACKPORT-2.3] Add migration doc for solving …
seancxmao Oct 29, 2018
632c0d9
[DOC] Fix doc for spark.sql.parquet.recordLevelFilter.enabled
bersprockets Oct 29, 2018
49e1eb8
[SPARK-25837][CORE] Fix potential slowdown in AppStatusListener when …
patrickbrownsync Nov 1, 2018
0c7d82b
[SPARK-25933][DOCUMENTATION] Fix pstats.Stats() reference in configur…
Nov 3, 2018
7a59618
[SPARK-26011][SPARK-SUBMIT] Yarn mode pyspark app without python main…
shanyu Nov 15, 2018
550408e
[SPARK-25934][MESOS] Don't propagate SPARK_CONF_DIR from spark submit
mpmolek Nov 16, 2018
90e4dd1
[MINOR][SQL] Fix typo in CTAS plan database string
dongjoon-hyun Nov 17, 2018
0fb830c
[SPARK-26084][SQL] Fixes unresolved AggregateExpression.references ex…
ssimeonov Nov 20, 2018
8b6504e
[SPARK-26109][WEBUI] Duration in the task summary metrics table and t…
shahidki31 Nov 21, 2018
62010d6
[SPARK-26118][BACKPORT-2.3][WEB UI] Introducing spark.ui.requestHeade…
attilapiros Nov 22, 2018
de5f489
[SPARK-25786][CORE] If the ByteBuffer.hasArray is false , it will thr…
10110346 Nov 24, 2018
96a5a12
[SPARK-26137][CORE] Use Java system property "file.separator" inste…
Nov 28, 2018
e96ba84
[SPARK-26211][SQL] Fix InSet for binary, and struct and array with null.
ueshin Nov 29, 2018
4ee463a
[SPARK-26201] Fix python broadcast with encryption
Nov 30, 2018
0058986
[MINOR][DOC] Correct some document description errors
10110346 Dec 1, 2018
8236f64
[SPARK-26198][SQL] Fix Metadata serialize null values throw NPE
wangyum Dec 2, 2018
1899dd2
[SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when encoding a decima…
mgaido91 Dec 5, 2018
3772d93
[SPARK-26307][SQL] Fix CTAS when INSERT a partitioned table using Hiv…
gatorsmile Dec 10, 2018
7930fbd
[SPARK-26327][SQL][BACKPORT-2.3] Bug fix for `FileSourceScanExec` met…
xuanyuanking Dec 14, 2018
20558f7
[SPARK-26315][PYSPARK] auto cast threshold from Integer to Float in a…
Dec 15, 2018
1576bd7
[SPARK-26352][SQL] join reorder should not change the order of output…
rednaxelafx Dec 17, 2018
bccefa5
[SPARK-26352][SQL][FOLLOWUP-2.3] Fix missing sameOutput in branch-2.3
rednaxelafx Dec 17, 2018
35c4235
[SPARK-26316][SPARK-21052][BRANCH-2.3] Revert hash join metrics in th…
JkSelf Dec 18, 2018
832812e
[SPARK-26394][CORE] Fix annotation error for Utils.timeStringAsMs
Dec 18, 2018
a22a11b
[SPARK-24687][CORE] Avoid job hanging when generate task binary cause…
caneGuy Dec 20, 2018
b4aeb81
[SPARK-26422][R] Support to disable Hive support in SparkR even for H…
HyukjinKwon Dec 21, 2018
a7d50ae
[SPARK-26366][SQL][BACKPORT-2.3] ReplaceExceptWithFilter should consi…
mgaido91 Dec 21, 2018
d9d3bea
Revert "[SPARK-26366][SQL][BACKPORT-2.3] ReplaceExceptWithFilter shou…
dongjoon-hyun Dec 22, 2018
acf20d2
[SPARK-26366][SQL][BACKPORT-2.3] ReplaceExceptWithFilter should consi…
mgaido91 Dec 23, 2018
acbfb31
[SPARK-26444][WEBUI] Stage color doesn't change with it's status
seancxmao Dec 28, 2018
c3d759f
[SPARK-26496][SS][TEST] Avoid to use Random.nextString in StreamingIn…
HyukjinKwon Dec 29, 2018
70a99ba
[SPARK-25591][PYSPARK][SQL][BRANCH-2.3] Avoid overwriting deserialize…
viirya Jan 3, 2019
30a811b
[SPARK-26019][PYSPARK] Allow insecure py4j gateways
squito Jan 3, 2019
30b82a3
[MINOR][NETWORK][TEST] Fix TransportFrameDecoderSuite to use ByteBuf …
dongjoon-hyun Jan 4, 2019
d618d27
[SPARK-26078][SQL][BACKPORT-2.3] Dedup self-join attributes on IN sub…
mgaido91 Jan 5, 2019
64fce5c
[SPARK-26545] Fix typo in EqualNullSafe's truth table comment
rednaxelafx Jan 5, 2019
bb52170
[SPARK-26537][BUILD][BRANCH-2.3] change git-wip-us to gitbox
shaneknapp Jan 6, 2019
38fe12b
[SPARK-25253][PYSPARK][FOLLOWUP] Undefined name: from pyspark.util im…
Aug 30, 2018
9052a5e
[MINOR][BUILD] Fix script name in `release-tag.sh` usage message
dongjoon-hyun Jan 7, 2019
87c2c11
[SPARK-26576][SQL] Broadcast hint not applied to partitioned table
jzhuge Jan 11, 2019
b6c4649
[SPARK-26607][SQL][TEST] Remove Spark 2.2.x testing from HiveExternal…
dongjoon-hyun Jan 12, 2019
6d063ee
[SPARK-26538][SQL] Set default precision and scale for elements of po…
a-shkarupin Jan 12, 2019
20b7490
[SPARK-26010][R] fix vignette eval with Java 11
felixcheung Nov 13, 2018
01511e4
[SPARK-25572][SPARKR] test only if not cran
felixcheung Sep 29, 2018
2a82295
[SPARK-26120][TESTS][SS][SPARKR] Fix a streaming query leak in Struct…
zsxwing Nov 21, 2018
18c138b
Revert "[SPARK-26576][SQL] Broadcast hint not applied to partitioned …
maropu Jan 16, 2019
b5ea933
Preparing Spark release v2.3.3-rc1
maropu Jan 16, 2019
8319ba7
Preparing development version 2.3.4-SNAPSHOT
maropu Jan 16, 2019
5a50ae3
[SPARK-26629][SS] Fixed error with multiple file stream in a query + …
tdas Jan 16, 2019
c0fc6d0
Revert "[SPARK-26629][SS] Fixed error with multiple file stream in a …
zsxwing Jan 16, 2019
8debdbd
[SPARK-26638][PYSPARK][ML] Pyspark vector classes always return error…
srowen Jan 17, 2019
bf3cdea
[SPARK-24740][PYTHON][ML][BACKPORT-2.3] Make PySpark's tests compatib…
HyukjinKwon Jan 19, 2019
ae64e5b
[SPARK-26351][MLLIB] Update doc and minor correction in the mllib eva…
shahidki31 Jan 21, 2019
b88067b
[SPARK-26665][CORE] Fix a bug that BlockTransferService.fetchBlockSyn…
zsxwing Jan 22, 2019
98d48c7
[SPARK-26228][MLLIB] OOM issue encountered when computing Gramian matrix
srowen Jan 23, 2019
de3b5c4
[SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType
aokolnychyi Jan 24, 2019
23e35d4
[SPARK-26706][SQL][FOLLOWUP] Fix `illegalNumericPrecedence` for ByteType
dbtsai Jan 24, 2019
ded902c
[SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Had…
rdblue Jan 24, 2019
373a627
[SPARK-26680][SPARK-25767][SQL][BACKPORT-2.3] Eagerly create inputVar…
bersprockets Jan 25, 2019
f98aee4
[SPARK-26709][SQL][BRANCH-2.3] OptimizeMetadataOnlyQuery does not han…
gengliangwang Jan 26, 2019
a89f601
[SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid Unresolve…
HeartSaVioR Jan 28, 2019
f6391e1
[SPARK-26732][CORE][TEST] Wait for listener bus to process events in …
Jan 30, 2019
ad18faa
[SPARK-26718][SS][BRANCH-2.3] Fixed integer overflow in SS kafka rate…
Jan 30, 2019
94a4b46
[SPARK-26726] Synchronize the amount of memory used by the broadcast …
httfighter Jan 31, 2019
537d15c
[SPARK-26757][GRAPHX] Return 0 for `count` on empty Edge/Vertex RDDs
huonw Jan 31, 2019
a5d22da
[SPARK-26806][SS] EventTimeStats.merge should handle zeros correctly
zsxwing Feb 1, 2019
4d6ea2c
[SPARK-26751][SQL] Fix memory leak when statement run in background a…
caneGuy Feb 3, 2019
66fd9c3
Preparing Spark release v2.3.3-rc2
maropu Feb 4, 2019
7845807
Preparing development version 2.3.4-SNAPSHOT
maropu Feb 4, 2019
9c78669
[SPARK-26758][CORE] Idle Executors are not getting killed after spark…
sandeep-katta Feb 5, 2019
38ade42
[SPARK-26734][STREAMING] Fix StackOverflowError with large block queue
rlodge Feb 6, 2019
bb6dbd2
[SPARK-26082][MESOS] Fix mesos fetch cache config name
Feb 7, 2019
3abf45d
[SPARK-26082][MESOS][FOLLOWUP] Add UT on fetcher cache option on Meso…
HeartSaVioR Feb 7, 2019
97f8ed4
Revert "[SPARK-26082][MESOS][FOLLOWUP] Add UT on fetcher cache option…
dongjoon-hyun Feb 9, 2019
02e9890
[SPARK-26082][MESOS][FOLLOWUP][BRANCH-2.3] Add UT on fetcher cache op…
HeartSaVioR Feb 10, 2019
abce846
[SPARK-23408][SS][BRANCH-2.3] Synchronize successive AddData actions …
HeartSaVioR Feb 12, 2019
7f13fd0
[SPARK-23491][SS] Remove explicit job cancellation from ContinuousExe…
jose-torres Feb 26, 2018
55d5a19
[SPARK-23416][SS] Add a specific stop method for ContinuousExecution.
jose-torres May 24, 2018
0d0c9ff
[SPARK-26572][SQL] fix aggregate codegen result evaluation
peter-toth Feb 14, 2019
d38a113
[SPARK-26897][SQL][TEST][FOLLOW-UP] Remove workaround for 2.2.0 and 2…
maropu Feb 18, 2019
214b6b2
[SPARK-26897][SQL][TEST][BRANCH-2.3] Update Spark 2.3.x testing from …
maropu Feb 19, 2019
41df43f
[SPARK-26873][SQL] Use a consistent timestamp to build Hadoop Job IDs.
rdblue Feb 19, 2019
6691c04
[R][BACKPORT-2.4] update package description
felixcheung Feb 21, 2019
36db45d
[R][BACKPORT-2.3] update package description
felixcheung Feb 22, 2019
ae1b44c
[SPARK-26950][SQL][TEST] Make RandomDataGenerator use Float.NaN or Do…
dongjoon-hyun Feb 22, 2019
3ece965
[MINOR][BUILD] Update all checkstyle dtd to use "https://checkstyle.org"
HeartSaVioR Feb 25, 2019
c326628
[MINOR][DOCS] Clarify that Spark apps should mark Spark as a 'provide…
srowen Mar 5, 2019
8b70980
[SPARK-24669][SQL] Invalidate tables in case of DROP DATABASE CASCADE
Udbhav30 Mar 6, 2019
877b8db
[SPARK-27065][CORE] avoid more than one active task set managers for …
cloud-fan Mar 6, 2019
dfde0c6
[SPARK-25863][SPARK-21871][SQL] Check if code size statistics is empt…
maropu Mar 7, 2019
b013c57
add stageIdToFinishedPartitions
Ngone51 Mar 6, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,7 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)
(The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)
(The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.10.6 - http://py4j.sourceforge.net/)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.10.7 - http://py4j.sourceforge.net/)
(Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - http://github.com/szeiger/junit-interface/)
(BSD licence) sbt and sbt-launch-lib.bash
(BSD 3 Clause) d3.min.js (https://github.com/mbostock/d3/blob/master/LICENSE)
Expand Down
25 changes: 25 additions & 0 deletions NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,31 @@ This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).


Export Control Notice
---------------------

This distribution includes cryptographic software. The country in which you currently reside may have
restrictions on the import, possession, use, and/or re-export to another country, of encryption software.
BEFORE using any encryption software, please check your country's laws, regulations and policies concerning
the import, possession, or use, and re-export of encryption software, to see if this is permitted. See
<http://www.wassenaar.org/> for more information.

The U.S. Government Department of Commerce, Bureau of Industry and Security (BIS), has classified this
software as Export Commodity Control Number (ECCN) 5D002.C.1, which includes information security software
using or performing cryptographic functions with asymmetric algorithms. The form and manner of this Apache
Software Foundation distribution makes it eligible for export under the License Exception ENC Technology
Software Unrestricted (TSU) exception (see the BIS Export Administration Regulations, Section 740.13) for
both object code and source code.

The following provides more details on the included cryptographic software:

This software uses Apache Commons Crypto (https://commons.apache.org/proper/commons-crypto/) to
support authentication, and encryption and decryption of data sent across the network between
services.

This software includes Bouncy Castle (http://bouncycastle.org/) to support the jets3t library.


========================================================================
Common Development and Distribution License 1.0
========================================================================
Expand Down
11 changes: 6 additions & 5 deletions R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: SparkR
Type: Package
Version: 2.3.0
Title: R Frontend for Apache Spark
Description: Provides an R Frontend for Apache Spark.
Version: 2.3.4
Title: R Front end for 'Apache Spark'
Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>.
Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),
email = "[email protected]"),
person("Xiangrui", "Meng", role = "aut",
Expand All @@ -11,8 +11,9 @@ Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),
email = "[email protected]"),
person(family = "The Apache Software Foundation", role = c("aut", "cph")))
License: Apache License (== 2.0)
URL: http://www.apache.org/ http://spark.apache.org/
BugReports: http://spark.apache.org/contributing.html
URL: https://www.apache.org/ https://spark.apache.org/
BugReports: https://spark.apache.org/contributing.html
SystemRequirements: Java (== 8)
Depends:
R (>= 3.0),
methods
Expand Down
1 change: 1 addition & 0 deletions R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,7 @@ exportMethods("arrange",
"with",
"withColumn",
"withColumnRenamed",
"withWatermark",
"write.df",
"write.jdbc",
"write.json",
Expand Down
105 changes: 97 additions & 8 deletions R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -2090,7 +2090,8 @@ setMethod("selectExpr",
#'
#' @param x a SparkDataFrame.
#' @param colName a column name.
#' @param col a Column expression, or an atomic vector in the length of 1 as literal value.
#' @param col a Column expression (which must refer only to this SparkDataFrame), or an atomic
#' vector in the length of 1 as literal value.
#' @return A SparkDataFrame with the new column added or the existing column replaced.
#' @family SparkDataFrame functions
#' @aliases withColumn,SparkDataFrame,character-method
Expand Down Expand Up @@ -2853,7 +2854,7 @@ setMethod("intersect",
#' except
#'
#' Return a new SparkDataFrame containing rows in this SparkDataFrame
#' but not in another SparkDataFrame. This is equivalent to \code{EXCEPT} in SQL.
#' but not in another SparkDataFrame. This is equivalent to \code{EXCEPT DISTINCT} in SQL.
#'
#' @param x a SparkDataFrame.
#' @param y a SparkDataFrame.
Expand Down Expand Up @@ -3054,10 +3055,10 @@ setMethod("describe",
#' \item stddev
#' \item min
#' \item max
#' \item arbitrary approximate percentiles specified as a percentage (eg, "75%")
#' \item arbitrary approximate percentiles specified as a percentage (eg, "75\%")
#' }
#' If no statistics are given, this function computes count, mean, stddev, min,
#' approximate quartiles (percentiles at 25%, 50%, and 75%), and max.
#' approximate quartiles (percentiles at 25\%, 50\%, and 75\%), and max.
#' This function is meant for exploratory data analysis, as we make no guarantee about the
#' backward compatibility of the schema of the resulting Dataset. If you want to
#' programmatically compute summary statistics, use the \code{agg} function instead.
Expand Down Expand Up @@ -3661,7 +3662,8 @@ setMethod("getNumPartitions",
#' isStreaming
#'
#' Returns TRUE if this SparkDataFrame contains one or more sources that continuously return data
#' as it arrives.
#' as it arrives. A dataset that reads data from a streaming source must be executed as a
#' \code{StreamingQuery} using \code{write.stream}.
#'
#' @param x A SparkDataFrame
#' @return TRUE if this SparkDataFrame is from a streaming source
Expand Down Expand Up @@ -3707,7 +3709,17 @@ setMethod("isStreaming",
#' @param df a streaming SparkDataFrame.
#' @param source a name for external data source.
#' @param outputMode one of 'append', 'complete', 'update'.
#' @param ... additional argument(s) passed to the method.
#' @param partitionBy a name or a list of names of columns to partition the output by on the file
#' system. If specified, the output is laid out on the file system similar to Hive's
#' partitioning scheme.
#' @param trigger.processingTime a processing time interval as a string, e.g. '5 seconds',
#' '1 minute'. This is a trigger that runs a query periodically based on the processing
#' time. If value is '0 seconds', the query will run as fast as possible, this is the
#' default. Only one trigger can be set.
#' @param trigger.once a logical, must be set to \code{TRUE}. This is a trigger that processes only
#' one batch of data in a streaming query then terminates the query. Only one trigger can be
#' set.
#' @param ... additional external data source specific named options.
#'
#' @family SparkDataFrame functions
#' @seealso \link{read.stream}
Expand All @@ -3725,7 +3737,8 @@ setMethod("isStreaming",
#' # console
#' q <- write.stream(wordCounts, "console", outputMode = "complete")
#' # text stream
#' q <- write.stream(df, "text", path = "/home/user/out", checkpointLocation = "/home/user/cp")
#' q <- write.stream(df, "text", path = "/home/user/out", checkpointLocation = "/home/user/cp"
#' partitionBy = c("year", "month"), trigger.processingTime = "30 seconds")
#' # memory stream
#' q <- write.stream(wordCounts, "memory", queryName = "outs", outputMode = "complete")
#' head(sql("SELECT * from outs"))
Expand All @@ -3737,7 +3750,8 @@ setMethod("isStreaming",
#' @note experimental
setMethod("write.stream",
signature(df = "SparkDataFrame"),
function(df, source = NULL, outputMode = NULL, ...) {
function(df, source = NULL, outputMode = NULL, partitionBy = NULL,
trigger.processingTime = NULL, trigger.once = NULL, ...) {
if (!is.null(source) && !is.character(source)) {
stop("source should be character, NULL or omitted. It is the data source specified ",
"in 'spark.sql.sources.default' configuration by default.")
Expand All @@ -3748,12 +3762,43 @@ setMethod("write.stream",
if (is.null(source)) {
source <- getDefaultSqlSource()
}
cols <- NULL
if (!is.null(partitionBy)) {
if (!all(sapply(partitionBy, function(c) { is.character(c) }))) {
stop("All partitionBy column names should be characters.")
}
cols <- as.list(partitionBy)
}
jtrigger <- NULL
if (!is.null(trigger.processingTime) && !is.na(trigger.processingTime)) {
if (!is.null(trigger.once)) {
stop("Multiple triggers not allowed.")
}
interval <- as.character(trigger.processingTime)
if (nchar(interval) == 0) {
stop("Value for trigger.processingTime must be a non-empty string.")
}
jtrigger <- handledCallJStatic("org.apache.spark.sql.streaming.Trigger",
"ProcessingTime",
interval)
} else if (!is.null(trigger.once) && !is.na(trigger.once)) {
if (!is.logical(trigger.once) || !trigger.once) {
stop("Value for trigger.once must be TRUE.")
}
jtrigger <- callJStatic("org.apache.spark.sql.streaming.Trigger", "Once")
}
options <- varargsToStrEnv(...)
write <- handledCallJMethod(df@sdf, "writeStream")
write <- callJMethod(write, "format", source)
if (!is.null(outputMode)) {
write <- callJMethod(write, "outputMode", outputMode)
}
if (!is.null(cols)) {
write <- callJMethod(write, "partitionBy", cols)
}
if (!is.null(jtrigger)) {
write <- callJMethod(write, "trigger", jtrigger)
}
write <- callJMethod(write, "options", options)
ssq <- handledCallJMethod(write, "start")
streamingQuery(ssq)
Expand Down Expand Up @@ -3967,3 +4012,47 @@ setMethod("broadcast",
sdf <- callJStatic("org.apache.spark.sql.functions", "broadcast", x@sdf)
dataFrame(sdf)
})

#' withWatermark
#'
#' Defines an event time watermark for this streaming SparkDataFrame. A watermark tracks a point in
#' time before which we assume no more late data is going to arrive.
#'
#' Spark will use this watermark for several purposes:
#' \itemize{
#' \item To know when a given time window aggregation can be finalized and thus can be emitted
#' when using output modes that do not allow updates.
#' \item To minimize the amount of state that we need to keep for on-going aggregations.
#' }
#' The current watermark is computed by looking at the \code{MAX(eventTime)} seen across
#' all of the partitions in the query minus a user specified \code{delayThreshold}. Due to the cost
#' of coordinating this value across partitions, the actual watermark used is only guaranteed
#' to be at least \code{delayThreshold} behind the actual event time. In some cases we may still
#' process records that arrive more than \code{delayThreshold} late.
#'
#' @param x a streaming SparkDataFrame
#' @param eventTime a string specifying the name of the Column that contains the event time of the
#' row.
#' @param delayThreshold a string specifying the minimum delay to wait to data to arrive late,
#' relative to the latest record that has been processed in the form of an
#' interval (e.g. "1 minute" or "5 hours"). NOTE: This should not be negative.
#' @return a SparkDataFrame.
#' @aliases withWatermark,SparkDataFrame,character,character-method
#' @family SparkDataFrame functions
#' @rdname withWatermark
#' @name withWatermark
#' @export
#' @examples
#' \dontrun{
#' sparkR.session()
#' schema <- structType(structField("time", "timestamp"), structField("value", "double"))
#' df <- read.stream("json", path = jsonDir, schema = schema, maxFilesPerTrigger = 1)
#' df <- withWatermark(df, "time", "10 minutes")
#' }
#' @note withWatermark since 2.3.0
setMethod("withWatermark",
signature(x = "SparkDataFrame", eventTime = "character", delayThreshold = "character"),
function(x, eventTime, delayThreshold) {
sdf <- callJMethod(x@sdf, "withWatermark", eventTime, delayThreshold)
dataFrame(sdf)
})
4 changes: 3 additions & 1 deletion R/pkg/R/SQLContext.R
Original file line number Diff line number Diff line change
Expand Up @@ -727,7 +727,9 @@ read.jdbc <- function(url, tableName,
#' @param schema The data schema defined in structType or a DDL-formatted string, this is
#' required for file-based streaming data source
#' @param ... additional external data source specific named options, for instance \code{path} for
#' file-based streaming data source
#' file-based streaming data source. \code{timeZone} to indicate a timezone to be used to
#' parse timestamps in the JSON/CSV data sources or partition values; If it isn't set, it
#' uses the default value, session local timezone.
#' @return SparkDataFrame
#' @rdname read.stream
#' @name read.stream
Expand Down
46 changes: 44 additions & 2 deletions R/pkg/R/client.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

# Creates a SparkR client connection object
# if one doesn't already exist
connectBackend <- function(hostname, port, timeout) {
connectBackend <- function(hostname, port, timeout, authSecret) {
if (exists(".sparkRcon", envir = .sparkREnv)) {
if (isOpen(.sparkREnv[[".sparkRCon"]])) {
cat("SparkRBackend client connection already exists\n")
Expand All @@ -29,7 +29,7 @@ connectBackend <- function(hostname, port, timeout) {

con <- socketConnection(host = hostname, port = port, server = FALSE,
blocking = TRUE, open = "wb", timeout = timeout)

doServerAuth(con, authSecret)
assign(".sparkRCon", con, envir = .sparkREnv)
con
}
Expand Down Expand Up @@ -60,13 +60,55 @@ generateSparkSubmitArgs <- function(args, sparkHome, jars, sparkSubmitOpts, pack
combinedArgs
}

checkJavaVersion <- function() {
javaBin <- "java"
javaHome <- Sys.getenv("JAVA_HOME")
javaReqs <- utils::packageDescription(utils::packageName(), fields = c("SystemRequirements"))
sparkJavaVersion <- as.numeric(tail(strsplit(javaReqs, "[(=)]")[[1]], n = 1L))
if (javaHome != "") {
javaBin <- file.path(javaHome, "bin", javaBin)
}

# If java is missing from PATH, we get an error in Unix and a warning in Windows
javaVersionOut <- tryCatch(
if (is_windows()) {
# See SPARK-24535
system2(javaBin, "-version", wait = TRUE, stdout = TRUE, stderr = TRUE)
} else {
launchScript(javaBin, "-version", wait = TRUE, stdout = TRUE, stderr = TRUE)
},
error = function(e) {
stop("Java version check failed. Please make sure Java is installed",
" and set JAVA_HOME to point to the installation directory.", e)
},
warning = function(w) {
stop("Java version check failed. Please make sure Java is installed",
" and set JAVA_HOME to point to the installation directory.", w)
})
javaVersionFilter <- Filter(
function(x) {
grepl(" version", x)
}, javaVersionOut)

javaVersionStr <- strsplit(javaVersionFilter[[1]], "[\"]")[[1L]][2]
# javaVersionStr is of the form 1.8.0_92.
# Extract 8 from it to compare to sparkJavaVersion
javaVersionNum <- as.integer(strsplit(javaVersionStr, "[.]")[[1L]][2])
if (javaVersionNum != sparkJavaVersion) {
stop(paste("Java version", sparkJavaVersion, "is required for this package; found version:",
javaVersionStr))
}
return(javaVersionNum)
}

launchBackend <- function(args, sparkHome, jars, sparkSubmitOpts, packages) {
sparkSubmitBinName <- determineSparkSubmitBin()
if (sparkHome != "") {
sparkSubmitBin <- file.path(sparkHome, "bin", sparkSubmitBinName)
} else {
sparkSubmitBin <- sparkSubmitBinName
}

combinedArgs <- generateSparkSubmitArgs(args, sparkHome, jars, sparkSubmitOpts, packages)
cat("Launching java with spark-submit command", sparkSubmitBin, combinedArgs, "\n")
invisible(launchScript(sparkSubmitBin, combinedArgs))
Expand Down
10 changes: 8 additions & 2 deletions R/pkg/R/column.R
Original file line number Diff line number Diff line change
Expand Up @@ -164,12 +164,18 @@ setMethod("alias",
#' @aliases substr,Column-method
#'
#' @param x a Column.
#' @param start starting position.
#' @param start starting position. It should be 1-base.
#' @param stop ending position.
#' @examples
#' \dontrun{
#' df <- createDataFrame(list(list(a="abcdef")))
#' collect(select(df, substr(df$a, 1, 4))) # the result is `abcd`.
#' collect(select(df, substr(df$a, 2, 4))) # the result is `bcd`.
#' }
#' @note substr since 1.4.0
setMethod("substr", signature(x = "Column"),
function(x, start, stop) {
jc <- callJMethod(x@jc, "substr", as.integer(start - 1), as.integer(stop - start + 1))
jc <- callJMethod(x@jc, "substr", as.integer(start), as.integer(stop - start + 1))
column(jc)
})

Expand Down
Loading