Skip to content

Commit d7b5187

Browse files
committed
Merge branch 'master' into issues/SPARK-18623
2 parents c83919e + 18066f2 commit d7b5187

File tree

2,541 files changed

+133610
-44799
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

2,541 files changed

+133610
-44799
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ dependency-reduced-pom.xml
4242
derby.log
4343
dev/create-release/*final
4444
dev/create-release/*txt
45+
dev/pr-deps/
4546
dist/
4647
docs/_site
4748
docs/api

.travis.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,6 @@ dist: trusty
2828
# 2. Choose language and target JDKs for parallel builds.
2929
language: java
3030
jdk:
31-
- oraclejdk7
3231
- oraclejdk8
3332

3433
# 3. Setup cache directory for SBT and Maven.
@@ -44,7 +43,7 @@ notifications:
4443
# 5. Run maven install before running lint-java.
4544
install:
4645
- export MAVEN_SKIP_RC=1
47-
- build/mvn -T 4 -q -DskipTests -Pmesos -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive -Phive-thriftserver install
46+
- build/mvn -T 4 -q -DskipTests -Pmesos -Pyarn -Pkinesis-asl -Phive -Phive-thriftserver install
4847

4948
# 6. Run lint-java.
5049
script:

LICENSE

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -249,11 +249,11 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
249249
(Interpreter classes (all .scala files in repl/src/main/scala
250250
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
251251
and for SerializableMapWrapper in JavaUtils.scala)
252-
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.7 - http://www.scala-lang.org/)
253-
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.7 - http://www.scala-lang.org/)
254-
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.7 - http://www.scala-lang.org/)
255-
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.7 - http://www.scala-lang.org/)
256-
(BSD-like) Scalap (org.scala-lang:scalap:2.11.7 - http://www.scala-lang.org/)
252+
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.8 - http://www.scala-lang.org/)
253+
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.8 - http://www.scala-lang.org/)
254+
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.8 - http://www.scala-lang.org/)
255+
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.8 - http://www.scala-lang.org/)
256+
(BSD-like) Scalap (org.scala-lang:scalap:2.11.8 - http://www.scala-lang.org/)
257257
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)
258258
(BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)
259259
(BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)
@@ -297,3 +297,4 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
297297
(MIT License) RowsGroup (http://datatables.net/license/mit)
298298
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
299299
(MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)
300+
(MIT License) machinist (https://github.com/typelevel/machinist)

R/CRAN_RELEASE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ To release SparkR as a package to CRAN, we would use the `devtools` package. Ple
77

88
First, check that the `Version:` field in the `pkg/DESCRIPTION` file is updated. Also, check for stale files not under source control.
99

10-
Note that while `check-cran.sh` is running `R CMD check`, it is doing so with `--no-manual --no-vignettes`, which skips a few vignettes or PDF checks - therefore it will be preferred to run `R CMD check` on the source package built manually before uploading a release.
10+
Note that while `run-tests.sh` runs `check-cran.sh` (which runs `R CMD check`), it is doing so with `--no-manual --no-vignettes`, which skips a few vignettes or PDF checks - therefore it will be preferred to run `R CMD check` on the source package built manually before uploading a release. Also note that for CRAN checks for pdf vignettes to success, `qpdf` tool must be there (to install it, eg. `yum -q -y install qpdf`).
1111

1212
To upload a release, we would need to update the `cran-comments.md`. This should generally contain the results from running the `check-cran.sh` script along with comments on status of all `WARNING` (should not be any) or `NOTE`. As a part of `check-cran.sh` and the release process, the vignettes is build - make sure `SPARK_HOME` is set and Spark jars are accessible.
1313

R/README.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -66,11 +66,7 @@ To run one of them, use `./bin/spark-submit <filename> <args>`. For example:
6666
```bash
6767
./bin/spark-submit examples/src/main/r/dataframe.R
6868
```
69-
You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
70-
```bash
71-
R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
72-
./R/run-tests.sh
73-
```
69+
You can run R unit tests by following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests).
7470

7571
### Running on YARN
7672

R/WINDOWS.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ To build SparkR on Windows, the following steps are required
66
include Rtools and R in `PATH`.
77

88
2. Install
9-
[JDK7](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html) and set
9+
[JDK8](http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) and set
1010
`JAVA_HOME` in the system environment variables.
1111

1212
3. Download and install [Maven](http://maven.apache.org/download.html). Also include the `bin`
@@ -34,10 +34,9 @@ To run the SparkR unit tests on Windows, the following steps are required —ass
3434

3535
4. Set the environment variable `HADOOP_HOME` to the full path to the newly created `hadoop` directory.
3636

37-
5. Run unit tests for SparkR by running the command below. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
37+
5. Run unit tests for SparkR by running the command below. You need to install the needed packages following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests) first:
3838

3939
```
40-
R -e "install.packages('testthat', repos='http://cran.us.r-project.org')"
41-
.\bin\spark-submit2.cmd --conf spark.hadoop.fs.default.name="file:///" R\pkg\tests\run-all.R
40+
.\bin\spark-submit2.cmd --conf spark.hadoop.fs.defaultFS="file:///" R\pkg\tests\run-all.R
4241
```
4342

R/check-cran.sh

Lines changed: 12 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -20,28 +20,18 @@
2020
set -o pipefail
2121
set -e
2222

23-
FWDIR="$(cd `dirname $0`; pwd)"
24-
pushd $FWDIR > /dev/null
23+
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
24+
pushd "$FWDIR" > /dev/null
2525

26-
if [ ! -z "$R_HOME" ]
27-
then
28-
R_SCRIPT_PATH="$R_HOME/bin"
29-
else
30-
# if system wide R_HOME is not found, then exit
31-
if [ ! `command -v R` ]; then
32-
echo "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed."
33-
exit 1
34-
fi
35-
R_SCRIPT_PATH="$(dirname $(which R))"
36-
fi
37-
echo "USING R_HOME = $R_HOME"
26+
. "$FWDIR/find-r.sh"
3827

28+
# Install the package (this is required for code in vignettes to run when building it later)
3929
# Build the latest docs, but not vignettes, which is built with the package next
40-
$FWDIR/create-docs.sh
30+
. "$FWDIR/install-dev.sh"
4131

4232
# Build source package with vignettes
4333
SPARK_HOME="$(cd "${FWDIR}"/..; pwd)"
44-
. "${SPARK_HOME}"/bin/load-spark-env.sh
34+
. "${SPARK_HOME}/bin/load-spark-env.sh"
4535
if [ -f "${SPARK_HOME}/RELEASE" ]; then
4636
SPARK_JARS_DIR="${SPARK_HOME}/jars"
4737
else
@@ -50,16 +40,16 @@ fi
5040

5141
if [ -d "$SPARK_JARS_DIR" ]; then
5242
# Build a zip file containing the source package with vignettes
53-
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD build $FWDIR/pkg
43+
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/R" CMD build "$FWDIR/pkg"
5444

5545
find pkg/vignettes/. -not -name '.' -not -name '*.Rmd' -not -name '*.md' -not -name '*.pdf' -not -name '*.html' -delete
5646
else
57-
echo "Error Spark JARs not found in $SPARK_HOME"
47+
echo "Error Spark JARs not found in '$SPARK_HOME'"
5848
exit 1
5949
fi
6050

6151
# Run check as-cran.
62-
VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`
52+
VERSION=`grep Version "$FWDIR/pkg/DESCRIPTION" | awk '{print $NF}'`
6353

6454
CRAN_CHECK_OPTIONS="--as-cran"
6555

@@ -77,9 +67,10 @@ echo "Running CRAN check with $CRAN_CHECK_OPTIONS options"
7767

7868
if [ -n "$NO_TESTS" ] && [ -n "$NO_MANUAL" ]
7969
then
80-
"$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
70+
"$R_SCRIPT_PATH/R" CMD check $CRAN_CHECK_OPTIONS "SparkR_$VERSION.tar.gz"
8171
else
8272
# This will run tests and/or build vignettes, and require SPARK_HOME
83-
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
73+
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/R" CMD check $CRAN_CHECK_OPTIONS "SparkR_$VERSION.tar.gz"
8474
fi
75+
8576
popd > /dev/null

R/create-docs.sh

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,26 +29,27 @@ set -o pipefail
2929
set -e
3030

3131
# Figure out where the script is
32-
export FWDIR="$(cd "`dirname "$0"`"; pwd)"
33-
export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
32+
export FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
33+
export SPARK_HOME="$(cd "`dirname "${BASH_SOURCE[0]}"`"/..; pwd)"
3434

3535
# Required for setting SPARK_SCALA_VERSION
36-
. "${SPARK_HOME}"/bin/load-spark-env.sh
36+
. "${SPARK_HOME}/bin/load-spark-env.sh"
3737

3838
echo "Using Scala $SPARK_SCALA_VERSION"
3939

40-
pushd $FWDIR
40+
pushd "$FWDIR" > /dev/null
41+
. "$FWDIR/find-r.sh"
4142

4243
# Install the package (this will also generate the Rd files)
43-
./install-dev.sh
44+
. "$FWDIR/install-dev.sh"
4445

4546
# Now create HTML files
4647

4748
# knit_rd puts html in current working directory
4849
mkdir -p pkg/html
4950
pushd pkg/html
5051

51-
Rscript -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knitr); knit_rd("SparkR", links = tools::findHTMLlinks(paste(libDir, "SparkR", sep="/")))'
52+
"$R_SCRIPT_PATH/Rscript" -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knitr); knit_rd("SparkR", links = tools::findHTMLlinks(paste(libDir, "SparkR", sep="/")))'
5253

5354
popd
5455

R/create-rd.sh

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
#!/bin/bash
2+
3+
#
4+
# Licensed to the Apache Software Foundation (ASF) under one or more
5+
# contributor license agreements. See the NOTICE file distributed with
6+
# this work for additional information regarding copyright ownership.
7+
# The ASF licenses this file to You under the Apache License, Version 2.0
8+
# (the "License"); you may not use this file except in compliance with
9+
# the License. You may obtain a copy of the License at
10+
#
11+
# http://www.apache.org/licenses/LICENSE-2.0
12+
#
13+
# Unless required by applicable law or agreed to in writing, software
14+
# distributed under the License is distributed on an "AS IS" BASIS,
15+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
# See the License for the specific language governing permissions and
17+
# limitations under the License.
18+
#
19+
20+
# This scripts packages the SparkR source files (R and C files) and
21+
# creates a package that can be loaded in R. The package is by default installed to
22+
# $FWDIR/lib and the package can be loaded by using the following command in R:
23+
#
24+
# library(SparkR, lib.loc="$FWDIR/lib")
25+
#
26+
# NOTE(shivaram): Right now we use $SPARK_HOME/R/lib to be the installation directory
27+
# to load the SparkR package on the worker nodes.
28+
29+
set -o pipefail
30+
set -e
31+
32+
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
33+
pushd "$FWDIR" > /dev/null
34+
. "$FWDIR/find-r.sh"
35+
36+
# Generate Rd files if devtools is installed
37+
"$R_SCRIPT_PATH/Rscript" -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'

R/find-r.sh

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
#!/bin/bash
2+
3+
#
4+
# Licensed to the Apache Software Foundation (ASF) under one or more
5+
# contributor license agreements. See the NOTICE file distributed with
6+
# this work for additional information regarding copyright ownership.
7+
# The ASF licenses this file to You under the Apache License, Version 2.0
8+
# (the "License"); you may not use this file except in compliance with
9+
# the License. You may obtain a copy of the License at
10+
#
11+
# http://www.apache.org/licenses/LICENSE-2.0
12+
#
13+
# Unless required by applicable law or agreed to in writing, software
14+
# distributed under the License is distributed on an "AS IS" BASIS,
15+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
# See the License for the specific language governing permissions and
17+
# limitations under the License.
18+
#
19+
20+
if [ -z "$R_SCRIPT_PATH" ]
21+
then
22+
if [ ! -z "$R_HOME" ]
23+
then
24+
R_SCRIPT_PATH="$R_HOME/bin"
25+
else
26+
# if system wide R_HOME is not found, then exit
27+
if [ ! `command -v R` ]; then
28+
echo "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed."
29+
exit 1
30+
fi
31+
R_SCRIPT_PATH="$(dirname $(which R))"
32+
fi
33+
echo "Using R_SCRIPT_PATH = ${R_SCRIPT_PATH}"
34+
fi

0 commit comments

Comments
 (0)