Skip to content

Commit 403bf55

Browse files
committed
[SPARK-33927][BUILD] Fix Dockerfile for Spark release to work
### What changes were proposed in this pull request? This PR proposes to fix the `Dockerfile` for Spark release. - Port b135db3 to `Dockerfile` - Upgrade Ubuntu 18.04 -> 20.04 (because of porting b135db3) - Remove Python 2 (because of Ubuntu upgrade) - Use built-in Python 3.8.5 (because of Ubuntu upgrade) - Node.js 11 -> 12 (because of Ubuntu upgrade) - Ruby 2.5 -> 2.7 (because of Ubuntu upgrade) - Python dependencies and Jekyll + plugins upgrade to the latest as it's used in GitHub Actions build (unrelated to the issue itself) ### Why are the changes needed? To make a Spark release :-). ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? Manually tested via: ```bash cd dev/create-release/spark-rm docker build -t spark-rm --build-arg UID=$UID . ``` ``` ... Successfully built 516d7943634f Successfully tagged spark-rm:latest ``` Closes #30971 from HyukjinKwon/SPARK-33927. Lead-authored-by: Hyukjin Kwon <[email protected]> Co-authored-by: HyukjinKwon <[email protected]> Signed-off-by: HyukjinKwon <[email protected]>
1 parent 4a669f5 commit 403bf55

File tree

1 file changed

+16
-16
lines changed

1 file changed

+16
-16
lines changed

dev/create-release/spark-rm/Dockerfile

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -15,16 +15,20 @@
1515
# limitations under the License.
1616
#
1717

18-
# Image for building Spark releases. Based on Ubuntu 18.04.
18+
# Image for building Spark releases. Based on Ubuntu 20.04.
1919
#
2020
# Includes:
2121
# * Java 8
2222
# * Ivy
23-
# * Python (2.7.15/3.6.7)
24-
# * R-base/R-base-dev (4.0.2)
25-
# * Ruby 2.3 build utilities
23+
# * Python (3.8.5)
24+
# * R-base/R-base-dev (4.0.3)
25+
# * Ruby (2.7.0)
26+
#
27+
# You can test it as below:
28+
# cd dev/create-release/spark-rm
29+
# docker build -t spark-rm --build-arg UID=$UID .
2630

27-
FROM ubuntu:18.04
31+
FROM ubuntu:20.04
2832

2933
# For apt to be noninteractive
3034
ENV DEBIAN_FRONTEND noninteractive
@@ -36,8 +40,8 @@ ARG APT_INSTALL="apt-get install --no-install-recommends -y"
3640
# TODO(SPARK-32407): Sphinx 3.1+ does not correctly index nested classes.
3741
# See also https://github.com/sphinx-doc/sphinx/issues/7551.
3842
# We should use the latest Sphinx version once this is fixed.
39-
ARG PIP_PKGS="sphinx==3.0.4 mkdocs==1.0.4 numpy==1.18.1 pydata_sphinx_theme==0.3.1 ipython==7.16.1 nbsphinx==0.7.1 numpydoc==1.1.0"
40-
ARG GEM_PKGS="jekyll:4.0.0 jekyll-redirect-from:0.16.0 rouge:3.15.0"
43+
ARG PIP_PKGS="sphinx==3.0.4 mkdocs==1.1.2 numpy==1.19.4 pydata_sphinx_theme==0.4.1 ipython==7.19.0 nbsphinx==0.8.0 numpydoc==1.1.0"
44+
ARG GEM_PKGS="jekyll:4.2.0 jekyll-redirect-from:0.16.0 rouge:3.26.0"
4145

4246
# Install extra needed repos and refresh.
4347
# - CRAN repo
@@ -46,42 +50,38 @@ ARG GEM_PKGS="jekyll:4.0.0 jekyll-redirect-from:0.16.0 rouge:3.15.0"
4650
# This is all in a single "RUN" command so that if anything changes, "apt update" is run to fetch
4751
# the most current package versions (instead of potentially using old versions cached by docker).
4852
RUN apt-get clean && apt-get update && $APT_INSTALL gnupg ca-certificates && \
49-
echo 'deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/' >> /etc/apt/sources.list && \
53+
echo 'deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/' >> /etc/apt/sources.list && \
5054
gpg --keyserver keyserver.ubuntu.com --recv-key E298A3A825C0D65DFD57CBB651716619E084DAB9 && \
5155
gpg -a --export E084DAB9 | apt-key add - && \
5256
apt-get clean && \
5357
rm -rf /var/lib/apt/lists/* && \
5458
apt-get clean && \
5559
apt-get update && \
5660
$APT_INSTALL software-properties-common && \
57-
apt-add-repository -y ppa:brightbox/ruby-ng && \
5861
apt-get update && \
5962
# Install openjdk 8.
6063
$APT_INSTALL openjdk-8-jdk && \
6164
update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java && \
6265
# Install build / source control tools
6366
$APT_INSTALL curl wget git maven ivy subversion make gcc lsof libffi-dev \
6467
pandoc pandoc-citeproc libssl-dev libcurl4-openssl-dev libxml2-dev && \
65-
curl -sL https://deb.nodesource.com/setup_11.x | bash && \
68+
curl -sL https://deb.nodesource.com/setup_12.x | bash && \
6669
$APT_INSTALL nodejs && \
6770
# Install needed python packages. Use pip for installing packages (for consistency).
68-
$APT_INSTALL libpython3-dev python3-pip python3-setuptools && \
71+
$APT_INSTALL python3-pip python3-setuptools && \
6972
# qpdf is required for CRAN checks to pass.
7073
$APT_INSTALL qpdf jq && \
71-
# Change default python version to python3.
72-
update-alternatives --install /usr/bin/python python /usr/bin/python2.7 1 && \
73-
update-alternatives --install /usr/bin/python python /usr/bin/python3.6 2 && \
74-
update-alternatives --set python /usr/bin/python3.6 && \
7574
pip3 install $PIP_PKGS && \
7675
# Install R packages and dependencies used when building.
7776
# R depends on pandoc*, libssl (which are installed above).
7877
# Note that PySpark doc generation also needs pandoc due to nbsphinx
7978
$APT_INSTALL r-base r-base-dev && \
79+
$APT_INSTALL libcurl4-openssl-dev libgit2-dev libssl-dev libxml2-dev && \
8080
$APT_INSTALL texlive-latex-base texlive texlive-fonts-extra texinfo qpdf && \
8181
Rscript -e "install.packages(c('curl', 'xml2', 'httr', 'devtools', 'testthat', 'knitr', 'rmarkdown', 'roxygen2', 'e1071', 'survival'), repos='https://cloud.r-project.org/')" && \
8282
Rscript -e "devtools::install_github('jimhester/lintr')" && \
8383
# Install tools needed to build the documentation.
84-
$APT_INSTALL ruby2.5 ruby2.5-dev && \
84+
$APT_INSTALL ruby2.7 ruby2.7-dev && \
8585
gem install --no-document $GEM_PKGS
8686

8787
WORKDIR /opt/spark-rm/output

0 commit comments

Comments
 (0)