-
Notifications
You must be signed in to change notification settings - Fork 49
[SPARK-40805] Use spark username in official image
#11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @HyukjinKwon @zhengruifeng @dongjoon-hyun @holdenk @attilapiros Also cc @dcoliversun maybe you could start to help SPARK-40570 SPARK-40569 based on this patch. Also cc @pan3793 @LuciferYang we had offline discuss about the username |
|
I'm a little worried about this change, previously, we can achieve the dynamic login user ability by the following patch, but this PR makes the login user be static to "spark" spark-docker/3.3.0/scala2.12-java11-ubuntu/entrypoint.sh Lines 27 to 34 in 3037f75
# If there is no passwd entry for the container UID, attempt to create one
if [ -z "$uidentry" ] ; then
if [ -w /etc/passwd ] ; then
- echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd
+ echo "${SPARK_USER_NAME:-$myuid}:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd
else
echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID"
fi
fi |
|
@pan3793 The DOI recommand to use a certain account/gid/uid as I mentioned in PR description. Don't worry, the ability to dynamically create users still available when the user specifies a USER_ID based on the official image. FROM spark
# Specify the User that the actual main process will run as
USER ${spark_uid}This is the best way I can think of at the moment, taking into account both DOI requirement and your scenario. Let's we had a offiline discussion tmr, time to sleep, lol, happy weekend night! |
|
Thanks @Yikun for the explanation, the idea of keeping the default login user to 'spark' and allowing to extend to use dynamic login user makes sense to me. And let me explain a little about why we need the dynamic login user ability. In Spark on Yarn mode, when we launch a Spark application via To reduce the difference between Spark on Yarn and Spark on K8s, we hope Spark on K8s keeps the same ability to allow to dynamically change login user on submitting Spark application. |
Thanks for feedback! Good to know this change also meet your requirements.
I personally think it's a good way, this also benifit users who want to migrate yarn to k8s. But as we discussed, let us do it in a separate PR. |
| esac | ||
|
|
||
| switch_spark_if_root() { | ||
| if [ $(id -u) -ne 0 ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it would be simpler as:
if [ $(id -u) -eq 0 ]; then
echo gosu spark
fiThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good suggestion!
|
Thanks for ping me @Yikun I'm working on it. |
zhengruifeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@HyukjinKwon @zhengruifeng @martin-g @dcoliversun @pan3793 Thanks all! If no more comments, I will merge this today. |
|
fine to me, it is consistent with the usage habits for me |
|
@HyukjinKwon @zhengruifeng @martin-g @dcoliversun @pan3793 @LuciferYang Merged. Thanks all. |
What changes were proposed in this pull request?
This patch:
sparkuser inentrypoint.shrather than Dockerfile. (make sure the spark process is executed as non-root users)USERsetting in Dockerfile. (make sure base image has permission to extend dockerifle, such as executeapt update)spark:sparkinstead ofroot:root. (avoid permission issue such like standalone mode)gosudeps, asudoreplacement recommanded by docker and docker official image, and also are used by other DOI images.This change also follow the rules of docker official images, see also consistency and dockerfile best practices about user.
Why are the changes needed?
The below issues are what I have found so far
Docker images username is not very standard, docker run with
185username is a little bit weird.And also there are some permission issue when running some spark script, such as standalone mode:
Due to static USER set in Dockerfile.
Does this PR introduce any user-facing change?
Yes.
How was this patch tested?
CI passed: all k8s test
Regression test: