Skip to content

Conversation

@yongjiaw
Copy link
Contributor

Expose custom log4j files in Spark driver UI's executor page. Only works for standalone cluster (screenshot attached).

image

@andrewor14
Copy link
Contributor

@yongjiaw I believe the changes here actually have security implications. Previously the viewers could only read stderr and stdout files on the executor machines, but now they can read anything. If you can limit the files accessible to the log4j ones then we can proceed safely.

@andrewor14
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Dec 15, 2015

Test build #47707 has finished for PR 9321 at commit 48be221.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yongjiaw
Copy link
Contributor Author

@andrewor14 regarding the security concern, yes, I think inside LogPage.scala which is run with the worker process, it's hardcoded to only read from the workerDir, and one can already temper with appId and executorId to potentially read the log from other apps. But only the file "stdout" and "stderr" are allowed to be read.
I agree it's best to check what files are being read, but it's not so easy to access the loggers of the executor from worker process, and one cannot assume the worker has the same log4j config as executors even that's probably the case most of the time. So an executor has to register its loggers with worker and I could not find something in the existing code base to piggyback on, which means a lot of new code probably. Do you have some suggestions?
On the other hand, I cannot see serious issues by allowing user to potentially read any files under workerDir, it will display the bytecode if reading a jar file but I think the user already have full control access to all the files under workerDir, except for maybe other spark app's jar files. So a security measure is probably better enforced at the application level, not by the file name (even it's still the best practice to check file name).

@andrewor14
Copy link
Contributor

On the other hand, I cannot see serious issues by allowing user to potentially read any files under workerDir, it will display the bytecode if reading a jar file but I think the user already have full control access to all the files under workerDir, except for maybe other spark app's jar files.

What if I have some plain text secret (e.g. private key) in some file? Anyone with access to the UI will be able to read it.

@JoshRosen
Copy link
Contributor

I agree that it's not okay to allow arbitrary file reads inside of the worker directory. What might be okay is a configuration mechanism which lets users register specific custom log files to be displayed in the UI (e.g. an explicit whitelist)

@JoshRosen
Copy link
Contributor

Ping @yongjiaw, could you please either close this pull request or address our comments? What do you think about adding an explicit whitelisting mechanism?

@andrewor14
Copy link
Contributor

I agree, a whitelisting mechanism is preferable. Let's close this PR for now since it's inactive and re-open it later with a different approach if there is still interest.

@asfgit asfgit closed this in 085f510 Feb 4, 2016
@yongjiaw
Copy link
Contributor Author

yongjiaw commented Feb 4, 2016

yea, explicitly registering custom log files makes sense. It's good to close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants