-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-26753][CORE] Fixed custom log levels for spark-shell by using Filter instead of Threshold #23675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…lasses work for spark-shell This fix replaces the Threshold with a Filter for ConsoleAppender which checks to ensure that either the logLevel is greater than thresholdLevel (shell log level) or the log originated from a custom defined logger. In these cases, it lets a log event go through, otherwise it doesn't. Testing Done: 1. Ensured that custom log level works when set by default (via log4j.properties) 2. Ensured that logs are not printed twice when log level is changed by setLogLevel 3. Ensured that custom logs are printed when log level is changed back by setLogLevel
|
cc @vanzin |
|
Test build #101767 has finished for PR 23675 at commit
|
vanzin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain the fix, not the problem, in the PR.
1. Moved SparkShellLoggingFilter from Utils.scala to Logging.scala 2. Moved thresholdLevel from Filter to Logging.scala 3. Added unit test cases
1. Changed uninitialize() method of Logging.scala 2. Other minor modifications
|
Test build #101836 has finished for PR 23675 at commit
|
|
Test build #101843 has finished for PR 23675 at commit
|
vanzin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit. I'll fix during merge.
Merging to master.
| } | ||
| } | ||
|
|
||
| private class SparkShellLoggingFilter() extends Filter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove ()
…Filter instead of Threshold This fix replaces the Threshold with a Filter for ConsoleAppender which checks to ensure that either the logLevel is greater than thresholdLevel (shell log level) or the log originated from a custom defined logger. In these cases, it lets a log event go through, otherwise it doesn't. 1. Ensured that custom log level works when set by default (via log4j.properties) 2. Ensured that logs are not printed twice when log level is changed by setLogLevel 3. Ensured that custom logs are printed when log level is changed back by setLogLevel Closes apache#23675 from ankuriitg/ankurgupta/SPARK-26753. Authored-by: ankurgupta <[email protected]> Signed-off-by: Marcelo Vanzin <[email protected]>
…oot logger properly ### What changes were proposed in this pull request? In the current implementation of `SparkShellLoggingFilter`, if the log level of the root logger and the log level of a message are different, whether a message should logged is decided based on log4j's configuration but whether the message should be output to the REPL's console is not cared. So, if the log level of the root logger is `DEBUG`, the log level of REPL's logger is `WARN` and the log level of a message is `INFO`, the message will output to the REPL's console even though `INFO < WARN`. https://github.com/apache/spark/pull/26798/files#diff-bfd5810d8aa78ad90150e806d830bb78L237 The ideal behavior should be like as follows and implemented them in this change. 1. If the log level of a message is greater than or equal to the log level of the root logger, the message should be logged but whether the message is output to the REPL's console should be decided based on whether the log level of the message is greater than or equal to the log level of the REPL's logger. 2. If a log level or custom appenders are explicitly defined for a category, whether a log message via the logger corresponding to the category is logged and output to the REPL's console should be decided baed on the log level of the category. We can confirm whether a log level or appenders are explicitly set to a logger for a category by `Logger#getLevel` and `Logger#getAllAppenders.hasMoreElements`. ### Why are the changes needed? This is a bug breaking a compatibility. #9816 enabled REPL's log4j configuration to override root logger but #23675 seems to have broken the feature. You can see one example when you modifies the default log4j configuration like as follows. ``` # Change the log level for rootCategory to DEBUG log4j.rootCategory=DEBUG, console ... # The log level for repl.Main remains WARN log4j.logger.org.apache.spark.repl.Main=WARN ``` If you launch REPL with the configuration, INFO level logs appear even though the log level for REPL is WARN. ``` ・・・ 19/12/08 23:31:38 INFO Utils: Successfully started service 'sparkDriver' on port 33083. 19/12/08 23:31:38 INFO SparkEnv: Registering MapOutputTracker 19/12/08 23:31:38 INFO SparkEnv: Registering BlockManagerMaster 19/12/08 23:31:38 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 19/12/08 23:31:38 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 19/12/08 23:31:38 INFO SparkEnv: Registering BlockManagerMasterHeartbeat ・・・ ``` Before #23675 was applied, those INFO level logs are not shown with the same log4j.properties. ### Does this PR introduce any user-facing change? Yes. The logging behavior for REPL is fixed. ### How was this patch tested? Manual test and newly added unit test. Closes #26798 from sarutak/fix-spark-shell-loglevel. Authored-by: Kousuke Saruta <[email protected]> Signed-off-by: Marcelo Vanzin <[email protected]>
What changes were proposed in this pull request?
This fix replaces the Threshold with a Filter for ConsoleAppender which checks
to ensure that either the logLevel is greater than thresholdLevel (shell log
level) or the log originated from a custom defined logger. In these cases, it
lets a log event go through, otherwise it doesn't.
How was this patch tested?