Skip to content

Conversation

@frankliee
Copy link
Contributor

What changes were proposed in this pull request?

Update log4j1 syntax to log4j2, and use ${sys:spark.yarn.app.container.log.dir} to relocate log path.

see https://issues.apache.org/jira/browse/SPARK-42880

Why are the changes needed?

Since Spark3.3 has changed log4j1 to log4j2, some documents should also be updated.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Just doc.

@github-actions github-actions bot added the DOCS label Mar 21, 2023
@frankliee frankliee changed the title [SPARK-42880] Update running-on-yarn.md for log4j2 [SPARK-42880] Update running-on-yarn.md to log4j2 syntax Mar 21, 2023
@HyukjinKwon
Copy link
Member

cc @viirya FYI

@frankliee
Copy link
Contributor Author

Yarn NM injects spark.yarn.app.container.log.dir as a system property, so we use ${sys:xxx} to refer it during logging initialization.

https://logging.apache.org/log4j/2.x/manual/lookups.html#system-properties-lookup

to the same log file).

If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j.properties`. For example, `log4j.appender.file_appender.File=${spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility.
If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j.properties`. For example, `appender.spark.fileName=${sys:spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility.
Copy link
Contributor

@LuciferYang LuciferYang Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sys: is ok ~ but I don't think we need change file_appender to spark and I think log4j.properties should be change to log4j2.properties

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just appender.file.fileName. As it is what we have in some log4j2.properties in the codebase (e.g. test).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed back to file_appender.

@viirya viirya changed the title [SPARK-42880] Update running-on-yarn.md to log4j2 syntax [SPARK-42880][DOCS] Update running-on-yarn.md to log4j2 syntax Mar 21, 2023
to the same log file).

If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j.properties`. For example, `log4j.appender.file_appender.File=${spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility.
If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j.properties`. For example, `appender.spark.fileName=${sys:spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j.properties`. For example, `appender.spark.fileName=${sys:spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility.
If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j2.properties`. For example, `appender.spark.fileName=${sys:spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@srowen srowen closed this in 7ad1c80 Mar 22, 2023
@srowen
Copy link
Member

srowen commented Mar 22, 2023

Merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants