[SPARK-42157][CORE] `spark.scheduler.mode=FAIR` should provide FAIR scheduler #39703

dongjoon-hyun · 2023-01-22T22:47:11Z

What changes were proposed in this pull request?

Like our documentation, spark.sheduler.mode=FAIR should provide a FAIR Scheduling Within an Application.

https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application

This bug is hidden in our CI because we have fairscheduler.xml always as one of test resources.

https://github.com/apache/spark/blob/master/core/src/test/resources/fairscheduler.xml

Why are the changes needed?

Currently, when spark.scheduler.mode=FAIR is given without scheduler allocation file, Spark creates Fair Scheduler Pools with FIFO scheduler which is wrong. We need to switch the mode to FAIR from FIFO.

BEFORE

$ bin/spark-shell -c spark.scheduler.mode=FAIR
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/01/22 14:47:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/01/22 14:47:38 WARN FairSchedulableBuilder: Fair Scheduler configuration file not found so jobs will be scheduled in FIFO order. To use fair scheduling, configure pools in fairscheduler.xml or set spark.scheduler.allocation.file to a file that contains the configuration.
Spark context Web UI available at http://localhost:4040

AFTER

$ bin/spark-shell -c spark.scheduler.mode=FAIR
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/01/22 14:48:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://localhost:4040

Does this PR introduce any user-facing change?

Yes, but this is a bug fix to match with Apache Spark official documentation.

How was this patch tested?

Pass the CIs.

…eduler

dongjoon-hyun · 2023-01-22T22:56:51Z

conf/fairscheduler-default.xml.template

+-->
+
+<allocations>
+  <pool name="default">


spark/core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala

Line 65 in 0c3f4cf

val DEFAULT_POOL_NAME = "default"

dongjoon-hyun · 2023-01-22T22:57:12Z

conf/fairscheduler-default.xml.template

+<allocations>
+  <pool name="default">
+    <schedulingMode>FAIR</schedulingMode>
+    <weight>1</weight>


spark/core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala

Line 73 in 0c3f4cf

val DEFAULT_WEIGHT = 1

dongjoon-hyun · 2023-01-22T22:57:24Z

conf/fairscheduler-default.xml.template

+  <pool name="default">
+    <schedulingMode>FAIR</schedulingMode>
+    <weight>1</weight>
+    <minShare>0</minShare>


spark/core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala

Line 72 in 0c3f4cf

val DEFAULT_MINIMUM_SHARE = 0

dongjoon-hyun · 2023-01-22T22:58:13Z

core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala


  val schedulerAllocFile = sc.conf.get(SCHEDULER_ALLOCATION_FILE)
  val DEFAULT_SCHEDULER_FILE = "fairscheduler.xml"
+  val DEFAULT_SCHEDULER_TEMPLATE_FILE = "fairscheduler-default.xml.template"


To avoid any conflicts in the existing production jobs, this PR provide and use new file as .xml.template.

dongjoon-hyun

Could you review this, @mridulm ? This bug was hidden and difficult to test in the unit test environment because we have fairscheduler.xml test resource.

https://github.com/apache/spark/blob/master/core/src/test/resources/fairscheduler.xml

mridulm · 2023-01-23T05:28:40Z

conf/fairscheduler-default.xml.template

+    <weight>1</weight>
+    <minShare>0</minShare>
+  </pool>
+</allocations>


There is a conf/fairscheduler.xml.template - why do we need this ?
If it is for testing, move it as a resource there instead of in conf ?

This is not for testing, @mridulm . As mentioned in #39703 (review), we already have a testing resource, fairscheduler.xml, not a template.

In addition, the content of conf/fairscheduler.xml.template is not matched with the expected default behavior.

mridulm · 2023-01-23T05:32:44Z

core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala

+              s"FIFO order. To use fair scheduling, configure pools in $DEFAULT_SCHEDULER_FILE " +
+              s"or set ${SCHEDULER_ALLOCATION_FILE.key} to a file that contains the configuration.")
+            None
+          }


We should not be relying on template file - in deployments, template file can be invalid - admin's are not expecting it to be read by spark.

Instead, why not simply rely on returning None here ?

Note - if this is only for testing, we can special case it that way via spark.testing

First of all, this is not a testing issue. As I wrote in the PR description, our documentation is wrong. It says spark.scheduler.mode=FAIR will return a FAIR scheduler. However, we are getting FIFO scheduler now.

Note - if this is only for testing, we can special case it that way via spark.testing

None is the previous behavior which ends up with FIFO scheduler with the WARNING message, 23/01/22 14:47:38 WARN FairSchedulableBuilder: Fair Scheduler configuration file not found so jobs will be scheduled in FIFO order. To use fair scheduling, configure pools in fairscheduler.xml or set spark.scheduler.allocation.file to a file that contains the configuration.

Instead, why not simply rely on returning None here ?

Got it. I understand your point about the template file. The reason why I tried to use template file is that I cannot put the real fairscheduler.xml file because it can be used already in the production.

We should not be relying on template file - in deployments, template file can be invalid - admin's are not expecting it to be read by spark.

mridulm · 2023-01-23T06:54:43Z

Looks like I misunderstood the PR, I see what you mean @dongjoon-hyun.
I am not sure what is a good way to make progress here ... let me think about it more.

+CC @tgravescs, @Ngone51 in case you have thoughts.

dongjoon-hyun · 2023-01-23T06:57:00Z

No problem. I totally understand your concern on the usage of template file. I'll also think about a new way. Thank you for your thoughtful review, @mridulm .

tgravescs · 2023-01-23T15:29:26Z

I haven't used FAIR Scheduler much, but was wondering can we just have the defaults be in the code vs having to read a separate template file?

ie if no file
rootPool.addSchedulable(new Pool("default", DEFAULT_SCHEDULING_MODE, DEFAULT_MINIMUM_SHARE, DEFAULT_WEIGHT))

dongjoon-hyun · 2023-01-23T17:04:19Z

Thank you, @tgravescs . Yes, I agree with you to have the defaults in the code.

This reverts commit 17db356.

…FAIR scheduler" This reverts commit 273a3a6.

dongjoon-hyun · 2023-01-23T20:30:59Z

I address the comments. Could you review this once more, @mridulm and @tgravescs ?

dongjoon-hyun · 2023-01-24T04:05:03Z

All tests passed.

dongjoon-hyun · 2023-01-24T05:52:07Z

Could you review this, @Ngone51 ?

core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala

mridulm

Looks good to me, I was essentially toying with the same idea Tom had - but wanted to explore alternatives.
Unfortunately, could not come up with anything better

…der.scala Co-authored-by: Mridul Muralidharan <[email protected]>

dongjoon-hyun · 2023-01-24T07:27:38Z

Thank you, @mridulm !

…cheduler ### What changes were proposed in this pull request? Like our documentation, `spark.sheduler.mode=FAIR` should provide a `FAIR Scheduling Within an Application`. https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application ![Screenshot 2023-01-22 at 2 59 22 PM](https://user-images.githubusercontent.com/9700541/213944956-931e3a3c-d094-4455-8990-233c7966194b.png) This bug is hidden in our CI because we have `fairscheduler.xml` always as one of test resources. - https://github.com/apache/spark/blob/master/core/src/test/resources/fairscheduler.xml ### Why are the changes needed? Currently, when `spark.scheduler.mode=FAIR` is given without scheduler allocation file, Spark creates `Fair Scheduler Pools` with `FIFO` scheduler which is wrong. We need to switch the mode to `FAIR` from `FIFO`. **BEFORE** ``` $ bin/spark-shell -c spark.scheduler.mode=FAIR Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 23/01/22 14:47:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 23/01/22 14:47:38 WARN FairSchedulableBuilder: Fair Scheduler configuration file not found so jobs will be scheduled in FIFO order. To use fair scheduling, configure pools in fairscheduler.xml or set spark.scheduler.allocation.file to a file that contains the configuration. Spark context Web UI available at http://localhost:4040 ``` ![Screenshot 2023-01-22 at 2 50 38 PM](https://user-images.githubusercontent.com/9700541/213944555-6e367a33-ca58-4daf-9ba4-b0319fbe4516.png) **AFTER** ``` $ bin/spark-shell -c spark.scheduler.mode=FAIR Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 23/01/22 14:48:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Spark context Web UI available at http://localhost:4040 ``` ![Screenshot 2023-01-22 at 2 50 14 PM](https://user-images.githubusercontent.com/9700541/213944551-660aa298-638b-450c-ad61-db9e42a624b0.png) ### Does this PR introduce _any_ user-facing change? Yes, but this is a bug fix to match with Apache Spark official documentation. ### How was this patch tested? Pass the CIs. Closes #39703 from dongjoon-hyun/SPARK-42157. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 4d51bfa) Signed-off-by: Dongjoon Hyun <[email protected]>

dongjoon-hyun · 2023-01-24T07:49:06Z

Merged to master/3.3/3.2.

dongjoon-hyun · 2023-01-24T07:49:22Z

Thank you again, @mridulm and @tgravescs .

dongjoon-hyun · 2023-01-25T22:23:38Z

cc @kazuyukitanimura since this lands at branch-3.2

…cheduler ### What changes were proposed in this pull request? Like our documentation, `spark.sheduler.mode=FAIR` should provide a `FAIR Scheduling Within an Application`. https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application ![Screenshot 2023-01-22 at 2 59 22 PM](https://user-images.githubusercontent.com/9700541/213944956-931e3a3c-d094-4455-8990-233c7966194b.png) This bug is hidden in our CI because we have `fairscheduler.xml` always as one of test resources. - https://github.com/apache/spark/blob/master/core/src/test/resources/fairscheduler.xml ### Why are the changes needed? Currently, when `spark.scheduler.mode=FAIR` is given without scheduler allocation file, Spark creates `Fair Scheduler Pools` with `FIFO` scheduler which is wrong. We need to switch the mode to `FAIR` from `FIFO`. **BEFORE** ``` $ bin/spark-shell -c spark.scheduler.mode=FAIR Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 23/01/22 14:47:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 23/01/22 14:47:38 WARN FairSchedulableBuilder: Fair Scheduler configuration file not found so jobs will be scheduled in FIFO order. To use fair scheduling, configure pools in fairscheduler.xml or set spark.scheduler.allocation.file to a file that contains the configuration. Spark context Web UI available at http://localhost:4040 ``` ![Screenshot 2023-01-22 at 2 50 38 PM](https://user-images.githubusercontent.com/9700541/213944555-6e367a33-ca58-4daf-9ba4-b0319fbe4516.png) **AFTER** ``` $ bin/spark-shell -c spark.scheduler.mode=FAIR Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 23/01/22 14:48:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Spark context Web UI available at http://localhost:4040 ``` ![Screenshot 2023-01-22 at 2 50 14 PM](https://user-images.githubusercontent.com/9700541/213944551-660aa298-638b-450c-ad61-db9e42a624b0.png) ### Does this PR introduce _any_ user-facing change? Yes, but this is a bug fix to match with Apache Spark official documentation. ### How was this patch tested? Pass the CIs. Closes apache#39703 from dongjoon-hyun/SPARK-42157. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 4d51bfa) Signed-off-by: Dongjoon Hyun <[email protected]>

[SPARK-42157][CORE] spark.scheduler.mode=FAIR should provide FAIR sch…

273a3a6

…eduler

github-actions bot added the CORE label Jan 22, 2023

dongjoon-hyun changed the title ~~[SPARK-42157][CORE] spark.scheduler.mode=FAIR should provide FAIR scheduler~~ [SPARK-42157][CORE] spark.scheduler.mode=FAIR should provide FAIR scheduler Jan 22, 2023

dongjoon-hyun commented Jan 22, 2023

View reviewed changes

dongjoon-hyun commented Jan 23, 2023

View reviewed changes

Fix info log message

17db356

mridulm reviewed Jan 23, 2023

View reviewed changes

dongjoon-hyun added 3 commits January 23, 2023 10:30

Revert "Fix info log message"

7fc9906

This reverts commit 17db356.

Revert "[SPARK-42157][CORE] spark.scheduler.mode=FAIR should provide …

664e4f7

…FAIR scheduler" This reverts commit 273a3a6.

Address comments

e0d22d6

mridulm reviewed Jan 24, 2023

View reviewed changes

core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala Outdated Show resolved Hide resolved

mridulm approved these changes Jan 24, 2023

View reviewed changes

Update core/src/main/scala/org/apache/spark/scheduler/SchedulableBuil…

58082c9

…der.scala Co-authored-by: Mridul Muralidharan <[email protected]>

Indentation

5124297

dongjoon-hyun closed this in 4d51bfa Jan 24, 2023

dongjoon-hyun deleted the SPARK-42157 branch January 24, 2023 07:49

dongjoon-hyun mentioned this pull request Jan 24, 2023

[MINOR][K8S][DOCS] Add all resource managers in Scheduling Within an Application section #39704

Closed

[SPARK-42157][CORE] spark.scheduler.mode=FAIR should provide FAIR scheduler #39703

[SPARK-42157][CORE] spark.scheduler.mode=FAIR should provide FAIR scheduler #39703

Uh oh!

Conversation

dongjoon-hyun commented Jan 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mridulm commented Jan 23, 2023

Uh oh!

dongjoon-hyun commented Jan 23, 2023

Uh oh!

tgravescs commented Jan 23, 2023

Uh oh!

dongjoon-hyun commented Jan 23, 2023

Uh oh!

dongjoon-hyun commented Jan 23, 2023

Uh oh!

dongjoon-hyun commented Jan 24, 2023

Uh oh!

dongjoon-hyun commented Jan 24, 2023

Uh oh!

Uh oh!

mridulm left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Jan 24, 2023

Uh oh!

dongjoon-hyun commented Jan 24, 2023

Uh oh!

dongjoon-hyun commented Jan 24, 2023

Uh oh!

dongjoon-hyun commented Jan 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-42157][CORE] `spark.scheduler.mode=FAIR` should provide FAIR scheduler #39703

[SPARK-42157][CORE] `spark.scheduler.mode=FAIR` should provide FAIR scheduler #39703

dongjoon-hyun commented Jan 22, 2023 •

edited

Loading