Skip to content

Conversation

@EnricoMi
Copy link

What changes were proposed in this pull request?

Allow to disable shuffle data migration to other executors thus only migrate shuffle data to fallback storage.

Why are the changes needed?

Currently, even though fallback storage is enabled, shuffle data are migrated to other executors first. This causes shuffle data to be migrated multiple times. Only when no other executor is available for migration, shuffle data are migrated to the fallback storage. There should be a mode of operation where executors migrate there shuffle data to the fallback storage only, so there the data is migrated exactly once.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit test and manual test via reproduction example.

Was this patch authored or co-authored using generative AI tooling?

No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants