Skip to content

Conversation

@Cyrilvallez
Copy link
Member

@Cyrilvallez Cyrilvallez commented Nov 18, 2025

What does this PR do?

The device_map computation is currently broken, after #42043 switched to using an integration of the accelerate functions to simplify them and use the already available all_tied_weights_keys instead of computing them again and again. But it does not compute the double list of tied_parameters correctly, in the format that the accelerate functions are used to, i.e. groups of all similar parameters.
Also, now that we don't tie weights before device_map computation, we need to exclude them explicitly from the loop in compute_model_sizes, as they will be present in the iterated params.

This PR fixes it. This is needed in #42242, which is where I noticed it.
Also fixes a more long-standing issue, where compute_model_sizes was not taking non-persistent buffers into account, but they actually take some space!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this !

@Cyrilvallez Cyrilvallez merged commit a74be0d into main Nov 19, 2025
24 checks passed
@Cyrilvallez Cyrilvallez deleted the fix-accelerate-integration branch November 19, 2025 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants