add accelerate to load models with smaller memory footprint #361

piEsposito · 2022-09-05T15:05:40Z

Closes #281

HuggingFaceDocBuilderDev · 2022-09-05T15:08:21Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten · 2022-09-16T15:13:26Z

src/diffusers/configuration_utils.py

-        init_dict, unused_kwargs = cls.extract_init_dict(config_dict, **kwargs)
-
-        model = cls(**init_dict)
+        device_map = kwargs.pop("low_cpu_mem_usage", None)


Ideally we would like to try to keep configuration_utils.py framework and component independent. Could we maybe try to set:

with accelerate.init_empty_weights(): model, unused_kwargs = cls.from_config( config_path, cache_dir=cache_dir, return_unused_kwargs=True, force_download=force_download, resume_download=resume_download, proxies=proxies, local_files_only=local_files_only, use_auth_token=use_auth_token, revision=revision, subfolder=subfolder, device_map=device_map, **kwargs, )

in modeling_utils.py instead?

Just did it.

patrickvonplaten · 2022-09-16T15:13:55Z

src/diffusers/modeling_utils.py


-        # Set model in evaluation mode to deactivate DropOut modules by default
-        model.eval()
+        if device_map is not None:            


If possible it would be very nice if all accelerate logic would only be added here.

Just did it. Had to put the model creation and checkpoint loading after grabbing the weight and config files to avoid splitting the accelerate logic into two.

tests/test_models_unet.py

patrickvonplaten

Hey @piEsposito,

Sorry for replying only now :-/
Thanks a lot for the PR! It looks already really nice. One important thing that (if possible) would be good to change is only add functionality to modeling_utils.py and not configuration_utils.py because configuration_utils is also used for the schedulers.

Could you maybe give this a try?

…ove it from configuration utils

piEsposito · 2022-09-16T16:09:27Z

@patrickvonplaten I've addressed your comments and moved accelerate logic to modelling utils. Also, created some tests to ensure memory usage gets lower and results keep the same. Thank you for your time to carefully review this PR.

patrickvonplaten · 2022-09-21T14:13:56Z

src/diffusers/modeling_utils.py

 import torch
 from torch import Tensor, device

+import accelerate


We should not make accelerate a hard dependency here . Could you wrap it into a:

if accelerate_is_available(): import accelerate else: raise ImportError("Please install accelerate via `pip install accelerate`")

below the if device_map == "auto" method?

@patrickvonplaten I've just did it, thank you for the suggestion.

patrickvonplaten · 2022-09-22T13:40:05Z

src/diffusers/modeling_utils.py

 from huggingface_hub import hf_hub_download
 from huggingface_hub.utils import EntryNotFoundError, RepositoryNotFoundError, RevisionNotFoundError
 from requests import HTTPError
+from transformers.utils import is_accelerate_available


We cannot do this either because transformers is not hard requirement 😅

I should have given you more details in my previous comments - very sorry that the feedback cycle takes so much time. Will try hard to reply faster here now.

In short, can you copy this code https://github.com/huggingface/transformers/blob/e5b7cff5fe65eac9e54ba88fa3935b3270db0207/src/transformers/utils/import_utils.py#L528 into https://github.com/huggingface/diffusers/blob/main/src/diffusers/utils/import_utils.py

Hey, thank you for clarifying that. I've just fixed it. And no problem. I'm sure you are very busy and am very thankful for the time you took to guide me through this PR.

patrickvonplaten

Just need to clean up the is_accelerate_available() comment and then it should be good to go :-)

piEsposito · 2022-09-22T14:42:24Z

@patrickvonplaten finished implementing all requested changes. Please let me know if anything else comes to your mind.

patrickvonplaten · 2022-09-27T09:10:13Z

This looks good to me!
@patil-suraj @anton-l could you take a look here?

piEsposito · 2022-10-03T17:49:26Z

Hey folks, any update here?

patrickvonplaten · 2022-10-04T12:24:51Z

src/diffusers/modeling_utils.py

        from_auto_class = kwargs.pop("_from_auto", False)
        torch_dtype = kwargs.pop("torch_dtype", None)
        subfolder = kwargs.pop("subfolder", None)
+        device_map = kwargs.pop("device_map", None)


Sorry I oversaw this the first time.
@piEsposito could you also add some docstring here?

E.g. 3,4 lines under line 264

patrickvonplaten · 2022-10-04T12:50:36Z

PR is good to merge for me! Played around with:

#!/usr/bin/env python3
from diffusers import UNet2DConditionModel

model = UNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-3", device_map="auto", subfolder="unet")
import ipdb; ipdb.set_trace()

on 1, >1, and no GPU machines and it works as expected.

patrickvonplaten

@patil-suraj @anton-l would be nice if one of you could take a look :-)

patrickvonplaten · 2022-10-04T12:54:17Z

@piEsposito in a follow-up PR it would be nice if you could implement this then as well to the more global:

from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-3", device_map="auto")

functionality :-) Think this could have then very wide-spread adoption. At the moment 99% of users load models via the pipeline interface

patil-suraj

LGTM, thanks a lot for working on this @piEsposito !

tests/test_models_unet.py

piEsposito · 2022-10-04T13:42:53Z

@piEsposito in a follow-up PR it would be nice if you could implement this then as well to the more global:
from diffusers import DiffusionPipeline



pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-3", device_map="auto")
functionality :-) Think this could have then very wide-spread adoption. At the moment 99% of users load models via the pipeline interface

@patrickvonplaten great idea. I can start working on that today. Thanks!

…ace#361) * add accelerate to load models with smaller memory footprint * remove low_cpu_mem_usage as it is reduntant * move accelerate init weights context to modelling utils * add test to ensure results are the same when loading with accelerate * add tests to ensure ram usage gets lower when using accelerate * move accelerate logic to single snippet under modelling utils and remove it from configuration utils * format code using to pass quality check * fix imports with isor * add accelerate to test extra deps * only import accelerate if device_map is set to auto * move accelerate availability check to diffusers import utils * format code Co-authored-by: Patrick von Platen <[email protected]>

Usage: with create_context() as ctx: module = model_annotation(ctx, input_contents=..., config_path=..., search_op=...) Example: The example is to annotate the minilm model with GPU config files. python model_annotation.py /nodclouddata/vivian/minilm_model/model.mlir /nodclouddata/vivian/minilm_model/model_config.json

…ace#361) * add accelerate to load models with smaller memory footprint * remove low_cpu_mem_usage as it is reduntant * move accelerate init weights context to modelling utils * add test to ensure results are the same when loading with accelerate * add tests to ensure ram usage gets lower when using accelerate * move accelerate logic to single snippet under modelling utils and remove it from configuration utils * format code using to pass quality check * fix imports with isor * add accelerate to test extra deps * only import accelerate if device_map is set to auto * move accelerate availability check to diffusers import utils * format code Co-authored-by: Patrick von Platen <[email protected]>

add accelerate to load models with smaller memory footprint

0ea501b

piEsposito mentioned this pull request Sep 5, 2022

Optimize model loading by natively using Accelerate, please. #281

Closed

piEsposito added 2 commits September 12, 2022 11:39

remove low_cpu_mem_usage as it is reduntant

7631dd6

Merge branch 'main' of github.com:huggingface/diffusers into main

973eb23

patrickvonplaten reviewed Sep 16, 2022

View reviewed changes

tests/test_models_unet.py Show resolved Hide resolved

patrickvonplaten reviewed Sep 16, 2022

View reviewed changes

piEsposito added 4 commits September 16, 2022 12:26

move accelerate init weights context to modelling utils

8592e23

add test to ensure results are the same when loading with accelerate

76b8e4a

add tests to ensure ram usage gets lower when using accelerate

dd7f9b9

move accelerate logic to single snippet under modelling utils and rem…

ec5f7aa

…ove it from configuration utils

piEsposito requested a review from patrickvonplaten September 16, 2022 16:08

Merge branch 'huggingface:main' into main

ae5f56d

piEsposito mentioned this pull request Sep 16, 2022

stable diffusion using < 2.3GB of GPU memory #537

Closed

piEsposito added 3 commits September 16, 2022 15:27

format code using to pass quality check

8392e3f

fix imports with isor

615054a

add accelerate to test extra deps

75c08a9

piEsposito marked this pull request as ready for review September 16, 2022 19:25

Merge branch 'main' into main

7e06f3d

patrickvonplaten reviewed Sep 21, 2022

View reviewed changes

piEsposito added 2 commits September 21, 2022 11:25

only import accelerate if device_map is set to auto

6189b86

Merge branch 'main' of github.com:piEsposito/diffusers into main

02818b5

piEsposito requested a review from patrickvonplaten September 21, 2022 14:35

Merge branch 'main' of github.com:huggingface/diffusers into main

dc14ace

patrickvonplaten reviewed Sep 22, 2022

View reviewed changes

move accelerate availability check to diffusers import utils

bc51061

piEsposito requested a review from patrickvonplaten September 22, 2022 14:09

piEsposito added 2 commits September 22, 2022 11:16

Merge remote-tracking branch 'upstream/main' into main

ad1b55d

format code

e020d73

Merge branch 'main' into main

c3778bb

patrickvonplaten requested review from anton-l and patil-suraj September 27, 2022 09:10

Merge branch 'main' into main

0e2319d

patrickvonplaten reviewed Oct 4, 2022

View reviewed changes

Merge branch 'main' into main

25e07d8

patrickvonplaten requested a review from pcuenca October 4, 2022 12:50

patrickvonplaten approved these changes Oct 4, 2022

View reviewed changes

patil-suraj approved these changes Oct 4, 2022

View reviewed changes

tests/test_models_unet.py Show resolved Hide resolved

patrickvonplaten merged commit 4d1cce2 into huggingface:main Oct 4, 2022

CrazyBoyM mentioned this pull request Oct 14, 2022

add device map and accelerate to DiffusionPipeline abstraction to reduce memory footprint when loading model #725

Closed

add accelerate to load models with smaller memory footprint #361

add accelerate to load models with smaller memory footprint #361

Uh oh!

Conversation

piEsposito commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

piEsposito commented Sep 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

piEsposito Sep 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

piEsposito Sep 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

piEsposito commented Sep 22, 2022

Uh oh!

patrickvonplaten commented Sep 27, 2022

Uh oh!

piEsposito commented Oct 3, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Oct 4, 2022

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Oct 4, 2022

Uh oh!

patil-suraj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

piEsposito commented Oct 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

piEsposito commented Sep 5, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 5, 2022 •

edited

Loading

piEsposito commented Sep 16, 2022 •

edited

Loading

piEsposito Sep 21, 2022 •

edited

Loading

piEsposito Sep 22, 2022 •

edited

Loading