-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[Accelerate model loading] Fix meta device and super low memory usage #1016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The tests: are currently failing on main. Also this PR renames: Related original PR: #850 @piEsposito does this work for you? |
| assert np.abs(ddpm_images - ddim_images).max() < 1e-1 | ||
|
|
||
| @require_torch_gpu | ||
| def test_stable_diffusion_accelerate_load_works(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test doesn't do anything so let's delete it
|
The documentation is not available anymore as the PR was closed or merged. |
patil-suraj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for fixing this, looks good to me!
| self.enable_attention_slicing(None) | ||
|
|
||
| def cuda_with_minimal_gpu_usage(self): | ||
| def enable_sequential_cpu_offload(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great name choice!
| pipeline_id = "CompVis/stable-diffusion-v1-4" | ||
|
|
||
| start_time = time.time() | ||
| pipeline_normal_load = StableDiffusionPipeline.from_pretrained( | ||
| pipeline_id, revision="fp16", torch_dtype=torch.float16, use_auth_token=True | ||
| ) | ||
| pipeline_normal_load.to(torch_device) | ||
| normal_load_time = time.time() - start_time | ||
|
|
||
| start_time = time.time() | ||
| _ = StableDiffusionPipeline.from_pretrained( | ||
| pipeline_id, revision="fp16", torch_dtype=torch.float16, use_auth_token=True, device_map="auto" | ||
| ) | ||
| meta_device_load_time = time.time() - start_time | ||
|
|
||
| assert 2 * meta_device_load_time < normal_load_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool!
|
@patrickvonplaten great naming choice, I love it! |
…huggingface#1016) * [Accelerate model loading] Fix meta device and super low memory usage * better naming
The tests:
are currently failing on main.
Also this PR renames:
cuda_with_minimal_gpu_usagetoenable_sequential_cpu_offloadas it's a more fitting name and disentangledenable_attention_slicingfromcpu_offloadRelated original PR: #850
@piEsposito does this work for you?