-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Improve pipeline callback tests #697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Use default weights. - Do not use attention slicing or autocast. - Compare first and last slices. - In ONNX, use the CUDAExecutionProvider. In ONNX, however, the results of the last slice were erratic (sometimes they get very close to 0). I need to check with @anton-l what could be the reason.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
jamestiotio
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM! The checks for the last slices are also nice.
Just suggested a little nitpick for explicit clarity.
Thank you!
| [1.8279, 1.2858, -0.1022, 1.2406, -2.3068, 1.0748, -0.0819, -0.6522, -2.9496] | ||
| ) | ||
| assert np.abs(latents_slice.flatten() - expected_slice).max() < 1e-3 | ||
| if step == 50: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to make it more explicit that this block is mutually exclusive from the step == 0 block. Something like:
| if step == 50: | |
| elif step == 50: |
| expected_slice = np.array([0.9052, -0.0184, 0.4810, 0.2898, 0.5851, 1.4920, 0.5362, 1.9838, 0.0530]) | ||
| expected_slice = np.array([0.9052, -0.0187, 0.4808, 0.2900, 0.5852, 1.4922, 0.5364, 1.9840, 0.0534]) | ||
| assert np.abs(latents_slice.flatten() - expected_slice).max() < 1e-3 | ||
| if step == 37: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above:
| if step == 37: | |
| elif step == 37: |
| [-0.5472, 1.1218, -0.5504, -0.9391, -1.0795, 0.4064, 0.5158, 0.6427, -1.5245] | ||
| ) | ||
| assert np.abs(latents_slice.flatten() - expected_slice).max() < 1e-3 | ||
| if step == 37: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above:
| if step == 37: | |
| elif step == 37: |
| pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True) | ||
| pipe.to(torch_device) | ||
| pipe.set_progress_bar_config(disable=None) | ||
| pipe.enable_attention_slicing() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we have enough memory for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also why disable it? It gives the same results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was added by @jamestiotio because they didn't have a GPU with enough memory, and I thought our tests were performed with attention slicing disabled. But now I see they enable it, so I'll restore those lines.
|
Sorry I don't fully understand this PR - how does this improve the tests? It's also good for our test suite to keep memory usage low -> so I don't see any advantage in disabling |
@jamestiotio wrote the initial tests for their PR using fp16, autocast and the CPU execution provider for ONNX. I changed them to use full precision and the CUDA execution provider, just to mimic the rest of the tests. However, the main ONNX tests were later migrated to use CPU too, so that change no longer makes sense. And I don't think it's very important to require the tests for the callbacks to be computed in full precision anyway, so closing this then. Fell free to reopen if we still need some adjustment. |
|
@pcuenca Adding the checks for the last slices would be nice to ensure that the latents stay somewhat consistent whenever there are new changes that are introduced into the library. I feel that it is something that I should have included together in my PR. Perhaps we can consider at least merging that part? |
|
@jamestiotio this makes sense to me - do you mind maybe opening a new PR maybe since this one in under @pcuenca and it's more or less outdated now? This would be very nice! |
@patrickvonplaten I have opened a new PR to add the checks for the final latent slices at #731, as requested. |
This is a continuation of #521 by @jamestiotio as they didn't have access to a GPU with enough memory. I've just changed the tests slightly:
In ONNX, however, the results of the last step of the iteration were erratic (sometimes they get very close to 0). I need to check with @anton-l what could be the reason. I'm just comparing the first step for now.
@jamestiotio could you please take a quick look and let me know if you agree with this? Thanks!