Skip to content

Conversation

@pcuenca
Copy link
Member

@pcuenca pcuenca commented Oct 2, 2022

This is a continuation of #521 by @jamestiotio as they didn't have access to a GPU with enough memory. I've just changed the tests slightly:

  • Use default weights.
  • Do not use attention slicing or autocast.
  • Compare first and last slices.
  • In ONNX, use the CUDAExecutionProvider.

In ONNX, however, the results of the last step of the iteration were erratic (sometimes they get very close to 0). I need to check with @anton-l what could be the reason. I'm just comparing the first step for now.

@jamestiotio could you please take a quick look and let me know if you agree with this? Thanks!

- Use default weights.
- Do not use attention slicing or autocast.
- Compare first and last slices.
- In ONNX, use the CUDAExecutionProvider.

In ONNX, however, the results of the last slice were erratic (sometimes
they get very close to 0). I need to check with @anton-l what could be
the reason.
@pcuenca pcuenca requested a review from anton-l October 2, 2022 18:01
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Contributor

@jamestiotio jamestiotio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM! The checks for the last slices are also nice.

Just suggested a little nitpick for explicit clarity.

Thank you!

[1.8279, 1.2858, -0.1022, 1.2406, -2.3068, 1.0748, -0.0819, -0.6522, -2.9496]
)
assert np.abs(latents_slice.flatten() - expected_slice).max() < 1e-3
if step == 50:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to make it more explicit that this block is mutually exclusive from the step == 0 block. Something like:

Suggested change
if step == 50:
elif step == 50:

expected_slice = np.array([0.9052, -0.0184, 0.4810, 0.2898, 0.5851, 1.4920, 0.5362, 1.9838, 0.0530])
expected_slice = np.array([0.9052, -0.0187, 0.4808, 0.2900, 0.5852, 1.4922, 0.5364, 1.9840, 0.0534])
assert np.abs(latents_slice.flatten() - expected_slice).max() < 1e-3
if step == 37:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above:

Suggested change
if step == 37:
elif step == 37:

[-0.5472, 1.1218, -0.5504, -0.9391, -1.0795, 0.4064, 0.5158, 0.6427, -1.5245]
)
assert np.abs(latents_slice.flatten() - expected_slice).max() < 1e-3
if step == 37:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above:

Suggested change
if step == 37:
elif step == 37:

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True)
pipe.to(torch_device)
pipe.set_progress_bar_config(disable=None)
pipe.enable_attention_slicing()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we have enough memory for this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why disable it? It gives the same results

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was added by @jamestiotio because they didn't have a GPU with enough memory, and I thought our tests were performed with attention slicing disabled. But now I see they enable it, so I'll restore those lines.

@patrickvonplaten
Copy link
Contributor

Sorry I don't fully understand this PR - how does this improve the tests? It's also good for our test suite to keep memory usage low -> so I don't see any advantage in disabling attention_slicing

@pcuenca
Copy link
Member Author

pcuenca commented Oct 4, 2022

Sorry I don't fully understand this PR - how does this improve the tests? It's also good for our test suite to keep memory usage low -> so I don't see any advantage in disabling attention_slicing

@jamestiotio wrote the initial tests for their PR using fp16, autocast and the CPU execution provider for ONNX. I changed them to use full precision and the CUDA execution provider, just to mimic the rest of the tests. However, the main ONNX tests were later migrated to use CPU too, so that change no longer makes sense. And I don't think it's very important to require the tests for the callbacks to be computed in full precision anyway, so closing this then. Fell free to reopen if we still need some adjustment.

@pcuenca pcuenca closed this Oct 4, 2022
@jamestiotio
Copy link
Contributor

@pcuenca Adding the checks for the last slices would be nice to ensure that the latents stay somewhat consistent whenever there are new changes that are introduced into the library. I feel that it is something that I should have included together in my PR. Perhaps we can consider at least merging that part?

@patrickvonplaten
Copy link
Contributor

@jamestiotio this makes sense to me - do you mind maybe opening a new PR maybe since this one in under @pcuenca and it's more or less outdated now?

This would be very nice!

@jamestiotio
Copy link
Contributor

@jamestiotio this makes sense to me - do you mind maybe opening a new PR maybe since this one in under @pcuenca and it's more or less outdated now?

This would be very nice!

@patrickvonplaten I have opened a new PR to add the checks for the final latent slices at #731, as requested.

@pcuenca pcuenca deleted the sd-callbacks-tests branch October 12, 2022 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants