-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[Docs] Adds a documentation page for evaluating diffusion models #2516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
patrickvonplaten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! Left some suggestions
Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Kashif Rasul <[email protected]>
pcuenca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool!
|
|
||
|  | ||
|
|
||
| One strategy to evaluate such a model is to measure the consistency of the change between the two images (in [CLIP](https://huggingface.co/docs/transformers/model_doc/clip) space) with the change between the two image captions (as shown in [CLIP-Guided Domain Adaptation of Image Generators](https://arxiv.org/abs/2108.00946)). This is referred to as the "**CLIP directional similarity**". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Maybe we can draw parallels with "guidance scale" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the following?
One could consider this measure to be orthogonal to the use of
guidance_scalein theDiffusionPipeline. The higher theguidance_scale, the more constrained the generation becomes on the input text prompt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that the method looks similar to the use of guidance scale to encourage the model to go in a direction that improves caption-image similarity. But reading the full post again is going to be more confusing than clarifying, so I'd just leave it as it is now :)
|
Maybe also mention that our training scripts have built-in tensorboard and W&B logging, so we recommend you check both qualitative and quantitative measures while training. |
Co-authored-by: Pedro <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
Co-authored-by: Pedro <[email protected]>
|
@pcuenca thanks so much for your comments! I addressed all of them except for #2516 (comment). Let me know your thoughts on the latest changes. |
yiyixuxu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! thank you so much for adding this! I learnt a lot from it
Co-authored-by: Will Berman <[email protected]> Co-authored-by: YiYi Xu <[email protected]>
|
Thanks for all the reviews. I have addressed all of them. I figured that the readers of the doc might want to do hands-on with the content presented in it. So, worked on huggingface/notebooks#336 as well. @williamberman @pcuenca could you do one final pass and comment? |
williamberman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
pcuenca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! I like it a lot, I think it's a great introduction to evaluation methods.
|
|
||
|  | ||
|
|
||
| One strategy to evaluate such a model is to measure the consistency of the change between the two images (in [CLIP](https://huggingface.co/docs/transformers/model_doc/clip) space) with the change between the two image captions (as shown in [CLIP-Guided Domain Adaptation of Image Generators](https://arxiv.org/abs/2108.00946)). This is referred to as the "**CLIP directional similarity**". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that the method looks similar to the use of guidance scale to encourage the model to go in a direction that improves caption-image similarity. But reading the full post again is going to be more confusing than clarifying, so I'd just leave it as it is now :)
Co-authored-by: Pedro Cuenca <[email protected]>
…#336) * Add files via upload * update notebook as per PR comments huggingface/diffusers#2516
…gingface#2516) * add a documentation page for evaluating diffuion models. * fix: checkpoint link. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Kashif Rasul <[email protected]> * formatting fixes. * formatting fixes. * link to partiprompts dataset on hub. * reflect on Pedro's comments. Co-authored-by: Pedro <[email protected]> * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * reflect on Pedro's comments. Co-authored-by: Pedro <[email protected]> * update mention of FID. * Apply suggestions from code review Co-authored-by: Will Berman <[email protected]> Co-authored-by: YiYi Xu <[email protected]> * minor nit. * finish edges and add colab notebook. * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * run formatting. * additional feedback. --------- Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Kashif Rasul <[email protected]> Co-authored-by: Pedro <[email protected]> Co-authored-by: Will Berman <[email protected]> Co-authored-by: YiYi Xu <[email protected]>
…gingface#2516) * add a documentation page for evaluating diffuion models. * fix: checkpoint link. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Kashif Rasul <[email protected]> * formatting fixes. * formatting fixes. * link to partiprompts dataset on hub. * reflect on Pedro's comments. Co-authored-by: Pedro <[email protected]> * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * reflect on Pedro's comments. Co-authored-by: Pedro <[email protected]> * update mention of FID. * Apply suggestions from code review Co-authored-by: Will Berman <[email protected]> Co-authored-by: YiYi Xu <[email protected]> * minor nit. * finish edges and add colab notebook. * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * run formatting. * additional feedback. --------- Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Kashif Rasul <[email protected]> Co-authored-by: Pedro <[email protected]> Co-authored-by: Will Berman <[email protected]> Co-authored-by: YiYi Xu <[email protected]>
Should be useful to the community.