-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Description
Thanks for all the work first of all, I appreciate that we have all these pretrained models freely available open-source.
I'm a bit confused by the relationship between sampler implementations and model checkpoints. If someone could clarify the following, that'd be much appreciated (and perhaps some of these can become part of the user documentation) -- especially Q2.
Q1. Where can I find the exact noise schedule used during training for each checkpoint? From what I can tell, some of the released Stable Diffusion U-Nets are using the original DDPM schedule (specific betas with 1000 steps), but I couldn't find any official documentation for this.
Q2. Continuing from Q1, how are we supposed to use different samplers without the explicit specification of the noise schedule used during training? Depending on the schedule,
Q3. I may be mistaken here, but it seems like the library assumes epsilon prediction with discrete times. Why were these assumptions made? (somewhat related: #1308) A concrete example would be VDM [1], with the U-net conditioned on log(SNR) instead of
I'm not very familiar with the code base, so if there is any misunderstanding please correct me.
[1] Kingma, Diederik, et al. "Variational diffusion models." Advances in neural information processing systems 34 (2021): 21696-21707.