Skip to content

[RFC] Proposal to change API of ConfigMixin and SchedulerMixin. #1268

@patrickvonplaten

Description

@patrickvonplaten

There are a couple of API problems currently related to the from_config method. This is a proposal to change the behavior of from_config and add a new load_config method instead.

1. Loading schedulers

Currently one has to load schedulers as follows:

from diffusers import DDIMScheduler

scheduler = DDIMScheduler.from_config("CompVis/stable-diffusion-v1-4", subfolder="scheduler")

We've received numerous feedback now (e.g. from @thomwolf, @keturn, @Narsil) that this is weird. Why do we pass a model_id to the from_config method? Both transformers and diffusers makes heavy usage of the from_pretrained(...) method, so it does indeed make much more sense to also better have the following API:

from diffusers import DDIMScheduler

scheduler = DDIMScheduler.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="scheduler")

2. Switching between schedulers.

It's quite obvious that we need functionality to allow the user to easily switch between schedulers.
From the feedback of the set_scheduler PR here: #1247 as well as from feedback from @Narsil , we should probably re-visit the design of from_config a bit and instead of allowing to pass file names just make it accept a config dictionary - the one that is given to every object inheriting from ConfigMixin via the .config property. This would also directly solve the #1247 in a pretty clean way - we can just advertise users to do the following to switch between schedulers:

# 1. Switch to DDIM
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)

# 2. Switch to Euler
pipe.scheduler = EulerScheduler.from_config(pipe.scheduler.config)

...

Changing the API of from_config would be a pretty clean and also intuitive solution for the set_scheduler problem

3. Loading just the config of models and schedulers

Instead of instantiating the whole model or whole scheduler, users might want to just get the config of the components. This is currently a niche case, but might become more important going forward (e.g. if people want to train diffusion systems from scratch).

Right now people can to the following for models:

random_unet = Unet2DModel.from_config("CompVis/stable-diffusion-v1-4", subfolder="unet")

If we follow the API changes above long-term this would not be possible anymore if we make from_config(...) accept only a configuration dictionary. I would instead propose to add a new static method called load_config to ConfigMixin that would allow all classes inheriting from ConfigMixin to just load the config that can then be passed to from_config e.g. we could instead allow the following API:

config = UNet2DModel.load_config("CompVis/stable-diffusion-v1-4", subfolder="unet")

model = UNet2DModel.from_config(config)

This API has the following advantage - it clearly seperates the UNet object (with all it's random weights) from the configuration dictionary. Doing load_config would only load the config.json of the model and allow the user to quickly inspect a given config without having to init any random weights. The config can then be passed to the from_config(...) method to randomely init the model.

This RFC is heavily influenced by the very nice feedback from @keturn and also copies @keturn's suggestions here: #1247 (comment)

The more I think about, it, I'm very much in favor of changing the from_config(...) API and adding a new load_config(...) method to ConfigMixin as well as a from_pretrained(...) method to SchedulerMixin to nicely cover all three use cases.

What are your thoughts on this ? @patil-suraj @anton-l @pcuenca @Narsil @keturn ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions