Controlnet Img2Img with reference #3872

realimposter · 2023-06-26T05:35:10Z

What does this PR do?

Adds controlnet reference functionality to the controlnet Img2Img pipeline. I created this by combining "stable_diffusion_controlnet_reference.py" into "pipeline_controlnet_img2img.py"

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

patrickvonplaten · 2023-06-28T13:24:29Z

Hey @realimposter,

What is a common use case of this? Also how is this different from #3435?

github-actions · 2023-07-26T15:02:46Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten · 2023-08-03T10:22:10Z

Also related to: #4257

yamkz · 2023-08-13T19:03:11Z

I really desire the SDXL-supported 'img2img+ControlNet+reference only pipeline' and 'inpainting+ControlNet+reference only pipeline.' I would be very happy if they were available.

#4589

stilletto · 2023-09-09T21:21:56Z

Hey @realimposter,

What is a common use case of this? Also how is this different from #3435?

It can be usefull with face photo editing on low level of noice power.

breengles · 2023-10-16T09:59:43Z

Hey @realimposter,

What is a common use case of this? Also how is this different from #3435?

Hey @patrickvonplaten!
I am also trying to get it working (well, i2i adapts more or less straightforwardly). The most different feature is that the pipeline you mentioned does not support image input (but as I said it can be readily adapted).

Currently, I am struggling to adapt reference guidance to the inpaint pipelines for the same reason -- it would allow more ways to edit images. By the way, these combinations of inpainting, reference, and ControlNets work totally fine in A1111 so it might be useful to them in diffusers (perhaps, in the community examples). So, I tried to do the following but haven't got any reasonable result:

        # 10. Denoising loop
        num_warmup_steps = len(timesteps) - num_inference_steps * self.scheduler.order
        with self.progress_bar(total=num_inference_steps) as progress_bar:
            for i, t in enumerate(timesteps):
                # expand the latents if we are doing classifier free guidance
                latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents
                latent_model_input = self.scheduler.scale_model_input(latent_model_input, t)

                # ref only part
                noise = randn_tensor(
                    ref_image_latents.shape,
                    generator=generator,
                    device=device,
                    dtype=ref_image_latents.dtype,
                )
                ref_xt = self.scheduler.add_noise(
                    ref_image_latents,
                    noise,
                    t.reshape(
                        1,
                    ),
                )
                ref_xt = torch.cat([ref_xt] * 2) if do_classifier_free_guidance else ref_xt
                ref_xt = self.scheduler.scale_model_input(ref_xt, t)

                if num_channels_unet == 9:
                    # concat latents, mask, masked_image_latents in the channel dimension
                    latent_model_input = torch.cat([latent_model_input, mask, masked_image_latents], dim=1)
                    
                    if do_classifier_free_guidance:
                        ref_image_latents_inject = torch.cat([ref_image_latents] * 2)
                    else:
                        ref_image_latents_inject = ref_image_latents

                    ref_xt = torch.cat([ref_xt, empty_mask, ref_image_latents_inject], dim=1)

                MODE = "write"
                self.unet(
                    ref_xt,
                    t,
                    encoder_hidden_states=prompt_embeds,
                    cross_attention_kwargs=cross_attention_kwargs,
                    return_dict=False,
                )

                # predict the noise residual
                MODE = "read"
                noise_pred = self.unet(
                    latent_model_input,
                    t,
                    encoder_hidden_states=prompt_embeds,
                    cross_attention_kwargs=cross_attention_kwargs,
                    return_dict=False,
                )[0]
...

I must say that everything works fine with non-inpainting weights with 4-channel input so I think it is smth wrong with how I am trying to prepare ref_xt (though, it should follow the A1111's flow).
I would greatly appreciate any advice 🤗

upd: seems like the artefacts happen if there is adain guidance (with and without attentions part)
upd x2: I used empty_mask as the zeros tensor (like in A1111) but it seems like the results are much better if tensor of ones is used instead

patrickvonplaten

@DN6 can you take a look here?

github-actions · 2023-11-20T15:07:39Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten · 2023-11-21T14:26:34Z

@DN6 gentle ping here

DN6 · 2023-11-22T07:40:28Z

@realimposter Could we move this into the community pipelines please?

github-actions · 2023-12-26T15:12:01Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

realimposter added 4 commits June 24, 2023 16:19

Add files via upload

94c3a88

Update __init__.py

ca310c7

Update __init__.py

0829679

Update __init__.py

e831749

github-actions bot added the stale Issues that haven't received updates label Jul 26, 2023

yamkz mentioned this pull request Aug 13, 2023

Improve support of Reference-only and outpainting for SDXL #4257

Open

Update pipeline_controlnet_img2img.py

9252533

patrickvonplaten reviewed Oct 23, 2023

View reviewed changes

github-actions bot closed this Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Controlnet Img2Img with reference #3872

Controlnet Img2Img with reference #3872

Uh oh!

realimposter commented Jun 26, 2023

Uh oh!

patrickvonplaten commented Jun 28, 2023

Uh oh!

github-actions bot commented Jul 26, 2023

Uh oh!

patrickvonplaten commented Aug 3, 2023

Uh oh!

yamkz commented Aug 13, 2023 •

edited

Loading

Uh oh!

stilletto commented Sep 9, 2023 •

edited

Loading

Uh oh!

breengles commented Oct 16, 2023 •

edited

Loading

Uh oh!

patrickvonplaten left a comment

Uh oh!

github-actions bot commented Nov 20, 2023

Uh oh!

patrickvonplaten commented Nov 21, 2023

Uh oh!

DN6 commented Nov 22, 2023

Uh oh!

github-actions bot commented Dec 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Controlnet Img2Img with reference #3872

Controlnet Img2Img with reference #3872

Uh oh!

Conversation

realimposter commented Jun 26, 2023

What does this PR do?

Before submitting

Who can review?

Uh oh!

patrickvonplaten commented Jun 28, 2023

Uh oh!

github-actions bot commented Jul 26, 2023

Uh oh!

patrickvonplaten commented Aug 3, 2023

Uh oh!

yamkz commented Aug 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stilletto commented Sep 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

breengles commented Oct 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 20, 2023

Uh oh!

patrickvonplaten commented Nov 21, 2023

Uh oh!

DN6 commented Nov 22, 2023

Uh oh!

github-actions bot commented Dec 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

yamkz commented Aug 13, 2023 •

edited

Loading

stilletto commented Sep 9, 2023 •

edited

Loading

breengles commented Oct 16, 2023 •

edited

Loading