Skip to content

Conversation

@rockerBOO
Copy link

@rockerBOO rockerBOO commented Jan 2, 2023

Mostly a copy of #22 though walked through the HF implementation. Requires the same changes to the stable diffusion example to run.

  • Also skipped thresholding.
  • Moved scheduler in the example into the num samples loop, as if you did more than 1 it would overflow. Each sample iteration needs a new schedule, it seems. Commented in this scheduler into the example.
  • Have a shallow_clone of a tensor twice due to ownership.
PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128 RUST_BACKTRACE=1 cargo run --example stable-diffusion --features clap \
     -- --final-image "out/ghosts/ghosts.png" --n-steps 10 --num-samples 4 --cpu clip --sd-version v1-5 --unet-weights ./data/unet-fp16.ot --prompt "ghost in a forest at night, glowing, radiant, angelic, cute dali trending on artstation"

ghosts 1

@LaurentMazare
Copy link
Owner

Looks nice thanks, how much of the code is actually duplicated from the multi-step bit? If it's almost all of it, maybe it could be an option of the multi-step version so as to avoid the duplication?

@rockerBOO rockerBOO marked this pull request as draft January 2, 2023 19:53
@rockerBOO
Copy link
Author

The main differences are in

  • *_dpm_solver_third_order_update
  • step

HF has them separated, and other implementations seem to also keep them separated. I am not comfortable to be sure to properly implement that. There are some types and enums that could be moved out and unified, though.

I am working on some further changes in the step function that I didn't port over fully.

@rockerBOO
Copy link
Author

Updated to use the different solver_order to use as the order.

It works except for a bug with the get_order_list doesn't produce the correct number of results for 15 (but fine for 10 and 16). I triple checked my code, but I'm not exactly sure what's causing it not to line up to the number of steps and double checked the HF. I added a bunch of tests to cover the various conditions.

Using 16 steps does seem to work correctly.

@rockerBOO
Copy link
Author

Refactored out the common bits into src/schedulers/dpmsolver.rs. Still have the previous bug with get_order_list.

@rockerBOO rockerBOO force-pushed the rockerBOO/dpm-solver-singlestep-scheduler branch from 880290b to 223c21f Compare January 8, 2023 04:43
@rockerBOO
Copy link
Author

Ok, good go to. The bug was due to lower_order_final should default to true in the initializer. If not, it did not create enough items in the order list, and would fail on the last step. I also ran the included tests on the python version, to check the get_order_list functionality, and it works the same.

cargo run --example stable-diffusion --features clap \
     -- --final-image "out/ghosts/ghosts.png" --seed 2948294 --n-steps 15 --num-samples 4 --cpu clip --sd-version v1-5 --unet-weights ./data/unet-fp16.ot --prompt "ghost in a forest at night, glowing, radiant, angelic, cute dali trending on artstation"

ghosts 4

cargo run --example stable-diffusion --features clap \
     -- --final-image "out/ghosts/ghosts.png" --seed 29293 --n-steps 11 --num-samples 4 --cpu clip --sd-version v1-5 --unet-weights ./data/unet-fp16.ot --prompt "ghost in a forest at night, glowing, radiant, angelic, cute. trending on artstation illustation beautiful amazing boo cute"

ghosts 3

@rockerBOO rockerBOO marked this pull request as ready for review January 8, 2023 06:15
@rockerBOO rockerBOO marked this pull request as draft January 9, 2023 19:48
@rockerBOO
Copy link
Author

I think I need to fall back on the trait as we may be changing it with a more generic Scheduler trait

@rockerBOO rockerBOO force-pushed the rockerBOO/dpm-solver-singlestep-scheduler branch from b8271d9 to 41e0286 Compare January 14, 2023 21:51
@rockerBOO
Copy link
Author

Reverted trait changes and should be good to go now.

@rockerBOO rockerBOO marked this pull request as ready for review January 14, 2023 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants