Evolutionary-Diffusion

Combining Evolutionary Computing with Diffusion Models

Images

🎨 Aesthetics Maximization/Minimization using LAION Aesthetics Predictor V2
📊 Multi-Objective Optimization with CLIP-IQA metrics
🛡️ Evading AI-Image Detection by optimizing against a fine-tuned SDXL AI-Image-Detector
🧭 Navigating the CLIP-Score Landscape for Prompt-Matching

Audio

🔊 Optimize Audiobox Aesthetics from Meta

Try it out in Google Colab

Notebook	Link
Genetic Algorithm
Island Genetic Algorithm
NSGA

Image results will be saved in your Google Drive in the folder evolutionary. Each generation creates a new folder where the images will be saved in. You can change the folders in the notebook.

Sometimes Google Collab causes dependency problems which break the notebook. If you have any issues executing this in a Collab environment, please do not hesitate to create a new issue.

Running locally

Optionally but recommended to use a venv. Clone the repo or download the .zip, then install the dependencies via:

pip install -e ".[all]"

Now you are ready to go with the notebooks or custom code. CUDA and MPS are supported.

Example - Creating the most Aesthetic Image

Optimizing for Aesthetics using the Aesthetics Predictor V2 from LAION with a GA and SDXL-Turbo

Optimizing the aesthetics predictor as a maximization problem, the algorithm came to a max Aesthetics score of 8.67. This score is higher than the examples from the real LAION English Subset dataset have, with the red line showing the limit. A wide variety of prompts (inspired by parti prompts) was used for the initial population.

ga_200gen_100pop_aesthetic.mp4

Parameters:

population_size = 100
num_generations = 200
batch_size = 1
elitism = 1

creator = SDXLPromptEmbeddingImageCreator(pipeline_factory=setup_pipeline, batch_size=batch_size, inference_steps=3)
evaluator = AestheticsImageEvaluator()  
crossover = PooledArithmeticCrossover(0.5, 0.5)
mutation_arguments = UniformGaussianMutatorArguments(mutation_rate=0.1, mutation_strength=2, clamp_range=(-900, 900)) 
mutation_arguments_pooled = UniformGaussianMutatorArguments(mutation_rate=0.1, mutation_strength=0.3, clamp_range=(-8, 8))
mutator = PooledUniformGaussianMutator(mutation_arguments, mutation_arguments_pooled)
selector = TournamentSelector(tournament_size=3)

Example - Island GA with Artists on each Island

Performing an Island GA by creating random embeddings and mixing them with artist embeddings to get mixtures of styles and new ideas.

More images

Example - Improving Audiobox Aesthetics Score

Starting from noisy random samples, evolving to better sounds. Using the sum of all fitness criteria Audiobox Aesthetics offers.

example_fitness_14.mp4

example_fitness_31.mp4

Detailed Results and Notebooks

More detailed results can be found in a separate repository dedicated to the results of the experiments: https://github.com/malthee/evolutionary-diffusion-results

Evaluators

AestheticsImageEvaluator: Uses the LAION Aesthetics Predictor V2. Blog: https://laion.ai/blog/laion-aesthetics/
AestheticsPredictorV25ImageEvaluator: Uses the Aesthetic Predictor V2.5 from discus0434
CLIPScoreEvaluator: Using the torchmetrics implementation for CLIP-Score
(Single/Multi)CLIPIQAEvaluator: Using the torchmetrics implementation for CLIP Image Quality Assessment.
AIDetectionImageEvaluator: Using the original Version from HuggingFace, or the fine-tuned one for SDXL generated images
AudioboxAestheticsEvaluator: Using Audiobox Aesthetics from Meta

Image Creators

Current supported creators working in the prompt embedding space:

SDXLPromptEmbeddingImageCreator: Supports the SDXL pipeline, creates both prompt- and pooled-prompt-embeddings.
SDPromptEmbeddingImageCreator: Only has prompt-embeddings, is faster but produces less quality results than SDXL.

Audio Creators

Supporting only AudioLDM because it works simply on the CLAP embedding space (suitable for this kind of operation). Other embeddings have shown to not work well with evolutionary operations (like the T5 encoder for example)

AudioLDMSoundCreator: Works with any AudioLDMPipeline, default is audioldm-l-full

Package Structure and Base Classes

(Pre-Testing) Evaluating Models for Evolutionary use

There are multiple notebooks exploring the speed and quality of models for generation and fitness-evaluation. These notebooks also allow for simple inference so that any model can be tried out easily.

diffusion_model_comparison: tries out different diffusion models with varying arguments (inference steps, batch size) to find out the optimal model for image generation in an evolutionary context (generation speed & quality)
clip_evaluators: uses torch metrics with CLIPScore and CLIP IQA. CLIPScore could define the fitness for "prompt fulfillment" or "image alignment" while CLIP IQA has many possible metrics like "quality, brightness, happiness..."
ai_detection_evaluator: uses a pre-trained model for AI image detection. This could be a fitness criteria to minimize "AI-likeness" in images.
aesthetics_evaluator: uses a pre-trained model from the maintainers of the LAION image dataset, which scores an image 0-10 depending on how "aesthetic" it is. Could be used as a maximization criteria for the fitness of images.
clamp_range: testing the usual prompt-embedding min and max values for different models, so that a CLAMP range can be set in the mutator for example. Using the parti prompts.
crossover_mutation_experiments: testing different crossover and mutation strategies to see how they work in the prompt embedding space
embedding_relations: experimenting with TensorBoard and integrating it into our embedding model

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
.idea		.idea
.junie		.junie
evolutionary		evolutionary
evolutionary_imaging		evolutionary_imaging
evolutionary_model_helpers		evolutionary_model_helpers
evolutionary_prompt_embedding		evolutionary_prompt_embedding
evolutionary_sound		evolutionary_sound
notebooks		notebooks
pre-testing		pre-testing
.gitignore		.gitignore
518.png		518.png
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evolutionary-Diffusion

Images

Audio

Try it out in Google Colab

Running locally

Example - Creating the most Aesthetic Image

Optimizing for Aesthetics using the Aesthetics Predictor V2 from LAION with a GA and SDXL-Turbo

Example - Island GA with Artists on each Island

Example - Improving Audiobox Aesthetics Score

Detailed Results and Notebooks

Evaluators

Image Creators

Audio Creators

Package Structure and Base Classes

(Pre-Testing) Evaluating Models for Evolutionary use

About

Uh oh!

Releases

Uh oh!

Languages

License

malthee/evolutionary-diffusion

Folders and files

Latest commit

History

Repository files navigation

Evolutionary-Diffusion

Images

Audio

Try it out in Google Colab

Running locally

Example - Creating the most Aesthetic Image

Optimizing for Aesthetics using the Aesthetics Predictor V2 from LAION with a GA and SDXL-Turbo

Example - Island GA with Artists on each Island

Example - Improving Audiobox Aesthetics Score

Detailed Results and Notebooks

Evaluators

Image Creators

Audio Creators

Package Structure and Base Classes

(Pre-Testing) Evaluating Models for Evolutionary use

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Languages