-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Hello!
I was trying to finetune Anything 3.0 and when I was generating the class images I got precision mismatch error (see log below). I already had float16 errors on my GPU, so I set mixed_precision to false and it worked with pre-generated class images, but not when the pipeline generates them in the train_dreambooth.py script.
I already found the place which is causing the issue in the code and I'll make a Pull-Request today.
Reproduction
- Have a GPU which is having trouble with half precision
- Download Anything-3.0's diffusers branch, so that the Diffusers Dreambooth pipeline would work on it
git clone --depth=1 -b diffusers https://huggingface.co/Linaqruf/anything-v3.0 - Start the Dreambooth script with
--with_prior_preservation - Get
RuntimeError: expected scalar type Half but found Floatwhen generating class images
export MODEL_NAME="/home/{USER}/kml/models1"
export INSTANCE_DIR="/home/{USER}/kml/datasets/objects/alhaitham"
export CLASS_DIR="/home/{USER}/kml/datasets/objects/alhaitham1"
export OUTPUT_DIR="/home/{USER}/kml/models2"
accelerate launch train_dreambooth.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--instance_data_dir=$INSTANCE_DIR \
--class_data_dir=$CLASS_DIR \
--output_dir=$OUTPUT_DIR \
--with_prior_preservation --prior_loss_weight=1.0 \
--instance_prompt="character portrait of alhaitham" \
--class_prompt="1boy, medium hair, grey hair, green eyes, bishounen, colorful, autumn, green leaves, detailed fantasy> --resolution=512 \
--train_batch_size=1 \
--gradient_accumulation_steps=1 \
--learning_rate=1e-6 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--num_class_images=200 \
--max_train_steps=800 \
--mixed_precision 'no' \
--train_text_encoderLogs
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
Generating class images: 0%| | 0/50 [00:00<?, ?it/s]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/{USER}/kml/diffusers/examples/dreambooth/train_dreambooth.py:779 in <module> │
│ │
│ 776 │
│ 777 if __name__ == "__main__": │
│ 778 │ args = parse_args() │
│ ❱ 779 │ main(args) │
│ 780 │
│ │
│ /home/{USER}/kml/diffusers/examples/dreambooth/train_dreambooth.py:456 in main │
│ │
│ 453 │ │ │ for example in tqdm( │
│ 454 │ │ │ │ sample_dataloader, desc="Generating class images", disable=not accelerat │
│ 455 │ │ │ ): │
│ ❱ 456 │ │ │ │ images = pipeline(example["prompt"]).images │
│ 457 │ │ │ │ │
│ 458 │ │ │ │ for i, image in enumerate(images): │
│ 459 │ │ │ │ │ hash_image = hashlib.sha1(image.tobytes()).hexdigest() │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py:27 in │
│ decorate_context │
│ │
│ 24 │ │ @functools.wraps(func) │
│ 25 │ │ def decorate_context(*args, **kwargs): │
│ 26 │ │ │ with self.clone(): │
│ ❱ 27 │ │ │ │ return func(*args, **kwargs) │
│ 28 │ │ return cast(F, decorate_context) │
│ 29 │ │
│ 30 │ def _wrap_generator(self, func): │
│ │
│ /home/{USER}/kml/src/diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusi │
│ on.py:496 in __call__ │
│ │
│ 493 │ │ do_classifier_free_guidance = guidance_scale > 1.0 │
│ 494 │ │ │
│ 495 │ │ # 3. Encode input prompt │
│ ❱ 496 │ │ text_embeddings = self._encode_prompt( │
│ 497 │ │ │ prompt, device, num_images_per_prompt, do_classifier_free_guidance, negative │
│ 498 │ │ ) │
│ 499 │
│ │
│ /home/{USER}/kml/src/diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusi │
│ on.py:265 in _encode_prompt │
│ │
│ 262 │ │ else: │
│ 263 │ │ │ attention_mask = None │
│ 264 │ │ │
│ ❱ 265 │ │ text_embeddings = self.text_encoder( │
│ 266 │ │ │ text_input_ids.to(device), │
│ 267 │ │ │ attention_mask=attention_mask, │
│ 268 │ │ ) │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130 in _call_impl │
│ │
│ 1127 │ │ # this function, and just call forward. │
│ 1128 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1129 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1130 │ │ │ return forward_call(*input, **kwargs) │
│ 1131 │ │ # Do not call functions when jit is used │
│ 1132 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1133 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py:722 │
│ in forward │
│ │
│ 719 │ │ >>> last_hidden_state = outputs.last_hidden_state │
│ 720 │ │ >>> pooled_output = outputs.pooler_output # pooled (EOS token) states │
│ 721 │ │ """ │
│ ❱ 722 │ │ return self.text_model( │
│ 723 │ │ │ input_ids=input_ids, │
│ 724 │ │ │ attention_mask=attention_mask, │
│ 725 │ │ │ position_ids=position_ids, │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130 in _call_impl │
│ │
│ 1127 │ │ # this function, and just call forward. │
│ 1128 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1129 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1130 │ │ │ return forward_call(*input, **kwargs) │
│ 1131 │ │ # Do not call functions when jit is used │
│ 1132 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1133 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py:643 │
│ in forward │
│ │
│ 640 │ │ │ # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len] │
│ 641 │ │ │ attention_mask = _expand_mask(attention_mask, hidden_states.dtype) │
│ 642 │ │ │
│ ❱ 643 │ │ encoder_outputs = self.encoder( │
│ 644 │ │ │ inputs_embeds=hidden_states, │
│ 645 │ │ │ attention_mask=attention_mask, │
│ 646 │ │ │ causal_attention_mask=causal_attention_mask, │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130 in _call_impl │
│ │
│ 1127 │ │ # this function, and just call forward. │
│ 1128 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1129 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1130 │ │ │ return forward_call(*input, **kwargs) │
│ 1131 │ │ # Do not call functions when jit is used │
│ 1132 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1133 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py:574 │
│ in forward │
│ │
│ 571 │ │ │ │ │ causal_attention_mask, │
│ 572 │ │ │ │ ) │
│ 573 │ │ │ else: │
│ ❱ 574 │ │ │ │ layer_outputs = encoder_layer( │
│ 575 │ │ │ │ │ hidden_states, │
│ 576 │ │ │ │ │ attention_mask, │
│ 577 │ │ │ │ │ causal_attention_mask, │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130 in _call_impl │
│ │
│ 1127 │ │ # this function, and just call forward. │
│ 1128 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1129 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1130 │ │ │ return forward_call(*input, **kwargs) │
│ 1131 │ │ # Do not call functions when jit is used │
│ 1132 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1133 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py:317 │
│ in forward │
│ │
│ 314 │ │ residual = hidden_states │
│ 315 │ │ │
│ 316 │ │ hidden_states = self.layer_norm1(hidden_states) │
│ ❱ 317 │ │ hidden_states, attn_weights = self.self_attn( │
│ 318 │ │ │ hidden_states=hidden_states, │
│ 319 │ │ │ attention_mask=attention_mask, │
│ 320 │ │ │ causal_attention_mask=causal_attention_mask, │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130 in _call_impl │
│ │
│ 1127 │ │ # this function, and just call forward. │
│ 1128 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1129 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1130 │ │ │ return forward_call(*input, **kwargs) │
│ 1131 │ │ # Do not call functions when jit is used │
│ 1132 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1133 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ /home/{USER}/.local/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py:257 │
│ in forward │
│ │
│ 254 │ │ │
│ 255 │ │ attn_probs = nn.functional.dropout(attn_weights, p=self.dropout, training=self.t │
│ 256 │ │ │
│ ❱ 257 │ │ attn_output = torch.bmm(attn_probs, value_states) │
│ 258 │ │ │
│ 259 │ │ if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim): │
│ 260 │ │ │ raise ValueError( │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: expected scalar type Half but found Float
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/{USER}/.local/bin/accelerate:33 in <module> │
│ │
│ 30 │
│ 31 if __name__ == '__main__': │
│ 32 │ sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0]) │
│ ❱ 33 │ sys.exit(load_entry_point('accelerate', 'console_scripts', 'accelerate')()) │
│ 34 │
│ │
│ /home/{USER}/kml/src/accelerate/src/accelerate/commands/accelerate_cli.py:45 in main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if __name__ == "__main__": │
│ │
│ /home/{USER}/kml/src/accelerate/src/accelerate/commands/launch.py:1071 in launch_command │
│ │
│ 1068 │ elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA │
│ 1069 │ │ sagemaker_launcher(defaults, args) │
│ 1070 │ else: │
│ ❱ 1071 │ │ simple_launcher(args) │
│ 1072 │
│ 1073 │
│ 1074 def main(): │
│ │
│ /home/{USER}/kml/src/accelerate/src/accelerate/commands/launch.py:547 in simple_launcher │
│ │
│ 544 │ process.wait() │
│ 545 │ if process.returncode != 0: │
│ 546 │ │ if not args.quiet: │
│ ❱ 547 │ │ │ raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) │
│ 548 │ │ else: │
│ 549 │ │ │ sys.exit(1) │
│ 550 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth.py',
'--pretrained_model_name_or_path=/home/{USER}/kml/models1',
'--instance_data_dir=/home/{USER}/kml/datasets/objects/alhaitham',
'--class_data_dir=/home/{USER}/kml/datasets/objects/alhaitham1', '--output_dir=/home/{USER}/kml/models2',
'--with_prior_preservation', '--prior_loss_weight=1.0', '--instance_prompt=character portrait of alhaitham',
'--class_prompt=1boy, medium hair, grey hair, green eyes, bishounen, colorful, autumn, green leaves, detailed fantasy
clothes, lighting, blue sky', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=1',
'--learning_rate=1e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=200',
'--max_train_steps=800', '--mixed_precision', 'no', '--train_text_encoder']' returned non-zero exit status 1.System Info
Ubuntu 20.04.5 LTS, diffusers installed yesterday from b6d4702, Python 3.8.10
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working