Skip to content

Wan2.1-T2V-14B 使用图片进行LoRA训练报错 #610

Open
@hgx

Description

@hgx

metadata.csv

video,prompt
test.png,"dog"

accelerate launch examples/wanvideo/model_training/train.py \
  --dataset_base_path data/wan/train_datav6 \
  --dataset_metadata_path data/wan/train_datav6/metadata.csv \
  --height 512 \
  --width 512 \
  --num_frames 1 \
  --dataset_repeat 100 \
  --model_paths '[
    [
        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00001-of-00006.safetensors",
        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00002-of-00006.safetensors",
        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00003-of-00006.safetensors",
        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00004-of-00006.safetensors",
        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00005-of-00006.safetensors",
        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00006-of-00006.safetensors"
    ],
    "models/Wan-AI/Wan2.1-T2V-14B/models_t5_umt5-xxl-enc-bf16.pth",
    "models/Wan-AI/Wan2.1-T2V-14B/Wan2.1_VAE.pth"
]' \
  --learning_rate 1e-4 \
  --num_epochs 10 \
  --remove_prefix_in_ckpt "pipe.dit." \
  --output_path "./models/train/Wan2.1-T2V-14B_lora" \
  --lora_base_model "dit" \
  --lora_target_modules "q,k,v,o,ffn.0,ffn.2" \
  --lora_rank 32
(wan) root@iv-ydxqk42iv4qbxys524xd:~/DiffSynth-Studio# ./Wan2.1-T2V-14B_train.sh 
The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_processes` was set to a value of `1`
        `--num_machines` was set to a value of `1`
        `--mixed_precision` was set to a value of `'no'`
        `--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
Height and width are fixed. Setting `dynamic_resolution` to False.
Loading models from: ['models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00001-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00002-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00003-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00004-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00005-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00006-of-00006.safetensors']
    model_name: wan_video_dit model_class: WanModel
        This model is initialized with extra kwargs: {'has_image_input': False, 'patch_size': [1, 2, 2], 'in_dim': 16, 'dim': 5120, 'ffn_dim': 13824, 'freq_dim': 256, 'text_dim': 4096, 'out_dim': 16, 'num_heads': 40, 'num_layers': 40, 'eps': 1e-06}
    The following models are loaded: ['wan_video_dit'].
Loading models from: models/Wan-AI/Wan2.1-T2V-14B/models_t5_umt5-xxl-enc-bf16.pth
    model_name: wan_video_text_encoder model_class: WanTextEncoder
    The following models are loaded: ['wan_video_text_encoder'].
Loading models from: models/Wan-AI/Wan2.1-T2V-14B/Wan2.1_VAE.pth
    model_name: wan_video_vae model_class: WanVideoVAE
    The following models are loaded: ['wan_video_vae'].
Using wan_video_text_encoder from models/Wan-AI/Wan2.1-T2V-14B/models_t5_umt5-xxl-enc-bf16.pth.
Using wan_video_dit from ['models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00001-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00002-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00003-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00004-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00005-of-00006.safetensors', 'models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00006-of-00006.safetensors'].
Using wan_video_vae from models/Wan-AI/Wan2.1-T2V-14B/Wan2.1_VAE.pth.
No wan_video_image_encoder models available.
No wan_video_motion_controller models available.
No wan_video_vace models available.
Downloading Model from https://www.modelscope.cn to directory: /root/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B
2025-06-13 17:37:23,743 - modelscope - INFO - Target directory already exists, skipping creation.
  0%|                                                                    | 0/1600 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/root/DiffSynth-Studio/examples/wanvideo/model_training/train.py", line 113, in <module>
    launch_training_task(model, dataset, args=args)
  File "/root/DiffSynth-Studio/diffsynth/trainers/utils.py", line 202, in launch_training_task
    loss = model(data)
  File "/root/Wan2.1/wan/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/Wan2.1/wan/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/DiffSynth-Studio/examples/wanvideo/model_training/train.py", line 93, in forward
    if inputs is None: inputs = self.forward_preprocess(data)
  File "/root/DiffSynth-Studio/examples/wanvideo/model_training/train.py", line 63, in forward_preprocess
    "height": data["video"][0].size[1],
TypeError: 'Image' object is not subscriptable
Traceback (most recent call last):
  File "/root/Wan2.1/wan/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/root/Wan2.1/wan/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 50, in main
    args.func(args)
  File "/root/Wan2.1/wan/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1198, in launch_command
    simple_launcher(args)
  File "/root/Wan2.1/wan/lib/python3.10/site-packages/accelerate/commands/launch.py", line 785, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/root/Wan2.1/wan/bin/python3', 'examples/wanvideo/model_training/train.py', '--dataset_base_path', 'data/wan/train_datav6', '--dataset_metadata_path', 'data/wan/train_datav6/metadata.csv', '--height', '512', '--width', '512', '--num_frames', '1', '--dataset_repeat', '100', '--model_paths', '[\n    [\n        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00001-of-00006.safetensors",\n        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00002-of-00006.safetensors",\n        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00003-of-00006.safetensors",\n        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00004-of-00006.safetensors",\n        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00005-of-00006.safetensors",\n        "models/Wan-AI/Wan2.1-T2V-14B/diffusion_pytorch_model-00006-of-00006.safetensors"\n    ],\n    "models/Wan-AI/Wan2.1-T2V-14B/models_t5_umt5-xxl-enc-bf16.pth",\n    "models/Wan-AI/Wan2.1-T2V-14B/Wan2.1_VAE.pth"\n]', '--learning_rate', '1e-4', '--num_epochs', '10', '--remove_prefix_in_ckpt', 'pipe.dit.', '--output_path', './models/train/Wan2.1-T2V-14B_lora', '--lora_base_model', 'dit', '--lora_target_modules', 'q,k,v,o,ffn.0,ffn.2', '--lora_rank', '32']' returned non-zero exit status 1.
(wan) root@iv-ydxqk42iv4qbxys524xd:~/DiffSynth-Studio# 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions