Skip to content

Conversation

@haotianxu216
Copy link

The umt5_encoder.h and clip_vision_model.h modules have been registered as VLM models and tested, but issues were encountered during testing:
umt5_encoder.h error: engine.h:47 kv_cache_manager_ is not BlockManagerPool type!
clip_vision_model.h error: utils.cpp:32 Check failed: weight.sizes() == tensor.sizes() ([0, 1280] vs. [1280, 1280]) weight size mismatch for vision_model.encoder.layers.0.self_attn.q_proj.weight
The other three modules (autoencoder_kl_wan.h, dit_wan.h, and unipc_multistep_scheduler.h) have not been tested yet. Additionally, the overall pipeline for the WAN model has not been implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant