Skip to content

Conversation

CISC
Copy link
Collaborator

@CISC CISC commented Mar 31, 2025

The config items lora_rank_tokenshift and lora_rank_decay were introduced in new release, see:
https://huggingface.co/featherless-ai/Qwerky-72B/blob/main/modeling_rwkv6qwen2.py#L268-L279

Fixes #12662

@github-actions github-actions bot added the python python script changes label Mar 31, 2025
@CISC CISC requested a review from MollySophia March 31, 2025 08:22
@MollySophia
Copy link
Collaborator

The change looks good to me. I haven't got the chance to download the full 72B model yet. Have you tested this?

@CISC
Copy link
Collaborator Author

CISC commented Mar 31, 2025

@MollySophia No, purely based on diffing the modeling code.

@MollySophia
Copy link
Collaborator

@MollySophia No, purely based on diffing the modeling code.

I see. Then let's wait for a feedback from #12662 :)

@CISC
Copy link
Collaborator Author

CISC commented Mar 31, 2025

BTW, the QWQ-32B modeling code uses lora_rank_decay incorrectly, but since it's identical to lora_rank_tokenshift in this model it has no implications.

ref:
https://huggingface.co/featherless-ai/Qwerky-QwQ-32B/blob/main/modeling_rwkv6qwen2.py#L268-L279

@CISC
Copy link
Collaborator Author

CISC commented Mar 31, 2025

@MollySophia Looks like it's working, even though @kanttouchthis only tested 32B I think it's safe to assume this change works for 72B too.

@CISC CISC merged commit 403fbac into ggml-org:master Mar 31, 2025
5 checks passed
@CISC CISC deleted the qwerky-lora-rank-decay branch March 31, 2025 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Eval bug: Qwerky QwQ 32B (rwkv6qwen2) failed to load
2 participants