-
Notifications
You must be signed in to change notification settings - Fork 12.6k
Add T5Gemma support #14940 #15123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add T5Gemma support #14940 #15123
Conversation
convert_hf_to_gguf.py
Outdated
# Don't call super().__init__() because it tries to find standard layer count parameters | ||
# that don't exist in T5Gemma models (they have encoder.num_hidden_layers instead) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the only reason, maybe instead call super().__init__()
with a modified hparams
?
hparams = kwargs.get("hparams") or ModelBase.load_hparams(args[0] if args else kwargs["dir_model"])
encoder_config = hparams.get("encoder", {})
hparams["num_hidden_layers"] = encoder_config.get["num_hidden_layers"]
kwargs["hparams"] = hparams
super().__init__(*args, **kwargs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already changed all commit , this is my first time to convert model in to gguf, i just convert t5gemma-s-s-prefixlm-it ... not sure that will work on all t5gemma
convert_hf_to_gguf.py
Outdated
for i in range(self.block_count): | ||
# Encoder relative attention bias - shape should be (n_rel_attn_bkts, n_head) | ||
rel_bias_enc = torch.zeros(n_rel_attn_bkts, n_head_enc, dtype=torch.float16) | ||
yield f"enc.blk.{i}.attn_rel_b.weight", rel_bias_enc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use self.format_tensor_name
if possible.
yield f"enc.blk.{i}.attn_rel_b.weight", rel_bias_enc | |
yield self.format_tensor_name(gguf.MODEL_TENSOR.ENC_ATTN_REL_B, i), rel_bias_enc |
(I did not test this, but should probably work)
This also applies to the other places in this function where the output tensor names are hardcoded.
convert_hf_to_gguf.py
Outdated
# Dynamically set encoder's other parameters | ||
for key, value in encoder_config.items(): | ||
if key not in ["max_position_embeddings", "hidden_size", "num_hidden_layers", "intermediate_size", | ||
"num_attention_heads", "num_key_value_heads", "head_dim", "rms_norm_eps", | ||
"sliding_window", "attn_logit_softcapping", "final_logit_softcapping", | ||
"rope_theta", "attention_bias", "attention_dropout", "query_pre_attn_scalar", "vocab_size"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of excluding keys, maybe enumerating the included keys could be more robust.
It would also avoid adding unexpected metadata which won't necessarily be used.
…ents (ggml-org#14940) - Add T5Gemma model support with proper encoder-decoder architecture - Use super().__init__() instead of manual initialization for better inheritance - Use format_tensor_name() for consistent tensor naming - Explicitly enumerate included keys instead of excluding keys - Add proper type annotations for better type safety - Fix all trailing whitespace issues - Support relative attention bias tensors generation - Handle T5Gemma-specific post-layer normalization tensors - Implement proper tokenizer handling for BPE tokenizer - Add comprehensive tensor mapping for all T5Gemma components
Implement encoder-decoder architecture with rela tive attention bias, tensor mapping, and model conversion
only tested t5gemma-s-s-prefixlm-it due to memory limitations. Please correct us if there are any mistakes.
Make sure to read the contributing guidelines before submitting a PR