Skip to content
This repository was archived by the owner on Sep 23, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/night_build_memo.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
finetune: gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b-chat, huggyllama/llama-7b
finetune: gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b, huggyllama/llama-7b
6 changes: 3 additions & 3 deletions .github/workflows/workflow_finetune.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
name: finetune
strategy:
matrix:
model: [ EleutherAI/gpt-j-6b, meta-llama/Llama-2-7b-chat-hf, gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b-chat, meta-llama/Llama-2-7b-hf, mistralai/Mistral-7B-v0.1, google/gemma-2b]
model: [ EleutherAI/gpt-j-6b, meta-llama/Llama-2-7b-chat-hf, gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b, meta-llama/Llama-2-7b-hf, mistralai/Mistral-7B-v0.1, google/gemma-2b]
isPR:
- ${{inputs.ci_type == 'pr'}}

Expand Down Expand Up @@ -92,7 +92,7 @@ jobs:
with open(conf_path, encoding="utf-8") as reader:
result = yaml.load(reader, Loader=yaml.FullLoader)
result['General']['base_model'] = "${{ matrix.model }}"
if "${{ matrix.model }}" == "mosaicml/mpt-7b-chat":
if "${{ matrix.model }}" == "mosaicml/mpt-7b":
result['General']['config']['trust_remote_code'] = True
else:
result['General']['config']['trust_remote_code'] = False
Expand Down Expand Up @@ -147,7 +147,7 @@ jobs:

- name: Run Deltatuner Test on DENAS-LoRA Model
run: |
if [[ ${{ matrix.model }} =~ ^(mosaicml\/mpt-7b-chat|huggyllama\/llama-7b|meta-llama\/Llama-2-7b-chat-hf|mistralai\/Mistral-7B-v0.1|google\/gemma-2b)$ ]]; then
if [[ ${{ matrix.model }} =~ ^(mosaicml\/mpt-7b|huggyllama\/llama-7b|meta-llama\/Llama-2-7b-chat-hf|mistralai\/Mistral-7B-v0.1|google\/gemma-2b)$ ]]; then
echo ${{ matrix.model }} is not supported!
else
docker exec "finetune" bash -c "rm -rf /tmp/llm-ray/*"
Expand Down
1 change: 1 addition & 0 deletions docs/finetune_parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ The following are the parameters supported in the finetuning workflow.
|Configuration Name| Default|Meaning|
|-|-|-|
|base_model| EleutherAI/gpt-j-6b|Path to pretrained model or model identifier from huggingface.co/models|
|tokenizer_name|None|Path to pretrained tokenizer from huggingface.co/models. If not provided, the tokenizer will be loaded from the `base_model`.|
|gpt_base_model|True|This parameter is for [Transformers#22482](https://github.com/huggingface/transformers/issues/22482). It needs to be set to True when the pretrained model is realted to gpt, otherwise it is False.|
|output_dir|/tmp/llm-ray/output|The output directory to store the finetuned model|
|checkpoint_dir|/tmp/llm-ray/checkpoint|The directory to store checkpoint|
Expand Down
6 changes: 5 additions & 1 deletion llm_on_ray/finetune/finetune.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,10 @@ def train_func(config: Dict[str, Any]):

gradient_accumulation_steps = config["Training"].get("gradient_accumulation_steps", 1)
base_model = config["General"]["base_model"]
if config["General"].get("tokenizer_name") is not None:
tokenizer_name = config["General"].get("tokenizer_name")
else:
tokenizer_name = base_model
Comment on lines +158 to +161
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if config["General"].get("tokenizer_name") is not None:
tokenizer_name = config["General"].get("tokenizer_name")
else:
tokenizer_name = base_model
tokenizer_name = config["General"].get("tokenizer_name", base_model)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the 'tokenizer_name' is not specified in the config YAML, the expression config["General"].get("tokenizer_name") will return None instead of the expected default value 'base_model'. I plan to submit a pull request to address this issue.

dataset_file = config["Dataset"]["train_file"]

seed = config["Training"].get("seed")
Expand All @@ -171,7 +175,7 @@ def train_func(config: Dict[str, Any]):

tokenizer = common.tokenizer.Tokenizer.registory.get("HuggingFaceTokenizer")()(
config={
"name": base_model,
"name": tokenizer_name,
"config": config["General"]["config"],
}
)
Expand Down
1 change: 1 addition & 0 deletions llm_on_ray/finetune/finetune_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ class DeltatunerConfig(BaseModel):

class General(BaseModel):
base_model: str
tokenizer_name: Optional[str] = None
gpt_base_model: bool
output_dir: str
checkpoint_dir: Optional[str]
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
General:
base_model: mosaicml/mpt-7b-chat
base_model: mosaicml/mpt-7b
tokenizer_name: EleutherAI/gpt-neox-20b
gpt_base_model: false
output_dir: /tmp/llm-ray/output
checkpoint_dir: /tmp/llm-ray/checkpoint
Expand Down