[MTK] Add support for Llama 3.2 and code updates to align with current ET API for dynamic dim #6726

neuropilot-captain · 2024-11-08T04:21:15Z

This PR contains the following modifications/udpates:

Add Llama 3.2 related files
Fix bug for tie word embeddings weight naming
Exclude embedding .bin file during weight checking in sanity checks
Get Embedding layer before loading of main model if calibration dataset provided to prevent tied word embedding weight name from being removed from state dict during main model weight loading
Add keep_in_memory during dataset mapping function to prevent disk out of space issue

…t api

pytorch-bot · 2024-11-08T04:21:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6726

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fc75330 with merge base 97a4600 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cccclai

Most nit comments, thank you for adding support for 3.2

cccclai · 2024-11-08T06:35:05Z

examples/mediatek/model_export_scripts/llama.py

                "eos_token_id_tensor": torch.tensor(tokenizer.eos_token_id),
                "response_cap": args.response_cap,
            },
+            keep_in_memory=True


Mind sharing what is keep in memory?

Hi @cccclai, keep in memory stores the dataset in RAM instead of caching it. It was a temporary workaround for an OSError I encountered on my end. I have since resolved the issue and will remove this argument in the next commit.

cccclai · 2024-11-08T06:36:40Z

examples/mediatek/models/llm_models/modeling_common.py

        if get_dym_shape:
            nt = Dim("num_token", max=num_token)
            cache_dims = tuple(({} for _ in range(2 * self.num_blocks)))
            dynamic_shapes = (


I think I may ask in the previous PR - what does dynamic_shape do? Probably good to add a comment to explain

Hi @cccclai, dynamic_shape is passed in to torch.export.export_for_training to indicate that the input shapes during calibration for some of the inputs may be different. After calibration, the dynamic_shape argument is not needed

cccclai · 2024-11-08T06:37:53Z

examples/mediatek/model_export_scripts/llama.py

    cal_dataset = None
    if args.dataset is not None:
        cal_dataset = load_dataset("text", data_files=args.dataset, split="train")
-        embedding_layer = get_embedding_layer(config, weight_dir, state_dict)


Any specific reason to remove this line?

Oh I didn't remove it. I shifted it above to line 422. The reason being that during chunk.load_weights (line 437), the weights are popped from the original state_dict and for the tie word embedding True case, this would mean that the embedding weights are not present in the state_dict anymore by the time we get the embedding layer, hence I shifted it up

cmodi-meta · 2024-11-08T19:16:43Z

Thanks @neuropilot-captain! do we also need to make to the examples/mediatek/shell_scripts/export_llama.sh file to include Llama 3.2 1B and 3B model in the if-else conditions?

cccclai · 2024-11-09T20:08:30Z

There are some lint errors. Could you send a fix?

neuropilot-captain · 2024-11-11T06:18:21Z

Thanks @neuropilot-captain! do we also need to make to the examples/mediatek/shell_scripts/export_llama.sh file to include Llama 3.2 1B and 3B model in the if-else conditions?

Hi @cmodi-meta, yes the script would need to be modified to include Llama 3.2 1B and 3B in the if else. I can modify the script and update it in the next commit

facebook-github-bot · 2024-11-15T04:13:43Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Bug fix to support llama 3.2 and code updates to align with current e…

9a5c458

…t api

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 8, 2024

cccclai approved these changes Nov 8, 2024

View reviewed changes

Remove keep_in_memory and fix lintrunner errors

fc75330

facebook-github-bot merged commit 5663e3c into pytorch:main Nov 15, 2024
41 checks passed

[MTK] Add support for Llama 3.2 and code updates to align with current ET API for dynamic dim #6726

[MTK] Add support for Llama 3.2 and code updates to align with current ET API for dynamic dim #6726

Uh oh!

Conversation

neuropilot-captain commented Nov 8, 2024

Uh oh!

pytorch-bot bot commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6726

✅ No Failures

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

cccclai Nov 8, 2024

Choose a reason for hiding this comment

Uh oh!

neuropilot-captain Nov 11, 2024

Choose a reason for hiding this comment

Uh oh!

cccclai Nov 8, 2024

Choose a reason for hiding this comment

Uh oh!

neuropilot-captain Nov 11, 2024

Choose a reason for hiding this comment

Uh oh!

cccclai Nov 8, 2024

Choose a reason for hiding this comment

Uh oh!

neuropilot-captain Nov 11, 2024

Choose a reason for hiding this comment

Uh oh!

cmodi-meta commented Nov 8, 2024

Uh oh!

cccclai commented Nov 9, 2024

Uh oh!

neuropilot-captain commented Nov 11, 2024

Uh oh!

facebook-github-bot commented Nov 15, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Nov 8, 2024 •

edited

Loading