[DSV3] Forward and backward pass for single GPU #1320

wwwjn · 2025-06-19T15:26:15Z

Command to run: NGPU=1 CONFIG_FILE="./torchtitan/models/deepseek_v3/train_configs/debug_model.toml" ./run_train.sh

Context

Added model args for 4 model settings, and training config for debug model
Debugged the forward pass, and the backward pass works out of pocket.
Reused c4-test dataset, and tiktokenizer from llama3 model for current testing

tianyu-l

looks quite good! left some comments

torchtitan/models/deepseek_v3/model/model.py

torchtitan/models/deepseek_v3/model/args.py

torchtitan/models/deepseek_v3/__init__.py

torchtitan/models/deepseek_v3/model/model.py

H-Huang

Looks good to me! Thanks for getting working so quickly!

Command to run: `NGPU=1 CONFIG_FILE="./torchtitan/models/deepseek_v3/train_configs/debug_model.toml" ./run_train.sh` ## Context 1. Added model args for 4 model settings, and training config for debug model 2. Debugged the forward pass, and the backward pass works out of pocket. 3. Reused c4-test dataset, and tiktokenizer from llama3 model for current testing ![Screenshot 2025-06-20 at 11 52 49 AM](https://github.com/user-attachments/assets/81d938a2-9a85-4e8c-b8e1-7f9510d785c2)

wwwjn added 2 commits June 18, 2025 19:19

rename to register model

fcdb4e3

forward and backward

b6ed02c

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 19, 2025

wwwjn requested review from H-Huang and tianyu-l June 19, 2025 15:28

lint

c8e302f

tianyu-l reviewed Jun 19, 2025

View reviewed changes

H-Huang reviewed Jun 20, 2025

View reviewed changes

torchtitan/models/deepseek_v3/__init__.py Show resolved Hide resolved

torchtitan/models/deepseek_v3/model/model.py Outdated Show resolved Hide resolved

remove useless comments

9201098

wwwjn requested review from H-Huang and tianyu-l June 20, 2025 19:09

H-Huang approved these changes Jun 20, 2025

View reviewed changes

wwwjn merged commit 968a889 into deepseek-v3 Jun 23, 2025
5 checks passed

tianyu-l deleted the dsv3-configs branch June 24, 2025 03:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DSV3] Forward and backward pass for single GPU #1320

[DSV3] Forward and backward pass for single GPU #1320

Uh oh!

wwwjn commented Jun 19, 2025 •

edited

Loading

Uh oh!

tianyu-l left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

H-Huang left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[DSV3] Forward and backward pass for single GPU #1320

[DSV3] Forward and backward pass for single GPU #1320

Uh oh!

Conversation

wwwjn commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Uh oh!

tianyu-l left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

H-Huang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

wwwjn commented Jun 19, 2025 •

edited

Loading