From 225acbf6f656e122b0a212d2279f665e435b6d75 Mon Sep 17 00:00:00 2001 From: Will Constable Date: Fri, 12 Jul 2024 17:38:37 -0700 Subject: [PATCH 1/4] Update [ghstack-poisoned] --- README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/README.md b/README.md index 18364d8f89..2bf49b0275 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,14 @@ Our guiding principles when building `torchtitan`: [![Welcome to torchtitan!](assets/images/titan_play_video.png)](https://youtu.be/ee5DOEqD35I?si=_B94PbVv0V5ZnNKE "Welcome to torchtitan!") +### Dive Into the code + +You may want to see how the model is defined or how parallelism techniques are applied. For a guided tour, see these files first: +* [train.py](https://github.com/pytorch/torchtitan/blob/main/train.py) - the main training loop and high-level setup code +* [torchtitan/parallelisms/parallelize_llama.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/parallelisms/parallelize_llama.py) - helpers for applying TP/DP/PP parallelisms to the model +* [torchtitan/checkpoint.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/checkpoint.py) - utils for saving/loading distributed checkpoints +* [torchtitan/models/llama/model.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/model.py) - the LLaMa model definition (shared for llama2/llama3 variants) + ## Pre-Release Updates: #### (4/25/2024): `torchtitan` is now public but in a pre-release state and under development. Currently we showcase pre-training **Llama 3 and Llama 2** LLMs of various sizes from scratch. `torchtitan` is tested and verified with the PyTorch nightly version `torch-2.4.0.dev20240412`. (We recommend latest PyTorch nightly). From b548fac3df648e6b812c5d807d9bf68a92913151 Mon Sep 17 00:00:00 2001 From: Will Constable Date: Sun, 14 Jul 2024 21:21:39 -0700 Subject: [PATCH 2/4] Update README.md Co-authored-by: tianyu-l <150487191+tianyu-l@users.noreply.github.com> --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2bf49b0275..aa64efbb9c 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ Our guiding principles when building `torchtitan`: You may want to see how the model is defined or how parallelism techniques are applied. For a guided tour, see these files first: * [train.py](https://github.com/pytorch/torchtitan/blob/main/train.py) - the main training loop and high-level setup code -* [torchtitan/parallelisms/parallelize_llama.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/parallelisms/parallelize_llama.py) - helpers for applying TP/DP/PP parallelisms to the model +* [torchtitan/parallelisms/parallelize_llama.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/parallelisms/parallelize_llama.py) - helpers for applying Data / Tensor / Pipeline Parallelisms to the model * [torchtitan/checkpoint.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/checkpoint.py) - utils for saving/loading distributed checkpoints * [torchtitan/models/llama/model.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/model.py) - the LLaMa model definition (shared for llama2/llama3 variants) From 9694ef340c410c149194f7da089c98d9bedb99d0 Mon Sep 17 00:00:00 2001 From: Will Constable Date: Sun, 14 Jul 2024 21:21:43 -0700 Subject: [PATCH 3/4] Update README.md Co-authored-by: tianyu-l <150487191+tianyu-l@users.noreply.github.com> --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index aa64efbb9c..59541f3dff 100644 --- a/README.md +++ b/README.md @@ -24,7 +24,7 @@ You may want to see how the model is defined or how parallelism techniques are a * [train.py](https://github.com/pytorch/torchtitan/blob/main/train.py) - the main training loop and high-level setup code * [torchtitan/parallelisms/parallelize_llama.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/parallelisms/parallelize_llama.py) - helpers for applying Data / Tensor / Pipeline Parallelisms to the model * [torchtitan/checkpoint.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/checkpoint.py) - utils for saving/loading distributed checkpoints -* [torchtitan/models/llama/model.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/model.py) - the LLaMa model definition (shared for llama2/llama3 variants) +* [torchtitan/models/llama/model.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/model.py) - the Llama model definition (shared for Llama2 and Llama3 variants) ## Pre-Release Updates: #### (4/25/2024): `torchtitan` is now public but in a pre-release state and under development. From 675f3f3aa319e8dba78a789838d8712ebb168b53 Mon Sep 17 00:00:00 2001 From: Will Constable Date: Sun, 14 Jul 2024 21:21:48 -0700 Subject: [PATCH 4/4] Update README.md Co-authored-by: tianyu-l <150487191+tianyu-l@users.noreply.github.com> --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 59541f3dff..dde75e2085 100644 --- a/README.md +++ b/README.md @@ -18,7 +18,7 @@ Our guiding principles when building `torchtitan`: [![Welcome to torchtitan!](assets/images/titan_play_video.png)](https://youtu.be/ee5DOEqD35I?si=_B94PbVv0V5ZnNKE "Welcome to torchtitan!") -### Dive Into the code +### Dive into the code You may want to see how the model is defined or how parallelism techniques are applied. For a guided tour, see these files first: * [train.py](https://github.com/pytorch/torchtitan/blob/main/train.py) - the main training loop and high-level setup code