From 225acbf6f656e122b0a212d2279f665e435b6d75 Mon Sep 17 00:00:00 2001
From: Will Constable <whc@meta.com>
Date: Fri, 12 Jul 2024 17:38:37 -0700
Subject: [PATCH 1/4] Update

[ghstack-poisoned]
---
 README.md | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/README.md b/README.md
index 18364d8f89..2bf49b0275 100644
--- a/README.md
+++ b/README.md
@@ -18,6 +18,14 @@ Our guiding principles when building `torchtitan`:
 
 [![Welcome to torchtitan!](assets/images/titan_play_video.png)](https://youtu.be/ee5DOEqD35I?si=_B94PbVv0V5ZnNKE "Welcome to torchtitan!")
 
+### Dive Into the code
+
+You may want to see how the model is defined or how parallelism techniques are applied. For a guided tour, see these files first:
+* [train.py](https://github.com/pytorch/torchtitan/blob/main/train.py) - the main training loop and high-level setup code
+* [torchtitan/parallelisms/parallelize_llama.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/parallelisms/parallelize_llama.py) - helpers for applying TP/DP/PP parallelisms to the model
+* [torchtitan/checkpoint.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/checkpoint.py) - utils for saving/loading distributed checkpoints
+* [torchtitan/models/llama/model.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/model.py) - the LLaMa model definition (shared for llama2/llama3 variants)
+
 ## Pre-Release Updates:
 #### (4/25/2024): `torchtitan` is now public but in a pre-release state and under development.
 Currently we showcase pre-training **Llama 3 and Llama 2** LLMs of various sizes from scratch. `torchtitan` is tested and verified with the PyTorch nightly version `torch-2.4.0.dev20240412`. (We recommend latest PyTorch nightly).

From b548fac3df648e6b812c5d807d9bf68a92913151 Mon Sep 17 00:00:00 2001
From: Will Constable <willconstable@gmail.com>
Date: Sun, 14 Jul 2024 21:21:39 -0700
Subject: [PATCH 2/4] Update README.md

Co-authored-by: tianyu-l <150487191+tianyu-l@users.noreply.github.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 2bf49b0275..aa64efbb9c 100644
--- a/README.md
+++ b/README.md
@@ -22,7 +22,7 @@ Our guiding principles when building `torchtitan`:
 
 You may want to see how the model is defined or how parallelism techniques are applied. For a guided tour, see these files first:
 * [train.py](https://github.com/pytorch/torchtitan/blob/main/train.py) - the main training loop and high-level setup code
-* [torchtitan/parallelisms/parallelize_llama.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/parallelisms/parallelize_llama.py) - helpers for applying TP/DP/PP parallelisms to the model
+* [torchtitan/parallelisms/parallelize_llama.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/parallelisms/parallelize_llama.py) - helpers for applying Data / Tensor / Pipeline Parallelisms to the model
 * [torchtitan/checkpoint.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/checkpoint.py) - utils for saving/loading distributed checkpoints
 * [torchtitan/models/llama/model.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/model.py) - the LLaMa model definition (shared for llama2/llama3 variants)
 

From 9694ef340c410c149194f7da089c98d9bedb99d0 Mon Sep 17 00:00:00 2001
From: Will Constable <willconstable@gmail.com>
Date: Sun, 14 Jul 2024 21:21:43 -0700
Subject: [PATCH 3/4] Update README.md

Co-authored-by: tianyu-l <150487191+tianyu-l@users.noreply.github.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index aa64efbb9c..59541f3dff 100644
--- a/README.md
+++ b/README.md
@@ -24,7 +24,7 @@ You may want to see how the model is defined or how parallelism techniques are a
 * [train.py](https://github.com/pytorch/torchtitan/blob/main/train.py) - the main training loop and high-level setup code
 * [torchtitan/parallelisms/parallelize_llama.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/parallelisms/parallelize_llama.py) - helpers for applying Data / Tensor / Pipeline Parallelisms to the model
 * [torchtitan/checkpoint.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/checkpoint.py) - utils for saving/loading distributed checkpoints
-* [torchtitan/models/llama/model.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/model.py) - the LLaMa model definition (shared for llama2/llama3 variants)
+* [torchtitan/models/llama/model.py](https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/model.py) - the Llama model definition (shared for Llama2 and Llama3 variants)
 
 ## Pre-Release Updates:
 #### (4/25/2024): `torchtitan` is now public but in a pre-release state and under development.

From 675f3f3aa319e8dba78a789838d8712ebb168b53 Mon Sep 17 00:00:00 2001
From: Will Constable <willconstable@gmail.com>
Date: Sun, 14 Jul 2024 21:21:48 -0700
Subject: [PATCH 4/4] Update README.md

Co-authored-by: tianyu-l <150487191+tianyu-l@users.noreply.github.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 59541f3dff..dde75e2085 100644
--- a/README.md
+++ b/README.md
@@ -18,7 +18,7 @@ Our guiding principles when building `torchtitan`:
 
 [![Welcome to torchtitan!](assets/images/titan_play_video.png)](https://youtu.be/ee5DOEqD35I?si=_B94PbVv0V5ZnNKE "Welcome to torchtitan!")
 
-### Dive Into the code
+### Dive into the code
 
 You may want to see how the model is defined or how parallelism techniques are applied. For a guided tour, see these files first:
 * [train.py](https://github.com/pytorch/torchtitan/blob/main/train.py) - the main training loop and high-level setup code