Skip to content
This repository was archived by the owner on Aug 28, 2025. It is now read-only.

Commit 5c6023b

Browse files
committed
change underscores to hyphens to prevent sphinx substitution reference interpretation
1 parent dcf7793 commit 5c6023b

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

lightning_examples/finetuning-scheduler/finetuning-scheduler.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@
7979
#
8080
# 2. Alter the schedule as desired.
8181
#
82-
# ![side_by_side_yaml](side_by_side_yaml.png){height="327px" width="800px"}
82+
# ![side-by-side-yaml](side_by_side_yaml.png){height="327px" width="800px"}
8383
#
8484
# 3. Once the finetuning schedule has been altered as desired, pass it to
8585
# [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) to commence scheduled training:
@@ -105,7 +105,7 @@
105105
#
106106
# **Tip:** Use of regex expressions can be convenient for specifying more complex schedules. Also, a per-phase base maximum lr can be specified:
107107
#
108-
# ![emphasized_yaml](emphasized_yaml.png){height="380px" width="800px"}
108+
# ![emphasized-yaml](emphasized_yaml.png){height="380px" width="800px"}
109109
#
110110
# </div>
111111
#
@@ -645,8 +645,8 @@ def train() -> None:
645645
# produced in the scenarios [here](https://drive.google.com/file/d/1t7myBgcqcZ9ax_IT9QVk-vFH_l_o5UXB/view?usp=sharing)
646646
# (caution, ~3.5GB).
647647
#
648-
# [![fts_explicit_accuracy](fts_explicit_accuracy.png){height="315px" width="492px"}](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/#scalars&_smoothingWeight=0&runSelectionState=eyJmdHNfZXhwbGljaXQiOnRydWUsIm5vZnRzX2Jhc2VsaW5lIjpmYWxzZSwiZnRzX2ltcGxpY2l0IjpmYWxzZX0%3D)
649-
# [![nofts_baseline](nofts_baseline_accuracy.png){height="316px" width="505px"}](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/#scalars&_smoothingWeight=0&runSelectionState=eyJmdHNfZXhwbGljaXQiOmZhbHNlLCJub2Z0c19iYXNlbGluZSI6dHJ1ZSwiZnRzX2ltcGxpY2l0IjpmYWxzZX0%3D)
648+
# [![fts-explicit-accuracy](fts_explicit_accuracy.png){height="315px" width="492px"}](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/#scalars&_smoothingWeight=0&runSelectionState=eyJmdHNfZXhwbGljaXQiOnRydWUsIm5vZnRzX2Jhc2VsaW5lIjpmYWxzZSwiZnRzX2ltcGxpY2l0IjpmYWxzZX0%3D)
649+
# [![nofts-baseline](nofts_baseline_accuracy.png){height="316px" width="505px"}](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/#scalars&_smoothingWeight=0&runSelectionState=eyJmdHNfZXhwbGljaXQiOmZhbHNlLCJub2Z0c19iYXNlbGluZSI6dHJ1ZSwiZnRzX2ltcGxpY2l0IjpmYWxzZX0%3D)
650650
#
651651
# Note there could be around ~1% variation in performance from the tensorboard summaries generated by this notebook
652652
# which uses DP and 1 GPU.
@@ -656,7 +656,7 @@ def train() -> None:
656656
# greater finetuning flexibility for model exploration in research. For example, glancing at DeBERTa-v3's implicit training
657657
# run, a critical tuning transition point is immediately apparent:
658658
#
659-
# [![implicit_training_transition](implicit_training_transition.png){height="272px" width="494px"}](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/#scalars&_smoothingWeight=0&runSelectionState=eyJmdHNfZXhwbGljaXQiOmZhbHNlLCJub2Z0c19iYXNlbGluZSI6ZmFsc2UsImZ0c19pbXBsaWNpdCI6dHJ1ZX0%3D)
659+
# [![implicit-training-transition](implicit_training_transition.png){height="272px" width="494px"}](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/#scalars&_smoothingWeight=0&runSelectionState=eyJmdHNfZXhwbGljaXQiOmZhbHNlLCJub2Z0c19iYXNlbGluZSI6ZmFsc2UsImZ0c19pbXBsaWNpdCI6dHJ1ZX0%3D)
660660
#
661661
# Our `val_loss` begins a precipitous decline at step 3119 which corresponds to phase 17 in the schedule. Referring to our
662662
# schedule, in phase 17 we're beginning tuning the attention parameters of our 10th encoder layer (of 11). Interesting!

0 commit comments

Comments
 (0)