Skip to content

Commit 0d2067e

Browse files
authored
Dynamic Transformer (LAT) (#139)
1 parent 4743273 commit 0d2067e

File tree

18 files changed

+8849
-5
lines changed

18 files changed

+8849
-5
lines changed

docs/tutorials/pytorch/question-answering/Dynamic_MiniLM_SQuAD.ipynb

Lines changed: 2305 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
(308, 247, 198, 159, 128, 103) 2484739968 87.33049546363097 0 None
2+
(269, 253, 252, 202, 104, 34) 2485456896 87.76366630183897 1 ((300, 263, 248, 202, 104, 38),)
3+
(300, 243, 214, 197, 129, 36) 2492206848 87.94776585579947 1 ((284, 252, 237, 197, 129, 36),)
4+
(284, 262, 248, 200, 95, 36) 2510150400 88.12499528628362 2 ((284, 252, 237, 197, 129, 36), (283, 272, 258, 203, 60, 36))
5+
(315, 251, 242, 159, 142, 33) 2546934912 88.33566781744294 1 ((315, 278, 231, 169, 105, 33),)
6+
(303, 268, 256, 182, 118, 29) 2581745280 88.51327545611849 2 ((322, 284, 275, 166, 107, 22), (284, 252, 237, 197, 129, 36))
7+
(346, 284, 275, 166, 107, 24) 2695914240 88.75378439863312 1 ((346, 284, 275, 166, 107, 50),)
8+
(365, 280, 273, 176, 112, 46) 2782709760 88.8968270345175 2 ((348, 282, 268, 186, 111, 42), (381, 278, 278, 166, 113, 50))
9+
(375, 331, 283, 213, 127, 18) 3015100416 89.11164300013466 1 ((374, 331, 283, 184, 125, 48),)
10+
(374, 331, 283, 230, 126, 51) 3085609344 89.15126282829618 1 ((374, 331, 283, 227, 126, 52),)
11+
(377, 361, 345, 245, 132, 38) 3329854464 89.28344658586016 2 ((377, 355, 350, 242, 130, 33), (377, 367, 340, 247, 134, 43))
12+
(383, 363, 360, 242, 134, 42) 3385542144 89.30703301646996 1 ((382, 379, 360, 242, 134, 42),)
13+
(383, 380, 376, 358, 202, 123) 3955011456 89.31783846266845 1 ((383, 380, 378, 358, 210, 142),)
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Dynamic-Length Transformer
2+
3+
The implementation is based on [Length Adaptive Transformer](https://github.com/clovaai/length-adaptive-transformer)'s work.
4+
Currently, it supports BERT and RoBERTa based transformers.
5+
6+
7+
## Training
8+
9+
10+
### Step 1: Finetuning Pretrained Transformer
11+
```
12+
python run_qa.py \
13+
--model_name_or_path bert-base-uncased \
14+
--dataset_name squad \
15+
--do_train \
16+
--do_eval \
17+
--learning_rate 3e-5 \
18+
--num_train_epochs 2 \
19+
--max_seq_length 384 \
20+
--doc_stride 128 \
21+
--per_device_train_batch_size 8 \
22+
--output_dir output/finetuning
23+
```
24+
25+
26+
### Step 2: Training with LengthDrop
27+
28+
```
29+
python run_qa.py \
30+
--model_name_or_path output/finetuning \
31+
--dataset_name squad \
32+
--do_train \
33+
--do_eval \
34+
--learning_rate 3e-5 \
35+
--num_train_epochs 5 \
36+
--max_seq_length 384 \
37+
--doc_stride 128 \
38+
--per_device_train_batch_size 8 \
39+
--length_adaptive \
40+
--num_sandwich 2 \
41+
--length_drop_ratio_bound 0.2 \
42+
--layer_dropout_prob 0.2 \
43+
--output_dir output/dynamic
44+
45+
```
46+
47+
### Step 3: Evolutionary Search
48+
49+
run search to optimize length configurations for any possible target computational budget.
50+
51+
```
52+
python run_qa.py \
53+
--model_name_or_path output/dynamic \
54+
--dataset_name squad \
55+
--max_seq_length 384 \
56+
--doc_stride 128 \
57+
--do_eval \
58+
--per_device_eval_batch_size 32 \
59+
--do_search \
60+
--output_dir output/search
61+
62+
```
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
transformers
2+
datasets
3+
torchprofiler

0 commit comments

Comments
 (0)