4444# <https://gluebenchmark.com/>`_. The MRPC (Dolan and Brockett, 2005) is
4545# a corpus of sentence pairs automatically extracted from online news
4646# sources, with human annotations of whether the sentences in the pair
47- # are semantically equivalent. Because the classes are imbalanced (68%
47+ # are semantically equivalent. As the classes are imbalanced (68%
4848# positive, 32% negative), we follow the common practice and report
4949# `F1 score <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html>`_.
5050# MRPC is a common NLP task for language pair classification, as shown
5555
5656######################################################################
5757# 1. Setup
58- # -------
58+ # --------
5959#
60- # Install PyTorch and HuggingFace Transformers
61- # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
60+ # 1.1 Install PyTorch and HuggingFace Transformers
61+ # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6262#
6363# To start this tutorial, let’s first follow the installation instructions
6464# in PyTorch `here <https://github.com/pytorch/pytorch/#installation>`_ and HuggingFace Github Repo `here <https://github.com/huggingface/transformers#installation>`_.
8787
8888
8989######################################################################
90- # 2. Import the necessary modules
91- # ----------------------------
90+ # 1.2 Import the necessary modules
91+ # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9292#
9393# In this step we import the necessary Python modules for the tutorial.
9494#
130130
131131
132132######################################################################
133- # 3. Download the dataset
134- # --------------------
133+ # 1.3 Download the dataset
134+ # ^^^^^^^^^^^^^^^^^^^^^^^^
135135#
136136# Before running MRPC tasks we download the `GLUE data
137137# <https://gluebenchmark.com/tasks>`_ by running `this script
138138# <https://gist.github.com/W4ngatang/60c2bdb54d156a41194446737ce03e2e>`_
139- # and unpack it to a directory `glue_data`.
139+ # and unpack it to a directory `` glue_data` `.
140140#
141141#
142142# .. code:: shell
146146
147147
148148######################################################################
149- # 4. Helper functions
150- # ----------------
149+ # 1.4 Learn about helper functions
150+ # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
151151#
152152# The helper functions are built-in in transformers library. We mainly use
153153# the following helper functions: one for converting the text examples
157157# The `glue_convert_examples_to_features <https://github.com/huggingface/transformers/blob/master/transformers/data/processors/glue.py>`_ function converts the texts into input features:
158158#
159159# - Tokenize the input sequences;
160- # - Insert [CLS] at the beginning;
160+ # - Insert [CLS] in the beginning;
161161# - Insert [SEP] between the first sentence and the second sentence, and
162- # at the end;
162+ # in the end;
163163# - Generate token type ids to indicate whether a token belongs to the
164164# first sequence or the second sequence.
165165#
166166# The `F1 score <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html>`_
167167# can be interpreted as a weighted average of the precision and recall,
168168# where an F1 score reaches its best value at 1 and worst score at 0. The
169169# relative contribution of precision and recall to the F1 score are equal.
170- # The equation for the F1 score is:
171170#
172- # - F1 = 2 \* (precision \* recall) / (precision + recall)
171+ # - The equation for the F1 score is:
172+ # .. math:: F1 = 2 * (\text{precision} * \text{recall}) / (\text{precision} + \text{recall})
173173#
174174
175175
176176######################################################################
177- # 5 . Fine-tune the BERT model
178- # --------------------------
177+ # 2 . Fine-tune the BERT model
178+ # ---------------------------
179179#
180180
181181
216216# To save time, you can download the model file (~400 MB) directly into your local folder ``$OUT_DIR``.
217217
218218######################################################################
219- # 6. Set global configurations
220- # -------------------------
219+ # 2.1 Set global configurations
220+ # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
221221#
222222
223223
@@ -264,12 +264,9 @@ def set_seed(seed):
264264
265265
266266######################################################################
267- # 7. Load the fine-tuned BERT model
268- # ------------------------------
267+ # 2.2 Load the fine-tuned BERT model
268+ # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
269269#
270-
271-
272- ######################################################################
273270# We load the tokenizer and fine-tuned BERT sequence classifier model
274271# (FP32) from the ``configs.output_dir``.
275272#
@@ -282,8 +279,8 @@ def set_seed(seed):
282279
283280
284281######################################################################
285- # 8. Define the tokenize and evaluation function
286- # -------------------------------------------
282+ # 2.3 Define the tokenize and evaluation function
283+ # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
287284#
288285# We reuse the tokenize and evaluation function from `Huggingface <https://github.com/huggingface/transformers/blob/master/examples/run_glue.py>`_.
289286#
@@ -426,7 +423,7 @@ def load_and_cache_examples(args, task, tokenizer, evaluate=False):
426423
427424
428425######################################################################
429- # 9 . Apply the dynamic quantization
426+ # 3 . Apply the dynamic quantization
430427# -------------------------------
431428#
432429# We call ``torch.quantization.quantize_dynamic`` on the model to apply
@@ -445,8 +442,8 @@ def load_and_cache_examples(args, task, tokenizer, evaluate=False):
445442
446443
447444######################################################################
448- # 10. Check the model size
449- # --------------------
445+ # 3.1 Check the model size
446+ # ^^^^^^^^^^^^^^^^^^^^^^^^
450447#
451448# Let’s first check the model size. We can observe a significant reduction
452449# in model size (FP32 total size: 438 MB; INT8 total size: 181 MB):
@@ -472,8 +469,8 @@ def print_size_of_model(model):
472469
473470
474471######################################################################
475- # 11. Evaluate the inference accuracy and time
476- # ----------------------------------------
472+ # 3.2 Evaluate the inference accuracy and time
473+ # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
477474#
478475# Next, let’s compare the inference time as well as the evaluation
479476# accuracy between the original FP32 model and the INT8 model after the
@@ -513,7 +510,7 @@ def time_model_evaluation(model, configs, tokenizer):
513510# comparison, in a `recent paper <https://arxiv.org/pdf/1910.06188.pdf>`_ (Table 1),
514511# it achieved 0.8788 by
515512# applying the post-training dynamic quantization and 0.8956 by applying
516- # the quantization-aware training. The main reason is that we support the
513+ # the quantization-aware training. The main difference is that we support the
517514# asymmetric quantization in PyTorch while that paper supports the
518515# symmetric quantization only.
519516#
@@ -533,8 +530,8 @@ def time_model_evaluation(model, configs, tokenizer):
533530
534531
535532######################################################################
536- # 12. Serialize the quantized model
537- # -----------------------------
533+ # 3.3 Serialize the quantized model
534+ # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
538535#
539536# We can serialize and save the quantized model for the future use.
540537#
0 commit comments