Intro to TorchScript tutorial (#592)

James Reed · soumith · commit ccadd5f68501 · 2019-08-07T13:11:20.000-04:00
diff --git a/_static/img/torchscript.png b/_static/img/torchscript.png
diff --git a/beginner_source/Intro_to_TorchScript.py b/beginner_source/Intro_to_TorchScript.py
@@ -0,0 +1,384 @@
+"""
+Introduction to TorchScript
+===========================
+
+*James Reed (jamesreed@fb.com), Michael Suo (suo@fb.com)*, rev2
+
+In this tutorial we will cover:
+
+1. The basics of model authoring in PyTorch, including:
+
+-  Modules
+-  Defining ``forward`` functions
+-  Composing modules into a hierarchy of modules
+
+2. Methods for converting PyTorch modules to TorchScript, our
+   high-performance deployment runtime
+
+-  Tracing an existing module
+-  Using scripting to directly compile a module
+-  How to compose both approaches
+-  Saving and loading TorchScript modules
+
+"""
+
+import torch # This is all you need to use both PyTorch and TorchScript!
+print(torch.__version__)
+
+
+######################################################################
+# Basics of PyTorch Model Authoring
+# ---------------------------------
+# 
+# Let’s start out be defining a simple ``Module``. A ``Module`` is the
+# basic unit of composition in PyTorch. It contains:
+# 
+# 1. A constructor, which prepares the module for invocation
+# 2. A set of ``Parameters`` and sub-\ ``Modules``. These are initialized
+#    by the constructor and can be used by the module during invocation.
+# 3. A ``forward`` function. This is the code that is run when the module
+#    is invoked.
+# 
+# Let’s examine a small example:
+# 
+
+class MyCell(torch.nn.Module):
+    def __init__(self):
+        super(MyCell, self).__init__()
+
+    def forward(self, x, h):
+        new_h = torch.tanh(x + h)
+        return new_h, new_h
+      
+my_cell = MyCell()
+x = torch.rand(3, 4)
+h = torch.rand(3, 4)
+print(my_cell(x, h))
+
+
+######################################################################
+# So we’ve:
+# 
+# 1. Created a class that subclasses ``torch.nn.Module``.
+# 2. Defined a constructor. The constructor doesn’t do much, just calls
+#    the constructor for ``super``.
+# 3. Defined a ``forward`` function, which takes two inputs and returns
+#    two outputs. The actual contents of the ``forward`` function are not
+#    really important, but it’s sort of a fake `RNN
+#    cell <https://colah.github.io/posts/2015-08-Understanding-LSTMs/>`__–that
+#    is–it’s a function that is applied on a loop.
+# 
+# We instantiated the module, and made ``x`` and ``y``, which are just 3x4
+# matrices of random values. Then we invoked the cell with
+# ``my_cell(x, h)``. This in turn calls our ``forward`` function.
+# 
+# Let’s do something a little more interesting:
+# 
+
+class MyCell(torch.nn.Module):
+    def __init__(self):
+        super(MyCell, self).__init__()
+        self.linear = torch.nn.Linear(4, 4)
+
+    def forward(self, x, h):
+        new_h = torch.tanh(self.linear(x) + h)
+        return new_h, new_h
+
+my_cell = MyCell()
+print(my_cell)
+print(my_cell(x, h))
+
+
+######################################################################
+# We’ve redefined our module ``MyCell``, but this time we’ve added a
+# ``self.linear`` attribute, and we invoke ``self.linear`` in the forward
+# function.
+# 
+# What exactly is happening here? ``torch.nn.Linear`` is a ``Module`` from
+# the PyTorch standard library. Just like ``MyCell``, it can be invoked
+# using the call syntax. We are building a hierarchy of ``Module``\ s.
+# 
+# ``print`` on a ``Module`` will give a visual representation of the
+# ``Module``\ ’s subclass hierarchy. In our example, we can see our
+# ``Linear`` subclass and its parameters.
+# 
+# By composing ``Module``\ s in this way, we can succintly and readably
+# author models with reusable components.
+# 
+# You may have noticed ``grad_fn`` on the outputs. This is a detail of
+# PyTorch’s method of automatic differentiation, called
+# `autograd <https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html>`__.
+# In short, this system allows us to compute derivatives through
+# potentially complex programs. The design allows for a massive amount of
+# flexibility in model authoring.
+# 
+# Now let’s examine said flexibility:
+# 
+
+class MyDecisionGate(torch.nn.Module):
+  def forward(self, x):
+    if x.sum() > 0:
+      return x
+    else:
+      return -x
+
+class MyCell(torch.nn.Module):
+    def __init__(self):
+        super(MyCell, self).__init__()
+        self.dg = MyDecisionGate()
+        self.linear = torch.nn.Linear(4, 4)
+
+    def forward(self, x, h):
+        new_h = torch.tanh(self.dg(self.linear(x)) + h)
+        return new_h, new_h
+
+my_cell = MyCell()
+print(my_cell)
+print(my_cell(x, h))
+
+
+######################################################################
+# We’ve once again redefined our MyCell class, but here we’ve defined
+# ``MyDecisionGate``. This module utilizes **control flow**. Control flow
+# consists of things like loops and ``if``-statements.
+# 
+# Many frameworks take the approach of computing symbolic derivatives
+# given a full program representation. However, in PyTorch, we use a
+# gradient tape. We record operations as they occur, and replay them
+# backwards in computing derivatives. In this way, the framework does not
+# have to explicitly define derivatives for all constructs in the
+# language.
+# 
+# .. figure:: https://github.com/pytorch/pytorch/raw/master/docs/source/_static/img/dynamic_graph.gif
+#    :alt: How autograd works
+# 
+#    How autograd works
+# 
+
+
+######################################################################
+# Basics of TorchScript
+# ---------------------
+# 
+# Now let’s take our running example and see how we can apply TorchScript.
+# 
+# In short, TorchScript provides tools to capture the definition of your
+# model, even in light of the flexible and dynamic nature of PyTorch.
+# Let’s begin by examining what we call **tracing**.
+# 
+# Tracing ``Modules``
+# ~~~~~~~~~~~~~~~~~~~
+# 
+
+class MyCell(torch.nn.Module):
+    def __init__(self):
+        super(MyCell, self).__init__()
+        self.linear = torch.nn.Linear(4, 4)
+
+    def forward(self, x, h):
+        new_h = torch.tanh(self.linear(x) + h)
+        return new_h, new_h
+      
+my_cell = MyCell()
+x, h = torch.rand(3, 4), torch.rand(3, 4)
+traced_cell = torch.jit.trace(my_cell, (x, h))
+print(traced_cell)
+traced_cell(x, h)
+
+
+######################################################################
+# We’ve rewinded a bit and taken the second version of our ``MyCell``
+# class. As before, we’ve instantiated it, but this time, we’ve called
+# ``torch.jit.trace``, passed in the ``Module``, and passed in *example
+# inputs* the network might see.
+# 
+# What exactly has this done? It has invoked the ``Module``, recorded the
+# operations that occured when the ``Module`` was run, and created an
+# instance of ``torch.jit.ScriptModule`` (of which ``TracedModule`` is an
+# instance)
+# 
+# TorchScript records its definitions in an Intermediate Representation
+# (or IR), commonly referred to in Deep learning as a *graph*. We can
+# examine the graph with the ``.graph`` property:
+# 
+
+print(traced_cell.graph)
+
+
+######################################################################
+# However, this is a very low-level representation and most of the
+# information contained in the graph is not useful for end users. Instead,
+# we can use the ``.code`` property to give a Python-syntax interpretation
+# of the code:
+# 
+
+print(traced_cell.code)
+
+
+######################################################################
+# So **why** did we do all this? There are several reasons:
+# 
+# 1. TorchScript code can be invoked in its own interpreter, which is
+#    basically a restricted Python interpreter. This interpreter does not
+#    acquire the Global Interpreter Lock, and so many requests can be
+#    processed on the same instance simultaneously.
+# 2. This format allows us to save the whole model to disk and load it
+#    into another environment, such as in a server written in a language
+#    other than Python
+# 3. TorchScript gives us a representation in which we can do compiler
+#    optimizations on the code to provide more efficient execution
+# 4. TorchScript allows us to interface with many backend/device runtimes
+#    that require a broader view of the program than individual operators.
+# 
+# We can see that invoking ``traced_cell`` produces the same results as
+# the Python module:
+# 
+
+print(my_cell(x, h))
+print(traced_cell(x, h))
+
+
+######################################################################
+# Using Scripting to Convert Modules
+# ----------------------------------
+# 
+# There’s a reason we used version two of our module, and not the one with
+# the control-flow-laden submodule. Let’s examine that now:
+# 
+
+class MyDecisionGate(torch.nn.Module):
+  def forward(self, x):
+    if x.sum() > 0:
+      return x
+    else:
+      return -x
+
+class MyCell(torch.nn.Module):
+    def __init__(self, dg):
+        super(MyCell, self).__init__()
+        self.dg = dg
+        self.linear = torch.nn.Linear(4, 4)
+
+    def forward(self, x, h):
+        new_h = torch.tanh(self.dg(self.linear(x)) + h)
+        return new_h, new_h
+      
+my_cell = MyCell(MyDecisionGate())
+traced_cell = torch.jit.trace(my_cell, (x, h))
+print(traced_cell.code)
+
+
+######################################################################
+# Looking at the ``.code`` output, we can see that the ``if-else`` branch
+# is nowhere to be found! Why? Tracing does exactly what we said it would:
+# run the code, record the operations *that happen* and construct a
+# ScriptModule that does exactly that. Unfortunately, things like control
+# flow are erased.
+# 
+# How can we faithfully represent this module in TorchScript? We provide a
+# **script compiler**, which does direct analysis of your Python source
+# code to transform it into TorchScript. Let’s convert ``MyDecisionGate``
+# using the script compiler:
+# 
+
+scripted_gate = torch.jit.script(MyDecisionGate())
+
+my_cell = MyCell(scripted_gate)
+traced_cell = torch.jit.script(my_cell)
+print(traced_cell.code)
+
+
+######################################################################
+# Hooray! We’ve now faithfully captured the behavior of our program in
+# TorchScript. Let’s now try running the program:
+# 
+
+# New inputs
+x, h = torch.rand(3, 4), torch.rand(3, 4)
+traced_cell(x, h)
+
+
+######################################################################
+# Mixing Scripting and Tracing
+# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+# 
+# Some situations call for using tracing rather than scripting (e.g. a
+# module has many architectural decisions that are made based on constant
+# Python values that we would like to not appear in TorchScript). In this
+# case, scripting can be composed with tracing: ``torch.jit.script`` will
+# inline the code for a traced module, and tracing will inline the code
+# for a scripted module.
+# 
+# An example of the first case:
+# 
+
+class MyRNNLoop(torch.nn.Module):
+    def __init__(self):
+        super(MyRNNLoop, self).__init__()
+        self.cell = torch.jit.trace(MyCell(scripted_gate), (x, h))
+
+    def forward(self, xs):
+        h, y = torch.zeros(3, 4), torch.zeros(3, 4)
+        for i in range(xs.size(0)):
+            y, h = self.cell(xs[i], h)
+        return y, h
+      
+rnn_loop = torch.jit.script(MyRNNLoop())
+print(rnn_loop.code)
+
+
+
+######################################################################
+# And an example of the second case:
+# 
+
+class WrapRNN(torch.nn.Module):
+  def __init__(self):
+    super(WrapRNN, self).__init__()
+    self.loop = torch.jit.script(MyRNNLoop())
+    
+  def forward(self, xs):
+    y, h = self.loop(xs)
+    return torch.relu(y)
+  
+traced = torch.jit.trace(WrapRNN(), (torch.rand(10, 3, 4)))
+print(traced.code)
+
+
+######################################################################
+# This way, scripting and tracing can be used when the situation calls for
+# each of them and used together.
+# 
+# Saving and Loading models
+# -------------------------
+# 
+# We provide APIs to save and load TorchScript modules to/from disk in an
+# archive format. This format includes code, parameters, attributes, and
+# debug information, meaning that the archive is a freestanding
+# representation of the model that can be loaded in an entirely separate
+# process. Let’s save and load our wrapped RNN module:
+# 
+
+traced.save('wrapped_rnn.zip')
+
+loaded = torch.jit.load('wrapped_rnn.zip')
+
+print(loaded)
+print(loaded.code)
+
+
+######################################################################
+# As you can see, serialization preserves the module hierarchy and the
+# code we’ve been examining throughout. The model can also be loaded, for
+# example, `into
+# C++ <https://pytorch.org/tutorials/advanced/cpp_export.html>`__ for
+# python-free execution.
+# 
+# Further Reading
+# ~~~~~~~~~~~~~~~
+# 
+# We’ve completed our tutorial! For a more involved demonstration, check
+# out the NeurIPS demo for converting machine translation models using
+# TorchScript:
+# https://colab.research.google.com/drive/1HiICg6jRkBnr5hvK2-VnMi88Vi9pUzEJ
+# 
diff --git a/index.rst b/index.rst
@@ -200,6 +200,11 @@ Extending PyTorch
 Production Usage
 ----------------------
 
+.. customgalleryitem::
+   :tooltip: Introduction to TorchScript
+   :description: :doc:`beginner/Intro_to_TorchScript`
+   :figure: _static/img/torchscript.png
+
 .. customgalleryitem::
    :tooltip: Loading a PyTorch model in C++
    :description: :doc:`advanced/cpp_export`