|
76 | 76 | }, |
77 | 77 | "cell_type": "markdown", |
78 | 78 | "source": [ |
79 | | - "## Overview\n", |
| 79 | + "# Overview\n", |
80 | 80 | "\n", |
81 | 81 | "This notebook will demonstrate how to use the lazy adam optimizer from the Addons package.\n" |
82 | 82 | ] |
83 | 83 | }, |
| 84 | + { |
| 85 | + "metadata": { |
| 86 | + "id": "bQwBbFVAyHJ_", |
| 87 | + "colab_type": "text" |
| 88 | + }, |
| 89 | + "cell_type": "markdown", |
| 90 | + "source": [ |
| 91 | + "# LazyAdam\n", |
| 92 | + "\n", |
| 93 | + "> LazyAdam is a variant of the Adam optimizer that handles sparse updates moreefficiently.\n", |
| 94 | + " The original Adam algorithm maintains two moving-average accumulators for\n", |
| 95 | + " each trainable variable; the accumulators are updated at every step.\n", |
| 96 | + " This class provides lazier handling of gradient updates for sparse\n", |
| 97 | + " variables. It only updates moving-average accumulators for sparse variable\n", |
| 98 | + " indices that appear in the current batch, rather than updating the\n", |
| 99 | + " accumulators for all indices. Compared with the original Adam optimizer,\n", |
| 100 | + " it can provide large improvements in model training throughput for some\n", |
| 101 | + " applications. However, it provides slightly different semantics than the\n", |
| 102 | + " original Adam algorithm, and may lead to different empirical results." |
| 103 | + ] |
| 104 | + }, |
84 | 105 | { |
85 | 106 | "metadata": { |
86 | 107 | "colab_type": "text", |
|
145 | 166 | }, |
146 | 167 | "cell_type": "code", |
147 | 168 | "source": [ |
148 | | - "model = tf.keras.Sequential()\n", |
149 | | - "model.add(tf.keras.layers.Dense(64, input_shape=(784,), activation='relu', name='dense_1'))\n", |
150 | | - "model.add(tf.keras.layers.Dense(64, activation='relu', name='dense_2'))\n", |
151 | | - "model.add(tf.keras.layers.Dense(10, activation='softmax', name='predictions'))" |
| 169 | + "model = tf.keras.Sequential([\n", |
| 170 | + " tf.keras.layers.Dense(64, input_shape=(784,), activation='relu', name='dense_1'),\n", |
| 171 | + " tf.keras.layers.Dense(64, activation='relu', name='dense_2'),\n", |
| 172 | + " tf.keras.layers.Dense(10, activation='softmax', name='predictions'),\n", |
| 173 | + "])" |
152 | 174 | ], |
153 | 175 | "execution_count": 0, |
154 | 176 | "outputs": [] |
|
223 | 245 | "metadata": { |
224 | 246 | "id": "1Y--0tK69SXf", |
225 | 247 | "colab_type": "code", |
| 248 | + "outputId": "163a7751-e35b-4d9f-cc07-1f8580bdf6bf", |
226 | 249 | "colab": { |
227 | 250 | "base_uri": "https://localhost:8080/", |
228 | | - "height": 67 |
229 | | - }, |
230 | | - "outputId": "4a33a1c1-da98-4da9-b226-ee9af7b903d3" |
| 251 | + "height": 68 |
| 252 | + } |
231 | 253 | }, |
232 | 254 | "cell_type": "code", |
233 | 255 | "source": [ |
|
236 | 258 | "results = model.evaluate(x_test, y_test, batch_size=128)\n", |
237 | 259 | "print('Test loss = {0}, Test acc: {1}'.format(results[0], results[1]))" |
238 | 260 | ], |
239 | | - "execution_count": 19, |
| 261 | + "execution_count": 9, |
240 | 262 | "outputs": [ |
241 | 263 | { |
242 | 264 | "output_type": "stream", |
243 | 265 | "text": [ |
244 | 266 | "Evaluate on test data:\n", |
245 | | - "10000/10000 [==============================] - 0s 28us/sample - loss: 0.1149 - accuracy: 0.9762\n", |
246 | | - "Test loss = 0.11493133022264228, Test acc: 0.9761999845504761\n" |
| 267 | + "10000/10000 [==============================] - 0s 21us/sample - loss: 0.0884 - accuracy: 0.9752\n", |
| 268 | + "Test loss = 0.08840992146739736, Test acc: 0.9751999974250793\n" |
247 | 269 | ], |
248 | 270 | "name": "stdout" |
249 | 271 | } |
|
0 commit comments