You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/addons/blob/master/tensorflow_addons/examples/notebooks/optimizers_lazyadam.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
65
+
" </td>\n",
66
+
" <td>\n",
67
+
" <a target=\"_blank\" href=\"https://github.com/tensorflow/addons/blob/master/tensorflow_addons/examples/notebooks/optimizers_lazyadam.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
68
+
" </td>\n",
69
+
"</table>"
70
+
]
71
+
},
72
+
{
73
+
"metadata": {
74
+
"colab_type": "text",
75
+
"id": "xHxb-dlhMIzW"
76
+
},
77
+
"cell_type": "markdown",
78
+
"source": [
79
+
"# Overview\n",
80
+
"\n",
81
+
"This notebook will demonstrate how to use the lazy adam optimizer from the Addons package.\n"
82
+
]
83
+
},
84
+
{
85
+
"metadata": {
86
+
"id": "bQwBbFVAyHJ_",
87
+
"colab_type": "text"
88
+
},
89
+
"cell_type": "markdown",
90
+
"source": [
91
+
"# LazyAdam\n",
92
+
"\n",
93
+
"> LazyAdam is a variant of the Adam optimizer that handles sparse updates moreefficiently.\n",
94
+
" The original Adam algorithm maintains two moving-average accumulators for\n",
95
+
" each trainable variable; the accumulators are updated at every step.\n",
96
+
" This class provides lazier handling of gradient updates for sparse\n",
97
+
" variables. It only updates moving-average accumulators for sparse variable\n",
98
+
" indices that appear in the current batch, rather than updating the\n",
99
+
" accumulators for all indices. Compared with the original Adam optimizer,\n",
100
+
" it can provide large improvements in model training throughput for some\n",
101
+
" applications. However, it provides slightly different semantics than the\n",
102
+
" original Adam algorithm, and may lead to different empirical results."
0 commit comments