Skip to content

Conversation

ulf1
Copy link
Contributor

@ulf1 ulf1 commented Jan 8, 2020

Code moved from this PR tensorflow/tensorflow#35469

Copy link
Member

@qlzh727 qlzh727 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the code with the assumption of layernorm=True, and I will take a look again.

@ulf1
Copy link
Contributor Author

ulf1 commented Jan 9, 2020

The unit tests fail with this message

//tensorflow_addons/rnn:cell_test                                        FAILED in 2.8s
  /root/.cache/bazel/_bazel_root/e051f2f195071208ea7f081f88730f4f/execroot/__main__/bazel-out/k8-opt/testlogs/tensorflow_addons/rnn/cell_test/test.log

Executed 61 out of 61 tests: 60 tests pass and 1 fails locally.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
+ exit 3
make: *** [unit-test] Error 3
makefile:29: recipe for target 'unit-test' failed

and

//tensorflow_addons/rnn:cell_test                                        FAILED in 3.7s
  /root/.cache/bazel/_bazel_root/e051f2f195071208ea7f081f88730f4f/execroot/__main__/bazel-out/k8-opt/testlogs/tensorflow_addons/rnn/cell_test/test.log

Executed 61 out of 61 tests: 60 tests pass and 1 fails locally.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
INFO: Build completed, 1 test FAILED, 1522 total actions
+ exit 3
make: *** [gpu-unit-test] Error 3
makefile:32: recipe for target 'gpu-unit-test' failed

Is this error about this in BUILD?

py_test(
    name = "cell_test",
    size = "small",
    srcs = ["cell_test.py"],
    deps = [
        ":rnn",
    ],
)

@ulf1
Copy link
Contributor Author

ulf1 commented Jan 11, 2020

@seanpmorgan can you run kokoro:force-run again, please?

@ulf1
Copy link
Contributor Author

ulf1 commented Jan 11, 2020

Python2 kokoro tests (GPU, CPU) are still failing with

There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.

In BUILD the timeout is currently at size = "large", what works for the Python3 tests in kokoro. Should I just delete all unit tests? Are the Python2 tests still relevant?

bias_regularizer: Regularizer function applied to the bias vector
(`use_bias=True`) or for the beta vector of the layer normalization
layer (`use_layernorm=True`). Default: `None`.
gamma_regularizer: Regularizer function applied to the gamma vector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where the gamma regularizer and constrant are useful or not. I chose to not expose those two in the layernorm lstm, so that we don't have a big number of params that confuse users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for training of the scaling parameter in LayerNormalization.

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/layers/normalization.py
line 1033 - 1040

Why to set self.gamma_constraint?

  • When your scaling parameters shrinks towards zero (e.g., other weights increase, too unscaled inputs), or becomes too large
  • You know from experience with the problem, that the scaling parameter always was in a certain range

Why to set self.gamma_regularizer?

  • When your scaling parameter explodes (e.g., input values and hidden state values are too small to generate new hidden values that are big enough)

The gamma constraint and regularizer are hyperparameters a user could try when model training terminates with weird scaling parameters.

@ulf1 ulf1 requested a review from qlzh727 January 16, 2020 14:41
Copy link
Member

@qlzh727 qlzh727 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good. Thanks for the change.

epsilon=layernorm_epsilon,
center=False,
scale=True,
beta_initializer=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None seems to be a weird default. Should it be zeros?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if center=False then beta is is not used at all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove all three inputs that are like beta_initializer=None?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I think let layernorm to use its default value should be fine.

Copy link
Member

@qlzh727 qlzh727 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply, and thanks for the contribution.

@qlzh727 qlzh727 merged commit cdb43ff into tensorflow:master Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants