Skip to content

Gradient accumulate optimizer #2260

@dathudeptrai

Description

@dathudeptrai

Describe the feature and the current behavior/state.

Hi, I think it's good if someone can support Gradient Accumulate optimizer for this repo, this feature is really helpful for those who train the large model with a low resource such as Bert, etc. The usage should be similar with tfa.optimizer.SWA:

opt = ...
accumulate_opt = tfa.optimizer.AccumulationOptimizer(opt, accumulate_steps=5)

There is an implementation of gradient accumulator but for custom training loop rather than Keras model fit here link.

Relevant information

  • Are you willing to contribute it (yes/no): no
  • Are you willing to maintain it going forward? (yes/no): no
  • Is there a relevant academic paper? (if so, where):
  • Is there already an implementation in another framework? (if so, where): here but for custom training loop.
  • Was it part of tf.contrib? (if so, where): no

Which API type would this fall under (layer, metric, optimizer, etc.)
optimizer
Who will benefit with this feature?
all tensorflow users.
Any other info.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions