Bug in TransformerModel for word_language_model initialization  

In the TransformerModel's init function we have

`self.decoder = nn.Linear(ninp, ntoken)`
and then
`self.init_weights()`

where

 
`    def init_weights(self):`

        initrange = 0.1
        nn.init.uniform_(self.encoder.weight, -initrange, initrange)
        nn.init.zeros_(self.decoder)
        nn.init.uniform_(self.decoder.weight, -initrange, initrange)
`


The `nn.init.zeros_(self.decoder)` line gives the error

> AttributeError: 'Linear' object has no attribute 'zero_'

I simply commented out the `nn.init.zeros_(self.decoder)` line, but I don't know how much not initializing with zeros messes with the model's performance. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug in TransformerModel for word_language_model initialization #783

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug in TransformerModel for word_language_model initialization #783

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions