Skip to content

Conversation

@thomwolf
Copy link
Member

@thomwolf thomwolf commented Sep 4, 2019

Torch jit (cf #1010) and TF 2.0 (cf #1104) are more strict than PyTorch on having a specific order of arguments for easy use.

This PR refactor the order of the keyword arguments to make them as natural as possible.

This will be a breaking change for people using positional order to input keyword arguments in the forward pass of the models, hence is delayed to the 2.0 release.

@codecov-io
Copy link

Codecov Report

Merging #1195 into master will decrease coverage by 0.4%.
The diff coverage is 86.59%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1195      +/-   ##
==========================================
- Coverage   80.83%   80.42%   -0.41%     
==========================================
  Files          46       46              
  Lines        7878     7892      +14     
==========================================
- Hits         6368     6347      -21     
- Misses       1510     1545      +35
Impacted Files Coverage Δ
pytorch_transformers/modeling_xlnet.py 78.83% <100%> (ø) ⬆️
pytorch_transformers/tests/modeling_bert_test.py 96.29% <100%> (ø) ⬆️
...rch_transformers/tests/modeling_distilbert_test.py 99.08% <100%> (ø) ⬆️
pytorch_transformers/modeling_distilbert.py 96.77% <100%> (ø) ⬆️
pytorch_transformers/modeling_xlm.py 87.08% <100%> (ø) ⬆️
pytorch_transformers/modeling_bert.py 88.03% <100%> (ø) ⬆️
...ytorch_transformers/tests/modeling_roberta_test.py 78.81% <100%> (ø) ⬆️
pytorch_transformers/modeling_transfo_xl.py 56.9% <42.85%> (-0.1%) ⬇️
pytorch_transformers/modeling_openai.py 81.08% <76.47%> (-0.88%) ⬇️
pytorch_transformers/modeling_gpt2.py 83.13% <76.47%> (-0.91%) ⬇️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b52642...7fba47b. Read the comment docs.

@thomwolf thomwolf changed the title Reodering arguments for torch jit #1010 and future TF2.0 compatibility [2.0] Reodering arguments for torch jit #1010 and future TF2.0 compatibility Sep 4, 2019
@thomwolf thomwolf requested a review from LysandreJik September 4, 2019 22:30
outputs = self.bert(input_ids, position_ids=position_ids, token_type_ids=token_type_ids,
attention_mask=attention_mask, head_mask=head_mask)
def forward(self, input_ids, attention_mask=None, token_type_ids=None,
position_ids=None, head_mask=None, labels=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to solve #1010

if attention_mask is not None:
# Apply the attention mask
w = w + attention_mask

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition!


if attention_mask is not None:
# Apply the attention mask
w = w + attention_mask
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition here too

model.eval()
sequence_output, pooled_output = model(input_ids, token_type_ids, input_mask)
sequence_output, pooled_output = model(input_ids, token_type_ids)
sequence_output, pooled_output = model(input_ids, attention_mask=input_mask, token_type_ids=token_type_ids)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is clearer than before! Looks great.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to solve the problems related to JIT. I feel it's a great addition as the order of inputs now accurately depicts the order of importance.

@thomwolf
Copy link
Member Author

thomwolf commented Sep 9, 2019

Ok merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants