You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[ ] An `xx/xx_backbone.py` file which has the model graph \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py)\].
26
-
-[ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone_test.py)\].
25
+
-[ ] An `xx/xx_backbone.py` file which has the model graph \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py)\].
26
+
-[ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone_test.py)\].
27
27
-[ ] A Colab notebook link in the PR description which matches the outputs of the implemented backbone model with the original source \[[Example](https://colab.research.google.com/drive/1SeZWJorKWmwWJax8ORSdxKrxE25BfhHa?usp=sharing)\].
28
28
29
29
### Step 3: PR #2 - Add XXTokenizer
30
30
31
-
-[ ] An `xx/xx_tokenizer.py` file which has the tokenizer for the model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer.py)\].
32
-
-[ ] An `xx/xx_tokenizer_test.py` file which has unit tests for the model tokenizer \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer_test.py)\].
31
+
-[ ] An `xx/xx_tokenizer.py` file which has the tokenizer for the model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer.py)\].
32
+
-[ ] An `xx/xx_tokenizer_test.py` file which has unit tests for the model tokenizer \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer_test.py)\].
33
33
-[ ] A Colab notebook link in the PR description, demonstrating that the output of the tokenizer matches the original tokenizer \[[Example](https://colab.research.google.com/drive/1MH_rpuFB1Nz_NkKIAvVtVae2HFLjXZDA?usp=sharing)].
34
34
35
35
### Step 4: PR #3 - Add XX Presets
36
36
37
-
-[ ] An `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_presets.py)\].
37
+
-[ ] An `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_presets.py)\].
38
38
-[ ] A `tools/checkpoint_conversion/convert_xx_checkpoints.py` which is reusable script for converting checkpoints \[[Example](https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_distilbert_checkpoints.py)\].
39
39
-[ ] A Colab notebook link in the PR description, showing an end-to-end task such as text classification, etc. The task model can be built using the backbone model, with the task head on top \[[Example](https://gist.github.com/mattdangerw/bf0ca07fb66b6738150c8b56ee5bab4e)\].
40
40
41
41
### Step 5: PR #4 and Beyond - Add XX Tasks and Preprocessors
42
42
43
43
This PR is optional.
44
44
45
-
-[ ] An `xx/xx_<task>.py` file for adding a task model like classifier, masked LM, etc. \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier.py)\]
46
-
-[ ] An `xx/xx_<task>_preprocessor.py` file which has the preprocessor and can be used to get inputs suitable for the task model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_preprocessor.py)\].
47
-
-[ ]`xx/xx_<task>_test.py` file and `xx/xx_<task>_preprocessor_test.py` files which have unit tests for the above two modules \[[Example 1](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier_test.py) and [Example 2](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_preprocessor_test.py)\].
45
+
-[ ] An `xx/xx_<task>.py` file for adding a task model like classifier, masked LM, etc. \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_classifier.py)\]
46
+
-[ ] An `xx/xx_<task>_preprocessor.py` file which has the preprocessor and can be used to get inputs suitable for the task model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_preprocessor.py)\].
47
+
-[ ]`xx/xx_<task>_test.py` file and `xx/xx_<task>_preprocessor_test.py` files which have unit tests for the above two modules \[[Example 1](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_classifier_test.py) and [Example 2](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_preprocessor_test.py)\].
48
48
-[ ] A Colab notebook link in the PR description, demonstrating that the output of the preprocessor matches the output of the original preprocessor \[[Example](https://colab.research.google.com/drive/1GFFC7Y1I_2PtYlWDToqKvzYhHWv1b3nC?usp=sharing)].
49
49
50
50
## Detailed Instructions
@@ -81,7 +81,7 @@ around by a class to implement our models.
81
81
82
82
A model is typically split into three/four sections. We would recommend you to
Standard layers used: `keras_nlp.layers.TransformerEncoder`, `keras_nlp.layers.FNetEncoder`.
99
+
Standard layers used: `keras_hub.layers.TransformerEncoder`, `keras_hub.layers.FNetEncoder`.
100
100
101
101
**Decoder layers (possibly)**
102
102
103
-
Standard layers used: `keras_nlp.layers.TransformerDecoder`.
103
+
Standard layers used: `keras_hub.layers.TransformerDecoder`.
104
104
105
105
**Other layers which might be used**
106
106
107
107
`keras.layers.LayerNorm`, `keras.layers.Dropout`, `keras.layers.Conv1D`, etc.
108
108
109
109
<br/>
110
110
111
-
The standard layers provided in Keras and KerasNLP are generally enough for
111
+
The standard layers provided in Keras and KerasHub are generally enough for
112
112
most of the usecases and it is recommended to do a thorough search
113
-
[here](https://keras.io/api/layers/) and [here](https://keras.io/api/keras_nlp/layers/).
113
+
[here](https://keras.io/api/layers/) and [here](https://keras.io/api/keras_hub/layers/).
114
114
However, sometimes, models have small tweaks/paradigm changes in their architecture.
115
115
This is when things might slightly get complicated.
116
116
117
117
If the model introduces a paradigm shift, such as using relative attention instead
118
118
of vanilla attention, the contributor will have to implement complete custom layers. A case
119
-
in point is `keras_nlp.models.DebertaV3Backbone` where we had to [implement layers
120
-
from scratch](https://github.com/keras-team/keras-nlp/tree/master/keras_nlp/models/deberta_v3).
119
+
in point is `keras_hub.models.DebertaV3Backbone` where we had to [implement layers
120
+
from scratch](https://github.com/keras-team/keras-nlp/tree/master/keras_hub/models/deberta_v3).
121
121
122
122
On the other hand, if the model has a small tweak, something simpler can be done.
123
123
For instance, in the Whisper model, the self-attention and cross-attention mechanism
@@ -154,23 +154,23 @@ and loaded correctly, etc.
154
154
#### Tokenizer
155
155
156
156
Most text models nowadays use subword tokenizers such as WordPiece, SentencePiece
157
-
and BPE Tokenizer. Since KerasNLP has implementations of most of the popular
157
+
and BPE Tokenizer. Since KerasHub has implementations of most of the popular
158
158
subword tokenizers, the model tokenizer layer typically inherits from a base
159
159
tokenizer class.
160
160
161
161
For example, DistilBERT uses the WordPiece tokenizer. So, we can introduce a new
162
-
class, `DistilBertTokenizer`, which inherits from `keras_nlp.tokenizers.WordPieceTokenizer`.
162
+
class, `DistilBertTokenizer`, which inherits from `keras_hub.tokenizers.WordPieceTokenizer`.
163
163
All the underlying actual tokenization will be taken care of by the superclass.
164
164
165
165
The important thing here is adding "special tokens". Most models have
166
166
special tokens such as beginning-of-sequence token, end-of-sequence token,
167
167
mask token, pad token, etc. These have to be
168
-
[added as member attributes](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer.py#L91-L105)
168
+
[added as member attributes](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer.py#L91-L105)
169
169
to the tokenizer class. These member attributes are then accessed by the
170
170
preprocessor layers.
171
171
172
-
For a full list of the tokenizers KerasNLP offers, please visit
173
-
[this link](https://keras.io/api/keras_nlp/tokenizers/) and make use of the
172
+
For a full list of the tokenizers KerasHub offers, please visit
173
+
[this link](https://keras.io/api/keras_hub/tokenizers/) and make use of the
174
174
tokenizer your model uses!
175
175
176
176
#### Unit Tests
@@ -193,7 +193,7 @@ files. These files will then be uploaded to GCP by us!
193
193
After wrapping up the preset configuration file, you need to
194
194
add the `from_preset` function to all three classes, i.e., `DistilBertBackbone`,
The testing for presets is divided into two: "large" and "extra large".
199
199
For "large" tests, we pick the smallest preset (in terms of number of parameters)
@@ -228,12 +228,12 @@ and return the dictionary in the form expected by the model.
228
228
229
229
The preprocessor class might have a few intricacies depending on the model. For example,
230
230
the DeBERTaV3 tokenizer does not have the `[MASK]` in the provided sentencepiece
231
-
proto file, and we had to make some modifications [here](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/deberta_v3/deberta_v3_preprocessor.py). Secondly, we have
231
+
proto file, and we had to make some modifications [here](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/deberta_v3/deberta_v3_preprocessor.py). Secondly, we have
232
232
a separate preprocessor class for every task. This is because different tasks
233
-
might require different input formats. For instance, we have a [separate preprocessor](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_masked_lm_preprocessor.py)
233
+
might require different input formats. For instance, we have a [separate preprocessor](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_masked_lm_preprocessor.py)
234
234
for masked language modeling (MLM) for DistilBERT.
235
235
236
236
## Conclusion
237
237
238
238
Once all three PRs (and optionally, the fourth PR) have been merged, you have
239
-
successfully contributed a model to KerasNLP. Congratulations! 🔥
239
+
successfully contributed a model to KerasHub. Congratulations! 🔥
0 commit comments