diff --git a/README.md b/README.md index 53166456a3..bd03704c5f 100644 --- a/README.md +++ b/README.md @@ -248,7 +248,7 @@ Add a set of new very well trained ResNet & ResNet-V2 18/34 (basic block) weight ### April 11, 2024 * Prepping for a long overdue 1.0 release, things have been stable for a while now. * Significant feature that's been missing for a while, `features_only=True` support for ViT models with flat hidden states or non-std module layouts (so far covering `'vit_*', 'twins_*', 'deit*', 'beit*', 'mvitv2*', 'eva*', 'samvit_*', 'flexivit*'`) -* Above feature support achieved through a new `forward_intermediates()` API that can be used with a feature wrapping module or direclty. +* Above feature support achieved through a new `forward_intermediates()` API that can be used with a feature wrapping module or directly. ```python model = timm.create_model('vit_base_patch16_224') final_feat, intermediates = model.forward_intermediates(input) @@ -486,7 +486,7 @@ Included optimizers available via `timm.optim.create_optimizer_v2` factory metho * `madgrad` an implementation of MADGRAD adapted from https://github.com/facebookresearch/madgrad - https://arxiv.org/abs/2101.11075 * `mars` MARS optimizer from https://github.com/AGI-Arena/MARS - https://arxiv.org/abs/2411.10438 * `nadam` an implementation of Adam w/ Nesterov momentum -* `nadamw` an impementation of AdamW (Adam w/ decoupled weight-decay) w/ Nesterov momentum. A simplified impl based on https://github.com/mlcommons/algorithmic-efficiency +* `nadamw` an implementation of AdamW (Adam w/ decoupled weight-decay) w/ Nesterov momentum. A simplified impl based on https://github.com/mlcommons/algorithmic-efficiency * `novograd` by [Masashi Kimura](https://github.com/convergence-lab/novograd) - https://arxiv.org/abs/1905.11286 * `radam` by [Liyuan Liu](https://github.com/LiyuanLucasLiu/RAdam) - https://arxiv.org/abs/1908.03265 * `rmsprop_tf` adapted from PyTorch RMSProp by myself. Reproduces much improved Tensorflow RMSProp behaviour diff --git a/UPGRADING.md b/UPGRADING.md index e177c7c9b3..87c86f2915 100644 --- a/UPGRADING.md +++ b/UPGRADING.md @@ -1,10 +1,10 @@ # Upgrading from previous versions -I generally try to maintain code interface and especially model weight compability across many `timm` versions. Sometimes there are exceptions. +I generally try to maintain code interface and especially model weight compatibility across many `timm` versions. Sometimes there are exceptions. ## Checkpoint remapping -Pretrained weight remapping is handled by `checkpoint_filter_fn` in a model implementation module. This remaps old pretrained checkpoints to new, and also 3rd party (original) checkpoints to `timm` format if the model was modified when brough into `timm`. +Pretrained weight remapping is handled by `checkpoint_filter_fn` in a model implementation module. This remaps old pretrained checkpoints to new, and also 3rd party (original) checkpoints to `timm` format if the model was modified when brought into `timm`. The `checkpoint_filter_fn` is automatically called when loading pretrained weights via `pretrained=True`, but they can be called manually if you call the fn directly with the current model instance and old state dict. @@ -19,6 +19,6 @@ Many changes were made since the 0.6.x stable releases. They were previewed in 0 * The pretrained_tag is the specific weight variant (different head) for the architecture. * Using only `architecture` defaults to the first weights in the default_cfgs for that model architecture. * In adding pretrained tags, many model names that existed to differentiate were renamed to use the tag (ex: `vit_base_patch16_224_in21k` -> `vit_base_patch16_224.augreg_in21k`). There are deprecation mappings for these. -* A number of models had their checkpoints remaped to match architecture changes needed to better support `features_only=True`, there are `checkpoint_filter_fn` methods in any model module that was remapped. These can be passed to `timm.models.load_checkpoint(..., filter_fn=timm.models.swin_transformer_v2.checkpoint_filter_fn)` to remap your existing checkpoint. +* A number of models had their checkpoints remapped to match architecture changes needed to better support `features_only=True`, there are `checkpoint_filter_fn` methods in any model module that was remapped. These can be passed to `timm.models.load_checkpoint(..., filter_fn=timm.models.swin_transformer_v2.checkpoint_filter_fn)` to remap your existing checkpoint. * The Hugging Face Hub (https://huggingface.co/timm) is now the primary source for `timm` weights. Model cards include link to papers, original source, license. * Previous 0.6.x can be cloned from [0.6.x](https://github.com/rwightman/pytorch-image-models/tree/0.6.x) branch or installed via pip with version.