@@ -41,6 +41,7 @@ architectures for image classification:
4141- `EfficientNet `_
4242- `RegNet `_
4343- `VisionTransformer `_
44+ - `ConvNeXt `_
4445
4546You can construct a model with random weights by calling its constructor:
4647
@@ -88,7 +89,6 @@ You can construct a model with random weights by calling its constructor:
8889 vit_b_32 = models.vit_b_32()
8990 vit_l_16 = models.vit_l_16()
9091 vit_l_32 = models.vit_l_32()
91- vit_h_14 = models.vit_h_14()
9292
9393 We provide pre-trained models, using the PyTorch :mod: `torch.utils.model_zoo `.
9494These can be constructed by passing ``pretrained=True ``:
@@ -248,6 +248,7 @@ vit_b_16 81.072 95.318
248248vit_b_32 75.912 92.466
249249vit_l_16 79.662 94.638
250250vit_l_32 76.972 93.070
251+ convnext_tiny (prototype) 82.520 96.146
251252================================ ============= =============
252253
253254
@@ -266,6 +267,7 @@ vit_l_32 76.972 93.070
266267.. _EfficientNet : https://arxiv.org/abs/1905.11946
267268.. _RegNet : https://arxiv.org/abs/2003.13678
268269.. _VisionTransformer : https://arxiv.org/abs/2010.11929
270+ .. _ConvNeXt : https://arxiv.org/abs/2201.03545
269271
270272.. currentmodule :: torchvision.models
271273
@@ -461,7 +463,6 @@ VisionTransformer
461463 vit_b_32
462464 vit_l_16
463465 vit_l_32
464- vit_h_14
465466
466467Quantized Models
467468----------------
@@ -594,6 +595,7 @@ The models subpackage contains definitions for the following model
594595architectures for detection:
595596
596597- `Faster R-CNN <https://arxiv.org/abs/1506.01497 >`_
598+ - `FCOS <https://arxiv.org/abs/1904.01355 >`_
597599- `Mask R-CNN <https://arxiv.org/abs/1703.06870 >`_
598600- `RetinaNet <https://arxiv.org/abs/1708.02002 >`_
599601- `SSD <https://arxiv.org/abs/1512.02325 >`_
@@ -639,6 +641,7 @@ Network box AP mask AP keypoint AP
639641Faster R-CNN ResNet-50 FPN 37.0 - -
640642Faster R-CNN MobileNetV3-Large FPN 32.8 - -
641643Faster R-CNN MobileNetV3-Large 320 FPN 22.8 - -
644+ FCOS ResNet-50 FPN 39.2 - -
642645RetinaNet ResNet-50 FPN 36.4 - -
643646SSD300 VGG16 25.1 - -
644647SSDlite320 MobileNetV3-Large 21.3 - -
@@ -699,6 +702,7 @@ Network train time (s / it) test time (s / it)
699702Faster R-CNN ResNet-50 FPN 0.2288 0.0590 5.2
700703Faster R-CNN MobileNetV3-Large FPN 0.1020 0.0415 1.0
701704Faster R-CNN MobileNetV3-Large 320 FPN 0.0978 0.0376 0.6
705+ FCOS ResNet-50 FPN 0.1450 0.0539 3.3
702706RetinaNet ResNet-50 FPN 0.2514 0.0939 4.1
703707SSD300 VGG16 0.2093 0.0744 1.5
704708SSDlite320 MobileNetV3-Large 0.1773 0.0906 1.5
@@ -718,6 +722,15 @@ Faster R-CNN
718722 torchvision.models.detection.fasterrcnn_mobilenet_v3_large_fpn
719723 torchvision.models.detection.fasterrcnn_mobilenet_v3_large_320_fpn
720724
725+ FCOS
726+ ----
727+
728+ .. autosummary ::
729+ :toctree: generated/
730+ :template: function.rst
731+
732+ torchvision.models.detection.fcos_resnet50_fpn
733+
721734
722735RetinaNet
723736---------
0 commit comments