Skip to content

Conversation

@fengyuentau
Copy link
Member

@fengyuentau fengyuentau commented Nov 23, 2023

opencv/opencv#24476

Script to generate VIT_B_32

import torch # version: 1.13.1
import torchvision # version: 0.14.1

dummy_input = torch.randn(1, 3, 224, 224, device="cpu")
input_names = [ "input" ]
output_names = [ "output" ]

opset = 11

model = torchvision.models.vit_b_32(weights=torchvision.models.ViT_B_32_Weights.DEFAULT)
torch.onnx.export(model, dummy_input, "vit_b_32.opset{}.onnx".format(opset), verbose=False, export_params=True,
                  opset_version=opset, input_names=input_names, output_names=output_names)

from onnxsim import simplify # version: 0.4.33
import onnx # 1.14.1

model_sim, check = simplify("vit_b_32.opset{}.onnx".format(opset))
assert check, "Simplified model could not be validated"
onnx.save(model_simplified, "vit_b_32.onnx")

@fengyuentau fengyuentau added the DNN dnn related tests and data label Nov 23, 2023
@dkurt
Copy link
Member

dkurt commented Nov 24, 2023

Can you please specify a source of test data? Or changes to Python generate script missed?

@fengyuentau
Copy link
Member Author

Oh, yes, sure, I forgot that. Will do later.

@fengyuentau
Copy link
Member Author

Code is added.

@dkurt
Copy link
Member

dkurt commented Dec 15, 2023

@fengyuentau, please squash all the commits into one to make total patch size smaller

@fengyuentau
Copy link
Member Author

@fengyuentau, please squash all the commits into one to make total patch size smaller

Done.

@asmorkalov asmorkalov merged commit 800a8c4 into opencv:4.x Dec 20, 2023
asmorkalov pushed a commit to opencv/opencv that referenced this pull request Dec 20, 2023
dnn: add attention layer #24476

Resolves #24609

Merge with: opencv/opencv_extra#1128.

Attention operator spec from onnxruntime: https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.

TODO:
- [x] benchmark (before this PR vs. with this PR vs. ORT).
- [x] Layer fusion: Take care Slice with end=INT64_MAX.
- [x] Layer fusion: match more potential attention (VIT) patterns.
    - [x] Single-head attention is supported.
- [x] Test AttentionSubgraph fusion.
- [x] Add acc tests for VIT_B_32 and VitTrack
- [x] Add perf tests for VIT_B_32 and VitTrack

## Benchmarks

Platform: Macbook Air M1.

### Attention Subgraph

Input scale: [1, 197, 768].

|                        | mean (ms) | median (ms) | min (ms) |
| ---------------------- | --------- | ----------- | -------- |
| w/ Attention (this PR) | 3.75      | 3.68        | 3.22     |
| w/o Attention          | 9.06      | 9.01        | 8.24     |
| ORT (python)           | 4.32      | 2.63        | 2.50     |

### ViTs

All data in millisecond (ms).

| ViTs     | With Attention | Without Attention | ORT    |
| -------- | -------------- | ----------------- | ------ |
| vit_b_16 | 302.77         | 365.35            | 109.70 |
| vit_b_32 | 89.92          | 116.22            | 30.36  |
| vit_l_16 | 1593.32        | 1730.74           | 419.92 |
| vit_l_32 | 468.11         | 577.41            | 134.12 |
| VitTrack | 3.80           | 3.87              | 2.25   |

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
dnn: add attention layer opencv#24476

Resolves opencv#24609

Merge with: opencv/opencv_extra#1128.

Attention operator spec from onnxruntime: https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.

TODO:
- [x] benchmark (before this PR vs. with this PR vs. ORT).
- [x] Layer fusion: Take care Slice with end=INT64_MAX.
- [x] Layer fusion: match more potential attention (VIT) patterns.
    - [x] Single-head attention is supported.
- [x] Test AttentionSubgraph fusion.
- [x] Add acc tests for VIT_B_32 and VitTrack
- [x] Add perf tests for VIT_B_32 and VitTrack

## Benchmarks

Platform: Macbook Air M1.

### Attention Subgraph

Input scale: [1, 197, 768].

|                        | mean (ms) | median (ms) | min (ms) |
| ---------------------- | --------- | ----------- | -------- |
| w/ Attention (this PR) | 3.75      | 3.68        | 3.22     |
| w/o Attention          | 9.06      | 9.01        | 8.24     |
| ORT (python)           | 4.32      | 2.63        | 2.50     |

### ViTs

All data in millisecond (ms).

| ViTs     | With Attention | Without Attention | ORT    |
| -------- | -------------- | ----------------- | ------ |
| vit_b_16 | 302.77         | 365.35            | 109.70 |
| vit_b_32 | 89.92          | 116.22            | 30.36  |
| vit_l_16 | 1593.32        | 1730.74           | 419.92 |
| vit_l_32 | 468.11         | 577.41            | 134.12 |
| VitTrack | 3.80           | 3.87              | 2.25   |

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
@fengyuentau fengyuentau deleted the attention_layer branch December 3, 2024 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DNN dnn related tests and data

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants