-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
P3Doc bugs, questions, minor issues, etc.Doc bugs, questions, minor issues, etc.perfPerformance and Benchmarking relatedPerformance and Benchmarking relatedup-for-grabsA good issue to fix if you are trying to contribute to the projectA good issue to fix if you are trying to contribute to the project
Description
Our profiles show that AddScaleSU which should be faster with AVX is actually slower and takes more time. This can be confirmed running our CpuMathBenchmarks:
..\..\Tools\dotnetcli\dotnet.exe run -c Release-Intrinsics -- -f *.AddScaleSU --join
| Type | Method | Mean |
|---|---|---|
| AvxPerformanceTests | AddScaleSU | 4.012 ms |
| NativePerformanceTests | AddScaleSU | 2.966 ms |
| SsePerformanceTests | AddScaleSU | 2.916 ms |
This issue has been spotted by @eerhardt in August #691 (comment)
@helloguo suggested #691 (comment) that GatherVector256 intrinsic should be used
eerhardt
Metadata
Metadata
Assignees
Labels
P3Doc bugs, questions, minor issues, etc.Doc bugs, questions, minor issues, etc.perfPerformance and Benchmarking relatedPerformance and Benchmarking relatedup-for-grabsA good issue to fix if you are trying to contribute to the projectA good issue to fix if you are trying to contribute to the project