-
Couldn't load subscription status.
- Fork 165
Description
QFT CPU benchmark performance
ARCHER2 single node performance
Previous performance v3.5.0
| QFT time | ||||
|---|---|---|---|---|
| Qubits | nodes | Standard | High freq | high freq % of standard |
| 33 | 1 | 267.69 | 256.35 | 95.76339884 |
Current performance v4.1.0:
Final state:
QFT run time: 819.105s
Total run time: 1021.17
Compare QFT run time.
Function that dominates run in V4:
QuEST/quest/src/cpu/cpu_subroutines.cpp
Line 582 in 4d44ec8
| void cpu_statevec_anyCtrlOneTargDiagMatr_sub(Qureg qureg, vector<int> ctrls, vector<int> ctrlStates, int targ, DiagMatr1 matr) { |
Function that dominates run in V3.5.0:
QuEST/QuEST/src/CPU/QuEST_cpu.c
Line 3208 in 23a0a08
| void statevec_controlledPhaseShift (Qureg qureg, int idQubit1, int idQubit2, qreal angle) |
I suspect this is due to the changes in some of the openmp directives between old and new I have profiler results I can share directly but main features of the profiling is that voluntary context switching has increased by a factor of 3 and % of time in CPU float instructions down by a factor of 3.