-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Description
hi, I have found another potential performance issue though not as severe as #5402:
I'm doing a full svd decomposition with dgesdd with an increasing number of rows with a moderate matrix size (n=1 to 300, m=1000),
for convenience I wrote the test-case in Python but it follows closely what I observe from C++ with the direct call to dgesdd from the C/Fortran interface.
import numpy as np
rng = np.random.default_rng()
m = 1000
for n in range(1, 300):
print(n, flush=True)
a = rng.normal(size=(n, m))
U, S, Vh = np.linalg.svd(a, full_matrices=True)
On my machine (intel xeon) it takes ~1m with the default number of threads (40), but only ~35s with OMP_NUM_THREADS=1
This is Fedora rawhide again with openblas version 0.3.29.
Metadata
Metadata
Assignees
Labels
No labels