-
Notifications
You must be signed in to change notification settings - Fork 930
Open
Labels
Description
Per the thread started here (http://www.open-mpi.org/community/lists/devel/2014/10/16130.php), the performance of critical OPAL classes has gotten more expensive on master, now that we have set OPAL_ENABLE_MULTI_THREADS to 1 (this was not unexpected).
However, we now have a definite metric for it -- the PSM message rate falls quite a bit (sidenote: this may also be what inspired Mellanox to write the yalla PML; they may have noticed their message rate was artificially too low in the MXM MTL).
So -- let's figure out how to recover the performance (when possible). Obvious options:
- Only enable the more expensive forms of the classes when THREAD_MULTIPLE and/or async progress threads are being used. I mention this option for completeness; it does not seem like the right solution because it somewhat defeats the point of setting OPAL_ENABLE_MULTI_THREADS to 1 by default.
- Have multiple flavors of OPAL classes that are performance-critical (e.g., freelists): thread-safe and non-thread-safe. Code can then choose which flavor to use, depending on the specific situation (e.g., whether it's data structures that are potentially shared among multiple threads or data structures that are guaranteed to always be used by a single thread).
Adding those who will likely care / have opinions here: @bosilca @hjelmn @miked-mellanox @jladd-mlnx @ggouaillardet