-
Notifications
You must be signed in to change notification settings - Fork 936
Bring the more flexible AVX* support in 4.1 #8361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1. Consistent march flag order between configure and make. 2. op/avx: give the option to skip some tests it is possible to skip some intrinsic tests by setting some environment variables to "no" before invoking configure: - ompi_cv_op_avx_check_avx512 - ompi_cv_op_avx_check_avx2 - ompi_cv_op_avx_check_avx - ompi_cv_op_avx_check_sse41 - ompi_cv_op_avx_check_sse3 3. op/avx: update AVX512 flags try -mavx512f -mavx512bw -mavx512vl -mavx512dq instead of -march=skylake-avx512 since the former is less likely to conflict with user provided CFLAGS (e.g. -march=...) Thanks Bart Oldeman for pointing this. 4. op/avx: have the op/avx library depend on libmpi.so Refs. open-mpi#8323 Signed-off-by: Gilles Gouaillardet <[email protected]> Signed-off-by: George Bosilca <[email protected]>
1. Allow fallback to a lesser AVX support during make Due to the fact that some distro restrict the compiule architecture during make (while not setting any restrictions during configure) we need to detect the target architecture also during make in order to restrict the code we generate. 2. Add comments and better protect the arch specific code. Identify all the vectorial functions used and clasify them according to the neccesary hardware capabilities. Use these requirements to protect the code for load and stores (the rest of the code being automatically generated it is more difficult to protect). 3. Correctly check for AVX* support. Signed-off-by: George Bosilca <[email protected]>
The test now has the ability to add a shift to all or to any of the input and output buffers to assess the impact of unaligned operations. Signed-off-by: George Bosilca <[email protected]>
|
Hi @bosilca Any chance this change would break See https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=261970&view=logs&j=d0d954b5-f111-5dc4-4d76-03b6c9d0cf7e&t=841356e0-85bb-57d8-dbbc-852e683d1642&l=22345. All tests in conda-forge/openmpi-feedstock#71 currently fail due to this. |
|
No, this should have nothing to do with I notice in your Azure run output: What's |
|
Hi @jsquyres It's just a little script wrapping over |
|
and this is where it gets called: |
|
It looks like |
|
@jsquyres This is a clean/minimal environment so it's very likely impossible. We ensure conda's stuff is first found in $PATH. Tested locally on the same image: That said, I'm happy to print the command out 🙂 Let me wait for the current run to finish...(I'm reverting this patch to see if the |
Hi @jsquyres Looks like a simple revert fixes the error: conda-forge/openmpi-feedstock@4dac571 Here's the CI output: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=262072&view=logs&j=d0d954b5-f111-5dc4-4d76-03b6c9d0cf7e&t=841356e0-85bb-57d8-dbbc-852e683d1642 Any thoughts for what else I can check? |
|
Wow, that's... weird. You've got me stumped on that, because this is what I get from a v4.1.x branch HEAD build: The patch itself has zero to do with the mpirun command line parsing stuff. I don't know how applying that patch would disable the |
|
@jsquyres Question: which file(s) is this option implemented? Is it in one of the 4 files touched here? |
|
Apology, @jsquyres. I rerun our CI with the patch applied, and everything is green (so far). Looks like it was a transient error in Conda's build infrastructure that messed up with text patches among other things...😞 Thanks a lot (as always) for you help! |
The `op-avx` flag disables building features that take advantage of AVX
instructions.
Since `open-mpi` is able to detect CPU features at runtime, disabling
this is not necessary, and simply leads to lower performance for users
with newer machines.
See also:
open-mpi/ompi#8306
open-mpi/ompi#8361
The `op-avx` flag disables building features that take advantage of AVX
instructions.
Since `open-mpi` is able to detect CPU features at runtime, disabling
this is not necessary, and simply leads to lower performance for users
with newer machines.
See also:
open-mpi/ompi#8306
open-mpi/ompi#8361
Closes #76619.
Signed-off-by: Sean Molenaar <[email protected]>
Signed-off-by: BrewTestBot <[email protected]>
Allow more flexibility on the support of AVX* extensions.
Use cached variables to prevent OMPI from checking specific architecture features. As an example this will allow to disable AVX512 while allowing AVX2.
Protect the code against mismatched flags between configure and make.
Document the list of of vectorial functions we need and the corresponding level of AVX, SSE required.
This is #8322 for 4.1
Fixes #8306