Skip to content

Conversation

@bosilca
Copy link
Member

@bosilca bosilca commented Jan 13, 2021

Refs #8334

icc does not define the __AVX*__ macros if the corresponding -m architecture
flag was not provided. Thus, make sure we always provide it for icc (not not
necessarily for gcc).

Signed-off-by: George Bosilca <[email protected]>
Copy link
Contributor

@ggouaillardet ggouaillardet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor typo in the first commit message ("not not"), otherwise looks good to me

@jsquyres
Copy link
Member

@rajachan Can you try this patch with your app performance test?

@jsquyres
Copy link
Member

jsquyres commented Jan 13, 2021

Corresponding 4.1.x PR: #8373

Copy link
Member

@rajachan rajachan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested the PR, looks good to me. I no longer observe the performance degradation when op/avx component is used.

@jsquyres
Copy link
Member

@rajachan Just curious -- is AVX2 being used? And if so, do you see any performance benefit?

@rajachan
Copy link
Member

This PR disables AVX2 as well. Assuming you were asking about AVX, I see a marginal (~1%) improvement in the loop time for the specific test case I was running. I have not evaluated microbenchmarks yet.

@bosilca
Copy link
Member Author

bosilca commented Jan 13, 2021

This PR only fixes the support for icc. The restrictions in AVX support are now in #8376.

@rajachan
Copy link
Member

The icc fix is good to go, thanks for splitting out the restrictions change to a separate PR.

@rajachan rajachan merged commit a6cf2cd into open-mpi:master Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants