Skip to content

[OpenMP][clang] 6.0: num_threads strict (part 1: host runtime) #146403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 7, 2025

Conversation

ro-i
Copy link
Contributor

@ro-i ro-i commented Jun 30, 2025

OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the num_threads clause on parallel directives, along with the message and severity clauses. This commit implements necessary host runtime changes.

OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary host runtime changes.
Copy link
Contributor

@TerryLWilmarth TerryLWilmarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ro-i ro-i merged commit 424abcb into main Jul 7, 2025
9 checks passed
@ro-i ro-i deleted the users/ro-i/omp-runtime-nt-strict_1 branch July 7, 2025 19:07
@ro-i
Copy link
Contributor Author

ro-i commented Jul 7, 2025

merged, thanks for the review!

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 7, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime-2 running on rocm-worker-hw-02 while building openmp at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/8870

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libarcher :: races/parallel-simple.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 13
/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp  -gdwarf-4 -O1 -fsanitize=thread  -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src   /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp 2>&1 | tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp.log | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp -gdwarf-4 -O1 -fsanitize=thread -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp -latomic
# note: command had no output on stdout or stderr
# executed command: env TSAN_OPTIONS=ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1 /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp
# note: command had no output on stdout or stderr
# executed command: tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp.log
# note: command had no output on stdout or stderr
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c
# .---command stderr------------
# | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c:36:11: error: CHECK: expected string not found in input
# | // CHECK: ThreadSanitizer: reported {{[1-7]}} warnings
# |           ^
# | <stdin>:26:5: note: scanning from here
# | DONE
# |     ^
# | <stdin>:27:1: note: possible intended match here
# | ThreadSanitizer: thread T4 finished with ignores enabled, created at:
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             .
# |             .
# |             .
# |            21:  #0 pthread_create /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1090:3 (parallel-simple.c.tmp+0xa3c0a) 
# |            22:  #1 __kmp_create_worker z_Linux_util.cpp (libomp.so+0xcb252) 
# |            23:  
# |            24: SUMMARY: ThreadSanitizer: data race /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c:23:8 in main.omp_outlined_debug__ 
# |            25: ================== 
# |            26: DONE 
# | check:36'0         X error: no match found
# |            27: ThreadSanitizer: thread T4 finished with ignores enabled, created at: 
# | check:36'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | check:36'1     ?                                                                      possible intended match
# |            28:  #0 pthread_create /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1090:3 (parallel-simple.c.tmp+0xa3c0a) 
# | check:36'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            29:  #1 __kmp_create_worker z_Linux_util.cpp (libomp.so+0xcb252) 
# | check:36'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            30:  
...

@ro-i
Copy link
Contributor Author

ro-i commented Jul 7, 2025

I don't see how this could possibly be caused by this PR, since num_threads in openmp/tools/archer/tests/races/parallel-simple.c doesn't even use strict. But since the test still seems to be closely related, I'm reverting for further investigation.

ro-i added a commit that referenced this pull request Jul 7, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 7, 2025
@jprotze
Copy link
Collaborator

jprotze commented Jul 7, 2025

I'm wondering whether the runtime code is sufficient to implement 6.0 runtime error behavior for the strict modifier. As I understand the runtime change in this PR, it only catches the case for serialized parallel regions.
I think, runtime error termination should catch whenever requested nthreads < effective nthreads. It could also be the case due to max-active-levels-var, or exceeding the thread-limit. At the end, the reason doesn't really matter.

The error from the build bot is confusing without having more output available from the error case.

@shiltian
Copy link
Contributor

shiltian commented Jul 7, 2025

#85466 should be the one that supports other cases?

ro-i added a commit that referenced this pull request Jul 8, 2025
#147532)

OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary host runtime changes.

Reland #146403. After manual
testing on a gfx90a machine, I could not reproduce the failing test,
which makes it even more likely that the test has just been flaky. (Or
at least that it's not an issue related to this patch.)
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 8, 2025
…ost runtime) (#147532)

OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary host runtime changes.

Reland llvm/llvm-project#146403. After manual
testing on a gfx90a machine, I could not reproduce the failing test,
which makes it even more likely that the test has just been flaky. (Or
at least that it's not an issue related to this patch.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openmp:libomp OpenMP host runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants