You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[MLIR][NVVM] Update mbarrier Ops to use AnyTypeOf[] (2/n)
This is a follow up of PR #165558. (1/n)
This patch updates the below mbarrier Ops to use
AnyTypeOf[] construct:
* mbarrier.arrive
* mbarrier.arrive.noComplete
* mbarrier.test.wait
* cp.async.mbarrier.arrive
* Updated existing tests accordingly.
* Verified locally that there are no new regressions
in the `integration` tests.
* TODO: A few more Ops are remaining and will be
migrated in a subsequent PR.
Signed-off-by: Durgadoss R <[email protected]>
- `addr`: A pointer to the memory location of the *mbarrier object*. Uses generic
680
-
addressing, but the address must still be in the shared memory space.
679
+
- `addr`: A pointer to the memory location of the *mbarrier object*. The `addr`
680
+
must be a pointer to generic or shared::cta memory. When it is generic, the
681
+
underlying address must be within the shared::cta memory space; otherwise
682
+
the behavior is undefined.
681
683
682
684
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
This Op is the same as `nvvm.mbarrier.arrive` except that the *mbarrier object*
696
-
should be accessed using a shared-memory pointer instead of a generic-memory pointer.
697
-
698
-
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
captures the phase of the *mbarrier object* prior to the arrive-on operation.
724
720
725
721
The operation takes the following operands:
726
-
- `addr`: A pointer to the memory location of the *mbarrier object*. Uses generic
727
-
addressing, but the address must still be in the shared memory space.
722
+
- `addr`: A pointer to the memory location of the *mbarrier object*. The `addr`
723
+
must be a pointer to generic or shared::cta memory. When it is generic, the
724
+
underlying address must be within the shared::cta memory space; otherwise
725
+
the behavior is undefined.
728
726
- `count`: Integer specifying the count argument to the arrive-on operation.
729
727
Must be in the valid range as specified in the *mbarrier object* contents.
730
728
731
729
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-try-wait)
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-try-wait)
The `cp.async.mbarrier.arrive` Op makes the *mbarrier object* track
1542
1529
all prior cp.async operations initiated by the executing thread.
1543
1530
The `addr` operand specifies the address of the *mbarrier object*
1544
-
in generic address space. The `noinc` attr impacts how the
1545
-
mbarrier's state is updated.
1531
+
in generic or shared::cta address space. When it is generic, the
1532
+
underlying memory should fall within the shared::cta space;
1533
+
otherwise the behavior is undefined. The `noinc` attr impacts
1534
+
how the mbarrier's state is updated.
1546
1535
1547
1536
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive)
1548
1537
}];
1549
-
let assemblyFormat = "$addr attr-dict `:` type(operands)";
let summary = "NVVM Dialect Op for cp.async.mbarrier.arrive.shared";
1565
-
let description = [{
1566
-
The `cp.async.mbarrier.arrive.shared` Op makes the *mbarrier object*
1567
-
track all prior cp.async operations initiated by the executing thread.
1568
-
The `addr` operand specifies the address of the *mbarrier object* in
1569
-
shared memory. The `noinc` attr impacts how the mbarrier's state
1570
-
is updated.
1571
-
1572
-
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive)
1573
-
}];
1574
1543
let assemblyFormat = "$addr attr-dict `:` type(operands)";
0 commit comments