You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[MLIR][NVVM] Update mbarrier Ops to use AnyTypeOf[] (2/n)
This is a follow up of PR #165558. (1/n)
This patch updates the below mbarrier Ops to use
AnyTypeOf[] construct:
* mbarrier.arrive
* mbarrier.arrive.noComplete
* mbarrier.test.wait
* cp.async.mbarrier.arrive
* Updated existing tests accordingly.
* Verified locally that there are no new regressions
in the `integration` tests.
* TODO: A few more Ops are remaining and will be
migrated in a subsequent PR.
Signed-off-by: Durgadoss R <[email protected]>
- `addr`: A pointer to the memory location of the *mbarrier object*. Uses generic
678
-
addressing, but the address must still be in the shared memory space.
677
+
- `addr`: A pointer to the memory location of the *mbarrier object*. The `addr`
678
+
must be a pointer to generic or shared::cta memory. When it is generic, the
679
+
underlying address must be within the shared::cta memory space; otherwise
680
+
the behavior is undefined.
679
681
680
682
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
This Op is the same as `nvvm.mbarrier.arrive` except that the *mbarrier object*
694
-
should be accessed using a shared-memory pointer instead of a generic-memory pointer.
695
-
696
-
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
captures the phase of the *mbarrier object* prior to the arrive-on operation.
722
718
723
719
The operation takes the following operands:
724
-
- `addr`: A pointer to the memory location of the *mbarrier object*. Uses generic
725
-
addressing, but the address must still be in the shared memory space.
720
+
- `addr`: A pointer to the memory location of the *mbarrier object*. The `addr`
721
+
must be a pointer to generic or shared::cta memory. When it is generic, the
722
+
underlying address must be within the shared::cta memory space; otherwise
723
+
the behavior is undefined.
726
724
- `count`: Integer specifying the count argument to the arrive-on operation.
727
725
Must be in the valid range as specified in the *mbarrier object* contents.
728
726
729
727
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-try-wait)
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-try-wait)
The `cp.async.mbarrier.arrive` Op makes the *mbarrier object* track
1535
1522
all prior cp.async operations initiated by the executing thread.
1536
1523
The `addr` operand specifies the address of the *mbarrier object*
1537
-
in generic address space. The `noinc` attr impacts how the
1538
-
mbarrier's state is updated.
1524
+
in generic or shared::cta address space. When it is generic, the
1525
+
underlying memory should fall within the shared::cta space;
1526
+
otherwise the behavior is undefined. The `noinc` attr impacts
1527
+
how the mbarrier's state is updated.
1539
1528
1540
1529
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive)
1541
1530
}];
1542
-
let assemblyFormat = "$addr attr-dict `:` type(operands)";
let summary = "NVVM Dialect Op for cp.async.mbarrier.arrive.shared";
1558
-
let description = [{
1559
-
The `cp.async.mbarrier.arrive.shared` Op makes the *mbarrier object*
1560
-
track all prior cp.async operations initiated by the executing thread.
1561
-
The `addr` operand specifies the address of the *mbarrier object* in
1562
-
shared memory. The `noinc` attr impacts how the mbarrier's state
1563
-
is updated.
1564
-
1565
-
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive)
1566
-
}];
1567
1536
let assemblyFormat = "$addr attr-dict `:` type(operands)";
0 commit comments