Skip to content

[CIR][ThroughMLIR] Lower structs and GetMemberOp #1565

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2,472 commits into
base: main
Choose a base branch
from

Conversation

terapines-osc-cir
Copy link
Contributor

Structs are implemented as memref<size x i8>. It is not feasible to represent them as tuples, for tuples can't be put in memref (i.e. pointers to structs would break if we did).

We use memref::ViewOp for this. Unlike PtrStrideOp, the reinterpret cast operation doesn't work here, as the result type is potentially different from i8.

AmrDeveloper and others added 30 commits April 9, 2025 11:23
…llvm#1265)

[Neon
definiton](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vmaxv_s8)
[OG
implementation](https://github.com/llvm/clangir/blob/04d7dcfb2582753f3eccbf01ec900d60297cbf4b/clang/lib/CodeGen/CGBuiltin.cpp#L13202)
Implementation in this PR is different from OG as 
1. avoided code duplication by extracting out the common pattern
2. avoided using i32 as return type of the intrinsic call, so eliminated
the need for casting result of the intrinsic call. This way of OG's
implementation is quite unnecessary IMHO, this is MAX, not ADD or MUL.
After all, using the expected type as return type of intrinsic call
produces [the same ASM code](https://godbolt.org/z/3nKG7fxPb).
I continue to use `csmith` and catch run time bags. Now it's time to fix
the layout for the const structs.

There is a divergence between const structs generated by CIR and the
original codegen. And this PR makes one more step to eliminate it. There
are cases where the extra padding is required - and here is a fix for
some of them. I did not write extra tests, since the fixes in the
existing already covers the code I added. The point is that now the
layout for all of these structs in the LLVM IR with and without CIR is
the same.
Class `CIRGenFunction` contained three identical functions that
converted from a Clang AST type (`clang::QualType`) to a ClangIR type
(`mlir::Type`): `convertType`, `ConvertType`, and `getCIRType`. This
embarrassment of duplication needed to be fixed, along with cleaning up
other functions that convert from Clang types to ClangIR types.

The three functions `CIRGenFunction::ConvertType`,
`CIRGenFunction::convertType`, and `CIRGenFunction::getCIRType` were
combined into a single function `CIRGenFunction::convertType`. Other
functions were renamed as follows:
- `CIRGenTypes::ConvertType` to `CIRGenTypes::convertType`
- `CIRGenTypes::ConvertFunctionTypeInternal` to
`CIRGenTypes::convertFunctionTypeInternal`
- `CIRGenModule::getCIRType` to `CIRGenModule::convertType`
- `ConstExprEmitter::ConvertType` to `ConstExprEmitter::convertType`
- `ScalarExprEmitter::ConvertType` to `ScalarExprEmitter::convertType`

Many cases of `getTypes().convertType(t)` and
`getTypes().convertTypeForMem(t)` were changed to just `convertType(t)`
and `convertTypeForMem(t)`, respectively, because the forwarding
functions in `CIRGenModule` and `CIRGenFunction` make the explicit call
to `getTypes()` unnecessary.
Reland previously reverted attempt now that this passes ASANified `ninja
check-clang-cir`.

Original message:
We are missing cleanups all around, more incremental progress towards fixing
that. This is supposed to be NFC intended, but we have to start changing some
bits in order to properly match cleanup bits in OG.

Start tagging places with more MissingFeatures to allow us to incrementally
improve the situation.
…#1262)

This patch adds support for the following GCC function attributes:

  - `__attribute__((const))`
  - `__attribute__((pure))`

The side effect information is attached to the call operations during
CIRGen. During LLVM lowering, these information is consumed to further
emit appropriate LLVM metadata on LLVM call instructions.
…m#1249)

C/C++ functions returning void had an explicit !cir.void return type
while not
having any returned value, which was breaking a lot of MLIR invariants
when the
CIR dialect is used in a greater context, for example with the inliner.

Now, a C/C++ function returning void has no return type and no return
values,
which does not break the MLIR invariant about the same number of return
types
and returned values.

This change does not keeps the same parsing/pretty-printed syntax as
before for
compatibility like in llvm#1203 because
it
requires some new features from the MLIR parser infrastructure itself,
which is
not great.

This uses an optional type for function return type.

The default MLIR parser for optional parameters requires an optional
anchor we
do not have in the syntax, so use a custom FuncType parser to handle the
optional
return type.
Some passes not declared with TableGen did not have descriptions.
This PR adds a support for for default arguments
…#1283)

Corresponding [OG
code](https://github.com/llvm/clangir/blob/ef20d053b3d78c9d4c135e2811b303b7e5016d30/clang/lib/CodeGen/CGExprConstant.cpp#L846).
[OG generated code here](https://godbolt.org/z/x6q333dMn), one notable
diff is we're missing `inrange` which is reported in [issue 886
](llvm#886).
For now, I'm still using GlobalViewAttr to implement it so we can move
things fast.
But it might be worth considering approach [Comments in issue
258](llvm#258), especially we could
incoporate [inrange info](llvm#886) to
the attribute suggested there.
This PR adds CIRGen and LLVM lowering support for the following language
features related to pointers to data members:

  - Comparisons between pointers to data members.
  - Casting from pointers to data members to boolean.
  - Reinterpret casts between pointers to data members.
This PR updates the `#cir.global_view` attribute and make it accept
integer types as its result type.
…vm#1280)

Resolves llvm#1266

After change:

```llvm
%1 = alloca ptr, i64 1, align 8
  store i32 1, ptr @g_arr, align 4
  store i32 2, ptr getelementptr (i32, ptr @g_arr, i64 1), align 4
  store i32 3, ptr getelementptr (i32, ptr @g_arr, i64 2), align 4
  %2 = load i32, ptr @g, align 4
  store i32 %2, ptr getelementptr (i32, ptr @g_arr, i64 3), align 4
  store ptr getelementptr (i32, ptr getelementptr (i32, ptr @g_arr, i64 3), i64 1), ptr %1, align 8
```
This does not change anything in practice, work in that direction should come
next. We also want this to not affect existing tests to isolate upcoming
changes.
This change adds initial support for array new expressions where the
array size is constant and the element does not require a cookie.
- After abba01a, 'is' and 'get'
interfaces are deprecated even though not removed yet. However, it
causes warnings and triggers build failures if that warnings are treated
as errors.
Cleanup cir scope if it contains only yield operation

Fixes: llvm#455
This patch adds the minimal support for array cookies needed to enable
ClangIR generation for an array new expression that requires cookies but
does not require an explicit initializer.

This only provides the cookie support for the base Itanium CXXABI.
Different cookie calculations are required for AppleARM64, which will be
added in a subsequent patch.
…#1298)

Change `parseVisibilityAttr` to use enum parser helper
`parseOptionalCIRKeyword`

Fixes: llvm#770
This PR adds padding for union type, which is necessary in some cases
(e.g. proper offset computation for an element of an array).

The previous discussion is here llvm#1281

The idea is to add a notion about padding in the `StructType` in the
same fashion as it's done for packed structures - as a bool argument in
the constructor.

Now we can compute the proper union type size as a size of the largest
element + size of padding type.

There are some downsides though - I had to add this `padded` word in
many places. So take a look please!
There are many tests fixed and one new - `union-padding`
…m_ldaex (llvm#1293)

Lowering clang::AArch64::BI__builtin_arm_ldaex
Fixing Lit test after rebasing

`CIRToLLVMPtrStrideOpLowering` will not emit casting in this case
because the width is equal to *layoutWidth


https://github.com/llvm/clangir/blob/d329c96a56b41ad99ddffe7bd037ac4ab7476ce6/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp#L967-L999

Fixes: llvm#1295
…lvm#1304)

This handles initialization of array new allocations in the simple case
where the entire allocated memory block can be initialized with a memset
to zero.
@bcardosolopes
Copy link
Member

now that I landed your other PR, gh might automatically run the tests, gave it another click!

Currently, the following code snippet fails with a crash during CodeGen
```
class C {
public:
  ~C();
  void operator=(C);
};

void d() {
  C a, b;
  a = b;
}
```
with error: 
```
mlir::Block* clang::CIRGen::CIRGenFunction::getEHResumeBlock(bool, cir::TryOp): Assertion `tryOp && "expected available cir.try"' failed.
```
in CIRGenCleanup [these
lines](https://github.com/llvm/clangir/blob/204c03efbe898c9f64e477937d869767fdfb1310/clang/lib/CIR/CodeGen/CIRGenCleanup.cpp#L615C1-L617C6)
don't check if there is a TryOp when at the end of the scope chain
before
[getEHResumeBlock](https://github.com/llvm/clangir/blob/204c03efbe898c9f64e477937d869767fdfb1310/clang/lib/CIR/CodeGen/CIRGenException.cpp#L764)
is called causing the crash, because it contains an assertion.

This PR fixes this and adds a simple test for a case like this.
@keryell
Copy link
Collaborator

keryell commented Apr 17, 2025

Yes you can do that.
But there are some philosophical questions about what is the meaning of MLIR standard dialect lowering, along some points discussed in #1219
There is a trade-off between lowering to MLIR standard dialect as is today whatever low-level the generated code is and waiting for MLIR standard dialect to improve so that the code can be high-level.
Probably your approach of having something upstream today is good compared to what I have tried with "what if we had a better tuple in MLIR standard dialect in the future" in #1334

@keryell
Copy link
Collaborator

keryell commented Apr 17, 2025

In #1334 I use a tuple-like type to keep the high-level data structure and being able to have for example struct of struct or struct of arrays which have a value semantics which cannot be represented with memref but with the same trick as you to emulate GetMemberOp.

@bcardosolopes
Copy link
Member

Fails on windows for some reason!

gitoleg and others added 4 commits April 18, 2025 10:39
…lvm#1567)

This is a just small fix that cover the case when the global union is
declared with `static` keyword and one of the its users is an array
Add `TypeBuilderWithInferredContext` to each CIR type that supports MLIR
context inference from its parameters.
We have been using RecordLayoutAttr to "cache" data layout information
calculated for records. Unfortunately, it wasn't actually caching the
information, and because each call was calculating more information than
it needed, it was doing extra work.

This replaces the previous implementation with a set of functions that
compute only the information needed. Ideally, we would like to have a
mechanism to properly cache this information, but until such a mechanism
is implemented, these new functions should be a small step forward.
@terapines-osc-cir
Copy link
Contributor Author

It seems #1569 calculates wrong offsets for struct types, so the test cases will fail. For a non-packed struct { char a; int b; }, it thinks the offset of b is 1, but it should be 4. Currently CIR doesn't have test cases for these, so it's left unnoticed.
I'll try to solve it in another PR and leave this open for a while.

Still don't know why there's failure on windows though.

@bcardosolopes
Copy link
Member

I'll try to solve it in another PR and leave this open for a while.

Great!

Still don't know why there's failure on windows though.

Seems to be failing all across the board

@keryell
Copy link
Collaborator

keryell commented Apr 21, 2025

To rebase?

xlauko and others added 6 commits April 21, 2025 22:43
This changes the alias prefix for record types to make it less general.
Introduce common base class for attributes with single type parameter.
Structs are implemented as `memref<size x i8>`. It is not feasible to
represent them as tuples, for tuples can't be put in memref (i.e.
pointers to structs would break if we did).

We use `memref::ViewOp` for this. Unlike `PtrStrideOp`, the reinterpret
cast operation doesn't work here, as the result type is potentially
different from i8.
@terapines-osc-cir
Copy link
Contributor Author

Now the struct offset issue is fixed.
As it's hard to write an independent test case for struct types, I choose to include the change here (it's small anyway).

@bcardosolopes
Copy link
Member

One alternative is to build your clang with ASAN enabled by the host compiler (using CMAKE's -DLLVM_USE_SANITIZER="Address"), that usually helps spotting windows only crashes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.