Skip to content

Commit 8853555

Browse files
sitio-coutolanza
authored andcommitted
[CIR][CIRGen] Support mutable and recursive named records
This allows a named StructType to be mutated after it has been created, if it is identified and incomplete. The motivation for this is to improve the codegen of CIR in certain scenarios where an incomplete type is used and later completed. These usually leave the IR in an inconsistent state, where there are two records types with the same identifier but different definitions (one complete the other incomplete). For example: ```c++ struct Node { Node *next; }; void test(struct Node n) {} ``` Generates: ```mlir !temp_struct = !cir.struct<struct "Node" incomplete> !full_struct = !cir.struct<struct "Node" {!cir.ptr<!temp_struct>}> ``` To generate the `Node` struct type, its members must be created first. However, the `next` member is a recursive reference, so it can only be completed after its parent. This generates a temporary incomplete definition of the `Node` type that remains in the code even after the type to which it refers is completed. As a consequence, accessing the `next` member of a `Node` value fetches the old incomplete version of the type which affects CIR's type-checking capabilities. This patch ensures that, once the parent is fully visited, the `next` member can be completed in place, automatically updating any references to it at a low cost. To represent recursive types, the StructType now is equipped with self-references. These are represented by a `cir.struct` type with just the name of the parent struct that it refers to. The same snippet of code will not generate the following CIR IR: ```mlir !full_struct = !cir.struct<struct "Node" {!cir.ptr<!cir.struct<struct "Node">>}> ``` Summary of the changes made: - Named records are now uniquely identified by their name. An attempt to create a new record with the same will fail. - Anonymous records are uniquely identified by members and other relevant attributes. - StructType has a new `mutate` method that allows it to be mutated after it has been created. Each type can only be mutated if it is identified and incomplete, rendering further changes impossible. - When building a new name StructType, the builder will try to first create, then complete the type, ensuring that: - Inexistent types are created - Existing incomplete types are completed - Existing complete types with matching attributes are reused - Existing complete types with different attributes raise errors - StructType now uses the CyclicParser/Printer guard to avoid infinite recursion and identify when it should print/parse a self-reference. ghstack-source-id: a6d4f65 Pull Request resolved: #303
1 parent 6b8eef0 commit 8853555

24 files changed

+330
-65
lines changed

clang/include/clang/CIR/Dialect/IR/CIRTypes.h

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,43 @@ struct StructTypeStorage;
4141

4242
/// Each unique clang::RecordDecl is mapped to a `cir.struct` and any object in
4343
/// C/C++ that has a struct type will have a `cir.struct` in CIR.
44+
///
45+
/// There are three possible formats for this type:
46+
///
47+
/// - Identified and complete structs: unique name and a known body.
48+
/// - Identified and incomplete structs: unique name and unkonwn body.
49+
/// - Anonymous structs: no name and a known body.
50+
///
51+
/// Identified structs are uniqued by their name, and anonymous structs are
52+
/// uniqued by their body. This means that two anonymous structs with the same
53+
/// body will be the same type, and two identified structs with the same name
54+
/// will be the same type. Attempting to build a struct with a existing name,
55+
/// but a different body will result in an error.
56+
///
57+
/// A few examples:
58+
///
59+
/// ```mlir
60+
/// !complete = !cir.struct<struct "complete" {!cir.int<u, 8>}>
61+
/// !incomplete = !cir.struct<struct "incomplete" incomplete>
62+
/// !anonymous = !cir.struct<struct {!cir.int<u, 8>}>
63+
/// ```
64+
///
65+
/// Incomplete structs are mutable, meaning the can be later completed with a
66+
/// body automatically updating in place every type in the code that uses the
67+
/// incomplete struct. Mutability allows for recursive types to be represented,
68+
/// meaning the struct can have members that refer to itself. This is useful for
69+
/// representing recursive records and is implemented through a special syntax.
70+
/// In the example below, the `Node` struct has a member that is a pointer to a
71+
/// `Node` struct:
72+
///
73+
/// ```mlir
74+
/// !struct = !cir.struct<struct "Node" {!cir.ptr<!cir.struct<struct
75+
/// "Node">>}>
76+
/// ```
4477
class StructType
4578
: public Type::TypeBase<StructType, Type, detail::StructTypeStorage,
46-
DataLayoutTypeInterface::Trait> {
79+
DataLayoutTypeInterface::Trait,
80+
TypeTrait::IsMutable> {
4781
// FIXME(cir): migrate this type to Tablegen once mutable types are supported.
4882
public:
4983
using Base::Base;
@@ -123,6 +157,10 @@ class StructType
123157
return getKindAsStr() + "." + getName().getValue().str();
124158
}
125159

160+
/// Complete the struct type by mutating its members and attributes.
161+
void complete(ArrayRef<Type> members, bool packed,
162+
ASTRecordDeclInterface ast = {});
163+
126164
/// DataLayoutTypeInterface methods.
127165
llvm::TypeSize getTypeSizeInBits(const DataLayout &dataLayout,
128166
DataLayoutEntryListRef params) const;

clang/include/clang/CIR/Dialect/IR/CIRTypesDetails.h

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@
1414
#define CIR_DIALECT_IR_CIRTYPESDETAILS_H
1515

1616
#include "mlir/IR/BuiltinAttributes.h"
17+
#include "mlir/Support/LogicalResult.h"
1718
#include "clang/CIR/Dialect/IR/CIRTypes.h"
19+
#include "llvm/ADT/Hashing.h"
1820

1921
namespace mlir {
2022
namespace cir {
@@ -58,14 +60,18 @@ struct StructTypeStorage : public TypeStorage {
5860
}
5961

6062
bool operator==(const KeyTy &key) const {
63+
if (name)
64+
return (name == key.name) && (kind == key.kind);
6165
return (members == key.members) && (name == key.name) &&
6266
(incomplete == key.incomplete) && (packed == key.packed) &&
6367
(kind == key.kind) && (ast == key.ast);
6468
}
6569

6670
static llvm::hash_code hashKey(const KeyTy &key) {
67-
return hash_combine(key.members, key.name, key.incomplete, key.packed,
68-
key.kind, key.ast);
71+
if (key.name)
72+
return llvm::hash_combine(key.name, key.kind);
73+
return llvm::hash_combine(key.members, key.incomplete, key.packed, key.kind,
74+
key.ast);
6975
}
7076

7177
static StructTypeStorage *construct(TypeStorageAllocator &allocator,
@@ -74,6 +80,32 @@ struct StructTypeStorage : public TypeStorage {
7480
StructTypeStorage(allocator.copyInto(key.members), key.name,
7581
key.incomplete, key.packed, key.kind, key.ast);
7682
}
83+
84+
/// Mutates the members and attributes an identified struct.
85+
///
86+
/// Once a record is mutated, it is marked as complete, preventing further
87+
/// mutations. Anonymous structs are always complete and cannot be mutated.
88+
/// This method does not fail if a mutation of a complete struct does not
89+
/// change the struct.
90+
LogicalResult mutate(TypeStorageAllocator &allocator, ArrayRef<Type> members,
91+
bool packed, ASTRecordDeclInterface ast) {
92+
// Anonymous structs cannot mutate.
93+
if (!name)
94+
return failure();
95+
96+
// Mutation of complete structs are allowed if they change nothing.
97+
if (!incomplete)
98+
return mlir::success((this->members == members) &&
99+
(this->packed == packed) && (this->ast == ast));
100+
101+
// Mutate incomplete struct.
102+
this->members = allocator.copyInto(members);
103+
this->packed = packed;
104+
this->ast = ast;
105+
106+
incomplete = false;
107+
return success();
108+
}
77109
};
78110

79111
} // namespace detail

clang/lib/CIR/CodeGen/CIRGenBuilder.h

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -435,6 +435,9 @@ class CIRGenBuilderTy : public CIRBaseBuilderTy {
435435
}
436436

437437
/// Get a CIR named struct type.
438+
///
439+
/// If a struct already exists and is complete, but the client tries to fetch
440+
/// it with a different set of attributes, this method will crash.
438441
mlir::cir::StructType getCompleteStructTy(llvm::ArrayRef<mlir::Type> members,
439442
llvm::StringRef name, bool packed,
440443
const clang::RecordDecl *ast) {
@@ -445,8 +448,16 @@ class CIRGenBuilderTy : public CIRBaseBuilderTy {
445448
astAttr = getAttr<mlir::cir::ASTRecordDeclAttr>(ast);
446449
kind = getRecordKind(ast->getTagKind());
447450
}
448-
return getType<mlir::cir::StructType>(members, nameAttr, packed, kind,
449-
astAttr);
451+
452+
// Create or get the struct.
453+
auto type = getType<mlir::cir::StructType>(members, nameAttr, packed, kind,
454+
astAttr);
455+
456+
// Complete an incomplete struct or ensure the existing complete struct
457+
// matches the requested attributes.
458+
type.complete(members, packed, astAttr);
459+
460+
return type;
450461
}
451462

452463
//
@@ -689,7 +700,6 @@ class CIRGenBuilderTy : public CIRBaseBuilderTy {
689700
auto flag = getBool(val, loc);
690701
return create<mlir::cir::StoreOp>(loc, flag, dst);
691702
}
692-
693703
};
694704

695705
} // namespace cir

clang/lib/CIR/CodeGen/CIRRecordLayoutBuilder.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -596,7 +596,7 @@ std::unique_ptr<CIRGenRecordLayout>
596596
CIRGenTypes::computeRecordLayout(const RecordDecl *D,
597597
mlir::cir::StructType *Ty) {
598598
CIRRecordLowering builder(*this, D, /*packed=*/false);
599-
599+
assert(Ty->isIncomplete() && "recomputing record layout?");
600600
builder.lower(/*nonVirtualBaseType=*/false);
601601

602602
// If we're in C++, compute the base subobject type.

clang/lib/CIR/Dialect/IR/CIRTypes.cpp

Lines changed: 46 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,9 @@ Type StructType::getLargestMember(const ::mlir::DataLayout &dataLayout) const {
128128
}
129129

130130
Type StructType::parse(mlir::AsmParser &parser) {
131+
FailureOr<AsmParser::CyclicParseReset> cyclicParseGuard;
131132
const auto loc = parser.getCurrentLocation();
133+
const auto eLoc = parser.getEncodedSourceLoc(loc);
132134
bool packed = false;
133135
RecordKind kind;
134136
auto *context = parser.getContext();
@@ -152,6 +154,26 @@ Type StructType::parse(mlir::AsmParser &parser) {
152154
mlir::StringAttr name;
153155
parser.parseOptionalAttribute(name);
154156

157+
// Is a self reference: ensure referenced type was parsed.
158+
if (name && parser.parseOptionalGreater().succeeded()) {
159+
auto type = getChecked(eLoc, context, name, kind);
160+
if (succeeded(parser.tryStartCyclicParse(type))) {
161+
parser.emitError(loc, "invalid self-reference within record");
162+
return {};
163+
}
164+
return type;
165+
}
166+
167+
// Is a named record definition: ensure name has not been parsed yet.
168+
if (name) {
169+
auto type = getChecked(eLoc, context, name, kind);
170+
cyclicParseGuard = parser.tryStartCyclicParse(type);
171+
if (failed(cyclicParseGuard)) {
172+
parser.emitError(loc, "record already defined");
173+
return {};
174+
}
175+
}
176+
155177
if (parser.parseOptionalKeyword("packed").succeeded())
156178
packed = true;
157179

@@ -176,14 +198,17 @@ Type StructType::parse(mlir::AsmParser &parser) {
176198
if (parser.parseGreater())
177199
return {};
178200

179-
// Try to create the proper type.
180-
mlir::Type type = {};
201+
// Try to create the proper record type.
181202
ArrayRef<mlir::Type> membersRef(members); // Needed for template deduction.
182-
const auto eLoc = parser.getEncodedSourceLoc(loc);
203+
mlir::Type type = {};
183204
if (name && incomplete) { // Identified & incomplete
184205
type = getChecked(eLoc, context, name, kind);
185206
} else if (name && !incomplete) { // Identified & complete
186207
type = getChecked(eLoc, context, membersRef, name, packed, kind);
208+
// If the record has a self-reference, its type already exists in a
209+
// incomplete state. In this case, we must complete it.
210+
if (type.cast<StructType>().isIncomplete())
211+
type.cast<StructType>().complete(membersRef, packed, ast);
187212
} else if (!name && !incomplete) { // anonymous & complete
188213
type = getChecked(eLoc, context, membersRef, packed, kind);
189214
} else { // anonymous & incomplete
@@ -195,6 +220,7 @@ Type StructType::parse(mlir::AsmParser &parser) {
195220
}
196221

197222
void StructType::print(mlir::AsmPrinter &printer) const {
223+
FailureOr<AsmPrinter::CyclicPrintReset> cyclicPrintGuard;
198224
printer << '<';
199225

200226
switch (getKind()) {
@@ -210,7 +236,17 @@ void StructType::print(mlir::AsmPrinter &printer) const {
210236
}
211237

212238
if (getName())
213-
printer << getName() << " ";
239+
printer << getName();
240+
241+
// Current type has already been printed: print as self reference.
242+
cyclicPrintGuard = printer.tryStartCyclicPrint(*this);
243+
if (failed(cyclicPrintGuard)) {
244+
printer << '>';
245+
return;
246+
}
247+
248+
// Type not yet printed: continue printing the entire record.
249+
printer << ' ';
214250

215251
if (getPacked())
216252
printer << "packed ";
@@ -308,6 +344,12 @@ mlir::cir::StructType::RecordKind StructType::getKind() const {
308344

309345
ASTRecordDeclInterface StructType::getAst() const { return getImpl()->ast; }
310346

347+
void StructType::complete(ArrayRef<Type> members, bool packed,
348+
ASTRecordDeclInterface ast) {
349+
if (mutate(members, packed, ast).failed())
350+
llvm_unreachable("failed to complete struct");
351+
}
352+
311353
//===----------------------------------------------------------------------===//
312354
// Data Layout information for types
313355
//===----------------------------------------------------------------------===//

clang/test/CIR/CodeGen/agg-init.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -std=c++17 -fclangir-enable -Wno-unused-value -emit-cir %s -o %t.cir
22
// RUN: FileCheck --input-file=%t.cir %s
33

4-
// CHECK: !ty_22yep_22 = !cir.struct<struct "yep_" {!u32i, !u32i}>
4+
// CHECK: !ty_22yep_22 = !cir.struct<struct "yep_" {!cir.int<u, 32>, !cir.int<u, 32>}>
55

66
typedef enum xxy_ {
77
xxy_Low = 0,

clang/test/CIR/CodeGen/atomic.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,4 @@ typedef struct _a {
77

88
void m() { at y; }
99

10-
// CHECK: !ty_22_a22 = !cir.struct<struct "_a" {!s32i}>
10+
// CHECK: !ty_22_a22 = !cir.struct<struct "_a" {!cir.int<s, 32>}>

clang/test/CIR/CodeGen/bitfields.c

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,10 +27,10 @@ typedef struct {
2727
int a : 3; // one bitfield with size < 8
2828
unsigned b;
2929
} T;
30-
// CHECK: !ty_22S22 = !cir.struct<struct "S" {!u32i, !u32i, !u16i, !u32i}>
31-
// CHECK: !ty_22T22 = !cir.struct<struct "T" {!u8i, !u32i} #cir.record.decl.ast>
32-
// CHECK: !ty_22anon2E122 = !cir.struct<struct "anon.1" {!u32i} #cir.record.decl.ast>
33-
// CHECK: !ty_22__long22 = !cir.struct<struct "__long" {!ty_22anon2E122, !u32i, !cir.ptr<!u32i>}>
30+
// CHECK: !ty_22S22 = !cir.struct<struct "S" {!cir.int<u, 32>, !cir.int<u, 32>, !cir.int<u, 16>, !cir.int<u, 32>}>
31+
// CHECK: !ty_22T22 = !cir.struct<struct "T" {!cir.int<u, 8>, !cir.int<u, 32>} #cir.record.decl.ast>
32+
// CHECK: !ty_22anon2E122 = !cir.struct<struct "anon.1" {!cir.int<u, 32>} #cir.record.decl.ast>
33+
// CHECK: !ty_22__long22 = !cir.struct<struct "__long" {!cir.struct<struct "anon.1" {!cir.int<u, 32>} #cir.record.decl.ast>, !cir.int<u, 32>, !cir.ptr<!cir.int<u, 32>>}>
3434

3535
// CHECK: cir.func {{.*@store_field}}
3636
// CHECK: [[TMP0:%.*]] = cir.alloca !ty_22S22, cir.ptr <!ty_22S22>

clang/test/CIR/CodeGen/bitfields.cpp

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -27,11 +27,10 @@ typedef struct {
2727
int a : 3; // one bitfield with size < 8
2828
unsigned b;
2929
} T;
30-
31-
// CHECK: !ty_22S22 = !cir.struct<struct "S" {!u32i, !u32i, !u16i, !u32i}>
32-
// CHECK: !ty_22T22 = !cir.struct<struct "T" {!u8i, !u32i} #cir.record.decl.ast>
33-
// CHECK: !ty_22anon2E122 = !cir.struct<struct "anon.1" {!u32i} #cir.record.decl.ast>
34-
// CHECK: !ty_22__long22 = !cir.struct<struct "__long" {!ty_22anon2E122, !u32i, !cir.ptr<!u32i>}>
30+
// CHECK: !ty_22S22 = !cir.struct<struct "S" {!cir.int<u, 32>, !cir.int<u, 32>, !cir.int<u, 16>, !cir.int<u, 32>}>
31+
// CHECK: !ty_22T22 = !cir.struct<struct "T" {!cir.int<u, 8>, !cir.int<u, 32>} #cir.record.decl.ast>
32+
// CHECK: !ty_22anon2E122 = !cir.struct<struct "anon.1" {!cir.int<u, 32>} #cir.record.decl.ast>
33+
// CHECK: !ty_22__long22 = !cir.struct<struct "__long" {!cir.struct<struct "anon.1" {!cir.int<u, 32>} #cir.record.decl.ast>, !cir.int<u, 32>, !cir.ptr<!cir.int<u, 32>>}>
3534

3635
// CHECK: cir.func @_Z11store_field
3736
// CHECK: [[TMP0:%.*]] = cir.alloca !ty_22S22, cir.ptr <!ty_22S22>,

clang/test/CIR/CodeGen/coro-task.cpp

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -126,13 +126,13 @@ co_invoke_fn co_invoke;
126126

127127
}} // namespace folly::coro
128128

129-
// CHECK-DAG: ![[IntTask:.*]] = !cir.struct<struct "folly::coro::Task<int>" {!u8i}>
130-
// CHECK-DAG: ![[VoidTask:.*]] = !cir.struct<struct "folly::coro::Task<void>" {!u8i}>
131-
// CHECK-DAG: ![[VoidPromisse:.*]] = !cir.struct<struct "folly::coro::Task<void>::promise_type" {!u8i}>
132-
// CHECK-DAG: ![[CoroHandleVoid:.*]] = !cir.struct<struct "std::coroutine_handle<void>" {!u8i}>
133-
// CHECK-DAG: ![[CoroHandlePromise:ty_.*]] = !cir.struct<struct "std::coroutine_handle<folly::coro::Task<void>::promise_type>" {!u8i}>
134-
// CHECK-DAG: ![[StdString:.*]] = !cir.struct<struct "std::string" {!u8i}>
135-
// CHECK-DAG: ![[SuspendAlways:.*]] = !cir.struct<struct "std::suspend_always" {!u8i}>
129+
// CHECK-DAG: ![[IntTask:.*]] = !cir.struct<struct "folly::coro::Task<int>" {!cir.int<u, 8>}>
130+
// CHECK-DAG: ![[VoidTask:.*]] = !cir.struct<struct "folly::coro::Task<void>" {!cir.int<u, 8>}>
131+
// CHECK-DAG: ![[VoidPromisse:.*]] = !cir.struct<struct "folly::coro::Task<void>::promise_type" {!cir.int<u, 8>}>
132+
// CHECK-DAG: ![[CoroHandleVoid:.*]] = !cir.struct<struct "std::coroutine_handle<void>" {!cir.int<u, 8>}>
133+
// CHECK-DAG: ![[CoroHandlePromise:ty_.*]] = !cir.struct<struct "std::coroutine_handle<folly::coro::Task<void>::promise_type>" {!cir.int<u, 8>}>
134+
// CHECK-DAG: ![[StdString:.*]] = !cir.struct<struct "std::string" {!cir.int<u, 8>}>
135+
// CHECK-DAG: ![[SuspendAlways:.*]] = !cir.struct<struct "std::suspend_always" {!cir.int<u, 8>}>
136136

137137
// CHECK: module {{.*}} {
138138
// CHECK-NEXT: cir.global external @_ZN5folly4coro9co_invokeE = #cir.zero : !ty_22folly3A3Acoro3A3Aco_invoke_fn22

0 commit comments

Comments
 (0)