Skip to content

Conversation

@LekkalaSravya3
Copy link

@LekkalaSravya3 LekkalaSravya3 commented Sep 22, 2025

This PR addresses issue #159746

This patch extends the MemRef-To-EmitC TypeConverter to handle pointer types, enabling proper conversion of MemRef types that are represented as ptr<array<>> .
The update ensures that operations involving allocations or memory access are correctly emitted in C as array-based pointer.

Key Changes Included :

  • Added pointer-array type handling logic in the TypeConverter.
  • Updated MemRef-to-EmitC conversion patterns to generate appropriate C code for pointer-array types.
  • Enhanced the C emitter to correctly print and initialize pointer-array types during code emission.

The example file demonstrates how MemRef types using pointer are now correctly emitted as C code, addressing the issue described in #159746.
Example.txt

@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Sep 22, 2025

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-emitc

Author: Lekkala_Sravya-mcw (LekkalaSravya3)

Changes

This PR addresses issue #159746

  • Implemented proper handling of UnrealizedConversionCastOp when lowering from EmitC IR to the target C++ code using the EmitC emitter.
  • Added the tests to ensure the generated c++ code is valid .

Full diff: https://github.com/llvm/llvm-project/pull/160159.diff

2 Files Affected:

  • (modified) mlir/lib/Target/Cpp/TranslateToCpp.cpp (+83-1)
  • (added) mlir/test/Target/Cpp/unrealized_conversion_cast.mlir (+15)
diff --git a/mlir/lib/Target/Cpp/TranslateToCpp.cpp b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
index a5bd80e9d6b8b..80d0c83049ac7 100644
--- a/mlir/lib/Target/Cpp/TranslateToCpp.cpp
+++ b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
@@ -782,6 +782,64 @@ static LogicalResult printOperation(CppEmitter &emitter,
   return success();
 }
 
+static LogicalResult printOperation(CppEmitter &emitter,
+                                    mlir::UnrealizedConversionCastOp castOp) {
+  raw_ostream &os = emitter.ostream();
+  Operation &op = *castOp.getOperation();
+
+  if (castOp.getResults().size() != 1 || castOp.getOperands().size() != 1) {
+    return castOp.emitOpError(
+        "expected single result and single operand for conversion cast");
+  }
+
+  Type destType = castOp.getResult(0).getType();
+
+  auto srcPtrType =
+      mlir::dyn_cast<emitc::PointerType>(castOp.getOperand(0).getType());
+  auto destArrayType = mlir::dyn_cast<emitc::ArrayType>(destType);
+
+  if (srcPtrType && destArrayType) {
+
+    // Emit declaration: (*v13)[dims] =
+    if (failed(emitter.emitType(op.getLoc(), destArrayType.getElementType())))
+      return failure();
+    os << " (*" << emitter.getOrCreateName(op.getResult(0)) << ")";
+    for (int64_t dim : destArrayType.getShape())
+      os << "[" << dim << "]";
+    os << " = ";
+
+    os << "(";
+
+    // Emit the C++ type for "datatype (*)[dim1][dim2]..."
+    if (failed(emitter.emitType(op.getLoc(), destArrayType.getElementType())))
+      return failure();
+
+    os << "(*)"; // Pointer to array
+
+    for (int64_t dim : destArrayType.getShape()) {
+      os << "[" << dim << "]";
+    }
+    os << ")";
+    if (failed(emitter.emitOperand(castOp.getOperand(0))))
+      return failure();
+
+    return success();
+  }
+
+  // Fallback to generic C-style cast for other cases
+  if (failed(emitter.emitAssignPrefix(op)))
+    return failure();
+
+  os << "(";
+  if (failed(emitter.emitType(op.getLoc(), destType)))
+    return failure();
+  os << ")";
+  if (failed(emitter.emitOperand(castOp.getOperand(0))))
+    return failure();
+
+  return success();
+}
+
 static LogicalResult printOperation(CppEmitter &emitter,
                                     emitc::ApplyOp applyOp) {
   raw_ostream &os = emitter.ostream();
@@ -1291,7 +1349,29 @@ CppEmitter::CppEmitter(raw_ostream &os, bool declareVariablesAtTop,
 std::string CppEmitter::getSubscriptName(emitc::SubscriptOp op) {
   std::string out;
   llvm::raw_string_ostream ss(out);
-  ss << getOrCreateName(op.getValue());
+  Value baseValue = op.getValue();
+
+  // Check if the baseValue (%arg1) is a result of UnrealizedConversionCastOp
+  // that converts a pointer to an array type.
+  if (auto castOp = dyn_cast_or_null<mlir::UnrealizedConversionCastOp>(
+          baseValue.getDefiningOp())) {
+    auto destArrayType =
+        mlir::dyn_cast<emitc::ArrayType>(castOp.getResult(0).getType());
+    auto srcPtrType =
+        mlir::dyn_cast<emitc::PointerType>(castOp.getOperand(0).getType());
+
+    // If it's a pointer being cast to an array, emit (*varName)
+    if (srcPtrType && destArrayType) {
+      ss << "(*" << getOrCreateName(baseValue) << ")";
+    } else {
+      // Fallback if the cast is not our specific pointer-to-array case
+      ss << getOrCreateName(baseValue);
+    }
+  } else {
+    // Default behavior for a regular array or other base types
+    ss << getOrCreateName(baseValue);
+  }
+
   for (auto index : op.getIndices()) {
     ss << "[" << getOrCreateName(index) << "]";
   }
@@ -1747,6 +1827,8 @@ LogicalResult CppEmitter::emitOperation(Operation &op, bool trailingSemicolon) {
             cacheDeferredOpResult(op.getResult(), getSubscriptName(op));
             return success();
           })
+          .Case<mlir::UnrealizedConversionCastOp>(
+              [&](auto op) { return printOperation(*this, op); })
           .Default([&](Operation *) {
             return op.emitOpError("unable to find printer for op");
           });
diff --git a/mlir/test/Target/Cpp/unrealized_conversion_cast.mlir b/mlir/test/Target/Cpp/unrealized_conversion_cast.mlir
new file mode 100644
index 0000000000000..075a268821fa1
--- /dev/null
+++ b/mlir/test/Target/Cpp/unrealized_conversion_cast.mlir
@@ -0,0 +1,15 @@
+// RUN: mlir-translate -mlir-to-cpp %s | FileCheck %s
+
+// CHECK-LABEL: void builtin_cast
+func.func @builtin_cast(%arg0: !emitc.ptr<f32>){
+    // CHECK : float (*v2)[1][3][4][4] = (float(*)[1][3][4][4])v1
+  %1 = builtin.unrealized_conversion_cast %arg0 : !emitc.ptr<f32> to !emitc.array<1x3x4x4xf32>
+return
+}
+
+// CHECK-LABEL: void builtin_cast_index
+func.func @builtin_cast_index(%arg0:  !emitc.size_t){
+    // CHECK : size_t v2 = (size_t)v1
+  %1 = builtin.unrealized_conversion_cast %arg0 : !emitc.size_t to index
+return
+}
\ No newline at end of file

@LekkalaSravya3 LekkalaSravya3 changed the title Handled UnrealizedConversionCast for C code generation and validated … Handled UnrealizedConversionCast for C code generation and validated tests Sep 22, 2025
@LekkalaSravya3
Copy link
Author

LekkalaSravya3 commented Sep 24, 2025

@marbre , @ilovepi — when you have some time, could you please review this PR?

Copy link
Member

@marbre marbre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by comment. Also assigned reviewers, will try to allocate time for a proper review later if the others don't get to me before I have a chance.

Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@LekkalaSravya3 LekkalaSravya3 force-pushed the cpp-emitter_unrealized_conversion_cast branch from 354a234 to ec47d01 Compare September 25, 2025 10:19
@aniragil
Copy link
Contributor

Thanks for reporting and addressing this issue.
I'm not entirely sure though that the solution should be adding support for unrealized casts in the translator. As we end up with the unrealized type conversion, it seems our type conversion in MemRefToEmitC may not cover this case.
I wonder if the solution you propose (generating a cast to a pointer to an array and supporting it in emitc.subscript) can be implemented as part of lowering so that we don't get the unrealized cast in the first place?

@simon-camp
Copy link
Contributor

Hi @LekkalaSravya3, thanks for the contribution. Wouldn't we need to decay the outer most array dimension to pointer, i.e. array<2x3xi32> -> ptr<array<3xi32>>?
Additionally I also think this should be fixed in the lowering of memref.alloc and the type conversion. Though I haven't thought about all details.

Thanks for reporting and addressing this issue. I'm not entirely sure though that the solution should be adding support for unrealized casts in the translator. As we end up with the unrealized type conversion, it seems our type conversion in MemRefToEmitC may not cover this case. I wonder if the solution you propose (generating a cast to a pointer to an array and supporting it in emitc.subscript) can be implemented as part of lowering so that we don't get the unrealized cast in the first place?

I think that is the correct place to fix this. Adding support for ptr(array) is mostly involving updating the emitter. IIRC this type cannot be simply emitted left to right. so we would need a bit of book keeping in emitType. I think I have some python prototype lying around when/where to add parentheses into the emitted type.

Do you suggest that subscript will support emitc.subscript %a[%c3, %c1] : ptr<array<2xi32>> -> i32 directly? That vastly simplifies the problem then I think.
If we would need to split between pointer and array indexing it will get complicated, as the outer subscript won't be able to get materialized (it's emitted as an assignment to array type). Otherwise we would need to wrap all this into an expression with a combination of subscripts and loads.

@marbre marbre requested a review from simon-camp September 29, 2025 11:54
@aniragil
Copy link
Contributor

aniragil commented Oct 5, 2025

I think that is the correct place to fix this. Adding support for ptr(array) is mostly involving updating the emitter. IIRC this type cannot be simply emitted left to right. so we would need a bit of book keeping in emitType. I think I have some python prototype lying around when/where to add parentheses into the emitted type.

Not sure I follow: operators * and [] have the same (top) precedence and left-to-right associativity, right?

Do you suggest that subscript will support emitc.subscript %a[%c3, %c1] : ptr<array<2xi32>> -> i32 directly? That vastly simplifies the problem then I think.

Yes, as emitc.subscript already supports pointers but forces a single index in verify() we could arguably generalize it to support (element-type-rank + 1) indices.

If we would need to split between pointer and array indexing it will get complicated, as the outer subscript won't be able to get materialized (it's emitted as an assignment to array type).

Right, the emitc.apply * op predates the lvalue changes it doesn't return an lvalue but also loads the value (and is therefore marked as having a side effect). Since the pointer de-referencing would be done first, yielding an array, we'll need to add the emitc.apply op to the "deferred emission" mechanism when the de-referenced type is an array to solve the materialization problem and mark the op as not having a side effect in that case.

Otherwise we would need to wrap all this into an expression with a combination of subscripts and loads.

Right, as an alternative to adding emitc.apply to "deferred emission" (by load you mean the user of the subscript, right?).
The emitc.apply * side effect may still need to be refined so that the expression can be inline.

@LekkalaSravya3
Copy link
Author

Thank you for the detailed suggestion @simon-camp , @aniragil

From my understanding, I should modify the conversion to directly lower the input memref type to the !emitc.ptr<!emitc.array<...>> format and handle conversions using emitc::ApplyOp, ensuring that emitc::SubscriptOp supports the necessary index handling.

I just wanted to clarify one point — for cases like:

builtin.unrealized_conversion_cast %arg0 : !emitc.size_t to index

Would it make sense to handle the !emitc.size_t ↔ index conversion in a similar way within the TypeConverter? Specifically, by mapping index directly to !emitc.size_t and handling it uniformly during lowering, so we can avoid introducing UnrealizedConversionCastOp for these cases?

Would that approach be consistent with your suggestion?

@aniragil
Copy link
Contributor

From my understanding, I should modify the conversion to directly lower the input memref type to the !emitc.ptr<!emitc.array<...>> format and handle conversions using emitc::ApplyOp, ensuring that emitc::SubscriptOp supports the necessary index handling.

Right, except we were hoping you could use only emitc.subscript (by extending its existing support of emitc.ptr to multiple indices), thus avoiding the need to use emitc.apply as it would complicate things (right @simon-camp ?)

I just wanted to clarify one point — for cases like:

builtin.unrealized_conversion_cast %arg0 : !emitc.size_t to index

Would it make sense to handle the !emitc.size_t ↔ index conversion in a similar way within the TypeConverter? Specifically, by mapping index directly to !emitc.size_t and handling it uniformly during lowering, so we can avoid introducing UnrealizedConversionCastOp for these cases?

Would that approach be consistent with your suggestion?

@simon-camp is probably more qualified than me to answer concretely, but in general I think the answer is yes - we should avoid generating unrealized casts where possible (and AFAIK when they do get created they are usually created in pairs such that they cancel each other later). Is that related to the same problem you're fixing in this PR? If not, best to do that in a separate PR.

@simon-camp
Copy link
Contributor

I would try to get this running https://godbolt.org/z/b7Ko1a6dv, by replacing %6 = cast %5 : !emitc.ptr<!emitc.opaque<"void">> to !emitc.ptr<i32> with %6 = cast %5 : !emitc.ptr<!emitc.opaque<"void">> to !emitc.ptr<!emitc.array<3xi32>>. The cast is generated in the AllocOp Conversion.

But as I think about this now, you would get another unrealized conversion cast from !emitc.ptr<!emitc.array<3xi32>> to !emitc.array<2x3xi32>> unless every array type undergoes array to pointer decay.

@LekkalaSravya3
Copy link
Author

@simon-camp Yes, when I converted it to !emitc.ptr<!emitc.array<...>>, I encountered an unrealized_conversion_cast from !emitc.ptr<!emitc.array<..>> to !emitc.array<>>. This issue should be resolved once we update the type conversion to !emitc.ptr<!emitc.array<..>> and make the corresponding adjustments in the load and store implementations. Since those operations currently expect !emitc.array<>, the cast gets inserted automatically. If that’s the case, I assume we’ll need to update all related operations to consistently use the !emitc.ptr<!emitc.array<...>> representation. I believe this will also require modifying the emitter to properly handle the updated pointer-to-array representation.
Would it be fine to proceed with these corresponding changes, or do you have any other suggestions?

@simon-camp
Copy link
Contributor

Yes I think so. We need to be careful with allocas and globals though (and maybe function signatures, not sure), they need to be kept as Arrays so that the memory is allocated.

Alloca can be converted to an Array variable and a Cast that decays the outer Dimension.

@LekkalaSravya3 LekkalaSravya3 marked this pull request as draft October 23, 2025 05:13
@LekkalaSravya3 LekkalaSravya3 force-pushed the cpp-emitter_unrealized_conversion_cast branch from eff40b7 to a76c5a4 Compare October 23, 2025 08:57
@LekkalaSravya3 LekkalaSravya3 changed the title Handled UnrealizedConversionCast for C code generation and validated tests [MLIR][EmitC] Add support for pointer-array types in the TypeConverter and related MemRef-to-EmitC operations, and update the C emitter. Oct 23, 2025
@LekkalaSravya3 LekkalaSravya3 force-pushed the cpp-emitter_unrealized_conversion_cast branch from 63812c9 to 23acdce Compare October 28, 2025 18:05
@LekkalaSravya3
Copy link
Author

Hi @simon-camp ,
I have updated the implementation to support pointer-to-array types in the MemRef-to-EmitC conversion and attached an updated test file in the description that demonstrates the expected behavior, based on the issue raised earlier.
Kindly review the changes and let me know if further modifications are needed.
Thanks!

@LekkalaSravya3
Copy link
Author

Hi @simon-camp — just checking in on this ..

Before I proceed further with updates on the PR, could you please clarify the intended scope?
For the array-decay–based approach you described (loosening cast/subscript verifiers, implicit decay in calls, emitter updates)
Would you prefer that to be included in this PR as well, or should it be handled in a follow-up PR after this one lands?
Once I have your confirmation, I’ll update accordingly.
Thanks again for your earlier feedback!

@simon-camp
Copy link
Contributor

I don't have a strong opinion on this, this can be revisited later.

I had a look on your changes and it seems to me that you make the emission of the apply op deferred? Doesn't this change semantics of the apply op and affects downstream users? @aniragil what's your opinion on this. (This currently also conflicts with #159975)

@aniragil
Copy link
Contributor

I had a look on your changes and it seems to me that you make the emission of the apply op deferred? Doesn't this change semantics of the apply op and affects downstream users? @aniragil what's your opinion on this. (This currently also conflicts with #159975)

That's actually in the spirit of what I proposed. I was thinking this would only apply when the resulting type is an array so semantics wouldn't change, but this still makes the emitc.apply semantics even more complex than it already is. So this might actually be a good opportunity to introduce new, lvalue-based ops for "*" and "&" that will facilitate retiring emitc.apply (something I've been hoping to do for a long time). WDYT @LekkalaSravya3, @simon-camp , @marbre ?

The "*" op will need to be introduced as hasDeferredEmission rather than a CExpressionInterface as #159975 is indeed still under review, but rebasing #159975 to include this new op should be fairly simple.

I'm not sure we should introduce these ops as part of this patch, though. If everyone is OK on adding these ops I can update my old patch to use lvalue's (and keep emitc.appy intact until we retire it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants