Skip to content

Conversation

@erichkeane
Copy link
Collaborator

@erichkeane erichkeane commented Sep 22, 2025

After previous implementation, I discovered that we were both doing arrays incorrectly for recipes, plus didn't get the pointer allocations done correctly. This patch is the first of a few in a series that attempts to make sure we get all pointers/arrays correct.

This patch is limited to just 'private' and destructors, which simplifies the review significantly. Destructors are simply looped through and called at each level.

The 'recipe-decl' is the 'least bounded' (that is, the type of the
expression, in the type of int[5] i; #pragma acc parallel private(i[1]), the type of the recipe-decl is int. This allows
us to do init/destruction at the element level.

This patch also adds infrastructure for the rest of the series of private (for the init section), as well as extensive testing for 'private', with a lot of 'TODO' locations.

Future patches will fill these in, but at the moment, there is an NYI warning for bounds, so a number of tests are updated to handle that.

After previous implementation, I discovered that we were both doing
arrays incorrectly for recipes, plus didn't get the pointer allocations
done correctly.  This patch is the first of a few in a series that
attempts to make sure we get all pointers/arrays correct.

This patch is limited to just 'private' and destructors, which
simplifies the review significantly. Destructors are simply looped
through and called at each level.

The 'recipe-decl' is the 'least bounded' (that is, the type of the
    expression, in the type of `int[5] i; #pragma acc parallel
    private(i[1]), the type of the `recipe-decl` is `int`.  This allows
    us to do init/destruction at the element level.

This patch also adds infrastructure for the rest of the series of
private (for the init section), as well as extensive testing for
'private', with a lot of 'TODO' locations.

Future patches will fill these in, but at the moment, there is an NYI
warning for bounds, so a number of tests are updated to handle that.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" ClangIR Anything related to the ClangIR project labels Sep 22, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 22, 2025

@llvm/pr-subscribers-clangir

@llvm/pr-subscribers-clang

Author: Erich Keane (erichkeane)

Changes

After previous implementation, I discovered that we were both doing arrays incorrectly for recipes, plus didn't get the pointer allocations done correctly. This patch is the first of a few in a series that attempts to make sure we get all pointers/arrays correct.

This patch is limited to just 'private' and destructors, which simplifies the review significantly. Destructors are simply looped through and called at each level.

The 'recipe-decl' is the 'least bounded' (that is, the type of the
expression, in the type of int[5] i; #pragma acc parallel private(i[1]), the type of the recipe-declisint`. This allows
us to do init/destruction at the element level.

This patch also adds infrastructure for the rest of the series of private (for the init section), as well as extensive testing for 'private', with a lot of 'TODO' locations.

Future patches will fill these in, but at the moment, there is an NYI warning for bounds, so a number of tests are updated to handle that.


Patch is 271.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/160189.diff

41 Files Affected:

  • (modified) clang/include/clang/AST/OpenACCClause.h (+1-3)
  • (modified) clang/lib/CIR/CodeGen/CIRGenFunction.h (+7)
  • (modified) clang/lib/CIR/CodeGen/CIRGenOpenACC.cpp (+32-4)
  • (modified) clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp (+21-22)
  • (modified) clang/lib/CIR/CodeGen/CIRGenOpenACCRecipe.h (+255-38)
  • (modified) clang/lib/Sema/SemaOpenACC.cpp (+15-4)
  • (modified) clang/test/CIR/CodeGenOpenACC/combined-firstprivate-clause.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/combined-private-clause.cpp (+68-76)
  • (modified) clang/test/CIR/CodeGenOpenACC/combined-reduction-clause-default-ops.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/combined-reduction-clause-float.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/combined-reduction-clause-inline-ops.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/combined-reduction-clause-int.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/combined-reduction-clause-outline-ops.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-firstprivate-clause.c (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-firstprivate-clause.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-private-clause.c (+22-22)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-private-clause.cpp (+68-76)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-default-ops.c (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-default-ops.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-float.c (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-float.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-inline-ops.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-int.c (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-int.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-outline-ops.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-unsigned-int.c (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/loop-private-clause.cpp (+68-76)
  • (modified) clang/test/CIR/CodeGenOpenACC/loop-reduction-clause-default-ops.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/loop-reduction-clause-float.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/loop-reduction-clause-inline-ops.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/loop-reduction-clause-int.cpp (+1-1)
  • (modified) clang/test/CIR/CodeGenOpenACC/loop-reduction-clause-outline-ops.cpp (+1-1)
  • (added) clang/test/CIR/CodeGenOpenACC/private-clause-array-recipes-CtorDtor.cpp (+448)
  • (added) clang/test/CIR/CodeGenOpenACC/private-clause-array-recipes-NoOps.cpp (+142)
  • (added) clang/test/CIR/CodeGenOpenACC/private-clause-array-recipes-int.cpp (+87)
  • (added) clang/test/CIR/CodeGenOpenACC/private-clause-pointer-array-recipes-CtorDtor.cpp (+845)
  • (added) clang/test/CIR/CodeGenOpenACC/private-clause-pointer-array-recipes-NoOps.cpp (+253)
  • (added) clang/test/CIR/CodeGenOpenACC/private-clause-pointer-array-recipes-int.cpp (+251)
  • (added) clang/test/CIR/CodeGenOpenACC/private-clause-pointer-recipes-CtorDtor.cpp (+291)
  • (added) clang/test/CIR/CodeGenOpenACC/private-clause-pointer-recipes-NoOps.cpp (+117)
  • (added) clang/test/CIR/CodeGenOpenACC/private-clause-pointer-recipes-int.cpp (+116)
diff --git a/clang/include/clang/AST/OpenACCClause.h b/clang/include/clang/AST/OpenACCClause.h
index 081244fe0efb6..5f06117d65a47 100644
--- a/clang/include/clang/AST/OpenACCClause.h
+++ b/clang/include/clang/AST/OpenACCClause.h
@@ -842,9 +842,7 @@ struct OpenACCPrivateRecipe {
   VarDecl *AllocaDecl;
   Expr *InitExpr;
 
-  OpenACCPrivateRecipe(VarDecl *A, Expr *I) : AllocaDecl(A), InitExpr(I) {
-    assert(!AllocaDecl || AllocaDecl->getInit() == nullptr);
-  }
+  OpenACCPrivateRecipe(VarDecl *A, Expr *I) : AllocaDecl(A), InitExpr(I) {}
 
   bool isSet() const { return AllocaDecl; }
 
diff --git a/clang/lib/CIR/CodeGen/CIRGenFunction.h b/clang/lib/CIR/CodeGen/CIRGenFunction.h
index a0c571a544322..7413f5c8b2f79 100644
--- a/clang/lib/CIR/CodeGen/CIRGenFunction.h
+++ b/clang/lib/CIR/CodeGen/CIRGenFunction.h
@@ -1715,8 +1715,15 @@ class CIRGenFunction : public CIRGenTypeCache {
     mlir::Location beginLoc;
     mlir::Value varValue;
     std::string name;
+    // The type of the original variable reference: that is, after 'bounds' have
+    // removed pointers/array types/etc. So in the case of int arr[5], and a
+    // private(arr[1]), 'origType' is 'int', but 'baseType' is 'int[5]'.
+    QualType origType;
     QualType baseType;
     llvm::SmallVector<mlir::Value> bounds;
+    // The list of types that we found when going through the bounds, which we
+    // can use to properly set the alloca section.
+    llvm::SmallVector<QualType> boundTypes;
   };
   // Gets the collection of info required to lower and OpenACC clause or cache
   // construct variable reference.
diff --git a/clang/lib/CIR/CodeGen/CIRGenOpenACC.cpp b/clang/lib/CIR/CodeGen/CIRGenOpenACC.cpp
index 907cb5fa11401..7f9350a9e4173 100644
--- a/clang/lib/CIR/CodeGen/CIRGenOpenACC.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenOpenACC.cpp
@@ -68,14 +68,33 @@ mlir::Value CIRGenFunction::createOpenACCConstantInt(mlir::Location loc,
 CIRGenFunction::OpenACCDataOperandInfo
 CIRGenFunction::getOpenACCDataOperandInfo(const Expr *e) {
   const Expr *curVarExpr = e->IgnoreParenImpCasts();
+  QualType origType =
+      curVarExpr->getType().getNonReferenceType().getUnqualifiedType();
+  // Array sections are special, and we have to treat them that way.
+  if (const auto *section =
+          dyn_cast<ArraySectionExpr>(curVarExpr->IgnoreParenImpCasts()))
+    origType = ArraySectionExpr::getBaseOriginalType(section);
 
   mlir::Location exprLoc = cgm.getLoc(curVarExpr->getBeginLoc());
   llvm::SmallVector<mlir::Value> bounds;
+  llvm::SmallVector<QualType> boundTypes;
 
   std::string exprString;
   llvm::raw_string_ostream os(exprString);
   e->printPretty(os, nullptr, getContext().getPrintingPolicy());
 
+  auto addBoundType = [&](const Expr *e) {
+    if (const auto *section = dyn_cast<ArraySectionExpr>(curVarExpr)) {
+      QualType baseTy = ArraySectionExpr::getBaseOriginalType(
+          section->getBase()->IgnoreParenImpCasts());
+      boundTypes.push_back(QualType(baseTy->getPointeeOrArrayElementType(), 0));
+    } else {
+      boundTypes.push_back(curVarExpr->getType());
+    }
+  };
+
+  addBoundType(curVarExpr);
+
   while (isa<ArraySectionExpr, ArraySubscriptExpr>(curVarExpr)) {
     mlir::Location boundLoc = cgm.getLoc(curVarExpr->getBeginLoc());
     mlir::Value lowerBound;
@@ -115,19 +134,28 @@ CIRGenFunction::getOpenACCDataOperandInfo(const Expr *e) {
 
     bounds.push_back(createBound(*this, this->builder, boundLoc, lowerBound,
                                  upperBound, extent));
+    addBoundType(curVarExpr);
   }
 
   if (const auto *memExpr = dyn_cast<MemberExpr>(curVarExpr))
-    return {exprLoc, emitMemberExpr(memExpr).getPointer(), exprString,
+    return {exprLoc,
+            emitMemberExpr(memExpr).getPointer(),
+            exprString,
+            origType,
             curVarExpr->getType().getNonReferenceType().getUnqualifiedType(),
-            std::move(bounds)};
+            std::move(bounds),
+            std::move(boundTypes)};
 
   // Sema has made sure that only 4 types of things can get here, array
   // subscript, array section, member expr, or DRE to a var decl (or the
   // former 3 wrapping a var-decl), so we should be able to assume this is
   // right.
   const auto *dre = cast<DeclRefExpr>(curVarExpr);
-  return {exprLoc, emitDeclRefLValue(dre).getPointer(), exprString,
+  return {exprLoc,
+          emitDeclRefLValue(dre).getPointer(),
+          exprString,
+          origType,
           curVarExpr->getType().getNonReferenceType().getUnqualifiedType(),
-          std::move(bounds)};
+          std::move(bounds),
+          std::move(boundTypes)};
 }
diff --git a/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp b/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp
index 9959cf6c15792..dd37b101e9735 100644
--- a/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp
@@ -988,20 +988,16 @@ class OpenACCClauseCIREmitter final
 
         {
           mlir::OpBuilder::InsertionGuard guardCase(builder);
-          // TODO: OpenACC: At the moment this is a bit of a hacky way of doing
-          // this, and won't work when we get to bounds/etc. Do this for now to
-          // limit the scope of this refactor.
-          VarDecl *allocaDecl = varRecipe.AllocaDecl;
-          allocaDecl->setInit(varRecipe.InitExpr);
-          allocaDecl->setInitStyle(VarDecl::CallInit);
 
           auto recipe =
               OpenACCRecipeBuilder<mlir::acc::PrivateRecipeOp>(cgf, builder)
-                  .getOrCreateRecipe(cgf.getContext(), varExpr, allocaDecl,
-                                     /*temporary=*/nullptr,
-                                     OpenACCReductionOperator::Invalid,
-                                     Decl::castToDeclContext(cgf.curFuncDecl),
-                                     opInfo.baseType, privateOp.getResult());
+                  .getOrCreateRecipe(
+                      cgf.getContext(), varExpr, varRecipe.AllocaDecl,
+                      varRecipe.InitExpr,
+                      /*temporary=*/nullptr, OpenACCReductionOperator::Invalid,
+                      Decl::castToDeclContext(cgf.curFuncDecl), opInfo.origType,
+                      opInfo.bounds.size(), opInfo.boundTypes, opInfo.baseType,
+                      privateOp.getResult());
           // TODO: OpenACC: The dialect is going to change in the near future to
           // have these be on a different operation, so when that changes, we
           // probably need to change these here.
@@ -1042,12 +1038,13 @@ class OpenACCClauseCIREmitter final
           auto recipe =
               OpenACCRecipeBuilder<mlir::acc::FirstprivateRecipeOp>(cgf,
                                                                     builder)
-                  .getOrCreateRecipe(cgf.getContext(), varExpr, allocaDecl,
-                                     varRecipe.InitFromTemporary,
-                                     OpenACCReductionOperator::Invalid,
-                                     Decl::castToDeclContext(cgf.curFuncDecl),
-                                     opInfo.baseType,
-                                     firstPrivateOp.getResult());
+                  .getOrCreateRecipe(
+                      cgf.getContext(), varExpr, varRecipe.AllocaDecl,
+                      varRecipe.InitExpr, varRecipe.InitFromTemporary,
+                      OpenACCReductionOperator::Invalid,
+                      Decl::castToDeclContext(cgf.curFuncDecl), opInfo.origType,
+                      opInfo.bounds.size(), opInfo.boundTypes, opInfo.baseType,
+                      firstPrivateOp.getResult());
 
           // TODO: OpenACC: The dialect is going to change in the near future to
           // have these be on a different operation, so when that changes, we
@@ -1089,11 +1086,13 @@ class OpenACCClauseCIREmitter final
 
           auto recipe =
               OpenACCRecipeBuilder<mlir::acc::ReductionRecipeOp>(cgf, builder)
-                  .getOrCreateRecipe(cgf.getContext(), varExpr, allocaDecl,
-                                     /*temporary=*/nullptr,
-                                     clause.getReductionOp(),
-                                     Decl::castToDeclContext(cgf.curFuncDecl),
-                                     opInfo.baseType, reductionOp.getResult());
+                  .getOrCreateRecipe(
+                      cgf.getContext(), varExpr, varRecipe.AllocaDecl,
+                      varRecipe.InitExpr,
+                      /*temporary=*/nullptr, clause.getReductionOp(),
+                      Decl::castToDeclContext(cgf.curFuncDecl), opInfo.origType,
+                      opInfo.bounds.size(), opInfo.boundTypes, opInfo.baseType,
+                      reductionOp.getResult());
 
           operation.addReduction(builder.getContext(), reductionOp, recipe);
         }
diff --git a/clang/lib/CIR/CodeGen/CIRGenOpenACCRecipe.h b/clang/lib/CIR/CodeGen/CIRGenOpenACCRecipe.h
index 102fd890e5579..4e60f892e7ab2 100644
--- a/clang/lib/CIR/CodeGen/CIRGenOpenACCRecipe.h
+++ b/clang/lib/CIR/CodeGen/CIRGenOpenACCRecipe.h
@@ -22,10 +22,140 @@
 #include "mlir/Dialect/OpenACC/OpenACC.h"
 
 namespace clang::CIRGen {
+
 template <typename RecipeTy> class OpenACCRecipeBuilder {
   CIRGen::CIRGenFunction &cgf;
   CIRGen::CIRGenBuilderTy &builder;
 
+mlir::Block *createRecipeBlock(mlir::Region &region, mlir::Type opTy,
+                               mlir::Location loc, size_t numBounds,
+                               bool isInit) {
+  llvm::SmallVector<mlir::Type> types;
+  types.reserve(numBounds + 2);
+  types.push_back(opTy);
+  // The init section is the only one that doesn't have TWO copies of the
+  // operation-type.  Copy has a to/from, and destroy has a
+  // 'reference'/'privatized' copy version.
+  if (!isInit)
+    types.push_back(opTy);
+
+  auto boundsTy = mlir::acc::DataBoundsType::get(&cgf.getMLIRContext());
+  for (size_t i = 0; i < numBounds; ++i)
+    types.push_back(boundsTy);
+
+  llvm::SmallVector<mlir::Location> locs{types.size(), loc};
+  return builder.createBlock(&region, region.end(), types, locs);
+}
+// Creates a loop through an 'acc.bounds', leaving the 'insertion' point to be
+// the inside of the loop body. Traverses LB->UB UNLESS `inverse` is set.
+// Returns the 'subscriptedValue' changed with the new bounds subscript.
+mlir::Value createBoundsLoop(mlir::Value subscriptedValue, mlir::Value bound,
+                             mlir::Location loc, bool inverse) {
+  mlir::Operation *bodyInsertLoc;
+
+  mlir::Type itrTy = cgf.cgm.convertType(cgf.getContext().UnsignedLongLongTy);
+  auto itrPtrTy = cir::PointerType::get(itrTy);
+  mlir::IntegerAttr itrAlign =
+      cgf.cgm.getSize(cgf.getContext().getTypeAlignInChars(
+          cgf.getContext().UnsignedLongLongTy));
+  auto idxType = mlir::IndexType::get(&cgf.getMLIRContext());
+
+  auto doSubscriptOp = [&](mlir::Value subVal,
+                           cir::LoadOp idxLoad) -> mlir::Value {
+    auto eltTy = cast<cir::PointerType>(subVal.getType()).getPointee();
+
+    if (auto arrayTy = dyn_cast<cir::ArrayType>(eltTy))
+      return builder.getArrayElement(loc, loc, subVal, arrayTy.getElementType(),
+                                     idxLoad.getResult(),
+                                     /*shouldDecay=*/true);
+
+    assert(isa<cir::PointerType>(eltTy));
+
+    auto eltLoad = cir::LoadOp::create(builder, loc, {subVal});
+
+    return cir::PtrStrideOp::create(builder, loc, eltLoad.getType(), eltLoad,
+                                    idxLoad.getResult())
+        .getResult();
+  };
+
+  auto forStmtBuilder = [&]() {
+    // get the lower and upper bound for iterating over.
+    auto lowerBoundVal =
+        mlir::acc::GetLowerboundOp::create(builder, loc, idxType, bound);
+    auto lbConversion = mlir::UnrealizedConversionCastOp::create(
+        builder, loc, itrTy, lowerBoundVal.getResult());
+    auto upperBoundVal =
+        mlir::acc::GetUpperboundOp::create(builder, loc, idxType, bound);
+    auto ubConversion = mlir::UnrealizedConversionCastOp::create(
+        builder, loc, itrTy, upperBoundVal.getResult());
+
+    // Create a memory location for the iterator.
+    auto itr =
+        cir::AllocaOp::create(builder, loc, itrPtrTy, itrTy, "iter", itrAlign);
+    // Store to the iterator: either lower bound, or if inverse loop, upper
+    // bound.
+    if (inverse) {
+      cir::ConstantOp constOne = builder.getConstInt(loc, itrTy, 1);
+
+      auto sub =
+          cir::BinOp::create(builder, loc, itrTy, cir::BinOpKind::Sub,
+                             ubConversion.getResult(0), constOne.getResult());
+
+      // Upperbound is exclusive, so subtract 1.
+      builder.CIRBaseBuilderTy::createStore(loc, sub.getResult(), itr);
+    } else {
+      // Lowerbound is inclusive, so we can include it.
+      builder.CIRBaseBuilderTy::createStore(loc, lbConversion.getResult(0),
+                                            itr);
+    }
+    // Save the 'end' iterator based on whether we are inverted or not. This
+    // end iterator never changes, so we can just get it and convert it, so no
+    // need to store/load/etc.
+    auto endItr = inverse ? lbConversion : ubConversion;
+
+    builder.createFor(
+        loc,
+        /*condBuilder=*/
+        [&](mlir::OpBuilder &b, mlir::Location loc) {
+          auto loadCur = cir::LoadOp::create(builder, loc, {itr});
+          // Use 'not equal' since we are just doing an increment/decrement.
+          auto cmp = builder.createCompare(
+              loc, inverse ? cir::CmpOpKind::ge : cir::CmpOpKind::lt,
+              loadCur.getResult(), endItr.getResult(0));
+          builder.createCondition(cmp);
+        },
+        /*bodyBuilder=*/
+        [&](mlir::OpBuilder &b, mlir::Location loc) {
+          auto load = cir::LoadOp::create(builder, loc, {itr});
+
+          if (subscriptedValue)
+            subscriptedValue = doSubscriptOp(subscriptedValue, load);
+          bodyInsertLoc = builder.createYield(loc);
+        },
+        /*stepBuilder=*/
+        [&](mlir::OpBuilder &b, mlir::Location loc) {
+          auto load = cir::LoadOp::create(builder, loc, {itr});
+          auto unary = cir::UnaryOp::create(builder, loc, load.getType(),
+                                            inverse ? cir::UnaryOpKind::Dec
+                                                    : cir::UnaryOpKind::Inc,
+                                            load.getResult());
+          builder.CIRBaseBuilderTy::createStore(loc, unary.getResult(), itr);
+          builder.createYield(loc);
+        });
+  };
+
+  cir::ScopeOp::create(builder, loc,
+                       [&](mlir::OpBuilder &b, mlir::Location loc) {
+                         forStmtBuilder();
+                         builder.createYield(loc);
+                       });
+
+  // Leave the insertion point to be inside the body, so we can loop over
+  // these things.
+  builder.setInsertionPoint(bodyInsertLoc);
+  return subscriptedValue;
+}
+
   mlir::acc::ReductionOperator convertReductionOp(OpenACCReductionOperator op) {
     switch (op) {
     case OpenACCReductionOperator::Addition:
@@ -54,6 +184,7 @@ template <typename RecipeTy> class OpenACCRecipeBuilder {
   }
 
   std::string getRecipeName(SourceRange loc, QualType baseType,
+                            unsigned numBounds,
                             OpenACCReductionOperator reductionOp) {
     std::string recipeName;
     {
@@ -106,6 +237,11 @@ template <typename RecipeTy> class OpenACCRecipeBuilder {
         static_assert(!sizeof(RecipeTy), "Unknown Recipe op kind");
       }
 
+      //  The naming convention from Flang with bounds doesn't map to C++ types
+      //  very well, so we're just going to choose our own here.  
+      if (numBounds)
+        stream << "_Bcnt" << numBounds << '_';
+
       MangleContext &mc = cgf.cgm.getCXXABI().getMangleContext();
       mc.mangleCanonicalTypeName(baseType, stream);
     }
@@ -117,9 +253,9 @@ template <typename RecipeTy> class OpenACCRecipeBuilder {
       CIRGenFunction::AutoVarEmission tempDeclEmission,
       mlir::acc::FirstprivateRecipeOp recipe, const VarDecl *varRecipe,
       const VarDecl *temporary) {
-    mlir::Block *block = builder.createBlock(
-        &recipe.getCopyRegion(), recipe.getCopyRegion().end(),
-        {mainOp.getType(), mainOp.getType()}, {loc, loc});
+    mlir::Block *block =
+        createRecipeBlock(recipe.getCopyRegion(), mainOp.getType(), loc,
+                          /*numBounds=*/0, /*isInit=*/false);
     builder.setInsertionPointToEnd(&recipe.getCopyRegion().back());
     CIRGenFunction::LexicalScope ls(cgf, loc, block);
 
@@ -143,6 +279,54 @@ template <typename RecipeTy> class OpenACCRecipeBuilder {
     cgf.emitAutoVarInit(tempDeclEmission);
     mlir::acc::YieldOp::create(builder, locEnd);
   }
+  // TODO: OpenACC: When we get this implemented for the reduction/firstprivate,
+  // this might end up re-merging with createRecipeInitCopy.  For now, keep it
+  // separate until we're sure what everything looks like to keep this as clean
+  // as possible.
+  void createPrivateInitRecipe(mlir::Location loc, mlir::Location locEnd,
+                               SourceRange exprRange, mlir::Value mainOp,
+                               mlir::acc::PrivateRecipeOp recipe,
+                               size_t numBounds,
+                               llvm::ArrayRef<QualType> boundTypes,
+                               const VarDecl *allocaDecl, QualType origType,
+                               const Expr *initExpr) {
+    assert(allocaDecl && "Required recipe variable not set?");
+    CIRGenFunction::DeclMapRevertingRAII declMapRAII{cgf, allocaDecl};
+
+    mlir::Block *block =
+        createRecipeBlock(recipe.getInitRegion(), mainOp.getType(), loc,
+                          numBounds, /*isInit=*/true);
+    builder.setInsertionPointToEnd(&recipe.getInitRegion().back());
+    CIRGenFunction::LexicalScope ls(cgf, loc, block);
+
+    const Type *allocaPointeeType =
+        allocaDecl->getType()->getPointeeOrArrayElementType();
+    // We are OK with no init for builtins, arrays of builtins, or pointers,
+    // else we should NYI so we know to go look for these.
+    if (cgf.getContext().getLangOpts().CPlusPlus && !allocaDecl->getInit() &&
+        !allocaDecl->getType()->isPointerType() &&
+        !allocaPointeeType->isBuiltinType() &&
+        !allocaPointeeType->isPointerType()) {
+      // If we don't have any initialization recipe, we failed during Sema to
+      // initialize this correctly. If we disable the
+      // Sema::TentativeAnalysisScopes in SemaOpenACC::CreateInitRecipe, it'll
+      // emit an error to tell us.  However, emitting those errors during
+      // production is a violation of the standard, so we cannot do them.
+      cgf.cgm.errorNYI(exprRange, "private default-init recipe");
+    }
+
+    if (!numBounds) {
+      // This is an 'easy' case, we just have to use the builtin init stuff to
+      // initialize this variable correctly.
+      CIRGenFunction::AutoVarEmission tempDeclEmission =
+          cgf.emitAutoVarAlloca(*allocaDecl, builder.saveInsertionPoint());
+      cgf.emitAutoVarInit(tempDeclEmission);
+    } else {
+      cgf.cgm.errorNYI(exprRange, "private-init with bounds");
+    }
+
+    mlir::acc::YieldOp::create(builder, locEnd);
+  }
 
   // Create the 'init' section of the recipe, including the 'copy' section for
   // 'firstprivate'.  Note that this function is not 'insertion point' clean, in
@@ -160,9 +344,9 @@ template <typename RecipeTy> class OpenACCRecipeBuilder {
 
     // Do the 'init' section of the recipe IR, which does an alloca, then the
     // initialization (except for firstprivate).
-    mlir::Block *block = builder.createBlock(&recipe.getInitRegion(),
-                                             recipe.getInitRegion().end(),
-                                             {mainOp.getType()}, {loc});
+    mlir::Block *block =
+        createRecipeBlock(recipe.getInitRegion(), mainOp.getType(), loc,
+                          /*numBounds=*/0, /*isInit=*/true);
     builder.setInsertionPointToEnd(&recipe.getInitRegion().back());
     CIRGenFunction::LexicalScope ls(cgf, loc, block);
 
@@ -241,22 +425,42 @@ template <typename RecipeTy> class OpenACCRecipeBuilder {
   // doesn't restore it aftewards.
   void createRecipeDestroySection(mlir::Location loc, mlir::Location locEnd,
               ...
[truncated]

@erichkeane
Copy link
Collaborator Author

I tried to make this as small as possible, but a LOT of stuff ended up in tests since I wanted to make sure we got good coverage. I would VERY MUCH appreciate overall review, but in particular want to make sure that I got the IR for the destroy section correct, both from a CIR and an ACC IR perspective.

@github-actions
Copy link

github-actions bot commented Sep 22, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, minor question/concern below.

mlir::Block *createRecipeBlock(mlir::Region &region, mlir::Type opTy,
mlir::Location loc, size_t numBounds,
bool isInit) {
llvm::SmallVector<mlir::Type> types;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a bit worried with the amount of code in this header file, can it be moved out to impl files? (I'm fine if this comes as a follow up PR)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I definitely get that, this ends up being pretty big! It is mostly inside a template, so it can't really move too far, though I can split up helpers, but I question that value.

I actually just split all of this out of the CIRGenOpenACCClause.cpp file, as it was part of THAT implementation, and won't be included from anywhere else. The idea is that this was all basically the 'top' of the Clause.cpp file as it is only used there, but would be more readable in its own file like this. It DOES need to be included (since it is a template, thus .h).

I guess I can simplify the implementation, as much of it is not template dependent besides a handful of decisions, and turn this recipe creator into a much smaller file. Its probably a good idea overall anyway.... let me think about it, and I'll do it as a followup once I come up with something.

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Just a very minor nit about a test var name and a question to make sure I'm correctly understanding where the while loops are coming from.

// CHECK-NEXT: %[[BITCAST:.*]] = cir.cast(bitcast, %[[TL_ALLOCA]] : !cir.ptr<!cir.array<!cir.array<!cir.array<!rec_CtorDtor x 5> x 5> x 5>>), !cir.ptr<!cir.array<!rec_CtorDtor x 125>>
// CHECK-NEXT: %[[ARR_SIZE:.*]] = cir.const #cir.int<125> : !u64i
// CHECK-NEXT: %[[DECAY:.*]] = cir.cast(array_to_ptrdecay, %[[BITCAST]] : !cir.ptr<!cir.array<!rec_CtorDtor x 125>>), !cir.ptr<!rec_CtorDtor>
// CHECK-NEXT: %[[LAST_ELT:.*]] = cir.ptr_stride(%[[DECAY]] : !cir.ptr<!rec_CtorDtor>, %[[ARR_SIZE]] : !u64i), !cir.ptr<!rec_CtorDtor>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't suppose it matters, but this will actually be a pointer just past the last element, right? The CIR looks correct, but the test var name is wrong.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah... thats true. I'll correct that.

// CHECK-NEXT: %[[LAST_ELT:.*]] = cir.ptr_stride(%[[DECAY]] : !cir.ptr<!rec_CtorDtor>, %[[LAST_IDX]] : !u64i), !cir.ptr<!rec_CtorDtor>
// CHECK-NEXT: %[[ARR_IDX:.*]] = cir.alloca !cir.ptr<!rec_CtorDtor>, !cir.ptr<!cir.ptr<!rec_CtorDtor>>, ["__array_idx"] {alignment = 1 : i64}
// CHECK-NEXT: cir.store %[[LAST_ELT]], %[[ARR_IDX]] : !cir.ptr<!rec_CtorDtor>, !cir.ptr<!cir.ptr<!rec_CtorDtor>>
// CHECK-NEXT: cir.do {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this loop generated by our normal array destructor handling? I couldn't find anything in this PR that creates a do-while loop.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the do-while loop is caused by cgf.emitDestroy.

@erichkeane
Copy link
Collaborator Author

Also hoping @razvanlupusoru could do a quick look at the tests to make sure I'm not doing anything silly with the bounds/misinterpreting what I'm supposed to be doing with it!

@razvanlupusoru
Copy link
Contributor

Also hoping @razvanlupusoru could do a quick look at the tests to make sure I'm not doing anything silly with the bounds/misinterpreting what I'm supposed to be doing with it!

Would it be possible to have the C++ code and the CHECK lines for recipes intertwined? With the mangling I cannot tell which ones correspond together!

// The type of the original variable reference: that is, after 'bounds' have
// removed pointers/array types/etc. So in the case of int arr[5], and a
// private(arr[1]), 'origType' is 'int', but 'baseType' is 'int[5]'.
QualType origType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the "element type"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't QUITE element type (since you don't have to be 'fully decended' with bounds), but yes, it is the original type of the expression that the user typed, after accessing the bounds they typed.

@erichkeane
Copy link
Collaborator Author

Also hoping @razvanlupusoru could do a quick look at the tests to make sure I'm not doing anything silly with the bounds/misinterpreting what I'm supposed to be doing with it!

Would it be possible to have the C++ code and the CHECK lines for recipes intertwined? With the mangling I cannot tell which ones correspond together!

Open for suggestions how how to do so! One of the problems I ran into with this is that the 'recipe' is always added at the 'top' when we create it, so it is inverse order. AND FileCheck doesn't let you go 'backwards', so I was unable to put the recipes inline.

@erichkeane
Copy link
Collaborator Author

Also hoping @razvanlupusoru could do a quick look at the tests to make sure I'm not doing anything silly with the bounds/misinterpreting what I'm supposed to be doing with it!

Would it be possible to have the C++ code and the CHECK lines for recipes intertwined? With the mangling I cannot tell which ones correspond together!

Open for suggestions how how to do so! One of the problems I ran into with this is that the 'recipe' is always added at the 'top' when we create it, so it is inverse order. AND FileCheck doesn't let you go 'backwards', so I was unable to put the recipes inline.

I made an attempt by putting the applicable C/C++ code in a comment above each recipe. Let me know if this is helpful!

@razvanlupusoru
Copy link
Contributor

Open for suggestions how how to do so! One of the problems I ran into with this is that the 'recipe' is always added at the 'top' when we create it, so it is inverse order. AND FileCheck doesn't let you go 'backwards', so I was unable to put the recipes inline.

I agree that it is wonky to use FileCheck with CHECK-DAG especially if it is backwards. One suggestion to alleviate the FileCheck issue is to actually insert in order - one possible way is to keep last recipe insertion point.

@erichkeane
Copy link
Collaborator Author

Open for suggestions how how to do so! One of the problems I ran into with this is that the 'recipe' is always added at the 'top' when we create it, so it is inverse order. AND FileCheck doesn't let you go 'backwards', so I was unable to put the recipes inline.

I agree that it is wonky to use FileCheck with CHECK-DAG especially if it is backwards. One suggestion to alleviate the FileCheck issue is to actually insert in order - one possible way is to keep last recipe insertion point.

Andy poked at it a bit lately, CHECK-DAG doesn't let you do 'next' unfortunately. As far as keeping the last-recipe-insertion point, that is probably a good idea. It'll take a while and perhaps mean i have to modify a bunch of other tests here, but perhaps i cna do that in a separate patch then rebase this on top of it.

Copy link
Contributor

@razvanlupusoru razvanlupusoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Erich for the comments showing type of array. That helped! The changes look great to me!

A few semi-relevant comments:

  • I noticed you use the upperbound instead of extent in the recipes - despite the fact that only extent is filled in in the bounds used in the acc.private operations. I am guessing you will be relying on the promise I made (but not yet implemented) - https://mlir.llvm.org/docs/Dialects/OpenACCDialect/#accget_upperbound-accgetupperboundop
  • It is expected that you are not using stride from acc.bounds (since OpenACC spec does not actually allow stride). We currently do use this in flang because the language can express non-contiguous array views. I am imagining that if ever mdspan was natively represented in compiler as some sort of built-in type, then using stride would be necessary.
  • I noticed you collapse multi-dimension arrays into a single loop. I imagine normal declarations do not result in non-contiguous arrays. However, does your semantics checking verify for contiguous sections in the case when acc bounds are used?

@erichkeane
Copy link
Collaborator Author

Thank you Erich for the comments showing type of array. That helped! The changes look great to me!

Awesome, thanks! I'll still be doing a patch to do the recipe-ordering cleanup, but that should be effectively NFC, but I'll do that after this gets merged.

A few semi-relevant comments:

* I noticed you use the upperbound instead of extent in the recipes - despite the fact that only extent is filled in in the bounds used in the `acc.private` operations. I am guessing you will be relying on the promise I made (but not yet implemented) - https://mlir.llvm.org/docs/Dialects/OpenACCDialect/#accget_upperbound-accgetupperboundop

Yep :) Thats why i was so excited about it! :) the loops worked out way better doing LB->UB

* It is expected that you are not using stride from `acc.bounds` (since OpenACC spec does not actually allow stride). We currently do use this in flang because the language can express non-contiguous array views. I am imagining that if ever mdspan was natively represented in compiler as some sort of built-in type, then using stride would be necessary.

Right, the standard doesn't allow stride, and C/C++ doesn't really have a non-1 stride available in its syntax. Yes, a 'native' mdspan would require some sort of stride operation, but I don't see us able ot do that anytime soon.

* I noticed you collapse multi-dimension arrays into a single loop. I imagine normal declarations do not result in non-contiguous arrays. However, does your semantics checking verify for contiguous sections in the case when acc bounds are used?

Multi-dimension arrays in C/C++ are guaranteed contiguous. Where that doesn't happen is when there are pointers involved, but we have to 'set that up' with allocas, which I'll do in a subsequent patch. In those cases, we end up having to emit the values better, but you'll see that in upcoming patches. The ptr_stride operations here are 'pointer aware' so they should do the indirections as necessary.

@erichkeane erichkeane merged commit 41387ab into llvm:main Sep 24, 2025
9 checks passed
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
After previous implementation, I discovered that we were both doing
arrays incorrectly for recipes, plus didn't get the pointer allocations
done correctly. This patch is the first of a few in a series that
attempts to make sure we get all pointers/arrays correct.

This patch is limited to just 'private' and destructors, which
simplifies the review significantly. Destructors are simply looped
through and called at each level.

The 'recipe-decl' is the 'least bounded' (that is, the type of the
    expression, in the type of `int[5] i; #pragma acc parallel
    private(i[1])`, the type of the `recipe-decl` is `int`.  This allows
    us to do init/destruction at the element level.

This patch also adds infrastructure for the rest of the series of
private (for the init section), as well as extensive testing for
'private', with a lot of 'TODO' locations.

Future patches will fill these in, but at the moment, there is an NYI
warning for bounds, so a number of tests are updated to handle that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants