[HLSL] Implement the `fwidth` intrinsic for DXIL and SPIR-V target #161378

Alexander-Johnston · 2025-09-30T13:41:07Z

Adds the fwidth intrinsic for HLSL.
The DXIL path only requires modification to the hlsl headers.
The SPIRV path implements the OpFwidth builtin in Clang and instruction selection for the OpFwidth instruction in LLVM.
Also adds shader stage tests to the ddx_coarse and ddy_coarse instructions used by fwidth.

Closes #99120

github-actions · 2025-09-30T13:41:23Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-09-30T13:41:52Z

@llvm/pr-subscribers-hlsl
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-spir-v
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-directx

@llvm/pr-subscribers-clang-codegen

Author: Alexander Johnston (Alexander-Johnston)

Changes

Closes #99120

Patch is 25.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/161378.diff

20 Files Affected:

(modified) clang/include/clang/Basic/Builtins.td (+12)
(modified) clang/include/clang/Basic/BuiltinsSPIRVCommon.td (+1)
(modified) clang/lib/CodeGen/CGHLSLBuiltins.cpp (+18)
(modified) clang/lib/CodeGen/TargetBuiltins/SPIR.cpp (+5)
(modified) clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h (+25)
(modified) clang/lib/Headers/hlsl/hlsl_intrinsics.h (+25)
(modified) clang/lib/Sema/SemaHLSL.cpp (+3-1)
(modified) clang/lib/Sema/SemaSPIRV.cpp (+18)
(added) clang/test/CodeGenSPIRV/Builtins/fwidth.c (+41)
(added) clang/test/SemaSPIRV/BuiltIns/fwidth-errors.c (+24)
(modified) llvm/include/llvm/IR/IntrinsicsDirectX.td (+2)
(modified) llvm/include/llvm/IR/IntrinsicsSPIRV.td (+1)
(modified) llvm/lib/Target/DirectX/DXIL.td (+18)
(modified) llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp (+2)
(modified) llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp (+16)
(added) llvm/test/CodeGen/DirectX/deriv_coarse_x.ll (+43)
(added) llvm/test/CodeGen/DirectX/deriv_coarse_x_error.ll (+15)
(added) llvm/test/CodeGen/DirectX/deriv_coarse_y.ll (+43)
(added) llvm/test/CodeGen/DirectX/deriv_coarse_y_error.ll (+15)
(added) llvm/test/CodeGen/SPIRV/hlsl-intrinsics/fwidth.ll (+44)

diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 468121f7d20ab..71fe555d6f689 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -5204,6 +5204,18 @@ def HLSLGetSpirvSpecConstant : LangBuiltin<"HLSL_LANG">, HLSLScalarTemplate {
   let Prototype = "T(unsigned int, T)";
 }
 
+def HLSLDerivCoarseX: LangBuiltin<"HLSL_LANG"> {
+  let Spellings = ["__builtin_hlsl_elementwise_deriv_coarse_x"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
+def HLSLDerivCoarseY: LangBuiltin<"HLSL_LANG"> {
+  let Spellings = ["__builtin_hlsl_elementwise_deriv_coarse_y"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
 // Builtins for XRay.
 def XRayCustomEvent : Builtin {
   let Spellings = ["__xray_customevent"];
diff --git a/clang/include/clang/Basic/BuiltinsSPIRVCommon.td b/clang/include/clang/Basic/BuiltinsSPIRVCommon.td
index d2ef6f99a0502..95f73cf4effbc 100644
--- a/clang/include/clang/Basic/BuiltinsSPIRVCommon.td
+++ b/clang/include/clang/Basic/BuiltinsSPIRVCommon.td
@@ -21,3 +21,4 @@ def subgroup_local_invocation_id : SPIRVBuiltin<"uint32_t()", [NoThrow, Const]>;
 def distance : SPIRVBuiltin<"void(...)", [NoThrow, Const]>;
 def length : SPIRVBuiltin<"void(...)", [NoThrow, Const]>;
 def smoothstep : SPIRVBuiltin<"void(...)", [NoThrow, Const, CustomTypeChecking]>;
+def fwidth : SPIRVBuiltin<"void(...)", [NoThrow, Const, CustomTypeChecking]>;
diff --git a/clang/lib/CodeGen/CGHLSLBuiltins.cpp b/clang/lib/CodeGen/CGHLSLBuiltins.cpp
index 6c0fc8d7f07be..5ed77adbb1d16 100644
--- a/clang/lib/CodeGen/CGHLSLBuiltins.cpp
+++ b/clang/lib/CodeGen/CGHLSLBuiltins.cpp
@@ -532,6 +532,24 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
         /*ReturnType=*/Op0->getType(), CGM.getHLSLRuntime().getFracIntrinsic(),
         ArrayRef<Value *>{Op0}, nullptr, "hlsl.frac");
   }
+  case Builtin::BI__builtin_hlsl_elementwise_deriv_coarse_x: {
+    Value *Op0 = EmitScalarExpr(E->getArg(0));
+    if (!E->getArg(0)->getType()->hasFloatingRepresentation())
+      llvm_unreachable(
+          "deriv coarse x operand must have a float representation");
+    return Builder.CreateIntrinsic(
+        /*ReturnType=*/Op0->getType(), llvm::Intrinsic::dx_deriv_coarse_x,
+        ArrayRef<Value *>{Op0}, nullptr, "hlsl.deriv.coarse.x");
+  }
+  case Builtin::BI__builtin_hlsl_elementwise_deriv_coarse_y: {
+    Value *Op0 = EmitScalarExpr(E->getArg(0));
+    if (!E->getArg(0)->getType()->hasFloatingRepresentation())
+      llvm_unreachable(
+          "deriv coarse x operand must have a float representation");
+    return Builder.CreateIntrinsic(
+        /*ReturnType=*/Op0->getType(), llvm::Intrinsic::dx_deriv_coarse_y,
+        ArrayRef<Value *>{Op0}, nullptr, "hlsl.deriv.coarse.y");
+  }
   case Builtin::BI__builtin_hlsl_elementwise_isinf: {
     Value *Op0 = EmitScalarExpr(E->getArg(0));
     llvm::Type *Xty = Op0->getType();
diff --git a/clang/lib/CodeGen/TargetBuiltins/SPIR.cpp b/clang/lib/CodeGen/TargetBuiltins/SPIR.cpp
index 243aad8bf7083..43b05a128e876 100644
--- a/clang/lib/CodeGen/TargetBuiltins/SPIR.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/SPIR.cpp
@@ -151,6 +151,11 @@ Value *CodeGenFunction::EmitSPIRVBuiltinExpr(unsigned BuiltinID,
         Intrinsic::spv_global_offset,
         ArrayRef<Value *>{EmitScalarExpr(E->getArg(0))}, nullptr,
         "spv.global.offset");
+  case SPIRV::BI__builtin_spirv_fwidth:
+    return Builder.CreateIntrinsic(
+        /*ReturnType=*/getTypes().ConvertType(E->getType()),
+        Intrinsic::spv_fwidth, ArrayRef<Value *>{EmitScalarExpr(E->getArg(0))},
+        nullptr, "spv.fwidth");
   }
   return nullptr;
 }
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
index c877234479ad1..01f32596ad554 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
@@ -148,6 +148,31 @@ template <typename T> constexpr T ldexp_impl(T X, T Exp) {
   return exp2(Exp) * X;
 }
 
+template <typename T> constexpr T fwidth_impl(T input) {
+#if (__has_builtin(__builtin_spirv_fwidth))
+  return __builtin_spirv_fwidth(input);
+#else
+  T derivCoarseX = __builtin_hlsl_elementwise_deriv_coarse_x(input);
+  derivCoarseX = abs(derivCoarseX);
+  T derivCoarseY = __builtin_hlsl_elementwise_deriv_coarse_y(input);
+  derivCoarseY = abs(derivCoarseY);
+  return derivCoarseX + derivCoarseY;
+#endif
+}
+
+template <typename T, int N>
+constexpr vector<T, N> fwidth_vec_impl(vector<T, N> input) {
+#if (__has_builtin(__builtin_spirv_fwidth))
+  return __builtin_spirv_fwidth(input);
+#else
+  vector<T, N> derivCoarseX = __builtin_hlsl_elementwise_deriv_coarse_x(input);
+  derivCoarseX = abs(derivCoarseX);
+  vector<T, N> derivCoarseY = __builtin_hlsl_elementwise_deriv_coarse_y(input);
+  derivCoarseY = abs(derivCoarseY);
+  return derivCoarseX + derivCoarseY;
+#endif
+}
+
 } // namespace __detail
 } // namespace hlsl
 
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index 5ba5bfb9abde0..1e01828fd3ba1 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -605,5 +605,30 @@ smoothstep(__detail::HLSL_FIXED_VECTOR<float, N> Min,
   return __detail::smoothstep_vec_impl(Min, Max, X);
 }
 
+//===----------------------------------------------------------------------===//
+// fwidth builtin
+//===----------------------------------------------------------------------===//
+
+/// \fn T fwidth(T x)
+/// \brief Computes the sum of the absolute values of the partial derivatives
+/// with regard to the x and y screen space coordinates.
+/// \param x [in] The floating-point scalar or vector to process.
+///
+/// The return value is a floating-point scalar or vector where each element
+/// holds the computation of the matching element in the input.
+
+template <typename T>
+const inline __detail::enable_if_t<
+    __detail::is_arithmetic<T>::Value && __detail::is_same<float, T>::value, T>
+fwidth(T input) {
+  return __detail::fwidth_impl(input);
+}
+
+template <int N>
+const inline __detail::HLSL_FIXED_VECTOR<float, N>
+fwidth(__detail::HLSL_FIXED_VECTOR<float, N> input) {
+  return __detail::fwidth_vec_impl(input);
+}
+
 } // namespace hlsl
 #endif //_HLSL_HLSL_INTRINSICS_H_
diff --git a/clang/lib/Sema/SemaHLSL.cpp b/clang/lib/Sema/SemaHLSL.cpp
index 940d510b4cc02..fe621d62988fe 100644
--- a/clang/lib/Sema/SemaHLSL.cpp
+++ b/clang/lib/Sema/SemaHLSL.cpp
@@ -3080,7 +3080,9 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned BuiltinID, CallExpr *TheCall) {
   case Builtin::BI__builtin_hlsl_elementwise_degrees:
   case Builtin::BI__builtin_hlsl_elementwise_radians:
   case Builtin::BI__builtin_hlsl_elementwise_rsqrt:
-  case Builtin::BI__builtin_hlsl_elementwise_frac: {
+  case Builtin::BI__builtin_hlsl_elementwise_frac:
+  case Builtin::BI__builtin_hlsl_elementwise_deriv_coarse_x:
+  case Builtin::BI__builtin_hlsl_elementwise_deriv_coarse_y: {
     if (SemaRef.checkArgCount(TheCall, 1))
       return true;
     if (CheckAllArgTypesAreCorrect(&SemaRef, TheCall,
diff --git a/clang/lib/Sema/SemaSPIRV.cpp b/clang/lib/Sema/SemaSPIRV.cpp
index c8ea0d09c4081..0e78cff9c1774 100644
--- a/clang/lib/Sema/SemaSPIRV.cpp
+++ b/clang/lib/Sema/SemaSPIRV.cpp
@@ -360,6 +360,24 @@ bool SemaSPIRV::CheckSPIRVBuiltinFunctionCall(const TargetInfo &TI,
   case SPIRV::BI__builtin_spirv_generic_cast_to_ptr_explicit: {
     return checkGenericCastToPtr(SemaRef, TheCall);
   }
+  case SPIRV::BI__builtin_spirv_fwidth: {
+    if (SemaRef.checkArgCount(TheCall, 1))
+      return true;
+
+    // Check if first argument has floating representation
+    ExprResult A = TheCall->getArg(0);
+    QualType ArgTyA = A.get()->getType();
+    if (!ArgTyA->hasFloatingRepresentation()) {
+      SemaRef.Diag(A.get()->getBeginLoc(), diag::err_builtin_invalid_arg_type)
+          << /* ordinal */ 1 << /* scalar or vector */ 5 << /* no int */ 0
+          << /* fp */ 1 << ArgTyA;
+      return true;
+    }
+
+    QualType RetTy = ArgTyA;
+    TheCall->setType(RetTy);
+    break;
+  }
   }
   return false;
 }
diff --git a/clang/test/CodeGenSPIRV/Builtins/fwidth.c b/clang/test/CodeGenSPIRV/Builtins/fwidth.c
new file mode 100644
index 0000000000000..027b80500904d
--- /dev/null
+++ b/clang/test/CodeGenSPIRV/Builtins/fwidth.c
@@ -0,0 +1,41 @@
+// RUN: %clang_cc1 -O1 -triple spirv-pc-vulkan-compute %s -emit-llvm -o - | FileCheck %s
+
+typedef _Float16 half;
+typedef half half2 __attribute__((ext_vector_type(2)));
+typedef half half3 __attribute__((ext_vector_type(3)));
+typedef half half4 __attribute__((ext_vector_type(4)));
+typedef float float2 __attribute__((ext_vector_type(2)));
+typedef float float3 __attribute__((ext_vector_type(3)));
+typedef float float4 __attribute__((ext_vector_type(4)));
+
+// CHECK: [[fwidth0:%.*]] = tail call half @llvm.spv.fwidth.f16(half {{%.*}})
+// CHECK: ret half [[fwidth0]] 
+half test_fwidth_half(half X) { return __builtin_spirv_fwidth(X); }
+
+// CHECK: [[fwidth0:%.*]] = tail call <2 x half> @llvm.spv.fwidth.v2f16(<2 x half>  {{%.*}})
+// CHECK: ret <2 x half> [[fwidth0]] 
+half2 test_fwidth_half2(half2 X) { return __builtin_spirv_fwidth(X); }
+
+// CHECK: [[fwidth0:%.*]] = tail call <3 x half> @llvm.spv.fwidth.v3f16(<3 x half> {{%.*}})
+// CHECK: ret <3 x half> [[fwidth0]] 
+half3 test_fwidth_half3(half3 X) { return __builtin_spirv_fwidth(X); }
+
+// CHECK: [[fwidth0:%.*]] = tail call <4 x half> @llvm.spv.fwidth.v4f16(<4 x half> {{%.*}})
+// CHECK: ret <4 x half> [[fwidth0]] 
+half4 test_fwidth_half4(half4 X) { return __builtin_spirv_fwidth(X); }
+
+// CHECK: [[fwidth0:%.*]] = tail call float @llvm.spv.fwidth.f32(float {{%.*}})
+// CHECK: ret float [[fwidth0]] 
+float test_fwidth_float(float X) { return __builtin_spirv_fwidth(X); }
+
+// CHECK: [[fwidth1:%.*]] = tail call <2 x float> @llvm.spv.fwidth.v2f32(<2 x float> {{%.*}})
+// CHECK: ret <2 x float> [[fwidth1]]
+float2 test_fwidth_float2(float2 X) { return __builtin_spirv_fwidth(X); }
+
+// CHECK: [[fwidth2:%.*]] = tail call <3 x float> @llvm.spv.fwidth.v3f32(<3 x float> {{%.*}})
+// CHECK: ret <3 x float> [[fwidth2]]
+float3 test_fwidth_float3(float3 X) { return __builtin_spirv_fwidth(X); }
+
+// CHECK: [[fwidth3:%.*]] = tail call <4 x float> @llvm.spv.fwidth.v4f32(<4 x float> {{%.*}})
+// CHECK: ret <4 x float> [[fwidth3]]
+float4 test_fwidth_float4(float4 X) { return __builtin_spirv_fwidth(X); }
diff --git a/clang/test/SemaSPIRV/BuiltIns/fwidth-errors.c b/clang/test/SemaSPIRV/BuiltIns/fwidth-errors.c
new file mode 100644
index 0000000000000..44cdd819e4332
--- /dev/null
+++ b/clang/test/SemaSPIRV/BuiltIns/fwidth-errors.c
@@ -0,0 +1,24 @@
+// RUN: %clang_cc1 %s -triple spirv-pc-vulkan-compute -verify
+
+typedef float float2 __attribute__((ext_vector_type(2)));
+
+void test_too_few_arg()
+{
+  return __builtin_spirv_fwidth();
+  // expected-error@-1 {{too few arguments to function call, expected 1, have 0}}
+}
+
+float test_too_many_arg(float p0) {
+  return __builtin_spirv_fwidth(p0, p0);
+  // expected-error@-1 {{too many arguments to function call, expected 1, have 2}}
+}
+
+float test_int_scalar_inputs(int p0) {
+  return __builtin_spirv_fwidth(p0);
+  //  expected-error@-1 {{1st argument must be a scalar or vector of floating-point types (was 'int')}}
+}
+
+float test_mismatched_return(float2 p0) {
+  return __builtin_spirv_fwidth(p0);
+  // expected-error@-1 {{returning 'float2' (vector of 2 'float' values) from a function with incompatible result type 'float'}}
+}
diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index 570d6bc35cbd0..1a4a0fc2364bd 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -162,6 +162,8 @@ def int_dx_splitdouble : DefaultAttrsIntrinsic<[llvm_anyint_ty, LLVMMatchType<0>
     [LLVMScalarOrSameVectorWidth<0, llvm_double_ty>], [IntrNoMem]>;
 def int_dx_radians : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
 def int_dx_discard : DefaultAttrsIntrinsic<[], [llvm_i1_ty], []>;
+def int_dx_deriv_coarse_x : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
+def int_dx_deriv_coarse_y : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
 def int_dx_firstbituhigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
 def int_dx_firstbitshigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
 def int_dx_firstbitlow : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
diff --git a/llvm/include/llvm/IR/IntrinsicsSPIRV.td b/llvm/include/llvm/IR/IntrinsicsSPIRV.td
index 823c491e1bfee..235568f4b20eb 100644
--- a/llvm/include/llvm/IR/IntrinsicsSPIRV.td
+++ b/llvm/include/llvm/IR/IntrinsicsSPIRV.td
@@ -132,6 +132,7 @@ def int_spv_rsqrt : DefaultAttrsIntrinsic<[LLVMMatchType<0>], [llvm_anyfloat_ty]
   def int_spv_group_memory_barrier_with_group_sync
       : DefaultAttrsIntrinsic<[], [], [IntrConvergent]>;
   def int_spv_discard : DefaultAttrsIntrinsic<[], [], []>;
+  def int_spv_fwidth : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
   def int_spv_uclamp : DefaultAttrsIntrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>], [IntrNoMem]>;
   def int_spv_sclamp : DefaultAttrsIntrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>], [IntrNoMem]>;
   def int_spv_nclamp : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>], [IntrNoMem]>;
diff --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td
index 228114c5c24b2..02360cdc859fc 100644
--- a/llvm/lib/Target/DirectX/DXIL.td
+++ b/llvm/lib/Target/DirectX/DXIL.td
@@ -922,6 +922,24 @@ def Discard : DXILOp<82, discard> {
   let stages = [Stages<DXIL1_0, [pixel]>];
 }
 
+def DerivCoarseX : DXILOp<83, unary> {
+  let Doc = "computes the rate of change per stamp in x direction";
+  let intrinsics = [IntrinSelect<int_dx_deriv_coarse_x>];
+  let arguments = [OverloadTy];
+  let result = OverloadTy;
+  let overloads = [Overloads<DXIL1_0, [HalfTy, FloatTy]>];
+  let stages = [Stages<DXIL1_0, [library, pixel, compute, amplification, mesh, node]>];
+}
+
+def DerivCoarseY : DXILOp<84, unary> {
+  let Doc = "computes the rate of change per stamp in y direction";
+  let intrinsics = [IntrinSelect<int_dx_deriv_coarse_y>];
+  let arguments = [OverloadTy];
+  let result = OverloadTy;
+  let overloads = [Overloads<DXIL1_0, [HalfTy, FloatTy]>];
+  let stages = [Stages<DXIL1_0, [library, pixel, compute, amplification, mesh, node]>];
+}
+
 def ThreadId : DXILOp<93, threadId> {
   let Doc = "Reads the thread ID";
   let intrinsics = [IntrinSelect<int_dx_thread_id>];
diff --git a/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp b/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
index 68fd3e0bc74c7..4854a3e676918 100644
--- a/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
+++ b/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
@@ -48,6 +48,8 @@ bool DirectXTTIImpl::isTargetIntrinsicTriviallyScalarizable(
   case Intrinsic::dx_firstbitshigh:
   case Intrinsic::dx_firstbituhigh:
   case Intrinsic::dx_frac:
+  case Intrinsic::dx_deriv_coarse_x:
+  case Intrinsic::dx_deriv_coarse_y:
   case Intrinsic::dx_isinf:
   case Intrinsic::dx_isnan:
   case Intrinsic::dx_rsqrt:
diff --git a/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp b/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
index 1aadd9df189a8..d94c62972e43c 100644
--- a/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
@@ -179,6 +179,9 @@ class SPIRVInstructionSelector : public InstructionSelector {
   bool selectSplatVector(Register ResVReg, const SPIRVType *ResType,
                          MachineInstr &I) const;
 
+  bool selectFwidth(Register ResVReg, const SPIRVType *ResType,
+                    MachineInstr &I) const;
+
   bool selectCmp(Register ResVReg, const SPIRVType *ResType,
                  unsigned comparisonOpcode, MachineInstr &I) const;
   bool selectDiscard(Register ResVReg, const SPIRVType *ResType,
@@ -2615,6 +2618,16 @@ bool SPIRVInstructionSelector::selectDiscard(Register ResVReg,
       .constrainAllUses(TII, TRI, RBI);
 }
 
+bool SPIRVInstructionSelector::selectFwidth(Register ResVReg,
+                                            const SPIRVType *ResType,
+                                            MachineInstr &I) const {
+  // TODO: The rest of this? Go in debugger
+  return BuildMI(*I.getParent(), I, I.getDebugLoc(), TII.get(SPIRV::OpFwidth))
+      .addDef(ResVReg)
+      .addUse(GR.getSPIRVTypeID(ResType))
+      .addUse(I.getOperand(2).getReg());
+}
+
 bool SPIRVInstructionSelector::selectCmp(Register ResVReg,
                                          const SPIRVType *ResType,
                                          unsigned CmpOpc,
@@ -3451,6 +3464,9 @@ bool SPIRVInstructionSelector::selectIntrinsic(Register ResVReg,
   case Intrinsic::spv_discard: {
     return selectDiscard(ResVReg, ResType, I);
   }
+  case Intrinsic::spv_fwidth: {
+    return selectFwidth(ResVReg, ResType, I);
+  }
   case Intrinsic::modf: {
     return selectModf(ResVReg, ResType, I);
   }
diff --git a/llvm/test/CodeGen/DirectX/deriv_coarse_x.ll b/llvm/test/CodeGen/DirectX/deriv_coarse_x.ll
new file mode 100644
index 0000000000000..49e584c5c158e
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/deriv_coarse_x.ll
@@ -0,0 +1,43 @@
+; RUN: opt -S -scalarizer -dxil-op-lower -mtriple=dxil-pc-shadermodel6.3-library %s | FileCheck %s
+
+; Make sure dxil operation function calls for fwidth are generated for float, half vec, float, an32float v
+; Make sure dxil operation function calls for fwidth are generated for float, half vec, flv4oat, an32float vec
+
+
+define noundef half @deriv_coarse_x_half(half noundef %a) {
+; CHECK: call half @dx.op.unary.f16(i32 83, half %{{.*}})
+entry:
+  %dx.deriv.coarse.x = call half @llvm.dx.deriv.coarse.x.f16(half %a)
+  ret half %dx.deriv.coarse.x
+}
+
+define noundef float @deriv_coarse_x_float(float noundef %a) {
+; CHECK: call float @dx.op.unary.f32(i32 83, float %{{.*}})
+entry:
+  %dx.deriv.coarse.x = call float @llvm.dx.deriv.coarse.x.f32(float %a)
+  ret float %dx.deriv.coarse.x
+}
+
+define noundef <4 x float> @deriv_coarse_x_float4(<4 x float> noundef %a) {
+; CHECK: [[ee0:%.*]] = extractelement <4 x float> %a, i64 0
+; CHECK: [[ie0:%.*]] = call float @dx.op.unary.f32(i32 83, float [[ee0]])
+; CHECK: [[ee1:%.*]] = extractelement <4 x float> %a, i64 1
+; CHECK: [[ie1:%.*]] = call float @dx.op.unary.f32(i32 83, float [[ee1]])
+; CHECK: [[ee2:%.*]] = extractelement <4 x float> %a, i64 2
+; CHECK: [[ie2:%.*]] = call float @dx.op.unary.f32(i32 83, float [[ee2]])
+; CHECK: [[ee3:%.*]] = extractelement <4 x float> %a, i64 3
+; CHECK: [[ie3:%.*]] = call float @dx.op.unary.f32(i32 83, float [[ee3]])
+; CHECK: insertelement <4 x float> poison, float [[ie0]], i64 0
+; CHECK: insertelement <4 x float> %{{.*}}, float [[ie1]], i64 1
+; CHECK: insertelement <4 x float> %{{.*}}, float [[ie2]], i64 2
+; CHECK: insertelement <4 x float> %{{.*}}, float [[ie3]], i64 3
+; CHECK: ret <4 x float> %{{.*}}
+entry:
+  %dx.deriv.coarse.x = call <4 x float> @llvm.dx.deriv.coarse.x.v4f32(<4 x float> %a)
+  ret <4 x float> %dx.deriv.coarse.x
+}
+
+declare half @llvm.dx.deriv.coarse.x.f16(half)
+declare float @llvm.dx.deriv.coarse.x.f32(float)
+declare <4 x float> @llvm.dx.deriv.coarse.x.v4f32(<4 x float>)
+
diff --git a/llvm/test/CodeGen/DirectX/deriv_coarse_x_error.ll b/llvm/test/CodeGen/DirectX/deriv_coarse_x_error.ll
new file mode 100644
index 0000000000000..eab495c5f7c6a
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/deriv_coarse_x_error.ll
@@ -0,0 +1,15 @@
+; RUN: not opt -S -dxil-op-lower -mtriple=dxil-pc-shadermodel6.3-library %s 2>&1 | FileCheck %s
+
+; DXIL operation deriv_coarse_x does not support double overload type
+; CHECK: in function deriv_coarse_x
+; CHECK-SAME: Cannot create DerivCoarseX operation: Invalid overload type
+
+; Function Attrs: noinline nounwind optnone
+define noundef double @deriv_coarse_x_double(double noundef ...
[truncated]

s-perron

The SPIR-V code looks good.

farzonl · 2025-10-15T17:52:38Z

llvm/include/llvm/IR/IntrinsicsDirectX.td

+def int_dx_deriv_coarse_x : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
+def int_dx_deriv_coarse_y : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;


I have concerns with how we are going about this because these need spirv implementations and we are half implementing other tickets to make progress on fwidth. I kind of feel like fwidth should be blocked until we do these two tickets

Implement the ddy_coarse HLSL Function #99100

Implement the ddx_coarse HLSL Function #99097

A fair point! I'll finish implementing them fully and put them in a new PR, then come back to finish up fwidth.

I've implemented both of these here #164831

farzonl · 2025-10-15T17:55:35Z

clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h

+  vector<T, N> derivCoarseX = __builtin_hlsl_elementwise_deriv_coarse_x(input);
+  derivCoarseX = abs(derivCoarseX);
+  vector<T, N> derivCoarseY = __builtin_hlsl_elementwise_deriv_coarse_y(input);


this would be better if we could just call ddy_coarse and ddx_coarse instead of a builtin so we can take advantage of HLSL language symantic rules. like what you are doing with abs.

farzonl · 2025-10-15T17:59:11Z

clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h

+template <typename T> constexpr T fwidth_impl(T input) {
+#if (__has_builtin(__builtin_spirv_fwidth))
+  return __builtin_spirv_fwidth(input);
+#else
+  T derivCoarseX = __builtin_hlsl_elementwise_deriv_coarse_x(input);
+  derivCoarseX = abs(derivCoarseX);
+  T derivCoarseY = __builtin_hlsl_elementwise_deriv_coarse_y(input);
+  derivCoarseY = abs(derivCoarseY);
+  return derivCoarseX + derivCoarseY;
+#endif
+}


I feel like fwidth_impl and fwidth_vec_impl could be merged into one function see how we did refract:
056f0a1#diff-f5dc7c12e2bb511ab1504961855fdd1e81dbeda43b8e3ee174819fb4a61c4acb

This pattern emerged because there was previously a bug with using select but that was fixed and so you can see the new pattern in ldexp_impl, faceforward_impl, and, lit_impl is to just have one implementation.

farzonl · 2025-10-15T18:05:42Z

clang/include/clang/Basic/BuiltinsSPIRVCommon.td

 def distance : SPIRVBuiltin<"void(...)", [NoThrow, Const]>;
 def length : SPIRVBuiltin<"void(...)", [NoThrow, Const]>;
 def smoothstep : SPIRVBuiltin<"void(...)", [NoThrow, Const, CustomTypeChecking]>;
+def fwidth : SPIRVBuiltin<"void(...)", [NoThrow, Const, CustomTypeChecking]>;


we only put things here if this builtin should work for both CL and VK contexts. If this is just a VK feature please move to
clang/include/clang/Basic/BuiltinsSPIRVVK.td

As it stands I don't see fwidth in opencl https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#math-functions so I think this will likely need to move.

I only see it in opengl/vk context: https://registry.khronos.org/OpenGL-Refpages/gl4/html/fwidth.xhtml

Moved to BuiltinsSPIRVVK.td

farzonl · 2025-10-15T18:20:08Z

llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp

+bool SPIRVInstructionSelector::selectFwidth(Register ResVReg,
+                                            const SPIRVType *ResType,
+                                            MachineInstr &I) const {
+  return BuildMI(*I.getParent(), I, I.getDebugLoc(), TII.get(SPIRV::OpFwidth))
+      .addDef(ResVReg)
+      .addUse(GR.getSPIRVTypeID(ResType))
+      .addUse(I.getOperand(2).getReg());
+}
+


This doesn't seem special enough to need it's own selector function but I guess thats how we have been doing all the non ext opcodes like SPIRV::OpFwidth. Maybe we should create a generic Selector for the SPIRV Opcodes. So we don't have to do so many custom select functions when the only interesting part is SPIRV::OpFwidth. Something close to what we did for selectExtInst.

Renamed the selector for the DpdCoarse insts to selectDerivativeInst and sent the fwidth to it as well, as they all have the same pattern and require conversion to and from float when they are used with a half. This rule follows for all the other derivative instructions so far as I can tell.

farzonl · 2025-10-15T18:34:44Z

clang/lib/Headers/hlsl/hlsl_intrinsics.h

+template <typename T>
+const inline __detail::enable_if_t<
+    __detail::is_arithmetic<T>::Value && __detail::is_same<float, T>::value, T>
+fwidth(T input) {
+  return __detail::fwidth_impl(input);
+}
+
+template <int N>
+const inline __detail::HLSL_FIXED_VECTOR<float, N>
+fwidth(__detail::HLSL_FIXED_VECTOR<float, N> input) {
+  return __detail::fwidth_vec_impl(input);
+}


Doesn't fwidth have a half overload?

Yes, my mistake. Added it

farzonl · 2025-10-15T18:39:17Z

There are many HLSL codegen test and semaHLSL tests missing that are needed to cover

fwdith.hlsl
ddy_coarse.hlsl
ddx_coarse.hlsl
I'm expecting 6 more test files.

farzonl · 2025-10-15T18:42:45Z

clang/lib/CodeGen/CGHLSLBuiltins.cpp

+        /*ReturnType=*/Op0->getType(), llvm::Intrinsic::dx_deriv_coarse_x,
+        ArrayRef<Value *>{Op0}, nullptr, "hlsl.deriv.coarse.x");
+  }
+  case Builtin::BI__builtin_hlsl_elementwise_deriv_coarse_y: {
+    Value *Op0 = EmitScalarExpr(E->getArg(0));
+    if (!E->getArg(0)->getType()->hasFloatingRepresentation())
+      llvm_unreachable(
+          "deriv coarse x operand must have a float representation");
+    return Builder.CreateIntrinsic(
+        /*ReturnType=*/Op0->getType(), llvm::Intrinsic::dx_deriv_coarse_y,


We should really be using CGM.getHLSLRuntime().get...() instead of calling llvm::Intrinsic::dx_deriv_coarse_x and llvm::Intrinsic::dx_deriv_coarse_y directly. These two intrinsicis should map to the SPIRV Opcodes OpDPdxCoarse and OpDPdyCoarse.

llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp

Closes #99097 Closes #99100 As ddx and ddy are near identical implementations I've combined them in this PR. This aims to unblock #161378 --------- Co-authored-by: Alexander Johnston <[email protected]>

Closes llvm/llvm-project#99097 Closes llvm/llvm-project#99100 As ddx and ddy are near identical implementations I've combined them in this PR. This aims to unblock llvm/llvm-project#161378 --------- Co-authored-by: Alexander Johnston <[email protected]>

Closes llvm#99120

github-actions · 2025-11-19T03:28:07Z

🐧 Linux x64 Test Results

192819 tests passed
6176 tests skipped

farzonl · 2025-11-19T12:51:58Z

forgot to mention this on the ddx_coarse\ddy_coarse pr but in DXIL.td you defined those two opcodes as pixel and library shader but we didn't write a test to confirm they would error for say a use on a compute shader. Can you add those two tests to this pr?

farzonl · 2025-11-19T12:56:43Z

clang/test/CodeGenHLSL/builtins/fwidth.hlsl

+// RUN: %clang_cc1 -finclude-default-header  -x hlsl  -triple dxil-pc-shadermodel6.3-library %s \
+// RUN:  -emit-llvm -disable-llvm-passes -fnative-half-type -o - | \
+// RUN:  FileCheck %s --check-prefixes=CHECK
+// RUN: %clang_cc1 -finclude-default-header  -x hlsl  -triple spirv-pc-vulkan-compute  %s \


it is a bit odd this is compute -triple spirv-pc-vulkan-compute but I suppose this is fine since we are doing -emit-llvm.

farzonl · 2025-11-19T13:04:38Z

llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp

-    return selectDpdCoarse(ResVReg, ResType, I, SPIRV::OpDPdxCoarse);
+    return selectDerivativeInst(ResVReg, ResType, I, SPIRV::OpDPdxCoarse);
  }
  case Intrinsic::spv_ddy_coarse: {
-    return selectDpdCoarse(ResVReg, ResType, I, SPIRV::OpDPdyCoarse);
+    return selectDerivativeInst(ResVReg, ResType, I, SPIRV::OpDPdyCoarse);
+  }
+  case Intrinsic::spv_fwidth: {
+    return selectDerivativeInst(ResVReg, ResType, I, SPIRV::OpFwidth);


This file is inconsistent but some of these single return case statements don't have curly braces {}. This might be an ambigous part of the style guide since switch isn't mentioned: https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements. I do beleive this counts as a single statement body so if possible drop the curly braces.

farzonl

lgtm. A few nits and test requests.

farzonl · 2025-11-19T13:08:19Z

Your pr message body needs a description of what this change does

Alexander-Johnston · 2025-11-19T14:02:48Z

I believe that's the nits fixed. I updated the ddx_coarse-errors.ll and ddy_coarse-errors.ll` tests to include a run with a compute triple and check for the appropriate error.

Alexander-Johnston · 2025-11-20T12:25:36Z

I don't have permissions to merge this. If you're happy with the nit fixes and test changes can you merge it @farzonl ? Thanks!

github-actions · 2025-11-20T12:38:51Z

@Alexander-Johnston Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

Alexander-Johnston force-pushed the fwidth branch 2 times, most recently from 445fcb1 to a141971 Compare October 1, 2025 23:10

s-perron approved these changes Oct 2, 2025

View reviewed changes

farzonl reviewed Oct 15, 2025

View reviewed changes

farzonl reviewed Oct 16, 2025

View reviewed changes

llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp Outdated Show resolved Hide resolved

Alexander-Johnston mentioned this pull request Oct 23, 2025

[HLSL] Implement ddx/ddy_coarse intrinsics #164831

Merged

[HLSL] Implement the fwidth intrinsic

5af43f8

Closes llvm#99120

Alexander-Johnston force-pushed the fwidth branch from a141971 to 5af43f8 Compare November 19, 2025 02:26

farzonl reviewed Nov 19, 2025

View reviewed changes

farzonl approved these changes Nov 19, 2025

View reviewed changes

Alexander-Johnston added 2 commits November 19, 2025 13:56

Fixup nits: Code style and fwidth test triple

449b422

Add Invalid stage test to ddx/y_coarse tests

6a7ef68

Alexander-Johnston mentioned this pull request Nov 19, 2025

Add fwidth graphics test llvm/offload-test-suite#508

Open

farzonl merged commit 76f1949 into llvm:main Nov 20, 2025
12 checks passed

		def int_dx_deriv_coarse_x : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
		def int_dx_deriv_coarse_y : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;

[HLSL] Implement the fwidth intrinsic for DXIL and SPIR-V target #161378

[HLSL] Implement the fwidth intrinsic for DXIL and SPIR-V target #161378

Uh oh!

Conversation

Alexander-Johnston commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 30, 2025

Uh oh!

llvmbot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s-perron left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

farzonl Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

farzonl Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

farzonl Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

farzonl commented Oct 15, 2025

Uh oh!

farzonl Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐧 Linux x64 Test Results

Uh oh!

farzonl commented Nov 19, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

farzonl Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

farzonl left a comment

Choose a reason for hiding this comment

Uh oh!

farzonl commented Nov 19, 2025

Uh oh!

Alexander-Johnston commented Nov 19, 2025

Uh oh!

Alexander-Johnston commented Nov 20, 2025

Uh oh!

Uh oh!

github-actions bot commented Nov 20, 2025

Uh oh!

[HLSL] Implement the `fwidth` intrinsic for DXIL and SPIR-V target #161378

[HLSL] Implement the `fwidth` intrinsic for DXIL and SPIR-V target #161378

Alexander-Johnston commented Sep 30, 2025 •

edited

Loading

llvmbot commented Sep 30, 2025 •

edited

Loading

farzonl Oct 15, 2025 •

edited

Loading

farzonl Oct 15, 2025 •

edited

Loading

farzonl Oct 15, 2025 •

edited

Loading

farzonl Oct 15, 2025 •

edited

Loading

github-actions bot commented Nov 19, 2025 •

edited

Loading

farzonl Nov 19, 2025 •

edited

Loading