Skip to content

Conversation

@heiher
Copy link
Member

@heiher heiher commented Sep 10, 2025

This patch introduces the LASX and LSX conversion intrinsics:

  • <8 x float> @llvm.loongarch.lasx.cast.128.s(<4 x float>)
  • <4 x double> @llvm.loongarch.lasx.cast.128.d(<2 x double>)
  • <4 x i64> @llvm.loongarch.lasx.cast.128(<2 x i64>)
  • <8 x float> @llvm.loongarch.lasx.concat.128.s(<4 x float>, <4 x float>)
  • <4 x double> @llvm.loongarch.lasx.concat.128.d(<2 x double>, <2 x double>)
  • <4 x i64> @llvm.loongarch.lasx.concat.128(<2 x i64>, <2 x i64>)
  • <4 x float> @llvm.loongarch.lasx.extract.128.lo.s(<8 x float>)
  • <2 x double> @llvm.loongarch.lasx.extract.128.lo.d(<4 x double>)
  • <2 x i64> @llvm.loongarch.lasx.extract.128.lo(<4 x i64>)
  • <4 x float> @llvm.loongarch.lasx.extract.128.hi.s(<8 x float>)
  • <2 x double> @llvm.loongarch.lasx.extract.128.hi.d(<4 x double>)
  • <2 x i64> @llvm.loongarch.lasx.extract.128.hi(<4 x i64>)
  • <8 x float> @llvm.loongarch.lasx.insert.128.lo.s(<8 x float>, <4 x float>)
  • <4 x double> @llvm.loongarch.lasx.insert.128.lo.d(<4 x double>, <2 x double>)
  • <4 x i64> @llvm.loongarch.lasx.insert.128.lo(<4 x i64>, <2 x i64>)
  • <8 x float> @llvm.loongarch.lasx.insert.128.hi.s(<8 x float>, <4 x float>)
  • <4 x double> @llvm.loongarch.lasx.insert.128.hi.d(<4 x double>, <2 x double>)
  • <4 x i64> @llvm.loongarch.lasx.insert.128.hi(<4 x i64>, <2 x i64>)

@llvmbot
Copy link
Member

llvmbot commented Sep 10, 2025

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-loongarch

Author: hev (heiher)

Changes

This patch introduces the LASX and LSX conversion intrinsics:

  • <8 x float> @llvm.loongarch.lasx.cast.128.s(<4 x float>)
  • <4 x double> @llvm.loongarch.lasx.cast.128.d(<2 x double>)
  • <32 x i8> @llvm.loongarch.lasx.cast.128(<16 x i8>)
  • <8 x float> @llvm.loongarch.lasx.concat.128.s(<4 x float>, <4 x float>)
  • <4 x double> @llvm.loongarch.lasx.concat.128.d(<2 x double>, <2 x double>)
  • <32 x i8> @llvm.loongarch.lasx.concat.128(<16 x i8>, <16 x i8>)
  • <4 x float> @llvm.loongarch.lasx.extract.128.lo.s(<8 x float>)
  • <2 x double> @llvm.loongarch.lasx.extract.128.lo.d(<4 x double>)
  • <16 x i8> @llvm.loongarch.lasx.extract.128.lo(<32 x i8>)
  • <4 x float> @llvm.loongarch.lasx.extract.128.hi.s(<8 x float>)
  • <2 x double> @llvm.loongarch.lasx.extract.128.hi.d(<4 x double>)
  • <16 x i8> @llvm.loongarch.lasx.extract.128.hi(<32 x i8>)
  • <8 x float> @llvm.loongarch.lasx.insert.128.lo.s(<8 x float>, <4 x float>)
  • <4 x double> @llvm.loongarch.lasx.insert.128.lo.d(<4 x double>, <2 x double>)
  • <32 x i8> @llvm.loongarch.lasx.insert.128.lo(<32 x i8>, <16 x i8>)
  • <8 x float> @llvm.loongarch.lasx.insert.128.hi.s(<8 x float>, <4 x float>)
  • <4 x double> @llvm.loongarch.lasx.insert.128.hi.d(<4 x double>, <2 x double>)
  • <32 x i8> @llvm.loongarch.lasx.insert.128.hi(<32 x i8>, <16 x i8>)

Full diff: https://github.com/llvm/llvm-project/pull/157818.diff

4 Files Affected:

  • (modified) llvm/include/llvm/IR/IntrinsicsLoongArch.td (+38)
  • (modified) llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp (+5)
  • (modified) llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td (+31)
  • (added) llvm/test/CodeGen/LoongArch/lasx/intrinsic-conversion.ll (+234)
diff --git a/llvm/include/llvm/IR/IntrinsicsLoongArch.td b/llvm/include/llvm/IR/IntrinsicsLoongArch.td
index 84026aa9d3624..2466aeec4474a 100644
--- a/llvm/include/llvm/IR/IntrinsicsLoongArch.td
+++ b/llvm/include/llvm/IR/IntrinsicsLoongArch.td
@@ -1192,4 +1192,42 @@ def int_loongarch_lasx_xvstelm_w
 def int_loongarch_lasx_xvstelm_d
   : VecInt<[], [llvm_v4i64_ty, llvm_ptr_ty, llvm_i32_ty, llvm_i32_ty],
            [IntrWriteMem, IntrArgMemOnly, ImmArg<ArgIndex<2>>, ImmArg<ArgIndex<3>>]>;
+
+// LASX and LSX conversion
+def int_loongarch_lasx_cast_128_s
+  : VecInt<[llvm_v8f32_ty], [llvm_v4f32_ty], [IntrNoMem]>;
+def int_loongarch_lasx_cast_128_d
+  : VecInt<[llvm_v4f64_ty], [llvm_v2f64_ty], [IntrNoMem]>;
+def int_loongarch_lasx_cast_128
+  : VecInt<[llvm_v32i8_ty], [llvm_v16i8_ty], [IntrNoMem]>;
+def int_loongarch_lasx_concat_128_s
+  : VecInt<[llvm_v8f32_ty], [llvm_v4f32_ty, llvm_v4f32_ty], [IntrNoMem]>;
+def int_loongarch_lasx_concat_128_d
+  : VecInt<[llvm_v4f64_ty], [llvm_v2f64_ty, llvm_v2f64_ty], [IntrNoMem]>;
+def int_loongarch_lasx_concat_128
+  : VecInt<[llvm_v32i8_ty], [llvm_v16i8_ty, llvm_v16i8_ty], [IntrNoMem]>;
+def int_loongarch_lasx_extract_128_lo_s
+  : VecInt<[llvm_v4f32_ty], [llvm_v8f32_ty], [IntrNoMem]>;
+def int_loongarch_lasx_extract_128_lo_d
+  : VecInt<[llvm_v2f64_ty], [llvm_v4f64_ty], [IntrNoMem]>;
+def int_loongarch_lasx_extract_128_lo
+  : VecInt<[llvm_v16i8_ty], [llvm_v32i8_ty], [IntrNoMem]>;
+def int_loongarch_lasx_extract_128_hi_s
+  : VecInt<[llvm_v4f32_ty], [llvm_v8f32_ty], [IntrNoMem]>;
+def int_loongarch_lasx_extract_128_hi_d
+  : VecInt<[llvm_v2f64_ty], [llvm_v4f64_ty], [IntrNoMem]>;
+def int_loongarch_lasx_extract_128_hi
+  : VecInt<[llvm_v16i8_ty], [llvm_v32i8_ty], [IntrNoMem]>;
+def int_loongarch_lasx_insert_128_lo_s
+  : VecInt<[llvm_v8f32_ty], [llvm_v8f32_ty, llvm_v4f32_ty], [IntrNoMem]>;
+def int_loongarch_lasx_insert_128_lo_d
+  : VecInt<[llvm_v4f64_ty], [llvm_v4f64_ty, llvm_v2f64_ty], [IntrNoMem]>;
+def int_loongarch_lasx_insert_128_lo
+  : VecInt<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v16i8_ty], [IntrNoMem]>;
+def int_loongarch_lasx_insert_128_hi_s
+  : VecInt<[llvm_v8f32_ty], [llvm_v8f32_ty, llvm_v4f32_ty], [IntrNoMem]>;
+def int_loongarch_lasx_insert_128_hi_d
+  : VecInt<[llvm_v4f64_ty], [llvm_v4f64_ty, llvm_v2f64_ty], [IntrNoMem]>;
+def int_loongarch_lasx_insert_128_hi
+  : VecInt<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v16i8_ty], [IntrNoMem]>;
 } // TargetPrefix = "loongarch"
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
index 634914d3b3fd0..1ae5fc8bcfdbb 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
@@ -6168,6 +6168,11 @@ performINTRINSIC_WO_CHAINCombine(SDNode *N, SelectionDAG &DAG,
                        N->getOperand(1),
                        DAG.getNode(ISD::ANY_EXTEND, DL, Subtarget.getGRLenVT(),
                                    N->getOperand(2)));
+  case Intrinsic::loongarch_lasx_concat_128_s:
+  case Intrinsic::loongarch_lasx_concat_128_d:
+  case Intrinsic::loongarch_lasx_concat_128:
+    return DAG.getNode(ISD::CONCAT_VECTORS, DL, N->getValueType(0),
+                       N->getOperand(1), N->getOperand(2));
   }
   return SDValue();
 }
diff --git a/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td b/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
index a79c01cbe577a..d070717a8c8ea 100644
--- a/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+++ b/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
@@ -2028,6 +2028,37 @@ defm : subvector_subreg_lowering<LSX128, v2f64, LASX256, v4f64,  2,  sub_128>;
 defm : subvector_subreg_lowering<LSX128, v8i16, LASX256, v16i16, 8,  sub_128>;
 defm : subvector_subreg_lowering<LSX128, v16i8, LASX256, v32i8,  16, sub_128>;
 
+// LASX and LSX conversion
+def : Pat<(int_loongarch_lasx_cast_128_s (v4f32 LSX128:$src)),
+          (INSERT_SUBREG (IMPLICIT_DEF), LSX128:$src, sub_128)>;
+def : Pat<(int_loongarch_lasx_cast_128_d (v2f64 LSX128:$src)),
+          (INSERT_SUBREG (IMPLICIT_DEF), LSX128:$src, sub_128)>;
+def : Pat<(int_loongarch_lasx_cast_128 (v16i8 LSX128:$src)),
+          (INSERT_SUBREG (IMPLICIT_DEF), LSX128:$src, sub_128)>;
+def : Pat<(int_loongarch_lasx_extract_128_lo_s (v8f32 LASX256:$src)),
+          (EXTRACT_SUBREG LASX256:$src, sub_128)>;
+def : Pat<(int_loongarch_lasx_extract_128_lo_d (v4f64 LASX256:$src)),
+          (EXTRACT_SUBREG LASX256:$src, sub_128)>;
+def : Pat<(int_loongarch_lasx_extract_128_lo (v32i8 LASX256:$src)),
+          (EXTRACT_SUBREG LASX256:$src, sub_128)>;
+def : Pat<(int_loongarch_lasx_extract_128_hi_s (v8f32 LASX256:$src)),
+          (EXTRACT_SUBREG (XVPERMI_Q (IMPLICIT_DEF), LASX256:$src, 1), sub_128)>;
+def : Pat<(int_loongarch_lasx_extract_128_hi_d (v4f64 LASX256:$src)),
+          (EXTRACT_SUBREG (XVPERMI_Q (IMPLICIT_DEF), LASX256:$src, 1), sub_128)>;
+def : Pat<(int_loongarch_lasx_extract_128_hi (v32i8 LASX256:$src)),
+          (EXTRACT_SUBREG (XVPERMI_Q (IMPLICIT_DEF), LASX256:$src, 1), sub_128)>;
+def : Pat<(int_loongarch_lasx_insert_128_lo_s (v8f32 LASX256:$src), (v4f32 LSX128:$lo)),
+          (XVPERMI_Q LASX256:$src, (INSERT_SUBREG (IMPLICIT_DEF), LSX128:$lo, sub_128), 48)>;
+def : Pat<(int_loongarch_lasx_insert_128_lo_d (v4f64 LASX256:$src), (v2f64 LSX128:$lo)),
+          (XVPERMI_Q LASX256:$src, (INSERT_SUBREG (IMPLICIT_DEF), LSX128:$lo, sub_128), 48)>;
+def : Pat<(int_loongarch_lasx_insert_128_lo (v32i8 LASX256:$src), (v16i8 LSX128:$lo)),
+          (XVPERMI_Q LASX256:$src, (INSERT_SUBREG (IMPLICIT_DEF), LSX128:$lo, sub_128), 48)>;
+def : Pat<(int_loongarch_lasx_insert_128_hi_s (v8f32 LASX256:$src), (v4f32 LSX128:$lo)),
+          (XVPERMI_Q LASX256:$src, (INSERT_SUBREG (IMPLICIT_DEF), LSX128:$lo, sub_128), 2)>;
+def : Pat<(int_loongarch_lasx_insert_128_hi_d (v4f64 LASX256:$src), (v2f64 LSX128:$lo)),
+          (XVPERMI_Q LASX256:$src, (INSERT_SUBREG (IMPLICIT_DEF), LSX128:$lo, sub_128), 2)>;
+def : Pat<(int_loongarch_lasx_insert_128_hi (v32i8 LASX256:$src), (v16i8 LSX128:$lo)),
+          (XVPERMI_Q LASX256:$src, (INSERT_SUBREG (IMPLICIT_DEF), LSX128:$lo, sub_128), 2)>;
 } // Predicates = [HasExtLASX]
 
 /// Intrinsic pattern
diff --git a/llvm/test/CodeGen/LoongArch/lasx/intrinsic-conversion.ll b/llvm/test/CodeGen/LoongArch/lasx/intrinsic-conversion.ll
new file mode 100644
index 0000000000000..ede421015f626
--- /dev/null
+++ b/llvm/test/CodeGen/LoongArch/lasx/intrinsic-conversion.ll
@@ -0,0 +1,234 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+lasx < %s | FileCheck %s
+; RUN: llc --mtriple=loongarch64 --mattr=+lasx < %s | FileCheck %s
+
+declare <8 x float> @llvm.loongarch.lasx.cast.128.s(<4 x float>)
+
+define <8 x float> @lasx_cast_128_s(<4 x float> %va) {
+; CHECK-LABEL: lasx_cast_128_s:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 def $xr0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <8 x float> @llvm.loongarch.lasx.cast.128.s(<4 x float> %va)
+  ret <8 x float> %res
+}
+
+declare <4 x double> @llvm.loongarch.lasx.cast.128.d(<2 x double>)
+
+define <4 x double> @lasx_cast_128_d(<2 x double> %va) {
+; CHECK-LABEL: lasx_cast_128_d:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 def $xr0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <4 x double> @llvm.loongarch.lasx.cast.128.d(<2 x double> %va)
+  ret <4 x double> %res
+}
+
+declare <32 x i8> @llvm.loongarch.lasx.cast.128(<16 x i8>)
+
+define <32 x i8> @lasx_cast_128(<16 x i8> %va) {
+; CHECK-LABEL: lasx_cast_128:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 def $xr0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <32 x i8> @llvm.loongarch.lasx.cast.128(<16 x i8> %va)
+  ret <32 x i8> %res
+}
+
+declare <8 x float> @llvm.loongarch.lasx.concat.128.s(<4 x float>, <4 x float>)
+
+define <8 x float> @lasx_concat_128_s(<4 x float> %va, <4 x float> %vb) {
+; CHECK-LABEL: lasx_concat_128_s:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr1 killed $vr1 def $xr1
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 def $xr0
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 2
+; CHECK-NEXT:    ret
+entry:
+  %res = call <8 x float> @llvm.loongarch.lasx.concat.128.s(<4 x float> %va, <4 x float> %vb)
+  ret <8 x float> %res
+}
+
+declare <4 x double> @llvm.loongarch.lasx.concat.128.d(<2 x double>, <2 x double>)
+
+define <4 x double> @lasx_concat_128_d(<2 x double> %va, <2 x double> %vb) {
+; CHECK-LABEL: lasx_concat_128_d:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr1 killed $vr1 def $xr1
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 def $xr0
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 2
+; CHECK-NEXT:    ret
+entry:
+  %res = call <4 x double> @llvm.loongarch.lasx.concat.128.d(<2 x double> %va, <2 x double> %vb)
+  ret <4 x double> %res
+}
+
+declare <32 x i8> @llvm.loongarch.lasx.concat.128(<16 x i8>, <16 x i8>)
+
+define <32 x i8> @lasx_concat_128(<16 x i8> %va, <16 x i8> %vb) {
+; CHECK-LABEL: lasx_concat_128:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr1 killed $vr1 def $xr1
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 def $xr0
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 2
+; CHECK-NEXT:    ret
+entry:
+  %res = call <32 x i8> @llvm.loongarch.lasx.concat.128(<16 x i8> %va, <16 x i8> %vb)
+  ret <32 x i8> %res
+}
+
+declare <4 x float> @llvm.loongarch.lasx.extract.128.lo.s(<8 x float>)
+
+define <4 x float> @lasx_extract_128_lo_s(<8 x float> %va) {
+; CHECK-LABEL: lasx_extract_128_lo_s:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 killed $xr0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <4 x float> @llvm.loongarch.lasx.extract.128.lo.s(<8 x float> %va)
+  ret <4 x float> %res
+}
+
+declare <2 x double> @llvm.loongarch.lasx.extract.128.lo.d(<4 x double>)
+
+define <2 x double> @lasx_extract_128_lo_d(<4 x double> %va) {
+; CHECK-LABEL: lasx_extract_128_lo_d:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 killed $xr0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <2 x double> @llvm.loongarch.lasx.extract.128.lo.d(<4 x double> %va)
+  ret <2 x double> %res
+}
+
+declare <16 x i8> @llvm.loongarch.lasx.extract.128.lo(<32 x i8>)
+
+define <16 x i8> @lasx_extract_128_lo(<32 x i8> %va) {
+; CHECK-LABEL: lasx_extract_128_lo:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 killed $xr0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <16 x i8> @llvm.loongarch.lasx.extract.128.lo(<32 x i8> %va)
+  ret <16 x i8> %res
+}
+
+declare <4 x float> @llvm.loongarch.lasx.extract.128.hi.s(<8 x float>)
+
+define <4 x float> @lasx_extract_128_hi_s(<8 x float> %va) {
+; CHECK-LABEL: lasx_extract_128_hi_s:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    xvpermi.q $xr0, $xr0, 1
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 killed $xr0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <4 x float> @llvm.loongarch.lasx.extract.128.hi.s(<8 x float> %va)
+  ret <4 x float> %res
+}
+
+declare <2 x double> @llvm.loongarch.lasx.extract.128.hi.d(<4 x double>)
+
+define <2 x double> @lasx_extract_128_hi_d(<4 x double> %va) {
+; CHECK-LABEL: lasx_extract_128_hi_d:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    xvpermi.q $xr0, $xr0, 1
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 killed $xr0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <2 x double> @llvm.loongarch.lasx.extract.128.hi.d(<4 x double> %va)
+  ret <2 x double> %res
+}
+
+declare <16 x i8> @llvm.loongarch.lasx.extract.128.hi(<32 x i8>)
+
+define <16 x i8> @lasx_extract_128_hi(<32 x i8> %va) {
+; CHECK-LABEL: lasx_extract_128_hi:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    xvpermi.q $xr0, $xr0, 1
+; CHECK-NEXT:    # kill: def $vr0 killed $vr0 killed $xr0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <16 x i8> @llvm.loongarch.lasx.extract.128.hi(<32 x i8> %va)
+  ret <16 x i8> %res
+}
+
+declare <8 x float> @llvm.loongarch.lasx.insert.128.lo.s(<8 x float>, <4 x float>)
+
+define <8 x float> @lasx_insert_128_lo_s(<8 x float> %va, <4 x float> %vb) {
+; CHECK-LABEL: lasx_insert_128_lo_s:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr1 killed $vr1 def $xr1
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 48
+; CHECK-NEXT:    ret
+entry:
+  %res = call <8 x float> @llvm.loongarch.lasx.insert.128.lo.s(<8 x float> %va, <4 x float> %vb)
+  ret <8 x float> %res
+}
+
+declare <4 x double> @llvm.loongarch.lasx.insert.128.lo.d(<4 x double>, <2 x double>)
+
+define <4 x double> @lasx_insert_128_lo_d(<4 x double> %va, <2 x double> %vb) {
+; CHECK-LABEL: lasx_insert_128_lo_d:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr1 killed $vr1 def $xr1
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 48
+; CHECK-NEXT:    ret
+entry:
+  %res = call <4 x double> @llvm.loongarch.lasx.insert.128.lo.d(<4 x double> %va, <2 x double> %vb)
+  ret <4 x double> %res
+}
+
+declare <32 x i8> @llvm.loongarch.lasx.insert.128.lo(<32 x i8>, <16 x i8>)
+
+define <32 x i8> @lasx_insert_128_lo(<32 x i8> %va, <16 x i8> %vb) {
+; CHECK-LABEL: lasx_insert_128_lo:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr1 killed $vr1 def $xr1
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 48
+; CHECK-NEXT:    ret
+entry:
+  %res = call <32 x i8> @llvm.loongarch.lasx.insert.128.lo(<32 x i8> %va, <16 x i8> %vb)
+  ret <32 x i8> %res
+}
+
+declare <8 x float> @llvm.loongarch.lasx.insert.128.hi.s(<8 x float>, <4 x float>)
+
+define <8 x float> @lasx_insert_128_hi_s(<8 x float> %va, <4 x float> %vb) {
+; CHECK-LABEL: lasx_insert_128_hi_s:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr1 killed $vr1 def $xr1
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 2
+; CHECK-NEXT:    ret
+entry:
+  %res = call <8 x float> @llvm.loongarch.lasx.insert.128.hi.s(<8 x float> %va, <4 x float> %vb)
+  ret <8 x float> %res
+}
+
+declare <4 x double> @llvm.loongarch.lasx.insert.128.hi.d(<4 x double>, <2 x double>)
+
+define <4 x double> @lasx_insert_128_hi_d(<4 x double> %va, <2 x double> %vb) {
+; CHECK-LABEL: lasx_insert_128_hi_d:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr1 killed $vr1 def $xr1
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 2
+; CHECK-NEXT:    ret
+entry:
+  %res = call <4 x double> @llvm.loongarch.lasx.insert.128.hi.d(<4 x double> %va, <2 x double> %vb)
+  ret <4 x double> %res
+}
+
+declare <32 x i8> @llvm.loongarch.lasx.insert.128.hi(<32 x i8>, <16 x i8>)
+
+define <32 x i8> @lasx_insert_128_hi(<32 x i8> %va, <16 x i8> %vb) {
+; CHECK-LABEL: lasx_insert_128_hi:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    # kill: def $vr1 killed $vr1 def $xr1
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 2
+; CHECK-NEXT:    ret
+entry:
+  %res = call <32 x i8> @llvm.loongarch.lasx.insert.128.hi(<32 x i8> %va, <16 x i8> %vb)
+  ret <32 x i8> %res
+}

@heiher heiher force-pushed the users/hev/llvm-lasx-lsx-conversion branch from a088c6f to aceb318 Compare September 18, 2025 04:59
This patch introduces the LASX and LSX conversion intrinsics:

- <8 x float> @llvm.loongarch.lasx.cast.128.s(<4 x float>)
- <4 x double> @llvm.loongarch.lasx.cast.128.d(<2 x double>)
- <4 x i64> @llvm.loongarch.lasx.cast.128(<2 x i64>)
- <8 x float> @llvm.loongarch.lasx.concat.128.s(<4 x float>, <4 x float>)
- <4 x double> @llvm.loongarch.lasx.concat.128.d(<2 x double>, <2 x double>)
- <4 x i64> @llvm.loongarch.lasx.concat.128(<2 x i64>, <2 x i64>)
- <4 x float> @llvm.loongarch.lasx.extract.128.lo.s(<8 x float>)
- <2 x double> @llvm.loongarch.lasx.extract.128.lo.d(<4 x double>)
- <2 x i64> @llvm.loongarch.lasx.extract.128.lo(<4 x i64>)
- <4 x float> @llvm.loongarch.lasx.extract.128.hi.s(<8 x float>)
- <2 x double> @llvm.loongarch.lasx.extract.128.hi.d(<4 x double>)
- <2 x i64> @llvm.loongarch.lasx.extract.128.hi(<4 x i64>)
- <8 x float> @llvm.loongarch.lasx.insert.128.lo.s(<8 x float>, <4 x float>)
- <4 x double> @llvm.loongarch.lasx.insert.128.lo.d(<4 x double>, <2 x double>)
- <4 x i64> @llvm.loongarch.lasx.insert.128.lo(<4 x i64>, <2 x i64>)
- <8 x float> @llvm.loongarch.lasx.insert.128.hi.s(<8 x float>, <4 x float>)
- <4 x double> @llvm.loongarch.lasx.insert.128.hi.d(<4 x double>, <2 x double>)
- <4 x i64> @llvm.loongarch.lasx.insert.128.hi(<4 x i64>, <2 x i64>)
@heiher heiher force-pushed the users/hev/llvm-lasx-lsx-conversion branch from aceb318 to 279874f Compare October 30, 2025 12:20
@heiher heiher merged commit a410570 into main Nov 5, 2025
10 checks passed
@heiher heiher deleted the users/hev/llvm-lasx-lsx-conversion branch November 5, 2025 12:36
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 5, 2025

LLVM Buildbot has detected a new failure on builder arc-builder running on arc-worker while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/24436

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/X86/sse2-intrinsics-fast-isel.ll' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/buildbot/worker/arc-folder/build/bin/llc < /buildbot/worker/arc-folder/llvm-project/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll -show-mc-encoding -fast-isel -mtriple=i386-unknown-unknown -mattr=+sse2 | /buildbot/worker/arc-folder/build/bin/FileCheck /buildbot/worker/arc-folder/llvm-project/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll --check-prefixes=CHECK,X86,SSE,X86-SSE
# executed command: /buildbot/worker/arc-folder/build/bin/llc -show-mc-encoding -fast-isel -mtriple=i386-unknown-unknown -mattr=+sse2
# .---command stderr------------
# | LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse2.clflush
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
# | Stack dump:
# | 0.	Program arguments: /buildbot/worker/arc-folder/build/bin/llc -show-mc-encoding -fast-isel -mtriple=i386-unknown-unknown -mattr=+sse2
# | 1.	Running pass 'Function Pass Manager' on module '<stdin>'.
# | 2.	Running pass 'X86 DAG->DAG Instruction Selection' on function '@test_mm_clflush'
# |  #0 0x00000000023a9188 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/buildbot/worker/arc-folder/build/bin/llc+0x23a9188)
# |  #1 0x00000000023a6095 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
# |  #2 0x00007f5530ba1630 __restore_rt sigaction.c:0:0
# |  #3 0x00007f552f8f13d7 raise (/usr/lib64/libc.so.6+0x363d7)
# |  #4 0x00007f552f8f2ac8 abort (/usr/lib64/libc.so.6+0x37ac8)
# |  #5 0x000000000072a8f7 llvm::json::operator==(llvm::json::Value const&, llvm::json::Value const&) (.cold) JSON.cpp:0:0
# |  #6 0x0000000002124459 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/buildbot/worker/arc-folder/build/bin/llc+0x2124459)
# |  #7 0x0000000002128fca llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/buildbot/worker/arc-folder/build/bin/llc+0x2128fca)
# |  #8 0x0000000000970b77 (anonymous namespace)::X86DAGToDAGISel::Select(llvm::SDNode*) X86ISelDAGToDAG.cpp:0:0
# |  #9 0x000000000211fc9f llvm::SelectionDAGISel::DoInstructionSelection() (/buildbot/worker/arc-folder/build/bin/llc+0x211fc9f)
# | #10 0x000000000212fb78 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/buildbot/worker/arc-folder/build/bin/llc+0x212fb78)
# | #11 0x0000000002133e23 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/buildbot/worker/arc-folder/build/bin/llc+0x2133e23)
# | #12 0x0000000002134915 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/buildbot/worker/arc-folder/build/bin/llc+0x2134915)
# | #13 0x000000000211f4af llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/buildbot/worker/arc-folder/build/bin/llc+0x211f4af)
# | #14 0x0000000001227997 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
# | #15 0x00000000018a89eb llvm::FPPassManager::runOnFunction(llvm::Function&) (/buildbot/worker/arc-folder/build/bin/llc+0x18a89eb)
# | #16 0x00000000018a8d91 llvm::FPPassManager::runOnModule(llvm::Module&) (/buildbot/worker/arc-folder/build/bin/llc+0x18a8d91)
# | #17 0x00000000018a99a5 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/buildbot/worker/arc-folder/build/bin/llc+0x18a99a5)
# | #18 0x000000000080f5a8 compileModule(char**, llvm::LLVMContext&) llc.cpp:0:0
# | #19 0x0000000000733286 main (/buildbot/worker/arc-folder/build/bin/llc+0x733286)
# | #20 0x00007f552f8dd555 __libc_start_main (/usr/lib64/libc.so.6+0x22555)
# | #21 0x00000000008047e6 _start (/buildbot/worker/arc-folder/build/bin/llc+0x8047e6)
# `-----------------------------
# error: command failed with exit status: -6
# executed command: /buildbot/worker/arc-folder/build/bin/FileCheck /buildbot/worker/arc-folder/llvm-project/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll --check-prefixes=CHECK,X86,SSE,X86-SSE
# .---command stderr------------
# | /buildbot/worker/arc-folder/llvm-project/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll:399:14: error: SSE-LABEL: expected string not found in input
# | ; SSE-LABEL: test_mm_bsrli_si128:
# |              ^
# | <stdin>:170:21: note: scanning from here
# | test_mm_bslli_si128: # @test_mm_bslli_si128
# |                     ^
# | <stdin>:178:9: note: possible intended match here
# |  .globl test_mm_bsrli_si128 # 
# |         ^
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants