Skip to content

Conversation

@omern1
Copy link
Member

@omern1 omern1 commented Nov 20, 2024

The relevant bit from the Intel SDM for vinsertps semantics:

IF (SRC = REG) THEN COUNT_S := imm8[7:6] ELSE COUNT_S := 0

This is now taken into account.

The relevant bit from the Intel SDM for vinsertps semantics:
```
IF (SRC = REG) THEN COUNT_S := imm8[7:6] ELSE COUNT_S := 0
```

This is now taken into account.
@omern1 omern1 requested a review from RKSimon November 20, 2024 16:59
@llvmbot llvmbot added backend:X86 llvm:mc Machine (object) code labels Nov 20, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-x86

Author: Nabeel Omer (omern1)

Changes

The relevant bit from the Intel SDM for vinsertps semantics:

IF (SRC = REG) THEN COUNT_S := imm8[7:6] ELSE COUNT_S := 0

This is now taken into account.


Full diff: https://github.com/llvm/llvm-project/pull/117009.diff

5 Files Affected:

  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp (+8-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp (+2-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h (+1-1)
  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+1-1)
  • (added) llvm/test/MC/X86/vinsertps_decode.s (+7)
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp
index 70c71273d270f9..a952ba3e3b0ccd 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp
@@ -1122,7 +1122,13 @@ bool llvm::EmitAnyX86InstComments(const MCInst *MI, raw_ostream &OS,
   case X86::VINSERTPSrri:
   case X86::VINSERTPSZrri:
     Src2Name = getRegName(MI->getOperand(2).getReg());
-    [[fallthrough]];
+    DestName = getRegName(MI->getOperand(0).getReg());
+    Src1Name = getRegName(MI->getOperand(1).getReg());
+    if (MI->getOperand(NumOperands - 1).isImm())
+      DecodeINSERTPSMask(MI->getOperand(NumOperands - 1).getImm(),
+                         ShuffleMask, false);
+    break;
+
   case X86::INSERTPSrmi:
   case X86::VINSERTPSrmi:
   case X86::VINSERTPSZrmi:
@@ -1130,7 +1136,7 @@ bool llvm::EmitAnyX86InstComments(const MCInst *MI, raw_ostream &OS,
     Src1Name = getRegName(MI->getOperand(1).getReg());
     if (MI->getOperand(NumOperands - 1).isImm())
       DecodeINSERTPSMask(MI->getOperand(NumOperands - 1).getImm(),
-                         ShuffleMask);
+                         ShuffleMask, true);
     break;
 
   case X86::MOVLHPSrr:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp
index 82f4460a42e70a..57d0a734bf6ef7 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp
@@ -23,7 +23,7 @@
 
 namespace llvm {
 
-void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask) {
+void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask, bool SrcIsMem) {
   // Defaults the copying the dest value.
   ShuffleMask.push_back(0);
   ShuffleMask.push_back(1);
@@ -33,7 +33,7 @@ void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask) {
   // Decode the immediate.
   unsigned ZMask = Imm & 15;
   unsigned CountD = (Imm >> 4) & 3;
-  unsigned CountS = (Imm >> 6) & 3;
+  unsigned CountS = SrcIsMem ? 0 : (Imm >> 6) & 3;
 
   // CountS selects which input element to use.
   unsigned InVal = 4 + CountS;
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h b/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h
index 4ef9959f7a278c..cb079648e7be92 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h
@@ -28,7 +28,7 @@ template <typename T> class SmallVectorImpl;
 enum { SM_SentinelUndef = -1, SM_SentinelZero = -2 };
 
 /// Decode a 128-bit INSERTPS instruction as a v4f32 shuffle mask.
-void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask);
+void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask, bool SrcIsMem);
 
 // Insert the bottom Len elements from a second source into a vector starting at
 // element Idx.
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index fea66e9582cfba..65c8d53d0d42d5 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -5362,7 +5362,7 @@ static bool getTargetShuffleMask(SDValue N, bool AllowSentinelZero,
     assert(N.getOperand(0).getValueType() == VT && "Unexpected value type");
     assert(N.getOperand(1).getValueType() == VT && "Unexpected value type");
     ImmN = N.getConstantOperandVal(N.getNumOperands() - 1);
-    DecodeINSERTPSMask(ImmN, Mask);
+    DecodeINSERTPSMask(ImmN, Mask, false);
     IsUnary = IsFakeUnary = N.getOperand(0) == N.getOperand(1);
     break;
   case X86ISD::EXTRQI:
diff --git a/llvm/test/MC/X86/vinsertps_decode.s b/llvm/test/MC/X86/vinsertps_decode.s
new file mode 100644
index 00000000000000..644f34e4f6d144
--- /dev/null
+++ b/llvm/test/MC/X86/vinsertps_decode.s
@@ -0,0 +1,7 @@
+# RUN: llvm-mc -triple x86_64-unknown-unknown %s | FileCheck %s
+
+.intel_syntax
+
+# CHECK:  vinsertps {{.*}} # xmm2 = xmm2[0,1,2],mem[0]
+
+vinsertps xmm2,xmm2,dword ptr [r14+rdi*8+0x4C],0x0B0

@github-actions
Copy link

github-actions bot commented Nov 20, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reported issue for this?

Src1Name = getRegName(MI->getOperand(1).getReg());
if (MI->getOperand(NumOperands - 1).isImm())
DecodeINSERTPSMask(MI->getOperand(NumOperands - 1).getImm(),
ShuffleMask, false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*SrcIsMem=*/false

if (MI->getOperand(NumOperands - 1).isImm())
DecodeINSERTPSMask(MI->getOperand(NumOperands - 1).getImm(),
ShuffleMask);
ShuffleMask, true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*SrcIsMem=*/true

namespace llvm {

void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask) {
void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask, bool SrcIsMem) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format


/// Decode a 128-bit INSERTPS instruction as a v4f32 shuffle mask.
void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask);
void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask, bool SrcIsMem);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format

@omern1
Copy link
Member Author

omern1 commented Nov 20, 2024

Is there a reported issue for this?

No there isn't, I'll create one now.


# CHECK: vinsertps $176, 76(%r14,%rdi,8), %xmm2, %xmm2 # xmm2 = xmm2[0,1,2],mem[0]

vinsertps xmm2,xmm2,dword ptr [r14+rdi*8+0x4C],0x0B0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test insertps and {evex} vinsertps variants as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: you can simplify the address computation. For the purpose of this test, you don't need index/scale/offset. It is all about testing that you still get mem[0] even if some of the top bits of the immediate are set.

@RKSimon RKSimon changed the title [X86] Fix decoding for vinsertps immediate operand [X86] Fix shuffle comment decoding for vinsertps immediate operand Nov 20, 2024
@omern1 omern1 linked an issue Nov 21, 2024 that may be closed by this pull request
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@omern1 omern1 merged commit 0b06301 into llvm:main Nov 21, 2024
5 of 8 checks passed
@omern1 omern1 deleted the vinsertps branch November 21, 2024 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:X86 llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[X86] Incorrect shuffle comment decoding for vinsertps

4 participants