Skip to content

Conversation

@tcwzxx
Copy link
Member

@tcwzxx tcwzxx commented Aug 3, 2024

When store chains have the same value type ID and pointer type ID, they may mix different sizes of values, such as i8 and i64. This can lead to missed vectorization opportunities.

@llvmbot
Copy link
Member

llvmbot commented Aug 3, 2024

@llvm/pr-subscribers-llvm-transforms

Author: tcwzxx (tcwzxx)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/101810.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+8)
  • (added) llvm/test/Transforms/SLPVectorizer/X86/stores_mix_sizes.ll (+34)
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 8d2ce6bad6af7..97fb6798adcfc 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -19125,6 +19125,14 @@ bool SLPVectorizerPass::vectorizeStoreChains(BoUpSLP &R) {
     if (V->getPointerOperandType()->getTypeID() >
         V2->getPointerOperandType()->getTypeID())
       return false;
+    if (V->getValueOperand()->getType()->getScalarSizeInBits() <
+        V2->getValueOperand()->getType()->getScalarSizeInBits()) {
+      return true;
+    }
+    if (V->getValueOperand()->getType()->getScalarSizeInBits() >
+        V2->getValueOperand()->getType()->getScalarSizeInBits()) {
+      return false;
+    }
     // UndefValues are compatible with all other values.
     if (isa<UndefValue>(V->getValueOperand()) ||
         isa<UndefValue>(V2->getValueOperand()))
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/stores_mix_sizes.ll b/llvm/test/Transforms/SLPVectorizer/X86/stores_mix_sizes.ll
new file mode 100644
index 0000000000000..c571372e40d16
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/stores_mix_sizes.ll
@@ -0,0 +1,34 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s | FileCheck %s
+
+define void @test(ptr %p) {
+; CHECK-LABEL: define void @test(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[IDX1:%.*]] = getelementptr i8, ptr [[P]], i64 1
+; CHECK-NEXT:    [[IDX_64_9:%.*]] = getelementptr i64, ptr [[P]], i64 9
+; CHECK-NEXT:    store i64 1, ptr [[IDX_64_9]], align 8
+; CHECK-NEXT:    store <8 x i8> zeroinitializer, ptr [[IDX1]], align 4
+; CHECK-NEXT:    ret void
+;
+entry:
+  %idx1 = getelementptr i8, ptr %p, i64 1
+  store i8 0, ptr %idx1, align 4
+  %idx.64.9 = getelementptr i64, ptr %p, i64 9
+  store i64 1, ptr %idx.64.9, align 8
+  %idx2 = getelementptr i8, ptr %p, i64 2
+  store i8 0, ptr %idx2, align 4
+  %idx3 = getelementptr i8, ptr %p, i64 3
+  store i8 0, ptr %idx3, align 4
+  %idx4 = getelementptr i8, ptr %p, i64 4
+  store i8 0, ptr %idx4, align 4
+  %idx5 = getelementptr i8, ptr %p, i64 5
+  store i8 0, ptr %idx5, align 4
+  %idx6 = getelementptr i8, ptr %p, i64 6
+  store i8 0, ptr %idx6, align 4
+  %idx7 = getelementptr i8, ptr %p, i64 7
+  store i8 0, ptr %idx7, align 4
+  %idx8 = getelementptr i8, ptr %p, i64 8
+  store i8 0, ptr %idx8, align 4
+  ret void
+}

@alexey-bataev
Copy link
Member

Add a description of the patch

@alexey-bataev
Copy link
Member

Add a test in a separate patch

@tcwzxx
Copy link
Member Author

tcwzxx commented Aug 3, 2024

Add a description of the patch

OK, thanks

@alexey-bataev
Copy link
Member

I don't see test update

@tcwzxx
Copy link
Member Author

tcwzxx commented Aug 5, 2024

I don't see test update

The existing test cases are fine.

There is a new pattern :

; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s | FileCheck %s
define void @test(ptr %p) {
entry:
    %idx1 = getelementptr i8, ptr %p, i64 1
    store i8 0, ptr %idx1, align 4
    %idx.64.9 = getelementptr i64, ptr %p, i64 9
    store i64 1, ptr %idx.64.9, align 8
    %idx2 = getelementptr i8, ptr %p, i64 2
    store i8 0, ptr %idx2, align 4
    %idx3 = getelementptr i8, ptr %p, i64 3
    store i8 0, ptr %idx3, align 4
    %idx4 = getelementptr i8, ptr %p, i64 4
    store i8 0, ptr %idx4, align 4
    %idx5 = getelementptr i8, ptr %p, i64 5
    store i8 0, ptr %idx5, align 4
    %idx6 = getelementptr i8, ptr %p, i64 6
    store i8 0, ptr %idx6, align 4
    %idx7 = getelementptr i8, ptr %p, i64 7
    store i8 0, ptr %idx7, align 4
    %idx8 = getelementptr i8, ptr %p, i64 8
    store i8 0, ptr %idx8, align 4
    ret void
}

Before the patch:

define void @test(ptr %p) {
entry:
    %idx1 = getelementptr i8, ptr %p, i64 1
    store i8 0, ptr %idx1, align 4
    %idx.64.9 = getelementptr i64, ptr %p, i64 9
    store i64 1, ptr %idx.64.9, align 8
    %idx2 = getelementptr i8, ptr %p, i64 2
    store <4 x i8> zeroinitializer, ptr %idx2, align 4
    %idx6 = getelementptr i8, ptr %p, i64 6
    store i8 0, ptr %idx6, align 4
    %idx7 = getelementptr i8, ptr %p, i64 7
    store i8 0, ptr %idx7, align 4
    %idx8 = getelementptr i8, ptr %p, i64 8
    store i8 0, ptr %idx8, align 4
    ret void
}

After the patch:

define void @test(ptr %p) {
entry:
    %idx1 = getelementptr i8, ptr %p, i64 1
    %idx.64.9 = getelementptr i64, ptr %p, i64 9
    store i64 1, ptr %idx.64.9, align 8
    store <8 x i8> zeroinitializer, ptr %idx1, align 4
    ret void
}

Add a test in a separate patch

I will commit the above test case in a different PR.

@alexey-bataev
Copy link
Member

I don't see test update

The existing test cases are fine.

There is a new pattern :

; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s | FileCheck %s
define void @test(ptr %p) {
entry:
    %idx1 = getelementptr i8, ptr %p, i64 1
    store i8 0, ptr %idx1, align 4
    %idx.64.9 = getelementptr i64, ptr %p, i64 9
    store i64 1, ptr %idx.64.9, align 8
    %idx2 = getelementptr i8, ptr %p, i64 2
    store i8 0, ptr %idx2, align 4
    %idx3 = getelementptr i8, ptr %p, i64 3
    store i8 0, ptr %idx3, align 4
    %idx4 = getelementptr i8, ptr %p, i64 4
    store i8 0, ptr %idx4, align 4
    %idx5 = getelementptr i8, ptr %p, i64 5
    store i8 0, ptr %idx5, align 4
    %idx6 = getelementptr i8, ptr %p, i64 6
    store i8 0, ptr %idx6, align 4
    %idx7 = getelementptr i8, ptr %p, i64 7
    store i8 0, ptr %idx7, align 4
    %idx8 = getelementptr i8, ptr %p, i64 8
    store i8 0, ptr %idx8, align 4
    ret void
}

Before the patch:

define void @test(ptr %p) {
entry:
    %idx1 = getelementptr i8, ptr %p, i64 1
    store i8 0, ptr %idx1, align 4
    %idx.64.9 = getelementptr i64, ptr %p, i64 9
    store i64 1, ptr %idx.64.9, align 8
    %idx2 = getelementptr i8, ptr %p, i64 2
    store <4 x i8> zeroinitializer, ptr %idx2, align 4
    %idx6 = getelementptr i8, ptr %p, i64 6
    store i8 0, ptr %idx6, align 4
    %idx7 = getelementptr i8, ptr %p, i64 7
    store i8 0, ptr %idx7, align 4
    %idx8 = getelementptr i8, ptr %p, i64 8
    store i8 0, ptr %idx8, align 4
    ret void
}

After the patch:

define void @test(ptr %p) {
entry:
    %idx1 = getelementptr i8, ptr %p, i64 1
    %idx.64.9 = getelementptr i64, ptr %p, i64 9
    store i64 1, ptr %idx.64.9, align 8
    store <8 x i8> zeroinitializer, ptr %idx1, align 4
    ret void
}

Add a test in a separate patch

I will commit the above test case in a different PR.

I don't see test update

The existing test cases are fine.

There is a new pattern :

; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s | FileCheck %s
define void @test(ptr %p) {
entry:
    %idx1 = getelementptr i8, ptr %p, i64 1
    store i8 0, ptr %idx1, align 4
    %idx.64.9 = getelementptr i64, ptr %p, i64 9
    store i64 1, ptr %idx.64.9, align 8
    %idx2 = getelementptr i8, ptr %p, i64 2
    store i8 0, ptr %idx2, align 4
    %idx3 = getelementptr i8, ptr %p, i64 3
    store i8 0, ptr %idx3, align 4
    %idx4 = getelementptr i8, ptr %p, i64 4
    store i8 0, ptr %idx4, align 4
    %idx5 = getelementptr i8, ptr %p, i64 5
    store i8 0, ptr %idx5, align 4
    %idx6 = getelementptr i8, ptr %p, i64 6
    store i8 0, ptr %idx6, align 4
    %idx7 = getelementptr i8, ptr %p, i64 7
    store i8 0, ptr %idx7, align 4
    %idx8 = getelementptr i8, ptr %p, i64 8
    store i8 0, ptr %idx8, align 4
    ret void
}

Before the patch:

define void @test(ptr %p) {
entry:
    %idx1 = getelementptr i8, ptr %p, i64 1
    store i8 0, ptr %idx1, align 4
    %idx.64.9 = getelementptr i64, ptr %p, i64 9
    store i64 1, ptr %idx.64.9, align 8
    %idx2 = getelementptr i8, ptr %p, i64 2
    store <4 x i8> zeroinitializer, ptr %idx2, align 4
    %idx6 = getelementptr i8, ptr %p, i64 6
    store i8 0, ptr %idx6, align 4
    %idx7 = getelementptr i8, ptr %p, i64 7
    store i8 0, ptr %idx7, align 4
    %idx8 = getelementptr i8, ptr %p, i64 8
    store i8 0, ptr %idx8, align 4
    ret void
}

After the patch:

define void @test(ptr %p) {
entry:
    %idx1 = getelementptr i8, ptr %p, i64 1
    %idx.64.9 = getelementptr i64, ptr %p, i64 9
    store i64 1, ptr %idx.64.9, align 8
    store <8 x i8> zeroinitializer, ptr %idx1, align 4
    ret void
}

Add a test in a separate patch

I will commit the above test case in a different PR.

Ok, fine, but these changes must be reflected in the tests. There are no test changes in this patch

@tcwzxx
Copy link
Member Author

tcwzxx commented Aug 5, 2024

Add a test in a separate patch

Ok, fine, but these changes must be reflected in the tests. There are no test changes in this patch

I feel confused. So, do you mean I should add the above test case in this patch?

The existing test cases do not change after the patch.

@alexey-bataev
Copy link
Member

You need to add the test(s) in the separate NFC patch, commit it, and then show in this patch that it updates/affects these new tests

@tcwzxx
Copy link
Member Author

tcwzxx commented Aug 5, 2024

You need to add the test(s) in the separate NFC patch, commit it, and then show in this patch that it updates/affects these new tests

Thank you for your patience. Is everything OK now?

if (V->getPointerOperandType()->getTypeID() >
V2->getPointerOperandType()->getTypeID())
return false;
if (V->getValueOperand()->getType()->getScalarSizeInBits() <
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also need to update the checks in AreCompatibleStores lambda

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to compare getTypeID() rather than size here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getTypeID has been compared above, but with the same TypeID, such as IntegerTyID, which can have different sizes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, getTypeID() above compared for pointer operands, which worked for typed pointers. You need to use getTypeID for value operand types

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you miss lines 19114 to 19122, which compare the value operand types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. Then there is a question, of why it did not work, because it should work correctly for compatible types

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i8 and i64 are not compatible types, but they have the same Type ID, IntegerTyID

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, missed it.

Copy link
Member

@alexey-bataev alexey-bataev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@tcwzxx tcwzxx merged commit b64ec3c into llvm:main Aug 7, 2024
@tcwzxx tcwzxx deleted the slp branch August 17, 2024 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants