Skip to content

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Oct 7, 2025

The vector intrinsics in question have no undefined behavior, and have no other effect besides returning the result: they should hence be marked speculatable.

@llvmbot
Copy link
Member

llvmbot commented Oct 7, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-risc-v

Author: Ramkumar Ramachandra (artagnon)

Changes

The vector intrinsics in questions have no undefined behavior, and have no other effect besides returning the result, and do return a result: they should hence be marked speculatable and willreturn.

At the moment, there are no optimizations that are enabled by these attributes, and the patch is mainly just documentation.


Full diff: https://github.com/llvm/llvm-project/pull/162334.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/Intrinsics.td (+34-12)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll (+1-1)
  • (modified) llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll (-1)
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 96da698538314..fef80a88bfbc2 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -2422,12 +2422,12 @@ def int_loop_dependence_war_mask:
 def int_get_active_lane_mask:
   DefaultAttrsIntrinsic<[llvm_anyvector_ty],
             [llvm_anyint_ty, LLVMMatchType<1>],
-            [IntrNoMem]>;
+            [IntrNoMem, IntrSpeculatable, IntrWillReturn]>;
 
 def int_experimental_get_vector_length:
   DefaultAttrsIntrinsic<[llvm_i32_ty],
                         [llvm_anyint_ty, llvm_i32_ty, llvm_i1_ty],
-                        [IntrNoMem,
+                        [IntrNoMem, IntrSpeculatable, IntrWillReturn,
                          ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>]>;
 
 def int_experimental_cttz_elts:
@@ -2614,7 +2614,7 @@ def int_memset_element_unordered_atomic
 
 //===------------------------ Reduction Intrinsics ------------------------===//
 //
-let IntrProperties = [IntrNoMem, IntrSpeculatable] in {
+let IntrProperties = [IntrNoMem, IntrSpeculatable, IntrWillReturn] in {
 
   def int_vector_reduce_fadd : DefaultAttrsIntrinsic<[LLVMVectorElementType<0>],
                                          [LLVMVectorElementType<0>,
@@ -2753,41 +2753,63 @@ def int_preserve_static_offset : DefaultAttrsIntrinsic<[llvm_ptr_ty],
 
 def int_vector_reverse : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                                [LLVMMatchType<0>],
-                                               [IntrNoMem]>;
+                                               [IntrNoMem,
+                                                IntrSpeculatable,
+                                                IntrWillReturn]>;
 
 def int_vector_splice : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                               [LLVMMatchType<0>,
                                                LLVMMatchType<0>,
                                                llvm_i32_ty],
-                                              [IntrNoMem, ImmArg<ArgIndex<2>>]>;
+                                              [IntrNoMem,
+                                               IntrSpeculatable,
+                                               IntrWillReturn,
+                                               ImmArg<ArgIndex<2>>]>;
 
 //===---------- Intrinsics to query properties of scalable vectors --------===//
-def int_vscale : DefaultAttrsIntrinsic<[llvm_anyint_ty], [], [IntrNoMem]>;
+def int_vscale : DefaultAttrsIntrinsic<[llvm_anyint_ty],
+                                       [],
+                                       [IntrNoMem,
+                                        IntrSpeculatable,
+                                        IntrWillReturn]>;
 
 //===---------- Intrinsics to perform subvector insertion/extraction ------===//
 def int_vector_insert : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                               [LLVMMatchType<0>, llvm_anyvector_ty, llvm_i64_ty],
-                                              [IntrNoMem, IntrSpeculatable, ImmArg<ArgIndex<2>>]>;
+                                              [IntrNoMem,
+                                               IntrSpeculatable,
+                                               IntrWillReturn,
+                                               ImmArg<ArgIndex<2>>]>;
 
 def int_vector_extract : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                                [llvm_anyvector_ty, llvm_i64_ty],
-                                               [IntrNoMem, IntrSpeculatable, ImmArg<ArgIndex<1>>]>;
+                                               [IntrNoMem,
+                                                IntrSpeculatable,
+                                                IntrWillReturn,
+                                                ImmArg<ArgIndex<1>>]>;
 
 foreach n = 2...8 in {
   def int_vector_interleave#n   : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                                        !listsplat(LLVMOneNthElementsVectorType<0, n>, n),
-                                                       [IntrNoMem]>;
+                                                       [IntrNoMem,
+                                                        IntrSpeculatable,
+                                                        IntrWillReturn]>;
 
   def int_vector_deinterleave#n : DefaultAttrsIntrinsic<!listsplat(LLVMOneNthElementsVectorType<0, n>, n),
                                                        [llvm_anyvector_ty],
-                                                       [IntrNoMem]>;
+                                                       [IntrNoMem,
+                                                        IntrSpeculatable,
+                                                        IntrWillReturn]>;
 }
 
 //===-------------- Intrinsics to perform partial reduction ---------------===//
 
 def int_vector_partial_reduce_add : DefaultAttrsIntrinsic<[LLVMMatchType<0>],
-                                                          [llvm_anyvector_ty, llvm_anyvector_ty],
-                                                          [IntrNoMem]>;
+                                                          [llvm_anyvector_ty,
+                                                           llvm_anyvector_ty],
+                                                          [IntrNoMem,
+                                                           IntrSpeculatable,
+                                                           IntrWillReturn]>;
 
 //===----------------- Pointer Authentication Intrinsics ------------------===//
 //
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll b/llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll
index d73900dcd0bea..83b494a1be0d3 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll
@@ -2288,7 +2288,7 @@ define void @tgamma_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
 }
 ;.
 ; CHECK: attributes #[[ATTR0]] = { "target-features"="+v" }
-; CHECK: attributes #[[ATTR1:[0-9]+]] = { nocallback nofree nosync nounwind willreturn memory(none) }
+; CHECK: attributes #[[ATTR1:[0-9]+]] = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
 ; CHECK: attributes #[[ATTR2]] = { "vector-function-abi-variant"="_ZGVrNxv_acos(Sleef_acosdx_u10rvvm2)" }
 ; CHECK: attributes #[[ATTR3]] = { "vector-function-abi-variant"="_ZGVrNxv_acosf(Sleef_acosfx_u10rvvm2)" }
 ; CHECK: attributes #[[ATTR4]] = { "vector-function-abi-variant"="_ZGVrNxv_acosh(Sleef_acoshdx_u10rvvm2)" }
diff --git a/llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll b/llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll
index 9acc6d6601292..09f583f9242d5 100644
--- a/llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll
+++ b/llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll
@@ -39,5 +39,4 @@ declare <4 x float> @llvm.exp.v4f32(<4 x float>) #0
 declare <vscale x 4 x float> @llvm.exp.nxv4f32(<vscale x 4 x float>) #0
 
 ; CHECK: attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
-; CHECK-NEXT: attributes #1 = { nocallback nofree nosync nounwind willreturn memory(none) }
 attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

@llvmbot
Copy link
Member

llvmbot commented Oct 7, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Ramkumar Ramachandra (artagnon)

Changes

The vector intrinsics in questions have no undefined behavior, and have no other effect besides returning the result, and do return a result: they should hence be marked speculatable and willreturn.

At the moment, there are no optimizations that are enabled by these attributes, and the patch is mainly just documentation.


Full diff: https://github.com/llvm/llvm-project/pull/162334.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/Intrinsics.td (+34-12)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll (+1-1)
  • (modified) llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll (-1)
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 96da698538314..fef80a88bfbc2 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -2422,12 +2422,12 @@ def int_loop_dependence_war_mask:
 def int_get_active_lane_mask:
   DefaultAttrsIntrinsic<[llvm_anyvector_ty],
             [llvm_anyint_ty, LLVMMatchType<1>],
-            [IntrNoMem]>;
+            [IntrNoMem, IntrSpeculatable, IntrWillReturn]>;
 
 def int_experimental_get_vector_length:
   DefaultAttrsIntrinsic<[llvm_i32_ty],
                         [llvm_anyint_ty, llvm_i32_ty, llvm_i1_ty],
-                        [IntrNoMem,
+                        [IntrNoMem, IntrSpeculatable, IntrWillReturn,
                          ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>]>;
 
 def int_experimental_cttz_elts:
@@ -2614,7 +2614,7 @@ def int_memset_element_unordered_atomic
 
 //===------------------------ Reduction Intrinsics ------------------------===//
 //
-let IntrProperties = [IntrNoMem, IntrSpeculatable] in {
+let IntrProperties = [IntrNoMem, IntrSpeculatable, IntrWillReturn] in {
 
   def int_vector_reduce_fadd : DefaultAttrsIntrinsic<[LLVMVectorElementType<0>],
                                          [LLVMVectorElementType<0>,
@@ -2753,41 +2753,63 @@ def int_preserve_static_offset : DefaultAttrsIntrinsic<[llvm_ptr_ty],
 
 def int_vector_reverse : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                                [LLVMMatchType<0>],
-                                               [IntrNoMem]>;
+                                               [IntrNoMem,
+                                                IntrSpeculatable,
+                                                IntrWillReturn]>;
 
 def int_vector_splice : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                               [LLVMMatchType<0>,
                                                LLVMMatchType<0>,
                                                llvm_i32_ty],
-                                              [IntrNoMem, ImmArg<ArgIndex<2>>]>;
+                                              [IntrNoMem,
+                                               IntrSpeculatable,
+                                               IntrWillReturn,
+                                               ImmArg<ArgIndex<2>>]>;
 
 //===---------- Intrinsics to query properties of scalable vectors --------===//
-def int_vscale : DefaultAttrsIntrinsic<[llvm_anyint_ty], [], [IntrNoMem]>;
+def int_vscale : DefaultAttrsIntrinsic<[llvm_anyint_ty],
+                                       [],
+                                       [IntrNoMem,
+                                        IntrSpeculatable,
+                                        IntrWillReturn]>;
 
 //===---------- Intrinsics to perform subvector insertion/extraction ------===//
 def int_vector_insert : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                               [LLVMMatchType<0>, llvm_anyvector_ty, llvm_i64_ty],
-                                              [IntrNoMem, IntrSpeculatable, ImmArg<ArgIndex<2>>]>;
+                                              [IntrNoMem,
+                                               IntrSpeculatable,
+                                               IntrWillReturn,
+                                               ImmArg<ArgIndex<2>>]>;
 
 def int_vector_extract : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                                [llvm_anyvector_ty, llvm_i64_ty],
-                                               [IntrNoMem, IntrSpeculatable, ImmArg<ArgIndex<1>>]>;
+                                               [IntrNoMem,
+                                                IntrSpeculatable,
+                                                IntrWillReturn,
+                                                ImmArg<ArgIndex<1>>]>;
 
 foreach n = 2...8 in {
   def int_vector_interleave#n   : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
                                                        !listsplat(LLVMOneNthElementsVectorType<0, n>, n),
-                                                       [IntrNoMem]>;
+                                                       [IntrNoMem,
+                                                        IntrSpeculatable,
+                                                        IntrWillReturn]>;
 
   def int_vector_deinterleave#n : DefaultAttrsIntrinsic<!listsplat(LLVMOneNthElementsVectorType<0, n>, n),
                                                        [llvm_anyvector_ty],
-                                                       [IntrNoMem]>;
+                                                       [IntrNoMem,
+                                                        IntrSpeculatable,
+                                                        IntrWillReturn]>;
 }
 
 //===-------------- Intrinsics to perform partial reduction ---------------===//
 
 def int_vector_partial_reduce_add : DefaultAttrsIntrinsic<[LLVMMatchType<0>],
-                                                          [llvm_anyvector_ty, llvm_anyvector_ty],
-                                                          [IntrNoMem]>;
+                                                          [llvm_anyvector_ty,
+                                                           llvm_anyvector_ty],
+                                                          [IntrNoMem,
+                                                           IntrSpeculatable,
+                                                           IntrWillReturn]>;
 
 //===----------------- Pointer Authentication Intrinsics ------------------===//
 //
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll b/llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll
index d73900dcd0bea..83b494a1be0d3 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/veclib-function-calls.ll
@@ -2288,7 +2288,7 @@ define void @tgamma_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
 }
 ;.
 ; CHECK: attributes #[[ATTR0]] = { "target-features"="+v" }
-; CHECK: attributes #[[ATTR1:[0-9]+]] = { nocallback nofree nosync nounwind willreturn memory(none) }
+; CHECK: attributes #[[ATTR1:[0-9]+]] = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
 ; CHECK: attributes #[[ATTR2]] = { "vector-function-abi-variant"="_ZGVrNxv_acos(Sleef_acosdx_u10rvvm2)" }
 ; CHECK: attributes #[[ATTR3]] = { "vector-function-abi-variant"="_ZGVrNxv_acosf(Sleef_acosfx_u10rvvm2)" }
 ; CHECK: attributes #[[ATTR4]] = { "vector-function-abi-variant"="_ZGVrNxv_acosh(Sleef_acoshdx_u10rvvm2)" }
diff --git a/llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll b/llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll
index 9acc6d6601292..09f583f9242d5 100644
--- a/llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll
+++ b/llvm/test/Transforms/PreISelIntrinsicLowering/AArch64/expand-exp.ll
@@ -39,5 +39,4 @@ declare <4 x float> @llvm.exp.v4f32(<4 x float>) #0
 declare <vscale x 4 x float> @llvm.exp.nxv4f32(<vscale x 4 x float>) #0
 
 ; CHECK: attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
-; CHECK-NEXT: attributes #1 = { nocallback nofree nosync nounwind willreturn memory(none) }
 attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

@artagnon artagnon changed the title [IR] Mark vector intrinsics speculatable, willreturn [IR] Mark vector intrinsics speculatable Oct 7, 2025
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Oct 7, 2025
@artagnon artagnon removed the request for review from fhahn October 8, 2025 08:50
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to test this with a case where we hoist a speculative vector intrinsic (and not hoist the remainder/division

@artagnon
Copy link
Contributor Author

artagnon commented Oct 8, 2025

Would be good to test this with a case where we hoist a speculative vector intrinsic (and not hoist the remainder/division

I think we hoist based on Instruction::mayHaveSideEffects, which is unrelated to speculatable in LICM? Before the change, I have the following LICM test: not sure what I'm missing?

define i32 @vp_udiv(<2 x i32> %inv.q, <2 x i32> %inv.d) {
; CHECK-LABEL: define i32 @vp_udiv(
; CHECK-SAME: <2 x i32> [[INV_Q:%.*]], <2 x i32> [[INV_D:%.*]]) {
; CHECK-NEXT:  [[ENTRY:.*]]:
; CHECK-NEXT:    [[VP_UDIV:%.*]] = call <2 x i32> @llvm.vp.udiv.v2i32(<2 x i32> [[INV_Q]], <2 x i32> [[INV_D]], <2 x i1> splat (i1 true), i32 2)
; CHECK-NEXT:    [[EXTRACT:%.*]] = extractelement <2 x i32> [[VP_UDIV]], i32 0
; CHECK-NEXT:    br label %[[LOOP:.*]]
; CHECK:       [[LOOP]]:
; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ]
; CHECK-NEXT:    [[LOOP_COND:%.*]] = icmp ult i32 [[IV]], [[EXTRACT]]
; CHECK-NEXT:    [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT:    br i1 [[LOOP_COND]], label %[[LOOP]], label %[[EXIT:.*]]
; CHECK:       [[EXIT]]:
; CHECK-NEXT:    [[IV_LCSSA:%.*]] = phi i32 [ [[IV]], %[[LOOP]] ]
; CHECK-NEXT:    ret i32 [[IV_LCSSA]]
;
entry:
  br label %loop

loop:
  %iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
  %vp.udiv = call <2 x i32> @llvm.vp.udiv.v2i32(<2 x i32> %inv.q, <2 x i32> %inv.d, <2 x i1> splat (i1 1), i32 2)
  %extract = extractelement <2 x i32> %vp.udiv, i32 0
  %loop.cond = icmp ult i32 %iv, %extract
  %iv.next = add i32 %iv, 1
  br i1 %loop.cond, label %loop, label %exit

exit:
  ret i32 %iv
}

@nikic
Copy link
Contributor

nikic commented Oct 8, 2025

@artagnon You need to place the intrinsic in a code path that is not unconditionally executed on loop entry. (E.g. after early exit or abnormal exit.)

The vector intrinsics in questions have no undefined behavior, and have
no other effect besides returning the result, and do return a result:
they should hence be marked speculatable and willreturn.

At the moment, there are no optimizations that are enabled by these
attributes, and the patch is mainly just documentation.
@artagnon artagnon force-pushed the ir-vector-intrinsic-attr branch from f233c12 to 1dcf6ff Compare October 8, 2025 10:37
// CHECK-C-NEXT: [[ADD:%.*]] = add i32 [[SLICE_BASE]], [[TMP3]]
// CHECK-C-NEXT: [[TMP5:%.*]] = add i32 [[ADD]], 15
// CHECK-C-NEXT: tail call void @llvm.aarch64.sme.ld1b.horiz(<vscale x 16 x i1> [[PG]], ptr [[TMP2]], i32 0, i32 [[TMP5]])
// CHECK-C-NEXT: [[SVL:%.*]] = shl nuw nsw i64 [[TMP0]], 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by the changes here. How does marking intrinsics speculatable result in more nowrap flags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InstCombine is doing this:

*** IR Dump After InstCombinePass on test_svstnt1_vnum_u8_x2 ***

 entry:
-  %0 = insertvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } poison, <vscale x 16 x i8> %v.coerce0, 0
-  %1 = insertvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } %0, <vscale x 16 x i8> %v.coerce1, 1
-  %2 = getelementptr <vscale x 16 x i8>, ptr %base, i64 %vnum
+  %0 = call i64 @llvm.vscale.i64()
+  %1 = shl nuw nsw i64 %0, 4
+  %.idx = mul i64 %vnum, %1
+  %2 = getelementptr i8, ptr %base, i64 %.idx
   call void @llvm.aarch64.sve.stnt1.pn.x2.nxv16i8(<vscale x 16 x i8> %v.coerce0, <vscale x 16 x i8> %v.coerce1, target("aarch64.svcount") %pn, ptr %2)
   ret void

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not able to figure out the exact code snippet in InstCombine that's responsible for this :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like ptradd canonicalization. But I still don't understand why this patch changes anything about the nowrap handling here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the variable names I wonder if it's a case of no longer loosing the wrap flags rather than gaining them? From CodeGenFunction::EmitSMELd1St1:

    llvm::Value *StreamingVectorLengthCall =
        Builder.CreateMul(Builder.CreateCall(StreamingVectorLength),
                          llvm::ConstantInt::get(Int64Ty, 8), "svl",
                          /* HasNUW */ true, /* HasNSW */ true);
    llvm::Value *Mulvl =
        Builder.CreateMul(StreamingVectorLengthCall, Ops[4], "mulvl");

Copy link
Contributor Author

@artagnon artagnon Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I finally found the piece of code that sets nuw/nsw. First, we do PtrAdd canonicalization like Nikita said:

Value *llvm::emitGEPOffset(IRBuilderBase *Builder, const DataLayout &DL,
                           User *GEP, bool NoAssumptions) {
  // ...
  // nusw implies nsw for the offset arithmetic.
  bool NSW = GEPOp->hasNoUnsignedSignedWrap() && !NoAssumptions;
  bool NUW = GEPOp->hasNoUnsignedWrap() && !NoAssumptions;

Here, nuw/nsw doesn't get set because the original GEP doesn't have nuw/nusw. However, this creates a shufflevector, which is later handled by:

// Splatting the first element of the result of a BinOp, where any of the
// BinOp's operands are the result of a first element splat can be simplified to
// splatting the first element of the result of the BinOp
Instruction *InstCombinerImpl::simplifyBinOpSplats(ShuffleVectorInst &SVI) {
  // ...
  auto *BinOp = cast<BinaryOperator>(Op0);
  if (!isSafeToSpeculativelyExecuteWithVariableReplaced(BinOp))
    return nullptr;

  Value *NewBO = Builder.CreateBinOp(BinOp->getOpcode(), X, Y);
  if (auto NewBOI = dyn_cast<Instruction>(NewBO))
    NewBOI->copyIRFlags(BinOp);

So yes, I think it is a case of preserving existing nowrap flags, as Paul remarked.

EDIT: isSafeToSpeculativelyExecute has to be called on the vscale (or a CallInst, which is not a BinOp), for there to be changes. Even after hours of digging, it is still mysterious to me.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@artagnon artagnon merged commit 2a02d57 into llvm:main Oct 9, 2025
9 checks passed
@artagnon artagnon deleted the ir-vector-intrinsic-attr branch October 9, 2025 08:42
svkeerthy pushed a commit that referenced this pull request Oct 9, 2025
The vector intrinsics in question have no undefined behavior, and have
no other effect besides returning the result: they should hence be
marked speculatable.
clingfei pushed a commit to clingfei/llvm-project that referenced this pull request Oct 10, 2025
The vector intrinsics in question have no undefined behavior, and have
no other effect besides returning the result: they should hence be
marked speculatable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang Clang issues not falling into any other category llvm:ir llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants