Skip to content

@llvm.abs.i64 optimizes differently versus target-specific @llvm.aarch64.neon.abs.i64 #148388

Open
@folkertdev

Description

@folkertdev

I'm not sure one is really better than the other, but it's the difference that is weird. My reading of https://llvm.org/docs/LangRef.html#llvm-abs-intrinsic is that the second argument being false means that the behavior is in fact wrapping, like the neon instruction.

https://godbolt.org/z/fTvxc4z3z

target triple = "aarch64-unknown-linux-gnu"

define noundef i64 @foo(i64 noundef %a) unnamed_addr {
start:
  %_0.sroa.0.0 = tail call i64 @llvm.abs.i64(i64 %a, i1 false)
  ret i64 %_0.sroa.0.0
}

define noundef i64 @bar(i64 noundef %a) unnamed_addr {
start:
  %_0.i = tail call noundef i64 @llvm.aarch64.neon.abs.i64(i64 noundef %a) #3
  ret i64 %_0.i
}

declare i64 @llvm.aarch64.neon.abs.i64(i64) unnamed_addr #1

declare i64 @llvm.abs.i64(i64, i1 immarg) #2

At -O0 they produce the same instructions, but at -O3 they do not

foo:                                    // @foo
        cmp     x0, #0
        cneg    x0, x0, mi
        ret
bar:                                    // @bar
        fmov    d0, x0
        abs     d0, d0
        fmov    x0, d0
        ret

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions