- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Closed
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationArea: Code generationC-bugCategory: This is a bug.Category: This is a bug.I-heavyIssue: Problems and improvements with respect to binary size of generated code.Issue: Problems and improvements with respect to binary size of generated code.O-riscvTarget: RISC-V architectureTarget: RISC-V architectureT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
I tried this code:
pub fn from_be_slice_manual(bytes: &[u8; 4]) -> u32 {
    (bytes[0] as u32) << 24
    | ((bytes[1] as u32) << 16)
    | ((bytes[2] as u32) << 8)
    | (bytes[3] as u32)
}
pub fn from_be_slice_intrinsic(bytes: &[u8; 4]) -> u32 {
    u32::from_be_bytes(*bytes)
}I expected to see this happen: Both should produce equal output, or at the very least the one using the intrinsic should be acceptable. When building on x86 these produce the same output, and on arm-unknown-linux-gnueabi the output is different but not terrible. On riscv64gc-unknown-linux-gnu the asm generated by the intrinsic is massive.
Instead, this happened:
example::from_be_slice_manual:
        lb      a1, 0(a0)
        lbu     a2, 1(a0)
        slli    a1, a1, 24
        lbu     a3, 2(a0)
        slli    a2, a2, 16
        lbu     a0, 3(a0)
        or      a1, a1, a2
        slli    a2, a3, 8
        or      a1, a1, a2
        or      a0, a0, a1
        ret
example::from_be_slice_intrinsic:
        lbu     a1, 1(a0)
        lbu     a2, 0(a0)
        lb      a3, 3(a0)
        lbu     a0, 2(a0)
        slli    a1, a1, 8
        or      a1, a1, a2
        slli    a2, a3, 8
        or      a0, a0, a2
        slli    a0, a0, 16
        or      a0, a0, a1
        slli    a1, a0, 8
        addi    a2, zero, 255
        slli    a3, a2, 32
        and     a1, a1, a3
        slli    a3, a0, 24
        slli    a4, a2, 40
        and     a3, a3, a4
        or      a1, a1, a3
        slli    a3, a0, 40
        slli    a2, a2, 48
        and     a2, a2, a3
        slli    a0, a0, 56
        or      a0, a0, a2
        or      a0, a0, a1
        srli    a0, a0, 32
        retMeta
rustc --version --verbose:
rustc 1.55.0
1
rustc 1.55.0 - 786ms
rustc 1.55.0 (c8dfcfe04 2021-09-06)
binary: rustc
commit-hash: c8dfcfe046a7680554bf4eb612bad840e7631c4b
commit-date: 2021-09-06
host: x86_64-unknown-linux-gnu
release: 1.55.0
LLVM version: 12.0.1
Godbolt link: https://godbolt.org/z/aPPdnond5
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationArea: Code generationC-bugCategory: This is a bug.Category: This is a bug.I-heavyIssue: Problems and improvements with respect to binary size of generated code.Issue: Problems and improvements with respect to binary size of generated code.O-riscvTarget: RISC-V architectureTarget: RISC-V architectureT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.