Skip to content

Commit 01ad4c8

Browse files
committed
[LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS.
In D71281 a fix was put in to round up the size of a ThunkSection to the nearest 4KiB when performing errata patching. This fixed a problem with a very large instrumented program that had thunks and patches mutually trigger each other. Unfortunately it triggers an assertion failure in an AArch64 allyesconfig build of the kernel. There is a specific assertion preventing an InputSectionDescription being larger than 4KiB. This will always trigger if there is at least one Thunk needed in that InputSectionDescription, which is possible for an allyesconfig build. Abstractly the problem case is: .text : { *(.text) ; ... . = ALIGN(SZ_4K); __idmap_text_start = .; *(.idmap.text) __idmap_text_end = .; ... } The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB. Note that there is more than one InputSectionDescription in the OutputSection so we can't just restrict the fix to OutputSections smaller than 4 KiB. The fix presented here limits the D71281 to InputSectionDescriptions that meet the following conditions: 1.) The OutputSection is bigger than the thunkSectionSpacing so adding thunks will affect the addresses of following code. 2.) The InputSectionDescription is larger than 4 KiB. This will prevent any assertion failures that an InputSectionDescription is < 4 KiB in size. We do this at ThunkSection creation time as at this point we know that the addresses are stable and up to date prior to adding the thunks as assignAddresses() will have been called immediately prior to thunk generation. The fix reverts the two tests affected by D71281 to their original state as they no longer need the 4KiB size roundup. I've added simpler tests to check for D71281 when the OutputSection size is larger than the ThunkSection spacing. Fixes ClangBuiltLinux/linux#812 Differential Revision: https://reviews.llvm.org/D72344
1 parent c3ab790 commit 01ad4c8

File tree

7 files changed

+165
-23
lines changed

7 files changed

+165
-23
lines changed

lld/ELF/Relocations.cpp

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1744,6 +1744,37 @@ ThunkSection *ThunkCreator::addThunkSection(OutputSection *os,
17441744
uint64_t off) {
17451745
auto *ts = make<ThunkSection>(os, off);
17461746
ts->partition = os->partition;
1747+
if ((config->fixCortexA53Errata843419 || config->fixCortexA8) &&
1748+
!isd->sections.empty()) {
1749+
// The errata fixes are sensitive to addresses modulo 4 KiB. When we add
1750+
// thunks we disturb the base addresses of sections placed after the thunks
1751+
// this makes patches we have generated redundant, and may cause us to
1752+
// generate more patches as different instructions are now in sensitive
1753+
// locations. When we generate more patches we may force more branches to
1754+
// go out of range, causing more thunks to be generated. In pathological
1755+
// cases this can cause the address dependent content pass not to converge.
1756+
// We fix this by rounding up the size of the ThunkSection to 4KiB, this
1757+
// limits the insertion of a ThunkSection on the addresses modulo 4 KiB,
1758+
// which means that adding Thunks to the section does not invalidate
1759+
// errata patches for following code.
1760+
// Rounding up the size to 4KiB has consequences for code-size and can
1761+
// trip up linker script defined assertions. For example the linux kernel
1762+
// has an assertion that what LLD represents as an InputSectionDescription
1763+
// does not exceed 4 KiB even if the overall OutputSection is > 128 Mib.
1764+
// We use the heuristic of rounding up the size when both of the following
1765+
// conditions are true:
1766+
// 1.) The OutputSection is larger than the ThunkSectionSpacing. This
1767+
// accounts for the case where no single InputSectionDescription is
1768+
// larger than the OutputSection size. This is conservative but simple.
1769+
// 2.) The InputSectionDescription is larger than 4 KiB. This will prevent
1770+
// any assertion failures that an InputSectionDescription is < 4 KiB
1771+
// in size.
1772+
uint64_t isdSize = isd->sections.back()->outSecOff +
1773+
isd->sections.back()->getSize() -
1774+
isd->sections.front()->outSecOff;
1775+
if (os->size > target->getThunkSectionSpacing() && isdSize > 4096)
1776+
ts->roundUpSizeForErrata = true;
1777+
}
17471778
isd->thunkSections.push_back({ts, pass});
17481779
return ts;
17491780
}

lld/ELF/SyntheticSections.cpp

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3457,13 +3457,8 @@ ThunkSection::ThunkSection(OutputSection *os, uint64_t off)
34573457
this->outSecOff = off;
34583458
}
34593459

3460-
// When the errata patching is on, we round the size up to a 4 KiB
3461-
// boundary. This limits the effect that adding Thunks has on the addresses
3462-
// of the program modulo 4 KiB. As the errata patching is sensitive to address
3463-
// modulo 4 KiB this can prevent further patches from being needed due to
3464-
// Thunk insertion.
34653460
size_t ThunkSection::getSize() const {
3466-
if (config->fixCortexA53Errata843419 || config->fixCortexA8)
3461+
if (roundUpSizeForErrata)
34673462
return alignTo(size, 4096);
34683463
return size;
34693464
}

lld/ELF/SyntheticSections.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1069,6 +1069,10 @@ class ThunkSection : public SyntheticSection {
10691069
InputSection *getTargetInputSection() const;
10701070
bool assignOffsets();
10711071

1072+
// When true, round up reported size of section to 4 KiB. See comment
1073+
// in addThunkSection() for more details.
1074+
bool roundUpSizeForErrata = false;
1075+
10721076
private:
10731077
std::vector<Thunk *> thunks;
10741078
size_t size = 0;
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
// REQUIRES: aarch64
2+
// RUN: llvm-mc -filetype=obj -triple=aarch64-none-linux %s -o %t.o
3+
// RUN: echo "SECTIONS { \
4+
// RUN: .text 0x10000 : { \
5+
// RUN: *(.text.01) ; \
6+
// RUN: . += 0x8000000 ; \
7+
// RUN: *(.text.02) } \
8+
// RUN: .foo : { *(.foo_sec) } } " > %t.script
9+
// RUN: ld.lld -pie --fix-cortex-a53-843419 --script=%t.script %t.o -o %t2
10+
// RUN: llvm-objdump --no-show-raw-insn -triple=aarch64-linux-gnu -d %t2
11+
12+
13+
/// %t2 is > 128 Megabytes, so delete it early.
14+
// RUN: rm %t2
15+
16+
/// Test case that for an OutputSection larger than the ThunkSectionSpacing
17+
/// --fix-cortex-a53-843419 will cause the size of the ThunkSection to be
18+
/// rounded up to the nearest 4KiB
19+
20+
.section .text.01, "ax", %progbits
21+
.balign 4096
22+
.globl _start
23+
.type _start, %function
24+
_start:
25+
/// Range extension thunk needed, due to linker script
26+
bl far_away
27+
.space 4096 - 12
28+
29+
/// Erratum sequence
30+
.globl t3_ff8_ldr
31+
.type t3_ff8_ldr, %function
32+
t3_ff8_ldr:
33+
adrp x0, dat
34+
ldr x1, [x1, #0]
35+
ldr x0, [x0, :lo12:dat]
36+
ret
37+
38+
/// Expect thunk and patch to be inserted here
39+
// CHECK: 0000000000011008 __AArch64ADRPThunk_far_away:
40+
// CHECK-NEXT: 11008: adrp x16, #134221824
41+
// CHECK-NEXT: add x16, x16, #16
42+
// CHECK-NEXT: br x16
43+
// CHECK: 0000000000012008 __CortexA53843419_11000:
44+
// CHECK-NEXT: 12008: ldr x0, [x0, #168]
45+
// CHECK-NEXT: b #-4104 <t3_ff8_ldr+0xc>
46+
47+
.section .text.02, "ax", %progbits
48+
.globl far_away
49+
.type far_away, function
50+
far_away:
51+
bl _start
52+
ret
53+
/// Expect thunk for _start not to have size rounded up to 4KiB as it is at
54+
/// the end of the OutputSection
55+
// CHECK: 0000000008012010 far_away:
56+
// CHECK-NEXT: 8012010: bl #8
57+
// CHECK-NEXT: ret
58+
// CHECK: 0000000008012018 __AArch64ADRPThunk__start:
59+
// CHECK-NEXT: 8012018: adrp x16, #-134225920
60+
// CHECK-NEXT: add x16, x16, #0
61+
// CHECK-NEXT: br x16
62+
// CHECK: 0000000008012024 foo:
63+
// CHECK-NEXT: 8012024: ret
64+
.section .foo_sec, "ax", %progbits
65+
.globl foo
66+
.type foo, function
67+
foo:
68+
ret
69+
70+
71+
.section .data
72+
.balign 8
73+
.globl dat
74+
dat: .quad 0

lld/test/ELF/aarch64-cortex-a53-843419-thunk.s

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@
55
// RUN: .text2 0x8010000 : { *(.text.04) } } " > %t.script
66
// RUN: ld.lld --script %t.script -fix-cortex-a53-843419 -verbose %t.o -o %t2 \
77
// RUN: 2>&1 | FileCheck -check-prefix=CHECK-PRINT %s
8-
98
// RUN: llvm-objdump --no-show-raw-insn -triple=aarch64-linux-gnu -d %t2 | FileCheck %s
109

1110
/// %t2 is 128 Megabytes, so delete it early.
@@ -23,11 +22,9 @@
2322
_start:
2423
bl far_away
2524
/// Thunk to far_away, size 16-bytes goes here.
26-
/// Thunk Section with patch enabled has its size rounded up to 4KiB
27-
/// this leaves the address of following sections the same modulo 4 KiB
2825

2926
.section .text.02, "ax", %progbits
30-
.space 4096 - 12
27+
.space 4096 - 28
3128

3229
/// Erratum sequence will only line up at address 0 modulo 0xffc when
3330
/// Thunk is inserted.
@@ -40,13 +37,13 @@ t3_ff8_ldr:
4037
ldr x0, [x0, :got_lo12:dat]
4138
ret
4239

43-
// CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 11FF8 in unpatched output.
44-
// CHECK: 0000000000011ff8 t3_ff8_ldr:
45-
// CHECK-NEXT: adrp x0, #134213632
40+
// CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 10FF8 in unpatched output.
41+
// CHECK: 0000000000010ff8 t3_ff8_ldr:
42+
// CHECK-NEXT: adrp x0, #134217728
4643
// CHECK-NEXT: ldr x1, [x1]
4744
// CHECK-NEXT: b #8
4845
// CHECK-NEXT: ret
49-
// CHECK: 0000000000012008 __CortexA53843419_12000:
46+
// CHECK: 0000000000011008 __CortexA53843419_11000:
5047
// CHECK-NEXT: ldr x0, [x0, #8]
5148
// CHECK-NEXT: b #-8
5249
.section .text.04, "ax", %progbits
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
// REQUIRES: arm
2+
// RUN: llvm-mc -filetype=obj -triple=armv7a-linux-gnueabihf --arm-add-build-attributes %s -o %t.o
3+
// RUN: ld.lld --fix-cortex-a8 --shared %t.o -o %t2
4+
// RUN: llvm-objdump -d --no-show-raw-insn %t2 | FileCheck %s
5+
6+
/// Test case that for an OutputSection larger than the ThunkSectionSpacing
7+
/// --fix-cortex-a8 will cause the size of the ThunkSection to be rounded up to
8+
/// the nearest 4KiB
9+
.thumb
10+
11+
.section .text.01, "ax", %progbits
12+
.balign 4096
13+
.globl _start
14+
.type _start, %function
15+
_start:
16+
/// state change thunk required
17+
b.w arm_func
18+
thumb_target:
19+
.space 4096 - 10
20+
/// erratum patch needed
21+
nop.w
22+
b.w thumb_target
23+
24+
/// Expect thunk and patch to be inserted here
25+
// CHECK: 00003004 __ThumbV7PILongThunk_arm_func:
26+
// CHECK-NEXT: 3004: movw r12, #4088
27+
// CHECK-NEXT: movt r12, #256
28+
// CHECK-NEXT: add r12, pc
29+
// CHECK-NEXT: bx r12
30+
// CHECK: 00004004 __CortexA8657417_2FFE:
31+
// CHECK-NEXT: 4004: b.w #-8196
32+
.section .text.02
33+
/// Take us over thunk section spacing
34+
.space 16 * 1024 * 1024
35+
36+
.section .text.03, "ax", %progbits
37+
.arm
38+
.balign 4
39+
.type arm_func, %function
40+
arm_func:
41+
bx lr

lld/test/ELF/arm-fix-cortex-a8-thunk.s

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// REQUIRES: arm
22
// RUN: llvm-mc -filetype=obj -triple=armv7a-linux-gnueabihf --arm-add-build-attributes %s -o %t.o
33
// RUN: echo "SECTIONS { \
4-
// RUN: .text0 0x01200a : { *(.text.00) } \
4+
// RUN: .text0 0x011006 : { *(.text.00) } \
55
// RUN: .text1 0x110000 : { *(.text.01) *(.text.02) *(.text.03) \
66
// RUN: *(.text.04) } \
77
// RUN: .text2 0x210000 : { *(.text.05) } } " > %t.script
@@ -32,7 +32,7 @@ _start:
3232
// CHECK-NEXT: bx r12
3333

3434
.section .text.02, "ax", %progbits
35-
.space 4096 - 10
35+
.space 4096 - 22
3636

3737
.section .text.03, "ax", %progbits
3838
.thumb_func
@@ -43,21 +43,21 @@ target:
4343
bl target
4444

4545
/// Expect erratum patch inserted here
46-
// CHECK: 00111ffa target:
47-
// CHECK-NEXT: 111ffa: nop.w
46+
// CHECK: 00110ffa target:
47+
// CHECK-NEXT: 110ffa: nop.w
4848
// CHECK-NEXT: bl #2
49-
// CHECK: 00112004 __CortexA8657417_111FFE:
50-
// CHECK-NEXT: 112004: b.w #-14
49+
// CHECK: 00111004 __CortexA8657417_110FFE:
50+
// CHECK-NEXT: 111004: b.w #-14
5151

5252
/// Expect range extension thunk here.
53-
// CHECK: 00112008 __ThumbV7PILongThunk_early:
54-
// CHECK-NEXT: 112008: b.w #-1048578
53+
// CHECK: 00111008 __ThumbV7PILongThunk_early:
54+
// CHECK-NEXT: 111008: b.w #-1048582
5555

5656
.section .text.04, "ax", %progbits
5757
/// The erratum patch will push this branch out of range, so another
5858
/// range extension thunk will be needed.
5959
beq.w early
60-
// CHECK: 113008: beq.w #-4100
60+
// CHECK: 11100c: beq.w #-8
6161

6262
.section .text.05, "ax", %progbits
6363
.arm

0 commit comments

Comments
 (0)