linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* The trouble with __weak and objtool got worse
@ 2022-04-15 11:19 Peter Zijlstra
  2022-04-15 14:12 ` Steven Rostedt
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-15 11:19 UTC (permalink / raw)
  To: x86, Josh Poimboeuf
  Cc: hjl.tools, ndesaulniers, mbenes, rostedt, linux-toolchains

Hi,

After chasing my tail for a good few hours this morning I finally
figured out why code patching is exploding for me in weird and wonderful
ways... :-(

This problem affects everything where we have external tools (mostly
objtool but possible also recordmcount) generate location sections for
us, notably things like:

  .static_call_sites
  .retpoline_sites
  __mcount_loc
  .orc_unwind
  .orc_unwind_ip

from individual translation-unit runs.

Consider:

foo-weak.c:

extern void __SCT__foo(void);

__attribute__((weak)) void foo(void)
{
        return __SCT__foo();
}

foo.c:

extern void __SCT__foo(void);
extern void my_foo(void);

void foo(void)
{
        my_foo();
        return __SCT__foo();
}

These generate the obvious code
(gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):

foo-weak.o:
0000000000000000 <foo>:
   0:   e9 00 00 00 00          jmpq   5 <foo+0x5>      1: R_X86_64_PLT32       __SCT__foo-0x4

foo.o:
0000000000000000 <foo>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   e8 00 00 00 00          callq  9 <foo+0x9>      5: R_X86_64_PLT32       my_foo-0x4
   9:   48 83 c4 08             add    $0x8,%rsp
   d:   e9 00 00 00 00          jmpq   12 <foo+0x12>    e: R_X86_64_PLT32       __SCT__foo-0x4


Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o):

foos.o:
0000000000000000 <foo-0x10>:
   0:   e9 00 00 00 00          jmpq   5 <foo-0xb>      1: R_X86_64_PLT32       __SCT__foo-0x4
   5:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)
   f:   90                      nop

0000000000000010 <foo>:
  10:   48 83 ec 08             sub    $0x8,%rsp
  14:   e8 00 00 00 00          callq  19 <foo+0x9>     15: R_X86_64_PLT32      my_foo-0x4
  19:   48 83 c4 08             add    $0x8,%rsp
  1d:   e9 00 00 00 00          jmpq   22 <foo+0x12>    1e: R_X86_64_PLT32      __SCT__foo-0x4

Noting that ld preserves the weak function text, but strips the symbol
off of it (hence objdump doing that funny negative offset thing). This
does lead to 'interesting' unused code issues with objtool when ran on
linked objects, but that seems to be working (fingers crossed).

So far so good.. Now lets consider the objtool static_call output
section (readelf output, old binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 1d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

So we have two patch sites, one in the dead code of the weak foo and one
in the real foo. All is well.

*HOWEVER*, sometimes it generates things like this (using new enough
binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

And now we can see how that foos.o .static_call_sites goes side-ways, we
now have _two_ patch sites in foo. One for the weak symbol at foo+0
(which is no longer a static_call site!) and one at foo+d which is in
fact the right location.

This seems to happen when objtool cannot find a section symbol, in which
case it falls back to any other symbol to key off of, however in this
case that goes terribly wrong!

Now, if we do IBT/LTO builds, and run objtool on linked objects only,
this obvious doesn't happen, because the weak stuff has been resolved by
then. But ideally we'd not force that since it has a build time
penalty..

The other option seems to be to have objtool add section symbols it
needs, however due to ELF being a total PITA and requiring all LOCAL
symbols to be before GLOBAL symbols, this would mean re-ordering the
whole symbol table (I have the code somewhere :-/).

Alternatively:

  https://sourceware.org/pipermail/binutils/2020-December/114671.html

seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
work, except I'm getting:

$ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
as: unrecognized option '--generate-unused-section-symbols=yes'
as: unrecognized option '--generate-unused-section-symbols=yes'

$ as --version
GNU assembler (GNU Binutils for Debian) 2.38


Opinions?

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-04-17 15:46 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-04-15 11:19 The trouble with __weak and objtool got worse Peter Zijlstra
2022-04-15 14:12 ` Steven Rostedt
2022-04-15 15:31   ` Peter Zijlstra
2022-04-15 15:10 ` Josh Poimboeuf
2022-04-15 15:15   ` Josh Poimboeuf
2022-04-15 15:26 ` Peter Zijlstra
2022-04-15 17:40   ` Nick Desaulniers
2022-04-15 18:21     ` Josh Poimboeuf
2022-04-15 18:23       ` Josh Poimboeuf
2022-04-15 20:36       ` Nick Desaulniers
2022-04-16 10:49         ` Peter Zijlstra
2022-04-16 10:48       ` Peter Zijlstra
2022-04-16 16:07         ` Josh Poimboeuf
2022-04-16 16:32           ` H.J. Lu
2022-04-17 15:44           ` Peter Zijlstra
2022-04-17 15:46             ` Peter Zijlstra
2022-04-15 18:22 ` Segher Boessenkool
2022-04-15 18:36   ` Nick Desaulniers
2022-04-15 20:07     ` Segher Boessenkool
2022-04-15 20:31       ` Nick Desaulniers
2022-04-15 21:17         ` Fangrui Song
2022-04-15 21:41           ` Segher Boessenkool
2022-04-16 11:09       ` Peter Zijlstra
2022-04-16 10:59   ` Peter Zijlstra
2022-04-16 13:20     ` Segher Boessenkool
2022-04-16 17:59       ` Segher Boessenkool
2022-04-15 21:04 ` H.J. Lu
2022-04-16 11:25   ` Peter Zijlstra
2022-04-16 16:27     ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).