linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "the arch/x86 maintainers" <x86@kernel.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Miroslav Benes <mbenes@suse.cz>,
	rostedt@goodmis.org, linux-toolchains@vger.kernel.org
Subject: Re: The trouble with __weak and objtool got worse
Date: Fri, 15 Apr 2022 14:04:32 -0700	[thread overview]
Message-ID: <CAMe9rOrj7Hdkg_TtABAgAuLsr=g2riRs9vicV79ybfr5y7pmYQ@mail.gmail.com> (raw)
In-Reply-To: <YllUqPK4CWZeHku8@hirez.programming.kicks-ass.net>

On Fri, Apr 15, 2022 at 4:19 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Hi,
>
> After chasing my tail for a good few hours this morning I finally
> figured out why code patching is exploding for me in weird and wonderful
> ways... :-(
>
> This problem affects everything where we have external tools (mostly
> objtool but possible also recordmcount) generate location sections for
> us, notably things like:
>
>   .static_call_sites
>   .retpoline_sites
>   __mcount_loc
>   .orc_unwind
>   .orc_unwind_ip
>
> from individual translation-unit runs.
>
> Consider:
>
> foo-weak.c:
>
> extern void __SCT__foo(void);
>
> __attribute__((weak)) void foo(void)
> {
>         return __SCT__foo();
> }
>
> foo.c:
>
> extern void __SCT__foo(void);
> extern void my_foo(void);
>
> void foo(void)
> {
>         my_foo();
>         return __SCT__foo();
> }
>
> These generate the obvious code
> (gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):
>
> foo-weak.o:
> 0000000000000000 <foo>:
>    0:   e9 00 00 00 00          jmpq   5 <foo+0x5>      1: R_X86_64_PLT32       __SCT__foo-0x4
>
> foo.o:
> 0000000000000000 <foo>:
>    0:   48 83 ec 08             sub    $0x8,%rsp
>    4:   e8 00 00 00 00          callq  9 <foo+0x9>      5: R_X86_64_PLT32       my_foo-0x4
>    9:   48 83 c4 08             add    $0x8,%rsp
>    d:   e9 00 00 00 00          jmpq   12 <foo+0x12>    e: R_X86_64_PLT32       __SCT__foo-0x4
>
>
> Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o):
>
> foos.o:
> 0000000000000000 <foo-0x10>:
>    0:   e9 00 00 00 00          jmpq   5 <foo-0xb>      1: R_X86_64_PLT32       __SCT__foo-0x4
>    5:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)
>    f:   90                      nop
>
> 0000000000000010 <foo>:
>   10:   48 83 ec 08             sub    $0x8,%rsp
>   14:   e8 00 00 00 00          callq  19 <foo+0x9>     15: R_X86_64_PLT32      my_foo-0x4
>   19:   48 83 c4 08             add    $0x8,%rsp
>   1d:   e9 00 00 00 00          jmpq   22 <foo+0x12>    1e: R_X86_64_PLT32      __SCT__foo-0x4
>
> Noting that ld preserves the weak function text, but strips the symbol
> off of it (hence objdump doing that funny negative offset thing). This
> does lead to 'interesting' unused code issues with objtool when ran on
> linked objects, but that seems to be working (fingers crossed).
>
> So far so good.. Now lets consider the objtool static_call output
> section (readelf output, old binutils):
>
> foo-weak.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 0
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> foo.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + d
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> foos.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 0
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> 0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 1d
> 000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> So we have two patch sites, one in the dead code of the weak foo and one
> in the real foo. All is well.
>
> *HOWEVER*, sometimes it generates things like this (using new enough
> binutils):
>
> foo-weak.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> foo.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> foos.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> 0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
> 000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> And now we can see how that foos.o .static_call_sites goes side-ways, we
> now have _two_ patch sites in foo. One for the weak symbol at foo+0
> (which is no longer a static_call site!) and one at foo+d which is in
> fact the right location.

This can't be right.  "foo" must point to the non-weak function.
Please file a linker bug report.

> This seems to happen when objtool cannot find a section symbol, in which
> case it falls back to any other symbol to key off of, however in this
> case that goes terribly wrong!
>
> Now, if we do IBT/LTO builds, and run objtool on linked objects only,
> this obvious doesn't happen, because the weak stuff has been resolved by
> then. But ideally we'd not force that since it has a build time
> penalty..
>
> The other option seems to be to have objtool add section symbols it
> needs, however due to ELF being a total PITA and requiring all LOCAL
> symbols to be before GLOBAL symbols, this would mean re-ordering the
> whole symbol table (I have the code somewhere :-/).
>
> Alternatively:
>
>   https://sourceware.org/pipermail/binutils/2020-December/114671.html
>
> seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> work, except I'm getting:
>
> $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> as: unrecognized option '--generate-unused-section-symbols=yes'
> as: unrecognized option '--generate-unused-section-symbols=yes'
>
> $ as --version
> GNU assembler (GNU Binutils for Debian) 2.38
>
>
> Opinions?



-- 
H.J.

  parent reply	other threads:[~2022-04-15 21:06 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-15 11:19 The trouble with __weak and objtool got worse Peter Zijlstra
2022-04-15 14:12 ` Steven Rostedt
2022-04-15 15:31   ` Peter Zijlstra
2022-04-15 15:10 ` Josh Poimboeuf
2022-04-15 15:15   ` Josh Poimboeuf
2022-04-15 15:26 ` Peter Zijlstra
2022-04-15 17:40   ` Nick Desaulniers
2022-04-15 18:21     ` Josh Poimboeuf
2022-04-15 18:23       ` Josh Poimboeuf
2022-04-15 20:36       ` Nick Desaulniers
2022-04-16 10:49         ` Peter Zijlstra
2022-04-16 10:48       ` Peter Zijlstra
2022-04-16 16:07         ` Josh Poimboeuf
2022-04-16 16:32           ` H.J. Lu
2022-04-17 15:44           ` Peter Zijlstra
2022-04-17 15:46             ` Peter Zijlstra
2022-04-15 18:22 ` Segher Boessenkool
2022-04-15 18:36   ` Nick Desaulniers
2022-04-15 20:07     ` Segher Boessenkool
2022-04-15 20:31       ` Nick Desaulniers
2022-04-15 21:17         ` Fangrui Song
2022-04-15 21:41           ` Segher Boessenkool
2022-04-16 11:09       ` Peter Zijlstra
2022-04-16 10:59   ` Peter Zijlstra
2022-04-16 13:20     ` Segher Boessenkool
2022-04-16 17:59       ` Segher Boessenkool
2022-04-15 21:04 ` H.J. Lu [this message]
2022-04-16 11:25   ` Peter Zijlstra
2022-04-16 16:27     ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMe9rOrj7Hdkg_TtABAgAuLsr=g2riRs9vicV79ybfr5y7pmYQ@mail.gmail.com' \
    --to=hjl.tools@gmail.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=mbenes@suse.cz \
    --cc=ndesaulniers@google.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).