From: "H.J. Lu" <hjl.tools@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "the arch/x86 maintainers" <x86@kernel.org>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Nick Desaulniers <ndesaulniers@google.com>,
Miroslav Benes <mbenes@suse.cz>,
rostedt@goodmis.org, linux-toolchains@vger.kernel.org
Subject: Re: The trouble with __weak and objtool got worse
Date: Fri, 15 Apr 2022 14:04:32 -0700 [thread overview]
Message-ID: <CAMe9rOrj7Hdkg_TtABAgAuLsr=g2riRs9vicV79ybfr5y7pmYQ@mail.gmail.com> (raw)
In-Reply-To: <YllUqPK4CWZeHku8@hirez.programming.kicks-ass.net>
On Fri, Apr 15, 2022 at 4:19 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Hi,
>
> After chasing my tail for a good few hours this morning I finally
> figured out why code patching is exploding for me in weird and wonderful
> ways... :-(
>
> This problem affects everything where we have external tools (mostly
> objtool but possible also recordmcount) generate location sections for
> us, notably things like:
>
> .static_call_sites
> .retpoline_sites
> __mcount_loc
> .orc_unwind
> .orc_unwind_ip
>
> from individual translation-unit runs.
>
> Consider:
>
> foo-weak.c:
>
> extern void __SCT__foo(void);
>
> __attribute__((weak)) void foo(void)
> {
> return __SCT__foo();
> }
>
> foo.c:
>
> extern void __SCT__foo(void);
> extern void my_foo(void);
>
> void foo(void)
> {
> my_foo();
> return __SCT__foo();
> }
>
> These generate the obvious code
> (gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):
>
> foo-weak.o:
> 0000000000000000 <foo>:
> 0: e9 00 00 00 00 jmpq 5 <foo+0x5> 1: R_X86_64_PLT32 __SCT__foo-0x4
>
> foo.o:
> 0000000000000000 <foo>:
> 0: 48 83 ec 08 sub $0x8,%rsp
> 4: e8 00 00 00 00 callq 9 <foo+0x9> 5: R_X86_64_PLT32 my_foo-0x4
> 9: 48 83 c4 08 add $0x8,%rsp
> d: e9 00 00 00 00 jmpq 12 <foo+0x12> e: R_X86_64_PLT32 __SCT__foo-0x4
>
>
> Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o):
>
> foos.o:
> 0000000000000000 <foo-0x10>:
> 0: e9 00 00 00 00 jmpq 5 <foo-0xb> 1: R_X86_64_PLT32 __SCT__foo-0x4
> 5: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%rax,%rax,1)
> f: 90 nop
>
> 0000000000000010 <foo>:
> 10: 48 83 ec 08 sub $0x8,%rsp
> 14: e8 00 00 00 00 callq 19 <foo+0x9> 15: R_X86_64_PLT32 my_foo-0x4
> 19: 48 83 c4 08 add $0x8,%rsp
> 1d: e9 00 00 00 00 jmpq 22 <foo+0x12> 1e: R_X86_64_PLT32 __SCT__foo-0x4
>
> Noting that ld preserves the weak function text, but strips the symbol
> off of it (hence objdump doing that funny negative offset thing). This
> does lead to 'interesting' unused code issues with objtool when ran on
> linked objects, but that seems to be working (fingers crossed).
>
> So far so good.. Now lets consider the objtool static_call output
> section (readelf output, old binutils):
>
> foo-weak.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
> Offset Info Type Symbol's Value Symbol's Name + Addend
> 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + 0
> 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
>
> foo.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
> Offset Info Type Symbol's Value Symbol's Name + Addend
> 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + d
> 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
>
> foos.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
> Offset Info Type Symbol's Value Symbol's Name + Addend
> 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 0
> 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
> 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 1d
> 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
>
> So we have two patch sites, one in the dead code of the weak foo and one
> in the real foo. All is well.
>
> *HOWEVER*, sometimes it generates things like this (using new enough
> binutils):
>
> foo-weak.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
> Offset Info Type Symbol's Value Symbol's Name + Addend
> 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + 0
> 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
>
> foo.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
> Offset Info Type Symbol's Value Symbol's Name + Addend
> 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + d
> 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
>
> foos.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
> Offset Info Type Symbol's Value Symbol's Name + Addend
> 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 foo + 0
> 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
> 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 foo + d
> 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
>
> And now we can see how that foos.o .static_call_sites goes side-ways, we
> now have _two_ patch sites in foo. One for the weak symbol at foo+0
> (which is no longer a static_call site!) and one at foo+d which is in
> fact the right location.
This can't be right. "foo" must point to the non-weak function.
Please file a linker bug report.
> This seems to happen when objtool cannot find a section symbol, in which
> case it falls back to any other symbol to key off of, however in this
> case that goes terribly wrong!
>
> Now, if we do IBT/LTO builds, and run objtool on linked objects only,
> this obvious doesn't happen, because the weak stuff has been resolved by
> then. But ideally we'd not force that since it has a build time
> penalty..
>
> The other option seems to be to have objtool add section symbols it
> needs, however due to ELF being a total PITA and requiring all LOCAL
> symbols to be before GLOBAL symbols, this would mean re-ordering the
> whole symbol table (I have the code somewhere :-/).
>
> Alternatively:
>
> https://sourceware.org/pipermail/binutils/2020-December/114671.html
>
> seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> work, except I'm getting:
>
> $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> as: unrecognized option '--generate-unused-section-symbols=yes'
> as: unrecognized option '--generate-unused-section-symbols=yes'
>
> $ as --version
> GNU assembler (GNU Binutils for Debian) 2.38
>
>
> Opinions?
--
H.J.
next prev parent reply other threads:[~2022-04-15 21:06 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-15 11:19 The trouble with __weak and objtool got worse Peter Zijlstra
2022-04-15 14:12 ` Steven Rostedt
2022-04-15 15:31 ` Peter Zijlstra
2022-04-15 15:10 ` Josh Poimboeuf
2022-04-15 15:15 ` Josh Poimboeuf
2022-04-15 15:26 ` Peter Zijlstra
2022-04-15 17:40 ` Nick Desaulniers
2022-04-15 18:21 ` Josh Poimboeuf
2022-04-15 18:23 ` Josh Poimboeuf
2022-04-15 20:36 ` Nick Desaulniers
2022-04-16 10:49 ` Peter Zijlstra
2022-04-16 10:48 ` Peter Zijlstra
2022-04-16 16:07 ` Josh Poimboeuf
2022-04-16 16:32 ` H.J. Lu
2022-04-17 15:44 ` Peter Zijlstra
2022-04-17 15:46 ` Peter Zijlstra
2022-04-15 18:22 ` Segher Boessenkool
2022-04-15 18:36 ` Nick Desaulniers
2022-04-15 20:07 ` Segher Boessenkool
2022-04-15 20:31 ` Nick Desaulniers
2022-04-15 21:17 ` Fangrui Song
2022-04-15 21:41 ` Segher Boessenkool
2022-04-16 11:09 ` Peter Zijlstra
2022-04-16 10:59 ` Peter Zijlstra
2022-04-16 13:20 ` Segher Boessenkool
2022-04-16 17:59 ` Segher Boessenkool
2022-04-15 21:04 ` H.J. Lu [this message]
2022-04-16 11:25 ` Peter Zijlstra
2022-04-16 16:27 ` H.J. Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMe9rOrj7Hdkg_TtABAgAuLsr=g2riRs9vicV79ybfr5y7pmYQ@mail.gmail.com' \
--to=hjl.tools@gmail.com \
--cc=jpoimboe@redhat.com \
--cc=linux-toolchains@vger.kernel.org \
--cc=mbenes@suse.cz \
--cc=ndesaulniers@google.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).