From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0871EC433EF for ; Fri, 15 Apr 2022 21:06:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354115AbiDOVIw (ORCPT ); Fri, 15 Apr 2022 17:08:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348676AbiDOVHi (ORCPT ); Fri, 15 Apr 2022 17:07:38 -0400 Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BB363D48F for ; Fri, 15 Apr 2022 14:05:09 -0700 (PDT) Received: by mail-pg1-x52b.google.com with SMTP id 125so8554541pgc.11 for ; Fri, 15 Apr 2022 14:05:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kvao/+ISob4tgR5xm+sVodpPc3aKdp86q8WeigrPVkU=; b=cjswGH6BEg0DpajdDTwA8cQYldpwklK6GB9ppdDdhJw+lH1kwWpFjf1pKJ2hfuGx2z gYXwEYJUy5Rtrjs3mMQ8wZRwTOSid4RP1+qreAwHshYkPrUj5opG9uI+weAVjc3LGVBt a30ImBK7qaqOf+EQaOmk1InFdlZaYmGmLv9ZvyKjTBiKsDEFI7cbx/PZVJ91X+FZmoEz ZyWZD4brYvwKwdQCGFZ+C/OM/X3h2BNZVtD9StUsvWw2YdQqu1tW0aIWFmoR7cpM0KZs NwNkMlxvhlTUhe6Hdd1/T6h0f79nuUAavAauh4U03bPFZHrp/uvFIHqyeIldEZA690JN 1fVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kvao/+ISob4tgR5xm+sVodpPc3aKdp86q8WeigrPVkU=; b=Pg0OKpQ0D+DzKDDzmhc+eh1QgPbfqWS084PulZy8LwcvSObCEGN6zk9/IDN+2I1FtO 3ueJTNeKckSsi4wiGRByqSvj+2IvlZwNoLsr061fUwHtbJf2mKhp1CNCZ2Dj+0cSR01y kzYtvRJOvdkaKKxHSzpr1uqA8I1SZW4qRarM2sSkjPtjnUwOqDC801Soud0m3MZA+2Wq PCOKd8nedWTcNN8Bg1PBuLxzhEtu2IKIYXfi12dOUNNxl+92gp94JCkxci1t3RbdLf4H NB1sriZrZ4jveTVkXteLbnBZx1NDRF6gC3LZxdmOqGvZXX7ALZNB+AhRlPItTjyc7YAv SGyw== X-Gm-Message-State: AOAM533D7+LadzOChhsj4NuY3SLVdH9GYNnx8eETvu4GklNHXm7tanNI UWQ5a5jF1hNyBCcgYw3q5zuTDsdk6H/3nvFZo6F+bvg3 X-Google-Smtp-Source: ABdhPJzv+0DFePhAwE6IcTh7MrKnUZ8rqis83nAaPqKBZrO/q0ByXCc7iKHbeHpgudE/jTeHDObBqRHfGl0qeo6fY1E= X-Received: by 2002:a05:6a00:c85:b0:4fa:f806:10f5 with SMTP id a5-20020a056a000c8500b004faf80610f5mr897940pfv.43.1650056708362; Fri, 15 Apr 2022 14:05:08 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: "H.J. Lu" Date: Fri, 15 Apr 2022 14:04:32 -0700 Message-ID: Subject: Re: The trouble with __weak and objtool got worse To: Peter Zijlstra Cc: "the arch/x86 maintainers" , Josh Poimboeuf , Nick Desaulniers , Miroslav Benes , rostedt@goodmis.org, linux-toolchains@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-toolchains@vger.kernel.org On Fri, Apr 15, 2022 at 4:19 AM Peter Zijlstra wrote: > > Hi, > > After chasing my tail for a good few hours this morning I finally > figured out why code patching is exploding for me in weird and wonderful > ways... :-( > > This problem affects everything where we have external tools (mostly > objtool but possible also recordmcount) generate location sections for > us, notably things like: > > .static_call_sites > .retpoline_sites > __mcount_loc > .orc_unwind > .orc_unwind_ip > > from individual translation-unit runs. > > Consider: > > foo-weak.c: > > extern void __SCT__foo(void); > > __attribute__((weak)) void foo(void) > { > return __SCT__foo(); > } > > foo.c: > > extern void __SCT__foo(void); > extern void my_foo(void); > > void foo(void) > { > my_foo(); > return __SCT__foo(); > } > > These generate the obvious code > (gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c): > > foo-weak.o: > 0000000000000000 : > 0: e9 00 00 00 00 jmpq 5 1: R_X86_64_PLT32 __SCT__foo-0x4 > > foo.o: > 0000000000000000 : > 0: 48 83 ec 08 sub $0x8,%rsp > 4: e8 00 00 00 00 callq 9 5: R_X86_64_PLT32 my_foo-0x4 > 9: 48 83 c4 08 add $0x8,%rsp > d: e9 00 00 00 00 jmpq 12 e: R_X86_64_PLT32 __SCT__foo-0x4 > > > Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o): > > foos.o: > 0000000000000000 : > 0: e9 00 00 00 00 jmpq 5 1: R_X86_64_PLT32 __SCT__foo-0x4 > 5: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%rax,%rax,1) > f: 90 nop > > 0000000000000010 : > 10: 48 83 ec 08 sub $0x8,%rsp > 14: e8 00 00 00 00 callq 19 15: R_X86_64_PLT32 my_foo-0x4 > 19: 48 83 c4 08 add $0x8,%rsp > 1d: e9 00 00 00 00 jmpq 22 1e: R_X86_64_PLT32 __SCT__foo-0x4 > > Noting that ld preserves the weak function text, but strips the symbol > off of it (hence objdump doing that funny negative offset thing). This > does lead to 'interesting' unused code issues with objtool when ran on > linked objects, but that seems to be working (fingers crossed). > > So far so good.. Now lets consider the objtool static_call output > section (readelf output, old binutils): > > foo-weak.o: > > Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: > Offset Info Type Symbol's Value Symbol's Name + Addend > 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + 0 > 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 > > foo.o: > > Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: > Offset Info Type Symbol's Value Symbol's Name + Addend > 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + d > 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 > > foos.o: > > Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: > Offset Info Type Symbol's Value Symbol's Name + Addend > 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 0 > 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 > 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 1d > 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 > > So we have two patch sites, one in the dead code of the weak foo and one > in the real foo. All is well. > > *HOWEVER*, sometimes it generates things like this (using new enough > binutils): > > foo-weak.o: > > Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: > Offset Info Type Symbol's Value Symbol's Name + Addend > 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + 0 > 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 > > foo.o: > > Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: > Offset Info Type Symbol's Value Symbol's Name + Addend > 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + d > 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 > > foos.o: > > Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: > Offset Info Type Symbol's Value Symbol's Name + Addend > 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 foo + 0 > 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 > 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 foo + d > 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 > > And now we can see how that foos.o .static_call_sites goes side-ways, we > now have _two_ patch sites in foo. One for the weak symbol at foo+0 > (which is no longer a static_call site!) and one at foo+d which is in > fact the right location. This can't be right. "foo" must point to the non-weak function. Please file a linker bug report. > This seems to happen when objtool cannot find a section symbol, in which > case it falls back to any other symbol to key off of, however in this > case that goes terribly wrong! > > Now, if we do IBT/LTO builds, and run objtool on linked objects only, > this obvious doesn't happen, because the weak stuff has been resolved by > then. But ideally we'd not force that since it has a build time > penalty.. > > The other option seems to be to have objtool add section symbols it > needs, however due to ELF being a total PITA and requiring all LOCAL > symbols to be before GLOBAL symbols, this would mean re-ordering the > whole symbol table (I have the code somewhere :-/). > > Alternatively: > > https://sourceware.org/pipermail/binutils/2020-December/114671.html > > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to > work, except I'm getting: > > $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c > as: unrecognized option '--generate-unused-section-symbols=yes' > as: unrecognized option '--generate-unused-section-symbols=yes' > > $ as --version > GNU assembler (GNU Binutils for Debian) 2.38 > > > Opinions? -- H.J.