From: Ingo Molnar <mingo@kernel.org>
To: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@alien8.de>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [GIT PULL] objtool fixes
Date: Thu, 3 Apr 2025 16:52:57 +0200 [thread overview]
Message-ID: <Z-6gyQk2WlHc4DNw@gmail.com> (raw)
In-Reply-To: <n7p2rtrq6vvfteu5szlubng4wj6akgn45suekjdxojrpuxr6dp@oxjfxawkv3xs>
* Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> On Wed, Apr 02, 2025 at 10:58:15AM -0700, Linus Torvalds wrote:
> > Apparently nobody else ever looks at generated code, but dammit, the
> > clac/stac code generation has turned the ugly up to 11.
BTW., I do look at generated code, and I know others do too, but at the
.o most of the time, not at the .s code, via two ways:
1) objdump --disassemble-all .o
2) perf top's live kernel function annotation and disassembler
feature that uses /dev/mem. (While turning off KASLR,
ptr-obfuscation, turning perf_event_paranoid up to 11^H -1,
etc.)
Such output is more readable to me:
# [ __memmove_avx_unaligned_erms annotated screen: ]
0.00 1ea: vzeroupper
← retq
nop
1f0: testq %rcx,%rcx
↑ je 1ea
1f5: vmovdqu 0x20(%rsi), %ymm5
vmovdqu 0x40(%rsi), %ymm6
leaq -0x81(%rdi,%rdx),%rcx
vmovdqu 0x60(%rsi), %ymm7
vmovdqu -0x20(%rsi,%rdx), %ymm8
subq %rdi,%rsi
andq $-0x20,%rcx
addq %rcx,%rsi
nop
5.87 220: vmovdqu 0x60(%rsi), %ymm1
8.09 vmovdqu 0x40(%rsi), %ymm2
10.34 vmovdqu 0x20(%rsi), %ymm3
11.29 vmovdqu (%rsi), %ymm4
13.45 addq $-0x80,%rsi
13.47 vmovdqa %ymm1,0x60(%rcx)
11.26 vmovdqa %ymm2,0x40(%rcx)
9.16 vmovdqa %ymm3,0x20(%rcx)
7.45 vmovdqa %ymm4,(%rcx)
4.98 addq $-0x80,%rcx
4.57 cmpq %rcx,%rdi
↑ jb 220
vmovdqu %ymm0, (%rdi)
vmovdqu %ymm5, 0x20(%rdi)
vmovdqu %ymm6, 0x40(%rdi)
vmovdqu %ymm7, 0x60(%rdi)
vmovdqu %ymm8, -0x20(%rdx,%rdi)
vzeroupper
← retq
nop
nop
( and the 'P' key within perf-top will print this out into the separate
__memmove_avx_unaligned_erms.annotation for editor based inspection. )
because:
- this kind of disassembler output is more standardized,
regardless of compiler used,
- it's generally less messy looking,
- it gives ground-truth instead of being some intermediate layer
in the toolchain that might or might not be the real deal,
- it shows alignment and the various other effects linkers may
apply,
- there's profiling data overlaid if it's a perf top run,
- and on a live kernel it also sees through the kernel's various
layers of runtime patching code obfuscation facilities, also
known as: alternative-instructions, tracepoints and jump labels.
The compiler's .s is really a last-ditch way to look at generated
machine code, for me at least.
> >
> > Yes, the altinstruction replacement thing was already making the
> > generated asm hard to read, but now it's *also* adding this garbage to
> > it:
> >
> > 911:
> > .pushsection .discard.annotate_insn,"M",@progbits,8
> > .long 911b - .
> > .long 6
> > .popsection
> >
> > which is just pure unadulterated pointless noise.
> >
> > That "annotation #6" is WORTHLESS.
> >
> > Dammit, objtool could have figured that annotation out ON ITS OWN
> > without generating shit in our code. It's not like it doesn't already
> > look at alternatives, and it's not like it couldn't just have seen
> > "oh, look, it's a nop instruction with a clac/stac instruction as an
> > alternative".
>
> Ugh, fragile hard-coded special cases like that are exactly what we're
> trying to get away from. They make the code unmaintainable and they end
> up triggering false positives, just like the one fixed by that patch in
> the first place.
So the problem is, by reducing objtool complexity a bit:
# 1154bbd326de objtool: Fix X86_FEATURE_SMAP alternative handling
8 files changed, 37 insertions(+), 89 deletions(-)
and fixing these two false positive warnings where the 'objtool looks
at the alternatives code' heuristics failed:
arch/x86/mm/fault.o: warning: objtool: do_user_addr_fault+0x8ec: __stack_chk_fail() missing __noreturn in .c/.h or NORETURN() in noreturns.h
arch/x86/mm/fault.o: warning: objtool: do_user_addr_fault+0x8f1: unreachable instruction
... we have now uglified the kernel's .s asm output space for all
affected stac()/clac() using functions:
starship:~/tip> diff -up usercopy.s.before usercopy.s.after
--- usercopy.s.before 2025-04-03 16:33:07.286944538 +0200
+++ usercopy.s.after 2025-04-03 16:32:41.770041092 +0200
@@ -78,11 +78,16 @@ copy_from_user_nmi:
# ./include/linux/uaccess.h:244: current->pagefault_disabled++;
addl $1, 2748(%rdx) #, _25->pagefault_disabled
# ./include/linux/uaccess.h:266: barrier();
-# ./arch/x86/include/asm/smap.h:35: alternative("", "stac", X86_FEATURE_SMAP);
+# ./arch/x86/include/asm/smap.h:35: alternative(ANNOTATE_IGNORE_ALTERNATIVE "", "stac", X86_FEATURE_SMAP);
#APP
# 35 "./arch/x86/include/asm/smap.h" 1
# ALT: oldinstr
771:
+ 911:
+ .pushsection .discard.annotate_insn,"M",@progbits,8
+ .long 911b - .
+ .long 6
+ .popsection
772:
# ALT: padding
@@ -140,10 +145,15 @@ copy_from_user_nmi:
.popsection
# 0 "" 2
-# ./arch/x86/include/asm/smap.h:29: alternative("", "clac", X86_FEATURE_SMAP);
+# ./arch/x86/include/asm/smap.h:29: alternative(ANNOTATE_IGNORE_ALTERNATIVE "", "clac", X86_FEATURE_SMAP);
# 29 "./arch/x86/include/asm/smap.h" 1
# ALT: oldinstr
771:
+ 911:
+ .pushsection .discard.annotate_insn,"M",@progbits,8
+ .long 911b - .
+ .long 6
+ .popsection
772:
# ALT: padding
Is there a way to make this more compact, more readable?
Or if not, we just have to do the more fragile thing I suspect?
It's a tool after all.
Thanks,
Ingo
next prev parent reply other threads:[~2025-04-03 14:53 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-01 19:57 [GIT PULL] objtool fixes Ingo Molnar
2025-04-02 17:48 ` pr-tracker-bot
2025-04-02 17:58 ` Linus Torvalds
2025-04-02 20:30 ` Ingo Molnar
2025-04-03 8:31 ` Josh Poimboeuf
2025-04-03 14:52 ` Ingo Molnar [this message]
2025-04-03 15:23 ` Josh Poimboeuf
2025-04-03 15:33 ` Ingo Molnar
2025-04-03 15:43 ` Josh Poimboeuf
2025-04-03 16:06 ` Linus Torvalds
2025-04-03 18:24 ` Josh Poimboeuf
2025-04-03 19:12 ` Linus Torvalds
2025-04-03 19:42 ` Josh Poimboeuf
2025-04-03 15:58 ` Linus Torvalds
2025-04-03 15:39 ` Linus Torvalds
-- strict thread matches above, loose matches on Subject: below --
2025-12-06 11:37 Ingo Molnar
2025-12-06 20:43 ` pr-tracker-bot
2025-04-10 21:03 Ingo Molnar
2025-04-10 22:52 ` pr-tracker-bot
2025-02-28 19:08 Ingo Molnar
2025-03-01 1:41 ` pr-tracker-bot
2021-06-24 6:54 Ingo Molnar
2021-06-24 16:34 ` pr-tracker-bot
2021-06-12 12:37 Ingo Molnar
2021-06-12 19:09 ` pr-tracker-bot
2021-05-15 7:46 Ingo Molnar
2021-05-15 17:55 ` pr-tracker-bot
2020-04-25 9:04 Ingo Molnar
2020-04-25 19:30 ` pr-tracker-bot
2018-11-30 6:18 Ingo Molnar
2018-11-30 21:00 ` pr-tracker-bot
2017-11-26 12:34 Ingo Molnar
2017-03-01 22:01 Ingo Molnar
2017-02-28 7:53 Ingo Molnar
2017-03-01 1:55 ` Linus Torvalds
2017-03-01 6:05 ` Josh Poimboeuf
2016-04-23 11:10 Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-6gyQk2WlHc4DNw@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=jpoimboe@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).