linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [GIT PULL] objtool fixes
Date: Thu, 3 Apr 2025 16:52:57 +0200	[thread overview]
Message-ID: <Z-6gyQk2WlHc4DNw@gmail.com> (raw)
In-Reply-To: <n7p2rtrq6vvfteu5szlubng4wj6akgn45suekjdxojrpuxr6dp@oxjfxawkv3xs>


* Josh Poimboeuf <jpoimboe@kernel.org> wrote:

> On Wed, Apr 02, 2025 at 10:58:15AM -0700, Linus Torvalds wrote:

> > Apparently nobody else ever looks at generated code, but dammit, the
> > clac/stac code generation has turned the ugly up to 11.

BTW., I do look at generated code, and I know others do too, but at the 
.o most of the time, not at the .s code, via two ways:

      1) objdump --disassemble-all .o
    
      2) perf top's live kernel function annotation and disassembler 
         feature that uses /dev/mem. (While turning off KASLR, 
         ptr-obfuscation, turning perf_event_paranoid up to 11^H -1, 
         etc.)

Such output is more readable to me:

   # [ __memmove_avx_unaligned_erms annotated screen: ]

   0.00  1ea:   vzeroupper      
              ← retq            
                nop             
         1f0:   testq      %rcx,%rcx
              ↑ je         1ea  
         1f5:   vmovdqu    0x20(%rsi), %ymm5
                vmovdqu    0x40(%rsi), %ymm6
                leaq       -0x81(%rdi,%rdx),%rcx
                vmovdqu    0x60(%rsi), %ymm7
                vmovdqu    -0x20(%rsi,%rdx), %ymm8
                subq       %rdi,%rsi
                andq       $-0x20,%rcx
                addq       %rcx,%rsi
                nop             
   5.87  220:   vmovdqu    0x60(%rsi), %ymm1
   8.09         vmovdqu    0x40(%rsi), %ymm2
  10.34         vmovdqu    0x20(%rsi), %ymm3
  11.29         vmovdqu    (%rsi), %ymm4
  13.45         addq       $-0x80,%rsi
  13.47         vmovdqa    %ymm1,0x60(%rcx)
  11.26         vmovdqa    %ymm2,0x40(%rcx)
   9.16         vmovdqa    %ymm3,0x20(%rcx)
   7.45         vmovdqa    %ymm4,(%rcx)
   4.98         addq       $-0x80,%rcx
   4.57         cmpq       %rcx,%rdi
              ↑ jb         220  
                vmovdqu    %ymm0, (%rdi)
                vmovdqu    %ymm5, 0x20(%rdi)
                vmovdqu    %ymm6, 0x40(%rdi)
                vmovdqu    %ymm7, 0x60(%rdi)
                vmovdqu    %ymm8, -0x20(%rdx,%rdi)
                vzeroupper      
              ← retq            
                nop             
                nop             

( and the 'P' key within perf-top will print this out into the separate
  __memmove_avx_unaligned_erms.annotation for editor based inspection.   )

because:
    
     - this kind of disassembler output is more standardized, 
       regardless of compiler used,
    
     - it's generally less messy looking,
    
     - it gives ground-truth instead of being some intermediate layer
       in the toolchain that might or might not be the real deal,

     - it shows alignment and the various other effects linkers may 
       apply,

     - there's profiling data overlaid if it's a perf top run,

     - and on a live kernel it also sees through the kernel's various
       layers of runtime patching code obfuscation facilities, also
       known as: alternative-instructions, tracepoints and jump labels.

The compiler's .s is really a last-ditch way to look at generated 
machine code, for me at least.

> >
> > Yes, the altinstruction replacement thing was already making the
> > generated asm hard to read, but now it's *also* adding this garbage to
> > it:
> >
> >         911:
> >         .pushsection .discard.annotate_insn,"M",@progbits,8
> >         .long 911b - .
> >         .long 6
> >         .popsection
> > 
> > which is just pure unadulterated pointless noise.
> > 
> > That "annotation #6" is WORTHLESS.
> > 
> > Dammit, objtool could have figured that annotation out ON ITS OWN
> > without generating shit in our code. It's not like it doesn't already
> > look at alternatives, and it's not like it couldn't just have seen
> > "oh, look, it's a nop instruction with a clac/stac instruction as an
> > alternative".
> 
> Ugh, fragile hard-coded special cases like that are exactly what we're
> trying to get away from.  They make the code unmaintainable and they end
> up triggering false positives, just like the one fixed by that patch in
> the first place.

So the problem is, by reducing objtool complexity a bit:

   # 1154bbd326de objtool: Fix X86_FEATURE_SMAP alternative handling

   8 files changed, 37 insertions(+), 89 deletions(-)

and fixing these two false positive warnings where the 'objtool looks 
at the alternatives code' heuristics failed:

  arch/x86/mm/fault.o: warning: objtool: do_user_addr_fault+0x8ec: __stack_chk_fail() missing __noreturn in .c/.h or NORETURN() in noreturns.h
  arch/x86/mm/fault.o: warning: objtool: do_user_addr_fault+0x8f1: unreachable instruction

... we have now uglified the kernel's .s asm output space for all 
affected stac()/clac() using functions:

    starship:~/tip> diff -up usercopy.s.before usercopy.s.after
    --- usercopy.s.before	2025-04-03 16:33:07.286944538 +0200
    +++ usercopy.s.after	2025-04-03 16:32:41.770041092 +0200
    @@ -78,11 +78,16 @@ copy_from_user_nmi:
     # ./include/linux/uaccess.h:244: 	current->pagefault_disabled++;
     	addl	$1, 2748(%rdx)	#, _25->pagefault_disabled
     # ./include/linux/uaccess.h:266: 	barrier();
    -# ./arch/x86/include/asm/smap.h:35: 	alternative("", "stac", X86_FEATURE_SMAP);
    +# ./arch/x86/include/asm/smap.h:35: 	alternative(ANNOTATE_IGNORE_ALTERNATIVE "", "stac", X86_FEATURE_SMAP);
     #APP
     # 35 "./arch/x86/include/asm/smap.h" 1
     	# ALT: oldinstr
     771:
    +	911:
    +	.pushsection .discard.annotate_insn,"M",@progbits,8
    +	.long 911b - .
    +	.long 6
    +	.popsection
     	
     772:
     # ALT: padding
    @@ -140,10 +145,15 @@ copy_from_user_nmi:
      .popsection
     
     # 0 "" 2
    -# ./arch/x86/include/asm/smap.h:29: 	alternative("", "clac", X86_FEATURE_SMAP);
    +# ./arch/x86/include/asm/smap.h:29: 	alternative(ANNOTATE_IGNORE_ALTERNATIVE "", "clac", X86_FEATURE_SMAP);
     # 29 "./arch/x86/include/asm/smap.h" 1
     	# ALT: oldinstr
     771:
    +	911:
    +	.pushsection .discard.annotate_insn,"M",@progbits,8
    +	.long 911b - .
    +	.long 6
    +	.popsection
     	
     772:
     # ALT: padding

Is there a way to make this more compact, more readable?

Or if not, we just have to do the more fragile thing I suspect?
It's a tool after all.

Thanks,

	Ingo

  reply	other threads:[~2025-04-03 14:53 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-01 19:57 [GIT PULL] objtool fixes Ingo Molnar
2025-04-02 17:48 ` pr-tracker-bot
2025-04-02 17:58 ` Linus Torvalds
2025-04-02 20:30   ` Ingo Molnar
2025-04-03  8:31   ` Josh Poimboeuf
2025-04-03 14:52     ` Ingo Molnar [this message]
2025-04-03 15:23       ` Josh Poimboeuf
2025-04-03 15:33         ` Ingo Molnar
2025-04-03 15:43         ` Josh Poimboeuf
2025-04-03 16:06           ` Linus Torvalds
2025-04-03 18:24             ` Josh Poimboeuf
2025-04-03 19:12               ` Linus Torvalds
2025-04-03 19:42                 ` Josh Poimboeuf
2025-04-03 15:58       ` Linus Torvalds
2025-04-03 15:39     ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2025-12-06 11:37 Ingo Molnar
2025-12-06 20:43 ` pr-tracker-bot
2025-04-10 21:03 Ingo Molnar
2025-04-10 22:52 ` pr-tracker-bot
2025-02-28 19:08 Ingo Molnar
2025-03-01  1:41 ` pr-tracker-bot
2021-06-24  6:54 Ingo Molnar
2021-06-24 16:34 ` pr-tracker-bot
2021-06-12 12:37 Ingo Molnar
2021-06-12 19:09 ` pr-tracker-bot
2021-05-15  7:46 Ingo Molnar
2021-05-15 17:55 ` pr-tracker-bot
2020-04-25  9:04 Ingo Molnar
2020-04-25 19:30 ` pr-tracker-bot
2018-11-30  6:18 Ingo Molnar
2018-11-30 21:00 ` pr-tracker-bot
2017-11-26 12:34 Ingo Molnar
2017-03-01 22:01 Ingo Molnar
2017-02-28  7:53 Ingo Molnar
2017-03-01  1:55 ` Linus Torvalds
2017-03-01  6:05   ` Josh Poimboeuf
2016-04-23 11:10 Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-6gyQk2WlHc4DNw@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=jpoimboe@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).