public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns
Date: Thu, 6 Mar 2025 10:57:44 +0100	[thread overview]
Message-ID: <Z8lxmPmnJhBmPRvl@gmail.com> (raw)
In-Reply-To: <20250228123825.2729925-1-ubizjak@gmail.com>


* Uros Bizjak <ubizjak@gmail.com> wrote:

> According to:
> 
>   https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html
> 
> the usage of asm pseudo directives in the asm template can confuse
> the compiler to wrongly estimate the size of the generated
> code.
> 
> The LOCK_PREFIX macro expands to several asm pseudo directives, so
> its usage in atomic locking insns causes instruction length estimate
> to fail significantly (the specially instrumented compiler reports
> the estimated length of these asm templates to be 6 instructions long).
> 
> This incorrect estimate further causes unoptimal inlining decisions,
> unoptimal instruction scheduling and unoptimal code block alignments
> for functions that use these locking primitives.
> 
> Use asm_inline instead:
> 
>   https://gcc.gnu.org/pipermail/gcc-patches/2018-December/512349.html
> 
> which is a feature that makes GCC pretend some inline assembler code
> is tiny (while it would think it is huge), instead of just asm.
> 
> For code size estimation, the size of the asm is then taken as
> the minimum size of one instruction, ignoring how many instructions
> compiler thinks it is.
> 
> The code size of the resulting x86_64 defconfig object file increases
> for 33.264 kbytes, representing 1.2% code size increase:
> 
>    text    data     bss     dec     hex filename
> 27450107        4633332  814148 32897587        1f5fa33 vmlinux-old.o
> 27483371        4633784  814148 32931303        1f67de7 vmlinux-new.o
> 
> mainly due to different inlining decisions of -O2 build.

So my request here would be not more benchmark figures (I don't think 
it's a realistic expectation for contributors to be able to measure 
much of an effect with such a type of change, let alone be certain
what a macro or micro-benchmark measures is causally connected with
the patch), but I'd like to ask for some qualitative analysis on the
code generation side:

 - +1.2% code size increase is a lot, especially if it's under the 
   default build flags of the kernel. Where does the extra code come 
   from?

 - Is there any effect on Clang? Are its inlining decisions around 
   these asm() statements comparable, worse/better?

A couple of concrete examples would go a long way:

 - "Function XXX was inlined 3 times before the patch, and it was 
    inlined 30 times after the patch. I have reviewed two such inlining 
    locations, and they have added more code to unlikely or 
    failure-handling branches collected near the function epilogue, 
    while the fast-path of the function was more optimal."

Or you might end up finding:

 - "Function YYY was inlined 3x more frequently after the patch, but 
    the inlining decision increased register pressure and created less 
    optimal code in the fast-path, increasing both code size and likely 
    decreasing fast-path performance."

Obviously we'd be sad about the second case, but it's well within the 
spectrum of possibilities when we look at "+1.2% object code size 
increase".

What we cannot do is to throw up our hands and claim "-O2 trades 
performance for size, and thus this patch improves performance".
We don't know that for sure and 30 years of kernel development
created a love-and-hate relationship and a fair level of distrust
between kernel developers and compiler inlining decisions,
especially around x86 asm() statements ...

So these are roughly the high level requirements around such patches.
Does this make sense?

Thanks,

	Ingo

  parent reply	other threads:[~2025-03-06  9:57 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 12:35 [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns Uros Bizjak
2025-02-28 13:13 ` Uros Bizjak
2025-02-28 16:48 ` Dave Hansen
2025-02-28 22:31   ` Uros Bizjak
2025-02-28 22:58     ` Dave Hansen
2025-03-01  9:05       ` Uros Bizjak
2025-03-01 12:38         ` Borislav Petkov
2025-03-05  8:54           ` Uros Bizjak
2025-03-05 17:04             ` Linus Torvalds
2025-03-05 19:40               ` Peter Zijlstra
2025-03-05 19:47               ` Uros Bizjak
2025-03-05 22:18                 ` David Laight
2025-03-05 20:14               ` David Laight
2025-03-06 10:45                 ` Uros Bizjak
2025-03-06 13:07                   ` Uros Bizjak
2025-03-06 22:19                     ` Ingo Molnar
2025-03-08  7:22                       ` Uros Bizjak
2025-03-08 19:15               ` H. Peter Anvin
2025-03-05 19:55             ` Ingo Molnar
2025-03-05 20:13               ` Uros Bizjak
2025-03-05 20:21                 ` Ingo Molnar
2025-03-06  9:38                   ` Uros Bizjak
2025-03-05 20:20               ` Ingo Molnar
2025-03-06 10:52                 ` Dirk Gouders
2025-03-06 10:59                   ` Ingo Molnar
2025-03-05 20:36             ` Borislav Petkov
2025-03-05 21:26               ` Peter Zijlstra
2025-03-06  9:01                 ` Uros Bizjak
2025-03-06  9:43                   ` kernel: Current status of CONFIG_CC_OPTIMIZE_FOR_SIZE=y (was: Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns) Ingo Molnar
2025-03-06 10:37                     ` Arnd Bergmann
2025-03-06 20:37                     ` David Laight
2025-03-03 13:12       ` [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns David Laight
2025-03-02 20:56   ` Uros Bizjak
2025-03-03 12:23     ` Uros Bizjak
2025-03-08 19:08   ` H. Peter Anvin
2025-03-09  7:50     ` Uros Bizjak
2025-03-09  9:46       ` David Laight
2025-03-09  9:57         ` Uros Bizjak
2025-03-06  9:57 ` Ingo Molnar [this message]
2025-03-06 10:26   ` Uros Bizjak
2025-03-06 10:38     ` Ingo Molnar
2025-03-06 10:50       ` Ingo Molnar
2025-03-06 13:56   ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8lxmPmnJhBmPRvl@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=ubizjak@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox