From: David Laight <david.laight.linux@gmail.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Uros Bizjak <ubizjak@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@intel.com>,
x86@kernel.org, linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Linus Torvalds <torvalds@linuxfoundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Arnd Bergmann <arnd@arndb.de>
Subject: Re: kernel: Current status of CONFIG_CC_OPTIMIZE_FOR_SIZE=y (was: Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns)
Date: Thu, 6 Mar 2025 20:37:14 +0000 [thread overview]
Message-ID: <20250306203714.118ead69@pumpkin> (raw)
In-Reply-To: <Z8luPgXr9hcO7jDz@gmail.com>
On Thu, 6 Mar 2025 10:43:26 +0100
Ingo Molnar <mingo@kernel.org> wrote:
> * Uros Bizjak <ubizjak@gmail.com> wrote:
...
> And this one by Linus, 14 years ago:
>
> =================>
> 281dc5c5ec0f ("Give up on pushing CC_OPTIMIZE_FOR_SIZE")
> =================>
>
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Sun, 22 May 2011 14:30:36 -0700
> Subject: [PATCH] Give up on pushing CC_OPTIMIZE_FOR_SIZE
>
> I still happen to believe that I$ miss costs are a major thing, but
> sadly, -Os doesn't seem to be the solution. With or without it, gcc
> will miss some obvious code size improvements, and with it enabled gcc
> will sometimes make choices that aren't good even with high I$ miss
> ratios.
>
> For example, with -Os, gcc on x86 will turn a 20-byte constant memcpy
> into a "rep movsl". While I sincerely hope that x86 CPU's will some day
> do a good job at that, they certainly don't do it yet, and the cost is
> higher than a L1 I$ miss would be.
Well 'rep movsb' is a lot better than it was then.
Even on Sandy bridge (IIRC) it is ~20 clocks for short transfers (of any length).
Unlike the P4 with a 140 clock overhead!
Still slower for short fixed sizes, but probably good for anything variable
because of the costs of the function call and the conditionals to select the
'best' algorithm.
OTOH if you know it is only a few bytes a code loop may be best - and gcc will
convert it to a memcpy() call for you!
The really silly one was 'push immd_byte; pop reg' to get a sign extended value.
But I do remember -O2 being smaller than -Oz !
Just changing the inlining thresholds and code replication on loops
(and never unrollong loops) would probably be a good start.
David
next prev parent reply other threads:[~2025-03-06 20:37 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-28 12:35 [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns Uros Bizjak
2025-02-28 13:13 ` Uros Bizjak
2025-02-28 16:48 ` Dave Hansen
2025-02-28 22:31 ` Uros Bizjak
2025-02-28 22:58 ` Dave Hansen
2025-03-01 9:05 ` Uros Bizjak
2025-03-01 12:38 ` Borislav Petkov
2025-03-05 8:54 ` Uros Bizjak
2025-03-05 17:04 ` Linus Torvalds
2025-03-05 19:40 ` Peter Zijlstra
2025-03-05 19:47 ` Uros Bizjak
2025-03-05 22:18 ` David Laight
2025-03-05 20:14 ` David Laight
2025-03-06 10:45 ` Uros Bizjak
2025-03-06 13:07 ` Uros Bizjak
2025-03-06 22:19 ` Ingo Molnar
2025-03-08 7:22 ` Uros Bizjak
2025-03-08 19:15 ` H. Peter Anvin
2025-03-05 19:55 ` Ingo Molnar
2025-03-05 20:13 ` Uros Bizjak
2025-03-05 20:21 ` Ingo Molnar
2025-03-06 9:38 ` Uros Bizjak
2025-03-05 20:20 ` Ingo Molnar
2025-03-06 10:52 ` Dirk Gouders
2025-03-06 10:59 ` Ingo Molnar
2025-03-05 20:36 ` Borislav Petkov
2025-03-05 21:26 ` Peter Zijlstra
2025-03-06 9:01 ` Uros Bizjak
2025-03-06 9:43 ` kernel: Current status of CONFIG_CC_OPTIMIZE_FOR_SIZE=y (was: Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns) Ingo Molnar
2025-03-06 10:37 ` Arnd Bergmann
2025-03-06 20:37 ` David Laight [this message]
2025-03-03 13:12 ` [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns David Laight
2025-03-02 20:56 ` Uros Bizjak
2025-03-03 12:23 ` Uros Bizjak
2025-03-08 19:08 ` H. Peter Anvin
2025-03-09 7:50 ` Uros Bizjak
2025-03-09 9:46 ` David Laight
2025-03-09 9:57 ` Uros Bizjak
2025-03-06 9:57 ` Ingo Molnar
2025-03-06 10:26 ` Uros Bizjak
2025-03-06 10:38 ` Ingo Molnar
2025-03-06 10:50 ` Ingo Molnar
2025-03-06 13:56 ` Uros Bizjak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250306203714.118ead69@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=arnd@arndb.de \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=torvalds@linuxfoundation.org \
--cc=ubizjak@gmail.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox