From: Dave Hansen <dave.hansen@intel.com>
To: Uros Bizjak <ubizjak@gmail.com>,
x86@kernel.org, linux-kernel@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@kernel.org>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>
Subject: Re: [PATCH] x86/smp: Use atomic_try_cmpxchg() to micro-optimize native_stop_other_cpus()
Date: Fri, 17 Nov 2023 08:18:19 -0800 [thread overview]
Message-ID: <367bc727-3f26-4e78-8e58-af959760b3fc@intel.com> (raw)
In-Reply-To: <20231114164416.208285-1-ubizjak@gmail.com>
On 11/14/23 08:43, Uros Bizjak wrote:
> Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old
> in native_stop_other_cpus(). On x86 the CMPXCHG instruction returns success
> in the ZF flag, so this change saves a compare after CMPXCHG. Together
> with a small code reorder, the generated asm code improves from:
>
> 74: 8b 05 00 00 00 00 mov 0x0(%rip),%eax
> 7a: 41 54 push %r12
> 7c: 55 push %rbp
> 7d: 65 8b 2d 00 00 00 00 mov %gs:0x0(%rip),%ebp
> 84: 53 push %rbx
> 85: 85 c0 test %eax,%eax
> 87: 75 71 jne fa <native_stop_other_cpus+0x8a>
> 89: b8 ff ff ff ff mov $0xffffffff,%eax
> 8e: f0 0f b1 2d 00 00 00 lock cmpxchg %ebp,0x0(%rip)
> 95: 00
> 96: 83 f8 ff cmp $0xffffffff,%eax
> 99: 75 5f jne fa <native_stop_other_cpus+0x8a>
>
> to:
>
> 74: 8b 05 00 00 00 00 mov 0x0(%rip),%eax
> 7a: 85 c0 test %eax,%eax
> 7c: 0f 85 84 00 00 00 jne 106 <native_stop_other_cpus+0x96>
> 82: 41 54 push %r12
> 84: b8 ff ff ff ff mov $0xffffffff,%eax
> 89: 55 push %rbp
> 8a: 53 push %rbx
> 8b: 65 8b 1d 00 00 00 00 mov %gs:0x0(%rip),%ebx
> 92: f0 0f b1 1d 00 00 00 lock cmpxchg %ebx,0x0(%rip)
> 99: 00
> 9a: 75 5e jne fa <native_stop_other_cpus+0x8a>
>
> Please note early exit and lack of CMP after CMPXCHG.
Uros, I really do appreciate that you are trying to optimize these
paths. But the thing we have to balance is the _need_ for optimization
with the chance that this will break something.
This is about as much of a slow path as we have in the kernel. It's
only used at shutdown, right? That means this is one of the places in
the kernel that least needs optimization. It can only possibly get run
once per boot.
So, the benefit is that it might make this code a few cycles faster. In
practice, it might not even be measurably faster.
On the other hand, this is relatively untested and also makes the C code
more complicated.
Is there some other side benefit that I'm missing here? Applying this
patch doesn't seem to have a great risk/reward ratio.
next prev parent reply other threads:[~2023-11-17 16:18 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-14 16:43 [PATCH] x86/smp: Use atomic_try_cmpxchg() to micro-optimize native_stop_other_cpus() Uros Bizjak
2023-11-17 16:18 ` Dave Hansen [this message]
2023-11-17 16:37 ` Uros Bizjak
2023-11-17 17:22 ` Dave Hansen
2023-11-22 20:18 ` Uros Bizjak
2023-11-22 21:32 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=367bc727-3f26-4e78-8e58-af959760b3fc@intel.com \
--to=dave.hansen@intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=ubizjak@gmail.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox