From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Dumazet <edumazet@google.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H . Peter Anvin" <hpa@zytor.com>,
Steven Rostedt <rostedt@goodmis.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Masami Hiramatsu <mhiramat@kernel.org>,
x86@kernel.org, bpf@vger.kernel.org,
Eric Dumazet <eric.dumazet@gmail.com>,
Greg Thelen <gthelen@google.com>,
Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH] x86/alternatives: remove false sharing in poke_int3_handler()
Date: Tue, 25 Mar 2025 12:26:36 +0100 [thread overview]
Message-ID: <Z-KS7H6666PZ3eKv@gmail.com> (raw)
In-Reply-To: <20250325103047.GH36322@noisy.programming.kicks-ass.net>
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Mar 25, 2025 at 09:41:10AM +0100, Ingo Molnar wrote:
> >
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > > On Mon, Mar 24, 2025 at 08:53:31AM +0100, Eric Dumazet wrote:
> > >
> > > > BTW the atomic_cond_read_acquire() part is never called even during my
> > > > stress test.
> > >
> > > Yes, IIRC this is due to text_poke_sync() serializing the state, as that
> > > does a synchronous IPI broadcast, which by necessity requires all
> > > previous INT3 handlers to complete.
> > >
> > > You can only hit that case if the INT3 remains after step-3 (IOW you're
> > > actively writing INT3 into the text). This is exceedingly rare.
> >
> > Might make sense to add a comment for that.
>
> Sure, find below.
>
> > Also, any strong objections against doing this in the namespace:
> >
> > s/bp_/int3_
> >
> > ?
> >
> > Half of the code already calls it a variant of 'int3', half of it 'bp',
> > which I had to think for a couple of seconds goes for breakpoint, not
> > base pointer ... ;-)
>
> It actually is breakpoint, as in INT3 raises #BP. For complete confusion
> the things that are commonly known as debug breakpoints, those things in
> DR7, they raise #DB or debug exceptions.
Yeah, it's a software breakpoint, swbp, that raises the #BP trap.
'bp' is confusingly aliased (in my brain at least) with 'base pointer'
register naming and assembler syntax: as in bp, ebp, rbp.
So I'd prefer if it was named consistently:
text_poke_int3_batch()
text_poke_int3_handler()
...
Not the current mishmash of:
text_poke_bp_batch()
poke_int3_handler()
...
Does this make more sense?
> > Might as well standardize on int3_ and call it a day?
>
> Yeah, perhaps. At some point you've got to know that INT3->#BP and
> DR7->#DB and it all sorta makes sense, but *shrug* :-)
Yeah, so I do know what #BP is, but what the heck disambiguates the two
meanings of _bp and why do we have the above jungle of an inconsistent
namespace? :-)
Picking _int3 would neatly solve all of that.
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index bf82c6f7d690..01e94603e767 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -2749,6 +2749,13 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
>
> /*
> * Remove and wait for refs to be zero.
> + *
> + * Notably, if after step-3 above the INT3 got removed, then the
> + * text_poke_sync() will have serialized against any running INT3
> + * handlers and the below spin-wait will not happen.
> + *
> + * IOW. unless the replacement instruction is INT3, this case goes
> + * unused.
> */
> if (!atomic_dec_and_test(&bp_desc.refs))
> atomic_cond_read_acquire(&bp_desc.refs, !VAL);
Thanks! I stuck this into tip:x86/alternatives, with your SOB.
Ingo
next prev parent reply other threads:[~2025-03-25 11:26 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-23 7:25 [PATCH] x86/alternatives: remove false sharing in poke_int3_handler() Eric Dumazet
2025-03-23 21:38 ` Ingo Molnar
2025-03-24 3:59 ` Eric Dumazet
2025-03-24 7:16 ` Ingo Molnar
2025-03-24 7:47 ` Eric Dumazet
2025-03-24 7:53 ` Eric Dumazet
2025-03-24 8:04 ` Ingo Molnar
2025-03-24 11:33 ` Peter Zijlstra
2025-03-25 8:41 ` Ingo Molnar
2025-03-25 10:30 ` Peter Zijlstra
2025-03-25 11:26 ` Ingo Molnar [this message]
2025-03-25 12:31 ` Peter Zijlstra
2025-03-27 20:56 ` Ingo Molnar
2025-03-25 11:36 ` [tip: x86/alternatives] x86/alternatives: Document the text_poke_bp_batch() synchronization rules a bit more tip-bot2 for Peter Zijlstra
2025-03-24 8:02 ` [PATCH] x86/alternatives: remove false sharing in poke_int3_handler() Ingo Molnar
2025-03-24 8:20 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-KS7H6666PZ3eKv@gmail.com \
--to=mingo@kernel.org \
--cc=ast@kernel.org \
--cc=bp@alien8.de \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=dave.hansen@linux.intel.com \
--cc=edumazet@google.com \
--cc=eranian@google.com \
--cc=eric.dumazet@gmail.com \
--cc=gthelen@google.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.