All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: David Laight <David.Laight@aculab.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	lkml <linux-kernel@vger.kernel.org>,
	"linux-trace-kernel@vger.kernel.org"
	<linux-trace-kernel@vger.kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [RFC] x86/alternatives: Merge first and second step in text_poke_bp_batch
Date: Thu, 16 Jan 2025 12:48:32 +0100	[thread overview]
Message-ID: <Z4jyEBf5WIvygWYh@krava> (raw)
In-Reply-To: <c88cf8951a0d4f73901ba97a81ba3a12@AcuMS.aculab.com>

On Tue, Jan 14, 2025 at 02:38:42PM +0000, David Laight wrote:
> From: Jiri Olsa
> > Sent: 14 January 2025 14:03
> > 
> > hi,
> > while checking on similar code for uprobes I was wondering if we
> > can merge first 2 steps of instruction update in text_poke_bp_batch
> > function.
> > 
> > Basically the first step now would be to write int3 byte together
> > with the rest of the bytes of the new instruction instead of doing
> > that separately. And the second step would be to overwrite int3
> > byte with first byte of the new instruction.
> > 
> > Would that work or do I miss some x86 detail that could lead to crash?
> 
> I suspect it will 'crash and burn'.
> 
> Consider what happens if there is a cache-line boundary in the
> middle of an instruction.
> (Actually an instruction fetch boundary will do.)
> 
> cpu0: reads the old instructions from the old cache line.
> cpu0: pipeline busy (or similar) so doesn't read the next cache line.
> cpu1: writes the new instructions.
> cpu0: reads the second cache line.
> 
> cpu0 now has a mix of the old and new instruction bytes.
> 
> Writing the int3 is safe - provided they don't return until
> all the patching is over.
> 
> But between writing the int3 (over the first opcode byte) and
> updating anything else I suspect you need something that does
> a complete synchronise between the cpu that discards any bytes
> in the decode pipeline as well as flushing the I-cache (etc).
> I suspect that requires an acked IPI.
> 
> Very long cpu stalls are easy to generate.
> Any read from PCIe will be slow (I've at fpga target that takes ~1us).
> You'd need to be unlucky to be patching an instruction while one
> was pending, but a DMA access might just be enough to cause grief.

ok, thanks for all the details,

jirka

  reply	other threads:[~2025-01-16 11:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-14 14:02 [RFC] x86/alternatives: Merge first and second step in text_poke_bp_batch Jiri Olsa
2025-01-14 14:17 ` Peter Zijlstra
2025-01-14 14:31   ` Jiri Olsa
2025-01-14 15:36     ` Steven Rostedt
2025-01-15 18:26       ` Jiri Olsa
2025-01-14 14:38 ` David Laight
2025-01-16 11:48   ` Jiri Olsa [this message]
2025-01-16  5:57 ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4jyEBf5WIvygWYh@krava \
    --to=olsajiri@gmail.com \
    --cc=David.Laight@aculab.com \
    --cc=bpf@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.