From: Oleg Nesterov <oleg@redhat.com>
To: David Laight <David.Laight@ACULAB.COM>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
'Jiri Olsa' <olsajiri@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrii Nakryiko <andrii@kernel.org>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
Hao Luo <haoluo@google.com>, Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Alan Maguire <alan.maguire@oracle.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-trace-kernel@vger.kernel.org"
<linux-trace-kernel@vger.kernel.org>
Subject: Re: [PATCH bpf-next 08/13] uprobes/x86: Add support to optimize uprobes
Date: Mon, 16 Dec 2024 13:22:05 +0100 [thread overview]
Message-ID: <20241216122204.GB374@redhat.com> (raw)
In-Reply-To: <0916e24539ba4bae9fb729198b033bd7@AcuMS.aculab.com>
OK, thanks, I am starting to share your concerns...
Oleg.
On 12/16, David Laight wrote:
>
> From: Oleg Nesterov
> > Sent: 16 December 2024 10:13
> >
> > David,
> >
> > let me say first that my understanding of this magic is very limited,
> > please correct me.
>
> I only (half) understand what the 'magic' has to accomplish and
> some of the pitfalls.
>
> I've copied linux-mm - someone there might know more.
>
> > On 12/16, David Laight wrote:
> > >
> > > It all depends on how hard __replace_page() tries to be atomic.
> > > The page has to change from one backed by the executable to a private
> > > one backed by swap - otherwise you can't write to it.
> >
> > This is what uprobe_write_opcode() does,
>
> And will be enough for single byte changes - they'll be picked up
> at some point after the change.
>
> > > But the problems arise when the instruction prefetch unit has read
> > > part of the 5-byte instruction (it might even only read half a cache
> > > line at a time).
> > > I'm not sure how long the pipeline can sit in that state - but I
> > > can do a memory read of a PCIe address that takes ~3000 clocks.
> > > (And a misaligned AVX-512 read is probably eight 8-byte transfers.)
> > >
> > > So I think you need to force an interrupt while the PTE is invalid.
> > > And that need to be simultaneous on all cpu running that process.
> >
> > __replace_page() does ptep_get_and_clear(old_pte) + flush_tlb_page().
> >
> > That's not enough?
>
> I doubt it. As I understand it.
> The hardware page tables will be shared by all the threads of a process.
> So unless you hard synchronise all the cpu (and flush the TLB) while the
> PTE is being changed there is always the possibility of a cpu picking up
> the new PTE before the IPI that (I presume) flush_tlb_page() generates
> is processed.
> If that happens when the instruction you are patching is part-read into
> the instruction decode buffer then you'll execute a mismatch of the two
> instructions.
>
> I can't remember the outcome of discussions about live-patching kernel
> code - and I'm sure that was aligned 32bit writes.
>
> >
> > > Stopping the process using ptrace would do it.
> >
> > Not an option :/
>
> Thought you'd say that.
>
> David
>
> >
> > Oleg.
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
next prev parent reply other threads:[~2024-12-16 12:22 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-11 13:33 [PATCH bpf-next 00/13] uprobes: Add support to optimize usdt probes on x86_64 Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 01/13] uprobes: Rename arch_uretprobe_trampoline function Jiri Olsa
2024-12-13 0:42 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 02/13] uprobes: Make copy_from_page global Jiri Olsa
2024-12-13 0:43 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 03/13] uprobes: Add nbytes argument to uprobe_write_opcode Jiri Olsa
2024-12-13 0:45 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 04/13] uprobes: Add arch_uprobe_verify_opcode function Jiri Olsa
2024-12-13 0:48 ` Andrii Nakryiko
2024-12-13 13:21 ` Jiri Olsa
2024-12-13 21:11 ` Andrii Nakryiko
2024-12-13 21:52 ` Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 05/13] uprobes: Add mapping for optimized uprobe trampolines Jiri Olsa
2024-12-13 1:01 ` Andrii Nakryiko
2024-12-13 13:42 ` Jiri Olsa
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 06/13] uprobes/x86: Add uprobe syscall to speed up uprobe Jiri Olsa
2024-12-13 13:48 ` Thomas Weißschuh
2024-12-13 14:51 ` Jiri Olsa
2024-12-13 15:12 ` Thomas Weißschuh
2024-12-13 21:52 ` Jiri Olsa
2024-12-14 13:21 ` Thomas Weißschuh
2024-12-16 8:03 ` Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 07/13] uprobes/x86: Add support to emulate nop5 instruction Jiri Olsa
2024-12-13 10:45 ` Peter Zijlstra
2024-12-13 13:02 ` Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 08/13] uprobes/x86: Add support to optimize uprobes Jiri Olsa
2024-12-13 10:49 ` Peter Zijlstra
2024-12-13 13:06 ` Jiri Olsa
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-15 12:06 ` David Laight
2024-12-15 14:14 ` Oleg Nesterov
2024-12-16 8:08 ` Jiri Olsa
2024-12-16 9:18 ` David Laight
2024-12-16 10:12 ` Oleg Nesterov
2024-12-16 11:10 ` David Laight
2024-12-16 12:22 ` Oleg Nesterov [this message]
2024-12-16 12:50 ` Jiri Olsa
2024-12-16 15:08 ` David Laight
2024-12-16 16:06 ` Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 09/13] selftests/bpf: Use 5-byte nop for x86 usdt probes Jiri Olsa
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-16 8:32 ` Jiri Olsa
2024-12-16 23:06 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 10/13] selftests/bpf: Add uprobe/usdt optimized test Jiri Olsa
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-16 7:58 ` Jiri Olsa
2024-12-11 13:34 ` [PATCH bpf-next 11/13] selftests/bpf: Add hit/attach/detach race optimized uprobe test Jiri Olsa
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-16 7:59 ` Jiri Olsa
2024-12-11 13:34 ` [PATCH bpf-next 12/13] selftests/bpf: Add uprobe syscall sigill signal test Jiri Olsa
2024-12-11 13:34 ` [PATCH bpf-next 13/13] selftests/bpf: Add 5-byte nop uprobe trigger bench Jiri Olsa
2024-12-13 21:57 ` Andrii Nakryiko
2024-12-16 7:56 ` Jiri Olsa
2024-12-13 0:43 ` [PATCH bpf-next 00/13] uprobes: Add support to optimize usdt probes on x86_64 Andrii Nakryiko
2024-12-13 9:46 ` Jiri Olsa
2024-12-13 10:51 ` Peter Zijlstra
2024-12-13 13:07 ` Jiri Olsa
2024-12-13 13:54 ` Peter Zijlstra
2024-12-13 14:05 ` Jiri Olsa
2024-12-13 18:39 ` Peter Zijlstra
2024-12-13 21:52 ` Jiri Olsa
2024-12-13 21:59 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241216122204.GB374@redhat.com \
--to=oleg@redhat.com \
--cc=David.Laight@ACULAB.COM \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=olsajiri@gmail.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.