All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <olsajiri@gmail.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCHv3 04/12] uprobes/x86: Move optimized uprobe from nop5 to nop10
Date: Tue, 26 May 2026 12:19:11 +0200	[thread overview]
Message-ID: <ahVzn3lyY38Il8c2@krava> (raw)
In-Reply-To: <20260526091944.GB4149641@noisy.programming.kicks-ass.net>

On Tue, May 26, 2026 at 11:19:44AM +0200, Peter Zijlstra wrote:
> On Fri, May 22, 2026 at 11:19:17PM +0200, Jiri Olsa wrote:
> > On Fri, May 22, 2026 at 11:50:44AM -0700, Andrii Nakryiko wrote:
> > > On Thu, May 21, 2026 at 5:44 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > >
> > > > Andrii reported an issue with optimized uprobes [1] that can clobber
> > > > redzone area with call instruction storing return address on stack
> > > > where user code may keep temporary data without adjusting rsp.
> > > >
> > > > Fixing this by moving the optimized uprobes on top of 10-bytes nop
> > > > instruction, so we can squeeze another instruction to escape the
> > > > redzone area before doing the call, like:
> > > >
> > > >   lea -0x80(%rsp), %rsp
> > > >   call tramp
> > > >
> > > > Note the lea instruction is used to adjust the rsp register without
> > > > changing the flags.
> > > >
> > > > We use nop10 and following transofrmation to optimized instructions
> > > > above and back as suggested by Peterz [2].
> > > >
> > > > Optimize path (int3_update_optimize):
> > > >
> > > >   1) Initial state after set_swbp() installed the uprobe:
> > > >       cc 2e 0f 1f 84 00 00 00 00 00
> > > >
> > > >      From offset 0 this is INT3 followed by the tail of the original
> > > >      10-byte NOP.
> > > >
> > > >   2) Trap the call slot before rewriting the NOP tail:
> > > >       cc 2e 0f 1f 84 [cc] 00 00 00 00
> > > >
> > > >      From offset 0 this traps on the uprobe INT3.  A thread reaching
> > > >      offset 5 traps on the temporary INT3 instead of seeing a partially
> > > >      patched call.
> > > >
> > > >   3) Rewrite the LEA tail and call displacement, keeping both INT3 bytes:
> > > >       cc [8d 64 24 80] cc [d0 d1 d2 d3]
> > > >
> > > >      From offset 0 and offset 5 this still traps.  The bytes between
> > > >      them are not executable entry points while both traps are in place.
> > > >
> > > >   4) Restore the call opcode at offset 5:
> > > >       cc 8d 64 24 80 [e8] d0 d1 d2 d3
> > > >
> > > >      From offset 0 this still traps.  From offset 5 the instruction is
> > > >      the final CALL to the uprobe trampoline.
> > > >
> > > 
> > > I'm sorry if I'm slow, but I don't understand why we need that second
> > > cc at offset 5? Isn't original nop10 processed by CPU as single
> > > instruction? So it will either be at ip of nop10, or at ip+10, no? If
> > > we trap at ip and in int3 handler +10 from there while we are
> > > installing lea+call, why do we need cc on byte 5?
> > > 
> > > I.e., I don't understand how CPU can end up being at ip+5 until we
> > > finalize lea+call sequence? Can it?
> > 
> > hum, so I though it's for the case when you do unoptimize+optimize,
> > then you can have cpu executing the previous lea and hitting the int3
> > on +5 offset.. but as pointed by Peter (and you) the call instruction
> > never changes, so now I'm not sure why we need it
> 
> So I missed you did the second INT3 in my initial reading.
> 
> That second INT3 is absolutely required *IF* the CALL can ever change.
> However Andrii pointed out that once the CALL is written, it will always
> be the same CALL -- there is but the one trampoline, it doesn't move.
> 
> Therefore, the second INT3 is not strictly required.
> 
> Does this clarify?

yes, will change that in next version

thanks,
jirka

  reply	other threads:[~2026-05-26 10:19 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-21 12:43 [PATCHv3 00/12] uprobes/x86: Fix red zone issue for optimized uprobes Jiri Olsa
2026-05-21 12:44 ` [PATCHv3 01/12] uprobes/x86: Use proper mm_struct in __in_uprobe_trampoline Jiri Olsa
2026-05-22 18:50   ` Andrii Nakryiko
2026-05-21 12:44 ` [PATCHv3 02/12] uprobes/x86: Remove struct uprobe_trampoline object Jiri Olsa
2026-05-21 13:26   ` bot+bpf-ci
2026-05-24 22:13     ` Jiri Olsa
2026-05-22 18:50   ` Andrii Nakryiko
2026-05-21 12:44 ` [PATCHv3 03/12] uprobes/x86: Allow to copy uprobe trampolines on fork Jiri Olsa
2026-05-22 18:50   ` Andrii Nakryiko
2026-05-24 21:54     ` Jiri Olsa
2026-05-21 12:44 ` [PATCHv3 04/12] uprobes/x86: Move optimized uprobe from nop5 to nop10 Jiri Olsa
2026-05-21 13:35   ` Peter Zijlstra
2026-05-22 21:19     ` Jiri Olsa
2026-05-22 18:50   ` Andrii Nakryiko
2026-05-22 21:19     ` Jiri Olsa
2026-05-26  9:19       ` Peter Zijlstra
2026-05-26 10:19         ` Jiri Olsa [this message]
2026-05-21 12:44 ` [PATCHv3 05/12] libbpf: Change has_nop_combo to work on top of nop10 Jiri Olsa
2026-05-21 13:01   ` sashiko-bot
2026-05-22 18:52   ` Andrii Nakryiko
2026-05-22 21:28     ` Jiri Olsa
2026-05-26 14:26       ` Jiri Olsa
2026-05-21 12:44 ` [PATCHv3 06/12] libbpf: Detect uprobe syscall with new error Jiri Olsa
2026-05-21 13:26   ` bot+bpf-ci
2026-05-21 13:29   ` sashiko-bot
2026-05-25 15:44     ` Jiri Olsa
2026-05-21 12:44 ` [PATCHv3 07/12] selftests/bpf: Emit nop,nop10 instructions combo for x86_64 arch Jiri Olsa
2026-05-21 13:21   ` sashiko-bot
2026-05-25 15:43     ` Jiri Olsa
2026-05-26 10:58     ` Jakub Sitnicki
2026-05-26 14:30       ` Jiri Olsa
2026-05-21 13:26   ` bot+bpf-ci
2026-05-26 10:58   ` Jakub Sitnicki
2026-05-21 12:44 ` [PATCHv3 08/12] selftests/bpf: Change uprobe syscall tests to use nop10 Jiri Olsa
2026-05-21 13:05   ` sashiko-bot
2026-05-21 13:26   ` bot+bpf-ci
2026-05-22 18:57   ` Andrii Nakryiko
2026-05-25 15:44     ` Jiri Olsa
2026-05-21 12:44 ` [PATCHv3 09/12] selftests/bpf: Change uprobe/usdt trigger bench code " Jiri Olsa
2026-05-21 12:44 ` [PATCHv3 10/12] selftests/bpf: Add reattach tests for uprobe syscall Jiri Olsa
2026-05-21 12:44 ` [PATCHv3 11/12] selftests/bpf: Add tests for uprobe nop10 red zone clobbering Jiri Olsa
2026-05-21 13:26   ` bot+bpf-ci
2026-05-25 15:44     ` Jiri Olsa
2026-05-21 12:44 ` [PATCHv3 12/12] selftests/bpf: Add tests for forked/cloned optimized uprobes Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahVzn3lyY38Il8c2@krava \
    --to=olsajiri@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.