From: Andrii Nakryiko <andrii@kernel.org>
To: linux-trace-kernel@vger.kernel.org, peterz@infradead.org,
oleg@redhat.com
Cc: rostedt@goodmis.org, mhiramat@kernel.org, mingo@kernel.org,
bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
jolsa@kernel.org, paulmck@kernel.org,
Andrii Nakryiko <andrii@kernel.org>
Subject: [PATCH v3 tip/perf/core 0/2] SRCU-protected uretprobes hot path
Date: Wed, 23 Oct 2024 21:41:57 -0700 [thread overview]
Message-ID: <20241024044159.3156646-1-andrii@kernel.org> (raw)
Recently landed changes make uprobe entry hot code path makes use of RCU Tasks
Trace to avoid touching uprobe refcount, which at high frequency of uprobe
triggering leads to excessive cache line bouncing and limited scalability with
increased number of CPUs that simultaneously execute uprobe handlers.
This patch set adds return uprobe (uretprobe) side of this, this time
utilizing SRCU for the same reasons. Given the time between entry uprobe
activation (at which point uretprobe code hijacks user-space stack to get
activated on user function return) and uretprobe activation can be arbitrarily
long and is completely under control of user code, we need to protect
ourselves from too long or unbounded SRCU grace periods.
To that end we keep SRCU protection only for a limited time, and if user space
code takes longer to return, pending uretprobe instances are "downgraded" to
refcounted ones. This gives us best scalability and performance for
high-frequency uretprobes, and keeps upper bound on SRCU grace period duration
for low frequency uretprobes.
There are a bunch of synchronization issues between timer callback running in
IRQ handler and current thread executing uretprobe handlers, which is
abstracted away behind "hybrid lifetime uprobe" (hprobe) wrapper around uprobe
instance itself.
There is now a speculative try_get_uprobe() and, possibly, a compensating
put_uprobe() being done from the timer thread (softirq), so we need to make
sure that put_uprobe() is working well from any context. This is what patch #1
does, employing deferred work callback, and shifting all the locking to it.
v2->v3:
- rebased onto peterz/queue.git's perf/core on top of Jiri's changes;
- simplify hprobe states by utilizing HPROBE_GONE for NULL uprobe (Peter);
- hprobe_expire() can return uprobe with refcount, if requested (Peter);
- keep hprobe_init_leased() and hprobe_init_stable() to a) avoid srcu_idx
bikeshedding dependency and b) leased constructor shouldn't accept NULL
uprobe, so it's nice to be able to easily express and enforce that;
- patch #1 stays the same, we'll work on uprobe_delayed_lock separately;
v1->v2:
- dropped single-stepped uprobes changes to make this change a bit more
palatable to Oleg and get some good will from him :)
- fixed the bug with not calling __srcu_read_unlock when "expiring" leased
uprobe, but failing to get refcount;
- switched hprobe implementation to an explicit state machine, which seems
to make logic more straightforward, evidenced by this allowing me to spot
the above subtle LEASED -> GONE transition bug;
- re-ran uprobe-stress many-many times, it was instrumental for getting
confidence in implementation and spotting subtle bugs (including the above
one, once I modified timer logic to ran at fixed interval to increase the
probability of races with the normal uretprobe consumer code);
rfc->v1:
- made put_uprobe() work in any context, not just user context (Oleg);
- changed to unconditional mod_timer() usage to avoid races (Oleg).
- I kept single-stepped uprobe changes, as they have a simple use of all the
hprobe functionality developed in patch #1.
Andrii Nakryiko (2):
uprobes: allow put_uprobe() from non-sleepable softirq context
uprobes: SRCU-protect uretprobe lifetime (with timeout)
include/linux/uprobes.h | 54 ++++++-
kernel/events/uprobes.c | 309 +++++++++++++++++++++++++++++++++++-----
2 files changed, 322 insertions(+), 41 deletions(-)
--
2.43.5
next reply other threads:[~2024-10-24 4:42 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-24 4:41 Andrii Nakryiko [this message]
2024-10-24 4:41 ` [PATCH v3 tip/perf/core 1/2] uprobes: allow put_uprobe() from non-sleepable softirq context Andrii Nakryiko
2024-10-24 4:41 ` [PATCH v3 tip/perf/core 2/2] uprobes: SRCU-protect uretprobe lifetime (with timeout) Andrii Nakryiko
2025-02-24 12:22 ` Breno Leitao
2025-02-24 22:23 ` Andrii Nakryiko
2025-02-25 11:46 ` Breno Leitao
2025-02-25 15:13 ` Paul E. McKenney
2025-02-25 22:10 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241024044159.3156646-1-andrii@kernel.org \
--to=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).