From: Peter Zijlstra <peterz@infradead.org>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
linux-trace-kernel@vger.kernel.org, rostedt@goodmis.org,
mhiramat@kernel.org, oleg@redhat.com, mingo@redhat.com,
bpf@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org,
clm@meta.com, open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 00/12] uprobes: add batched register/unregister APIs and per-CPU RW semaphore
Date: Thu, 4 Jul 2024 11:15:59 +0200 [thread overview]
Message-ID: <20240704091559.GS11386@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <CAEf4BzaZhi+_MZ0M4Pz-1qmej6rrJeLO9x1+nR5QH9pnQXzwdw@mail.gmail.com>
On Wed, Jul 03, 2024 at 02:33:06PM -0700, Andrii Nakryiko wrote:
> 2. More tactically, RCU protection seems like the best way forward. We
> got hung up on SRCU vs RCU Tasks Trace. Thanks to Paul, we also
> clarified that RCU Tasks Trace has nothing to do with Tasks Rude
> flavor (whatever that is, I have no idea).
>
> Now, RCU Tasks Trace were specifically designed for least overhead
> hotpath (reader side) performance, at the expense of slowing down much
> rarer writers. My microbenchmarking does show at least 5% difference.
> Both flavors can handle sleepable uprobes waiting for page faults.
> Tasks Trace flavor is already used for tracing in the BPF realm,
> including for sleepable uprobes and works well. It's not going away.
I need to look into this new RCU flavour and why it exists -- for
example, why can't SRCU be improved to gain the same benefits. This is
what we've always done, improve SRCU.
> Now, you keep pushing for SRCU instead of RCU Tasks Trace, but I
> haven't seen a single argument why. Please provide that, or let's
> stick to RCU Tasks Trace, because uprobe's use case is an ideal case
> of what Tasks Trace flavor was designed for.
Because I actually know SRCU, and because it provides a local scope.
It isolates the unregister waiters from other random users. I'm not
going to use this funky new flavour until I truly understand it.
Also, we actually want two scopes here, there is no reason for the
consumer unreg to wait for the retprobe stuff.
> 3. Regardless of RCU flavor, due to RCU protection, we have to add
> batched register/unregister APIs, so we can amortize sync_rcu cost
> during deregistration. Can we please agree on that as well? This is
> the main goal of this patch set and I'd like to land it before working
> further on changing and improving the rest of the locking schema.
See my patch here:
https://lkml.kernel.org/r/20240704084524.GC28838@noisy.programming.kicks-ass.net
I don't think it needs to be more complicated than that.
> I won't be happy about it, but just to move things forward, I can drop
> a) custom refcounting and/or b) percpu RW semaphore. Both are
> beneficial but not essential for batched APIs work. But if you force
> me to do that, please state clearly your reasons/arguments.
The reason I'm pushing RCU here is because AFAICT uprobes doesn't
actually need the stronger serialisation that rwlock (any flavour)
provide. It is a prime candidate for RCU, and I think you'll find plenty
papers / articles (by both Paul and others) that show that RCU scales
better.
As a bonus, you avoid that horrific write side cost that per-cpu rwsem
has.
The reason I'm not keen on that refcount thing was initially because I
did not understand the justification for it, but worse, once I did read
your justification, your very own numbers convinced me that the refcount
is fundamentally problematic, in any way shape or form.
> No one had yet pointed out why refcounting is broken
Your very own numbers point out that refcounting is a problem here.
> and why percpu RW semaphore is bad.
Literature and history show us that RCU -- where possible -- is
always better than any reader-writer locking scheme.
> 4. Another tactical thing, but an important one. Refcounting schema
> for uprobes. I've replied already, but I think refcounting is
> unavoidable for uretprobes,
I think we can fix that, I replied here:
https://lkml.kernel.org/r/20240704083152.GQ11386@noisy.programming.kicks-ass.net
> and current refcounting schema is
> problematic for batched APIs due to race between finding uprobe and
> there still being a possibility we'd need to undo all that and retry
> again.
Right, I've not looked too deeply at that, because I've not seen a
reason to actually change that. I can go think about it if you want, but
meh.
next prev parent reply other threads:[~2024-07-04 9:16 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-01 22:39 [PATCH v2 00/12] uprobes: add batched register/unregister APIs and per-CPU RW semaphore Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 01/12] uprobes: update outdated comment Andrii Nakryiko
2024-07-03 11:38 ` Oleg Nesterov
2024-07-03 18:24 ` Andrii Nakryiko
2024-07-03 21:51 ` Andrii Nakryiko
2024-07-10 13:31 ` Oleg Nesterov
2024-07-10 15:14 ` Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 02/12] uprobes: correct mmap_sem locking assumptions in uprobe_write_opcode() Andrii Nakryiko
2024-07-03 11:41 ` Oleg Nesterov
2024-07-03 13:15 ` Masami Hiramatsu
2024-07-03 18:25 ` Andrii Nakryiko
2024-07-03 21:47 ` Masami Hiramatsu
2024-07-01 22:39 ` [PATCH v2 03/12] uprobes: simplify error handling for alloc_uprobe() Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 04/12] uprobes: revamp uprobe refcounting and lifetime management Andrii Nakryiko
2024-07-02 10:22 ` Peter Zijlstra
2024-07-02 17:54 ` Andrii Nakryiko
2024-07-03 13:36 ` Peter Zijlstra
2024-07-03 20:47 ` Andrii Nakryiko
2024-07-04 8:03 ` Peter Zijlstra
2024-07-04 8:45 ` Peter Zijlstra
2024-07-04 14:40 ` Masami Hiramatsu
2024-07-04 8:31 ` Peter Zijlstra
2024-07-05 15:37 ` Oleg Nesterov
2024-07-06 17:00 ` Jiri Olsa
2024-07-06 17:05 ` Jiri Olsa
2024-07-07 14:46 ` Oleg Nesterov
2024-07-08 17:47 ` Andrii Nakryiko
2024-07-09 18:47 ` Oleg Nesterov
2024-07-09 20:59 ` Andrii Nakryiko
2024-07-09 21:31 ` Oleg Nesterov
2024-07-09 21:45 ` Andrii Nakryiko
2024-07-08 17:47 ` Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 05/12] uprobes: move offset and ref_ctr_offset into uprobe_consumer Andrii Nakryiko
2024-07-03 8:13 ` Peter Zijlstra
2024-07-03 10:13 ` Masami Hiramatsu
2024-07-03 18:23 ` Andrii Nakryiko
2024-07-07 12:48 ` Oleg Nesterov
2024-07-08 17:56 ` Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 06/12] uprobes: add batch uprobe register/unregister APIs Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 07/12] uprobes: inline alloc_uprobe() logic into __uprobe_register() Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 08/12] uprobes: split uprobe allocation and uprobes_tree insertion steps Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 09/12] uprobes: batch uprobes_treelock during registration Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 10/12] uprobes: improve lock batching for uprobe_unregister_batch Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 11/12] uprobes,bpf: switch to batch uprobe APIs for BPF multi-uprobes Andrii Nakryiko
2024-07-01 22:39 ` [PATCH v2 12/12] uprobes: switch uprobes_treelock to per-CPU RW semaphore Andrii Nakryiko
2024-07-02 10:23 ` [PATCH v2 00/12] uprobes: add batched register/unregister APIs and " Peter Zijlstra
2024-07-02 11:54 ` Peter Zijlstra
2024-07-02 12:01 ` Peter Zijlstra
2024-07-02 17:54 ` Andrii Nakryiko
2024-07-02 19:18 ` Peter Zijlstra
2024-07-02 23:56 ` Paul E. McKenney
2024-07-03 4:54 ` Andrii Nakryiko
2024-07-03 7:50 ` Peter Zijlstra
2024-07-03 14:08 ` Paul E. McKenney
2024-07-04 8:39 ` Peter Zijlstra
2024-07-04 15:13 ` Paul E. McKenney
2024-07-03 21:57 ` Steven Rostedt
2024-07-03 22:07 ` Paul E. McKenney
2024-07-03 4:47 ` Andrii Nakryiko
2024-07-03 8:07 ` Peter Zijlstra
2024-07-03 20:55 ` Andrii Nakryiko
2024-07-03 21:33 ` Andrii Nakryiko
2024-07-04 9:15 ` Peter Zijlstra [this message]
2024-07-04 13:56 ` Steven Rostedt
2024-07-04 15:44 ` Paul E. McKenney
2024-07-08 17:47 ` Andrii Nakryiko
2024-07-08 17:48 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240704091559.GS11386@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=paulmck@kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.