linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] uprobes: add batched register/unregister APIs and per-CPU RW semaphore
@ 2024-06-25  0:21 Andrii Nakryiko
  2024-06-25  0:21 ` [PATCH 01/12] uprobes: update outdated comment Andrii Nakryiko
                   ` (11 more replies)
  0 siblings, 12 replies; 42+ messages in thread
From: Andrii Nakryiko @ 2024-06-25  0:21 UTC (permalink / raw)
  To: linux-trace-kernel, rostedt, mhiramat, oleg
  Cc: peterz, mingo, bpf, jolsa, paulmck, clm, Andrii Nakryiko

This patch set, ultimately, switches global uprobes_treelock from RW spinlock
to per-CPU RW semaphore, which has better performance and scales better under
contention and multiple parallel threads triggering lots of uprobes.

To make this work well with attaching multiple uprobes (through BPF
multi-uprobe), we need to add batched versions of uprobe register/unregister
APIs. This is what most of the patch set is actually doing. The actual switch
to per-CPU RW semaphore is trivial after that and is done in the very last
patch #12. See commit message with some comparison numbers.

Patch #4 is probably the most important patch in the series, revamping uprobe
lifetime management and refcounting. See patch description and added code
comments for all the details.

With changes in patch #4, we open up the way to refactor uprobe_register() and
uprobe_unregister() implementations in such a way that we can avoid taking
uprobes_treelock many times during a single batched attachment/detachment.
This allows to accommodate a much higher latency of taking per-CPU RW
semaphore for write. The end result of this patch set is that attaching 50
thousand uprobes with BPF multi-uprobes doesn't regress and takes about 200ms
both before and after the changes in this patch set.

Patch #5 updates existing uprobe consumers to put all the relevant necessary
pieces into struct uprobe_consumer, without having to pass around
offset/ref_ctr_offset. Existing consumers already keep this data around, we
just formalize the interface.

Patches #6 through #10 add batched versions of register/unregister APIs and
gradually factor them in such a way as to allow taking single (batched)
uprobes_treelock, splitting the logic into multiple independent phases.

Patch #11 switched BPF multi-uprobes to batched uprobe APIs.

As mentioned, a very straightforward patch #12 takes advantage of all the prep
work and just switches uprobes_treelock to per-CPU RW semaphore.

Andrii Nakryiko (12):
  uprobes: update outdated comment
  uprobes: grab write mmap lock in unapply_uprobe()
  uprobes: simplify error handling for alloc_uprobe()
  uprobes: revamp uprobe refcounting and lifetime management
  uprobes: move offset and ref_ctr_offset into uprobe_consumer
  uprobes: add batch uprobe register/unregister APIs
  uprobes: inline alloc_uprobe() logic into __uprobe_register()
  uprobes: split uprobe allocation and uprobes_tree insertion steps
  uprobes: batch uprobes_treelock during registration
  uprobes: improve lock batching for uprobe_unregister_batch
  uprobes,bpf: switch to batch uprobe APIs for BPF multi-uprobes
  uprobes: switch uprobes_treelock to per-CPU RW semaphore

 include/linux/uprobes.h                       |  29 +-
 kernel/events/uprobes.c                       | 522 ++++++++++++------
 kernel/trace/bpf_trace.c                      |  40 +-
 kernel/trace/trace_uprobe.c                   |  53 +-
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  23 +-
 5 files changed, 419 insertions(+), 248 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2024-07-02 23:16 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-25  0:21 [PATCH 00/12] uprobes: add batched register/unregister APIs and per-CPU RW semaphore Andrii Nakryiko
2024-06-25  0:21 ` [PATCH 01/12] uprobes: update outdated comment Andrii Nakryiko
2024-06-25  0:21 ` [PATCH 02/12] uprobes: grab write mmap lock in unapply_uprobe() Andrii Nakryiko
2024-06-25  1:29   ` Masami Hiramatsu
2024-06-25 14:49     ` Oleg Nesterov
2024-06-25 17:37       ` Andrii Nakryiko
2024-06-25 19:07         ` Oleg Nesterov
2024-06-26 16:38           ` Andrii Nakryiko
2024-06-25 10:50   ` Oleg Nesterov
2024-06-25  0:21 ` [PATCH 03/12] uprobes: simplify error handling for alloc_uprobe() Andrii Nakryiko
2024-06-25  0:21 ` [PATCH 04/12] uprobes: revamp uprobe refcounting and lifetime management Andrii Nakryiko
2024-06-25 14:44   ` Oleg Nesterov
2024-06-25 17:30     ` Andrii Nakryiko
2024-06-26  6:02   ` kernel test robot
2024-06-26 16:39     ` Andrii Nakryiko
2024-06-27  2:29   ` Masami Hiramatsu
2024-06-27 16:43     ` Andrii Nakryiko
2024-07-01 21:59       ` Andrii Nakryiko
2024-06-25  0:21 ` [PATCH 05/12] uprobes: move offset and ref_ctr_offset into uprobe_consumer Andrii Nakryiko
2024-06-27  3:06   ` Masami Hiramatsu
2024-06-25  0:21 ` [PATCH 06/12] uprobes: add batch uprobe register/unregister APIs Andrii Nakryiko
2024-06-26 11:27   ` Jiri Olsa
2024-06-26 16:44     ` Andrii Nakryiko
2024-06-27 13:04   ` Masami Hiramatsu
2024-06-27 16:47     ` Andrii Nakryiko
2024-06-28  6:28       ` Masami Hiramatsu
2024-06-28 16:34         ` Andrii Nakryiko
2024-06-29 23:30           ` Masami Hiramatsu
2024-07-01 17:55             ` Andrii Nakryiko
2024-07-01 22:15               ` Andrii Nakryiko
2024-07-02  1:01                 ` Masami Hiramatsu
2024-07-02  1:34                   ` Andrii Nakryiko
2024-07-02 15:19                     ` Masami Hiramatsu
2024-07-02 16:53                       ` Steven Rostedt
2024-07-02 21:23                         ` Andrii Nakryiko
2024-07-02 23:16                         ` Masami Hiramatsu
2024-06-25  0:21 ` [PATCH 07/12] uprobes: inline alloc_uprobe() logic into __uprobe_register() Andrii Nakryiko
2024-06-25  0:21 ` [PATCH 08/12] uprobes: split uprobe allocation and uprobes_tree insertion steps Andrii Nakryiko
2024-06-25  0:21 ` [PATCH 09/12] uprobes: batch uprobes_treelock during registration Andrii Nakryiko
2024-06-25  0:21 ` [PATCH 10/12] uprobes: improve lock batching for uprobe_unregister_batch Andrii Nakryiko
2024-06-25  0:21 ` [PATCH 11/12] uprobes,bpf: switch to batch uprobe APIs for BPF multi-uprobes Andrii Nakryiko
2024-06-25  0:21 ` [PATCH 12/12] uprobes: switch uprobes_treelock to per-CPU RW semaphore Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).