BPF List
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	mingo@kernel.org, andrii@kernel.org,
	linux-kernel@vger.kernel.org, rostedt@goodmis.org,
	oleg@redhat.com, jolsa@kernel.org, clm@meta.com,
	bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH 00/10] perf/uprobe: Optimize uprobes
Date: Wed, 10 Jul 2024 11:16:31 +0200	[thread overview]
Message-ID: <20240710091631.GT27299@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <Zo1hBFS7c_J-Yx-7@casper.infradead.org>

On Tue, Jul 09, 2024 at 05:10:44PM +0100, Matthew Wilcox wrote:

> Probably best to start with lock_vma_under_rcu() in mm/memory.c.

So close and yet so far :/ 

> > > > Specifically, how feasible would it be to get a simple RCU based
> > > > find_vma() version sorted these days?
> > > 
> > > Liam's and Willy's Maple Tree work, combined with Suren's per-VMA locking
> > > combined with some of Vlastimil's slab work is pushing in that direction.
> > > I believe that things are getting pretty close.
> > 
> > So I fundamentally do not believe in per-VMA locking. Specifically for
> > this case that would be trading one hot line for another. I tried
> > telling people that, but it doesn't seem to stick :/
> 
> SRCU also had its own performance problems, so we've got problems one
> way or the other.  The per-VMA lock probably doesn't work quite the way
> you think it does, but it absoutely can be a hot cacheline.
> 
> I did propose a store-free variant at LSFMM 2022 and again at 2023,
> but was voted down.  https://lwn.net/Articles/932298/

That seems to be lacking much details. Anyway, page-tables are sorta RCU
freed -- per GUP fast requirements. Making it actually RCU isn't
hard, just increases the delay.

> I don't think the door is completely closed to a migration to that,
> but it's a harder sell than what we've got.  Of course, data helps ...

Current 'problem' is a heavily multi-threaded application using uprobes.
Executable is typically one VMA (per DSO) and all the probes will live
in that one VMA. Then have all the CPUs concurrently hit probes, and
they're all hammering the same VMA in order to translate
{mm,vaddr}->{inode,offset}.

After fixing a ton of uprobe things, Andrii is seeing 45% of CPU time
spend in mmap_rwsem:

  https://lkml.kernel.org/r/CAEf4BzY6tXrDGkW6mkxCY551pZa1G+Sgxeuex==nvHUEp9ynpg@mail.gmail.com

Given it's all typically one VMA, anything per VMA is not going to help
much.

Anyway, back to this PER_VMA_LOCK code, it took me a fair while to
digest because it lacks a coherent comment. It took me extra time
because I assumed (silly me) that _seq would mean an actual sequence
count.

If it were an actual sequence count, I could make it work, but sadly,
not. Also, vma_end_write() seems to be missing :-( If anything it could
be used to lockdep annotate the thing.

Mooo.. I need to stare more at this to see if perhaps it can be made to
work, but so far, no joy :/

  parent reply	other threads:[~2024-07-10  9:16 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240708091241.544262971@infradead.org>
     [not found] ` <20240709075651.122204f1358f9f78d1e64b62@kernel.org>
2024-07-09  0:25   ` [PATCH 00/10] perf/uprobe: Optimize uprobes Andrii Nakryiko
2024-07-09  9:01     ` Peter Zijlstra
2024-07-09 14:11       ` Paul E. McKenney
2024-07-09 14:29         ` Peter Zijlstra
2024-07-09 14:36           ` Paul E. McKenney
2024-07-09 15:31             ` Peter Zijlstra
2024-07-09 15:56               ` Paul E. McKenney
2024-07-09 16:10           ` Matthew Wilcox
2024-07-09 16:30             ` Matthew Wilcox
2024-07-09 16:57               ` Paul E. McKenney
2024-07-10  9:16             ` Peter Zijlstra [this message]
2024-07-10  9:40               ` Peter Zijlstra
2024-07-22 19:09                 ` Suren Baghdasaryan
2024-07-27  0:20                   ` Andrii Nakryiko
2024-07-27  1:29                     ` Suren Baghdasaryan
2024-07-27  3:45                       ` Matthew Wilcox
2024-07-30  3:18                         ` Andrii Nakryiko
2024-07-30 13:10                         ` Peter Zijlstra
2024-07-30 18:10                           ` Suren Baghdasaryan
2024-08-03  5:47                             ` Andrii Nakryiko
2024-08-03  8:53                               ` Peter Zijlstra
2024-08-04 23:22                                 ` Andrii Nakryiko
2024-08-06  4:08                                   ` Andrii Nakryiko
2024-08-06 14:50                                     ` Suren Baghdasaryan
2024-08-06 17:40                                       ` Andrii Nakryiko
2024-08-06 17:44                                         ` Suren Baghdasaryan
2024-08-07  1:36                                     ` Suren Baghdasaryan
2024-08-07  5:13                                       ` Suren Baghdasaryan
2024-08-07 17:49                                         ` Andrii Nakryiko
2024-08-07 18:04                                           ` Suren Baghdasaryan
2024-08-07 18:30                                             ` Andrii Nakryiko
2024-08-07 18:33                                             ` Suren Baghdasaryan
2024-08-08  0:47                                               ` Andrii Nakryiko
2024-07-30 13:46                   ` Peter Zijlstra
2024-07-30 18:16                     ` Suren Baghdasaryan
2024-07-09 16:42         ` Andrii Nakryiko
2024-07-09  9:03     ` Peter Zijlstra
2024-07-09 10:01       ` Jiri Olsa
2024-07-09 10:16         ` Peter Zijlstra
2024-07-09 22:10           ` Masami Hiramatsu
2024-07-10 10:10             ` Peter Zijlstra
2024-07-10 14:56               ` Masami Hiramatsu
2024-07-10 18:40                 ` Andrii Nakryiko
2024-07-11  8:51                   ` Peter Zijlstra
2024-07-11 15:17                     ` Masami Hiramatsu
2024-07-11 15:22                       ` Peter Zijlstra
2024-07-11 17:47                         ` Steven Rostedt
2024-07-11 23:59                           ` Masami Hiramatsu
2024-07-10  0:55       ` Masami Hiramatsu
2024-07-09 21:47     ` Andrii Nakryiko
2024-07-10 10:12       ` Peter Zijlstra
2024-07-10 12:34         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240710091631.GT27299@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox