From: Peter Zijlstra <peterz@infradead.org>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>,
Matthew Wilcox <willy@infradead.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
mingo@kernel.org, andrii@kernel.org,
linux-kernel@vger.kernel.org, rostedt@goodmis.org,
oleg@redhat.com, jolsa@kernel.org, clm@meta.com,
bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH 00/10] perf/uprobe: Optimize uprobes
Date: Sat, 3 Aug 2024 10:53:12 +0200 [thread overview]
Message-ID: <20240803085312.GP39708@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <CAEf4BzavWOgCLQoNdmPyyqHcm7gY5USKU5f1JWfyaCbuc_zVAA@mail.gmail.com>
On Fri, Aug 02, 2024 at 10:47:15PM -0700, Andrii Nakryiko wrote:
> Is there any reason why the approach below won't work?
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 8be9e34e786a..e21b68a39f13 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -2251,6 +2251,52 @@ static struct uprobe
> *find_active_uprobe_rcu(unsigned long bp_vaddr, int *is_swb
> struct uprobe *uprobe = NULL;
> struct vm_area_struct *vma;
>
> +#ifdef CONFIG_PER_VMA_LOCK
> + vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_MAYSHARE, vm_flags;
> + struct file *vm_file;
> + struct inode *vm_inode;
> + unsigned long vm_pgoff, vm_start, vm_end;
> + int vm_lock_seq;
> + loff_t offset;
> +
> + rcu_read_lock();
> +
> + vma = vma_lookup(mm, bp_vaddr);
> + if (!vma)
> + goto retry_with_lock;
> +
> + vm_lock_seq = READ_ONCE(vma->vm_lock_seq);
So vma->vm_lock_seq is only updated on vma_start_write()
> +
> + vm_file = READ_ONCE(vma->vm_file);
> + vm_flags = READ_ONCE(vma->vm_flags);
> + if (!vm_file || (vm_flags & flags) != VM_MAYEXEC)
> + goto retry_with_lock;
> +
> + vm_inode = READ_ONCE(vm_file->f_inode);
> + vm_pgoff = READ_ONCE(vma->vm_pgoff);
> + vm_start = READ_ONCE(vma->vm_start);
> + vm_end = READ_ONCE(vma->vm_end);
None of those are written with WRITE_ONCE(), so this buys you nothing.
Compiler could be updating them one byte at a time while you load some
franken-update.
Also, if you're in the middle of split_vma() you might not get a
consistent set.
> + if (bp_vaddr < vm_start || bp_vaddr >= vm_end)
> + goto retry_with_lock;
> +
> + offset = (loff_t)(vm_pgoff << PAGE_SHIFT) + (bp_vaddr - vm_start);
> + uprobe = find_uprobe_rcu(vm_inode, offset);
> + if (!uprobe)
> + goto retry_with_lock;
> +
> + /* now double check that nothing about VMA changed */
> + if (vm_lock_seq != READ_ONCE(vma->vm_lock_seq))
> + goto retry_with_lock;
Since vma->vma_lock_seq is only ever updated at vma_start_write() you're
checking you're in or after the same modification cycle.
The point of sequence locks is to check you *IN* a modification cycle
and retry if you are. You're now explicitly continuing if you're in a
modification.
You really need:
seq++;
wmb();
... do modification
wmb();
seq++;
vs
do {
s = READ_ONCE(seq) & ~1;
rmb();
... read stuff
} while (rmb(), seq != s);
The thing to note is that seq will be odd while inside a modification
and even outside, further if the pre and post seq are both even but not
identical, you've crossed a modification and also need to retry.
> +
> + /* happy case, we speculated successfully */
> + rcu_read_unlock();
> + return uprobe;
> +
> +retry_with_lock:
> + rcu_read_unlock();
> + uprobe = NULL;
> +#endif
> +
> mmap_read_lock(mm);
> vma = vma_lookup(mm, bp_vaddr);
> if (vma) {
> diff --git a/kernel/fork.c b/kernel/fork.c
> index cc760491f201..211a84ee92b4 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -3160,7 +3160,7 @@ void __init proc_caches_init(void)
> NULL);
> files_cachep = kmem_cache_create("files_cache",
> sizeof(struct files_struct), 0,
> - SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT,
> + SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT|SLAB_TYPESAFE_BY_RCU,
> NULL);
> fs_cachep = kmem_cache_create("fs_cache",
> sizeof(struct fs_struct), 0,
next prev parent reply other threads:[~2024-08-03 8:53 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-08 9:12 [PATCH 00/10] perf/uprobe: Optimize uprobes Peter Zijlstra
2024-07-08 9:12 ` [PATCH 01/10] perf/uprobe: Re-indent labels Peter Zijlstra
2024-07-08 9:12 ` [PATCH 02/10] perf/uprobe: Remove spurious whitespace Peter Zijlstra
2024-07-08 9:12 ` [PATCH 03/10] rbtree: Provide rb_find_rcu() / rb_find_add_rcu() Peter Zijlstra
2024-07-10 1:29 ` Masami Hiramatsu
2024-07-10 7:55 ` Peter Zijlstra
2024-07-10 14:05 ` Masami Hiramatsu
2024-07-08 9:12 ` [PATCH 04/10] perf/uprobe: RCU-ify find_uprobe() Peter Zijlstra
2024-07-08 16:35 ` Oleg Nesterov
2024-07-08 18:08 ` Peter Zijlstra
2024-07-09 14:32 ` Oleg Nesterov
2024-07-09 15:02 ` Peter Zijlstra
2024-07-09 15:23 ` Oleg Nesterov
2024-07-08 9:12 ` [PATCH 05/10] perf/uprobe: SRCU-ify uprobe->consumer list Peter Zijlstra
2024-07-09 12:05 ` Peter Zijlstra
2024-07-09 13:33 ` Oleg Nesterov
2024-07-09 14:32 ` Peter Zijlstra
2024-07-09 15:05 ` Oleg Nesterov
2024-07-09 15:19 ` Peter Zijlstra
2024-07-09 15:34 ` Oleg Nesterov
2024-07-09 14:32 ` Peter Zijlstra
2024-07-08 9:12 ` [PATCH 06/10] perf/uprobe: Split uprobe_unregister() Peter Zijlstra
2024-07-10 1:40 ` Masami Hiramatsu
2024-07-08 9:12 ` [PATCH 07/10] perf/uprobe: Convert (some) uprobe->refcount to SRCU Peter Zijlstra
2024-07-08 9:12 ` [PATCH 08/10] srcu: Add __srcu_clone_read_lock() Peter Zijlstra
2024-07-10 5:48 ` Boqun Feng
2024-07-10 10:02 ` Peter Zijlstra
2024-07-10 12:05 ` Peter Zijlstra
2024-07-08 9:12 ` [PATCH 09/10] perf/uprobe: Convert single-step and uretprobe to SRCU Peter Zijlstra
2024-07-08 9:12 ` [PATCH 10/10] perf/uprobe: Add uretprobe timer Peter Zijlstra
2024-07-08 22:56 ` [PATCH 00/10] perf/uprobe: Optimize uprobes Masami Hiramatsu
2024-07-09 0:25 ` Andrii Nakryiko
2024-07-09 9:01 ` Peter Zijlstra
2024-07-09 14:11 ` Paul E. McKenney
2024-07-09 14:29 ` Peter Zijlstra
2024-07-09 14:36 ` Paul E. McKenney
2024-07-09 15:31 ` Peter Zijlstra
2024-07-09 15:56 ` Paul E. McKenney
2024-07-09 16:10 ` Matthew Wilcox
2024-07-09 16:30 ` Matthew Wilcox
2024-07-09 16:57 ` Paul E. McKenney
2024-07-10 9:16 ` Peter Zijlstra
2024-07-10 9:40 ` Peter Zijlstra
2024-07-22 19:09 ` Suren Baghdasaryan
2024-07-27 0:20 ` Andrii Nakryiko
2024-07-27 1:29 ` Suren Baghdasaryan
2024-07-27 3:45 ` Matthew Wilcox
2024-07-30 3:18 ` Andrii Nakryiko
2024-07-30 13:10 ` Peter Zijlstra
2024-07-30 18:10 ` Suren Baghdasaryan
2024-08-03 5:47 ` Andrii Nakryiko
2024-08-03 8:53 ` Peter Zijlstra [this message]
2024-08-04 23:22 ` Andrii Nakryiko
2024-08-06 4:08 ` Andrii Nakryiko
2024-08-06 14:50 ` Suren Baghdasaryan
2024-08-06 17:40 ` Andrii Nakryiko
2024-08-06 17:44 ` Suren Baghdasaryan
2024-08-07 1:36 ` Suren Baghdasaryan
2024-08-07 5:13 ` Suren Baghdasaryan
2024-08-07 17:49 ` Andrii Nakryiko
2024-08-07 18:04 ` Suren Baghdasaryan
2024-08-07 18:30 ` Andrii Nakryiko
2024-08-07 18:33 ` Suren Baghdasaryan
2024-08-08 0:47 ` Andrii Nakryiko
2024-07-30 13:46 ` Peter Zijlstra
2024-07-30 18:16 ` Suren Baghdasaryan
2024-07-09 16:42 ` Andrii Nakryiko
2024-07-09 9:03 ` Peter Zijlstra
2024-07-09 10:01 ` Jiri Olsa
2024-07-09 10:16 ` Peter Zijlstra
2024-07-09 22:10 ` Masami Hiramatsu
2024-07-10 10:10 ` Peter Zijlstra
2024-07-10 14:56 ` Masami Hiramatsu
2024-07-10 18:40 ` Andrii Nakryiko
2024-07-11 8:51 ` Peter Zijlstra
2024-07-11 15:17 ` Masami Hiramatsu
2024-07-11 15:22 ` Peter Zijlstra
2024-07-11 17:47 ` Steven Rostedt
2024-07-11 23:59 ` Masami Hiramatsu
2024-07-10 0:55 ` Masami Hiramatsu
2024-07-09 21:47 ` Andrii Nakryiko
2024-07-10 10:12 ` Peter Zijlstra
2024-07-10 12:34 ` Peter Zijlstra
2024-07-09 7:11 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240803085312.GP39708@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@kernel.org \
--cc=rostedt@goodmis.org \
--cc=surenb@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.