From: Peter Zijlstra <peterz@infradead.org>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>,
Matthew Wilcox <willy@infradead.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
mingo@kernel.org, andrii@kernel.org,
linux-kernel@vger.kernel.org, rostedt@goodmis.org,
oleg@redhat.com, jolsa@kernel.org, clm@meta.com,
bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH 00/10] perf/uprobe: Optimize uprobes
Date: Sat, 3 Aug 2024 10:53:12 +0200 [thread overview]
Message-ID: <20240803085312.GP39708@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <CAEf4BzavWOgCLQoNdmPyyqHcm7gY5USKU5f1JWfyaCbuc_zVAA@mail.gmail.com>
On Fri, Aug 02, 2024 at 10:47:15PM -0700, Andrii Nakryiko wrote:
> Is there any reason why the approach below won't work?
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 8be9e34e786a..e21b68a39f13 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -2251,6 +2251,52 @@ static struct uprobe
> *find_active_uprobe_rcu(unsigned long bp_vaddr, int *is_swb
> struct uprobe *uprobe = NULL;
> struct vm_area_struct *vma;
>
> +#ifdef CONFIG_PER_VMA_LOCK
> + vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_MAYSHARE, vm_flags;
> + struct file *vm_file;
> + struct inode *vm_inode;
> + unsigned long vm_pgoff, vm_start, vm_end;
> + int vm_lock_seq;
> + loff_t offset;
> +
> + rcu_read_lock();
> +
> + vma = vma_lookup(mm, bp_vaddr);
> + if (!vma)
> + goto retry_with_lock;
> +
> + vm_lock_seq = READ_ONCE(vma->vm_lock_seq);
So vma->vm_lock_seq is only updated on vma_start_write()
> +
> + vm_file = READ_ONCE(vma->vm_file);
> + vm_flags = READ_ONCE(vma->vm_flags);
> + if (!vm_file || (vm_flags & flags) != VM_MAYEXEC)
> + goto retry_with_lock;
> +
> + vm_inode = READ_ONCE(vm_file->f_inode);
> + vm_pgoff = READ_ONCE(vma->vm_pgoff);
> + vm_start = READ_ONCE(vma->vm_start);
> + vm_end = READ_ONCE(vma->vm_end);
None of those are written with WRITE_ONCE(), so this buys you nothing.
Compiler could be updating them one byte at a time while you load some
franken-update.
Also, if you're in the middle of split_vma() you might not get a
consistent set.
> + if (bp_vaddr < vm_start || bp_vaddr >= vm_end)
> + goto retry_with_lock;
> +
> + offset = (loff_t)(vm_pgoff << PAGE_SHIFT) + (bp_vaddr - vm_start);
> + uprobe = find_uprobe_rcu(vm_inode, offset);
> + if (!uprobe)
> + goto retry_with_lock;
> +
> + /* now double check that nothing about VMA changed */
> + if (vm_lock_seq != READ_ONCE(vma->vm_lock_seq))
> + goto retry_with_lock;
Since vma->vma_lock_seq is only ever updated at vma_start_write() you're
checking you're in or after the same modification cycle.
The point of sequence locks is to check you *IN* a modification cycle
and retry if you are. You're now explicitly continuing if you're in a
modification.
You really need:
seq++;
wmb();
... do modification
wmb();
seq++;
vs
do {
s = READ_ONCE(seq) & ~1;
rmb();
... read stuff
} while (rmb(), seq != s);
The thing to note is that seq will be odd while inside a modification
and even outside, further if the pre and post seq are both even but not
identical, you've crossed a modification and also need to retry.
> +
> + /* happy case, we speculated successfully */
> + rcu_read_unlock();
> + return uprobe;
> +
> +retry_with_lock:
> + rcu_read_unlock();
> + uprobe = NULL;
> +#endif
> +
> mmap_read_lock(mm);
> vma = vma_lookup(mm, bp_vaddr);
> if (vma) {
> diff --git a/kernel/fork.c b/kernel/fork.c
> index cc760491f201..211a84ee92b4 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -3160,7 +3160,7 @@ void __init proc_caches_init(void)
> NULL);
> files_cachep = kmem_cache_create("files_cache",
> sizeof(struct files_struct), 0,
> - SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT,
> + SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT|SLAB_TYPESAFE_BY_RCU,
> NULL);
> fs_cachep = kmem_cache_create("fs_cache",
> sizeof(struct fs_struct), 0,
next prev parent reply other threads:[~2024-08-03 8:53 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240708091241.544262971@infradead.org>
[not found] ` <20240709075651.122204f1358f9f78d1e64b62@kernel.org>
2024-07-09 0:25 ` [PATCH 00/10] perf/uprobe: Optimize uprobes Andrii Nakryiko
2024-07-09 9:01 ` Peter Zijlstra
2024-07-09 14:11 ` Paul E. McKenney
2024-07-09 14:29 ` Peter Zijlstra
2024-07-09 14:36 ` Paul E. McKenney
2024-07-09 15:31 ` Peter Zijlstra
2024-07-09 15:56 ` Paul E. McKenney
2024-07-09 16:10 ` Matthew Wilcox
2024-07-09 16:30 ` Matthew Wilcox
2024-07-09 16:57 ` Paul E. McKenney
2024-07-10 9:16 ` Peter Zijlstra
2024-07-10 9:40 ` Peter Zijlstra
2024-07-22 19:09 ` Suren Baghdasaryan
2024-07-27 0:20 ` Andrii Nakryiko
2024-07-27 1:29 ` Suren Baghdasaryan
2024-07-27 3:45 ` Matthew Wilcox
2024-07-30 3:18 ` Andrii Nakryiko
2024-07-30 13:10 ` Peter Zijlstra
2024-07-30 18:10 ` Suren Baghdasaryan
2024-08-03 5:47 ` Andrii Nakryiko
2024-08-03 8:53 ` Peter Zijlstra [this message]
2024-08-04 23:22 ` Andrii Nakryiko
2024-08-06 4:08 ` Andrii Nakryiko
2024-08-06 14:50 ` Suren Baghdasaryan
2024-08-06 17:40 ` Andrii Nakryiko
2024-08-06 17:44 ` Suren Baghdasaryan
2024-08-07 1:36 ` Suren Baghdasaryan
2024-08-07 5:13 ` Suren Baghdasaryan
2024-08-07 17:49 ` Andrii Nakryiko
2024-08-07 18:04 ` Suren Baghdasaryan
2024-08-07 18:30 ` Andrii Nakryiko
2024-08-07 18:33 ` Suren Baghdasaryan
2024-08-08 0:47 ` Andrii Nakryiko
2024-07-30 13:46 ` Peter Zijlstra
2024-07-30 18:16 ` Suren Baghdasaryan
2024-07-09 16:42 ` Andrii Nakryiko
2024-07-09 9:03 ` Peter Zijlstra
2024-07-09 10:01 ` Jiri Olsa
2024-07-09 10:16 ` Peter Zijlstra
2024-07-09 22:10 ` Masami Hiramatsu
2024-07-10 10:10 ` Peter Zijlstra
2024-07-10 14:56 ` Masami Hiramatsu
2024-07-10 18:40 ` Andrii Nakryiko
2024-07-11 8:51 ` Peter Zijlstra
2024-07-11 15:17 ` Masami Hiramatsu
2024-07-11 15:22 ` Peter Zijlstra
2024-07-11 17:47 ` Steven Rostedt
2024-07-11 23:59 ` Masami Hiramatsu
2024-07-10 0:55 ` Masami Hiramatsu
2024-07-09 21:47 ` Andrii Nakryiko
2024-07-10 10:12 ` Peter Zijlstra
2024-07-10 12:34 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240803085312.GP39708@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@kernel.org \
--cc=rostedt@goodmis.org \
--cc=surenb@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox