BPF List
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	mingo@kernel.org, andrii@kernel.org,
	linux-kernel@vger.kernel.org, rostedt@goodmis.org,
	oleg@redhat.com, jolsa@kernel.org, clm@meta.com,
	bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH 00/10] perf/uprobe: Optimize uprobes
Date: Sat, 3 Aug 2024 10:53:12 +0200	[thread overview]
Message-ID: <20240803085312.GP39708@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <CAEf4BzavWOgCLQoNdmPyyqHcm7gY5USKU5f1JWfyaCbuc_zVAA@mail.gmail.com>

On Fri, Aug 02, 2024 at 10:47:15PM -0700, Andrii Nakryiko wrote:

> Is there any reason why the approach below won't work?

> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 8be9e34e786a..e21b68a39f13 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -2251,6 +2251,52 @@ static struct uprobe
> *find_active_uprobe_rcu(unsigned long bp_vaddr, int *is_swb
>         struct uprobe *uprobe = NULL;
>         struct vm_area_struct *vma;
> 
> +#ifdef CONFIG_PER_VMA_LOCK
> +       vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_MAYSHARE, vm_flags;
> +       struct file *vm_file;
> +       struct inode *vm_inode;
> +       unsigned long vm_pgoff, vm_start, vm_end;
> +       int vm_lock_seq;
> +       loff_t offset;
> +
> +       rcu_read_lock();
> +
> +       vma = vma_lookup(mm, bp_vaddr);
> +       if (!vma)
> +               goto retry_with_lock;
> +
> +       vm_lock_seq = READ_ONCE(vma->vm_lock_seq);

So vma->vm_lock_seq is only updated on vma_start_write()

> +
> +       vm_file = READ_ONCE(vma->vm_file);
> +       vm_flags = READ_ONCE(vma->vm_flags);
> +       if (!vm_file || (vm_flags & flags) != VM_MAYEXEC)
> +               goto retry_with_lock;
> +
> +       vm_inode = READ_ONCE(vm_file->f_inode);
> +       vm_pgoff = READ_ONCE(vma->vm_pgoff);
> +       vm_start = READ_ONCE(vma->vm_start);
> +       vm_end = READ_ONCE(vma->vm_end);

None of those are written with WRITE_ONCE(), so this buys you nothing.
Compiler could be updating them one byte at a time while you load some
franken-update.

Also, if you're in the middle of split_vma() you might not get a
consistent set.

> +       if (bp_vaddr < vm_start || bp_vaddr >= vm_end)
> +               goto retry_with_lock;
> +
> +       offset = (loff_t)(vm_pgoff << PAGE_SHIFT) + (bp_vaddr - vm_start);
> +       uprobe = find_uprobe_rcu(vm_inode, offset);
> +       if (!uprobe)
> +               goto retry_with_lock;
> +
> +       /* now double check that nothing about VMA changed */
> +       if (vm_lock_seq != READ_ONCE(vma->vm_lock_seq))
> +               goto retry_with_lock;

Since vma->vma_lock_seq is only ever updated at vma_start_write() you're
checking you're in or after the same modification cycle.

The point of sequence locks is to check you *IN* a modification cycle
and retry if you are. You're now explicitly continuing if you're in a
modification.

You really need:

   seq++;
   wmb();

   ... do modification

   wmb();
   seq++;

vs

  do {
	  s = READ_ONCE(seq) & ~1;
	  rmb();

	  ... read stuff

  } while (rmb(), seq != s);
  

The thing to note is that seq will be odd while inside a modification
and even outside, further if the pre and post seq are both even but not
identical, you've crossed a modification and also need to retry.

> +
> +       /* happy case, we speculated successfully */
> +       rcu_read_unlock();
> +       return uprobe;
> +
> +retry_with_lock:
> +       rcu_read_unlock();
> +       uprobe = NULL;
> +#endif
> +
>         mmap_read_lock(mm);
>         vma = vma_lookup(mm, bp_vaddr);
>         if (vma) {
> diff --git a/kernel/fork.c b/kernel/fork.c
> index cc760491f201..211a84ee92b4 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -3160,7 +3160,7 @@ void __init proc_caches_init(void)
>                         NULL);
>         files_cachep = kmem_cache_create("files_cache",
>                         sizeof(struct files_struct), 0,
> -                       SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT,
> + SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT|SLAB_TYPESAFE_BY_RCU,
>                         NULL);
>         fs_cachep = kmem_cache_create("fs_cache",
>                         sizeof(struct fs_struct), 0,

  reply	other threads:[~2024-08-03  8:53 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240708091241.544262971@infradead.org>
     [not found] ` <20240709075651.122204f1358f9f78d1e64b62@kernel.org>
2024-07-09  0:25   ` [PATCH 00/10] perf/uprobe: Optimize uprobes Andrii Nakryiko
2024-07-09  9:01     ` Peter Zijlstra
2024-07-09 14:11       ` Paul E. McKenney
2024-07-09 14:29         ` Peter Zijlstra
2024-07-09 14:36           ` Paul E. McKenney
2024-07-09 15:31             ` Peter Zijlstra
2024-07-09 15:56               ` Paul E. McKenney
2024-07-09 16:10           ` Matthew Wilcox
2024-07-09 16:30             ` Matthew Wilcox
2024-07-09 16:57               ` Paul E. McKenney
2024-07-10  9:16             ` Peter Zijlstra
2024-07-10  9:40               ` Peter Zijlstra
2024-07-22 19:09                 ` Suren Baghdasaryan
2024-07-27  0:20                   ` Andrii Nakryiko
2024-07-27  1:29                     ` Suren Baghdasaryan
2024-07-27  3:45                       ` Matthew Wilcox
2024-07-30  3:18                         ` Andrii Nakryiko
2024-07-30 13:10                         ` Peter Zijlstra
2024-07-30 18:10                           ` Suren Baghdasaryan
2024-08-03  5:47                             ` Andrii Nakryiko
2024-08-03  8:53                               ` Peter Zijlstra [this message]
2024-08-04 23:22                                 ` Andrii Nakryiko
2024-08-06  4:08                                   ` Andrii Nakryiko
2024-08-06 14:50                                     ` Suren Baghdasaryan
2024-08-06 17:40                                       ` Andrii Nakryiko
2024-08-06 17:44                                         ` Suren Baghdasaryan
2024-08-07  1:36                                     ` Suren Baghdasaryan
2024-08-07  5:13                                       ` Suren Baghdasaryan
2024-08-07 17:49                                         ` Andrii Nakryiko
2024-08-07 18:04                                           ` Suren Baghdasaryan
2024-08-07 18:30                                             ` Andrii Nakryiko
2024-08-07 18:33                                             ` Suren Baghdasaryan
2024-08-08  0:47                                               ` Andrii Nakryiko
2024-07-30 13:46                   ` Peter Zijlstra
2024-07-30 18:16                     ` Suren Baghdasaryan
2024-07-09 16:42         ` Andrii Nakryiko
2024-07-09  9:03     ` Peter Zijlstra
2024-07-09 10:01       ` Jiri Olsa
2024-07-09 10:16         ` Peter Zijlstra
2024-07-09 22:10           ` Masami Hiramatsu
2024-07-10 10:10             ` Peter Zijlstra
2024-07-10 14:56               ` Masami Hiramatsu
2024-07-10 18:40                 ` Andrii Nakryiko
2024-07-11  8:51                   ` Peter Zijlstra
2024-07-11 15:17                     ` Masami Hiramatsu
2024-07-11 15:22                       ` Peter Zijlstra
2024-07-11 17:47                         ` Steven Rostedt
2024-07-11 23:59                           ` Masami Hiramatsu
2024-07-10  0:55       ` Masami Hiramatsu
2024-07-09 21:47     ` Andrii Nakryiko
2024-07-10 10:12       ` Peter Zijlstra
2024-07-10 12:34         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240803085312.GP39708@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox