public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: Puranjay Mohan <puranjay@kernel.org>, bpf@vger.kernel.org
Cc: Puranjay Mohan <puranjay@kernel.org>,
	Puranjay Mohan <puranjay12@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	kernel-team@meta.com
Subject: Re: [PATCH bpf 3/3] bpf: return VMA snapshot from task_vma iterator
Date: Thu, 05 Mar 2026 18:53:57 +0000	[thread overview]
Message-ID: <87h5quxg3e.fsf@gmail.com> (raw)
In-Reply-To: <20260304142026.1443666-4-puranjay@kernel.org>

Puranjay Mohan <puranjay@kernel.org> writes:

> Holding the per-VMA lock across the BPF program's loop body creates a
> lock ordering problem when helpers acquire locks with a dependency on
> mmap_lock (e.g., bpf_dynptr_read -> __kernel_read -> i_rwsem):
>
>   vm_lock -> i_rwsem -> mmap_lock -> vm_lock
>
> Snapshot VMA fields into an embedded struct vm_area_struct under the
> per-VMA lock in _next(), then drop the lock before returning. The BPF
> program accesses only the snapshot, so no lock is held during execution.
> For vm_file, get_file() takes a reference under the lock, released via
> fput() on the next iteration or in _destroy(). The snapshot's vm_file is
> set to NULL after fput() so _destroy() does not double-release the
> reference when _next() has already dropped it. For vm_mm, the snapshot
> uses the mm pointer held via mmget().
>
> Fixes: 4ac454682158 ("bpf: Introduce task_vma open-coded iterator kfuncs")
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
> ---
>  kernel/bpf/task_iter.c | 31 +++++++++++++++++++++----------
>  1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
> index ff29d4da0267..4bf93cff69c7 100644
> --- a/kernel/bpf/task_iter.c
> +++ b/kernel/bpf/task_iter.c
> @@ -798,7 +798,7 @@ const struct bpf_func_proto bpf_find_vma_proto = {
>  struct bpf_iter_task_vma_kern_data {
>  	struct task_struct *task;
>  	struct mm_struct *mm;
> -	struct vm_area_struct *locked_vma;
> +	struct vm_area_struct snapshot;
>  	u64 last_addr;
>  };
>  
> @@ -908,8 +908,8 @@ __bpf_kfunc int bpf_iter_task_vma_new(struct bpf_iter_task_vma *it,
>  		goto err_cleanup_iter;
>  	}
>  
> -	kit->data->locked_vma = NULL;
>  	kit->data->last_addr = addr;
> +	memset(&kit->data->snapshot, 0, sizeof(kit->data->snapshot));
>  	return 0;
>  
>  err_cleanup_iter:
> @@ -923,15 +923,19 @@ __bpf_kfunc int bpf_iter_task_vma_new(struct bpf_iter_task_vma *it,
>  __bpf_kfunc struct vm_area_struct *bpf_iter_task_vma_next(struct bpf_iter_task_vma *it)
>  {
>  	struct bpf_iter_task_vma_kern *kit = (void *)it;
> -	struct vm_area_struct *vma;
> +	struct vm_area_struct *snap, *vma;
>  	struct vma_iterator vmi;
>  	unsigned long next_addr, next_end;
>  
>  	if (!kit->data) /* bpf_iter_task_vma_new failed */
>  		return NULL;
>  
> -	if (kit->data->locked_vma)
> -		vma_end_read(kit->data->locked_vma);
> +	snap = &kit->data->snapshot;
> +
> +	if (snap->vm_file) {
> +		fput(snap->vm_file);
> +		snap->vm_file = NULL;
> +	}
>  
>  retry:
>  	rcu_read_lock();
> @@ -939,7 +943,6 @@ __bpf_kfunc struct vm_area_struct *bpf_iter_task_vma_next(struct bpf_iter_task_v
>  	vma = vma_next(&vmi);
>  	if (!vma) {
>  		rcu_read_unlock();
> -		kit->data->locked_vma = NULL;
>  		return NULL;
>  	}
>  	next_addr = vma->vm_start;
> @@ -961,9 +964,17 @@ __bpf_kfunc struct vm_area_struct *bpf_iter_task_vma_next(struct bpf_iter_task_v
>  		goto retry;
>  	}
>  
> -	kit->data->locked_vma = vma;
> +	snap->vm_start = vma->vm_start;
> +	snap->vm_end = vma->vm_end;
> +	snap->vm_mm = kit->data->mm;
> +	snap->vm_page_prot = vma->vm_page_prot;
> +	snap->flags = vma->flags;
> +	snap->vm_pgoff = vma->vm_pgoff;
> +	snap->vm_file = vma->vm_file ? get_file(vma->vm_file) : NULL;
Are you omitting some fields when copying to snapshot? How do
you decide what fields are needed and what not? If your intention is
to copy everything and bump refcnt for file, why not memcpy() +
get_file(vma->vm_file)?
> +
>  	kit->data->last_addr = vma->vm_end;
> -	return vma;
> +	vma_end_read(vma);
> +	return snap;
>  }
>  
>  __bpf_kfunc void bpf_iter_task_vma_destroy(struct bpf_iter_task_vma *it)
> @@ -971,8 +982,8 @@ __bpf_kfunc void bpf_iter_task_vma_destroy(struct bpf_iter_task_vma *it)
>  	struct bpf_iter_task_vma_kern *kit = (void *)it;
>  
>  	if (kit->data) {
> -		if (kit->data->locked_vma)
> -			vma_end_read(kit->data->locked_vma);
> +		if (kit->data->snapshot.vm_file)
> +			fput(kit->data->snapshot.vm_file);
>  		bpf_iter_mmput(kit->data->mm);
>  		put_task_struct(kit->data->task);
>  		bpf_mem_free(&bpf_global_ma, kit->data);
> -- 
> 2.47.3

  reply	other threads:[~2026-03-05 18:53 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-04 14:20 [PATCH bpf 0/3] bpf: fix and improve open-coded task_vma iterator Puranjay Mohan
2026-03-04 14:20 ` [PATCH bpf 1/3] bpf: fix mm lifecycle in " Puranjay Mohan
2026-03-05  8:55   ` kernel test robot
2026-03-05 11:58   ` kernel test robot
2026-03-05 16:34   ` Mykyta Yatsenko
2026-03-05 16:48     ` Puranjay Mohan
2026-03-05 17:36       ` Mykyta Yatsenko
2026-03-06  1:11   ` Alexei Starovoitov
2026-03-04 14:20 ` [PATCH bpf 2/3] bpf: switch task_vma iterator from mmap_lock to per-VMA locks Puranjay Mohan
2026-03-05 18:47   ` Mykyta Yatsenko
2026-03-04 14:20 ` [PATCH bpf 3/3] bpf: return VMA snapshot from task_vma iterator Puranjay Mohan
2026-03-05 18:53   ` Mykyta Yatsenko [this message]
2026-03-05 19:03     ` Puranjay Mohan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h5quxg3e.fsf@gmail.com \
    --to=mykyta.yatsenko5@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=puranjay12@gmail.com \
    --cc=puranjay@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox