BPF List
 help / color / mirror / Atom feed
From: Puranjay Mohan <puranjay@kernel.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bot+bpf-ci@kernel.org, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Eduard <eddyz87@gmail.com>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Kernel Team <kernel-team@meta.com>,
	Yonghong Song <yonghong.song@linux.dev>,
	Chris Mason <clm@meta.com>,
	Ihor Solodrai <ihor.solodrai@linux.dev>
Subject: Re: [PATCH bpf-next v2 1/4] bpf: arena: populate vm_area without allocating memory
Date: Sat, 15 Nov 2025 00:52:21 +0000	[thread overview]
Message-ID: <mb61pbjl43ynu.fsf@kernel.org> (raw)
In-Reply-To: <CAADnVQJBkRCNts86ug+B7H3kFiF4LfBGEw4acVoPyLz7350SkQ@mail.gmail.com>

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Fri, Nov 14, 2025 at 6:57 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>>
>> bot+bpf-ci@kernel.org writes:
>>
>> >> diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c
>> >> index 1074ac445..48b8ffba3 100644
>> >> --- a/kernel/bpf/arena.c
>> >> +++ b/kernel/bpf/arena.c
>> >
>> > [ ... ]
>> >
>> >> @@ -92,6 +93,62 @@ static long compute_pgoff(struct bpf_arena *arena, long uaddr)
>> >>      return (u32)(uaddr - (u32)arena->user_vm_start) >> PAGE_SHIFT;
>> >>  }
>> >>
>> >> +struct apply_range_data {
>> >> +    struct page **pages;
>> >> +    int i;
>> >> +};
>> >> +
>> >> +static int apply_range_set_cb(pte_t *pte, unsigned long addr, void *data)
>> >> +{
>> >> +    struct apply_range_data *d = data;
>> >> +    struct page *page;
>> >> +
>> >> +    if (!data)
>> >> +            return 0;
>> >> +    /* sanity check */
>> >> +    if (unlikely(!pte_none(ptep_get(pte))))
>> >> +            return -EBUSY;
>> >> +
>> >> +    page = d->pages[d->i++];
>> >> +    /* paranoia, similar to vmap_pages_pte_range() */
>> >> +    if (WARN_ON_ONCE(!pfn_valid(page_to_pfn(page))))
>> >> +            return -EINVAL;
>> >> +
>> >> +    set_pte_at(&init_mm, addr, pte, mk_pte(page, PAGE_KERNEL));
>> >> +    return 0;
>> >> +}
>> >> +
>> >> +static int apply_range_clear_cb(pte_t *pte, unsigned long addr, void *data)
>> >> +{
>> >> +    pte_t old_pte;
>> >> +    struct page *page;
>> >> +
>> >> +    /* sanity check */
>> >> +    old_pte = ptep_get(pte);
>> >> +    if (pte_none(old_pte) || !pte_present(old_pte))
>> >> +            return 0; /* nothing to do */
>> >> +
>> >> +    /* get page and free it */
>> >> +    page = pte_page(old_pte);
>> >> +    if (WARN_ON_ONCE(!page))
>> >> +            return -EINVAL;
>> >> +
>> >> +    pte_clear(&init_mm, addr, pte);
>> >> +
>> >> +    /* ensure no stale TLB entries */
>> >> +    flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>> >> +
>> >> +    __free_page(page);
>> >> +
>> >> +    return 0;
>> >> +}
>> >> +
>> >> +static int populate_pgtable_except_pte(struct bpf_arena *arena)
>> >> +{
>> >> +    return apply_to_page_range(&init_mm, bpf_arena_get_kern_vm_start(arena),
>> >> +                               KERN_VM_SZ - GUARD_SZ, apply_range_set_cb, NULL);
>> >> +}
>> >> +
>> >>  static struct bpf_map *arena_map_alloc(union bpf_attr *attr)
>> >>  {
>> >>      struct vm_struct *kern_vm;
>> >
>> > [ ... ]
>> >
>> >> @@ -144,6 +201,11 @@ static struct bpf_map *arena_map_alloc(union bpf_attr *attr)
>> >>              goto err;
>> >>      }
>> >>      mutex_init(&arena->lock);
>> >> +    err = populate_pgtable_except_pte(arena);
>> >> +    if (err) {
>> >> +            bpf_map_area_free(arena);
>> >> +            goto err;
>> >> +    }
>> >                      ^^^^
>> >
>> > Can this leak the range tree? Looking at the code, range_tree_set() was
>> > called earlier in this function and can allocate range_node structures
>> > via range_tree_clear()->kmalloc_nolock(). If populate_pgtable_except_pte()
>> > fails here, the error path calls bpf_map_area_free(arena) but doesn't call
>> > range_tree_destroy(&arena->rt) first. Compare with arena_map_free() which
>> > always calls range_tree_destroy() before freeing the arena.
>>
>> As the range tree is empty at this point, we can be sure that
>> range_tree_clear() in range_tree_set() will not allocate anything.
>
> range_tree_clear() won't clear anything, but AI pointed in
> the right direction.
> Look at what range_tree_set() does. It will allocate for sure.

If I am understanding it correctly, range_tree_set() allocates memory
using kmalloc_nolock() and it fails when this allocation fails, so in
the error path we don't need to do anything as no allocation was successful.

  reply	other threads:[~2025-11-15  0:52 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-14 11:16 [PATCH bpf-next v2 0/4] Remove KF_SLEEPABLE from arena kfuncs Puranjay Mohan
2025-11-14 11:16 ` [PATCH bpf-next v2 1/4] bpf: arena: populate vm_area without allocating memory Puranjay Mohan
2025-11-14 11:47   ` bot+bpf-ci
2025-11-14 14:57     ` Puranjay Mohan
2025-11-14 21:21       ` Alexei Starovoitov
2025-11-15  0:52         ` Puranjay Mohan [this message]
2025-11-15  1:26           ` Alexei Starovoitov
2025-11-14 11:16 ` [PATCH bpf-next v2 2/4] bpf: arena: use kmalloc_nolock() in place of kvcalloc() Puranjay Mohan
2025-11-14 11:39   ` bot+bpf-ci
2025-11-14 15:13     ` Puranjay Mohan
2025-11-14 21:25   ` Alexei Starovoitov
2025-11-14 11:16 ` [PATCH bpf-next v2 3/4] bpf: arena: make arena kfuncs any context safe Puranjay Mohan
2025-11-14 11:47   ` bot+bpf-ci
2025-11-14 15:28     ` Puranjay Mohan
2025-11-14 21:27   ` Alexei Starovoitov
2025-11-15  0:56     ` Puranjay Mohan
2025-11-15  1:28       ` Alexei Starovoitov
2025-11-15  8:18   ` kernel test robot
2025-11-16  1:15   ` kernel test robot
2025-11-14 11:16 ` [PATCH bpf-next v2 4/4] selftests: bpf: test non-sleepable arena allocations Puranjay Mohan
2025-11-14 22:18   ` Alexei Starovoitov
2025-11-15  0:58     ` Puranjay Mohan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mb61pbjl43ynu.fsf@kernel.org \
    --to=puranjay@kernel.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=kernel-team@meta.com \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox