From: Leon Hwang <leon.hwang@linux.dev>
To: Amery Hung <ameryhung@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
daniel@iogearbox.net, martin.lau@linux.dev, eddyz87@gmail.com,
song@kernel.org, yonghong.song@linux.dev,
john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
haoluo@google.com, jolsa@kernel.org, memxor@gmail.com,
linux-kernel@vger.kernel.org, kernel-patches-bot@fb.com
Subject: Re: [PATCH bpf v3 4/4] selftests/bpf: Add tests to verify freeing the special fields when update hash and local storage maps
Date: Thu, 30 Oct 2025 22:29:56 +0800 [thread overview]
Message-ID: <ca00af4c-39bd-455c-889e-044fcf9cf09f@linux.dev> (raw)
In-Reply-To: <a798d47e-4ab8-423a-b8ef-e42ff9760324@linux.dev>
On 2025/10/29 22:58, Leon Hwang wrote:
>
>
> On 2025/10/28 00:34, Amery Hung wrote:
>> On Sun, Oct 26, 2025 at 8:42 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>>>
>>> Add tests to verify that updating hash and local storage maps decrements
>>> refcount when BPF_KPTR_REF objects are involved.
>>>
>>> The tests perform the following steps:
>>>
>>> 1. Call update_elem() to insert an initial value.
>>> 2. Use bpf_refcount_acquire() to increment the refcount.
>>> 3. Store the node pointer in the map value.
>>> 4. Add the node to a linked list.
>>> 5. Probe-read the refcount and verify it is *2*.
>>> 6. Call update_elem() again to trigger refcount decrement.
>>> 7. Probe-read the refcount and verify it is *1*.
>>>
>>> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
>>> ---
>>> .../bpf/prog_tests/refcounted_kptr.c | 178 +++++++++++++++++-
>>> .../selftests/bpf/progs/refcounted_kptr.c | 160 ++++++++++++++++
>>> 2 files changed, 337 insertions(+), 1 deletion(-)
>>>
>
> [...]
>
>>> @@ -44,3 +44,179 @@ void test_refcounted_kptr_wrong_owner(void)
>>> ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_a2 retval");
>>> refcounted_kptr__destroy(skel);
>>> }
>>> +
>>> +static void test_refcnt_leak(void *values, size_t values_sz, u64 flags, struct bpf_map *map,
>>> + struct bpf_program *prog_leak, struct bpf_program *prog_check)
>>> +{
>
> [...]
>
>>> +}
>>
>> Just use syscall BPF programs across different subtests, and you can
>> share this test_refcnt_leak() across subtests.
>>
>> It also saves you some code setting up bpf_test_run_opts. You can just
>> call bpf_prog_test_run_opts(prog_fd, NULL) as you don't pass any input
>> from ctx.
>>
>>> +
>>> +static void test_percpu_hash_refcount_leak(void)
>>> +{
>
> [...]
>
>>> +out:
>>> + close(cgroup);
>>> + refcounted_kptr__destroy(skel);
>>> + if (client_fd >= 0)
>>> + close(client_fd);
>>> + if (server_fd >= 0)
>>> + close(server_fd);
>>> +}
>>
>> Then, you won't need to set up server, connection.... just to
>> read/write cgroup local storage. Just call test_refcnt_leak() that
>> runs the two BPF syscall programs for cgroup local storage.
>>
>
> [...]
>
>>
>>
>> And in syscall BPF program, you can simply get the cgroup through the
>> current task
>>
>> SEC("syscall")
>> int syscall_prog(void *ctx)
>> {
>> struct task_struct *task = bpf_get_current_task_btf();
>>
>> v = bpf_cgrp_storage_get(&cgrp_strg, task->cgroups->dfl_cgrp, 0,
>> BPF_LOCAL_STORAGE_GET_F_CREATE);
>> ...
>> }
>>
>
> Hi Amery,
>
> Thanks for the suggestion.
>
> I tried your approach, but the verifier rejected it with the following
> error:
>
> 0: R1=ctx() R10=fp0
> ; task = bpf_get_current_task_btf(); @ refcounted_kptr.c:686
> 0: (85) call bpf_get_current_task_btf#158 ; R0=trusted_ptr_task_struct()
> ; v = bpf_cgrp_storage_get(&cgrp_strg, task->cgroups->dfl_cgrp, 0, @
> refcounted_kptr.c:687
> 1: (79) r1 = *(u64 *)(r0 +4856) ; R0=trusted_ptr_task_struct()
> R1=untrusted_ptr_css_set()
> 2: (79) r2 = *(u64 *)(r1 +96) ; R1=untrusted_ptr_css_set()
> R2=untrusted_ptr_cgroup()
> 3: (18) r1 = 0xffffa1b442a4b800 ; R1=map_ptr(map=cgrp_strg,ks=4,vs=16)
> 5: (b7) r3 = 0 ; R3=0
> 6: (b7) r4 = 1 ; R4=1
> 7: (85) call bpf_cgrp_storage_get#210
> R2 type=untrusted_ptr_ expected=ptr_, trusted_ptr_, rcu_ptr_
> processed 7 insns (limit 1000000) max_states_per_insn 0 total_states 0
> peak_states 0 mark_read 0
>
After analyzing the verifier log (with a bit of AI help), it turned out
that 'task->cgroups->dfl_cgrp' wasn't protected by an RCU read lock.
According to verifier.c::btf_ld_kptr_type(), this pointer must be
accessed either within an RCU critical section or in a non-sleepable
program.
Adding RCU protection fixed the issue:
bpf_rcu_read_lock()
v = bpf_cgrp_storage_get(&cgrp_strg, task->cgroups->dfl_cgrp, 0,
BPF_LOCAL_STORAGE_GET_F_CREATE);
bpf_rcu_read_unlock()
With this change, test_refcnt_leak() can now be used across all subtests.
Thanks again for the helpful suggestion.
Thanks,
Leon
next prev parent reply other threads:[~2025-10-30 14:30 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-26 15:39 [PATCH bpf v3 0/4] bpf: Free special fields when update hash and local storage maps Leon Hwang
2025-10-26 15:39 ` [PATCH bpf v3 1/4] bpf: Free special fields when update [lru_,]percpu_hash maps Leon Hwang
2025-10-28 18:03 ` Andrii Nakryiko
2025-10-26 15:39 ` [PATCH bpf v3 2/4] bpf: Free special fields when update hash maps with BPF_F_LOCK Leon Hwang
2025-10-26 15:39 ` [PATCH bpf v3 3/4] bpf: Free special fields when update local storage maps Leon Hwang
2025-10-27 15:44 ` Amery Hung
2025-10-27 16:15 ` Leon Hwang
2025-10-27 17:04 ` Amery Hung
2025-10-28 14:48 ` Leon Hwang
2025-10-28 18:03 ` Andrii Nakryiko
2025-10-26 15:40 ` [PATCH bpf v3 4/4] selftests/bpf: Add tests to verify freeing the special fields when update hash and " Leon Hwang
2025-10-27 16:34 ` Amery Hung
2025-10-29 14:58 ` Leon Hwang
2025-10-30 14:29 ` Leon Hwang [this message]
2025-10-28 18:10 ` [PATCH bpf v3 0/4] bpf: Free " patchwork-bot+netdevbpf
2025-10-28 20:22 ` Andrii Nakryiko
2025-10-29 6:49 ` Leon Hwang
2025-10-29 6:57 ` Menglong Dong
2025-10-29 16:38 ` Alexei Starovoitov
2025-10-30 5:37 ` Leon Hwang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ca00af4c-39bd-455c-889e-044fcf9cf09f@linux.dev \
--to=leon.hwang@linux.dev \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kernel-patches-bot@fb.com \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=memxor@gmail.com \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).