bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Hwang <leon.hwang@linux.dev>
To: Yonghong Song <yonghong.song@linux.dev>, bpf@vger.kernel.org
Cc: ast@kernel.org, andrii@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, eddyz87@gmail.com, song@kernel.org,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, memxor@gmail.com,
	ameryhung@gmail.com, linux-kernel@vger.kernel.org,
	kernel-patches-bot@fb.com
Subject: Re: [PATCH bpf-next v6 2/2] selftests/bpf: Add test to verify freeing the special fields when update [lru_,]percpu_hash maps
Date: Thu, 13 Nov 2025 21:16:32 +0800	[thread overview]
Message-ID: <17a9444f-1151-44b4-b2ec-5cafd12bf2bd@linux.dev> (raw)
In-Reply-To: <6a2a6f41-f24a-4e87-94d0-8cb147725279@linux.dev>



On 2025/11/12 05:58, Yonghong Song wrote:
>
>
> On 11/11/25 5:52 AM, Leon Hwang wrote:
>>
>> On 2025/11/11 21:38, Leon Hwang wrote:
>>>
>>> On 2025/11/7 10:00, Yonghong Song wrote:
>>>>
>>>> On 11/5/25 7:14 AM, Leon Hwang wrote:
[...]

>>>>> +
>>>>> +static int __insert_in_list(struct bpf_list_head *head, struct
>>>>> bpf_spin_lock *lock,
>>>>> +                struct node_data __kptr **node)
>>>>> +{
>>>>> +    struct node_data *node_new, *node_ref, *node_old;
>>>>> +
>>>>> +    node_new = bpf_obj_new(typeof(*node_new));
>>>>> +    if (!node_new)
>>>>> +        return -1;
>>>>> +
>>>>> +    node_ref = bpf_refcount_acquire(node_new);
>>>>> +    node_old = bpf_kptr_xchg(node, node_new);
>>>> Change the above to node_old = bpf_kptr_xchg(node, node_node_ref);
>>>> might
>>>> be better for reasoning although node_ref/node_new are the same.
>>>>
>>> Nope — node_ref and node_new are different for the verifier.
>> They are the same in theory.
>>
>> The verifier failure was likely caused by something else, but I'm not
>> sure of the exact reason.
>
> I did some analysis and your code works as expected:
>
>     node_ref = bpf_refcount_acquire(node_new);
>     node_old = bpf_kptr_xchg(node, node_new);
>     if (node_old) {
>             bpf_obj_drop(node_old);
>             bpf_obj_drop(node_ref);
>             return -2;
>     }
>
>     bpf_spin_lock(lock);
>     bpf_list_push_front(head, &node_ref->l);
>     ref = (u64)(void *) &node_ref->ref;
>     bpf_spin_unlock(lock);
>
> In the above, after the following insn:
>     node_old = bpf_kptr_xchg(node, node_new);
> the second argument 'node_new' will become a scalar since it
> may be changed by another bpf program accessing the same map.
>
> So your code is okay as node_ref still valid ptr_node_data
> and can be used in following codes.
>
>
> My suggestion to replace
>     node_old = bpf_kptr_xchg(node, node_new);
> with
>     node_old = bpf_kptr_xchg(node, node_ref);
> will not work since node_ref will be a scalar
> so subsequent bpf_obj_drop(node_ref) and bpf_list_push_front(...)
> won't work.
>
> In summary, your change look okay to me. Sorry for noise.
>

Hi Yonghong,

Thanks for your detailed analysis.

Indeed, in the case of
node_old = bpf_kptr_xchg(node, node_ref);,
the verifier marks node_ref as SCALAR_VALUE.

Here's the relevant part in check_helper_call():

static int check_helper_call(struct bpf_verifier_env *env, struct
bpf_insn *insn,
                             int *insn_idx_p)
{
        //...
        if (meta.release_regno) {
                err = -EINVAL;
                if (arg_type_is_dynptr(fn->arg_type[meta.release_regno -
BPF_REG_1])) {
                        err = unmark_stack_slots_dynptr(env,
&regs[meta.release_regno]);
                } else if (func_id == BPF_FUNC_kptr_xchg &&
meta.ref_obj_id) {
                        u32 ref_obj_id = meta.ref_obj_id;
                        bool in_rcu = in_rcu_cs(env);
                        struct bpf_func_state *state;
                        struct bpf_reg_state *reg;

                        err = release_reference_nomark(env->cur_state,
ref_obj_id);
                        if (!err) {

bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({
                                        if (reg->ref_obj_id == ref_obj_id) {
                                                if (in_rcu && (reg->type
& MEM_ALLOC) && (reg->type & MEM_PERCPU)) {
                                                        reg->ref_obj_id = 0;
                                                        reg->type &=
~MEM_ALLOC;
                                                        reg->type |=
MEM_RCU;
                                                } else {

mark_reg_invalid(env, reg);
                                                }
                                        }
                                }));
                        }
                } else if (meta.ref_obj_id) {
                        //...
        }
        //...
}

This logic matches your explanation — when the argument passed to
bpf_kptr_xchg() holds a reference (meta.ref_obj_id), that reference is
not a KPTR_PERCPU inside an RCU critical section.
As a result, node_ref is invalidated and becomes a scalar after the call.

So yes, your reasoning is correct, and this behavior explains why using
node_ref as the second argument doesn't work.

Thanks,
Leon

[...]

  reply	other threads:[~2025-11-13 13:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-05 15:14 [PATCH bpf-next v6 0/2] bpf: Free special fields when update [lru_,]percpu_hash maps Leon Hwang
2025-11-05 15:14 ` [PATCH bpf-next v6 1/2] " Leon Hwang
2025-11-07  1:56   ` Yonghong Song
2025-11-05 15:14 ` [PATCH bpf-next v6 2/2] selftests/bpf: Add test to verify freeing the " Leon Hwang
2025-11-07  2:00   ` Yonghong Song
2025-11-11 13:38     ` Leon Hwang
2025-11-11 13:52       ` Leon Hwang
2025-11-11 21:58         ` Yonghong Song
2025-11-13 13:16           ` Leon Hwang [this message]
2025-11-13 17:30 ` [PATCH bpf-next v6 0/2] bpf: Free " patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17a9444f-1151-44b4-b2ec-5cafd12bf2bd@linux.dev \
    --to=leon.hwang@linux.dev \
    --cc=ameryhung@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kernel-patches-bot@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).