All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Hwang <leon.hwang@linux.dev>
To: Yonghong Song <yonghong.song@linux.dev>, bpf@vger.kernel.org
Cc: ast@kernel.org, andrii@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, eddyz87@gmail.com, song@kernel.org,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, memxor@gmail.com,
	ameryhung@gmail.com, linux-kernel@vger.kernel.org,
	kernel-patches-bot@fb.com
Subject: Re: [PATCH bpf-next v6 2/2] selftests/bpf: Add test to verify freeing the special fields when update [lru_,]percpu_hash maps
Date: Thu, 13 Nov 2025 21:16:32 +0800	[thread overview]
Message-ID: <17a9444f-1151-44b4-b2ec-5cafd12bf2bd@linux.dev> (raw)
In-Reply-To: <6a2a6f41-f24a-4e87-94d0-8cb147725279@linux.dev>



On 2025/11/12 05:58, Yonghong Song wrote:
>
>
> On 11/11/25 5:52 AM, Leon Hwang wrote:
>>
>> On 2025/11/11 21:38, Leon Hwang wrote:
>>>
>>> On 2025/11/7 10:00, Yonghong Song wrote:
>>>>
>>>> On 11/5/25 7:14 AM, Leon Hwang wrote:
[...]

>>>>> +
>>>>> +static int __insert_in_list(struct bpf_list_head *head, struct
>>>>> bpf_spin_lock *lock,
>>>>> +                struct node_data __kptr **node)
>>>>> +{
>>>>> +    struct node_data *node_new, *node_ref, *node_old;
>>>>> +
>>>>> +    node_new = bpf_obj_new(typeof(*node_new));
>>>>> +    if (!node_new)
>>>>> +        return -1;
>>>>> +
>>>>> +    node_ref = bpf_refcount_acquire(node_new);
>>>>> +    node_old = bpf_kptr_xchg(node, node_new);
>>>> Change the above to node_old = bpf_kptr_xchg(node, node_node_ref);
>>>> might
>>>> be better for reasoning although node_ref/node_new are the same.
>>>>
>>> Nope — node_ref and node_new are different for the verifier.
>> They are the same in theory.
>>
>> The verifier failure was likely caused by something else, but I'm not
>> sure of the exact reason.
>
> I did some analysis and your code works as expected:
>
>     node_ref = bpf_refcount_acquire(node_new);
>     node_old = bpf_kptr_xchg(node, node_new);
>     if (node_old) {
>             bpf_obj_drop(node_old);
>             bpf_obj_drop(node_ref);
>             return -2;
>     }
>
>     bpf_spin_lock(lock);
>     bpf_list_push_front(head, &node_ref->l);
>     ref = (u64)(void *) &node_ref->ref;
>     bpf_spin_unlock(lock);
>
> In the above, after the following insn:
>     node_old = bpf_kptr_xchg(node, node_new);
> the second argument 'node_new' will become a scalar since it
> may be changed by another bpf program accessing the same map.
>
> So your code is okay as node_ref still valid ptr_node_data
> and can be used in following codes.
>
>
> My suggestion to replace
>     node_old = bpf_kptr_xchg(node, node_new);
> with
>     node_old = bpf_kptr_xchg(node, node_ref);
> will not work since node_ref will be a scalar
> so subsequent bpf_obj_drop(node_ref) and bpf_list_push_front(...)
> won't work.
>
> In summary, your change look okay to me. Sorry for noise.
>

Hi Yonghong,

Thanks for your detailed analysis.

Indeed, in the case of
node_old = bpf_kptr_xchg(node, node_ref);,
the verifier marks node_ref as SCALAR_VALUE.

Here's the relevant part in check_helper_call():

static int check_helper_call(struct bpf_verifier_env *env, struct
bpf_insn *insn,
                             int *insn_idx_p)
{
        //...
        if (meta.release_regno) {
                err = -EINVAL;
                if (arg_type_is_dynptr(fn->arg_type[meta.release_regno -
BPF_REG_1])) {
                        err = unmark_stack_slots_dynptr(env,
&regs[meta.release_regno]);
                } else if (func_id == BPF_FUNC_kptr_xchg &&
meta.ref_obj_id) {
                        u32 ref_obj_id = meta.ref_obj_id;
                        bool in_rcu = in_rcu_cs(env);
                        struct bpf_func_state *state;
                        struct bpf_reg_state *reg;

                        err = release_reference_nomark(env->cur_state,
ref_obj_id);
                        if (!err) {

bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({
                                        if (reg->ref_obj_id == ref_obj_id) {
                                                if (in_rcu && (reg->type
& MEM_ALLOC) && (reg->type & MEM_PERCPU)) {
                                                        reg->ref_obj_id = 0;
                                                        reg->type &=
~MEM_ALLOC;
                                                        reg->type |=
MEM_RCU;
                                                } else {

mark_reg_invalid(env, reg);
                                                }
                                        }
                                }));
                        }
                } else if (meta.ref_obj_id) {
                        //...
        }
        //...
}

This logic matches your explanation — when the argument passed to
bpf_kptr_xchg() holds a reference (meta.ref_obj_id), that reference is
not a KPTR_PERCPU inside an RCU critical section.
As a result, node_ref is invalidated and becomes a scalar after the call.

So yes, your reasoning is correct, and this behavior explains why using
node_ref as the second argument doesn't work.

Thanks,
Leon

[...]

  reply	other threads:[~2025-11-13 13:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-05 15:14 [PATCH bpf-next v6 0/2] bpf: Free special fields when update [lru_,]percpu_hash maps Leon Hwang
2025-11-05 15:14 ` [PATCH bpf-next v6 1/2] " Leon Hwang
2025-11-07  1:56   ` Yonghong Song
2025-11-05 15:14 ` [PATCH bpf-next v6 2/2] selftests/bpf: Add test to verify freeing the " Leon Hwang
2025-11-07  2:00   ` Yonghong Song
2025-11-11 13:38     ` Leon Hwang
2025-11-11 13:52       ` Leon Hwang
2025-11-11 21:58         ` Yonghong Song
2025-11-13 13:16           ` Leon Hwang [this message]
2025-11-13 17:30 ` [PATCH bpf-next v6 0/2] bpf: Free " patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17a9444f-1151-44b4-b2ec-5cafd12bf2bd@linux.dev \
    --to=leon.hwang@linux.dev \
    --cc=ameryhung@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kernel-patches-bot@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.