All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Xu Kuohai <xukuohai@huaweicloud.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Yonghong Song <yonghong.song@linux.dev>,
	Kui-Feng Lee <thinker.li@gmail.com>
Subject: Re: [PATCH bpf-next 2/2] selftests/bpf: Add test for struct_ops map release
Date: Mon, 11 Nov 2024 13:30:16 -0800	[thread overview]
Message-ID: <f8f02f5c-acad-4f65-85d3-e20f70fe6b7d@linux.dev> (raw)
In-Reply-To: <e898a2b2-779b-45e6-b2d2-a2a796e322ff@huaweicloud.com>

On 11/9/24 12:40 AM, Xu Kuohai wrote:
> On 11/9/2024 3:39 AM, Martin KaFai Lau wrote:
>> On 11/8/24 12:26 AM, Xu Kuohai wrote:
>>> -static void bpf_testmod_test_2(int a, int b)
>>> +static void bpf_dummy_unreg(void *kdata, struct bpf_link *link)
>>>   {
>>> +    WRITE_ONCE(__bpf_dummy_ops, &__bpf_testmod_ops);
>>>   }
>>
>> [ ... ]
>>
>>> +static int run_struct_ops(const char *val, const struct kernel_param *kp)
>>> +{
>>> +    int ret;
>>> +    unsigned int repeat;
>>> +    struct bpf_testmod_ops *ops;
>>> +
>>> +    ret = kstrtouint(val, 10, &repeat);
>>> +    if (ret)
>>> +        return ret;
>>> +
>>> +    if (repeat > 10000)
>>> +        return -ERANGE;
>>> +
>>> +    while (repeat-- > 0) {
>>> +        ops = READ_ONCE(__bpf_dummy_ops);
>>
>> I don't think it is the usual bpf_struct_ops implementation which only uses 
>> READ_ONCE and WRITE_ONCE to protect the registered ops. tcp-cc uses a 
>> refcnt+rcu. It seems hid uses synchronize_srcu(). sched_ext seems to also use 
>> kthread_flush_work() to wait for all ops calling finished. Meaning I don't 
>> think the current bpf_struct_ops unreg implementation will run into this issue 
>> for sleepable ops.
>>
> 
> Thanks for the explanation.
> 
> Are you saying that it's not the struct_ops framework's
> responsibility to ensure the struct_ops map is not
> released while it may be still in use? And the "bug" in
> this series should be "fixed" in the test, namely this
> patch?

Yeah, it is what I was trying to say. I don't think there is thing to fix. Think 
about extending a subsystem by a kernel module. The subsystem will also do the 
needed protection itself during the unreg process. There is already a 
bpf_try_module_get() to help the subsystem.

>> The current synchronize_rcu_mult(call_rcu, call_rcu_tasks) is only needed for 
>> the tcp-cc because a tcp-cc's ops (which uses refcnt+rcu) can decrement its 
>> own refcnt. Looking back, this was a mistake (mine). A new tcp-cc ops should 
>> have been introduced instead to return a new tcp-cc-ops to be used.
> 
> Not quite clear, but from the description, it seems that
> the synchronize_rcu_mult(call_rcu, call_rcu_tasks) could

This synchronize_rcu_mult is only need for the tcp_congestion_ops 
(bpf_tcp_ca.c). May be it is cleaner to just make a special case for 
"tcp_congestion_ops" in st_ops->name in map_alloc and only set 
free_after_mult_rcu_gp to TRUE for this one case, then it won't slow down other 
struct_ops map freeing also.

imo, the test in this patch is not needed in its current form also since it is 
not how the kernel subsystem implements unreg in struct_ops.

> be just removed in some way, no need to do a cleanup to
> switch it to call_rcu.
> 
>>
>>> +        if (ops->test_1)
>>> +            ops->test_1();
>>> +        if (ops->test_2)
>>> +            ops->test_2(0, 0);
>>> +    }
>>> +
>>> +    return 0;
>>> +}
> 


  reply	other threads:[~2024-11-11 21:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-08  8:26 [PATCH bpf-next 0/2] Fix release of struct_ops map Xu Kuohai
2024-11-08  8:26 ` [PATCH bpf-next 1/2] bpf: " Xu Kuohai
2024-11-08 17:00   ` Alexei Starovoitov
2024-11-08  8:26 ` [PATCH bpf-next 2/2] selftests/bpf: Add test for struct_ops map release Xu Kuohai
2024-11-08 19:39   ` Martin KaFai Lau
2024-11-09  8:40     ` Xu Kuohai
2024-11-11 21:30       ` Martin KaFai Lau [this message]
2024-11-12 12:22         ` Xu Kuohai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f8f02f5c-acad-4f65-85d3-e20f70fe6b7d@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=thinker.li@gmail.com \
    --cc=xukuohai@huaweicloud.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.