netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Xu Kuohai <xukuohai@huaweicloud.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Yonghong Song <yonghong.song@linux.dev>,
	Kui-Feng Lee <thinker.li@gmail.com>
Subject: Re: [PATCH bpf-next 2/2] selftests/bpf: Add test for struct_ops map release
Date: Mon, 11 Nov 2024 13:30:16 -0800	[thread overview]
Message-ID: <f8f02f5c-acad-4f65-85d3-e20f70fe6b7d@linux.dev> (raw)
In-Reply-To: <e898a2b2-779b-45e6-b2d2-a2a796e322ff@huaweicloud.com>

On 11/9/24 12:40 AM, Xu Kuohai wrote:
> On 11/9/2024 3:39 AM, Martin KaFai Lau wrote:
>> On 11/8/24 12:26 AM, Xu Kuohai wrote:
>>> -static void bpf_testmod_test_2(int a, int b)
>>> +static void bpf_dummy_unreg(void *kdata, struct bpf_link *link)
>>>   {
>>> +    WRITE_ONCE(__bpf_dummy_ops, &__bpf_testmod_ops);
>>>   }
>>
>> [ ... ]
>>
>>> +static int run_struct_ops(const char *val, const struct kernel_param *kp)
>>> +{
>>> +    int ret;
>>> +    unsigned int repeat;
>>> +    struct bpf_testmod_ops *ops;
>>> +
>>> +    ret = kstrtouint(val, 10, &repeat);
>>> +    if (ret)
>>> +        return ret;
>>> +
>>> +    if (repeat > 10000)
>>> +        return -ERANGE;
>>> +
>>> +    while (repeat-- > 0) {
>>> +        ops = READ_ONCE(__bpf_dummy_ops);
>>
>> I don't think it is the usual bpf_struct_ops implementation which only uses 
>> READ_ONCE and WRITE_ONCE to protect the registered ops. tcp-cc uses a 
>> refcnt+rcu. It seems hid uses synchronize_srcu(). sched_ext seems to also use 
>> kthread_flush_work() to wait for all ops calling finished. Meaning I don't 
>> think the current bpf_struct_ops unreg implementation will run into this issue 
>> for sleepable ops.
>>
> 
> Thanks for the explanation.
> 
> Are you saying that it's not the struct_ops framework's
> responsibility to ensure the struct_ops map is not
> released while it may be still in use? And the "bug" in
> this series should be "fixed" in the test, namely this
> patch?

Yeah, it is what I was trying to say. I don't think there is thing to fix. Think 
about extending a subsystem by a kernel module. The subsystem will also do the 
needed protection itself during the unreg process. There is already a 
bpf_try_module_get() to help the subsystem.

>> The current synchronize_rcu_mult(call_rcu, call_rcu_tasks) is only needed for 
>> the tcp-cc because a tcp-cc's ops (which uses refcnt+rcu) can decrement its 
>> own refcnt. Looking back, this was a mistake (mine). A new tcp-cc ops should 
>> have been introduced instead to return a new tcp-cc-ops to be used.
> 
> Not quite clear, but from the description, it seems that
> the synchronize_rcu_mult(call_rcu, call_rcu_tasks) could

This synchronize_rcu_mult is only need for the tcp_congestion_ops 
(bpf_tcp_ca.c). May be it is cleaner to just make a special case for 
"tcp_congestion_ops" in st_ops->name in map_alloc and only set 
free_after_mult_rcu_gp to TRUE for this one case, then it won't slow down other 
struct_ops map freeing also.

imo, the test in this patch is not needed in its current form also since it is 
not how the kernel subsystem implements unreg in struct_ops.

> be just removed in some way, no need to do a cleanup to
> switch it to call_rcu.
> 
>>
>>> +        if (ops->test_1)
>>> +            ops->test_1();
>>> +        if (ops->test_2)
>>> +            ops->test_2(0, 0);
>>> +    }
>>> +
>>> +    return 0;
>>> +}
> 


  reply	other threads:[~2024-11-11 21:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-08  8:26 [PATCH bpf-next 0/2] Fix release of struct_ops map Xu Kuohai
2024-11-08  8:26 ` [PATCH bpf-next 1/2] bpf: " Xu Kuohai
2024-11-08 17:00   ` Alexei Starovoitov
2024-11-08  8:26 ` [PATCH bpf-next 2/2] selftests/bpf: Add test for struct_ops map release Xu Kuohai
2024-11-08 19:39   ` Martin KaFai Lau
2024-11-09  8:40     ` Xu Kuohai
2024-11-11 21:30       ` Martin KaFai Lau [this message]
2024-11-12 12:22         ` Xu Kuohai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f8f02f5c-acad-4f65-85d3-e20f70fe6b7d@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=thinker.li@gmail.com \
    --cc=xukuohai@huaweicloud.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).