From: Martin KaFai Lau <martin.lau@linux.dev>
To: Xu Kuohai <xukuohai@huaweicloud.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
Yonghong Song <yonghong.song@linux.dev>,
Kui-Feng Lee <thinker.li@gmail.com>
Subject: Re: [PATCH bpf-next 2/2] selftests/bpf: Add test for struct_ops map release
Date: Mon, 11 Nov 2024 13:30:16 -0800 [thread overview]
Message-ID: <f8f02f5c-acad-4f65-85d3-e20f70fe6b7d@linux.dev> (raw)
In-Reply-To: <e898a2b2-779b-45e6-b2d2-a2a796e322ff@huaweicloud.com>
On 11/9/24 12:40 AM, Xu Kuohai wrote:
> On 11/9/2024 3:39 AM, Martin KaFai Lau wrote:
>> On 11/8/24 12:26 AM, Xu Kuohai wrote:
>>> -static void bpf_testmod_test_2(int a, int b)
>>> +static void bpf_dummy_unreg(void *kdata, struct bpf_link *link)
>>> {
>>> + WRITE_ONCE(__bpf_dummy_ops, &__bpf_testmod_ops);
>>> }
>>
>> [ ... ]
>>
>>> +static int run_struct_ops(const char *val, const struct kernel_param *kp)
>>> +{
>>> + int ret;
>>> + unsigned int repeat;
>>> + struct bpf_testmod_ops *ops;
>>> +
>>> + ret = kstrtouint(val, 10, &repeat);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + if (repeat > 10000)
>>> + return -ERANGE;
>>> +
>>> + while (repeat-- > 0) {
>>> + ops = READ_ONCE(__bpf_dummy_ops);
>>
>> I don't think it is the usual bpf_struct_ops implementation which only uses
>> READ_ONCE and WRITE_ONCE to protect the registered ops. tcp-cc uses a
>> refcnt+rcu. It seems hid uses synchronize_srcu(). sched_ext seems to also use
>> kthread_flush_work() to wait for all ops calling finished. Meaning I don't
>> think the current bpf_struct_ops unreg implementation will run into this issue
>> for sleepable ops.
>>
>
> Thanks for the explanation.
>
> Are you saying that it's not the struct_ops framework's
> responsibility to ensure the struct_ops map is not
> released while it may be still in use? And the "bug" in
> this series should be "fixed" in the test, namely this
> patch?
Yeah, it is what I was trying to say. I don't think there is thing to fix. Think
about extending a subsystem by a kernel module. The subsystem will also do the
needed protection itself during the unreg process. There is already a
bpf_try_module_get() to help the subsystem.
>> The current synchronize_rcu_mult(call_rcu, call_rcu_tasks) is only needed for
>> the tcp-cc because a tcp-cc's ops (which uses refcnt+rcu) can decrement its
>> own refcnt. Looking back, this was a mistake (mine). A new tcp-cc ops should
>> have been introduced instead to return a new tcp-cc-ops to be used.
>
> Not quite clear, but from the description, it seems that
> the synchronize_rcu_mult(call_rcu, call_rcu_tasks) could
This synchronize_rcu_mult is only need for the tcp_congestion_ops
(bpf_tcp_ca.c). May be it is cleaner to just make a special case for
"tcp_congestion_ops" in st_ops->name in map_alloc and only set
free_after_mult_rcu_gp to TRUE for this one case, then it won't slow down other
struct_ops map freeing also.
imo, the test in this patch is not needed in its current form also since it is
not how the kernel subsystem implements unreg in struct_ops.
> be just removed in some way, no need to do a cleanup to
> switch it to call_rcu.
>
>>
>>> + if (ops->test_1)
>>> + ops->test_1();
>>> + if (ops->test_2)
>>> + ops->test_2(0, 0);
>>> + }
>>> +
>>> + return 0;
>>> +}
>
next prev parent reply other threads:[~2024-11-11 21:30 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-08 8:26 [PATCH bpf-next 0/2] Fix release of struct_ops map Xu Kuohai
2024-11-08 8:26 ` [PATCH bpf-next 1/2] bpf: " Xu Kuohai
2024-11-08 17:00 ` Alexei Starovoitov
2024-11-08 8:26 ` [PATCH bpf-next 2/2] selftests/bpf: Add test for struct_ops map release Xu Kuohai
2024-11-08 19:39 ` Martin KaFai Lau
2024-11-09 8:40 ` Xu Kuohai
2024-11-11 21:30 ` Martin KaFai Lau [this message]
2024-11-12 12:22 ` Xu Kuohai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f8f02f5c-acad-4f65-85d3-e20f70fe6b7d@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=thinker.li@gmail.com \
--cc=xukuohai@huaweicloud.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).