From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
daniel@iogearbox.net, kafai@meta.com, kernel-team@meta.com,
eddyz87@gmail.com, Mykyta Yatsenko <yatsenko@meta.com>
Subject: Re: [PATCH 1/2] bpf: Migrate bpf_task_work to kmalloc_nolock
Date: Fri, 27 Mar 2026 14:36:38 +0000 [thread overview]
Message-ID: <87fr5libk9.fsf@gmail.com> (raw)
In-Reply-To: <CAP01T74d63iXXxz6UhjLQWgC_FOvJTrUbfrREcf=+qkshbOoag@mail.gmail.com>
Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
> nit: I think you need to target bpf-next in next respin, patch subject
> is incorrect.
Thanks, forgot to apply bpf-next prefix.
>
> On Wed, 25 Mar 2026 at 22:12, Mykyta Yatsenko
> <mykyta.yatsenko5@gmail.com> wrote:
>>
>> From: Mykyta Yatsenko <yatsenko@meta.com>
>>
>> Replace bpf_mem_alloc/bpf_mem_free with
>> kmalloc_nolock/kfree_rcu for bpf_task_work_ctx.
>>
>> Replace guard(rcu_tasks_trace)() with guard(rcu)() in
>> bpf_task_work_irq(). The function only accesses ctx struct members
>> (not map values), so tasks trace protection is not needed - regular
>> RCU is sufficient since ctx is freed via kfree_rcu. The guard in
>> bpf_task_work_callback() remains as tasks trace since it accesses map
>> values from process context.
>>
>
> I think a comment in both places would be useful. Also, this bit can
> (should?) probably be a separate patch preceding the conversion.
Do you mean a separate patch to migrate from rcu TT to rcu? I'm not sure
it's worth it, it's not a problem right now, because bpf_mem_free()
actually frees memory only after both TT and normal rcus. But because we
are moving to kfree_rcu() it should be paired with rcu guard, because
now free does not wait for rcu TT.
>
>> Sleepable BPF programs (e.g. BPF_PROG_TYPE_SYSCALL) hold
>> rcu_read_lock_trace but not regular rcu_read_lock. Since kfree_rcu
>> waits for a regular RCU grace period, the ctx memory can be freed
>> while a sleepable program is still running. Add explicit
>> rcu_read_lock/unlock around the pointer read and refcount tryget in
>> bpf_task_work_acquire_ctx to close this race window.
>>
>> For the lost-cmpxchg path the ctx was never published, so plain kfree
>> is safe.
>> ---
>> [...]
>>
>> @@ -4296,13 +4292,27 @@ static struct bpf_task_work_ctx *bpf_task_work_acquire_ctx(struct bpf_task_work
>> {
>> struct bpf_task_work_ctx *ctx;
>>
>> + /*
>> + * Sleepable BPF programs hold rcu_read_lock_trace but not
>> + * regular rcu_read_lock. Since kfree_rcu waits for regular
>> + * RCU GP, the ctx can be freed while we're between reading
>> + * the pointer and incrementing the refcount. Take regular
>> + * rcu_read_lock to prevent kfree_rcu from freeing the ctx
>> + * before we can tryget it.
>> + */
>> + rcu_read_lock();
>> ctx = bpf_task_work_fetch_ctx(tw, map);
>> - if (IS_ERR(ctx))
>> + if (IS_ERR(ctx)) {
>> + rcu_read_unlock();
>> return ctx;
>> + }
>>
>> /* try to get ref for task_work callback to hold */
>> - if (!bpf_task_work_ctx_tryget(ctx))
>> + if (!bpf_task_work_ctx_tryget(ctx)) {
>> + rcu_read_unlock();
>> return ERR_PTR(-EBUSY);
>> + }
>> + rcu_read_unlock();
>
> nit: This might look cleaner with explicit block {} and guard(rcu)() inside?
>
yeah, I think you are right.
>>
>> if (cmpxchg(&ctx->state, BPF_TW_STANDBY, BPF_TW_PENDING) != BPF_TW_STANDBY) {
>> /* lost acquiring race or map_release_uref() stole it from us, put ref and bail */
>>
>> --
>> 2.52.0
>>
next prev parent reply other threads:[~2026-03-27 14:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-25 21:11 [PATCH 0/2] bpf: Migrate bpf_task_work and file dynptr to kmalloc_nolock Mykyta Yatsenko
2026-03-25 21:11 ` [PATCH 1/2] bpf: Migrate bpf_task_work " Mykyta Yatsenko
2026-03-27 3:41 ` Kumar Kartikeya Dwivedi
2026-03-27 14:36 ` Mykyta Yatsenko [this message]
2026-03-25 21:11 ` [PATCH 2/2] bpf: Migrate dynptr file " Mykyta Yatsenko
2026-03-27 3:37 ` Kumar Kartikeya Dwivedi
2026-03-27 14:46 ` Mykyta Yatsenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87fr5libk9.fsf@gmail.com \
--to=mykyta.yatsenko5@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=kafai@meta.com \
--cc=kernel-team@meta.com \
--cc=memxor@gmail.com \
--cc=yatsenko@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox