All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
	daniel@iogearbox.net, kafai@meta.com, kernel-team@meta.com,
	eddyz87@gmail.com, Mykyta Yatsenko <yatsenko@meta.com>
Subject: Re: [PATCH 1/2] bpf: Migrate bpf_task_work to kmalloc_nolock
Date: Fri, 27 Mar 2026 14:36:38 +0000	[thread overview]
Message-ID: <87fr5libk9.fsf@gmail.com> (raw)
In-Reply-To: <CAP01T74d63iXXxz6UhjLQWgC_FOvJTrUbfrREcf=+qkshbOoag@mail.gmail.com>

Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:

> nit: I think you need to target bpf-next in next respin, patch subject
> is incorrect.
Thanks, forgot to apply bpf-next prefix.
>
> On Wed, 25 Mar 2026 at 22:12, Mykyta Yatsenko
> <mykyta.yatsenko5@gmail.com> wrote:
>>
>> From: Mykyta Yatsenko <yatsenko@meta.com>
>>
>> Replace bpf_mem_alloc/bpf_mem_free with
>> kmalloc_nolock/kfree_rcu for bpf_task_work_ctx.
>>
>> Replace guard(rcu_tasks_trace)() with guard(rcu)() in
>> bpf_task_work_irq(). The function only accesses ctx struct members
>> (not map values), so tasks trace protection is not needed - regular
>> RCU is sufficient since ctx is freed via kfree_rcu. The guard in
>> bpf_task_work_callback() remains as tasks trace since it accesses map
>> values from process context.
>>
>
> I think a comment in both places would be useful. Also, this bit can
> (should?) probably be a separate patch preceding the conversion.
Do you mean a separate patch to migrate from rcu TT to rcu? I'm not sure
it's worth it, it's not a problem right now, because bpf_mem_free()
actually frees memory only after both TT and normal rcus. But because we
are moving to kfree_rcu() it should be paired with rcu guard, because
now free does not wait for rcu TT.
>
>> Sleepable BPF programs (e.g. BPF_PROG_TYPE_SYSCALL) hold
>> rcu_read_lock_trace but not regular rcu_read_lock. Since kfree_rcu
>> waits for a regular RCU grace period, the ctx memory can be freed
>> while a sleepable program is still running. Add explicit
>> rcu_read_lock/unlock around the pointer read and refcount tryget in
>> bpf_task_work_acquire_ctx to close this race window.
>>
>> For the lost-cmpxchg path the ctx was never published, so plain kfree
>> is safe.
>> ---
>> [...]
>>
>> @@ -4296,13 +4292,27 @@ static struct bpf_task_work_ctx *bpf_task_work_acquire_ctx(struct bpf_task_work
>>  {
>>         struct bpf_task_work_ctx *ctx;
>>
>> +       /*
>> +        * Sleepable BPF programs hold rcu_read_lock_trace but not
>> +        * regular rcu_read_lock. Since kfree_rcu waits for regular
>> +        * RCU GP, the ctx can be freed while we're between reading
>> +        * the pointer and incrementing the refcount. Take regular
>> +        * rcu_read_lock to prevent kfree_rcu from freeing the ctx
>> +        * before we can tryget it.
>> +        */
>> +       rcu_read_lock();
>>         ctx = bpf_task_work_fetch_ctx(tw, map);
>> -       if (IS_ERR(ctx))
>> +       if (IS_ERR(ctx)) {
>> +               rcu_read_unlock();
>>                 return ctx;
>> +       }
>>
>>         /* try to get ref for task_work callback to hold */
>> -       if (!bpf_task_work_ctx_tryget(ctx))
>> +       if (!bpf_task_work_ctx_tryget(ctx)) {
>> +               rcu_read_unlock();
>>                 return ERR_PTR(-EBUSY);
>> +       }
>> +       rcu_read_unlock();
>
> nit: This might look cleaner with explicit block {} and guard(rcu)() inside?
>
yeah, I think you are right.
>>
>>         if (cmpxchg(&ctx->state, BPF_TW_STANDBY, BPF_TW_PENDING) != BPF_TW_STANDBY) {
>>                 /* lost acquiring race or map_release_uref() stole it from us, put ref and bail */
>>
>> --
>> 2.52.0
>>

  reply	other threads:[~2026-03-27 14:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-25 21:11 [PATCH 0/2] bpf: Migrate bpf_task_work and file dynptr to kmalloc_nolock Mykyta Yatsenko
2026-03-25 21:11 ` [PATCH 1/2] bpf: Migrate bpf_task_work " Mykyta Yatsenko
2026-03-27  3:41   ` Kumar Kartikeya Dwivedi
2026-03-27 14:36     ` Mykyta Yatsenko [this message]
2026-03-25 21:11 ` [PATCH 2/2] bpf: Migrate dynptr file " Mykyta Yatsenko
2026-03-27  3:37   ` Kumar Kartikeya Dwivedi
2026-03-27 14:46     ` Mykyta Yatsenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fr5libk9.fsf@gmail.com \
    --to=mykyta.yatsenko5@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=kafai@meta.com \
    --cc=kernel-team@meta.com \
    --cc=memxor@gmail.com \
    --cc=yatsenko@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.