From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>, bot+bpf-ci@kernel.org
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
daniel@iogearbox.net, kafai@meta.com, kernel-team@meta.com,
memxor@gmail.com, eddyz87@gmail.com, yatsenko@meta.com,
martin.lau@kernel.org, yonghong.song@linux.dev, clm@meta.com,
ihor.solodrai@linux.dev
Subject: Re: [PATCH RFC v3 01/10] bpf: Refactor __bpf_async_set_callback()
Date: Tue, 13 Jan 2026 16:14:01 +0000 [thread overview]
Message-ID: <0149734c-91e8-474b-99da-b0f20efb2329@gmail.com> (raw)
In-Reply-To: <CAEf4BzZxZbUjBS2v-SO8Z_g_hxbrZy9JTVYTdryE0OzZx7PSyg@mail.gmail.com>
On 1/9/26 22:18, Andrii Nakryiko wrote:
> On Wed, Jan 7, 2026 at 10:22 AM <bot+bpf-ci@kernel.org> wrote:
>>> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
>>> index 9eaa4185e0a7..954bd61310a6 100644
>>> --- a/kernel/bpf/helpers.c
>>> +++ b/kernel/bpf/helpers.c
>>> @@ -1355,55 +1355,36 @@ static const struct bpf_func_proto bpf_timer_init_proto = {
>>> };
>>>
>>> static int __bpf_async_set_callback(struct bpf_async_kern *async, void *callback_fn,
>>> - struct bpf_prog_aux *aux, unsigned int flags,
>>> - enum bpf_async_type type)
>>> + struct bpf_prog *prog)
>>> {
>>> - struct bpf_prog *prev, *prog = aux->prog;
>>> - struct bpf_async_cb *cb;
>>> - int ret = 0;
>>> + struct bpf_prog *prev;
>>> + struct bpf_async_cb *cb = async->cb;
>>>
>>> - if (in_nmi())
>>> - return -EOPNOTSUPP;
>>> - __bpf_spin_lock_irqsave(&async->lock);
>>> - cb = async->cb;
>>> - if (!cb) {
>>> - ret = -EINVAL;
>>> - goto out;
>>> - }
>>> + if (!cb)
>>> + return -EPERM;
>> A small concern: the error code changed from -EINVAL to -EPERM. The verifier
>> comment at kernel/bpf/verifier.c mentions that non-initialized timers
>> "will return -EINVAL". Is the error code change intentional, and should
>> the verifier comment be updated?
>>
> good AI, I asked the same
>
>>> - if (!atomic64_read(&cb->map->usercnt)) {
>>> - /* maps with timers must be either held by user space
>>> - * or pinned in bpffs. Otherwise timer might still be
>>> - * running even when bpf prog is detached and user space
>>> - * is gone, since map_release_uref won't ever be called.
>>> - */
>>> - ret = -EPERM;
>>> - goto out;
>>> - }
>> The usercnt check is removed here. The check still exists in
>> __bpf_async_init(), so a timer cannot be initialized when usercnt is zero.
>> But can there be issues if set_callback is called after usercnt drops to
>> zero on an already-initialized timer? The old code would fail with -EPERM,
>> while the new code proceeds.
>>
> Wasn't this discussed earlier, Mykyta? Do you remember the conclusion?
I think what I missed is to put rcu lock in this function, because the
last prog reference is put in the rcu callback (the change this patchset
introduces).
This way we make sure that whatever is set here will be put if map's
usercount goes to 0.
>
>>> - prev = cb->prog;
>>> - if (prev != prog) {
>>> - /* Bump prog refcnt once. Every bpf_timer_set_callback()
>>> - * can pick different callback_fn-s within the same prog.
>>> - */
>>> - prog = bpf_prog_inc_not_zero(prog);
>>> - if (IS_ERR(prog)) {
>>> - ret = PTR_ERR(prog);
>>> - goto out;
>>> + do {
>>> + if (prog) {
>>> + prog = bpf_prog_inc_not_zero(prog);
>>> + if (IS_ERR(prog))
>>> + return PTR_ERR(prog);
>>> }
>>> +
>>> + prev = xchg(&cb->prog, prog);
>>> + rcu_assign_pointer(cb->callback_fn, callback_fn);
>> Is there a potential race between these two operations? Consider two
>> concurrent callers (with programs P and Q) racing:
>>
>> 1. Thread A: xchg stores P into cb->prog
>> 2. Thread B: xchg stores Q into cb->prog, gets P back
>> 3. Thread B: rcu_assign_pointer stores FB (Q's callback)
>> 4. Thread B: bpf_prog_put(P), B's check passes, B exits
>> 5. Thread A: rcu_assign_pointer stores FA (P's callback), overwrites FB
>> 6. Thread A: check fails (cb->prog is Q, not P), retries
>>
>> After step 5, cb->prog holds Q (with Q's reference) but cb->callback_fn
>> points to FA (from P). If P's refcount drops to zero elsewhere and P is
>> freed, would the timer callback FA become a dangling pointer?
> AI is not completely wrong here, IMO. No, there is no use-after-free
> just yet because program is RCU protected, so we don't have dangling
> pointer just yet.
>
> But. That bpf_prog_inc_not_zero() on retry will fail if P's program
> refcount dropped to zero already. And then once we exit, we'll have
> Q+FA combo, which is not good.
>
> So I think we need to do one bpf_prog_inc_not_zero() outside of the
> loop once and keep that extra ref until we exit. And then instead of
> bpf_prog_inc_not_zero(prog) inside the loop, we do unconditional
> bpf_prog_inc(), which we *know* has to succeed. Less error handling
> inside the loop as well.
>
> Thoughts?
>
>
>> The bpf_timer_cb() function reads cb->callback_fn and calls it directly
>> without verifying it corresponds to cb->prog. The cb->prog field holds
>> the reference keeping the program alive, but if the callback pointer is
>> from a different program that may have been freed, this could be a problem.
>>
>>> +
>>> if (prev)
>>> - /* Drop prev prog refcnt when swapping with new prog */
>>> bpf_prog_put(prev);
>>> - cb->prog = prog;
>>> - }
>>> - rcu_assign_pointer(cb->callback_fn, callback_fn);
>>> -out:
>>> - __bpf_spin_unlock_irqrestore(&async->lock);
>>> - return ret;
>>> +
>>> + } while (READ_ONCE(cb->prog) != prog || READ_ONCE(cb->callback_fn) != callback_fn);
>>> +
>>> + return 0;
>>> }
>> [ ... ]
>>
>>
>> ---
>> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
>> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>>
>> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20791345842
next prev parent reply other threads:[~2026-01-13 16:14 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-07 17:49 [PATCH RFC v3 00/10] bpf: Avoid locks in bpf_timer and bpf_wq Mykyta Yatsenko
2026-01-07 17:49 ` [PATCH RFC v3 01/10] bpf: Refactor __bpf_async_set_callback() Mykyta Yatsenko
2026-01-07 18:22 ` bot+bpf-ci
2026-01-09 22:18 ` Andrii Nakryiko
2026-01-13 13:58 ` Mykyta Yatsenko
2026-01-13 16:14 ` Mykyta Yatsenko [this message]
2026-01-09 22:18 ` Andrii Nakryiko
2026-01-12 6:47 ` Kumar Kartikeya Dwivedi
2026-01-12 8:12 ` Kumar Kartikeya Dwivedi
2026-01-07 17:49 ` [PATCH RFC v3 02/10] bpf: Factor out timer deletion helper Mykyta Yatsenko
2026-01-12 6:51 ` Kumar Kartikeya Dwivedi
2026-01-07 17:49 ` [PATCH RFC v3 03/10] bpf: Simplify bpf_timer_cancel() Mykyta Yatsenko
2026-01-07 18:22 ` bot+bpf-ci
2026-01-09 22:19 ` Andrii Nakryiko
2026-01-12 7:29 ` Kumar Kartikeya Dwivedi
2026-01-07 17:49 ` [PATCH RFC v3 04/10] bpf: Add lock-free cell for NMI-safe async operations Mykyta Yatsenko
2026-01-07 18:08 ` bot+bpf-ci
2026-01-07 18:30 ` Kumar Kartikeya Dwivedi
2026-01-07 19:05 ` Mykyta Yatsenko
2026-01-09 1:18 ` Andrii Nakryiko
2026-01-09 18:22 ` Mykyta Yatsenko
2026-01-09 18:47 ` Andrii Nakryiko
2026-01-09 23:51 ` Alexei Starovoitov
2026-01-10 0:03 ` Andrii Nakryiko
2026-01-09 22:19 ` Andrii Nakryiko
2026-01-07 17:49 ` [PATCH RFC v3 05/10] bpf: Enable bpf timer and workqueue use in NMI Mykyta Yatsenko
2026-01-07 18:22 ` bot+bpf-ci
2026-01-09 22:19 ` Andrii Nakryiko
2026-01-14 14:53 ` Mykyta Yatsenko
2026-01-15 18:39 ` Andrii Nakryiko
2026-01-15 18:52 ` Mykyta Yatsenko
2026-01-12 8:10 ` Kumar Kartikeya Dwivedi
2026-01-07 17:49 ` [PATCH RFC v3 06/10] bpf: Add verifier support for bpf_timer argument in kfuncs Mykyta Yatsenko
2026-01-09 22:19 ` Andrii Nakryiko
2026-01-07 17:49 ` [PATCH RFC v3 07/10] bpf: Introduce bpf_timer_cancel_async() kfunc Mykyta Yatsenko
2026-01-09 22:19 ` Andrii Nakryiko
2026-01-07 17:49 ` [PATCH RFC v3 08/10] selftests/bpf: Refactor timer selftests Mykyta Yatsenko
2026-01-07 17:49 ` [PATCH RFC v3 09/10] selftests/bpf: Add stress test for timer async cancel Mykyta Yatsenko
2026-01-07 17:49 ` [PATCH RFC v3 10/10] selftests/bpf: Verify bpf_timer_cancel_async works Mykyta Yatsenko
2026-01-09 22:19 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0149734c-91e8-474b-99da-b0f20efb2329@gmail.com \
--to=mykyta.yatsenko5@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bot+bpf-ci@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=ihor.solodrai@linux.dev \
--cc=kafai@meta.com \
--cc=kernel-team@meta.com \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=yatsenko@meta.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox