From: "Kumar Kartikeya Dwivedi" <memxor@gmail.com>
To: "Justin Suess" <utilityemal77@gmail.com>,
"Kumar Kartikeya Dwivedi" <memxor@gmail.com>
Cc: <bpf@vger.kernel.org>, "Alexei Starovoitov" <ast@kernel.org>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Eduard Zingerman" <eddyz87@gmail.com>,
"Emil Tsalapatis" <emil@etsalapatis.com>, <kkd@meta.com>,
<kernel-team@meta.com>
Subject: Re: [PATCH bpf-next v1 3/5] bpf: Cancel special fields on map value recycle
Date: Mon, 08 Jun 2026 20:50:54 +0200 [thread overview]
Message-ID: <DJ3W6VAVBZXA.1I914ZIIXG9K5@gmail.com> (raw)
In-Reply-To: <aicB9SZfVnUV1rz-@zenbox>
On Mon Jun 8, 2026 at 8:01 PM CEST, Justin Suess wrote:
> On Mon, Jun 08, 2026 at 04:48:36PM +0200, Kumar Kartikeya Dwivedi wrote:
>> From: Justin Suess <utilityemal77@gmail.com>
>>
>> Map update and delete paths currently call bpf_obj_free_fields() when a
>> value is being replaced or recycled. That makes field destruction depend
>> on the context of the update/delete operation. For tracing programs this
>> can include NMI context, where referenced kptr destructors, uptr
>> unpinning, and graph root destruction are not generally safe.
>>
>> Introduce bpf_obj_cancel_fields() for the reusable-value path. It only
>> performs NMI-safe cleanup for timer, workqueue, and task_work fields.
>> Fields that need full destruction are left attached to the recycled value
>> and are destroyed by the final cleanup path instead.
>>
>> Switch array and hashtab update/delete/recycle paths to this cancel
>> helper. Keep bpf_obj_free_fields() for final map destruction and for
>> bpf_mem_alloc destructors. Preallocated hashtabs do not have allocator
>> destructors, so teardown continues to walk the normal and extra elements
>> and fully destroy their fields.
>>
>> This deliberately relaxes the eager-free semantics of map update/delete
>> for special fields. Programs that relied on a recycled map slot becoming
>> empty immediately after update/delete were relying on behavior that
>> cannot be implemented safely from every BPF execution context without
>> offloading arbitrary destructors.
>>
>> There is a chance this change breaks programs making assumptions
>> regarding the eager freeing of fields. If so, we can relax semantics to
>> cancellation only when irqs_disabled() is true in the future. However,
>> theoretically, map values that get reused eagerly already have weaker
>> guarantees as parallel users can recreate freed fields before the new
>> element becomes visible again.
>>
>> Fixes: 14a324f6a67e ("bpf: Wire up freeing of referenced kptr")
>> Signed-off-by: Justin Suess <utilityemal77@gmail.com>
>> Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>> ---
>> [...]
>> void bpf_obj_free_fields(const struct btf_record *rec, void *obj)
>> {
> Would a WARN_ON_ONCE for in_nmi in bpf_obj_free_fields help spot these
> kind of bugs in the future?
>
> Usually calling bpf_obj_free_fields in_nmi won't result in a deadlock
> unless specific conditions are met (ie last reference to a refcounted
> object, using a workqueue, etc).
>
> Adding a check here would quickly surface bugs of this class.
>
In general, the idea makes sense when errors are impossible, but people
increasingly run with panic-on-warn enabled, which sort of beats the purpose,
hence I'd rather avoid it.
If you think about it, there are several guardrails that protect us here. First,
we maintain a whitelist for stuff that is invoked in the bpf_obj_cancel_fields()
path, so we don't inadvertently start operating on any new, possibly unsafe,
fields. Second, the free path mostly happens in the map destruction path now,
hence we need to worry less about random contexts it could be invoked in anymore.
>> [...]
next prev parent reply other threads:[~2026-06-08 18:50 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-08 14:48 [PATCH bpf-next v1 0/5] Fix kptr dtor deadlock Kumar Kartikeya Dwivedi
2026-06-08 14:48 ` [PATCH bpf-next v1 1/5] bpf: Treat non-iterator tracing progs as tracing Kumar Kartikeya Dwivedi
2026-06-08 14:51 ` Kumar Kartikeya Dwivedi
2026-06-08 15:13 ` sashiko-bot
2026-06-08 15:44 ` bot+bpf-ci
2026-06-08 17:47 ` Justin Suess
2026-06-08 18:53 ` Kumar Kartikeya Dwivedi
2026-06-08 14:48 ` [PATCH bpf-next v1 2/5] bpf: Reject bpf_obj_drop() from tracing progs Kumar Kartikeya Dwivedi
2026-06-08 15:40 ` sashiko-bot
2026-06-08 14:48 ` [PATCH bpf-next v1 3/5] bpf: Cancel special fields on map value recycle Kumar Kartikeya Dwivedi
2026-06-08 15:44 ` bot+bpf-ci
2026-06-08 15:56 ` sashiko-bot
2026-06-08 18:01 ` Justin Suess
2026-06-08 18:50 ` Kumar Kartikeya Dwivedi [this message]
2026-06-08 14:48 ` [PATCH bpf-next v1 4/5] selftests/bpf: Exercise unsafe obj drops from tracing progs Kumar Kartikeya Dwivedi
2026-06-08 16:16 ` sashiko-bot
2026-06-08 14:48 ` [PATCH bpf-next v1 5/5] selftests/bpf: Exercise kptr map update lifetime Kumar Kartikeya Dwivedi
2026-06-08 16:40 ` sashiko-bot
2026-06-08 14:58 ` [PATCH bpf-next v1 0/5] Fix kptr dtor deadlock Kumar Kartikeya Dwivedi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DJ3W6VAVBZXA.1I914ZIIXG9K5@gmail.com \
--to=memxor@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=emil@etsalapatis.com \
--cc=kernel-team@meta.com \
--cc=kkd@meta.com \
--cc=utilityemal77@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox