BPF List
 help / color / mirror / Atom feed
From: Justin Suess <utilityemal77@gmail.com>
To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	eddyz87@gmail.com, memxor@gmail.com
Cc: martin.lau@linux.dev, song@kernel.org, yonghong.song@linux.dev,
	jolsa@kernel.org, bpf@vger.kernel.org, mic@digikod.net,
	Justin Suess <utilityemal77@gmail.com>
Subject: [bpf-next v2 0/2] bpf: Fix deadlock in kptr dtor in nmi
Date: Tue,  5 May 2026 11:08:49 -0400	[thread overview]
Message-ID: <20260505150851.3090688-1-utilityemal77@gmail.com> (raw)

Hello,

While following up on a Sashiko report [1], I found that referenced kptr
destructors can run from NMI context. One way to trigger this is from a
tracing program attached to tp_btf/nmi_handler while a map element is
being torn down.

That is problematic because referenced kptr destructor paths are not
universally NMI-safe. In particular, they may rely on operations such as
call_rcu(), which can deadlock when reached from NMI context.

This is v2 of the series. The approach from v1 was effectively discarded:
instead of teaching the broader teardown path about deferred destruction,
this revision fixes the problem at the referenced-kptr exchange and
teardown points directly.

This revision is also different from the problematic solution I proposed
in [2]. The freelist hijacking approach was too brittle and touched
allocator internals in an unsafe way.

This revision uses a global (percpu) irq work queue and llist.

The core change is to offload destructor execution that originates from
NMI context. It preallocates offload jobs from the global BPF memory
allocator, tracks the number of live destructor-backed references so the
pool stays ahead of NMI frees, and runs the actual destructor from
irq_work after NMI exits. Non-destructor kptr exchanges keep the fast
path, while referenced kptr exchanges pay the extra bookkeeping needed to
guarantee that teardown can be deferred safely.

The second patch adds a dedicated selftest based on the reproducer from
the report [3]. This utilizes the cpumask bpf kptr instead, to simplify
the test harness. It exercises both hash and array map cases by creating
cpumask kptrs and dropping them from an NMI handler, then verifies that
objects queued from NMI do get their destructors run.

1. bpf: Offload kptr destructors that run from NMI
2. selftests/bpf: Add kptr destructor NMI exerciser

Kind regards,
Justin Suess

[1] https://lore.kernel.org/bpf/20260421010536.17FB1C19425@smtp.kernel.org/
[2] https://lore.kernel.org/bpf/afYLJAT9brXkWxz2@zenbox/
[3] https://lore.kernel.org/bpf/20260421201035.1729473-1-utilityemal77@gmail.com/

Justin Suess (2):
  bpf: Offload kptr destructors that run from NMI
  selftests/bpf: Add kptr destructor NMI exerciser

 include/linux/bpf.h                           |  16 +
 include/linux/bpf_verifier.h                  |   1 +
 kernel/bpf/fixups.c                           |  33 +-
 kernel/bpf/helpers.c                          |  24 +-
 kernel/bpf/syscall.c                          | 181 ++++++++
 kernel/bpf/verifier.c                         |   2 +
 .../selftests/bpf/prog_tests/kptr_dtor_nmi.c  | 259 +++++++++++
 .../selftests/bpf/progs/kptr_dtor_nmi.c       | 414 ++++++++++++++++++
 8 files changed, 915 insertions(+), 15 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/kptr_dtor_nmi.c
 create mode 100644 tools/testing/selftests/bpf/progs/kptr_dtor_nmi.c


base-commit: 2ca6723a5f7b68c739dba47b2639e3eaa7884b09
-- 
2.53.0


             reply	other threads:[~2026-05-05 15:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05 15:08 Justin Suess [this message]
2026-05-05 15:08 ` [bpf-next v2 1/2] bpf: Offload kptr destructors that run from NMI Justin Suess
2026-05-05 16:06   ` bot+bpf-ci
2026-05-05 19:48     ` Justin Suess
2026-05-05 19:49   ` sashiko-bot
2026-05-06 16:43   ` Mykyta Yatsenko
2026-05-06 19:52     ` Justin Suess
2026-05-07 14:59       ` Mykyta Yatsenko
2026-05-07 16:41         ` Justin Suess
2026-05-07 17:19           ` Mykyta Yatsenko
2026-05-05 15:08 ` [bpf-next v2 2/2] selftests/bpf: Add kptr destructor NMI exerciser Justin Suess
2026-05-05 20:15   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260505150851.3090688-1-utilityemal77@gmail.com \
    --to=utilityemal77@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=mic@digikod.net \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox