public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf v3 0/5] Close race in freeing special fields and map value
@ 2026-02-27 22:48 Kumar Kartikeya Dwivedi
  2026-02-27 22:48 ` [PATCH bpf v3 1/5] bpf: Register dtor for freeing special fields Kumar Kartikeya Dwivedi
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-02-27 22:48 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Eduard Zingerman, Mykyta Yatsenko, kkd,
	kernel-team

There exists a race across various map types where the freeing of
special fields (tw, timer, wq, kptr, etc.) can be done eagerly when a
logical delete operation is done on a map value, such that the program
which continues to have access to such a map value can recreate the
fields and cause them to leak.

The set contains fixes for this case. It is a continuation of Mykyta's
previous attempt in [0], but applies to all fields. A test is included
which reproduces the bug reliably in absence of the fixes.

Local Storage Benchmarks
------------------------
Evaluation Setup: Benchmarked on a dual-socket Intel Xeon Gold 6348 (Ice
Lake) @ 2.60GHz (56 cores / 112 threads), with the CPU governor set to
performance. Bench was pinned to a single NUMA node throughout the test.

Benchmark comes from [1] using the following command:
./bench -p 1 local-storage-create --storage-type <socket,task> --batch-size <16,32,64>

Before the test, 10 runs of all cases ([socket|task] x 3 batch sizes x 7
iterations per batch size) are done to warm up and prime the machine.

Then, 3 runs of all cases are done (with and without the patch, across
reboots).

For each comparison, we have 21 samples, i.e. per batch size (e.g.
socket 16) of a given local storage, we have 3 runs x 7 iterations.

The statistics (mean, median, stddev) and t-test is done for each
scenario (local storage and batch size pair) individually (21 samples
for either case). All values are for local storage creations in thousand
creations / sec (k/s).

	       Baseline (without patch)               With patch                       Delta
     Case      Median        Mean   Std. Dev.   Median        Mean   Std. Dev.   Median       %
---------------------------------------------------------------------------------------------------
socket 16     432.026     431.941    1.047     431.347     431.953    1.635      -0.679    -0.16%
socket 32     432.641     432.818    1.535     432.488     432.302    1.508      -0.153    -0.04%
socket 64     431.504     431.996    1.337     429.145     430.326    2.469      -2.359    -0.55%
  task 16      38.816      39.382    1.456      39.657      39.337    1.831      +0.841    +2.17%
  task 32      38.815      39.644    2.690      38.721      39.122    1.636      -0.094    -0.24%
  task 64      37.562      38.080    1.701      39.554      38.563    1.689      +1.992    +5.30%

The cases for socket are within the range of noise, and improvements in task
local storage are due to high variance (CV ~4%-6% across batch sizes). The only
statistically significant case worth mentioning is socket with batch size 64
with p-value from t-test < 0.05, but the absolute difference is small (~2k/s).

TL;DR there doesn't appear to be any significant regression or improvement.

  [0]: https://lore.kernel.org/bpf/20260216131341.1285427-1-mykyta.yatsenko5@gmail.com
  [1]: https://lore.kernel.org/bpf/20260205222916.1788211-1-ameryhung@gmail.com

Changelog:
----------
v2 -> v3
v2: https://lore.kernel.org/bpf/20260227052031.3988575-1-memxor@gmail.com

 * Add syzbot Tested-by.
 * Add Amery's Reviewed-by.
 * Fix missing rcu_dereference_check() in __bpf_selem_free_rcu. (BPF CI Bot)
 * Remove migrate_disable() in bpf_selem_free_rcu. (Alexei)

v1 -> v2
v1: https://lore.kernel.org/bpf/20260225185121.2057388-1-memxor@gmail.com

 * Add Paul's Reviewed-by.
 * Fix use-after-free in accessing bpf_mem_alloc embedded in map. (syzbot CI)
 * Add benchmark numbers for local storage.
 * Add extra test case for per-cpu hashmap coverage with up to 16 refcount leaks.
 * Target bpf tree.

Kumar Kartikeya Dwivedi (5):
  bpf: Register dtor for freeing special fields
  bpf: Lose const-ness of map in map_check_btf()
  bpf: Delay freeing fields in local storage
  bpf: Retire rcu_trace_implies_rcu_gp() from local storage
  selftests/bpf: Add tests for special fields races

 include/linux/bpf.h                           |   4 +-
 include/linux/bpf_local_storage.h             |   2 +-
 include/linux/bpf_mem_alloc.h                 |   6 +
 kernel/bpf/arena.c                            |   2 +-
 kernel/bpf/arraymap.c                         |   2 +-
 kernel/bpf/bloom_filter.c                     |   2 +-
 kernel/bpf/bpf_insn_array.c                   |   2 +-
 kernel/bpf/bpf_local_storage.c                |  75 +++---
 kernel/bpf/hashtab.c                          |  86 +++++++
 kernel/bpf/local_storage.c                    |   2 +-
 kernel/bpf/lpm_trie.c                         |   2 +-
 kernel/bpf/memalloc.c                         |  58 ++++-
 kernel/bpf/syscall.c                          |   2 +-
 .../selftests/bpf/prog_tests/map_kptr_race.c  | 218 ++++++++++++++++++
 .../selftests/bpf/progs/map_kptr_race.c       | 197 ++++++++++++++++
 15 files changed, 603 insertions(+), 57 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/map_kptr_race.c
 create mode 100644 tools/testing/selftests/bpf/progs/map_kptr_race.c


base-commit: 6881af27f9ea0f5ca8f606f573ef5cc25ca31fe4
-- 
2.47.3


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-02-27 23:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-27 22:48 [PATCH bpf v3 0/5] Close race in freeing special fields and map value Kumar Kartikeya Dwivedi
2026-02-27 22:48 ` [PATCH bpf v3 1/5] bpf: Register dtor for freeing special fields Kumar Kartikeya Dwivedi
2026-02-27 22:48 ` [PATCH bpf v3 2/5] bpf: Lose const-ness of map in map_check_btf() Kumar Kartikeya Dwivedi
2026-02-27 22:48 ` [PATCH bpf v3 3/5] bpf: Delay freeing fields in local storage Kumar Kartikeya Dwivedi
2026-02-27 22:48 ` [PATCH bpf v3 4/5] bpf: Retire rcu_trace_implies_rcu_gp() from " Kumar Kartikeya Dwivedi
2026-02-27 22:48 ` [PATCH bpf v3 5/5] selftests/bpf: Add tests for special fields races Kumar Kartikeya Dwivedi
2026-02-27 23:50 ` [PATCH bpf v3 0/5] Close race in freeing special fields and map value patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox