From: Puranjay Mohan <puranjay@kernel.org>
To: Aaron Esau <aaron1esau@gmail.com>, bpf@vger.kernel.org
Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org
Subject: Re: [BUG] bpf: use-after-free in hashtab BPF_F_LOCK in-place update path
Date: Thu, 26 Mar 2026 13:39:02 +0000 [thread overview]
Message-ID: <m2se9mg16x.fsf@kernel.org> (raw)
In-Reply-To: <CADucPGRvSRpkneb94dPP08YkOHgNgBnskTK6myUag_Mkjimihg@mail.gmail.com>
Aaron Esau <aaron1esau@gmail.com> writes:
> Reported-by: Aaron Esau <aaron1esau@gmail.com>
>
> htab_map_update_elem() has a use-after-free when BPF_F_LOCK is used
> for in-place updates.
>
> The BPF_F_LOCK path calls lookup_nulls_elem_raw() without holding the
> bucket lock, then dereferences the element via copy_map_value_locked().
> A concurrent htab_map_delete_elem() can delete and free the element
> between these steps.
>
> free_htab_elem() uses bpf_mem_cache_free(), which immediately returns
> the object to the per-CPU free list (not RCU-deferred). The memory may
> be reallocated before copy_map_value_locked() executes, leading to
> writes into a different element.
>
> When lookup succeeds (l_old != NULL), the in-place update path returns
> early, so the “full lookup under lock” path is not taken.
>
> Race:
>
> CPU 0: htab_map_update_elem (BPF_F_LOCK)
> lookup_nulls_elem_raw() → E (no bucket lock)
> ...
> CPU 1: htab_map_delete_elem()
> htab_lock_bucket → hlist_nulls_del_rcu → htab_unlock_bucket
> free_htab_elem → bpf_mem_cache_free (immediate free)
> CPU 1: htab_map_update_elem (new key)
> alloc_htab_elem → reuses E
> CPU 0: copy_map_value_locked(E, ...) → writes into reused object
>
> Reproduction:
>
> 1. Create BPF_MAP_TYPE_HASH with a value containing bpf_spin_lock
> (max_entries=64, 7 u64 fields + lock).
> 2. Threads A: BPF_MAP_UPDATE_ELEM with BPF_F_LOCK (pattern 0xAAAA...)
> 3. Threads B: DELETE + UPDATE (pattern 0xBBBB...) on same keys
> 4. Threads C: same as A (pattern 0xCCCC...)
> 5. Verifier threads: LOOKUP loop, detect mixed-pattern values
> 6. Run 60s on >=4 CPUs
>
> Attached a POC. On 6.19.9 (4 vCPU QEMU, CONFIG_PREEMPT=y),
> I observed ~645 torn values in 2.5M checks (~0.026%).
>
> Fixes: 96049f3afd50 ("bpf: introduce BPF_F_LOCK flag")
Although this is a real issue, your reproducer is not accurate, it will
see torn writes even without the UAF issue, because the verifier thread
is not taking the lock:
So the torn write pattern CCCAAAA can mean:
1. Thread A finished writing AAAAAAA (while holding the lock)
2. Thread C acquired the lock and started writing: field[0]=C, field[1]=C, field[2]=C...
3. The verifier thread reads (no lock): sees field[0]=C, field[1]=C, field[2]=C, field[3]=A, field[4]=A, field[5]=A, field[6]=A
4. Thread C finishes: field[3]=C, field[4]=C, field[5]=C, field[6]=C, releases lock
This race happens regardless of whether the element is freed/reused. It
would happen even without thread B (the delete+readd thread). The
corruption source is the non-atomic read, not the UAF.
If you change the preproducer like:
-- >8 --
--- repro.c 2026-03-26 05:22:49.012503218 -0700
+++ repro2.c 2026-03-26 06:24:40.951044279 -0700
@@ -227,6 +227,7 @@
attr.map_fd = fd;
attr.key = (uint64_t)(unsigned long)key;
attr.value = (uint64_t)(unsigned long)val;
+ attr.flags = BPF_F_LOCK;
return bpf_sys(BPF_MAP_LOOKUP_ELEM_CMD, &attr, sizeof(attr));
}
-- 8< --
Now it will detect the correct UAF problem.
I verified that this updated reproducer shows the problem, the following
kernel diff fixes it:
-- >8 --
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index bc6bc8bb871d..af33f62069f0 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -953,7 +953,7 @@ static void htab_elem_free(struct bpf_htab *htab, struct htab_elem *l)
if (htab->map.map_type == BPF_MAP_TYPE_PERCPU_HASH)
bpf_mem_cache_free(&htab->pcpu_ma, l->ptr_to_pptr);
- bpf_mem_cache_free(&htab->ma, l);
+ bpf_mem_cache_free_rcu(&htab->ma, l);
}
static void htab_put_fd_value(struct bpf_htab *htab, struct htab_elem *l)
-- 8< --
Before:
[root@alarm host0]# ./repro2
Running 10 threads for 60 seconds...
Total checks: 49228421
Torn writes: 5470
Max torn fields: 3 / 7
Corruption rate: 0.011111%
Cross-pattern breakdown:
A in B: 8595
C in B: 7826
Unknown: 1
First 20 events:
[0] check #42061 seq=39070 CCCBBBB
[1] check #65714 seq=60575 CCCBBBB
[2] check #65287 seq=60575 CCCBBBB
[3] check #70474 seq=65793 AAABBBB
[4] check #70907 seq=65793 AAABBBB
[5] check #103389 seq=95745 AAABBBB
[6] check #107208 seq=98672 CCCBBBB
[7] check #108218 seq=100387 CCCBBBB
[8] check #111490 seq=103388 CCCBBBB
[9] check #140942 seq=128894 CCCBBBB
[10] check #164845 seq=151828 CCCBBBB
[11] check #163993 seq=151828 CCCBBBB
[12] check #169184 seq=155453 CCCBBBB
[13] check #171383 seq=158572 AAABBBB
[14] check #179943 seq=165425 CCCBBBB
[15] check #189218 seq=173926 CCCBBBB
[16] check #192119 seq=177892 CCCBBBB
[17] check #194253 seq=180562 AAABBBB
[18] check #202169 seq=187253 CCCBBBB
[19] check #205452 seq=189021 CCCBBBB
CORRUPTION DETECTED
After:
[root@alarm host0]# ./repro2
Running 10 threads for 60 seconds...
Total checks: 108666576
Torn writes: 0
Max torn fields: 0 / 7
No corruption detected (try more CPUs or longer run)
[root@alarm host0]# nproc
16
I will send a patch to fix this soon after validating the above kernel
diff and figuring out how we got to this state in htab_elem_free() by
analyzing the git history.
Thanks for the report.
Puranjay
next prev parent reply other threads:[~2026-03-26 13:39 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 8:49 [BUG] bpf: use-after-free in hashtab BPF_F_LOCK in-place update path Aaron Esau
2026-03-26 13:39 ` Puranjay Mohan [this message]
2026-03-26 14:58 ` Kumar Kartikeya Dwivedi
2026-03-26 15:02 ` Puranjay Mohan
2026-03-26 15:26 ` Mykyta Yatsenko
2026-03-26 15:33 ` Puranjay Mohan
2026-03-26 15:43 ` Mykyta Yatsenko
2026-03-26 15:47 ` Mykyta Yatsenko
2026-03-26 15:57 ` Puranjay Mohan
2026-03-27 2:44 ` Aaron Esau
2026-03-27 3:21 ` Kumar Kartikeya Dwivedi
2026-03-27 16:09 ` Mykyta Yatsenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m2se9mg16x.fsf@kernel.org \
--to=puranjay@kernel.org \
--cc=aaron1esau@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox