* [PATCH bpf] bpf, lpm_trie: Allow lookups from sleepable BPF programs
@ 2026-05-29 17:42 Vlad Poenaru
2026-05-29 19:02 ` sashiko-bot
2026-05-29 19:19 ` Emil Tsalapatis
0 siblings, 2 replies; 5+ messages in thread
From: Vlad Poenaru @ 2026-05-29 17:42 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf
Cc: Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa,
Toke Høiland-Jørgensen, linux-kernel, stable
trie_lookup_elem() annotates its rcu_dereference_check() walks with
only rcu_read_lock_bh_held(). Because rcu_dereference_check(p, c)
resolves to "c || rcu_read_lock_held()", this passes for XDP/NAPI and
classic RCU readers but fails for sleepable BPF programs, which enter
via __bpf_prog_enter_sleepable() and hold only rcu_read_lock_trace().
A sleepable LSM hook that ends up doing bpf_map_lookup_elem() on an LPM
trie therefore triggers lockdep on debug kernels:
=============================
WARNING: suspicious RCU usage
7.1.0-... Tainted: G E
-----------------------------
kernel/bpf/lpm_trie.c:249 suspicious rcu_dereference_check() usage!
1 lock held by net_tests/540:
#0: (rcu_tasks_trace_srcu_struct){....}-{0:0},
at: __bpf_prog_enter_sleepable+0x26/0x280
Call Trace:
dump_stack_lvl
lockdep_rcu_suspicious
trie_lookup_elem
bpf_prog_..._enforce_security_socket_connect
bpf_trampoline_...
security_socket_connect
__sys_connect
do_syscall_64
This is lockdep-only -- no UAF, since Tasks Trace RCU does serialize
against the trie's reclaim path -- but it spams the console once per
distinct callsite on every debug kernel running a sleepable BPF LSM
that does map lookups on an LPM trie, which is increasingly common.
Other map types already use the bpf_rcu_lock_held() helper, which
accepts all three contexts (classic, BH, Tasks Trace). Use it here as
well, matching the established convention.
Fixes: 694cea395fde ("bpf: Allow RCU-protected lookups to happen from bh context")
Cc: stable@vger.kernel.org
Signed-off-by: Vlad Poenaru <vlad.wing@gmail.com>
---
kernel/bpf/lpm_trie.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
index 0f57608b385d..ac36063cb7e6 100644
--- a/kernel/bpf/lpm_trie.c
+++ b/kernel/bpf/lpm_trie.c
@@ -246,7 +246,7 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key)
/* Start walking the trie from the root node ... */
- for (node = rcu_dereference_check(trie->root, rcu_read_lock_bh_held());
+ for (node = rcu_dereference_check(trie->root, bpf_rcu_lock_held());
node;) {
unsigned int next_bit;
size_t matchlen;
@@ -280,7 +280,7 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key)
*/
next_bit = extract_bit(key->data, node->prefixlen);
node = rcu_dereference_check(node->child[next_bit],
- rcu_read_lock_bh_held());
+ bpf_rcu_lock_held());
}
if (!found)
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH bpf] bpf, lpm_trie: Allow lookups from sleepable BPF programs
2026-05-29 17:42 [PATCH bpf] bpf, lpm_trie: Allow lookups from sleepable BPF programs Vlad Poenaru
@ 2026-05-29 19:02 ` sashiko-bot
2026-05-29 19:30 ` Emil Tsalapatis
2026-06-07 9:17 ` Kumar Kartikeya Dwivedi
2026-05-29 19:19 ` Emil Tsalapatis
1 sibling, 2 replies; 5+ messages in thread
From: sashiko-bot @ 2026-05-29 19:02 UTC (permalink / raw)
To: Vlad Poenaru; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] This problem wasn't introduced by this patch, but the fix is incomplete. The patch resolves an RCU lockdep splat in `trie_lookup_elem` for sleepable BPF programs, but leaves identical core kernel locking rule violations in `trie_update_elem` and `trie_delete_elem`. Sleepable eBPF programs hold `rcu_read_lock_trace()`, not the classic `rcu_read_lock()`. The update and delete operations acquire the map's spinlock (`trie->lock`) and correctly serialize access, but they use `rcu_dereference()` to traverse the trie instead of the update-side API `rcu_dereference_protected()`. `rcu_dereference()` strictly asserts that `rcu_read_lock_held()` is true. Holding a spinlock disables preemption but does not satisfy this check, leading to the exact same 'suspicious rcu_dereference_check() usage' lockdep splat that this patch aimed to fix.
--
commit 1ac6e30b3c7fb4d93d67bfcbb5dc35077814b266
Author: Vlad Poenaru <vlad.wing@gmail.com>
bpf, lpm_trie: Allow lookups from sleepable BPF programs
This commit fixes a lockdep warning caused by sleepable BPF programs
accessing LPM tries. It updates the RCU dereference checks in
trie_lookup_elem() to use bpf_rcu_lock_held() instead of
rcu_read_lock_bh_held().
> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
> index 0f57608b385d4..ac36063cb7e62 100644
> --- a/kernel/bpf/lpm_trie.c
> +++ b/kernel/bpf/lpm_trie.c
[ ... ]
> @@ -280,7 +280,7 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key)
> */
> next_bit = extract_bit(key->data, node->prefixlen);
> node = rcu_dereference_check(node->child[next_bit],
> - rcu_read_lock_bh_held());
> + bpf_rcu_lock_held());
> }
>
> if (!found)
[Severity: High]
This is a pre-existing issue, but does this patch leave the fix incomplete?
While this patch updates the lookup path to avoid lockdep splats from sleepable
BPF programs, sleepable BPF programs can also trigger map updates and deletes.
The functions trie_update_elem() and trie_delete_elem() acquire the map's
spinlock (trie->lock) but traverse the tree using rcu_dereference():
kernel/bpf/lpm_trie.c:trie_update_elem() {
...
while ((node = rcu_dereference(*slot))) {
matchlen = longest_prefix_match(trie, node, key);
...
}
kernel/bpf/lpm_trie.c:trie_delete_elem() {
...
while ((node = rcu_dereference(*trim))) {
matchlen = longest_prefix_match(trie, node, key);
...
}
Because rcu_dereference() strictly asserts that rcu_read_lock_held() is true,
and holding a spinlock does not satisfy the classic RCU read lock check, won't
calling map updates or deletes from sleepable programs still trigger the exact
same suspicious rcu_dereference_check() usage lockdep splats?
Should these paths be updated to use rcu_dereference_protected() since the
trie->lock is held?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260529174233.2954240-1-vlad.wing@gmail.com?part=1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf] bpf, lpm_trie: Allow lookups from sleepable BPF programs
2026-05-29 17:42 [PATCH bpf] bpf, lpm_trie: Allow lookups from sleepable BPF programs Vlad Poenaru
2026-05-29 19:02 ` sashiko-bot
@ 2026-05-29 19:19 ` Emil Tsalapatis
1 sibling, 0 replies; 5+ messages in thread
From: Emil Tsalapatis @ 2026-05-29 19:19 UTC (permalink / raw)
To: Vlad Poenaru, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, bpf
Cc: Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa,
Toke Høiland-Jørgensen, linux-kernel, stable
On Fri May 29, 2026 at 1:42 PM EDT, Vlad Poenaru wrote:
> trie_lookup_elem() annotates its rcu_dereference_check() walks with
> only rcu_read_lock_bh_held(). Because rcu_dereference_check(p, c)
> resolves to "c || rcu_read_lock_held()", this passes for XDP/NAPI and
> classic RCU readers but fails for sleepable BPF programs, which enter
> via __bpf_prog_enter_sleepable() and hold only rcu_read_lock_trace().
>
> A sleepable LSM hook that ends up doing bpf_map_lookup_elem() on an LPM
> trie therefore triggers lockdep on debug kernels:
>
> =============================
> WARNING: suspicious RCU usage
> 7.1.0-... Tainted: G E
> -----------------------------
> kernel/bpf/lpm_trie.c:249 suspicious rcu_dereference_check() usage!
> 1 lock held by net_tests/540:
> #0: (rcu_tasks_trace_srcu_struct){....}-{0:0},
> at: __bpf_prog_enter_sleepable+0x26/0x280
> Call Trace:
> dump_stack_lvl
> lockdep_rcu_suspicious
> trie_lookup_elem
> bpf_prog_..._enforce_security_socket_connect
> bpf_trampoline_...
> security_socket_connect
> __sys_connect
> do_syscall_64
>
> This is lockdep-only -- no UAF, since Tasks Trace RCU does serialize
> against the trie's reclaim path -- but it spams the console once per
> distinct callsite on every debug kernel running a sleepable BPF LSM
> that does map lookups on an LPM trie, which is increasingly common.
>
> Other map types already use the bpf_rcu_lock_held() helper, which
> accepts all three contexts (classic, BH, Tasks Trace). Use it here as
> well, matching the established convention.
>
> Fixes: 694cea395fde ("bpf: Allow RCU-protected lookups to happen from bh context")
> Cc: stable@vger.kernel.org
> Signed-off-by: Vlad Poenaru <vlad.wing@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
> ---
> kernel/bpf/lpm_trie.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
> index 0f57608b385d..ac36063cb7e6 100644
> --- a/kernel/bpf/lpm_trie.c
> +++ b/kernel/bpf/lpm_trie.c
> @@ -246,7 +246,7 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key)
>
> /* Start walking the trie from the root node ... */
>
> - for (node = rcu_dereference_check(trie->root, rcu_read_lock_bh_held());
> + for (node = rcu_dereference_check(trie->root, bpf_rcu_lock_held());
> node;) {
> unsigned int next_bit;
> size_t matchlen;
> @@ -280,7 +280,7 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key)
> */
> next_bit = extract_bit(key->data, node->prefixlen);
> node = rcu_dereference_check(node->child[next_bit],
> - rcu_read_lock_bh_held());
> + bpf_rcu_lock_held());
> }
>
> if (!found)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf] bpf, lpm_trie: Allow lookups from sleepable BPF programs
2026-05-29 19:02 ` sashiko-bot
@ 2026-05-29 19:30 ` Emil Tsalapatis
2026-06-07 9:17 ` Kumar Kartikeya Dwivedi
1 sibling, 0 replies; 5+ messages in thread
From: Emil Tsalapatis @ 2026-05-29 19:30 UTC (permalink / raw)
To: sashiko-reviews, Vlad Poenaru; +Cc: bpf
On Fri May 29, 2026 at 3:02 PM EDT, sashiko-bot wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
>
> Pre-existing issues:
> - [High] This problem wasn't introduced by this patch, but the fix is incomplete. The patch resolves an RCU lockdep splat in `trie_lookup_elem` for sleepable BPF programs, but leaves identical core kernel locking rule violations in `trie_update_elem` and `trie_delete_elem`. Sleepable eBPF programs hold `rcu_read_lock_trace()`, not the classic `rcu_read_lock()`. The update and delete operations acquire the map's spinlock (`trie->lock`) and correctly serialize access, but they use `rcu_dereference()` to traverse the trie instead of the update-side API `rcu_dereference_protected()`. `rcu_dereference()` strictly asserts that `rcu_read_lock_held()` is true. Holding a spinlock disables preemption but does not satisfy this check, leading to the exact same 'suspicious rcu_dereference_check() usage' lockdep splat that this patch aimed to fix.
> --
>
> commit 1ac6e30b3c7fb4d93d67bfcbb5dc35077814b266
> Author: Vlad Poenaru <vlad.wing@gmail.com>
>
> bpf, lpm_trie: Allow lookups from sleepable BPF programs
>
> This commit fixes a lockdep warning caused by sleepable BPF programs
> accessing LPM tries. It updates the RCU dereference checks in
> trie_lookup_elem() to use bpf_rcu_lock_held() instead of
> rcu_read_lock_bh_held().
>
>> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
>> index 0f57608b385d4..ac36063cb7e62 100644
>> --- a/kernel/bpf/lpm_trie.c
>> +++ b/kernel/bpf/lpm_trie.c
> [ ... ]
>> @@ -280,7 +280,7 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key)
>> */
>> next_bit = extract_bit(key->data, node->prefixlen);
>> node = rcu_dereference_check(node->child[next_bit],
>> - rcu_read_lock_bh_held());
>> + bpf_rcu_lock_held());
>> }
>>
>> if (!found)
>
> [Severity: High]
> This is a pre-existing issue, but does this patch leave the fix incomplete?
>
> While this patch updates the lookup path to avoid lockdep splats from sleepable
> BPF programs, sleepable BPF programs can also trigger map updates and deletes.
> The functions trie_update_elem() and trie_delete_elem() acquire the map's
> spinlock (trie->lock) but traverse the tree using rcu_dereference():
>
> kernel/bpf/lpm_trie.c:trie_update_elem() {
> ...
> while ((node = rcu_dereference(*slot))) {
> matchlen = longest_prefix_match(trie, node, key);
> ...
> }
>
> kernel/bpf/lpm_trie.c:trie_delete_elem() {
> ...
> while ((node = rcu_dereference(*trim))) {
> matchlen = longest_prefix_match(trie, node, key);
> ...
> }
>
> Because rcu_dereference() strictly asserts that rcu_read_lock_held() is true,
> and holding a spinlock does not satisfy the classic RCU read lock check, won't
> calling map updates or deletes from sleepable programs still trigger the exact
> same suspicious rcu_dereference_check() usage lockdep splats?
>
> Should these paths be updated to use rcu_dereference_protected() since the
> trie->lock is held?
Fair point by the bot
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf] bpf, lpm_trie: Allow lookups from sleepable BPF programs
2026-05-29 19:02 ` sashiko-bot
2026-05-29 19:30 ` Emil Tsalapatis
@ 2026-06-07 9:17 ` Kumar Kartikeya Dwivedi
1 sibling, 0 replies; 5+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-06-07 9:17 UTC (permalink / raw)
To: sashiko-reviews, Vlad Poenaru; +Cc: bpf
On Fri May 29, 2026 at 9:02 PM CEST, sashiko-bot wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
>
> Pre-existing issues:
> - [High] This problem wasn't introduced by this patch, but the fix is incomplete. The patch resolves an RCU lockdep splat in `trie_lookup_elem` for sleepable BPF programs, but leaves identical core kernel locking rule violations in `trie_update_elem` and `trie_delete_elem`. Sleepable eBPF programs hold `rcu_read_lock_trace()`, not the classic `rcu_read_lock()`. The update and delete operations acquire the map's spinlock (`trie->lock`) and correctly serialize access, but they use `rcu_dereference()` to traverse the trie instead of the update-side API `rcu_dereference_protected()`. `rcu_dereference()` strictly asserts that `rcu_read_lock_held()` is true. Holding a spinlock disables preemption but does not satisfy this check, leading to the exact same 'suspicious rcu_dereference_check() usage' lockdep splat that this patch aimed to fix.
> --
>
> commit 1ac6e30b3c7fb4d93d67bfcbb5dc35077814b266
> Author: Vlad Poenaru <vlad.wing@gmail.com>
>
> bpf, lpm_trie: Allow lookups from sleepable BPF programs
>
> This commit fixes a lockdep warning caused by sleepable BPF programs
> accessing LPM tries. It updates the RCU dereference checks in
> trie_lookup_elem() to use bpf_rcu_lock_held() instead of
> rcu_read_lock_bh_held().
>
>> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
>> index 0f57608b385d4..ac36063cb7e62 100644
>> --- a/kernel/bpf/lpm_trie.c
>> +++ b/kernel/bpf/lpm_trie.c
> [ ... ]
>> @@ -280,7 +280,7 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key)
>> */
>> next_bit = extract_bit(key->data, node->prefixlen);
>> node = rcu_dereference_check(node->child[next_bit],
>> - rcu_read_lock_bh_held());
>> + bpf_rcu_lock_held());
>> }
>>
>> if (!found)
>
> [Severity: High]
> This is a pre-existing issue, but does this patch leave the fix incomplete?
>
> While this patch updates the lookup path to avoid lockdep splats from sleepable
> BPF programs, sleepable BPF programs can also trigger map updates and deletes.
> The functions trie_update_elem() and trie_delete_elem() acquire the map's
> spinlock (trie->lock) but traverse the tree using rcu_dereference():
>
> kernel/bpf/lpm_trie.c:trie_update_elem() {
> ...
> while ((node = rcu_dereference(*slot))) {
> matchlen = longest_prefix_match(trie, node, key);
> ...
> }
>
> kernel/bpf/lpm_trie.c:trie_delete_elem() {
> ...
> while ((node = rcu_dereference(*trim))) {
> matchlen = longest_prefix_match(trie, node, key);
> ...
> }
>
> Because rcu_dereference() strictly asserts that rcu_read_lock_held() is true,
> and holding a spinlock does not satisfy the classic RCU read lock check, won't
> calling map updates or deletes from sleepable programs still trigger the exact
> same suspicious rcu_dereference_check() usage lockdep splats?
Please address this feedback for other instances and respin.
pw-bot: cr
>
> Should these paths be updated to use rcu_dereference_protected() since the
> trie->lock is held?
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-07 9:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-29 17:42 [PATCH bpf] bpf, lpm_trie: Allow lookups from sleepable BPF programs Vlad Poenaru
2026-05-29 19:02 ` sashiko-bot
2026-05-29 19:30 ` Emil Tsalapatis
2026-06-07 9:17 ` Kumar Kartikeya Dwivedi
2026-05-29 19:19 ` Emil Tsalapatis
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.