* [PATCH bpf 0/2] bpf: Avoid RCU context warning when unpinning htab with internal structs
@ 2025-10-03 8:45 KaFai Wan
2025-10-03 8:45 ` [PATCH bpf 1/2] " KaFai Wan
2025-10-03 8:45 ` [PATCH bpf 2/2] selftests/bpf: Add test for unpinning htab with internal timer struct KaFai Wan
0 siblings, 2 replies; 5+ messages in thread
From: KaFai Wan @ 2025-10-03 8:45 UTC (permalink / raw)
To: ast, daniel, andrii, martin.lau, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, kafai.wan,
toke, linux-kernel, bpf, linux-kselftest
This small patchset is about avoid RCU context warning when unpinning
htab with internal structs (timer, workqueue, or task_work).
---
KaFai Wan (2):
bpf: Avoid RCU context warning when unpinning htab with internal
structs
selftests/bpf: Add test for unpinning htab with internal timer struct
kernel/bpf/inode.c | 2 +-
.../selftests/bpf/prog_tests/pinning_htab.c | 37 +++++++++++++++++++
.../selftests/bpf/progs/test_pinning_htab.c | 25 +++++++++++++
3 files changed, 63 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/pinning_htab.c
create mode 100644 tools/testing/selftests/bpf/progs/test_pinning_htab.c
--
2.43.0
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH bpf 1/2] bpf: Avoid RCU context warning when unpinning htab with internal structs 2025-10-03 8:45 [PATCH bpf 0/2] bpf: Avoid RCU context warning when unpinning htab with internal structs KaFai Wan @ 2025-10-03 8:45 ` KaFai Wan 2025-10-06 23:58 ` Andrii Nakryiko 2025-10-03 8:45 ` [PATCH bpf 2/2] selftests/bpf: Add test for unpinning htab with internal timer struct KaFai Wan 1 sibling, 1 reply; 5+ messages in thread From: KaFai Wan @ 2025-10-03 8:45 UTC (permalink / raw) To: ast, daniel, andrii, martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, kafai.wan, toke, linux-kernel, bpf, linux-kselftest Cc: Le Chen When unpinning a BPF hash table (htab or htab_lru) that contains internal structures (timer, workqueue, or task_work) in its values, a BUG warning is triggered: BUG: sleeping function called from invalid context at kernel/bpf/hashtab.c:244 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0 ... The issue arises from the interaction between BPF object unpinning and RCU callback mechanisms: 1. BPF object unpinning uses ->free_inode() which schedules cleanup via call_rcu(), deferring the actual freeing to an RCU callback that executes within the RCU_SOFTIRQ context. 2. During cleanup of hash tables containing internal structures, htab_map_free_internal_structs() is invoked, which includes cond_resched() or cond_resched_rcu() calls to yield the CPU during potentially long operations. However, cond_resched() or cond_resched_rcu() cannot be safely called from atomic RCU softirq context, leading to the BUG warning when attempting to reschedule. Fix this by changing from ->free_inode() to ->destroy_inode() for BPF objects (prog, map, link). This allows direct inode freeing without RCU callback scheduling, avoiding the invalid context warning. Reported-by: Le Chen <tom2cat@sjtu.edu.cn> Closes: https://lore.kernel.org/all/1444123482.1827743.1750996347470.JavaMail.zimbra@sjtu.edu.cn/ Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: KaFai Wan <kafai.wan@linux.dev> --- kernel/bpf/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index f90bdcc0a047..65c2a71d7de1 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -790,7 +790,7 @@ const struct super_operations bpf_super_ops = { .statfs = simple_statfs, .drop_inode = inode_just_drop, .show_options = bpf_show_options, - .free_inode = bpf_free_inode, + .destroy_inode = bpf_free_inode, }; enum { -- 2.43.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH bpf 1/2] bpf: Avoid RCU context warning when unpinning htab with internal structs 2025-10-03 8:45 ` [PATCH bpf 1/2] " KaFai Wan @ 2025-10-06 23:58 ` Andrii Nakryiko 2025-10-07 1:25 ` KaFai Wan 0 siblings, 1 reply; 5+ messages in thread From: Andrii Nakryiko @ 2025-10-06 23:58 UTC (permalink / raw) To: KaFai Wan Cc: ast, daniel, andrii, martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, toke, linux-kernel, bpf, linux-kselftest, Le Chen On Fri, Oct 3, 2025 at 1:47 AM KaFai Wan <kafai.wan@linux.dev> wrote: > > When unpinning a BPF hash table (htab or htab_lru) that contains internal > structures (timer, workqueue, or task_work) in its values, a BUG warning > is triggered: > BUG: sleeping function called from invalid context at kernel/bpf/hashtab.c:244 > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0 > ... > > The issue arises from the interaction between BPF object unpinning and > RCU callback mechanisms: > 1. BPF object unpinning uses ->free_inode() which schedules cleanup via > call_rcu(), deferring the actual freeing to an RCU callback that > executes within the RCU_SOFTIRQ context. > 2. During cleanup of hash tables containing internal structures, > htab_map_free_internal_structs() is invoked, which includes > cond_resched() or cond_resched_rcu() calls to yield the CPU during > potentially long operations. > > However, cond_resched() or cond_resched_rcu() cannot be safely called from > atomic RCU softirq context, leading to the BUG warning when attempting > to reschedule. > > Fix this by changing from ->free_inode() to ->destroy_inode() for BPF > objects (prog, map, link). This allows direct inode freeing without > RCU callback scheduling, avoiding the invalid context warning. > > Reported-by: Le Chen <tom2cat@sjtu.edu.cn> > Closes: https://lore.kernel.org/all/1444123482.1827743.1750996347470.JavaMail.zimbra@sjtu.edu.cn/ > Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") > Suggested-by: Alexei Starovoitov <ast@kernel.org> > Signed-off-by: KaFai Wan <kafai.wan@linux.dev> > --- > kernel/bpf/inode.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c > index f90bdcc0a047..65c2a71d7de1 100644 > --- a/kernel/bpf/inode.c > +++ b/kernel/bpf/inode.c > @@ -790,7 +790,7 @@ const struct super_operations bpf_super_ops = { > .statfs = simple_statfs, > .drop_inode = inode_just_drop, > .show_options = bpf_show_options, > - .free_inode = bpf_free_inode, > + .destroy_inode = bpf_free_inode, s/bpf_free_inode/bpf_destroy_inode/ then? > }; > > enum { > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf 1/2] bpf: Avoid RCU context warning when unpinning htab with internal structs 2025-10-06 23:58 ` Andrii Nakryiko @ 2025-10-07 1:25 ` KaFai Wan 0 siblings, 0 replies; 5+ messages in thread From: KaFai Wan @ 2025-10-07 1:25 UTC (permalink / raw) To: Andrii Nakryiko Cc: ast, daniel, andrii, martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, toke, linux-kernel, bpf, linux-kselftest, Le Chen On Mon, 2025-10-06 at 16:58 -0700, Andrii Nakryiko wrote: > On Fri, Oct 3, 2025 at 1:47 AM KaFai Wan <kafai.wan@linux.dev> wrote: > > > > When unpinning a BPF hash table (htab or htab_lru) that contains internal > > structures (timer, workqueue, or task_work) in its values, a BUG warning > > is triggered: > > BUG: sleeping function called from invalid context at > > kernel/bpf/hashtab.c:244 > > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: > > ksoftirqd/0 > > ... > > > > The issue arises from the interaction between BPF object unpinning and > > RCU callback mechanisms: > > 1. BPF object unpinning uses ->free_inode() which schedules cleanup via > > call_rcu(), deferring the actual freeing to an RCU callback that > > executes within the RCU_SOFTIRQ context. > > 2. During cleanup of hash tables containing internal structures, > > htab_map_free_internal_structs() is invoked, which includes > > cond_resched() or cond_resched_rcu() calls to yield the CPU during > > potentially long operations. > > > > However, cond_resched() or cond_resched_rcu() cannot be safely called from > > atomic RCU softirq context, leading to the BUG warning when attempting > > to reschedule. > > > > Fix this by changing from ->free_inode() to ->destroy_inode() for BPF > > objects (prog, map, link). This allows direct inode freeing without > > RCU callback scheduling, avoiding the invalid context warning. > > > > Reported-by: Le Chen <tom2cat@sjtu.edu.cn> > > Closes: > > https://lore.kernel.org/all/1444123482.1827743.1750996347470.JavaMail.zimbra@sjtu.edu.cn/ > > Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") > > Suggested-by: Alexei Starovoitov <ast@kernel.org> > > Signed-off-by: KaFai Wan <kafai.wan@linux.dev> > > --- > > kernel/bpf/inode.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c > > index f90bdcc0a047..65c2a71d7de1 100644 > > --- a/kernel/bpf/inode.c > > +++ b/kernel/bpf/inode.c > > @@ -790,7 +790,7 @@ const struct super_operations bpf_super_ops = { > > .statfs = simple_statfs, > > .drop_inode = inode_just_drop, > > .show_options = bpf_show_options, > > - .free_inode = bpf_free_inode, > > + .destroy_inode = bpf_free_inode, > > s/bpf_free_inode/bpf_destroy_inode/ then? ok, done in v2. > > > }; > > > > enum { > > -- > > 2.43.0 > > -- Thanks, KaFai ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH bpf 2/2] selftests/bpf: Add test for unpinning htab with internal timer struct 2025-10-03 8:45 [PATCH bpf 0/2] bpf: Avoid RCU context warning when unpinning htab with internal structs KaFai Wan 2025-10-03 8:45 ` [PATCH bpf 1/2] " KaFai Wan @ 2025-10-03 8:45 ` KaFai Wan 1 sibling, 0 replies; 5+ messages in thread From: KaFai Wan @ 2025-10-03 8:45 UTC (permalink / raw) To: ast, daniel, andrii, martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, kafai.wan, toke, linux-kernel, bpf, linux-kselftest Add test to verify that unpinning hash tables containing internal timer structures does not trigger context warnings. Each subtest (timer_prealloc and timer_no_prealloc) can trigger the context warning when unpinning, but the warning cannot be triggered twice within a short time interval (a HZ), which is expected behavior. Signed-off-by: KaFai Wan <kafai.wan@linux.dev> --- .../selftests/bpf/prog_tests/pinning_htab.c | 37 +++++++++++++++++++ .../selftests/bpf/progs/test_pinning_htab.c | 25 +++++++++++++ 2 files changed, 62 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/pinning_htab.c create mode 100644 tools/testing/selftests/bpf/progs/test_pinning_htab.c diff --git a/tools/testing/selftests/bpf/prog_tests/pinning_htab.c b/tools/testing/selftests/bpf/prog_tests/pinning_htab.c new file mode 100644 index 000000000000..fc804bb87b26 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/pinning_htab.c @@ -0,0 +1,37 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <test_progs.h> +#include "test_pinning_htab.skel.h" + +static void unpin_map(const char *map_name, const char *pin_path) +{ + struct test_pinning_htab *skel; + struct bpf_map *map; + int err; + + skel = test_pinning_htab__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel open_and_load")) + return; + + map = bpf_object__find_map_by_name(skel->obj, map_name); + if (!ASSERT_OK_PTR(map, "bpf_object__find_map_by_name")) + goto out; + + err = bpf_map__pin(map, pin_path); + if (!ASSERT_OK(err, "bpf_map__pin")) + goto out; + + err = bpf_map__unpin(map, pin_path); + if (!ASSERT_OK(err, "bpf_map__unpin")) + goto out; +out: + test_pinning_htab__destroy(skel); +} + +void test_pinning_htab(void) +{ + if (test__start_subtest("timer_prealloc")) + unpin_map("timer_prealloc", "/sys/fs/bpf/timer_prealloc"); + if (test__start_subtest("timer_no_prealloc")) + unpin_map("timer_no_prealloc", "/sys/fs/bpf/timer_no_prealloc"); +} diff --git a/tools/testing/selftests/bpf/progs/test_pinning_htab.c b/tools/testing/selftests/bpf/progs/test_pinning_htab.c new file mode 100644 index 000000000000..ae227930c73c --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_pinning_htab.c @@ -0,0 +1,25 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> + +char _license[] SEC("license") = "GPL"; + +struct timer_val { + struct bpf_timer timer; +}; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, __u32); + __type(value, struct timer_val); + __uint(max_entries, 1); +} timer_prealloc SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, __u32); + __type(value, struct timer_val); + __uint(max_entries, 1); + __uint(map_flags, BPF_F_NO_PREALLOC); +} timer_no_prealloc SEC(".maps"); -- 2.43.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-10-07 1:26 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-10-03 8:45 [PATCH bpf 0/2] bpf: Avoid RCU context warning when unpinning htab with internal structs KaFai Wan 2025-10-03 8:45 ` [PATCH bpf 1/2] " KaFai Wan 2025-10-06 23:58 ` Andrii Nakryiko 2025-10-07 1:25 ` KaFai Wan 2025-10-03 8:45 ` [PATCH bpf 2/2] selftests/bpf: Add test for unpinning htab with internal timer struct KaFai Wan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox