* Re: [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry [not found] ` <20250828222357.55fab4c2@batman.local.home> @ 2025-08-29 11:11 ` Paul E. McKenney 2025-09-01 8:06 ` Masami Hiramatsu 2025-09-01 10:06 ` [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry Herbert Xu 0 siblings, 2 replies; 11+ messages in thread From: Paul E. McKenney @ 2025-08-29 11:11 UTC (permalink / raw) To: Steven Rostedt Cc: Menglong Dong, mhiramat, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, herbert, linux-crypto On Thu, Aug 28, 2025 at 10:23:57PM -0400, Steven Rostedt wrote: > On Fri, 29 Aug 2025 10:14:36 +0800 > Menglong Dong <dongml2@chinatelecom.cn> wrote: > > > rcu_read_lock() is not needed in fprobe_entry, but rcu_dereference_check() > > is used in rhltable_lookup(), which causes suspicious RCU usage warning: > > > > WARNING: suspicious RCU usage > > 6.17.0-rc1-00001-gdfe0d675df82 #1 Tainted: G S > > ----------------------------- > > include/linux/rhashtable.h:602 suspicious rcu_dereference_check() usage! > > ...... > > stack backtrace: > > CPU: 1 UID: 0 PID: 4652 Comm: ftracetest Tainted: G S > > Tainted: [S]=CPU_OUT_OF_SPEC, [I]=FIRMWARE_WORKAROUND > > Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.1.1 10/07/2015 > > Call Trace: > > <TASK> > > dump_stack_lvl+0x7c/0x90 > > lockdep_rcu_suspicious+0x14f/0x1c0 > > __rhashtable_lookup+0x1e0/0x260 > > ? __pfx_kernel_clone+0x10/0x10 > > fprobe_entry+0x9a/0x450 > > ? __lock_acquire+0x6b0/0xca0 > > ? find_held_lock+0x2b/0x80 > > ? __pfx_fprobe_entry+0x10/0x10 > > ? __pfx_kernel_clone+0x10/0x10 > > ? lock_acquire+0x14c/0x2d0 > > ? __might_fault+0x74/0xc0 > > function_graph_enter_regs+0x2a0/0x550 > > ? __do_sys_clone+0xb5/0x100 > > ? __pfx_function_graph_enter_regs+0x10/0x10 > > ? _copy_to_user+0x58/0x70 > > ? __pfx_kernel_clone+0x10/0x10 > > ? __x64_sys_rt_sigprocmask+0x114/0x180 > > ? __pfx___x64_sys_rt_sigprocmask+0x10/0x10 > > ? __pfx_kernel_clone+0x10/0x10 > > ftrace_graph_func+0x87/0xb0 > > > > Fix this by using rcu_read_lock() for rhltable_lookup(). Alternatively, we > > can use rcu_lock_acquire(&rcu_lock_map) here to obtain better performance. > > However, it's not a common usage :/ > > So this is needed even though it's called under preempt_disable(). > > Paul, do we need to add an rcu_read_lock() because the code in rht > (rhashtable) requires RCU read lock? > > I thought that rcu_read_lock() and preempt_disable() have been merged? Yes, preempt_disable() does indeed start an RCU read-side critical section, just as surely as rcu_read_lock() does. However, this is a lockdep check inside of __rhashtable_lookup(): rht_dereference_rcu(ht->tbl, ht) Which is defined as: rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)); This is explicitly telling lockdep that rcu_read_lock() is OK and holding ht->mutex is OK, but nothing else is. So an alternative way to fix this is to declare it to be a false positive, and then avoid that false positive by adding a check that preemption is disabled. Adding the rhashtable maintainers for their perspective. Thanx, Paul > -- Steve > > > > > > Reported-by: kernel test robot <oliver.sang@intel.com> > > Closes: https://lore.kernel.org/oe-lkp/202508281655.54c87330-lkp@intel.com > > Fixes: dfe0d675df82 ("tracing: fprobe: use rhltable for fprobe_ip_table") > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > --- > > kernel/trace/fprobe.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c > > index fb127fa95f21..fece0f849c1c 100644 > > --- a/kernel/trace/fprobe.c > > +++ b/kernel/trace/fprobe.c > > @@ -269,7 +269,9 @@ static int fprobe_entry(struct ftrace_graph_ent *trace, struct fgraph_ops *gops, > > if (WARN_ON_ONCE(!fregs)) > > return 0; > > > > + rcu_read_lock(); > > head = rhltable_lookup(&fprobe_ip_table, &func, fprobe_rht_params); > > + rcu_read_unlock(); > > reserved_words = 0; > > rhl_for_each_entry_rcu(node, pos, head, hlist) { > > if (node->addr != func) > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry 2025-08-29 11:11 ` [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry Paul E. McKenney @ 2025-09-01 8:06 ` Masami Hiramatsu 2025-09-01 15:00 ` Paul E. McKenney 2025-09-01 10:06 ` [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry Herbert Xu 1 sibling, 1 reply; 11+ messages in thread From: Masami Hiramatsu @ 2025-09-01 8:06 UTC (permalink / raw) To: paulmck Cc: Steven Rostedt, Menglong Dong, mhiramat, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, herbert, linux-crypto On Fri, 29 Aug 2025 04:11:02 -0700 "Paul E. McKenney" <paulmck@kernel.org> wrote: > On Thu, Aug 28, 2025 at 10:23:57PM -0400, Steven Rostedt wrote: > > On Fri, 29 Aug 2025 10:14:36 +0800 > > Menglong Dong <dongml2@chinatelecom.cn> wrote: > > > > > rcu_read_lock() is not needed in fprobe_entry, but rcu_dereference_check() > > > is used in rhltable_lookup(), which causes suspicious RCU usage warning: > > > > > > WARNING: suspicious RCU usage > > > 6.17.0-rc1-00001-gdfe0d675df82 #1 Tainted: G S > > > ----------------------------- > > > include/linux/rhashtable.h:602 suspicious rcu_dereference_check() usage! > > > ...... > > > stack backtrace: > > > CPU: 1 UID: 0 PID: 4652 Comm: ftracetest Tainted: G S > > > Tainted: [S]=CPU_OUT_OF_SPEC, [I]=FIRMWARE_WORKAROUND > > > Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.1.1 10/07/2015 > > > Call Trace: > > > <TASK> > > > dump_stack_lvl+0x7c/0x90 > > > lockdep_rcu_suspicious+0x14f/0x1c0 > > > __rhashtable_lookup+0x1e0/0x260 > > > ? __pfx_kernel_clone+0x10/0x10 > > > fprobe_entry+0x9a/0x450 > > > ? __lock_acquire+0x6b0/0xca0 > > > ? find_held_lock+0x2b/0x80 > > > ? __pfx_fprobe_entry+0x10/0x10 > > > ? __pfx_kernel_clone+0x10/0x10 > > > ? lock_acquire+0x14c/0x2d0 > > > ? __might_fault+0x74/0xc0 > > > function_graph_enter_regs+0x2a0/0x550 > > > ? __do_sys_clone+0xb5/0x100 > > > ? __pfx_function_graph_enter_regs+0x10/0x10 > > > ? _copy_to_user+0x58/0x70 > > > ? __pfx_kernel_clone+0x10/0x10 > > > ? __x64_sys_rt_sigprocmask+0x114/0x180 > > > ? __pfx___x64_sys_rt_sigprocmask+0x10/0x10 > > > ? __pfx_kernel_clone+0x10/0x10 > > > ftrace_graph_func+0x87/0xb0 > > > > > > Fix this by using rcu_read_lock() for rhltable_lookup(). Alternatively, we > > > can use rcu_lock_acquire(&rcu_lock_map) here to obtain better performance. > > > However, it's not a common usage :/ > > > > So this is needed even though it's called under preempt_disable(). > > > > Paul, do we need to add an rcu_read_lock() because the code in rht > > (rhashtable) requires RCU read lock? > > > > I thought that rcu_read_lock() and preempt_disable() have been merged? > > Yes, preempt_disable() does indeed start an RCU read-side critical section, > just as surely as rcu_read_lock() does. > > However, this is a lockdep check inside of __rhashtable_lookup(): > > rht_dereference_rcu(ht->tbl, ht) > > Which is defined as: > > rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)); > > This is explicitly telling lockdep that rcu_read_lock() is OK and > holding ht->mutex is OK, but nothing else is. That is similar to the kprobes, which also allows accessing in rcu critical section or under mutex. > > So an alternative way to fix this is to declare it to be a false positive, > and then avoid that false positive by adding a check that preemption > is disabled. Adding the rhashtable maintainers for their perspective. What about changing it alloing it with preempt disabled flag? Thank you, > > Thanx, Paul > > > -- Steve > > > > > > > > > > Reported-by: kernel test robot <oliver.sang@intel.com> > > > Closes: https://lore.kernel.org/oe-lkp/202508281655.54c87330-lkp@intel.com > > > Fixes: dfe0d675df82 ("tracing: fprobe: use rhltable for fprobe_ip_table") > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > > --- > > > kernel/trace/fprobe.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c > > > index fb127fa95f21..fece0f849c1c 100644 > > > --- a/kernel/trace/fprobe.c > > > +++ b/kernel/trace/fprobe.c > > > @@ -269,7 +269,9 @@ static int fprobe_entry(struct ftrace_graph_ent *trace, struct fgraph_ops *gops, > > > if (WARN_ON_ONCE(!fregs)) > > > return 0; > > > > > > + rcu_read_lock(); > > > head = rhltable_lookup(&fprobe_ip_table, &func, fprobe_rht_params); > > > + rcu_read_unlock(); > > > reserved_words = 0; > > > rhl_for_each_entry_rcu(node, pos, head, hlist) { > > > if (node->addr != func) > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry 2025-09-01 8:06 ` Masami Hiramatsu @ 2025-09-01 15:00 ` Paul E. McKenney 2025-09-02 6:59 ` Masami Hiramatsu ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Paul E. McKenney @ 2025-09-01 15:00 UTC (permalink / raw) To: Masami Hiramatsu Cc: Steven Rostedt, Menglong Dong, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, herbert, linux-crypto On Mon, Sep 01, 2025 at 05:06:55PM +0900, Masami Hiramatsu wrote: > On Fri, 29 Aug 2025 04:11:02 -0700 > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > On Thu, Aug 28, 2025 at 10:23:57PM -0400, Steven Rostedt wrote: > > > On Fri, 29 Aug 2025 10:14:36 +0800 > > > Menglong Dong <dongml2@chinatelecom.cn> wrote: > > > > > > > rcu_read_lock() is not needed in fprobe_entry, but rcu_dereference_check() > > > > is used in rhltable_lookup(), which causes suspicious RCU usage warning: > > > > > > > > WARNING: suspicious RCU usage > > > > 6.17.0-rc1-00001-gdfe0d675df82 #1 Tainted: G S > > > > ----------------------------- > > > > include/linux/rhashtable.h:602 suspicious rcu_dereference_check() usage! > > > > ...... > > > > stack backtrace: > > > > CPU: 1 UID: 0 PID: 4652 Comm: ftracetest Tainted: G S > > > > Tainted: [S]=CPU_OUT_OF_SPEC, [I]=FIRMWARE_WORKAROUND > > > > Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.1.1 10/07/2015 > > > > Call Trace: > > > > <TASK> > > > > dump_stack_lvl+0x7c/0x90 > > > > lockdep_rcu_suspicious+0x14f/0x1c0 > > > > __rhashtable_lookup+0x1e0/0x260 > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > fprobe_entry+0x9a/0x450 > > > > ? __lock_acquire+0x6b0/0xca0 > > > > ? find_held_lock+0x2b/0x80 > > > > ? __pfx_fprobe_entry+0x10/0x10 > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > ? lock_acquire+0x14c/0x2d0 > > > > ? __might_fault+0x74/0xc0 > > > > function_graph_enter_regs+0x2a0/0x550 > > > > ? __do_sys_clone+0xb5/0x100 > > > > ? __pfx_function_graph_enter_regs+0x10/0x10 > > > > ? _copy_to_user+0x58/0x70 > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > ? __x64_sys_rt_sigprocmask+0x114/0x180 > > > > ? __pfx___x64_sys_rt_sigprocmask+0x10/0x10 > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > ftrace_graph_func+0x87/0xb0 > > > > > > > > Fix this by using rcu_read_lock() for rhltable_lookup(). Alternatively, we > > > > can use rcu_lock_acquire(&rcu_lock_map) here to obtain better performance. > > > > However, it's not a common usage :/ > > > > > > So this is needed even though it's called under preempt_disable(). > > > > > > Paul, do we need to add an rcu_read_lock() because the code in rht > > > (rhashtable) requires RCU read lock? > > > > > > I thought that rcu_read_lock() and preempt_disable() have been merged? > > > > Yes, preempt_disable() does indeed start an RCU read-side critical section, > > just as surely as rcu_read_lock() does. > > > > However, this is a lockdep check inside of __rhashtable_lookup(): > > > > rht_dereference_rcu(ht->tbl, ht) > > > > Which is defined as: > > > > rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)); > > > > This is explicitly telling lockdep that rcu_read_lock() is OK and > > holding ht->mutex is OK, but nothing else is. > > That is similar to the kprobes, which also allows accessing in > rcu critical section or under mutex. > > > So an alternative way to fix this is to declare it to be a false positive, > > and then avoid that false positive by adding a check that preemption > > is disabled. Adding the rhashtable maintainers for their perspective. > > What about changing it alloing it with preempt disabled flag? I am not sure that "it" that you are proposing changing. ;-) However, another option for the the above rcu_dereference_check() to become something like this: rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht) || rcu_read_lock_any_held()); This would be happy with any RCU reader, including rcu_read_lock(), preempt_disable(), local_irq_disable(), local_bh_disable(), and various handler contexts. One downside is that this would *always* be happy in a kernel built with CONFIG_PREEMPT_{NONE,VOLUNTARY}=y. If this is happening often enough, it would be easy for me to create an rcu_dereference_all_check() that allows all forms of vanilla RCU readers (but not, for example, SRCU readers), but with only two use cases, it is not clear to me that this is an overall win. Or am I missing a turn in here somewhere? Thanx, Paul > Thank you, > > > > > Thanx, Paul > > > > > -- Steve > > > > > > > > > > > > > > Reported-by: kernel test robot <oliver.sang@intel.com> > > > > Closes: https://lore.kernel.org/oe-lkp/202508281655.54c87330-lkp@intel.com > > > > Fixes: dfe0d675df82 ("tracing: fprobe: use rhltable for fprobe_ip_table") > > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > > > --- > > > > kernel/trace/fprobe.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c > > > > index fb127fa95f21..fece0f849c1c 100644 > > > > --- a/kernel/trace/fprobe.c > > > > +++ b/kernel/trace/fprobe.c > > > > @@ -269,7 +269,9 @@ static int fprobe_entry(struct ftrace_graph_ent *trace, struct fgraph_ops *gops, > > > > if (WARN_ON_ONCE(!fregs)) > > > > return 0; > > > > > > > > + rcu_read_lock(); > > > > head = rhltable_lookup(&fprobe_ip_table, &func, fprobe_rht_params); > > > > + rcu_read_unlock(); > > > > reserved_words = 0; > > > > rhl_for_each_entry_rcu(node, pos, head, hlist) { > > > > if (node->addr != func) > > > > > > -- > Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry 2025-09-01 15:00 ` Paul E. McKenney @ 2025-09-02 6:59 ` Masami Hiramatsu 2025-09-02 11:58 ` Paul E. McKenney 2025-09-03 9:43 ` Herbert Xu 2025-09-04 9:44 ` [PATCH] rhashtable: Use rcu_dereference_all and rcu_dereference_all_check Herbert Xu 2 siblings, 1 reply; 11+ messages in thread From: Masami Hiramatsu @ 2025-09-02 6:59 UTC (permalink / raw) To: paulmck Cc: Steven Rostedt, Menglong Dong, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, herbert, linux-crypto On Mon, 1 Sep 2025 08:00:15 -0700 "Paul E. McKenney" <paulmck@kernel.org> wrote: > On Mon, Sep 01, 2025 at 05:06:55PM +0900, Masami Hiramatsu wrote: > > On Fri, 29 Aug 2025 04:11:02 -0700 > > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > > > On Thu, Aug 28, 2025 at 10:23:57PM -0400, Steven Rostedt wrote: > > > > On Fri, 29 Aug 2025 10:14:36 +0800 > > > > Menglong Dong <dongml2@chinatelecom.cn> wrote: > > > > > > > > > rcu_read_lock() is not needed in fprobe_entry, but rcu_dereference_check() > > > > > is used in rhltable_lookup(), which causes suspicious RCU usage warning: > > > > > > > > > > WARNING: suspicious RCU usage > > > > > 6.17.0-rc1-00001-gdfe0d675df82 #1 Tainted: G S > > > > > ----------------------------- > > > > > include/linux/rhashtable.h:602 suspicious rcu_dereference_check() usage! > > > > > ...... > > > > > stack backtrace: > > > > > CPU: 1 UID: 0 PID: 4652 Comm: ftracetest Tainted: G S > > > > > Tainted: [S]=CPU_OUT_OF_SPEC, [I]=FIRMWARE_WORKAROUND > > > > > Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.1.1 10/07/2015 > > > > > Call Trace: > > > > > <TASK> > > > > > dump_stack_lvl+0x7c/0x90 > > > > > lockdep_rcu_suspicious+0x14f/0x1c0 > > > > > __rhashtable_lookup+0x1e0/0x260 > > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > > fprobe_entry+0x9a/0x450 > > > > > ? __lock_acquire+0x6b0/0xca0 > > > > > ? find_held_lock+0x2b/0x80 > > > > > ? __pfx_fprobe_entry+0x10/0x10 > > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > > ? lock_acquire+0x14c/0x2d0 > > > > > ? __might_fault+0x74/0xc0 > > > > > function_graph_enter_regs+0x2a0/0x550 > > > > > ? __do_sys_clone+0xb5/0x100 > > > > > ? __pfx_function_graph_enter_regs+0x10/0x10 > > > > > ? _copy_to_user+0x58/0x70 > > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > > ? __x64_sys_rt_sigprocmask+0x114/0x180 > > > > > ? __pfx___x64_sys_rt_sigprocmask+0x10/0x10 > > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > > ftrace_graph_func+0x87/0xb0 > > > > > > > > > > Fix this by using rcu_read_lock() for rhltable_lookup(). Alternatively, we > > > > > can use rcu_lock_acquire(&rcu_lock_map) here to obtain better performance. > > > > > However, it's not a common usage :/ > > > > > > > > So this is needed even though it's called under preempt_disable(). > > > > > > > > Paul, do we need to add an rcu_read_lock() because the code in rht > > > > (rhashtable) requires RCU read lock? > > > > > > > > I thought that rcu_read_lock() and preempt_disable() have been merged? > > > > > > Yes, preempt_disable() does indeed start an RCU read-side critical section, > > > just as surely as rcu_read_lock() does. > > > > > > However, this is a lockdep check inside of __rhashtable_lookup(): > > > > > > rht_dereference_rcu(ht->tbl, ht) > > > > > > Which is defined as: > > > > > > rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)); > > > > > > This is explicitly telling lockdep that rcu_read_lock() is OK and > > > holding ht->mutex is OK, but nothing else is. > > > > That is similar to the kprobes, which also allows accessing in > > rcu critical section or under mutex. > > > > > So an alternative way to fix this is to declare it to be a false positive, > > > and then avoid that false positive by adding a check that preemption > > > is disabled. Adding the rhashtable maintainers for their perspective. > > > > What about changing it alloing it with preempt disabled flag? > > I am not sure that "it" that you are proposing changing. ;-) Sorry, Ii meant the rcu_dereference_check(). > > However, another option for the the above rcu_dereference_check() to > become something like this: > > rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht) || > rcu_read_lock_any_held()); > > This would be happy with any RCU reader, including rcu_read_lock(), > preempt_disable(), local_irq_disable(), local_bh_disable(), and various > handler contexts. One downside is that this would *always* be happy in > a kernel built with CONFIG_PREEMPT_{NONE,VOLUNTARY}=y. Ah, indeed. This means that we lose the ability to explicitly check whether the rcu pointer is in a critical section on that kernel. > > If this is happening often enough, it would be easy for me to create an > rcu_dereference_all_check() that allows all forms of vanilla RCU readers > (but not, for example, SRCU readers), but with only two use cases, > it is not clear to me that this is an overall win. OK, I think this discussion is important for the patch from Menglong [1] [1] https://lore.kernel.org/all/20250829021436.19982-1-dongml2@chinatelecom.cn/ because this does not make an rcu critical section while using `head` but it works because fprobe_entry() runs under preempt_disable(). Is it better to use `guard(rcu)()` instead of rcu_read_lock() so that it explicitly secure the `head` usage? I just wonder if there is any downside to extend rcu_read_lock() area (still in the same preempt_disable section). Thank you, > > Or am I missing a turn in here somewhere? > > Thanx, Paul > > > Thank you, > > > > > > > > Thanx, Paul > > > > > > > -- Steve > > > > > > > > > > > > > > > > > > Reported-by: kernel test robot <oliver.sang@intel.com> > > > > > Closes: https://lore.kernel.org/oe-lkp/202508281655.54c87330-lkp@intel.com > > > > > Fixes: dfe0d675df82 ("tracing: fprobe: use rhltable for fprobe_ip_table") > > > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > > > > --- > > > > > kernel/trace/fprobe.c | 2 ++ > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c > > > > > index fb127fa95f21..fece0f849c1c 100644 > > > > > --- a/kernel/trace/fprobe.c > > > > > +++ b/kernel/trace/fprobe.c > > > > > @@ -269,7 +269,9 @@ static int fprobe_entry(struct ftrace_graph_ent *trace, struct fgraph_ops *gops, > > > > > if (WARN_ON_ONCE(!fregs)) > > > > > return 0; > > > > > > > > > > + rcu_read_lock(); > > > > > head = rhltable_lookup(&fprobe_ip_table, &func, fprobe_rht_params); > > > > > + rcu_read_unlock(); > > > > > reserved_words = 0; > > > > > rhl_for_each_entry_rcu(node, pos, head, hlist) { > > > > > if (node->addr != func) > > > > > > > > > > -- > > Masami Hiramatsu (Google) <mhiramat@kernel.org> -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry 2025-09-02 6:59 ` Masami Hiramatsu @ 2025-09-02 11:58 ` Paul E. McKenney 0 siblings, 0 replies; 11+ messages in thread From: Paul E. McKenney @ 2025-09-02 11:58 UTC (permalink / raw) To: Masami Hiramatsu Cc: Steven Rostedt, Menglong Dong, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, herbert, linux-crypto On Tue, Sep 02, 2025 at 03:59:53PM +0900, Masami Hiramatsu wrote: > On Mon, 1 Sep 2025 08:00:15 -0700 > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > On Mon, Sep 01, 2025 at 05:06:55PM +0900, Masami Hiramatsu wrote: > > > On Fri, 29 Aug 2025 04:11:02 -0700 > > > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > > > > > On Thu, Aug 28, 2025 at 10:23:57PM -0400, Steven Rostedt wrote: > > > > > On Fri, 29 Aug 2025 10:14:36 +0800 > > > > > Menglong Dong <dongml2@chinatelecom.cn> wrote: > > > > > > > > > > > rcu_read_lock() is not needed in fprobe_entry, but rcu_dereference_check() > > > > > > is used in rhltable_lookup(), which causes suspicious RCU usage warning: > > > > > > > > > > > > WARNING: suspicious RCU usage > > > > > > 6.17.0-rc1-00001-gdfe0d675df82 #1 Tainted: G S > > > > > > ----------------------------- > > > > > > include/linux/rhashtable.h:602 suspicious rcu_dereference_check() usage! > > > > > > ...... > > > > > > stack backtrace: > > > > > > CPU: 1 UID: 0 PID: 4652 Comm: ftracetest Tainted: G S > > > > > > Tainted: [S]=CPU_OUT_OF_SPEC, [I]=FIRMWARE_WORKAROUND > > > > > > Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.1.1 10/07/2015 > > > > > > Call Trace: > > > > > > <TASK> > > > > > > dump_stack_lvl+0x7c/0x90 > > > > > > lockdep_rcu_suspicious+0x14f/0x1c0 > > > > > > __rhashtable_lookup+0x1e0/0x260 > > > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > > > fprobe_entry+0x9a/0x450 > > > > > > ? __lock_acquire+0x6b0/0xca0 > > > > > > ? find_held_lock+0x2b/0x80 > > > > > > ? __pfx_fprobe_entry+0x10/0x10 > > > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > > > ? lock_acquire+0x14c/0x2d0 > > > > > > ? __might_fault+0x74/0xc0 > > > > > > function_graph_enter_regs+0x2a0/0x550 > > > > > > ? __do_sys_clone+0xb5/0x100 > > > > > > ? __pfx_function_graph_enter_regs+0x10/0x10 > > > > > > ? _copy_to_user+0x58/0x70 > > > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > > > ? __x64_sys_rt_sigprocmask+0x114/0x180 > > > > > > ? __pfx___x64_sys_rt_sigprocmask+0x10/0x10 > > > > > > ? __pfx_kernel_clone+0x10/0x10 > > > > > > ftrace_graph_func+0x87/0xb0 > > > > > > > > > > > > Fix this by using rcu_read_lock() for rhltable_lookup(). Alternatively, we > > > > > > can use rcu_lock_acquire(&rcu_lock_map) here to obtain better performance. > > > > > > However, it's not a common usage :/ > > > > > > > > > > So this is needed even though it's called under preempt_disable(). > > > > > > > > > > Paul, do we need to add an rcu_read_lock() because the code in rht > > > > > (rhashtable) requires RCU read lock? > > > > > > > > > > I thought that rcu_read_lock() and preempt_disable() have been merged? > > > > > > > > Yes, preempt_disable() does indeed start an RCU read-side critical section, > > > > just as surely as rcu_read_lock() does. > > > > > > > > However, this is a lockdep check inside of __rhashtable_lookup(): > > > > > > > > rht_dereference_rcu(ht->tbl, ht) > > > > > > > > Which is defined as: > > > > > > > > rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)); > > > > > > > > This is explicitly telling lockdep that rcu_read_lock() is OK and > > > > holding ht->mutex is OK, but nothing else is. > > > > > > That is similar to the kprobes, which also allows accessing in > > > rcu critical section or under mutex. > > > > > > > So an alternative way to fix this is to declare it to be a false positive, > > > > and then avoid that false positive by adding a check that preemption > > > > is disabled. Adding the rhashtable maintainers for their perspective. > > > > > > What about changing it alloing it with preempt disabled flag? > > > > I am not sure that "it" that you are proposing changing. ;-) > > Sorry, Ii meant the rcu_dereference_check(). > > > > > However, another option for the the above rcu_dereference_check() to > > become something like this: > > > > rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht) || > > rcu_read_lock_any_held()); > > > > This would be happy with any RCU reader, including rcu_read_lock(), > > preempt_disable(), local_irq_disable(), local_bh_disable(), and various > > handler contexts. One downside is that this would *always* be happy in > > a kernel built with CONFIG_PREEMPT_{NONE,VOLUNTARY}=y. > > Ah, indeed. This means that we lose the ability to explicitly check > whether the rcu pointer is in a critical section on that kernel. It is a usability/bug-detection design tradeoff, and as such, the RCU user's choice. RCU is simply an arms supplier on this one. ;-) > > If this is happening often enough, it would be easy for me to create an > > rcu_dereference_all_check() that allows all forms of vanilla RCU readers > > (but not, for example, SRCU readers), but with only two use cases, > > it is not clear to me that this is an overall win. > > OK, I think this discussion is important for the patch from Menglong [1] > > [1] https://lore.kernel.org/all/20250829021436.19982-1-dongml2@chinatelecom.cn/ > > because this does not make an rcu critical section while using `head` > but it works because fprobe_entry() runs under preempt_disable(). Agreed, and that appears to be what initiated this dicussion. > Is it better to use `guard(rcu)()` instead of rcu_read_lock() so that > it explicitly secure the `head` usage? I just wonder if there is any > downside to extend rcu_read_lock() area (still in the same > preempt_disable section). Another option is `scoped_guard(rcu)`, for example, as used in ftrace_find_callable_addr() in arch/loongarch/kernel/ftrace_dyn.c. That way you keep the RAII usability while keeping the RCU read-side critical section small. Thanx, Paul > Thank you, > > > > > Or am I missing a turn in here somewhere? > > > > Thanx, Paul > > > > > Thank you, > > > > > > > > > > > Thanx, Paul > > > > > > > > > -- Steve > > > > > > > > > > > > > > > > > > > > > > Reported-by: kernel test robot <oliver.sang@intel.com> > > > > > > Closes: https://lore.kernel.org/oe-lkp/202508281655.54c87330-lkp@intel.com > > > > > > Fixes: dfe0d675df82 ("tracing: fprobe: use rhltable for fprobe_ip_table") > > > > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > > > > > --- > > > > > > kernel/trace/fprobe.c | 2 ++ > > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > > > diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c > > > > > > index fb127fa95f21..fece0f849c1c 100644 > > > > > > --- a/kernel/trace/fprobe.c > > > > > > +++ b/kernel/trace/fprobe.c > > > > > > @@ -269,7 +269,9 @@ static int fprobe_entry(struct ftrace_graph_ent *trace, struct fgraph_ops *gops, > > > > > > if (WARN_ON_ONCE(!fregs)) > > > > > > return 0; > > > > > > > > > > > > + rcu_read_lock(); > > > > > > head = rhltable_lookup(&fprobe_ip_table, &func, fprobe_rht_params); > > > > > > + rcu_read_unlock(); > > > > > > reserved_words = 0; > > > > > > rhl_for_each_entry_rcu(node, pos, head, hlist) { > > > > > > if (node->addr != func) > > > > > > > > > > > > > > -- > > > Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > -- > Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry 2025-09-01 15:00 ` Paul E. McKenney 2025-09-02 6:59 ` Masami Hiramatsu @ 2025-09-03 9:43 ` Herbert Xu 2025-09-04 9:44 ` [PATCH] rhashtable: Use rcu_dereference_all and rcu_dereference_all_check Herbert Xu 2 siblings, 0 replies; 11+ messages in thread From: Herbert Xu @ 2025-09-03 9:43 UTC (permalink / raw) To: Paul E. McKenney Cc: Masami Hiramatsu, Steven Rostedt, Menglong Dong, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, linux-crypto On Mon, Sep 01, 2025 at 08:00:15AM -0700, Paul E. McKenney wrote: > > If this is happening often enough, it would be easy for me to create an > rcu_dereference_all_check() that allows all forms of vanilla RCU readers > (but not, for example, SRCU readers), but with only two use cases, > it is not clear to me that this is an overall win. Hi Paul: Please create such a helper. Because the alternative is for me to do something like this in rhashtable: #define rht_dereference_rcu(p, ht) \ rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht) || \ rcu_read_lock_any_held()) This really makes no sense because rcu_read_lock_any_held is an internal RCU implementation detail and has nothing to do with rhashtable. rhashtable is just a middle-man like RCU. The actual context (be it vanilla, bh or sched RCU) used is entirely up to the user. Actually what puzzles me is why can't we just get rid of the bh and sched variants of rcu_dereference? After all, there is only one synchronize_rcu/call_rcu and it supports all three variants. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] rhashtable: Use rcu_dereference_all and rcu_dereference_all_check 2025-09-01 15:00 ` Paul E. McKenney 2025-09-02 6:59 ` Masami Hiramatsu 2025-09-03 9:43 ` Herbert Xu @ 2025-09-04 9:44 ` Herbert Xu 2025-09-08 15:23 ` Paul E. McKenney 2 siblings, 1 reply; 11+ messages in thread From: Herbert Xu @ 2025-09-04 9:44 UTC (permalink / raw) To: Paul E. McKenney Cc: Masami Hiramatsu, Steven Rostedt, Menglong Dong, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, linux-crypto On Mon, Sep 01, 2025 at 08:00:15AM -0700, Paul E. McKenney wrote: > > However, another option for the the above rcu_dereference_check() to > become something like this: > > rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht) || > rcu_read_lock_any_held()); > > This would be happy with any RCU reader, including rcu_read_lock(), > preempt_disable(), local_irq_disable(), local_bh_disable(), and various > handler contexts. One downside is that this would *always* be happy in > a kernel built with CONFIG_PREEMPT_{NONE,VOLUNTARY}=y. Why is that a problem? Assuming that it is always safe to perform an rcu_dereference_all() on such a kernel, not warning would appear to be the right thing to do. > If this is happening often enough, it would be easy for me to create an > rcu_dereference_all_check() that allows all forms of vanilla RCU readers > (but not, for example, SRCU readers), but with only two use cases, > it is not clear to me that this is an overall win. > > Or am I missing a turn in here somewhere? As I said rhashtable is just a middle-man, while it does some RCU actions internally, in the end the user is free to choose the RCU variant (vanilla, bh, sched). So I'm faced with the choice of using only rcu_dereference_all in rhashtable code, or splitting the rhashtable API into three variants just like RCU. I'd rather not do this especially because it appears RCU itself has essentially morphed into a single variant because call_rcu etc. always waits for all three variants. So how about this patch? ---8<--- Add rcu_dereference_all and rcu_dereference_all_check so that library code such as rhashtable can be used with any RCU variant. As it stands rcu_dereference is used within rashtable, which creates false-positive warnings if the user calls it from another RCU variant context, such as preempt_disable(). Use the rcu_dereference_all and rcu_dereference_all_check calls in rhashtable to suppress these warnings. Also replace the rcu_dereference_raw calls in the list iterators with rcu_dereference_all to uncover buggy calls. Reported-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 120536f4c6eb..b4261a0498f0 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -713,6 +713,23 @@ do { \ (c) || rcu_read_lock_sched_held(), \ __rcu) +/** + * rcu_dereference_all_check() - rcu_dereference_all with debug checking + * @p: The pointer to read, prior to dereferencing + * @c: The conditions under which the dereference will take place + * + * This is the RCU-all counterpart to rcu_dereference_check(). + * However, please note that starting in v5.0 kernels, vanilla RCU grace + * periods wait for preempt_disable() regions of code in addition to + * regions of code demarked by rcu_read_lock() and rcu_read_unlock(). + * This means that synchronize_rcu(), call_rcu, and friends all take not + * only rcu_read_lock() but also rcu_read_lock_*() into account. + */ +#define rcu_dereference_all_check(p, c) \ + __rcu_dereference_check((p), __UNIQUE_ID(rcu), \ + (c) || rcu_read_lock_any_held(), \ + __rcu) + /* * The tracing infrastructure traces RCU (we want that), but unfortunately * some of the RCU checks causes tracing to lock up the system. @@ -767,6 +784,14 @@ do { \ */ #define rcu_dereference_sched(p) rcu_dereference_sched_check(p, 0) +/** + * rcu_dereference_all() - fetch RCU-all-protected pointer for dereferencing + * @p: The pointer to read, prior to dereferencing + * + * Makes rcu_dereference_check() do the dirty work. + */ +#define rcu_dereference_all(p) rcu_dereference_all_check(p, 0) + /** * rcu_pointer_handoff() - Hand off a pointer from RCU to other mechanism * @p: The pointer to hand off diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h index 6c85b28ea30b..e9b5ac26b42d 100644 --- a/include/linux/rhashtable.h +++ b/include/linux/rhashtable.h @@ -272,13 +272,13 @@ struct rhash_lock_head __rcu **rht_bucket_nested_insert( rcu_dereference_protected(p, lockdep_rht_mutex_is_held(ht)) #define rht_dereference_rcu(p, ht) \ - rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)) + rcu_dereference_all_check(p, lockdep_rht_mutex_is_held(ht)) #define rht_dereference_bucket(p, tbl, hash) \ rcu_dereference_protected(p, lockdep_rht_bucket_is_held(tbl, hash)) #define rht_dereference_bucket_rcu(p, tbl, hash) \ - rcu_dereference_check(p, lockdep_rht_bucket_is_held(tbl, hash)) + rcu_dereference_all_check(p, lockdep_rht_bucket_is_held(tbl, hash)) #define rht_entry(tpos, pos, member) \ ({ tpos = container_of(pos, typeof(*tpos), member); 1; }) @@ -373,7 +373,7 @@ static inline struct rhash_head *__rht_ptr( static inline struct rhash_head *rht_ptr_rcu( struct rhash_lock_head __rcu *const *bkt) { - return __rht_ptr(rcu_dereference(*bkt), bkt); + return __rht_ptr(rcu_dereference_all(*bkt), bkt); } static inline struct rhash_head *rht_ptr( @@ -497,7 +497,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, for (({barrier(); }), \ pos = head; \ !rht_is_a_nulls(pos); \ - pos = rcu_dereference_raw(pos->next)) + pos = rcu_dereference_all(pos->next)) /** * rht_for_each_rcu - iterate over rcu hash chain @@ -513,7 +513,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, for (({barrier(); }), \ pos = rht_ptr_rcu(rht_bucket(tbl, hash)); \ !rht_is_a_nulls(pos); \ - pos = rcu_dereference_raw(pos->next)) + pos = rcu_dereference_all(pos->next)) /** * rht_for_each_entry_rcu_from - iterated over rcu hash chain from given head @@ -560,7 +560,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, * list returned by rhltable_lookup. */ #define rhl_for_each_rcu(pos, list) \ - for (pos = list; pos; pos = rcu_dereference_raw(pos->next)) + for (pos = list; pos; pos = rcu_dereference_all(pos->next)) /** * rhl_for_each_entry_rcu - iterate over rcu hash table list of given type @@ -574,7 +574,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, */ #define rhl_for_each_entry_rcu(tpos, pos, list, member) \ for (pos = list; pos && rht_entry(tpos, pos, member); \ - pos = rcu_dereference_raw(pos->next)) + pos = rcu_dereference_all(pos->next)) static inline int rhashtable_compare(struct rhashtable_compare_arg *arg, const void *obj) Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] rhashtable: Use rcu_dereference_all and rcu_dereference_all_check 2025-09-04 9:44 ` [PATCH] rhashtable: Use rcu_dereference_all and rcu_dereference_all_check Herbert Xu @ 2025-09-08 15:23 ` Paul E. McKenney 2025-09-09 9:50 ` [v2 PATCH] " Herbert Xu 0 siblings, 1 reply; 11+ messages in thread From: Paul E. McKenney @ 2025-09-08 15:23 UTC (permalink / raw) To: Herbert Xu Cc: Masami Hiramatsu, Steven Rostedt, Menglong Dong, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, linux-crypto On Thu, Sep 04, 2025 at 05:44:53PM +0800, Herbert Xu wrote: > On Mon, Sep 01, 2025 at 08:00:15AM -0700, Paul E. McKenney wrote: > > > > However, another option for the the above rcu_dereference_check() to > > become something like this: > > > > rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht) || > > rcu_read_lock_any_held()); > > > > This would be happy with any RCU reader, including rcu_read_lock(), > > preempt_disable(), local_irq_disable(), local_bh_disable(), and various > > handler contexts. One downside is that this would *always* be happy in > > a kernel built with CONFIG_PREEMPT_{NONE,VOLUNTARY}=y. > > Why is that a problem? Assuming that it is always safe to perform > an rcu_dereference_all() on such a kernel, not warning would appear > to be the right thing to do. > > > If this is happening often enough, it would be easy for me to create an > > rcu_dereference_all_check() that allows all forms of vanilla RCU readers > > (but not, for example, SRCU readers), but with only two use cases, > > it is not clear to me that this is an overall win. > > > > Or am I missing a turn in here somewhere? > > As I said rhashtable is just a middle-man, while it does some > RCU actions internally, in the end the user is free to choose > the RCU variant (vanilla, bh, sched). So I'm faced with the > choice of using only rcu_dereference_all in rhashtable code, > or splitting the rhashtable API into three variants just like > RCU. I'd rather not do this especially because it appears RCU > itself has essentially morphed into a single variant because > call_rcu etc. always waits for all three variants. > > So how about this patch? > > ---8<--- > Add rcu_dereference_all and rcu_dereference_all_check so that > library code such as rhashtable can be used with any RCU variant. > > As it stands rcu_dereference is used within rashtable, which > creates false-positive warnings if the user calls it from another > RCU variant context, such as preempt_disable(). > > Use the rcu_dereference_all and rcu_dereference_all_check calls > in rhashtable to suppress these warnings. > > Also replace the rcu_dereference_raw calls in the list iterators > with rcu_dereference_all to uncover buggy calls. > > Reported-by: Menglong Dong <dongml2@chinatelecom.cn> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> I am guessing that you want to send this up via the rhashtable path. > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > index 120536f4c6eb..b4261a0498f0 100644 > --- a/include/linux/rcupdate.h > +++ b/include/linux/rcupdate.h > @@ -713,6 +713,23 @@ do { \ > (c) || rcu_read_lock_sched_held(), \ > __rcu) > > +/** > + * rcu_dereference_all_check() - rcu_dereference_all with debug checking > + * @p: The pointer to read, prior to dereferencing > + * @c: The conditions under which the dereference will take place > + * > + * This is the RCU-all counterpart to rcu_dereference_check(). > + * However, please note that starting in v5.0 kernels, vanilla RCU grace > + * periods wait for preempt_disable() regions of code in addition to > + * regions of code demarked by rcu_read_lock() and rcu_read_unlock(). > + * This means that synchronize_rcu(), call_rcu, and friends all take not > + * only rcu_read_lock() but also rcu_read_lock_*() into account. How about something like this? * This is similar to rcu_dereference_check(), but allows protection * by all forms of vanilla RCU readers, including preemption disabled, * bh-disabled, and interrupt-disabled regions of code. Note that "vanilla * RCU" excludes SRCU and the various Tasks RCU flavors. Please note * that this macro should not be backported to any Linux-kernel version * preceding v5.0 due to changes in synchronize_rcu() semantics prior * to that version. The "should not" vs. "can not" accounts for the possibility of people using synchronize_rcu_mult(), but someone wanting to do that best know what they are doing. ;-) > + */ > +#define rcu_dereference_all_check(p, c) \ > + __rcu_dereference_check((p), __UNIQUE_ID(rcu), \ > + (c) || rcu_read_lock_any_held(), \ > + __rcu) > + > /* > * The tracing infrastructure traces RCU (we want that), but unfortunately > * some of the RCU checks causes tracing to lock up the system. > @@ -767,6 +784,14 @@ do { \ > */ > #define rcu_dereference_sched(p) rcu_dereference_sched_check(p, 0) > > +/** > + * rcu_dereference_all() - fetch RCU-all-protected pointer for dereferencing > + * @p: The pointer to read, prior to dereferencing > + * > + * Makes rcu_dereference_check() do the dirty work. > + */ > +#define rcu_dereference_all(p) rcu_dereference_all_check(p, 0) > + With some comment-header change similar to the above: Reviewed-by: Paul E. McKenney <paulmck@kernel.org> > /** > * rcu_pointer_handoff() - Hand off a pointer from RCU to other mechanism > * @p: The pointer to hand off > diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h > index 6c85b28ea30b..e9b5ac26b42d 100644 > --- a/include/linux/rhashtable.h > +++ b/include/linux/rhashtable.h > @@ -272,13 +272,13 @@ struct rhash_lock_head __rcu **rht_bucket_nested_insert( > rcu_dereference_protected(p, lockdep_rht_mutex_is_held(ht)) > > #define rht_dereference_rcu(p, ht) \ > - rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)) > + rcu_dereference_all_check(p, lockdep_rht_mutex_is_held(ht)) > > #define rht_dereference_bucket(p, tbl, hash) \ > rcu_dereference_protected(p, lockdep_rht_bucket_is_held(tbl, hash)) > > #define rht_dereference_bucket_rcu(p, tbl, hash) \ > - rcu_dereference_check(p, lockdep_rht_bucket_is_held(tbl, hash)) > + rcu_dereference_all_check(p, lockdep_rht_bucket_is_held(tbl, hash)) > > #define rht_entry(tpos, pos, member) \ > ({ tpos = container_of(pos, typeof(*tpos), member); 1; }) > @@ -373,7 +373,7 @@ static inline struct rhash_head *__rht_ptr( > static inline struct rhash_head *rht_ptr_rcu( > struct rhash_lock_head __rcu *const *bkt) > { > - return __rht_ptr(rcu_dereference(*bkt), bkt); > + return __rht_ptr(rcu_dereference_all(*bkt), bkt); > } > > static inline struct rhash_head *rht_ptr( > @@ -497,7 +497,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, > for (({barrier(); }), \ > pos = head; \ > !rht_is_a_nulls(pos); \ > - pos = rcu_dereference_raw(pos->next)) > + pos = rcu_dereference_all(pos->next)) > > /** > * rht_for_each_rcu - iterate over rcu hash chain > @@ -513,7 +513,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, > for (({barrier(); }), \ > pos = rht_ptr_rcu(rht_bucket(tbl, hash)); \ > !rht_is_a_nulls(pos); \ > - pos = rcu_dereference_raw(pos->next)) > + pos = rcu_dereference_all(pos->next)) > > /** > * rht_for_each_entry_rcu_from - iterated over rcu hash chain from given head > @@ -560,7 +560,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, > * list returned by rhltable_lookup. > */ > #define rhl_for_each_rcu(pos, list) \ > - for (pos = list; pos; pos = rcu_dereference_raw(pos->next)) > + for (pos = list; pos; pos = rcu_dereference_all(pos->next)) > > /** > * rhl_for_each_entry_rcu - iterate over rcu hash table list of given type > @@ -574,7 +574,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, > */ > #define rhl_for_each_entry_rcu(tpos, pos, list, member) \ > for (pos = list; pos && rht_entry(tpos, pos, member); \ > - pos = rcu_dereference_raw(pos->next)) > + pos = rcu_dereference_all(pos->next)) > > static inline int rhashtable_compare(struct rhashtable_compare_arg *arg, > const void *obj) > > Cheers, > -- > Email: Herbert Xu <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 11+ messages in thread
* [v2 PATCH] rhashtable: Use rcu_dereference_all and rcu_dereference_all_check 2025-09-08 15:23 ` Paul E. McKenney @ 2025-09-09 9:50 ` Herbert Xu 2025-09-25 10:17 ` Andrea Righi 0 siblings, 1 reply; 11+ messages in thread From: Herbert Xu @ 2025-09-09 9:50 UTC (permalink / raw) To: Paul E. McKenney Cc: Masami Hiramatsu, Steven Rostedt, Menglong Dong, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, linux-crypto On Mon, Sep 08, 2025 at 08:23:27AM -0700, Paul E. McKenney wrote: > > I am guessing that you want to send this up via the rhashtable path. Yes I could push that along. > * This is similar to rcu_dereference_check(), but allows protection > * by all forms of vanilla RCU readers, including preemption disabled, > * bh-disabled, and interrupt-disabled regions of code. Note that "vanilla > * RCU" excludes SRCU and the various Tasks RCU flavors. Please note > * that this macro should not be backported to any Linux-kernel version > * preceding v5.0 due to changes in synchronize_rcu() semantics prior > * to that version. > > The "should not" vs. "can not" accounts for the possibility of people > using synchronize_rcu_mult(), but someone wanting to do that best know > what they are doing. ;-) Thanks! I've incorported that into the patch: ---8<--- Add rcu_dereference_all and rcu_dereference_all_check so that library code such as rhashtable can be used with any RCU variant. As it stands rcu_dereference is used within rashtable, which creates false-positive warnings if the user calls it from another RCU context, such as preempt_disable(). Use the rcu_dereference_all and rcu_dereference_all_check calls in rhashtable to suppress these warnings. Also replace the rcu_dereference_raw calls in the list iterators with rcu_dereference_all to uncover buggy calls. Reported-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 120536f4c6eb..448eb1f0cb48 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -713,6 +713,24 @@ do { \ (c) || rcu_read_lock_sched_held(), \ __rcu) +/** + * rcu_dereference_all_check() - rcu_dereference_all with debug checking + * @p: The pointer to read, prior to dereferencing + * @c: The conditions under which the dereference will take place + * + * This is similar to rcu_dereference_check(), but allows protection + * by all forms of vanilla RCU readers, including preemption disabled, + * bh-disabled, and interrupt-disabled regions of code. Note that "vanilla + * RCU" excludes SRCU and the various Tasks RCU flavors. Please note + * that this macro should not be backported to any Linux-kernel version + * preceding v5.0 due to changes in synchronize_rcu() semantics prior + * to that version. + */ +#define rcu_dereference_all_check(p, c) \ + __rcu_dereference_check((p), __UNIQUE_ID(rcu), \ + (c) || rcu_read_lock_any_held(), \ + __rcu) + /* * The tracing infrastructure traces RCU (we want that), but unfortunately * some of the RCU checks causes tracing to lock up the system. @@ -767,6 +785,14 @@ do { \ */ #define rcu_dereference_sched(p) rcu_dereference_sched_check(p, 0) +/** + * rcu_dereference_all() - fetch RCU-all-protected pointer for dereferencing + * @p: The pointer to read, prior to dereferencing + * + * Makes rcu_dereference_check() do the dirty work. + */ +#define rcu_dereference_all(p) rcu_dereference_all_check(p, 0) + /** * rcu_pointer_handoff() - Hand off a pointer from RCU to other mechanism * @p: The pointer to hand off diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h index e740157f3cd7..05a221ce79a6 100644 --- a/include/linux/rhashtable.h +++ b/include/linux/rhashtable.h @@ -272,13 +272,13 @@ struct rhash_lock_head __rcu **rht_bucket_nested_insert( rcu_dereference_protected(p, lockdep_rht_mutex_is_held(ht)) #define rht_dereference_rcu(p, ht) \ - rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)) + rcu_dereference_all_check(p, lockdep_rht_mutex_is_held(ht)) #define rht_dereference_bucket(p, tbl, hash) \ rcu_dereference_protected(p, lockdep_rht_bucket_is_held(tbl, hash)) #define rht_dereference_bucket_rcu(p, tbl, hash) \ - rcu_dereference_check(p, lockdep_rht_bucket_is_held(tbl, hash)) + rcu_dereference_all_check(p, lockdep_rht_bucket_is_held(tbl, hash)) #define rht_entry(tpos, pos, member) \ ({ tpos = container_of(pos, typeof(*tpos), member); 1; }) @@ -373,7 +373,7 @@ static inline struct rhash_head *__rht_ptr( static inline struct rhash_head *rht_ptr_rcu( struct rhash_lock_head __rcu *const *bkt) { - return __rht_ptr(rcu_dereference(*bkt), bkt); + return __rht_ptr(rcu_dereference_all(*bkt), bkt); } static inline struct rhash_head *rht_ptr( @@ -497,7 +497,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, for (({barrier(); }), \ pos = head; \ !rht_is_a_nulls(pos); \ - pos = rcu_dereference_raw(pos->next)) + pos = rcu_dereference_all(pos->next)) /** * rht_for_each_rcu - iterate over rcu hash chain @@ -513,7 +513,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, for (({barrier(); }), \ pos = rht_ptr_rcu(rht_bucket(tbl, hash)); \ !rht_is_a_nulls(pos); \ - pos = rcu_dereference_raw(pos->next)) + pos = rcu_dereference_all(pos->next)) /** * rht_for_each_entry_rcu_from - iterated over rcu hash chain from given head @@ -560,7 +560,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, * list returned by rhltable_lookup. */ #define rhl_for_each_rcu(pos, list) \ - for (pos = list; pos; pos = rcu_dereference_raw(pos->next)) + for (pos = list; pos; pos = rcu_dereference_all(pos->next)) /** * rhl_for_each_entry_rcu - iterate over rcu hash table list of given type @@ -574,7 +574,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, */ #define rhl_for_each_entry_rcu(tpos, pos, list, member) \ for (pos = list; pos && rht_entry(tpos, pos, member); \ - pos = rcu_dereference_raw(pos->next)) + pos = rcu_dereference_all(pos->next)) static inline int rhashtable_compare(struct rhashtable_compare_arg *arg, const void *obj) -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [v2 PATCH] rhashtable: Use rcu_dereference_all and rcu_dereference_all_check 2025-09-09 9:50 ` [v2 PATCH] " Herbert Xu @ 2025-09-25 10:17 ` Andrea Righi 0 siblings, 0 replies; 11+ messages in thread From: Andrea Righi @ 2025-09-25 10:17 UTC (permalink / raw) To: Herbert Xu Cc: Paul E. McKenney, Masami Hiramatsu, Steven Rostedt, Menglong Dong, mathieu.desnoyers, linux-kernel, linux-trace-kernel, kernel test robot, tgraf, linux-crypto Hi Herbert, On Tue, Sep 09, 2025 at 05:50:56PM +0800, Herbert Xu wrote: > On Mon, Sep 08, 2025 at 08:23:27AM -0700, Paul E. McKenney wrote: > > > > I am guessing that you want to send this up via the rhashtable path. > > Yes I could push that along. > > > * This is similar to rcu_dereference_check(), but allows protection > > * by all forms of vanilla RCU readers, including preemption disabled, > > * bh-disabled, and interrupt-disabled regions of code. Note that "vanilla > > * RCU" excludes SRCU and the various Tasks RCU flavors. Please note > > * that this macro should not be backported to any Linux-kernel version > > * preceding v5.0 due to changes in synchronize_rcu() semantics prior > > * to that version. > > > > The "should not" vs. "can not" accounts for the possibility of people > > using synchronize_rcu_mult(), but someone wanting to do that best know > > what they are doing. ;-) > > Thanks! I've incorported that into the patch: > > ---8<--- > Add rcu_dereference_all and rcu_dereference_all_check so that > library code such as rhashtable can be used with any RCU variant. > > As it stands rcu_dereference is used within rashtable, which > creates false-positive warnings if the user calls it from another > RCU context, such as preempt_disable(). > > Use the rcu_dereference_all and rcu_dereference_all_check calls > in rhashtable to suppress these warnings. > > Also replace the rcu_dereference_raw calls in the list iterators > with rcu_dereference_all to uncover buggy calls. > > Reported-by: Menglong Dong <dongml2@chinatelecom.cn> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org> We hit the same issue in sched_ext and with this applied lockep seems happy. FWIW: Tested-by: Andrea Righi <arighi@nvidia.com> Thanks, -Andrea > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > index 120536f4c6eb..448eb1f0cb48 100644 > --- a/include/linux/rcupdate.h > +++ b/include/linux/rcupdate.h > @@ -713,6 +713,24 @@ do { \ > (c) || rcu_read_lock_sched_held(), \ > __rcu) > > +/** > + * rcu_dereference_all_check() - rcu_dereference_all with debug checking > + * @p: The pointer to read, prior to dereferencing > + * @c: The conditions under which the dereference will take place > + * > + * This is similar to rcu_dereference_check(), but allows protection > + * by all forms of vanilla RCU readers, including preemption disabled, > + * bh-disabled, and interrupt-disabled regions of code. Note that "vanilla > + * RCU" excludes SRCU and the various Tasks RCU flavors. Please note > + * that this macro should not be backported to any Linux-kernel version > + * preceding v5.0 due to changes in synchronize_rcu() semantics prior > + * to that version. > + */ > +#define rcu_dereference_all_check(p, c) \ > + __rcu_dereference_check((p), __UNIQUE_ID(rcu), \ > + (c) || rcu_read_lock_any_held(), \ > + __rcu) > + > /* > * The tracing infrastructure traces RCU (we want that), but unfortunately > * some of the RCU checks causes tracing to lock up the system. > @@ -767,6 +785,14 @@ do { \ > */ > #define rcu_dereference_sched(p) rcu_dereference_sched_check(p, 0) > > +/** > + * rcu_dereference_all() - fetch RCU-all-protected pointer for dereferencing > + * @p: The pointer to read, prior to dereferencing > + * > + * Makes rcu_dereference_check() do the dirty work. > + */ > +#define rcu_dereference_all(p) rcu_dereference_all_check(p, 0) > + > /** > * rcu_pointer_handoff() - Hand off a pointer from RCU to other mechanism > * @p: The pointer to hand off > diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h > index e740157f3cd7..05a221ce79a6 100644 > --- a/include/linux/rhashtable.h > +++ b/include/linux/rhashtable.h > @@ -272,13 +272,13 @@ struct rhash_lock_head __rcu **rht_bucket_nested_insert( > rcu_dereference_protected(p, lockdep_rht_mutex_is_held(ht)) > > #define rht_dereference_rcu(p, ht) \ > - rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)) > + rcu_dereference_all_check(p, lockdep_rht_mutex_is_held(ht)) > > #define rht_dereference_bucket(p, tbl, hash) \ > rcu_dereference_protected(p, lockdep_rht_bucket_is_held(tbl, hash)) > > #define rht_dereference_bucket_rcu(p, tbl, hash) \ > - rcu_dereference_check(p, lockdep_rht_bucket_is_held(tbl, hash)) > + rcu_dereference_all_check(p, lockdep_rht_bucket_is_held(tbl, hash)) > > #define rht_entry(tpos, pos, member) \ > ({ tpos = container_of(pos, typeof(*tpos), member); 1; }) > @@ -373,7 +373,7 @@ static inline struct rhash_head *__rht_ptr( > static inline struct rhash_head *rht_ptr_rcu( > struct rhash_lock_head __rcu *const *bkt) > { > - return __rht_ptr(rcu_dereference(*bkt), bkt); > + return __rht_ptr(rcu_dereference_all(*bkt), bkt); > } > > static inline struct rhash_head *rht_ptr( > @@ -497,7 +497,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, > for (({barrier(); }), \ > pos = head; \ > !rht_is_a_nulls(pos); \ > - pos = rcu_dereference_raw(pos->next)) > + pos = rcu_dereference_all(pos->next)) > > /** > * rht_for_each_rcu - iterate over rcu hash chain > @@ -513,7 +513,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, > for (({barrier(); }), \ > pos = rht_ptr_rcu(rht_bucket(tbl, hash)); \ > !rht_is_a_nulls(pos); \ > - pos = rcu_dereference_raw(pos->next)) > + pos = rcu_dereference_all(pos->next)) > > /** > * rht_for_each_entry_rcu_from - iterated over rcu hash chain from given head > @@ -560,7 +560,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, > * list returned by rhltable_lookup. > */ > #define rhl_for_each_rcu(pos, list) \ > - for (pos = list; pos; pos = rcu_dereference_raw(pos->next)) > + for (pos = list; pos; pos = rcu_dereference_all(pos->next)) > > /** > * rhl_for_each_entry_rcu - iterate over rcu hash table list of given type > @@ -574,7 +574,7 @@ static inline void rht_assign_unlock(struct bucket_table *tbl, > */ > #define rhl_for_each_entry_rcu(tpos, pos, list, member) \ > for (pos = list; pos && rht_entry(tpos, pos, member); \ > - pos = rcu_dereference_raw(pos->next)) > + pos = rcu_dereference_all(pos->next)) > > static inline int rhashtable_compare(struct rhashtable_compare_arg *arg, > const void *obj) > -- > Email: Herbert Xu <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry 2025-08-29 11:11 ` [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry Paul E. McKenney 2025-09-01 8:06 ` Masami Hiramatsu @ 2025-09-01 10:06 ` Herbert Xu 1 sibling, 0 replies; 11+ messages in thread From: Herbert Xu @ 2025-09-01 10:06 UTC (permalink / raw) To: paulmck Cc: rostedt, dongml2, mhiramat, mathieu.desnoyers, linux-kernel, linux-trace-kernel, oliver.sang, tgraf, linux-crypto Paul E. McKenney <paulmck@kernel.org> wrote: > > Yes, preempt_disable() does indeed start an RCU read-side critical section, > just as surely as rcu_read_lock() does. > > However, this is a lockdep check inside of __rhashtable_lookup(): > > rht_dereference_rcu(ht->tbl, ht) > > Which is defined as: > > rcu_dereference_check(p, lockdep_rht_mutex_is_held(ht)); > > This is explicitly telling lockdep that rcu_read_lock() is OK and > holding ht->mutex is OK, but nothing else is. I think that's a deficiency in rcu_dereference_check. Yes I could certainly add a preemption check to rht_dereference_rcu, but that makes zero sense because this is an implementation detail of RCU and there is no reason why this logic should be added to rhashtable. rhashtable never relies on the fact that turning off preemption creates is safe for RCU reads, so it makes no sense to add this logic to rht_dereference_rcu since RCU could conceivably (even if it is unlikely) be changed on day so that turning off preemption is no longer safe for RCU reads. My preference would be to add the preemption test to rcu_dereference_check, or if that is not possible for some reason, create a new RCU helper that includes the preemption test. Of course just adding RCU read locks as this patch does is also fine. Thanks, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-09-25 10:18 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20250829021436.19982-1-dongml2@chinatelecom.cn>
[not found] ` <20250828222357.55fab4c2@batman.local.home>
2025-08-29 11:11 ` [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry Paul E. McKenney
2025-09-01 8:06 ` Masami Hiramatsu
2025-09-01 15:00 ` Paul E. McKenney
2025-09-02 6:59 ` Masami Hiramatsu
2025-09-02 11:58 ` Paul E. McKenney
2025-09-03 9:43 ` Herbert Xu
2025-09-04 9:44 ` [PATCH] rhashtable: Use rcu_dereference_all and rcu_dereference_all_check Herbert Xu
2025-09-08 15:23 ` Paul E. McKenney
2025-09-09 9:50 ` [v2 PATCH] " Herbert Xu
2025-09-25 10:17 ` Andrea Righi
2025-09-01 10:06 ` [PATCH] tracing: fprobe: fix suspicious rcu usage in fprobe_entry Herbert Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).