* [PATCH tip/perf] uprobes: avoid false lockdep splat in uprobe timer callback
@ 2025-04-03 17:18 Andrii Nakryiko
2025-04-03 17:49 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 7+ messages in thread
From: Andrii Nakryiko @ 2025-04-03 17:18 UTC (permalink / raw)
To: linux-trace-kernel, peterz, mingo
Cc: bpf, linux-kernel, kernel-team, rostedt, oleg, mhiramat, bigeasy,
ast, Andrii Nakryiko
Avoid a false-positive lockdep warning in PREEMPT_RT configuration when
using write_seqcount_begin() in uprobe timer callback by using
raw_write_* APIs. Uprobe's use of timer callback is guaranteed to not
race with itself, and as such seqcount's insistence on having hardirqs
disabled on the writer side is irrelevant. So switch to raw_ variants of
seqcount API instead of disabling hardirqs unnecessarily.
Also, point out in the comments more explicitly why we use seqcount
despite our reader side being rather simple and never retrying. We favor
well-maintained kernel primitive in favor of open-coding our own memory
barriers.
Link: https://lore.kernel.org/bpf/CAADnVQLLOHZmPO4X_dQ+cTaSDvzdWHzA0qUqQDhLFYL3D6xPxg@mail.gmail.com/
Reported-by: Alexei Starovoitov <ast@kernel.org>
Suggested-by: Sebastian Sewior <bigeasy@linutronix.de>
Fixes: 8622e45b5da1 ("uprobes: Reuse return_instances between multiple uretprobes within task")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
kernel/events/uprobes.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 70c84b9d7be3..6d7e7da0fbbc 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1944,6 +1944,9 @@ static void free_ret_instance(struct uprobe_task *utask,
* to-be-reused return instances for future uretprobes. If ri_timer()
* happens to be running right now, though, we fallback to safety and
* just perform RCU-delated freeing of ri.
+ * Admittedly, this is a rather simple use of seqcount, but it nicely
+ * abstracts away all the necessary memory barriers, so we use
+ * a well-supported kernel primitive here.
*/
if (raw_seqcount_try_begin(&utask->ri_seqcount, seq)) {
/* immediate reuse of ri without RCU GP is OK */
@@ -2004,12 +2007,18 @@ static void ri_timer(struct timer_list *timer)
/* RCU protects return_instance from freeing. */
guard(rcu)();
- write_seqcount_begin(&utask->ri_seqcount);
+ /* See free_ret_instance() for notes on seqcount use.
+ * We also employ raw API variants to avoid lockdep false-positive
+ * warning complaining about hardirqs not being disabled. We have
+ * a guarantee that this timer callback won't race with itself, so no
+ * need to disable hardirqs.
+ */
+ raw_write_seqcount_begin(&utask->ri_seqcount);
for_each_ret_instance_rcu(ri, utask->return_instances)
hprobe_expire(&ri->hprobe, false);
- write_seqcount_end(&utask->ri_seqcount);
+ raw_write_seqcount_end(&utask->ri_seqcount);
}
static struct uprobe_task *alloc_utask(void)
--
2.47.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH tip/perf] uprobes: avoid false lockdep splat in uprobe timer callback
2025-04-03 17:18 [PATCH tip/perf] uprobes: avoid false lockdep splat in uprobe timer callback Andrii Nakryiko
@ 2025-04-03 17:49 ` Sebastian Andrzej Siewior
2025-04-03 17:53 ` Steven Rostedt
2025-04-03 18:30 ` Andrii Nakryiko
0 siblings, 2 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-03 17:49 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, peterz, mingo, bpf, linux-kernel, kernel-team,
rostedt, oleg, mhiramat, ast
On 2025-04-03 10:18:31 [-0700], Andrii Nakryiko wrote:
> Avoid a false-positive lockdep warning in PREEMPT_RT configuration when
> using write_seqcount_begin() in uprobe timer callback by using
> raw_write_* APIs. Uprobe's use of timer callback is guaranteed to not
> race with itself, and as such seqcount's insistence on having hardirqs
preemption, not hardirqs
> disabled on the writer side is irrelevant. So switch to raw_ variants of
> seqcount API instead of disabling hardirqs unnecessarily.
>
> Also, point out in the comments more explicitly why we use seqcount
> despite our reader side being rather simple and never retrying. We favor
> well-maintained kernel primitive in favor of open-coding our own memory
> barriers.
Thank you.
> Link: https://lore.kernel.org/bpf/CAADnVQLLOHZmPO4X_dQ+cTaSDvzdWHzA0qUqQDhLFYL3D6xPxg@mail.gmail.com/
> Reported-by: Alexei Starovoitov <ast@kernel.org>
> Suggested-by: Sebastian Sewior <bigeasy@linutronix.de>
> Fixes: 8622e45b5da1 ("uprobes: Reuse return_instances between multiple uretprobes within task")
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> kernel/events/uprobes.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 70c84b9d7be3..6d7e7da0fbbc 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1944,6 +1944,9 @@ static void free_ret_instance(struct uprobe_task *utask,
> * to-be-reused return instances for future uretprobes. If ri_timer()
> * happens to be running right now, though, we fallback to safety and
> * just perform RCU-delated freeing of ri.
> + * Admittedly, this is a rather simple use of seqcount, but it nicely
> + * abstracts away all the necessary memory barriers, so we use
> + * a well-supported kernel primitive here.
> */
> if (raw_seqcount_try_begin(&utask->ri_seqcount, seq)) {
> /* immediate reuse of ri without RCU GP is OK */
> @@ -2004,12 +2007,18 @@ static void ri_timer(struct timer_list *timer)
> /* RCU protects return_instance from freeing. */
> guard(rcu)();
>
> - write_seqcount_begin(&utask->ri_seqcount);
> + /* See free_ret_instance() for notes on seqcount use.
This is not a proper multi line comment.
> + * We also employ raw API variants to avoid lockdep false-positive
> + * warning complaining about hardirqs not being disabled. We have
s/hardirqs/preemption. The warning is about missing disabled preemption.
> + * a guarantee that this timer callback won't race with itself, so no
> + * need to disable hardirqs.
The timer can only be invoked once for a uprobe_task. Therefore there
can only be one writer. The reader does not require an even sequence
count so it is okay to remain preemptible on PREEMPT_RT.
> + */
> + raw_write_seqcount_begin(&utask->ri_seqcount);
>
> for_each_ret_instance_rcu(ri, utask->return_instances)
> hprobe_expire(&ri->hprobe, false);
>
> - write_seqcount_end(&utask->ri_seqcount);
> + raw_write_seqcount_end(&utask->ri_seqcount);
> }
>
> static struct uprobe_task *alloc_utask(void)
Sebastian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH tip/perf] uprobes: avoid false lockdep splat in uprobe timer callback
2025-04-03 17:49 ` Sebastian Andrzej Siewior
@ 2025-04-03 17:53 ` Steven Rostedt
2025-04-03 17:56 ` Sebastian Andrzej Siewior
2025-04-03 18:30 ` Andrii Nakryiko
1 sibling, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2025-04-03 17:53 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Andrii Nakryiko, linux-trace-kernel, peterz, mingo, bpf,
linux-kernel, kernel-team, oleg, mhiramat, ast
On Thu, 3 Apr 2025 19:49:17 +0200
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> > + /* See free_ret_instance() for notes on seqcount use.
>
> This is not a proper multi line comment.
It's only proper in the networking code, but not the rest of the kernel.
-- Steve
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH tip/perf] uprobes: avoid false lockdep splat in uprobe timer callback
2025-04-03 17:53 ` Steven Rostedt
@ 2025-04-03 17:56 ` Sebastian Andrzej Siewior
2025-04-03 18:53 ` Steven Rostedt
0 siblings, 1 reply; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-03 17:56 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andrii Nakryiko, linux-trace-kernel, peterz, mingo, bpf,
linux-kernel, kernel-team, oleg, mhiramat, ast
On 2025-04-03 13:53:31 [-0400], Steven Rostedt wrote:
> On Thu, 3 Apr 2025 19:49:17 +0200
> Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
>
> > > + /* See free_ret_instance() for notes on seqcount use.
> >
> > This is not a proper multi line comment.
>
> It's only proper in the networking code, but not the rest of the kernel.
I wasn't aware that uprobe is following networking standards here.
> -- Steve
Sebastian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH tip/perf] uprobes: avoid false lockdep splat in uprobe timer callback
2025-04-03 17:49 ` Sebastian Andrzej Siewior
2025-04-03 17:53 ` Steven Rostedt
@ 2025-04-03 18:30 ` Andrii Nakryiko
2025-04-04 8:36 ` Ingo Molnar
1 sibling, 1 reply; 7+ messages in thread
From: Andrii Nakryiko @ 2025-04-03 18:30 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Andrii Nakryiko, linux-trace-kernel, peterz, mingo, bpf,
linux-kernel, kernel-team, rostedt, oleg, mhiramat, ast
On Thu, Apr 3, 2025 at 10:49 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2025-04-03 10:18:31 [-0700], Andrii Nakryiko wrote:
> > Avoid a false-positive lockdep warning in PREEMPT_RT configuration when
> > using write_seqcount_begin() in uprobe timer callback by using
> > raw_write_* APIs. Uprobe's use of timer callback is guaranteed to not
> > race with itself, and as such seqcount's insistence on having hardirqs
> preemption, not hardirqs
>
> > disabled on the writer side is irrelevant. So switch to raw_ variants of
> > seqcount API instead of disabling hardirqs unnecessarily.
> >
> > Also, point out in the comments more explicitly why we use seqcount
> > despite our reader side being rather simple and never retrying. We favor
> > well-maintained kernel primitive in favor of open-coding our own memory
> > barriers.
>
> Thank you.
>
> > Link: https://lore.kernel.org/bpf/CAADnVQLLOHZmPO4X_dQ+cTaSDvzdWHzA0qUqQDhLFYL3D6xPxg@mail.gmail.com/
> > Reported-by: Alexei Starovoitov <ast@kernel.org>
> > Suggested-by: Sebastian Sewior <bigeasy@linutronix.de>
> > Fixes: 8622e45b5da1 ("uprobes: Reuse return_instances between multiple uretprobes within task")
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> > kernel/events/uprobes.c | 13 +++++++++++--
> > 1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> > index 70c84b9d7be3..6d7e7da0fbbc 100644
> > --- a/kernel/events/uprobes.c
> > +++ b/kernel/events/uprobes.c
> > @@ -1944,6 +1944,9 @@ static void free_ret_instance(struct uprobe_task *utask,
> > * to-be-reused return instances for future uretprobes. If ri_timer()
> > * happens to be running right now, though, we fallback to safety and
> > * just perform RCU-delated freeing of ri.
> > + * Admittedly, this is a rather simple use of seqcount, but it nicely
> > + * abstracts away all the necessary memory barriers, so we use
> > + * a well-supported kernel primitive here.
> > */
> > if (raw_seqcount_try_begin(&utask->ri_seqcount, seq)) {
> > /* immediate reuse of ri without RCU GP is OK */
> > @@ -2004,12 +2007,18 @@ static void ri_timer(struct timer_list *timer)
> > /* RCU protects return_instance from freeing. */
> > guard(rcu)();
> >
> > - write_seqcount_begin(&utask->ri_seqcount);
>
> > + /* See free_ret_instance() for notes on seqcount use.
>
> This is not a proper multi line comment.
yep, will fix; no, uprobe is not networking, this style is just
ingrained in my brain from working in BPF code base for a while
>
> > + * We also employ raw API variants to avoid lockdep false-positive
> > + * warning complaining about hardirqs not being disabled. We have
>
> s/hardirqs/preemption. The warning is about missing disabled preemption.
Right, sorry, the `this_cpu_read(hardirqs_enabled)` part of the check
in lockdep_assert_preemption_disabled() made too strong an impression
on me :) Will fix.
>
> > + * a guarantee that this timer callback won't race with itself, so no
> > + * need to disable hardirqs.
>
> The timer can only be invoked once for a uprobe_task. Therefore there
> can only be one writer. The reader does not require an even sequence
> count so it is okay to remain preemptible on PREEMPT_RT.
>
> > + */
> > + raw_write_seqcount_begin(&utask->ri_seqcount);
> >
> > for_each_ret_instance_rcu(ri, utask->return_instances)
> > hprobe_expire(&ri->hprobe, false);
> >
> > - write_seqcount_end(&utask->ri_seqcount);
> > + raw_write_seqcount_end(&utask->ri_seqcount);
> > }
> >
> > static struct uprobe_task *alloc_utask(void)
>
> Sebastian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH tip/perf] uprobes: avoid false lockdep splat in uprobe timer callback
2025-04-03 17:56 ` Sebastian Andrzej Siewior
@ 2025-04-03 18:53 ` Steven Rostedt
0 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2025-04-03 18:53 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Andrii Nakryiko, linux-trace-kernel, peterz, mingo, bpf,
linux-kernel, kernel-team, oleg, mhiramat, ast
On Thu, 3 Apr 2025 19:56:19 +0200
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> On 2025-04-03 13:53:31 [-0400], Steven Rostedt wrote:
> > On Thu, 3 Apr 2025 19:49:17 +0200
> > Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> >
> > > > + /* See free_ret_instance() for notes on seqcount use.
> > >
> > > This is not a proper multi line comment.
> >
> > It's only proper in the networking code, but not the rest of the kernel.
>
> I wasn't aware that uprobe is following networking standards here.
It's not, but I know that Andrii works a bit with the networking code.
-- Steve
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH tip/perf] uprobes: avoid false lockdep splat in uprobe timer callback
2025-04-03 18:30 ` Andrii Nakryiko
@ 2025-04-04 8:36 ` Ingo Molnar
0 siblings, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2025-04-04 8:36 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Sebastian Andrzej Siewior, Andrii Nakryiko, linux-trace-kernel,
peterz, bpf, linux-kernel, kernel-team, rostedt, oleg, mhiramat,
ast
* Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> On Thu, Apr 3, 2025 at 10:49 AM Sebastian Andrzej Siewior
> <bigeasy@linutronix.de> wrote:
> >
> > On 2025-04-03 10:18:31 [-0700], Andrii Nakryiko wrote:
> > > Avoid a false-positive lockdep warning in PREEMPT_RT configuration when
> > > using write_seqcount_begin() in uprobe timer callback by using
> > > raw_write_* APIs. Uprobe's use of timer callback is guaranteed to not
> > > race with itself, and as such seqcount's insistence on having hardirqs
> > preemption, not hardirqs
> >
> > > disabled on the writer side is irrelevant. So switch to raw_ variants of
> > > seqcount API instead of disabling hardirqs unnecessarily.
> > >
> > > Also, point out in the comments more explicitly why we use seqcount
> > > despite our reader side being rather simple and never retrying. We favor
> > > well-maintained kernel primitive in favor of open-coding our own memory
> > > barriers.
> >
> > Thank you.
> >
> > > Link: https://lore.kernel.org/bpf/CAADnVQLLOHZmPO4X_dQ+cTaSDvzdWHzA0qUqQDhLFYL3D6xPxg@mail.gmail.com/
> > > Reported-by: Alexei Starovoitov <ast@kernel.org>
> > > Suggested-by: Sebastian Sewior <bigeasy@linutronix.de>
> > > Fixes: 8622e45b5da1 ("uprobes: Reuse return_instances between multiple uretprobes within task")
> > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > ---
> > > kernel/events/uprobes.c | 13 +++++++++++--
> > > 1 file changed, 11 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> > > index 70c84b9d7be3..6d7e7da0fbbc 100644
> > > --- a/kernel/events/uprobes.c
> > > +++ b/kernel/events/uprobes.c
> > > @@ -1944,6 +1944,9 @@ static void free_ret_instance(struct uprobe_task *utask,
> > > * to-be-reused return instances for future uretprobes. If ri_timer()
> > > * happens to be running right now, though, we fallback to safety and
> > > * just perform RCU-delated freeing of ri.
> > > + * Admittedly, this is a rather simple use of seqcount, but it nicely
> > > + * abstracts away all the necessary memory barriers, so we use
> > > + * a well-supported kernel primitive here.
> > > */
> > > if (raw_seqcount_try_begin(&utask->ri_seqcount, seq)) {
> > > /* immediate reuse of ri without RCU GP is OK */
> > > @@ -2004,12 +2007,18 @@ static void ri_timer(struct timer_list *timer)
> > > /* RCU protects return_instance from freeing. */
> > > guard(rcu)();
> > >
> > > - write_seqcount_begin(&utask->ri_seqcount);
> >
> > > + /* See free_ret_instance() for notes on seqcount use.
> >
> > This is not a proper multi line comment.
>
> yep, will fix; no, uprobe is not networking, this style is just
> ingrained in my brain from working in BPF code base for a while
... and this example underlines why we've been asking the networking
folks for years to use the standard Linux kernel coding style for
comments, instead of creating this pointless noise & inconsistency.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-04-04 8:36 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-03 17:18 [PATCH tip/perf] uprobes: avoid false lockdep splat in uprobe timer callback Andrii Nakryiko
2025-04-03 17:49 ` Sebastian Andrzej Siewior
2025-04-03 17:53 ` Steven Rostedt
2025-04-03 17:56 ` Sebastian Andrzej Siewior
2025-04-03 18:53 ` Steven Rostedt
2025-04-03 18:30 ` Andrii Nakryiko
2025-04-04 8:36 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).