Linux Trace Kernel
 help / color / mirror / Atom feed
* [PATCH] rethook: Use tsk->on_cpu to check task execution state
@ 2026-05-25 13:22 Tengda Wu
  2026-05-26  3:37 ` Masami Hiramatsu
  0 siblings, 1 reply; 12+ messages in thread
From: Tengda Wu @ 2026-05-25 13:22 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu
  Cc: Mathieu Desnoyers, Alexei Starovoitov, linux-trace-kernel,
	linux-kernel, Tengda Wu

When a task calls schedule() to yield the CPU, its state remains
TASK_RUNNING, but its stack is frozen and safe to walk.

Replace task_is_running(tsk) with tsk->on_cpu to avoid overly
conservative rejections.

Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook")
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
---
 kernel/trace/rethook.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
index 5a8bdf88999a..bd5e5f455e85 100644
--- a/kernel/trace/rethook.c
+++ b/kernel/trace/rethook.c
@@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame
 	if (WARN_ON_ONCE(!cur))
 		return 0;
 
-	if (tsk != current && task_is_running(tsk))
+	if (tsk != current && tsk->on_cpu)
 		return 0;
 
 	do {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-05-25 13:22 [PATCH] rethook: Use tsk->on_cpu to check task execution state Tengda Wu
@ 2026-05-26  3:37 ` Masami Hiramatsu
  2026-05-29  3:39   ` Tengda Wu
  0 siblings, 1 reply; 12+ messages in thread
From: Masami Hiramatsu @ 2026-05-26  3:37 UTC (permalink / raw)
  To: Tengda Wu
  Cc: Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov,
	linux-trace-kernel, linux-kernel

On Mon, 25 May 2026 21:22:53 +0800
Tengda Wu <wutengda@huaweicloud.com> wrote:

> When a task calls schedule() to yield the CPU, its state remains
> TASK_RUNNING, but its stack is frozen and safe to walk.
> 
> Replace task_is_running(tsk) with tsk->on_cpu to avoid overly
> conservative rejections.

Please see the Sashiko's comment.

https://sashiko.dev/#/patchset/20260525132253.1889726-1-wutengda%40huaweicloud.com

When calling Unwind on a task other than the current, IMHO, it is
the responsibility of the caller of this function to ensure that the
stack trace of that task is safe.
We also should not use tsk->on_cpu, but should use task_on_cpu(tsk).

BTW, should task_on_cpu() use READ_ONCE() etc?
wait_task_inactive() seems a bit fragile.

Thanks,

> 
> Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook")
> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
> ---
>  kernel/trace/rethook.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
> index 5a8bdf88999a..bd5e5f455e85 100644
> --- a/kernel/trace/rethook.c
> +++ b/kernel/trace/rethook.c
> @@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame
>  	if (WARN_ON_ONCE(!cur))
>  		return 0;
>  
> -	if (tsk != current && task_is_running(tsk))
> +	if (tsk != current && tsk->on_cpu)
>  		return 0;
>  
>  	do {
> -- 
> 2.34.1
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-05-26  3:37 ` Masami Hiramatsu
@ 2026-05-29  3:39   ` Tengda Wu
  2026-05-31 23:40     ` Masami Hiramatsu
  0 siblings, 1 reply; 12+ messages in thread
From: Tengda Wu @ 2026-05-29  3:39 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov,
	linux-trace-kernel, linux-kernel

Hi Masami,

thanks for the review and feedback.

On 2026/5/26 11:37, Masami Hiramatsu wrote:
> On Mon, 25 May 2026 21:22:53 +0800
> Tengda Wu <wutengda@huaweicloud.com> wrote:
> 
>> When a task calls schedule() to yield the CPU, its state remains
>> TASK_RUNNING, but its stack is frozen and safe to walk.
>>
>> Replace task_is_running(tsk) with tsk->on_cpu to avoid overly
>> conservative rejections.
> 
> Please see the Sashiko's comment.
> 
> https://sashiko.dev/#/patchset/20260525132253.1889726-1-wutengda%40huaweicloud.com
> 
> When calling Unwind on a task other than the current, IMHO, it is
> the responsibility of the caller of this function to ensure that the
> stack trace of that task is safe.

Agree.

> We also should not use tsk->on_cpu, but should use task_on_cpu(tsk).
> 
> BTW, should task_on_cpu() use READ_ONCE() etc?
> wait_task_inactive() seems a bit fragile.
> 
> Thanks,
> 

It seems that using task_on_cpu() is not necessary here because:

1. It requires an additional 'rq' parameter not available in the rethook context.
2. It just returns p->on_cpu, which is identical to our current use of tsk->on_cpu.

/* file: kernel/sched/sched.h */
static inline int task_on_cpu(struct rq *rq, struct task_struct *p)
{
	return p->on_cpu;
}

Given these constraints, staying with tsk->on_cpu seems more straightforward
for the rethook context.

Thanks,
Tengda

>>
>> Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook")
>> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
>> ---
>>  kernel/trace/rethook.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
>> index 5a8bdf88999a..bd5e5f455e85 100644
>> --- a/kernel/trace/rethook.c
>> +++ b/kernel/trace/rethook.c
>> @@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame
>>  	if (WARN_ON_ONCE(!cur))
>>  		return 0;
>>  
>> -	if (tsk != current && task_is_running(tsk))
>> +	if (tsk != current && tsk->on_cpu)
>>  		return 0;
>>  
>>  	do {
>> -- 
>> 2.34.1
>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-05-29  3:39   ` Tengda Wu
@ 2026-05-31 23:40     ` Masami Hiramatsu
  2026-06-01  0:58       ` Tengda Wu
  2026-06-04  9:34       ` Peter Zijlstra
  0 siblings, 2 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2026-05-31 23:40 UTC (permalink / raw)
  To: Tengda Wu, Peter Zijlstra
  Cc: Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov,
	linux-trace-kernel, linux-kernel

On Fri, 29 May 2026 11:39:49 +0800
Tengda Wu <wutengda@huaweicloud.com> wrote:

> > We also should not use tsk->on_cpu, but should use task_on_cpu(tsk).
> > 
> > BTW, should task_on_cpu() use READ_ONCE() etc?
> > wait_task_inactive() seems a bit fragile.
> > 
> > Thanks,
> > 
> 
> It seems that using task_on_cpu() is not necessary here because:
> 
> 1. It requires an additional 'rq' parameter not available in the rethook context.
> 2. It just returns p->on_cpu, which is identical to our current use of tsk->on_cpu.

OK. We may need to cleanup task_on_cpu() to remove unused rq, which
was historically introduced by commit 029632fbb7b7 ("sched: Make
separate sched*.c translation units") as a parameter for "task_running()"
because it called task_current(), but was cleaned by commit 0b9d46fc5ef7
("sched: Rename task_running() to task_on_cpu()") and now it is not
used.

> 
> /* file: kernel/sched/sched.h */
> static inline int task_on_cpu(struct rq *rq, struct task_struct *p)
> {
> 	return p->on_cpu;
> }
> 
> Given these constraints, staying with tsk->on_cpu seems more straightforward
> for the rethook context.

The reason I asked you to use a wrapper function instead of directly
accessing that on_cpu is to ensure that this part isn't overlooked when
changing the scheduler's behavior in the future.

They may be equivalent at this point in time, but that doesn't necessarily
mean they will remain so in the future.

IMHO, an API interface should represent the behavior and resources
required for that operation, and when accessing it from outside the
subsystem, we should use that API whenever possible. Otherwise, even
if we want to update the internal implementation of one subsystem,
we will have to make changes to other subsystems as well.

Peter, is it OK to drop @rq from task_on_cpu()? Then we can use
it from rethook.

Thank you,

> 
> Thanks,
> Tengda
> 
> >>
> >> Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook")
> >> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
> >> ---
> >>  kernel/trace/rethook.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
> >> index 5a8bdf88999a..bd5e5f455e85 100644
> >> --- a/kernel/trace/rethook.c
> >> +++ b/kernel/trace/rethook.c
> >> @@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame
> >>  	if (WARN_ON_ONCE(!cur))
> >>  		return 0;
> >>  
> >> -	if (tsk != current && task_is_running(tsk))
> >> +	if (tsk != current && tsk->on_cpu)
> >>  		return 0;
> >>  
> >>  	do {
> >> -- 
> >> 2.34.1
> >>
> >>
> > 
> > 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-05-31 23:40     ` Masami Hiramatsu
@ 2026-06-01  0:58       ` Tengda Wu
  2026-06-04  9:34       ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Tengda Wu @ 2026-06-01  0:58 UTC (permalink / raw)
  To: Masami Hiramatsu, Peter Zijlstra
  Cc: Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov,
	linux-trace-kernel, linux-kernel



On 2026/6/1 7:40, Masami Hiramatsu wrote:
> On Fri, 29 May 2026 11:39:49 +0800
> Tengda Wu <wutengda@huaweicloud.com> wrote:
> 
>>> We also should not use tsk->on_cpu, but should use task_on_cpu(tsk).
>>>
>>> BTW, should task_on_cpu() use READ_ONCE() etc?
>>> wait_task_inactive() seems a bit fragile.
>>>
>>> Thanks,
>>>
>>
>> It seems that using task_on_cpu() is not necessary here because:
>>
>> 1. It requires an additional 'rq' parameter not available in the rethook context.
>> 2. It just returns p->on_cpu, which is identical to our current use of tsk->on_cpu.
> 
> OK. We may need to cleanup task_on_cpu() to remove unused rq, which
> was historically introduced by commit 029632fbb7b7 ("sched: Make
> separate sched*.c translation units") as a parameter for "task_running()"
> because it called task_current(), but was cleaned by commit 0b9d46fc5ef7
> ("sched: Rename task_running() to task_on_cpu()") and now it is not
> used.
> 
>>
>> /* file: kernel/sched/sched.h */
>> static inline int task_on_cpu(struct rq *rq, struct task_struct *p)
>> {
>> 	return p->on_cpu;
>> }
>>
>> Given these constraints, staying with tsk->on_cpu seems more straightforward
>> for the rethook context.
> 
> The reason I asked you to use a wrapper function instead of directly
> accessing that on_cpu is to ensure that this part isn't overlooked when
> changing the scheduler's behavior in the future.
> 
> They may be equivalent at this point in time, but that doesn't necessarily
> mean they will remain so in the future.
> 
> IMHO, an API interface should represent the behavior and resources
> required for that operation, and when accessing it from outside the
> subsystem, we should use that API whenever possible. Otherwise, even
> if we want to update the internal implementation of one subsystem,
> we will have to make changes to other subsystems as well.
> 
> Peter, is it OK to drop @rq from task_on_cpu()? Then we can use
> it from rethook.
> 
> Thank you,
> 

Thank you for the detailed explanation -- that makes perfect sense.

I looked into task_on_cpu() and saw it has many callers.
Once it's confirmed that the parameter can be safely dropped, I'll
follow up with my patch to use the wrapper instead of directly
accessing on_cpu.

Best,
Tengda

>>>>
>>>> Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook")
>>>> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
>>>> ---
>>>>  kernel/trace/rethook.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
>>>> index 5a8bdf88999a..bd5e5f455e85 100644
>>>> --- a/kernel/trace/rethook.c
>>>> +++ b/kernel/trace/rethook.c
>>>> @@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame
>>>>  	if (WARN_ON_ONCE(!cur))
>>>>  		return 0;
>>>>  
>>>> -	if (tsk != current && task_is_running(tsk))
>>>> +	if (tsk != current && tsk->on_cpu)
>>>>  		return 0;
>>>>  
>>>>  	do {
>>>> -- 
>>>> 2.34.1
>>>>
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-05-31 23:40     ` Masami Hiramatsu
  2026-06-01  0:58       ` Tengda Wu
@ 2026-06-04  9:34       ` Peter Zijlstra
  2026-06-05 13:43         ` Masami Hiramatsu
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2026-06-04  9:34 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Tengda Wu, Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov,
	linux-trace-kernel, linux-kernel

On Mon, Jun 01, 2026 at 08:40:01AM +0900, Masami Hiramatsu wrote:

> Peter, is it OK to drop @rq from task_on_cpu()? 

Sure.

> Then we can use it from rethook.

Well, it is in sched/sched.h, which is an internal header, and no you
cannot use that header in rethook.

But lets step back first, what is the actual problem here, why are we
looking at ->on_cpu at all?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-06-04  9:34       ` Peter Zijlstra
@ 2026-06-05 13:43         ` Masami Hiramatsu
  2026-06-08  1:52           ` Tengda Wu
  0 siblings, 1 reply; 12+ messages in thread
From: Masami Hiramatsu @ 2026-06-05 13:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tengda Wu, Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov,
	linux-trace-kernel, linux-kernel

On Thu, 4 Jun 2026 11:34:45 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Mon, Jun 01, 2026 at 08:40:01AM +0900, Masami Hiramatsu wrote:
> 
> > Peter, is it OK to drop @rq from task_on_cpu()? 
> 
> Sure.
> 
> > Then we can use it from rethook.
> 
> Well, it is in sched/sched.h, which is an internal header, and no you
> cannot use that header in rethook.

Ah, OK. Hmm, then we should not use it. Maybe ->on_cpu is also internal
state?

> 
> But lets step back first, what is the actual problem here, why are we
> looking at ->on_cpu at all?

Tengda, can you explain it?
I think you want to take a stacktrace on !current process, and
rethook_find_ret_addr() is rejected i the task is running state.

But if you can share actual situation what you need, it is
helpful for us to understand.

Thank you,


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-06-05 13:43         ` Masami Hiramatsu
@ 2026-06-08  1:52           ` Tengda Wu
  2026-06-08  2:56             ` Masami Hiramatsu
  2026-06-08  9:34             ` Peter Zijlstra
  0 siblings, 2 replies; 12+ messages in thread
From: Tengda Wu @ 2026-06-08  1:52 UTC (permalink / raw)
  To: Masami Hiramatsu, Peter Zijlstra
  Cc: Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov,
	linux-trace-kernel, linux-kernel



On 2026/6/5 21:43, Masami Hiramatsu wrote:
> On Thu, 4 Jun 2026 11:34:45 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
>> On Mon, Jun 01, 2026 at 08:40:01AM +0900, Masami Hiramatsu wrote:
>>
>>> Peter, is it OK to drop @rq from task_on_cpu()? 
>>
>> Sure.
>>
>>> Then we can use it from rethook.
>>
>> Well, it is in sched/sched.h, which is an internal header, and no you
>> cannot use that header in rethook.
> 
> Ah, OK. Hmm, then we should not use it. Maybe ->on_cpu is also internal
> state?
> 
>>
>> But lets step back first, what is the actual problem here, why are we
>> looking at ->on_cpu at all?
> 
> Tengda, can you explain it?
> I think you want to take a stacktrace on !current process, and
> rethook_find_ret_addr() is rejected i the task is running state.
> 
> But if you can share actual situation what you need, it is
> helpful for us to understand.
> 
> Thank you,
> 
> 


Sure.

Background: We are verifying the support of live patches for functions that
have a kretprobe. The specific verification method is as follows:

We construct a function foo() that calls bar():

void bar(void)
{
    for (;;) {
        schedule();
    }
}

void foo(void)
{
    bar();
}

A kretprobe is attached to bar():

echo 'r:rp1 bar' > /sys/kernel/tracing/kprobe_events
echo 1 > /sys/kernel/tracing/events/kprobes/rp1/enable

Then foo() is triggered. The expected behavior is that bar() will call
schedule() and yield the CPU.

After that, the live patch is activated to attempt replacing the implementation
of foo(). The expectation is that this should succeed.

However, in reality, because the task that called schedule() is still in the
RUNNING state, the condition task_is_running(tsk) inside rethook_find_ret_addr()
is not satisfied, causing the function to return early. This, in turn,
prevents stack_trace_save_tsk_reliable() from determining the stack as
reliable, leading to a failure in activating the live patch.

**Not sure if this is correct:**

We believe that after a task voluntarily calls schedule(), when the stack
is expected to be reliable, it is a safe time to activate a live patch.
Additionally, a similar tsk->on_cpu check can be found elsewhere in the
kernel (See task_on_another_cpu() in arch/x86/include/asm/unwind.h).
Therefore, we propose changing the task_is_running(tsk) condition to
tsk->on_cpu.

Thanks,
Tengda


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-06-08  1:52           ` Tengda Wu
@ 2026-06-08  2:56             ` Masami Hiramatsu
  2026-06-08  8:31               ` Tengda Wu
  2026-06-08  9:34             ` Peter Zijlstra
  1 sibling, 1 reply; 12+ messages in thread
From: Masami Hiramatsu @ 2026-06-08  2:56 UTC (permalink / raw)
  To: Tengda Wu, Josh Poimboeuf
  Cc: Peter Zijlstra, Steven Rostedt, Mathieu Desnoyers,
	Alexei Starovoitov, linux-trace-kernel, linux-kernel

On Mon, 8 Jun 2026 09:52:37 +0800
Tengda Wu <wutengda@huaweicloud.com> wrote:

> 
> 
> On 2026/6/5 21:43, Masami Hiramatsu wrote:
> > On Thu, 4 Jun 2026 11:34:45 +0200
> > Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> >> On Mon, Jun 01, 2026 at 08:40:01AM +0900, Masami Hiramatsu wrote:
> >>
> >>> Peter, is it OK to drop @rq from task_on_cpu()? 
> >>
> >> Sure.
> >>
> >>> Then we can use it from rethook.
> >>
> >> Well, it is in sched/sched.h, which is an internal header, and no you
> >> cannot use that header in rethook.
> > 
> > Ah, OK. Hmm, then we should not use it. Maybe ->on_cpu is also internal
> > state?
> > 
> >>
> >> But lets step back first, what is the actual problem here, why are we
> >> looking at ->on_cpu at all?
> > 
> > Tengda, can you explain it?
> > I think you want to take a stacktrace on !current process, and
> > rethook_find_ret_addr() is rejected i the task is running state.
> > 
> > But if you can share actual situation what you need, it is
> > helpful for us to understand.
> > 
> > Thank you,
> > 
> > 
> 
> 
> Sure.
> 
> Background: We are verifying the support of live patches for functions that
> have a kretprobe. The specific verification method is as follows:
> 
> We construct a function foo() that calls bar():
> 
> void bar(void)
> {
>     for (;;) {
>         schedule();
>     }
> }
> 
> void foo(void)
> {
>     bar();
> }
> 
> A kretprobe is attached to bar():
> 
> echo 'r:rp1 bar' > /sys/kernel/tracing/kprobe_events
> echo 1 > /sys/kernel/tracing/events/kprobes/rp1/enable
> 
> Then foo() is triggered. The expected behavior is that bar() will call
> schedule() and yield the CPU.
> 
> After that, the live patch is activated to attempt replacing the implementation
> of foo(). The expectation is that this should succeed.
> 
> However, in reality, because the task that called schedule() is still in the
> RUNNING state, the condition task_is_running(tsk) inside rethook_find_ret_addr()
> is not satisfied, causing the function to return early. This, in turn,
> prevents stack_trace_save_tsk_reliable() from determining the stack as
> reliable, leading to a failure in activating the live patch.

Hmm is the bar() doing infinite loop, or limited loop but take a long time
so just yield a while? Anyway, it seems like a non-good design pattern.
Is it possible to avoid busy loops and instead use Workers, or wait for
something to complete or for input within a loop?

> 
> **Not sure if this is correct:**
> 
> We believe that after a task voluntarily calls schedule(), when the stack
> is expected to be reliable, it is a safe time to activate a live patch.

In this case, I don't know how to block the loop inside the bar.
Even if !tsk->on_cpu, the tsk can restart running right after checking
the flag.

> Additionally, a similar tsk->on_cpu check can be found elsewhere in the
> kernel (See task_on_another_cpu() in arch/x86/include/asm/unwind.h).
> Therefore, we propose changing the task_is_running(tsk) condition to
> tsk->on_cpu.

Yes, but the caller said there is another check to ensure the race.

        /*                  
         * Refuse to unwind the stack of a task while it's executing on another
         * CPU.  This check is racy, but that's ok: the unwinder has other
         * checks to prevent it from going off the rails.
         */
        if (task_on_another_cpu(task))
                goto err;

Josh, do you know how this avoid the race case?

Thank you,

> 
> Thanks,
> Tengda
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-06-08  2:56             ` Masami Hiramatsu
@ 2026-06-08  8:31               ` Tengda Wu
  0 siblings, 0 replies; 12+ messages in thread
From: Tengda Wu @ 2026-06-08  8:31 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Peter Zijlstra, Josh Poimboeuf, Steven Rostedt, Mathieu Desnoyers,
	Alexei Starovoitov, linux-trace-kernel, linux-kernel



On 2026/6/8 10:56, Masami Hiramatsu wrote:
> On Mon, 8 Jun 2026 09:52:37 +0800
> Tengda Wu <wutengda@huaweicloud.com> wrote:
> 
>>
>>
>> On 2026/6/5 21:43, Masami Hiramatsu wrote:
>>> On Thu, 4 Jun 2026 11:34:45 +0200
>>> Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>>> On Mon, Jun 01, 2026 at 08:40:01AM +0900, Masami Hiramatsu wrote:
>>>>
>>>>> Peter, is it OK to drop @rq from task_on_cpu()? 
>>>>
>>>> Sure.
>>>>
>>>>> Then we can use it from rethook.
>>>>
>>>> Well, it is in sched/sched.h, which is an internal header, and no you
>>>> cannot use that header in rethook.
>>>
>>> Ah, OK. Hmm, then we should not use it. Maybe ->on_cpu is also internal
>>> state?
>>>
>>>>
>>>> But lets step back first, what is the actual problem here, why are we
>>>> looking at ->on_cpu at all?
>>>
>>> Tengda, can you explain it?
>>> I think you want to take a stacktrace on !current process, and
>>> rethook_find_ret_addr() is rejected i the task is running state.
>>>
>>> But if you can share actual situation what you need, it is
>>> helpful for us to understand.
>>>
>>> Thank you,
>>>
>>>
>>
>>
>> Sure.
>>
>> Background: We are verifying the support of live patches for functions that
>> have a kretprobe. The specific verification method is as follows:
>>
>> We construct a function foo() that calls bar():
>>
>> void bar(void)
>> {
>>     for (;;) {
>>         schedule();
>>     }
>> }
>>
>> void foo(void)
>> {
>>     bar();
>> }
>>
>> A kretprobe is attached to bar():
>>
>> echo 'r:rp1 bar' > /sys/kernel/tracing/kprobe_events
>> echo 1 > /sys/kernel/tracing/events/kprobes/rp1/enable
>>
>> Then foo() is triggered. The expected behavior is that bar() will call
>> schedule() and yield the CPU.
>>
>> After that, the live patch is activated to attempt replacing the implementation
>> of foo(). The expectation is that this should succeed.
>>
>> However, in reality, because the task that called schedule() is still in the
>> RUNNING state, the condition task_is_running(tsk) inside rethook_find_ret_addr()
>> is not satisfied, causing the function to return early. This, in turn,
>> prevents stack_trace_save_tsk_reliable() from determining the stack as
>> reliable, leading to a failure in activating the live patch.
> 
> Hmm is the bar() doing infinite loop, or limited loop but take a long time
> so just yield a while? Anyway, it seems like a non-good design pattern.
> Is it possible to avoid busy loops and instead use Workers, or wait for
> something to complete or for input within a loop?
> 
>>
>> **Not sure if this is correct:**
>>
>> We believe that after a task voluntarily calls schedule(), when the stack
>> is expected to be reliable, it is a safe time to activate a live patch.
> 
> In this case, I don't know how to block the loop inside the bar.
> Even if !tsk->on_cpu, the tsk can restart running right after checking
> the flag.
> 

The infinite loop in bar() is indeed a poor design pattern. This test
case is only artificial, not from real-world code. It is merely
intended to verify live patch support for various cases.

However, the point you raised has indeed made me think. I realize that
checking only tsk->on_cpu is not sufficient -- there is also a race
condition where the task could be scheduled back onto a CPU right after
the check. I need to re-examine the validity of this test case and
whether it represents a safe live patch activation scenario.

Thank you again for your patience and for pointing out these
fundamental issues. Your guidance is much appreciated.

Best regards,
Tengda


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-06-08  1:52           ` Tengda Wu
  2026-06-08  2:56             ` Masami Hiramatsu
@ 2026-06-08  9:34             ` Peter Zijlstra
  2026-06-08 10:23               ` Peter Zijlstra
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2026-06-08  9:34 UTC (permalink / raw)
  To: Tengda Wu
  Cc: Masami Hiramatsu, Steven Rostedt, Mathieu Desnoyers,
	Alexei Starovoitov, linux-trace-kernel, linux-kernel,
	Josh Poimboeuf, jikos, mbenes, pmladek


+Live patching folks

On Mon, Jun 08, 2026 at 09:52:37AM +0800, Tengda Wu wrote:

> Background: We are verifying the support of live patches for functions that
> have a kretprobe. The specific verification method is as follows:
> 
> We construct a function foo() that calls bar():
> 
> void bar(void)
> {
>     for (;;) {
>         schedule();
>     }
> }
> 
> void foo(void)
> {
>     bar();
> }
> 
> A kretprobe is attached to bar():
> 
> echo 'r:rp1 bar' > /sys/kernel/tracing/kprobe_events
> echo 1 > /sys/kernel/tracing/events/kprobes/rp1/enable
> 
> Then foo() is triggered. The expected behavior is that bar() will call
> schedule() and yield the CPU.
> 
> After that, the live patch is activated to attempt replacing the implementation
> of foo(). The expectation is that this should succeed.

This wholly depends on how foo() calls bar(), if it is a normal call,
then no, it should not succeed, because foo() is still on the stack.

If it is a tail-call, then yes, because foo() is no longer relevant.

> However, in reality, because the task that called schedule() is still in the
> RUNNING state,

So calling schedule() without setting state is dodgy in the first place.
Who is doing this? All wait primitives will set this to
TASK_UNINTERRUPTIBLE or something along those lines.

> the condition task_is_running(tsk) inside rethook_find_ret_addr()
> is not satisfied, causing the function to return early. This, in turn,
> prevents stack_trace_save_tsk_reliable() from determining the stack as
> reliable, leading to a failure in activating the live patch.
> 
> **Not sure if this is correct:**
> 
> We believe that after a task voluntarily calls schedule(), when the stack
> is expected to be reliable, it is a safe time to activate a live patch.

Calling schedule() without setting state is a no-op and really shouldn't
count much at all.

> Additionally, a similar tsk->on_cpu check can be found elsewhere in the
> kernel (See task_on_another_cpu() in arch/x86/include/asm/unwind.h).
> Therefore, we propose changing the task_is_running(tsk) condition to
> tsk->on_cpu.

Anyway, I'm wondering what the purpose of this check here is, there is
no real comment, and commit 5120d167e21c ("rethook: Remove warning
messages printed for finding return address of a frame.") is just pure
voodoo as well.

Also, note the comment that goes with the usage of
task_on_another_cpu(); that thing is racy as all heck.

So it really comes down to what the purpose of this check is.

I suspect the issue at hand is that tsk->rethook elements, such as
iterated by __rethook_find_ret_addr() are not safe to be accessed for a
running task.

Notably while rethook_recycle() has some RCU thing on, that objpool
thing (and the recycle name itself) seems to strongly suggest iterating
these things is not sound (you could start with things from this task,
hit a recycled entry and continue iterating rethooks from another task).

Also note that the current check is also racy, nothing really prevents a
wakeup from happening right after you observe task_is_running() being
false. The task can then get scheduled in on another CPU and tear down
its rethooks concurrent with __rethook_find_ret_addr().


Now, livepatch itself calls unwind from a proper context, but unwinds in
general are not. This rethook stuff doesn't seem to be sound in general.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state
  2026-06-08  9:34             ` Peter Zijlstra
@ 2026-06-08 10:23               ` Peter Zijlstra
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Zijlstra @ 2026-06-08 10:23 UTC (permalink / raw)
  To: Tengda Wu
  Cc: Masami Hiramatsu, Steven Rostedt, Mathieu Desnoyers,
	Alexei Starovoitov, linux-trace-kernel, linux-kernel,
	Josh Poimboeuf, jikos, mbenes, pmladek

On Mon, Jun 08, 2026 at 11:34:49AM +0200, Peter Zijlstra wrote:
> 
> +Live patching folks
> 
> On Mon, Jun 08, 2026 at 09:52:37AM +0800, Tengda Wu wrote:
> 
> > Background: We are verifying the support of live patches for functions that
> > have a kretprobe. The specific verification method is as follows:
> > 
> > We construct a function foo() that calls bar():
> > 
> > void bar(void)
> > {
> >     for (;;) {
> >         schedule();
> >     }
> > }
> > 
> > void foo(void)
> > {
> >     bar();
> > }
> > 
> > A kretprobe is attached to bar():
> > 
> > echo 'r:rp1 bar' > /sys/kernel/tracing/kprobe_events
> > echo 1 > /sys/kernel/tracing/events/kprobes/rp1/enable
> > 
> > Then foo() is triggered. The expected behavior is that bar() will call
> > schedule() and yield the CPU.
> > 
> > After that, the live patch is activated to attempt replacing the implementation
> > of foo(). The expectation is that this should succeed.
> 
> This wholly depends on how foo() calls bar(), if it is a normal call,
> then no, it should not succeed, because foo() is still on the stack.
> 
> If it is a tail-call, then yes, because foo() is no longer relevant.
> 
> > However, in reality, because the task that called schedule() is still in the
> > RUNNING state,
> 
> So calling schedule() without setting state is dodgy in the first place.
> Who is doing this? All wait primitives will set this to
> TASK_UNINTERRUPTIBLE or something along those lines.
> 
> > the condition task_is_running(tsk) inside rethook_find_ret_addr()
> > is not satisfied, causing the function to return early. This, in turn,
> > prevents stack_trace_save_tsk_reliable() from determining the stack as
> > reliable, leading to a failure in activating the live patch.
> > 
> > **Not sure if this is correct:**
> > 
> > We believe that after a task voluntarily calls schedule(), when the stack
> > is expected to be reliable, it is a safe time to activate a live patch.
> 
> Calling schedule() without setting state is a no-op and really shouldn't
> count much at all.
> 
> > Additionally, a similar tsk->on_cpu check can be found elsewhere in the
> > kernel (See task_on_another_cpu() in arch/x86/include/asm/unwind.h).
> > Therefore, we propose changing the task_is_running(tsk) condition to
> > tsk->on_cpu.
> 
> Anyway, I'm wondering what the purpose of this check here is, there is
> no real comment, and commit 5120d167e21c ("rethook: Remove warning
> messages printed for finding return address of a frame.") is just pure
> voodoo as well.

FWIW, you should have had this discussion then.

> Also, note the comment that goes with the usage of
> task_on_another_cpu(); that thing is racy as all heck.
> 
> So it really comes down to what the purpose of this check is.
> 
> I suspect the issue at hand is that tsk->rethook elements, such as
> iterated by __rethook_find_ret_addr() are not safe to be accessed for a
> running task.
> 
> Notably while rethook_recycle() has some RCU thing on, that objpool
> thing (and the recycle name itself) seems to strongly suggest iterating
> these things is not sound (you could start with things from this task,
> hit a recycled entry and continue iterating rethooks from another task).
> 
> Also note that the current check is also racy, nothing really prevents a
> wakeup from happening right after you observe task_is_running() being
> false. The task can then get scheduled in on another CPU and tear down
> its rethooks concurrent with __rethook_find_ret_addr().
> 
> 
> Now, livepatch itself calls unwind from a proper context, but unwinds in
> general are not. This rethook stuff doesn't seem to be sound in general.

I suspect just entirely removing the check is the sanest option at this
point. Callers that do it right (livepatch) are guaranteed consistent
data, and the rest gets whatever pieces.

Notably, unwind_next() holds rcu, so the iteration is protected from any
of those rethook_node things getting freed. Its just that the iteration
can go sideways and you might not get a sane answer.

The very worst possible option is getting stuck in an infinite loop when
concurrent with agressive rethook re-use or something daft like that,
but that seems extremely unlikely.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-06-08 10:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-25 13:22 [PATCH] rethook: Use tsk->on_cpu to check task execution state Tengda Wu
2026-05-26  3:37 ` Masami Hiramatsu
2026-05-29  3:39   ` Tengda Wu
2026-05-31 23:40     ` Masami Hiramatsu
2026-06-01  0:58       ` Tengda Wu
2026-06-04  9:34       ` Peter Zijlstra
2026-06-05 13:43         ` Masami Hiramatsu
2026-06-08  1:52           ` Tengda Wu
2026-06-08  2:56             ` Masami Hiramatsu
2026-06-08  8:31               ` Tengda Wu
2026-06-08  9:34             ` Peter Zijlstra
2026-06-08 10:23               ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox