* [PATCH] rethook: Use tsk->on_cpu to check task execution state
@ 2026-05-25 13:22 Tengda Wu
2026-05-26 3:37 ` Masami Hiramatsu
0 siblings, 1 reply; 5+ messages in thread
From: Tengda Wu @ 2026-05-25 13:22 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu
Cc: Mathieu Desnoyers, Alexei Starovoitov, linux-trace-kernel,
linux-kernel, Tengda Wu
When a task calls schedule() to yield the CPU, its state remains
TASK_RUNNING, but its stack is frozen and safe to walk.
Replace task_is_running(tsk) with tsk->on_cpu to avoid overly
conservative rejections.
Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook")
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
---
kernel/trace/rethook.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
index 5a8bdf88999a..bd5e5f455e85 100644
--- a/kernel/trace/rethook.c
+++ b/kernel/trace/rethook.c
@@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame
if (WARN_ON_ONCE(!cur))
return 0;
- if (tsk != current && task_is_running(tsk))
+ if (tsk != current && tsk->on_cpu)
return 0;
do {
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state 2026-05-25 13:22 [PATCH] rethook: Use tsk->on_cpu to check task execution state Tengda Wu @ 2026-05-26 3:37 ` Masami Hiramatsu 2026-05-29 3:39 ` Tengda Wu 0 siblings, 1 reply; 5+ messages in thread From: Masami Hiramatsu @ 2026-05-26 3:37 UTC (permalink / raw) To: Tengda Wu Cc: Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov, linux-trace-kernel, linux-kernel On Mon, 25 May 2026 21:22:53 +0800 Tengda Wu <wutengda@huaweicloud.com> wrote: > When a task calls schedule() to yield the CPU, its state remains > TASK_RUNNING, but its stack is frozen and safe to walk. > > Replace task_is_running(tsk) with tsk->on_cpu to avoid overly > conservative rejections. Please see the Sashiko's comment. https://sashiko.dev/#/patchset/20260525132253.1889726-1-wutengda%40huaweicloud.com When calling Unwind on a task other than the current, IMHO, it is the responsibility of the caller of this function to ensure that the stack trace of that task is safe. We also should not use tsk->on_cpu, but should use task_on_cpu(tsk). BTW, should task_on_cpu() use READ_ONCE() etc? wait_task_inactive() seems a bit fragile. Thanks, > > Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook") > Signed-off-by: Tengda Wu <wutengda@huaweicloud.com> > --- > kernel/trace/rethook.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c > index 5a8bdf88999a..bd5e5f455e85 100644 > --- a/kernel/trace/rethook.c > +++ b/kernel/trace/rethook.c > @@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame > if (WARN_ON_ONCE(!cur)) > return 0; > > - if (tsk != current && task_is_running(tsk)) > + if (tsk != current && tsk->on_cpu) > return 0; > > do { > -- > 2.34.1 > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state 2026-05-26 3:37 ` Masami Hiramatsu @ 2026-05-29 3:39 ` Tengda Wu 2026-05-31 23:40 ` Masami Hiramatsu 0 siblings, 1 reply; 5+ messages in thread From: Tengda Wu @ 2026-05-29 3:39 UTC (permalink / raw) To: Masami Hiramatsu Cc: Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov, linux-trace-kernel, linux-kernel Hi Masami, thanks for the review and feedback. On 2026/5/26 11:37, Masami Hiramatsu wrote: > On Mon, 25 May 2026 21:22:53 +0800 > Tengda Wu <wutengda@huaweicloud.com> wrote: > >> When a task calls schedule() to yield the CPU, its state remains >> TASK_RUNNING, but its stack is frozen and safe to walk. >> >> Replace task_is_running(tsk) with tsk->on_cpu to avoid overly >> conservative rejections. > > Please see the Sashiko's comment. > > https://sashiko.dev/#/patchset/20260525132253.1889726-1-wutengda%40huaweicloud.com > > When calling Unwind on a task other than the current, IMHO, it is > the responsibility of the caller of this function to ensure that the > stack trace of that task is safe. Agree. > We also should not use tsk->on_cpu, but should use task_on_cpu(tsk). > > BTW, should task_on_cpu() use READ_ONCE() etc? > wait_task_inactive() seems a bit fragile. > > Thanks, > It seems that using task_on_cpu() is not necessary here because: 1. It requires an additional 'rq' parameter not available in the rethook context. 2. It just returns p->on_cpu, which is identical to our current use of tsk->on_cpu. /* file: kernel/sched/sched.h */ static inline int task_on_cpu(struct rq *rq, struct task_struct *p) { return p->on_cpu; } Given these constraints, staying with tsk->on_cpu seems more straightforward for the rethook context. Thanks, Tengda >> >> Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook") >> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com> >> --- >> kernel/trace/rethook.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c >> index 5a8bdf88999a..bd5e5f455e85 100644 >> --- a/kernel/trace/rethook.c >> +++ b/kernel/trace/rethook.c >> @@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame >> if (WARN_ON_ONCE(!cur)) >> return 0; >> >> - if (tsk != current && task_is_running(tsk)) >> + if (tsk != current && tsk->on_cpu) >> return 0; >> >> do { >> -- >> 2.34.1 >> >> > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state 2026-05-29 3:39 ` Tengda Wu @ 2026-05-31 23:40 ` Masami Hiramatsu 2026-06-01 0:58 ` Tengda Wu 0 siblings, 1 reply; 5+ messages in thread From: Masami Hiramatsu @ 2026-05-31 23:40 UTC (permalink / raw) To: Tengda Wu, Peter Zijlstra Cc: Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov, linux-trace-kernel, linux-kernel On Fri, 29 May 2026 11:39:49 +0800 Tengda Wu <wutengda@huaweicloud.com> wrote: > > We also should not use tsk->on_cpu, but should use task_on_cpu(tsk). > > > > BTW, should task_on_cpu() use READ_ONCE() etc? > > wait_task_inactive() seems a bit fragile. > > > > Thanks, > > > > It seems that using task_on_cpu() is not necessary here because: > > 1. It requires an additional 'rq' parameter not available in the rethook context. > 2. It just returns p->on_cpu, which is identical to our current use of tsk->on_cpu. OK. We may need to cleanup task_on_cpu() to remove unused rq, which was historically introduced by commit 029632fbb7b7 ("sched: Make separate sched*.c translation units") as a parameter for "task_running()" because it called task_current(), but was cleaned by commit 0b9d46fc5ef7 ("sched: Rename task_running() to task_on_cpu()") and now it is not used. > > /* file: kernel/sched/sched.h */ > static inline int task_on_cpu(struct rq *rq, struct task_struct *p) > { > return p->on_cpu; > } > > Given these constraints, staying with tsk->on_cpu seems more straightforward > for the rethook context. The reason I asked you to use a wrapper function instead of directly accessing that on_cpu is to ensure that this part isn't overlooked when changing the scheduler's behavior in the future. They may be equivalent at this point in time, but that doesn't necessarily mean they will remain so in the future. IMHO, an API interface should represent the behavior and resources required for that operation, and when accessing it from outside the subsystem, we should use that API whenever possible. Otherwise, even if we want to update the internal implementation of one subsystem, we will have to make changes to other subsystems as well. Peter, is it OK to drop @rq from task_on_cpu()? Then we can use it from rethook. Thank you, > > Thanks, > Tengda > > >> > >> Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook") > >> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com> > >> --- > >> kernel/trace/rethook.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c > >> index 5a8bdf88999a..bd5e5f455e85 100644 > >> --- a/kernel/trace/rethook.c > >> +++ b/kernel/trace/rethook.c > >> @@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame > >> if (WARN_ON_ONCE(!cur)) > >> return 0; > >> > >> - if (tsk != current && task_is_running(tsk)) > >> + if (tsk != current && tsk->on_cpu) > >> return 0; > >> > >> do { > >> -- > >> 2.34.1 > >> > >> > > > > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state 2026-05-31 23:40 ` Masami Hiramatsu @ 2026-06-01 0:58 ` Tengda Wu 0 siblings, 0 replies; 5+ messages in thread From: Tengda Wu @ 2026-06-01 0:58 UTC (permalink / raw) To: Masami Hiramatsu, Peter Zijlstra Cc: Steven Rostedt, Mathieu Desnoyers, Alexei Starovoitov, linux-trace-kernel, linux-kernel On 2026/6/1 7:40, Masami Hiramatsu wrote: > On Fri, 29 May 2026 11:39:49 +0800 > Tengda Wu <wutengda@huaweicloud.com> wrote: > >>> We also should not use tsk->on_cpu, but should use task_on_cpu(tsk). >>> >>> BTW, should task_on_cpu() use READ_ONCE() etc? >>> wait_task_inactive() seems a bit fragile. >>> >>> Thanks, >>> >> >> It seems that using task_on_cpu() is not necessary here because: >> >> 1. It requires an additional 'rq' parameter not available in the rethook context. >> 2. It just returns p->on_cpu, which is identical to our current use of tsk->on_cpu. > > OK. We may need to cleanup task_on_cpu() to remove unused rq, which > was historically introduced by commit 029632fbb7b7 ("sched: Make > separate sched*.c translation units") as a parameter for "task_running()" > because it called task_current(), but was cleaned by commit 0b9d46fc5ef7 > ("sched: Rename task_running() to task_on_cpu()") and now it is not > used. > >> >> /* file: kernel/sched/sched.h */ >> static inline int task_on_cpu(struct rq *rq, struct task_struct *p) >> { >> return p->on_cpu; >> } >> >> Given these constraints, staying with tsk->on_cpu seems more straightforward >> for the rethook context. > > The reason I asked you to use a wrapper function instead of directly > accessing that on_cpu is to ensure that this part isn't overlooked when > changing the scheduler's behavior in the future. > > They may be equivalent at this point in time, but that doesn't necessarily > mean they will remain so in the future. > > IMHO, an API interface should represent the behavior and resources > required for that operation, and when accessing it from outside the > subsystem, we should use that API whenever possible. Otherwise, even > if we want to update the internal implementation of one subsystem, > we will have to make changes to other subsystems as well. > > Peter, is it OK to drop @rq from task_on_cpu()? Then we can use > it from rethook. > > Thank you, > Thank you for the detailed explanation -- that makes perfect sense. I looked into task_on_cpu() and saw it has many callers. Once it's confirmed that the parameter can be safely dropped, I'll follow up with my patch to use the wrapper instead of directly accessing on_cpu. Best, Tengda >>>> >>>> Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook") >>>> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com> >>>> --- >>>> kernel/trace/rethook.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c >>>> index 5a8bdf88999a..bd5e5f455e85 100644 >>>> --- a/kernel/trace/rethook.c >>>> +++ b/kernel/trace/rethook.c >>>> @@ -250,7 +250,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame >>>> if (WARN_ON_ONCE(!cur)) >>>> return 0; >>>> >>>> - if (tsk != current && task_is_running(tsk)) >>>> + if (tsk != current && tsk->on_cpu) >>>> return 0; >>>> >>>> do { >>>> -- >>>> 2.34.1 >>>> >>>> >>> >>> >> > > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-01 0:58 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-25 13:22 [PATCH] rethook: Use tsk->on_cpu to check task execution state Tengda Wu 2026-05-26 3:37 ` Masami Hiramatsu 2026-05-29 3:39 ` Tengda Wu 2026-05-31 23:40 ` Masami Hiramatsu 2026-06-01 0:58 ` Tengda Wu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox