From: Andrea Righi <arighi@nvidia.com>
To: Cheng-Yang Chou <yphbchou0911@gmail.com>
Cc: sched-ext@lists.linux.dev, Tejun Heo <tj@kernel.org>,
David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
rcu@vger.kernel.org, Ching-Chun Huang <jserv@ccns.ncku.edu.tw>,
Chia-Ping Tsai <chia7712@gmail.com>
Subject: Re: [PATCH 1/2] sched_ext: Fix exit_cpu accuracy for lockup paths
Date: Tue, 9 Jun 2026 07:10:14 +0200 [thread overview]
Message-ID: <aiegNmIT5mZij8KP@gpd4> (raw)
In-Reply-To: <20260531152646.1206799-2-yphbchou0911@gmail.com>
Hi Cheng-Yang,
On Sun, May 31, 2026 at 11:25:26PM +0800, Cheng-Yang Chou wrote:
> handle_lockup() uses raw_smp_processor_id() for exit_cpu, which is wrong
> for two paths:
>
> - scx_hardlockup_irq_workfn() has the hung CPU in a local variable but
> irq_work may run elsewhere. Pass the local cpu explicitly.
> - scx_rcu_cpu_stall() records the detector CPU rather than the stalled
> one. Pass -1 for now. The next patch fixes this properly.
>
> Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Small nit below, apart than that looks good me.
Reviewed-by: Andrea Righi <arighi@nvidia.com>
> ---
> kernel/sched/ext.c | 12 +++++++-----
> kernel/sched/ext_internal.h | 2 --
> 2 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index ffad1a90196f..0c37b5fd58b0 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -5205,6 +5205,7 @@ bool scx_allow_ttwu_queue(const struct task_struct *p)
>
> /**
> * handle_lockup - sched_ext common lockup handler
> + * @exit_cpu: CPU to record in exit_info. Pass the stalled/hung CPU, not current.
> * @fmt: format string
> *
> * Called on system stall or lockup condition and initiates abort of sched_ext
> @@ -5214,7 +5215,7 @@ bool scx_allow_ttwu_queue(const struct task_struct *p)
> * resolve the lockup. %false if sched_ext is not enabled or abort was already
> * initiated by someone else.
> */
> -static __printf(1, 2) bool handle_lockup(const char *fmt, ...)
> +static __printf(2, 3) bool handle_lockup(int exit_cpu, const char *fmt, ...)
> {
> struct scx_sched *sch;
> va_list args;
> @@ -5230,7 +5231,7 @@ static __printf(1, 2) bool handle_lockup(const char *fmt, ...)
> case SCX_ENABLING:
> case SCX_ENABLED:
> va_start(args, fmt);
> - ret = scx_verror(sch, fmt, args);
> + ret = scx_vexit(sch, SCX_EXIT_ERROR, 0, exit_cpu, fmt, args);
> va_end(args);
> return ret;
> default:
> @@ -5252,7 +5253,7 @@ static __printf(1, 2) bool handle_lockup(const char *fmt, ...)
> */
> bool scx_rcu_cpu_stall(void)
> {
> - return handle_lockup("RCU CPU stall detected!");
> + return handle_lockup(-1, "RCU CPU stall detected!");
> }
>
> /**
> @@ -5267,7 +5268,8 @@ bool scx_rcu_cpu_stall(void)
> */
> void scx_softlockup(u32 dur_s)
> {
> - if (!handle_lockup("soft lockup - CPU %d stuck for %us", smp_processor_id(), dur_s))
> + if (!handle_lockup(smp_processor_id(), "soft lockup - CPU %d stuck for %us",
> + smp_processor_id(), dur_s))
nit: maybe we can use smp_processor_id() once here, like:
int cpu = smp_processor_id();
if (!handle_lockup(cpu, ..., cpu, dur_s))
Thanks,
-Andrea
next prev parent reply other threads:[~2026-06-09 5:10 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-31 15:25 [PATCH v6 sched_ext/for-7.2 0/2] sched_ext: Follow-up fixes for exit_cpu accuracy Cheng-Yang Chou
2026-05-31 15:25 ` [PATCH 1/2] sched_ext: Fix exit_cpu accuracy for lockup paths Cheng-Yang Chou
2026-06-09 5:10 ` Andrea Righi [this message]
2026-05-31 15:25 ` [PATCH 2/2] sched_ext, rcu: Upgrade RCU stall paths to report cpumask of stalled CPUs Cheng-Yang Chou
2026-06-04 17:57 ` Paul E. McKenney
2026-06-05 14:33 ` Cheng-Yang Chou
2026-06-09 8:06 ` Andrea Righi
-- strict thread matches above, loose matches on Subject: below --
2026-05-21 16:16 [PATCH v5 sched_ext/for-7.2 0/2] sched_ext: Follow-up fixes for exit_cpu accuracy Cheng-Yang Chou
2026-05-21 16:16 ` [PATCH 1/2] sched_ext: Fix exit_cpu accuracy for lockup paths Cheng-Yang Chou
2026-05-19 17:17 [PATCH v4 sched_ext/for-7.2 0/2] sched_ext: Follow-up fixes for exit_cpu accuracy Cheng-Yang Chou
2026-05-19 17:17 ` [PATCH 1/2] sched_ext: Fix exit_cpu accuracy for lockup paths Cheng-Yang Chou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiegNmIT5mZij8KP@gpd4 \
--to=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=chia7712@gmail.com \
--cc=jserv@ccns.ncku.edu.tw \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=tj@kernel.org \
--cc=void@manifault.com \
--cc=yphbchou0911@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.