From: Frederic Weisbecker <frederic@kernel.org>
To: Adam Li <adamli@os.amperecomputing.com>
Cc: anna-maria@linutronix.de, tglx@linutronix.de, mingo@redhat.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, vschneid@redhat.com,
dietmar.eggemann@arm.com, rostedt@goodmis.org,
bsegall@google.com, mgorman@suse.de, cl@linux.com,
linux-kernel@vger.kernel.org, patches@amperecomputing.com,
Christoph Lameter <cl@gentwo.org>
Subject: Re: [PATCH RESEND 1/2] tick/nohz: Fix wrong NOHZ idle CPU state
Date: Thu, 4 Sep 2025 18:05:21 +0200 [thread overview]
Message-ID: <aLm4wRwKBMGkekkT@localhost.localdomain> (raw)
In-Reply-To: <20250821042707.62993-2-adamli@os.amperecomputing.com>
Le Thu, Aug 21, 2025 at 04:27:06AM +0000, Adam Li a écrit :
> NOHZ idle load balance is done among CPUs in nohz.idle_cpus_mask.
> A CPU is added to nohz.idle_cpus_mask in:
> do_idle()
> -> tick_nohz_idle_stop_tick()
> -> nohz_balance_enter_idle()
>
> nohz_balance_enter_idle() is called if:
> 1) tick is stopped (TS_FLAG_STOPPED is set)
> 2) and tick was not already stopped before tick_nohz_idle_stop_tick()
> stops the tick (!was_stopped)
>
> When CONFIG_NO_HZ_FULL is set and the CPU is in the nohz_full list
> then 'was_stopped' may always be true.
> The flag 'TS_FLAG_STOPPED' may be already set in
> tick_nohz_full_stop_tick(). So nohz_balance_enter_idle() has no chance
> to be called.
>
> As a result, CPU will stay in a 'wrong' state:
> 1) tick is stopped (TS_FLAG_STOPPED is set)
> 2) and CPU is not in nohz.idle_cpus_mask
> 3) and CPU stays idle
>
> Neither the periodic nor the NOHZ idle load balancing can move task
> to this CPU. Some CPUs keep idle while others busy.
>
> In nohz_balance_enter_idle(), 'rq->nohz_tick_stopped' is checked to avoid
> duplicated nohz.idle_cpus_mask setting. So for nohz_balance_enter_idle()
> there is no need to check the '!was_stopped' condition.
>
> This patch will add the CPU to nohz.idle_cpus_mask as expected.
>
> Signed-off-by: Adam Li <adamli@os.amperecomputing.com>
> Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org>
> ---
> kernel/time/tick-sched.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c527b421c865..b900a120ab54 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1229,8 +1229,9 @@ void tick_nohz_idle_stop_tick(void)
> ts->idle_sleeps++;
> ts->idle_expires = expires;
>
> - if (!was_stopped && tick_sched_flag_test(ts, TS_FLAG_STOPPED)) {
> - ts->idle_jiffies = ts->last_jiffies;
> + if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) {
> + if (!was_stopped)
> + ts->idle_jiffies = ts->last_jiffies;
> nohz_balance_enter_idle(cpu);
The current state is indeed broken and some people have already tried to fix it.
The thing is nohz_full don't want dynamic isolation because it is deemed to run a
single task. Therefore those tasks must be placed manually in order not to break
isolation guarantees by accident.
In fact nohz_full doesn't make much sense without isolcpus (or isolated cpuset
v2 partitions) and I even intend to make nohz_full depend on domain isolation
in the long term.
Thanks.
> }
> } else {
> --
> 2.34.1
>
--
Frederic Weisbecker
SUSE Labs
next prev parent reply other threads:[~2025-09-04 16:05 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-21 4:27 [PATCH RESEND 0/2] tick/nohz: CPU cannot enter NOHZ idle balance state Adam Li
2025-08-21 4:27 ` [PATCH RESEND 1/2] tick/nohz: Fix wrong NOHZ idle CPU state Adam Li
2025-09-04 16:05 ` Frederic Weisbecker [this message]
2025-09-04 16:10 ` Christoph Lameter (Ampere)
2025-09-05 11:47 ` Frederic Weisbecker
2025-09-08 15:25 ` Christoph Lameter (Ampere)
2026-02-11 23:19 ` Shubhang Kaushik
2025-08-21 4:27 ` [PATCH RESEND 2/2] tick/nohz: Trigger warning when CPU in wrong NOHZ idle state Adam Li
2025-09-03 8:01 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLm4wRwKBMGkekkT@localhost.localdomain \
--to=frederic@kernel.org \
--cc=adamli@os.amperecomputing.com \
--cc=anna-maria@linutronix.de \
--cc=bsegall@google.com \
--cc=cl@gentwo.org \
--cc=cl@linux.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=patches@amperecomputing.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.