From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Linux PM <linux-pm@vger.kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Christian Loehle <christian.loehle@arm.com>
Subject: [PATCH v1] cpuidle: governors: teo: Special-case nohz_full CPUs
Date: Thu, 28 Aug 2025 22:16:20 +0200 [thread overview]
Message-ID: <5939372.DvuYhMxLoT@rafael.j.wysocki> (raw)
In-Reply-To: <2804546.mvXUDI8C0e@rafael.j.wysocki>
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This change follows an analogous modification of the menu governor [1].
Namely, when the governor runs on a nohz_full CPU and there are no user
space timers in the workload on that CPU, it ends up selecting idle
states with target residency values above TICK_NSEC, or the deepest
enabled idle state in the absence of any of those, all the time due to
a tick_nohz_tick_stopped() check designed for running on CPUs where the
tick is not permanently disabled. In that case, the fact that the tick
has been stopped means that the CPU was expected to be idle sufficiently
long previously, so it is not unreasonable to expect it to be idle
sufficiently long again, but this inference does not apply to nohz_full
CPUs.
In some cases, latency in the workload grows undesirably as a result of
selecting overly deep idle states, and the workload may also consume
more energy than necessary if the CPU does not spend enough time in the
selected deep idle state.
Address this by amending the tick_nohz_tick_stopped() check in question
with a tick_nohz_full_cpu() one to avoid effectively ignoring all
shallow idle states on nohz_full CPUs. While doing so introduces a risk
of getting stuck in a shallow idle state for a long time, that only
affects energy efficiently, but the current behavior potentially hurts
both energy efficiency and performance that is arguably the priority for
nohz_full CPUs.
While at it, add a comment explaining the logic in teo_state_ok().
Link: https://lore.kernel.org/linux-pm/2244365.irdbgypaU6@rafael.j.wysocki/ [1]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/cpuidle/governors/teo.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
--- a/drivers/cpuidle/governors/teo.c
+++ b/drivers/cpuidle/governors/teo.c
@@ -227,9 +227,17 @@
cpu_data->total += PULSE;
}
-static bool teo_state_ok(int i, struct cpuidle_driver *drv)
+static bool teo_state_ok(int i, struct cpuidle_driver *drv, struct cpuidle_device *dev)
{
- return !tick_nohz_tick_stopped() ||
+ /*
+ * If the scheduler tick has been stopped already, avoid selecting idle
+ * states with target residency below the tick period length under the
+ * assumption that the CPU is likely to be idle sufficiently long for
+ * the tick to be stopped again (or the tick would not have been
+ * stopped previously in the first place). However, do not do that on
+ * nohz_full CPUs where the above assumption does not hold.
+ */
+ return !tick_nohz_tick_stopped() || tick_nohz_full_cpu(dev->cpu) ||
drv->states[i].target_residency_ns >= TICK_NSEC;
}
@@ -379,7 +387,7 @@
* shallow or disabled, in which case take the
* first enabled state that is deep enough.
*/
- if (teo_state_ok(i, drv) &&
+ if (teo_state_ok(i, drv, dev) &&
!dev->states_usage[i].disable) {
idx = i;
break;
@@ -391,7 +399,7 @@
if (dev->states_usage[i].disable)
continue;
- if (teo_state_ok(i, drv)) {
+ if (teo_state_ok(i, drv, dev)) {
/*
* The current state is deep enough, but still
* there may be a better one.
@@ -460,7 +468,7 @@
*/
if (drv->states[idx].target_residency_ns > duration_ns) {
i = teo_find_shallower_state(drv, dev, idx, duration_ns, false);
- if (teo_state_ok(i, drv))
+ if (teo_state_ok(i, drv, dev))
idx = i;
}
next prev parent reply other threads:[~2025-08-28 20:16 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-13 10:21 [PATCH v1 0/3] cpuidle: governors: menu: A fix, a corner case adjustment and a cleanup Rafael J. Wysocki
2025-08-13 10:25 ` [PATCH v1 1/3] cpuidle: governors: menu: Avoid selecting states with too much latency Rafael J. Wysocki
2025-08-13 19:13 ` Christian Loehle
2025-08-18 17:08 ` Rafael J. Wysocki
2025-09-11 13:37 ` Frederic Weisbecker
2025-10-23 3:05 ` Doug Smythies
2025-10-23 14:51 ` Rafael J. Wysocki
2025-10-23 16:02 ` Doug Smythies
2025-10-23 16:52 ` Rafael J. Wysocki
2025-08-13 10:26 ` [PATCH v1 2/3] cpuidle: governors: menu: Rearrange main loop in menu_select() Rafael J. Wysocki
2025-08-14 13:00 ` Christian Loehle
2025-09-11 13:37 ` Frederic Weisbecker
2025-08-13 10:29 ` [PATCH v1 3/3] cpuidle: governors: menu: Special-case nohz_full CPUs Rafael J. Wysocki
2025-08-14 14:09 ` Christian Loehle
2025-08-18 17:41 ` Rafael J. Wysocki
2025-08-19 9:10 ` Christian Loehle
2025-08-19 11:56 ` Rafael J. Wysocki
2025-09-11 14:17 ` Frederic Weisbecker
2025-09-11 17:07 ` Rafael J. Wysocki
2025-09-18 15:07 ` Frederic Weisbecker
2025-09-23 17:25 ` Rafael J. Wysocki
2026-02-08 15:59 ` Ionut Nechita (Wind River)
2026-02-20 13:02 ` Rafael J. Wysocki
2025-08-28 20:16 ` Rafael J. Wysocki [this message]
2025-08-29 19:37 ` [PATCH v1] cpuidle: governors: teo: " Rafael J. Wysocki
2025-08-31 21:30 ` Christian Loehle
2025-09-01 19:08 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5939372.DvuYhMxLoT@rafael.j.wysocki \
--to=rafael@kernel.org \
--cc=christian.loehle@arm.com \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox