* Re: [PATCH RFC] tick/sched: Prevent pointless NOHZ transitions
[not found] ` <87zf4yt90t.ffs@tglx>
@ 2026-02-24 21:31 ` Rafael J. Wysocki
2026-02-24 21:55 ` Thomas Gleixner
0 siblings, 1 reply; 5+ messages in thread
From: Rafael J. Wysocki @ 2026-02-24 21:31 UTC (permalink / raw)
To: Christian Loehle, Thomas Gleixner
Cc: LKML, Peter Zijlstra, Frederic Weisbecker, Linux PM
On Tuesday, February 24, 2026 5:13:06 PM CET Thomas Gleixner wrote:
> On Tue, Feb 24 2026 at 09:35, Christian Loehle wrote:
> > On 2/24/26 08:32, Thomas Gleixner wrote:
> >> This happens with both TEO and MENU governors in a VM guest. That's not
> >> only pointless it's also a performance issue as each rearm of the timer
> >> implies a VM exit.
> >
> > This is the (drv->state_count <= 1) case I assume, no governor does anything
> > sensible in that case.
>
> Indeed.
>
> > I was also curious about the performance angle recently FWIW, but didn't
> > hear back:
> > https://lore.kernel.org/all/73439919-e24d-4bd5-a7ed-d7633beb5e4f@arm.com/
>
> Sure, but I can tell you that two VM exits for a 10us idle are really
> harming performance a lot. That's why I noticed.
>
> >> Keep track of the idle time with a moving average and check it for being
> >> larger than TICK_NSEC in can_stop_idle_tick(). That cures this behaviour
> >> while still allowing the system to go into long idle sleeps once the
> >> work load stopped.
> >>
> >> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> >> ---
> >> kernel/time/tick-sched.c | 20 +++++++++++++++++---
> >> kernel/time/tick-sched.h | 9 +++++++++
> >> 2 files changed, 26 insertions(+), 3 deletions(-)
> >
> > Why here and not in cpuidle?
>
> I don't care where it is fixed, that's why I marked it RFC
>
> > We've recently added some code for the single state case to skip
> > governor see
>
> Duh. I just noticed, the VM has no driver, so this will not end up in
> cpuidle_select(). No wonder that changing the governor has no effect :)
>
> I set the governor to haltpoll now, but that does not work either as the
> stupid haltpoll driver is built in and not activated as it requires the
> force parameter unless the KVM hypervisor has KVM_HINTS_REALTIME set.
>
> Brilliant, intuitive and truly user friendly stuff all that.
>
> It's amazing as always that all the "performance experts" who cry murder
> on everything else never noticed this completely nonsensical default
> behaviour.
>
> Force enabling that driver and setting the governor to 'teo' makes it go
> away. 'menu' still sucks pretty much the same way as with none; slightly
> less so, but often enough.
>
> > e5c9ffc6ae1b ("cpuidle: Skip governor when only one idle state is available")
> > where that could also live.
>
> So either ladder or the powernv driver is broken and that gets fixed in
> the cpuidle core. Interesting choice.
>
> But as I explained above adding something to this hack won't help for
> the VM case with no driver active because cpuidle_not_available() is
> true and idle ends up in default_idle_call().
>
> So either the governor/driver muck provides some sensible default
> implementation or this has to go into into default_idle_call().
>
> Oh well...
It looks like the issue is cause by the tick_nohz_idle_stop_tick() called right
before invoking default_idle_call().
After the recent changes mentioned above, cpuidle_select() will never stop the
tick when there's only one idle state in the cpuidle driver, so it would be
consistent to make the default case behave analogously. The default idle state
is never a deep one AFAICS.
So maybe something like the below?
---
kernel/sched/idle.c | 2 --
1 file changed, 2 deletions(-)
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -186,8 +186,6 @@ static void cpuidle_idle_call(void)
}
if (cpuidle_not_available(drv, dev)) {
- tick_nohz_idle_stop_tick();
-
default_idle_call();
goto exit_idle;
}
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] tick/sched: Prevent pointless NOHZ transitions
2026-02-24 21:31 ` [PATCH RFC] tick/sched: Prevent pointless NOHZ transitions Rafael J. Wysocki
@ 2026-02-24 21:55 ` Thomas Gleixner
2026-02-25 12:54 ` Rafael J. Wysocki
0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2026-02-24 21:55 UTC (permalink / raw)
To: Rafael J. Wysocki, Christian Loehle
Cc: LKML, Peter Zijlstra, Frederic Weisbecker, Linux PM
On Tue, Feb 24 2026 at 22:31, Rafael J. Wysocki wrote:
> On Tuesday, February 24, 2026 5:13:06 PM CET Thomas Gleixner wrote:
>> So either the governor/driver muck provides some sensible default
>> implementation or this has to go into into default_idle_call().
>>
>> Oh well...
>
> It looks like the issue is cause by the tick_nohz_idle_stop_tick() called right
> before invoking default_idle_call().
>
> After the recent changes mentioned above, cpuidle_select() will never stop the
> tick when there's only one idle state in the cpuidle driver, so it would be
> consistent to make the default case behave analogously. The default idle state
> is never a deep one AFAICS.
>
> So maybe something like the below?
>
> ---
> kernel/sched/idle.c | 2 --
> 1 file changed, 2 deletions(-)
>
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -186,8 +186,6 @@ static void cpuidle_idle_call(void)
> }
>
> if (cpuidle_not_available(drv, dev)) {
> - tick_nohz_idle_stop_tick();
> -
> default_idle_call();
> goto exit_idle;
> }
Which prevents VMs or other systems which do not have an idle driver to
stop the tick at all. That's just obviously wrong, no?
Thanks,
tglx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] tick/sched: Prevent pointless NOHZ transitions
2026-02-24 21:55 ` Thomas Gleixner
@ 2026-02-25 12:54 ` Rafael J. Wysocki
2026-02-25 13:10 ` Rafael J. Wysocki
0 siblings, 1 reply; 5+ messages in thread
From: Rafael J. Wysocki @ 2026-02-25 12:54 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Rafael J. Wysocki, Christian Loehle, LKML, Peter Zijlstra,
Frederic Weisbecker, Linux PM
On Tue, Feb 24, 2026 at 10:56 PM Thomas Gleixner <tglx@kernel.org> wrote:
>
> On Tue, Feb 24 2026 at 22:31, Rafael J. Wysocki wrote:
> > On Tuesday, February 24, 2026 5:13:06 PM CET Thomas Gleixner wrote:
> >> So either the governor/driver muck provides some sensible default
> >> implementation or this has to go into into default_idle_call().
> >>
> >> Oh well...
> >
> > It looks like the issue is cause by the tick_nohz_idle_stop_tick() called right
> > before invoking default_idle_call().
> >
> > After the recent changes mentioned above, cpuidle_select() will never stop the
> > tick when there's only one idle state in the cpuidle driver, so it would be
> > consistent to make the default case behave analogously. The default idle state
> > is never a deep one AFAICS.
> >
> > So maybe something like the below?
> >
> > ---
> > kernel/sched/idle.c | 2 --
> > 1 file changed, 2 deletions(-)
> >
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -186,8 +186,6 @@ static void cpuidle_idle_call(void)
> > }
> >
> > if (cpuidle_not_available(drv, dev)) {
> > - tick_nohz_idle_stop_tick();
> > -
> > default_idle_call();
> > goto exit_idle;
> > }
>
> Which prevents VMs or other systems which do not have an idle driver to
> stop the tick at all. That's just obviously wrong, no?
The benefit from stopping the tick in cpuidle is that it doesn't kick
CPUs from idle states unnecessarily, so more energy can be saved (or
even some energy can be saved at all if the idle state target
residency is large enough), but if the idle state in question is
shallow, that's rather not super-useful. And I'd rather not expect
default idle to be a deep idle state because that would obviously hurt
low-latency use cases.
I must be missing something, so what is it?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] tick/sched: Prevent pointless NOHZ transitions
2026-02-25 12:54 ` Rafael J. Wysocki
@ 2026-02-25 13:10 ` Rafael J. Wysocki
2026-02-25 16:00 ` Thomas Gleixner
0 siblings, 1 reply; 5+ messages in thread
From: Rafael J. Wysocki @ 2026-02-25 13:10 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Christian Loehle, LKML, Peter Zijlstra, Frederic Weisbecker,
Linux PM
On Wed, Feb 25, 2026 at 1:54 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Feb 24, 2026 at 10:56 PM Thomas Gleixner <tglx@kernel.org> wrote:
> >
> > On Tue, Feb 24 2026 at 22:31, Rafael J. Wysocki wrote:
> > > On Tuesday, February 24, 2026 5:13:06 PM CET Thomas Gleixner wrote:
> > >> So either the governor/driver muck provides some sensible default
> > >> implementation or this has to go into into default_idle_call().
> > >>
> > >> Oh well...
> > >
> > > It looks like the issue is cause by the tick_nohz_idle_stop_tick() called right
> > > before invoking default_idle_call().
> > >
> > > After the recent changes mentioned above, cpuidle_select() will never stop the
> > > tick when there's only one idle state in the cpuidle driver, so it would be
> > > consistent to make the default case behave analogously. The default idle state
> > > is never a deep one AFAICS.
> > >
> > > So maybe something like the below?
> > >
> > > ---
> > > kernel/sched/idle.c | 2 --
> > > 1 file changed, 2 deletions(-)
> > >
> > > --- a/kernel/sched/idle.c
> > > +++ b/kernel/sched/idle.c
> > > @@ -186,8 +186,6 @@ static void cpuidle_idle_call(void)
> > > }
> > >
> > > if (cpuidle_not_available(drv, dev)) {
> > > - tick_nohz_idle_stop_tick();
> > > -
> > > default_idle_call();
> > > goto exit_idle;
> > > }
> >
> > Which prevents VMs or other systems which do not have an idle driver to
> > stop the tick at all. That's just obviously wrong, no?
>
> The benefit from stopping the tick in cpuidle is that it doesn't kick
> CPUs from idle states unnecessarily, so more energy can be saved (or
> even some energy can be saved at all if the idle state target
> residency is large enough), but if the idle state in question is
> shallow, that's rather not super-useful. And I'd rather not expect
> default idle to be a deep idle state because that would obviously hurt
> low-latency use cases.
>
> I must be missing something, so what is it?
OK, if I'm not mistaken, the tick in a VM will effectively become a
periodic hrtimer in the host and it would prevent the host cpuidle
from stopping the tick. Fair enough.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] tick/sched: Prevent pointless NOHZ transitions
2026-02-25 13:10 ` Rafael J. Wysocki
@ 2026-02-25 16:00 ` Thomas Gleixner
0 siblings, 0 replies; 5+ messages in thread
From: Thomas Gleixner @ 2026-02-25 16:00 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Christian Loehle, LKML, Peter Zijlstra, Frederic Weisbecker,
Linux PM
On Wed, Feb 25 2026 at 14:10, Rafael J. Wysocki wrote:
> On Wed, Feb 25, 2026 at 1:54 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>> >
>> > Which prevents VMs or other systems which do not have an idle driver to
>> > stop the tick at all. That's just obviously wrong, no?
>>
>> The benefit from stopping the tick in cpuidle is that it doesn't kick
>> CPUs from idle states unnecessarily, so more energy can be saved (or
>> even some energy can be saved at all if the idle state target
>> residency is large enough), but if the idle state in question is
>> shallow, that's rather not super-useful. And I'd rather not expect
>> default idle to be a deep idle state because that would obviously hurt
>> low-latency use cases.
There are systems out there where even HLT (or the architecture specific
equivalent) saves power magically in the firmware.
>> I must be missing something, so what is it?
>
> OK, if I'm not mistaken, the tick in a VM will effectively become a
> periodic hrtimer in the host and it would prevent the host cpuidle
> from stopping the tick. Fair enough.
That's the energy side.
The other problem is performance in the guest itself. If the guest idles
only briefly and can avoid the rearm of the timer on wakeup then it wins
performance wise. That's true for bare metal too, but the rearm on bare
metal is less expensive than a full VM exit.
Thanks,
tglx
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-02-25 16:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <875x7mv8wd.ffs@tglx>
[not found] ` <ca2b5ede-1922-4540-bc44-a7ff6bec406f@arm.com>
[not found] ` <87zf4yt90t.ffs@tglx>
2026-02-24 21:31 ` [PATCH RFC] tick/sched: Prevent pointless NOHZ transitions Rafael J. Wysocki
2026-02-24 21:55 ` Thomas Gleixner
2026-02-25 12:54 ` Rafael J. Wysocki
2026-02-25 13:10 ` Rafael J. Wysocki
2026-02-25 16:00 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox