From: Peter Zijlstra <peterz@infradead.org>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>,
Frederic Weisbecker <frederic@kernel.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
Ingo Molnar <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Valentin Schneider <vschneid@redhat.com>
Subject: Re: Stopping the tick on a fully loaded system
Date: Wed, 26 Jul 2023 00:28:51 +0200 [thread overview]
Message-ID: <20230725222851.GC3784071@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CAJZ5v0gJj_xGHcABCDoX2t8aR+9kXr7fvRFF+5KBO5MJz9kFWQ@mail.gmail.com>
On Tue, Jul 25, 2023 at 04:27:56PM +0200, Rafael J. Wysocki wrote:
> On Tue, Jul 25, 2023 at 3:07 PM Anna-Maria Behnsen
> > 100% load 50% load 25% load
> > (top: ~2% idle) (top: ~49% idle) (top: ~74% idle;
> > 33 CPUs are completely idle)
> > --------------- ---------------- ----------------------------
> > Idle Total 1658703 100% 3150522 100% 2377035 100%
> > x >= 4ms 2504 0.15% 2 0.00% 53 0.00%
> > 4ms> x >= 2ms 390 0.02% 0 0.00% 4563 0.19%
> > 2ms > x >= 1ms 62 0.00% 1 0.00% 54 0.00%
> > 1ms > x >= 500us 67 0.00% 6 0.00% 2 0.00%
> > 500us > x >= 250us 93 0.01% 39 0.00% 11 0.00%
> > 250us > x >=100us 280 0.02% 1145 0.04% 633 0.03%
> > 100us > x >= 50us 942 0.06% 30722 0.98% 13347 0.56%
> > 50us > x >= 25us 26728 1.61% 310932 9.87% 106083 4.46%
> > 25us > x >= 10us 825920 49.79% 2320683 73.66% 1722505 72.46%
> > 10us > x > 5us 795197 47.94% 442991 14.06% 506008 21.29%
> > 5us > x 6520 0.39% 43994 1.40% 23645 0.99%
> >
> >
> > 99% of the tick stops only have an idle period shorter than 50us (50us is
> > 1,25% of a tick length).
>
> Well, this just means that the governor predicts overly long idle
> durations quite often under this workload.
>
> The governor's decision on whether or not to stop the tick is based on
> its idle duration prediction. If it overshoots, that's how it goes.
This is abysmal; IIRC TEO tracks a density function in C state buckets
and if it finds it's more likely to be shorter than 'predicted' by the
timer it should pick something shallower.
Given we have this density function, picking something that's <1% likely
is insane. In fact, it seems to suggest the whole pick-alternative thing
is utterly broken.
> > This is also the reason for my opinion, that the return of
> > tick_nohz_next_event() is completely irrelevant in a (fully) loaded case:
>
> It is an upper bound and in a fully loaded case it may be way off.
But given we have our density function, we should be able to do much
better.
Oooh,... I think I see the problem. Our bins are strictly the available
C-state, but if you run this on a Zen3 that has ACPI-idle, then you end
up with something that only has 3 C states, like:
$ for i in state*/residency ; do echo -n "${i}: "; cat $i; done
state0/residency: 0
state1/residency: 2
state2/residency: 36
Which means we only have buckets: (0,0] (0,2000], (2000,36000] or somesuch. All
of them very much smaller than TICK_NSEC.
That means we don't track nearly enough data to reliably tell anything
about disabling the tick or not. We should have at least one bucket
beyond TICK_NSEC for this.
Hmm.. it is getting very late, but how about I get the cpuidle framework
to pad the drv states with a few 'disabled' C states so that we have at
least enough data to cross the TICK_NSEC boundary and say something
usable about things.
Because as things stand, it's very likely we determine @stop_tick purely
based on what tick_nohz_get_sleep_length() tells us, not on what we've
learnt from recent history.
(FWIW intel_idle seems to not have an entry for Tigerlake !?! -- my poor
laptop, it feels neglected)
next prev parent reply other threads:[~2023-07-25 22:54 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-20 6:51 Stopping the tick on a fully loaded system Anna-Maria Behnsen
2023-07-20 7:38 ` Vincent Guittot
2023-07-20 13:00 ` Anna-Maria Behnsen
2023-07-20 13:55 ` Vincent Guittot
2023-07-23 21:21 ` Frederic Weisbecker
2023-07-24 8:23 ` Rafael J. Wysocki
2023-07-25 13:07 ` Anna-Maria Behnsen
2023-07-25 14:27 ` Rafael J. Wysocki
2023-07-25 22:28 ` Peter Zijlstra [this message]
2023-07-26 15:10 ` Rafael J. Wysocki
2023-07-26 15:53 ` Rafael J. Wysocki
2023-07-26 16:14 ` Peter Zijlstra
2023-07-26 16:49 ` Peter Zijlstra
2023-07-26 21:26 ` Peter Zijlstra
2023-07-27 7:59 ` Peter Zijlstra
2023-07-27 20:10 ` Rafael J. Wysocki
2023-07-26 16:40 ` Anna-Maria Behnsen
2023-07-26 18:30 ` Rafael J. Wysocki
2023-07-26 20:09 ` Peter Zijlstra
2023-07-26 10:59 ` Frederic Weisbecker
2023-07-26 15:07 ` Rafael J. Wysocki
2023-07-26 10:47 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230725222851.GC3784071@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=anna-maria@linutronix.de \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=frederic@kernel.org \
--cc=gautham.shenoy@amd.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox