From: Frederic Weisbecker <frederic@kernel.org>
To: Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
Ingo Molnar <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Valentin Schneider <vschneid@redhat.com>,
"Rafael J . Wysocki" <rafael@kernel.org>
Subject: Re: Stopping the tick on a fully loaded system
Date: Sun, 23 Jul 2023 23:21:52 +0200 [thread overview]
Message-ID: <ZL2Z8InSLmI5GU9L@localhost.localdomain> (raw)
In-Reply-To: <ad370ab-5694-d6e4-c888-72bdc635824@linutronix.de>
(Adding Rafael in Cc)
Le Thu, Jul 20, 2023 at 03:00:37PM +0200, Anna-Maria Behnsen a écrit :
> I had also a look at teo. It makes things better but does not solve the
> underlying problem that I see here - please correct me if I missed
> something or if I'm simply wrong:
>
> Yes, the governors have to decide in the end, whether it makes sense to
> stop the tick or not. For this decision, the governors require information
> about the current state of the core and how long nothing has to be done
> propably. At the moment the governors therefore call
> tick_nohz_get_sleep_length(). This checks first whether the tick can be
> stopped. Then it takes into account whether rcu, irq_work, arch_work needs
> the CPU or a timer softirq is pending. If non of this is true, then the
> timers are checked. So tick_nohz_get_sleep_length() isn't only based on
> timers already.
Right but those things (rcu/irq work, etc...) act kind of like timers here
and they should be considered as exceptions.
The timer infrastructure shouldn't take into account the idle activity,
this is really a job for the cpuidle governors.
> The information about the sleep length of the scheduler perspective is
> completely missing in the current existing check for the probable sleep
> length.
>
> Sure, teo takes scheduler utilization into account directly in the
> governor. But for me it is not comprehensible, why the CPU utilization
> check is done after asking for the possible sleep length where timers are
> taken into account. If the CPU is busy anyway, the information generated by
> tick_nohz_next_event() is irrelevant. And when the CPU is not busy, then it
> makes sense to ask for the sleep length also from a timer perspective.
>
> When this CPU utilization check is implemented directly inside the
> governor, every governor has to implement it on it's own. So wouldn't it
> make sense to implement a "how utilized is the CPU out of a scheduler
> perspective" in one place and use this as the first check in
> tick_nohz_get_sleep_length()/tick_nohz_next_event()?
>
Well, beyond that, there might be other situations where the governor may
decide not to stop the tick even if tick_nohz_next_event() says it's possible
to do so. That's the purpose of having that next event as an input among many
others for the cpuidle governors.
As such, calling tmigr_cpu_deactivate() on next tick _evaluation_ time instead of
tick _stop_ time is always going to be problematic.
Can we fix that and call tmigr_cpu_deactivate() from tick_nohz_stop_tick()
instead? This will change a bit the locking scenario because
tick_nohz_stop_tick() doesn't hold the base lock. Is it a problem though?
In the worst case a remote tick happens and handles the earliest timer
for the current CPU while it's between tick_nohz_next_event() and
tick_nohz_stop_tick(), but then the current CPU would just propagate
an earlier deadline than needed. No big deal.
Though I could be overlooking some race or something else making that
not possible of course...
Thanks.
next prev parent reply other threads:[~2023-07-23 21:22 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-20 6:51 Stopping the tick on a fully loaded system Anna-Maria Behnsen
2023-07-20 7:38 ` Vincent Guittot
2023-07-20 13:00 ` Anna-Maria Behnsen
2023-07-20 13:55 ` Vincent Guittot
2023-07-23 21:21 ` Frederic Weisbecker [this message]
2023-07-24 8:23 ` Rafael J. Wysocki
2023-07-25 13:07 ` Anna-Maria Behnsen
2023-07-25 14:27 ` Rafael J. Wysocki
2023-07-25 22:28 ` Peter Zijlstra
2023-07-26 15:10 ` Rafael J. Wysocki
2023-07-26 15:53 ` Rafael J. Wysocki
2023-07-26 16:14 ` Peter Zijlstra
2023-07-26 16:49 ` Peter Zijlstra
2023-07-26 21:26 ` Peter Zijlstra
2023-07-27 7:59 ` Peter Zijlstra
2023-07-27 20:10 ` Rafael J. Wysocki
2023-07-26 16:40 ` Anna-Maria Behnsen
2023-07-26 18:30 ` Rafael J. Wysocki
2023-07-26 20:09 ` Peter Zijlstra
2023-07-26 10:59 ` Frederic Weisbecker
2023-07-26 15:07 ` Rafael J. Wysocki
2023-07-26 10:47 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZL2Z8InSLmI5GU9L@localhost.localdomain \
--to=frederic@kernel.org \
--cc=anna-maria@linutronix.de \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=gautham.shenoy@amd.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox