Re: [PATCH v6 2/2] cpuidle: teo: Introduce util-awareness

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

From: Qais Yousef <qyousef@layalina.io>
To: Lukasz Luba <lukasz.luba@arm.com>
Cc: rafael@kernel.org, daniel.lezcano@linaro.org,
	Kajetan Puchalski <kajetan.puchalski@arm.com>,
	Dietmar.Eggemann@arm.com, dsmythies@telus.net,
	yu.chen.surf@gmail.com, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v6 2/2] cpuidle: teo: Introduce util-awareness
Date: Tue, 18 Jul 2023 13:45:17 +0100	[thread overview]
Message-ID: <20230718124517.3pvgvmesdh6eevqb@airbuntu> (raw)
In-Reply-To: <2bfa9dda-410f-5c8a-bff4-54a5ea308a93@arm.com>

On 07/18/23 11:23, Lukasz Luba wrote:
> 
> 
> On 7/17/23 19:21, Qais Yousef wrote:
> > +CC Vincent and Peter
> > 
> > On 07/17/23 14:47, Lukasz Luba wrote:
> > > Hi Qais,
> > > 
> > > The rule is 'one size doesn't fit all', please see below.
> > > 
> > > On 7/11/23 18:58, Qais Yousef wrote:
> > > > Hi Kajetan
> > > > 
> > > > On 01/05/23 14:51, Kajetan Puchalski wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > @@ -510,9 +598,11 @@ static int teo_enable_device(struct cpuidle_driver *drv,
> > > > >    			     struct cpuidle_device *dev)
> > > > >    {
> > > > >    	struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu);
> > > > > +	unsigned long max_capacity = arch_scale_cpu_capacity(dev->cpu);
> > > > >    	int i;
> > > > >    	memset(cpu_data, 0, sizeof(*cpu_data));
> > > > > +	cpu_data->util_threshold = max_capacity >> UTIL_THRESHOLD_SHIFT;
> > > > 
> > > > Given that utilization is invariant, why do we set the threshold based on
> > > > cpu capacity?
> > > 
> > > 
> > > To treat CPUs differently, not with the same policy.
> > > 
> > > 
> > > > 
> > > > I'm not sure if this is a problem, but on little cores this threshold would be
> > > > too low. Given that util is invariant - I wondered if we need to have a single
> > > > threshold for all type of CPUs instead. Have you tried something like that
> > > 
> > > A single threshold for all CPUs might be biased towards some CPUs. Let's
> > > pick the value 15 - which was tested to work really good in benchmarks
> > > for the big CPUs. On the other hand when you set that value to little
> > > CPUs, with max_capacity = 124, than you have 15/124 ~= 13% threshold.
> > > That means you prefer to enter deeper idle state ~9x times (at max
> > > freq). What if the Little's freq is set to e.g. < ~20% fmax, which
> > > corresponds to capacity < ~25? Let's try to simulate such scenario.
> > 
> > Hmm what I'm struggling with is that PELT is invariant. So the time it takes to
> > rise and decay to threshold of 15 should be the same for all CPUs, no?
> 
> Yes, but that's not an issue. The idea is to have a threshold value
> set to a level which corresponds to a long enough slip time (in
> wall-clock). Then you don't waste the energy for turning on/off the
> core too often.
> 
> > 
> > > 
> > > In a situation we could have utilization 14 on Little CPU, than CPU capacity
> > > (effectively frequency) voting based on utilization would be
> > > 1.2 * 14 = ~17 so let's pick OPP corresponding to 17 capacity.
> > > In such condition the little CPU would run the 14-util-periodic-task for
> > > 14/17= ~82% of wall-clock time. That's a lot, and not suited for
> > > entering deeper idle state on that CPU, isn't it?
> > 
> > Yes runtime is stretched. But we counter this at utilization level by making
> > PELT invariant. I thought that any CPU in the system will now take the same
> > amount of time to ramp-up and decay to the same util level. No?
> 
> The stretched runtime is what we want to derived out of rq util
> information, mostly after task migration to some next cpu.
> 
> > 
> > But maybe what you're saying is that we don't need to take the invariance into
> > account?
> 
> Invariance is OK, since is the common ground when we migrate those tasks
> across cores and we still can conclude based on util the sleeping
> cpu time.
> 
> > 
> > My concern (that is not backed by real problem yet) is that the threshold is
> > near 0, and since PELT is invariant, the time to gain few points is constant
> > irrespective of any CPU/capacity/freq and this means the little CPUs has to be
> > absolutely idle with no activity almost at all, IIUC.
> 
> As I said, that is good for those Little CPU in term of better latency
> for the tasks they serve and also doesn't harm the energy. The
> deeper idle state for those tiny silicon area cores (and low-power
> cells) doesn't bring much in avg for real workloads.
> This is the trade-off: you don't want to sacrifice the latency factor,
> you would rather pay a bit more energy not entering deep idle
> on Little cores, but get better latency there.

I can follow this argument much better than the time stretch one for sure.

I sort of agree, but feel cautious still this is too aggressive. I guess time
will let us know how suitable this is or we'll need to evolve :-)

FWIW, these patches are now in GKI, so we should get complaints if it is
a real problem.

> For the big cores, which occupy bigger silicon area and are made from
> High-Performance (HP) cells (and low V-threshold) leaking more, that
> trade-off is set differently. That's why a similar small util task on
> big core might be facing larger latency do to facing the need to
> wake-up cpu from deeper idle state.
> 
> > 
> > > 
> > > Apart from that, the little CPUs are tiny in terms of silicon area
> > > and are less leaky in WFI than big cores. Therefore, they don't need
> > > aggressive entries into deeper idle state. At the same time, they
> > > are often used for serving interrupts, where the latency is important
> > > factor.
> > 
> > On Pixel 6 this threshold will translate to util_threshold of 2. Which looks
> > too low to me. Can't TEO do a good job before we reach such extremely low level
> > of inactivity?
> 
> Kajetan has introduced a new trace event 'cpu_idle_miss' which was
> helping us to say how to set the trade-off in different thresholds.
> It appeared that we would rather prefer to miss-predict and be wrong
> about missing opportunity to enter deeper idle. The opposite was more
> harmful. You can enable that trace event on your platform and analyze
> your use cases.

Okay, thanks for the info.


Cheers

--
Qais Yousef

next prev parent reply	other threads:[~2023-07-18 12:45 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-05 14:51 [PATCH v6 0/2] cpuidle: teo: Introduce util-awareness Kajetan Puchalski
2023-01-05 14:51 ` [PATCH v6 1/2] cpuidle: teo: Optionally skip polling states in teo_find_shallower_state() Kajetan Puchalski
2023-01-05 14:51 ` [PATCH v6 2/2] cpuidle: teo: Introduce util-awareness Kajetan Puchalski
2023-01-05 15:07   ` Rafael J. Wysocki
2023-01-05 15:20     ` Lukasz Luba
2023-01-05 15:34     ` Vincent Guittot
2023-01-05 17:11       ` Rafael J. Wysocki
2023-07-11 17:58   ` Qais Yousef
2023-07-17 13:47     ` Lukasz Luba
2023-07-17 18:21       ` Qais Yousef
2023-07-18 10:23         ` Lukasz Luba
2023-07-18 12:45           ` Qais Yousef [this message]
2023-07-18 12:02     ` Kajetan Puchalski
2023-07-18 13:24       ` Qais Yousef
2023-07-19 15:07         ` Kajetan Puchalski
2023-09-17  1:05         ` Qais Yousef
2023-09-18 11:41           ` Kajetan Puchalski
2023-09-19  0:04             ` Qais Yousef
2024-05-28  9:29           ` Vincent Guittot
2024-05-28  9:59             ` Lukasz Luba
2024-05-28 14:07               ` Vincent Guittot
2024-05-29 13:09                 ` Christian Loehle
2024-05-31  8:57                   ` Vincent Guittot
2024-06-12  7:25                 ` Lukasz Luba
2024-06-12  9:04                   ` Vincent Guittot
2024-06-12  9:17                     ` Lukasz Luba
2024-06-17  8:52                       ` Vincent Guittot
2024-06-19 12:20                       ` Lukasz Luba
2024-05-28 10:35             ` Christian Loehle
2024-05-28 12:12             ` Kajetan Puchalski
2024-05-29 10:23               ` Qais Yousef
2024-05-29 10:19             ` Qais Yousef
2024-06-12  7:53               ` Lukasz Luba
2024-06-16 21:48                 ` Qais Yousef
2024-06-17  8:13                   ` Lukasz Luba
2023-01-12 19:22 ` [PATCH v6 0/2] " Rafael J. Wysocki
2023-01-13 15:21   ` Kajetan Puchalski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230718124517.3pvgvmesdh6eevqb@airbuntu \
    --to=qyousef@layalina.io \
    --cc=Dietmar.Eggemann@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dsmythies@telus.net \
    --cc=kajetan.puchalski@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=yu.chen.surf@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox