Re: [RFT][PATCH v1] cpuidle: teo: Avoid selecting deepest idle state over-eagerly

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

From: Christian Loehle <christian.loehle@arm.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Linux PM <linux-pm@vger.kernel.org>,
	dsmythies@telus.net
Cc: LKML <linux-kernel@vger.kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Artem Bityutskiy <artem.bityutskiy@linux.intel.com>,
	Aboorva Devarajan <aboorvad@linux.ibm.com>
Subject: Re: [RFT][PATCH v1] cpuidle: teo: Avoid selecting deepest idle state over-eagerly
Date: Thu, 13 Feb 2025 14:07:16 +0000	[thread overview]
Message-ID: <8d147f4f-f511-4f44-b18e-2011b0fab17c@arm.com> (raw)
In-Reply-To: <12630185.O9o76ZdvQC@rjwysocki.net>

On 2/4/25 20:58, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It has been observed that the recent teo governor update which concluded
> with commit 16c8d7586c19 ("cpuidle: teo: Skip sleep length computation
> for low latency constraints") caused the max-jOPS score of the SPECjbb
> 2015 benchmark [1] on Intel Granite Rapids to decrease by around 1.4%.
> While it may be argued that this is not a significant increase, the
> previous score can be restored by tweaking the inequality used by teo
> to decide whether or not to preselect the deepest enabled idle state.
> That change also causes the critical-jOPS score of SPECjbb to increase
> by around 2%.
> 
> Namely, the likelihood of selecting the deepest enabled idle state in
> teo on the platform in question has increased after commit 13ed5c4a6d9c
> ("cpuidle: teo: Skip getting the sleep length if wakeups are very
> frequent") because some timer wakeups were previously counted as non-
> timer ones and they were effectively added to the left-hand side of the
> inequality deciding whether or not to preselect the deepest idle state.
> 
> Many of them are now (accurately) counted as timer wakeups, so the left-
> hand side of that inequality is now effectively smaller in some cases,
> especially when timer wakeups often occur in the range below the target
> residency of the deepest enabled idle state and idle states with target
> residencies below CPUIDLE_FLAG_POLLING are often selected, but the
> majority of recent idle intervals are still above that value most of
> the time.  As a result, the deepest enabled idle state may be selected
> more often than it used to be selected in some cases.
> 
> To counter that effect, add the sum of the hits metric for all of the
> idle states below the candidate one (which is the deepest enabled idle
> state at that point) to the left-hand side of the inequality mentioned
> above.  This will cause it to be more balanced because, in principle,
> putting both timer and non-timer wakeups on both sides of it is more
> consistent than only taking into account the timer wakeups in the range
> above the target residency of the deepest enabled idle state.
> 
> Link: https://www.spec.org/jbb2015/
> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpuidle/governors/teo.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> --- a/drivers/cpuidle/governors/teo.c
> +++ b/drivers/cpuidle/governors/teo.c
> @@ -349,13 +349,13 @@
>  	}
>  
>  	/*
> -	 * If the sum of the intercepts metric for all of the idle states
> -	 * shallower than the current candidate one (idx) is greater than the
> +	 * If the sum of the intercepts and hits metric for all of the idle
> +	 * states below the current candidate one (idx) is greater than the
>  	 * sum of the intercepts and hits metrics for the candidate state and
>  	 * all of the deeper states, a shallower idle state is likely to be a
>  	 * better choice.
>  	 */
> -	if (2 * idx_intercept_sum > cpu_data->total - idx_hit_sum) {
> +	if (2 * (idx_intercept_sum + idx_hit_sum) > cpu_data->total) {
>  		int first_suitable_idx = idx;
>  
>  		/*
> 
> 
> 

I'm curious, are Doug's numbers reproducible?
Or could you share the idle state usage numbers? Is that explainable?
Seems like a lot and it does worry me that I can't reproduce anything
as drastic.

I did a bit of x86 as well and got for Raptor Lake (I won't post the
non-x86 numbers now, but teo-tweak performs comparable to teo mainline.)

Idle 5 min:
device	 gov	 iter	 Joules	 idles	 idle_misses	 idle_miss_ratio	 belows	 aboves	
teo 	0 	170.02 	12690 	646 	0.051 	371 	275
teo 	1 	123.17 	8361 	517 	0.062 	281 	236
teo 	2 	122.76 	7741 	347 	0.045 	262 	85
teo 	3 	118.5 	8699 	668 	0.077 	307 	361
teo 	4 	80.46 	8113 	443 	0.055 	264 	179
teo-tweak 	0 	115.05 	10223 	803 	0.079 	323 	480
teo-tweak 	1 	164.41 	8523 	631 	0.074 	263 	368
teo-tweak 	2 	163.91 	8409 	711 	0.085 	256 	455
teo-tweak 	3 	137.22 	8581 	721 	0.084 	261 	460
teo-tweak 	4 	174.95 	8703 	675 	0.078 	261 	414
teo 	0 	164.34 	8443 	516 	0.061 	303 	213
teo 	1 	167.85 	8767 	492 	0.056 	256 	236
teo 	2 	166.25 	7835 	406 	0.052 	263 	143
teo 	3 	189.77 	8865 	493 	0.056 	276 	217
teo 	4 	136.97 	9185 	467 	0.051 	286 	181

At least in the idle case you can see an increase in 'above' idle_misses.

Firefox Youtube 4K video playback 2 min:
device	 gov	 iter	 Joules	 idles	 idle_misses	 idle_miss_ratio	 belows	 aboves	
teo 	0 	260.09 	67404 	11189 	0.166 	1899 	9290
teo 	1 	273.71 	76649 	12155 	0.159 	2233 	9922
teo 	2 	231.45 	59559 	10344 	0.174 	1747 	8597
teo 	3 	202.61 	58223 	10641 	0.183 	1748 	8893
teo 	4 	217.56 	61411 	10744 	0.175 	1809 	8935
teo-tweak 	0 	227.99 	61209 	11251 	0.184 	2110 	9141
teo-tweak 	1 	222.44 	61959 	10323 	0.167 	1474 	8849
teo-tweak 	2 	218.1 	64380 	11080 	0.172 	1845 	9235
teo-tweak 	3 	207.4 	60183 	11267 	0.187 	1929 	9338
teo-tweak 	4 	217.46 	61253 	10381 	0.169 	1620 	8761
menu 	0 	225.72 	87871 	26032 	0.296 	25412 	620
menu 	1 	200.36 	86577 	24712 	0.285 	24486 	226
menu 	2 	214.79 	84885 	24750 	0.292 	24556 	194
menu 	3 	206.07 	88007 	25938 	0.295 	25683 	255
menu 	4 	216.48 	88700 	26504 	0.299 	26302 	202

(Idle numbers aren't really reflective in energy used -> dominated by
active power.)

next prev parent reply	other threads:[~2025-02-13 14:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-04 20:58 [RFT][PATCH v1] cpuidle: teo: Avoid selecting deepest idle state over-eagerly Rafael J. Wysocki
2025-02-06 14:37 ` Rafael J. Wysocki
2025-02-07 23:40 ` Doug Smythies
2025-02-08 11:24   ` Rafael J. Wysocki
2025-02-10 15:17     ` Doug Smythies
2025-02-09  9:24   ` Artem Bityutskiy
2025-02-10 15:17     ` Doug Smythies
2025-02-13 14:07 ` Christian Loehle [this message]
2025-02-14  4:23   ` Doug Smythies
2025-02-14 21:34   ` Rafael J. Wysocki
2025-02-18 11:28     ` Christian Loehle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8d147f4f-f511-4f44-b18e-2011b0fab17c@arm.com \
    --to=christian.loehle@arm.com \
    --cc=aboorvad@linux.ibm.com \
    --cc=artem.bityutskiy@linux.intel.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dsmythies@telus.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox