RE: Performance regressions introduced via Revert "cpuidle: menu: Avoid discarding useful information" on 5.15 LTS

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

From: "Doug Smythies" <dsmythies@telus.net>
To: "'Harshvardhan Jha'" <harshvardhan.j.jha@oracle.com>,
	"'Christian Loehle'" <christian.loehle@arm.com>
Cc: "'Sasha Levin'" <sashal@kernel.org>,
	"'Greg Kroah-Hartman'" <gregkh@linuxfoundation.org>,
	<linux-pm@vger.kernel.org>, <stable@vger.kernel.org>,
	"'Rafael J. Wysocki'" <rafael@kernel.org>,
	"'Daniel Lezcano'" <daniel.lezcano@linaro.org>,
	"Doug Smythies" <dsmythies@telus.net>
Subject: RE: Performance regressions introduced via Revert "cpuidle: menu: Avoid discarding useful information" on 5.15 LTS
Date: Tue, 27 Jan 2026 21:06:40 -0800	[thread overview]
Message-ID: <003e01dc9013$e3bc5060$ab34f120$@telus.net> (raw)
In-Reply-To: <849ee0ff-e15b-4b69-84de-6503e3b3168d@oracle.com>

[-- Attachment #1: Type: text/plain, Size: 8076 bytes --]

On 2026.01.27 07:45 Harshvardhan Jha wrote:
>On 08/12/25 6:17 PM, Christian Loehle wrote:
>> On 12/8/25 11:33, Harshvardhan Jha wrote:
>>> On 04/12/25 4:00 AM, Doug Smythies wrote:
>>>> On 2025.12.03 08:45 Christian Loehle wrote:
>>>>> On 12/3/25 16:18, Harshvardhan Jha wrote:
>>>>>>
>>>>>> While running performance benchmarks for the 5.15.196 LTS tags , it was
>>>>>> observed that several regressions across different benchmarks is being
>>>>>> introduced when compared to the previous 5.15.193 kernel tag. Running an
>>>>>> automated bisect on both of them narrowed down the culprit commit to:
>>>>>> - 5666bcc3c00f7 Revert "cpuidle: menu: Avoid discarding useful
>>>>>> information" for 5.15
>>>>>>
>>>>>> Regressions on 5.15.196 include:
>>>>>> -9.3% : Phoronix pts/sqlite using 2 processes on OnPrem X6-2
>>>>>> -6.3% : Phoronix system/sqlite on OnPrem X6-2
>>>>>> -18%  : rds-stress -M 1 (readonly rdma-mode) metrics with 1 depth & 1
>>>>>> thread & 1M buffer size on OnPrem X6-2
>>>>>> -4 -> -8% : rds-stress -M 2 (writeonly rdma-mode) metrics with 1 depth &
>>>>>> 1 thread & 1M buffer size on OnPrem X6-2
>>>>>> Up to -30% : Some Netpipe metrics on OnPrem X5-2
>>>>>>
>>>>>> The culprit commits' messages mention that these reverts were done due
>>>>>> to performance regressions introduced in Intel Jasper Lake systems but
>>>>>> this revert is causing issues in other systems unfortunately. I wanted
>>>>>> to know the maintainers' opinion on how we should proceed in order to
>>>>>> fix this. If we reapply it'll bring back the previous regressions on
>>>>>> Jasper Lake systems and if we don't revert it then it's stuck with
>>>>>> current regressions. If this problem has been reported before and a fix
>>>>>> is in the works then please let me know I shall follow developments to
>>>>>> that mail thread.
>>>>> The discussion regarding this can be found here:
>>>>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/36iykr223vmcfsoysexug6s274nq2oimcu55ybn6ww4il3g3cv@cohflgdbpnq7/__;!!ACWV5N9M2RV99hQ!MWXEz_wRbaLyJxDign2EXci2qNzAPpCyhi8qIORMdReh0g_yIVIt-Oqov23KT23A_rGBRRxJ4bHb_e6UQA-b9PW7hw$ 
>>>>> we explored an alternative to the full revert here:
>>>>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/4687373.LvFx2qVVIh@rafael.j.wysocki/__;!!ACWV5N9M2RV99hQ!MWXEz_wRbaLyJxDign2EXci2qNzAPpCyhi8qIORMdReh0g_yIVIt-Oqov23KT23A_rGBRRxJ4bHb_e6UQA9PSf_uMQ$ 
>>>>> unfortunately that didn't lead anywhere useful, so Rafael went with the
>>>>> full revert you're seeing now.
>>>>>
>>>>> Ultimately it seems to me that this "aggressiveness" on deep idle tradeoffs
>>>>> will highly depend on your platform, but also your workload, Jasper Lake
>>>>> in particular seems to favor deep idle states even when they don't seem
>>>>> to be a 'good' choice from a purely cpuidle (governor) perspective, so
>>>>> we're kind of stuck with that.
>>>>>
>>>>> For teo we've discussed a tunable knob in the past, which comes naturally with
>>>>> the logic, for menu there's nothing obvious that would be comparable.
>>>>> But for teo such a knob didn't generate any further interest (so far).
>>>>>
>>>>> That's the status, unless I missed anything?
>>>> By reading everything in the links Chrsitian provided, you can see
>>>> that we had difficulties repeating test results on other platforms.
>>>>
>>>> Of the tests listed herein, the only one that was easy to repeat on my
>>>> test server, was the " Phoronix pts/sqlite" one. I got (summary: no difference):
>>>>
>>>> Kernel 6.18									Reverted			
>>>> pts/sqlite-2.3.0			menu rc4		menu rc1		menu rc1		menu rc3	
>>>> 				performance		performance		performance		performance	
>>>> test	what			ave			ave			ave			ave	
>>>> 1	T/C 1			2.147	-0.2%		2.143	0.0%		2.16	-0.8%		2.156	-0.6%
>>>> 2	T/C 2			3.468	0.1%		3.473	0.0%		3.486	-0.4%		3.478	-0.1%
>>>> 3	T/C 4			4.336	0.3%		4.35	0.0%		4.355	-0.1%		4.354	-0.1%
>>>> 4	T/C 8			5.438	-0.1%		5.434	0.0%		5.456	-0.4%		5.45	-0.3%
>>>> 5	T/C 12			6.314	-0.2%		6.299	0.0%		6.307	-0.1%		6.29	0.1%
>>>>
>>>> Where:
>>>> T/C means: Threads / Copies
>>>> performance means: intel_pstate CPU frequency scaling driver and the performance CPU frequencay scaling governor.
>>>> Data points are in Seconds.
>>>> Ave means the average test result. The number of runs per test was increased from the default of 3 to 10.
>>>> The reversion was manually applied to kernel 6.18-rc1 for that test.
>>>> The reversion was included in kernel 6.18-rc3.
>>>> Kernel 6.18-rc4 had another code change to menu.c
>>>>
>>>> In case the formatting gets messed up, the table is also attached.
>>>>
>>>> Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz, 6 cores 12 CPUs.
>>>> HWP: Enabled.
>>> I was able to recover performance on 5.15 and 5.4 LTS based kernels
>>> after reapplying the revert on X6-2 systems.
>>>
>>> Architecture:                x86_64
>>>   CPU op-mode(s):            32-bit, 64-bit
>>>   Address sizes:             46 bits physical, 48 bits virtual
>>>   Byte Order:                Little Endian
>>> CPU(s):                      56
>>>   On-line CPU(s) list:       0-55
>>> Vendor ID:                   GenuineIntel
>>>   Model name:                Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
>>>    CPU family:              6
>>>     Model:                   79
>>>     Thread(s) per core:      2
>>>     Core(s) per socket:      14
>>>     Socket(s):               2
>>>     Stepping:                1
>>>     CPU(s) scaling MHz:      98%
>>>     CPU max MHz:             2600.0000
>>>     CPU min MHz:             1200.0000
>>>     BogoMIPS:                5188.26
... snip ...
>> It would be nice to get the idle states here, ideally how the states' usage changed
>> from base to revert.
>> The mentioned thread did this and should show how it can be done, but a dump of
>> cat /sys/devices/system/cpu/cpu*/cpuidle/state*/*
>> before and after the workload is usually fine to work with:
>> https://urldefense.com/v3/__https://lore.kernel.org/linux-pm/8da42386-282e-4f97-af93-4715ae206361@arm.com/__;!!ACWV5N9M2RV99hQ!PEhkFcO7emFLMaNxWEoE2Gtnw3zSkpghP17iuEvZM3W6KUpmkbgKw_tr91FwGfpzm4oA5f7c5sz8PkYvKiEVwI_iLIPpMt53$ 
> Apologies for the late reply, I'm attaching a tar ball which has the cpu
> states for the test suites before and after tests. The folders with the
> name of the test contain two folders good-kernel and bad-kernel
> containing two files having the before and after states. Please note
> that different machines were used for different test suites due to
> compatibility reasons. The jbb test was run using containers.

It is a considerable amount of work to manually extract and summarize the data.
I have only done it for the phoronix-sqlite data.
There seems to be 40 CPUs, 5 idle states, with idle state 3 defaulting to disabled.
I remember seeing a Linux-pm email about why but couldn't find it just now.
Summary (also attached as a PNG file, in case the formatting gets messed up):
The total idle entries (usage)  and time seem low to me, which is why the ???.

phoronix-sqlite						
	Good Kernel: Time between samples 4 seconds (estimated and ???)					
	Usage	Above	Below	Above	Below	
state 0	220	0	218	0.00%	99.09%	
state 1	70212	5213	34602	7.42%	49.28%	
state 2	30273	5237	1806	17.30%	5.97%	
state 3	0	0	0	0.00%	0.00%	
state 4	11824	2120	0	17.93%	0.00%	
						
total	112529	12570	36626	43.72%	 <<< Misses %	
						
	Bad Kernel: Time between samples 3.8 seconds (estimated and ???)					
	Usage	Above	Below	Above	Below	
state 0	262	0	260	0.00%	99.24%	
state 1	62751	3985	35588	6.35%	56.71%	
state 2	24941	7896	1433	31.66%	5.75%	
state 3	0	0	0	0.00%	0.00%	
state 4	24489	11543	0	47.14%	0.00%	
						
total	112443	23424	37281	53.99%	 <<< Misses %

Observe 2X use of idle state 4 for the "Bad Kernel"

I have a template now, and can summarize the other 40 CPU data
faster, but I would have to rework the template for the 56 CPU data,
and is it a 64 CPU data set at 4 idle states per CPU?

... Doug


[-- Attachment #2: sqlite-summary.png --]
[-- Type: image/png, Size: 37171 bytes --]

next prev parent reply	other threads:[~2026-01-28  5:06 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-03 16:18 Performance regressions introduced via Revert "cpuidle: menu: Avoid discarding useful information" on 5.15 LTS Harshvardhan Jha
2025-12-03 16:44 ` Christian Loehle
2025-12-03 22:30   ` Doug Smythies
2025-12-08 11:33     ` Harshvardhan Jha
2025-12-08 12:47       ` Christian Loehle
2026-01-13  7:06         ` Harshvardhan Jha
2026-01-13 14:13           ` Rafael J. Wysocki
2026-01-13 14:18             ` Rafael J. Wysocki
2026-01-14  4:28               ` Sergey Senozhatsky
2026-01-14  4:49                 ` Sergey Senozhatsky
2026-01-14  5:15                   ` Tomasz Figa
2026-01-14 20:07                     ` Rafael J. Wysocki
2026-01-29 10:23                       ` Harshvardhan Jha
2026-01-29 22:47                   ` Doug Smythies
2026-01-27 15:45         ` Harshvardhan Jha
2026-01-28  5:06           ` Doug Smythies [this message]
2026-01-28 23:53             ` Doug Smythies
2026-01-29 22:27               ` Doug Smythies
2026-01-30 19:28                 ` Rafael J. Wysocki
2026-02-01 19:20                   ` Christian Loehle
2026-02-02 17:31                     ` Harshvardhan Jha
2026-02-03  9:07                       ` Christian Loehle
2026-02-03  9:16                         ` Harshvardhan Jha
2026-02-03  9:31                           ` Christian Loehle
2026-02-03 10:22                             ` Harshvardhan Jha
2026-02-03 10:30                               ` Christian Loehle
2026-02-03 16:45                             ` Rafael J. Wysocki
2026-02-05  0:45                               ` Doug Smythies
2026-02-05  2:37                                 ` Sergey Senozhatsky
2026-02-05  5:18                                   ` Doug Smythies
2026-02-10  9:17                                     ` Sergey Senozhatsky
2026-02-11  4:27                                       ` Doug Smythies
2026-02-05  7:15                                   ` Christian Loehle
2026-02-10  8:02                                     ` Sergey Senozhatsky
2026-02-10  8:57                                       ` Christian Loehle
2026-02-11  4:27                                         ` Doug Smythies
2026-02-05  5:02                                 ` Doug Smythies
2026-02-10  9:33                                   ` Xueqin Luo
2026-02-10 10:04                                     ` Sergey Senozhatsky
2026-02-10 14:24                                       ` Rafael J. Wysocki
2026-02-11  1:34                                         ` Sergey Senozhatsky
2026-02-11  4:17                                           ` Doug Smythies
2026-02-11  8:58                                         ` Xueqin Luo
2026-02-10 14:20                                     ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='003e01dc9013$e3bc5060$ab34f120$@telus.net' \
    --to=dsmythies@telus.net \
    --cc=christian.loehle@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=harshvardhan.j.jha@oracle.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox