From: "Doug Smythies" <dsmythies@telus.net>
To: "'Harshvardhan Jha'" <harshvardhan.j.jha@oracle.com>,
"'Christian Loehle'" <christian.loehle@arm.com>
Cc: "'Sasha Levin'" <sashal@kernel.org>,
"'Greg Kroah-Hartman'" <gregkh@linuxfoundation.org>,
<linux-pm@vger.kernel.org>, <stable@vger.kernel.org>,
"'Rafael J. Wysocki'" <rafael@kernel.org>,
"'Daniel Lezcano'" <daniel.lezcano@linaro.org>,
"Doug Smythies" <dsmythies@telus.net>
Subject: RE: Performance regressions introduced via Revert "cpuidle: menu: Avoid discarding useful information" on 5.15 LTS
Date: Thu, 29 Jan 2026 14:27:13 -0800 [thread overview]
Message-ID: <002601dc916e$6acbe650$4063b2f0$@telus.net> (raw)
In-Reply-To: <004e01dc90b1$4b28f9e0$e17aeda0$@telus.net>
[-- Attachment #1: Type: text/plain, Size: 6908 bytes --]
On 2026.01.28 15:53 Doug Smythies wrote:
> On 2026.01.27 21:07 Doug Smythies wrote:
>> On 2026.01.27 07:45 Harshvardhan Jha wrote:
>>> On 08/12/25 6:17 PM, Christian Loehle wrote:
>>>> On 12/8/25 11:33, Harshvardhan Jha wrote:
>>>>> On 04/12/25 4:00 AM, Doug Smythies wrote:
>>>>>> On 2025.12.03 08:45 Christian Loehle wrote:
>>>>>>> On 12/3/25 16:18, Harshvardhan Jha wrote:
> ... snip ...
>
>>>> It would be nice to get the idle states here, ideally how the states' usage changed
>>>> from base to revert.
>>>> The mentioned thread did this and should show how it can be done, but a dump of
>>>> cat /sys/devices/system/cpu/cpu*/cpuidle/state*/*
>>>> before and after the workload is usually fine to work with:
>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-pm/8da42386-282e-4f97-af93-4715ae206361@arm.com/__;!!ACWV5N9M2RV99hQ!PEhkFcO7emFLMaNxWEoE2Gtnw3zSkpghP17iuEvZM3W6KUpmkbgKw_tr91FwGfpzm4oA5f7c5sz8PkYvKiEVwI_iLIPpMt53$
>
>>> Apologies for the late reply, I'm attaching a tar ball which has the cpu
>>> states for the test suites before and after tests. The folders with the
>>> name of the test contain two folders good-kernel and bad-kernel
>>> containing two files having the before and after states. Please note
>>> that different machines were used for different test suites due to
>>> compatibility reasons. The jbb test was run using containers.
>
> Please provide the results of the test runs that were done for
> the supplied before and after idle data.
> In particular, what is the "fio" test and it results. Its idle data is not very revealing.
> Is it a test I can run on my test computer?
I see that I have fio installed on my test computer.
>> It is a considerable amount of work to manually extract and summarize the data.
>> I have only done it for the phoronix-sqlite data.
>
> I have done the rest now, see below.
> I have also attached the results, in case the formatting gets screwed up.
>
>> There seems to be 40 CPUs, 5 idle states, with idle state 3 defaulting to disabled.
>> I remember seeing a Linux-pm email about why but couldn't find it just now.
>> Summary (also attached as a PNG file, in case the formatting gets messed up):
>> The total idle entries (usage) and time seem low to me, which is why the ???.
>>
>> phoronix-sqlite
>> Good Kernel: Time between samples 4 seconds (estimated and ???)
>> Usage Above Below Above Below
>> state 0 220 0 218 0.00% 99.09%
>> state 1 70212 5213 34602 7.42% 49.28%
>> state 2 30273 5237 1806 17.30% 5.97%
>> state 3 0 0 0 0.00% 0.00%
>> state 4 11824 2120 0 17.93% 0.00%
>>
>> total 112529 12570 36626 43.72% <<< Misses %
>>
>> Bad Kernel: Time between samples 3.8 seconds (estimated and ???)
>> Usage Above Below Above Below
>> state 0 262 0 260 0.00% 99.24%
>> state 1 62751 3985 35588 6.35% 56.71%
>> state 2 24941 7896 1433 31.66% 5.75%
>> state 3 0 0 0 0.00% 0.00%
>> state 4 24489 11543 0 47.14% 0.00%
>>
>> total 112443 23424 37281 53.99% <<< Misses %
>>
>> Observe 2X use of idle state 4 for the "Bad Kernel"
>>
>> I have a template now, and can summarize the other 40 CPU data
>> faster, but I would have to rework the template for the 56 CPU data,
>> and is it a 64 CPU data set at 4 idle states per CPU?
>
> jbb: 40 CPU's; 5 idle states, with idle state 3 defaulting to disabled.
> POLL, C1, C1E, C3 (disabled), C6
>
> Good Kernel: Time between samples > 2 hours (estimated)
> Usage Above Below Above Below
> state 0 297550 0 296084 0.00% 99.51%
> state 1 8062854 341043 4962635 4.23% 61.55%
> state 2 56708358 12688379 6252051 22.37% 11.02%
> state 3 0 0 0 0.00% 0.00%
> state 4 54624476 15868752 0 29.05% 0.00%
>
> total 119693238 28898174 11510770 33.76% <<< Misses
>
> Bad Kernel: Time between samples > 2 hours (estimated)
> Usage Above Below Above Below
> state 0 90715 0 75134 0.00% 82.82%
> state 1 8878738 312970 6082180 3.52% 68.50%
> state 2 12048728 2576251 603316 21.38% 5.01%
> state 3 0 0 0 0.00% 0.00%
> state 4 85999424 44723273 0 52.00% 0.00%
>
> total 107017605 47612494 6760630 50.81% <<< Misses
>
> As with the previous test, observe 1.6X use of idle state 4 for the "Bad Kernel"
>
> fio: 64 CPUs; 4 idle states; POLL, C1, C1E, C6.
>
> fio
> Good Kernel: Time between samples ~ 1 minute (estimated)
> Usage Above Below Above Below
> state 0 3822 0 3818 0.00% 99.90%
> state 1 148640 4406 68956 2.96% 46.39%
> state 2 593455 45344 105675 7.64% 17.81%
> state 3 3209648 807014 0 25.14% 0.00%
>
> total 3955565 856764 178449 26.17% <<< Misses
>
> Bad Kernel: Time between samples ~ 1 minute (estimated)
> Usage Above Below Above Below
> state 0 916 0 756 0.00% 82.53%
> state 1 80230 2028 42791 2.53% 53.34%
> state 2 59231 6888 6791 11.63% 11.47%
> state 3 2455784 564797 0 23.00% 0.00%
>
> total 2596161 573713 50338 24.04% <<< Misses
>
> It is not clear why the number of idle entries differs so much
> between the tests, but there is a bit of a different distribution
> of the workload among the CPUs.
>
> rds-stress: 56 CPUs; 5 idle states, with idle state 3 defaulting to disabled.
> POLL, C1, C1E, C3 (disabled), C6
>
> rds-stress-test
> Good Kernel: Time between samples ~70 Seconds (estimated)
> Usage Above Below Above Below
> state 0 1561 0 1435 0.00% 91.93%
> state 1 13855 899 2410 6.49% 17.39%
> state 2 467998 139254 23679 29.76% 5.06%
> state 3 0 0 0 0.00% 0.00%
> state 4 213132 107417 0 50.40% 0.00%
>
> total 696546 247570 27524 39.49% <<< Misses
>
> Bad Kernel: Time between samples ~ 70 Seconds (estimated)
> Usage Above Below Above Below
> state 0 231 0 231 0.00% 100.00%
> state 1 5413 266 1186 4.91% 21.91%
> state 2 54365 719 3789 1.32% 6.97%
> state 3 0 0 0 0.00% 0.00%
> state 4 267055 148327 0 55.54% 0.00%
>
> total 327064 149312 5206 47.24% <<< Misses
>
> Again, differing numbers of idle entries between tests.
> This time the load distribution between CPUs is more
> obvious. In the "Bad" case most work is done on 2 or 3 CPU's.
> In the "Good" case the work is distributed over more CPUs.
> I assume without proof, that the scheduler is deciding not to migrate
> the next bit of work to another CPU in the one case verses the other.
The above is incorrect. The CPUs involved between the "Good"
and "Bad" tests are very similar, mainly 2 CPUs with a little of
a 3rd and 4th. See the attached graph for more detail / clarity.
All of the tests show higher usage of shallower idle states with
the "Good" verses the "Bad", which was the expectation of the
original patch, as has been mentioned a few times in the emails.
My input is to revert the reversion.
... Doug
[-- Attachment #2: usage-v-cpu-v-state.png --]
[-- Type: image/png, Size: 104684 bytes --]
next prev parent reply other threads:[~2026-01-29 22:27 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-03 16:18 Performance regressions introduced via Revert "cpuidle: menu: Avoid discarding useful information" on 5.15 LTS Harshvardhan Jha
2025-12-03 16:44 ` Christian Loehle
2025-12-03 22:30 ` Doug Smythies
2025-12-08 11:33 ` Harshvardhan Jha
2025-12-08 12:47 ` Christian Loehle
2026-01-13 7:06 ` Harshvardhan Jha
2026-01-13 14:13 ` Rafael J. Wysocki
2026-01-13 14:18 ` Rafael J. Wysocki
2026-01-14 4:28 ` Sergey Senozhatsky
2026-01-14 4:49 ` Sergey Senozhatsky
2026-01-14 5:15 ` Tomasz Figa
2026-01-14 20:07 ` Rafael J. Wysocki
2026-01-29 10:23 ` Harshvardhan Jha
2026-01-29 22:47 ` Doug Smythies
2026-01-27 15:45 ` Harshvardhan Jha
2026-01-28 5:06 ` Doug Smythies
2026-01-28 23:53 ` Doug Smythies
2026-01-29 22:27 ` Doug Smythies [this message]
2026-01-30 19:28 ` Rafael J. Wysocki
2026-02-01 19:20 ` Christian Loehle
2026-02-02 17:31 ` Harshvardhan Jha
2026-02-03 9:07 ` Christian Loehle
2026-02-03 9:16 ` Harshvardhan Jha
2026-02-03 9:31 ` Christian Loehle
2026-02-03 10:22 ` Harshvardhan Jha
2026-02-03 10:30 ` Christian Loehle
2026-02-03 16:45 ` Rafael J. Wysocki
2026-02-05 0:45 ` Doug Smythies
2026-02-05 2:37 ` Sergey Senozhatsky
2026-02-05 5:18 ` Doug Smythies
2026-02-10 9:17 ` Sergey Senozhatsky
2026-02-11 4:27 ` Doug Smythies
2026-02-05 7:15 ` Christian Loehle
2026-02-10 8:02 ` Sergey Senozhatsky
2026-02-10 8:57 ` Christian Loehle
2026-02-11 4:27 ` Doug Smythies
2026-02-05 5:02 ` Doug Smythies
2026-02-10 9:33 ` Xueqin Luo
2026-02-10 10:04 ` Sergey Senozhatsky
2026-02-10 14:24 ` Rafael J. Wysocki
2026-02-11 1:34 ` Sergey Senozhatsky
2026-02-11 4:17 ` Doug Smythies
2026-02-11 8:58 ` Xueqin Luo
2026-02-10 14:20 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='002601dc916e$6acbe650$4063b2f0$@telus.net' \
--to=dsmythies@telus.net \
--cc=christian.loehle@arm.com \
--cc=daniel.lezcano@linaro.org \
--cc=gregkh@linuxfoundation.org \
--cc=harshvardhan.j.jha@oracle.com \
--cc=linux-pm@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox