From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A06B02FC86C; Tue, 14 Oct 2025 17:19:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760462399; cv=none; b=gj5Jy47rjKZPGEkxVX70g1L4HPKVm5VowlVxbVdvoA11ZhQzwjMZou6Mczp7+vsKYnLWsElHxbj3xYvRnjKKjT7irnC2GzZDLe0lqNjTXX2GN/V6GmNGhKSIapHo/nwW2UmKmS4H6WEhpWeyzzS9P5/FARHBZ3adRlS7wgtR/co= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760462399; c=relaxed/simple; bh=+Rok64GjH6bMzjCNLJns+E/72zeS8NKwoze0SUab/P0=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=WpmDhUWNdQhJ5OUgkAwVpOi8Xhja11OkDd98UfTvjco/4amlj0sD46xiSSBJslHZX38MYr2q8JXcdvpOvR/EJNYKDs3KSPhOsflWtiDFOdB4KMD1SQH6B0928X9xktEQmmdgG4ri+4s6ul04+amTU5DEdq2XZnAP++HlLY4oJ5M= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D193B1A9A; Tue, 14 Oct 2025 10:19:47 -0700 (PDT) Received: from [10.1.36.66] (unknown [10.1.36.66]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 847AB3F66E; Tue, 14 Oct 2025 10:19:53 -0700 (PDT) Message-ID: Date: Tue, 14 Oct 2025 18:19:51 +0100 Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: stable: commit "cpuidle: menu: Avoid discarding useful information" causes regressions To: "Rafael J. Wysocki" , Sergey Senozhatsky Cc: "Rafael J. Wysocki" , Greg Kroah-Hartman , Artem Bityutskiy , Sasha Levin , Daniel Lezcano , linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, Tomasz Figa , stable@vger.kernel.org References: <36iykr223vmcfsoysexug6s274nq2oimcu55ybn6ww4il3g3cv@cohflgdbpnq7> <08529809-5ca1-4495-8160-15d8e85ad640@arm.com> <2zreguw4djctgcmvgticnm4dctcuja7yfnp3r6bxaqon3i2pxf@thee3p3qduoq> <8da42386-282e-4f97-af93-4715ae206361@arm.com> <49cf14a1-b96f-4413-a17e-599bc1c104cd@arm.com> Content-Language: en-US From: Christian Loehle In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 10/14/25 16:54, Rafael J. Wysocki wrote: > On Tue, Oct 14, 2025 at 5:11 PM Christian Loehle > wrote: >> >> On 10/14/25 12:55, Sergey Senozhatsky wrote: >>> On (25/10/14 11:25), Christian Loehle wrote: >>>> On 10/14/25 11:23, Sergey Senozhatsky wrote: >>>>> On (25/10/14 10:50), Christian Loehle wrote: >>>>>>> Upstream fixup fa3fa55de0d ("cpuidle: governors: menu: Avoid using >>>>>>> invalid recent intervals data") doesn't address the problems we are >>>>>>> observing. Revert seems to be bringing performance metrics back to >>>>>>> pre-regression levels. >>>>>> >>>>>> Any details would be much appreciated. >>>>>> How do the idle state usages differ with and without >>>>>> "cpuidle: menu: Avoid discarding useful information"? >>>>>> What do the idle states look like in your platform? >>>>> >>>>> Sure, I can run tests. How do I get the numbers/stats >>>>> that you are asking for? >>>> >>>> Ideally just dump >>>> cat /sys/devices/system/cpu/cpu*/cpuidle/state*/* >>>> before and after the test. >>> >>> OK, got some data for you. The terminology being used here is as follows: >>> >>> - 6.1-base >>> is 6.1 stable with a9edb700846 "cpuidle: menu: Avoid discarding useful information" >>> >>> - 6.1-base-fixup >>> is 6.1 stable with a9edb700846 and fa3fa55de0d6 "cpuidle: governors: >>> menu: Avoid using invalid recent intervals data" cherry-pick >>> >>> - 6.1-revert >>> is 6.1 stable with a9edb700846 reverted (and no fixup commit, obviously) >>> >>> Just to show the scale of regression, results of some of the benchmarks: >>> >>> 6.1-base: 84.5 >>> 6.1-base-fixup: 76.5 >>> 6.1-revert: 59.5 >>> >>> (lower is better, 6.1-revert has the same results as previous stable >>> kernels). >> This immediately threw me off. >> The fixup was written for a specific system which had completely broken >> cpuidle. It shouldn't affect any sane system significantly. >> I double checked the numbers and your system looks fine, in fact none of >> the tests had any rejected cpuidle occurrences. So functionally base and >> base-fixup are identical for you. The cpuidle numbers are also reasonably >> 'in the noise', so just for the future some stats would be helpful on those >> scores. >> >> I can see a huge difference between base and revert in terms of cpuidle, >> so that's enough for me to take a look, I'll do that now. >> (6.1-revert has more C3_ACPI in favor of C1_ACPI.) >> >> (Also I can't send this email without at least recommending teo instead of menu >> for your platform / use-cases, if you deemed it unfit I'd love to know what >> didn't work for you!) > > Well, yeah. > > So I've already done some analysis. > > There are 4 C-states, POLL, C1, C6 and C10 (at least that's what the > MWAIT hints tell me). > > This is how many times each of them was requested during the workload > run on base 6.1.y: > > POLL: 21445 > C1: 2993722 > C6: 767029 > C10: 736854 > > and in percentage of the total idle state requests: > > POLL: 0,47% > C1: 66,25% > C6: 16,97% > C10: 16,31% > > With the problematic commit reverted, this became > > POLL: 16092 > C1: 2452591 > C6: 750933 > C10: 1150259 > > and (again) in percentage of the total: > > POLL: 0,37% > C1: 56,12% > C6: 17,18% > C10: 26,32% > > Overall, POLL is negligible and the revet had no effect on the number > of times C6 was requested. The difference is for C1 and C10 and it's > 10% in both cases, but going in opposite directions so to speak: C1 > was requested 10% less and C10 was requested 10% more after the > revert. > > Let's see how this corresponds to the residency numbers. > > For base 6.1.y there was > > POLL: 599883 > C1: 732303748 > C6: 576785253 > C10: 2020491489 > > and in percentage of the total > > POLL: 0,02% > C1: 21,99% > C6: 17,32% > C10: 60,67% > > After the revert it became > > POLL: 469451 > C1: 517623465 > C6: 508945687 > C10: 2567701673 > > and in percentage of the total > > POLL: 0,01% > C1: 14,40% > C6: 14,16% > C10: 71,43% > > so with the revert the CPUs spend around 7% more time in deep idle > states (C6 and C10 combined). > > I have to say that this is consistent with the intent of the > problematic commit, which is to reduce the number of times the deepest > idle state is requested although it is likely to be too deep. > > However, on the system in question this somehow causes performance to > drop significantly (even though shallow idle states are used more > often which should result in lower average idle state exit latency and > better performance). > > One possible explanation is that this somehow affects turbo > frequencies. That is, requesting shallower idle states on idle CPUs > prevents the other CPUs from getting sufficiently high turbo. > > Sergey, can you please run the workload under turbostat on the base > 6.1.y and on 6.1.y with the problematic commit reverted and send the > turbostat output from both runs (note: turbostat needs to be run as > root)? That's the most plausible explanation and would also be my guess. FWIW most of the C3_ACPI (== C10) with revert are objectively wrong with 78% idle misses (they were already pretty high with base around 72.5%). I'll leave this here for easier following: ===== 6.1-base: after minus before deltas (aggregated across CPUs) ===== +---------+-------------+------------+--------------+---------------+------------+------------+---------+ | state | time_diff_s | usage_diff | avg_resid_us | rejected_diff | above_diff | below_diff | share_% | +---------+-------------+------------+--------------+---------------+------------+------------+---------+ | POLL | 0.600 | 21,445 | 28.0 | 0 | 0 | 19,846 | 0.02 | | C1_ACPI | 732.304 | 2,993,722 | 244.6 | 0 | 3,816 | 280,613 | 21.99 | | C2_ACPI | 576.785 | 767,029 | 752.0 | 0 | 272,105 | 453 | 17.32 | | C3_ACPI | 2,020.491 | 736,854 | 2,742.1 | 0 | 534,424 | 0 | 60.67 | | TOTAL | 3,330.180 | 4,519,050 | | 0 | 810,345 | 300,912 | 100.00 | +---------+-------------+------------+--------------+---------------+------------+------------+---------+ ===== 6.1-revert: after minus before deltas (aggregated across CPUs) ===== +---------+-------------+------------+--------------+---------------+------------+------------+---------+ | state | time_diff_s | usage_diff | avg_resid_us | rejected_diff | above_diff | below_diff | share_% | +---------+-------------+------------+--------------+---------------+------------+------------+---------+ | POLL | 0.469 | 16,092 | 29.2 | 0 | 0 | 14,855 | 0.01 | | C1_ACPI | 517.623 | 2,452,591 | 211.1 | 0 | 4,109 | 150,500 | 14.40 | | C2_ACPI | 508.946 | 750,933 | 677.8 | 0 | 327,457 | 427 | 14.16 | | C3_ACPI | 2,567.702 | 1,150,259 | 2,232.3 | 0 | 895,311 | 0 | 71.43 | | TOTAL | 3,594.740 | 4,369,875 | | 0 | 1,226,877 | 165,782 | 100.00 | +---------+-------------+------------+--------------+---------------+------------+------------+---------+ ===== 6.1-revert minus 6.1-base (state-by-state deltas of the deltas) ===== +---------+-----------+----------+----------+---------------+----------+----------+ | state | Δshare_pp | Δusage | Δtime_s | Δavg_resid_us | Δabove | Δbelow | +---------+-----------+----------+----------+---------------+----------+----------+ | POLL | -0.00 | -5,353 | -0.130 | 1.2 | +0 | -4,991 | | C1_ACPI | -7.59 | -541,131 | -214.680 | -33.6 | +293 | -130,113 | | C2_ACPI | -3.16 | -16,096 | -67.840 | -74.2 | +55,352 | -26 | | C3_ACPI | +10.76 | +413,405 | 547.210 | -509.8 | +360,887 | +0 | | TOTAL | +0.00 | -149,175 | 264.560 | | +416,532 | -135,130 | +---------+-----------+----------+----------+---------------+----------+----------+