From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E5E6415B7 for ; Mon, 17 Jun 2024 00:20:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.178 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718583643; cv=none; b=ublF6syPbUN4UqKoRoAkT9Q+KzW9tQBdiUA9evQo5JsZ9XHXKfiPXyUmZ3RxtLJkbT9RqLcs6ZJJ2M2WZQiJLeGAa2vSQqJ/l5JaVdOI+avdLRRikxL8Bjjvd2yQEiqi6g6WKGgrE65IsW8f/h9sd7wb50q4mTiq+5DfrpSJxm8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718583643; c=relaxed/simple; bh=ElGcE93YPNO6+VXOMNHd+8lGckGOEQ+jLeA+6EHVD+U=; h=From:To:Cc:References:In-Reply-To:Subject:Date:Message-ID: MIME-Version:Content-Type; b=H6/aptPLJhEiBRiI6HYKQiwHxI6TtPmf+elhGgFgBl8KsNQMzfyLRASFbHdyCgjxdprsfLcVLze7qhcdEg42QEUW1l3q8/SfitSrp8PvWXkBLNV1/fmSuFeiEmmCMLgfGj5sopupaM4WumiZsRL2WN+J/T7rPLLUlMmyJ7gy3RM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=telus.net; spf=pass smtp.mailfrom=telus.net; dkim=pass (2048-bit key) header.d=telus.net header.i=@telus.net header.b=NCBSrSCF; arc=none smtp.client-ip=209.85.210.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=telus.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=telus.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=telus.net header.i=@telus.net header.b="NCBSrSCF" Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-70255d5ddc7so3296183b3a.3 for ; Sun, 16 Jun 2024 17:20:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telus.net; s=google; t=1718583641; x=1719188441; darn=vger.kernel.org; h=thread-index:content-language:content-transfer-encoding :mime-version:message-id:date:subject:in-reply-to:references:cc:to :from:from:to:cc:subject:date:message-id:reply-to; bh=91YtqfEB2yGHwH8Rhem+uFZ4ogM7NT/580ia4kMPl3E=; b=NCBSrSCFvEYxmWGqQg1uo5AX+i5QcALYBk5CZ9w11fB1NdStWGGg1S09znbBihpXys W0yByu+YJ9673Ihzoi2BfKh0lFd1O1ledcFD7H4ZmwnVnN4CiRlg6r9ddZns/K/mpmaC n9s+aUABwQT8CaOR3qQ2x6GmW2lnuKCMQl/lPDwH+K1885ncRfkiEBnR3b4AlgMqk1Ct MEAn5ljqc0VvIDGnBhtAgL26dFKmGLLEL4JDP+ucnwi30fns4t0Q48jjtUFbS/SOoDCO t5Fe30Ap7HfoxTa+8Q2GForJKyf7PlF5agppjqx/Ah+GXXRPPDIbxzSBOFmaPYM5rgnO eI0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718583641; x=1719188441; h=thread-index:content-language:content-transfer-encoding :mime-version:message-id:date:subject:in-reply-to:references:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=91YtqfEB2yGHwH8Rhem+uFZ4ogM7NT/580ia4kMPl3E=; b=UnY62IpH1zv6bqmud/GcANXgsEIAvLrerEzQRlPWh4Ks8WU+sSBwuMZOxNEmimeCU2 KmZHrm+lu+hTxt6nxjiaZ85J2acY3AmP3WXQ9G08HpHuhyIDKMuJMNAuNqHL5hi3rClC rxa9U4fqCZTayrP8wNmyfY3xHuoHO30KO87BW4BpFp3gFevuLpTseVHk0eM0t0IPoT/J CnUXsoCneTwDU26zFoVPHURwo6b5iBmXosAenG4KuUk+qGkwIpBXSQqQBWYTJgpdm3TQ QtEb6/lAuhK5IYtg3q3wO97V888yGHt/Vp/ION/yoE7O4480WxWRDCTc5A7O4WU5ymKx cOHA== X-Forwarded-Encrypted: i=1; AJvYcCX0un64T9/FO2cmsKeZegBLlLJ/4whE5iT559p+4w8iY45ZuVcA4lvJwpMbmjWyDEo37giHbnPHi73LdeDwiO8zeKjsQWMnFgITv0O2 X-Gm-Message-State: AOJu0YwuVqAhTQKMZN2d9lBu179qhFq/tW8kUQCysQgPDhF/cDzZmLuq Jm1WHZ4yhOpp9A9TCpIgN8o0KfZ31rbwX3MBqJnHck+ZpYoglf9DTR5ZHdNMmi4= X-Google-Smtp-Source: AGHT+IEWad3GVlaqCppoLGsu5NX/4C4FRSP893Bd6cawh9xwCIh7hhcKHHGuMcIDuYBcBPa68tgufw== X-Received: by 2002:a05:6a00:26cb:b0:705:d755:69b0 with SMTP id d2e1a72fcca58-70603623c99mr1253331b3a.6.1718583640993; Sun, 16 Jun 2024 17:20:40 -0700 (PDT) Received: from DougS18 (s66-183-142-209.bc.hsia.telus.net. [66.183.142.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-705ccb85490sm6427210b3a.183.2024.06.16.17.20.39 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 16 Jun 2024 17:20:40 -0700 (PDT) From: "Doug Smythies" To: "'Christian Loehle'" , Cc: , , , , , , , , , , , "Doug Smythies" References: <20240611112413.1241352-1-christian.loehle@arm.com> In-Reply-To: <20240611112413.1241352-1-christian.loehle@arm.com> Subject: RE: [PATCHv2 0/3] cpuidle: teo: Fixing utilization and intercept logic Date: Sun, 16 Jun 2024 17:20:43 -0700 Message-ID: <004a01dac04c$314c4360$93e4ca20$@telus.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Content-Language: en-ca Thread-Index: AQHzep4n7RUpOXZFjs7PHwMj2t6TVLGY73sg On 2024.06.11 04:24 Christian Loehle wrote: ... > Happy for anyone to take a look and test as well. ... I tested the patch set. I do a set of tests adopted over some years now. Readers may recall that some of the tests search over a wide range of = operating conditions looking for areas to focus on in more detail. One interesting observation is that everything seems to run much slower = than the last time I did this, last August, Kernel 6.5-rc4. Test system: Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz (6 cores, 2 thread = per core, 12 CPUs) CPU Frequency scaling driver: intel_pstate HWP (HardWare Pstate) control: Disabled CPU frequency scaling governor: Performance Idle states: 4: name : description: state0/name:POLL desc:CPUIDLE CORE POLL IDLE state1/name:C1_ACPI desc:ACPI FFH MWAIT 0x0 state2/name:C2_ACPI desc:ACPI FFH MWAIT 0x30 state3/name:C3_ACPI desc:ACPI FFH MWAIT 0x60 Ilde driver: intel_idle Idle governor: as per individual test Kernel: 6.10-rc2 and with V1 and V2 patch sets (1000 Hz tick rate) Legend: teo: unmodified 6.10-rc2 menu:=20 ladder: cl: Kernel 6.10-rc2 + Christian Loehle patch set V1 clv2: Kernel 6.10-rc2 + Christian Loehle patch set V2 System is extremely idle, other than the test work. Test 1: 2 core ping pong sweep: Pass a token between 2 CPUs on 2 different cores. Do a variable amount of work at each stop. Purpose: To utilize the shallowest idle states and observe the transition from using more of 1 idle state to another. Results relative to teo (negative is better): menu ladder clv2 cl average -2.09% 11.11% 2.88% 1.81% max 10.63% 33.83% 9.45% 10.13% min -11.58% 6.25% -3.61% -3.34% While there are a few operating conditions where clv2 performs better = than teo, overall it is worse. Further details: http://smythies.com/~doug/linux/idle/teo-util3/ping-sweep/2-1/2-core-pp-r= elative.png http://smythies.com/~doug/linux/idle/teo-util3/ping-sweep/2-1/2-core-pp-d= ata.png http://smythies.com/~doug/linux/idle/teo-util3/ping-sweep/2-1/perf/ Test 2: 6 core ping pong sweep: Pass a token between 6 CPUs on 6 different cores. Do a variable amount of work at each stop. Purpose: To utilize the midrange idle states and observe the transitions between use of idle states. Note: This test has uncertainty in an area where the performance is = bi-stable for all idle governors, transitioning between much less power and slower performance and much = more power and higher performance. On either side of this area, the differences between all idle governors = are negligible. Only data from before this area (from results 1 t0 95) was included in = the below results. Results relative to teo (negative is better): menu ladder cl clv2 average 0.16% 4.32% 2.54% 2.64% max 0.92% 14.32% 8.78% 8.50% min -0.44% 0.27% 0.09% 0.05% One large clv2 difference seems to be excessive use of the deepest idle = state, with corresponding 100% hit rate on the "Idle State 3 was to deep" = metric. Example (20 second sample time): teo: Idle state 3 entries: 600, 74.33% were to deep or 451. Processor = power was 38.0 watts. clv2: Idle state 3 entries: 4,375,243, 100.00% were to deep or = 4,375,243. Processor power was 40.6 watts. clv2 loop times were about 8% worse than teo. Further details: http://smythies.com/~doug/linux/idle/teo-util3/ping-sweep/6-1/6-core-pp-d= ata-detail-a.png http://smythies.com/~doug/linux/idle/teo-util3/ping-sweep/6-1/6-core-pp-d= ata-detail-b.png http://smythies.com/~doug/linux/idle/teo-util3/ping-sweep/6-1/6-core-pp-d= ata.png http://smythies.com/~doug/linux/idle/teo-util3/ping-sweep/6-1/perf/ Test 3: sleeping ebizzy - 128 threads. Purpose: This test has given interesting results in the past. The test varies the sleep interval between record lookups. The result is varying usage of idle states. Results: relative to teo (negative is better): menu clv2 ladder cl average 0.06% 0.38% 0.81% 0.35% max 2.53% 3.20% 5.00% 2.87% min -2.13% -1.66% -3.30% -2.13% No strong conclusion here, from just the data. However, clv2 seems to use a bit more processor power, on average. Further details: Test4: adrestia wakeup latency tests. 500 threads. Purpose: The test was reported in 2023.09 by the kernel test robot and = looked both interesting and gave interesting results, so I added it to the = tests I run. Results: teo:wakeup cost (periodic, 20us): 3130nSec reference clv2:wakeup cost (periodic, 20us): 3179nSec +1.57% cl:wakeup cost (periodic, 20us): 3206nSec +2.43% menu:wakeup cost (periodic, 20us): 2933nSec -6.29% ladder:wakeup cost (periodic, 20us): 3530nSec +12.78% No strong conclusion here, from just the data. However, clv2 seems to use a bit more processor power, on average. teo: 69.72 watts clv2: 72.91 watts +4.6% Note 1: The first 5 minutes of the test powers were discarded to allow = for thermal stabilization. Note 2: More work is required for this test, because the teo one = actually took longer to execute, due to more outlier results than the = other tests. There were several other tests run but are not written up herein.