From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1045776AbdDWPbn (ORCPT ); Sun, 23 Apr 2017 11:31:43 -0400 Received: from cmta20.telus.net ([209.171.16.93]:50989 "EHLO cmta20.telus.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756858AbdDWPbd (ORCPT ); Sun, 23 Apr 2017 11:31:33 -0400 X-Authority-Analysis: v=2.2 cv=Nv4+S4VJ c=1 sm=1 tr=0 a=zJWegnE7BH9C0Gl4FFgQyA==:117 a=zJWegnE7BH9C0Gl4FFgQyA==:17 a=Pyq9K9CWowscuQLKlpiwfMBGOR0=:19 a=8nJEP1OIZ-IA:10 a=VwQbUJbxAAAA:8 a=3yv4M0XK6FzVe4Ega7QA:9 a=7Zwj6sZBwVKJAoWSPKxL6X1jA+E=:19 a=wPNLvfGTeEIA:10 a=AjGcO6oz07-iQ99wixmX:22 From: "Doug Smythies" To: "'Rafael J. Wysocki'" Cc: "'Mel Gorman'" , "'Rafael Wysocki'" , "=?iso-8859-1?Q?'J=F6rg_Otte'?=" , "'Linux Kernel Mailing List'" , "'Linux PM'" , "'Srinivas Pandruvada'" , "Doug Smythies" References: <20170410084117.rjh3mtdx7hd2i5ze@techsingularity.net> <000a01d2b9e6$393afef0$abb0fcd0$@net> <000301d2bb31$c0037790$400a66b0$@net> 22LpdqAXDopZn22LudrJa9 In-Reply-To: 22LpdqAXDopZn22LudrJa9 Subject: RE: Performance of low-cpu utilisation benchmark regressed severely since 4.6 Date: Sun, 23 Apr 2017 08:31:25 -0700 Message-ID: <000501d2bc46$ad4b1fc0$07e15f40$@net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AdK7rV2GEB8COe/OTbOli2NJ+98wkAAB2w/g Content-Language: en-ca X-CMAE-Envelope: MS4wfJjv9NRlQW5Vs9PqRA2cu175AQBYHERsZgrcR04Ua6SldHly7GkcT3OYIuP0rSlQDkYXT0AJXJFWs5UBRrHrdMfR/dkO/71bLKtjaEMWlFK886TbOdVs 2iAWewzyCuIpGT502g4916aThol+Y3rbz3kCsrMc1nXosKkiVN08AEdjJUQbWPrNiMltu2yqBtuzLY152ol0eJpqBwwZt7wMKLLxGz7UES/TDgZn28aWRDLX /q3z/19baONPKebqksw3KC24xIHni2ERhuzhryHroEVajHqz0EZOBkXYKc0/h2exZA/zVHNSJWxSJZYUzxk4W+8hSKNRhGb09bAr0Bbe2RBhWPXsWxTwbHVd M5kKNpHz+GNX7NEqB/asT2erCOnvq7JbrTpLOyjvSFChZlSwBUWSCkUexNMcIPF+Nor3xrH7 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2017.04.22 14:08 Rafael wrote: > On Friday, April 21, 2017 11:29:06 PM Doug Smythies wrote: >> On 2017.04.20 18:18 Rafael wrote: >>> On Thursday, April 20, 2017 07:55:57 AM Doug Smythies wrote: >>>> On 2017.04.19 01:16 Mel Gorman wrote: >>>>> On Fri, Apr 14, 2017 at 04:01:40PM -0700, Doug Smythies wrote: >>>>>> Hi Mel, >>> >>> [cut] >>> >>>>> And the revert does help albeit not being an option for reasons Rafael >>>>> covered. >>>> >>>> New data point: Kernel 4.11-rc7 intel_pstate, powersave forcing the >>>> load based algorithm: Elapsed 3178 seconds. >>>> >>>> If I understand your data correctly, my load based results are the opposite of yours. >>>> >>>> Mel: 4.11-rc5 vanilla: Elapsed mean: 3750.20 Seconds >>>> Mel: 4.11-rc5 load based: Elapsed mean: 2503.27 Seconds >>>> Or: 33.25% >>>> >>>> Doug: 4.11-rc6 stock: Elapsed total (5 runs): 2364.45 Seconds >>>> Doug: 4.11-rc7 force load based: Elapsed total (5 runs): 3178 Seconds >>>> Or: -34.4% >>> >>> I wonder if you can do the same thing I've just advised Mel to do. That is, >>> take my linux-next branch: >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next >>> >>> (which is new material for 4.12 on top of 4.11-rc7) and reduce >>> INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL (in intel_pstate.c) in it by 1/2 >>> (force load-based if need be, I'm not sure what PM profile of your test system >>> is). >> >> I did not need to force load-based. I do not know how to figure it out from >> an acpidump the way Srinivas does. I did a trace and figured out what algorithm >> it was using from the data. >> >> Reference test, before changing INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL: >> 3239.4 seconds. >> >> Test after changing INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL: >> 3195.5 seconds. > > So it does have an effect, but relatively small. I don't know how repeatable the tests results are. i.e. I don't know if the 1.36% change is within experimental error or not. That being said, the trend does seem consistent. > I wonder if further reducing INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL to 2 ms > will make any difference. I went all the way to 1 ms, just for the test: 3123.9 Seconds >> By far, and with any code, I get the fastest elapsed time, of course next >> to performance mode, but not by much, by limiting the test to only use >> just 1 cpu: 1814.2 Seconds. > > Interesting. > > It looks like the cost is mostly related to moving the load from one CPU to > another and waiting for the new one to ramp up then. > > I guess the workload consists of many small tasks that each start on new CPUs > and cause that ping-pong to happen. Yes, and (from trace data) many tasks are very very very small. Also the test appears to take a few holidays, of up to 1 second, during execution. >> (performance governor, restated from a previous e-mail: 1776.05 seconds) > > But that causes the processor to stay in the maximum sustainable P-state all > the time, which on Sandy Bridge is quite costly energetically. Agreed. I only provide these data points as a reference and so that we know what the boundary conditions (limits) are. > We can do one more trick I forgot about. Namely, if we are about to increase > the P-state, we can jump to the average between the target and the max > instead of just the target, like in the appended patch (on top of linux-next). > > That will make the P-state selection really aggressive, so costly energetically, > but it shoud small jumps of the average load above 0 to case big jumps of > the target P-state. I'm already seeing the energy costs of some of this stuff. 3050.2 Seconds. Idle power 4.06 Watts. Idle power for kernel 4.11-rc7 (performance-based): 3.89 Watts. Idle power for kernel 4.11-rc7, using load-based: 4.01 watts Idle power for kernel 4.11-rc7 next linux-pm: 3.91 watts