From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Doug Smythies" Subject: RE: [PATCH] cpuidle: use high confidence factors only when considering polling Date: Fri, 18 Mar 2016 13:59:39 -0700 Message-ID: <004901d18159$18967100$49c35300$@net> References: <20160316121400.680a6a46@annuminas.surriel.com> <10828426.sI6CaBvZhk@vostro.rjw.lan> <000701d180df$e8a14340$b9e3c9c0$@net> <003301d18144$87bb8df0$9732a9d0$@net> <20160318152957.5c3b91bc@annuminas.surriel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from cmta13.telus.net ([209.171.16.86]:51614 "EHLO cmta13.telus.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751838AbcCRU7n (ORCPT ); Fri, 18 Mar 2016 16:59:43 -0400 In-Reply-To: <20160318152957.5c3b91bc@annuminas.surriel.com> Content-Language: en-ca Sender: linux-pm-owner@vger.kernel.org List-Id: linux-pm@vger.kernel.org To: 'Rik van Riel' Cc: "'Rafael J. Wysocki'" , "'Rafael J. Wysocki'" , 'Viresh Kumar' , 'Srinivas Pandruvada' , "'Chen, Yu C'" , linux-pm@vger.kernel.org, 'Arto Jantunen' , 'Len Brown' On 2106.03.18 12:30 Rik van Riel wrote: > On Fri, 18 Mar 2016 11:32:28 -0700 Doug Smythies wrote: >> On 2016.03.18 06:12 Rafael J. Wysocki wrote: >>> I'm wondering what happens if you replace the expected_interval in the >>> "expected_interval > >>> drv->states[CPUIDLE_DRIVER_STATE_START].target_residency" test with >>> data->next_timer_us (with the Rik's patch applied, of course). Can >>> you please try doing that? >> >> O.K. my reference: rvr6 is the above modification to rvr5 >> It works as well as "reverted"/ >> >> State k45rc7-rjw10-rvr6 (mins) >> 0.00 0.87 >> 1.00 24.20 >> 2.00 4.05 >> 3.00 1.72 >> 4.00 147.50 >> >> total 178.34 >> >> Energy: >> Kernel 4.5-rc7-rjw10-rvr6: 55864 Joules >> >> Trace data (very crude summary): >> Kernel 4.5-rc7-rjw10-rvr5: ~3049 long durations at high CPU load (idle state 0) >> Kernel 4.5-rc7-rjw10-rvr5: ~183 long durations at high, but less, CPU load (not all idle state 0) > > What does "long duration" mean? > Dozens of microseconds? > Hundreds of microseconds? > Milliseconds? On average, 100s of milliseconds, and as much as 4 seconds. Specifically, for the Kernel 4.5-rc7-rjw10-rvr5 case, of 3049: The average load was 97.2% and the average "Long" duration was 295.2 mSec. Example 1: CPU 5 load 99.96% duration 1.96 seconds. Example 2: CPU 5 load 99.74% duration 2.68 seconds. Example 3: CPU 7 load 97.86% duration 2.30 seconds. So, to repeat what I said the other day, but in another way: The estimate can be correct 99.9% (or even more) of the time, but when it isn't right, and the CPU gets left in idle state 0, sometimes it can get left there for a very very long time. > Either way, it appears there is something wrong with the > code in get_typical_interval. One of the problems is > that calculating in microseconds, when working with a > threshold of 1-2 microseconds is not going to work well, > and secondly the code declares success the moment the > standard deviation is below 20 microseconds, which is > also not the best idea when dealing with 1-2 microsecond > thresholds :) > > Does the below patch help? I'll report back later on that part. The test computer is busy with something else at the moment. ... Doug