From: "Doug Smythies" <dsmythies@telus.net>
To: "'Rafael J. Wysocki'" <rafael@kernel.org>
Cc: "'Rafael J. Wysocki'" <rjw@rjwysocki.net>,
"'Srinivas Pandruvada'" <srinivas.pandruvada@linux.intel.com>,
"'Peter Zijlstra'" <peterz@infradead.org>,
"'LKML'" <linux-kernel@vger.kernel.org>,
"'Frederic Weisbecker'" <frederic@kernel.org>,
"'Mel Gorman'" <mgorman@suse.de>,
"'Daniel Lezcano'" <daniel.lezcano@linaro.org>,
"'Chen, Hu'" <hu1.chen@intel.com>,
"'Quentin Perret'" <quentin.perret@arm.com>,
"'Linux PM'" <linux-pm@vger.kernel.org>,
"'Giovanni Gherdovich'" <ggherdovich@suse.cz>
Subject: RE: [RFC/RFT][PATCH v8] cpuidle: New timer events oriented governor for tickless systems
Date: Mon, 7 Oct 2019 23:20:38 -0700 [thread overview]
Message-ID: <000d01d57da0$8410f1c0$8c32d540$@net> (raw)
In-Reply-To: <CAJZ5v0jo-KQouuE3P51THvU33kViBVtDq1WknBFx+FWUY0e=ag@mail.gmail.com>
On 2019.10.06 08:34 Rafael J. Wysocki wrote:
> On Sun, Oct 6, 2019 at 4:46 PM Doug Smythies <dsmythies@telus.net> wrote:
>> On 2019.10.01 02:32 Rafael J. Wysocki wrote:
>>> On Sun, Sep 29, 2019 at 6:05 PM Doug Smythies <dsmythies@telus.net> wrote:
>>>> On 2019.09.26 09:32 Doug Smythies wrote:
>>>>
>>>>> If the deepest idle state is disabled, the system
>>>>> can become somewhat unstable, with anywhere between no problem
>>>>> at all, to the occasional temporary jump using a lot more
>>>>> power for a few seconds, to a permanent jump using a lot more
>>>>> power continuously. I have been unable to isolate the exact
>>>>> test load conditions under which this will occur. However,
>>>>> temporarily disabling and then enabling other idle states
>>>>> seems to make for a somewhat repeatable test. It is important
>>>>> to note that the issue occurs with only ever disabling the deepest
>>>>> idle state, just not reliably.
>>>>>
>>>>> I want to know how you want to proceed before I do a bunch of
>>>>> regression testing.
>>>>
>> I do not think I stated it clearly before: The problem here is that some CPUs
>> seem to get stuck in idle state 0, and when they do power consumption spikes,
>> often by several hundred % and often indefinitely.
>
> That indeed has not been clear to me, thanks for the clarification!
>
>> I made a hack job automated test:
>> Kernel tests fail rate
>> 5.4-rc1 6616 13.45%
>> 5.3 2376 4.50%
>> 5.3-teov7 12136 0.00% <<< teo.c reverted and teov7 put in its place.
>> 5.4-rc1-ds 11168 0.00% <<< [old] proposed patch (> 7 hours test time)
5.4-rc1-ds12 4224 0.005 <<< new proposed patch
>>
>> [old] Proposed patch (on top of kernel 5.4-rc1): [deleted]
> This change may cause the deepest state to be selected even if its
> "hits" metric is less than the "misses" one AFAICS, in which case the
> max_early_index state should be selected instead.
>
> It looks like the max_early_index computation is broken when the
> deepest state is disabled.
O.K. Thanks for your quick reply, and insight.
I think long durations always need to be counted, but currently if
the deepest idle state is disabled, they are not.
How about this?:
(test results added above, more tests pending if this might be a path forward.)
diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c
index b5a0e49..a970d2c 100644
--- a/drivers/cpuidle/governors/teo.c
+++ b/drivers/cpuidle/governors/teo.c
@@ -155,10 +155,12 @@ static void teo_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
cpu_data->states[i].early_hits -= early_hits >> DECAY_SHIFT;
- if (drv->states[i].target_residency <= sleep_length_us) {
- idx_timer = i;
- if (drv->states[i].target_residency <= measured_us)
- idx_hit = i;
+ if (!(drv->states[i].disabled || dev->states_usage[i].disable)){
+ if (drv->states[i].target_residency <= sleep_length_us) {
+ idx_timer = i;
+ if (drv->states[i].target_residency <= measured_us)
+ idx_hit = i;
+ }
}
}
@@ -256,39 +258,25 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
struct cpuidle_state *s = &drv->states[i];
struct cpuidle_state_usage *su = &dev->states_usage[i];
- if (s->disabled || su->disable) {
- /*
- * If the "early hits" metric of a disabled state is
- * greater than the current maximum, it should be taken
- * into account, because it would be a mistake to select
- * a deeper state with lower "early hits" metric. The
- * index cannot be changed to point to it, however, so
- * just increase the max count alone and let the index
- * still point to a shallower idle state.
- */
- if (max_early_idx >= 0 &&
- count < cpu_data->states[i].early_hits)
- count = cpu_data->states[i].early_hits;
-
- continue;
- }
- if (idx < 0)
- idx = i; /* first enabled state */
+ if (!(s->disabled || su->disable)) {
+ if (idx < 0)
+ idx = i; /* first enabled state */
- if (s->target_residency > duration_us)
- break;
+ if (s->target_residency > duration_us)
+ break;
- if (s->exit_latency > latency_req && constraint_idx > i)
- constraint_idx = i;
+ if (s->exit_latency > latency_req && constraint_idx > i)
+ constraint_idx = i;
- idx = i;
+ idx = i;
- if (count < cpu_data->states[i].early_hits &&
- !(tick_nohz_tick_stopped() &&
- drv->states[i].target_residency < TICK_USEC)) {
- count = cpu_data->states[i].early_hits;
- max_early_idx = i;
+ if (count < cpu_data->states[i].early_hits &&
+ !(tick_nohz_tick_stopped() &&
+ drv->states[i].target_residency < TICK_USEC)) {
+ count = cpu_data->states[i].early_hits;
+ max_early_idx = i;
+ }
}
}
next prev parent reply other threads:[~2019-10-08 6:20 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-26 16:31 [RFC/RFT][PATCH v8] cpuidle: New timer events oriented governor for tickless systems Doug Smythies
2019-09-29 16:04 ` Doug Smythies
2019-10-01 9:31 ` Rafael J. Wysocki
2019-10-06 14:46 ` Doug Smythies
2019-10-06 15:34 ` Rafael J. Wysocki
2019-10-08 6:20 ` Doug Smythies [this message]
2019-10-08 9:51 ` Rafael J. Wysocki
2019-10-08 10:49 ` Rafael J. Wysocki
2019-10-08 23:19 ` Rafael J. Wysocki
2019-10-09 13:36 ` Rafael J. Wysocki
2019-10-10 7:05 ` Doug Smythies
2019-10-10 8:42 ` Rafael J. Wysocki
-- strict thread matches above, loose matches on Subject: below --
2018-12-17 1:53 Doug Smythies
2018-12-17 11:59 ` Rafael J. Wysocki
2018-12-11 11:49 Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='000d01d57da0$8410f1c0$8c32d540$@net' \
--to=dsmythies@telus.net \
--cc=daniel.lezcano@linaro.org \
--cc=frederic@kernel.org \
--cc=ggherdovich@suse.cz \
--cc=hu1.chen@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=peterz@infradead.org \
--cc=quentin.perret@arm.com \
--cc=rafael@kernel.org \
--cc=rjw@rjwysocki.net \
--cc=srinivas.pandruvada@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox