linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* CPUs do not go idle - excessive energy consumption
@ 2017-10-12 15:28 Doug Smythies
  2017-10-12 15:52 ` Rafael J. Wysocki
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Doug Smythies @ 2017-10-12 15:28 UTC (permalink / raw)
  To: linux-pm; +Cc: Doug Smythies

Hi all,

I am observing higher than nominal processor package power consumption, under
some conditions. The worst case, so far, was an extra 6.3 watts or 25%, however
more typically it is between 0 and 4 watts (over 1 minute sampling intervals).

The simplest use case found so far is a single threaded 100% load on one CPU.
The problem appears to be CPUs that should be idle and in a deep C state,
sometimes are not. It is as though they have been forgotten and only when they hit
the watchdog timer, or some other event, do they generally get sorted out. I have
looked at kernels back as far as 4.5, and the issue is always present.
  
I am looking for assistance investigating towards finding the root cause, because
I am unfamiliar with the idle code and have made little (O.K. no) progress on my own.

Details (for my test server, with an Intel i7-2600K, currently kernel 4.14-rc4):

In an otherwise idle system, with one processor loaded 100%, processor power
consumption is expected to be about 24 watts. However, sometimes power consumptions
of up to about 28 watts have been observed. Of course, the first thought is that the
extra power consumption is just due to some task, because the "otherwise idle" system
isn't really completely so. If that were true, we would expect the intel_pstate CPU
frequency scaling driver to be called often. It is not.

Although it is difficult to correlate higher power samples from turbostat, with
intel_pstate trace data, it seems they are due to occurrences of high load and long
durations. For a few years now, we would never expect to see high load without
high frequency calls to the driver. Also, typically for my test server, with many
services turned off, other tasks should take very little run time per time slice.
Examples (from trace data acquired and post processed with intel_pstate_tracer.py):

CPU 3: Load 87.2%; Duration 4000.091 msec; Comment watchdog.
CPU 5: Load 100%; Duration 2536.22 msec; Comment idle.
CPU 3: Load 100%; Duration 1184.027 msec; Comment idle.
(I have thousands more examples.)

Higher power consumption with no user load at all seems to be rather rare,
(Between 0 and 40 in one hour, arbitrary thresholds), and hard to detect via
turbostat output. However, if the system is booted with intel_idle.max_cstate=1,
then higher power consumption with no user load is very common, as are much wilder
variations on power consumption (from ~9 watts to ~32 watts).
For reference, the average rate is about 600 per hour with one 100% load, but
it does vary.

Note: While I tend to use taskset so as to know which CPU should be busy,
the issue also occurs when taskset is not used.

Note: I have arbitrarily set the threshold for this condition at >= 10% load
and >= 250 milliseconds duration (the time between calls to the intel_pstate
driver), and written a program to parse such samples out of the .csv files
generated by intel_pstate_tracer.py.

Note: This might be a separate problem, but the issue is made worse
by taking CPUs off-line and then bringing them back on-line (I think).

... Doug

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-10-20  0:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-12 15:28 CPUs do not go idle - excessive energy consumption Doug Smythies
2017-10-12 15:52 ` Rafael J. Wysocki
2017-10-12 16:36 ` Doug Smythies
2017-10-13 14:10 ` Doug Smythies
2017-10-20  0:16 ` Doug Smythies

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).