* [Bug 49231] Single CPU bound process results in non-optimal turbo boost configuration
2012-10-21 23:56 [Bug 49231] New: Single CPU bound process results in non-optimal turbo boost configuration bugzilla-daemon
@ 2012-10-22 13:44 ` bugzilla-daemon
2012-10-22 13:57 ` bugzilla-daemon
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2012-10-22 13:44 UTC (permalink / raw)
To: cpufreq
https://bugzilla.kernel.org/show_bug.cgi?id=49231
Thomas Renninger <trenn@suse.de> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |NEEDINFO
CC| |trenn@suse.de
--- Comment #1 from Thomas Renninger <trenn@suse.de> 2012-10-22 13:44:44 ---
Can you use "cpupower monitor"
(tools/power/cpupower)
instead of turbostat.
Then you will also see what idle states the kernel requested.
Does this single job (or something else) produce interrupts frequently (maybe
powertop helps in this respect)?
This would explain why C1 and no deeper sleep states are entered.
If properly configured, "cpupower top" should invoke the powertop tool.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread* [Bug 49231] Single CPU bound process results in non-optimal turbo boost configuration
2012-10-21 23:56 [Bug 49231] New: Single CPU bound process results in non-optimal turbo boost configuration bugzilla-daemon
2012-10-22 13:44 ` [Bug 49231] " bugzilla-daemon
@ 2012-10-22 13:57 ` bugzilla-daemon
2012-10-22 23:02 ` bugzilla-daemon
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2012-10-22 13:57 UTC (permalink / raw)
To: cpufreq
https://bugzilla.kernel.org/show_bug.cgi?id=49231
--- Comment #2 from Thomas Renninger <trenn@suse.de> 2012-10-22 13:57:09 ---
Another idea:
There is an Intel specific CPU configuration register (perf-bias) which can be
set to values between 0-15.
It tells the CPU to behave more energy or performance efficient (not much more
documentation exist afaik).
You can read or set this register via:
cpupower set -b X
cpupower info -b
Maybe this changes C-state entering behavior?
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread* [Bug 49231] Single CPU bound process results in non-optimal turbo boost configuration
2012-10-21 23:56 [Bug 49231] New: Single CPU bound process results in non-optimal turbo boost configuration bugzilla-daemon
2012-10-22 13:44 ` [Bug 49231] " bugzilla-daemon
2012-10-22 13:57 ` bugzilla-daemon
@ 2012-10-22 23:02 ` bugzilla-daemon
2012-10-23 2:09 ` bugzilla-daemon
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2012-10-22 23:02 UTC (permalink / raw)
To: cpufreq
https://bugzilla.kernel.org/show_bug.cgi?id=49231
--- Comment #3 from Roger Scott <ras243-korg@yahoo.com> 2012-10-22 23:02:24 ---
Hi Thomas,
Thanks for the ideas. I'll do some more testing when I get home this evening
(Timezone=UTC+10). The problem was showing up with jobs which were possibly
hammering the interrupts so I used the simplest test I could think of which was
a never-ending-do-nothing-for-loop written in C. Unfortunately (or fortunately
depending on your perspective) this resulted in the same behaviour.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread* [Bug 49231] Single CPU bound process results in non-optimal turbo boost configuration
2012-10-21 23:56 [Bug 49231] New: Single CPU bound process results in non-optimal turbo boost configuration bugzilla-daemon
` (2 preceding siblings ...)
2012-10-22 23:02 ` bugzilla-daemon
@ 2012-10-23 2:09 ` bugzilla-daemon
2012-10-23 10:20 ` bugzilla-daemon
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2012-10-23 2:09 UTC (permalink / raw)
To: cpufreq
https://bugzilla.kernel.org/show_bug.cgi?id=49231
Len Brown <lenb@kernel.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |lenb@kernel.org
--- Comment #4 from Len Brown <lenb@kernel.org> 2012-10-23 02:09:34 ---
Even the "idle" case doesn't look right -- it is 2.73% busy.
try using top and find out what is running.
core CPU %c0 GHz TSC %c1 %c3 %c6 %c7 %pc2 %pc3 %pc6
2.73 1.20 3.20 7.06 0.00 0.02 90.19 20.20 0.00 68.20
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread* [Bug 49231] Single CPU bound process results in non-optimal turbo boost configuration
2012-10-21 23:56 [Bug 49231] New: Single CPU bound process results in non-optimal turbo boost configuration bugzilla-daemon
` (3 preceding siblings ...)
2012-10-23 2:09 ` bugzilla-daemon
@ 2012-10-23 10:20 ` bugzilla-daemon
2012-10-24 4:21 ` bugzilla-daemon
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2012-10-23 10:20 UTC (permalink / raw)
To: cpufreq
https://bugzilla.kernel.org/show_bug.cgi?id=49231
--- Comment #5 from Thomas Renninger <trenn@suse.de> 2012-10-23 10:20:27 ---
> so I used the simplest test I could think of which was a never-ending-
> do-nothing-for-loop written in C
JFI, I use:
cat /dev/zero >/dev/null &
to utilized one core with 100% CPU load.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread* [Bug 49231] Single CPU bound process results in non-optimal turbo boost configuration
2012-10-21 23:56 [Bug 49231] New: Single CPU bound process results in non-optimal turbo boost configuration bugzilla-daemon
` (4 preceding siblings ...)
2012-10-23 10:20 ` bugzilla-daemon
@ 2012-10-24 4:21 ` bugzilla-daemon
2012-10-24 11:44 ` bugzilla-daemon
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2012-10-24 4:21 UTC (permalink / raw)
To: cpufreq
https://bugzilla.kernel.org/show_bug.cgi?id=49231
--- Comment #6 from Roger Scott <ras243-korg@yahoo.com> 2012-10-24 04:21:12 ---
I've managed to do some more testing. Output from cpupower-monitor (excuse
formatting mess):
|Nehalem || SandyBridge || Mperf ||
I
dle_Stats
CPU | C3 | C6 | PC3 | PC6 || C7 | PC2 | PC7 || C0 | Cx | Freq ||
P
OLL | C1-S | C3-S | C6-S | C7-S
0| 0.00| 0.00| 0.00| 0.00|| 0.00| 0.00| 0.00||100.00| 0.00| 3433||
0.00| 0.00| 0.00| 0.00| 0.00
1| 0.00| 0.00| 0.00| 0.00|| 0.00| 0.00| 0.00|| 2.22| 97.78| 3427||
0.00| 0.00| 0.00| 0.16| 97.86
2| 0.00| 0.00| 0.00| 0.00|| 0.00| 0.00| 0.00|| 2.17| 97.83| 3425||
0.00| 0.00| 0.00| 0.31| 97.75
3| 0.00| 0.00| 0.00| 0.00|| 0.00| 0.00| 0.00|| 3.71| 96.29| 3433||
0.00| 0.00| 0.04| 1.06| 95.43
4| 0.00| 0.00| 0.00| 0.00|| 0.00| 0.00| 0.00|| 2.04| 97.96| 3427||
0.00| 0.00| 0.00| 0.00| 98.21
5| 0.00| 0.00| 0.00| 0.00|| 0.00| 0.00| 0.00|| 4.56| 95.44| 3434||
0.00| 0.00| 0.00| 1.19| 94.50
I'm guessing that this is suggesting that the idle processor cores should be
mostly in C7 but for some reason aren't.
I fiddled around with the perf-bias register but it didn't seem to make any
real difference. It might have made the cores go from C1 to C7 quicker once
the job was stopped but that's just my subjective opinion and I didn't do any
timing tests.
When running powertop I was getting 1000 wakeups-from-idle per second (ie the
kernel tick rate). These were all from the swapper threads/processes, one per
core. So I thought I'd try running with the NO_HZ setting and interestingly
the idle cores now stay in C7. Output from cpupower monitor with NO_HZ set:
|Nehalem || SandyBridge || Mperf ||
I
dle_Stats
CPU | C3 | C6 | PC3 | PC6 || C7 | PC2 | PC7 || C0 | Cx | Freq ||
P
OLL | C1-S | C3-S | C6-S | C7-S
0| 0.00| 0.00| 0.00| 0.00|| 0.00| 0.00| 0.00|| 99.81| 0.19| 3799||
0.00| 0.00| 0.00| 0.00| 0.00
1| 1.17| 0.00| 0.00| 0.00|| 98.58| 0.00| 0.00|| 0.11| 99.89| 3719||
0.00| 0.00| 0.00| 0.00| 99.88
2| 0.03| 0.00| 0.00| 0.00|| 95.76| 0.00| 0.00|| 3.98| 96.02| 3796||
0.00| 0.03| 0.00| 0.00| 95.93
3| 1.12| 0.00| 0.00| 0.00|| 98.82| 0.00| 0.00|| 0.03| 99.97| 3715||
0.00| 0.00| 0.00| 0.00| 99.96
4| 0.01| 0.00| 0.00| 0.00|| 98.76| 0.00| 0.00|| 0.05| 99.95| 3755||
0.00| 0.00| 0.00| 0.00| 99.94
5| 0.00| 0.00| 0.00| 0.00|| 99.74| 0.00| 0.00|| 0.21| 99.79| 3666||
0.00| 0.01| 0.00| 0.00| 99.75
Naturally the time spent in C0 for the idle cores is less than for a ticked
system but my previous 2.7% is less than the Xeon at 4.5% which still manages
to turbo boost itself properly. Just for fun I ran a job which should have
simulated 1000 wakes/sec (ie for loop with usleep(1000)). Interestingly
despite more time than originally spent in C0 the cores still spent a
reasonable amount of time in C7 and were still boosted beyond 3.5GHz.
|Nehalem || SandyBridge || Mperf ||
Idle_Stats
CPU | C3 | C6 | PC3 | PC6 || C7 | PC2 | PC7 || C0 | Cx | Freq ||
POLL | C1-S | C3-S | C6-S | C7-S
0| 0.00| 0.00| 0.00| 0.00|| 0.00| 0.00| 0.00|| 98.43| 1.57| 3654||
0.00| 0.00| 0.00| 0.00| 0.00
1| 1.36| 0.00| 0.00| 0.00|| 33.03| 0.00| 0.00|| 3.78| 96.22| 3606||
0.00| 12.98| 22.92| 0.00| 60.35
2| 8.65| 0.16| 0.00| 0.00|| 55.79| 0.00| 0.00|| 4.23| 95.77| 3586||
0.00| 10.44| 8.28| 0.17| 76.83
3| 4.42| 0.00| 0.00| 0.00|| 72.01| 0.00| 0.00|| 4.47| 95.53| 3563||
0.00| 8.98| 2.78| 0.00| 83.75
4| 1.47| 0.00| 0.00| 0.00|| 47.46| 0.00| 0.00|| 3.21| 96.79| 3603||
0.00| 11.16| 14.78| 0.00| 70.87
5| 6.18| 0.00| 0.00| 0.00|| 23.11| 0.00| 0.00|| 2.15| 97.85| 3619||
0.00| 25.62| 11.99| 0.00| 60.30
I still think there's something a bit funny happening with the ticked system
which is inhibiting idle CPUs from entering C7 if one of their siblings are
busy but I'd be happy if someone who knows more might be able to explain why.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread* [Bug 49231] Single CPU bound process results in non-optimal turbo boost configuration
2012-10-21 23:56 [Bug 49231] New: Single CPU bound process results in non-optimal turbo boost configuration bugzilla-daemon
` (5 preceding siblings ...)
2012-10-24 4:21 ` bugzilla-daemon
@ 2012-10-24 11:44 ` bugzilla-daemon
2013-01-28 23:57 ` bugzilla-daemon
2013-01-28 23:57 ` bugzilla-daemon
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2012-10-24 11:44 UTC (permalink / raw)
To: cpufreq
https://bugzilla.kernel.org/show_bug.cgi?id=49231
Thomas Renninger <trenn@suse.de> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEEDINFO |RESOLVED
Resolution| |INVALID
--- Comment #7 from Thomas Renninger <trenn@suse.de> 2012-10-24 11:44:03 ---
> So I thought I'd try running with the NO_HZ setting and interestingly
> the idle cores now stay in C7
Ok, tickless timer configuration is a must for this processor to enter deepest
sleep states and thus enter boosting mode.
How/when processors enter deep sleep states (even if requested by the kernel)
is very HW specific. So that the CPUs behave differently (one enters deeper
sleep states without NO_HZ, the other does not (probably not very efficiently))
may be interesting, but it looks like it works as designed.
Hm, not sure whether to close this invalid or documented -> going for invalid
as nothing seems to be wrong.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread* [Bug 49231] Single CPU bound process results in non-optimal turbo boost configuration
2012-10-21 23:56 [Bug 49231] New: Single CPU bound process results in non-optimal turbo boost configuration bugzilla-daemon
` (6 preceding siblings ...)
2012-10-24 11:44 ` bugzilla-daemon
@ 2013-01-28 23:57 ` bugzilla-daemon
2013-01-28 23:57 ` bugzilla-daemon
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2013-01-28 23:57 UTC (permalink / raw)
To: cpufreq
https://bugzilla.kernel.org/show_bug.cgi?id=49231
--- Comment #8 from Len Brown <lenb@kernel.org> 2013-01-28 23:57:30 ---
> nothing seems to be wrong.
Agreed.
Turbo depends on idle,
and idle depends on tickless.
If you tick 1000/sec on each thread, your system will simultaneously
have poor power and poor performance. There have been several proposals
to remove the tickful option from the kernel, and sightings like this are why.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread* [Bug 49231] Single CPU bound process results in non-optimal turbo boost configuration
2012-10-21 23:56 [Bug 49231] New: Single CPU bound process results in non-optimal turbo boost configuration bugzilla-daemon
` (7 preceding siblings ...)
2013-01-28 23:57 ` bugzilla-daemon
@ 2013-01-28 23:57 ` bugzilla-daemon
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2013-01-28 23:57 UTC (permalink / raw)
To: cpufreq
https://bugzilla.kernel.org/show_bug.cgi?id=49231
Len Brown <lenb@kernel.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |CLOSED
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread