From: Mark Oteiza <mvoteiza@udel.edu>
To: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Subject: Re: PROBLEM: cpuidle: Quickly notice prediction failure for repeat mode
Date: Fri, 20 Dec 2013 17:54:45 -0500 [thread overview]
Message-ID: <20131220225445.GA6228@holos> (raw)
In-Reply-To: <52B352E3.1040903@linaro.org>
[-- Attachment #1: Type: text/plain, Size: 908 bytes --]
On 19/12/13 at 09:11pm, Daniel Lezcano wrote:
> On 12/19/2013 08:51 PM, Mark Oteiza wrote:
> >Hi,
> >
> >Commit 1485191 causes a regression on my hardware (laptop) manifesting
> >itself as increased power consumption when playing music on mpd using
> >pulseaudio.
> >
> >I bisected to find 1485191. I tested the commit it reverts (69a37bea),
> >and the commit prior 69a37bea, e34a00d6. Here's a summary:
> >
> > status sha power coretemp
> > ---------------------------------
> > BAD 14851912 14W 54C
> > GOOD 228b3023 9W 47C
> > GOOD 69a37bea ~8.8W 47C
> > GOOD e45a00d6 ~9.3W 47C
>
> Hi Mark,
>
> thanks for the measurements.
>
> Is it possible you do some perf benchmarking. IIRC, this patch was reverted
> because it introduced some perf regression.
>
> -- Daniel
Hi,
I hope the attached perf stats are useful.
--
Mark Oteiza
[-- Attachment #2: perf.txt --]
[-- Type: text/plain, Size: 15439 bytes --]
vim: ft=asciidoc
1. Test description
a. Common part
b. Varying configuration
2. Result summary
3. Raw perf stat
1. Test description
a. Common part
All tests using `perf stat -B -p PID` and playing out the track `Boom Jinx
- Too Free To Follow (Original Mix)`. Track length 9:25 (565 seconds).
Input #0, mp3, from '129225_Too_Free_To_Follow_Original_Mix.mp3':
Metadata:
encoder : LAME v3.96.1
artist : Boom Jinx
title : Too Free To Follow (Original Mix)
album : Come Play Perfect / Too Free To Follow
genre : Progressive House
publisher : Anjunadeep
track : 2/3
date : 2006
TDOR : 2006-01-01
TBPM : 125
Duration: 00:09:24.82, start: 0.000000, bitrate: 323 kb/s
The following three tests are on `3.12.5-1-ARCH`
b. Varying configuration
The tests are done with the following `mpd.conf`:
> # ~/.config/mpd/mpd.conf
>
> music_directory "/media/music"
> playlist_directory "~/.local/share/mpd/playlists"
> db_file "~/.local/share/mpd/mpd.db"
> log_file "~/.local/share/mpd/mpd.log"
> pid_file "~/.local/share/mpd/mpd.pid"
> state_file "~/.local/share/mpd/mpdstate"
>
> audio_output {
> type "pulse"
> name "PulseAudio"
> }
in one of three situations/configurations:
A. alsa only, no pulse running, audio_output block commented
B. pulse running, audio_output block commented
C. pulse running, audio_output block NOT commented
These three configurations are tested on the latest vanilla Arch Linux kernel
3.12.5-1-ARCH, and a patched 3.11.6-1-ARCH kernel with 1485191 reverted.
2. Result summary
On vanilla kernel `3.12.5-1-ARCH`
Test Cpufreq Frontend cyc. id. Backend cyc. id. Percent of all branches
--------------------------------------------------------------------------------------
A 1.399 GHz [96.81%] 43.74% [96.86%] 11.98% [13.09%] 1.50% [96.38%]
A 1.324 GHz [96.41%] 42.61% [96.80%] 11.76% [13.08%] 1.51% [96.94%]
B 1.858 GHz [79.91%] 47.34% [84.83%] 37.68% [80.26%] 4.15% [65.01%]
B 1.923 GHz [71.58%] 46.24% [90.12%] 36.23% [82.24%] 4.77% [64.75%]
C 2.479 GHz [82.12%] 52.43% [94.14%] 39.01% [88.21%] 4.75% [41.09%]
C 2.486 GHz [93.97%] 53.46% [94.14%] 39.26% [88.01%] 5.97% [29.84%]
On kernel `3.11.6-1-ARCH` with commit 1485191 reverted.
Test Cpufreq Frontend cyc. id. Backend cyc. id. Percent of all branches
--------------------------------------------------------------------------------------
A 1.207 GHz [96.77%] 44.12% [96.74%] 11.92% [13.09%] 1.48% [96.41%]
A 1.093 GHz [45.20%] 43.97% [96.89%] 26.83% [93.31%] 1.73% [67.77%]
B 1.569 GHz [78.11%] 40.73% [69.44%] 36.53% [80.72%] 4.52% [80.75%]
B 1.655 GHz [53.47%] 42.10% [82.75%] 34.82% [81.95%] 4.36% [90.81%]
C 1.863 GHz [82.98%] 51.42% [84.33%] 37.72% [80.99%] 3.81% [67.46%]
C 1.893 GHz [88.14%] 51.51% [70.09%] 30.86% [59.34%] 3.52% [88.33%]
3. Raw perf stat
### Test A ###
$ perf stat -B -p 5779
^C
Performance counter stats for process id '5779':
19269.559713 task-clock # 0.034 CPUs utilized
29,092 context-switches # 0.002 M/sec
450 cpu-migrations # 0.023 K/sec
780 page-faults # 0.040 K/sec
26,953,017,361 cycles # 1.399 GHz [96.81%]
11,788,193,213 stalled-cycles-frontend # 43.74% frontend cycles idle [96.86%]
3,229,712,476 stalled-cycles-backend # 11.98% backend cycles idle [13.09%]
12,790,505,825 instructions # 0.47 insns per cycle
# 0.92 stalled cycles per insn [83.78%]
3,789,342,481 branches # 196.649 M/sec [96.86%]
56,708,247 branch-misses # 1.50% of all branches [96.38%]
567.364674495 seconds time elapsed
### Test B ###
$ perf stat -B -p 21640 # pulse, mpd autodetect, GOOD
^C
Performance counter stats for process id '21640':
5461.629046 task-clock # 0.010 CPUs utilized
125,451 context-switches # 0.023 M/sec
2,953 cpu-migrations # 0.541 K/sec
1,224 page-faults # 0.224 K/sec
10,146,727,107 cycles # 1.858 GHz [79.91%]
4,803,761,857 stalled-cycles-frontend # 47.34% frontend cycles idle [84.83%]
3,822,947,306 stalled-cycles-backend # 37.68% backend cycles idle [80.26%]
14,068,598,186 instructions # 1.39 insns per cycle
# 0.34 stalled cycles per insn [90.26%]
1,203,848,486 branches # 220.419 M/sec [90.01%]
49,929,970 branch-misses # 4.15% of all branches [65.01%]
566.998160291 seconds time elapsed
### Test C ###
$ perf stat -B -p 22293 # pulse, pulse configured audio_device, BAD
^C
Performance counter stats for process id '22293':
5039.684394 task-clock # 0.009 CPUs utilized
201,452 context-switches # 0.040 M/sec
3,931 cpu-migrations # 0.780 K/sec
1,013 page-faults # 0.201 K/sec
12,491,465,987 cycles # 2.479 GHz [82.12%]
6,549,588,957 stalled-cycles-frontend # 52.43% frontend cycles idle [94.14%]
4,872,604,722 stalled-cycles-backend # 39.01% backend cycles idle [88.21%]
16,043,214,477 instructions # 1.28 insns per cycle
# 0.41 stalled cycles per insn [94.27%]
1,432,475,493 branches # 284.239 M/sec [94.45%]
68,002,415 branch-misses # 4.75% of all branches [41.09%]
567.793946819 seconds time elapsed
### Test B ###
$ perf stat -B -p 25849 # pulse, autodetect, OK?
^C
Performance counter stats for process id '25849':
4828.312531 task-clock # 0.008 CPUs utilized
52,635 context-switches # 0.011 M/sec
2,920 cpu-migrations # 0.605 K/sec
1,060 page-faults # 0.220 K/sec
9,283,525,030 cycles # 1.923 GHz [71.58%]
4,292,859,154 stalled-cycles-frontend # 46.24% frontend cycles idle [90.12%]
3,363,024,956 stalled-cycles-backend # 36.23% backend cycles idle [82.24%]
13,214,633,730 instructions # 1.42 insns per cycle
# 0.32 stalled cycles per insn [91.05%]
1,039,288,777 branches # 215.249 M/sec [91.32%]
49,566,186 branch-misses # 4.77% of all branches [64.75%]
575.362641552 seconds time elapsed
### Test C ###
# 13-18W
$ perf stat -B -p 26349 # pulse, configured audio_output. BAD
^C
Performance counter stats for process id '26349':
4928.812937 task-clock # 0.009 CPUs utilized
208,121 context-switches # 0.042 M/sec
4,009 cpu-migrations # 0.813 K/sec
1,013 page-faults # 0.206 K/sec
12,253,718,423 cycles # 2.486 GHz [93.97%]
6,550,856,654 stalled-cycles-frontend # 53.46% frontend cycles idle [94.14%]
4,810,974,054 stalled-cycles-backend # 39.26% backend cycles idle [88.01%]
16,108,633,255 instructions # 1.31 insns per cycle
# 0.41 stalled cycles per insn [94.10%]
1,445,452,744 branches # 293.266 M/sec [94.04%]
86,270,136 branch-misses # 5.97% of all branches [29.84%]
568.702015081 seconds time elapsed
### Test A ###
# alsa only , no config block GOOD
$ perf stat -B -p 26696
^C
Performance counter stats for process id '26696':
20122.934921 task-clock # 0.035 CPUs utilized
29,070 context-switches # 0.001 M/sec
281 cpu-migrations # 0.014 K/sec
1,020 page-faults # 0.051 K/sec
26,643,218,399 cycles # 1.324 GHz [96.41%]
11,352,679,468 stalled-cycles-frontend # 42.61% frontend cycles idle [96.80%]
3,132,458,091 stalled-cycles-backend # 11.76% backend cycles idle [13.08%]
46,389,485,704 instructions # 1.74 insns per cycle
# 0.24 stalled cycles per insn [96.90%]
3,788,903,036 branches # 188.288 M/sec [96.77%]
57,166,014 branch-misses # 1.51% of all branches [96.94%]
568.255786926 seconds time elapsed
On `3.11.6-1-ARCH` with commit 1485191 reverted.
### Test A ###
# alsa only, GOOD
$ perf stat -B -p 1260
^C
Performance counter stats for process id '1260':
22530.387748 task-clock # 0.039 CPUs utilized
29,028 context-switches # 0.001 M/sec
260 cpu-migrations # 0.012 K/sec
916 page-faults # 0.041 K/sec
27,183,889,434 cycles # 1.207 GHz [96.77%]
11,992,731,634 stalled-cycles-frontend # 44.12% frontend cycles idle [96.74%]
3,241,176,991 stalled-cycles-backend # 11.92% backend cycles idle [13.09%]
12,894,959,266 instructions # 0.47 insns per cycle
# 0.93 stalled cycles per insn [84.24%]
3,793,175,080 branches # 168.358 M/sec [97.00%]
56,150,170 branch-misses # 1.48% of all branches [96.41%]
574.810366198 seconds time elapsed
### Test B ###
# pulse no config block GOOD
$ perf stat -B -p 3240
^C
Performance counter stats for process id '3240':
5777.516582 task-clock # 0.010 CPUs utilized
49,661 context-switches # 0.009 M/sec
2,497 cpu-migrations # 0.432 K/sec
1,196 page-faults # 0.207 K/sec
9,067,410,074 cycles # 1.569 GHz [78.11%]
3,692,796,429 stalled-cycles-frontend # 40.73% frontend cycles idle [69.44%]
3,312,214,070 stalled-cycles-backend # 36.53% backend cycles idle [80.72%]
13,213,756,363 instructions # 1.46 insns per cycle
# 0.28 stalled cycles per insn [91.14%]
1,038,835,964 branches # 179.807 M/sec [91.00%]
46,978,484 branch-misses # 4.52% of all branches [80.75%]
598.581618771 seconds time elapsed
### Test C ###
# pulse, config block GOOD
$ perf stat -B -p 3399
^C
Performance counter stats for process id '3399':
6621.672200 task-clock # 0.012 CPUs utilized
188,868 context-switches # 0.029 M/sec
4,560 cpu-migrations # 0.689 K/sec
849 page-faults # 0.128 K/sec
12,337,231,455 cycles # 1.863 GHz [82.98%]
6,343,372,974 stalled-cycles-frontend # 51.42% frontend cycles idle [84.33%]
4,653,020,474 stalled-cycles-backend # 37.72% backend cycles idle [80.99%]
15,991,003,014 instructions # 1.30 insns per cycle
# 0.40 stalled cycles per insn [94.02%]
1,482,486,392 branches # 223.884 M/sec [84.26%]
56,429,425 branch-misses # 3.81% of all branches [67.46%]
566.108971634 seconds time elapsed
### Test A ###
$ perf stat -B -p 4085
^C
Performance counter stats for process id '4085':
23816.993016 task-clock # 0.042 CPUs utilized
29,306 context-switches # 0.001 M/sec
230 cpu-migrations # 0.010 K/sec
877 page-faults # 0.037 K/sec
26,022,945,276 cycles # 1.093 GHz [45.20%]
11,441,586,110 stalled-cycles-frontend # 43.97% frontend cycles idle [96.89%]
6,982,409,775 stalled-cycles-backend # 26.83% backend cycles idle [93.31%]
46,405,822,403 instructions # 1.78 insns per cycle
# 0.25 stalled cycles per insn [96.50%]
3,794,249,905 branches # 159.309 M/sec [96.84%]
65,778,344 branch-misses # 1.73% of all branches [67.77%]
568.510689271 seconds time elapsed
### Test B ###
$ perf stat -B -p 4506
^C
Performance counter stats for process id '4506':
5850.939684 task-clock # 0.010 CPUs utilized
49,814 context-switches # 0.009 M/sec
2,713 cpu-migrations # 0.464 K/sec
1,193 page-faults # 0.204 K/sec
9,682,944,877 cycles # 1.655 GHz [53.47%]
4,076,656,050 stalled-cycles-frontend # 42.10% frontend cycles idle [82.75%]
3,372,031,105 stalled-cycles-backend # 34.82% backend cycles idle [81.95%]
13,196,912,570 instructions # 1.36 insns per cycle
# 0.31 stalled cycles per insn [91.09%]
1,039,765,080 branches # 177.709 M/sec [91.03%]
45,330,554 branch-misses # 4.36% of all branches [90.81%]
570.162067313 seconds time elapsed
### Test C ###
$ perf stat -B -p 4884
^C
Performance counter stats for process id '4884':
6505.038805 task-clock # 0.011 CPUs utilized
188,878 context-switches # 0.029 M/sec
4,786 cpu-migrations # 0.736 K/sec
1,018 page-faults # 0.156 K/sec
12,315,800,288 cycles # 1.893 GHz [88.14%]
6,344,128,023 stalled-cycles-frontend # 51.51% frontend cycles idle [70.09%]
3,800,263,568 stalled-cycles-backend # 30.86% backend cycles idle [59.34%]
15,010,836,518 instructions # 1.22 insns per cycle
# 0.42 stalled cycles per insn [91.86%]
1,415,142,508 branches # 217.546 M/sec [94.11%]
49,838,213 branch-misses # 3.52% of all branches [88.33%]
566.421286698 seconds time elapsed
prev parent reply other threads:[~2013-12-20 22:53 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-19 19:51 PROBLEM: cpuidle: Quickly notice prediction failure for repeat mode Mark Oteiza
2013-12-19 19:58 ` Mark Oteiza
2013-12-19 20:11 ` Daniel Lezcano
2013-12-20 22:54 ` Mark Oteiza [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131220225445.GA6228@holos \
--to=mvoteiza@udel.edu \
--cc=daniel.lezcano@linaro.org \
--cc=linux-pm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.