public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark Oteiza <mvoteiza@udel.edu>
To: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Subject: Re: PROBLEM: cpuidle: Quickly notice prediction failure for repeat mode
Date: Fri, 20 Dec 2013 17:54:45 -0500	[thread overview]
Message-ID: <20131220225445.GA6228@holos> (raw)
In-Reply-To: <52B352E3.1040903@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 908 bytes --]

On 19/12/13 at 09:11pm, Daniel Lezcano wrote:
> On 12/19/2013 08:51 PM, Mark Oteiza wrote:
> >Hi,
> >
> >Commit 1485191 causes a regression on my hardware (laptop) manifesting
> >itself as increased power consumption when playing music on mpd using
> >pulseaudio.
> >
> >I bisected to find 1485191.  I tested the commit it reverts (69a37bea),
> >and the commit prior 69a37bea, e34a00d6.  Here's a summary:
> >
> >   status      sha   power  coretemp
> >   ---------------------------------
> >   BAD    14851912     14W       54C
> >   GOOD   228b3023      9W       47C
> >   GOOD   69a37bea   ~8.8W       47C
> >   GOOD   e45a00d6   ~9.3W       47C
> 
> Hi Mark,
> 
> thanks for the measurements.
> 
> Is it possible you do some perf benchmarking. IIRC, this patch was reverted
> because it introduced some perf regression.
> 
>   -- Daniel

Hi,

I hope the attached perf stats are useful.

-- 
Mark Oteiza

[-- Attachment #2: perf.txt --]
[-- Type: text/plain, Size: 15439 bytes --]

vim: ft=asciidoc

1. Test description
   a. Common part
   b. Varying configuration
2. Result summary
3. Raw perf stat


1. Test description

   a. Common part

All tests using `perf stat -B -p PID` and playing out the track `Boom Jinx
- Too Free To Follow (Original Mix)`. Track length 9:25 (565 seconds).

  Input #0, mp3, from '129225_Too_Free_To_Follow_Original_Mix.mp3':
    Metadata:
      encoder         : LAME v3.96.1
      artist          : Boom Jinx
      title           : Too Free To Follow (Original Mix)
      album           : Come Play Perfect / Too Free To Follow
      genre           : Progressive House
      publisher       : Anjunadeep
      track           : 2/3
      date            : 2006
      TDOR            : 2006-01-01
      TBPM            : 125
    Duration: 00:09:24.82, start: 0.000000, bitrate: 323 kb/s

The following three tests are on `3.12.5-1-ARCH`

   b. Varying configuration

The tests are done with the following `mpd.conf`:

 > # ~/.config/mpd/mpd.conf
 >
 > music_directory    "/media/music"
 > playlist_directory "~/.local/share/mpd/playlists"
 > db_file            "~/.local/share/mpd/mpd.db"
 > log_file           "~/.local/share/mpd/mpd.log"
 > pid_file           "~/.local/share/mpd/mpd.pid"
 > state_file         "~/.local/share/mpd/mpdstate"
 >
 > audio_output {
 >   type    "pulse"
 >   name    "PulseAudio"
 > }

in one of three situations/configurations:

  A. alsa only, no pulse running, audio_output block commented
  B. pulse running, audio_output block commented
  C. pulse running, audio_output block NOT commented

These three configurations are tested on the latest vanilla Arch Linux kernel
3.12.5-1-ARCH, and a patched 3.11.6-1-ARCH kernel with 1485191 reverted.


2. Result summary


On vanilla kernel `3.12.5-1-ARCH`

 Test  Cpufreq             Frontend cyc. id.  Backend cyc. id.  Percent of all branches
 --------------------------------------------------------------------------------------
 A     1.399 GHz [96.81%]  43.74% [96.86%]    11.98% [13.09%]   1.50% [96.38%]
 A     1.324 GHz [96.41%]  42.61% [96.80%]    11.76% [13.08%]   1.51% [96.94%]
 B     1.858 GHz [79.91%]  47.34% [84.83%]    37.68% [80.26%]   4.15% [65.01%]
 B     1.923 GHz [71.58%]  46.24% [90.12%]    36.23% [82.24%]   4.77% [64.75%]
 C     2.479 GHz [82.12%]  52.43% [94.14%]    39.01% [88.21%]   4.75% [41.09%]
 C     2.486 GHz [93.97%]  53.46% [94.14%]    39.26% [88.01%]   5.97% [29.84%]

On kernel `3.11.6-1-ARCH` with commit 1485191 reverted.

 Test  Cpufreq             Frontend cyc. id.  Backend cyc. id.  Percent of all branches
 --------------------------------------------------------------------------------------
 A     1.207 GHz [96.77%]  44.12% [96.74%]    11.92% [13.09%]   1.48% [96.41%]
 A     1.093 GHz [45.20%]  43.97% [96.89%]    26.83% [93.31%]   1.73% [67.77%]
 B     1.569 GHz [78.11%]  40.73% [69.44%]    36.53% [80.72%]   4.52% [80.75%]
 B     1.655 GHz [53.47%]  42.10% [82.75%]    34.82% [81.95%]   4.36% [90.81%]
 C     1.863 GHz [82.98%]  51.42% [84.33%]    37.72% [80.99%]   3.81% [67.46%]
 C     1.893 GHz [88.14%]  51.51% [70.09%]    30.86% [59.34%]   3.52% [88.33%]


3. Raw perf stat


### Test A ###
$ perf stat -B -p 5779
^C
 Performance counter stats for process id '5779':

      19269.559713 task-clock                #    0.034 CPUs utilized
            29,092 context-switches          #    0.002 M/sec
               450 cpu-migrations            #    0.023 K/sec
               780 page-faults               #    0.040 K/sec
    26,953,017,361 cycles                    #    1.399 GHz                     [96.81%]
    11,788,193,213 stalled-cycles-frontend   #   43.74% frontend cycles idle    [96.86%]
     3,229,712,476 stalled-cycles-backend    #   11.98% backend  cycles idle    [13.09%]
    12,790,505,825 instructions              #    0.47  insns per cycle
                                             #    0.92  stalled cycles per insn [83.78%]
     3,789,342,481 branches                  #  196.649 M/sec                   [96.86%]
        56,708,247 branch-misses             #    1.50% of all branches         [96.38%]

     567.364674495 seconds time elapsed


### Test B ###
$ perf stat -B -p 21640 # pulse, mpd autodetect, GOOD
^C
 Performance counter stats for process id '21640':

       5461.629046 task-clock                #    0.010 CPUs utilized
           125,451 context-switches          #    0.023 M/sec
             2,953 cpu-migrations            #    0.541 K/sec
             1,224 page-faults               #    0.224 K/sec
    10,146,727,107 cycles                    #    1.858 GHz                     [79.91%]
     4,803,761,857 stalled-cycles-frontend   #   47.34% frontend cycles idle    [84.83%]
     3,822,947,306 stalled-cycles-backend    #   37.68% backend  cycles idle    [80.26%]
    14,068,598,186 instructions              #    1.39  insns per cycle
                                             #    0.34  stalled cycles per insn [90.26%]
     1,203,848,486 branches                  #  220.419 M/sec                   [90.01%]
        49,929,970 branch-misses             #    4.15% of all branches         [65.01%]

     566.998160291 seconds time elapsed

### Test C ###
$ perf stat -B -p 22293 # pulse, pulse configured audio_device, BAD
^C
 Performance counter stats for process id '22293':

       5039.684394 task-clock                #    0.009 CPUs utilized
           201,452 context-switches          #    0.040 M/sec
             3,931 cpu-migrations            #    0.780 K/sec
             1,013 page-faults               #    0.201 K/sec
    12,491,465,987 cycles                    #    2.479 GHz                     [82.12%]
     6,549,588,957 stalled-cycles-frontend   #   52.43% frontend cycles idle    [94.14%]
     4,872,604,722 stalled-cycles-backend    #   39.01% backend  cycles idle    [88.21%]
    16,043,214,477 instructions              #    1.28  insns per cycle
                                             #    0.41  stalled cycles per insn [94.27%]
     1,432,475,493 branches                  #  284.239 M/sec                   [94.45%]
        68,002,415 branch-misses             #    4.75% of all branches         [41.09%]

     567.793946819 seconds time elapsed


### Test B ###
$ perf stat -B -p 25849 # pulse, autodetect, OK?
^C
 Performance counter stats for process id '25849':

       4828.312531 task-clock                #    0.008 CPUs utilized
            52,635 context-switches          #    0.011 M/sec
             2,920 cpu-migrations            #    0.605 K/sec
             1,060 page-faults               #    0.220 K/sec
     9,283,525,030 cycles                    #    1.923 GHz                     [71.58%]
     4,292,859,154 stalled-cycles-frontend   #   46.24% frontend cycles idle    [90.12%]
     3,363,024,956 stalled-cycles-backend    #   36.23% backend  cycles idle    [82.24%]
    13,214,633,730 instructions              #    1.42  insns per cycle
                                             #    0.32  stalled cycles per insn [91.05%]
     1,039,288,777 branches                  #  215.249 M/sec                   [91.32%]
        49,566,186 branch-misses             #    4.77% of all branches         [64.75%]

     575.362641552 seconds time elapsed

### Test C ###
# 13-18W
$ perf stat -B -p 26349 # pulse, configured audio_output. BAD
^C
 Performance counter stats for process id '26349':

       4928.812937 task-clock                #    0.009 CPUs utilized
           208,121 context-switches          #    0.042 M/sec
             4,009 cpu-migrations            #    0.813 K/sec
             1,013 page-faults               #    0.206 K/sec
    12,253,718,423 cycles                    #    2.486 GHz                     [93.97%]
     6,550,856,654 stalled-cycles-frontend   #   53.46% frontend cycles idle    [94.14%]
     4,810,974,054 stalled-cycles-backend    #   39.26% backend  cycles idle    [88.01%]
    16,108,633,255 instructions              #    1.31  insns per cycle
                                             #    0.41  stalled cycles per insn [94.10%]
     1,445,452,744 branches                  #  293.266 M/sec                   [94.04%]
        86,270,136 branch-misses             #    5.97% of all branches         [29.84%]

     568.702015081 seconds time elapsed


### Test A ###
# alsa only , no config block GOOD
$ perf stat -B -p 26696
^C
 Performance counter stats for process id '26696':

      20122.934921 task-clock                #    0.035 CPUs utilized
            29,070 context-switches          #    0.001 M/sec
               281 cpu-migrations            #    0.014 K/sec
             1,020 page-faults               #    0.051 K/sec
    26,643,218,399 cycles                    #    1.324 GHz                     [96.41%]
    11,352,679,468 stalled-cycles-frontend   #   42.61% frontend cycles idle    [96.80%]
     3,132,458,091 stalled-cycles-backend    #   11.76% backend  cycles idle    [13.08%]
    46,389,485,704 instructions              #    1.74  insns per cycle
                                             #    0.24  stalled cycles per insn [96.90%]
     3,788,903,036 branches                  #  188.288 M/sec                   [96.77%]
        57,166,014 branch-misses             #    1.51% of all branches         [96.94%]

     568.255786926 seconds time elapsed


On `3.11.6-1-ARCH` with commit 1485191 reverted.

### Test A ###
# alsa only, GOOD
$ perf stat -B -p 1260
^C
 Performance counter stats for process id '1260':

      22530.387748 task-clock                #    0.039 CPUs utilized
            29,028 context-switches          #    0.001 M/sec
               260 cpu-migrations            #    0.012 K/sec
               916 page-faults               #    0.041 K/sec
    27,183,889,434 cycles                    #    1.207 GHz                     [96.77%]
    11,992,731,634 stalled-cycles-frontend   #   44.12% frontend cycles idle    [96.74%]
     3,241,176,991 stalled-cycles-backend    #   11.92% backend  cycles idle    [13.09%]
    12,894,959,266 instructions              #    0.47  insns per cycle
                                             #    0.93  stalled cycles per insn [84.24%]
     3,793,175,080 branches                  #  168.358 M/sec                   [97.00%]
        56,150,170 branch-misses             #    1.48% of all branches         [96.41%]

     574.810366198 seconds time elapsed

### Test B ###
# pulse no config block GOOD
$ perf stat -B -p 3240
^C
 Performance counter stats for process id '3240':

       5777.516582 task-clock                #    0.010 CPUs utilized
            49,661 context-switches          #    0.009 M/sec
             2,497 cpu-migrations            #    0.432 K/sec
             1,196 page-faults               #    0.207 K/sec
     9,067,410,074 cycles                    #    1.569 GHz                     [78.11%]
     3,692,796,429 stalled-cycles-frontend   #   40.73% frontend cycles idle    [69.44%]
     3,312,214,070 stalled-cycles-backend    #   36.53% backend  cycles idle    [80.72%]
    13,213,756,363 instructions              #    1.46  insns per cycle
                                             #    0.28  stalled cycles per insn [91.14%]
     1,038,835,964 branches                  #  179.807 M/sec                   [91.00%]
        46,978,484 branch-misses             #    4.52% of all branches         [80.75%]

     598.581618771 seconds time elapsed


### Test C ###
# pulse, config block GOOD
$ perf stat -B -p 3399
^C
 Performance counter stats for process id '3399':

       6621.672200 task-clock                #    0.012 CPUs utilized
           188,868 context-switches          #    0.029 M/sec
             4,560 cpu-migrations            #    0.689 K/sec
               849 page-faults               #    0.128 K/sec
    12,337,231,455 cycles                    #    1.863 GHz                     [82.98%]
     6,343,372,974 stalled-cycles-frontend   #   51.42% frontend cycles idle    [84.33%]
     4,653,020,474 stalled-cycles-backend    #   37.72% backend  cycles idle    [80.99%]
    15,991,003,014 instructions              #    1.30  insns per cycle
                                             #    0.40  stalled cycles per insn [94.02%]
     1,482,486,392 branches                  #  223.884 M/sec                   [84.26%]
        56,429,425 branch-misses             #    3.81% of all branches         [67.46%]

     566.108971634 seconds time elapsed

### Test A ###
$ perf stat -B -p 4085
^C
 Performance counter stats for process id '4085':

      23816.993016 task-clock                #    0.042 CPUs utilized
            29,306 context-switches          #    0.001 M/sec
               230 cpu-migrations            #    0.010 K/sec
               877 page-faults               #    0.037 K/sec
    26,022,945,276 cycles                    #    1.093 GHz                     [45.20%]
    11,441,586,110 stalled-cycles-frontend   #   43.97% frontend cycles idle    [96.89%]
     6,982,409,775 stalled-cycles-backend    #   26.83% backend  cycles idle    [93.31%]
    46,405,822,403 instructions              #    1.78  insns per cycle
                                             #    0.25  stalled cycles per insn [96.50%]
     3,794,249,905 branches                  #  159.309 M/sec                   [96.84%]
        65,778,344 branch-misses             #    1.73% of all branches         [67.77%]

     568.510689271 seconds time elapsed

### Test B ###
$ perf stat -B -p 4506
^C
 Performance counter stats for process id '4506':

       5850.939684 task-clock                #    0.010 CPUs utilized
            49,814 context-switches          #    0.009 M/sec
             2,713 cpu-migrations            #    0.464 K/sec
             1,193 page-faults               #    0.204 K/sec
     9,682,944,877 cycles                    #    1.655 GHz                     [53.47%]
     4,076,656,050 stalled-cycles-frontend   #   42.10% frontend cycles idle    [82.75%]
     3,372,031,105 stalled-cycles-backend    #   34.82% backend  cycles idle    [81.95%]
    13,196,912,570 instructions              #    1.36  insns per cycle
                                             #    0.31  stalled cycles per insn [91.09%]
     1,039,765,080 branches                  #  177.709 M/sec                   [91.03%]
        45,330,554 branch-misses             #    4.36% of all branches         [90.81%]

     570.162067313 seconds time elapsed

### Test C ###
$ perf stat -B -p 4884
^C
 Performance counter stats for process id '4884':

       6505.038805 task-clock                #    0.011 CPUs utilized
           188,878 context-switches          #    0.029 M/sec
             4,786 cpu-migrations            #    0.736 K/sec
             1,018 page-faults               #    0.156 K/sec
    12,315,800,288 cycles                    #    1.893 GHz                     [88.14%]
     6,344,128,023 stalled-cycles-frontend   #   51.51% frontend cycles idle    [70.09%]
     3,800,263,568 stalled-cycles-backend    #   30.86% backend  cycles idle    [59.34%]
    15,010,836,518 instructions              #    1.22  insns per cycle
                                             #    0.42  stalled cycles per insn [91.86%]
     1,415,142,508 branches                  #  217.546 M/sec                   [94.11%]
        49,838,213 branch-misses             #    3.52% of all branches         [88.33%]

     566.421286698 seconds time elapsed

      reply	other threads:[~2013-12-20 22:53 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-19 19:51 PROBLEM: cpuidle: Quickly notice prediction failure for repeat mode Mark Oteiza
2013-12-19 19:58 ` Mark Oteiza
2013-12-19 20:11 ` Daniel Lezcano
2013-12-20 22:54   ` Mark Oteiza [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131220225445.GA6228@holos \
    --to=mvoteiza@udel.edu \
    --cc=daniel.lezcano@linaro.org \
    --cc=linux-pm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox