Strange values of multiplexed hardware counters

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Michael Petlan <mpetlan@redhat.com>
To: "linux-perf-users@vger.kernel.org" <linux-perf-users@vger.kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>, 'William Cohen' <wcohen@redhat.com>
Subject: Strange values of multiplexed hardware counters
Date: Fri, 22 Jan 2016 18:10:54 +0100	[thread overview]
Message-ID: <1453482654.13794.155.camel@redhat.com> (raw)

Hi all,

I have encountered a strange problem with perf-stat. My Intel Ivy Bridge
machine has 4 generic registers.

When running perf-stat with 4 times the same event, it seems to work
fine:

$ perf stat -e cycles -e cycles -e cycles -e cycles ls
 Performance counter stats for 'ls':
         3,527,771      cycles
         3,527,771      cycles
         3,527,771      cycles
         3,527,771      cycles
       0.003955878 seconds time elapsed elapsed

When adding a fifth event, we need multiplexing/scaling:

$ perf stat -e cycles -e cycles -e cycles -e cycles -e cycles ls
 Performance counter stats for 'ls':
         2,984,685      cycles       (79.86%)
         3,782,493      cycles       (86.79%)
         3,816,329      cycles       (85.90%)
         4,038,067      cycles       (70.61%)
         2,869,339      cycles       (78.36%)
       0.006683959 seconds time elapsed

Considering that scaling is enabled and 4 counters must do the job of
five, the results still look weird to me.

I know that number of hardware event occurrences vary from run to run,
but I would expect that the five lines would contain similar numbers
after scaling.

In case a counter counted an event only for 75% of the measurement run
time and the real value on it is 1,500,000 I would say, that perf should
either report:

      1,500,000    cycles     (75.00%)
(when no scaling is applied to the result)

or

      2,000,000    cycles     (75.00%)
(when the value is scaled by 100/75 (extrapolated))

In case the second is true, I would expect the values to be roughly the
same in the experiment with the 'cycles' event opened 5 times.

Since the values differ a lot, it looks like the values are not scaled.
But it is still weird, because it seems to be proportional with the
percentage in the right column sometimes, but not this time. If 4.038M
refers to 70.61%, 3.816M cannot refer to 85.90%.

The sum of the percent values is 401.52% and that would match the fact
that 4 counters were used. The 1.52% that are extra here might stand for
some overhead caused multiplexing (?) or a rounding error (???).

But still, the values differ more than 1.52%.

                       ***************

Does anyone know where is the problem? Do I understand it wrong? Is the
current behaviour correct?

Thanks in advance.
Michael

Note 1: When doing the same experiment with OProfile's ocount, the
effect is the same. Having the CPU_CLK_UNHALTED event opened four times,
the results are exactly the same, but when adding a fifth one,
multiplexing kicks in and the numbers become weird:

Events were actively counted for 4344151 nanoseconds.
Event counts (actual) for /usr/bin/ls:
        Event                   Count                    % time counted
        CPU_CLK_UNHALTED        3,534,782                100.00
        CPU_CLK_UNHALTED        3,534,782                100.00
        CPU_CLK_UNHALTED        3,534,782                100.00
        CPU_CLK_UNHALTED        3,534,782                100.00

When adding the fifth '-e CPU_CLK_UNHALTED':

Events were actively counted for 3506335 nanoseconds.
Event counts (scaled) for /usr/bin/ls:
        Event                   Count                    % time counted
        CPU_CLK_UNHALTED        5,747,077                73.66
        CPU_CLK_UNHALTED        5,236,339                65.44
        CPU_CLK_UNHALTED        4,927,556                100.00
        CPU_CLK_UNHALTED        4,927,556                100.00
        CPU_CLK_UNHALTED        3,342,282                65.68

The sum of percents makes 404.78% here. It is interesting that OProfile
usually leaves two counters count for 100% of the time and the remaining
two are doing the job of three.

Note 2: Sometimes perf-stat does not print the percentages everywhere:

 Performance counter stats for 'ls':
       5,258,849      cycles    (72.47%)
       5,272,885      cycles    (72.50%)
       4,228,707      cycles    (95.03%)
       4,513,546      cycles
       2,932,594      cycles    (64.38%)
     0.003473492 seconds time elapsed

                 reply	other threads:[~2016-01-22 17:10 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1453482654.13794.155.camel@redhat.com \
    --to=mpetlan@redhat.com \
    --cc=jolsa@redhat.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=wcohen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).