linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Weird instruction profiling
@ 2015-02-18  9:57 Brendan Gregg
  2015-02-18 18:30 ` Andi Kleen
  0 siblings, 1 reply; 6+ messages in thread
From: Brendan Gregg @ 2015-02-18  9:57 UTC (permalink / raw)
  To: linux-perf-use.

G'Day,

I was profiling and noticed some odd instruction counts, and boiled it
down to a 16 NOP loop:

# perf annotate -i perf.data.noplooper --stdio
 Percent |      Source code & Disassembly of noplooper
--------------------------------------------------------
         :
         :      Disassembly of section .text:
         :
         :      00000000004004ed <main>:
    0.00 :        4004ed:       push   %rbp
    0.00 :        4004ee:       mov    %rsp,%rbp
   20.86 :        4004f1:       nop
    0.00 :        4004f2:       nop
    0.00 :        4004f3:       nop
    0.00 :        4004f4:       nop
   19.84 :        4004f5:       nop
    0.00 :        4004f6:       nop
    0.00 :        4004f7:       nop
    0.00 :        4004f8:       nop
   18.73 :        4004f9:       nop
    0.00 :        4004fa:       nop
    0.00 :        4004fb:       nop
    0.00 :        4004fc:       nop
   19.08 :        4004fd:       nop
    0.00 :        4004fe:       nop
    0.00 :        4004ff:       nop
    0.00 :        400500:       nop
   21.49 :        400501:       jmp    4004f1 <main+0x4>

Despite millions of samples, many NOPs are never seen. (See the
Percent column.) I'm not using PEBS, but I suppose I should.

I think Andi mentioned this to me last year -- that instruction
profiling was no longer reliable. At the time I hadn't been bitten by
it, but now I have!

Is this due to parallel and out-of-order execution? (ie, we're
sampling the instruction pointer, but that's set to the resumption
instruction, not the instructions being processed in the backend?).
Thanks,

Brendan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Weird instruction profiling
  2015-02-18  9:57 Weird instruction profiling Brendan Gregg
@ 2015-02-18 18:30 ` Andi Kleen
  2015-02-18 23:33   ` David Ahern
  0 siblings, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2015-02-18 18:30 UTC (permalink / raw)
  To: Brendan Gregg; +Cc: linux-perf-use.

Brendan Gregg <brendan.d.gregg@gmail.com> writes:
>
> Despite millions of samples, many NOPs are never seen. (See the
> Percent column.) I'm not using PEBS, but I suppose I should.

Yes you should. You get better results with :p / :pp (PEBS)
and the best results (but not at the cycles level) with
INST_RETIRED.PREC_DIST / ALL

cycles:upp

       │    0000000000400400 <main>:
  0.05 │ 0:   nop
       │      nop
       │      nop
 19.78 │      nop
  0.17 │      nop
       │      nop
       │      nop
 20.16 │      nop
  0.12 │      nop
       │      nop
       │      nop
 20.49 │      nop
  0.07 │      nop
       │      nop
       │      nop
 18.88 │      nop
 20.30 │    ↑ jmp    0


cpu/event=0xc0,umask=0x1,name=inst_retired_prec_dist/pp

       │    0000000000400400 <main>:
  6.13 │ 0:   nop
       │      nop
  0.02 │      nop
  0.02 │      nop
 23.69 │      nop
       │      nop
       │      nop
  0.02 │      nop
 24.05 │      nop
       │      nop
       │      nop
  0.02 │      nop
 23.60 │      nop
       │      nop
       │      nop
       │      nop
 22.46 │    ↑ jmp    0


This is nearly as good as you can get here because the machine can
retire four nops per cycle

A common trick is to run PREC_DIST/ALL in parallel with other events
and correlate.

> I think Andi mentioned this to me last year -- that instruction
> profiling was no longer reliable. 

It never was.

> Is this due to parallel and out-of-order execution? (ie, we're
> sampling the instruction pointer, but that's set to the resumption
> instruction, not the instructions being processed in the backend?).

Most problems are due to 'skid': It takes some time to trigger the
profiling interrupt after the event fired. Without PEBS the skid is
quite high. With PEBS it's a lot better because it writes out 
the information into the PEBS buffer faster, but also not zero and can
still be noticed. With PREC_DIST/ALL it does some additional tricks
to further reduce it.

There are also other problems, for example an event may not be tied
to an instruction. Some events have inherently large skid.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Weird instruction profiling
  2015-02-18 18:30 ` Andi Kleen
@ 2015-02-18 23:33   ` David Ahern
  2015-02-18 23:38     ` Andi Kleen
  0 siblings, 1 reply; 6+ messages in thread
From: David Ahern @ 2015-02-18 23:33 UTC (permalink / raw)
  To: Andi Kleen, Brendan Gregg; +Cc: linux-perf-use.

On 2/18/15 11:30 AM, Andi Kleen wrote:

>> I think Andi mentioned this to me last year -- that instruction
>> profiling was no longer reliable.
>
> It never was.

Why is that? What about profiling with other hardware counters?

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Weird instruction profiling
  2015-02-18 23:33   ` David Ahern
@ 2015-02-18 23:38     ` Andi Kleen
  2015-02-18 23:46       ` David Ahern
  0 siblings, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2015-02-18 23:38 UTC (permalink / raw)
  To: David Ahern; +Cc: Andi Kleen, Brendan Gregg, linux-perf-use.

On Wed, Feb 18, 2015 at 04:33:31PM -0700, David Ahern wrote:
> On 2/18/15 11:30 AM, Andi Kleen wrote:
> 
> >>I think Andi mentioned this to me last year -- that instruction
> >>profiling was no longer reliable.
> >
> >It never was.
> 
> Why is that? What about profiling with other hardware counters?

Practically all profiling has skid of some form.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Weird instruction profiling
  2015-02-18 23:38     ` Andi Kleen
@ 2015-02-18 23:46       ` David Ahern
  2015-02-20 19:04         ` Brendan Gregg
  0 siblings, 1 reply; 6+ messages in thread
From: David Ahern @ 2015-02-18 23:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Brendan Gregg, linux-perf-use.

On 2/18/15 4:38 PM, Andi Kleen wrote:
> On Wed, Feb 18, 2015 at 04:33:31PM -0700, David Ahern wrote:
>> On 2/18/15 11:30 AM, Andi Kleen wrote:
>>
>>>> I think Andi mentioned this to me last year -- that instruction
>>>> profiling was no longer reliable.
>>>
>>> It never was.
>>
>> Why is that? What about profiling with other hardware counters?
>
> Practically all profiling has skid of some form.

ok, so you meant that only in the context of skid meaning the hit 
instruction is off a few (ie., squint backwards at the annotate output).

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Weird instruction profiling
  2015-02-18 23:46       ` David Ahern
@ 2015-02-20 19:04         ` Brendan Gregg
  0 siblings, 0 replies; 6+ messages in thread
From: Brendan Gregg @ 2015-02-20 19:04 UTC (permalink / raw)
  To: David Ahern; +Cc: Andi Kleen, linux-perf-use.

On Wed, Feb 18, 2015 at 3:46 PM, David Ahern <dsahern@gmail.com> wrote:
> On 2/18/15 4:38 PM, Andi Kleen wrote:
>>
>> On Wed, Feb 18, 2015 at 04:33:31PM -0700, David Ahern wrote:
>>>
>>> On 2/18/15 11:30 AM, Andi Kleen wrote:
>>>
>>>>> I think Andi mentioned this to me last year -- that instruction
>>>>> profiling was no longer reliable.
>>>>
>>>>
>>>> It never was.
>>>
>>>
>>> Why is that? What about profiling with other hardware counters?
>>
>>
>> Practically all profiling has skid of some form.
>
>
> ok, so you meant that only in the context of skid meaning the hit
> instruction is off a few (ie., squint backwards at the annotate output).
>

Thanks Andi and David,

There's skid, but I was noticing what I thought was a different
artifact in the profile: where only every 4th instruction is seen.
Wouldn't this effect be more due to the IP pointing to the next
instruction while a group of 4 are executed in parallel? Then there's
skid on top of that....

Brendan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-02-20 19:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-18  9:57 Weird instruction profiling Brendan Gregg
2015-02-18 18:30 ` Andi Kleen
2015-02-18 23:33   ` David Ahern
2015-02-18 23:38     ` Andi Kleen
2015-02-18 23:46       ` David Ahern
2015-02-20 19:04         ` Brendan Gregg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).