Re: How to check perf commands are producing proper output or not?

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: How to check perf commands are producing proper output or not?
       [not found] <AANLkTikZc4L=-xzmOvqLC00qNHLuqCBOdcD3hofBNfFN@mail.gmail.com>
@ 2011-01-28 13:48 ` Arnaldo Carvalho de Melo
  2011-01-28 14:37   ` Vince Weaver
  0 siblings, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2011-01-28 13:48 UTC (permalink / raw)
  To: nelakurthi koteswararao; +Cc: linux-perf-users

Em Fri, Jan 28, 2011 at 05:49:00PM +0530, nelakurthi koteswararao escreveu:
>    I run the Perftest commands on 2.6.35 tree .
>    I want to test perf commands itself whether
>    they are producing the correct results or not.
>     
>    That means..suppose if i have following perf commands
>     
>    How to write test to test the above commands whether they are producing
>    the proper results specific to perf commands or not?
>     
>    Some tests like checking the processor registers at regular intervals to
>    validate
>    the perf results ? Any inputs regarding this query...?

That is a good question, we need more tests in tools/perf/builtin-test.c
that do this kinds of validations.

So far I wrote some that generate tracepoint events, but I can envision
for instance, figuring out the size of the L2 processor cache, creating
an array that is of that size, then go on touching it till it trashes
the cache while measuring the relevant event.

- Arnaldo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to check perf commands are producing proper output or not?
  2011-01-28 13:48 ` How to check perf commands are producing proper output or not? Arnaldo Carvalho de Melo
@ 2011-01-28 14:37   ` Vince Weaver
  2011-01-28 15:43     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 5+ messages in thread
From: Vince Weaver @ 2011-01-28 14:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: nelakurthi koteswararao, linux-perf-users

On Fri, 28 Jan 2011, Arnaldo Carvalho de Melo wrote:
> 
> So far I wrote some that generate tracepoint events, but I can envision
> for instance, figuring out the size of the L2 processor cache, creating
> an array that is of that size, then go on touching it till it trashes
> the cache while measuring the relevant event.

Unfortunately it is harder to validate perf events than you might guess.

For your L2 example, most processors do hardware prefetch in very hard to 
model way, so the "expected" value never matches anything that you get 
with the counters.  This is even if you write the tests in assembly
language to avoid any weird behavior from the C compilers (and there can 
be, even on something as simple as an array walk).

Even things you might think are simple, like retired-instructions can vary 
wildly.  There are enough differences caused by Linux and processor errata 
with retired-instructions that I wrote a whole 10 page paper about the 
differences you see.

I have started writing some validation tests, though they use the PAPI 
library which runs on top of perf-events.  You can see that and other 
validation work linked to here:
   http://web.eecs.utk.edu/~vweaver1/projects/validation-tests/

Vince

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to check perf commands are producing proper output or not?
  2011-01-28 14:37   ` Vince Weaver
@ 2011-01-28 15:43     ` Arnaldo Carvalho de Melo
  2011-01-28 22:03       ` Vince Weaver
  0 siblings, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2011-01-28 15:43 UTC (permalink / raw)
  To: Vince Weaver; +Cc: nelakurthi koteswararao, linux-perf-users

Em Fri, Jan 28, 2011 at 09:37:03AM -0500, Vince Weaver escreveu:
> On Fri, 28 Jan 2011, Arnaldo Carvalho de Melo wrote:
> > 
> > So far I wrote some that generate tracepoint events, but I can envision
> > for instance, figuring out the size of the L2 processor cache, creating
> > an array that is of that size, then go on touching it till it trashes
> > the cache while measuring the relevant event.
> 
> Unfortunately it is harder to validate perf events than you might guess.

My expectation so far, for basic tests, is not to precisely detect each
and every event, but to make sure that if a big number of cache misses
is being provoked, a big number of cache misses is being detected, to
test the basic working of the system.

After basic tests are in place, we go on trying to make them more
precise as is practical.
 
> For your L2 example, most processors do hardware prefetch in very hard to 
> model way, so the "expected" value never matches anything that you get 
> with the counters.  This is even if you write the tests in assembly
> language to avoid any weird behavior from the C compilers (and there can 
> be, even on something as simple as an array walk).
> 
> Even things you might think are simple, like retired-instructions can vary 
> wildly.  There are enough differences caused by Linux and processor errata 
> with retired-instructions that I wrote a whole 10 page paper about the 
> differences you see.

Cool, can I get the paper URL? I'd love to read it
 
> I have started writing some validation tests, though they use the PAPI 
> library which runs on top of perf-events.  You can see that and other 
> validation work linked to here:
>    http://web.eecs.utk.edu/~vweaver1/projects/validation-tests/

Will look at it, thanks for the pointer, thanks for working on it!

- Arnaldo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to check perf commands are producing proper output or not?
  2011-01-28 15:43     ` Arnaldo Carvalho de Melo
@ 2011-01-28 22:03       ` Vince Weaver
  2011-01-29 13:07         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 5+ messages in thread
From: Vince Weaver @ 2011-01-28 22:03 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: nelakurthi koteswararao, linux-perf-users

On Fri, 28 Jan 2011, Arnaldo Carvalho de Melo wrote:
> 
> My expectation so far, for basic tests, is not to precisely detect each
> and every event, but to make sure that if a big number of cache misses
> is being provoked, a big number of cache misses is being detected, to
> test the basic working of the system.

So do you want to just check for a non-zero number for the various counts?
That can be easily done.

If you want something closer, such as if something like branches match
to within 10% you end up needing assembly-coded benchmarks, as compiler
variation can make it hard to tell if the results you are getting are 
right or just coincidence.

> After basic tests are in place, we go on trying to make them more
> precise as is practical.

You do have to be careful in situations like this.  The in-kernel AMD 
branches count was wrong for a few kernel releases.  The count returned
a roughly plausible number of branches, but it turns out it was only 
counting retired-taken not retired-total branches.  Only a fairly exact 
assembly-language test I was working on showed this problem.

Another problem is with cache results.  This is why I wrote these tests to 
begin with.  I had users of PAPI/perf_events tell me that our tool was 
"broken" because they got implausibly low results for L2 cache misses (on 
the order of maybe 20 misses for walking an array the size of L2 cache).
It turns out that the HW prefetchers on modern machines are so good that
unless you do random walks of the cache it will look like the counter is 
broken.

> > Even things you might think are simple, like retired-instructions can vary 
> > wildly.  There are enough differences caused by Linux and processor errata 
> > with retired-instructions that I wrote a whole 10 page paper about the 
> > differences you see.
> 
> Cool, can I get the paper URL? I'd love to read it

Sure, it's:
   http://web.eecs.utk.edu/~vweaver1/projects/deterministic/fhpm2010_weaver.pdf

> > I have started writing some validation tests, though they use the PAPI 
> > library which runs on top of perf-events.  You can see that and other 
> > validation work linked to here:
> >    http://web.eecs.utk.edu/~vweaver1/projects/validation-tests/
> 
> Will look at it, thanks for the pointer, thanks for working on it!

I'd be glad to help a bit with this, as I think it's an important topic.
I should check out the current perf regression tests to see what's there.
I've been working at higher levels with libpfm4 and PAPI because a lot of 
the events I have problems with aren't necessarily the ones exposed by 
perf (without using raw events).

Vince

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to check perf commands are producing proper output or not?
  2011-01-28 22:03       ` Vince Weaver
@ 2011-01-29 13:07         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2011-01-29 13:07 UTC (permalink / raw)
  To: Vince Weaver; +Cc: nelakurthi koteswararao, Han Pingtian, linux-perf-users

Em Fri, Jan 28, 2011 at 05:03:06PM -0500, Vince Weaver escreveu:
> On Fri, 28 Jan 2011, Arnaldo Carvalho de Melo wrote:
> > My expectation so far, for basic tests, is not to precisely detect each
> > and every event, but to make sure that if a big number of cache misses
> > is being provoked, a big number of cache misses is being detected, to
> > test the basic working of the system.
> 
> So do you want to just check for a non-zero number for the various counts?
> That can be easily done.

Yeah, to check the basic functioning of the infrastructure.
 
> If you want something closer, such as if something like branches match
> to within 10% you end up needing assembly-coded benchmarks, as compiler
> variation can make it hard to tell if the results you are getting are 
> right or just coincidence.
 
> > After basic tests are in place, we go on trying to make them more
> > precise as is practical.
 
> You do have to be careful in situations like this.  The in-kernel AMD 
> branches count was wrong for a few kernel releases.  The count returned
> a roughly plausible number of branches, but it turns out it was only 
> counting retired-taken not retired-total branches.  Only a fairly exact 
> assembly-language test I was working on showed this problem.

And we want to avoid it from happening again for this specific case,
thus regression tests need to be in place, total agreement here :-)
 
> Another problem is with cache results.  This is why I wrote these tests to 
> begin with.  I had users of PAPI/perf_events tell me that our tool was 
> "broken" because they got implausibly low results for L2 cache misses (on 
> the order of maybe 20 misses for walking an array the size of L2 cache).
> It turns out that the HW prefetchers on modern machines are so good that
> unless you do random walks of the cache it will look like the counter is 
> broken.

:-)
 
> > > Even things you might think are simple, like retired-instructions can vary 
> > > wildly.  There are enough differences caused by Linux and processor errata 
> > > with retired-instructions that I wrote a whole 10 page paper about the 
> > > differences you see.
> > 
> > Cool, can I get the paper URL? I'd love to read it
> 
> Sure, it's:
>    http://web.eecs.utk.edu/~vweaver1/projects/deterministic/fhpm2010_weaver.pdf

Thanks!
 
> > > I have started writing some validation tests, though they use the PAPI 
> > > library which runs on top of perf-events.  You can see that and other 
> > > validation work linked to here:
> > >    http://web.eecs.utk.edu/~vweaver1/projects/validation-tests/
> > 
> > Will look at it, thanks for the pointer, thanks for working on it!
> 
> I'd be glad to help a bit with this, as I think it's an important topic.
> I should check out the current perf regression tests to see what's there.
> I've been working at higher levels with libpfm4 and PAPI because a lot of 
> the events I have problems with aren't necessarily the ones exposed by 
> perf (without using raw events).

The tests there are rather simple, basically testing the infrastructure
to create counters that I factored out of the tools and into
tools/perf/util/{evsel,evlist}.[ch] and that I'm exposing thru a python
binding.

They create syscall tracepoint events, that we can then trigger by using
the syscalls (I used open, pid getting routines), making them happen in
specific cpus (using sched_setaffinity), and then checking if the number
of syscalls generated on a particular CPU were correctly counted.

No tests for hardware counters are in there now, this discussion should
provide insights to people interested in writing them :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-01-29 13:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <AANLkTikZc4L=-xzmOvqLC00qNHLuqCBOdcD3hofBNfFN@mail.gmail.com>
2011-01-28 13:48 ` How to check perf commands are producing proper output or not? Arnaldo Carvalho de Melo
2011-01-28 14:37   ` Vince Weaver
2011-01-28 15:43     ` Arnaldo Carvalho de Melo
2011-01-28 22:03       ` Vince Weaver
2011-01-29 13:07         ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).