[perf] howto switch from pfmon

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [perf] howto switch from pfmon
@ 2009-06-22 20:54 Brice Goglin
  2009-06-23 12:12 ` Andi Kleen
  2009-06-23 13:14 ` Ingo Molnar
  0 siblings, 2 replies; 41+ messages in thread
From: Brice Goglin @ 2009-06-22 20:54 UTC (permalink / raw)
  To: Peter Zijlstra, paulus, mingo, LKML

Hello,

I am trying to play with perfcounters in current git (actually in latest
mmotm). I'd like to reproduce what I previously did with pfmon, but I
couldn't so far.

Something like
    pfmon --follow-exec 'foobar' -e
CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_0,CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_1
-- <shell script>
gives the number of memory accesses to dram node #0 and #1 for all
processes whose name matches 'foobar'.

So there are several questions here:
1) is it possible to specify counter names like the above or do we have
to use raw counter numbers? I tried raw numbers from [1] without
success. How am I supposed to find and specify these raw numbers?
2) how do we specify "subevents"?
3) is there anything similar to --follow-exec, or --follow-pthreads for
getting separated outputs for each thread?

I guess there are still a lot of things on the TODOlist but I'd like to
understand a bit more where things are going. Sorry I didn't read all
the archives about this, there are way too many of them recently :)

thanks,
Brice

[1]
https://aiya.ms.mff.cuni.cz/svn/rip/trunk/doc/devel/native_events_barcelona.txt

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-22 20:54 [perf] howto switch from pfmon Brice Goglin
@ 2009-06-23 12:12 ` Andi Kleen
  2009-06-23 12:23   ` Peter Zijlstra
  2009-06-23 13:57   ` Ingo Molnar
  2009-06-23 13:14 ` Ingo Molnar
  1 sibling, 2 replies; 41+ messages in thread
From: Andi Kleen @ 2009-06-23 12:12 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, mingo, LKML

Brice Goglin <Brice.Goglin@inria.fr> writes:

> Hello,
>
> I am trying to play with perfcounters in current git (actually in latest
> mmotm). I'd like to reproduce what I previously did with pfmon, but I
> couldn't so far.
>
> Something like
>     pfmon --follow-exec 'foobar' -e
> CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_0,CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_1
> -- <shell script>
> gives the number of memory accesses to dram node #0 and #1 for all
> processes whose name matches 'foobar'.

My understanding based on recent emails on the topic is that the
perfctr gods decreed you are not to do any of this because they cannot
think of a use case for it, therefore none exist.

-Andi



-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 12:12 ` Andi Kleen
@ 2009-06-23 12:23   ` Peter Zijlstra
  2009-06-23 13:57   ` Ingo Molnar
  1 sibling, 0 replies; 41+ messages in thread
From: Peter Zijlstra @ 2009-06-23 12:23 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Brice Goglin, paulus, mingo, LKML

On Tue, 2009-06-23 at 14:12 +0200, Andi Kleen wrote:
> Brice Goglin <Brice.Goglin@inria.fr> writes:
> 
> > Hello,
> >
> > I am trying to play with perfcounters in current git (actually in latest
> > mmotm). I'd like to reproduce what I previously did with pfmon, but I
> > couldn't so far.
> >
> > Something like
> >     pfmon --follow-exec 'foobar' -e
> > CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_0,CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_1
> > -- <shell script>
> > gives the number of memory accesses to dram node #0 and #1 for all
> > processes whose name matches 'foobar'.
> 
> My understanding based on recent emails on the topic is that the
> perfctr gods decreed you are not to do any of this because they cannot
> think of a use case for it, therefore none exist.

I wouldn't put it like that.

But we haven't gotten around to implementing uncore pmu stuff --
assuming that is what was meant.

What would be accurate is to say that we think uncore is a lot less
interesting that a lot of other pmu features.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 12:12 ` Andi Kleen
  2009-06-23 12:23   ` Peter Zijlstra
@ 2009-06-23 13:57   ` Ingo Molnar
  1 sibling, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2009-06-23 13:57 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Brice Goglin, Peter Zijlstra, paulus, LKML


* Andi Kleen <andi@firstfloor.org> wrote:

> Brice Goglin <Brice.Goglin@inria.fr> writes:
> 
> > Hello,
> >
> > I am trying to play with perfcounters in current git (actually in latest
> > mmotm). I'd like to reproduce what I previously did with pfmon, but I
> > couldn't so far.
> >
> > Something like
> >     pfmon --follow-exec 'foobar' -e
> > CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_0,CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_1
> > -- <shell script>
> > gives the number of memory accesses to dram node #0 and #1 for all
> > processes whose name matches 'foobar'.
> 
> My understanding based on recent emails on the topic is that the 
> perfctr gods decreed you are not to do any of this because they 
> cannot think of a use case for it, therefore none exist.

You are working for Intel, right?

Is the trolling of AMD related threads now an officially sanctioned 
activity by Intel, or do you do it out of personal motivation, in 
your free time? I'd really like to know, because what you do here is 
quite unprofessional and quite a distraction.

[ Btw., 'perfctr' is the name of another project, the one you wanted
  to attack here is called 'perfcounters'. ]

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-22 20:54 [perf] howto switch from pfmon Brice Goglin
  2009-06-23 12:12 ` Andi Kleen
@ 2009-06-23 13:14 ` Ingo Molnar
  2009-06-23 13:22   ` Peter Zijlstra
                     ` (3 more replies)
  1 sibling, 4 replies; 41+ messages in thread
From: Ingo Molnar @ 2009-06-23 13:14 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML

* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> Hello,
>
> I am trying to play with perfcounters in current git (actually in
> latest mmotm). I'd like to reproduce what I previously did with
> pfmon, but I couldn't so far.
>
> Something like
>     pfmon --follow-exec 'foobar' -e
> CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_0,CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_1
> -- <shell script>
> gives the number of memory accesses to dram node #0 and #1 for all
> processes whose name matches 'foobar'.
>
> So there are several questions here:
> 1) is it possible to specify counter names like the above or do we have
> to use raw counter numbers? I tried raw numbers from [1] without
> success. How am I supposed to find and specify these raw numbers?
> 2) how do we specify "subevents"?
> 3) is there anything similar to --follow-exec, or --follow-pthreads for
> getting separated outputs for each thread?
>
> I guess there are still a lot of things on the TODOlist but I'd 
> like to understand a bit more where things are going. Sorry I 
> didn't read all the archives about this, there are way too many of 
> them recently :)

Yeah, there's indeed still a lot on the TODO list :-)

CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE is a Barcelona hardware event, 
so if you know that it maps to raw ID 0x100000e0 then you can always 
extend the events that 'perf' knows about via raw events:

 $ perf stat -e cycles -e instructions -e r1000ffe0 ./hackbench 10
 Time: 0.186

 Performance counter stats for './hackbench 10':

     4381248335  cycles
     1964394846  instructions         #      0.448 IPC
            838  raw 0x1000ffe0

    0.215382037  seconds time elapsed.

That 'r1000ffe0' is the raw event. You can also do a profile with
such events:

  perf record -f -e r1000ffe0 ./hackbench 10

and look at it via 'perf report'.

Figuring out raw codes is certainly avoidable, we could probably
integrate all the oprofile (and PAPI) event names into perf too,
from the /usr/share/oprofile/ event lists perhaps - for easier
migration for those who got used to those event names. It also gives
a wider set of events - which is useful if you got used to any
specific name.

The Barcelona events are listed in listed in section 3.14 of "BIOS
and Kernel Developer's Guide for AMD Familiy 10h Processors", that's
where all the projects take these symbols from. If you want to
contribute then creating such tables for 'perf', for model-specific
events would certainly be useful.

[ Note, there's no need to specify any --follow-* flags as that is
  implicit in 'perf'. (and you'll probably also notice that perf
  stat is a lot faster at following fast-forking or
  context-switching workloads than is pfmon, because it's not ptrace
  based.) ]

And please let us know if you see any weirdness/difficulty while
using 'perf' or if you just notice some quirky thing in the tool.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 13:14 ` Ingo Molnar
@ 2009-06-23 13:22   ` Peter Zijlstra
  2009-06-23 13:38     ` Ingo Molnar
  2009-06-23 13:25   ` Ingo Molnar
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 41+ messages in thread
From: Peter Zijlstra @ 2009-06-23 13:22 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Brice Goglin, paulus, LKML

On Tue, 2009-06-23 at 15:14 +0200, Ingo Molnar wrote:
> * Brice Goglin <Brice.Goglin@inria.fr> wrote:
> 
> > Hello,
> >
> > I am trying to play with perfcounters in current git (actually in
> > latest mmotm). I'd like to reproduce what I previously did with
> > pfmon, but I couldn't so far.
> >
> > Something like
> >     pfmon --follow-exec 'foobar' -e
> > CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_0,CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_1
> > -- <shell script>
> > gives the number of memory accesses to dram node #0 and #1 for all
> > processes whose name matches 'foobar'.
> >
> > So there are several questions here:
> > 1) is it possible to specify counter names like the above or do we have
> > to use raw counter numbers? I tried raw numbers from [1] without
> > success. How am I supposed to find and specify these raw numbers?
> > 2) how do we specify "subevents"?
> > 3) is there anything similar to --follow-exec, or --follow-pthreads for
> > getting separated outputs for each thread?
> >
> > I guess there are still a lot of things on the TODOlist but I'd 
> > like to understand a bit more where things are going. Sorry I 
> > didn't read all the archives about this, there are way too many of 
> > them recently :)
> 
> Yeah, there's indeed still a lot on the TODO list :-)
> 
> CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE is a Barcelona hardware event, 
> so if you know that it maps to raw ID 0x100000e0 then you can always 
> extend the events that 'perf' knows about via raw events:
> 
>  $ perf stat -e cycles -e instructions -e r1000ffe0 ./hackbench 10
>  Time: 0.186
> 
>  Performance counter stats for './hackbench 10':
> 
>      4381248335  cycles
>      1964394846  instructions         #      0.448 IPC
>             838  raw 0x1000ffe0
> 
>     0.215382037  seconds time elapsed.

Just to clarify, The event code is 1E0h, and Ingo used a FFh unit mask.
These are combined using the arch masks below:

#define K7_EVNTSEL_EVENT_MASK   0x7000000FFULL
#define K7_EVNTSEL_UNIT_MASK    0x00000FF00ULL

to form the raw event code used: 0x1000ffe0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 13:22   ` Peter Zijlstra
@ 2009-06-23 13:38     ` Ingo Molnar
  0 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2009-06-23 13:38 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Brice Goglin, paulus, LKML


* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> On Tue, 2009-06-23 at 15:14 +0200, Ingo Molnar wrote:
> > * Brice Goglin <Brice.Goglin@inria.fr> wrote:
> > 
> > > Hello,
> > >
> > > I am trying to play with perfcounters in current git (actually in
> > > latest mmotm). I'd like to reproduce what I previously did with
> > > pfmon, but I couldn't so far.
> > >
> > > Something like
> > >     pfmon --follow-exec 'foobar' -e
> > > CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_0,CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE:LOCAL_TO_1
> > > -- <shell script>
> > > gives the number of memory accesses to dram node #0 and #1 for all
> > > processes whose name matches 'foobar'.
> > >
> > > So there are several questions here:
> > > 1) is it possible to specify counter names like the above or do we have
> > > to use raw counter numbers? I tried raw numbers from [1] without
> > > success. How am I supposed to find and specify these raw numbers?
> > > 2) how do we specify "subevents"?
> > > 3) is there anything similar to --follow-exec, or --follow-pthreads for
> > > getting separated outputs for each thread?
> > >
> > > I guess there are still a lot of things on the TODOlist but I'd 
> > > like to understand a bit more where things are going. Sorry I 
> > > didn't read all the archives about this, there are way too many of 
> > > them recently :)
> > 
> > Yeah, there's indeed still a lot on the TODO list :-)
> > 
> > CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE is a Barcelona hardware event, 
> > so if you know that it maps to raw ID 0x100000e0 then you can always 
> > extend the events that 'perf' knows about via raw events:
> > 
> >  $ perf stat -e cycles -e instructions -e r1000ffe0 ./hackbench 10
> >  Time: 0.186
> > 
> >  Performance counter stats for './hackbench 10':
> > 
> >      4381248335  cycles
> >      1964394846  instructions         #      0.448 IPC
> >             838  raw 0x1000ffe0
> > 
> >     0.215382037  seconds time elapsed.
> 
> Just to clarify, The event code is 1E0h, and Ingo used a FFh unit mask.
> These are combined using the arch masks below:
> 
> #define K7_EVNTSEL_EVENT_MASK   0x7000000FFULL
> #define K7_EVNTSEL_UNIT_MASK    0x00000FF00ULL
> 
> to form the raw event code used: 0x1000ffe0

Yes. The individual node mappings are 01, 02 .. 80 - ff is 'all 8 
nodes'.

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 13:14 ` Ingo Molnar
  2009-06-23 13:22   ` Peter Zijlstra
@ 2009-06-23 13:25   ` Ingo Molnar
  2009-06-23 13:47   ` Ingo Molnar
  2009-06-23 14:21   ` Brice Goglin
  3 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2009-06-23 13:25 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML


* Ingo Molnar <mingo@elte.hu> wrote:

> > I guess there are still a lot of things on the TODOlist but I'd 
> > like to understand a bit more where things are going. Sorry I 
> > didn't read all the archives about this, there are way too many 
> > of them recently :)
> 
> Yeah, there's indeed still a lot on the TODO list :-)
> 
> CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE is a Barcelona hardware event, 
> so if you know that it maps to raw ID 0x100000e0 then you can 
> always extend the events that 'perf' knows about via raw events:
> 
>  $ perf stat -e cycles -e instructions -e r1000ffe0 ./hackbench 10

Note, beyond using raw events, if you are interested in profiling 
out 'locality badness' of your app, you are probably quite well 
served with the default metrics on Barcelona as well:

 $ perf stat ~/hackbench  10
 Time: 0.205

  Performance counter stats for '/home/mingo/hackbench 10':

    2187.328436  task-clock-msecs     #      3.315 CPUs 
          54554  context-switches     #      0.025 M/sec
           1160  CPU-migrations       #      0.001 M/sec
          17755  page-faults          #      0.008 M/sec
     4995437535  cycles               #   2283.808 M/sec
     2150881875  instructions         #      0.431 IPC  
      644099534  cache-references     #    294.469 M/sec
        8516562  cache-misses         #      3.894 M/sec

    0.659895237  seconds time elapsed.

The cache-misses event is sufficiently well-represented to be 
meaningful to profile based on it. Raw DRAM access stats can be 
useful too - but they are generally layered much later and your app 
can hurt already flip-flopping its working set, without hitting too 
hard on the DRAM channels.

So perhaps 'cache-misses' is a good first-level approximation metric 
to measure and profile along. You can get a good 
(last-level-)cache-misses profile using the auto-freq counters:

  perf record -e cache-misses -F 10000 ./your-app

The '-F 10000' tells the kernel to do 10 KHz sampling of your-app, 
regardless of how frequent cache-misses are. The tools (perf report) 
will take the weight of events into account, so it's all 
well-normalized between the functions.

So you dont need to specify the 'sampling interval' by hand to get a 
sufficient number of samples, you just specify a sampling frequency 
- and the perfcounters subsystem takes care of the rest.

Also, your system wont over-sample nor under-sample if your workload 
idles around occasionally.

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 13:14 ` Ingo Molnar
  2009-06-23 13:22   ` Peter Zijlstra
  2009-06-23 13:25   ` Ingo Molnar
@ 2009-06-23 13:47   ` Ingo Molnar
  2009-06-23 14:00     ` Brice Goglin
  2009-06-23 14:21   ` Brice Goglin
  3 siblings, 1 reply; 41+ messages in thread
From: Ingo Molnar @ 2009-06-23 13:47 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML


* Ingo Molnar <mingo@elte.hu> wrote:

>  $ perf stat -e cycles -e instructions -e r1000ffe0 ./hackbench 10
>  Time: 0.186

Correction: that should be r10000ffe0.

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 13:47   ` Ingo Molnar
@ 2009-06-23 14:00     ` Brice Goglin
  2009-06-23 14:36       ` Ingo Molnar
  0 siblings, 1 reply; 41+ messages in thread
From: Brice Goglin @ 2009-06-23 14:00 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, paulus, LKML

Ingo Molnar wrote:
> * Ingo Molnar <mingo@elte.hu> wrote:
>
>   
>>  $ perf stat -e cycles -e instructions -e r1000ffe0 ./hackbench 10
>>  Time: 0.186
>>     
>
> Correction: that should be r10000ffe0.
>   

Oh thanks a lot, it seems to work now!

One strange thing I noticed: sometimes perf reports that there were some
accesses to target numa nodes 4-7 while my box only has 4 numa nodes:
If I request counters only for the non-existing target numa nodes (4-7,
with -e r1000010e0 -e r1000020e0 -e r1000040e0 -e r1000080e0), I always
get 4 zeros.
But if I mix some couinters from the existing nodes (0-3) with some
counters from non-existing nodes (4-7), the non-existing ones report
some small but non-empty values.
Does it ring any bell?

Brice

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 14:00     ` Brice Goglin
@ 2009-06-23 14:36       ` Ingo Molnar
  2009-06-23 15:22         ` Brice Goglin
  0 siblings, 1 reply; 41+ messages in thread
From: Ingo Molnar @ 2009-06-23 14:36 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML


* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> Ingo Molnar wrote:
> > * Ingo Molnar <mingo@elte.hu> wrote:
> >
> >   
> >>  $ perf stat -e cycles -e instructions -e r1000ffe0 ./hackbench 10
> >>  Time: 0.186
> >>     
> >
> > Correction: that should be r10000ffe0.
> 
> Oh thanks a lot, it seems to work now!

btw., it might make sense to expose NUMA inbalance via generic 
enumeration. Right now we have:

        PERF_COUNT_HW_CPU_CYCLES                = 0,
        PERF_COUNT_HW_INSTRUCTIONS              = 1,
        PERF_COUNT_HW_CACHE_REFERENCES          = 2,
        PERF_COUNT_HW_CACHE_MISSES              = 3,
        PERF_COUNT_HW_BRANCH_INSTRUCTIONS       = 4,
        PERF_COUNT_HW_BRANCH_MISSES             = 5,
        PERF_COUNT_HW_BUS_CYCLES                = 6,

plus we have cache stats:

 * Generalized hardware cache counters:
 *
 *       { L1-D, L1-I, LLC, ITLB, DTLB, BPU } x
 *       { read, write, prefetch } x
 *       { accesses, misses }

NUMA is here to stay, and expressing local versus remote access 
stats seems useful. We could add two generic counters:

        PERF_COUNT_HW_RAM_LOCAL                 = 7,
        PERF_COUNT_HW_RAM_REMOTE                = 8,

And map them properly on all CPUs that support such stats. They'd be 
accessible via '-e ram-local-refs' and '-e ram-remote-refs' type of 
event symbols.

What is your typical usage pattern of this counter? What (general) 
kind of app do you profile with it and how do you make use of the 
specific node masks?

Would a local/all-remote distinction be enough, or do you need to 
make a distinction between the individual nodes to get the best 
insight into the workload?

> One strange thing I noticed: sometimes perf reports that there 
> were some accesses to target numa nodes 4-7 while my box only has 
> 4 numa nodes: If I request counters only for the non-existing 
> target numa nodes (4-7, with -e r1000010e0 -e r1000020e0 -e 
> r1000040e0 -e r1000080e0), I always get 4 zeros.
>
> But if I mix some couinters from the existing nodes (0-3) with 
> some counters from non-existing nodes (4-7), the non-existing ones 
> report some small but non-empty values. Does it ring any bell?

I can see that too. I have a similar system (4 nodes), and if i use 
the stats for nodes 4-7 (non-existent) i get:

phoenix:~> perf stat -e r1000010e0 -e r1000020e0 -e r1000040e0 -e r1000080e0 --repeat 10 ./hackbench 30
Time: 0.490
Time: 0.435
Time: 0.492
Time: 0.569
Time: 0.491
Time: 0.498
Time: 0.549
Time: 0.530
Time: 0.543
Time: 0.482

 Performance counter stats for './hackbench 30' (10 runs):

              0  raw 0x1000010e0        ( +-   0.000% )
              0  raw 0x1000020e0        ( +-   0.000% )
              0  raw 0x1000040e0        ( +-   0.000% )
              0  raw 0x1000080e0        ( +-   0.000% )

    0.610303953  seconds time elapsed.

( Note the --repeat option - that way you can repeat workloads and 
  observe their statistical properties. )

If i try the first 4 nodes i get:

phoenix:~> perf stat -e r1000001e0 -e r1000002e0 -e r1000004e0 -e r1000008e0 --repeat 10 ./hackbench 30
Time: 0.403
Time: 0.431
Time: 0.406
Time: 0.421
Time: 0.461
Time: 0.423
Time: 0.495
Time: 0.462
Time: 0.434
Time: 0.459

 Performance counter stats for './hackbench 30' (10 runs):

       52255370  raw 0x1000001e0        ( +-   5.510% )
       46052950  raw 0x1000002e0        ( +-   8.067% )
       45966395  raw 0x1000004e0        ( +-  10.341% )
       63240044  raw 0x1000008e0        ( +-  11.707% )

    0.530894007  seconds time elapsed.

Quite noisy across runs - which is expected on NUMA, as the memory 
allocations are not really deterministic and some more NUMA friendly 
than others. This box has all relevant NUMA options enabled:

 CONFIG_NUMA=y
 CONFIG_K8_NUMA=y
 CONFIG_X86_64_ACPI_NUMA=y
 CONFIG_ACPI_NUMA=y

But if i 'mix' counters, i too get weird stats:

phoenix:~> perf stat -e r1000020e0 -e r1000040e0 -e r1000080e0 -e r10000ffe0 --repeat 10 ./hackbench 30
Time: 0.432
Time: 0.446
Time: 0.428
Time: 0.472
Time: 0.443
Time: 0.454
Time: 0.398
Time: 0.438
Time: 0.403
Time: 0.463

 Performance counter stats for './hackbench 30' (10 runs):

        2355436  raw 0x1000020e0        ( +-   8.989% )
              0  raw 0x1000040e0        ( +-   0.000% )
              0  raw 0x1000080e0        ( +-   0.000% )
      204768941  raw 0x10000ffe0        ( +-   0.788% )

    0.528447241  seconds time elapsed.

That 2355436 count for node 5 should have been zero.

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 14:36       ` Ingo Molnar
@ 2009-06-23 15:22         ` Brice Goglin
  2009-06-29 19:29           ` Ingo Molnar
  0 siblings, 1 reply; 41+ messages in thread
From: Brice Goglin @ 2009-06-23 15:22 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, paulus, LKML

Ingo Molnar wrote:
> btw., it might make sense to expose NUMA inbalance via generic 
> enumeration. Right now we have:
>
>         PERF_COUNT_HW_CPU_CYCLES                = 0,
>         PERF_COUNT_HW_INSTRUCTIONS              = 1,
>         PERF_COUNT_HW_CACHE_REFERENCES          = 2,
>         PERF_COUNT_HW_CACHE_MISSES              = 3,
>         PERF_COUNT_HW_BRANCH_INSTRUCTIONS       = 4,
>         PERF_COUNT_HW_BRANCH_MISSES             = 5,
>         PERF_COUNT_HW_BUS_CYCLES                = 6,
>
> plus we have cache stats:
>
>  * Generalized hardware cache counters:
>  *
>  *       { L1-D, L1-I, LLC, ITLB, DTLB, BPU } x
>  *       { read, write, prefetch } x
>  *       { accesses, misses }
>   

By the way, is there a way to know which cache was actually used when we
request cache references/misses? Always the largest/top one by default?

> NUMA is here to stay, and expressing local versus remote access 
> stats seems useful. We could add two generic counters:
>
>         PERF_COUNT_HW_RAM_LOCAL                 = 7,
>         PERF_COUNT_HW_RAM_REMOTE                = 8,
>
> And map them properly on all CPUs that support such stats. They'd be 
> accessible via '-e ram-local-refs' and '-e ram-remote-refs' type of 
> event symbols.
>
> What is your typical usage pattern of this counter? What (general) 
> kind of app do you profile with it and how do you make use of the 
> specific node masks?
>
> Would a local/all-remote distinction be enough, or do you need to 
> make a distinction between the individual nodes to get the best 
> insight into the workload?
>   

People here work on OpenMP runtime systems where you try to keep threads
and data together. So in the end, what's important is to maximize the
overall local/remote access ratio. But during development, it may useful
to have a distinction between individual nodes so as to understand
what's going on. That said, we still have raw numbers when we really
need that many details, and I don't know if it'd be easy for you to add
a generic counter with a sort of node-number attribute.

(including part of your other email here since it's relevant)

> How many threads does your workload typically run, and how do you 
> get their stats displayed?
>   

In the aforementioned OpenMP stuff, we use pfmon to get the local/remote
numa memory access ratio of each thread. In this specific case, we bind
one thread per core (even with a O(1) scheduler, people tend to avoid
launching hundreds of threads on current machines). pfmon gives us
something similar to the output of 'perf stat' in a file whose filename
contains process and thread IDs. We apply our own custom script to
convert these many pfmon output files into a single summary saying for
each thread, its thread ID, its core binding, its individual numa node
access numbers and percentages, and if they were local or remote (with
the Barcelona counters we were talking about, you need to check where
you were running before you know if accesses to node X are actually
local or remote accesses).

thanks,
Brice

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 15:22         ` Brice Goglin
@ 2009-06-29 19:29           ` Ingo Molnar
  2009-08-06 16:59             ` Brice Goglin
  0 siblings, 1 reply; 41+ messages in thread
From: Ingo Molnar @ 2009-06-29 19:29 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML


* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> > How many threads does your workload typically run, and how do 
> > you get their stats displayed?
> 
> In the aforementioned OpenMP stuff, we use pfmon to get the 
> local/remote numa memory access ratio of each thread. In this 
> specific case, we bind one thread per core (even with a O(1) 
> scheduler, people tend to avoid launching hundreds of threads on 
> current machines). pfmon gives us something similar to the output 
> of 'perf stat' in a file whose filename contains process and 
> thread IDs. We apply our own custom script to convert these many 
> pfmon output files into a single summary saying for each thread, 
> its thread ID, its core binding, its individual numa node access 
> numbers and percentages, and if they were local or remote (with 
> the Barcelona counters we were talking about, you need to check 
> where you were running before you know if accesses to node X are 
> actually local or remote accesses).

Update: based on your feedback the latest perfcounters tree includes 
the following new perf record features:

    -s, --stat            per thread counts
    -n, --no-samples      don't sample

--stat instructs the kernel to gather precise per task/thread stats 
and emits those counts to the data file. Via --no-samples one can do 
non-profiling runs - i.e. only statistics collection.

The 'perf stat' pretty printing side is not fully implemented yet - 
right now you can only see these stats if you look for 
PERF_EVENT_READ counts in the raw event log:

   perf report -D | grep PERF_EVENT_READ

But the biggest piece, the kernel and perf record side is there 
already. What kind of output would you prefer? Maybe you'd like to 
take a stab at implementing the perf report side?

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-29 19:29           ` Ingo Molnar
@ 2009-08-06 16:59             ` Brice Goglin
  2009-08-06 17:40               ` Peter Zijlstra
  0 siblings, 1 reply; 41+ messages in thread
From: Brice Goglin @ 2009-08-06 16:59 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, paulus, LKML

Ingo Molnar wrote:
> Update: based on your feedback the latest perfcounters tree includes 
> the following new perf record features:
>
>     -s, --stat            per thread counts
>     -n, --no-samples      don't sample
>
> --stat instructs the kernel to gather precise per task/thread stats 
> and emits those counts to the data file. Via --no-samples one can do 
> non-profiling runs - i.e. only statistics collection.
>
> The 'perf stat' pretty printing side is not fully implemented yet - 
> right now you can only see these stats if you look for 
> PERF_EVENT_READ counts in the raw event log:
>
>    perf report -D | grep PERF_EVENT_READ
>   

Ok I am getting remote/local accesses for my threads now. But I am not
sure which line corresponds to which event.

$ /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf record -f -s
-e r1000001e0 -e r1000002e0 -e r1000004e0 -e r1000008e0 ./stream
[...]
$ /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf report -D |
grep PERF_EVENT_READ
0x7cd0 [0x30]: PERF_EVENT_READ: 4651 4651 210827
0x7388 [0x30]: PERF_EVENT_READ: 4651 4651 241742
0x8cf0 [0x30]: PERF_EVENT_READ: 4651 4651 315938
0x9da0 [0x30]: PERF_EVENT_READ: 4651 4651 9461794
0x7208 [0x30]: PERF_EVENT_READ: 4651 4652 24954
0x8c90 [0x30]: PERF_EVENT_READ: 4651 4652 408056
0x7ca0 [0x30]: PERF_EVENT_READ: 4651 4652 8962423
0x9ce0 [0x30]: PERF_EVENT_READ: 4651 4652 9117
0x7df0 [0x30]: PERF_EVENT_READ: 4651 4653 21645
0x9d70 [0x30]: PERF_EVENT_READ: 4651 4653 23606
0x7358 [0x30]: PERF_EVENT_READ: 4651 4653 29266
0x8e70 [0x30]: PERF_EVENT_READ: 4651 4653 9339173
[...]

I can easily sort them by thread id, but I don't know how to match my 4
events with each group of 4 line.

Maybe perf report earned some better way to show per-thread statistics
in the meantime?

Brice


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-08-06 16:59             ` Brice Goglin
@ 2009-08-06 17:40               ` Peter Zijlstra
  2009-08-06 17:48                 ` Brice Goglin
  2009-08-06 19:01                 ` [perf] howto switch from pfmon Brice Goglin
  0 siblings, 2 replies; 41+ messages in thread
From: Peter Zijlstra @ 2009-08-06 17:40 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Ingo Molnar, paulus, LKML

On Thu, 2009-08-06 at 18:59 +0200, Brice Goglin wrote:

> I can easily sort them by thread id, but I don't know how to match my 4
> events with each group of 4 line.
> 
> Maybe perf report earned some better way to show per-thread statistics
> in the meantime?

Nah, it needs some love..

The below might be a starting point, it compiles, didn't check the
result. builtin-stat might be a nice place to look for more bits..

---

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 8cb58d6..c053fd8 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -112,7 +112,9 @@ struct read_event {
 	struct perf_event_header header;
 	u32 pid,tid;
 	u64 value;
-	u64 format[3];
+	u64 time_enabled;
+	u64 time_running;
+	u64 id;
 };
 
 typedef union event_union {
@@ -1690,14 +1692,37 @@ static void trace_event(event_t *event)
 	dprintf(".\n");
 }
 
+static struct perf_header	*header;
+
+static struct perf_counter_attr *perf_header__find_attr(u64 id)
+{
+	int i;
+
+	for (i = 0; i < header->attrs; i++) {
+		struct perf_header_attr *attr = header->attr[i];
+		int j;
+
+		for (j = 0; j < attr->ids; j++) {
+			if (attr->id[j] == id)
+				return &attr->attr;
+		}
+	}
+
+	return NULL;
+}
+
 static int
 process_read_event(event_t *event, unsigned long offset, unsigned long head)
 {
-	dprintf("%p [%p]: PERF_EVENT_READ: %d %d %Lu\n",
+	struct perf_counter_attr *attr = perf_header__find_attr(event->read.id);
+
+	dprintf("%p [%p]: PERF_EVENT_READ: %d %d %s %Lu\n",
 			(void *)(offset + head),
 			(void *)(long)(event->header.size),
 			event->read.pid,
 			event->read.tid,
+			attr ? __event_name(attr->type, attr->config)
+			     : "FAIL",
 			event->read.value);
 
 	return 0;
@@ -1743,8 +1768,6 @@ process_event(event_t *event, unsigned long offset, unsigned long head)
 	return 0;
 }
 
-static struct perf_header	*header;
-
 static u64 perf_header__sample_type(void)
 {
 	u64 sample_type = 0;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 7bdad8d..4858d83 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -223,9 +239,15 @@ char *event_name(int counter)
 {
 	u64 config = attrs[counter].config;
 	int type = attrs[counter].type;
+
+	return __event_name(type, config);
+}
+
+char *__event_name(int type, u64 config)
+{
 	static char buf[32];
 
-	if (attrs[counter].type == PERF_TYPE_RAW) {
+	if (type == PERF_TYPE_RAW) {
 		sprintf(buf, "raw 0x%llx", config);
 		return buf;
 	}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 1ea5d09..192a962 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -10,6 +10,7 @@ extern int			nr_counters;
 extern struct perf_counter_attr attrs[MAX_COUNTERS];
 
 extern char *event_name(int ctr);
+extern char *__event_name(int type, u64 config);
 
 extern int parse_events(const struct option *opt, const char *str, int unset);
 



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-08-06 17:40               ` Peter Zijlstra
@ 2009-08-06 17:48                 ` Brice Goglin
  2009-08-06 17:59                   ` Peter Zijlstra
  2009-08-06 18:57                   ` [PATCH] perf tools: Fix reading of perf.data file header Peter Zijlstra
  2009-08-06 19:01                 ` [perf] howto switch from pfmon Brice Goglin
  1 sibling, 2 replies; 41+ messages in thread
From: Brice Goglin @ 2009-08-06 17:48 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, paulus, LKML

Peter Zijlstra wrote:
> On Thu, 2009-08-06 at 18:59 +0200, Brice Goglin wrote:
>
>   
>> I can easily sort them by thread id, but I don't know how to match my 4
>> events with each group of 4 line.
>>
>> Maybe perf report earned some better way to show per-thread statistics
>> in the meantime?
>>     
>
> Nah, it needs some love..
>
> The below might be a starting point, it compiles, didn't check the
> result. builtin-stat might be a nice place to look for more bits..
>   

Thanks, now I get for each thread:

0x8fc0 [0x30]: PERF_EVENT_READ: 6268 6268 FAIL 209113
0x9698 [0x30]: PERF_EVENT_READ: 6268 6268 FAIL 307215
0x9cf8 [0x30]: PERF_EVENT_READ: 6268 6268 FAIL 9203221
0x8bb8 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000001e0 494628

Looks like it fails to stringified my raw events except the first one.

Brice


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-08-06 17:48                 ` Brice Goglin
@ 2009-08-06 17:59                   ` Peter Zijlstra
  2009-08-06 18:57                   ` [PATCH] perf tools: Fix reading of perf.data file header Peter Zijlstra
  1 sibling, 0 replies; 41+ messages in thread
From: Peter Zijlstra @ 2009-08-06 17:59 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Ingo Molnar, paulus, LKML

On Thu, 2009-08-06 at 19:48 +0200, Brice Goglin wrote:
> Peter Zijlstra wrote:
> > On Thu, 2009-08-06 at 18:59 +0200, Brice Goglin wrote:
> >
> >   
> >> I can easily sort them by thread id, but I don't know how to match my 4
> >> events with each group of 4 line.
> >>
> >> Maybe perf report earned some better way to show per-thread statistics
> >> in the meantime?
> >>     
> >
> > Nah, it needs some love..
> >
> > The below might be a starting point, it compiles, didn't check the
> > result. builtin-stat might be a nice place to look for more bits..
> >   
> 
> Thanks, now I get for each thread:
> 
> 0x8fc0 [0x30]: PERF_EVENT_READ: 6268 6268 FAIL 209113
> 0x9698 [0x30]: PERF_EVENT_READ: 6268 6268 FAIL 307215
> 0x9cf8 [0x30]: PERF_EVENT_READ: 6268 6268 FAIL 9203221
> 0x8bb8 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000001e0 494628
> 
> Looks like it fails to stringified my raw events except the first one.

/me mumbles intelligble

I'll try and sort that out after dinner, unless you beat me to it :-)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-06 17:48                 ` Brice Goglin
  2009-08-06 17:59                   ` Peter Zijlstra
@ 2009-08-06 18:57                   ` Peter Zijlstra
  2009-08-06 19:03                     ` Brice Goglin
                                       ` (2 more replies)
  1 sibling, 3 replies; 41+ messages in thread
From: Peter Zijlstra @ 2009-08-06 18:57 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Ingo Molnar, paulus, LKML

On Thu, 2009-08-06 at 19:48 +0200, Brice Goglin wrote:
> Peter Zijlstra wrote:
> > On Thu, 2009-08-06 at 18:59 +0200, Brice Goglin wrote:
> >
> >   
> >> I can easily sort them by thread id, but I don't know how to match my 4
> >> events with each group of 4 line.
> >>
> >> Maybe perf report earned some better way to show per-thread statistics
> >> in the meantime?
> >>     
> >
> > Nah, it needs some love..
> >
> > The below might be a starting point, it compiles, didn't check the
> > result. builtin-stat might be a nice place to look for more bits..
> >   
> 
> Thanks, now I get for each thread:
> 
> 0x8fc0 [0x30]: PERF_EVENT_READ: 6268 6268 FAIL 209113
> 0x9698 [0x30]: PERF_EVENT_READ: 6268 6268 FAIL 307215
> 0x9cf8 [0x30]: PERF_EVENT_READ: 6268 6268 FAIL 9203221
> 0x8bb8 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000001e0 494628
> 
> Looks like it fails to stringified my raw events except the first one.

---
Subject: perf tools: Fix reading of perf.data file header

A silly mistake made us re-read the first attribute for every attribute.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 450384b..95a44bc 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -213,9 +213,10 @@ struct perf_header *perf_header__read(int fd)
 
 	for (i = 0; i < nr_attrs; i++) {
 		struct perf_header_attr *attr;
-		off_t tmp = lseek(fd, 0, SEEK_CUR);
+		off_t tmp;
 
 		do_read(fd, &f_attr, sizeof(f_attr));
+		tmp = lseek(fd, 0, SEEK_CUR);
 
 		attr = perf_header_attr__new(&f_attr.attr);
 



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-06 18:57                   ` [PATCH] perf tools: Fix reading of perf.data file header Peter Zijlstra
@ 2009-08-06 19:03                     ` Brice Goglin
  2009-08-06 19:59                       ` Ingo Molnar
  2009-08-07  6:37                     ` [tip:perfcounters/urgent] perf tools: Fix multi-counter stat bug caused by incorrect reading of perf.data file header tip-bot for Peter Zijlstra
  2009-08-07  7:39                     ` tip-bot for Peter Zijlstra
  2 siblings, 1 reply; 41+ messages in thread
From: Brice Goglin @ 2009-08-06 19:03 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, paulus, LKML

$ /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf report -D |
grep _READ | sort -k5
0x8bb8 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000001e0 494628
0x8fc0 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000002e0 209113
0x9698 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000004e0 307215
0x9cf8 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000008e0 9203221
0x8a08 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000001e0 9210788
0x9020 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000002e0 302344
0x9608 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000004e0 198705
0x9d28 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000008e0 473471
[...]

Now I know which thread actually reads from where.
Looks like we're good to go now! thanks a lot Peter!

Tested-by: Brice Goglin <Brice.Goglin@inria.fr>

Brice



Peter Zijlstra wrote:
> Subject: perf tools: Fix reading of perf.data file header
>
> A silly mistake made us re-read the first attribute for every attribute.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index 450384b..95a44bc 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -213,9 +213,10 @@ struct perf_header *perf_header__read(int fd)
>  
>  	for (i = 0; i < nr_attrs; i++) {
>  		struct perf_header_attr *attr;
> -		off_t tmp = lseek(fd, 0, SEEK_CUR);
> +		off_t tmp;
>  
>  		do_read(fd, &f_attr, sizeof(f_attr));
> +		tmp = lseek(fd, 0, SEEK_CUR);
>  
>  		attr = perf_header_attr__new(&f_attr.attr);
>  
>
>
>   


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-06 19:03                     ` Brice Goglin
@ 2009-08-06 19:59                       ` Ingo Molnar
  2009-08-06 20:03                         ` Brice Goglin
  2009-08-06 23:35                         ` Brice Goglin
  0 siblings, 2 replies; 41+ messages in thread
From: Ingo Molnar @ 2009-08-06 19:59 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML


* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> $ /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf report -D |
> grep _READ | sort -k5
> 0x8bb8 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000001e0 494628
> 0x8fc0 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000002e0 209113
> 0x9698 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000004e0 307215
> 0x9cf8 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000008e0 9203221
> 0x8a08 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000001e0 9210788
> 0x9020 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000002e0 302344
> 0x9608 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000004e0 198705
> 0x9d28 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000008e0 473471
> [...]
> 
> Now I know which thread actually reads from where.
> Looks like we're good to go now! thanks a lot Peter!
> 
> Tested-by: Brice Goglin <Brice.Goglin@inria.fr>

Thanks Brice.

It would be nice to add this as some "perf report -s/--stats" flag, 
to not have to go via -D (which is a 'print debug output' kind of 
ad-hoc thing and subject to format changes in the future).

Would you be interested in sending a patch that adds that flag to 
'perf report', to print out these statistics entries (if any), in a 
tabular form suitable for your purposes? Below is a past patch to 
builtin-report.c that shows how to add new options.

	Ingo

-------------------->
>From 429764873cf3fc3e73142872a674bb27cda589c1 Mon Sep 17 00:00:00 2001
From: Mike Galbraith <efault@gmx.de>
Date: Thu, 2 Jul 2009 08:09:46 +0200
Subject: [PATCH] perf_counter tools: Enable kernel module symbol loading in tools

Add the -m/--modules option to perf report and perf annotate,
which enables live module symbol/image loading. To be used
with -k/--vmlinux.

(Also give perf annotate a -P/--full-paths option.)

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246514986.13293.48.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 tools/perf/builtin-annotate.c |   25 ++++++++++++++++++++-----
 tools/perf/builtin-report.c   |    9 ++++++++-
 tools/perf/builtin-top.c      |   12 ++++++++++--
 3 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 8820568..08ea6c5 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -43,6 +43,10 @@ static int		dump_trace = 0;
 
 static int		verbose;
 
+static int		modules;
+
+static int		full_paths;
+
 static int		print_line;
 
 static unsigned long	page_size;
@@ -171,7 +175,7 @@ static int load_kernel(void)
 	if (!kernel_dso)
 		return -1;
 
-	err = dso__load_kernel(kernel_dso, vmlinux, NULL, verbose, 0);
+	err = dso__load_kernel(kernel_dso, vmlinux, NULL, verbose, modules);
 	if (err <= 0) {
 		dso__delete(kernel_dso);
 		kernel_dso = NULL;
@@ -1268,19 +1272,25 @@ static void print_summary(char *filename)
 
 static void annotate_sym(struct dso *dso, struct symbol *sym)
 {
-	char *filename = dso->name;
+	char *filename = dso->name, *d_filename;
 	u64 start, end, len;
 	char command[PATH_MAX*2];
 	FILE *file;
 
 	if (!filename)
 		return;
-	if (dso == kernel_dso)
+	if (sym->module)
+		filename = sym->module->path;
+	else if (dso == kernel_dso)
 		filename = vmlinux;
 
 	start = sym->obj_start;
 	if (!start)
 		start = sym->start;
+	if (full_paths)
+		d_filename = filename;
+	else
+		d_filename = basename(filename);
 
 	end = start + sym->end - sym->start + 1;
 	len = sym->end - sym->start;
@@ -1291,13 +1301,14 @@ static void annotate_sym(struct dso *dso, struct symbol *sym)
 	}
 
 	printf("\n\n------------------------------------------------\n");
-	printf(" Percent |	Source code & Disassembly of %s\n", filename);
+	printf(" Percent |	Source code & Disassembly of %s\n", d_filename);
 	printf("------------------------------------------------\n");
 
 	if (verbose >= 2)
 		printf("annotating [%p] %30s : [%p] %30s\n", dso, dso->name, sym, sym->name);
 
-	sprintf(command, "objdump --start-address=0x%016Lx --stop-address=0x%016Lx -dS %s", (u64)start, (u64)end, filename);
+	sprintf(command, "objdump --start-address=0x%016Lx --stop-address=0x%016Lx -dS %s|grep -v %s",
+			(u64)start, (u64)end, filename, filename);
 
 	if (verbose >= 3)
 		printf("doing: %s\n", command);
@@ -1472,8 +1483,12 @@ static const struct option options[] = {
 	OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace,
 		    "dump raw trace in ASCII"),
 	OPT_STRING('k', "vmlinux", &vmlinux, "file", "vmlinux pathname"),
+	OPT_BOOLEAN('m', "modules", &modules,
+		    "load module symbols - WARNING: use only with -k and LIVE kernel"),
 	OPT_BOOLEAN('l', "print-line", &print_line,
 		    "print matching source lines (may be slow)"),
+	OPT_BOOLEAN('P', "full-paths", &full_paths,
+		    "Don't shorten the displayed pathnames"),
 	OPT_END()
 };
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 38d136f..b44476c 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -46,6 +46,8 @@ static int		dump_trace = 0;
 static int		verbose;
 #define eprintf(x...)	do { if (verbose) fprintf(stderr, x); } while (0)
 
+static int		modules;
+
 static int		full_paths;
 
 static unsigned long	page_size;
@@ -188,7 +190,7 @@ static int load_kernel(void)
 	if (!kernel_dso)
 		return -1;
 
-	err = dso__load_kernel(kernel_dso, vmlinux, NULL, verbose, 0);
+	err = dso__load_kernel(kernel_dso, vmlinux, NULL, verbose, modules);
 	if (err <= 0) {
 		dso__delete(kernel_dso);
 		kernel_dso = NULL;
@@ -648,6 +650,9 @@ sort__sym_print(FILE *fp, struct hist_entry *self)
 		ret += fprintf(fp, "[%c] %s",
 			self->dso == kernel_dso ? 'k' :
 			self->dso == hypervisor_dso ? 'h' : '.', self->sym->name);
+
+		if (self->sym->module)
+			ret += fprintf(fp, "\t[%s]", self->sym->module->name);
 	} else {
 		ret += fprintf(fp, "%#016llx", (u64)self->ip);
 	}
@@ -1710,6 +1715,8 @@ static const struct option options[] = {
 	OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace,
 		    "dump raw trace in ASCII"),
 	OPT_STRING('k', "vmlinux", &vmlinux, "file", "vmlinux pathname"),
+	OPT_BOOLEAN('m', "modules", &modules,
+		    "load module symbols - WARNING: use only with -k and LIVE kernel"),
 	OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
 		   "sort by key(s): pid, comm, dso, symbol, parent"),
 	OPT_BOOLEAN('P', "full-paths", &full_paths,
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 9bb25fc..aa044ea 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -66,6 +66,7 @@ static unsigned int		page_size;
 static unsigned int		mmap_pages			= 16;
 static int			freq				=  0;
 static int			verbose				=  0;
+static char			*vmlinux			=  NULL;
 
 static char			*sym_filter;
 static unsigned long		filter_start;
@@ -265,7 +266,10 @@ static void print_sym_table(void)
 			printf("%9.1f %10ld - ", syme->weight, syme->snap_count);
 
 		color_fprintf(stdout, color, "%4.1f%%", pcnt);
-		printf(" - %016llx : %s\n", sym->start, sym->name);
+		printf(" - %016llx : %s", sym->start, sym->name);
+		if (sym->module)
+			printf("\t[%s]", sym->module->name);
+		printf("\n");
 	}
 }
 
@@ -359,12 +363,13 @@ static int parse_symbols(void)
 {
 	struct rb_node *node;
 	struct symbol  *sym;
+	int modules = vmlinux ? 1 : 0;
 
 	kernel_dso = dso__new("[kernel]", sizeof(struct sym_entry));
 	if (kernel_dso == NULL)
 		return -1;
 
-	if (dso__load_kernel(kernel_dso, NULL, symbol_filter, 1, 0) <= 0)
+	if (dso__load_kernel(kernel_dso, vmlinux, symbol_filter, verbose, modules) <= 0)
 		goto out_delete_dso;
 
 	node = rb_first(&kernel_dso->syms);
@@ -680,6 +685,7 @@ static const struct option options[] = {
 			    "system-wide collection from all CPUs"),
 	OPT_INTEGER('C', "CPU", &profile_cpu,
 		    "CPU to profile on"),
+	OPT_STRING('k', "vmlinux", &vmlinux, "file", "vmlinux pathname"),
 	OPT_INTEGER('m', "mmap-pages", &mmap_pages,
 		    "number of mmap data pages"),
 	OPT_INTEGER('r', "realtime", &realtime_prio,
@@ -709,6 +715,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __used)
 {
 	int counter;
 
+	symbol__init();
+
 	page_size = sysconf(_SC_PAGE_SIZE);
 
 	argc = parse_options(argc, argv, options, top_usage, 0);

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-06 19:59                       ` Ingo Molnar
@ 2009-08-06 20:03                         ` Brice Goglin
  2009-08-06 23:35                         ` Brice Goglin
  1 sibling, 0 replies; 41+ messages in thread
From: Brice Goglin @ 2009-08-06 20:03 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, paulus, LKML

Ingo Molnar wrote:
> * Brice Goglin <Brice.Goglin@inria.fr> wrote:
>
>   
>> $ /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf report -D |
>> grep _READ | sort -k5
>> 0x8bb8 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000001e0 494628
>> 0x8fc0 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000002e0 209113
>> 0x9698 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000004e0 307215
>> 0x9cf8 [0x30]: PERF_EVENT_READ: 6268 6268 raw 0x1000008e0 9203221
>> 0x8a08 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000001e0 9210788
>> 0x9020 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000002e0 302344
>> 0x9608 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000004e0 198705
>> 0x9d28 [0x30]: PERF_EVENT_READ: 6268 6269 raw 0x1000008e0 473471
>> [...]
>>
>> Now I know which thread actually reads from where.
>> Looks like we're good to go now! thanks a lot Peter!
>>
>> Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
>>     
>
> Thanks Brice.
>
> It would be nice to add this as some "perf report -s/--stats" flag, 
> to not have to go via -D (which is a 'print debug output' kind of 
> ad-hoc thing and subject to format changes in the future).
>
> Would you be interested in sending a patch that adds that flag to 
> 'perf report', to print out these statistics entries (if any), in a 
> tabular form suitable for your purposes? Below is a past patch to 
> builtin-report.c that shows how to add new options.
>
>   

I'll see what I can do. I have been looking at the code tonight before
Peter fixed the last problem but I didn't manage to understand much of
the code so far. So thanks for the example patch :)

Brice


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-06 19:59                       ` Ingo Molnar
  2009-08-06 20:03                         ` Brice Goglin
@ 2009-08-06 23:35                         ` Brice Goglin
  2009-08-07  6:13                           ` Brice Goglin
  2009-08-07  6:32                           ` Ingo Molnar
  1 sibling, 2 replies; 41+ messages in thread
From: Brice Goglin @ 2009-08-06 23:35 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, paulus, LKML

Ingo Molnar wrote:
> It would be nice to add this as some "perf report -s/--stats" flag, 
> to not have to go via -D (which is a 'print debug output' kind of 
> ad-hoc thing and subject to format changes in the future).
>
> Would you be interested in sending a patch that adds that flag to 
> 'perf report', to print out these statistics entries (if any), in a 
> tabular form suitable for your purposes? Below is a past patch to 
> builtin-report.c that shows how to add new options.
>   

Here's a quick'n'dirty first try. Read events are copied in the
show_stat_event array during process_read_event. And __cmd_report sorts
the array by tid before displaying it.

perf report -S now shows the following after the existing output: (-s is
already used for something else).
It shows things like
# Per-thread statistics:
# PID    TID       Event          Count
  16709   16709   cache-misses   82727
  16709   16709   cache-references   41238768
  16709   16710   cache-misses   6462
  16709   16710   cache-references   76119375
or
# Per-thread statistics:
# PID    TID       Event          Count
  6268   6268   raw 0x1000001e0   494628
  6268   6268   raw 0x1000002e0   209113
  6268   6268   raw 0x1000004e0   307215
  6268   6268   raw 0x1000008e0   9203221
  6268   6269   raw 0x1000001e0   9210788
  6268   6269   raw 0x1000002e0   302344
  6268   6269   raw 0x1000004e0   198705
  6268   6269   raw 0x1000008e0   473471

Obviously, there's some a lot of nice pretty printing to do, but you'll
be able to tell whether the general idea is ok or not.

Brice


Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>

Index: linux-2.6.31/tools/perf/builtin-report.c
===================================================================
--- linux-2.6.31.orig/tools/perf/builtin-report.c	2009-08-07 00:42:55.000000000 +0200
+++ linux-2.6.31/tools/perf/builtin-report.c	2009-08-07 01:26:33.000000000 +0200
@@ -24,6 +24,8 @@
 #include "util/parse-options.h"
 #include "util/parse-events.h"
 
+#include <stdlib.h>
+
 #define SHOW_KERNEL	1
 #define SHOW_USER	2
 #define SHOW_HV		4
@@ -52,6 +54,10 @@
 
 static int		full_paths;
 static int		show_nr_samples;
+static int		show_stat;
+static int		show_stat_events;
+static int		show_stat_event_max;
+static struct read_event	*show_stat_event;
 
 static unsigned long	page_size;
 static unsigned long	mmap_window = 32;
@@ -126,6 +132,8 @@
 	struct read_event		read;
 } event_t;
 
+static struct perf_counter_attr *perf_header__find_attr(u64 id);
+
 static int repsep_fprintf(FILE *fp, const char *fmt, ...)
 {
 	int n;
@@ -1350,6 +1358,13 @@
 	}
 }
 
+static int compar_read_event_by_tid(const void *e1, const void *e2)
+{
+	const struct read_event *event1 = e1;
+	const struct read_event *event2 = e2;
+	return event1->tid - event2->tid;
+}
+
 static size_t output__fprintf(FILE *fp, u64 total_samples)
 {
 	struct hist_entry *pos;
@@ -1430,6 +1445,21 @@
 	}
 	fprintf(fp, "\n");
 
+	if (show_stat && show_stat_events) {
+		int i;
+		qsort(&show_stat_event[0], show_stat_events, sizeof(struct read_event), compar_read_event_by_tid);
+		fprintf(fp, "# Per-thread statistics:\n");
+		fprintf(fp, "# PID    TID       Event          Count\n");
+		for(i=0; i<show_stat_events; i++) {
+			struct read_event *event = &show_stat_event[i];
+			struct perf_counter_attr *attr = perf_header__find_attr(event->id);
+			printf("  %d   %d   %s   %Lu\n",
+			       event->pid, event->tid,
+			       attr ? __event_name(attr->type, attr->config) : "unknown",
+			       event->value);
+		}
+	}
+
 	return ret;
 }
 
@@ -1703,6 +1733,24 @@
 {
 	struct perf_counter_attr *attr = perf_header__find_attr(event->read.id);
 
+	if (show_stat) {
+		if (!show_stat_event) {
+			show_stat_events = 0;
+			show_stat_event_max = 16;
+			show_stat_event = malloc(show_stat_event_max * sizeof(*show_stat_event));
+			if (!show_stat_event)
+				die("cannot allocate show_stat_event array");
+		}
+		if (show_stat_events == show_stat_event_max) {
+			show_stat_event_max *= 2;
+			show_stat_event = realloc(show_stat_event, show_stat_event_max * sizeof(*show_stat_event));
+			if (!show_stat_event)
+				die("cannot enlarge show_stat_event array");
+		}
+		memcpy(&show_stat_event[show_stat_events], &event->read, sizeof(struct read_event));
+		show_stat_events++;
+	}
+
 	dprintf("%p [%p]: PERF_EVENT_READ: %d %d %s %Lu\n",
 			(void *)(offset + head),
 			(void *)(long)(event->header.size),
@@ -1998,6 +2046,8 @@
 		    "Show a column with the number of samples"),
 	OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
 		   "sort by key(s): pid, comm, dso, symbol, parent"),
+	OPT_BOOLEAN('S', "stat", &show_stat,
+		    "show per-thread event counters"),
 	OPT_BOOLEAN('P', "full-paths", &full_paths,
 		    "Don't shorten the pathnames taking into account the cwd"),
 	OPT_STRING('p', "parent", &parent_pattern, "regex",



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-06 23:35                         ` Brice Goglin
@ 2009-08-07  6:13                           ` Brice Goglin
  2009-08-07  6:32                           ` Ingo Molnar
  1 sibling, 0 replies; 41+ messages in thread
From: Brice Goglin @ 2009-08-07  6:13 UTC (permalink / raw)
  Cc: Ingo Molnar, Peter Zijlstra, paulus, LKML

Brice Goglin wrote:
> Ingo Molnar wrote:
>   
>> It would be nice to add this as some "perf report -s/--stats" flag, 
>> to not have to go via -D (which is a 'print debug output' kind of 
>> ad-hoc thing and subject to format changes in the future).
>>
>> Would you be interested in sending a patch that adds that flag to 
>> 'perf report', to print out these statistics entries (if any), in a 
>> tabular form suitable for your purposes? Below is a past patch to 
>> builtin-report.c that shows how to add new options.
>>   
>>     
>
> Here's a quick'n'dirty first try. Read events are copied in the
> show_stat_event array during process_read_event. And __cmd_report sorts
> the array by tid before displaying it.
>
> perf report -S now shows the following after the existing output: (-s is
> already used for something else).
> It shows things like
> # Per-thread statistics:
> # PID    TID       Event          Count
>   16709   16709   cache-misses   82727
>   16709   16709   cache-references   41238768
>   16709   16710   cache-misses   6462
>   16709   16710   cache-references   76119375
> or
> # Per-thread statistics:
> # PID    TID       Event          Count
>   6268   6268   raw 0x1000001e0   494628
>   6268   6268   raw 0x1000002e0   209113
>   6268   6268   raw 0x1000004e0   307215
>   6268   6268   raw 0x1000008e0   9203221
>   6268   6269   raw 0x1000001e0   9210788
>   6268   6269   raw 0x1000002e0   302344
>   6268   6269   raw 0x1000004e0   198705
>   6268   6269   raw 0x1000008e0   473471
>
> Obviously, there's some a lot of nice pretty printing to do, but you'll
> be able to tell whether the general idea is ok or not.
>   


Is there an easy way to know how many threads and how many different
event types were recorded? Once I know that, I could directly gather
values into a 2D matrix and display a single line per thread with all
corresponding event counter (so that people can easily apply awk to
manipulate them).

I am adding some pretty printing to the previous patch and waiting for
your feedback (for instance, what kind of global variable names, command
line option names, ... do you want or so).

Brice

PS: Is there a perf counters mailing list?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-06 23:35                         ` Brice Goglin
  2009-08-07  6:13                           ` Brice Goglin
@ 2009-08-07  6:32                           ` Ingo Molnar
  2009-08-07  7:38                             ` Brice Goglin
  2009-08-07 11:55                             ` [PATCH] perf report: Display per-thread event counters Brice Goglin
  1 sibling, 2 replies; 41+ messages in thread
From: Ingo Molnar @ 2009-08-07  6:32 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML


* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> Ingo Molnar wrote:
> > It would be nice to add this as some "perf report -s/--stats" flag, 
> > to not have to go via -D (which is a 'print debug output' kind of 
> > ad-hoc thing and subject to format changes in the future).
> >
> > Would you be interested in sending a patch that adds that flag 
> > to 'perf report', to print out these statistics entries (if 
> > any), in a tabular form suitable for your purposes? Below is a 
> > past patch to builtin-report.c that shows how to add new 
> > options.
> 
> Here's a quick'n'dirty first try. Read events are copied in the 
> show_stat_event array during process_read_event. And __cmd_report 
> sorts the array by tid before displaying it.
> 
> perf report -S now shows the following after the existing output: (-s is
> already used for something else).
> It shows things like
> # Per-thread statistics:
> # PID    TID       Event          Count
>   16709   16709   cache-misses   82727
>   16709   16709   cache-references   41238768
>   16709   16710   cache-misses   6462
>   16709   16710   cache-references   76119375
> or
> # Per-thread statistics:
> # PID    TID       Event          Count
>   6268   6268   raw 0x1000001e0   494628
>   6268   6268   raw 0x1000002e0   209113
>   6268   6268   raw 0x1000004e0   307215
>   6268   6268   raw 0x1000008e0   9203221
>   6268   6269   raw 0x1000001e0   9210788
>   6268   6269   raw 0x1000002e0   302344
>   6268   6269   raw 0x1000004e0   198705
>   6268   6269   raw 0x1000008e0   473471
> 
> Obviously, there's some a lot of nice pretty printing to do, but 
> you'll be able to tell whether the general idea is ok or not.

Yeah, the general idea looks OK to me.

I think it would be nice to share the pretty-printing code with 
'perf stat' (builtin-stat.c).

Plus, to make it easier for your scripting needs, we could add a 
--pretty=raw type of flag to both perf stat and perf report, which 
would emit the data in a raw way - to be used in gnuplot almost 
straight away, etc.

But for the typical interactive use it would be nice to do the 2D 
tabular form that 'perf stat' does. (btw: feel free to enhance that 
output as well, where it seems appropriate)

here's a few mostly stylistic comments:

>  static int		full_paths;
>  static int		show_nr_samples;
> +static int		show_stat;
> +static int		show_stat_events;
> +static int		show_stat_event_max;
> +static struct read_event	*show_stat_event;

> @@ -126,6 +132,8 @@
>  	struct read_event		read;
>  } event_t;
>  
> +static struct perf_counter_attr *perf_header__find_attr(u64 id);

Small cleanliness detail: could this function be moved to this spot? 
That way we could avoid this prototype declaration.

> +static int compar_read_event_by_tid(const void *e1, const void *e2)
> +{
> +	const struct read_event *event1 = e1;
> +	const struct read_event *event2 = e2;
> +	return event1->tid - event2->tid;
> +}

another small detail: i'd suggest s/compar/compare, and please put a 
newline after local variables, i.e. something like:

static int compare_read_event_by_tid(const void *e1, const void *e2)
{
	const struct read_event *event1 = e1;
	const struct read_event *event2 = e2;

	return event1->tid - event2->tid;
}

> @@ -1430,6 +1445,21 @@
>  	}
>  	fprintf(fp, "\n");
>  
> +	if (show_stat && show_stat_events) {
> +		int i;

[please add a newline here too]

> +		qsort(&show_stat_event[0], show_stat_events, sizeof(struct read_event), compar_read_event_by_tid);
> +		fprintf(fp, "# Per-thread statistics:\n");
> +		fprintf(fp, "# PID    TID       Event          Count\n");
> +		for(i=0; i<show_stat_events; i++) {

We write loops a tiny bit differently in the kernel - there's an 
easy way to check such details: run your patch through 
scripts/checkpatch.pl.

> +			struct read_event *event = &show_stat_event[i];
> +			struct perf_counter_attr *attr = perf_header__find_attr(event->id);

[please add a newline here too]

> +			printf("  %d   %d   %s   %Lu\n",
> +			       event->pid, event->tid,
> +			       attr ? __event_name(attr->type, attr->config) : "unknown",
> +			       event->value);
> +		}
> +	}

i'd also suggest to put this function into a helper inline function, 
which can start with:

	if (!show_stat || !show_stat_events)
		return;

That way it looks a (tiny) bit more structured and we win an 
indentation level.

> +	if (show_stat) {
> +		if (!show_stat_event) {
> +			show_stat_events = 0;
> +			show_stat_event_max = 16;
> +			show_stat_event = malloc(show_stat_event_max * sizeof(*show_stat_event));
> +			if (!show_stat_event)
> +				die("cannot allocate show_stat_event array");
> +		}
> +		if (show_stat_events == show_stat_event_max) {
> +			show_stat_event_max *= 2;
> +			show_stat_event = realloc(show_stat_event, show_stat_event_max * sizeof(*show_stat_event));
> +			if (!show_stat_event)
> +				die("cannot enlarge show_stat_event array");
> +		}
> +		memcpy(&show_stat_event[show_stat_events], &event->read, sizeof(struct read_event));
> +		show_stat_events++;
> +	}

this too could move into a helper inline function.

> @@ -1998,6 +2046,8 @@
>  		    "Show a column with the number of samples"),
>  	OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
>  		   "sort by key(s): pid, comm, dso, symbol, parent"),
> +	OPT_BOOLEAN('S', "stat", &show_stat,
> +		    "show per-thread event counters"),

Ok, there's indeed a flag clash with -s/--sort as you noticed.

-S looks good to me, how about going one step further and changing 
perf record to use -S/--stat as well, to make the flag consistent 
across all tools?

In any case, this patch moves into the right direction - this kind 
of functionality is exactly what we need.

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-07  6:32                           ` Ingo Molnar
@ 2009-08-07  7:38                             ` Brice Goglin
  2009-08-07  7:45                               ` Ingo Molnar
  2009-08-07 11:55                             ` [PATCH] perf report: Display per-thread event counters Brice Goglin
  1 sibling, 1 reply; 41+ messages in thread
From: Brice Goglin @ 2009-08-07  7:38 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, paulus, LKML

Ingo Molnar wrote:
> Ok, there's indeed a flag clash with -s/--sort as you noticed.
>
> -S looks good to me, how about going one step further and changing 
> perf record to use -S/--stat as well, to make the flag consistent 
> across all tools?
>   

Sure, but perf stat already uses -S for --scale :)

Brice


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-07  7:38                             ` Brice Goglin
@ 2009-08-07  7:45                               ` Ingo Molnar
  2009-08-07  8:18                                 ` Brice Goglin
  0 siblings, 1 reply; 41+ messages in thread
From: Ingo Molnar @ 2009-08-07  7:45 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML


* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> Ingo Molnar wrote:
> > Ok, there's indeed a flag clash with -s/--sort as you noticed.
> >
> > -S looks good to me, how about going one step further and changing 
> > perf record to use -S/--stat as well, to make the flag consistent 
> > across all tools?
> 
> Sure, but perf stat already uses -S for --scale :)

heh :-)

I completely forgot about it. And since we now have --scale enabled 
by default (and it doesnt make much sense to disable it i guess), it 
would not be a big issue to rename that -c/--scale to free up that 
flag for -S/--stat?

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-07  7:45                               ` Ingo Molnar
@ 2009-08-07  8:18                                 ` Brice Goglin
  2009-08-07  8:23                                   ` Ingo Molnar
                                                     ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Brice Goglin @ 2009-08-07  8:18 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, paulus, LKML

Ingo Molnar wrote:
> I completely forgot about it. And since we now have --scale enabled 
> by default (and it doesnt make much sense to disable it i guess), it 
> would not be a big issue to rename that -c/--scale to free up that 
> flag for -S/--stat?
>
> 	Ingo
>   


Here you are. But how do you switch "scale" off now that it's disabled
by default?
I tried --scale=0, --no-scale and so on without apparently succeeding.

Brice

[PATCH] perf stat: change -S for --scale into -c

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>

Index: linux-rc/tools/perf/Documentation/perf-stat.txt
===================================================================
--- linux-rc.orig/tools/perf/Documentation/perf-stat.txt	2009-08-07 10:09:40.000000000 +0200
+++ linux-rc/tools/perf/Documentation/perf-stat.txt	2009-08-07 10:09:46.000000000 +0200
@@ -40,7 +40,7 @@
 -a::
         system-wide collection
 
--S::
+-c::
         scale counter values
 
 EXAMPLES
Index: linux-rc/tools/perf/builtin-stat.c
===================================================================
--- linux-rc.orig/tools/perf/builtin-stat.c	2009-08-07 09:46:49.000000000 +0200
+++ linux-rc/tools/perf/builtin-stat.c	2009-08-07 10:07:44.000000000 +0200
@@ -496,7 +496,7 @@
 		    "stat events on existing pid"),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
-	OPT_BOOLEAN('S', "scale", &scale,
+	OPT_BOOLEAN('c', "scale", &scale,
 		    "scale/normalize counters"),
 	OPT_BOOLEAN('v', "verbose", &verbose,
 		    "be more verbose (show counter open errors, etc)"),



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-07  8:18                                 ` Brice Goglin
@ 2009-08-07  8:23                                   ` Ingo Molnar
  2009-08-07  8:27                                   ` Ingo Molnar
  2009-08-07  8:30                                   ` [tip:perfcounters/core] perf stat: Rename -S/--scale to -c/--scale tip-bot for Brice Goglin
  2 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2009-08-07  8:23 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML


* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> Ingo Molnar wrote:
> > I completely forgot about it. And since we now have --scale enabled 
> > by default (and it doesnt make much sense to disable it i guess), it 
> > would not be a big issue to rename that -c/--scale to free up that 
> > flag for -S/--stat?
> >
> > 	Ingo
> >   
> 
> 
> Here you are. But how do you switch "scale" off now that it's 
> disabled by default? I tried --scale=0, --no-scale and so on 
> without apparently succeeding.

--no-scale seems to work here:

aldebaran:~> perf stat --no-scale -e 0:0 -e 0:0 -e 0:0 -e 0:0 -e 0:0 
-e 0:0 -e 0:0 -e 0:0 ~/loop_1b_instructions

 Performance counter stats for '/home/mingo/loop_1b_instructions':

      469128160  cycles                  
      469072870  cycles                  
      470517049  cycles                  
      473750835  cycles                  
      476987439  cycles                  
      477364955  cycles                  
      474188871  cycles                  
      470950809  cycles                  

    0.237171581  seconds time elapsed

aldebaran:~> perf stat --scale -e 0:0 -e 0:0 -e 0:0 -e 0:0 -e 0:0 -e 
0:0 -e 0:0 -e 0:0 ~/loop_1b_instructions

 Performance counter stats for '/home/mingo/loop_1b_instructions':

      756242908  cycles                    (scaled from 62.00%)
      756238670  cycles                    (scaled from 62.00%)
      756237417  cycles                    (scaled from 62.23%)
      756093940  cycles                    (scaled from 62.66%)
      756097133  cycles                    (scaled from 63.08%)
      756141523  cycles                    (scaled from 63.10%)
      756142674  cycles                    (scaled from 62.68%)
      756154220  cycles                    (scaled from 62.26%)

    0.236972028  seconds time elapsed

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf tools: Fix reading of perf.data file header
  2009-08-07  8:18                                 ` Brice Goglin
  2009-08-07  8:23                                   ` Ingo Molnar
@ 2009-08-07  8:27                                   ` Ingo Molnar
  2009-08-07  8:30                                   ` [tip:perfcounters/core] perf stat: Rename -S/--scale to -c/--scale tip-bot for Brice Goglin
  2 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2009-08-07  8:27 UTC (permalink / raw)
  To: Brice Goglin; +Cc: Peter Zijlstra, paulus, LKML


* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> [PATCH] perf stat: change -S for --scale into -c

Applied, thanks Brice!

Find below the full changelog - i filled in the 'why' bits.

	Ingo

------------------------->
>From d4b2db0a624500c978f14393e6933d5d6aa552d2 Mon Sep 17 00:00:00 2001
From: Brice Goglin <Brice.Goglin@inria.fr>
Date: Fri, 7 Aug 2009 10:18:39 +0200
Subject: [PATCH] perf stat: Rename -S/--scale to -c/--scale

We want to use a coherent flag for -S/--stat across all tools,
so free up -S in perf stat.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
LKML-Reference: <4A7BE35F.4060604@inria.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 tools/perf/Documentation/perf-stat.txt |    2 +-
 tools/perf/builtin-stat.c              |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 0d74346..484080d 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -40,7 +40,7 @@ OPTIONS
 -a::
         system-wide collection
 
--S::
+-c::
         scale counter values
 
 EXAMPLES
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f9510ee..b4b06c7 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -496,7 +496,7 @@ static const struct option options[] = {
 		    "stat events on existing pid"),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
-	OPT_BOOLEAN('S', "scale", &scale,
+	OPT_BOOLEAN('c', "scale", &scale,
 		    "scale/normalize counters"),
 	OPT_BOOLEAN('v', "verbose", &verbose,
 		    "be more verbose (show counter open errors, etc)"),

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip:perfcounters/core] perf stat: Rename -S/--scale to -c/--scale
  2009-08-07  8:18                                 ` Brice Goglin
  2009-08-07  8:23                                   ` Ingo Molnar
  2009-08-07  8:27                                   ` Ingo Molnar
@ 2009-08-07  8:30                                   ` tip-bot for Brice Goglin
  2 siblings, 0 replies; 41+ messages in thread
From: tip-bot for Brice Goglin @ 2009-08-07  8:30 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, Brice.Goglin, hpa, mingo, a.p.zijlstra, tglx, mingo

Commit-ID:  d4b2db0a624500c978f14393e6933d5d6aa552d2
Gitweb:     http://git.kernel.org/tip/d4b2db0a624500c978f14393e6933d5d6aa552d2
Author:     Brice Goglin <Brice.Goglin@inria.fr>
AuthorDate: Fri, 7 Aug 2009 10:18:39 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 7 Aug 2009 10:26:12 +0200

perf stat: Rename -S/--scale to -c/--scale

We want to use a coherent flag for -S/--stat across all tools,
so free up -S in perf stat.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
LKML-Reference: <4A7BE35F.4060604@inria.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 tools/perf/Documentation/perf-stat.txt |    2 +-
 tools/perf/builtin-stat.c              |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 0d74346..484080d 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -40,7 +40,7 @@ OPTIONS
 -a::
         system-wide collection
 
--S::
+-c::
         scale counter values
 
 EXAMPLES
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f9510ee..b4b06c7 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -496,7 +496,7 @@ static const struct option options[] = {
 		    "stat events on existing pid"),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
-	OPT_BOOLEAN('S', "scale", &scale,
+	OPT_BOOLEAN('c', "scale", &scale,
 		    "scale/normalize counters"),
 	OPT_BOOLEAN('v', "verbose", &verbose,
 		    "be more verbose (show counter open errors, etc)"),

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH] perf report: Display per-thread event counters
  2009-08-07  6:32                           ` Ingo Molnar
  2009-08-07  7:38                             ` Brice Goglin
@ 2009-08-07 11:55                             ` Brice Goglin
  2009-08-08 11:54                               ` [tip:perfcounters/core] perf report: Fix and improve the displaying of " tip-bot for Brice Goglin
  2009-08-08 12:14                               ` [PATCH] perf report: Display " Ingo Molnar
  1 sibling, 2 replies; 41+ messages in thread
From: Brice Goglin @ 2009-08-07 11:55 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, paulus, LKML

Here's a better patch. I moved everything to utils/values.[ch] so that we may
reuse it in perf stat. But I don't see yet where I am suppose to get something
like PERF_READ_EVENT in builtin-stat.c so I haven't touched it yet.

We get something like this now:
#  PID   TID  cache-misses  cache-references
  4658  4659        495581           3238779
  4658  4662        498246           3236823
  4658  4663        499531           3243162

Then it'll be easy to add --pretty=raw to display a single line per thread/event.

By the way, -S was also used for --symbol... So I used -T/--thread here.





perf report: Add -T/--threads to display per-thread counter values
    
We get something like this now:
#  PID   TID  cache-misses  cache-references
  4658  4659        495581           3238779
  4658  4662        498246           3236823
  4658  4663        499531           3243162

Per-thread arrays of counter values are managed in utils/values.[ch]
    
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index e72e931..370344a 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -27,6 +27,9 @@ OPTIONS
 -n
 --show-nr-samples
 	Show the number of samples for each symbol
+-T
+--threads
+	Show per-thread event counters
 -C::
 --comms=::
 	Only consider symbols in these comms. CSV that understands
diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 1916e44..9fc133f 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -310,6 +310,7 @@ LIB_H += util/sigchain.h
 LIB_H += util/symbol.h
 LIB_H += util/module.h
 LIB_H += util/color.h
+LIB_H += util/values.h
 
 LIB_OBJS += util/abspath.o
 LIB_OBJS += util/alias.o
@@ -337,6 +338,7 @@ LIB_OBJS += util/color.o
 LIB_OBJS += util/pager.o
 LIB_OBJS += util/header.o
 LIB_OBJS += util/callchain.o
+LIB_OBJS += util/values.o
 
 BUILTIN_OBJS += builtin-annotate.o
 BUILTIN_OBJS += builtin-help.o
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index a5e2f8d..f7be85e 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -17,6 +17,7 @@
 #include "util/string.h"
 #include "util/callchain.h"
 #include "util/strlist.h"
+#include "util/values.h"
 
 #include "perf.h"
 #include "util/header.h"
@@ -53,6 +54,9 @@ static int		modules;
 static int		full_paths;
 static int		show_nr_samples;
 
+static int		show_threads;
+static struct perf_read_values	show_threads_values;
+
 static unsigned long	page_size;
 static unsigned long	mmap_window = 32;
 
@@ -1432,6 +1436,9 @@ print_entries:
 	}
 	fprintf(fp, "\n");
 
+	if (show_threads)
+		perf_read_values_display(fp, &show_threads_values);
+
 	return ret;
 }
 
@@ -1717,6 +1724,16 @@ process_read_event(event_t *event, unsigned long offset, unsigned long head)
 {
 	struct perf_counter_attr *attr = perf_header__find_attr(event->read.id);
 
+	if (show_threads) {
+		char *name = attr ? __event_name(attr->type, attr->config)
+				   : "unknown";
+		perf_read_values_add_value(&show_threads_values,
+					   event->read.pid, event->read.tid,
+					   event->read.id,
+					   name,
+					   event->read.value);
+	}
+
 	dprintf("%p [%p]: PERF_EVENT_READ: %d %d %s %Lu\n",
 			(void *)(offset + head),
 			(void *)(long)(event->header.size),
@@ -1798,6 +1815,9 @@ static int __cmd_report(void)
 
 	register_idle_thread();
 
+	if (show_threads)
+		perf_read_values_init(&show_threads_values);
+
 	input = open(input_name, O_RDONLY);
 	if (input < 0) {
 		fprintf(stderr, " failed to open file: %s", input_name);
@@ -1945,6 +1965,9 @@ done:
 	output__resort(total);
 	output__fprintf(stdout, total);
 
+	if (show_threads)
+		perf_read_values_destroy(&show_threads_values);
+
 	return rc;
 }
 
@@ -2011,6 +2034,8 @@ static const struct option options[] = {
 		    "load module symbols - WARNING: use only with -k and LIVE kernel"),
 	OPT_BOOLEAN('n', "show-nr-samples", &show_nr_samples,
 		    "Show a column with the number of samples"),
+	OPT_BOOLEAN('T', "threads", &show_threads,
+		    "Show per-thread event counters"),
 	OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
 		   "sort by key(s): pid, comm, dso, symbol, parent"),
 	OPT_BOOLEAN('P', "full-paths", &full_paths,
diff --git a/tools/perf/util/values.c b/tools/perf/util/values.c
new file mode 100644
index 0000000..8551c0b
--- /dev/null
+++ b/tools/perf/util/values.c
@@ -0,0 +1,171 @@
+#include <stdlib.h>
+
+#include "util.h"
+#include "values.h"
+
+void perf_read_values_init(struct perf_read_values *values)
+{
+	values->threads_max = 16;
+	values->pid = malloc(values->threads_max * sizeof(*values->pid));
+	values->tid = malloc(values->threads_max * sizeof(*values->tid));
+	values->value = malloc(values->threads_max * sizeof(*values->value));
+	if (!values->pid || !values->tid || !values->value)
+		die("failed to allocate read_values threads arrays");
+	values->threads = 0;
+
+	values->counters_max = 16;
+	values->counterrawid = malloc(values->counters_max
+				      * sizeof(*values->counterrawid));
+	values->countername = malloc(values->counters_max
+				     * sizeof(*values->countername));
+	if (!values->counterrawid || !values->countername)
+		die("failed to allocate read_values counters arrays");
+	values->counters = 0;
+}
+
+void perf_read_values_destroy(struct perf_read_values *values)
+{
+	int i;
+
+	if (!values->threads_max || !values->counters_max)
+		return;
+
+	for (i = 0; i < values->threads; i++)
+		free(values->value[i]);
+	free(values->pid);
+	free(values->tid);
+	free(values->counterrawid);
+	for (i = 0; i < values->counters; i++)
+		free(values->countername[i]);
+	free(values->countername);
+}
+
+static void perf_read_values__enlarge_threads(struct perf_read_values *values)
+{
+	values->threads_max *= 2;
+	values->pid = realloc(values->pid,
+			      values->threads_max * sizeof(*values->pid));
+	values->tid = realloc(values->tid,
+			      values->threads_max * sizeof(*values->tid));
+	values->value = realloc(values->value,
+				values->threads_max * sizeof(*values->value));
+	if (!values->pid || !values->tid || !values->value)
+		die("failed to enlarge read_values threads arrays");
+}
+
+static int perf_read_values__findnew_thread(struct perf_read_values *values,
+					    u32 pid, u32 tid)
+{
+	int i;
+
+	for (i = 0; i < values->threads; i++)
+		if (values->pid[i] == pid && values->tid[i] == tid)
+			return i;
+
+	if (values->threads == values->threads_max)
+		perf_read_values__enlarge_threads(values);
+
+	i = values->threads++;
+	values->pid[i] = pid;
+	values->tid[i] = tid;
+	values->value[i] = malloc(values->counters_max * sizeof(**values->value));
+	if (!values->value[i])
+		die("failed to allocate read_values counters array");
+
+	return i;
+}
+
+static void perf_read_values__enlarge_counters(struct perf_read_values *values)
+{
+	int i;
+
+	values->counters_max *= 2;
+	values->counterrawid = realloc(values->counterrawid,
+				       values->counters_max * sizeof(*values->counterrawid));
+	values->countername = realloc(values->countername,
+				      values->counters_max * sizeof(*values->countername));
+	if (!values->counterrawid || !values->countername)
+		die("failed to enlarge read_values counters arrays");
+
+	for (i = 0; i < values->threads; i++) {
+		values->value[i] = realloc(values->value[i],
+					   values->counters_max * sizeof(**values->value));
+		if (!values->value[i])
+			die("failed to enlarge read_values counters arrays");
+	}
+}
+
+static int perf_read_values__findnew_counter(struct perf_read_values *values,
+					     u64 rawid, char *name)
+{
+	int i;
+
+	for (i = 0; i < values->counters; i++)
+		if (values->counterrawid[i] == rawid)
+			return i;
+
+	if (values->counters == values->counters_max)
+		perf_read_values__enlarge_counters(values);
+
+	i = values->counters++;
+	values->counterrawid[i] = rawid;
+	values->countername[i] = strdup(name);
+
+	return i;
+}
+
+void perf_read_values_add_value(struct perf_read_values *values,
+				u32 pid, u32 tid,
+				u64 rawid, char *name, u64 value)
+{
+	int tindex, cindex;
+
+	tindex = perf_read_values__findnew_thread(values, pid, tid);
+	cindex = perf_read_values__findnew_counter(values, rawid, name);
+
+	values->value[tindex][cindex] = value;
+}
+
+void perf_read_values_display(FILE *fp, struct perf_read_values *values)
+{
+	int i, j;
+	int pidwidth, tidwidth;
+	int *counterwidth;
+
+	counterwidth = malloc(values->counters * sizeof(*counterwidth));
+	if (!counterwidth)
+		die("failed to allocate counterwidth array");
+	tidwidth = 3;
+	pidwidth = 3;
+	for (j = 0; j < values->counters; j++)
+		counterwidth[j] = strlen(values->countername[j]);
+	for (i = 0; i < values->threads; i++) {
+		int width;
+
+		width = snprintf(NULL, 0, "%d", values->pid[i]);
+		if (width > pidwidth)
+			pidwidth = width;
+		width = snprintf(NULL, 0, "%d", values->tid[i]);
+		if (width > tidwidth)
+			tidwidth = width;
+		for (j = 0; j < values->counters; j++) {
+			width = snprintf(NULL, 0, "%Lu", values->value[i][j]);
+			if (width > counterwidth[j])
+				counterwidth[j] = width;
+		}
+	}
+
+	fprintf(fp, "# %*s  %*s", pidwidth, "PID", tidwidth, "TID");
+	for (j = 0; j < values->counters; j++)
+		fprintf(fp, "  %*s", counterwidth[j], values->countername[j]);
+	fprintf(fp, "\n");
+
+	for (i = 0; i < values->threads; i++) {
+		fprintf(fp, "  %*d  %*d", pidwidth, values->pid[i],
+			tidwidth, values->tid[i]);
+		for (j = 0; j < values->counters; j++)
+			fprintf(fp, "  %*Lu",
+				counterwidth[j], values->value[i][j]);
+		fprintf(fp, "\n");
+	}
+}
diff --git a/tools/perf/util/values.h b/tools/perf/util/values.h
new file mode 100644
index 0000000..e41be5e
--- /dev/null
+++ b/tools/perf/util/values.h
@@ -0,0 +1,26 @@
+#ifndef _PERF_VALUES_H
+#define _PERF_VALUES_H
+
+#include "types.h"
+
+struct perf_read_values {
+	int threads;
+	int threads_max;
+	u32 *pid, *tid;
+	int counters;
+	int counters_max;
+	u64 *counterrawid;
+	char **countername;
+	u64 **value;
+};
+
+void perf_read_values_init(struct perf_read_values *values);
+void perf_read_values_destroy(struct perf_read_values *values);
+
+void perf_read_values_add_value(struct perf_read_values *values,
+				u32 pid, u32 tid,
+				u64 rawid, char *name, u64 value);
+
+void perf_read_values_display(FILE *fp, struct perf_read_values *values);
+
+#endif /* _PERF_VALUES_H */



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip:perfcounters/core] perf report: Fix and improve the displaying of per-thread event counters
  2009-08-07 11:55                             ` [PATCH] perf report: Display per-thread event counters Brice Goglin
@ 2009-08-08 11:54                               ` tip-bot for Brice Goglin
  2009-08-08 12:14                               ` [PATCH] perf report: Display " Ingo Molnar
  1 sibling, 0 replies; 41+ messages in thread
From: tip-bot for Brice Goglin @ 2009-08-08 11:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, Brice.Goglin, hpa, mingo, a.p.zijlstra, tglx, mingo

Commit-ID:  1c0cbc4e4a707afea1fd88f5f297f094c8adff82
Gitweb:     http://git.kernel.org/tip/1c0cbc4e4a707afea1fd88f5f297f094c8adff82
Author:     Brice Goglin <Brice.Goglin@inria.fr>
AuthorDate: Fri, 7 Aug 2009 13:55:24 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 8 Aug 2009 13:53:13 +0200

perf report: Fix and improve the displaying of per-thread event counters

Improve and fix the handling of per-thread counter stats
recorded via perf record -s. Previously we only displayed
it in debug printouts (-D) and even that output was hard
to disambiguate.

I moved everything to utils/values.[ch] so that we may reuse
it in perf stat.

We get something like this now:

 #  PID   TID  cache-misses  cache-references
   4658  4659        495581           3238779
   4658  4662        498246           3236823
   4658  4663        499531           3243162

Then it'll be easy to add --pretty=raw to display a single line per thread/event.

By the way, -S was also used for --symbol... So I used -T/--thread here.

perf report: Add -T/--threads to display per-thread counter values

 We get something like this now:
 #  PID   TID  cache-misses  cache-references
   4658  4659        495581           3238779
   4658  4662        498246           3236823
   4658  4663        499531           3243162

Per-thread arrays of counter values are managed in utils/values.[ch]

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
LKML-Reference: <4A7C162C.1030707@inria.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 tools/perf/Documentation/perf-report.txt |    3 +
 tools/perf/Makefile                      |    2 +
 tools/perf/builtin-report.c              |   25 +++++
 tools/perf/util/values.c                 |  171 ++++++++++++++++++++++++++++++
 tools/perf/util/values.h                 |   26 +++++
 5 files changed, 227 insertions(+), 0 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index e72e931..370344a 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -27,6 +27,9 @@ OPTIONS
 -n
 --show-nr-samples
 	Show the number of samples for each symbol
+-T
+--threads
+	Show per-thread event counters
 -C::
 --comms=::
 	Only consider symbols in these comms. CSV that understands
diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 1916e44..9fc133f 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -310,6 +310,7 @@ LIB_H += util/sigchain.h
 LIB_H += util/symbol.h
 LIB_H += util/module.h
 LIB_H += util/color.h
+LIB_H += util/values.h
 
 LIB_OBJS += util/abspath.o
 LIB_OBJS += util/alias.o
@@ -337,6 +338,7 @@ LIB_OBJS += util/color.o
 LIB_OBJS += util/pager.o
 LIB_OBJS += util/header.o
 LIB_OBJS += util/callchain.o
+LIB_OBJS += util/values.o
 
 BUILTIN_OBJS += builtin-annotate.o
 BUILTIN_OBJS += builtin-help.o
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 99274ce..4163918 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -17,6 +17,7 @@
 #include "util/string.h"
 #include "util/callchain.h"
 #include "util/strlist.h"
+#include "util/values.h"
 
 #include "perf.h"
 #include "util/header.h"
@@ -53,6 +54,9 @@ static int		modules;
 static int		full_paths;
 static int		show_nr_samples;
 
+static int		show_threads;
+static struct perf_read_values	show_threads_values;
+
 static unsigned long	page_size;
 static unsigned long	mmap_window = 32;
 
@@ -1473,6 +1477,9 @@ print_entries:
 
 	free(rem_sq_bracket);
 
+	if (show_threads)
+		perf_read_values_display(fp, &show_threads_values);
+
 	return ret;
 }
 
@@ -1758,6 +1765,16 @@ process_read_event(event_t *event, unsigned long offset, unsigned long head)
 {
 	struct perf_counter_attr *attr = perf_header__find_attr(event->read.id);
 
+	if (show_threads) {
+		char *name = attr ? __event_name(attr->type, attr->config)
+				   : "unknown";
+		perf_read_values_add_value(&show_threads_values,
+					   event->read.pid, event->read.tid,
+					   event->read.id,
+					   name,
+					   event->read.value);
+	}
+
 	dprintf("%p [%p]: PERF_EVENT_READ: %d %d %s %Lu\n",
 			(void *)(offset + head),
 			(void *)(long)(event->header.size),
@@ -1839,6 +1856,9 @@ static int __cmd_report(void)
 
 	register_idle_thread();
 
+	if (show_threads)
+		perf_read_values_init(&show_threads_values);
+
 	input = open(input_name, O_RDONLY);
 	if (input < 0) {
 		fprintf(stderr, " failed to open file: %s", input_name);
@@ -1993,6 +2013,9 @@ done:
 	output__resort(total);
 	output__fprintf(stdout, total);
 
+	if (show_threads)
+		perf_read_values_destroy(&show_threads_values);
+
 	return rc;
 }
 
@@ -2066,6 +2089,8 @@ static const struct option options[] = {
 		    "load module symbols - WARNING: use only with -k and LIVE kernel"),
 	OPT_BOOLEAN('n', "show-nr-samples", &show_nr_samples,
 		    "Show a column with the number of samples"),
+	OPT_BOOLEAN('T', "threads", &show_threads,
+		    "Show per-thread event counters"),
 	OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
 		   "sort by key(s): pid, comm, dso, symbol, parent"),
 	OPT_BOOLEAN('P', "full-paths", &full_paths,
diff --git a/tools/perf/util/values.c b/tools/perf/util/values.c
new file mode 100644
index 0000000..8551c0b
--- /dev/null
+++ b/tools/perf/util/values.c
@@ -0,0 +1,171 @@
+#include <stdlib.h>
+
+#include "util.h"
+#include "values.h"
+
+void perf_read_values_init(struct perf_read_values *values)
+{
+	values->threads_max = 16;
+	values->pid = malloc(values->threads_max * sizeof(*values->pid));
+	values->tid = malloc(values->threads_max * sizeof(*values->tid));
+	values->value = malloc(values->threads_max * sizeof(*values->value));
+	if (!values->pid || !values->tid || !values->value)
+		die("failed to allocate read_values threads arrays");
+	values->threads = 0;
+
+	values->counters_max = 16;
+	values->counterrawid = malloc(values->counters_max
+				      * sizeof(*values->counterrawid));
+	values->countername = malloc(values->counters_max
+				     * sizeof(*values->countername));
+	if (!values->counterrawid || !values->countername)
+		die("failed to allocate read_values counters arrays");
+	values->counters = 0;
+}
+
+void perf_read_values_destroy(struct perf_read_values *values)
+{
+	int i;
+
+	if (!values->threads_max || !values->counters_max)
+		return;
+
+	for (i = 0; i < values->threads; i++)
+		free(values->value[i]);
+	free(values->pid);
+	free(values->tid);
+	free(values->counterrawid);
+	for (i = 0; i < values->counters; i++)
+		free(values->countername[i]);
+	free(values->countername);
+}
+
+static void perf_read_values__enlarge_threads(struct perf_read_values *values)
+{
+	values->threads_max *= 2;
+	values->pid = realloc(values->pid,
+			      values->threads_max * sizeof(*values->pid));
+	values->tid = realloc(values->tid,
+			      values->threads_max * sizeof(*values->tid));
+	values->value = realloc(values->value,
+				values->threads_max * sizeof(*values->value));
+	if (!values->pid || !values->tid || !values->value)
+		die("failed to enlarge read_values threads arrays");
+}
+
+static int perf_read_values__findnew_thread(struct perf_read_values *values,
+					    u32 pid, u32 tid)
+{
+	int i;
+
+	for (i = 0; i < values->threads; i++)
+		if (values->pid[i] == pid && values->tid[i] == tid)
+			return i;
+
+	if (values->threads == values->threads_max)
+		perf_read_values__enlarge_threads(values);
+
+	i = values->threads++;
+	values->pid[i] = pid;
+	values->tid[i] = tid;
+	values->value[i] = malloc(values->counters_max * sizeof(**values->value));
+	if (!values->value[i])
+		die("failed to allocate read_values counters array");
+
+	return i;
+}
+
+static void perf_read_values__enlarge_counters(struct perf_read_values *values)
+{
+	int i;
+
+	values->counters_max *= 2;
+	values->counterrawid = realloc(values->counterrawid,
+				       values->counters_max * sizeof(*values->counterrawid));
+	values->countername = realloc(values->countername,
+				      values->counters_max * sizeof(*values->countername));
+	if (!values->counterrawid || !values->countername)
+		die("failed to enlarge read_values counters arrays");
+
+	for (i = 0; i < values->threads; i++) {
+		values->value[i] = realloc(values->value[i],
+					   values->counters_max * sizeof(**values->value));
+		if (!values->value[i])
+			die("failed to enlarge read_values counters arrays");
+	}
+}
+
+static int perf_read_values__findnew_counter(struct perf_read_values *values,
+					     u64 rawid, char *name)
+{
+	int i;
+
+	for (i = 0; i < values->counters; i++)
+		if (values->counterrawid[i] == rawid)
+			return i;
+
+	if (values->counters == values->counters_max)
+		perf_read_values__enlarge_counters(values);
+
+	i = values->counters++;
+	values->counterrawid[i] = rawid;
+	values->countername[i] = strdup(name);
+
+	return i;
+}
+
+void perf_read_values_add_value(struct perf_read_values *values,
+				u32 pid, u32 tid,
+				u64 rawid, char *name, u64 value)
+{
+	int tindex, cindex;
+
+	tindex = perf_read_values__findnew_thread(values, pid, tid);
+	cindex = perf_read_values__findnew_counter(values, rawid, name);
+
+	values->value[tindex][cindex] = value;
+}
+
+void perf_read_values_display(FILE *fp, struct perf_read_values *values)
+{
+	int i, j;
+	int pidwidth, tidwidth;
+	int *counterwidth;
+
+	counterwidth = malloc(values->counters * sizeof(*counterwidth));
+	if (!counterwidth)
+		die("failed to allocate counterwidth array");
+	tidwidth = 3;
+	pidwidth = 3;
+	for (j = 0; j < values->counters; j++)
+		counterwidth[j] = strlen(values->countername[j]);
+	for (i = 0; i < values->threads; i++) {
+		int width;
+
+		width = snprintf(NULL, 0, "%d", values->pid[i]);
+		if (width > pidwidth)
+			pidwidth = width;
+		width = snprintf(NULL, 0, "%d", values->tid[i]);
+		if (width > tidwidth)
+			tidwidth = width;
+		for (j = 0; j < values->counters; j++) {
+			width = snprintf(NULL, 0, "%Lu", values->value[i][j]);
+			if (width > counterwidth[j])
+				counterwidth[j] = width;
+		}
+	}
+
+	fprintf(fp, "# %*s  %*s", pidwidth, "PID", tidwidth, "TID");
+	for (j = 0; j < values->counters; j++)
+		fprintf(fp, "  %*s", counterwidth[j], values->countername[j]);
+	fprintf(fp, "\n");
+
+	for (i = 0; i < values->threads; i++) {
+		fprintf(fp, "  %*d  %*d", pidwidth, values->pid[i],
+			tidwidth, values->tid[i]);
+		for (j = 0; j < values->counters; j++)
+			fprintf(fp, "  %*Lu",
+				counterwidth[j], values->value[i][j]);
+		fprintf(fp, "\n");
+	}
+}
diff --git a/tools/perf/util/values.h b/tools/perf/util/values.h
new file mode 100644
index 0000000..e41be5e
--- /dev/null
+++ b/tools/perf/util/values.h
@@ -0,0 +1,26 @@
+#ifndef _PERF_VALUES_H
+#define _PERF_VALUES_H
+
+#include "types.h"
+
+struct perf_read_values {
+	int threads;
+	int threads_max;
+	u32 *pid, *tid;
+	int counters;
+	int counters_max;
+	u64 *counterrawid;
+	char **countername;
+	u64 **value;
+};
+
+void perf_read_values_init(struct perf_read_values *values);
+void perf_read_values_destroy(struct perf_read_values *values);
+
+void perf_read_values_add_value(struct perf_read_values *values,
+				u32 pid, u32 tid,
+				u64 rawid, char *name, u64 value);
+
+void perf_read_values_display(FILE *fp, struct perf_read_values *values);
+
+#endif /* _PERF_VALUES_H */

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf report: Display per-thread event counters
  2009-08-07 11:55                             ` [PATCH] perf report: Display per-thread event counters Brice Goglin
  2009-08-08 11:54                               ` [tip:perfcounters/core] perf report: Fix and improve the displaying of " tip-bot for Brice Goglin
@ 2009-08-08 12:14                               ` Ingo Molnar
  2009-08-08 16:10                                 ` Brice Goglin
  1 sibling, 1 reply; 41+ messages in thread
From: Ingo Molnar @ 2009-08-08 12:14 UTC (permalink / raw)
  To: Brice Goglin, Frédéric Weisbecker, Mike Galbraith,
	Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, paulus, LKML

* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> Here's a better patch. I moved everything to utils/values.[ch] so 
> that we may reuse it in perf stat. [...]

Nice patch! I've applied it, you can find it in the latest -tip 
tree:

  http://people.redhat.com/mingo/tip.git/README

please send enhancements/fixes on top of this.

> [...] But I don't see yet where I am suppose to get something like 
> PERF_READ_EVENT in builtin-stat.c so I haven't touched it yet.

Yeah. 'perf stat' is not really getting events but is doing a 
read-out of the counter value(s) and constructs its 'read event' 
that way. So you wont find PERF_READ_EVENT in builtin-stat.c, you'll 
find:

                res = read(fd[cpu][counter], single_count, nv * sizeof(u64));

in read_counter(). The printout is then done in print_counter().

> We get something like this now:
> #  PID   TID  cache-misses  cache-references
>   4658  4659        495581           3238779
>   4658  4662        498246           3236823
>   4658  4663        499531           3243162
> 
> Then it'll be easy to add --pretty=raw to display a single line 
> per thread/event.

ok.

> By the way, -S was also used for --symbol... So I used -T/--thread 
> here.

Hm, indeed - and -s was taken for --sort. Maybe we could rename 
-S/--symbols to -y/--symbols - this too is an i think rarely used 
feature.

I think pure 'statistics' runs like you do will be a pretty popular 
workflow, so intuitive naming/placement of options is important.

> perf report: Add -T/--threads to display per-thread counter values
>     
> We get something like this now:
> #  PID   TID  cache-misses  cache-references
>   4658  4659        495581           3238779
>   4658  4662        498246           3236823
>   4658  4663        499531           3243162

Btw., another thing to do would be to allow the 'dual' recording of 
both the stat values (collected when threads exit) and regular 
samples that perf report deals with.

I.e. dont handle 'perf record -s' as an exclusive thing to regular 
'perf record', but instead have -s/--sample-type option that can 
have such combinations:

  -s stats
  -s samples
  -s call-graph

And any combination thereof, such as:

  -s stats,samples

The default would be '-s samples'. Right now call-graph recording is 
triggered via a separate option (-g/--call-graph) - but maybe it 
could be merged into a more generic -s/--sample option mechanism?

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf report: Display per-thread event counters
  2009-08-08 12:14                               ` [PATCH] perf report: Display " Ingo Molnar
@ 2009-08-08 16:10                                 ` Brice Goglin
  2009-08-08 16:13                                   ` Ingo Molnar
  0 siblings, 1 reply; 41+ messages in thread
From: Brice Goglin @ 2009-08-08 16:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frédéric Weisbecker, Mike Galbraith,
	Arnaldo Carvalho de Melo, Peter Zijlstra, paulus, LKML

Ingo Molnar wrote:
>> [...] But I don't see yet where I am suppose to get something like 
>> PERF_READ_EVENT in builtin-stat.c so I haven't touched it yet.
>>     
>
> Yeah. 'perf stat' is not really getting events but is doing a 
> read-out of the counter value(s) and constructs its 'read event' 
> that way. So you wont find PERF_READ_EVENT in builtin-stat.c, you'll 
> find:
>
>                 res = read(fd[cpu][counter], single_count, nv * sizeof(u64));
>
> in read_counter(). The printout is then done in print_counter().
>   

Is there a way to get per-thread counters there? I wrote the code to
gather per-cpu counters there, but I don't see any way to get the
corresponding thread-id.

I looked at perf record to get some help. But I don't see where the
PERF_EVENT_READ are generated. I guess they are directly generated by
the kernel, read by perf record, and written as is to the output file?

thanks,
Brice


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] perf report: Display per-thread event counters
  2009-08-08 16:10                                 ` Brice Goglin
@ 2009-08-08 16:13                                   ` Ingo Molnar
  0 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2009-08-08 16:13 UTC (permalink / raw)
  To: Brice Goglin
  Cc: Frédéric Weisbecker, Mike Galbraith,
	Arnaldo Carvalho de Melo, Peter Zijlstra, paulus, LKML


* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> Ingo Molnar wrote:
> >> [...] But I don't see yet where I am suppose to get something like 
> >> PERF_READ_EVENT in builtin-stat.c so I haven't touched it yet.
> >>     
> >
> > Yeah. 'perf stat' is not really getting events but is doing a 
> > read-out of the counter value(s) and constructs its 'read event' 
> > that way. So you wont find PERF_READ_EVENT in builtin-stat.c, you'll 
> > find:
> >
> >                 res = read(fd[cpu][counter], single_count, nv * sizeof(u64));
> >
> > in read_counter(). The printout is then done in print_counter().
> 
> Is there a way to get per-thread counters there? I wrote the code 
> to gather per-cpu counters there, but I don't see any way to get 
> the corresponding thread-id.
> 
> I looked at perf record to get some help. But I don't see where 
> the PERF_EVENT_READ are generated. I guess they are directly 
> generated by the kernel, read by perf record, and written as is to 
> the output file?

Inherited counters are not accessible to the parent context. (they 
dont even have any fds instantiated, for performance and 
transparency reasons.)

I think perf stat could be enhanced to work not via reading the raw 
counters but by doing a mini "perf-record" internally, mmap the 
samples buffer and getting all the events there?

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [tip:perfcounters/urgent] perf tools: Fix multi-counter stat bug caused by incorrect reading of perf.data file header
  2009-08-06 18:57                   ` [PATCH] perf tools: Fix reading of perf.data file header Peter Zijlstra
  2009-08-06 19:03                     ` Brice Goglin
@ 2009-08-07  6:37                     ` tip-bot for Peter Zijlstra
  2009-08-07  7:39                     ` tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 41+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-08-07  6:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, Brice.Goglin, hpa, mingo, a.p.zijlstra, tglx, mingo

Commit-ID:  28d4db10a0f2a2a6cb9a843522f433ff0185c784
Gitweb:     http://git.kernel.org/tip/28d4db10a0f2a2a6cb9a843522f433ff0185c784
Author:     Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Thu, 6 Aug 2009 20:57:41 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 7 Aug 2009 08:32:55 +0200

perf tools: Fix multi-counter stat bug caused by incorrect reading of perf.data file header

Brice Goglin reported that only the first result from a
multi-counter perf record --stat run is accurate, the
rest looks bogus.

A silly mistake made us re-read the first attribute for
every recorded attribute.

Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: paulus@samba.org
LKML-Reference: <1249585061.4975.17.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 tools/perf/util/header.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 450384b..95a44bc 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -213,9 +213,10 @@ struct perf_header *perf_header__read(int fd)
 
 	for (i = 0; i < nr_attrs; i++) {
 		struct perf_header_attr *attr;
-		off_t tmp = lseek(fd, 0, SEEK_CUR);
+		off_t tmp;
 
 		do_read(fd, &f_attr, sizeof(f_attr));
+		tmp = lseek(fd, 0, SEEK_CUR);
 
 		attr = perf_header_attr__new(&f_attr.attr);
 

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip:perfcounters/urgent] perf tools: Fix multi-counter stat bug caused by incorrect reading of perf.data file header
  2009-08-06 18:57                   ` [PATCH] perf tools: Fix reading of perf.data file header Peter Zijlstra
  2009-08-06 19:03                     ` Brice Goglin
  2009-08-07  6:37                     ` [tip:perfcounters/urgent] perf tools: Fix multi-counter stat bug caused by incorrect reading of perf.data file header tip-bot for Peter Zijlstra
@ 2009-08-07  7:39                     ` tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 41+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-08-07  7:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, Brice.Goglin, hpa, mingo, a.p.zijlstra, tglx, mingo

Commit-ID:  ce71e78d91c4e7c02ce85f26319e53a824be4ffb
Gitweb:     http://git.kernel.org/tip/ce71e78d91c4e7c02ce85f26319e53a824be4ffb
Author:     Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Thu, 6 Aug 2009 20:57:41 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 7 Aug 2009 09:37:29 +0200

perf tools: Fix multi-counter stat bug caused by incorrect reading of perf.data file header

Brice Goglin reported that only the first result from a
multi-counter perf record --stat run is accurate, the
rest looks bogus.

A silly mistake made us re-read the first attribute for
every recorded attribute.

Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: paulus@samba.org
LKML-Reference: <1249585061.4975.17.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 tools/perf/util/header.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 450384b..95a44bc 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -213,9 +213,10 @@ struct perf_header *perf_header__read(int fd)
 
 	for (i = 0; i < nr_attrs; i++) {
 		struct perf_header_attr *attr;
-		off_t tmp = lseek(fd, 0, SEEK_CUR);
+		off_t tmp;
 
 		do_read(fd, &f_attr, sizeof(f_attr));
+		tmp = lseek(fd, 0, SEEK_CUR);
 
 		attr = perf_header_attr__new(&f_attr.attr);
 

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-08-06 17:40               ` Peter Zijlstra
  2009-08-06 17:48                 ` Brice Goglin
@ 2009-08-06 19:01                 ` Brice Goglin
  1 sibling, 0 replies; 41+ messages in thread
From: Brice Goglin @ 2009-08-06 19:01 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, paulus, LKML

Tested-by: Brice Goglin <Brice.Goglin@inria.fr>

This one and the next patch made the trick.



Peter Zijlstra wrote:
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 8cb58d6..c053fd8 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -112,7 +112,9 @@ struct read_event {
>  	struct perf_event_header header;
>  	u32 pid,tid;
>  	u64 value;
> -	u64 format[3];
> +	u64 time_enabled;
> +	u64 time_running;
> +	u64 id;
>  };
>  
>  typedef union event_union {
> @@ -1690,14 +1692,37 @@ static void trace_event(event_t *event)
>  	dprintf(".\n");
>  }
>  
> +static struct perf_header	*header;
> +
> +static struct perf_counter_attr *perf_header__find_attr(u64 id)
> +{
> +	int i;
> +
> +	for (i = 0; i < header->attrs; i++) {
> +		struct perf_header_attr *attr = header->attr[i];
> +		int j;
> +
> +		for (j = 0; j < attr->ids; j++) {
> +			if (attr->id[j] == id)
> +				return &attr->attr;
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
>  static int
>  process_read_event(event_t *event, unsigned long offset, unsigned long head)
>  {
> -	dprintf("%p [%p]: PERF_EVENT_READ: %d %d %Lu\n",
> +	struct perf_counter_attr *attr = perf_header__find_attr(event->read.id);
> +
> +	dprintf("%p [%p]: PERF_EVENT_READ: %d %d %s %Lu\n",
>  			(void *)(offset + head),
>  			(void *)(long)(event->header.size),
>  			event->read.pid,
>  			event->read.tid,
> +			attr ? __event_name(attr->type, attr->config)
> +			     : "FAIL",
>  			event->read.value);
>  
>  	return 0;
> @@ -1743,8 +1768,6 @@ process_event(event_t *event, unsigned long offset, unsigned long head)
>  	return 0;
>  }
>  
> -static struct perf_header	*header;
> -
>  static u64 perf_header__sample_type(void)
>  {
>  	u64 sample_type = 0;
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 7bdad8d..4858d83 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -223,9 +239,15 @@ char *event_name(int counter)
>  {
>  	u64 config = attrs[counter].config;
>  	int type = attrs[counter].type;
> +
> +	return __event_name(type, config);
> +}
> +
> +char *__event_name(int type, u64 config)
> +{
>  	static char buf[32];
>  
> -	if (attrs[counter].type == PERF_TYPE_RAW) {
> +	if (type == PERF_TYPE_RAW) {
>  		sprintf(buf, "raw 0x%llx", config);
>  		return buf;
>  	}
> diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
> index 1ea5d09..192a962 100644
> --- a/tools/perf/util/parse-events.h
> +++ b/tools/perf/util/parse-events.h
> @@ -10,6 +10,7 @@ extern int			nr_counters;
>  extern struct perf_counter_attr attrs[MAX_COUNTERS];
>  
>  extern char *event_name(int ctr);
> +extern char *__event_name(int type, u64 config);
>  
>  extern int parse_events(const struct option *opt, const char *str, int unset);
>  
>
>
>   


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 13:14 ` Ingo Molnar
                     ` (2 preceding siblings ...)
  2009-06-23 13:47   ` Ingo Molnar
@ 2009-06-23 14:21   ` Brice Goglin
  2009-06-23 14:51     ` Ingo Molnar
  3 siblings, 1 reply; 41+ messages in thread
From: Brice Goglin @ 2009-06-23 14:21 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, paulus, LKML

Ingo Molnar wrote:
> You can also do a profile with such events:
>
>   perf record -f -e r1000ffe0 ./hackbench 10
>
> and look at it via 'perf report'.
>   

I am not sure what the perf.data profile file contains but 'perf report'
only shows percentages. Is there a way to get a 'perf stat'-like output
from 'perf report'? Or maybe just have a -f option in 'perf stat' to
send the output into a file (with the PID in the name).

By the way, there's a typo in the description in
tools/perf/Documentation/perf-report.txt, you want s/via perf report/via
perf record/

> [ Note, there's no need to specify any --follow-* flags as that is
>   implicit in 'perf'. (and you'll probably also notice that perf
>   stat is a lot faster at following fast-forking or
>   context-switching workloads than is pfmon, because it's not ptrace
>   based.) ]
>   

What about threads? I didn't find any way to get per-thread counters.

Ideally, I'd like to be able to see no perf-related output on
stdout/stderr at runtime, and later have a look at per-thread counters
like 'perf stat' does at runtime.

thanks,
Brice

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 14:21   ` Brice Goglin
@ 2009-06-23 14:51     ` Ingo Molnar
  2009-06-23 15:29       ` Jaswinder Singh Rajput
  0 siblings, 1 reply; 41+ messages in thread
From: Ingo Molnar @ 2009-06-23 14:51 UTC (permalink / raw)
  To: Brice Goglin, Mike Galbraith, Arnaldo Carvalho de Melo,
	Jaswinder Singh Rajput, Thomas Gleixner
  Cc: Peter Zijlstra, paulus, LKML


* Brice Goglin <Brice.Goglin@inria.fr> wrote:

> Ingo Molnar wrote:
> > You can also do a profile with such events:
> >
> >   perf record -f -e r1000ffe0 ./hackbench 10
> >
> > and look at it via 'perf report'.
> >   
> 
> I am not sure what the perf.data profile file contains but 'perf 
> report' only shows percentages. Is there a way to get a 'perf 
> stat'-like output from 'perf report'? Or maybe just have a -f 
> option in 'perf stat' to send the output into a file (with the PID 
> in the name).

It's not yet possible but it's a very good feature request.

> By the way, there's a typo in the description in 
> tools/perf/Documentation/perf-report.txt, you want s/via perf 
> report/via perf record/

thanks, fixed and pushed out. You can generally find the latest 
'perf' stuff at:

  http://people.redhat.com/mingo/tip.git/README

> > [ Note, there's no need to specify any --follow-* flags as that is
> >   implicit in 'perf'. (and you'll probably also notice that perf
> >   stat is a lot faster at following fast-forking or
> >   context-switching workloads than is pfmon, because it's not ptrace
> >   based.) ]
> 
> What about threads? I didn't find any way to get per-thread 
> counters.
> 
> Ideally, I'd like to be able to see no perf-related output on 
> stdout/stderr at runtime, and later have a look at per-thread 
> counters like 'perf stat' does at runtime.

That's not possible yet either, but makes a lot of sense.

How many threads does your workload typically run, and how do you 
get their stats displayed?

Per thread info is currently available in the profile output:

   perf report --sort comm,pid,symbol

But it would be nice to either extend perf report with a --stat 
option:

   perf report --stat

or to extend perf stat to take an input file via -i:

   perf stat -i perf.data

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [perf] howto switch from pfmon
  2009-06-23 14:51     ` Ingo Molnar
@ 2009-06-23 15:29       ` Jaswinder Singh Rajput
  0 siblings, 0 replies; 41+ messages in thread
From: Jaswinder Singh Rajput @ 2009-06-23 15:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Brice Goglin, Mike Galbraith, Arnaldo Carvalho de Melo,
	Thomas Gleixner, Peter Zijlstra, paulus, LKML

On Tue, 2009-06-23 at 16:51 +0200, Ingo Molnar wrote:
> Per thread info is currently available in the profile output:
> 
>    perf report --sort comm,pid,symbol
> 
> But it would be nice to either extend perf report with a --stat 
> option:
> 
>    perf report --stat
> 
> or to extend perf stat to take an input file via -i:
> 
>    perf stat -i perf.data
> 

I prefer 'perf report --stat' as it is already handling file.

--
JSR


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2009-08-08 16:13 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-22 20:54 [perf] howto switch from pfmon Brice Goglin
2009-06-23 12:12 ` Andi Kleen
2009-06-23 12:23   ` Peter Zijlstra
2009-06-23 13:57   ` Ingo Molnar
2009-06-23 13:14 ` Ingo Molnar
2009-06-23 13:22   ` Peter Zijlstra
2009-06-23 13:38     ` Ingo Molnar
2009-06-23 13:25   ` Ingo Molnar
2009-06-23 13:47   ` Ingo Molnar
2009-06-23 14:00     ` Brice Goglin
2009-06-23 14:36       ` Ingo Molnar
2009-06-23 15:22         ` Brice Goglin
2009-06-29 19:29           ` Ingo Molnar
2009-08-06 16:59             ` Brice Goglin
2009-08-06 17:40               ` Peter Zijlstra
2009-08-06 17:48                 ` Brice Goglin
2009-08-06 17:59                   ` Peter Zijlstra
2009-08-06 18:57                   ` [PATCH] perf tools: Fix reading of perf.data file header Peter Zijlstra
2009-08-06 19:03                     ` Brice Goglin
2009-08-06 19:59                       ` Ingo Molnar
2009-08-06 20:03                         ` Brice Goglin
2009-08-06 23:35                         ` Brice Goglin
2009-08-07  6:13                           ` Brice Goglin
2009-08-07  6:32                           ` Ingo Molnar
2009-08-07  7:38                             ` Brice Goglin
2009-08-07  7:45                               ` Ingo Molnar
2009-08-07  8:18                                 ` Brice Goglin
2009-08-07  8:23                                   ` Ingo Molnar
2009-08-07  8:27                                   ` Ingo Molnar
2009-08-07  8:30                                   ` [tip:perfcounters/core] perf stat: Rename -S/--scale to -c/--scale tip-bot for Brice Goglin
2009-08-07 11:55                             ` [PATCH] perf report: Display per-thread event counters Brice Goglin
2009-08-08 11:54                               ` [tip:perfcounters/core] perf report: Fix and improve the displaying of " tip-bot for Brice Goglin
2009-08-08 12:14                               ` [PATCH] perf report: Display " Ingo Molnar
2009-08-08 16:10                                 ` Brice Goglin
2009-08-08 16:13                                   ` Ingo Molnar
2009-08-07  6:37                     ` [tip:perfcounters/urgent] perf tools: Fix multi-counter stat bug caused by incorrect reading of perf.data file header tip-bot for Peter Zijlstra
2009-08-07  7:39                     ` tip-bot for Peter Zijlstra
2009-08-06 19:01                 ` [perf] howto switch from pfmon Brice Goglin
2009-06-23 14:21   ` Brice Goglin
2009-06-23 14:51     ` Ingo Molnar
2009-06-23 15:29       ` Jaswinder Singh Rajput

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox