* Some event modifiers missing from output and desire option to compute value based on measurements
@ 2014-03-13 16:58 William Cohen
2014-03-13 17:30 ` Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: William Cohen @ 2014-03-13 16:58 UTC (permalink / raw)
To: linux-perf-users
When experimenting with perf I wanted to have separate counts for events in userspace and the kernel. I used:
$ perf stat -e instructions:u -e instructions:k -e cycles:u -e cycles:k -e cache-misses:u -e cache-misses:k make
The associated output below includes the event modifiers for all the events, but the 3.06 and 0.37 insns per cycles look off. Shouldn't that instructions:u/cycles:u and instructions:k/cycles:k be the values reported for "insns per cycle"?
Performance counter stats for 'make':
340,034,597,510 instructions:u # 3.06 insns per cycle [83.70%]
40,963,149,231 instructions:k # 0.37 insns per cycle [83.67%]
185,451,244,302 cycles:u [83.54%]
36,901,938,457 cycles:k [83.64%]
34,811,408 cache-misses:u [83.96%]
102,781,614 cache-misses:k [83.85%]
66.290033775 seconds time elapsed
For the L1-icache-load-misses and iTLB-load-misses the event modifier appears to be dropped as shown in the output below
$ perf stat -e instructions:u -e instructions:k -e cycles:u -e cycles:k -e L1-icache-load-misses:u -e L1-icache-load-misses:k make
Performance counter stats for 'make':
340,522,617,398 instructions:u # 3.09 insns per cycle [83.55%]
41,045,130,555 instructions:k # 0.37 insns per cycle [83.82%]
184,082,319,783 cycles:u [84.12%]
36,447,301,873 cycles:k [84.21%]
849,438,930 L1-icache-load-misses [82.96%]
445,141,089 L1-icache-load-misses [83.67%]
65.611650172 seconds time elapsed
$ perf stat -e instructions:u -e instructions:k -e cycles:u -e cycles:k -e iTLB-load-misses:u -e iTLB-load-misses:k make
2,074,873,836,988 instructions:u # 2.09 insns per cycle [83.47%]
128,412,604,104 instructions:k # 0.13 insns per cycle [83.51%]
1,786,252,236,017 cycles:u [83.52%]
202,469,325,995 cycles:k [83.46%]
786,419,505 iTLB-load-misses [83.44%]
40,548,044 iTLB-load-misses [83.43%]
3800.440742009 seconds time elapsed
It appears that the output is listing the measurements in the same order they are specified on the command line, but it would be nice if the output was clearer on the events being measured. If I am reading the output correctly, the L1-icache-load-misses per instruction is pretty poor for kernel-space. Much of the time I am looking at ratios of events and it would be nice if "perf stat" had a way to have it compute the ratios directly. Maybe a "-m, --math" option allowing algebraic expressions where you could do:
perf stat -e instructions:u -e instructions:k -e cycles:u -e cycles:k -e L1-icache-load-misses:u -e L1-icache-load-misses:k \
--math instructions:u/icache-load-misses:u \
--math instructions:k/icache-load-misses:k \
make
Performance counter stats for 'make':
$ perf stat -e instructions:u -e instructions:k -e cycles:u -e cycles:k -e L1-icache-load-misses:u -e L1-icache-load-misses:k make
Performance counter stats for 'make':
340,522,617,398 instructions:u # 3.09 insns per cycle [83.55%]
41,045,130,555 instructions:k # 0.37 insns per cycle [83.82%]
184,082,319,783 cycles:u [84.12%]
36,447,301,873 cycles:k [84.21%]
849,438,930 L1-icache-load-misses [82.96%]
445,141,089 L1-icache-load-misses [83.67%]
401.09 instructions:u/icache-load-misses:u
92.20 instructions:k/icache-load-misses:k
65.611650172 seconds time elapsed
-Will
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Some event modifiers missing from output and desire option to compute value based on measurements
2014-03-13 16:58 Some event modifiers missing from output and desire option to compute value based on measurements William Cohen
@ 2014-03-13 17:30 ` Andi Kleen
2014-03-13 18:39 ` William Cohen
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2014-03-13 17:30 UTC (permalink / raw)
To: William Cohen; +Cc: linux-perf-users
William Cohen <wcohen@redhat.com> writes:
> When experimenting with perf I wanted to have separate counts for events in userspace and the kernel. I used:
>
> $ perf stat -e instructions:u -e instructions:k -e cycles:u -e cycles:k -e cache-misses:u -e cache-misses:k make
>
> The associated output below includes the event modifiers for all the events, but the 3.06 and 0.37 insns per cycles look off. Shouldn't that instructions:u/cycles:u and instructions:k/cycles:k be the values reported for "insns per cycle"?
Yes the event match code currently assumes there's only a single event
each and always uses the last.
> It appears that the output is listing the measurements in the same
> order they are specified on the command line, but it would be nice if
> the output was clearer on the events being measured. If I am reading
> the output correctly, the L1-icache-load-misses per instruction is
> pretty poor for kernel-space. Much of the time I am looking at ratios
> of events and it would be nice if "perf stat" had a way to have it
> compute the ratios directly. Maybe a "-m, --math" option allowing
> algebraic expressions where you could do:
Most people just use -x, and load the result into a spread sheet or
other script that does the compuations. At some point you usually want
to plot the data or do other more complex manipulations than your
simple facility would provide.
You may also find this script useful
https://github.com/andikleen/pmu-tools/blob/master/interval-normalize.py
-Andi
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Some event modifiers missing from output and desire option to compute value based on measurements
2014-03-13 17:30 ` Andi Kleen
@ 2014-03-13 18:39 ` William Cohen
2014-03-21 12:49 ` Christopher Covington
0 siblings, 1 reply; 5+ messages in thread
From: William Cohen @ 2014-03-13 18:39 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-perf-users
On 03/13/2014 01:30 PM, Andi Kleen wrote:
> William Cohen <wcohen@redhat.com> writes:
>
>> When experimenting with perf I wanted to have separate counts for events in userspace and the kernel. I used:
>>
>> $ perf stat -e instructions:u -e instructions:k -e cycles:u -e cycles:k -e cache-misses:u -e cache-misses:k make
>>
>> The associated output below includes the event modifiers for all the events, but the 3.06 and 0.37 insns per cycles look off. Shouldn't that instructions:u/cycles:u and instructions:k/cycles:k be the values reported for "insns per cycle"?
>
> Yes the event match code currently assumes there's only a single event
> each and always uses the last.
>
>> It appears that the output is listing the measurements in the same
>> order they are specified on the command line, but it would be nice if
>> the output was clearer on the events being measured. If I am reading
>> the output correctly, the L1-icache-load-misses per instruction is
>> pretty poor for kernel-space. Much of the time I am looking at ratios
>> of events and it would be nice if "perf stat" had a way to have it
>> compute the ratios directly. Maybe a "-m, --math" option allowing
>> algebraic expressions where you could do:
Hi Andi,
So the missing event modifier is still a problem. The events begins passed into the perf are not going to match the names on the output. Also a script using the output perf is not going to be able to distinguish between the same event with different modifiers.
>
> Most people just use -x, and load the result into a spread sheet or
> other script that does the compuations. At some point you usually want
> to plot the data or do other more complex manipulations than your
> simple facility would provide.
>
> You may also find this script useful
>
> https://github.com/andikleen/pmu-tools/blob/master/interval-normalize.py
>
> -Andi
>
Thanks for the pointer to the interval-normalize.py script.
Yes, many people are probably using other more sophisticated tools such as spread sheets to analyze the data from perf. However, something like a "-m, --math" option would give a bit more insight than the basic "perf stat" without having to resort to more sophisticated tools. "perf stat" is already generating all sorts of derived numbers such as IPC, events/second, and perccent of cache misses it seems like a small step to provide some flexibility for the user to specify exactly what to compute.
-Will
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Some event modifiers missing from output and desire option to compute value based on measurements
2014-03-13 18:39 ` William Cohen
@ 2014-03-21 12:49 ` Christopher Covington
2014-03-21 14:16 ` David Ahern
0 siblings, 1 reply; 5+ messages in thread
From: Christopher Covington @ 2014-03-21 12:49 UTC (permalink / raw)
To: William Cohen; +Cc: Andi Kleen, linux-perf-users
Hi Will,
On 03/13/2014 02:39 PM, William Cohen wrote:
> On 03/13/2014 01:30 PM, Andi Kleen wrote:
>> William Cohen <wcohen@redhat.com> writes:
>>
>>> When experimenting with perf I wanted to have separate counts for events in userspace and the kernel. I used:
>>>
>>> $ perf stat -e instructions:u -e instructions:k -e cycles:u -e cycles:k -e cache-misses:u -e cache-misses:k make
>>>
>>> The associated output below includes the event modifiers for all the events, but the 3.06 and 0.37 insns per cycles look off. Shouldn't that instructions:u/cycles:u and instructions:k/cycles:k be the values reported for "insns per cycle"?
>>
>> Yes the event match code currently assumes there's only a single event
>> each and always uses the last.
>>
>>> It appears that the output is listing the measurements in the same
>>> order they are specified on the command line, but it would be nice if
>>> the output was clearer on the events being measured. If I am reading
>>> the output correctly, the L1-icache-load-misses per instruction is
>>> pretty poor for kernel-space. Much of the time I am looking at ratios
>>> of events and it would be nice if "perf stat" had a way to have it
>>> compute the ratios directly. Maybe a "-m, --math" option allowing
>>> algebraic expressions where you could do:
>
> Hi Andi,
>
> So the missing event modifier is still a problem. The events begins passed into the perf are not going to match the names on the output. Also a script using the output perf is not going to be able to distinguish between the same event with different modifiers.
>
>>
>> Most people just use -x, and load the result into a spread sheet or
>> other script that does the compuations. At some point you usually want
>> to plot the data or do other more complex manipulations than your
>> simple facility would provide.
>>
>> You may also find this script useful
>>
>> https://github.com/andikleen/pmu-tools/blob/master/interval-normalize.py
>>
>> -Andi
>>
>
> Thanks for the pointer to the interval-normalize.py script.
>
> Yes, many people are probably using other more sophisticated tools such as
> spread sheets to analyze the data from perf. However, something like a "-m,
> --math" option would give a bit more insight than the basic "perf stat"
> without having to resort to more sophisticated tools. "perf stat" is already
> generating all sorts of derived numbers such as IPC, events/second, and
> perccent of cache misses it seems like a small step to provide some
> flexibility for the user to specify exactly what to compute.
I don't know how useful a reference this is, but here's an out-of-tree
"periodic" command with math flag support.
https://www.codeaurora.org/cgit/quic/la/kernel/msm-3.10/tree/tools/perf/builtin-periodic.c?h=LNX.LA.3.6_rb1.1&id=4235d779be748291ed2ec5581dd64e7d1a529297
Christopher
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by the Linux Foundation.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-03-21 14:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-13 16:58 Some event modifiers missing from output and desire option to compute value based on measurements William Cohen
2014-03-13 17:30 ` Andi Kleen
2014-03-13 18:39 ` William Cohen
2014-03-21 12:49 ` Christopher Covington
2014-03-21 14:16 ` David Ahern
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).