* perf_event sampling a multithreaded process - ioctl with PERF_EVENT_IOC_SET_OUTPUT fails
@ 2017-04-19 15:15 Daniel John FitzGerald
2017-04-19 15:46 ` Andi Kleen
0 siblings, 1 reply; 4+ messages in thread
From: Daniel John FitzGerald @ 2017-04-19 15:15 UTC (permalink / raw)
To: linux-perf-users
I've been trying to prototype an application that would use perf_event
sampling (sample_period > 0 in *attr parameter passed on call to
perf_event_open()) to monitor multiple performance counters for a
potentially multi-threaded process. After playing with
perf_event_open() for a while, the solution I've come up with works like
this:
1. Use fork() to create a child process that we will use to execute the
multi-threaded test after the parent gives it the go-ahead over an
IPC pipe. In my test program, the child simply calls a local
function that runs the load, so no need to exec.
2. In the *attr parameter passed to perf_event_open(), set inherit=1
and disabled=1.
3. For each logical CPU that could execute the program:
1. For each event that we want to monitor:
1. If this is the first event we've processed for this CPU:
1. In the *attr parameter passed to perf_event_open(), set
disabled = 1.
2. Call perf_event_open() to open the event: leaderFD =
perf_event_open(*attr, <child_process_id>,
<this_cpu_id>, -1, 0);
3. Call mmap() to allocate a ring buffer: ringBuffer=
mmap(NULL, MMAPBUFFERSIZE, PROT_READ | PROT_WRITE,
MAP_SHARED, leaderFD, 0);
2. Else:
1. In the *attr parameter passed to perf_event_open(), set
disabled=0.
2. Call perf_event_open() to open the event: followerFD =
perf_event_open(*attr, <child_process_id>,
<this_cpu_id>, leaderFD, 0);
3. Issue an ioctl() with PERF_EVENT_IOC_SET_OUTPUT to tell
perf_event to associate followerFD with leaderFD, which
will cause perf_event to report this event's kernel
notifications to the
the mmap'd ring buffer at ringBuffer.
4. Issue an ioctl() with PERF_EVENT_IOC_ENABLE on the first event on
each CPU to enable all events
5. Issue the IPC synchronization signal (in my case, close a shared
pipe) to alert the child process to begin its' test.
The problem that I am running into is that whenever I attempt to call
ioctl() with PERF_EVENT_IOC_SET_OUTPUT, I get a -1 response with
errno==EINVAL. After looking through the mail archives as well as at
similar code in the oprofile and PAPI projects, I can't figure out why
this would be. I know that I can successfully call perf_event_open() on
all events for all CPUs. Furthermore, I can create a stand-alone ring
buffer for multiple events on the same CPU (I haven't tried doing it for
all events on all CPUs but feel it would work). Does anybody have an
idea about what I could be doing wrong here?
For reference, here is the contents *attr structure, event:
memset(&event, 0, sizeof(event)); // Initialize all
fields to zero
event.type = PERF_TYPE_RAW; // We will use custom
"raw" config values, defined in pwr8_hpm_MMCR_map[]
event.size = sizeof(struct perf_event_attr); // The size of the
perf_event_attr structure
assert(event.config == 0); // This will be set
when we open our individual performance counters
event.sample_period = 1048576; // Generate a sample
overflow notification every 1M events
event.sample_type = PERF_SAMPLE_IP; // Record the
instruction pointer
event.sample_type |= PERF_SAMPLE_TID; // Record the process
and thread IDs
event.sample_type |= PERF_SAMPLE_TIME; // Record a timestamp
with each sample
event.sample_type |= PERF_SAMPLE_READ; // Record counter
values for all events in the group
event.sample_type |= PERF_SAMPLE_CPU; // Record the CPU
number associated with this event
event.disabled = 1; // Disable this
counter on initialization (only done for parent event)
event.inherit = 1; // Sample for child
threads as well
event.exclude_kernel = 1; // Exclude events that
happened in kernel space
event.exclude_hv = 1; // Exclude events that
happened in hypervisor space
event.exclude_idle = 0; // Include events that
happened when the CPU is idle
event.precise_ip = 2; // Request a precise
IP address whenever possible
assert(event.watermark == 0); // Required to use
wakeup_events
assert(event.freq == 0); // Indicate that we're
using counter-based sampling
event.wakeup_events = 1; // Begin sending
overflow notifications after x events
event.bp_type = HW_BREAKPOINT_EMPTY; // No breakpoint type
assert(event.config1 == 0); // Not used
assert(event.config2 == 0); // Not used
--
Regards,
Dan FitzGerald
An enlightenment painter would paint a grand house on a lawn;
A romantic painter would paint it on fire.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: perf_event sampling a multithreaded process - ioctl with PERF_EVENT_IOC_SET_OUTPUT fails
2017-04-19 15:15 perf_event sampling a multithreaded process - ioctl with PERF_EVENT_IOC_SET_OUTPUT fails Daniel John FitzGerald
@ 2017-04-19 15:46 ` Andi Kleen
2017-04-19 16:37 ` Daniel John FitzGerald
0 siblings, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2017-04-19 15:46 UTC (permalink / raw)
To: Daniel John FitzGerald; +Cc: linux-perf-users
Daniel John FitzGerald <daniel.j.fitzgerald@gmail.com> writes:
>
> The problem that I am running into is that whenever I attempt to call
> ioctl() with PERF_EVENT_IOC_SET_OUTPUT, I get a -1 response with
> errno==EINVAL.
There are lots of restrictions to PERF_EVENT_IOC_SET_OUTPUT.
Likely you're violating that it has to be on the same CPU or same task.
-Andi
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: perf_event sampling a multithreaded process - ioctl with PERF_EVENT_IOC_SET_OUTPUT fails
2017-04-19 15:46 ` Andi Kleen
@ 2017-04-19 16:37 ` Daniel John FitzGerald
2017-04-19 16:51 ` Daniel John FitzGerald
0 siblings, 1 reply; 4+ messages in thread
From: Daniel John FitzGerald @ 2017-04-19 16:37 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-perf-users
[-- Attachment #1: Type: text/plain, Size: 876 bytes --]
I'm confused as to why either would be the case; I'm specifying the same PID and same CPU number. Is there any documentation out there on all of the restrictions to PERF_EVENT_IOC_SET_OUTPUT?
For reference, attached is the source for my function that calls perf_event_open(), mmaps the ring buffer, and tries to do the ioctl.
Regards,
Dan FitzGerald
An enlightenment painter would paint a grand house on a lawn;
A romantic painter would paint it on fire.
On 04/19/2017 11:46 AM, Andi Kleen wrote:
> Daniel John FitzGerald <daniel.j.fitzgerald@gmail.com> writes:
>> The problem that I am running into is that whenever I attempt to call
>> ioctl() with PERF_EVENT_IOC_SET_OUTPUT, I get a -1 response with
>> errno==EINVAL.
> There are lots of restrictions to PERF_EVENT_IOC_SET_OUTPUT.
>
> Likely you're violating that it has to be on the same CPU or same task.
>
> -Andi
[-- Attachment #2: code-snippit.c --]
[-- Type: application/x-c-file, Size: 7701 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: perf_event sampling a multithreaded process - ioctl with PERF_EVENT_IOC_SET_OUTPUT fails
2017-04-19 16:37 ` Daniel John FitzGerald
@ 2017-04-19 16:51 ` Daniel John FitzGerald
0 siblings, 0 replies; 4+ messages in thread
From: Daniel John FitzGerald @ 2017-04-19 16:51 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-perf-users
*Sighs*
I found the problem. I sent the wrong index counter as the
perf_event_open CPU... they were all being opened on different CPUs.
Thanks for the help :-)
Regards,
Dan FitzGerald
An enlightenment painter would paint a grand house on a lawn;
A romantic painter would paint it on fire.
On 04/19/2017 12:37 PM, Daniel John FitzGerald wrote:
> I'm confused as to why either would be the case; I'm specifying the same PID and same CPU number. Is there any documentation out there on all of the restrictions to PERF_EVENT_IOC_SET_OUTPUT?
>
> For reference, attached is the source for my function that calls perf_event_open(), mmaps the ring buffer, and tries to do the ioctl.
>
> Regards,
> Dan FitzGerald
>
> An enlightenment painter would paint a grand house on a lawn;
> A romantic painter would paint it on fire.
>
> On 04/19/2017 11:46 AM, Andi Kleen wrote:
>> Daniel John FitzGerald <daniel.j.fitzgerald@gmail.com> writes:
>>> The problem that I am running into is that whenever I attempt to call
>>> ioctl() with PERF_EVENT_IOC_SET_OUTPUT, I get a -1 response with
>>> errno==EINVAL.
>> There are lots of restrictions to PERF_EVENT_IOC_SET_OUTPUT.
>>
>> Likely you're violating that it has to be on the same CPU or same task.
>>
>> -Andi
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-04-19 16:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-19 15:15 perf_event sampling a multithreaded process - ioctl with PERF_EVENT_IOC_SET_OUTPUT fails Daniel John FitzGerald
2017-04-19 15:46 ` Andi Kleen
2017-04-19 16:37 ` Daniel John FitzGerald
2017-04-19 16:51 ` Daniel John FitzGerald
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).