* [RFC PATCH 0/1] Notifications for perf sideband events
@ 2017-06-06 14:49 Naveen N. Rao
2017-06-06 15:16 ` [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS Naveen N. Rao
0 siblings, 1 reply; 9+ messages in thread
From: Naveen N. Rao @ 2017-06-06 14:49 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa
Cc: linux-kernel
Currently, there is no way to ask for signals to be delivered when a
certain number of sideband events have been logged into the ring
buffer. This is problematic if we are only interested in, say, context
switch events. This patch provides for a way to achieve this.
As noted, this is a RFC and I am not too specific about the interface or
the ioctl name. Kindly suggest if you think there is a better way to
achieve this.
- Naveen
---
Here is a sample program demonstrating the same:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <signal.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <linux/perf_event.h>
#include <asm/unistd.h>
static long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
int cpu, int group_fd, unsigned long flags)
{
return syscall(__NR_perf_event_open, hw_event, pid, cpu,
group_fd, flags);
}
static void sigio_handler(int n, siginfo_t *info, void *uc)
{
fprintf (stderr, "Caught %s\n", info->si_code == POLL_HUP ? "POLL_HUP" :
(info->si_code == POLL_IN ? "POLL_IN" : "other signal"));
if (ioctl(info->si_fd, PERF_EVENT_IOC_REFRESH, 2) == -1)
perror("SIGIO: IOC_REFRESH");
}
int main(int argc, char **argv)
{
struct perf_event_attr pe;
struct sigaction act;
int fd;
void *buf;
memset(&act, 0, sizeof(act));
act.sa_sigaction = sigio_handler;
act.sa_flags = SA_SIGINFO;
sigaction(SIGIO, &act, 0);
memset(&pe, 0, sizeof(struct perf_event_attr));
pe.size = sizeof(struct perf_event_attr);
pe.type = PERF_TYPE_SOFTWARE;
pe.config = PERF_COUNT_SW_DUMMY;
pe.disabled = 1;
pe.sample_period = 1;
pe.context_switch = 1;
fd = perf_event_open(&pe, 0, -1, -1, 0);
if (fd == -1) {
fprintf(stderr, "Error opening leader %lx\n", pe.config);
exit(EXIT_FAILURE);
}
buf = mmap(NULL, sysconf(_SC_PAGESIZE) * 2, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (buf == MAP_FAILED) {
fprintf(stderr, "Can't mmap buffer\n");
return -1;
}
if (fcntl(fd, F_SETFL, fcntl(fd, F_GETFL, 0) | O_ASYNC) == -1)
return -2;
if (fcntl(fd, F_SETSIG, SIGIO) == -1)
return -3;
if (fcntl(fd, F_SETOWN, getpid()) == -1)
return -4;
if (ioctl(fd, PERF_EVENT_IOC_COUNT_RECORDS, 0) == -1)
return -5;
if (ioctl(fd, PERF_EVENT_IOC_REFRESH, 2) == -1)
return -6;
fprintf (stderr, "Sleep 1\n");
sleep(1);
fprintf (stderr, "Sleep 2\n");
sleep(1);
fprintf (stderr, "Sleep 3\n");
sleep(1);
/* Disable the event counter */
ioctl(fd, PERF_EVENT_IOC_DISABLE, 1);
close(fd);
return 0;
}
A sample output:
$ time ./cs
Sleep 1
Caught POLL_HUP
Sleep 2
Caught POLL_HUP
Sleep 3
Caught POLL_HUP
real 0m3.060s
user 0m0.001s
sys 0m0.005s
Naveen N. Rao (1):
kernel/events: Introduce IOC_COUNT_RECORDS
include/linux/perf_event.h | 1 +
include/uapi/linux/perf_event.h | 1 +
kernel/events/core.c | 16 +++++++++++++++-
kernel/events/ring_buffer.c | 9 +++++++++
4 files changed, 26 insertions(+), 1 deletion(-)
--
2.12.2
^ permalink raw reply [flat|nested] 9+ messages in thread* [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS 2017-06-06 14:49 [RFC PATCH 0/1] Notifications for perf sideband events Naveen N. Rao @ 2017-06-06 15:16 ` Naveen N. Rao 2017-06-06 15:51 ` Arnaldo Carvalho de Melo 2017-06-06 16:17 ` Peter Zijlstra 0 siblings, 2 replies; 9+ messages in thread From: Naveen N. Rao @ 2017-06-06 15:16 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa Cc: linux-kernel Many perf sideband events (context switches, namespaces, ...) are useful by themselves without the need for subscribing to any overflow events. However, it is not possible to subscribe for notifications when such records are logged into the ring buffer. Introduce IOC_COUNT_RECORDS as a way to request this. With IOC_COUNT_RECORDS set, IOC_REFRESH takes the number of records after which to generate a notification, rather than the number of overflow events. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> --- include/linux/perf_event.h | 1 + include/uapi/linux/perf_event.h | 1 + kernel/events/core.c | 16 +++++++++++++++- kernel/events/ring_buffer.c | 9 +++++++++ 4 files changed, 26 insertions(+), 1 deletion(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 24a635887f28..016f2da2bba7 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -683,6 +683,7 @@ struct perf_event { struct irq_work pending; atomic_t event_limit; + bool count_records; /* address range filters */ struct perf_addr_filters_head addr_filters; diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index b1c0b187acfe..fb989ac71ded 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -408,6 +408,7 @@ struct perf_event_attr { #define PERF_EVENT_IOC_ID _IOR('$', 7, __u64 *) #define PERF_EVENT_IOC_SET_BPF _IOW('$', 8, __u32) #define PERF_EVENT_IOC_PAUSE_OUTPUT _IOW('$', 9, __u32) +#define PERF_EVENT_IOC_COUNT_RECORDS _IO ('$', 10) enum perf_event_ioc_flags { PERF_IOC_FLAG_GROUP = 1U << 0, diff --git a/kernel/events/core.c b/kernel/events/core.c index 6e75a5c9412d..637064880b36 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2674,6 +2674,16 @@ void perf_event_addr_filters_sync(struct perf_event *event) } EXPORT_SYMBOL_GPL(perf_event_addr_filters_sync); +static int _perf_event_count_records(struct perf_event *event) +{ + if (event->attr.inherit || !is_sampling_event(event)) + return -EINVAL; + + event->count_records = 1; + + return 0; +} + static int _perf_event_refresh(struct perf_event *event, int refresh) { /* @@ -4699,6 +4709,9 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon func = _perf_event_reset; break; + case PERF_EVENT_IOC_COUNT_RECORDS: + return _perf_event_count_records(event); + case PERF_EVENT_IOC_REFRESH: return _perf_event_refresh(event, arg); @@ -7342,7 +7355,8 @@ static int __perf_event_overflow(struct perf_event *event, */ event->pending_kill = POLL_IN; - if (events && atomic_dec_and_test(&event->event_limit)) { + if (events && !event->count_records && + atomic_dec_and_test(&event->event_limit)) { ret = 1; event->pending_kill = POLL_HUP; diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 2831480c63a2..9b9ca0608fed 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -126,6 +126,7 @@ __perf_output_begin(struct perf_output_handle *handle, u64 id; u64 lost; } lost_event; + int events = atomic_read(&event->event_limit); rcu_read_lock(); /* @@ -197,6 +198,14 @@ __perf_output_begin(struct perf_output_handle *handle, if (unlikely(head - local_read(&rb->wakeup) > rb->watermark)) local_add(rb->watermark, &rb->wakeup); + if (events && event->count_records && + atomic_dec_and_test(&event->event_limit)) { + event->pending_kill = POLL_HUP; + local_inc(&rb->wakeup); + + perf_event_disable_inatomic(event); + } + page_shift = PAGE_SHIFT + page_order(rb); handle->page = (offset >> page_shift) & (rb->nr_pages - 1); -- 2.12.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS 2017-06-06 15:16 ` [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS Naveen N. Rao @ 2017-06-06 15:51 ` Arnaldo Carvalho de Melo 2017-06-06 16:56 ` Naveen N. Rao 2017-06-06 16:17 ` Peter Zijlstra 1 sibling, 1 reply; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-06-06 15:51 UTC (permalink / raw) To: Naveen N. Rao; +Cc: Peter Zijlstra, Ingo Molnar, Jiri Olsa, linux-kernel Em Tue, Jun 06, 2017 at 08:46:28PM +0530, Naveen N. Rao escreveu: > Many perf sideband events (context switches, namespaces, ...) are useful > by themselves without the need for subscribing to any overflow events. > However, it is not possible to subscribe for notifications when such > records are logged into the ring buffer. Introduce IOC_COUNT_RECORDS as > a way to request this. > > With IOC_COUNT_RECORDS set, IOC_REFRESH takes the number of records > after which to generate a notification, rather than the number of > overflow events. Can you take a look at tools/perf/python/twatch.py? [acme@jouet linux]$ make O=/tmp/build/perf -C tools/perf install-bin [root@jouet linux]# export PYTHONPATH=/tmp/build/perf/python/ [root@jouet linux]# python tools/perf/python/twatch.py cpu: 0, pid: 29860, tid: 29860 { type: exit, pid: 29860, ppid: 29860, tid: 29860, ptid: 29860, time: 117729363047027} cpu: 0, pid: 29854, tid: 29854 { type: exit, pid: 29854, ppid: 29854, tid: 29854, ptid: 29854, time: 117729363617885} cpu: 0, pid: 29853, tid: 29853 { type: fork, pid: 29865, ppid: 29853, tid: 29865, ptid: 29853, time: 117729363800225} cpu: 0, pid: 29865, tid: 29865 { type: comm, pid: 29865, tid: 29865, comm: fixdep } cpu: 0, pid: 29865, tid: 29865 { type: exit, pid: 29865, ppid: 29865, tid: 29865, ptid: 29865, time: 117729364898505} cpu: 0, pid: 29853, tid: 29853 { type: fork, pid: 29866, ppid: 29853, tid: 29866, ptid: 29853, time: 117729365022416} cpu: 0, pid: 29866, tid: 29866 { type: comm, pid: 29866, tid: 29866, comm: rm } cpu: 0, pid: 29866, tid: 29866 { type: exit, pid: 29866, ppid: 29866, tid: 29866, ptid: 29866, time: 117729365665831} cpu: 0, pid: 29853, tid: 29853 { type: fork, pid: 29867, ppid: 29853, tid: 29867, ptid: 29853, time: 117729365846030} cpu: 0, pid: 29867, tid: 29867 { type: comm, pid: 29867, tid: 29867, comm: mv } cpu: 2, pid: 28218, tid: 28218 { type: exit, pid: 28218, ppid: 28218, tid: 28218, ptid: 28218, time: 117729704900029} ^CTraceback (most recent call last): File "tools/perf/python/twatch.py", line 68, in <module> main() File "tools/perf/python/twatch.py", line 40, in main evlist.poll(timeout = -1) KeyboardInterrupt [root@jouet linux]# This is using the python binding to get notifications for such meta events "synchronously", you can do the same with a C proggie, of course, and using just what we have in the kernel already. See its changelog comments to see examples: git log tools/perf/python/twatch.py For instance, what I think you want is in: [acme@jouet linux]$ git log --oneline -1 cfeb1d90a1b1db96383b48888cb7a5f10ca12e12 cfeb1d90a1b1 perf python: Use attr.watermark in twatch.py [acme@jouet linux]$ git log --oneline -1 58b32c1b538f2d197ce385d6a314e83f8b787021 58b32c1b538f perf python: Make twatch.py use soft dummy event, freq=0 [acme@jouet linux]$ - Arnaldo > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > --- > include/linux/perf_event.h | 1 + > include/uapi/linux/perf_event.h | 1 + > kernel/events/core.c | 16 +++++++++++++++- > kernel/events/ring_buffer.c | 9 +++++++++ > 4 files changed, 26 insertions(+), 1 deletion(-) > > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index 24a635887f28..016f2da2bba7 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -683,6 +683,7 @@ struct perf_event { > struct irq_work pending; > > atomic_t event_limit; > + bool count_records; > > /* address range filters */ > struct perf_addr_filters_head addr_filters; > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index b1c0b187acfe..fb989ac71ded 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -408,6 +408,7 @@ struct perf_event_attr { > #define PERF_EVENT_IOC_ID _IOR('$', 7, __u64 *) > #define PERF_EVENT_IOC_SET_BPF _IOW('$', 8, __u32) > #define PERF_EVENT_IOC_PAUSE_OUTPUT _IOW('$', 9, __u32) > +#define PERF_EVENT_IOC_COUNT_RECORDS _IO ('$', 10) > > enum perf_event_ioc_flags { > PERF_IOC_FLAG_GROUP = 1U << 0, > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 6e75a5c9412d..637064880b36 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -2674,6 +2674,16 @@ void perf_event_addr_filters_sync(struct perf_event *event) > } > EXPORT_SYMBOL_GPL(perf_event_addr_filters_sync); > > +static int _perf_event_count_records(struct perf_event *event) > +{ > + if (event->attr.inherit || !is_sampling_event(event)) > + return -EINVAL; > + > + event->count_records = 1; > + > + return 0; > +} > + > static int _perf_event_refresh(struct perf_event *event, int refresh) > { > /* > @@ -4699,6 +4709,9 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon > func = _perf_event_reset; > break; > > + case PERF_EVENT_IOC_COUNT_RECORDS: > + return _perf_event_count_records(event); > + > case PERF_EVENT_IOC_REFRESH: > return _perf_event_refresh(event, arg); > > @@ -7342,7 +7355,8 @@ static int __perf_event_overflow(struct perf_event *event, > */ > > event->pending_kill = POLL_IN; > - if (events && atomic_dec_and_test(&event->event_limit)) { > + if (events && !event->count_records && > + atomic_dec_and_test(&event->event_limit)) { > ret = 1; > event->pending_kill = POLL_HUP; > > diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c > index 2831480c63a2..9b9ca0608fed 100644 > --- a/kernel/events/ring_buffer.c > +++ b/kernel/events/ring_buffer.c > @@ -126,6 +126,7 @@ __perf_output_begin(struct perf_output_handle *handle, > u64 id; > u64 lost; > } lost_event; > + int events = atomic_read(&event->event_limit); > > rcu_read_lock(); > /* > @@ -197,6 +198,14 @@ __perf_output_begin(struct perf_output_handle *handle, > if (unlikely(head - local_read(&rb->wakeup) > rb->watermark)) > local_add(rb->watermark, &rb->wakeup); > > + if (events && event->count_records && > + atomic_dec_and_test(&event->event_limit)) { > + event->pending_kill = POLL_HUP; > + local_inc(&rb->wakeup); > + > + perf_event_disable_inatomic(event); > + } > + > page_shift = PAGE_SHIFT + page_order(rb); > > handle->page = (offset >> page_shift) & (rb->nr_pages - 1); > -- > 2.12.2 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS 2017-06-06 15:51 ` Arnaldo Carvalho de Melo @ 2017-06-06 16:56 ` Naveen N. Rao 0 siblings, 0 replies; 9+ messages in thread From: Naveen N. Rao @ 2017-06-06 16:56 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Peter Zijlstra, Ingo Molnar, Jiri Olsa, linux-kernel Hi Arnaldo, On 2017/06/06 12:51PM, Arnaldo Carvalho de Melo wrote: > Em Tue, Jun 06, 2017 at 08:46:28PM +0530, Naveen N. Rao escreveu: > > Many perf sideband events (context switches, namespaces, ...) are useful > > by themselves without the need for subscribing to any overflow events. > > However, it is not possible to subscribe for notifications when such > > records are logged into the ring buffer. Introduce IOC_COUNT_RECORDS as > > a way to request this. > > > > With IOC_COUNT_RECORDS set, IOC_REFRESH takes the number of records > > after which to generate a notification, rather than the number of > > overflow events. > > Can you take a look at tools/perf/python/twatch.py? > > [acme@jouet linux]$ make O=/tmp/build/perf -C tools/perf install-bin > [root@jouet linux]# export PYTHONPATH=/tmp/build/perf/python/ > [root@jouet linux]# python tools/perf/python/twatch.py > cpu: 0, pid: 29860, tid: 29860 { type: exit, pid: 29860, ppid: 29860, tid: 29860, ptid: 29860, time: 117729363047027} > cpu: 0, pid: 29854, tid: 29854 { type: exit, pid: 29854, ppid: 29854, tid: 29854, ptid: 29854, time: 117729363617885} > cpu: 0, pid: 29853, tid: 29853 { type: fork, pid: 29865, ppid: 29853, tid: 29865, ptid: 29853, time: 117729363800225} > cpu: 0, pid: 29865, tid: 29865 { type: comm, pid: 29865, tid: 29865, comm: fixdep } > cpu: 0, pid: 29865, tid: 29865 { type: exit, pid: 29865, ppid: 29865, tid: 29865, ptid: 29865, time: 117729364898505} > cpu: 0, pid: 29853, tid: 29853 { type: fork, pid: 29866, ppid: 29853, tid: 29866, ptid: 29853, time: 117729365022416} > cpu: 0, pid: 29866, tid: 29866 { type: comm, pid: 29866, tid: 29866, comm: rm } > cpu: 0, pid: 29866, tid: 29866 { type: exit, pid: 29866, ppid: 29866, tid: 29866, ptid: 29866, time: 117729365665831} > cpu: 0, pid: 29853, tid: 29853 { type: fork, pid: 29867, ppid: 29853, tid: 29867, ptid: 29853, time: 117729365846030} > cpu: 0, pid: 29867, tid: 29867 { type: comm, pid: 29867, tid: 29867, comm: mv } > cpu: 2, pid: 28218, tid: 28218 { type: exit, pid: 28218, ppid: 28218, tid: 28218, ptid: 28218, time: 117729704900029} > ^CTraceback (most recent call last): > File "tools/perf/python/twatch.py", line 68, in <module> > main() > File "tools/perf/python/twatch.py", line 40, in main > evlist.poll(timeout = -1) > KeyboardInterrupt > [root@jouet linux]# > > This is using the python binding to get notifications for such meta > events "synchronously", you can do the same with a C proggie, of course, > and using just what we have in the kernel already. > > See its changelog comments to see examples: > > git log tools/perf/python/twatch.py > > For instance, what I think you want is in: > > [acme@jouet linux]$ git log --oneline -1 cfeb1d90a1b1db96383b48888cb7a5f10ca12e12 > cfeb1d90a1b1 perf python: Use attr.watermark in twatch.py > [acme@jouet linux]$ git log --oneline -1 58b32c1b538f2d197ce385d6a314e83f8b787021 > 58b32c1b538f perf python: Make twatch.py use soft dummy event, freq=0 > [acme@jouet linux]$ Thanks. I did come across the above commits when I was looking into this. And this is very close to what I was looking for, but the key difference is that I want to be notified through a signal, rather than having to use poll()/select(). This is for self-profiling, so a process won't need a separate thread for monitoring the mmap buffer. - Naveen ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS 2017-06-06 15:16 ` [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS Naveen N. Rao 2017-06-06 15:51 ` Arnaldo Carvalho de Melo @ 2017-06-06 16:17 ` Peter Zijlstra 2017-06-06 17:12 ` Naveen N. Rao 1 sibling, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2017-06-06 16:17 UTC (permalink / raw) To: Naveen N. Rao Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel On Tue, Jun 06, 2017 at 08:46:28PM +0530, Naveen N. Rao wrote: > Many perf sideband events (context switches, namespaces, ...) are useful > by themselves without the need for subscribing to any overflow events. > However, it is not possible to subscribe for notifications when such > records are logged into the ring buffer. Introduce IOC_COUNT_RECORDS as > a way to request this. > > With IOC_COUNT_RECORDS set, IOC_REFRESH takes the number of records > after which to generate a notification, rather than the number of > overflow events. You forgot to explain why? As is I'm not terribly excited to have more 'crud' in that output path. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS 2017-06-06 16:17 ` Peter Zijlstra @ 2017-06-06 17:12 ` Naveen N. Rao 0 siblings, 0 replies; 9+ messages in thread From: Naveen N. Rao @ 2017-06-06 17:12 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel Hi Peter, On 2017/06/06 06:17PM, Peter Zijlstra wrote: > On Tue, Jun 06, 2017 at 08:46:28PM +0530, Naveen N. Rao wrote: > > Many perf sideband events (context switches, namespaces, ...) are useful > > by themselves without the need for subscribing to any overflow events. > > However, it is not possible to subscribe for notifications when such > > records are logged into the ring buffer. Introduce IOC_COUNT_RECORDS as > > a way to request this. > > > > With IOC_COUNT_RECORDS set, IOC_REFRESH takes the number of records > > after which to generate a notification, rather than the number of > > overflow events. > > You forgot to explain why? As is I'm not terribly excited to have more > 'crud' in that output path. The usecase is a process wanting to profile itself for context switches. Currently, there is no way to ask for signals to be delivered for PERF_RECORD_SWITCH (or, other sideband) events, except setting {watermark=1, wakeup_watermark=1} and using poll()/select(). But, for self-profiling, that requires a separate thread to be used. In addition, it would be easier for user-space to ask for notification after a certain number of records, rather than a certain number of bytes. This is specifically important for context switch events, since we would otherwise not be able to block (context switch out would generate an event which would wake us up immediately, interrupting any existing system calls). I have included an example program in the cover letter which demonstrates this scenario. For the above reasons, I felt it would be simpler to extend the use of IOC_REFRESH ioctl. - Naveen ^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS @ 2017-06-06 14:51 Naveen N. Rao 2017-06-06 16:25 ` Peter Zijlstra 0 siblings, 1 reply; 9+ messages in thread From: Naveen N. Rao @ 2017-06-06 14:51 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa Cc: linux-kernel Many perf sideband events (context switches, namespaces, ...) are useful by themselves without the need for subscribing to any overflow events. However, it is not possible to subscribe for notifications when such records are logged into the ring buffer. Introduce IOC_COUNT_RECORDS as a way to request this. With IOC_COUNT_RECORDS set, IOC_REFRESH takes the number of records after which to generate a notification, rather than the number of overflow events. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> --- include/linux/perf_event.h | 1 + include/uapi/linux/perf_event.h | 1 + kernel/events/core.c | 16 +++++++++++++++- kernel/events/ring_buffer.c | 9 +++++++++ 4 files changed, 26 insertions(+), 1 deletion(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 24a635887f28..016f2da2bba7 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -683,6 +683,7 @@ struct perf_event { struct irq_work pending; atomic_t event_limit; + bool count_records; /* address range filters */ struct perf_addr_filters_head addr_filters; diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index b1c0b187acfe..fb989ac71ded 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -408,6 +408,7 @@ struct perf_event_attr { #define PERF_EVENT_IOC_ID _IOR('$', 7, __u64 *) #define PERF_EVENT_IOC_SET_BPF _IOW('$', 8, __u32) #define PERF_EVENT_IOC_PAUSE_OUTPUT _IOW('$', 9, __u32) +#define PERF_EVENT_IOC_COUNT_RECORDS _IO ('$', 10) enum perf_event_ioc_flags { PERF_IOC_FLAG_GROUP = 1U << 0, diff --git a/kernel/events/core.c b/kernel/events/core.c index 6e75a5c9412d..637064880b36 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2674,6 +2674,16 @@ void perf_event_addr_filters_sync(struct perf_event *event) } EXPORT_SYMBOL_GPL(perf_event_addr_filters_sync); +static int _perf_event_count_records(struct perf_event *event) +{ + if (event->attr.inherit || !is_sampling_event(event)) + return -EINVAL; + + event->count_records = 1; + + return 0; +} + static int _perf_event_refresh(struct perf_event *event, int refresh) { /* @@ -4699,6 +4709,9 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon func = _perf_event_reset; break; + case PERF_EVENT_IOC_COUNT_RECORDS: + return _perf_event_count_records(event); + case PERF_EVENT_IOC_REFRESH: return _perf_event_refresh(event, arg); @@ -7342,7 +7355,8 @@ static int __perf_event_overflow(struct perf_event *event, */ event->pending_kill = POLL_IN; - if (events && atomic_dec_and_test(&event->event_limit)) { + if (events && !event->count_records && + atomic_dec_and_test(&event->event_limit)) { ret = 1; event->pending_kill = POLL_HUP; diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 2831480c63a2..9b9ca0608fed 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -126,6 +126,7 @@ __perf_output_begin(struct perf_output_handle *handle, u64 id; u64 lost; } lost_event; + int events = atomic_read(&event->event_limit); rcu_read_lock(); /* @@ -197,6 +198,14 @@ __perf_output_begin(struct perf_output_handle *handle, if (unlikely(head - local_read(&rb->wakeup) > rb->watermark)) local_add(rb->watermark, &rb->wakeup); + if (events && event->count_records && + atomic_dec_and_test(&event->event_limit)) { + event->pending_kill = POLL_HUP; + local_inc(&rb->wakeup); + + perf_event_disable_inatomic(event); + } + page_shift = PAGE_SHIFT + page_order(rb); handle->page = (offset >> page_shift) & (rb->nr_pages - 1); -- 2.12.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS 2017-06-06 14:51 Naveen N. Rao @ 2017-06-06 16:25 ` Peter Zijlstra 2017-06-06 17:32 ` Naveen N. Rao 0 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2017-06-06 16:25 UTC (permalink / raw) To: Naveen N. Rao Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel On Tue, Jun 06, 2017 at 08:21:18PM +0530, Naveen N. Rao wrote: > Many perf sideband events (context switches, namespaces, ...) are useful > by themselves without the need for subscribing to any overflow events. > However, it is not possible to subscribe for notifications when such > records are logged into the ring buffer. Introduce IOC_COUNT_RECORDS as > a way to request this. > > With IOC_COUNT_RECORDS set, IOC_REFRESH takes the number of records > after which to generate a notification, rather than the number of > overflow events. > > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > --- > include/linux/perf_event.h | 1 + > include/uapi/linux/perf_event.h | 1 + > kernel/events/core.c | 16 +++++++++++++++- > kernel/events/ring_buffer.c | 9 +++++++++ > 4 files changed, 26 insertions(+), 1 deletion(-) > > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index 24a635887f28..016f2da2bba7 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -683,6 +683,7 @@ struct perf_event { > struct irq_work pending; > > atomic_t event_limit; > + bool count_records; This is an instant nack ;-) Never, that is _never_ use bool in composite types. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS 2017-06-06 16:25 ` Peter Zijlstra @ 2017-06-06 17:32 ` Naveen N. Rao 0 siblings, 0 replies; 9+ messages in thread From: Naveen N. Rao @ 2017-06-06 17:32 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel On 2017/06/06 06:25PM, Peter Zijlstra wrote: > On Tue, Jun 06, 2017 at 08:21:18PM +0530, Naveen N. Rao wrote: > > Many perf sideband events (context switches, namespaces, ...) are useful > > by themselves without the need for subscribing to any overflow events. > > However, it is not possible to subscribe for notifications when such > > records are logged into the ring buffer. Introduce IOC_COUNT_RECORDS as > > a way to request this. > > > > With IOC_COUNT_RECORDS set, IOC_REFRESH takes the number of records > > after which to generate a notification, rather than the number of > > overflow events. > > > > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > > --- > > include/linux/perf_event.h | 1 + > > include/uapi/linux/perf_event.h | 1 + > > kernel/events/core.c | 16 +++++++++++++++- > > kernel/events/ring_buffer.c | 9 +++++++++ > > 4 files changed, 26 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > > index 24a635887f28..016f2da2bba7 100644 > > --- a/include/linux/perf_event.h > > +++ b/include/linux/perf_event.h > > @@ -683,6 +683,7 @@ struct perf_event { > > struct irq_work pending; > > > > atomic_t event_limit; > > + bool count_records; > > This is an instant nack ;-) Never, that is _never_ use bool in composite > types. Ouch! Sorry. I did briefly consider labeling this 'EARLY RFC', but that still wouldn't have done justice :/ Thanks for the review, - Naveen ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-06-06 17:33 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-06-06 14:49 [RFC PATCH 0/1] Notifications for perf sideband events Naveen N. Rao 2017-06-06 15:16 ` [RFC PATCH 1/1] kernel/events: Introduce IOC_COUNT_RECORDS Naveen N. Rao 2017-06-06 15:51 ` Arnaldo Carvalho de Melo 2017-06-06 16:56 ` Naveen N. Rao 2017-06-06 16:17 ` Peter Zijlstra 2017-06-06 17:12 ` Naveen N. Rao -- strict thread matches above, loose matches on Subject: below -- 2017-06-06 14:51 Naveen N. Rao 2017-06-06 16:25 ` Peter Zijlstra 2017-06-06 17:32 ` Naveen N. Rao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox