* [PATCH 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT
@ 2016-10-21 11:38 Wang Nan
[not found] ` <1477049893-143199-1-git-send-email-wangnan0-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Wang Nan @ 2016-10-21 11:38 UTC (permalink / raw)
To: mtk.manpages
Cc: wangnan0, pi3orama, linux-kernel, linux-man, lizefan,
vincent.weaver
Linux 4.7 (86e7972f690c1017fd086cdfe53d8524e68c661c) introduces
PERF_EVENT_IOC_PAUSE_OUTPUT feature. Document it.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
---
man2/perf_event_open.2 | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index fade28c..2d3acad 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -2865,7 +2865,18 @@ The argument is a BPF program file descriptor that was created by
a previous
.BR bpf (2)
system call.
-.SS Using prctl(2)
+.TP
+.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)"
+.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c
+This allows pausing and resuming the event's ring-buffer. A
+paused ring-buffer does not prevent samples generation, but simply
+discards them. The discarded samples are considered lost, causes
+.BR PERF_RECORD_LOST
+to be generated when possible.
+
+The argument is an integer. Nonzero value pauses the ring-buffer,
+zero value resumes the ring-buffer.
+.SS Using prctl
A process can enable or disable all the event groups that are
attached to it using the
.BR prctl (2)
--
2.10.1
^ permalink raw reply related [flat|nested] 8+ messages in thread[parent not found: <1477049893-143199-1-git-send-email-wangnan0-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]
* [PATCH 2/2] perf_event_open.2: Document write_backward [not found] ` <1477049893-143199-1-git-send-email-wangnan0-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> @ 2016-10-21 11:38 ` Wang Nan 2016-10-21 21:25 ` Vince Weaver 2016-10-21 21:16 ` [PATCH 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Vince Weaver 2016-10-22 10:02 ` Michael Kerrisk (man-pages) 2 siblings, 1 reply; 8+ messages in thread From: Wang Nan @ 2016-10-21 11:38 UTC (permalink / raw) To: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w Cc: wangnan0-hv44wF8Li93QT0dZR+AlfA, pi3orama-9Onoh4P/yGk, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-man-u79uwXL29TY76Z2rM5mHXA, lizefan-hv44wF8Li93QT0dZR+AlfA, vincent.weaver-e7X0jjDqjFGHXe+LvDLADg Linux 4.7 (9ecda41acb971ebd07c8fb35faf24005c0baea12) introduces write_backward attribute to perf_event_attr. Document this feature. Signed-off-by: Wang Nan <wangnan0-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> Cc: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> --- man2/perf_event_open.2 | 56 +++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 53 insertions(+), 3 deletions(-) diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 index 2d3acad..e5fdfec 100644 --- a/man2/perf_event_open.2 +++ b/man2/perf_event_open.2 @@ -244,8 +244,8 @@ struct perf_event_attr { due to exec */ use_clockid : 1, /* use clockid for time fields */ context_switch : 1, /* context switch data */ - - __reserved_1 : 37; + write_backward : 1, /* Write ring buffer from end to beginning */ + __reserved_1 : 36; union { __u32 wakeup_events; /* wakeup every n events */ @@ -1127,6 +1127,29 @@ The advantage of this method is that it will give full information even with strict .I perf_event_paranoid settings. +.IR "write_backward" " (since Linux 4.6)" +.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12 +This makes the resuling event use a backward ring-buffer, which +writes samples from the end of the ring-buffer. + +It is not allowed to connect events with backward and forward +ring-buffer settings together using +.B PERF_EVENT_IOC_SET_OUTPUT. + +Backward ring-buffer is useful when the ring-buffer is overwritable +(created by readonly +.BR mmap (2) +). In this case, +.IR data_tail +is useless, +.IR data_head +points to the head of the most recent sample in a backward +ring-buffer. It is easy to iterate over the whole ring-buffer by reading +samples one by one because size of a sample can be found from decoding +its header. In contract, in a forward overwritable ring-buffer, the only +information is the end of the most recent sample which is pointed by +.IR data_head, +but the size of a sample can't be determined from the end of it. .TP .IR "wakeup_events" ", " "wakeup_watermark" This union sets how many samples @@ -1671,7 +1694,9 @@ And vice versa: .TP .I data_head This points to the head of the data section. -The value continuously increases, it does not wrap. +The value continuously increases (or decrease if +.IR write_backward +is set), it does not wrap. The value needs to be manually wrapped by the size of the mmap buffer before accessing the samples. @@ -2727,6 +2752,24 @@ Starting with Linux 3.18, .B POLL_HUP is indicated if the event being monitored is attached to a different process and that process exits. +.SS Reading from overwritable ring-buffer +Reader is unable to update +.IR data_tail +if the mapping is not +.BR PROT_WRITE . +In this case, kernel will overwrite data without considering whether +they are read or not, so ring-buffer is overwritable and +behaves like a flight recorder. To read from an overwritable +ring-buffer, setting +.IR write_backward +is suggested, or it would be hard to find a proper position to start +decoding. In addition, ring-buffer should be paused before reading +through +.BR ioctl (2) +with +.B PERF_EVENT_IOC_PAUSE_OUTPUT +to avoid racing between kernel and reader. Ring-buffer should be resumed +after finish reading. .SS rdpmc instruction Starting with Linux 3.4 on x86, you can use the .\" commit c7206205d00ab375839bd6c7ddb247d600693c09 @@ -2839,6 +2882,13 @@ The file descriptors must all be on the same CPU. The argument specifies the desired file descriptor, or \-1 if output should be ignored. + +Two events with different +.IR write_backward +settings are not allowed to be connected together using +.B PERF_EVENT_IOC_SET_OUTPUT. +.B EINVAL +is returned in this case. .TP .BR PERF_EVENT_IOC_SET_FILTER " (since Linux 2.6.33)" .\" commit 6fb2915df7f0747d9044da9dbff5b46dc2e20830 -- 2.10.1 -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] perf_event_open.2: Document write_backward 2016-10-21 11:38 ` [PATCH 2/2] perf_event_open.2: Document write_backward Wang Nan @ 2016-10-21 21:25 ` Vince Weaver 2016-10-22 10:05 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 8+ messages in thread From: Vince Weaver @ 2016-10-21 21:25 UTC (permalink / raw) To: Wang Nan Cc: mtk.manpages, pi3orama, linux-kernel, linux-man, lizefan, vincent.weaver On Fri, 21 Oct 2016, Wang Nan wrote: > context_switch : 1, /* context switch data */ > - > - __reserved_1 : 37; > + write_backward : 1, /* Write ring buffer from end to beginning */ > + __reserved_1 : 36; This removes a blank line, not sure if intentional or not. > +.IR "write_backward" " (since Linux 4.6)" It didn't committed until Linux 4.7 from what I can tell? > +This makes the resuling event use a backward ring-buffer, which resulting > +writes samples from the end of the ring-buffer. > + > +It is not allowed to connect events with backward and forward > +ring-buffer settings together using > +.B PERF_EVENT_IOC_SET_OUTPUT. > + > +Backward ring-buffer is useful when the ring-buffer is overwritable > +(created by readonly > +.BR mmap (2) > +). A ring buffer is over-writable when it is mmapped readonly? Is this a hard requirement? Can you set the read-backwards bit if not mapped readonly? Otherwise the documentation seems reasonable. Reviewed-by: Vince Weaver <vincent.weaver@maine.edu> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] perf_event_open.2: Document write_backward 2016-10-21 21:25 ` Vince Weaver @ 2016-10-22 10:05 ` Michael Kerrisk (man-pages) 2016-10-24 6:44 ` Wangnan (F) 0 siblings, 1 reply; 8+ messages in thread From: Michael Kerrisk (man-pages) @ 2016-10-22 10:05 UTC (permalink / raw) To: Vince Weaver, Wang Nan Cc: mtk.manpages, pi3orama, linux-kernel, linux-man, lizefan On 10/21/2016 11:25 PM, Vince Weaver wrote: > On Fri, 21 Oct 2016, Wang Nan wrote: > >> context_switch : 1, /* context switch data */ >> - >> - __reserved_1 : 37; >> + write_backward : 1, /* Write ring buffer from end to beginning */ >> + __reserved_1 : 36; > > This removes a blank line, not sure if intentional or not. Maybe it would be better to keep it. I don't feel too strongly about this though. >> +.IR "write_backward" " (since Linux 4.6)" > > It didn't committed until Linux 4.7 from what I can tell? Yes, that's my recollection too. > >> +This makes the resuling event use a backward ring-buffer, which > resulting > >> +writes samples from the end of the ring-buffer. >> + >> +It is not allowed to connect events with backward and forward >> +ring-buffer settings together using >> +.B PERF_EVENT_IOC_SET_OUTPUT. >> + >> +Backward ring-buffer is useful when the ring-buffer is overwritable >> +(created by readonly >> +.BR mmap (2) >> +). > > A ring buffer is over-writable when it is mmapped readonly? > Is this a hard requirement? > Can you set the read-backwards bit if not mapped readonly? Wang Nan, could you perhaps clarify this in the next version of the patch? > > Otherwise the documentation seems reasonable. > > Reviewed-by: Vince Weaver <vincent.weaver@maine.edu> Thanks for reviewing both patches, Vince. Wang Nan, please include the Reviewed-by: in the next patch iteration. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] perf_event_open.2: Document write_backward 2016-10-22 10:05 ` Michael Kerrisk (man-pages) @ 2016-10-24 6:44 ` Wangnan (F) 0 siblings, 0 replies; 8+ messages in thread From: Wangnan (F) @ 2016-10-24 6:44 UTC (permalink / raw) To: Michael Kerrisk (man-pages), Vince Weaver Cc: pi3orama, linux-kernel, linux-man, lizefan On 2016/10/22 18:05, Michael Kerrisk (man-pages) wrote: > On 10/21/2016 11:25 PM, Vince Weaver wrote: >> On Fri, 21 Oct 2016, Wang Nan wrote: >> >>> context_switch : 1, /* context switch data */ >>> - >>> - __reserved_1 : 37; >>> + write_backward : 1, /* Write ring buffer from end to beginning */ >>> + __reserved_1 : 36; >> This removes a blank line, not sure if intentional or not. > Maybe it would be better to keep it. I don't feel too strongly about > this though. > >>> +.IR "write_backward" " (since Linux 4.6)" >> It didn't committed until Linux 4.7 from what I can tell? > Yes, that's my recollection too. > >>> +This makes the resuling event use a backward ring-buffer, which >> resulting >> >>> +writes samples from the end of the ring-buffer. >>> + >>> +It is not allowed to connect events with backward and forward >>> +ring-buffer settings together using >>> +.B PERF_EVENT_IOC_SET_OUTPUT. >>> + >>> +Backward ring-buffer is useful when the ring-buffer is overwritable >>> +(created by readonly >>> +.BR mmap (2) >>> +). >> A ring buffer is over-writable when it is mmapped readonly? >> Is this a hard requirement? I'd like to explain over-writable ring buffer in patch 1/1 like this: diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 index fade28c..561331c 100644 --- a/man2/perf_event_open.2 +++ b/man2/perf_event_open.2 @@ -1687,6 +1687,15 @@ the .I data_tail value should be written by user space to reflect the last read data. In this case, the kernel will not overwrite unread data. + +When the mapping is read only (without +.BR PROT_WRITE ), +setting .I data_tail is not allowed. +In this case, the kernel will overwrite data when sample coming, unless +the ring buffer is paused by a +.BR PERF_EVENT_IOC_PAUSE_OUTPUT +.BR ioctl (2) +system call before reading. .TP .IR data_offset " (since Linux 4.1)" .\" commit e8c6deac69629c0cb97c3d3272f8631ef17f8f0f The ring buffer become over-writable because there's no way to tell kernel the positioin of the last read data when mmaped read only. >> Can you set the read-backwards bit if not mapped readonly? I don't understand why we need read-backwards. Mapped with PROT_WRITE is the *default* setting. In this case user program like perf is able to tell the reading position to kernel through writing to 'data_tail'. In this case kernel won't overwrite unread data, it reads forwardly. Or do you think the naming is confusing? The name of 'write_backward' is kernel-centric, means adjust kernel behavior. kernel *write* data, so I call it 'write_backward'. The name 'read-backwards' is user-centric, because user 'read' data. Thank you. ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT [not found] ` <1477049893-143199-1-git-send-email-wangnan0-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> 2016-10-21 11:38 ` [PATCH 2/2] perf_event_open.2: Document write_backward Wang Nan @ 2016-10-21 21:16 ` Vince Weaver 2016-10-22 10:00 ` Michael Kerrisk (man-pages) 2016-10-22 10:02 ` Michael Kerrisk (man-pages) 2 siblings, 1 reply; 8+ messages in thread From: Vince Weaver @ 2016-10-21 21:16 UTC (permalink / raw) To: Wang Nan Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, pi3orama-9Onoh4P/yGk, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-man-u79uwXL29TY76Z2rM5mHXA, lizefan-hv44wF8Li93QT0dZR+AlfA, vincent.weaver-e7X0jjDqjFGHXe+LvDLADg On Fri, 21 Oct 2016, Wang Nan wrote: > -.SS Using prctl(2) > +.SS Using prctl why this change? > +.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)" > +.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c > +This allows pausing and resuming the event's ring-buffer. A > +paused ring-buffer does not prevent samples generation, but simply > +discards them. The discarded samples are considered lost, causes > +.BR PERF_RECORD_LOST > +to be generated when possible. I don't know if it's worth mentioning that the reason to add this is to allow reading the ring-buffer without having to worry about data being overwritten. There are a few odd wording choices (mostly plural nouns) but otherwise looks fine to me. Reviewed-by: Vince Weaver <vincent.weaver-e7X0jjDqjFGHXe+LvDLADg@public.gmane.org> -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT 2016-10-21 21:16 ` [PATCH 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Vince Weaver @ 2016-10-22 10:00 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 8+ messages in thread From: Michael Kerrisk (man-pages) @ 2016-10-22 10:00 UTC (permalink / raw) To: Vince Weaver, Wang Nan Cc: mtk.manpages, pi3orama, linux-kernel, linux-man, lizefan On 10/21/2016 11:16 PM, Vince Weaver wrote: > On Fri, 21 Oct 2016, Wang Nan wrote: > > >> -.SS Using prctl(2) >> +.SS Using prctl > > why this change? I suspect a diff against a slight stale version of the page, since I added the '(2)' just a few days ago. Wang Nan, please do pull the latest version of the page :-). >> +.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)" >> +.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c >> +This allows pausing and resuming the event's ring-buffer. A >> +paused ring-buffer does not prevent samples generation, but simply >> +discards them. The discarded samples are considered lost, causes >> +.BR PERF_RECORD_LOST >> +to be generated when possible. > > I don't know if it's worth mentioning that the reason to add this is to > allow reading the ring-buffer without having to worry about data being > overwritten. Wang Nan, what do you you thing. Should this be added? > There are a few odd wording choices (mostly plural nouns) but otherwise > looks fine to me. > > Reviewed-by: Vince Weaver <vincent.weaver@maine.edu> Wang Nan, I'll send a few wording corrections. Could you please include Vince's reviewed by tag on your next revision? Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT [not found] ` <1477049893-143199-1-git-send-email-wangnan0-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> 2016-10-21 11:38 ` [PATCH 2/2] perf_event_open.2: Document write_backward Wang Nan 2016-10-21 21:16 ` [PATCH 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Vince Weaver @ 2016-10-22 10:02 ` Michael Kerrisk (man-pages) 2 siblings, 0 replies; 8+ messages in thread From: Michael Kerrisk (man-pages) @ 2016-10-22 10:02 UTC (permalink / raw) To: Wang Nan Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, pi3orama-9Onoh4P/yGk, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-man-u79uwXL29TY76Z2rM5mHXA, lizefan-hv44wF8Li93QT0dZR+AlfA, vincent.weaver-e7X0jjDqjFGHXe+LvDLADg Hello Wang Nan Thanks for this patch! A few comments below. On 10/21/2016 01:38 PM, Wang Nan wrote: > Linux 4.7 (86e7972f690c1017fd086cdfe53d8524e68c661c) introduces > PERF_EVENT_IOC_PAUSE_OUTPUT feature. Document it. > > Signed-off-by: Wang Nan <wangnan0-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> > Cc: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > --- > man2/perf_event_open.2 | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 > index fade28c..2d3acad 100644 > --- a/man2/perf_event_open.2 > +++ b/man2/perf_event_open.2 > @@ -2865,7 +2865,18 @@ The argument is a BPF program file descriptor that was created by > a previous > .BR bpf (2) > system call. > -.SS Using prctl(2) > +.TP > +.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)" > +.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c > +This allows pausing and resuming the event's ring-buffer. A > +paused ring-buffer does not prevent samples generation, but simply s/samples generation/generation of samples/ > +discards them. The discarded samples are considered lost, causes s/them/the samples/ s/causes/causing/ > +.BR PERF_RECORD_LOST > +to be generated when possible. > + > +The argument is an integer. Nonzero value pauses the ring-buffer, s/Nonzero/a nonzero/ > +zero value resumes the ring-buffer. s/zero value/zero/ > +.SS Using prctl As noted by Vince, the change to this SS line should not be part of this patch. > A process can enable or disable all the event groups that are > attached to it using the > .BR prctl (2) Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-10-24 6:44 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-21 11:38 [PATCH 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Wang Nan
[not found] ` <1477049893-143199-1-git-send-email-wangnan0-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2016-10-21 11:38 ` [PATCH 2/2] perf_event_open.2: Document write_backward Wang Nan
2016-10-21 21:25 ` Vince Weaver
2016-10-22 10:05 ` Michael Kerrisk (man-pages)
2016-10-24 6:44 ` Wangnan (F)
2016-10-21 21:16 ` [PATCH 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Vince Weaver
2016-10-22 10:00 ` Michael Kerrisk (man-pages)
2016-10-22 10:02 ` Michael Kerrisk (man-pages)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).