* [PATCH v3][manpages 0/2] Document new feature in perf_event_open @ 2016-10-24 6:52 Wang Nan 2016-10-24 6:52 ` [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Wang Nan 2016-10-24 6:52 ` [PATCH v3][manpages 2/2] perf_event_open.2: Document write_backward Wang Nan 0 siblings, 2 replies; 6+ messages in thread From: Wang Nan @ 2016-10-24 6:52 UTC (permalink / raw) To: mtk.manpages, vincent.weaver Cc: pi3orama, linux-kernel, lizefan, linux-man, Wang Nan Decribe PERF_EVENT_IOC_PAUSE_OUTPUT and write_backward in man pages. v2 -> v3: Correct words. Explain the relationship between readonly ring buffer and over-writable ring buffer in patch 1/2. Wang Nan (2): perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT perf_event_open.2: Document write_backward man2/perf_event_open.2 | 81 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 79 insertions(+), 2 deletions(-) -- 2.10.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT 2016-10-24 6:52 [PATCH v3][manpages 0/2] Document new feature in perf_event_open Wang Nan @ 2016-10-24 6:52 ` Wang Nan 2016-11-09 13:26 ` Michael Kerrisk (man-pages) 2018-08-13 16:39 ` Michael Kerrisk (man-opages) 2016-10-24 6:52 ` [PATCH v3][manpages 2/2] perf_event_open.2: Document write_backward Wang Nan 1 sibling, 2 replies; 6+ messages in thread From: Wang Nan @ 2016-10-24 6:52 UTC (permalink / raw) To: mtk.manpages, vincent.weaver Cc: pi3orama, linux-kernel, lizefan, linux-man, Wang Nan Linux 4.7 (86e7972f690c1017fd086cdfe53d8524e68c661c) introduces PERF_EVENT_IOC_PAUSE_OUTPUT feature. Document it. Signed-off-by: Wang Nan <wangnan0@huawei.com> Reviewed-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Michael Kerrisk <mtk.manpages@gmail.com> --- man2/perf_event_open.2 | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 index fade28c..561331c 100644 --- a/man2/perf_event_open.2 +++ b/man2/perf_event_open.2 @@ -1687,6 +1687,15 @@ the .I data_tail value should be written by user space to reflect the last read data. In this case, the kernel will not overwrite unread data. + +When the mapping is read only (without +.BR PROT_WRITE ), +setting .I data_tail is not allowed. +In this case, the kernel will overwrite data when sample coming, unless +the ring buffer is paused by a +.BR PERF_EVENT_IOC_PAUSE_OUTPUT +.BR ioctl (2) +system call before reading. .TP .IR data_offset " (since Linux 4.1)" .\" commit e8c6deac69629c0cb97c3d3272f8631ef17f8f0f @@ -2865,6 +2874,21 @@ The argument is a BPF program file descriptor that was created by a previous .BR bpf (2) system call. +.TP +.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)" +.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c +This allows pausing and resuming the event's ring-buffer. A +paused ring-buffer does not prevent generation of samples, but simply +discards the samples. The discarded samples are considered lost, +causing +.BR PERF_RECORD_LOST +to be generated when possible. + +The argument is an integer. A nonzero value pauses the ring-buffer, +zero resumes the ring-buffer. + +Pausing a read only ring buffer before reading from it without having +to worry about data being overwritten. .SS Using prctl(2) A process can enable or disable all the event groups that are attached to it using the -- 2.10.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT 2016-10-24 6:52 ` [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Wang Nan @ 2016-11-09 13:26 ` Michael Kerrisk (man-pages) 2018-08-13 16:39 ` Michael Kerrisk (man-opages) 1 sibling, 0 replies; 6+ messages in thread From: Michael Kerrisk (man-pages) @ 2016-11-09 13:26 UTC (permalink / raw) To: Wang Nan, vincent.weaver Cc: mtk.manpages, pi3orama, linux-kernel, lizefan, linux-man Hello Wang Nan, On 10/24/2016 08:52 AM, Wang Nan wrote: > Linux 4.7 (86e7972f690c1017fd086cdfe53d8524e68c661c) introduces > PERF_EVENT_IOC_PAUSE_OUTPUT feature. Document it. > > Signed-off-by: Wang Nan <wangnan0@huawei.com> > Reviewed-by: Vince Weaver <vincent.weaver@maine.edu> > Cc: Michael Kerrisk <mtk.manpages@gmail.com> > --- > man2/perf_event_open.2 | 24 ++++++++++++++++++++++++ > 1 file changed, 24 insertions(+) > > diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 > index fade28c..561331c 100644 > --- a/man2/perf_event_open.2 > +++ b/man2/perf_event_open.2 > @@ -1687,6 +1687,15 @@ the > .I data_tail > value should be written by user space to reflect the last read data. > In this case, the kernel will not overwrite unread data. > + > +When the mapping is read only (without > +.BR PROT_WRITE ), > +setting .I data_tail is not allowed. Missing line breaks in the preceding line. > +In this case, the kernel will overwrite data when sample coming, unless I find that last line hard to understand. s/sample coming/a sample arrives/? > +the ring buffer is paused by a > +.BR PERF_EVENT_IOC_PAUSE_OUTPUT > +.BR ioctl (2) > +system call before reading. > .TP > .IR data_offset " (since Linux 4.1)" > .\" commit e8c6deac69629c0cb97c3d3272f8631ef17f8f0f > @@ -2865,6 +2874,21 @@ The argument is a BPF program file descriptor that was created by > a previous > .BR bpf (2) > system call. > +.TP > +.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)" > +.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c > +This allows pausing and resuming the event's ring-buffer. A > +paused ring-buffer does not prevent generation of samples, but simply > +discards the samples. The discarded samples are considered lost, > +causing > +.BR PERF_RECORD_LOST > +to be generated when possible. > + > +The argument is an integer. A nonzero value pauses the ring-buffer, > +zero resumes the ring-buffer. > + > +Pausing a read only ring buffer before reading from it without having > +to worry about data being overwritten. That last sentence seems incomplete. I can't understand what you mean here? Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT 2016-10-24 6:52 ` [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Wang Nan 2016-11-09 13:26 ` Michael Kerrisk (man-pages) @ 2018-08-13 16:39 ` Michael Kerrisk (man-opages) 1 sibling, 0 replies; 6+ messages in thread From: Michael Kerrisk (man-opages) @ 2018-08-13 16:39 UTC (permalink / raw) To: Wang Nan, vincent.weaver Cc: mtk.manpages, pi3orama, linux-kernel, lizefan, linux-man Hello Wangnan, On 10/24/2016 08:52 AM, Wang Nan wrote: > Linux 4.7 (86e7972f690c1017fd086cdfe53d8524e68c661c) introduces > PERF_EVENT_IOC_PAUSE_OUTPUT feature. Document it. Just to confirm, I presume this patch has been superseded by the one from Vince that I just applied. Cheers, Michael > Signed-off-by: Wang Nan <wangnan0@huawei.com> > Reviewed-by: Vince Weaver <vincent.weaver@maine.edu> > Cc: Michael Kerrisk <mtk.manpages@gmail.com> > --- > man2/perf_event_open.2 | 24 ++++++++++++++++++++++++ > 1 file changed, 24 insertions(+) > > diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 > index fade28c..561331c 100644 > --- a/man2/perf_event_open.2 > +++ b/man2/perf_event_open.2 > @@ -1687,6 +1687,15 @@ the > .I data_tail > value should be written by user space to reflect the last read data. > In this case, the kernel will not overwrite unread data. > + > +When the mapping is read only (without > +.BR PROT_WRITE ), > +setting .I data_tail is not allowed. > +In this case, the kernel will overwrite data when sample coming, unless > +the ring buffer is paused by a > +.BR PERF_EVENT_IOC_PAUSE_OUTPUT > +.BR ioctl (2) > +system call before reading. > .TP > .IR data_offset " (since Linux 4.1)" > .\" commit e8c6deac69629c0cb97c3d3272f8631ef17f8f0f > @@ -2865,6 +2874,21 @@ The argument is a BPF program file descriptor that was created by > a previous > .BR bpf (2) > system call. > +.TP > +.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)" > +.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c > +This allows pausing and resuming the event's ring-buffer. A > +paused ring-buffer does not prevent generation of samples, but simply > +discards the samples. The discarded samples are considered lost, > +causing > +.BR PERF_RECORD_LOST > +to be generated when possible. > + > +The argument is an integer. A nonzero value pauses the ring-buffer, > +zero resumes the ring-buffer. > + > +Pausing a read only ring buffer before reading from it without having > +to worry about data being overwritten. > .SS Using prctl(2) > A process can enable or disable all the event groups that are > attached to it using the > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3][manpages 2/2] perf_event_open.2: Document write_backward 2016-10-24 6:52 [PATCH v3][manpages 0/2] Document new feature in perf_event_open Wang Nan 2016-10-24 6:52 ` [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Wang Nan @ 2016-10-24 6:52 ` Wang Nan 2016-11-09 13:30 ` Michael Kerrisk (man-pages) 1 sibling, 1 reply; 6+ messages in thread From: Wang Nan @ 2016-10-24 6:52 UTC (permalink / raw) To: mtk.manpages, vincent.weaver Cc: pi3orama, linux-kernel, lizefan, linux-man, Wang Nan Linux 4.7 (9ecda41acb971ebd07c8fb35faf24005c0baea12) introduces write_backward attribute to perf_event_attr. Document this feature. Signed-off-by: Wang Nan <wangnan0@huawei.com> Reviewed-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Michael Kerrisk <mtk.manpages@gmail.com> --- man2/perf_event_open.2 | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 index 561331c..fccde79 100644 --- a/man2/perf_event_open.2 +++ b/man2/perf_event_open.2 @@ -245,7 +245,8 @@ struct perf_event_attr { use_clockid : 1, /* use clockid for time fields */ context_switch : 1, /* context switch data */ - __reserved_1 : 37; + write_backward : 1, /* Write ring buffer from end to beginning */ + __reserved_1 : 36; union { __u32 wakeup_events; /* wakeup every n events */ @@ -1127,6 +1128,31 @@ The advantage of this method is that it will give full information even with strict .I perf_event_paranoid settings. +.IR "write_backward" " (since Linux 4.7)" +.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12 +This makes the resuling event use a backward ring-buffer, which +writes samples from the end of the ring-buffer to the beginning. + +It is not allowed to connect events with backward and forward +ring-buffer settings together using +.B PERF_EVENT_IOC_SET_OUTPUT. + +Backward ring-buffer is useful for ring-buffers created by readonly +.BR mmap (2). +In this case, +.IR data_tail +is useless (because user space programs are not allowed to write to it). +.IR data_head +points to the head of the most recent sample. In a backward +ring-buffer, it is easy to iterate over the whole ring-buffer by reading +samples one by one from +.IR data_head +because size of a sample can be found from decoding its header. + +For a forward read only ring-buffer in contract, +.IR data_head +points to the end of the most recent sample, but the size of a sample +can't be determined from the end of it. .TP .IR "wakeup_events" ", " "wakeup_watermark" This union sets how many samples @@ -1671,7 +1697,9 @@ And vice versa: .TP .I data_head This points to the head of the data section. -The value continuously increases, it does not wrap. +The value continuously increases (or decrease if +.IR write_backward +is set), it does not wrap. The value needs to be manually wrapped by the size of the mmap buffer before accessing the samples. @@ -2736,6 +2764,24 @@ Starting with Linux 3.18, .B POLL_HUP is indicated if the event being monitored is attached to a different process and that process exits. +.SS Reading from overwritable ring-buffer +Reader is unable to update +.IR data_tail +if the mapping is not +.BR PROT_WRITE . +In this case, kernel will overwrite data without considering whether +they are read or not, so ring-buffer is overwritable and +behaves like a flight recorder. To read from an overwritable +ring-buffer, setting +.IR write_backward +is suggested, or it would be hard to find a proper position to start +decoding. In addition, ring-buffer should be paused before reading +through +.BR ioctl (2) +with +.B PERF_EVENT_IOC_PAUSE_OUTPUT +to avoid racing between kernel and reader. Ring-buffer should be resumed +after finish reading. .SS rdpmc instruction Starting with Linux 3.4 on x86, you can use the .\" commit c7206205d00ab375839bd6c7ddb247d600693c09 @@ -2848,6 +2894,13 @@ The file descriptors must all be on the same CPU. The argument specifies the desired file descriptor, or \-1 if output should be ignored. + +Two events with different +.IR write_backward +settings are not allowed to be connected together using +.B PERF_EVENT_IOC_SET_OUTPUT. +.B EINVAL +is returned in this case. .TP .BR PERF_EVENT_IOC_SET_FILTER " (since Linux 2.6.33)" .\" commit 6fb2915df7f0747d9044da9dbff5b46dc2e20830 -- 2.10.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3][manpages 2/2] perf_event_open.2: Document write_backward 2016-10-24 6:52 ` [PATCH v3][manpages 2/2] perf_event_open.2: Document write_backward Wang Nan @ 2016-11-09 13:30 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 6+ messages in thread From: Michael Kerrisk (man-pages) @ 2016-11-09 13:30 UTC (permalink / raw) To: Wang Nan, vincent.weaver Cc: mtk.manpages, pi3orama, linux-kernel, lizefan, linux-man Hello Wang Nan, On 10/24/2016 08:52 AM, Wang Nan wrote: > Linux 4.7 (9ecda41acb971ebd07c8fb35faf24005c0baea12) introduces write_backward > attribute to perf_event_attr. Document this feature. > > Signed-off-by: Wang Nan <wangnan0@huawei.com> > Reviewed-by: Vince Weaver <vincent.weaver@maine.edu> > Cc: Michael Kerrisk <mtk.manpages@gmail.com> > --- > man2/perf_event_open.2 | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 55 insertions(+), 2 deletions(-) > > diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 > index 561331c..fccde79 100644 > --- a/man2/perf_event_open.2 > +++ b/man2/perf_event_open.2 > @@ -245,7 +245,8 @@ struct perf_event_attr { > use_clockid : 1, /* use clockid for time fields */ > context_switch : 1, /* context switch data */ > > - __reserved_1 : 37; > + write_backward : 1, /* Write ring buffer from end to beginning */ > + __reserved_1 : 36; > > union { > __u32 wakeup_events; /* wakeup every n events */ > @@ -1127,6 +1128,31 @@ The advantage of this method is that it will give full > information even with strict > .I perf_event_paranoid > settings. > +.IR "write_backward" " (since Linux 4.7)" > +.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12 > +This makes the resuling event use a backward ring-buffer, which s/reuling/resulting/ s/This/Setting this bit/ ? > +writes samples from the end of the ring-buffer to the beginning. > + > +It is not allowed to connect events with backward and forward > +ring-buffer settings together using > +.B PERF_EVENT_IOC_SET_OUTPUT. > + > +Backward ring-buffer is useful for ring-buffers created by readonly > +.BR mmap (2). > +In this case, > +.IR data_tail > +is useless (because user space programs are not allowed to write to it). > +.IR data_head > +points to the head of the most recent sample. In a backward > +ring-buffer, it is easy to iterate over the whole ring-buffer by reading > +samples one by one from > +.IR data_head > +because size of a sample can be found from decoding its header. > + > +For a forward read only ring-buffer in contract, What does "in contract" here mean? This needs to be clarified. > +.IR data_head > +points to the end of the most recent sample, but the size of a sample > +can't be determined from the end of it. > .TP > .IR "wakeup_events" ", " "wakeup_watermark" > This union sets how many samples > @@ -1671,7 +1697,9 @@ And vice versa: > .TP > .I data_head > This points to the head of the data section. > -The value continuously increases, it does not wrap. > +The value continuously increases (or decrease if > +.IR write_backward > +is set), it does not wrap. > The value needs to be manually wrapped by the size of the mmap buffer > before accessing the samples. > > @@ -2736,6 +2764,24 @@ Starting with Linux 3.18, > .B POLL_HUP > is indicated if the event being monitored is attached to a different > process and that process exits. > +.SS Reading from overwritable ring-buffer > +Reader is unable to update > +.IR data_tail > +if the mapping is not > +.BR PROT_WRITE . > +In this case, kernel will overwrite data without considering whether > +they are read or not, so ring-buffer is overwritable and > +behaves like a flight recorder. To read from an overwritable > +ring-buffer, setting > +.IR write_backward > +is suggested, or it would be hard to find a proper position to start > +decoding. In addition, ring-buffer should be paused before reading > +through > +.BR ioctl (2) > +with > +.B PERF_EVENT_IOC_PAUSE_OUTPUT > +to avoid racing between kernel and reader. Ring-buffer should be resumed > +after finish reading. > .SS rdpmc instruction > Starting with Linux 3.4 on x86, you can use the > .\" commit c7206205d00ab375839bd6c7ddb247d600693c09 > @@ -2848,6 +2894,13 @@ The file descriptors must all be on the same CPU. > > The argument specifies the desired file descriptor, or \-1 if > output should be ignored. > + > +Two events with different > +.IR write_backward > +settings are not allowed to be connected together using > +.B PERF_EVENT_IOC_SET_OUTPUT. > +.B EINVAL > +is returned in this case. > .TP > .BR PERF_EVENT_IOC_SET_FILTER " (since Linux 2.6.33)" > .\" commit 6fb2915df7f0747d9044da9dbff5b46dc2e20830 Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-08-13 16:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-10-24 6:52 [PATCH v3][manpages 0/2] Document new feature in perf_event_open Wang Nan 2016-10-24 6:52 ` [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT Wang Nan 2016-11-09 13:26 ` Michael Kerrisk (man-pages) 2018-08-13 16:39 ` Michael Kerrisk (man-opages) 2016-10-24 6:52 ` [PATCH v3][manpages 2/2] perf_event_open.2: Document write_backward Wang Nan 2016-11-09 13:30 ` Michael Kerrisk (man-pages)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).