* [PATCH] perf_events: fix errors path in perf_output_begin()
@ 2010-05-17 10:46 Stephane Eranian
2010-05-17 11:04 ` John Kacur
2010-05-17 11:46 ` Frederic Weisbecker
0 siblings, 2 replies; 9+ messages in thread
From: Stephane Eranian @ 2010-05-17 10:46 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, mingo, paulus, davem, fweisbec, acme, perfmon2-devel,
eranian, eranian
In case the sampling buffer has no "payload" pages, nr_pages is 0.
The problem is that the error path in perf_output_begin() skips to
a label which assumes perf_output_lock() has been issued which is
not the case. That triggers a WARN_ON() is perf_output_unlock().
This patch fixes the problem by adding a new label and skipping
perf_task_unlock() in case data->nr_pages is 0.
Signed-off-by: Stephane Eranian <eranian@google.com>
--
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index a4fa381..95137b6 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -3035,8 +3035,10 @@ int perf_output_begin(struct perf_output_handle *handle,
handle->nmi = nmi;
handle->sample = sample;
- if (!data->nr_pages)
- goto fail;
+ if (!data->nr_pages) {
+ atomic_inc(&data->lost);
+ goto out;
+ }
have_lost = atomic_read(&data->lost);
if (have_lost)
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] perf_events: fix errors path in perf_output_begin()
2010-05-17 10:46 [PATCH] perf_events: fix errors path in perf_output_begin() Stephane Eranian
@ 2010-05-17 11:04 ` John Kacur
2010-05-17 11:57 ` Stephane Eranian
2010-05-17 11:46 ` Frederic Weisbecker
1 sibling, 1 reply; 9+ messages in thread
From: John Kacur @ 2010-05-17 11:04 UTC (permalink / raw)
To: Stephane Eranian
Cc: linux-kernel, peterz, mingo, paulus, davem, fweisbec, acme,
perfmon2-devel, eranian
On Mon, 17 May 2010, Stephane Eranian wrote:
> In case the sampling buffer has no "payload" pages, nr_pages is 0.
> The problem is that the error path in perf_output_begin() skips to
> a label which assumes perf_output_lock() has been issued which is
> not the case. That triggers a WARN_ON() is perf_output_unlock().
>
> This patch fixes the problem by adding a new label and skipping
> perf_task_unlock() in case data->nr_pages is 0.
It does? Your code looks good to me, but the description above is
a little flawed, since no new label is added.
Thanks
>
> Signed-off-by: Stephane Eranian <eranian@google.com>
>
> --
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index a4fa381..95137b6 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -3035,8 +3035,10 @@ int perf_output_begin(struct perf_output_handle *handle,
> handle->nmi = nmi;
> handle->sample = sample;
>
> - if (!data->nr_pages)
> - goto fail;
> + if (!data->nr_pages) {
> + atomic_inc(&data->lost);
> + goto out;
> + }
>
> have_lost = atomic_read(&data->lost);
> if (have_lost)
> --
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] perf_events: fix errors path in perf_output_begin()
2010-05-17 11:04 ` John Kacur
@ 2010-05-17 11:57 ` Stephane Eranian
0 siblings, 0 replies; 9+ messages in thread
From: Stephane Eranian @ 2010-05-17 11:57 UTC (permalink / raw)
To: John Kacur
Cc: linux-kernel, peterz, mingo, paulus, davem, fweisbec, acme,
perfmon2-devel, eranian
On Mon, May 17, 2010 at 1:04 PM, John Kacur <jkacur@redhat.com> wrote:
>
>
> On Mon, 17 May 2010, Stephane Eranian wrote:
>
>> In case the sampling buffer has no "payload" pages, nr_pages is 0.
>> The problem is that the error path in perf_output_begin() skips to
>> a label which assumes perf_output_lock() has been issued which is
>> not the case. That triggers a WARN_ON() is perf_output_unlock().
>>
>> This patch fixes the problem by adding a new label and skipping
>> perf_task_unlock() in case data->nr_pages is 0.
>
> It does? Your code looks good to me, but the description above is
> a little flawed, since no new label is added.
>
Sorry about that. I had written the text before the final version of the patch!
I will resubmit.
>> --
>> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
>> index a4fa381..95137b6 100644
>> --- a/kernel/perf_event.c
>> +++ b/kernel/perf_event.c
>> @@ -3035,8 +3035,10 @@ int perf_output_begin(struct perf_output_handle *handle,
>> handle->nmi = nmi;
>> handle->sample = sample;
>>
>> - if (!data->nr_pages)
>> - goto fail;
>> + if (!data->nr_pages) {
>> + atomic_inc(&data->lost);
>> + goto out;
>> + }
>>
>> have_lost = atomic_read(&data->lost);
>> if (have_lost)
>> --
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] perf_events: fix errors path in perf_output_begin()
2010-05-17 10:46 [PATCH] perf_events: fix errors path in perf_output_begin() Stephane Eranian
2010-05-17 11:04 ` John Kacur
@ 2010-05-17 11:46 ` Frederic Weisbecker
2010-05-17 11:56 ` Stephane Eranian
1 sibling, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2010-05-17 11:46 UTC (permalink / raw)
To: Stephane Eranian
Cc: linux-kernel, peterz, mingo, paulus, davem, acme, perfmon2-devel,
eranian
On Mon, May 17, 2010 at 12:46:01PM +0200, Stephane Eranian wrote:
> In case the sampling buffer has no "payload" pages, nr_pages is 0.
> The problem is that the error path in perf_output_begin() skips to
> a label which assumes perf_output_lock() has been issued which is
> not the case. That triggers a WARN_ON() is perf_output_unlock().
>
> This patch fixes the problem by adding a new label and skipping
> perf_task_unlock() in case data->nr_pages is 0.
>
> Signed-off-by: Stephane Eranian <eranian@google.com>
>
> --
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index a4fa381..95137b6 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -3035,8 +3035,10 @@ int perf_output_begin(struct perf_output_handle *handle,
> handle->nmi = nmi;
> handle->sample = sample;
>
> - if (!data->nr_pages)
> - goto fail;
> + if (!data->nr_pages) {
> + atomic_inc(&data->lost);
> + goto out;
> + }
Oh indeed, handle->lock is in a random state.
Whatever its value we have an unbalanced put_cpu()
anyway.
And we don't race with someone else, data->lock = -1
and will then warn.
I just have a tiny doubt: should we really count this
path to the lost events? I'm not sure when we can have
data->nr_pages == 0, does this happen if we mmap after
enabling the event?
All I know is that I observed I already lost events in
this path using perf lock.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] perf_events: fix errors path in perf_output_begin()
2010-05-17 11:46 ` Frederic Weisbecker
@ 2010-05-17 11:56 ` Stephane Eranian
2010-05-17 12:03 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Stephane Eranian @ 2010-05-17 11:56 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: linux-kernel, peterz, mingo, paulus, davem, acme, perfmon2-devel,
eranian
On Mon, May 17, 2010 at 1:46 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Mon, May 17, 2010 at 12:46:01PM +0200, Stephane Eranian wrote:
>> In case the sampling buffer has no "payload" pages, nr_pages is 0.
>> The problem is that the error path in perf_output_begin() skips to
>> a label which assumes perf_output_lock() has been issued which is
>> not the case. That triggers a WARN_ON() is perf_output_unlock().
>>
>> This patch fixes the problem by adding a new label and skipping
>> perf_task_unlock() in case data->nr_pages is 0.
>>
>> Signed-off-by: Stephane Eranian <eranian@google.com>
>>
>> --
>> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
>> index a4fa381..95137b6 100644
>> --- a/kernel/perf_event.c
>> +++ b/kernel/perf_event.c
>> @@ -3035,8 +3035,10 @@ int perf_output_begin(struct perf_output_handle *handle,
>> handle->nmi = nmi;
>> handle->sample = sample;
>>
>> - if (!data->nr_pages)
>> - goto fail;
>> + if (!data->nr_pages) {
>> + atomic_inc(&data->lost);
>> + goto out;
>> + }
>
>
>
> Oh indeed, handle->lock is in a random state.
> Whatever its value we have an unbalanced put_cpu()
> anyway.
>
Yes, I caught this problem while debugging the perf record -c
issue.
> And we don't race with someone else, data->lock = -1
> and will then warn.
>
> I just have a tiny doubt: should we really count this
> path to the lost events? I'm not sure when we can have
> data->nr_pages == 0, does this happen if we mmap after
> enabling the event?
>
Well, nr_pages = 0 means all you have is the sampling buffer
header page. You cannot save any sample, so you actually
loose everything. Perf reports a dozen samples but those
are synthetic. The kernel cannot reject an mmap() for
just one page because of the remapped counters, thus it
has to handle this gracefully and at least report the lost
samples.
> All I know is that I observed I already lost events in
> this path using perf lock.
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] perf_events: fix errors path in perf_output_begin()
2010-05-17 11:56 ` Stephane Eranian
@ 2010-05-17 12:03 ` Peter Zijlstra
2010-05-17 12:04 ` Stephane Eranian
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2010-05-17 12:03 UTC (permalink / raw)
To: Stephane Eranian
Cc: Frederic Weisbecker, linux-kernel, mingo, paulus, davem, acme,
perfmon2-devel, eranian
On Mon, 2010-05-17 at 13:56 +0200, Stephane Eranian wrote:
> >> + if (!data->nr_pages) {
> >> + atomic_inc(&data->lost);
> >> + goto out;
> >> + }
> Well, nr_pages = 0 means all you have is the sampling buffer
> header page. You cannot save any sample, so you actually
> loose everything. Perf reports a dozen samples but those
> are synthetic. The kernel cannot reject an mmap() for
> just one page because of the remapped counters, thus it
> has to handle this gracefully and at least report the lost
> samples.
So you want to preserve this state for when you munmap() and mmap()
again? The only user of data->lost is writing the PERF_RECORD_LOST
event, which only ever happens when you have pages, so counting it when
there's no pages seems futile.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] perf_events: fix errors path in perf_output_begin()
2010-05-17 12:03 ` Peter Zijlstra
@ 2010-05-17 12:04 ` Stephane Eranian
2010-05-17 12:09 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Stephane Eranian @ 2010-05-17 12:04 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Frederic Weisbecker, linux-kernel, mingo, paulus, davem, acme,
perfmon2-devel, eranian
On Mon, May 17, 2010 at 2:03 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2010-05-17 at 13:56 +0200, Stephane Eranian wrote:
>> >> + if (!data->nr_pages) {
>> >> + atomic_inc(&data->lost);
>> >> + goto out;
>> >> + }
>
>
>> Well, nr_pages = 0 means all you have is the sampling buffer
>> header page. You cannot save any sample, so you actually
>> loose everything. Perf reports a dozen samples but those
>> are synthetic. The kernel cannot reject an mmap() for
>> just one page because of the remapped counters, thus it
>> has to handle this gracefully and at least report the lost
>> samples.
>
> So you want to preserve this state for when you munmap() and mmap()
> again? The only user of data->lost is writing the PERF_RECORD_LOST
> event, which only ever happens when you have pages, so counting it when
> there's no pages seems futile.
You're right. But couldn't you report lost samples in the buffer header as well?
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] perf_events: fix errors path in perf_output_begin()
2010-05-17 12:04 ` Stephane Eranian
@ 2010-05-17 12:09 ` Peter Zijlstra
2010-05-17 12:23 ` Stephane Eranian
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2010-05-17 12:09 UTC (permalink / raw)
To: Stephane Eranian
Cc: Frederic Weisbecker, linux-kernel, mingo, paulus, davem, acme,
perfmon2-devel, eranian
On Mon, 2010-05-17 at 14:04 +0200, Stephane Eranian wrote:
>
> > So you want to preserve this state for when you munmap() and mmap()
> > again? The only user of data->lost is writing the PERF_RECORD_LOST
> > event, which only ever happens when you have pages, so counting it when
> > there's no pages seems futile.
>
> You're right. But couldn't you report lost samples in the buffer header as well?
I guess you could, but what's the point? If you ask for sampling but
don't provide a buffer you don't get anything -- how's that a surprise?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] perf_events: fix errors path in perf_output_begin()
2010-05-17 12:09 ` Peter Zijlstra
@ 2010-05-17 12:23 ` Stephane Eranian
0 siblings, 0 replies; 9+ messages in thread
From: Stephane Eranian @ 2010-05-17 12:23 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Frederic Weisbecker, linux-kernel, mingo, paulus, davem, acme,
perfmon2-devel, eranian
On Mon, May 17, 2010 at 2:09 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2010-05-17 at 14:04 +0200, Stephane Eranian wrote:
>>
>> > So you want to preserve this state for when you munmap() and mmap()
>> > again? The only user of data->lost is writing the PERF_RECORD_LOST
>> > event, which only ever happens when you have pages, so counting it when
>> > there's no pages seems futile.
>>
>> You're right. But couldn't you report lost samples in the buffer header as well?
>
> I guess you could, but what's the point? If you ask for sampling but
> don't provide a buffer you don't get anything -- how's that a surprise?
>
Fine. I'll update the patch to not account for data->lost.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-05-17 12:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-17 10:46 [PATCH] perf_events: fix errors path in perf_output_begin() Stephane Eranian
2010-05-17 11:04 ` John Kacur
2010-05-17 11:57 ` Stephane Eranian
2010-05-17 11:46 ` Frederic Weisbecker
2010-05-17 11:56 ` Stephane Eranian
2010-05-17 12:03 ` Peter Zijlstra
2010-05-17 12:04 ` Stephane Eranian
2010-05-17 12:09 ` Peter Zijlstra
2010-05-17 12:23 ` Stephane Eranian
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox