public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf_events: fix read() bogus counts when in error state
@ 2009-11-26 17:24 Stephane Eranian
  2009-11-26 17:36 ` Peter Zijlstra
  2009-11-26 17:51 ` [tip:perf/core] perf_events: Fix " tip-bot for Stephane Eranian
  0 siblings, 2 replies; 4+ messages in thread
From: Stephane Eranian @ 2009-11-26 17:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, paulus, perfmon2-devel, eranian, eranian

	When a pinned group cannot be scheduled it goes into error state.
	Normally a group cannot go out of error state without being explicitly
	re-enabled or disabled. There was a bug in per-thread mode, whereby
	upon termination of the thread, the group would transition from error
	to off leading to bogus counts and timing information returned by
	read().

	It is important to realize that the current perf_events implementation
	assigns higher priority to system-wide events over per-thread events
	and that regardless of the fact that per-thread events may be pinned.
	It is not clear to me whether this is per design of the API or just a
	side effect of the implementation. I believe it is desirable that a
	system-wide tool gets priority access to the PMU but then this causes
	issues with per-thread events and especially when they request pinning.

	A per-thread pinned event can be evicted until there is enough PMU
	resource freed by system-wide events. Although, with this patch it is
	now possible to detect this when counting, it remains unclear how this
	situation could be detected when sampling, as it incurs potientially
	large blind spots and thus bias degrading the quality of the data
	collected.

	The API is missing a clear definition of what it means to be pinned
	for a per-thread event vs. system-wide event. Just like it does not
	clearly state that system-wide event have higher priority than per
	thread events.

	Signed-off-by: Stephane Eranian <eranian@google.com>
	
---
 perf_event.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 0b0d5f7..7a8bb5b 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -333,7 +333,16 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 		event->group_leader->nr_siblings--;
 
 	update_event_times(event);
-	event->state = PERF_EVENT_STATE_OFF;
+
+	/*
+	 * If event was in error state, then keep it
+	 * that way, otherwise bogus counts will be
+	 * returned on read(). The only way to get out
+	 * of error state is by explicit re-enabling
+	 * of the event
+	 */
+	if (event->state > PERF_EVENT_STATE_OFF)
+		event->state = PERF_EVENT_STATE_OFF;
 
 	/*
 	 * If this was a group event with sibling events then

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf_events: fix read() bogus counts when in error state
  2009-11-26 17:24 [PATCH] perf_events: fix read() bogus counts when in error state Stephane Eranian
@ 2009-11-26 17:36 ` Peter Zijlstra
  2009-11-26 17:48   ` Stephane Eranian
  2009-11-26 17:51 ` [tip:perf/core] perf_events: Fix " tip-bot for Stephane Eranian
  1 sibling, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2009-11-26 17:36 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: linux-kernel, mingo, paulus, perfmon2-devel, eranian

On Thu, 2009-11-26 at 09:24 -0800, Stephane Eranian wrote:
> 	When a pinned group cannot be scheduled it goes into error state.
> 	Normally a group cannot go out of error state without being explicitly
> 	re-enabled or disabled. There was a bug in per-thread mode, whereby
> 	upon termination of the thread, the group would transition from error
> 	to off leading to bogus counts and timing information returned by
> 	read().

> 	Signed-off-by: Stephane Eranian <eranian@google.com>

Right, good catch, totally forgot about error state :/

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
	
> ---
>  perf_event.c |   11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index 0b0d5f7..7a8bb5b 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -333,7 +333,16 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
>  		event->group_leader->nr_siblings--;
>  
>  	update_event_times(event);
> -	event->state = PERF_EVENT_STATE_OFF;
> +
> +	/*
> +	 * If event was in error state, then keep it
> +	 * that way, otherwise bogus counts will be
> +	 * returned on read(). The only way to get out
> +	 * of error state is by explicit re-enabling
> +	 * of the event
> +	 */
> +	if (event->state > PERF_EVENT_STATE_OFF)
> +		event->state = PERF_EVENT_STATE_OFF;
>  
>  	/*
>  	 * If this was a group event with sibling events then



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf_events: fix read() bogus counts when in error state
  2009-11-26 17:36 ` Peter Zijlstra
@ 2009-11-26 17:48   ` Stephane Eranian
  0 siblings, 0 replies; 4+ messages in thread
From: Stephane Eranian @ 2009-11-26 17:48 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, mingo, paulus, perfmon2-devel, eranian

On Thu, Nov 26, 2009 at 6:36 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, 2009-11-26 at 09:24 -0800, Stephane Eranian wrote:
>>       When a pinned group cannot be scheduled it goes into error state.
>>       Normally a group cannot go out of error state without being explicitly
>>       re-enabled or disabled. There was a bug in per-thread mode, whereby
>>       upon termination of the thread, the group would transition from error
>>       to off leading to bogus counts and timing information returned by
>>       read().
>
>>       Signed-off-by: Stephane Eranian <eranian@google.com>
>
> Right, good catch, totally forgot about error state :/
>
You need to clarify what pin actually means for per-thread
and I'd like to see an explanation for what happens when
sampling.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip:perf/core] perf_events: Fix read() bogus counts when in error state
  2009-11-26 17:24 [PATCH] perf_events: fix read() bogus counts when in error state Stephane Eranian
  2009-11-26 17:36 ` Peter Zijlstra
@ 2009-11-26 17:51 ` tip-bot for Stephane Eranian
  1 sibling, 0 replies; 4+ messages in thread
From: tip-bot for Stephane Eranian @ 2009-11-26 17:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, paulus, eranian, hpa, mingo, a.p.zijlstra, tglx,
	mingo

Commit-ID:  b2e74a265ded1a185f762ebaab967e9e0d008dd8
Gitweb:     http://git.kernel.org/tip/b2e74a265ded1a185f762ebaab967e9e0d008dd8
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Thu, 26 Nov 2009 09:24:30 -0800
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 26 Nov 2009 18:49:59 +0100

perf_events: Fix read() bogus counts when in error state

When a pinned group cannot be scheduled it goes into error state.

Normally a group cannot go out of error state without being
explicitly re-enabled or disabled. There was a bug in per-thread
mode, whereby upon termination of the thread, the group would
transition from error to off leading to bogus counts and timing
information returned by read().

Fix it by clearing the error state.

Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: perfmon2-devel@lists.sourceforge.net
LKML-Reference: <4b0eb9ce.0508d00a.573b.ffffeab6@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/perf_event.c |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index f8c7939..0b9ca2d 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -338,7 +338,16 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 		event->group_leader->nr_siblings--;
 
 	update_event_times(event);
-	event->state = PERF_EVENT_STATE_OFF;
+
+	/*
+	 * If event was in error state, then keep it
+	 * that way, otherwise bogus counts will be
+	 * returned on read(). The only way to get out
+	 * of error state is by explicit re-enabling
+	 * of the event
+	 */
+	if (event->state > PERF_EVENT_STATE_OFF)
+		event->state = PERF_EVENT_STATE_OFF;
 
 	/*
 	 * If this was a group event with sibling events then

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-11-26 17:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-26 17:24 [PATCH] perf_events: fix read() bogus counts when in error state Stephane Eranian
2009-11-26 17:36 ` Peter Zijlstra
2009-11-26 17:48   ` Stephane Eranian
2009-11-26 17:51 ` [tip:perf/core] perf_events: Fix " tip-bot for Stephane Eranian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox