Re: [BUG] perf_events: PERF_FORMAT_GROUP not working correctly when monitoring another task

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Corey Ashford <cjashfor@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Paul Mackerras <paulus@samba.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Carl Love <carll@us.ibm.com>, Maynard Johnson <mpjohn@us.ibm.com>,
	stephane eranian <eranian@googlemail.com>
Subject: Re: [BUG] perf_events: PERF_FORMAT_GROUP not working correctly when monitoring another task
Date: Thu, 06 May 2010 13:09:28 -0700	[thread overview]
Message-ID: <4BE321F8.1050309@linux.vnet.ibm.com> (raw)
In-Reply-To: <1273160566.5605.404.camel@twins>

On 05/06/2010 08:42 AM, Peter Zijlstra wrote:
> On Mon, 2010-05-03 at 19:06 -0700, Corey Ashford wrote:
>> In the last couple of days, I've run across what appears to be a
>> kernel bug in 2.6.33.3 (haven't tested later kernels yet) having to do
>> with using the PERF_FORMAT_GROUP feature in combination with
>> enable_on_exec and reading counts from a remote task.
>>
>> What happens is that when we go to read the entire group up, only the
>> first counter in can be accessed; the read() call returns too few
>> bytes.  This problem doesn't occur if measuring the from the same
>> task.
>>
>> I have attached a "cut down", though it's not terribly small.  It is a
>> cut down from the "task" example program in libpfm4/perf_examples.  In
>> addition to trimming the program down a lot, I've removed the
>> dependency on libpfm4 and made modifications so that it will compile
>> in the tools/perf subdirectory.  If you copy the attachment into your
>> tools/perf subdir, you should be able to compile it with just:
>>
>> gcc -o show_fg_bug show_fg_bug.c
>>
>> Then invoke it by passing it an executable that will give it something
>> to chew on a little, e.g.:
>>
>> ./show_fg_bug md5sum `which gdb`
>>
>> The test cases creates two counters and places them in the same group,
>> and sets the PERF_FORMAT_GROUP option on the first counter.  It
>> fork/execs the child and when the child is done executing, it reads
>> back the counter values.
>>
>> When I run it, I see this output:
>>
>> % ./show_fg_bug md5sum `which gdb`
>> 825b15d7279ef21d6c9d018d775758ae  /usr/bin/gdb
>> Error! tried to read 40 bytes, but got 32
>>              58684138 PERF_COUNT_HW_CPU_CYCLES (35469840 : 35469840)
>>                     0 PERF_COUNT_HW_INSTRUCTIONS (35469840 : 35469840)
>>
>> Oddly enough, if you look at the "nr" (number of counters) value that
>> gets read up as part of the group, it is two, but it can only read the
>> first of the two counters.  Another data point is that it doesn't
>> matter how many counters you add to the group, only the first can be
>> read up.
>>
>> Please let me know if you have any questions about this.
>
> Looks like you hit the exact same bug Stephane did:
>    http://lkml.org/lkml/2010/4/9/142
>
> The below patch seems to cure it for me.
>
> # gcc -o show_fg_bug show_fg_bug.c
> # ./show_fg_bug md5sum `which gdb`
> bc88d978c10446689326245529e6c4c1  /usr/bin/gdb
>              29721534 PERF_COUNT_HW_CPU_CYCLES (18657335 : 18657335)
>              18681747 PERF_COUNT_HW_INSTRUCTIONS (18657335 : 18657335)
>
>
> ---
> Subject: perf: Fix exit() vs PERF_FORMAT_GROUP
> From: Peter Zijlstra<a.p.zijlstra@chello.nl>
> Date: Thu May 06 17:31:38 CEST 2010
>
> Both Stephane and Corey reported that PERF_FORMAT_GROUP didn't work as
> expected if the task the counters were attached to quit before the
> read() call.
>
> The cause is that we unconditionally destroy the grouping when we
> remove counters from their context. Fix this by only doing this when
> we free the counter itself.
>
> Reported-by: Corey Ashford<cjashfor@linux.vnet.ibm.com>
> Reported-by: Stephane Eranian<eranian@google.com>
> Signed-off-by: Peter Zijlstra<a.p.zijlstra@chello.nl>
> ---
> Index: linux-2.6/include/linux/perf_event.h
> ===================================================================
> --- linux-2.6.orig/include/linux/perf_event.h
> +++ linux-2.6/include/linux/perf_event.h
> @@ -548,6 +548,7 @@ struct pmu {
>    * enum perf_event_active_state - the states of a event
>    */
>   enum perf_event_active_state {
> +	PERF_EVENT_STATE_FREE		= -3,
>   	PERF_EVENT_STATE_ERROR		= -2,
>   	PERF_EVENT_STATE_OFF		= -1,
>   	PERF_EVENT_STATE_INACTIVE	=  0,
> Index: linux-2.6/kernel/perf_event.c
> ===================================================================
> --- linux-2.6.orig/kernel/perf_event.c
> +++ linux-2.6/kernel/perf_event.c
> @@ -342,6 +342,9 @@ list_del_event(struct perf_event *event,
>   	if (event->state>  PERF_EVENT_STATE_OFF)
>   		event->state = PERF_EVENT_STATE_OFF;
>
> +	if (event->state>  PERF_EVENT_STATE_FREE)
> +		return;
> +
>   	/*
>   	 * If this was a group event with sibling events then
>   	 * upgrade the siblings to singleton events by adding them
> @@ -1861,6 +1864,8 @@ int perf_event_release_kernel(struct per
>   {
>   	struct perf_event_context *ctx = event->ctx;
>
> +	event->state = PERF_EVENT_STATE_FREE;
> +
>   	WARN_ON_ONCE(ctx->parent_ctx);
>   	mutex_lock(&ctx->mutex);
>   	perf_event_remove_from_context(event);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

I applied your patch to 2.6.33.3.  It applied with a few offsets, but no 
errors:
% patch -p1 < ~/pz_patch.diff
patching file include/linux/perf_event.h
Hunk #1 succeeded at 517 (offset -31 lines).
patching file kernel/perf_event.c
Hunk #1 succeeded at 349 (offset 7 lines).
Hunk #2 succeeded at 1777 (offset -87 lines).

The patch works with the test case I provided, but for some reason it 
breaks the normal operation of the libpfm4 "task" utility.  If I put 
more than one event in a group, I get zero counts on all but the first 
event.  That's even if I don't use the PERF_FORMAT_GROUP option.

So something appears to be messed up.  I will see if I can construct an 
arch-independent test case which demonstrates the problem.

-- 
Regards,

- Corey

Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
cjashfor@us.ibm.com

next prev parent reply	other threads:[~2010-05-06 20:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-04  2:06 [BUG] perf_events: PERF_FORMAT_GROUP not working correctly when monitoring another task Corey Ashford
2010-05-06 15:42 ` Peter Zijlstra
2010-05-06 20:09   ` Corey Ashford [this message]
2010-05-06 20:22     ` Peter Zijlstra
2010-05-07  2:40       ` Corey Ashford
2010-05-07 18:41   ` [tip:perf/core] perf: Fix exit() vs PERF_FORMAT_GROUP tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BE321F8.1050309@linux.vnet.ibm.com \
    --to=cjashfor@linux.vnet.ibm.com \
    --cc=acme@redhat.com \
    --cc=carll@us.ibm.com \
    --cc=eranian@googlemail.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpjohn@us.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox