public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ian Munsie <imunsie@au1.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	Ingo Molnar <mingo@elte.hu>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Mike Galbraith <efault@gmx.de>, Paul Mackerras <paulus@samba.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH 3/3] perf record/report: Process events in order
Date: Tue, 07 Dec 2010 12:15:45 +1100	[thread overview]
Message-ID: <1291679635-sup-9860@au1.ibm.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1012061447080.2653@localhost6.localdomain6>

Hi Thomas,

Excerpts from Thomas Gleixner's message of Tue Dec 07 01:41:22 +1100 2010:
> Does the patch below on top of my previous one work for you ?

Not quite. It does work on a patched kernel, but has some problems on
unpatched kernels and issues with perf report -D.

I set ordered_samples and ordering_requires_timestamps in perf report to
test this (I've included the changes below). Since you don't move the
dump_printfs like my patches, we can't set those flags unconditionally
in perf report, otherwise perf report -D would be completely broken.

That's OK, we can just set it conditionally on !dump_trace and report -D
will then always display events in file order. Personally I think it's
handy to see which order events are being processed in to track down why
samples aren't being attributed, but that's mostly important during
debugging the timestamp ordering, and perhaps not as useful otherwise.


>      case PERF_RECORD_HEADER_ATTR:
> -        return ops->attr(event, session);
> +        /* This updates session->sample_id_all */
> +        ret = ops->attr(event, session);
> +        if (ops->ordering_requires_timestamps &&
> +            ops->ordered_samples && !session->sample_id_all) {
> +            session->ordered_samples.next_flush = ULLONG_MAX;
> +            flush_sample_queue(session, ops);
> +            ops->ordered_samples = false;
> +        }

Makes sense. I did something similar in the report layer that I was
about to send when I saw this email, but this way we have a generic
solution for other parts of perf that might want this.
The problem here is that we only get the PERF_RECORD_HEADER_ATTR if perf
record has been piped somewhere, so running perf record <load>; perf
report on an unpatched kernel results in the COMM, MMAP, etc events
being processed last (timestamp -1ULL) and no userspace samples are
attributed at all:

$ ./perf report|cat
# Events: 1K cycles
#
# Overhead  Command  Shared Object                 Symbol
# ........  .......  .............  .....................
#
    99.42%  cachetest  [unknown]      [.] 0xf75e0ccb
     0.45%  cachetest  [kernel]       [k] page_fault
     0.06%  cachetest  [kernel]       [k] hpet_next_event
     0.06%  cachetest  [kernel]       [k] run_timer_softirq
     0.01%  cachetest  [kernel]       [k] flush_signal_handlers
     0.00%  cachetest  [kernel]       [k] native_write_msr_safe

> +    /*
> +     * For all non synthetized events we get the sample data

s/synthetized/synthesized/

> +    event__parse_sample(event, session, &sample);
> +    if (dump_trace)
> +        perf_session__print_tstamp(session, event, &sample);

Moving this here after the dump_printf("%#Lx [%#x]: PERF_RECORD_%s"...
changes the output of perf report -D in a bad way. Changing the spacing
in dump_printf can make up for it, or juggle the code around some more.


How do you want to proceed? At this point either version of the patches
are pretty functionally equivelant. Mine does the perf report -D
reordering as well, but that isn't really necessary to solve the bug.
Either way we only have a few minor fixups left.


If we want to use yours, here's my v3 patch respun on top of them to
enable ordering by timestamps in record & report:

>From 581eb3fbad176a83d1830a792174ff5087015897 Mon Sep 17 00:00:00 2001
From: Ian Munsie <imunsie@au1.ibm.com>
Date: Tue, 7 Dec 2010 10:20:44 +1100
Subject: [PATCH v3] perf record/report: Process events in order

This patch changes perf report to ask for the ID info on all events be
default if recording from multiple CPUs.

Perf report (without -D) will now process the events in order if the
kernel is able to provide timestamps on all events. This ensures that
events such as COMM and MMAP which are necessary to correctly interpret
samples are processed prior to those samples so that they are attributed
correctly.

Before:
 # perf record ./cachetest
 # perf report

 # Events: 6K cycles
 #
 # Overhead  Command      Shared Object                           Symbol
 # ........  .......  .................  ...............................
 #
     74.11%    :3259  [unknown]          [k] 0x4a6c
      1.50%  cachetest  ld-2.11.2.so       [.] 0x1777c
      1.46%    :3259  [kernel.kallsyms]  [k] .perf_event_mmap_ctx
      1.25%    :3259  [kernel.kallsyms]  [k] restore
      0.74%    :3259  [kernel.kallsyms]  [k] ._raw_spin_lock
      0.71%    :3259  [kernel.kallsyms]  [k] .filemap_fault
      0.66%    :3259  [kernel.kallsyms]  [k] .memset
      0.54%  cachetest  [kernel.kallsyms]  [k] .sha_transform
      0.54%    :3259  [kernel.kallsyms]  [k] .copy_4K_page
      0.54%    :3259  [kernel.kallsyms]  [k] .find_get_page
      0.52%    :3259  [kernel.kallsyms]  [k] .trace_hardirqs_off
      0.50%    :3259  [kernel.kallsyms]  [k] .__do_fault
<SNIP>

After:
 # perf report

 # Events: 6K cycles
 #
 # Overhead  Command      Shared Object                           Symbol
 # ........  .......  .................  ...............................
 #
     44.28%  cachetest  cachetest          [.] sumArrayNaive
     22.53%  cachetest  cachetest          [.] sumArrayOptimal
      6.59%  cachetest  ld-2.11.2.so       [.] 0x1777c
      2.13%  cachetest  [unknown]          [k] 0x340
      1.46%  cachetest  [kernel.kallsyms]  [k] .perf_event_mmap_ctx
      1.25%  cachetest  [kernel.kallsyms]  [k] restore
      0.74%  cachetest  [kernel.kallsyms]  [k] ._raw_spin_lock
      0.71%  cachetest  [kernel.kallsyms]  [k] .filemap_fault
      0.66%  cachetest  [kernel.kallsyms]  [k] .memset
      0.54%  cachetest  [kernel.kallsyms]  [k] .copy_4K_page
      0.54%  cachetest  [kernel.kallsyms]  [k] .find_get_page
      0.54%  cachetest  [kernel.kallsyms]  [k] .sha_transform
      0.52%  cachetest  [kernel.kallsyms]  [k] .trace_hardirqs_off
      0.50%  cachetest  [kernel.kallsyms]  [k] .__do_fault
<SNIP>

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 tools/perf/builtin-record.c |    5 ++++-
 tools/perf/builtin-report.c |    5 +++++
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 699dd21..310dd21 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -285,7 +285,7 @@ static void create_counter(int counter, int cpu)
 	if (system_wide)
 		attr->sample_type	|= PERF_SAMPLE_CPU;
 
-	if (sample_time)
+	if (sample_time || system_wide || !no_inherit || cpu_list)
 		attr->sample_type	|= PERF_SAMPLE_TIME;
 
 	if (raw_samples) {
@@ -327,6 +327,9 @@ try_again:
 				 * Old kernel, no attr->sample_id_type_all field
 				 */
 				sample_id_all_avail = false;
+				if (!sample_time && !raw_samples)
+					attr->sample_type &= ~PERF_SAMPLE_TIME;
+
 				goto retry_sample_id;
 			}
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 904519f..d1ad782 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -244,6 +244,8 @@ static struct perf_event_ops event_ops = {
 	.event_type = event__process_event_type,
 	.tracing_data = event__process_tracing_data,
 	.build_id = event__process_build_id,
+	/* { .ordered_samples && !dump_trace } */
+	/* { .ordering_requires_timestamps && !dump_trace } */
 };
 
 extern volatile int session_done;
@@ -319,6 +321,9 @@ static int __cmd_report(void)
 	if (ret)
 		goto out_delete;
 
+	event_ops.ordered_samples = event_ops.ordering_requires_timestamps =
+		!dump_trace;
+
 	ret = perf_session__process_events(session, &event_ops);
 	if (ret)
 		goto out_delete;
-- 
1.7.2.3

  reply	other threads:[~2010-12-07  1:15 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-04 22:26 [GIT PULL 0/5] perf/core: Support for PERF_SAMPLE_ in all events Arnaldo Carvalho de Melo
2010-12-04 22:26 ` [PATCH 1/5] perf events: Fix event inherit fallout of precalculated headers Arnaldo Carvalho de Melo
2010-12-04 22:26 ` [PATCH 2/5] perf events: Separate the routines handling the PERF_SAMPLE_ identity fields Arnaldo Carvalho de Melo
2010-12-04 22:26 ` [PATCH 3/5] perf events: Make sample_type identity fields available in all PERF_RECORD_ events Arnaldo Carvalho de Melo
2010-12-04 22:26 ` [PATCH 4/5] perf session: Parse sample earlier Arnaldo Carvalho de Melo
2010-12-04 22:26 ` [PATCH 5/5] perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events Arnaldo Carvalho de Melo
2010-12-05 13:32 ` [Patch] perf: session: Sort all events if ordered_samples=true Thomas Gleixner
2010-12-06  2:37   ` Process events in order if recording from multiple CPUs Ian Munsie
2010-12-06  2:37   ` [PATCH 1/3] perf tool: Better displaying of unresolved DSOs and symbols Ian Munsie
2010-12-07  6:56     ` [tip:perf/core] perf hist: " tip-bot for Ian Munsie
2010-12-06  2:37   ` [PATCH 2/3] perf: Move all output for perf report -D into trace_event Ian Munsie
2010-12-06  2:37   ` [PATCH 3/3] perf record/report: Process events in order Ian Munsie
2010-12-06  9:20     ` Thomas Gleixner
2010-12-06 12:42       ` Ian Munsie
2010-12-06 13:04         ` Thomas Gleixner
2010-12-06 13:11           ` Ian Munsie
2010-12-06 14:41             ` Thomas Gleixner
2010-12-07  1:15               ` Ian Munsie [this message]
2010-12-07 10:47                 ` Thomas Gleixner
2010-12-07 10:54                   ` Arnaldo Carvalho de Melo
2010-12-07 12:24                     ` Thomas Gleixner
2010-12-07  6:57   ` [tip:perf/core] perf session: Sort all events if ordered_samples=true tip-bot for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1291679635-sup-9860@au1.ibm.com \
    --to=imunsie@au1.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@ghostprotocols.net \
    --cc=efault@gmx.de \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulus@samba.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox