linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Arun Sharma <asharma@fb.com>
Cc: arun@sharma-home.net, Stephane Eranian <eranian@google.com>,
	Arnaldo Carvalho de Melo <acme@infradead.org>,
	linux-kernel@vger.kernel.org, Andi Kleen <ak@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Lin Ming <ming.m.lin@intel.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	eranian@gmail.com, Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] perf events: Add stalled cycles generic event - PERF_COUNT_HW_STALLED_CYCLES
Date: Mon, 25 Apr 2011 19:37:51 +0200	[thread overview]
Message-ID: <20110425173751.GA28239@elte.hu> (raw)
In-Reply-To: <20110424061645.GA12013@radium.snc4.facebook.com>


* Arun Sharma <asharma@fb.com> wrote:

> On Sat, Apr 23, 2011 at 10:14:09PM +0200, Ingo Molnar wrote:
> > 
> > The new PERF_COUNT_HW_STALLED_CYCLES event tries to approximate
> > cycles the CPU does nothing useful, because it is stalled on a
> > cache-miss or some other condition.
> 
> Conceptually looks fine. I'd prefer a more precise name such as: 
> PERF_COUNT_EXECUTION_STALLED_CYCLES (to differentiate from frontend or 
> retirement stalls).

Ok.

Your script:

> # ./analyze.py
> Percent idle: 27%
>         Retirement Stalls: 82%
>         Backend Stalls: 0%
>         Frontend Stalls: 62%
>         Instruction Starvation: 62%
>         icache stalls: 0%
> 
> does give me a signal about where to look. The script below is
> a quick and dirty hack. I haven't really validated it with 
> many workloads. I'm posting it here anyway hoping that it'd
> result in better kernel support for these types of analyses.

Is pretty useful IMO.

The frontend/backend characterisation is pretty generic - most modern CPUs 
share that and have similar events.

So we could try to generalize these and get most of the statistics your script 
outputs.

> Even if we cover this with various generic PERF_COUNT_*STALL events,
> we'll still have a need for other events:
> 
> * Things that give info about instruction mixes.
> 
>   Ratio of {loads, stores, floating point, branches, conditional branches}
>   to total instructions.

We have this at least partially covered, but yeah, we stopped short of covering 
all instruction types so complete ratios cannot be built yet.

> * Activity related to micro architecture specific caches
> 
>   People using -funroll-loops may have a significant performance opportunity.
>   But it's hard to spot bottlenecks in the instruction decoder.
> 
> * Monitoring traffic on Hypertransport/QPI links

Cross-node accesses ought to be covered by Peter's RFC patch. In terms of 
isolating cross-CPU cache accesses i suspect we could do that too if it really 
matters to analysis in practice.

Basically the way to go about it are the testcases you wrote - they demonstrate 
the utility of a given type of event - and that justifies generalization as 
well.

> Like you observe, most people will not look at these events, so
> focusing on getting the common events right makes sense. But I
> still like access to all events (either via a mapping file or
> a library such as libpfm4). Hiding them in "perf list" sounds
> like a reasonable way of keeping complexity out.

Yes. We have access to raw events for relatively obscure (or too CPU dependent) 
events - but what we do not want to do is to extend that space without adding 
*any* generic event in essence. If something like offcore or uncore PMU support 
is useful enough to be in the kernel, then it should also be useful enough to 
gain generic events.

> PS: branch-misses:pp was spot on for the example above.

heh :-)

Thanks,

	Ingo

  reply	other threads:[~2011-04-25 17:38 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-22  8:47 [PATCH 1/1] perf tools: Add missing user space support for config1/config2 Stephane Eranian
2011-04-22  9:23 ` Ingo Molnar
2011-04-22  9:41   ` Stephane Eranian
2011-04-22 10:52     ` [generalized cache events] " Ingo Molnar
2011-04-22 12:04       ` Stephane Eranian
2011-04-22 13:18         ` Ingo Molnar
2011-04-22 20:31           ` Stephane Eranian
2011-04-22 20:47             ` Ingo Molnar
2011-04-23 12:13               ` Stephane Eranian
2011-04-23 12:49                 ` Ingo Molnar
2011-04-22 21:03             ` Ingo Molnar
2011-04-23 12:27               ` Stephane Eranian
2011-04-22 16:51         ` Andi Kleen
2011-04-22 19:57           ` Ingo Molnar
2011-04-26  9:25           ` Peter Zijlstra
2011-04-22 16:50       ` arun
2011-04-22 17:00         ` Andi Kleen
2011-04-22 20:30         ` Ingo Molnar
2011-04-22 20:32           ` Ingo Molnar
2011-04-23  0:03             ` Andi Kleen
2011-04-23  7:50               ` Peter Zijlstra
2011-04-23 12:06                 ` Stephane Eranian
2011-04-23 12:36                   ` Ingo Molnar
2011-04-23 13:16                   ` Peter Zijlstra
2011-04-25 18:48                     ` Stephane Eranian
2011-04-25 19:40                     ` Andi Kleen
2011-04-25 19:55                       ` Ingo Molnar
2011-04-24  2:15                   ` Andi Kleen
2011-04-24  2:19                 ` Andi Kleen
2011-04-25 17:41                   ` Ingo Molnar
2011-04-25 18:00                     ` Dehao Chen
     [not found]                     ` <BANLkTiks31-pMJe4zCKrppsrA1d6KanJFA@mail.gmail.com>
2011-04-25 18:05                       ` Ingo Molnar
2011-04-25 18:39                         ` Stephane Eranian
2011-04-25 19:45                           ` Ingo Molnar
2011-04-23  8:02               ` Ingo Molnar
2011-04-23 20:14           ` [PATCH] perf events: Add stalled cycles generic event - PERF_COUNT_HW_STALLED_CYCLES Ingo Molnar
2011-04-24  6:16             ` Arun Sharma
2011-04-25 17:37               ` Ingo Molnar [this message]
2011-04-26  9:25               ` Peter Zijlstra
2011-04-26 14:00               ` Ingo Molnar
2011-04-27 11:11               ` Ingo Molnar
2011-04-27 14:47                 ` Arun Sharma
2011-04-27 15:48                   ` Ingo Molnar
2011-04-27 16:27                     ` Ingo Molnar
2011-04-27 19:05                       ` Arun Sharma
2011-04-27 19:03                     ` Arun Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110425173751.GA28239@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@infradead.org \
    --cc=acme@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arun@sharma-home.net \
    --cc=asharma@fb.com \
    --cc=eranian@gmail.com \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.m.lin@intel.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).