Re: [PATCH 1/2 -tip] perf_counter: Add generalized hardware vectored co-processor support for AMD and Intel Corei7/Nehalem

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@elte.hu>
To: Jaswinder Singh Rajput <jaswinder@kernel.org>
Cc: Arjan van de Ven <arjan@infradead.org>,
	Paul Mackerras <paulus@samba.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Anton Blanchard <anton@samba.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	x86 maintainers <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>
Subject: Re: [PATCH 1/2 -tip] perf_counter: Add generalized hardware vectored co-processor support for AMD and Intel Corei7/Nehalem
Date: Sat, 4 Jul 2009 12:03:31 +0200	[thread overview]
Message-ID: <20090704100331.GC2139@elte.hu> (raw)
In-Reply-To: <1246627555.2322.42.camel@jaswinder.satnam>


* Jaswinder Singh Rajput <jaswinder@kernel.org> wrote:

> On Fri, 2009-07-03 at 18:19 +0530, Jaswinder Singh Rajput wrote:
> > On Fri, 2009-07-03 at 17:25 +0530, Jaswinder Singh Rajput wrote:
> > > On Fri, 2009-07-03 at 12:29 +0200, Ingo Molnar wrote:
> > > > * Jaswinder Singh Rajput <jaswinder@kernel.org> wrote:
> > > > 
> > > > >  Performance counter stats for '/usr/bin/rhythmbox /home/jaswinder/Music/singhiskinng.mp3':
> > > > > 
> > > > >        17552264  vec-adds                  (scaled from 66.28%)
> > > > >        19715258  vec-muls                  (scaled from 66.63%)
> > > > >        15862733  vec-divs                  (scaled from 66.82%)
> > > > >     23735187095  vec-idle-cycles           (scaled from 66.89%)
> > > > >        11353159  vec-stall-cycles          (scaled from 66.90%)
> > > > >        36628571  vec-ops                   (scaled from 66.48%)
> > > > 
> > > > Is stall-cycles equivalent to busy-cycles? 
> > > 
> > > 
> > > hmm, normally we can use these terms interchangeably. But they can be
> > > different some times.
> > > 
> > > busy means it is already executing some instructions so it will not take
> > > another instruction.
> > > 
> > > stall can be busy(executing) or non-executing may be it is waiting for
> > > some operands due to cache miss.
> > > 
> > > 
> > > > I.e. do we have this 
> > > > general relationship to the cycle event:
> > > > 
> > > > 	cycles = vec-stall-cycles + vec-idle-cycles
> > > > 
> > > > ?
> > 
> > Like on AMD :
> > 
> >     13390918485  vec-adds                  (scaled from 57.07%)
> >     22465091289  vec-muls                  (scaled from 57.22%)
> >      2643789384  vec-divs                  (scaled from 57.21%)
> >     17922784596  vec-idle-cycles           (scaled from 57.23%)
> >      6402888606  vec-stall-cycles          (scaled from 57.17%)
> >     55823491597  cycles                    (scaled from 57.05%)
> >     51035264218  vec-ops                   (scaled from 57.05%)
> > 
> >   187.494664172  seconds time elapsed
> > 
> > vec-idle-cycles + vec-stall-cycles = 24325673202
> > 
> > so cycles = 2.29 * (vec-idle-cycles + vec-stall-cycles)

that equation is entirely bogus.

> > 
> > On AMD I used : EventSelect 0D7h Dispatch Stall for FPU Full The 
> > number of processor cycles the decoder is stalled because the 
> > scheduler for the Floating Point Unit is full. This condition 
> > can be caused by a lack of parallelism in FP-intensive code, or 
> > by cache misses on FP operand loads (which could also show up as 
> > EventSelect 0D8h instead, depending on the nature of the 
> > instruction sequences). May occur simultaneously with certain 
> > other stall conditions; see EventSelect 0D1h
> > 
> > So stall is due to lack of parallelism and cache misses. If we 
> > keep on increasing the size of FP units and cache may at some 
> > point be we can get vec-stall-cycles = zero.
> > 
> 
> I mean, So stall is majorly due to lack of parallelism and cache 
> misses. If we keep on increasing the size of FP units and cache 
> then stall time will keep on decreasing (ofcourse it will be never 
> Zero ;)
> 
> And same thing will be happen for Intel.
> 
> So stall is not equal to busy.
> 
> Please let me know what is next, should I remove busy term from 
> alias.

What is needed is for you to understand these events and provide a 
generalization around them that makes sense. Or to declare it 
honestly when you dont.

The numbers simply dont add up:

> >     13390918485  vec-adds                  (scaled from 57.07%)
> >     22465091289  vec-muls                  (scaled from 57.22%)
> >      2643789384  vec-divs                  (scaled from 57.21%)
> >     17922784596  vec-idle-cycles           (scaled from 57.23%)
> >      6402888606  vec-stall-cycles          (scaled from 57.17%)
> >     55823491597  cycles                    (scaled from 57.05%)
> >     51035264218  vec-ops                   (scaled from 57.05%)

vec-idle-cycles + vec-stall-cycles does not add up to cycles - 
because a stall is not an 'interchangeable' term with 'busy' as you 
claimed before, but a special state of the pipeline, a subset of 
busy.

I prefer to apply patches from people who understand what they are 
doing - and more importantly, who express and declare their own 
limits properly when they _dont_ understand something and are 
guessing.

Frankly, your patches dont give me this impression and you are also 
babbling way too much about things you clearly dont understand, and 
thus you hinder the discussions with noise.

It's not bad at all to not understand something (we all are at 
various stages of a big and constantly refreshing learning curves), 
but it's very bad to pretend you understand something while you 
clearly dont. What we need in lkml discussions is an honest laying 
down of facts, opinions and doubts.

Why the heck didnt you say:

 " I dont know much about PMUs or vector units yet, but I have found
   these blurbs in the Intel and AMD docs and what do you think 
   about structuring these events the following way. Someone who 
   knows this stuff should review this first, it is quite likely 
   incomplete. "

	Ingo

next prev parent reply	other threads:[~2009-07-04 10:04 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-01  9:33 [GIT-PULL -tip][PATCH 0/6] perf_counter patches Jaswinder Singh Rajput
2009-07-01  9:35 ` [PATCH 1/6 -tip] perf stat: define MATCH_EVENT for easy attrs checking Jaswinder Singh Rajput
2009-07-01  9:36   ` [PATCH 2/6 -tip] perf stat: treat same behaviour for all CYCLES and CLOCKS Jaswinder Singh Rajput
2009-07-01  9:37     ` [PATCH 3/6 -tip] perf_counter: Add Generalized Hardware vectored co-processor support for AMD Jaswinder Singh Rajput
2009-07-01  9:38       ` [PATCH 4/6 -tip] perf_counter: Add Generalized Hardware interrupt " Jaswinder Singh Rajput
2009-07-01  9:38         ` [PATCH 5/6 -tip] perf_counter: Add hardware vector events for nehalem Jaswinder Singh Rajput
2009-07-01  9:40           ` [PATCH 6/6 -tip] perf_counter: Add hardware interrupt events for nehalem, core2 and atom Jaswinder Singh Rajput
2009-07-01 11:24         ` [PATCH 4/6 -tip] perf_counter: Add Generalized Hardware interrupt support for AMD Ingo Molnar
2009-07-03 12:01           ` Jaswinder Singh Rajput
2009-07-04 10:22             ` Ingo Molnar
2009-07-04 14:17               ` Jaswinder Singh Rajput
2009-07-05  1:11                 ` Ingo Molnar
2009-07-05  4:29                   ` Jaswinder Singh Rajput
2009-07-05  8:04                     ` Ingo Molnar
2009-07-05  9:01                       ` Jaswinder Singh Rajput
2009-07-05  9:55                       ` Jaswinder Singh Rajput
2009-07-01 11:20       ` [PATCH 3/6 -tip] perf_counter: Add Generalized Hardware vectored co-processor " Ingo Molnar
2009-07-01 11:27         ` Ingo Molnar
2009-07-01 11:40           ` Jaswinder Singh Rajput
2009-07-01 11:49             ` Ingo Molnar
2009-07-02  9:44               ` [PATCH 1/2 -tip] perf_counter: Add generalized hardware vectored co-processor support for AMD and Intel Corei7/Nehalem Jaswinder Singh Rajput
2009-07-02  9:45                 ` [PATCH 2/2 -tip] perf_counter: Add generalized hardware interrupt support for AMD and Intel Corei7/Nehalem, Core2 and Atom Jaswinder Singh Rajput
2009-07-03 10:33                   ` Ingo Molnar
2009-07-03  7:38                 ` [PATCH 1/2 -tip] perf_counter: Add generalized hardware vectored co-processor support for AMD and Intel Corei7/Nehalem Jaswinder Singh Rajput
2009-07-03  9:30                   ` Ingo Molnar
2009-07-03 10:10                     ` Jaswinder Singh Rajput
2009-07-03 12:17                     ` [PATCH 3/3 -tip] perf list: avoid replicating functions Jaswinder Singh Rajput
2009-07-04  9:50                       ` Ingo Molnar
2009-07-03 10:29                 ` [PATCH 1/2 -tip] perf_counter: Add generalized hardware vectored co-processor support for AMD and Intel Corei7/Nehalem Ingo Molnar
2009-07-03 11:55                   ` Jaswinder Singh Rajput
2009-07-03 12:49                     ` Jaswinder Singh Rajput
2009-07-03 13:25                       ` Jaswinder Singh Rajput
2009-07-04 10:03                         ` Ingo Molnar [this message]
2009-07-04 14:05                           ` Jaswinder Singh Rajput
2009-07-04  9:49                     ` Ingo Molnar
2009-07-04 13:54                       ` Jaswinder Singh Rajput
2009-07-01 11:39     ` [PATCH 2/6 -tip] perf stat: treat same behaviour for all CYCLES and CLOCKS Ingo Molnar
2009-07-03  8:18       ` Paul Mackerras
2009-07-03  8:27         ` Ingo Molnar
2009-07-01 11:30   ` [tip:perfcounters/urgent] perf stat: Define MATCH_EVENT for easy attr checking tip-bot for Jaswinder Singh Rajput
2009-07-01 11:45 ` [GIT-PULL -tip][PATCH 0/6] perf_counter patches Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090704100331.GC2139@elte.hu \
    --to=mingo@elte.hu \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=anton@samba.org \
    --cc=arjan@infradead.org \
    --cc=benh@kernel.crashing.org \
    --cc=jaswinder@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox