public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <borislav.petkov@amd.com>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	greg@kroah.com, mingo@elte.hu, norsk5@yahoo.com,
	tglx@linutronix.de, mchehab@redhat.com, aris@redhat.com,
	edt@aei.ca, linux-kernel@vger.kernel.org,
	randy.dunlap@oracle.com, Sam Ravnborg <sam@ravnborg.org>
Subject: Re: [PATCH 0/4] amd64_edac: misc fixes
Date: Wed, 3 Jun 2009 20:20:23 +0200	[thread overview]
Message-ID: <20090603182023.GA28083@aftab> (raw)
In-Reply-To: <4A24248E.3060005@zytor.com>

On Mon, Jun 01, 2009 at 11:57:18AM -0700, H. Peter Anvin wrote:
> Borislav Petkov wrote:
> > Actually, popcnt got added to gas in July 2006 so checking the gas
> > version should suffice, IMHO.
> 
> gas is part of binutils.
> 
> > Anyway, I proposed something similar before but Andrew suggested that we
> > should simply slap in the opcode so we don't need the Kbuild changes.
> > The advantage of the approach is that it works unconditionally on all
> > toolchains and introduces less code changes. Hmm...
> 
> That really sucks, though, in the long run.  I personally prefer to have
> the "right thing" -- which in this case is probably gcc intrinsics --
> and then a fallback that will gradually fall out of use.

Ok, here's a simple performance data measurement exercise:

I went and rerouted all the cpumask_weight calls in sched.c through a
noinline local definition:

static noinline unsigned int my_weight(const struct cpumask *mask)
{
       return cpumask_weight(mask);
}

so that I could be able to dynamically ftrace the invocations. Compiling
a kernel (make -j8) on a quad core Fam10h gave the following trace
(excerpt):

          <idle>-0     [000]   313.120141: my_weight <-scheduler_tick
          <idle>-0     [000]   313.120145: my_weight <-select_nohz_load_balancer
          <idle>-0     [000]   313.124133: my_weight <-scheduler_tick
          <idle>-0     [000]   313.124138: my_weight <-select_nohz_load_balancer
          <idle>-0     [000]   313.128124: my_weight <-scheduler_tick
          <idle>-0     [000]   313.128127: my_weight <-select_nohz_load_balancer
          <idle>-0     [000]   313.132116: my_weight <-scheduler_tick
          <idle>-0     [000]   313.132120: my_weight <-select_nohz_load_balancer
          <idle>-0     [000]   313.136109: my_weight <-scheduler_tick
          <idle>-0     [000]   313.136114: my_weight <-select_nohz_load_balancer
           <...>-3986  [002]   313.138868: my_weight <-sched_balance_self
           <...>-3986  [002]   313.138870: my_weight <-sched_balance_self
           <...>-4064  [003]   313.138942: my_weight <-sched_balance_self
           <...>-4064  [003]   313.138945: my_weight <-sched_balance_self
           <...>-4064  [000]   313.142034: my_weight <-sched_balance_self
           <...>-4064  [000]   313.142037: my_weight <-sched_balance_self
           <...>-4065  [001]   313.143509: my_weight <-sched_balance_self
           <...>-4065  [001]   313.143511: my_weight <-sched_balance_self
            make-3777  [000]   313.146553: my_weight <-sched_balance_self
            make-3777  [000]   313.146554: my_weight <-sched_balance_self
           <...>-4066  [001]   313.146614: my_weight <-sched_balance_self
           <...>-4066  [001]   313.146614: my_weight <-sched_balance_self
           <...>-4066  [003]   313.149516: my_weight <-sched_balance_self


and the following stats:

compile time: ~309.373623 secs
my_weight calls on _all_ cores: 54005
	(cpu0: 14262, cpu1: 14417, cpu2: 11654, cpu3: 13672)

leading to approx. 174.56 calls per second on _ALL_ cores combined. If,
hypothetically speaking, this is a representative workload and we forget
the ftrace overhead, it looks like there's no need to switch to the
hardware version of hweight since this'll bring a bunch of code changes
which simply wouldn't justify themselves wrt to performance improvement.
It is just not worth the effort.

Of course, I'm open for suggestions wrt to a better workload but from
looking at the code, the most frequent hweight call site seems to be
scheduler_tick which happens with HZ frequency and even this is by
several magnitudes not enough for a measurable performance improvement.

Hmm..?

-- 
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
  System  | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. München, Germany
 Research | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
  Center  | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
  (OSRC)  | Registergericht München, HRB Nr. 43632


      reply	other threads:[~2009-06-03 18:20 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-20 18:43 [PATCH 0/4] amd64_edac: misc fixes Borislav Petkov
2009-05-20 18:43 ` [PATCH 1/4] x86: msr.h: fix build error Borislav Petkov
2009-05-20 18:43 ` [PATCH 2/4] amd64_edac: do not enable module by default Borislav Petkov
2009-05-20 18:43 ` [PATCH 3/4] EDAC: do not enable modules " Borislav Petkov
2009-05-20 18:43 ` [PATCH 4/4] amd64_edac: add MAINTAINERS entry Borislav Petkov
2009-05-20 21:41 ` [PATCH 0/4] amd64_edac: misc fixes Randy Dunlap
2009-05-28 23:47 ` Andrew Morton
2009-05-29 10:33   ` Borislav Petkov
2009-05-29 20:01     ` Andrew Morton
2009-05-30  8:19       ` Borislav Petkov
2009-05-30  8:40         ` Andrew Morton
2009-05-30 10:31           ` Borislav Petkov
2009-05-30 19:22           ` H. Peter Anvin
2009-06-01 14:53             ` Borislav Petkov
2009-06-01 16:54               ` H. Peter Anvin
2009-06-01 17:02                 ` H. Peter Anvin
2009-06-01 17:31                   ` H.J. Lu
2009-06-01 18:12                 ` Borislav Petkov
2009-06-01 18:57                   ` H. Peter Anvin
2009-06-03 18:20                     ` Borislav Petkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090603182023.GA28083@aftab \
    --to=borislav.petkov@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=aris@redhat.com \
    --cc=edt@aei.ca \
    --cc=greg@kroah.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@redhat.com \
    --cc=mingo@elte.hu \
    --cc=norsk5@yahoo.com \
    --cc=randy.dunlap@oracle.com \
    --cc=sam@ravnborg.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox