public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "John Hawkes" <hawkes@sgi.com>
To: "Chen, Kenneth W" <kenneth.w.chen@intel.com>,
	"Tony Luck" <tony.luck@gmail.com>,
	"Andrew Morton" <akpm@osdl.org>, <linux-ia64@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Cc: "Jack Steiner" <steiner@sgi.com>, "Dan Higgins" <djh@sgi.com>,
	"John Hesterberg" <jh@sgi.com>, "Greg Edwards" <edwardsg@sgi.com>
Subject: Re: [PATCH] ia64: change defconfig to NR_CPUS==1024
Date: Fri, 6 Jan 2006 09:06:18 -0800	[thread overview]
Message-ID: <000701c612e3$8324eff0$6f00a8c0@comcast.net> (raw)
In-Reply-To: 200601052233.k05MX4g15045@unix-os.sc.intel.com

From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
> What type of heavy workloads have you measured? Including db transaction
> processing and decision making workloads?

I haven't used a db transaction processing benchmark, but I have used other
workloads with large process counts and high context-switch rates.

> > The potential
> > extra cachemiss seems to be lost in the noise.  The for_each_*cpu()
> > macros are relatively efficient in skipping past zeroed cpumask bits.
> > Workloads that impose higher loads on the CPU Scheduler tend to
> > bottleneck on non-Scheduler parts of the kernel, and it's the Scheduler
> > which makes the principal use of the cpumask_t, so these extra
> > cachemiss inefficiencies and extra CPU cycles to scan zero mask words
> > just get lost in the general system overhead.
>
> I found above claims are generally false for workload that puts tons
> of pressure on CPU cache, especially with db workload.  Typically
> for db workload, the working set in user space is so large that making
> a trip into the kernel has far large secondary effect then the primary
> cache miss occurred in the kernel.  In other word, cache lines evicted
> by the kernel code have far larger impact to the overall application
> performance and leads to lower overall lower system performance.  So
> when you say "get lost in the general system overhead", did you consider
> the secondary effect it does to the application performance?

The current default is 512p, which is 8 words -- a cacheline.  Increasing to
1024p adds an additional 8 words -- one cacheline -- to the cpumask_t.  I
doubt you're going to see a performance regression on your db transaction
processing benchmark because of an additional cachemiss during active or
passive load-balancing.

I agree that throughout the kernel we ought to be aware of increasing
cachemisses and the lengthening code paths, but I don't believe this
particular one is some evil that needs to be suppressed.  We have far more
micro-performance-impacting algorithms and data structures in the kernel right
now that we ought to consider -- e.g., cache coloring conflicts with the
struct runqueue -- as well as the obvious algorithm tweaks that greatly affect
processor assignments -- e.g., whether or not to call wake_idle().

> What we found is going from NR_CPU = 64 to 128, it has small performance
> impact to db transaction processing workload.  Though I have not measured
> difference between 128 to 1024.

Going from 64 (one word) to >64 (an array of words) produces a qualitative
change to the emitted code in how the cpumask_t is passed in calling sequences
and how it is manipulated.  I completely understand that you can detect a
small performance regression between 64 and 128.  I just don't believe you can
conclude that going from 512 to 1024 will exhibit a similar measurable
regression.

John Hawkes


  reply	other threads:[~2006-01-06 17:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-05 21:39 [PATCH] ia64: change defconfig to NR_CPUS==1024 hawkes
2006-01-05 22:33 ` Chen, Kenneth W
2006-01-06 17:06   ` John Hawkes [this message]
2006-01-06  8:38 ` Arjan van de Ven
2006-01-12  0:09 ` Paul Jackson
2006-01-12 19:04   ` Christoph Lameter
  -- strict thread matches above, loose matches on Subject: below --
2006-01-06 17:19 Luck, Tony
2006-01-06 17:24 ` Arjan van de Ven
2006-01-06 17:26 ` Matthew Wilcox
2006-01-06 17:45 Luck, Tony
2006-01-06 17:49 ` Matthew Wilcox
2006-01-06 18:04   ` Christoph Lameter
2006-01-06 18:07     ` Matthew Wilcox
2006-01-06 18:19     ` Randy.Dunlap
2006-01-06 18:37       ` Christoph Lameter
2006-01-06 18:59         ` Arjan van de Ven
2006-01-06 20:17           ` Alan Cox
2006-01-06 20:18             ` Randy.Dunlap
2006-01-06 20:42           ` Rohit Seth
2006-01-06 21:00         ` Dave Jones
2006-01-06 18:25 ` Adrian Bunk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='000701c612e3$8324eff0$6f00a8c0@comcast.net' \
    --to=hawkes@sgi.com \
    --cc=akpm@osdl.org \
    --cc=djh@sgi.com \
    --cc=edwardsg@sgi.com \
    --cc=jh@sgi.com \
    --cc=kenneth.w.chen@intel.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=steiner@sgi.com \
    --cc=tony.luck@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox