public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: newidle balancing in NUMA domain?
Date: Tue, 24 Nov 2009 07:53:22 +0100	[thread overview]
Message-ID: <20091124065322.GC20981@wotan.suse.de> (raw)
In-Reply-To: <1258991617.6182.21.camel@marge.simson.net>

On Mon, Nov 23, 2009 at 04:53:37PM +0100, Mike Galbraith wrote:
> On Mon, 2009-11-23 at 16:29 +0100, Nick Piggin wrote:
> 
> > So basically about the least well performing or scalable possible
> > software architecture. This is exactly the wrong thing to optimise
> > for, guys.
> 
> Hm.  Isn't fork/exec our daily bread?

No. Not for handing out tiny chunks of work and attempting to do
them in parallel. There is this thing called Amdahl's law, and if
you write a parallel program that wantonly uses the heaviest
possible primitives in its serial sections, then it doesn't deserve
to go fast.

That is what IPC or shared memory is for. Vastly faster, vastly more
scalable, vastly easier for scheduler balancing (both via manual or
automatic placement).


> > The fact that you have to coax the scheduler into touching heaps
> > more remote cachelines and vastly increasing the amount of inter
> > node task migration should have been kind of a hint.
> > 
> > 
> > > Fork balancing only works until all cpus are active. But once a core
> > > goes idle it's left idle until we hit a general load-balance cycle.
> > > Newidle helps because it picks up these threads from other cpus,
> > > completing the current batch sooner, allowing the program to continue
> > > with the next.
> > > 
> > > There's just not much you can do from the fork() side of things once
> > > you've got them all running.
> > 
> > It sounds like allowing fork balancing to be more aggressive could
> > definitely help.
> 
> It doesn't. Task which is _already_ forked, placed and waiting over
> yonder can't do spit for getting this cpu active again without running
> so he can phone home.  This isn't only observable with x264, it just
> rubs our noses in it.  It is also quite observable in a kbuild.  What if
> the waiter is your next fork?

I'm not saying that vastly increasing task movement between NUMA
nodes won't *help* some workloads. Indeed they tend to be ones that
aren't very well parallelised (then it becomes critical to wake up
any waiter if a CPU becomes free because it might be holding a
heavily contended resource).

But can you apprciate that these are at one side of the spectrum of
workloads, and that others will much prefer to keep good affinity?
No matter how "nice" your workload is, you can't keep traffic off
the interconnect if the kernel screws up your numa placement.

And also, I'm not saying that we were at _exactly_ the right place
before and there was no room for improvement, but considering that
we didn't have a lot of active _regressions_ in the balancer, we
can really use that to our favour and concentrate changes in code
that does have regressions. And be really conservative and careful
with changes to the balancer.


  reply	other threads:[~2009-11-24  6:53 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-23 11:22 newidle balancing in NUMA domain? Nick Piggin
2009-11-23 11:36 ` Peter Zijlstra
2009-11-23 11:43   ` Nick Piggin
2009-11-23 11:50     ` Peter Zijlstra
2009-11-23 12:16       ` Nick Piggin
2009-11-23 11:45   ` Ingo Molnar
2009-11-23 12:01     ` Nick Piggin
2009-11-23 12:08       ` Ingo Molnar
2009-11-23 12:27         ` Nick Piggin
2009-11-23 12:46           ` Ingo Molnar
2009-11-24  6:36             ` Nick Piggin
2009-11-24 17:24               ` Jason Garrett-Glaser
2009-11-24 18:09                 ` Mike Galbraith
2009-11-30  8:19                 ` Nick Piggin
2009-12-01  8:18                   ` Jason Garrett-Glaser
2009-11-23 14:37 ` Mike Galbraith
2009-11-23 15:11   ` Nick Piggin
2009-11-23 15:21     ` Peter Zijlstra
2009-11-23 15:29       ` Nick Piggin
2009-11-23 15:37         ` Peter Zijlstra
2009-11-24  6:54           ` Nick Piggin
2009-11-23 15:53         ` Mike Galbraith
2009-11-24  6:53           ` Nick Piggin [this message]
2009-11-24  8:40             ` Mike Galbraith
2009-11-24  8:58               ` Mike Galbraith
2009-11-24  9:11                 ` Ingo Molnar
2009-11-30  8:27                   ` Nick Piggin
2009-11-23 17:04         ` Ingo Molnar
2009-11-24  6:59           ` Nick Piggin
2009-11-24  9:16             ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091124065322.GC20981@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox