All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>,
	hpa@zytor.com, linux-kernel@vger.kernel.org,
	torvalds@linux-foundation.org, pjt@google.com, cl@linux.com,
	bharata.rao@gmail.com, akpm@linux-foundation.org,
	Lee.Schermerhorn@hp.com, aarcange@redhat.com, danms@us.ibm.com,
	suresh.b.siddha@intel.com, tglx@linutronix.de,
	linux-tip-commits@vger.kernel.org
Subject: Re: [tip:sched/numa] sched/numa: Introduce sys_numa_{t,m}bind()
Date: Sat, 19 May 2012 13:19:09 +0200	[thread overview]
Message-ID: <20120519111908.GC2012@gmail.com> (raw)
In-Reply-To: <1337357128.573.88.camel@twins>


* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> > > I very much believe in doing the simple thing first, and 
> > > this is that,
> > 
> > Leave out your syscalls (which might not be useful for 
> > managed runtimes), and you actually have the simple thing :)
> 
> Right, but the virt people could actually trivially use those, 
> and vnuma doesn't have the scambling issue outlined earlier 
> since the guest kernel would also try to keep home-node 
> affinity.
> 
> Avi already said patching kvm would be like 5 minutes work.

These APIs also match what user-space numa daemons started doing 
already.

> It also absolutely avoids the false sharing issue otherwise 
> present with per-cpu memory, since you explicitly tell it 
> where it belongs.

The grouping is also a natural extension to task and memory 
affinities and groups in general.

It also allows us to turn auto-migration off by default, which 
is a plus in my book. Without enough numbers I'm not convinced 
that we really *want* auto-discovery turned on all the time, for 
all workloads. The thing is, in practice most workloads that 
matter are short-run and even trivial forms of CPU migration 
doesnt ever happen for bursts of activity. We place them and 
that's it.

Managed runtimes on the other hand can be expected to know about 
and manage their locality - they do it anyway, by running guest 
scheduler(s). So this patch-set gives them the ability to 
express locality in a simple way, without the host kernel 
scanning actively.

We can auto-scan on top of this, if the numbers support it, but 
in the simple case where both the guest and the host is smart 
then simply expressing locality and telling each other is vastly 
superior to any scanning method.

Thanks,

	Ingo

  reply	other threads:[~2012-05-19 11:19 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-18 10:42 [tip:sched/numa] sched/numa: Introduce sys_numa_{t,m}bind() tip-bot for Peter Zijlstra
2012-05-18 15:14 ` Rik van Riel
2012-05-18 15:25   ` Christoph Lameter
2012-05-18 15:33     ` Peter Zijlstra
2012-05-18 15:37       ` Christoph Lameter
2012-05-18 15:47         ` Peter Zijlstra
2012-05-18 15:35   ` Peter Zijlstra
2012-05-18 15:40     ` Peter Zijlstra
2012-05-18 15:47       ` Christoph Lameter
2012-05-18 15:49         ` Peter Zijlstra
2012-05-18 16:00           ` Christoph Lameter
2012-05-18 16:04             ` Peter Zijlstra
2012-05-18 16:07               ` Christoph Lameter
2012-05-18 15:48     ` Rik van Riel
2012-05-18 16:05       ` Peter Zijlstra
2012-05-19 11:19         ` Ingo Molnar [this message]
2012-05-19 11:09     ` Ingo Molnar
2012-05-19 10:32   ` Pekka Enberg
2012-05-20  2:23 ` David Rientjes
2012-05-21  8:40   ` Ingo Molnar
2012-05-22  2:16     ` David Rientjes
2012-05-22  2:42       ` David Rientjes
2012-05-22 12:04         ` Peter Zijlstra
2012-05-22 15:00           ` Peter Zijlstra
2012-05-23 16:00             ` Peter Zijlstra
2012-05-24  0:58               ` David Rientjes
2012-05-25  8:35                 ` Peter Zijlstra
2012-05-31 22:03                   ` Peter Zijlstra
2012-05-30 13:37               ` [tip:sched/urgent] sched: Fix SD_OVERLAP tip-bot for Peter Zijlstra
2012-05-30 13:38           ` [tip:sched/urgent] sched: Make sure to not re-read variables after validation tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120519111908.GC2012@gmail.com \
    --to=mingo@kernel.org \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata.rao@gmail.com \
    --cc=cl@linux.com \
    --cc=danms@us.ibm.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=pjt@google.com \
    --cc=riel@redhat.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.