From: Andrea Arcangeli <aarcange@redhat.com>
To: Christoph Lameter <cl@linux.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Lee Schermerhorn <lee.schermerhorn@hp.com>,
linux-numa@vger.kernel.org, akpm@linux-foundation.org,
Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
Hugh Dickins <hughd@google.com>,
andi@firstfloor.org, David Rientjes <rientjes@google.com>,
Avi Kivity <avi@redhat.com>
Subject: Re: [PATCH/RFC 0/8] numa - Migrate-on-Fault
Date: Mon, 15 Nov 2010 15:33:50 +0100 [thread overview]
Message-ID: <20101115143350.GC6809@random.random> (raw)
In-Reply-To: <alpine.DEB.2.00.1011150809030.19175@router.home>
Hi everyone,
On Mon, Nov 15, 2010 at 08:13:14AM -0600, Christoph Lameter wrote:
> On Sun, 14 Nov 2010, KOSAKI Motohiro wrote:
>
> > Nice!
>
> Lets not get overenthused. There has been no conclusive proof that the
> overhead introduced by automatic migration schemes is consistently less
> than the benefit obtained by moving the data. Quite to the contrary. We
> have over a decades worth of research and attempts on this issue and there
> was no general improvement to be had that way.
>
> The reason that the manual placement interfaces exist is because there was
> no generally beneficial migration scheme available. The manual interfaces
> allow the writing of various automatic migrations schemes in user space.
>
> If wecan come up with something that is an improvement then lets go
> this way but I am skeptical.
I generally find the patchset very interesting but I think like
Christoph.
It's good to give the patchset more visibility as it's quite unique in
this area, but when talking with Lee I also thought the synchronous
migrate on fault was probably too aggressive and I like an algorithm
where memory follows cpus and cpus follow memory in a total dynamic
way.
I suggested Lee during our chat (and also to others during KS+Plubers)
that we need a more dynamic algorithm that works in the background
asynchronously. Specifically I want the cpu to follow memory closely
whenever idle status allows it (change cpu in context switch is cheap,
I don't like pinning or "single" home node concept) and then memory
slowly also in tandem follow cpu in the background with kernel
thread. So that both having cpu follow memory fast, and memory follow
cpu slow, eventually things over time should converge in a optimal
behavior. I like the migration done from a kthread like
khugepaged/ksmd, not synchronously adding latency to page fault (or
having to take down ptes to trigger the migrate on fault, migrate
never need to require the app to exit kernel and take a fault just to
migrate, it happens transparently as far as userland is concerned,
well of course unless it trips on the migration pte just at the wrong
time :).
So the patchset looks very interesting, and it may actually be optimal
for some slower hardware, but I've the perception these days the
memory being remote isn't as a big deal as not keeping all two memory
controllers in action simultaneously (using just one controller is
worse than using both simultaneously from the wrong end, locality not
as important as not stepping in each other toes). So in general
synchronous migrate on fault seems a bit too aggressive to me and not
ideal for newer hardware. Still this is one of the most interesting
patchsets at this time in this area I've seen so far.
The homenode logic ironically may be optimal with the most important
bench because the way that bench is setup all vm are fairly small and
there are plenty of them so it'll never happen that a vm has more
memory than what can fit in the ram of a single node, but I like
dynamic approach that works best in all environments, even if it's not
clearly as simple and maybe not as optimal in the one relevant
benchmark we care about. I'm unsure what the homenode is supposed to
decide when the task has two three four times the ram that fits in a
single node (and that may not be a so uncommon scenario after all).
I admit not having read enough on this homenode logic, but I never got
any attraction to it personally as there should never be any single
"home" to any task in my view.
next prev parent reply other threads:[~2010-11-15 14:33 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-11 19:44 [PATCH/RFC 0/8] numa - Migrate-on-Fault Lee Schermerhorn
2010-11-11 19:44 ` [PATCH/RFC 1/8] numa - Migrate-on-Fault - add Kconfig option Lee Schermerhorn
2010-11-11 19:45 ` [PATCH/RFC 2/8] numa - Migrate-on-Fault - add cpuset control Lee Schermerhorn
2010-11-11 19:45 ` [PATCH/RFC 3/8] numa - Migrate-on-Fault - check for misplaced page Lee Schermerhorn
2010-11-11 19:45 ` [PATCH/RFC 4/8] numa - Migrate-on-Fault - migrate misplaced pages Lee Schermerhorn
2010-11-11 19:45 ` [PATCH/RFC 5/8] numa - Migrate-on-Fault - migrate misplaced anon pages Lee Schermerhorn
2010-11-11 19:45 ` [PATCH/RFC 6/8] numa - Migrate-on-Fault - add mbind() MPOL_MF_LAZY flag Lee Schermerhorn
2010-11-11 19:45 ` [PATCH/RFC 7/8] numa - Migrate-on-Fault - mbind() NOOP policy Lee Schermerhorn
2010-11-11 19:45 ` [PATCH/RFC 8/8] numa - Migrate-on-Fault - add statistics Lee Schermerhorn
2010-11-14 6:37 ` [PATCH/RFC 0/8] numa - Migrate-on-Fault KOSAKI Motohiro
2010-11-15 14:13 ` Christoph Lameter
2010-11-15 14:21 ` Andi Kleen
2010-11-15 14:37 ` Andrea Arcangeli
2010-11-15 14:33 ` Andrea Arcangeli [this message]
2010-11-17 17:03 ` Lee Schermerhorn
2010-11-17 21:27 ` Andrea Arcangeli
2010-11-16 4:54 ` KOSAKI Motohiro
2010-11-17 14:45 ` Lee Schermerhorn
2010-11-17 17:10 ` Avi Kivity
2010-11-17 17:34 ` Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101115143350.GC6809@random.random \
--to=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=avi@redhat.com \
--cc=cl@linux.com \
--cc=hughd@google.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=lee.schermerhorn@hp.com \
--cc=linux-numa@vger.kernel.org \
--cc=mel@csn.ul.ie \
--cc=npiggin@kernel.dk \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).