From: Ingo Molnar <mingo@kernel.org>
To: Christoph Lameter <cl@linux.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Paul Turner <pjt@google.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 0/8] Announcement: Enhanced NUMA scheduling with adaptive affinity
Date: Tue, 13 Nov 2012 08:24:41 +0100 [thread overview]
Message-ID: <20121113072441.GA21386@gmail.com> (raw)
In-Reply-To: <0000013af701ca15-3acab23b-a16d-4e38-9dc0-efef05cbc5f2-000000@email.amazonses.com>
* Christoph Lameter <cl@linux.com> wrote:
> On Mon, 12 Nov 2012, Peter Zijlstra wrote:
>
> > The biggest conceptual addition, beyond the elimination of
> > the home node, is that the scheduler is now able to
> > recognize 'private' versus 'shared' pages, by carefully
> > analyzing the pattern of how CPUs touch the working set
> > pages. The scheduler automatically recognizes tasks that
> > share memory with each other (and make dominant use of that
> > memory) - versus tasks that allocate and use their working
> > set privately.
>
> That is a key distinction to make and if this really works
> then that is major progress.
I posted updated benchmark results yesterday, and the approach
is indeed a performance breakthrough:
http://lkml.org/lkml/2012/11/12/330
It also made the code more generic and more maintainable from a
scheduler POV.
> > This new scheduler code is then able to group tasks that are
> > "memory related" via their memory access patterns together:
> > in the NUMA context moving them on the same node if
> > possible, and spreading them amongst nodes if they use
> > private memory.
>
> What happens if processes memory accesses are related but the
> common set of data does not fit into the memory provided by a
> single node?
The other (very common) node-overload case is that there are
more tasks for a shared piece of memory than fits on a single
node.
I have measured two such workloads, one is the Java SPEC
benchmark:
v3.7-vanilla: 494828 transactions/sec
v3.7-NUMA: 627228 transactions/sec [ +26.7% ]
the other is the 'numa01' testcase of autonumabench:
v3.7-vanilla: 340.3 seconds
v3.7-NUMA: 216.9 seconds [ +56% ]
> The correct resolution usually is in that case to interleasve
> the pages over both nodes in use.
I'd not go as far as to claim that to be a general rule: the
correct placement depends on the system and workload specifics:
how much memory is on each node, how many tasks run on each
node, and whether the access patterns and working set of the
tasks is symmetric amongst each other - which is not a given at
all.
Say consider a database server that executes small and large
queries over a large, memory-shared database, and has worker
tasks to clients, to serve each query. Depending on the nature
of the queries, interleaving can easily be the wrong thing to
do.
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-11-13 7:24 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-12 16:04 [PATCH 0/8] Announcement: Enhanced NUMA scheduling with adaptive affinity Peter Zijlstra
2012-11-12 16:04 ` [PATCH 1/8] sched, numa, mm: Introduce sched_feat_numa() Peter Zijlstra
2012-11-12 16:04 ` [PATCH 2/8] sched, numa, mm: Implement THP migration Peter Zijlstra
2012-11-12 16:04 ` [PATCH 3/8] sched, numa, mm: Add credits for NUMA placement Peter Zijlstra
2012-11-12 16:04 ` [PATCH 4/8] sched, numa, mm: Add last_cpu to page flags Peter Zijlstra
2012-11-13 11:55 ` Ingo Molnar
2012-11-13 16:09 ` Rik van Riel
2012-11-12 16:04 ` [PATCH 5/8] sched, numa, mm: Add adaptive NUMA affinity support Peter Zijlstra
2012-11-13 0:02 ` Christoph Lameter
2012-11-13 8:19 ` Ingo Molnar
2012-11-13 22:57 ` Rik van Riel
2012-11-16 18:06 ` Rik van Riel
2012-11-16 18:14 ` Ingo Molnar
2012-11-16 18:23 ` Rik van Riel
2012-11-29 19:34 ` Andi Kleen
2012-11-12 16:04 ` [PATCH 6/8] sched, numa, mm: Implement constant, per task Working Set Sampling (WSS) rate Peter Zijlstra
2012-11-12 16:04 ` [PATCH 7/8] sched, numa, mm: Count WS scanning against present PTEs, not virtual memory ranges Peter Zijlstra
2012-11-12 16:04 ` [PATCH 8/8] sched, numa, mm: Implement slow start for working set sampling Peter Zijlstra
2012-11-12 18:48 ` Benchmark results: "Enhanced NUMA scheduling with adaptive affinity" Ingo Molnar
2012-11-15 10:08 ` Mel Gorman
2012-11-15 18:52 ` Rik van Riel
2012-11-15 21:27 ` Mel Gorman
2012-11-15 20:32 ` Linus Torvalds
2012-11-15 22:04 ` Rik van Riel
2012-11-16 14:14 ` Mel Gorman
2012-11-16 19:50 ` Andrea Arcangeli
2012-11-16 20:05 ` Mel Gorman
2012-11-16 16:16 ` Ingo Molnar
2012-11-16 15:56 ` Ingo Molnar
2012-11-16 16:25 ` Mel Gorman
2012-11-16 17:49 ` Ingo Molnar
2012-11-16 19:04 ` Mel Gorman
2012-11-12 23:43 ` [PATCH 0/8] Announcement: Enhanced NUMA scheduling with adaptive affinity Christoph Lameter
2012-11-13 7:24 ` Ingo Molnar [this message]
2012-11-15 14:26 ` Christoph Lameter
2012-11-16 15:59 ` Ingo Molnar
2012-11-16 20:57 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121113072441.GA21386@gmail.com \
--to=mingo@kernel.org \
--cc=Lee.Schermerhorn@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=pjt@google.com \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).