From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Paul Mundt <lethal@linux-sh.org>
Cc: Andi Kleen <ak@suse.de>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
linux-mm <linux-mm@kvack.org>,
Christoph Lameter <clameter@sgi.com>,
Nishanth Aravamudan <nacc@us.ibm.com>,
kxr@sgi.com, akpm@linux-foundation.org,
Eric Whitney <eric.whitney@hp.com>
Subject: Re: [PATCH/RFC] Allow selected nodes to be excluded from MPOL_INTERLEAVE masks
Date: Wed, 01 Aug 2007 09:54:06 -0400 [thread overview]
Message-ID: <1185976446.5059.27.camel@localhost> (raw)
In-Reply-To: <20070801112116.GA9617@linux-sh.org>
On Wed, 2007-08-01 at 20:21 +0900, Paul Mundt wrote:
> On Wed, Aug 01, 2007 at 01:07:43PM +0200, Andi Kleen wrote:
> >
> > > As long as interleaving is possible after boot, then yes. It's only the
> > > boot-time interleave that we would like to avoid,
> >
> > But when anybody does interleaving later it could just as easily
> > fill up your small nodes, couldn't it?
> >
> Yes, but these are in embedded environments where we have control over
> what the applications are doing. Most of these sorts of things are for
> applications where we know what sort of latency requires we have to deal
> with, and so the workload is very much tied to the worst-case range of
> nodes, or just to a particular node. We might only have certain buffers
> that need to be backed by faster memory as well, so while most of the
> application pages will come from node 0 (system memory), certain other
> allocations will come from other nodes. We've been experimenting with
> doing that through tmpfs with mpol tuning.
>
> In the general case however it's fairly safe to include the tiny nodes as
> part of a larger set with a prefer policy so we don't immediately OOM.
>
> > Boot time allocations are small compared to what user space
> > later can allocate.
> >
> Yes, we only want certain applications to explicitly poke at those nodes,
> but they do have a use case for interleave, so it is not functionality I
> would want to lose completely.
This is why I wanted to use an "obscure boot option". I don't see this
as strictly an architectural/platform issue. Rather, it's a combination
of the arch/platform and how it's being used for specific applications.
So, I don't see how one could accomplish this with a heuristic.
As Paul mentioned, in embedded systems, one has a bit more control over
what applications are doing. In that case, I could envision a config
option to specify the initial/default value for the no_interleave_nodes
at kernel build time and dispense with the boot option. [Any interest
in such an option, Paul?] But for platforms like ours, that tend to run
enterprise distro kernels, I need a way to specify on a per site or per
installation basis, what nodes should be used. Our approach would be to
document this in a "best practices" doc that the customer or, more
likely, our field software specialists, would use to optimize the
platform and OS config for the application.
>
> > And do you really want them in the normal fallback lists? The normal zone
> > reservation heuristics probably won't work unless you put them into
> > special low zones.
> >
> That's something else to look at also, though I would very much like to
> avoid having to construct custom zonelists. it would be nice to keep things as
> simple and as non-invasive as possible. As far as the existing NUMA code
> goes, we're not quite all the way there yet in terms of supporting these
> things as well as we can, but it has proven to be a pretty good starting
> point.
Yes, there are rumblings on the mailing list about passing just a
starting [preferred] node and a node mask to the page allocator. I'm
too backed up with other things to think too much about this, yet.
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-08-01 13:54 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-27 20:07 [PATCH/RFC] Allow selected nodes to be excluded from MPOL_INTERLEAVE masks Lee Schermerhorn
2007-07-28 6:19 ` KAMEZAWA Hiroyuki
2007-07-30 16:13 ` Lee Schermerhorn
2007-07-30 18:29 ` Christoph Lameter
2007-07-30 20:32 ` Lee Schermerhorn
2007-07-30 21:57 ` Christoph Lameter
2007-08-01 10:16 ` Paul Mundt
2007-08-01 10:33 ` Andi Kleen
2007-08-01 11:01 ` Paul Mundt
2007-08-01 11:07 ` Andi Kleen
2007-08-01 11:21 ` Paul Mundt
2007-08-01 13:54 ` Lee Schermerhorn [this message]
2007-08-02 17:38 ` Mark Gross
2007-08-02 18:46 ` Lee Schermerhorn
2007-08-06 16:42 ` Mark Gross
2007-08-01 13:39 ` Lee Schermerhorn
2007-08-03 7:53 ` Paul Mundt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1185976446.5059.27.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=eric.whitney@hp.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kxr@sgi.com \
--cc=lethal@linux-sh.org \
--cc=linux-mm@kvack.org \
--cc=nacc@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.