From: Mel Gorman <mel@csn.ul.ie>
To: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
Nishanth Aravamudan <nacc@us.ibm.com>,
Adam Litke <agl@us.ibm.com>, Andy Whitcroft <apw@canonical.com>,
eric.whitney@hp.com
Subject: Re: [PATCH 0/5] Huge Pages Nodes Allowed
Date: Wed, 17 Jun 2009 14:02:16 +0100 [thread overview]
Message-ID: <20090617130216.GF28529@csn.ul.ie> (raw)
In-Reply-To: <20090616135228.25248.22018.sendpatchset@lts-notebook>
On Tue, Jun 16, 2009 at 09:52:28AM -0400, Lee Schermerhorn wrote:
> Because of assymmetries in some NUMA platforms, and "interesting"
> topologies emerging in the "scale up x86" world, we have need for
> better control over the placement of "fresh huge pages". A while
> back Nish Aravamundan floated a series of patches to add per node
> controls for allocating pages to the hugepage pool and removing
> them. Nish apparently moved on to other tasks before those patches
> were accepted. I have kept a copy of Nish's patches and have
> intended to rebase and test them and resubmit.
>
> In an [off-list] exchange with Mel Gorman, who admits to knowledge
> in the huge pages area, I asked his opinion of per node controls
> for huge pages and he suggested another approach: using the mempolicy
> of the task that changes nr_hugepages to constrain the fresh huge
> page allocations. I considered this approach but it seemed to me
> to be a misuse of mempolicy for populating the huge pages free
> pool.
Why would it be a misuse? Fundamentally, the huge page pools are being
filled by the current process when nr_hugepages is being used. Or are
you concerned about the specification of hugepages on the kernel command
line?
> Interleave policy doesn't have same "this node" semantics
> that we want
By "this node" semantics, do you mean allocating from one specific node?
In that case, why would specifying a nodemask of just one node not be
sufficient?
> and bind policy would require constructing a custom
> node mask for node as well as addressing OOM, which we don't want
> during fresh huge page allocation.
Would the required mask not already be setup when the process set the
policy? OOM is not a major concern, it doesn't trigger for failed
hugepage allocations.
> One could derive a node mask
> of allowed nodes for huge pages from the mempolicy of the task
> that is modifying nr_hugepages and use that for fresh huge pages
> with GFP_THISNODE. However, if we're not going to use mempolicy
> directly--e.g., via alloc_page_current() or alloc_page_vma() [with
> yet another on-stack pseudo-vma :(]--I thought it cleaner to
> define a "nodes allowed" nodemask for populating the [persistent]
> huge pages free pool.
>
How about adding alloc_page_mempolicy() that takes the explicit mempolicy
you need?
> This patch series introduces a [per hugepage size] "sysctl",
> hugepages_nodes_allowed, that specifies a nodemask to constrain
> the allocation of persistent, fresh huge pages. The nodemask
> may be specified by a sysctl, a sysfs huge pages attribute and
> on the kernel boot command line.
>
> The series includes a patch to free hugepages from the pool in a
> "round robin" fashion, interleaved across all on-line nodes to
> balance the hugepage pool across nodes. Nish had a patch to do
> this, too.
>
> Together, these changes don't provide the fine grain of control
> that per node attributes would.
I'm failing to understand at the moment why mem policies set by numactl
would not do the job for allocation at least. Freeing is a different problem.
> Specifically, there is no easy
> way to reduce the persistent huge page count for a specific node.
> I think the degree of control provided by these patches is the
> minimal necessary and sufficient for managing the persistent the
> huge page pool. However, with a bit more reorganization, we
> could implement per node controls if others would find that
> useful.
>
> For more info, see the patch descriptions and the updated kernel
> hugepages documentation.
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-06-17 13:02 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-16 13:52 [PATCH 0/5] Huge Pages Nodes Allowed Lee Schermerhorn
2009-06-16 13:52 ` [PATCH 1/5] Free huge pages round robin to balance across nodes Lee Schermerhorn
2009-06-17 13:18 ` Mel Gorman
2009-06-17 17:16 ` Lee Schermerhorn
2009-06-18 19:08 ` David Rientjes
2009-06-16 13:52 ` [PATCH 2/5] Add nodes_allowed members to hugepages hstate struct Lee Schermerhorn
2009-06-17 13:35 ` Mel Gorman
2009-06-17 17:38 ` Lee Schermerhorn
2009-06-18 9:17 ` Mel Gorman
2009-06-16 13:53 ` [PATCH 3/5] Use per hstate nodes_allowed to constrain huge page allocation Lee Schermerhorn
2009-06-17 13:39 ` Mel Gorman
2009-06-17 17:47 ` Lee Schermerhorn
2009-06-18 9:18 ` Mel Gorman
2009-06-16 13:53 ` [PATCH 4/5] Add sysctl for default hstate nodes_allowed Lee Schermerhorn
2009-06-17 13:41 ` Mel Gorman
2009-06-17 17:52 ` Lee Schermerhorn
2009-06-18 9:19 ` Mel Gorman
2009-06-16 13:53 ` [PATCH 5/5] Update huge pages kernel documentation Lee Schermerhorn
2009-06-18 18:49 ` David Rientjes
2009-06-18 19:06 ` Lee Schermerhorn
2009-06-17 13:02 ` Mel Gorman [this message]
2009-06-17 17:15 ` [PATCH 0/5] Huge Pages Nodes Allowed Lee Schermerhorn
2009-06-18 9:33 ` Mel Gorman
2009-06-18 14:46 ` Lee Schermerhorn
2009-06-18 15:00 ` Mel Gorman
2009-06-18 19:08 ` David Rientjes
2009-06-24 7:11 ` David Rientjes
2009-06-24 11:25 ` Lee Schermerhorn
2009-06-24 22:26 ` David Rientjes
2009-06-25 2:14 ` Lee Schermerhorn
2009-06-25 19:22 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090617130216.GF28529@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=agl@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=eric.whitney@hp.com \
--cc=lee.schermerhorn@hp.com \
--cc=linux-mm@kvack.org \
--cc=nacc@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).