From: Nishanth Aravamudan <nacc@us.ibm.com>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Mel Gorman <mel@skynet.ie>, Christoph Lameter <clameter@sgi.com>,
akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, rientjes@google.com,
kamezawa.hiroyu@jp.fujitsu.com
Subject: Re: [PATCH 6/6] Use one zonelist that is filtered by nodemask
Date: Fri, 9 Nov 2007 10:14:55 -0800 [thread overview]
Message-ID: <20071109181455.GH7507@us.ibm.com> (raw)
In-Reply-To: <1194628732.5296.14.camel@localhost>
On 09.11.2007 [12:18:52 -0500], Lee Schermerhorn wrote:
> On Fri, 2007-11-09 at 08:45 -0800, Nishanth Aravamudan wrote:
> > On 09.11.2007 [16:14:55 +0000], Mel Gorman wrote:
> > > On (09/11/07 07:45), Christoph Lameter didst pronounce:
> > > > On Fri, 9 Nov 2007, Mel Gorman wrote:
> > > >
> > > > > struct page * fastcall
> > > > > __alloc_pages(gfp_t gfp_mask, unsigned int order,
> > > > > struct zonelist *zonelist)
> > > > > {
> > > > > + /*
> > > > > + * Use a temporary nodemask for __GFP_THISNODE allocations. If the
> > > > > + * cost of allocating on the stack or the stack usage becomes
> > > > > + * noticable, allocate the nodemasks per node at boot or compile time
> > > > > + */
> > > > > + if (unlikely(gfp_mask & __GFP_THISNODE)) {
> > > > > + nodemask_t nodemask;
> > > >
> > > > Hmmm.. This places a potentially big structure on the stack. nodemask can
> > > > contain up to 1024 bits which means 128 bytes. Maybe keep an array of
> > > > gfp_thisnode nodemasks (node_nodemask?) and use node_nodemask[nid]?
> > > >
> > >
> > > That is what I was hinting at in the comment as a possible solution.
> > >
> > > > > +
> > > > > + return __alloc_pages_internal(gfp_mask, order,
> > > > > + zonelist, nodemask_thisnode(numa_node_id(), &nodemask));
> > > >
> > > > Argh.... GFP_THISNODE must use the nid passed to alloc_pages_node
> > > > and *not* the local numa node id. Only if the node specified to
> > > > alloc_pages nodes is -1 will this work.
> > > >
> > >
> > > alloc_pages_node() calls __alloc_pages_nodemask() though where in this
> > > function if I'm reading it right is called without a node id. Given no
> > > other details on the nid, the current one seemed a logical choice.
> >
> > Yeah, I guess the context here matters (and is a little hard to follow
> > because thare are a few places that change in different ways here):
> >
> > For allocating pages from a particular node (GFP_THISNODE with nid),
> > the nid clearly must be specified. This only happens with
> > alloc_pages_node(), AFAICT. So, in that interface, the right thing is
> > done and the appropriate nodemask will be built.
>
> I agree. In an earlier patch, Mel was ignoring nid and using
> numa_node_id() here. This was causing your [Nish's] hugetlb pool
> allocation patches to fail. Mel fixed that ~9oct07.
Yep, and that's why I'm on the Cc, I think :)
> > On the other hand, if we call alloc_pages() with GFP_THISNODE set, there
> > is no nid to base the allocation on, so we "fallback" to numa_node_id()
> > [ almost like the nid had been specified as -1 ].
> >
> > So I guess this is logical -- but I wonder, do we have any callers of
> > alloc_pages(GFP_THISNODE) ? It seems like an odd thing to do, when
> > alloc_pages_node() exists?
>
> I don't know if we have any current callers that do this, but absent any
> documentation specifying otherwise, Mel's implementation matches what
> I'd expect the behavior to be if I DID call alloc_pages with 'THISNODE.
> However, we could specify that THISNODE is ignored in __alloc_pages()
> and recommend the use of alloc_pages_node() passing numa_node_id() as
> the nid parameter to achieve the behavior. This would eliminate the
> check for 'THISNODE in __alloc_pages(). Just mask it off before calling
> down to __alloc_pages_internal().
>
> Does this make sense?
The caller could also just use -1 as the nid, since then
alloc_pages_node() should do the numa_node_id() for the caller... But I
agree, there is no documentation saying GFP_THISNODE is *not* allowed
for alloc_pages(), so we should probably handle it
-Nish
--
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center
WARNING: multiple messages have this Message-ID (diff)
From: Nishanth Aravamudan <nacc@us.ibm.com>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Mel Gorman <mel@skynet.ie>, Christoph Lameter <clameter@sgi.com>,
akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, rientjes@google.com,
kamezawa.hiroyu@jp.fujitsu.com
Subject: Re: [PATCH 6/6] Use one zonelist that is filtered by nodemask
Date: Fri, 9 Nov 2007 10:14:55 -0800 [thread overview]
Message-ID: <20071109181455.GH7507@us.ibm.com> (raw)
In-Reply-To: <1194628732.5296.14.camel@localhost>
On 09.11.2007 [12:18:52 -0500], Lee Schermerhorn wrote:
> On Fri, 2007-11-09 at 08:45 -0800, Nishanth Aravamudan wrote:
> > On 09.11.2007 [16:14:55 +0000], Mel Gorman wrote:
> > > On (09/11/07 07:45), Christoph Lameter didst pronounce:
> > > > On Fri, 9 Nov 2007, Mel Gorman wrote:
> > > >
> > > > > struct page * fastcall
> > > > > __alloc_pages(gfp_t gfp_mask, unsigned int order,
> > > > > struct zonelist *zonelist)
> > > > > {
> > > > > + /*
> > > > > + * Use a temporary nodemask for __GFP_THISNODE allocations. If the
> > > > > + * cost of allocating on the stack or the stack usage becomes
> > > > > + * noticable, allocate the nodemasks per node at boot or compile time
> > > > > + */
> > > > > + if (unlikely(gfp_mask & __GFP_THISNODE)) {
> > > > > + nodemask_t nodemask;
> > > >
> > > > Hmmm.. This places a potentially big structure on the stack. nodemask can
> > > > contain up to 1024 bits which means 128 bytes. Maybe keep an array of
> > > > gfp_thisnode nodemasks (node_nodemask?) and use node_nodemask[nid]?
> > > >
> > >
> > > That is what I was hinting at in the comment as a possible solution.
> > >
> > > > > +
> > > > > + return __alloc_pages_internal(gfp_mask, order,
> > > > > + zonelist, nodemask_thisnode(numa_node_id(), &nodemask));
> > > >
> > > > Argh.... GFP_THISNODE must use the nid passed to alloc_pages_node
> > > > and *not* the local numa node id. Only if the node specified to
> > > > alloc_pages nodes is -1 will this work.
> > > >
> > >
> > > alloc_pages_node() calls __alloc_pages_nodemask() though where in this
> > > function if I'm reading it right is called without a node id. Given no
> > > other details on the nid, the current one seemed a logical choice.
> >
> > Yeah, I guess the context here matters (and is a little hard to follow
> > because thare are a few places that change in different ways here):
> >
> > For allocating pages from a particular node (GFP_THISNODE with nid),
> > the nid clearly must be specified. This only happens with
> > alloc_pages_node(), AFAICT. So, in that interface, the right thing is
> > done and the appropriate nodemask will be built.
>
> I agree. In an earlier patch, Mel was ignoring nid and using
> numa_node_id() here. This was causing your [Nish's] hugetlb pool
> allocation patches to fail. Mel fixed that ~9oct07.
Yep, and that's why I'm on the Cc, I think :)
> > On the other hand, if we call alloc_pages() with GFP_THISNODE set, there
> > is no nid to base the allocation on, so we "fallback" to numa_node_id()
> > [ almost like the nid had been specified as -1 ].
> >
> > So I guess this is logical -- but I wonder, do we have any callers of
> > alloc_pages(GFP_THISNODE) ? It seems like an odd thing to do, when
> > alloc_pages_node() exists?
>
> I don't know if we have any current callers that do this, but absent any
> documentation specifying otherwise, Mel's implementation matches what
> I'd expect the behavior to be if I DID call alloc_pages with 'THISNODE.
> However, we could specify that THISNODE is ignored in __alloc_pages()
> and recommend the use of alloc_pages_node() passing numa_node_id() as
> the nid parameter to achieve the behavior. This would eliminate the
> check for 'THISNODE in __alloc_pages(). Just mask it off before calling
> down to __alloc_pages_internal().
>
> Does this make sense?
The caller could also just use -1 as the nid, since then
alloc_pages_node() should do the numa_node_id() for the caller... But I
agree, there is no documentation saying GFP_THISNODE is *not* allowed
for alloc_pages(), so we should probably handle it
-Nish
--
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-11-09 18:15 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-09 14:32 [PATCH 0/6] Use one zonelist per node instead of multiple zonelists v9 Mel Gorman
2007-11-09 14:32 ` Mel Gorman
2007-11-09 14:32 ` [PATCH 1/6] Use zonelists instead of zones when direct reclaiming pages Mel Gorman
2007-11-09 14:32 ` Mel Gorman
2007-11-09 14:33 ` [PATCH 2/6] Introduce node_zonelist() for accessing the zonelist for a GFP mask Mel Gorman
2007-11-09 14:33 ` Mel Gorman
2007-11-09 15:31 ` Christoph Lameter
2007-11-09 15:31 ` Christoph Lameter
2007-11-09 14:33 ` [PATCH 3/6] Use two zonelist that are filtered by " Mel Gorman
2007-11-09 14:33 ` Mel Gorman
2007-11-09 14:33 ` [PATCH 4/6] Have zonelist contains structs with both a zone pointer and zone_idx Mel Gorman
2007-11-09 14:33 ` Mel Gorman
2007-11-20 15:34 ` Lee Schermerhorn
2007-11-20 15:34 ` Lee Schermerhorn
2007-11-09 14:34 ` [PATCH 5/6] Filter based on a nodemask as well as a gfp_mask Mel Gorman
2007-11-09 14:34 ` Mel Gorman
2008-02-29 5:01 ` Paul Jackson
2008-02-29 5:01 ` Paul Jackson
2008-02-29 14:49 ` Lee Schermerhorn
2008-02-29 14:49 ` Lee Schermerhorn
2008-03-04 20:20 ` [PATCH] 2.6.25-rc3-mm1 - Mempolicy - update stale documentation and comments Lee Schermerhorn
2008-03-04 20:20 ` Lee Schermerhorn
2008-03-05 0:35 ` Paul Jackson
2008-03-05 0:35 ` Paul Jackson
2008-03-07 11:53 ` Mel Gorman
2008-03-07 11:53 ` Mel Gorman
2007-11-09 14:34 ` [PATCH 6/6] Use one zonelist that is filtered by nodemask Mel Gorman
2007-11-09 14:34 ` Mel Gorman
2007-11-09 15:45 ` Christoph Lameter
2007-11-09 15:45 ` Christoph Lameter
2007-11-09 16:14 ` Mel Gorman
2007-11-09 16:14 ` Mel Gorman
2007-11-09 16:19 ` Christoph Lameter
2007-11-09 16:19 ` Christoph Lameter
2007-11-09 16:45 ` Nishanth Aravamudan
2007-11-09 16:45 ` Nishanth Aravamudan
2007-11-09 17:18 ` Lee Schermerhorn
2007-11-09 17:18 ` Lee Schermerhorn
2007-11-09 17:26 ` Christoph Lameter
2007-11-09 17:26 ` Christoph Lameter
2007-11-09 18:16 ` Nishanth Aravamudan
2007-11-09 18:16 ` Nishanth Aravamudan
2007-11-09 18:20 ` Nishanth Aravamudan
2007-11-09 18:20 ` Nishanth Aravamudan
2007-11-09 18:22 ` Christoph Lameter
2007-11-09 18:22 ` Christoph Lameter
2007-11-11 14:16 ` Mel Gorman
2007-11-11 14:16 ` Mel Gorman
2007-11-12 19:07 ` Christoph Lameter
2007-11-12 19:07 ` Christoph Lameter
2007-11-09 18:14 ` Nishanth Aravamudan [this message]
2007-11-09 18:14 ` Nishanth Aravamudan
2007-11-20 14:19 ` Mel Gorman
2007-11-20 14:19 ` Mel Gorman
2007-11-20 15:14 ` Lee Schermerhorn
2007-11-20 15:14 ` Lee Schermerhorn
2007-11-20 16:21 ` Mel Gorman
2007-11-20 16:21 ` Mel Gorman
2007-11-20 20:19 ` Christoph Lameter
2007-11-20 20:19 ` Christoph Lameter
2007-11-20 20:18 ` Christoph Lameter
2007-11-20 20:18 ` Christoph Lameter
2007-11-20 21:26 ` Mel Gorman
2007-11-20 21:26 ` Mel Gorman
2007-11-20 21:33 ` Andrew Morton
2007-11-20 21:33 ` Andrew Morton
2007-11-20 21:38 ` Christoph Lameter
2007-11-20 21:38 ` Christoph Lameter
-- strict thread matches above, loose matches on Subject: below --
2007-09-28 14:23 [PATCH 0/6] Use one zonelist per node instead of multiple zonelists v8 Mel Gorman
2007-09-28 14:25 ` [PATCH 6/6] Use one zonelist that is filtered by nodemask Mel Gorman
2007-09-28 14:25 ` Mel Gorman
2007-10-09 1:11 ` Nishanth Aravamudan
2007-10-09 1:11 ` Nishanth Aravamudan
2007-10-09 1:56 ` Christoph Lameter
2007-10-09 1:56 ` Christoph Lameter
2007-10-09 3:17 ` Nishanth Aravamudan
2007-10-09 3:17 ` Nishanth Aravamudan
2007-10-09 15:40 ` Mel Gorman
2007-10-09 15:40 ` Mel Gorman
2007-10-09 16:25 ` Nishanth Aravamudan
2007-10-09 16:25 ` Nishanth Aravamudan
2007-10-09 18:47 ` Christoph Lameter
2007-10-09 18:47 ` Christoph Lameter
2007-10-09 18:12 ` Nishanth Aravamudan
2007-10-09 18:12 ` Nishanth Aravamudan
2007-10-10 15:53 ` Lee Schermerhorn
2007-10-10 15:53 ` Lee Schermerhorn
2007-10-10 16:05 ` Nishanth Aravamudan
2007-10-10 16:05 ` Nishanth Aravamudan
2007-10-10 16:09 ` Mel Gorman
2007-10-10 16:09 ` Mel Gorman
2007-09-13 17:52 [PATCH 0/6] Use one zonelist per node instead of multiple zonelists v7 Mel Gorman
2007-09-13 17:54 ` [PATCH 6/6] Use one zonelist that is filtered by nodemask Mel Gorman
2007-09-13 17:54 ` Mel Gorman
2007-09-12 21:04 [PATCH 0/6] Use one zonelist per node instead of multiple zonelists v6 Mel Gorman
2007-09-12 21:06 ` [PATCH 6/6] Use one zonelist that is filtered by nodemask Mel Gorman
2007-09-12 21:06 ` Mel Gorman
2007-09-11 21:30 [PATCH 0/6] Use one zonelist per node instead of multiple zonelists v5 (resend) Mel Gorman
2007-09-11 21:32 ` [PATCH 6/6] Use one zonelist that is filtered by nodemask Mel Gorman
2007-09-11 21:32 ` Mel Gorman
2007-09-11 15:19 [PATCH 0/6] Use one zonelist per node instead of multiple zonelists v5 Mel Gorman
2007-09-11 15:21 ` [PATCH 6/6] Use one zonelist that is filtered by nodemask Mel Gorman
2007-09-11 15:21 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071109181455.GH7507@us.ibm.com \
--to=nacc@us.ibm.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@skynet.ie \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.