From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail138.messagelabs.com (mail138.messagelabs.com [216.82.249.35]) by kanga.kvack.org (Postfix) with SMTP id 2021F6B01FC for ; Fri, 16 Apr 2010 11:03:04 -0400 (EDT) Received: by pvg11 with SMTP id 11so1670879pvg.14 for ; Fri, 16 Apr 2010 08:03:02 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20100416111539.GC19264@csn.ul.ie> References: <1270522777-9216-1-git-send-email-lliubbo@gmail.com> <20100413083855.GS25756@csn.ul.ie> <20100416111539.GC19264@csn.ul.ie> Date: Fri, 16 Apr 2010 23:03:02 +0800 Message-ID: Subject: Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page From: Bob Liu Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org To: Mel Gorman Cc: kamezawa.hiroyu@jp.fujitsu.com, minchan.kim@gmail.com, akpm@linux-foundation.org, linux-mm@kvack.org, andi@firstfloor.org, rientjes@google.com, lee.schermerhorn@hp.com, cl@linux-foundation.org List-ID: On Fri, Apr 16, 2010 at 7:15 PM, Mel Gorman wrote: > On Tue, Apr 13, 2010 at 10:28:35PM +0800, Bob Liu wrote: >> On Tue, Apr 13, 2010 at 4:38 PM, Mel Gorman wrote: >> > On Tue, Apr 13, 2010 at 04:20:53PM +0800, Bob Liu wrote: >> >> On 4/6/10, Bob Liu wrote: >> >> > In funtion migrate_pages(), if the dest node have no >> >> > enough free pages,it will fallback to other nodes. >> >> > Add GFP_THISNODE to avoid this, the same as what >> >> > funtion new_page_node() do in migrate.c. >> >> > >> >> > Signed-off-by: Bob Liu >> >> > --- >> >> > =C2=A0mm/mempolicy.c | =C2=A0 =C2=A03 ++- >> >> > =C2=A01 files changed, 2 insertions(+), 1 deletions(-) >> >> > >> >> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c >> >> > index 08f40a2..fc5ddf5 100644 >> >> > --- a/mm/mempolicy.c >> >> > +++ b/mm/mempolicy.c >> >> > @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page,= struct list_head *pagelist, >> >> > >> >> > =C2=A0static struct page *new_node_page(struct page *page, unsigned= long node, int **x) >> >> > =C2=A0{ >> >> > - =C2=A0 =C2=A0 =C2=A0 return alloc_pages_exact_node(node, GFP_HIGH= USER_MOVABLE, 0); >> >> > + =C2=A0 =C2=A0 =C2=A0 return alloc_pages_exact_node(node, >> >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 GFP_HIGHUSER_MOVABLE | GFP_THISNO= DE, 0); >> >> > =C2=A0} >> >> > >> >> >> >> Hi, Minchan and Kame >> >> =C2=A0 =C2=A0 =C2=A0Would you please add ack or review to this thread= . It's BUGFIX >> >> and not change, so i don't resend one. >> >> >> > >> > Sorry for taking so long to get around to this thread. I talked on thi= s >> > patch already but it's in another thread. Here is what I said there >> > >> > =3D=3D=3D=3D >> > This appears to be a valid bug fix. =C2=A0I agree that the way things = are structured >> > that __GFP_THISNODE should be used in new_node_page(). But maybe a fol= low-on >> > patch is also required. The behaviour is now; >> > >> > o new_node_page will not return NULL if the target node is empty (fine= ). >> > o migrate_pages will translate this into -ENOMEM (fine) >> > o do_migrate_pages breaks early if it gets -ENOMEM ? >> > >> > It's the last part I'd like you to double check. migrate_pages() takes= a >> > nodemask of allowed nodes to migrate to. Rather than sending this down >> > to the allocator, it iterates over the nodes allowed in the mask. If o= ne >> > of those nodes is full, it returns -ENOMEM. >> > >> > If -ENOMEM is returned from migrate_pages, should it not move to the >> > next node? >> > =3D=3D=3D=3D >> >> Hm.I think early return is ok but not sure about it :) >> >> As Christoph said >> "The intended semantic is the preservation of the relative position of t= he >> page to the beginning of the node set." >> "F.e. if you use page >> migration (or cpuset automigration) to shift an application running on 1= 0 >> nodes up by two nodes to make a hole that would allow you to run another >> application on the lower nodes. Applications place pages intentionally o= n >> certain nodes to be able to manage memory distances." >> >> If move to the next node instead of early return, the relative position = of the >> page to the beginning of the node set will be break; >> > > Yeah, but the user requested that a number of nodes to be used. Sure, if = the > first node is not free, a page will be allocated on the second node inste= ad > but is that not what was requested? If the user wanted one and only one > node to be used, they wouldn't have specified multiple nodes. I'm not > convinced an early return is what was intended here. > Hmm. What about this change? If the from_nodes and to_nodes' weight is different= , then we can don't preserv of the relative position of the page to the begin= ning of the node set. This case if a page allocation from the dest node failed, it will be allocated from the next node instead of early return. diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 08f40a2..094d092 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist, static struct page *new_node_page(struct page *page, unsigned long node, int **x) { - return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0); + return alloc_pages_exact_node(node, + GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0); } /* @@ -945,10 +946,26 @@ int do_migrate_pages(struct mm_struct *mm, node_clear(source, tmp); err =3D migrate_to_node(mm, source, dest, flags); +retry: if (err > 0) busy +=3D err; - if (err < 0) + if (err < 0) { + /* + * If the dest node have no enough memory, and from_nodes + * to_nodes have no equal weight(don't need protect offset.) + * try to migrate to next node. + */ + if((nodes_weight(*from_nodes) !=3D nodes_weight(*to_nodes)) + && (err =3D=3D -ENOMEM)) { + for_each_node_mask(s, *to_nodes) { + if(s !=3D dest) { + err =3D migrate_to_node(mm, source, s, flags); + goto retry; + } + } + } break; + } Thanks ! --=20 Regards, --Bob -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org