linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mempolicy:add GFP_THISNODE when allocing new page
@ 2010-04-06  2:59 Bob Liu
  2010-04-06  4:33 ` Minchan Kim
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Bob Liu @ 2010-04-06  2:59 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, andi, rientjes, lee.schermerhorn, Bob Liu

In funtion migrate_pages(), if the dest node have no
enough free pages,it will fallback to other nodes.
Add GFP_THISNODE to avoid this, the same as what
funtion new_page_node() do in migrate.c.

Signed-off-by: Bob Liu <lliubbo@gmail.com>
---
 mm/mempolicy.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 08f40a2..fc5ddf5 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
 
 static struct page *new_node_page(struct page *page, unsigned long node, int **x)
 {
-	return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
+	return alloc_pages_exact_node(node,
+				GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
 }
 
 /*
-- 
1.5.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-06  2:59 [PATCH] mempolicy:add GFP_THISNODE when allocing new page Bob Liu
@ 2010-04-06  4:33 ` Minchan Kim
  2010-04-06  4:56   ` Bob Liu
  2010-04-13  8:19 ` KAMEZAWA Hiroyuki
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Minchan Kim @ 2010-04-06  4:33 UTC (permalink / raw)
  To: Bob Liu; +Cc: akpm, linux-mm, andi, rientjes, lee.schermerhorn, Mel Gorman

On Tue, Apr 6, 2010 at 11:59 AM, Bob Liu <lliubbo@gmail.com> wrote:
> In funtion migrate_pages(), if the dest node have no
> enough free pages,it will fallback to other nodes.
> Add GFP_THISNODE to avoid this, the same as what
> funtion new_page_node() do in migrate.c.
>
> Signed-off-by: Bob Liu <lliubbo@gmail.com>

Yes. It can be fixed. but I have a different concern.

I looked at 6484eb3e2a81807722c5f28ef.
"   page allocator: do not check NUMA node ID when the caller knows
the node is valid

   Callers of alloc_pages_node() can optionally specify -1 as a node to mean
   "allocate from the current node".  However, a number of the callers in
   fast paths know for a fact their node is valid.  To avoid a comparison and
   branch, this patch adds alloc_pages_exact_node() that only checks the nid
   with VM_BUG_ON().  Callers that know their node is valid are then
   converted."

alloc_pages_exact_node's naming would be not good.
It is not for allocate page from exact node but just for
removing check of node's valid.
Some people like me who is poor english could misunderstood it.

How about changing name with following?
/* This function can allocate page to fallback list of node*/
alloc_pages_by_nodeid(...)

And instead of it, let's change alloc_pages_exact_node with following.
static inline struct page *alloc_pages_exact_node(...)
{
 VM_BUG_ON ..
 return __alloc_pages(gfp_mask|__GFP_THISNODE...);
}

I think it's more clear than old.
What do you think about it?

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-06  4:33 ` Minchan Kim
@ 2010-04-06  4:56   ` Bob Liu
  2010-04-06  5:06     ` Minchan Kim
  0 siblings, 1 reply; 21+ messages in thread
From: Bob Liu @ 2010-04-06  4:56 UTC (permalink / raw)
  To: Minchan Kim; +Cc: akpm, linux-mm, andi, rientjes, lee.schermerhorn, Mel Gorman

On 4/6/10, Minchan Kim <minchan.kim@gmail.com> wrote:
> On Tue, Apr 6, 2010 at 11:59 AM, Bob Liu <lliubbo@gmail.com> wrote:
>> In funtion migrate_pages(), if the dest node have no
>> enough free pages,it will fallback to other nodes.
>> Add GFP_THISNODE to avoid this, the same as what
>> funtion new_page_node() do in migrate.c.
>>
>> Signed-off-by: Bob Liu <lliubbo@gmail.com>
>
> Yes. It can be fixed. but I have a different concern.
>
> I looked at 6484eb3e2a81807722c5f28ef.
> "   page allocator: do not check NUMA node ID when the caller knows
> the node is valid
>
>    Callers of alloc_pages_node() can optionally specify -1 as a node to mean
>    "allocate from the current node".  However, a number of the callers in
>    fast paths know for a fact their node is valid.  To avoid a comparison
> and
>    branch, this patch adds alloc_pages_exact_node() that only checks the nid
>    with VM_BUG_ON().  Callers that know their node is valid are then
>    converted."
>
> alloc_pages_exact_node's naming would be not good.
> It is not for allocate page from exact node but just for
> removing check of node's valid.
> Some people like me who is poor english could misunderstood it.
>
> How about changing name with following?
> /* This function can allocate page to fallback list of node*/
> alloc_pages_by_nodeid(...)
>
> And instead of it, let's change alloc_pages_exact_node with following.
> static inline struct page *alloc_pages_exact_node(...)
> {
>  VM_BUG_ON ..
>  return __alloc_pages(gfp_mask|__GFP_THISNODE...);
> }
>
> I think it's more clear than old.
> What do you think about it?
>
Hm.. I agree with you, I was also misunderstanding by the name.
But let's still waiting for some other reply.

By the way, what about your opinion using GFP_THISNODE or
__GFP_THISNODE in __alloc_pages().
I think GFP_THISNODE is ok.

Thanks.
-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-06  4:56   ` Bob Liu
@ 2010-04-06  5:06     ` Minchan Kim
  0 siblings, 0 replies; 21+ messages in thread
From: Minchan Kim @ 2010-04-06  5:06 UTC (permalink / raw)
  To: Bob Liu; +Cc: akpm, linux-mm, andi, rientjes, lee.schermerhorn, Mel Gorman

On Tue, Apr 6, 2010 at 1:56 PM, Bob Liu <lliubbo@gmail.com> wrote:
> On 4/6/10, Minchan Kim <minchan.kim@gmail.com> wrote:
>> On Tue, Apr 6, 2010 at 11:59 AM, Bob Liu <lliubbo@gmail.com> wrote:
>>> In funtion migrate_pages(), if the dest node have no
>>> enough free pages,it will fallback to other nodes.
>>> Add GFP_THISNODE to avoid this, the same as what
>>> funtion new_page_node() do in migrate.c.
>>>
>>> Signed-off-by: Bob Liu <lliubbo@gmail.com>
>>
>> Yes. It can be fixed. but I have a different concern.
>>
>> I looked at 6484eb3e2a81807722c5f28ef.
>> "   page allocator: do not check NUMA node ID when the caller knows
>> the node is valid
>>
>>    Callers of alloc_pages_node() can optionally specify -1 as a node to mean
>>    "allocate from the current node".  However, a number of the callers in
>>    fast paths know for a fact their node is valid.  To avoid a comparison
>> and
>>    branch, this patch adds alloc_pages_exact_node() that only checks the nid
>>    with VM_BUG_ON().  Callers that know their node is valid are then
>>    converted."
>>
>> alloc_pages_exact_node's naming would be not good.
>> It is not for allocate page from exact node but just for
>> removing check of node's valid.
>> Some people like me who is poor english could misunderstood it.
>>
>> How about changing name with following?
>> /* This function can allocate page to fallback list of node*/
>> alloc_pages_by_nodeid(...)
>>
>> And instead of it, let's change alloc_pages_exact_node with following.
>> static inline struct page *alloc_pages_exact_node(...)
>> {
>>  VM_BUG_ON ..
>>  return __alloc_pages(gfp_mask|__GFP_THISNODE...);
>> }
>>
>> I think it's more clear than old.
>> What do you think about it?
>>
> Hm.. I agree with you, I was also misunderstanding by the name.
> But let's still waiting for some other reply.
>
> By the way, what about your opinion using GFP_THISNODE or
> __GFP_THISNODE in __alloc_pages().
> I think GFP_THISNODE is ok.
>

Yes. It would be good except alloc_fresh_huge_page_node.
-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-06  2:59 [PATCH] mempolicy:add GFP_THISNODE when allocing new page Bob Liu
  2010-04-06  4:33 ` Minchan Kim
@ 2010-04-13  8:19 ` KAMEZAWA Hiroyuki
  2010-04-13  8:20 ` Bob Liu
  2010-04-13  8:27 ` Minchan Kim
  3 siblings, 0 replies; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-04-13  8:19 UTC (permalink / raw)
  To: Bob Liu; +Cc: akpm, linux-mm, andi, rientjes, lee.schermerhorn

On Tue,  6 Apr 2010 10:59:37 +0800
Bob Liu <lliubbo@gmail.com> wrote:

> In funtion migrate_pages(), if the dest node have no
> enough free pages,it will fallback to other nodes.
> Add GFP_THISNODE to avoid this, the same as what
> funtion new_page_node() do in migrate.c.
> 
> Signed-off-by: Bob Liu <lliubbo@gmail.com>

Reviewd-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


> ---
>  mm/mempolicy.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 08f40a2..fc5ddf5 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
>  
>  static struct page *new_node_page(struct page *page, unsigned long node, int **x)
>  {
> -	return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
> +	return alloc_pages_exact_node(node,
> +				GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
>  }
>  
>  /*
> -- 
> 1.5.6.3
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-06  2:59 [PATCH] mempolicy:add GFP_THISNODE when allocing new page Bob Liu
  2010-04-06  4:33 ` Minchan Kim
  2010-04-13  8:19 ` KAMEZAWA Hiroyuki
@ 2010-04-13  8:20 ` Bob Liu
  2010-04-13  8:38   ` Mel Gorman
  2010-04-13  8:27 ` Minchan Kim
  3 siblings, 1 reply; 21+ messages in thread
From: Bob Liu @ 2010-04-13  8:20 UTC (permalink / raw)
  To: mel, kamezawa.hiroyu, minchan.kim
  Cc: akpm, linux-mm, andi, rientjes, lee.schermerhorn, Bob Liu

On 4/6/10, Bob Liu <lliubbo@gmail.com> wrote:
> In funtion migrate_pages(), if the dest node have no
> enough free pages,it will fallback to other nodes.
> Add GFP_THISNODE to avoid this, the same as what
> funtion new_page_node() do in migrate.c.
>
> Signed-off-by: Bob Liu <lliubbo@gmail.com>
> ---
>  mm/mempolicy.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 08f40a2..fc5ddf5 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
>
>  static struct page *new_node_page(struct page *page, unsigned long node, int **x)
>  {
> -       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
> +       return alloc_pages_exact_node(node,
> +                               GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
>  }
>

Hi, Minchan and Kame
     Would you please add ack or review to this thread. It's BUGFIX
and not change, so i don't resend one.

     About code clean, there should be some new CLEANUP patches or
just don't make any changes decided after we finish before
discussions.

Thanks!
-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-06  2:59 [PATCH] mempolicy:add GFP_THISNODE when allocing new page Bob Liu
                   ` (2 preceding siblings ...)
  2010-04-13  8:20 ` Bob Liu
@ 2010-04-13  8:27 ` Minchan Kim
  3 siblings, 0 replies; 21+ messages in thread
From: Minchan Kim @ 2010-04-13  8:27 UTC (permalink / raw)
  To: Bob Liu; +Cc: akpm, linux-mm, andi, rientjes, lee.schermerhorn

On Tue, Apr 6, 2010 at 11:59 AM, Bob Liu <lliubbo@gmail.com> wrote:
> In funtion migrate_pages(), if the dest node have no
> enough free pages,it will fallback to other nodes.
> Add GFP_THISNODE to avoid this, the same as what
> funtion new_page_node() do in migrate.c.
>
> Signed-off-by: Bob Liu <lliubbo@gmail.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

Thanks, Bob.


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-13  8:20 ` Bob Liu
@ 2010-04-13  8:38   ` Mel Gorman
  2010-04-13 14:28     ` Bob Liu
  0 siblings, 1 reply; 21+ messages in thread
From: Mel Gorman @ 2010-04-13  8:38 UTC (permalink / raw)
  To: Bob Liu
  Cc: kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi, rientjes,
	lee.schermerhorn

On Tue, Apr 13, 2010 at 04:20:53PM +0800, Bob Liu wrote:
> On 4/6/10, Bob Liu <lliubbo@gmail.com> wrote:
> > In funtion migrate_pages(), if the dest node have no
> > enough free pages,it will fallback to other nodes.
> > Add GFP_THISNODE to avoid this, the same as what
> > funtion new_page_node() do in migrate.c.
> >
> > Signed-off-by: Bob Liu <lliubbo@gmail.com>
> > ---
> >  mm/mempolicy.c |    3 ++-
> >  1 files changed, 2 insertions(+), 1 deletions(-)
> >
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 08f40a2..fc5ddf5 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
> >
> >  static struct page *new_node_page(struct page *page, unsigned long node, int **x)
> >  {
> > -       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
> > +       return alloc_pages_exact_node(node,
> > +                               GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
> >  }
> >
> 
> Hi, Minchan and Kame
>      Would you please add ack or review to this thread. It's BUGFIX
> and not change, so i don't resend one.
> 

Sorry for taking so long to get around to this thread. I talked on this
patch already but it's in another thread. Here is what I said there

====
This appears to be a valid bug fix.  I agree that the way things are structured
that __GFP_THISNODE should be used in new_node_page(). But maybe a follow-on
patch is also required. The behaviour is now;

o new_node_page will not return NULL if the target node is empty (fine).
o migrate_pages will translate this into -ENOMEM (fine)
o do_migrate_pages breaks early if it gets -ENOMEM ?

It's the last part I'd like you to double check. migrate_pages() takes a
nodemask of allowed nodes to migrate to. Rather than sending this down
to the allocator, it iterates over the nodes allowed in the mask. If one
of those nodes is full, it returns -ENOMEM.

If -ENOMEM is returned from migrate_pages, should it not move to the
next node?
====

My concern before acking this patch is that the function might be exiting
too early when given a set of nodes. Granted, because __GFP_THISNODE is not
specified, it's perfectly possible that migration is currently moving pages
to the wrong node which is also very bad.

>      About code clean, there should be some new CLEANUP patches or
> just don't make any changes decided after we finish before
> discussions.
> 

Cleanup patches can be sent separately. I might be biased against a function
rename but the bugfix is more important.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-13  8:38   ` Mel Gorman
@ 2010-04-13 14:28     ` Bob Liu
  2010-04-16  0:41       ` Christoph Lameter
  2010-04-16 11:15       ` Mel Gorman
  0 siblings, 2 replies; 21+ messages in thread
From: Bob Liu @ 2010-04-13 14:28 UTC (permalink / raw)
  To: Mel Gorman
  Cc: kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi, rientjes,
	lee.schermerhorn, cl

On Tue, Apr 13, 2010 at 4:38 PM, Mel Gorman <mel@csn.ul.ie> wrote:
> On Tue, Apr 13, 2010 at 04:20:53PM +0800, Bob Liu wrote:
>> On 4/6/10, Bob Liu <lliubbo@gmail.com> wrote:
>> > In funtion migrate_pages(), if the dest node have no
>> > enough free pages,it will fallback to other nodes.
>> > Add GFP_THISNODE to avoid this, the same as what
>> > funtion new_page_node() do in migrate.c.
>> >
>> > Signed-off-by: Bob Liu <lliubbo@gmail.com>
>> > ---
>> >  mm/mempolicy.c |    3 ++-
>> >  1 files changed, 2 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> > index 08f40a2..fc5ddf5 100644
>> > --- a/mm/mempolicy.c
>> > +++ b/mm/mempolicy.c
>> > @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
>> >
>> >  static struct page *new_node_page(struct page *page, unsigned long node, int **x)
>> >  {
>> > -       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
>> > +       return alloc_pages_exact_node(node,
>> > +                               GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
>> >  }
>> >
>>
>> Hi, Minchan and Kame
>>      Would you please add ack or review to this thread. It's BUGFIX
>> and not change, so i don't resend one.
>>
>
> Sorry for taking so long to get around to this thread. I talked on this
> patch already but it's in another thread. Here is what I said there
>
> ====
> This appears to be a valid bug fix.  I agree that the way things are structured
> that __GFP_THISNODE should be used in new_node_page(). But maybe a follow-on
> patch is also required. The behaviour is now;
>
> o new_node_page will not return NULL if the target node is empty (fine).
> o migrate_pages will translate this into -ENOMEM (fine)
> o do_migrate_pages breaks early if it gets -ENOMEM ?
>
> It's the last part I'd like you to double check. migrate_pages() takes a
> nodemask of allowed nodes to migrate to. Rather than sending this down
> to the allocator, it iterates over the nodes allowed in the mask. If one
> of those nodes is full, it returns -ENOMEM.
>
> If -ENOMEM is returned from migrate_pages, should it not move to the
> next node?
> ====

Hm.I think early return is ok but not sure about it :)

As Christoph said
"The intended semantic is the preservation of the relative position of the
page to the beginning of the node set."
"F.e. if you use page
migration (or cpuset automigration) to shift an application running on 10
nodes up by two nodes to make a hole that would allow you to run another
application on the lower nodes. Applications place pages intentionally on
certain nodes to be able to manage memory distances."

If move to the next node instead of early return, the relative position of the
page to the beginning of the node set will be break;

(BTW:I am still not very clear about the preservation of the relative
position of the
page to the beginning of the node set. I think if the user call
migrate_pages() with
different count of src and dest nodes, the  relative position will also break.
eg. if call migrate_pags() from nodes is node(1,2,3) , dest nodes is
just node(3).
the current code logical will move pages in node 1, 2 to node 3. this case the
relative position is breaked).

Add Christoph  to cc.

>
> My concern before acking this patch is that the function might be exiting
> too early when given a set of nodes. Granted, because __GFP_THISNODE is not
> specified, it's perfectly possible that migration is currently moving pages
> to the wrong node which is also very bad.
>
>>      About code clean, there should be some new CLEANUP patches or
>> just don't make any changes decided after we finish before
>> discussions.
>>
>
> Cleanup patches can be sent separately. I might be biased against a function
> rename but the bugfix is more important.
>

Thanks!

-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-13 14:28     ` Bob Liu
@ 2010-04-16  0:41       ` Christoph Lameter
  2010-04-16  1:02         ` Bob Liu
  2010-04-16 11:15       ` Mel Gorman
  1 sibling, 1 reply; 21+ messages in thread
From: Christoph Lameter @ 2010-04-16  0:41 UTC (permalink / raw)
  To: Bob Liu
  Cc: Mel Gorman, kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi,
	rientjes, lee.schermerhorn

On Tue, 13 Apr 2010, Bob Liu wrote:

> If move to the next node instead of early return, the relative position of the
> page to the beginning of the node set will be break;

Right.

> (BTW:I am still not very clear about the preservation of the relative
> position of the
> page to the beginning of the node set. I think if the user call
> migrate_pages() with
> different count of src and dest nodes, the  relative position will also break.
> eg. if call migrate_pags() from nodes is node(1,2,3) , dest nodes is
> just node(3).
> the current code logical will move pages in node 1, 2 to node 3. this case the
> relative position is breaked).

But in that case the user has specified that the set of nodes should be
compacted during migration and therefore requested what ocurred.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-16  0:41       ` Christoph Lameter
@ 2010-04-16  1:02         ` Bob Liu
  2010-04-16 16:02           ` Christoph Lameter
  0 siblings, 1 reply; 21+ messages in thread
From: Bob Liu @ 2010-04-16  1:02 UTC (permalink / raw)
  To: Christoph Lameter, Mel Gorman
  Cc: kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi, rientjes,
	lee.schermerhorn

On Fri, Apr 16, 2010 at 8:41 AM, Christoph Lameter
<cl@linux-foundation.org> wrote:
> On Tue, 13 Apr 2010, Bob Liu wrote:
>
>> If move to the next node instead of early return, the relative position of the
>> page to the beginning of the node set will be break;
>
> Right.
>

Thanks!
Then would you please acking this patch?  So as mel.

>> (BTW:I am still not very clear about the preservation of the relative
>> position of the
>> page to the beginning of the node set. I think if the user call
>> migrate_pages() with
>> different count of src and dest nodes, the  relative position will also break.
>> eg. if call migrate_pags() from nodes is node(1,2,3) , dest nodes is
>> just node(3).
>> the current code logical will move pages in node 1, 2 to node 3. this case the
>> relative position is breaked).
>
> But in that case the user has specified that the set of nodes should be
> compacted during migration and therefore requested what ocurred.
>

-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-13 14:28     ` Bob Liu
  2010-04-16  0:41       ` Christoph Lameter
@ 2010-04-16 11:15       ` Mel Gorman
  2010-04-16 15:03         ` Bob Liu
  1 sibling, 1 reply; 21+ messages in thread
From: Mel Gorman @ 2010-04-16 11:15 UTC (permalink / raw)
  To: Bob Liu
  Cc: kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi, rientjes,
	lee.schermerhorn, cl

On Tue, Apr 13, 2010 at 10:28:35PM +0800, Bob Liu wrote:
> On Tue, Apr 13, 2010 at 4:38 PM, Mel Gorman <mel@csn.ul.ie> wrote:
> > On Tue, Apr 13, 2010 at 04:20:53PM +0800, Bob Liu wrote:
> >> On 4/6/10, Bob Liu <lliubbo@gmail.com> wrote:
> >> > In funtion migrate_pages(), if the dest node have no
> >> > enough free pages,it will fallback to other nodes.
> >> > Add GFP_THISNODE to avoid this, the same as what
> >> > funtion new_page_node() do in migrate.c.
> >> >
> >> > Signed-off-by: Bob Liu <lliubbo@gmail.com>
> >> > ---
> >> >  mm/mempolicy.c |    3 ++-
> >> >  1 files changed, 2 insertions(+), 1 deletions(-)
> >> >
> >> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> >> > index 08f40a2..fc5ddf5 100644
> >> > --- a/mm/mempolicy.c
> >> > +++ b/mm/mempolicy.c
> >> > @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
> >> >
> >> >  static struct page *new_node_page(struct page *page, unsigned long node, int **x)
> >> >  {
> >> > -       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
> >> > +       return alloc_pages_exact_node(node,
> >> > +                               GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
> >> >  }
> >> >
> >>
> >> Hi, Minchan and Kame
> >>      Would you please add ack or review to this thread. It's BUGFIX
> >> and not change, so i don't resend one.
> >>
> >
> > Sorry for taking so long to get around to this thread. I talked on this
> > patch already but it's in another thread. Here is what I said there
> >
> > ====
> > This appears to be a valid bug fix.  I agree that the way things are structured
> > that __GFP_THISNODE should be used in new_node_page(). But maybe a follow-on
> > patch is also required. The behaviour is now;
> >
> > o new_node_page will not return NULL if the target node is empty (fine).
> > o migrate_pages will translate this into -ENOMEM (fine)
> > o do_migrate_pages breaks early if it gets -ENOMEM ?
> >
> > It's the last part I'd like you to double check. migrate_pages() takes a
> > nodemask of allowed nodes to migrate to. Rather than sending this down
> > to the allocator, it iterates over the nodes allowed in the mask. If one
> > of those nodes is full, it returns -ENOMEM.
> >
> > If -ENOMEM is returned from migrate_pages, should it not move to the
> > next node?
> > ====
> 
> Hm.I think early return is ok but not sure about it :)
> 
> As Christoph said
> "The intended semantic is the preservation of the relative position of the
> page to the beginning of the node set."
> "F.e. if you use page
> migration (or cpuset automigration) to shift an application running on 10
> nodes up by two nodes to make a hole that would allow you to run another
> application on the lower nodes. Applications place pages intentionally on
> certain nodes to be able to manage memory distances."
> 
> If move to the next node instead of early return, the relative position of the
> page to the beginning of the node set will be break;
> 

Yeah, but the user requested that a number of nodes to be used. Sure, if the
first node is not free, a page will be allocated on the second node instead
but is that not what was requested? If the user wanted one and only one
node to be used, they wouldn't have specified multiple nodes. I'm not
convinced an early return is what was intended here.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-16 11:15       ` Mel Gorman
@ 2010-04-16 15:03         ` Bob Liu
  2010-04-16 15:55           ` Christoph Lameter
  0 siblings, 1 reply; 21+ messages in thread
From: Bob Liu @ 2010-04-16 15:03 UTC (permalink / raw)
  To: Mel Gorman
  Cc: kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi, rientjes,
	lee.schermerhorn, cl

On Fri, Apr 16, 2010 at 7:15 PM, Mel Gorman <mel@csn.ul.ie> wrote:
> On Tue, Apr 13, 2010 at 10:28:35PM +0800, Bob Liu wrote:
>> On Tue, Apr 13, 2010 at 4:38 PM, Mel Gorman <mel@csn.ul.ie> wrote:
>> > On Tue, Apr 13, 2010 at 04:20:53PM +0800, Bob Liu wrote:
>> >> On 4/6/10, Bob Liu <lliubbo@gmail.com> wrote:
>> >> > In funtion migrate_pages(), if the dest node have no
>> >> > enough free pages,it will fallback to other nodes.
>> >> > Add GFP_THISNODE to avoid this, the same as what
>> >> > funtion new_page_node() do in migrate.c.
>> >> >
>> >> > Signed-off-by: Bob Liu <lliubbo@gmail.com>
>> >> > ---
>> >> >  mm/mempolicy.c |    3 ++-
>> >> >  1 files changed, 2 insertions(+), 1 deletions(-)
>> >> >
>> >> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> >> > index 08f40a2..fc5ddf5 100644
>> >> > --- a/mm/mempolicy.c
>> >> > +++ b/mm/mempolicy.c
>> >> > @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
>> >> >
>> >> >  static struct page *new_node_page(struct page *page, unsigned long node, int **x)
>> >> >  {
>> >> > -       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
>> >> > +       return alloc_pages_exact_node(node,
>> >> > +                               GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
>> >> >  }
>> >> >
>> >>
>> >> Hi, Minchan and Kame
>> >>      Would you please add ack or review to this thread. It's BUGFIX
>> >> and not change, so i don't resend one.
>> >>
>> >
>> > Sorry for taking so long to get around to this thread. I talked on this
>> > patch already but it's in another thread. Here is what I said there
>> >
>> > ====
>> > This appears to be a valid bug fix.  I agree that the way things are structured
>> > that __GFP_THISNODE should be used in new_node_page(). But maybe a follow-on
>> > patch is also required. The behaviour is now;
>> >
>> > o new_node_page will not return NULL if the target node is empty (fine).
>> > o migrate_pages will translate this into -ENOMEM (fine)
>> > o do_migrate_pages breaks early if it gets -ENOMEM ?
>> >
>> > It's the last part I'd like you to double check. migrate_pages() takes a
>> > nodemask of allowed nodes to migrate to. Rather than sending this down
>> > to the allocator, it iterates over the nodes allowed in the mask. If one
>> > of those nodes is full, it returns -ENOMEM.
>> >
>> > If -ENOMEM is returned from migrate_pages, should it not move to the
>> > next node?
>> > ====
>>
>> Hm.I think early return is ok but not sure about it :)
>>
>> As Christoph said
>> "The intended semantic is the preservation of the relative position of the
>> page to the beginning of the node set."
>> "F.e. if you use page
>> migration (or cpuset automigration) to shift an application running on 10
>> nodes up by two nodes to make a hole that would allow you to run another
>> application on the lower nodes. Applications place pages intentionally on
>> certain nodes to be able to manage memory distances."
>>
>> If move to the next node instead of early return, the relative position of the
>> page to the beginning of the node set will be break;
>>
>
> Yeah, but the user requested that a number of nodes to be used. Sure, if the
> first node is not free, a page will be allocated on the second node instead
> but is that not what was requested? If the user wanted one and only one
> node to be used, they wouldn't have specified multiple nodes. I'm not
> convinced an early return is what was intended here.
>

Hmm.
What about this change? If the from_nodes and to_nodes' weight is different,
then we can don't preserv of the relative position of the page to the beginning
of the node set. This case if a page allocation from the dest node
failed, it will
be allocated from the next node instead of early return.

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 08f40a2..094d092 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page,
struct list_head *pagelist,

 static struct page *new_node_page(struct page *page, unsigned long
node, int **x)
 {
-       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
+       return alloc_pages_exact_node(node,
+                       GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
 }

 /*
@@ -945,10 +946,26 @@ int do_migrate_pages(struct mm_struct *mm,

                node_clear(source, tmp);
                err = migrate_to_node(mm, source, dest, flags);
+retry:
                if (err > 0)
                        busy += err;
-               if (err < 0)
+               if (err < 0) {
+                       /*
+                        * If the dest node have no enough memory, and
from_nodes
+                        * to_nodes have no equal weight(don't need
protect offset.)
+                        * try to migrate to next node.
+                        */
+                       if((nodes_weight(*from_nodes) !=
nodes_weight(*to_nodes))
+                               && (err == -ENOMEM)) {
+                               for_each_node_mask(s, *to_nodes) {
+                                       if(s != dest) {
+                                               err =
migrate_to_node(mm, source, s, flags);
+                                               goto retry;
+                                       }
+                               }
+                       }
                        break;
+               }

Thanks !
-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-16 15:03         ` Bob Liu
@ 2010-04-16 15:55           ` Christoph Lameter
  2010-04-17 13:54             ` Bob Liu
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Lameter @ 2010-04-16 15:55 UTC (permalink / raw)
  To: Bob Liu
  Cc: Mel Gorman, kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi,
	rientjes, lee.schermerhorn

On Fri, 16 Apr 2010, Bob Liu wrote:

> Hmm.
> What about this change? If the from_nodes and to_nodes' weight is different,
> then we can don't preserv of the relative position of the page to the beginning
> of the node set. This case if a page allocation from the dest node
> failed, it will
> be allocated from the next node instead of early return.

Understand what you are doing first. The fallback is already there.

> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 08f40a2..094d092 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page,
> struct list_head *pagelist,
>
>  static struct page *new_node_page(struct page *page, unsigned long
> node, int **x)
>  {
> -       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
> +       return alloc_pages_exact_node(node,
> +                       GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);

You eliminate falling back to the next node?

GFP_THISNODE forces allocation from the node. Without it we will fallback.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-16  1:02         ` Bob Liu
@ 2010-04-16 16:02           ` Christoph Lameter
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Lameter @ 2010-04-16 16:02 UTC (permalink / raw)
  To: Bob Liu
  Cc: Mel Gorman, kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi,
	rientjes, lee.schermerhorn

On Fri, 16 Apr 2010, Bob Liu wrote:

> >> If move to the next node instead of early return, the relative position of the
> >> page to the beginning of the node set will be break;
> >
> > Right.
> >
>
> Thanks!
> Then would you please acking this patch?  So as mel.

Which patch? Could you clarify what you are trying to do?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-16 15:55           ` Christoph Lameter
@ 2010-04-17 13:54             ` Bob Liu
  2010-04-19 17:47               ` Christoph Lameter
  0 siblings, 1 reply; 21+ messages in thread
From: Bob Liu @ 2010-04-17 13:54 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Mel Gorman, kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi,
	rientjes, lee.schermerhorn

On Fri, Apr 16, 2010 at 11:55 PM, Christoph Lameter
<cl@linux-foundation.org> wrote:
> On Fri, 16 Apr 2010, Bob Liu wrote:
>
>> Hmm.
>> What about this change? If the from_nodes and to_nodes' weight is different,
>> then we can don't preserv of the relative position of the page to the beginning
>> of the node set. This case if a page allocation from the dest node
>> failed, it will
>> be allocated from the next node instead of early return.
>
> Understand what you are doing first. The fallback is already there.
>
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 08f40a2..094d092 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page,
>> struct list_head *pagelist,
>>
>>  static struct page *new_node_page(struct page *page, unsigned long
>> node, int **x)
>>  {
>> -       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
>> +       return alloc_pages_exact_node(node,
>> +                       GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
>
> You eliminate falling back to the next node?
>
> GFP_THISNODE forces allocation from the node. Without it we will fallback.
>

Yeah, but I think we shouldn't fallback at this case, what we want is
alloc a page
from exactly the dest node during migrate_to_node(dest).So I added
GFP_THISNODE.

And mel concerned that
====
This appears to be a valid bug fix.  I agree that the way things are structured
that __GFP_THISNODE should be used in new_node_page(). But maybe a follow-on
patch is also required. The behaviour is now;

o new_node_page will not return NULL if the target node is empty (fine).
o migrate_pages will translate this into -ENOMEM (fine)
o do_migrate_pages breaks early if it gets -ENOMEM ?

It's the last part I'd like you to double check. migrate_pages() takes a
nodemask of allowed nodes to migrate to. Rather than sending this down
to the allocator, it iterates over the nodes allowed in the mask. If one
of those nodes is full, it returns -ENOMEM.

If -ENOMEM is returned from migrate_pages, should it not move to the
next node?
====

In my opinion, when we want to preserve the relative position of the page to
the beginning of the node set, early return is ok. Else should try to alloc the
new page from the next node(to_nodes).

So I added retry path to allocate new page from next node only when
from_nodes' weight is different from to_nodes', this case the user should
konw the relative position of the page to the beginning of the node set
can be changed.

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 08f40a2..094d092 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page,
struct list_head *pagelist,

 static struct page *new_node_page(struct page *page, unsigned long
node, int **x)
 {
-       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
+       return alloc_pages_exact_node(node,
+                       GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
 }

 /*
@@ -945,10 +946,26 @@ int do_migrate_pages(struct mm_struct *mm,

               node_clear(source, tmp);
               err = migrate_to_node(mm, source, dest, flags);
+retry:
               if (err > 0)
                       busy += err;
-               if (err < 0)
+               if (err < 0) {
+                       /*
+                        * If the dest node have no enough memory, and
from_nodes
+                        * to_nodes have no equal weight(don't need
protect offset.)
+                        * try to migrate to next node.
+                        */
+                       if((nodes_weight(*from_nodes) !=
nodes_weight(*to_nodes))
+                               && (err == -ENOMEM)) {
+                               for_each_node_mask(s, *to_nodes) {
+                                       if(s != dest) {
+                                               err =
migrate_to_node(mm, source, s, flags);
+                                               goto retry;
+                                       }
+                               }
+                       }
                       break;
+               }

Thanks!
-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-17 13:54             ` Bob Liu
@ 2010-04-19 17:47               ` Christoph Lameter
  2010-04-20  2:08                 ` Bob Liu
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Lameter @ 2010-04-19 17:47 UTC (permalink / raw)
  To: Bob Liu
  Cc: Mel Gorman, kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi,
	rientjes, lee.schermerhorn

On Sat, 17 Apr 2010, Bob Liu wrote:

> > GFP_THISNODE forces allocation from the node. Without it we will fallback.
> >
>
> Yeah, but I think we shouldn't fallback at this case, what we want is
> alloc a page
> from exactly the dest node during migrate_to_node(dest).So I added
> GFP_THISNODE.

Why would we want that?

>
> And mel concerned that
> ====
> This appears to be a valid bug fix.  I agree that the way things are structured
> that __GFP_THISNODE should be used in new_node_page(). But maybe a follow-on
> patch is also required. The behaviour is now;
>
> o new_node_page will not return NULL if the target node is empty (fine).
> o migrate_pages will translate this into -ENOMEM (fine)
> o do_migrate_pages breaks early if it gets -ENOMEM ?
>
> It's the last part I'd like you to double check. migrate_pages() takes a
> nodemask of allowed nodes to migrate to. Rather than sending this down
> to the allocator, it iterates over the nodes allowed in the mask. If one
> of those nodes is full, it returns -ENOMEM.
>
> If -ENOMEM is returned from migrate_pages, should it not move to the
> next node?
> ====

?? It will move onto the next node if you leave things as is. If you add
GFP_THISNODE then you can get NULL back from the page allocator because
there is no memory on the local node. Without GFP_THISNODe the allocation
will fallback.

> In my opinion, when we want to preserve the relative position of the page to
> the beginning of the node set, early return is ok. Else should try to alloc the
> new page from the next node(to_nodes).

???

> So I added retry path to allocate new page from next node only when
> from_nodes' weight is different from to_nodes', this case the user should
> konw the relative position of the page to the beginning of the node set
> can be changed.

There is no point in your patch since the functionality is already there
without it.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-19 17:47               ` Christoph Lameter
@ 2010-04-20  2:08                 ` Bob Liu
  2010-04-21 14:13                   ` Christoph Lameter
  0 siblings, 1 reply; 21+ messages in thread
From: Bob Liu @ 2010-04-20  2:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Mel Gorman, kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi,
	rientjes, lee.schermerhorn

On Tue, Apr 20, 2010 at 1:47 AM, Christoph Lameter
<cl@linux-foundation.org> wrote:
> On Sat, 17 Apr 2010, Bob Liu wrote:
>
>> > GFP_THISNODE forces allocation from the node. Without it we will fallback.
>> >
>>
>> Yeah, but I think we shouldn't fallback at this case, what we want is
>> alloc a page
>> from exactly the dest node during migrate_to_node(dest).So I added
>> GFP_THISNODE.
>
> Why would we want that?
>

Because if dest node have no memory, it will fallback to other nodes.
The dest node's fallback nodes may be nodes in nodemask from_nodes.
It maybe make circulation ?.(I am not sure.)

What's more,i think it against the user's request.
The user wants to move pages from from_nodes to to_nodes, if fallback
happened, the pages may be moved to other nodes instead of any node in
nodemask to_nodes.
I am not sure if the user can expect this and accept.

Thanks a lot for your patient reply. :)
-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-20  2:08                 ` Bob Liu
@ 2010-04-21 14:13                   ` Christoph Lameter
  2010-04-22  1:03                     ` Bob Liu
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Lameter @ 2010-04-21 14:13 UTC (permalink / raw)
  To: Bob Liu
  Cc: Mel Gorman, kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi,
	rientjes, lee.schermerhorn

On Tue, 20 Apr 2010, Bob Liu wrote:

> On Tue, Apr 20, 2010 at 1:47 AM, Christoph Lameter
> <cl@linux-foundation.org> wrote:
> > On Sat, 17 Apr 2010, Bob Liu wrote:
> >
> >> > GFP_THISNODE forces allocation from the node. Without it we will fallback.
> >> >
> >>
> >> Yeah, but I think we shouldn't fallback at this case, what we want is
> >> alloc a page
> >> from exactly the dest node during migrate_to_node(dest).So I added
> >> GFP_THISNODE.
> >
> > Why would we want that?
> >
>
> Because if dest node have no memory, it will fallback to other nodes.
> The dest node's fallback nodes may be nodes in nodemask from_nodes.
> It maybe make circulation ?.(I am not sure.)
>
> What's more,i think it against the user's request.

The problem is your perception of NUMA against the kernel NUMA design. As
long as you have this problem I would suggest that you do not submit
patches against NUMA functionality in the kernel.

> The user wants to move pages from from_nodes to to_nodes, if fallback
> happened, the pages may be moved to other nodes instead of any node in
> nodemask to_nodes.
> I am not sure if the user can expect this and accept.

Sure the user always had it this way. NUMA allocations (like also
MPOL_INTERLEAVE round robin) are *only* attempts to allocate on specific
nodes.

There was never a guarantee (until GFP_THISNODE arrived on the scene to
fix SLAB breakage but that was very late in NUMA design of the kernel).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-21 14:13                   ` Christoph Lameter
@ 2010-04-22  1:03                     ` Bob Liu
  2010-04-22 14:38                       ` Christoph Lameter
  0 siblings, 1 reply; 21+ messages in thread
From: Bob Liu @ 2010-04-22  1:03 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Mel Gorman, kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi,
	rientjes, lee.schermerhorn

On Wed, Apr 21, 2010 at 10:13 PM, Christoph Lameter
<cl@linux-foundation.org> wrote:
> On Tue, 20 Apr 2010, Bob Liu wrote:
>
>> On Tue, Apr 20, 2010 at 1:47 AM, Christoph Lameter
>> <cl@linux-foundation.org> wrote:
>> > On Sat, 17 Apr 2010, Bob Liu wrote:
>> >
>> >> > GFP_THISNODE forces allocation from the node. Without it we will fallback.
>> >> >
>> >>
>> >> Yeah, but I think we shouldn't fallback at this case, what we want is
>> >> alloc a page
>> >> from exactly the dest node during migrate_to_node(dest).So I added
>> >> GFP_THISNODE.
>> >
>> > Why would we want that?
>> >
>>
>> Because if dest node have no memory, it will fallback to other nodes.
>> The dest node's fallback nodes may be nodes in nodemask from_nodes.
>> It maybe make circulation ?.(I am not sure.)
>>
>> What's more,i think it against the user's request.
>
> The problem is your perception of NUMA against the kernel NUMA design. As
> long as you have this problem I would suggest that you do not submit
> patches against NUMA functionality in the kernel.
>
ok :)

>> The user wants to move pages from from_nodes to to_nodes, if fallback
>> happened, the pages may be moved to other nodes instead of any node in
>> nodemask to_nodes.
>> I am not sure if the user can expect this and accept.
>
> Sure the user always had it this way. NUMA allocations (like also
> MPOL_INTERLEAVE round robin) are *only* attempts to allocate on specific
> nodes.
>
> There was never a guarantee (until GFP_THISNODE arrived on the scene to
> fix SLAB breakage but that was very late in NUMA design of the kernel).
>

Thanks for your patient reply.
Just one small point, why do_move_pages() in migrate.c needs GFP_THISNODE ?

-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page
  2010-04-22  1:03                     ` Bob Liu
@ 2010-04-22 14:38                       ` Christoph Lameter
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Lameter @ 2010-04-22 14:38 UTC (permalink / raw)
  To: Bob Liu
  Cc: Mel Gorman, kamezawa.hiroyu, minchan.kim, akpm, linux-mm, andi,
	rientjes, lee.schermerhorn

On Thu, 22 Apr 2010, Bob Liu wrote:

> Just one small point, why do_move_pages() in migrate.c needs GFP_THISNODE ?

Because the move_pages function call allows the user explicitly specify
the node for each page. If we cannot move the page to the node the user
wants then the best fallback is to keep it where it was.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-04-22 14:40 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-06  2:59 [PATCH] mempolicy:add GFP_THISNODE when allocing new page Bob Liu
2010-04-06  4:33 ` Minchan Kim
2010-04-06  4:56   ` Bob Liu
2010-04-06  5:06     ` Minchan Kim
2010-04-13  8:19 ` KAMEZAWA Hiroyuki
2010-04-13  8:20 ` Bob Liu
2010-04-13  8:38   ` Mel Gorman
2010-04-13 14:28     ` Bob Liu
2010-04-16  0:41       ` Christoph Lameter
2010-04-16  1:02         ` Bob Liu
2010-04-16 16:02           ` Christoph Lameter
2010-04-16 11:15       ` Mel Gorman
2010-04-16 15:03         ` Bob Liu
2010-04-16 15:55           ` Christoph Lameter
2010-04-17 13:54             ` Bob Liu
2010-04-19 17:47               ` Christoph Lameter
2010-04-20  2:08                 ` Bob Liu
2010-04-21 14:13                   ` Christoph Lameter
2010-04-22  1:03                     ` Bob Liu
2010-04-22 14:38                       ` Christoph Lameter
2010-04-13  8:27 ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).