Re: [PATCH] do_migrate_range: avoid failure as much as possible

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  2:47 [PATCH] do_migrate_range: avoid failure as much as possible Bob Liu
@ 2010-10-25  2:40 ` KAMEZAWA Hiroyuki
  2010-10-25  2:57   ` Wu Fengguang
  0 siblings, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-25  2:40 UTC (permalink / raw)
  To: Bob Liu; +Cc: akpm, linux-mm, fengguang.wu, mel, kosaki.motohiro

On Mon, 25 Oct 2010 10:47:31 +0800
Bob Liu <lliubbo@gmail.com> wrote:

> It's normal for isolate_lru_page() to fail at times. The failures are
> typically temporal and may well go away when offline_pages() retries
> the call. So it seems more reasonable to migrate as much as possible
> to increase the chance of complete success in next retry.
> 
> This patch remove page_count() check and remove putback_lru_pages() and
> call migrate_pages() regardless of not_managed to reduce failure as much
> as possible.
> 
> Signed-off-by: Bob Liu <lliubbo@gmail.com>

-EBUSY should be returned.

-Kame


> ---
>  mm/memory_hotplug.c |   12 ------------
>  1 files changed, 0 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a4cfcdc..b64cc9b 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -687,7 +687,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  	unsigned long pfn;
>  	struct page *page;
>  	int move_pages = NR_OFFLINE_AT_ONCE_PAGES;
> -	int not_managed = 0;
>  	int ret = 0;
>  	LIST_HEAD(source);
>  
> @@ -709,10 +708,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  					    page_is_file_cache(page));
>  
>  		} else {
> -			/* Becasue we don't have big zone->lock. we should
> -			   check this again here. */
> -			if (page_count(page))
> -				not_managed++;
>  #ifdef CONFIG_DEBUG_VM
>  			printk(KERN_ALERT "removing pfn %lx from LRU failed\n",
>  			       pfn);
> @@ -720,13 +715,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  #endif
>  		}
>  	}
> -	ret = -EBUSY;
> -	if (not_managed) {
> -		if (!list_empty(&source))
> -			putback_lru_pages(&source);
> -		goto out;
> -	}
> -	ret = 0;
>  	if (list_empty(&source))
>  		goto out;
>  	/* this function returns # of failed pages */
> -- 
> 1.6.3.3
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] do_migrate_range: avoid failure as much as possible
@ 2010-10-25  2:47 Bob Liu
  2010-10-25  2:40 ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 15+ messages in thread
From: Bob Liu @ 2010-10-25  2:47 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, fengguang.wu, kamezawa.hiroyu, mel, kosaki.motohiro,
	Bob Liu

It's normal for isolate_lru_page() to fail at times. The failures are
typically temporal and may well go away when offline_pages() retries
the call. So it seems more reasonable to migrate as much as possible
to increase the chance of complete success in next retry.

This patch remove page_count() check and remove putback_lru_pages() and
call migrate_pages() regardless of not_managed to reduce failure as much
as possible.

Signed-off-by: Bob Liu <lliubbo@gmail.com>
---
 mm/memory_hotplug.c |   12 ------------
 1 files changed, 0 insertions(+), 12 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a4cfcdc..b64cc9b 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -687,7 +687,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 	unsigned long pfn;
 	struct page *page;
 	int move_pages = NR_OFFLINE_AT_ONCE_PAGES;
-	int not_managed = 0;
 	int ret = 0;
 	LIST_HEAD(source);
 
@@ -709,10 +708,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 					    page_is_file_cache(page));
 
 		} else {
-			/* Becasue we don't have big zone->lock. we should
-			   check this again here. */
-			if (page_count(page))
-				not_managed++;
 #ifdef CONFIG_DEBUG_VM
 			printk(KERN_ALERT "removing pfn %lx from LRU failed\n",
 			       pfn);
@@ -720,13 +715,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 #endif
 		}
 	}
-	ret = -EBUSY;
-	if (not_managed) {
-		if (!list_empty(&source))
-			putback_lru_pages(&source);
-		goto out;
-	}
-	ret = 0;
 	if (list_empty(&source))
 		goto out;
 	/* this function returns # of failed pages */
-- 
1.6.3.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  2:40 ` KAMEZAWA Hiroyuki
@ 2010-10-25  2:57   ` Wu Fengguang
  2010-10-25  3:05     ` KAMEZAWA Hiroyuki
  2010-10-25  3:06     ` Wu Fengguang
  0 siblings, 2 replies; 15+ messages in thread
From: Wu Fengguang @ 2010-10-25  2:57 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, Oct 25, 2010 at 10:40:17AM +0800, KAMEZAWA Hiroyuki wrote:
> On Mon, 25 Oct 2010 10:47:31 +0800
> Bob Liu <lliubbo@gmail.com> wrote:
> 
> > It's normal for isolate_lru_page() to fail at times. The failures are
> > typically temporal and may well go away when offline_pages() retries
> > the call. So it seems more reasonable to migrate as much as possible
> > to increase the chance of complete success in next retry.
> > 
> > This patch remove page_count() check and remove putback_lru_pages() and
> > call migrate_pages() regardless of not_managed to reduce failure as much
> > as possible.
> > 
> > Signed-off-by: Bob Liu <lliubbo@gmail.com>
> 
> -EBUSY should be returned.

It does return -EBUSY when ALL pages cannot be isolated from LRU (or
is non-LRU pages at all). That means offline_pages() will repeat calls
to do_migrate_range() as fast as possible as long as it can make
progress.

Is that behavior good enough? It does need some comment for this
non-obvious return value. 

btw, the caller side code can be simplified (no behavior change).

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index dd186c1..606d358 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -848,17 +848,13 @@ repeat:
 	pfn = scan_lru_pages(start_pfn, end_pfn);
 	if (pfn) { /* We have page on LRU */
 		ret = do_migrate_range(pfn, end_pfn);
-		if (!ret) {
-			drain = 1;
-			goto repeat;
-		} else {
-			if (ret < 0)
-				if (--retry_max == 0)
-					goto failed_removal;
+		if (ret < 0) {
+			if (--retry_max <= 0)
+				goto failed_removal;
 			yield();
-			drain = 1;
-			goto repeat;
 		}
+		drain = 1;
+		goto repeat;
 	}
 	/* drain all zone's lru pagevec, this is asyncronous... */
 	lru_add_drain_all();

Thanks,
Fengguang

> > ---
> >  mm/memory_hotplug.c |   12 ------------
> >  1 files changed, 0 insertions(+), 12 deletions(-)
> > 
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index a4cfcdc..b64cc9b 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -687,7 +687,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
> >  	unsigned long pfn;
> >  	struct page *page;
> >  	int move_pages = NR_OFFLINE_AT_ONCE_PAGES;
> > -	int not_managed = 0;
> >  	int ret = 0;
> >  	LIST_HEAD(source);
> >  
> > @@ -709,10 +708,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
> >  					    page_is_file_cache(page));
> >  
> >  		} else {
> > -			/* Becasue we don't have big zone->lock. we should
> > -			   check this again here. */
> > -			if (page_count(page))
> > -				not_managed++;
> >  #ifdef CONFIG_DEBUG_VM
> >  			printk(KERN_ALERT "removing pfn %lx from LRU failed\n",
> >  			       pfn);
> > @@ -720,13 +715,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
> >  #endif
> >  		}
> >  	}
> > -	ret = -EBUSY;
> > -	if (not_managed) {
> > -		if (!list_empty(&source))
> > -			putback_lru_pages(&source);
> > -		goto out;
> > -	}
> > -	ret = 0;
> >  	if (list_empty(&source))
> >  		goto out;
> >  	/* this function returns # of failed pages */
> > -- 
> > 1.6.3.3
> > 
> > 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  2:57   ` Wu Fengguang
@ 2010-10-25  3:05     ` KAMEZAWA Hiroyuki
  2010-10-25  3:09       ` KAMEZAWA Hiroyuki
  2010-10-25  3:28       ` Wu Fengguang
  2010-10-25  3:06     ` Wu Fengguang
  1 sibling, 2 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-25  3:05 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, 25 Oct 2010 10:57:03 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Mon, Oct 25, 2010 at 10:40:17AM +0800, KAMEZAWA Hiroyuki wrote:
> > On Mon, 25 Oct 2010 10:47:31 +0800
> > Bob Liu <lliubbo@gmail.com> wrote:
> > 
> > > It's normal for isolate_lru_page() to fail at times. The failures are
> > > typically temporal and may well go away when offline_pages() retries
> > > the call. So it seems more reasonable to migrate as much as possible
> > > to increase the chance of complete success in next retry.
> > > 
> > > This patch remove page_count() check and remove putback_lru_pages() and
> > > call migrate_pages() regardless of not_managed to reduce failure as much
> > > as possible.
> > > 
> > > Signed-off-by: Bob Liu <lliubbo@gmail.com>
> > 
> > -EBUSY should be returned.
> 
> It does return -EBUSY when ALL pages cannot be isolated from LRU (or
> is non-LRU pages at all). That means offline_pages() will repeat calls
> to do_migrate_range() as fast as possible as long as it can make
> progress.
> 
I read the patch wrong ? "ret = -EBUSY" is dropped and "ret" will be
0 or just a return code of migrate_page().




> Is that behavior good enough? It does need some comment for this
> non-obvious return value. 
> 
> btw, the caller side code can be simplified (no behavior change).
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index dd186c1..606d358 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -848,17 +848,13 @@ repeat:
>  	pfn = scan_lru_pages(start_pfn, end_pfn);
>  	if (pfn) { /* We have page on LRU */
>  		ret = do_migrate_range(pfn, end_pfn);
> -		if (!ret) {
> -			drain = 1;
> -			goto repeat;
> -		} else {
> -			if (ret < 0)
> -				if (--retry_max == 0)
> -					goto failed_removal;
> +		if (ret < 0) {
> +			if (--retry_max <= 0)
> +				goto failed_removal;
>  			yield();
> -			drain = 1;
> -			goto repeat;
>  		}
> +		drain = 1;
> +		goto repeat;
>  	}

This changes behavior.

This "ret" can be > 0 because migrate_page()'s return code is
"Return: Number of pages not migrated or error code."

Then, 
ret < 0  ===> maybe ebusy
ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
ret == 0 ===> ok, all condition green. try next chunk soon.

Then, I added "yield()" and --retrym_max for !ret cases.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  2:57   ` Wu Fengguang
  2010-10-25  3:05     ` KAMEZAWA Hiroyuki
@ 2010-10-25  3:06     ` Wu Fengguang
  2010-10-25  3:16       ` Wu Fengguang
  1 sibling, 1 reply; 15+ messages in thread
From: Wu Fengguang @ 2010-10-25  3:06 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, Oct 25, 2010 at 10:57:03AM +0800, Wu Fengguang wrote:
> On Mon, Oct 25, 2010 at 10:40:17AM +0800, KAMEZAWA Hiroyuki wrote:
> > On Mon, 25 Oct 2010 10:47:31 +0800
> > Bob Liu <lliubbo@gmail.com> wrote:
> > 
> > > It's normal for isolate_lru_page() to fail at times. The failures are
> > > typically temporal and may well go away when offline_pages() retries
> > > the call. So it seems more reasonable to migrate as much as possible
> > > to increase the chance of complete success in next retry.
> > > 
> > > This patch remove page_count() check and remove putback_lru_pages() and
> > > call migrate_pages() regardless of not_managed to reduce failure as much
> > > as possible.
> > > 
> > > Signed-off-by: Bob Liu <lliubbo@gmail.com>
> > 
> > -EBUSY should be returned.
> 
> It does return -EBUSY when ALL pages cannot be isolated from LRU (or
> is non-LRU pages at all). That means offline_pages() will repeat calls
> to do_migrate_range() as fast as possible as long as it can make
> progress.
> 
> Is that behavior good enough? It does need some comment for this
> non-obvious return value. 
> 
> btw, the caller side code can be simplified (no behavior change).
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index dd186c1..606d358 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -848,17 +848,13 @@ repeat:
>  	pfn = scan_lru_pages(start_pfn, end_pfn);
>  	if (pfn) { /* We have page on LRU */
>  		ret = do_migrate_range(pfn, end_pfn);
> -		if (!ret) {
> -			drain = 1;
> -			goto repeat;
> -		} else {
> -			if (ret < 0)
> -				if (--retry_max == 0)
> -					goto failed_removal;
> +		if (ret < 0) {
> +			if (--retry_max <= 0)
> +				goto failed_removal;
>  			yield();
> -			drain = 1;
> -			goto repeat;
>  		}
> +		drain = 1;
> +		goto repeat;
>  	}
>  	/* drain all zone's lru pagevec, this is asyncronous... */
>  	lru_add_drain_all();

And it seems the costly drain operations could be avoided as long as
it's making progress. What do you think?

--- linux-next.orig/mm/memory_hotplug.c	2010-10-25 11:04:05.000000000 +0800
+++ linux-next/mm/memory_hotplug.c	2010-10-25 11:04:22.000000000 +0800
@@ -852,8 +852,8 @@ repeat:
 			if (--retry_max <= 0)
 				goto failed_removal;
 			yield();
+			drain = 1;
 		}
-		drain = 1;
 		goto repeat;
 	}
 	/* drain all zone's lru pagevec, this is asyncronous... */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  3:05     ` KAMEZAWA Hiroyuki
@ 2010-10-25  3:09       ` KAMEZAWA Hiroyuki
  2010-10-25  3:48         ` Wu Fengguang
  2010-10-25  3:28       ` Wu Fengguang
  1 sibling, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-25  3:09 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Wu Fengguang, Bob Liu, akpm@linux-foundation.org,
	linux-mm@kvack.org, mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, 25 Oct 2010 12:05:50 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> This changes behavior.
> 
> This "ret" can be > 0 because migrate_page()'s return code is
> "Return: Number of pages not migrated or error code."
> 
> Then, 
> ret < 0  ===> maybe ebusy
> ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
> ret == 0 ===> ok, all condition green. try next chunk soon.
> 
> Then, I added "yield()" and --retrym_max for !ret cases.
                                               ^^^^^^^^
						wrong.

The code here does

ret == 0 ==> ok, all condition green, try next chunk.
ret > 0  ==> all pages are isolated but some pages cannot be migrated. maybe under I/O
	     do yield.
ret < 0  ==> some pages may not be able to be isolated. reduce retrycount and yield()

Thanks,
-Kame 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  3:06     ` Wu Fengguang
@ 2010-10-25  3:16       ` Wu Fengguang
  0 siblings, 0 replies; 15+ messages in thread
From: Wu Fengguang @ 2010-10-25  3:16 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, Oct 25, 2010 at 11:06:34AM +0800, Wu Fengguang wrote:
> On Mon, Oct 25, 2010 at 10:57:03AM +0800, Wu Fengguang wrote:
> > On Mon, Oct 25, 2010 at 10:40:17AM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 25 Oct 2010 10:47:31 +0800
> > > Bob Liu <lliubbo@gmail.com> wrote:
> > > 
> > > > It's normal for isolate_lru_page() to fail at times. The failures are
> > > > typically temporal and may well go away when offline_pages() retries
> > > > the call. So it seems more reasonable to migrate as much as possible
> > > > to increase the chance of complete success in next retry.
> > > > 
> > > > This patch remove page_count() check and remove putback_lru_pages() and
> > > > call migrate_pages() regardless of not_managed to reduce failure as much
> > > > as possible.
> > > > 
> > > > Signed-off-by: Bob Liu <lliubbo@gmail.com>
> > > 
> > > -EBUSY should be returned.
> > 
> > It does return -EBUSY when ALL pages cannot be isolated from LRU (or
> > is non-LRU pages at all). That means offline_pages() will repeat calls
> > to do_migrate_range() as fast as possible as long as it can make
> > progress.
> > 
> > Is that behavior good enough? It does need some comment for this
> > non-obvious return value. 
> > 
> > btw, the caller side code can be simplified (no behavior change).
> > 
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index dd186c1..606d358 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -848,17 +848,13 @@ repeat:
> >  	pfn = scan_lru_pages(start_pfn, end_pfn);
> >  	if (pfn) { /* We have page on LRU */
> >  		ret = do_migrate_range(pfn, end_pfn);
> > -		if (!ret) {
> > -			drain = 1;
> > -			goto repeat;
> > -		} else {
> > -			if (ret < 0)
> > -				if (--retry_max == 0)
> > -					goto failed_removal;
> > +		if (ret < 0) {
> > +			if (--retry_max <= 0)
> > +				goto failed_removal;
> >  			yield();
> > -			drain = 1;
> > -			goto repeat;
> >  		}
> > +		drain = 1;
> > +		goto repeat;
> >  	}
> >  	/* drain all zone's lru pagevec, this is asyncronous... */
> >  	lru_add_drain_all();
> 
> And it seems the costly drain operations could be avoided as long as
> it's making progress. What do you think?
> 
> --- linux-next.orig/mm/memory_hotplug.c	2010-10-25 11:04:05.000000000 +0800
> +++ linux-next/mm/memory_hotplug.c	2010-10-25 11:04:22.000000000 +0800
> @@ -852,8 +852,8 @@ repeat:
>  			if (--retry_max <= 0)
>  				goto failed_removal;
>  			yield();
> +			drain = 1;
>  		}
> -		drain = 1;
>  		goto repeat;
>  	}
>  	/* drain all zone's lru pagevec, this is asyncronous... */

This is a more heavy weight patch for the above one-liner change.
I don't have real experiences to understand the requirements for
memory hot remove, so the idea may be way too imaginary.

--- linux-next.orig/mm/memory_hotplug.c	2010-10-25 11:04:05.000000000 +0800
+++ linux-next/mm/memory_hotplug.c	2010-10-25 11:12:07.000000000 +0800
@@ -788,7 +788,7 @@ static int offline_pages(unsigned long s
 {
 	unsigned long pfn, nr_pages, expire;
 	long offlined_pages;
-	int ret, drain, retry_max, node;
+	int ret, retry_max, node;
 	struct zone *zone;
 	struct memory_notify arg;
 
@@ -827,7 +827,6 @@ static int offline_pages(unsigned long s
 
 	pfn = start_pfn;
 	expire = jiffies + timeout;
-	drain = 0;
 	retry_max = 5;
 repeat:
 	/* start memory hot removal */
@@ -838,13 +837,6 @@ repeat:
 	if (signal_pending(current))
 		goto failed_removal;
 	ret = 0;
-	if (drain) {
-		lru_add_drain_all();
-		flush_scheduled_work();
-		cond_resched();
-		drain_all_pages();
-	}
-
 	pfn = scan_lru_pages(start_pfn, end_pfn);
 	if (pfn) { /* We have page on LRU */
 		ret = do_migrate_range(pfn, end_pfn);
@@ -852,15 +844,19 @@ repeat:
 			if (--retry_max <= 0)
 				goto failed_removal;
 			yield();
+			lru_add_drain_all();
+			flush_scheduled_work();
+			cond_resched();
+			drain_all_pages();
 		}
-		drain = 1;
 		goto repeat;
 	}
-	/* drain all zone's lru pagevec, this is asyncronous... */
+
+	/* drain all zone's lru pagevec, this is asynchronous... */
 	lru_add_drain_all();
 	flush_scheduled_work();
 	yield();
-	/* drain pcp pages , this is synchrouns. */
+	/* drain pcp pages , this is asynchronous. */
 	drain_all_pages();
 	/* check again */
 	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
@@ -869,7 +865,7 @@ repeat:
 		goto failed_removal;
 	}
 	printk(KERN_INFO "Offlined Pages %ld\n", offlined_pages);
-	/* Ok, all of our target is islaoted.
+	/* Ok, all of our target is isolated.
 	   We cannot do rollback at this point. */
 	offline_isolated_pages(start_pfn, end_pfn);
 	/* reset pagetype flags and makes migrate type to be MOVABLE */
Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  3:05     ` KAMEZAWA Hiroyuki
  2010-10-25  3:09       ` KAMEZAWA Hiroyuki
@ 2010-10-25  3:28       ` Wu Fengguang
  2010-10-25  3:50         ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 15+ messages in thread
From: Wu Fengguang @ 2010-10-25  3:28 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, Oct 25, 2010 at 11:05:50AM +0800, KAMEZAWA Hiroyuki wrote:
> On Mon, 25 Oct 2010 10:57:03 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > On Mon, Oct 25, 2010 at 10:40:17AM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 25 Oct 2010 10:47:31 +0800
> > > Bob Liu <lliubbo@gmail.com> wrote:
> > > 
> > > > It's normal for isolate_lru_page() to fail at times. The failures are
> > > > typically temporal and may well go away when offline_pages() retries
> > > > the call. So it seems more reasonable to migrate as much as possible
> > > > to increase the chance of complete success in next retry.
> > > > 
> > > > This patch remove page_count() check and remove putback_lru_pages() and
> > > > call migrate_pages() regardless of not_managed to reduce failure as much
> > > > as possible.
> > > > 
> > > > Signed-off-by: Bob Liu <lliubbo@gmail.com>
> > > 
> > > -EBUSY should be returned.
> > 
> > It does return -EBUSY when ALL pages cannot be isolated from LRU (or
> > is non-LRU pages at all). That means offline_pages() will repeat calls
> > to do_migrate_range() as fast as possible as long as it can make
> > progress.
> > 
> I read the patch wrong ? "ret = -EBUSY" is dropped and "ret" will be
> 0 or just a return code of migrate_page().

        for () {
                ret = isolate_lru_page(page);
        }

        if (list_empty(&source))
                goto out;

out:
        return ret;

So do_migrate_range() will return -EBUSY if the last isolate_lru_page() returns
-EBUSY.

> 
> 
> 
> > Is that behavior good enough? It does need some comment for this
> > non-obvious return value. 
> > 
> > btw, the caller side code can be simplified (no behavior change).
> > 
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index dd186c1..606d358 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -848,17 +848,13 @@ repeat:
> >     pfn = scan_lru_pages(start_pfn, end_pfn);
> >     if (pfn) { /* We have page on LRU */
> >             ret = do_migrate_range(pfn, end_pfn);
> > -           if (!ret) {
> > -                   drain = 1;
> > -                   goto repeat;
> > -           } else {
> > -                   if (ret < 0)
> > -                           if (--retry_max == 0)
> > -                                   goto failed_removal;
> > +           if (ret < 0) {
> > +                   if (--retry_max <= 0)
> > +                           goto failed_removal;
> >                     yield();
> > -                   drain = 1;
> > -                   goto repeat;
> >             }
> > +           drain = 1;
> > +           goto repeat;
> >     }
> 
> This changes behavior.

Ah yes!
 
> This "ret" can be > 0 because migrate_page()'s return code is
> "Return: Number of pages not migrated or error code."
> 
> Then, 
> ret < 0  ===> maybe ebusy
> ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
> ret == 0 ===> ok, all condition green. try next chunk soon.
> 
> Then, I added "yield()" and --retrym_max for !ret cases.

You are right, there is the "ret > 0, some pages are not migrated" case.
But I'm not sure it's PG_writeback pages, because migrate_pages() will wait on
writeback after pass 2.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  3:48         ` Wu Fengguang
@ 2010-10-25  3:48           ` KAMEZAWA Hiroyuki
  2010-10-25  4:06             ` Wu Fengguang
  2010-10-25  4:00           ` Bob Liu
  1 sibling, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-25  3:48 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, 25 Oct 2010 11:48:33 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Mon, Oct 25, 2010 at 11:09:01AM +0800, KAMEZAWA Hiroyuki wrote:
> > On Mon, 25 Oct 2010 12:05:50 +0900
> > KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > 
> > > This changes behavior.
> > > 
> > > This "ret" can be > 0 because migrate_page()'s return code is
> > > "Return: Number of pages not migrated or error code."
> > > 
> > > Then, 
> > > ret < 0  ===> maybe ebusy
> > > ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
> > > ret == 0 ===> ok, all condition green. try next chunk soon.
> > > 
> > > Then, I added "yield()" and --retrym_max for !ret cases.
> >                                                ^^^^^^^^
> > 						wrong.
> > 
> > The code here does
> > 
> > ret == 0 ==> ok, all condition green, try next chunk.
> 
> It seems reasonable to remove the drain operations for "ret == 0"
> case.  That would help large NUMA boxes noticeably I guess.
> 
Maybe.

> > ret > 0  ==> all pages are isolated but some pages cannot be migrated. maybe under I/O
> > 	     do yield.
> 
> Don't know how to deal with the possible "migration fail" pages --
> sorry I have no idea about that situation at all.
> 

In typical case, page_count() > 0 by get_user_pages() or PG_writeback is set.
All we can do is just waiting.


> Perhaps, OOM while offlining pages?
> 

I never see that..because memory offline is scheduled to be done only when
there are free memory.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  3:09       ` KAMEZAWA Hiroyuki
@ 2010-10-25  3:48         ` Wu Fengguang
  2010-10-25  3:48           ` KAMEZAWA Hiroyuki
  2010-10-25  4:00           ` Bob Liu
  0 siblings, 2 replies; 15+ messages in thread
From: Wu Fengguang @ 2010-10-25  3:48 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, Oct 25, 2010 at 11:09:01AM +0800, KAMEZAWA Hiroyuki wrote:
> On Mon, 25 Oct 2010 12:05:50 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> 
> > This changes behavior.
> > 
> > This "ret" can be > 0 because migrate_page()'s return code is
> > "Return: Number of pages not migrated or error code."
> > 
> > Then, 
> > ret < 0  ===> maybe ebusy
> > ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
> > ret == 0 ===> ok, all condition green. try next chunk soon.
> > 
> > Then, I added "yield()" and --retrym_max for !ret cases.
>                                                ^^^^^^^^
> 						wrong.
> 
> The code here does
> 
> ret == 0 ==> ok, all condition green, try next chunk.

It seems reasonable to remove the drain operations for "ret == 0"
case.  That would help large NUMA boxes noticeably I guess.

> ret > 0  ==> all pages are isolated but some pages cannot be migrated. maybe under I/O
> 	     do yield.

Don't know how to deal with the possible "migration fail" pages --
sorry I have no idea about that situation at all.

Perhaps, OOM while offlining pages?

> ret < 0  ==> some pages may not be able to be isolated. reduce retrycount and yield()

Makes good sense.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  3:28       ` Wu Fengguang
@ 2010-10-25  3:50         ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-25  3:50 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, 25 Oct 2010 11:28:27 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Mon, Oct 25, 2010 at 11:05:50AM +0800, KAMEZAWA Hiroyuki wrote:
> > On Mon, 25 Oct 2010 10:57:03 +0800
> > Wu Fengguang <fengguang.wu@intel.com> wrote:
> > 
> > > On Mon, Oct 25, 2010 at 10:40:17AM +0800, KAMEZAWA Hiroyuki wrote:
> > > > On Mon, 25 Oct 2010 10:47:31 +0800
> > > > Bob Liu <lliubbo@gmail.com> wrote:
> > > > 
> > > > > It's normal for isolate_lru_page() to fail at times. The failures are
> > > > > typically temporal and may well go away when offline_pages() retries
> > > > > the call. So it seems more reasonable to migrate as much as possible
> > > > > to increase the chance of complete success in next retry.
> > > > > 
> > > > > This patch remove page_count() check and remove putback_lru_pages() and
> > > > > call migrate_pages() regardless of not_managed to reduce failure as much
> > > > > as possible.
> > > > > 
> > > > > Signed-off-by: Bob Liu <lliubbo@gmail.com>
> > > > 
> > > > -EBUSY should be returned.
> > > 
> > > It does return -EBUSY when ALL pages cannot be isolated from LRU (or
> > > is non-LRU pages at all). That means offline_pages() will repeat calls
> > > to do_migrate_range() as fast as possible as long as it can make
> > > progress.
> > > 
> > I read the patch wrong ? "ret = -EBUSY" is dropped and "ret" will be
> > 0 or just a return code of migrate_page().
> 
>         for () {
>                 ret = isolate_lru_page(page);
>         }
> 
>         if (list_empty(&source))
>                 goto out;
> 
> out:
>         return ret;
> 
> So do_migrate_range() will return -EBUSY if the last isolate_lru_page() returns
> -EBUSY.
> 

Then this patch should be put onto "immediately quit loop when not_managed++" patch.
Please write it in description or make a patch series which other guys can see it.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  3:48         ` Wu Fengguang
  2010-10-25  3:48           ` KAMEZAWA Hiroyuki
@ 2010-10-25  4:00           ` Bob Liu
  1 sibling, 0 replies; 15+ messages in thread
From: Bob Liu @ 2010-10-25  4:00 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: KAMEZAWA Hiroyuki, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, Oct 25, 2010 at 11:48 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> On Mon, Oct 25, 2010 at 11:09:01AM +0800, KAMEZAWA Hiroyuki wrote:
>> On Mon, 25 Oct 2010 12:05:50 +0900
>> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>
>> > This changes behavior.
>> >
>> > This "ret" can be > 0 because migrate_page()'s return code is
>> > "Return: Number of pages not migrated or error code."
>> >
>> > Then,
>> > ret < 0  ===> maybe ebusy
>> > ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
>> > ret == 0 ===> ok, all condition green. try next chunk soon.
>> >
>> > Then, I added "yield()" and --retrym_max for !ret cases.
>>                                                ^^^^^^^^
>>                                               wrong.
>>
>> The code here does
>>
>> ret == 0 ==> ok, all condition green, try next chunk.
>
> It seems reasonable to remove the drain operations for "ret == 0"
> case.  That would help large NUMA boxes noticeably I guess.
>
>> ret > 0  ==> all pages are isolated but some pages cannot be migrated. maybe under I/O
>>            do yield.
>
> Don't know how to deal with the possible "migration fail" pages --
> sorry I have no idea about that situation at all.
>
> Perhaps, OOM while offlining pages?
>
>> ret < 0  ==> some pages may not be able to be isolated. reduce retrycount and yield()
>
> Makes good sense.
>
> Thanks,
> Fengguang
>

Hi, Wu

What about acking these two patches first which doesn't change logic too much.
[PATCH 2/3] do_migrate_range: exit loop if not_managed is true.
[PATCH 3/3] do_migrate_range: reduce list_empty() check.

For the current, I think more work&discussion is needed and make a
clean one is better.
Thanks.
-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  3:48           ` KAMEZAWA Hiroyuki
@ 2010-10-25  4:06             ` Wu Fengguang
  2010-10-25  4:34               ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 15+ messages in thread
From: Wu Fengguang @ 2010-10-25  4:06 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, Oct 25, 2010 at 11:48:16AM +0800, KAMEZAWA Hiroyuki wrote:
> On Mon, 25 Oct 2010 11:48:33 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > On Mon, Oct 25, 2010 at 11:09:01AM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 25 Oct 2010 12:05:50 +0900
> > > KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > > 
> > > > This changes behavior.
> > > > 
> > > > This "ret" can be > 0 because migrate_page()'s return code is
> > > > "Return: Number of pages not migrated or error code."
> > > > 
> > > > Then, 
> > > > ret < 0  ===> maybe ebusy
> > > > ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
> > > > ret == 0 ===> ok, all condition green. try next chunk soon.
> > > > 
> > > > Then, I added "yield()" and --retrym_max for !ret cases.
> > >                                                ^^^^^^^^
> > > 						wrong.
> > > 
> > > The code here does
> > > 
> > > ret == 0 ==> ok, all condition green, try next chunk.
> > 
> > It seems reasonable to remove the drain operations for "ret == 0"
> > case.  That would help large NUMA boxes noticeably I guess.
> > 
> Maybe.
> 
> > > ret > 0  ==> all pages are isolated but some pages cannot be migrated. maybe under I/O
> > > 	     do yield.
> > 
> > Don't know how to deal with the possible "migration fail" pages --
> > sorry I have no idea about that situation at all.
> > 
> 
> In typical case, page_count() > 0 by get_user_pages() or PG_writeback is set.
> All we can do is just waiting.

OK.

> > Perhaps, OOM while offlining pages?
> > 
> 
> I never see that..because memory offline is scheduled to be done only when
> there are free memory.

OK.

On OOM migrate_page() will return -ENOMEM, which will be handled in
the "ret < 0" case. So it will give up after some retries.

migrate_page() has a comment /* Permanent failure */ when returning
positive ret. So it looks safer not to retry indefinitely on the
"ret > 0" case?

Then it's reduced to two cases: "ret != 0, cannot make smooth
progress, unconditional retries may livelock" and "ret ==0, makes some
progress, safe to retry".

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  4:06             ` Wu Fengguang
@ 2010-10-25  4:34               ` KAMEZAWA Hiroyuki
  2010-10-25  4:55                 ` Wu Fengguang
  0 siblings, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-25  4:34 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, 25 Oct 2010 12:06:04 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Mon, Oct 25, 2010 at 11:48:16AM +0800, KAMEZAWA Hiroyuki wrote:
> > On Mon, 25 Oct 2010 11:48:33 +0800
> > Wu Fengguang <fengguang.wu@intel.com> wrote:
> > 
> > > On Mon, Oct 25, 2010 at 11:09:01AM +0800, KAMEZAWA Hiroyuki wrote:
> > > > On Mon, 25 Oct 2010 12:05:50 +0900
> > > > KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > > > 
> > > > > This changes behavior.
> > > > > 
> > > > > This "ret" can be > 0 because migrate_page()'s return code is
> > > > > "Return: Number of pages not migrated or error code."
> > > > > 
> > > > > Then, 
> > > > > ret < 0  ===> maybe ebusy
> > > > > ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
> > > > > ret == 0 ===> ok, all condition green. try next chunk soon.
> > > > > 
> > > > > Then, I added "yield()" and --retrym_max for !ret cases.
> > > >                                                ^^^^^^^^
> > > > 						wrong.
> > > > 
> > > > The code here does
> > > > 
> > > > ret == 0 ==> ok, all condition green, try next chunk.
> > > 
> > > It seems reasonable to remove the drain operations for "ret == 0"
> > > case.  That would help large NUMA boxes noticeably I guess.
> > > 
> > Maybe.
> > 
> > > > ret > 0  ==> all pages are isolated but some pages cannot be migrated. maybe under I/O
> > > > 	     do yield.
> > > 
> > > Don't know how to deal with the possible "migration fail" pages --
> > > sorry I have no idea about that situation at all.
> > > 
> > 
> > In typical case, page_count() > 0 by get_user_pages() or PG_writeback is set.
> > All we can do is just waiting.
> 
> OK.
> 
> > > Perhaps, OOM while offlining pages?
> > > 
> > 
> > I never see that..because memory offline is scheduled to be done only when
> > there are free memory.
> 
> OK.
> 
> On OOM migrate_page() will return -ENOMEM, which will be handled in
> the "ret < 0" case. So it will give up after some retries.
> 
> migrate_page() has a comment /* Permanent failure */ when returning
> positive ret. So it looks safer not to retry indefinitely on the
> "ret > 0" case?
> 
> Then it's reduced to two cases: "ret != 0, cannot make smooth
> progress, unconditional retries may livelock" and "ret ==0, makes some
> progress, safe to retry".
> 
Memory offline is designed to be able to stop by Ctrl-C. And it has timeout
of 120 sec.

I don't called as livelock.

Thanks,
-Kame



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] do_migrate_range: avoid failure as much as possible
  2010-10-25  4:34               ` KAMEZAWA Hiroyuki
@ 2010-10-25  4:55                 ` Wu Fengguang
  0 siblings, 0 replies; 15+ messages in thread
From: Wu Fengguang @ 2010-10-25  4:55 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Bob Liu, akpm@linux-foundation.org, linux-mm@kvack.org,
	mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com

On Mon, Oct 25, 2010 at 12:34:48PM +0800, KAMEZAWA Hiroyuki wrote:
> On Mon, 25 Oct 2010 12:06:04 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > On Mon, Oct 25, 2010 at 11:48:16AM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 25 Oct 2010 11:48:33 +0800
> > > Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > 
> > > > On Mon, Oct 25, 2010 at 11:09:01AM +0800, KAMEZAWA Hiroyuki wrote:
> > > > > On Mon, 25 Oct 2010 12:05:50 +0900
> > > > > KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > > > > 
> > > > > > This changes behavior.
> > > > > > 
> > > > > > This "ret" can be > 0 because migrate_page()'s return code is
> > > > > > "Return: Number of pages not migrated or error code."
> > > > > > 
> > > > > > Then, 
> > > > > > ret < 0  ===> maybe ebusy
> > > > > > ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
> > > > > > ret == 0 ===> ok, all condition green. try next chunk soon.
> > > > > > 
> > > > > > Then, I added "yield()" and --retrym_max for !ret cases.
> > > > >                                                ^^^^^^^^
> > > > > 						wrong.
> > > > > 
> > > > > The code here does
> > > > > 
> > > > > ret == 0 ==> ok, all condition green, try next chunk.
> > > > 
> > > > It seems reasonable to remove the drain operations for "ret == 0"
> > > > case.  That would help large NUMA boxes noticeably I guess.
> > > > 
> > > Maybe.

OK, I'll post a patch for it.

> > > > > ret > 0  ==> all pages are isolated but some pages cannot be migrated. maybe under I/O
> > > > > 	     do yield.
> > > > 
> > > > Don't know how to deal with the possible "migration fail" pages --
> > > > sorry I have no idea about that situation at all.
> > > > 
> > > 
> > > In typical case, page_count() > 0 by get_user_pages() or PG_writeback is set.
> > > All we can do is just waiting.
> > 
> > OK.
> > 
> > > > Perhaps, OOM while offlining pages?
> > > > 
> > > 
> > > I never see that..because memory offline is scheduled to be done only when
> > > there are free memory.
> > 
> > OK.
> > 
> > On OOM migrate_page() will return -ENOMEM, which will be handled in
> > the "ret < 0" case. So it will give up after some retries.
> > 
> > migrate_page() has a comment /* Permanent failure */ when returning
> > positive ret. So it looks safer not to retry indefinitely on the
> > "ret > 0" case?
> > 
> > Then it's reduced to two cases: "ret != 0, cannot make smooth
> > progress, unconditional retries may livelock" and "ret ==0, makes some
> > progress, safe to retry".
> > 
> Memory offline is designed to be able to stop by Ctrl-C. And it has timeout
> of 120 sec.
> 
> I don't called as livelock.

Ah sorry for overlooking that!  I should really think twice..(after
thinking twice) I find it's even better. Unmigratible pages will be
put back to LRU. Then -EBUSY will be returned when trying to isolate
it the next time. So it's an imaginary problem.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-10-25  4:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-25  2:47 [PATCH] do_migrate_range: avoid failure as much as possible Bob Liu
2010-10-25  2:40 ` KAMEZAWA Hiroyuki
2010-10-25  2:57   ` Wu Fengguang
2010-10-25  3:05     ` KAMEZAWA Hiroyuki
2010-10-25  3:09       ` KAMEZAWA Hiroyuki
2010-10-25  3:48         ` Wu Fengguang
2010-10-25  3:48           ` KAMEZAWA Hiroyuki
2010-10-25  4:06             ` Wu Fengguang
2010-10-25  4:34               ` KAMEZAWA Hiroyuki
2010-10-25  4:55                 ` Wu Fengguang
2010-10-25  4:00           ` Bob Liu
2010-10-25  3:28       ` Wu Fengguang
2010-10-25  3:50         ` KAMEZAWA Hiroyuki
2010-10-25  3:06     ` Wu Fengguang
2010-10-25  3:16       ` Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).