linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch 1/2]compaction: check migrated page number
@ 2012-09-06 10:44 Shaohua Li
  2012-09-06 12:17 ` Mel Gorman
  0 siblings, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2012-09-06 10:44 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, mgorman, aarcange


isolate_migratepages_range() might isolate none pages, for example, when
zone->lru_lock is contended and compaction is async. In this case, we should
abort compaction, otherwise, compact_zone will run a useless loop and make
zone->lru_lock is even contended.

Signed-off-by: Shaohua Li <shli@fusionio.com>
---
 mm/compaction.c |    2 ++
 1 file changed, 2 insertions(+)

Index: linux/mm/compaction.c
===================================================================
--- linux.orig/mm/compaction.c	2012-08-22 09:51:39.295322268 +0800
+++ linux/mm/compaction.c	2012-09-06 14:46:13.923144263 +0800
@@ -402,6 +402,8 @@ isolate_migratepages_range(struct zone *
 
 	trace_mm_compaction_isolate_migratepages(nr_scanned, nr_isolated);
 
+	if (!nr_isolated)
+		return 0;
 	return low_pfn;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/2]compaction: check migrated page number
  2012-09-06 10:44 [patch 1/2]compaction: check migrated page number Shaohua Li
@ 2012-09-06 12:17 ` Mel Gorman
  2012-09-06 12:55   ` Shaohua Li
  0 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2012-09-06 12:17 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-mm, akpm, aarcange

On Thu, Sep 06, 2012 at 06:44:04PM +0800, Shaohua Li wrote:
> 
> isolate_migratepages_range() might isolate none pages, for example, when
> zone->lru_lock is contended and compaction is async. In this case, we should
> abort compaction, otherwise, compact_zone will run a useless loop and make
> zone->lru_lock is even contended.
> 

It might also isolate no pages because the range was 100% allocated and
there were no free pages to isolate. This is perfectly normal and I suspect
this patch effectively disables compaction. What problem did you observe
that this patch is aimed at?

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/2]compaction: check migrated page number
  2012-09-06 12:17 ` Mel Gorman
@ 2012-09-06 12:55   ` Shaohua Li
  2012-09-06 13:25     ` Mel Gorman
  0 siblings, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2012-09-06 12:55 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm, akpm, aarcange

On Thu, Sep 06, 2012 at 01:17:25PM +0100, Mel Gorman wrote:
> On Thu, Sep 06, 2012 at 06:44:04PM +0800, Shaohua Li wrote:
> > 
> > isolate_migratepages_range() might isolate none pages, for example, when
> > zone->lru_lock is contended and compaction is async. In this case, we should
> > abort compaction, otherwise, compact_zone will run a useless loop and make
> > zone->lru_lock is even contended.
> > 
> 
> It might also isolate no pages because the range was 100% allocated and
> there were no free pages to isolate. This is perfectly normal and I suspect
> this patch effectively disables compaction. What problem did you observe
> that this patch is aimed at?

I'm running a random swapin/out workload. When memory is fragmented enough, I
saw 100% cpu usage. perf shows zone->lru_lock is heavily contended in
isolate_migratepages_range. I'm using slub(I didn't see the problem with slab),
the allocation is for radix_tree_node slab, which needs 4 pages. Even If I just
apply the second patch, the system is still in 100% cpu usage. The
spin_is_contended check can't cure the problem completely. Trace shows
compact_zone will run a useless loop and each loop contend the lru_lock. With
this patch, the cpu usage becomes normal (about 20% utilization).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/2]compaction: check migrated page number
  2012-09-06 12:55   ` Shaohua Li
@ 2012-09-06 13:25     ` Mel Gorman
  2012-09-07  4:12       ` Shaohua Li
  0 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2012-09-06 13:25 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-mm, akpm, aarcange

On Thu, Sep 06, 2012 at 08:55:26PM +0800, Shaohua Li wrote:
> On Thu, Sep 06, 2012 at 01:17:25PM +0100, Mel Gorman wrote:
> > On Thu, Sep 06, 2012 at 06:44:04PM +0800, Shaohua Li wrote:
> > > 
> > > isolate_migratepages_range() might isolate none pages, for example, when
> > > zone->lru_lock is contended and compaction is async. In this case, we should
> > > abort compaction, otherwise, compact_zone will run a useless loop and make
> > > zone->lru_lock is even contended.
> > > 
> > 
> > It might also isolate no pages because the range was 100% allocated and
> > there were no free pages to isolate. This is perfectly normal and I suspect
> > this patch effectively disables compaction. What problem did you observe
> > that this patch is aimed at?
> 
> I'm running a random swapin/out workload. When memory is fragmented enough, I
> saw 100% cpu usage. perf shows zone->lru_lock is heavily contended in
> isolate_migratepages_range. I'm using slub(I didn't see the problem with slab),
> the allocation is for radix_tree_node slab, which needs 4 pages.

Ok, the fragmentaiton is due to high-order unmovable kernel allocations from
SLUB which will have diminishing returns over time.  One option to address
this is to check if it's a high-order kernel allocation that can fail and
not compact in that case. SLUB will fall back to using order-0 instead.

> Even If I just
> apply the second patch, the system is still in 100% cpu usage. The
> spin_is_contended check can't cure the problem completely.

Are you sure it's really contention in that case and not just a lot of
time is spent in compaction trying to satisfy the radix_tree_node
allocation requests?

> Trace shows
> compact_zone will run a useless loop and each loop contend the lru_lock. With
> this patch, the cpu usage becomes normal (about 20% utilization).

I suspect the reason why this patch has an effect is because compaction is
no longer running. It finds a 100% full pageblock quickly and then aborts and
that is not the right fix. Can you try something like this instead please?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8f6eea3..bd5bd6d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2114,6 +2114,10 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 		return NULL;
 	}
 
+	/* Do not compact for high-order kernel allocations that can fail */
+	if ((gfp_mask & (__GFP_NORETRY | __GFP_MOVABLE)) == __GFP_NORETRY)
+		return NULL;
+
 	current->flags |= PF_MEMALLOC;
 	*did_some_progress = try_to_compact_pages(zonelist, order, gfp_mask,
 						nodemask, sync_migration,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [patch 1/2]compaction: check migrated page number
  2012-09-06 13:25     ` Mel Gorman
@ 2012-09-07  4:12       ` Shaohua Li
  2012-09-07 15:52         ` Andrea Arcangeli
  0 siblings, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2012-09-07  4:12 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm, akpm, aarcange

On Thu, Sep 06, 2012 at 02:25:51PM +0100, Mel Gorman wrote:
> On Thu, Sep 06, 2012 at 08:55:26PM +0800, Shaohua Li wrote:
> > On Thu, Sep 06, 2012 at 01:17:25PM +0100, Mel Gorman wrote:
> > > On Thu, Sep 06, 2012 at 06:44:04PM +0800, Shaohua Li wrote:
> > > > 
> > > > isolate_migratepages_range() might isolate none pages, for example, when
> > > > zone->lru_lock is contended and compaction is async. In this case, we should
> > > > abort compaction, otherwise, compact_zone will run a useless loop and make
> > > > zone->lru_lock is even contended.
> > > > 
> > > 
> > > It might also isolate no pages because the range was 100% allocated and
> > > there were no free pages to isolate. This is perfectly normal and I suspect
> > > this patch effectively disables compaction. What problem did you observe
> > > that this patch is aimed at?
> > 
> > I'm running a random swapin/out workload. When memory is fragmented enough, I
> > saw 100% cpu usage. perf shows zone->lru_lock is heavily contended in
> > isolate_migratepages_range. I'm using slub(I didn't see the problem with slab),
> > the allocation is for radix_tree_node slab, which needs 4 pages.
> 
> Ok, the fragmentaiton is due to high-order unmovable kernel allocations from
> SLUB which will have diminishing returns over time.  One option to address
> this is to check if it's a high-order kernel allocation that can fail and
> not compact in that case. SLUB will fall back to using order-0 instead.

I tried actually, and it doesn't help. The problem is compact_zone keeps
running isolate_migratepages_range, which does nothing except doing a
lock/unlock.
 
> > Even If I just
> > apply the second patch, the system is still in 100% cpu usage. The
> > spin_is_contended check can't cure the problem completely.
> 
> Are you sure it's really contention in that case and not just a lot of
> time is spent in compaction trying to satisfy the radix_tree_node
> allocation requests?

certainly it's the contention.
 
> > Trace shows
> > compact_zone will run a useless loop and each loop contend the lru_lock. With
> > this patch, the cpu usage becomes normal (about 20% utilization).
> 
> I suspect the reason why this patch has an effect is because compaction is
> no longer running. It finds a 100% full pageblock quickly and then aborts and
> that is not the right fix. Can you try something like this instead please?

That debug patch doesn't help. My system just hang.

I thought your worry is valid, we shouldn't abort if 100% full pageblock is
found. How about this one? With it, the cpu usage is normal in my workload.
Occassionally I saw cpu usage reaches high (up to 80%), but recovered
immediately. Without the patch, the cpu usage keeps in 100%.

Thanks,
Shaohua


Subject: compaction: check migrated page number

isolate_migratepages_range() might isolate none pages, for example, when
zone->lru_lock is contended and compaction is async. In this case, we should
abort compaction, otherwise, compact_zone will run a useless loop and make
zone->lru_lock is even contended.

Signed-off-by: Shaohua Li <shli@fusionio.com>
---
 mm/compaction.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux/mm/compaction.c
===================================================================
--- linux.orig/mm/compaction.c	2012-09-06 18:37:52.636413761 +0800
+++ linux/mm/compaction.c	2012-09-07 10:51:16.734081959 +0800
@@ -618,7 +618,7 @@ typedef enum {
 static isolate_migrate_t isolate_migratepages(struct zone *zone,
 					struct compact_control *cc)
 {
-	unsigned long low_pfn, end_pfn;
+	unsigned long low_pfn, end_pfn, old_low_pfn;
 
 	/* Do not scan outside zone boundaries */
 	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
@@ -633,8 +633,9 @@ static isolate_migrate_t isolate_migrate
 	}
 
 	/* Perform the isolation */
+	old_low_pfn = low_pfn;
 	low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn);
-	if (!low_pfn)
+	if (!low_pfn || old_low_pfn == low_pfn)
 		return ISOLATE_ABORT;
 
 	cc->migrate_pfn = low_pfn;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/2]compaction: check migrated page number
  2012-09-07  4:12       ` Shaohua Li
@ 2012-09-07 15:52         ` Andrea Arcangeli
  2012-09-10  1:05           ` Shaohua Li
  0 siblings, 1 reply; 7+ messages in thread
From: Andrea Arcangeli @ 2012-09-07 15:52 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Mel Gorman, linux-mm, akpm

On Fri, Sep 07, 2012 at 12:12:12PM +0800, Shaohua Li wrote:
> Subject: compaction: check migrated page number
> 
> isolate_migratepages_range() might isolate none pages, for example, when
> zone->lru_lock is contended and compaction is async. In this case, we should
> abort compaction, otherwise, compact_zone will run a useless loop and make
> zone->lru_lock is even contended.
> 
> Signed-off-by: Shaohua Li <shli@fusionio.com>
> ---
>  mm/compaction.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> Index: linux/mm/compaction.c
> ===================================================================
> --- linux.orig/mm/compaction.c	2012-09-06 18:37:52.636413761 +0800
> +++ linux/mm/compaction.c	2012-09-07 10:51:16.734081959 +0800
> @@ -618,7 +618,7 @@ typedef enum {
>  static isolate_migrate_t isolate_migratepages(struct zone *zone,
>  					struct compact_control *cc)
>  {
> -	unsigned long low_pfn, end_pfn;
> +	unsigned long low_pfn, end_pfn, old_low_pfn;
>  
>  	/* Do not scan outside zone boundaries */
>  	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
> @@ -633,8 +633,9 @@ static isolate_migrate_t isolate_migrate
>  	}
>  
>  	/* Perform the isolation */
> +	old_low_pfn = low_pfn;
>  	low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn);
> -	if (!low_pfn)
> +	if (!low_pfn || old_low_pfn == low_pfn)
>  		return ISOLATE_ABORT;
>  
>  	cc->migrate_pfn = low_pfn;

Looks good to me.

This other below approach should also work:

diff --git a/mm/compaction.c b/mm/compaction.c
index 7fcd3a5..aefb712 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -70,8 +70,7 @@ static bool compact_checklock_irqsave(spinlock_t *lock, unsigned long *flags,
 
 		/* async aborts if taking too long or contended */
 		if (!cc->sync) {
-			if (cc->contended)
-				*cc->contended = true;
+			cc->contended = true;
 			return false;
 		}
 
@@ -634,7 +633,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
 
 	/* Perform the isolation */
 	low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn);
-	if (!low_pfn)
+	if (!low_pfn || cc->contended)
 		return ISOLATE_ABORT;
 
 	cc->migrate_pfn = low_pfn;
@@ -831,6 +830,7 @@ static unsigned long compact_zone_order(struct zone *zone,
 				 int order, gfp_t gfp_mask,
 				 bool sync, bool *contended)
 {
+	unsigned long ret;
 	struct compact_control cc = {
 		.nr_freepages = 0,
 		.nr_migratepages = 0,
@@ -838,12 +838,14 @@ static unsigned long compact_zone_order(struct zone *zone,
 		.migratetype = allocflags_to_migratetype(gfp_mask),
 		.zone = zone,
 		.sync = sync,
-		.contended = contended,
 	};
 	INIT_LIST_HEAD(&cc.freepages);
 	INIT_LIST_HEAD(&cc.migratepages);
 
-	return compact_zone(zone, &cc);
+	ret = compact_zone(zone, &cc);
+	if (contended)
+		*contended = cc.contended;
+	return ret;
 }
 
 int sysctl_extfrag_threshold = 500;
diff --git a/mm/internal.h b/mm/internal.h
index 53418cd..dbb32ff 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -130,7 +130,7 @@ struct compact_control {
 	int order;			/* order a direct compactor needs */
 	int migratetype;		/* MOVABLE, RECLAIMABLE etc */
 	struct zone *zone;
-	bool *contended;		/* True if a lock was contended */
+	bool contended;			/* True if a lock was contended */
 };
 
 unsigned long

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [patch 1/2]compaction: check migrated page number
  2012-09-07 15:52         ` Andrea Arcangeli
@ 2012-09-10  1:05           ` Shaohua Li
  0 siblings, 0 replies; 7+ messages in thread
From: Shaohua Li @ 2012-09-10  1:05 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Mel Gorman, linux-mm, akpm

On Fri, Sep 07, 2012 at 05:52:43PM +0200, Andrea Arcangeli wrote:
> On Fri, Sep 07, 2012 at 12:12:12PM +0800, Shaohua Li wrote:
> > Subject: compaction: check migrated page number
> > 
> > isolate_migratepages_range() might isolate none pages, for example, when
> > zone->lru_lock is contended and compaction is async. In this case, we should
> > abort compaction, otherwise, compact_zone will run a useless loop and make
> > zone->lru_lock is even contended.
> > 
> > Signed-off-by: Shaohua Li <shli@fusionio.com>
> > ---
> >  mm/compaction.c |    5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > Index: linux/mm/compaction.c
> > ===================================================================
> > --- linux.orig/mm/compaction.c	2012-09-06 18:37:52.636413761 +0800
> > +++ linux/mm/compaction.c	2012-09-07 10:51:16.734081959 +0800
> > @@ -618,7 +618,7 @@ typedef enum {
> >  static isolate_migrate_t isolate_migratepages(struct zone *zone,
> >  					struct compact_control *cc)
> >  {
> > -	unsigned long low_pfn, end_pfn;
> > +	unsigned long low_pfn, end_pfn, old_low_pfn;
> >  
> >  	/* Do not scan outside zone boundaries */
> >  	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
> > @@ -633,8 +633,9 @@ static isolate_migrate_t isolate_migrate
> >  	}
> >  
> >  	/* Perform the isolation */
> > +	old_low_pfn = low_pfn;
> >  	low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn);
> > -	if (!low_pfn)
> > +	if (!low_pfn || old_low_pfn == low_pfn)
> >  		return ISOLATE_ABORT;
> >  
> >  	cc->migrate_pfn = low_pfn;
> 
> Looks good to me.
> 
> This other below approach should also work:

Yep, the logic is the same. But your code looks prettier, thanks! I'll send a
formal path.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-09-10  1:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-06 10:44 [patch 1/2]compaction: check migrated page number Shaohua Li
2012-09-06 12:17 ` Mel Gorman
2012-09-06 12:55   ` Shaohua Li
2012-09-06 13:25     ` Mel Gorman
2012-09-07  4:12       ` Shaohua Li
2012-09-07 15:52         ` Andrea Arcangeli
2012-09-10  1:05           ` Shaohua Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).