Re: [patch 1/2]compaction: check migrated page number

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Shaohua Li <shli@kernel.org>
To: Mel Gorman <mgorman@suse.de>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, aarcange@redhat.com
Subject: Re: [patch 1/2]compaction: check migrated page number
Date: Fri, 7 Sep 2012 12:12:12 +0800	[thread overview]
Message-ID: <20120907041212.GA31391@kernel.org> (raw)
In-Reply-To: <20120906132551.GS11266@suse.de>

On Thu, Sep 06, 2012 at 02:25:51PM +0100, Mel Gorman wrote:
> On Thu, Sep 06, 2012 at 08:55:26PM +0800, Shaohua Li wrote:
> > On Thu, Sep 06, 2012 at 01:17:25PM +0100, Mel Gorman wrote:
> > > On Thu, Sep 06, 2012 at 06:44:04PM +0800, Shaohua Li wrote:
> > > > 
> > > > isolate_migratepages_range() might isolate none pages, for example, when
> > > > zone->lru_lock is contended and compaction is async. In this case, we should
> > > > abort compaction, otherwise, compact_zone will run a useless loop and make
> > > > zone->lru_lock is even contended.
> > > > 
> > > 
> > > It might also isolate no pages because the range was 100% allocated and
> > > there were no free pages to isolate. This is perfectly normal and I suspect
> > > this patch effectively disables compaction. What problem did you observe
> > > that this patch is aimed at?
> > 
> > I'm running a random swapin/out workload. When memory is fragmented enough, I
> > saw 100% cpu usage. perf shows zone->lru_lock is heavily contended in
> > isolate_migratepages_range. I'm using slub(I didn't see the problem with slab),
> > the allocation is for radix_tree_node slab, which needs 4 pages.
> 
> Ok, the fragmentaiton is due to high-order unmovable kernel allocations from
> SLUB which will have diminishing returns over time.  One option to address
> this is to check if it's a high-order kernel allocation that can fail and
> not compact in that case. SLUB will fall back to using order-0 instead.

I tried actually, and it doesn't help. The problem is compact_zone keeps
running isolate_migratepages_range, which does nothing except doing a
lock/unlock.
 
> > Even If I just
> > apply the second patch, the system is still in 100% cpu usage. The
> > spin_is_contended check can't cure the problem completely.
> 
> Are you sure it's really contention in that case and not just a lot of
> time is spent in compaction trying to satisfy the radix_tree_node
> allocation requests?

certainly it's the contention.
 
> > Trace shows
> > compact_zone will run a useless loop and each loop contend the lru_lock. With
> > this patch, the cpu usage becomes normal (about 20% utilization).
> 
> I suspect the reason why this patch has an effect is because compaction is
> no longer running. It finds a 100% full pageblock quickly and then aborts and
> that is not the right fix. Can you try something like this instead please?

That debug patch doesn't help. My system just hang.

I thought your worry is valid, we shouldn't abort if 100% full pageblock is
found. How about this one? With it, the cpu usage is normal in my workload.
Occassionally I saw cpu usage reaches high (up to 80%), but recovered
immediately. Without the patch, the cpu usage keeps in 100%.

Thanks,
Shaohua


Subject: compaction: check migrated page number

isolate_migratepages_range() might isolate none pages, for example, when
zone->lru_lock is contended and compaction is async. In this case, we should
abort compaction, otherwise, compact_zone will run a useless loop and make
zone->lru_lock is even contended.

Signed-off-by: Shaohua Li <shli@fusionio.com>
---
 mm/compaction.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux/mm/compaction.c
===================================================================
--- linux.orig/mm/compaction.c	2012-09-06 18:37:52.636413761 +0800
+++ linux/mm/compaction.c	2012-09-07 10:51:16.734081959 +0800
@@ -618,7 +618,7 @@ typedef enum {
 static isolate_migrate_t isolate_migratepages(struct zone *zone,
 					struct compact_control *cc)
 {
-	unsigned long low_pfn, end_pfn;
+	unsigned long low_pfn, end_pfn, old_low_pfn;
 
 	/* Do not scan outside zone boundaries */
 	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
@@ -633,8 +633,9 @@ static isolate_migrate_t isolate_migrate
 	}
 
 	/* Perform the isolation */
+	old_low_pfn = low_pfn;
 	low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn);
-	if (!low_pfn)
+	if (!low_pfn || old_low_pfn == low_pfn)
 		return ISOLATE_ABORT;
 
 	cc->migrate_pfn = low_pfn;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2012-09-07  4:12 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-06 10:44 [patch 1/2]compaction: check migrated page number Shaohua Li
2012-09-06 12:17 ` Mel Gorman
2012-09-06 12:55   ` Shaohua Li
2012-09-06 13:25     ` Mel Gorman
2012-09-07  4:12       ` Shaohua Li [this message]
2012-09-07 15:52         ` Andrea Arcangeli
2012-09-10  1:05           ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120907041212.GA31391@kernel.org \
    --to=shli@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).