linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Xishi Qiu <qiuxishi@huawei.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	riel@redhat.com, aquini@redhat.com, linux-mm@kvack.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: skip the page buddy block instead of one page
Date: Thu, 15 Aug 2013 01:39:21 +0900	[thread overview]
Message-ID: <20130814163921.GC2706@gmail.com> (raw)
In-Reply-To: <20130814161642.GM2296@suse.de>

On Wed, Aug 14, 2013 at 05:16:42PM +0100, Mel Gorman wrote:
> On Thu, Aug 15, 2013 at 12:52:29AM +0900, Minchan Kim wrote:
> > Hi Mel,
> > 
> > On Wed, Aug 14, 2013 at 09:57:11AM +0100, Mel Gorman wrote:
> > > On Wed, Aug 14, 2013 at 12:45:41PM +0800, Xishi Qiu wrote:
> > > > A large free page buddy block will continue many times, so if the page 
> > > > is free, skip the whole page buddy block instead of one page.
> > > > 
> > > > Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
> > > 
> > > page_order cannot be used unless zone->lock is held which is not held in
> > > this path. Acquiring the lock would prevent parallel allocations from the
> > 
> > Argh, I missed that. And it seems you already pointed it out long time ago
> > someone try to do same things if I remember correctly. :(
> 
> It feels familiar but I do not remember why.
> 
> > But let's think about it more.
> > 
> > It's always not right because CMA and memory-hotplug already isolated
> > free pages in the range to MIGRATE_ISOLATE right before starting migration
> > so we could use page_order safely in those contexts even if we don't hold
> > zone->lock.
> >  
> 
> Both of those are teh corner cases. Neither operation happen frequently
> in comparison to something like THP allocations for example. I think an
> optimisation along those lines is marginal at best.

In embedded side, we don't use THP yet but uses CMA and memory-hotplug so
your claim isn't the case for the embedded world.
And as I said, CMA area is last fallback so it's likely to have many free
pages so bigger CMA area is, the bigger the benefit(CPU and Power) is,
I guess.

> 
> > In addition, it's likely to have many free pages in case of CMA because CMA
> > makes MIGRATE_CMA fallback of MIGRATE_MOVABLE to minimize number of migrations.
> > Even CMA area was full, it could have many free pages once driver who is
> > CMA area's owner releases the CMA area. So, the bigger CMA space is,
> > the bigger patch's benefit is. And it could help memory-hotplug, too.
> > 
> > Only problem is normal compaction. The worst case is just skipping
> > pageblock_nr_pages, for instace, 4M(of course, it depends on configuration).
> > but we can make the race window very small by dobule checking PageBuddy.
> > Still, it could make the race theoretically but I think it's really really
> > unlikely and still enhance compaction overhead withtout holding the lock.
> > Even if the race happens, normal compaction's customers(ex, THP) doesn't
> > have critical result and just fallback. So I think it isn't not bad tradeoff.
> > 
> > diff --git a/mm/compaction.c b/mm/compaction.c
> > index 05ccb4c..2341d52 100644
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -520,8 +520,18 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
> >  			goto next_pageblock;
> >  
> >  		/* Skip if free */
> > -		if (PageBuddy(page))
> > +		if (PageBuddy(page)) {
> > +			/*
> > +			 * page_order is racy without zone->lock but worst case
> > +			 * by the racing is just skipping pageblock_nr_pages.
> > +			 * but even the race is really unlikely by double
> > +			 * check of PageBuddy.
> > +			 */
> > +			unsigned long order = page_order(page);
> > +			if (PageBuddy(page))
> > +				low_pfn += (1 << order) - 1;
> >  			continue;
> > +		}
> >  
> 
> Even if the page is still page buddy, there is no guarantee that it's
> the same page order as the first read. It could have be currently
> merging with adjacent buddies for example. There is also a really
> small race that a page was freed, allocated with some number stuffed
> into page->private and freed again before the second PageBuddy check.
> It's a bit of a hand grenade. How much of a performance benefit is there

1. Just worst case is skipping pageblock_nr_pages
2. Race is really small
3. Higher order page allocation customer always have graceful fallback.

If you really have a concern about that, we can add following.

  reply	other threads:[~2013-08-14 16:39 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-14  4:45 [PATCH] mm: skip the page buddy block instead of one page Xishi Qiu
2013-08-14  7:07 ` Minchan Kim
2013-08-14  8:57 ` Mel Gorman
2013-08-14  9:14   ` Xishi Qiu
2013-08-14 15:52   ` Minchan Kim
2013-08-14 16:16     ` Mel Gorman
2013-08-14 16:39       ` Minchan Kim [this message]
2013-08-14 18:00         ` Mel Gorman
2013-08-14 19:11           ` Minchan Kim
2013-08-15  2:32           ` Xishi Qiu
2013-08-15  2:44             ` Minchan Kim
2013-08-15  3:46               ` Xishi Qiu
2013-08-15  3:59                 ` Wanpeng Li
2013-08-15  3:59                 ` Wanpeng Li
2013-08-15  4:17                 ` Minchan Kim
2013-08-15  4:24                   ` Minchan Kim
2013-08-15  7:45                     ` Xishi Qiu
2013-08-15  9:51                       ` Wanpeng Li
2013-08-15  9:51                       ` Wanpeng Li
2013-08-15 11:15                         ` Xishi Qiu
2013-08-15 11:23                           ` Wanpeng Li
2013-08-15 11:23                           ` Wanpeng Li
2013-08-15 11:17                         ` Xishi Qiu
2013-08-15  6:38                   ` Xishi Qiu
2013-08-15 11:30                   ` Mel Gorman
2013-08-15 13:19                     ` Minchan Kim
2013-08-15 13:42                       ` Mel Gorman
2013-08-15 14:16                         ` Minchan Kim
2013-08-14 20:26     ` Andrew Morton
2013-08-14 22:22       ` Mel Gorman
2014-01-17 14:32         ` [PATCH] mm: Improve documentation of page_order Mel Gorman
2014-01-17 18:40           ` Rafael Aquini
2014-01-17 18:53           ` Laura Abbott
2014-01-17 19:59             ` Mel Gorman
2014-01-21 11:05               ` [PATCH] mm: Improve documentation of page_order v2 Mel Gorman
2014-01-20  6:12           ` [PATCH] mm: Improve documentation of page_order Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130814163921.GC2706@gmail.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aquini@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=qiuxishi@huawei.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).