Re: [PATCH] mm: skip the page buddy block instead of one page

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Minchan Kim <minchan@kernel.org>
To: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	riel@redhat.com, aquini@redhat.com, linux-mm@kvack.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: skip the page buddy block instead of one page
Date: Thu, 15 Aug 2013 11:44:27 +0900	[thread overview]
Message-ID: <20130815024427.GA2718@gmail.com> (raw)
In-Reply-To: <520C3DD2.8010905@huawei.com>

Hi Xishi,

On Thu, Aug 15, 2013 at 10:32:50AM +0800, Xishi Qiu wrote:
> On 2013/8/15 2:00, Mel Gorman wrote:
> 
> >>> Even if the page is still page buddy, there is no guarantee that it's
> >>> the same page order as the first read. It could have be currently
> >>> merging with adjacent buddies for example. There is also a really
> >>> small race that a page was freed, allocated with some number stuffed
> >>> into page->private and freed again before the second PageBuddy check.
> >>> It's a bit of a hand grenade. How much of a performance benefit is there
> >>
> >> 1. Just worst case is skipping pageblock_nr_pages
> > 
> > No, the worst case is that page_order returns a number that is
> > completely garbage and low_pfn goes off the end of the zone
> > 
> >> 2. Race is really small
> >> 3. Higher order page allocation customer always have graceful fallback.
> >>
> 
> Hi Minchan, 
> I think in this case, we may get the wrong value from page_order(page).
> 
> 1. page is in page buddy
> 
> > if (PageBuddy(page)) {
> 
> 2. someone allocated the page, and set page->private to another value
> 
> > 	int nr_pages = (1 << page_order(page)) - 1;
> 
> 3. someone freed the page
> 
> > 	if (PageBuddy(page)) {
> 
> 4. we will skip wrong pages

So, what's the result by that?
As I said, it's just skipping (pageblock_nr_pages -1) at worst case
and the case you mentioned is right academically and I and Mel
already pointed out that. But how often could that happen in real
practice? I believe such is REALLY REALLY rare.
So, as Mel said, if you have some workloads to see the benefit
from this patch, I think we could accept the patch.
Could you try and respin with the number?
I guess big contigous memory range or memory-hotplug which are
full of free pages in embedded CPU which is rather slower than server
or desktop side could have benefit.

Thanks.

> 
> > 		nr_pages = min(nr_pages, MAX_ORDER_NR_PAGES - 1);
> > 		low_pfn += nr_pages;
> > 		continue;
> > 	}
> > }
> > 
> > It's still race-prone meaning that it really should be backed by some
> > performance data justifying it.
> > 
> 
> 
> 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Minchan Kim <minchan@kernel.org>
To: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	riel@redhat.com, aquini@redhat.com, linux-mm@kvack.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: skip the page buddy block instead of one page
Date: Thu, 15 Aug 2013 11:44:27 +0900	[thread overview]
Message-ID: <20130815024427.GA2718@gmail.com> (raw)
In-Reply-To: <520C3DD2.8010905@huawei.com>

Hi Xishi,

On Thu, Aug 15, 2013 at 10:32:50AM +0800, Xishi Qiu wrote:
> On 2013/8/15 2:00, Mel Gorman wrote:
> 
> >>> Even if the page is still page buddy, there is no guarantee that it's
> >>> the same page order as the first read. It could have be currently
> >>> merging with adjacent buddies for example. There is also a really
> >>> small race that a page was freed, allocated with some number stuffed
> >>> into page->private and freed again before the second PageBuddy check.
> >>> It's a bit of a hand grenade. How much of a performance benefit is there
> >>
> >> 1. Just worst case is skipping pageblock_nr_pages
> > 
> > No, the worst case is that page_order returns a number that is
> > completely garbage and low_pfn goes off the end of the zone
> > 
> >> 2. Race is really small
> >> 3. Higher order page allocation customer always have graceful fallback.
> >>
> 
> Hi Minchan, 
> I think in this case, we may get the wrong value from page_order(page).
> 
> 1. page is in page buddy
> 
> > if (PageBuddy(page)) {
> 
> 2. someone allocated the page, and set page->private to another value
> 
> > 	int nr_pages = (1 << page_order(page)) - 1;
> 
> 3. someone freed the page
> 
> > 	if (PageBuddy(page)) {
> 
> 4. we will skip wrong pages

So, what's the result by that?
As I said, it's just skipping (pageblock_nr_pages -1) at worst case
and the case you mentioned is right academically and I and Mel
already pointed out that. But how often could that happen in real
practice? I believe such is REALLY REALLY rare.
So, as Mel said, if you have some workloads to see the benefit
from this patch, I think we could accept the patch.
Could you try and respin with the number?
I guess big contigous memory range or memory-hotplug which are
full of free pages in embedded CPU which is rather slower than server
or desktop side could have benefit.

Thanks.

> 
> > 		nr_pages = min(nr_pages, MAX_ORDER_NR_PAGES - 1);
> > 		low_pfn += nr_pages;
> > 		continue;
> > 	}
> > }
> > 
> > It's still race-prone meaning that it really should be backed by some
> > performance data justifying it.
> > 
> 
> 
> 

-- 
Kind regards,
Minchan Kim

next prev parent reply	other threads:[~2013-08-15  2:44 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-14  4:45 [PATCH] mm: skip the page buddy block instead of one page Xishi Qiu
2013-08-14  4:45 ` Xishi Qiu
2013-08-14  7:07 ` Minchan Kim
2013-08-14  7:07   ` Minchan Kim
2013-08-14  8:57 ` Mel Gorman
2013-08-14  8:57   ` Mel Gorman
2013-08-14  9:14   ` Xishi Qiu
2013-08-14  9:14     ` Xishi Qiu
2013-08-14 15:52   ` Minchan Kim
2013-08-14 15:52     ` Minchan Kim
2013-08-14 16:16     ` Mel Gorman
2013-08-14 16:16       ` Mel Gorman
2013-08-14 16:39       ` Minchan Kim
2013-08-14 16:39         ` Minchan Kim
2013-08-14 18:00         ` Mel Gorman
2013-08-14 18:00           ` Mel Gorman
2013-08-14 19:11           ` Minchan Kim
2013-08-14 19:11             ` Minchan Kim
2013-08-15  2:32           ` Xishi Qiu
2013-08-15  2:32             ` Xishi Qiu
2013-08-15  2:44             ` Minchan Kim [this message]
2013-08-15  2:44               ` Minchan Kim
2013-08-15  3:46               ` Xishi Qiu
2013-08-15  3:46                 ` Xishi Qiu
2013-08-15  3:59                 ` Wanpeng Li
2013-08-15  3:59                 ` Wanpeng Li
2013-08-15  4:17                 ` Minchan Kim
2013-08-15  4:17                   ` Minchan Kim
2013-08-15  4:24                   ` Minchan Kim
2013-08-15  4:24                     ` Minchan Kim
2013-08-15  7:45                     ` Xishi Qiu
2013-08-15  7:45                       ` Xishi Qiu
2013-08-15  9:51                       ` Wanpeng Li
2013-08-15  9:51                       ` Wanpeng Li
2013-08-15 11:15                         ` Xishi Qiu
2013-08-15 11:15                           ` Xishi Qiu
2013-08-15 11:23                           ` Wanpeng Li
2013-08-15 11:23                           ` Wanpeng Li
2013-08-15 11:17                         ` Xishi Qiu
2013-08-15 11:17                           ` Xishi Qiu
2013-08-15  6:38                   ` Xishi Qiu
2013-08-15  6:38                     ` Xishi Qiu
2013-08-15 11:30                   ` Mel Gorman
2013-08-15 11:30                     ` Mel Gorman
2013-08-15 13:19                     ` Minchan Kim
2013-08-15 13:19                       ` Minchan Kim
2013-08-15 13:42                       ` Mel Gorman
2013-08-15 13:42                         ` Mel Gorman
2013-08-15 14:16                         ` Minchan Kim
2013-08-15 14:16                           ` Minchan Kim
2013-08-14 20:26     ` Andrew Morton
2013-08-14 20:26       ` Andrew Morton
2013-08-14 22:22       ` Mel Gorman
2013-08-14 22:22         ` Mel Gorman
2014-01-17 14:32         ` [PATCH] mm: Improve documentation of page_order Mel Gorman
2014-01-17 14:32           ` Mel Gorman
2014-01-17 18:40           ` Rafael Aquini
2014-01-17 18:40             ` Rafael Aquini
2014-01-17 18:53           ` Laura Abbott
2014-01-17 18:53             ` Laura Abbott
2014-01-17 19:59             ` Mel Gorman
2014-01-17 19:59               ` Mel Gorman
2014-01-21 11:05               ` [PATCH] mm: Improve documentation of page_order v2 Mel Gorman
2014-01-21 11:05                 ` Mel Gorman
2014-01-20  6:12           ` [PATCH] mm: Improve documentation of page_order Minchan Kim
2014-01-20  6:12             ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130815024427.GA2718@gmail.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aquini@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=qiuxishi@huawei.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.