From: Xishi Qiu <qiuxishi@huawei.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
riel@redhat.com, aquini@redhat.com, linux-mm@kvack.org,
LKML <linux-kernel@vger.kernel.org>,
Xishi Qiu <qiuxishi@huawei.com>
Subject: Re: [PATCH] mm: skip the page buddy block instead of one page
Date: Thu, 15 Aug 2013 11:46:07 +0800 [thread overview]
Message-ID: <520C4EFF.8040305@huawei.com> (raw)
In-Reply-To: <20130815024427.GA2718@gmail.com>
On 2013/8/15 10:44, Minchan Kim wrote:
> Hi Xishi,
>
> On Thu, Aug 15, 2013 at 10:32:50AM +0800, Xishi Qiu wrote:
>> On 2013/8/15 2:00, Mel Gorman wrote:
>>
>>>>> Even if the page is still page buddy, there is no guarantee that it's
>>>>> the same page order as the first read. It could have be currently
>>>>> merging with adjacent buddies for example. There is also a really
>>>>> small race that a page was freed, allocated with some number stuffed
>>>>> into page->private and freed again before the second PageBuddy check.
>>>>> It's a bit of a hand grenade. How much of a performance benefit is there
>>>>
>>>> 1. Just worst case is skipping pageblock_nr_pages
>>>
>>> No, the worst case is that page_order returns a number that is
>>> completely garbage and low_pfn goes off the end of the zone
>>>
>>>> 2. Race is really small
>>>> 3. Higher order page allocation customer always have graceful fallback.
>>>>
>>
>> Hi Minchan,
>> I think in this case, we may get the wrong value from page_order(page).
>>
>> 1. page is in page buddy
>>
>>> if (PageBuddy(page)) {
>>
>> 2. someone allocated the page, and set page->private to another value
>>
>>> int nr_pages = (1 << page_order(page)) - 1;
>>
>> 3. someone freed the page
>>
>>> if (PageBuddy(page)) {
>>
>> 4. we will skip wrong pages
>
> So, what's the result by that?
> As I said, it's just skipping (pageblock_nr_pages -1) at worst case
Hi Minchan,
I mean if the private is set to a large number, it will skip 2^private
pages, not (pageblock_nr_pages -1). I find somewhere will use page->private,
such as fs. Here is the comment about parivate.
/* Mapping-private opaque data:
* usually used for buffer_heads
* if PagePrivate set; used for
* swp_entry_t if PageSwapCache;
* indicates order in the buddy
* system if PG_buddy is set.
*/
Thanks,
Xishi Qiu
> and the case you mentioned is right academically and I and Mel
> already pointed out that. But how often could that happen in real
> practice? I believe such is REALLY REALLY rare.
> So, as Mel said, if you have some workloads to see the benefit
> from this patch, I think we could accept the patch.
> Could you try and respin with the number?
> I guess big contigous memory range or memory-hotplug which are
> full of free pages in embedded CPU which is rather slower than server
> or desktop side could have benefit.
>
> Thanks.
>
>>
>>> nr_pages = min(nr_pages, MAX_ORDER_NR_PAGES - 1);
>>> low_pfn += nr_pages;
>>> continue;
>>> }
>>> }
>>>
>>> It's still race-prone meaning that it really should be backed by some
>>> performance data justifying it.
>>>
>>
>>
>>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Xishi Qiu <qiuxishi@huawei.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>, <riel@redhat.com>,
<aquini@redhat.com>, <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Xishi Qiu <qiuxishi@huawei.com>
Subject: Re: [PATCH] mm: skip the page buddy block instead of one page
Date: Thu, 15 Aug 2013 11:46:07 +0800 [thread overview]
Message-ID: <520C4EFF.8040305@huawei.com> (raw)
In-Reply-To: <20130815024427.GA2718@gmail.com>
On 2013/8/15 10:44, Minchan Kim wrote:
> Hi Xishi,
>
> On Thu, Aug 15, 2013 at 10:32:50AM +0800, Xishi Qiu wrote:
>> On 2013/8/15 2:00, Mel Gorman wrote:
>>
>>>>> Even if the page is still page buddy, there is no guarantee that it's
>>>>> the same page order as the first read. It could have be currently
>>>>> merging with adjacent buddies for example. There is also a really
>>>>> small race that a page was freed, allocated with some number stuffed
>>>>> into page->private and freed again before the second PageBuddy check.
>>>>> It's a bit of a hand grenade. How much of a performance benefit is there
>>>>
>>>> 1. Just worst case is skipping pageblock_nr_pages
>>>
>>> No, the worst case is that page_order returns a number that is
>>> completely garbage and low_pfn goes off the end of the zone
>>>
>>>> 2. Race is really small
>>>> 3. Higher order page allocation customer always have graceful fallback.
>>>>
>>
>> Hi Minchan,
>> I think in this case, we may get the wrong value from page_order(page).
>>
>> 1. page is in page buddy
>>
>>> if (PageBuddy(page)) {
>>
>> 2. someone allocated the page, and set page->private to another value
>>
>>> int nr_pages = (1 << page_order(page)) - 1;
>>
>> 3. someone freed the page
>>
>>> if (PageBuddy(page)) {
>>
>> 4. we will skip wrong pages
>
> So, what's the result by that?
> As I said, it's just skipping (pageblock_nr_pages -1) at worst case
Hi Minchan,
I mean if the private is set to a large number, it will skip 2^private
pages, not (pageblock_nr_pages -1). I find somewhere will use page->private,
such as fs. Here is the comment about parivate.
/* Mapping-private opaque data:
* usually used for buffer_heads
* if PagePrivate set; used for
* swp_entry_t if PageSwapCache;
* indicates order in the buddy
* system if PG_buddy is set.
*/
Thanks,
Xishi Qiu
> and the case you mentioned is right academically and I and Mel
> already pointed out that. But how often could that happen in real
> practice? I believe such is REALLY REALLY rare.
> So, as Mel said, if you have some workloads to see the benefit
> from this patch, I think we could accept the patch.
> Could you try and respin with the number?
> I guess big contigous memory range or memory-hotplug which are
> full of free pages in embedded CPU which is rather slower than server
> or desktop side could have benefit.
>
> Thanks.
>
>>
>>> nr_pages = min(nr_pages, MAX_ORDER_NR_PAGES - 1);
>>> low_pfn += nr_pages;
>>> continue;
>>> }
>>> }
>>>
>>> It's still race-prone meaning that it really should be backed by some
>>> performance data justifying it.
>>>
>>
>>
>>
>
next prev parent reply other threads:[~2013-08-15 3:46 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-14 4:45 [PATCH] mm: skip the page buddy block instead of one page Xishi Qiu
2013-08-14 4:45 ` Xishi Qiu
2013-08-14 7:07 ` Minchan Kim
2013-08-14 7:07 ` Minchan Kim
2013-08-14 8:57 ` Mel Gorman
2013-08-14 8:57 ` Mel Gorman
2013-08-14 9:14 ` Xishi Qiu
2013-08-14 9:14 ` Xishi Qiu
2013-08-14 15:52 ` Minchan Kim
2013-08-14 15:52 ` Minchan Kim
2013-08-14 16:16 ` Mel Gorman
2013-08-14 16:16 ` Mel Gorman
2013-08-14 16:39 ` Minchan Kim
2013-08-14 16:39 ` Minchan Kim
2013-08-14 18:00 ` Mel Gorman
2013-08-14 18:00 ` Mel Gorman
2013-08-14 19:11 ` Minchan Kim
2013-08-14 19:11 ` Minchan Kim
2013-08-15 2:32 ` Xishi Qiu
2013-08-15 2:32 ` Xishi Qiu
2013-08-15 2:44 ` Minchan Kim
2013-08-15 2:44 ` Minchan Kim
2013-08-15 3:46 ` Xishi Qiu [this message]
2013-08-15 3:46 ` Xishi Qiu
2013-08-15 3:59 ` Wanpeng Li
2013-08-15 3:59 ` Wanpeng Li
2013-08-15 4:17 ` Minchan Kim
2013-08-15 4:17 ` Minchan Kim
2013-08-15 4:24 ` Minchan Kim
2013-08-15 4:24 ` Minchan Kim
2013-08-15 7:45 ` Xishi Qiu
2013-08-15 7:45 ` Xishi Qiu
2013-08-15 9:51 ` Wanpeng Li
2013-08-15 11:15 ` Xishi Qiu
2013-08-15 11:15 ` Xishi Qiu
2013-08-15 11:23 ` Wanpeng Li
2013-08-15 11:23 ` Wanpeng Li
2013-08-15 11:17 ` Xishi Qiu
2013-08-15 11:17 ` Xishi Qiu
2013-08-15 9:51 ` Wanpeng Li
2013-08-15 6:38 ` Xishi Qiu
2013-08-15 6:38 ` Xishi Qiu
2013-08-15 11:30 ` Mel Gorman
2013-08-15 11:30 ` Mel Gorman
2013-08-15 13:19 ` Minchan Kim
2013-08-15 13:19 ` Minchan Kim
2013-08-15 13:42 ` Mel Gorman
2013-08-15 13:42 ` Mel Gorman
2013-08-15 14:16 ` Minchan Kim
2013-08-15 14:16 ` Minchan Kim
2013-08-14 20:26 ` Andrew Morton
2013-08-14 20:26 ` Andrew Morton
2013-08-14 22:22 ` Mel Gorman
2013-08-14 22:22 ` Mel Gorman
2014-01-17 14:32 ` [PATCH] mm: Improve documentation of page_order Mel Gorman
2014-01-17 14:32 ` Mel Gorman
2014-01-17 18:40 ` Rafael Aquini
2014-01-17 18:40 ` Rafael Aquini
2014-01-17 18:53 ` Laura Abbott
2014-01-17 18:53 ` Laura Abbott
2014-01-17 19:59 ` Mel Gorman
2014-01-17 19:59 ` Mel Gorman
2014-01-21 11:05 ` [PATCH] mm: Improve documentation of page_order v2 Mel Gorman
2014-01-21 11:05 ` Mel Gorman
2014-01-20 6:12 ` [PATCH] mm: Improve documentation of page_order Minchan Kim
2014-01-20 6:12 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=520C4EFF.8040305@huawei.com \
--to=qiuxishi@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=aquini@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=minchan@kernel.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.