linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Gioh Kim <gioh.kim@lge.com>,
	akpm@linux-foundation.org, mgorman@suse.de, riel@redhat.com,
	hannes@cmpxchg.org, rientjes@google.com, vdavydov@parallels.com,
	iamjoonsoo.kim@lge.com
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, gunho.lee@lge.com
Subject: Re: [RFCv2] mm: page allocation for less fragmentation
Date: Wed, 01 Apr 2015 14:05:30 +0200	[thread overview]
Message-ID: <551BDF0A.2090503@suse.cz> (raw)
In-Reply-To: <551343E3.3050709@lge.com>

On 03/26/2015 12:25 AM, Gioh Kim wrote:
>
>
> 2015-03-26 i??i ? 7:16i?? Vlastimil Babka i?'(e??) i?' e,?:
>> On 25.3.2015 3:39, Gioh Kim wrote:
>>> My driver allocates more than 40MB pages via alloc_page() at a time and
>>> maps them at virtual address. Totally it uses 300~400MB pages.
>>>
>>> If I run a heavy load test for a few days in 1GB memory system, I cannot allocate even order=3 pages
>>> because-of the external fragmentation.
>>>
>>> I thought I needed a anti-fragmentation solution for my driver.
>>> But there is no allocation function that considers fragmentation.
>>> The compaction is not helpful because it is only for movable pages, not unmovable pages.
>>>
>>> This patch proposes a allocation function allocates only pages in the same pageblock.
>>>
>>> I tested this patch like following:
>>>
>>> 1. When the driver allocates about 400MB and do "cat /proc/pagetypeinfo;cat /proc/buddyinfo"
>>>
>>> Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
>>> Node    0, zone   Normal, type    Unmovable   3864    728    394    216    129     47     18      9      1      0      0
>>> Node    0, zone   Normal, type  Reclaimable    902     96     68     17      3      0      1      0      0      0      0
>>> Node    0, zone   Normal, type      Movable   5146    663    178     91     43     16      4      0      0      0      0
>>> Node    0, zone   Normal, type      Reserve      1      4      6      6      2      1      1      1      0      1      1
>>> Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0
>>> Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
>>>
>>> Number of blocks type     Unmovable  Reclaimable      Movable      Reserve          CMA      Isolate
>>> Node 0, zone   Normal          135            3          124            2            0            0
>>> Node 0, zone   Normal   9880   1489    647    332    177     64     24     10      1      1      1
>>>
>>> 2. The driver frees all pages and allocates pages again with alloc_pages_compact.
>>
>> This is not a good test setup. You shouldn't switch the allocation types during
>> single system boot. You should compare results from a boot where common
>> allocation is used and from a boot where your new allocation is used.
>
> The new allocator is slower so I don't think it can replace current allocator.
> I don't aim to change general allocator.

I don't say you should replace current allocator for everything. Use it 
just for your driver, that's fine. But when you perform/simulate your 
driver allocation, use either the general allocator or the new 
allocator, don't change from one to another during a single boot.

> The main pupose of the new allocator is a specific allocator if system has too much fragmentation.
> If some drivers consume much memory and generate fragmentation, it can use new allocator instead at the time.
> I want to make a kind of compaction for drivers that allocates unmovable pages.
>
> Therefore I tested like that.
> I first generated fragmentation and called the new allocator.
> I wanted to check whether the fragmentation was caused by my driver
> and the pages of the driver was able to be compacted.
> I thought the pages was compacted.
>
> If I freed pages and called the commmon allocator again,
> it could decrease a little fragmentation (not much as the new allocator).
> But there was no pages compaction and fragmentation would increase soon.

Yes, we need data comparing common/new allocator in the same scenario. 
Presumably that's what you have in v3 submission.

>
>
>>
>>> This is a kind of compaction of the driver.
>>> Following is the result of "cat /proc/pagetypeinfo;cat /proc/buddyinfo"
>>>
>>> Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
>>> Node    0, zone   Normal, type    Unmovable      8      5      1    432    272     91     37     11      1      0      0
>>> Node    0, zone   Normal, type  Reclaimable    901     96     68     17      3      0      1      0      0      0      0
>>> Node    0, zone   Normal, type      Movable   4790    776    192     91     43     16      4      0      0      0      0
>>> Node    0, zone   Normal, type      Reserve      1      4      6      6      2      1      1      1      0      1      1
>>> Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0
>>> Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
>>>
>>> Number of blocks type     Unmovable  Reclaimable      Movable      Reserve          CMA      Isolate
>>> Node 0, zone   Normal          135            3          124            2            0            0
>>> Node 0, zone   Normal   5693    877    266    544    320    108     43     12      1      1      1
>>
>> The number of unmovable pageblocks didn't change here. The stats for free
>> unmovable pages does look better for higher orders than in the first listing
>> above, but even the common allocation logic would give you that result, if you
>> allocated your 400 MB using (many) order-0 allocations (since you apparently
>> don't care about physically contiguous memory). That would also prefer order-0
>> free pages before splitting higher orders. So this doesn't demonstrate benefits
>> of the alloc_pages_compact() approach I'm afraid. The results suggest that the
>> system was in a worst state when the first allocation happened, and meanwhile
>> some pages were freed, creating the large numbers of order-0 unmovable free
>> pages. Or maybe the system got fragmented in the first allocation because your
>> driver tries to allocate the memory with high-order allocations before falling
>> back to lower orders? That would probably defeat the natural anti-fragmentation
>> of the buddy system.
>
> My driver is allocating pages only with alloc_page, not alloc_pages with high order.
>
> Yes, if I freed pages and called alloc_page again, it could decrease fragmentation at the time.
> But there was no compaction and fragmentation would increase soon,
> because the allocated pages was scattered all over the system.
>
> The new allocator compacts pages. I believe it can decrease fragmentation for long time.

If that's what v3 shows, ok. Let me check.

>>
>> So a proper test could be based on this:
>>
>>> If I run a heavy load test for a few days in 1GB memory system, I cannot
>> allocate even order=3 pages
>>> because-of the external fragmentation.
>>
>> With this patch, is the situation quantifiably better? Can you post the
>> pagetype/buddyinfo for system boot where all driver allocations use the common
>> allocator, and system boot with the patch? That should be comparable if the
>> workload is the same for both boots.
>>
>
> OK. I'll. I can be good test.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2015-04-01 12:05 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-25  2:39 [RFCv2] mm: page allocation for less fragmentation Gioh Kim
2015-03-25 10:56 ` Mel Gorman
2015-03-25 21:16   ` Gioh Kim
2015-03-26 10:28     ` Mel Gorman
2015-03-27  0:51       ` Gioh Kim
2015-03-25 22:16 ` Vlastimil Babka
2015-03-25 23:25   ` Gioh Kim
2015-04-01 12:05     ` Vlastimil Babka [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=551BDF0A.2090503@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=gioh.kim@lge.com \
    --cc=gunho.lee@lge.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).