[RFC][PATCH 0/3] big chunk memory allocator v2

public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed

* [RFC][PATCH 0/3] big chunk memory allocator v2
       [not found] <20101026190042.57f30338.kamezawa.hiroyu@jp.fujitsu.com>
@ 2010-10-27 23:22 ` Minchan Kim
  2010-10-29  9:20   ` Michał Nazarewicz
  0 siblings, 1 reply; 10+ messages in thread
From: Minchan Kim @ 2010-10-27 23:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Oct 26, 2010 at 7:00 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> Hi, here is version 2.
>
> I only did small test and it seems to work (but I think there will be bug...)
> I post this now just because I'll be out of office 10/31-11/15 with ksummit and
> a private trip.
>
> Any comments are welcome but please see the interface is enough for use cases or
> not. ?For example) If MAX_ORDER alignment is too bad, I need to rewrite almost
> all code.

First of all, thanks for the endless your effort to embedded system.
It's time for statkeholders to review this.
Cced some guys. Maybe many people of them have to attend KS.
So I hope SAMSUNG guys review this.

Maybe they can't test this since ARM doesn't support movable zone now.
(I will look into this).
As Kame said, please, review this patch whether this patch have enough
interface and meet
your requirement.
I think this can't meet _all_ of your requirements(ex, latency and
making sure getting big contiguous memory) but I believe it can meet
NOT CRITICAL many cases, I guess.

>
> Now interface is:
>
>
> struct page *__alloc_contig_pages(unsigned long base, unsigned long end,
> ? ? ? ? ? ? ? ? ? ? ? ?unsigned long nr_pages, int align_order,
> ? ? ? ? ? ? ? ? ? ? ? ?int node, gfp_t gfpflag, nodemask_t *mask)
>
> ?* @base: the lowest pfn which caller wants.
> ?* @end: ?the highest pfn which caller wants.
> ?* @nr_pages: the length of a chunk of pages to be allocated.
> ?* @align_order: alignment of start address of returned chunk in order.
> ?* ? Returned' page's order will be aligned to (1 << align_order).If smaller
> ?* ? than MAX_ORDER, it's raised to MAX_ORDER.
> ?* @node: allocate near memory to the node, If -1, current node is used.
> ?* @gfpflag: see include/linux/gfp.h
> ?* @nodemask: allocate memory within the nodemask.
>
> If the caller wants a FIXED address, set end - base == nr_pages.
>
> The patch is based onto the latest mmotm + Bob's 3 patches for fixing
> memory_hotplug.c (they are queued.)
>
> Thanks,
> -Kame
>
>
>
>
>



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC][PATCH 0/3] big chunk memory allocator v2
  2010-10-27 23:22 ` [RFC][PATCH 0/3] big chunk memory allocator v2 Minchan Kim
@ 2010-10-29  9:20   ` Michał Nazarewicz
  2010-10-29 10:31     ` Andi Kleen
  0 siblings, 1 reply; 10+ messages in thread
From: Michał Nazarewicz @ 2010-10-29  9:20 UTC (permalink / raw)
  To: linux-arm-kernel

> On Tue, Oct 26, 2010 at 7:00 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>> I only did small test and it seems to work (but I think there will be bug...)
>> I post this now just because I'll be out of office 10/31-11/15 with ksummit and
>> a private trip.
>>
>> Any comments are welcome but please see the interface is enough for use cases or
>> not.  For example) If MAX_ORDER alignment is too bad, I need to rewrite almost
>> all code.

On Thu, 28 Oct 2010 01:22:38 +0200, Minchan Kim <minchan.kim@gmail.com> wrote:
> First of all, thanks for the endless your effort to embedded system.
> It's time for statkeholders to review this.
> Cced some guys. Maybe many people of them have to attend KS.
> So I hope SAMSUNG guys review this.
>
> Maybe they can't test this since ARM doesn't support movable zone now.
> (I will look into this).
> As Kame said, please, review this patch whether this patch have enough
> interface and meet your requirement.
> I think this can't meet _all_ of your requirements(ex, latency and
> making sure getting big contiguous memory) but I believe it can meet
> NOT CRITICAL many cases, I guess.

I'm currently working on a framework (the CMA framework some may be aware of) which
in principle is meant for the same purpose: allocating physically contiguous blocks
of memory.  I'm hoping to help with latency, remove the need for MAX_ORDER alignment
as well as help with fragmentation by letting different drivers allocate memory from
different memory range.

When I was posting CMA, it had been suggested to create a new migration type
dedicated to contiguous allocations.  I think I already did that and thanks to
this new migration type we have (i) an area of memory that only accepts movable
and reclaimable pages and (ii) is used only if all other (non-reserved) pages have
been allocated.

I'm currently working on migration so that those movable and reclaimable pages
allocated in area dedicated for CMA are freed and Kame's work is quite helpful
in this regard as I have something to base my work on. :)

Nonetheless, it's a conference time now (ELC, PLC; interestingly both are in
Cambridge :P) so I guess we, here at SPRC, will look into it more after PLC.

>> Now interface is:
>>
>> struct page *__alloc_contig_pages(unsigned long base, unsigned long end,
>>                        unsigned long nr_pages, int align_order,
>>                        int node, gfp_t gfpflag, nodemask_t *mask)
>>
>>  * @base: the lowest pfn which caller wants.
>>  * @end:  the highest pfn which caller wants.
>>  * @nr_pages: the length of a chunk of pages to be allocated.
>>  * @align_order: alignment of start address of returned chunk in order.
>>  *   Returned' page's order will be aligned to (1 << align_order).If smaller
>>  *   than MAX_ORDER, it's raised to MAX_ORDER.
>>  * @node: allocate near memory to the node, If -1, current node is used

PS. Please note that Pawel's new address is <pawel@osciak.com>.  Fixing in Cc.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC][PATCH 0/3] big chunk memory allocator v2
  2010-10-29  9:20   ` Michał Nazarewicz
@ 2010-10-29 10:31     ` Andi Kleen
  2010-10-29 10:59       ` KAMEZAWA Hiroyuki
  2010-10-29 13:11       ` Minchan Kim
  0 siblings, 2 replies; 10+ messages in thread
From: Andi Kleen @ 2010-10-29 10:31 UTC (permalink / raw)
  To: linux-arm-kernel

> When I was posting CMA, it had been suggested to create a new migration type
> dedicated to contiguous allocations.  I think I already did that and thanks to
> this new migration type we have (i) an area of memory that only accepts movable
> and reclaimable pages and 

Aka highmem next generation :-(

> (ii) is used only if all other (non-reserved) pages have
> been allocated.

That will be near always the case after some uptime, as memory fills up
with caches. Unless you do early reclaim? 

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC][PATCH 0/3] big chunk memory allocator v2
  2010-10-29 10:31     ` Andi Kleen
@ 2010-10-29 10:59       ` KAMEZAWA Hiroyuki
  2010-10-29 12:29         ` Andi Kleen
  2010-10-29 13:11       ` Minchan Kim
  1 sibling, 1 reply; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-29 10:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 29 Oct 2010 12:31:54 +0200
Andi Kleen <andi.kleen@intel.com> wrote:

> > When I was posting CMA, it had been suggested to create a new migration type
> > dedicated to contiguous allocations.  I think I already did that and thanks to
> > this new migration type we have (i) an area of memory that only accepts movable
> > and reclaimable pages and 
> 
> Aka highmem next generation :-(
> 

yes. But Nick's new shrink_slab() may be a new help even without
new zone.


> > (ii) is used only if all other (non-reserved) pages have
> > been allocated.
> 
> That will be near always the case after some uptime, as memory fills up
> with caches. Unless you do early reclaim? 
> 

memory migration always do work with alloc_page() for getting migration target
pages. So, memory will be reclaimed if filled by cache.

About my patch, I may have to prealloc all required pages before start.
But I didn't do that at this time.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC][PATCH 0/3] big chunk memory allocator v2
  2010-10-29 10:59       ` KAMEZAWA Hiroyuki
@ 2010-10-29 12:29         ` Andi Kleen
  2010-10-29 12:31           ` KAMEZAWA Hiroyuki
  2010-10-29 12:43           ` Michał Nazarewicz
  0 siblings, 2 replies; 10+ messages in thread
From: Andi Kleen @ 2010-10-29 12:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 29, 2010 at 11:59:00AM +0100, KAMEZAWA Hiroyuki wrote:
> On Fri, 29 Oct 2010 12:31:54 +0200
> Andi Kleen <andi.kleen@intel.com> wrote:
> 
> > > When I was posting CMA, it had been suggested to create a new migration type
> > > dedicated to contiguous allocations.  I think I already did that and thanks to
> > > this new migration type we have (i) an area of memory that only accepts movable
> > > and reclaimable pages and 
> > 
> > Aka highmem next generation :-(
> > 
> 
> yes. But Nick's new shrink_slab() may be a new help even without
> new zone.

You would really need callbacks into lots of code. Christoph
used to have some patches for directed shrink of dcache/icache,
but they are currently not on the table.

I don't think Nick's patch does that, he simply optimizes the existing
shrinker (which in practice tends to not shrink a lot) to be a bit
less wasteful.

The coverage will never be 100% in any case. So you always have to
make a choice between movable or fully usable. That's essentially
highmem with most of its problems.

> 
> 
> > > (ii) is used only if all other (non-reserved) pages have
> > > been allocated.
> > 
> > That will be near always the case after some uptime, as memory fills up
> > with caches. Unless you do early reclaim? 
> > 
> 
> memory migration always do work with alloc_page() for getting migration target
> pages. So, memory will be reclaimed if filled by cache.

Was talking about that paragraph CMA, not your patch. 

If I understand it correctly CMA wants to define
a new zone which is somehow similar to movable, but only sometimes used
when another zone is full (which is the usual state in normal
operation actually)

It was unclear to me how this was all supposed to work. At least
as described in the paragraph it cannot I think.

> About my patch, I may have to prealloc all required pages before start.
> But I didn't do that at this time.

preallocate when? I thought the whole point of the large memory allocator
was to not have to pre-allocate.

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC][PATCH 0/3] big chunk memory allocator v2
  2010-10-29 12:29         ` Andi Kleen
@ 2010-10-29 12:31           ` KAMEZAWA Hiroyuki
  2010-10-29 12:43           ` Michał Nazarewicz
  1 sibling, 0 replies; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-29 12:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 29 Oct 2010 14:29:28 +0200
Andi Kleen <andi.kleen@intel.com> wrote:

> 
> > About my patch, I may have to prealloc all required pages before start.
> > But I didn't do that at this time.
> 
> preallocate when? I thought the whole point of the large memory allocator
> was to not have to pre-allocate.
> 

Yes, one-by-one allocation prevents the allocation from sudden-attack.
I just wonder to add a knob for "migrate pages here" :)


Thanks,
-Kame

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC][PATCH 0/3] big chunk memory allocator v2
  2010-10-29 12:29         ` Andi Kleen
  2010-10-29 12:31           ` KAMEZAWA Hiroyuki
@ 2010-10-29 12:43           ` Michał Nazarewicz
  2010-10-29 14:27             ` Andi Kleen
  1 sibling, 1 reply; 10+ messages in thread
From: Michał Nazarewicz @ 2010-10-29 12:43 UTC (permalink / raw)
  To: linux-arm-kernel

>>>> When I was posting CMA, it had been suggested to create a new migration type
>>>> dedicated to contiguous allocations.  I think I already did that and thanks to
>>>> this new migration type we have (i) an area of memory that only accepts movable
>>>> and reclaimable pages and

>> Andi Kleen <andi.kleen@intel.com> wrote:
>>> Aka highmem next generation :-(

> On Fri, Oct 29, 2010 at 11:59:00AM +0100, KAMEZAWA Hiroyuki wrote:
>> yes. But Nick's new shrink_slab() may be a new help even without
>> new zone.

On Fri, 29 Oct 2010 14:29:28 +0200, Andi Kleen <andi.kleen@intel.com> wrote:
> You would really need callbacks into lots of code. Christoph
> used to have some patches for directed shrink of dcache/icache,
> but they are currently not on the table.
>
> I don't think Nick's patch does that, he simply optimizes the existing
> shrinker (which in practice tends to not shrink a lot) to be a bit
> less wasteful.
>
> The coverage will never be 100% in any case. So you always have to
> make a choice between movable or fully usable. That's essentially
> highmem with most of its problems.

Yep.

>>>> (ii) is used only if all other (non-reserved) pages have
>>>> been allocated.

>>> That will be near always the case after some uptime, as memory fills up
>>> with caches. Unless you do early reclaim?

Hmm... true.  Still the point remains that only movable and reclaimable pages are
allowed in the marked regions.  This in effect means that from unmovable pages
point of view, the area is unusable but I havn't thought of any other way to
guarantee that because of fragmentation, long sequence of free/movable/reclaimable
pages is available.

>> memory migration always do work with alloc_page() for getting migration target
>> pages. So, memory will be reclaimed if filled by cache.
>
> Was talking about that paragraph CMA, not your patch.
>
> If I understand it correctly CMA wants to define
> a new zone which is somehow similar to movable, but only sometimes used
> when another zone is full (which is the usual state in normal
> operation actually)
>
> It was unclear to me how this was all supposed to work. At least
> as described in the paragraph it cannot I think.

It's not a new zone, just a new migrate type.  I haven't tested it yet,
but the idea is that once pageblock's migrate type is set to this
new MIGRATE_CMA type, buddy allocator never changes it and in
fallback list it's put on the end of entries for MIGRATE_RECLAIMABLE
and MIGRATE_MOVABLE.

If I got everything right, this means that pages from MIGRATE_CMA pageblocks
are available for movable and reclaimable allocations but not for unmovable.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC][PATCH 0/3] big chunk memory allocator v2
  2010-10-29 10:31     ` Andi Kleen
  2010-10-29 10:59       ` KAMEZAWA Hiroyuki
@ 2010-10-29 13:11       ` Minchan Kim
  1 sibling, 0 replies; 10+ messages in thread
From: Minchan Kim @ 2010-10-29 13:11 UTC (permalink / raw)
  To: linux-arm-kernel

2010/10/29 Andi Kleen <andi.kleen@intel.com>:
>> When I was posting CMA, it had been suggested to create a new migration type
>> dedicated to contiguous allocations. ?I think I already did that and thanks to
>> this new migration type we have (i) an area of memory that only accepts movable
>> and reclaimable pages and
>
> Aka highmem next generation :-(

I lost the road. What is highmem next generation?
Could you point it to me?

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC][PATCH 0/3] big chunk memory allocator v2
  2010-10-29 12:43           ` Michał Nazarewicz
@ 2010-10-29 14:27             ` Andi Kleen
  2010-10-29 14:58               ` Michał Nazarewicz
  0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2010-10-29 14:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 29, 2010 at 01:43:51PM +0100, Micha? Nazarewicz wrote:
> >>>> (ii) is used only if all other (non-reserved) pages have
> >>>> been allocated.
> 
> >>> That will be near always the case after some uptime, as memory fills up
> >>> with caches. Unless you do early reclaim?
> 
> Hmm... true.  Still the point remains that only movable and reclaimable pages are
> allowed in the marked regions.  This in effect means that from unmovable pages
> point of view, the area is unusable but I havn't thought of any other way to
> guarantee that because of fragmentation, long sequence of free/movable/reclaimable
> pages is available.

Essentially a movable zone as defined today.

That gets you near all the problems of highmem (except for the mapping
problem and you're a bit more flexible in the splits): 

Someone has to decide at boot how much should be movable
and what not, some workloads will run out of space, some may
deadlock when it runs out of management objects, etc.etc. 
Classic highmem had a long string of issues with all of this.

If it was an easy problem it had been long solved, but it isn't really.

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC][PATCH 0/3] big chunk memory allocator v2
  2010-10-29 14:27             ` Andi Kleen
@ 2010-10-29 14:58               ` Michał Nazarewicz
  0 siblings, 0 replies; 10+ messages in thread
From: Michał Nazarewicz @ 2010-10-29 14:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 29 Oct 2010 16:27:41 +0200, Andi Kleen <andi.kleen@intel.com> wrote:

> On Fri, Oct 29, 2010 at 01:43:51PM +0100, Micha? Nazarewicz wrote:
>> Hmm... true.  Still the point remains that only movable and reclaimable pages are
>> allowed in the marked regions.  This in effect means that from unmovable pages
>> point of view, the area is unusable but I havn't thought of any other way to
>> guarantee that because of fragmentation, long sequence of free/movable/reclaimable
>> pages is available.

> Essentially a movable zone as defined today.

Ah, right, I somehow was under the impresion that movable zone can be used as a fallback
zone.  When I'm finished with my current approach I'll look more closely into it.

> That gets you near all the problems of highmem (except for the mapping
> problem and you're a bit more flexible in the splits):
>
> Someone has to decide at boot how much should be movable
> and what not, some workloads will run out of space, some may
> deadlock when it runs out of management objects, etc.etc.
> Classic highmem had a long string of issues with all of this.

Here's where the rest of CMA comes.  The solution may be not perfect but it's
probably better then nothing.  The idea is to define regions for each device
(with possibility for a single region to be shared) which, hopefuly, can help
with fragmentation.

In the current form, CMA is designed mostly for embeded systems where one can
define what kind of devices will be used, but in general this could be used
for other systems as well.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-10-29 14:58 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20101026190042.57f30338.kamezawa.hiroyu@jp.fujitsu.com>
2010-10-27 23:22 ` [RFC][PATCH 0/3] big chunk memory allocator v2 Minchan Kim
2010-10-29  9:20   ` Michał Nazarewicz
2010-10-29 10:31     ` Andi Kleen
2010-10-29 10:59       ` KAMEZAWA Hiroyuki
2010-10-29 12:29         ` Andi Kleen
2010-10-29 12:31           ` KAMEZAWA Hiroyuki
2010-10-29 12:43           ` Michał Nazarewicz
2010-10-29 14:27             ` Andi Kleen
2010-10-29 14:58               ` Michał Nazarewicz
2010-10-29 13:11       ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox