Re: [RFC 0/7] Support high-order page bulk allocation

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Vlastimil Babka <vbabka@suse.cz>, John Dias <joaodias@google.com>,
	Suren Baghdasaryan <surenb@google.com>,
	pullip.cho@samsung.com
Subject: Re: [RFC 0/7] Support high-order page bulk allocation
Date: Mon, 17 Aug 2020 18:44:50 +0200	[thread overview]
Message-ID: <f047f2b2-9f62-cbf4-3c6b-a0f3bf1e9406@redhat.com> (raw)
In-Reply-To: <20200817163018.GC3852332@google.com>

On 17.08.20 18:30, Minchan Kim wrote:
> On Mon, Aug 17, 2020 at 05:45:59PM +0200, David Hildenbrand wrote:
>> On 17.08.20 17:27, Minchan Kim wrote:
>>> On Sun, Aug 16, 2020 at 02:31:22PM +0200, David Hildenbrand wrote:
>>>> On 14.08.20 19:31, Minchan Kim wrote:
>>>>> There is a need for special HW to require bulk allocation of
>>>>> high-order pages. For example, 4800 * order-4 pages.
>>>>>
>>>>> To meet the requirement, a option is using CMA area because
>>>>> page allocator with compaction under memory pressure is
>>>>> easily failed to meet the requirement and too slow for 4800
>>>>> times. However, CMA has also the following drawbacks:
>>>>>
>>>>>  * 4800 of order-4 * cma_alloc is too slow
>>>>>
>>>>> To avoid the slowness, we could try to allocate 300M contiguous
>>>>> memory once and then split them into order-4 chunks.
>>>>> The problem of this approach is CMA allocation fails one of the
>>>>> pages in those range couldn't migrate out, which happens easily
>>>>> with fs write under memory pressure.
>>>>
>>>> Why not chose a value in between? Like try to allocate MAX_ORDER - 1
>>>> chunks and split them. That would already heavily reduce the call frequency.
>>>
>>> I think you meant this:
>>>
>>>     alloc_pages(GFP_KERNEL|__GFP_NOWARN, MAX_ORDER - 1)
>>>
>>> It would work if system has lots of non-fragmented free memory.
>>> However, once they are fragmented, it doesn't work. That's why we have
>>> seen even order-4 allocation failure in the field easily and that's why
>>> CMA was there.
>>>
>>> CMA has more logics to isolate the memory during allocation/freeing as
>>> well as fragmentation avoidance so that it has less chance to be stealed
>>> from others and increase high success ratio. That's why I want this API
>>> to be used with CMA or movable zone.
>>
>> I was talking about doing MAX_ORDER - 1 CMA allocations instead of one
>> big 300M allocation. As you correctly note, memory placed into CMA
>> should be movable, except for (short/long) term pinnings. In these
>> cases, doing allocations smaller than 300M and splitting them up should
>> be good enough to reduce the call frequency, no?
> 
> I should have written that. The 300M I mentioned is really minimum size.
> In some scenraio, we need way bigger than 300M, up to several GB.
> Furthermore, the demand would be increased in near future.

And what will the driver do with that data besides providing it to the
device? Can it be mapped to user space? I think we really need more
information / the actual user.

>>
>>>
>>> A usecase is device can set a exclusive CMA area up when system boots.
>>> When device needs 4800 * order-4 pages, it could call this bulk against
>>> of the area so that it could effectively be guaranteed to allocate
>>> enough fast.
>>
>> Just wondering
>>
>> a) Why does it have to be fast?
> 
> That's because it's related to application latency, which ends up
> user feel bad.

Okay, but in theory, your device-needs are very similar to
application-needs, besides you requiring order-4 pages, correct? Similar
to an application that starts up and pins 300M (or more), just with
ordr-4 pages.

I don't get quite yet why you need a range allocator for that. Because
you intend to use CMA?

> 
>> b) Why does it need that many order-4 pages?
> 
> It's HW requirement. I couldn't say much about that.

Hm.

> 
>> c) How dynamic is the device need at runtime?
> 
> Whenever the application launched. It depends on user's usage pattern.
> 
>> d) Would it be reasonable in your setup to mark a CMA region in a way
>> such that it will never be used for other (movable) allocations,
> 
> I don't get your point. If we don't want the area to used up for
> other movable allocation, why should we use it as CMA first?
> It sounds like reserved memory and just wasted the memory.

Right, it's just very hard to get what you are trying to achieve without
the actual user at hand.

For example, will the pages you allocate be movable? Does the device
allow for that? If not, then the MOVABLE zone is usually not valid
(similar to gigantic pages not being allocated from the MOVABLE zone).
So your stuck with the NORMAL zone or CMA. Especially for the NORMAL
zone, alloc_contig_range() is currently not prepared to properly handle
sub-MAX_ORDER - 1 ranges. If any involved pageblock contains an
unmovable page, the allcoation will fail (see pageblock isolation /
has_unmovable_pages()). So CMA would be your only option.

-- 
Thanks,

David / dhildenb

next prev parent reply	other threads:[~2020-08-17 16:45 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-14 17:31 [RFC 0/7] Support high-order page bulk allocation Minchan Kim
2020-08-14 17:31 ` [RFC 1/7] mm: page_owner: split page by order Minchan Kim
2020-08-14 17:31 ` [RFC 2/7] mm: introduce split_page_by_order Minchan Kim
2020-08-14 17:31 ` [RFC 3/7] mm: compaction: deal with upcoming high-order page splitting Minchan Kim
2020-08-14 17:31 ` [RFC 4/7] mm: factor __alloc_contig_range out Minchan Kim
2020-08-14 17:31 ` [RFC 5/7] mm: introduce alloc_pages_bulk API Minchan Kim
2020-08-17 17:40   ` David Hildenbrand
2020-08-14 17:31 ` [RFC 6/7] mm: make alloc_pages_bulk best effort Minchan Kim
2020-08-14 17:31 ` [RFC 7/7] mm/page_isolation: avoid drain_all_pages for alloc_pages_bulk Minchan Kim
2020-08-14 17:40 ` [RFC 0/7] Support high-order page bulk allocation Matthew Wilcox
2020-08-14 20:55   ` Minchan Kim
2020-08-18  2:16     ` Cho KyongHo
2020-08-18  9:22     ` Cho KyongHo
2020-08-16 12:31 ` David Hildenbrand
2020-08-17 15:27   ` Minchan Kim
2020-08-17 15:45     ` David Hildenbrand
2020-08-17 16:30       ` Minchan Kim
2020-08-17 16:44         ` David Hildenbrand [this message]
2020-08-17 17:03           ` David Hildenbrand
2020-08-17 23:34           ` Minchan Kim
2020-08-18  7:42             ` Nicholas Piggin
2020-08-18  7:49             ` David Hildenbrand
2020-08-18 15:15               ` Minchan Kim
2020-08-18 15:58                 ` Matthew Wilcox
2020-08-18 16:22                   ` David Hildenbrand
2020-08-18 16:49                     ` Minchan Kim
2020-08-19  0:27                     ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f047f2b2-9f62-cbf4-3c6b-a0f3bf1e9406@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=joaodias@google.com \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=pullip.cho@samsung.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).