From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Johannes Weiner <hannes@cmpxchg.org>, yangge1116@126.com
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, stable@vger.kernel.org,
21cnbao@gmail.com, david@redhat.com, vbabka@suse.cz,
liuzixing@hygon.cn
Subject: Re: [PATCH V7] mm, compaction: don't use ALLOC_CMA for unmovable allocations
Date: Wed, 18 Dec 2024 15:57:54 +0800 [thread overview]
Message-ID: <ded3d2bf-650e-4ddc-b2be-d6faddeb3037@linux.alibaba.com> (raw)
In-Reply-To: <20241217155551.GA37530@cmpxchg.org>
On 2024/12/17 23:55, Johannes Weiner wrote:
> Hello Yangge,
>
> On Tue, Dec 17, 2024 at 07:46:44PM +0800, yangge1116@126.com wrote:
>> From: yangge <yangge1116@126.com>
>>
>> Since commit 984fdba6a32e ("mm, compaction: use proper alloc_flags
>> in __compaction_suitable()") allow compaction to proceed when free
>> pages required for compaction reside in the CMA pageblocks, it's
>> possible that __compaction_suitable() always returns true, and in
>> some cases, it's not acceptable.
>>
>> There are 4 NUMA nodes on my machine, and each NUMA node has 32GB
>> of memory. I have configured 16GB of CMA memory on each NUMA node,
>> and starting a 32GB virtual machine with device passthrough is
>> extremely slow, taking almost an hour.
>>
>> During the start-up of the virtual machine, it will call
>> pin_user_pages_remote(..., FOLL_LONGTERM, ...) to allocate memory.
>> Long term GUP cannot allocate memory from CMA area, so a maximum
>> of 16 GB of no-CMA memory on a NUMA node can be used as virtual
>> machine memory. Since there is 16G of free CMA memory on the NUMA
>> node, watermark for order-0 always be met for compaction, so
>> __compaction_suitable() always returns true, even if the node is
>> unable to allocate non-CMA memory for the virtual machine.
>>
>> For costly allocations, because __compaction_suitable() always
>> returns true, __alloc_pages_slowpath() can't exit at the appropriate
>> place, resulting in excessively long virtual machine startup times.
>> Call trace:
>> __alloc_pages_slowpath
>> if (compact_result == COMPACT_SKIPPED ||
>> compact_result == COMPACT_DEFERRED)
>> goto nopage; // should exit __alloc_pages_slowpath() from here
>>
>> Other unmovable alloctions, like dma_buf, which can be large in a
>> Linux system, are also unable to allocate memory from CMA, and these
>> allocations suffer from the same problems described above. In order
>> to quickly fall back to remote node, we should remove ALLOC_CMA both
>> in __compaction_suitable() and __isolate_free_page() for unmovable
>> alloctions. After this fix, starting a 32GB virtual machine with
>> device passthrough takes only a few seconds.
>
> The symptom is obviously bad, but I don't understand this fix.
>
> The reason we do ALLOC_CMA is that, even for unmovable allocations,
> you can create space in non-CMA space by moving migratable pages over
> to CMA space. This is not a property we want to lose. But I also don't
Good point. I missed that and I need to withdraw my reviewed tag. Thanks.
prev parent reply other threads:[~2024-12-18 7:57 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-17 11:46 [PATCH V7] mm, compaction: don't use ALLOC_CMA for unmovable allocations yangge1116
2024-12-17 15:55 ` Johannes Weiner
2024-12-18 2:15 ` Ge Yang
2024-12-18 3:29 ` Johannes Weiner
2024-12-18 3:56 ` Ge Yang
2024-12-18 4:00 ` Ge Yang
2024-12-18 7:57 ` Baolin Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ded3d2bf-650e-4ddc-b2be-d6faddeb3037@linux.alibaba.com \
--to=baolin.wang@linux.alibaba.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liuzixing@hygon.cn \
--cc=stable@vger.kernel.org \
--cc=vbabka@suse.cz \
--cc=yangge1116@126.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.