Re: [PATCH v3] mm: shmem: always support large folios for internal shmem mount

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Kefeng Wang <wangkefeng.wang@huawei.com>,
	"David Hildenbrand (Arm)" <david@kernel.org>,
	akpm@linux-foundation.org, hughd@google.com
Cc: willy@infradead.org, ziy@nvidia.com, ljs@kernel.org,
	lance.yang@linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH v3] mm: shmem: always support large folios for internal shmem mount
Date: Thu, 23 Apr 2026 08:43:48 +0800	[thread overview]
Message-ID: <73d1150f-8eea-4523-8d29-335f91d38e1b@linux.alibaba.com> (raw)
In-Reply-To: <12bdade5-b239-4456-bb5a-f2648c867db8@huawei.com>



On 4/22/26 11:03 PM, Kefeng Wang wrote:
> 
> 
> On 4/22/2026 2:28 PM, Baolin Wang wrote:
>> CC Kefeng,
>>
>> On 4/21/26 9:39 PM, David Hildenbrand (Arm) wrote:
>>> On 4/21/26 08:27, Baolin Wang wrote:
>>>>
>>>>
>>>> On 4/21/26 3:00 AM, David Hildenbrand (Arm) wrote:
>>>>> On 4/17/26 14:45, Baolin Wang wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Indeed. Good point.
>>>>>>
>>>>>>
>>>>>> Not really. There could be files created before remount whose 
>>>>>> mappings
>>>>>> don't support large folios (with 'huge=never' option), while files
>>>>>> created after remount will have mappings that support large folios 
>>>>>> (if
>>>>>> remounted with 'huge=always' option).
>>>>>>
>>>>>> It looks like the previous commit 5a90c155defa was also 
>>>>>> problematic. The
>>>>>> huge mount option has introduced a lot of tricky issues:(
>>>>>>
>>>>>> Now I think Zi's previous suggestion should be able to clean up this
>>>>>> mess? That is, calling mapping_set_large_folios() unconditionally for
>>>>>> all shmem mounts, and revisiting Kefeng's first version to fix the
>>>>>> performance issue.
>>>>>
>>>>> Okay, so you'll send a patch to just set mapping_set_large_folios()
>>>>> unconditionally?
>>>>
>>>> I'm still hesitating on this. If we set mapping_set_large_folios()
>>>> unconditionally, we need to re-fix the performance regression that was
>>>> addressed by commit 5a90c155defa.
>>>
>>> Just so I can follow: where is the test for large folios that we would
>>> unlock large folios and cause a regression?
>>
>> I spent some time investigating the performance regression that was 
>> addressed by commit 5a90c155defa ("tmpfs: don't enable large folios if 
>> not supported"). From my testing, I found that the performance issue 
>> no longer exists on upstream:
>>
>> mount tmpfs -t tmpfs -o size=50G /mnt/tmpfs
>>
>> Base:
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=400K count=10485 (3.2 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=800K count=5242 (3.2 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=1600K count=2621 (3.1 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=2200K count=1906 (3.0 GB/s )
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=3000K count=1398 (3.0 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=4500K count=932 (3.1 GB/s)
>>
>> Base + revert 5a90c155defa:
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=400K count=10485 (3.3 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=800K count=5242 (3.3 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=1600K count=2621 (3.2 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=2200K count=1906 (3.1 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/testbs=3000K count=1398 (3.0 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=4500K count=932 (3.1 GB/s)
>>
>> The data is basically consistent with minor fluctuation noise.
>>
>> Later, I continued investigating and found that commit 665575cff098b 
>> ("filemap: move prefaulting out of hot write path") fixed the write 
>> operation performance.
>>
>> Base + revert 665575cff098b + revert 5a90c155defa:
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=400K count=10485 (3.0 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=800K count=5242 (2.9 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=1600K count=2621 (2.6 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=2200K count=1906 (2.6 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=3000K count=1398 (2.5 GB/s)
>> dd if=/dev/zero of=/mnt/tmpfs/test bs=4500K count=932 (2.5 GB/s)
>>
>> We can see that after reverting commit 665575cff098b, there is a 
>> noticeable drop in write performance for tmpfs files.
>>
>> So my conclusion is that we can now safely revert commit 5a90c155defa 
>> to set mapping_set_large_folios() for all shmem mounts unconditionally.
>>
>> Kefeng, please correct me if I missed anything.
> 
> Hi Baolin，I found my testcases "bonnie Block/Re Write"
> 
> ./bonnie -d /tmp -s Size (size is from 100,256,512,1024,2048,4096).
> 
> But the dd test is similar as well, and as commit 4e527d5841e2
> ("iomap: fault in smaller chunks for non-large folio mappings") said,
> the issue is,
> 
> "If chunk is 2MB, total 512 pages need to be handled finally. During this
> period, fault_in_iov_iter_readable() is called to check iov_iter readable
> validity. Since only 4KB will be handled each time, below address space
> will be checked over and over again"
> 
> But after 665575cff098b, fault_in_iov_iter_readable() is moved, so the
> issue should be fixed.

Kefeng, thanks for confirming.

     prev parent reply	other threads:[~2026-04-23  0:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-17  3:25 [PATCH v3] mm: shmem: always support large folios for internal shmem mount Baolin Wang
2026-04-17  9:21 ` David Hildenbrand (Arm)
2026-04-17  9:27   ` Baolin Wang
2026-04-17  9:52     ` David Hildenbrand (Arm)
2026-04-17 12:45       ` Baolin Wang
2026-04-20 19:00         ` David Hildenbrand (Arm)
2026-04-21  6:27           ` Baolin Wang
2026-04-21 13:39             ` David Hildenbrand (Arm)
2026-04-22  6:28               ` Baolin Wang
2026-04-22 15:03                 ` Kefeng Wang
2026-04-23  0:43                   ` Baolin Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73d1150f-8eea-4523-8d29-335f91d38e1b@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=hughd@google.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox