From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E95E335BA for ; Thu, 23 Apr 2026 00:43:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.118 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776905036; cv=none; b=VNj2v6n0ievzUWC1DnBRRQgfGerYfNBh8WsbWxXZngbI0H1p3GqGmInwXepO4Wml+YOLkI8/M25SO8Ex4lZpqmiGxDK+K3ImlomQrnSq+O8Nab7+nF//MYcCZN6YGa6UoErG1Onl7uZCA6SVpwDQWaw7xCaQaCE3EZtkBGHdjKU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776905036; c=relaxed/simple; bh=KYEy8Uq6vmRdLBqDXoEZR62R+0YVezHySzVjTVhdVD4=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=OnuWiy+7uO6z7Aa1qi9ZqrHqdHDTYrh3VSvMS1MTZ86aDpQ/kjQUZ8y7c/thdR7wyxuvb5rf+dHfmNjpJ51Jl9pyvqxsj2Tarv/uGnAqpH1nEOZ4J1cUAJGoM/+aL8QPABXBGr6X7nFMT1HHmJHJQZPTcjzUIqQddR813YsxGLk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=joznW7Xj; arc=none smtp.client-ip=115.124.30.118 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="joznW7Xj" DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1776905031; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=6T3PCwJfjQMPBIXiRKansmxy8djAFUMHOTBn4vItwGw=; b=joznW7XjmYH6uezqpEu/96hNHsjiODE3p8KaItJod5Ib2Tf1DSHt1UvJgblidUqqQt4h7lCovGbdDjZw6tKm1v499bYsYUj1ox4jmoh+bCh7ADNM1Y5vH5iBiswwwdx3m2anUqnYYWHGpDAljdUdT/7lqhZiaLFCx6Q0vPk1CA4= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037033178;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0X1XkNlW_1776905029; Received: from 30.74.144.114(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X1XkNlW_1776905029 cluster:ay36) by smtp.aliyun-inc.com; Thu, 23 Apr 2026 08:43:49 +0800 Message-ID: <73d1150f-8eea-4523-8d29-335f91d38e1b@linux.alibaba.com> Date: Thu, 23 Apr 2026 08:43:48 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3] mm: shmem: always support large folios for internal shmem mount To: Kefeng Wang , "David Hildenbrand (Arm)" , akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, ljs@kernel.org, lance.yang@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dave Hansen References: <26f954be62348591e720c4e8b7a9099b74dc1d6d.1776331555.git.baolin.wang@linux.alibaba.com> <1b3c0401-6d10-4a28-97c8-8e3858d8dc3d@kernel.org> <015de194-99b9-4f9e-8c89-d35807c6fd08@linux.alibaba.com> <07e26d39-6155-4661-b3df-c2419535ed43@kernel.org> <116df9f9-4db7-40d4-a4a4-30a87c0feffa@linux.alibaba.com> <12bdade5-b239-4456-bb5a-f2648c867db8@huawei.com> From: Baolin Wang In-Reply-To: <12bdade5-b239-4456-bb5a-f2648c867db8@huawei.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 4/22/26 11:03 PM, Kefeng Wang wrote: > > > On 4/22/2026 2:28 PM, Baolin Wang wrote: >> CC Kefeng, >> >> On 4/21/26 9:39 PM, David Hildenbrand (Arm) wrote: >>> On 4/21/26 08:27, Baolin Wang wrote: >>>> >>>> >>>> On 4/21/26 3:00 AM, David Hildenbrand (Arm) wrote: >>>>> On 4/17/26 14:45, Baolin Wang wrote: >>>>>> >>>>>> >>>>>> >>>>>> Indeed. Good point. >>>>>> >>>>>> >>>>>> Not really. There could be files created before remount whose >>>>>> mappings >>>>>> don't support large folios (with 'huge=never' option), while files >>>>>> created after remount will have mappings that support large folios >>>>>> (if >>>>>> remounted with 'huge=always' option). >>>>>> >>>>>> It looks like the previous commit 5a90c155defa was also >>>>>> problematic. The >>>>>> huge mount option has introduced a lot of tricky issues:( >>>>>> >>>>>> Now I think Zi's previous suggestion should be able to clean up this >>>>>> mess? That is, calling mapping_set_large_folios() unconditionally for >>>>>> all shmem mounts, and revisiting Kefeng's first version to fix the >>>>>> performance issue. >>>>> >>>>> Okay, so you'll send a patch to just set mapping_set_large_folios() >>>>> unconditionally? >>>> >>>> I'm still hesitating on this. If we set mapping_set_large_folios() >>>> unconditionally, we need to re-fix the performance regression that was >>>> addressed by commit 5a90c155defa. >>> >>> Just so I can follow: where is the test for large folios that we would >>> unlock large folios and cause a regression? >> >> I spent some time investigating the performance regression that was >> addressed by commit 5a90c155defa ("tmpfs: don't enable large folios if >> not supported"). From my testing, I found that the performance issue >> no longer exists on upstream: >> >> mount tmpfs -t tmpfs -o size=50G /mnt/tmpfs >> >> Base: >> dd if=/dev/zero of=/mnt/tmpfs/test bs=400K count=10485 (3.2 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=800K count=5242 (3.2 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=1600K count=2621 (3.1 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=2200K count=1906 (3.0 GB/s ) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=3000K count=1398 (3.0 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=4500K count=932 (3.1 GB/s) >> >> Base + revert 5a90c155defa: >> dd if=/dev/zero of=/mnt/tmpfs/test bs=400K count=10485 (3.3 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=800K count=5242 (3.3 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=1600K count=2621 (3.2 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=2200K count=1906 (3.1 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/testbs=3000K count=1398 (3.0 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=4500K count=932 (3.1 GB/s) >> >> The data is basically consistent with minor fluctuation noise. >> >> Later, I continued investigating and found that commit 665575cff098b >> ("filemap: move prefaulting out of hot write path") fixed the write >> operation performance. >> >> Base + revert 665575cff098b + revert 5a90c155defa: >> dd if=/dev/zero of=/mnt/tmpfs/test bs=400K count=10485 (3.0 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=800K count=5242 (2.9 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=1600K count=2621 (2.6 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=2200K count=1906 (2.6 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=3000K count=1398 (2.5 GB/s) >> dd if=/dev/zero of=/mnt/tmpfs/test bs=4500K count=932 (2.5 GB/s) >> >> We can see that after reverting commit 665575cff098b, there is a >> noticeable drop in write performance for tmpfs files. >> >> So my conclusion is that we can now safely revert commit 5a90c155defa >> to set mapping_set_large_folios() for all shmem mounts unconditionally. >> >> Kefeng, please correct me if I missed anything. > > Hi Baolin,I found my testcases "bonnie Block/Re Write" > > ./bonnie -d /tmp -s Size (size is from 100,256,512,1024,2048,4096). > > But the dd test is similar as well, and as commit 4e527d5841e2 > ("iomap: fault in smaller chunks for non-large folio mappings") said, > the issue is, > > "If chunk is 2MB, total 512 pages need to be handled finally. During this > period, fault_in_iov_iter_readable() is called to check iov_iter readable > validity. Since only 4KB will be handled each time, below address space > will be checked over and over again" > > But after 665575cff098b, fault_in_iov_iter_readable() is moved, so the > issue should be fixed. Kefeng, thanks for confirming.