From: Lance Yang <lance.yang@linux.dev>
To: ziy@nvidia.com
Cc: akpm@linux-foundation.org, david@kernel.org, willy@infradead.org,
songliubraving@fb.com, clm@fb.com, dsterba@suse.com,
viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz,
ljs@kernel.org, baolin.wang@linux.alibaba.com,
Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev,
vbabka@kernel.org, rppt@kernel.org, surenb@google.com,
mhocko@suse.com, shuah@kernel.org, linux-btrfs@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v5 13/14] mm/khugepaged: enable clean pagecache folio collapse for writable files
Date: Fri, 8 May 2026 15:46:43 +0800 [thread overview]
Message-ID: <20260508074643.55548-1-lance.yang@linux.dev> (raw)
In-Reply-To: <20260429153538.727855-9-ziy@nvidia.com>
On Wed, Apr 29, 2026 at 11:35:36AM -0400, Zi Yan wrote:
>collapse_file() is capable of collapsing pagecache folios from writable
>files to PMD folios. Now enable clean pagecache folio collapse in addition
>to read-only pagecache folio collapse by removing the
>inode_is_open_for_write() from file_thp_enabled() and only performing
>filemap_flush() if the file is read-only.
>
>This means userspace needs to explicitly flush the content of pagecache
>folios before khugepaged can collapse the folios, or use
>madvise(MADV_COLLAPSE), which does the flush in the retry. The reason is
>that blindly enabling dirty pagecache folio from writable files collapse
>makes khugepaged flush these folios all the time. It is undesirable to
>cause system level pagecache flushes.
>
>To properly support dirty pagecache folio collapse, filemap_flush() needs
>to be avoided. Potentially, merging associated buffer instead of dropping
>it with filemap_release_folio() might be needed.
>
>NOTE: this breaks khugepaged selftests for writable file pagecache
>collapse, which is set to fail all the time. The next commit fix it.
>
>Signed-off-by: Zi Yan <ziy@nvidia.com>
>---
> mm/huge_memory.c | 2 +-
> mm/khugepaged.c | 9 ++++++++-
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
>diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>index 9b3abb98a7e51..e1e9d59db6e70 100644
>--- a/mm/huge_memory.c
>+++ b/mm/huge_memory.c
>@@ -97,7 +97,7 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma)
> if (!mapping_pmd_folio_support(vma->vm_file->f_mapping))
> return false;
>
>- return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
>+ return S_ISREG(inode->i_mode);
> }
>
> /* If returns true, we are unable to access the VMA's folios. */
>diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>index 1ee15b48962a3..fb7ff643973cc 100644
>--- a/mm/khugepaged.c
>+++ b/mm/khugepaged.c
>@@ -2345,7 +2345,14 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,
> * forcing writeback in loop.
> */
Nit: the comment above now looks stale. It still says:
"There won’t be new dirty pages."
That was true when file_thp_enabled() rejected writable-open files, but
not after this patch ;)
Otherwise, LGTM.
Reviewed-by: Lance Yang <lance.yang@linux.dev>
> xas_unlock_irq(&xas);
>- filemap_flush(mapping);
>+ /*
>+ * Only flush for read-only files. Writable
>+ * files can have their folios dirty at any
>+ * time; blindly flushing them would cause
>+ * undesirable system-wide writeback.
>+ */
>+ if (!inode_is_open_for_write(mapping->host))
>+ filemap_flush(mapping);
> result = SCAN_PAGE_DIRTY_OR_WRITEBACK;
> goto xa_unlocked;
> } else if (folio_test_writeback(folio)) {
>--
>2.53.0
>
>
next prev parent reply other threads:[~2026-05-08 7:47 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-29 15:29 [PATCH v5 00/14] Remove CONFIG_READ_ONLY_THP_FOR_FS and enable file THP for writable files Zi Yan
2026-04-29 15:29 ` [PATCH v5 01/14] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check Zi Yan
2026-04-30 14:37 ` Zi Yan
2026-04-30 15:04 ` Andrew Morton
2026-05-04 3:48 ` Nico Pache
2026-05-07 3:29 ` Lance Yang
2026-05-07 5:52 ` Zi Yan
2026-05-07 6:08 ` Zi Yan
2026-05-07 6:57 ` Zi Yan
2026-05-08 19:39 ` David Hildenbrand (Arm)
2026-04-29 15:29 ` [PATCH v5 02/14] mm/khugepaged: add folio dirty check after try_to_unmap() Zi Yan
2026-04-30 15:11 ` Zi Yan
2026-05-04 3:53 ` Nico Pache
2026-05-06 5:23 ` Lance Yang
2026-04-29 15:29 ` [PATCH v5 03/14] mm/huge_memory: remove READ_ONLY_THP_FOR_FS from file_thp_enabled() Zi Yan
2026-05-04 3:57 ` Nico Pache
2026-05-07 4:29 ` Lance Yang
2026-05-08 19:43 ` David Hildenbrand (Arm)
2026-04-29 15:29 ` [PATCH v5 04/14] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check in hugepage_enabled() Zi Yan
2026-05-04 4:00 ` Nico Pache
2026-05-07 4:49 ` Lance Yang
2026-05-08 18:54 ` Andrew Morton
2026-04-29 15:35 ` [PATCH v5 05/14] mm: remove READ_ONLY_THP_FOR_FS Kconfig option Zi Yan
2026-05-04 4:02 ` Nico Pache
2026-05-07 12:48 ` Lance Yang
2026-05-08 2:52 ` Wei Yang
2026-05-08 3:22 ` Lance Yang
2026-04-29 15:35 ` [PATCH v5 06/14] mm: fs: remove filemap_nr_thps*() functions and their users Zi Yan
2026-05-07 12:59 ` Lance Yang
2026-04-29 15:35 ` [PATCH v5 07/14] fs: remove nr_thps from struct address_space Zi Yan
2026-05-04 4:11 ` Nico Pache
2026-04-29 15:35 ` [PATCH v5 08/14] mm/huge_memory: remove folio split check for READ_ONLY_THP_FOR_FS Zi Yan
2026-04-29 15:35 ` [PATCH v5 09/14] mm/truncate: use folio_split() in truncate_inode_partial_folio() Zi Yan
2026-04-30 15:12 ` Zi Yan
2026-05-08 7:01 ` Lance Yang
2026-05-08 19:46 ` David Hildenbrand (Arm)
2026-04-29 15:35 ` [PATCH v5 10/14] fs/btrfs: remove a comment referring to READ_ONLY_THP_FOR_FS Zi Yan
2026-04-29 15:35 ` [PATCH v5 11/14] selftests/mm: remove READ_ONLY_THP_FOR_FS in khugepaged Zi Yan
2026-04-30 15:16 ` Zi Yan
2026-04-30 15:27 ` Zi Yan
2026-05-08 19:48 ` David Hildenbrand (Arm)
2026-05-04 4:23 ` Nico Pache
2026-05-06 13:11 ` Zi Yan
2026-05-08 19:51 ` David Hildenbrand (Arm)
2026-05-04 10:11 ` Nico Pache
2026-05-06 13:15 ` Zi Yan
2026-05-07 6:35 ` Nico Pache
2026-05-07 7:21 ` Zi Yan
2026-05-07 7:24 ` Zi Yan
2026-05-08 20:06 ` David Hildenbrand (Arm)
2026-04-29 15:35 ` [PATCH v5 12/14] selftests/mm: remove READ_ONLY_THP_FOR_FS code from guard-regions Zi Yan
2026-04-29 15:35 ` [PATCH v5 13/14] mm/khugepaged: enable clean pagecache folio collapse for writable files Zi Yan
2026-04-30 15:18 ` Zi Yan
2026-05-08 20:09 ` David Hildenbrand (Arm)
2026-05-08 7:46 ` Lance Yang [this message]
2026-05-08 20:13 ` David Hildenbrand (Arm)
2026-04-29 15:35 ` [PATCH v5 14/14] selftests/mm: add writable-file collapse tests for khugepaged Zi Yan
2026-04-29 16:13 ` [PATCH v5 00/14] Remove CONFIG_READ_ONLY_THP_FOR_FS and enable file THP for writable files Andrew Morton
2026-05-09 22:10 ` Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260508074643.55548-1-lance.yang@linux.dev \
--to=lance.yang@linux.dev \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=brauner@kernel.org \
--cc=clm@fb.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=dsterba@suse.com \
--cc=jack@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=shuah@kernel.org \
--cc=songliubraving@fb.com \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox