All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: ziy@nvidia.com
Cc: akpm@linux-foundation.org, david@kernel.org, willy@infradead.org,
	songliubraving@fb.com, clm@fb.com, dsterba@suse.com,
	viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz,
	ljs@kernel.org, baolin.wang@linux.alibaba.com,
	Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
	dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev,
	vbabka@kernel.org, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, shuah@kernel.org, linux-btrfs@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v5 13/14] mm/khugepaged: enable clean pagecache folio collapse for writable files
Date: Fri,  8 May 2026 15:46:43 +0800	[thread overview]
Message-ID: <20260508074643.55548-1-lance.yang@linux.dev> (raw)
In-Reply-To: <20260429153538.727855-9-ziy@nvidia.com>


On Wed, Apr 29, 2026 at 11:35:36AM -0400, Zi Yan wrote:
>collapse_file() is capable of collapsing pagecache folios from writable
>files to PMD folios. Now enable clean pagecache folio collapse in addition
>to read-only pagecache folio collapse by removing the
>inode_is_open_for_write() from file_thp_enabled() and only performing
>filemap_flush() if the file is read-only.
>
>This means userspace needs to explicitly flush the content of pagecache
>folios before khugepaged can collapse the folios, or use
>madvise(MADV_COLLAPSE), which does the flush in the retry. The reason is
>that blindly enabling dirty pagecache folio from writable files collapse
>makes khugepaged flush these folios all the time. It is undesirable to
>cause system level pagecache flushes.
>
>To properly support dirty pagecache folio collapse, filemap_flush() needs
>to be avoided. Potentially, merging associated buffer instead of dropping
>it with filemap_release_folio() might be needed.
>
>NOTE: this breaks khugepaged selftests for writable file pagecache
>collapse, which is set to fail all the time. The next commit fix it.
>
>Signed-off-by: Zi Yan <ziy@nvidia.com>
>---
> mm/huge_memory.c | 2 +-
> mm/khugepaged.c  | 9 ++++++++-
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
>diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>index 9b3abb98a7e51..e1e9d59db6e70 100644
>--- a/mm/huge_memory.c
>+++ b/mm/huge_memory.c
>@@ -97,7 +97,7 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma)
> 	if (!mapping_pmd_folio_support(vma->vm_file->f_mapping))
> 		return false;
> 
>-	return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
>+	return S_ISREG(inode->i_mode);
> }
> 
> /* If returns true, we are unable to access the VMA's folios. */
>diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>index 1ee15b48962a3..fb7ff643973cc 100644
>--- a/mm/khugepaged.c
>+++ b/mm/khugepaged.c
>@@ -2345,7 +2345,14 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,
> 				 * forcing writeback in loop.
> 				 */

Nit: the comment above now looks stale. It still says:

"There won’t be new dirty pages."

That was true when file_thp_enabled() rejected writable-open files, but
not after this patch ;)

Otherwise, LGTM.
Reviewed-by: Lance Yang <lance.yang@linux.dev>

> 				xas_unlock_irq(&xas);
>-				filemap_flush(mapping);
>+				/*
>+				 * Only flush for read-only files. Writable
>+				 * files can have their folios dirty at any
>+				 * time; blindly flushing them would cause
>+				 * undesirable system-wide writeback.
>+				 */
>+				if (!inode_is_open_for_write(mapping->host))
>+					filemap_flush(mapping);
> 				result = SCAN_PAGE_DIRTY_OR_WRITEBACK;
> 				goto xa_unlocked;
> 			} else if (folio_test_writeback(folio)) {
>-- 
>2.53.0
>
>

  parent reply	other threads:[~2026-05-08  7:47 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-29 15:29 [PATCH v5 00/14] Remove CONFIG_READ_ONLY_THP_FOR_FS and enable file THP for writable files Zi Yan
2026-04-29 15:29 ` [PATCH v5 01/14] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check Zi Yan
2026-04-30 14:37   ` Zi Yan
2026-04-30 15:04     ` Andrew Morton
2026-05-04  3:48   ` Nico Pache
2026-05-07  3:29     ` Lance Yang
2026-05-07  5:52       ` Zi Yan
2026-05-07  6:08   ` Zi Yan
2026-05-07  6:57     ` Zi Yan
2026-05-08 19:39   ` David Hildenbrand (Arm)
2026-04-29 15:29 ` [PATCH v5 02/14] mm/khugepaged: add folio dirty check after try_to_unmap() Zi Yan
2026-04-30 15:11   ` Zi Yan
2026-05-04  3:53   ` Nico Pache
2026-05-06  5:23   ` Lance Yang
2026-04-29 15:29 ` [PATCH v5 03/14] mm/huge_memory: remove READ_ONLY_THP_FOR_FS from file_thp_enabled() Zi Yan
2026-05-04  3:57   ` Nico Pache
2026-05-07  4:29   ` Lance Yang
2026-05-08 19:43   ` David Hildenbrand (Arm)
2026-04-29 15:29 ` [PATCH v5 04/14] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check in hugepage_enabled() Zi Yan
2026-05-04  4:00   ` Nico Pache
2026-05-07  4:49   ` Lance Yang
2026-05-08 18:54     ` Andrew Morton
2026-05-11  7:15   ` Lance Yang
2026-04-29 15:35 ` [PATCH v5 05/14] mm: remove READ_ONLY_THP_FOR_FS Kconfig option Zi Yan
2026-05-04  4:02   ` Nico Pache
2026-05-07 12:48   ` Lance Yang
2026-05-08  2:52   ` Wei Yang
2026-05-08  3:22     ` Lance Yang
2026-04-29 15:35 ` [PATCH v5 06/14] mm: fs: remove filemap_nr_thps*() functions and their users Zi Yan
2026-05-07 12:59   ` Lance Yang
2026-04-29 15:35 ` [PATCH v5 07/14] fs: remove nr_thps from struct address_space Zi Yan
2026-05-04  4:11   ` Nico Pache
2026-04-29 15:35 ` [PATCH v5 08/14] mm/huge_memory: remove folio split check for READ_ONLY_THP_FOR_FS Zi Yan
2026-04-29 15:35 ` [PATCH v5 09/14] mm/truncate: use folio_split() in truncate_inode_partial_folio() Zi Yan
2026-04-30 15:12   ` Zi Yan
2026-05-08  7:01   ` Lance Yang
2026-05-08 19:46   ` David Hildenbrand (Arm)
2026-04-29 15:35 ` [PATCH v5 10/14] fs/btrfs: remove a comment referring to READ_ONLY_THP_FOR_FS Zi Yan
2026-04-29 15:35 ` [PATCH v5 11/14] selftests/mm: remove READ_ONLY_THP_FOR_FS in khugepaged Zi Yan
2026-04-30 15:16   ` Zi Yan
2026-04-30 15:27     ` Zi Yan
2026-05-08 19:48       ` David Hildenbrand (Arm)
2026-05-04  4:23   ` Nico Pache
2026-05-06 13:11     ` Zi Yan
2026-05-08 19:51       ` David Hildenbrand (Arm)
2026-05-18 23:43         ` Zi Yan
2026-05-04 10:11   ` Nico Pache
2026-05-06 13:15     ` Zi Yan
2026-05-07  6:35       ` Nico Pache
2026-05-07  7:21         ` Zi Yan
2026-05-07  7:24   ` Zi Yan
2026-05-08 20:06   ` David Hildenbrand (Arm)
2026-05-17  2:45     ` Zi Yan
2026-04-29 15:35 ` [PATCH v5 12/14] selftests/mm: remove READ_ONLY_THP_FOR_FS code from guard-regions Zi Yan
2026-04-29 15:35 ` [PATCH v5 13/14] mm/khugepaged: enable clean pagecache folio collapse for writable files Zi Yan
2026-04-30 15:18   ` Zi Yan
2026-05-08 20:09     ` David Hildenbrand (Arm)
2026-05-08  7:46   ` Lance Yang [this message]
2026-05-08 20:13   ` David Hildenbrand (Arm)
2026-05-17  7:29     ` Zi Yan
2026-04-29 15:35 ` [PATCH v5 14/14] selftests/mm: add writable-file collapse tests for khugepaged Zi Yan
2026-04-29 16:13 ` [PATCH v5 00/14] Remove CONFIG_READ_ONLY_THP_FOR_FS and enable file THP for writable files Andrew Morton
2026-05-09 22:10   ` Zi Yan
2026-05-11  7:19     ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260508074643.55548-1-lance.yang@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=brauner@kernel.org \
    --cc=clm@fb.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=dsterba@suse.com \
    --cc=jack@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.