All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths
@ 2026-06-23 23:16 Barry Song (Xiaomi)
  2026-06-23 23:16 ` [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-23 23:16 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: baoquan.he, chrisl, david, jp.kobryn, kasong, liam, linux-kernel,
	ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
	usama.arif, vbabka, youngjun.park, Barry Song (Xiaomi)

We are doing a large number of redundant lru_add_drain() calls in
both wp_can_reuse_anon_folio() and do_swap_page(), leading to LRU
lock contention and unnecessary overhead.

In wp_can_reuse_anon_folio(), we can check the refcount against the
lru_cache before deciding to drain. In do_swap_page(), the drain is
now entirely redundant after Kairui's work to route SYNC I/O through
the swapcache in the same way as ASYNC I/O.

Build the kernel within a 1GB memcg using 20 threads with zRAM swap.
The number of lru_add_drain() calls is reduced from 276,787 to
230,283, while sys time decreases slightly from 3m40.125s to
3m37.128s.

Build the kernel within an 800MB memcg using 20 threads with zRAM
swap. The number of lru_add_drain() calls is reduced from 796,661 to
537,262, while sys time decreases slightly from 6m25.981s to
6m22.678s.

-v2:
 * collect the reviewed-by and acked-by tags from Usama, Baoquan,
   Shakeel, Kairui, thanks!
 * add patch4 to free swapcache for non-LRU folios, as suggested
   by Kairui, thanks!

-RFC:
 https://lore.kernel.org/linux-mm/20260611105124.98668-1-baohua@kernel.org/ 

Barry Song (Xiaomi) (4):
  mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
  mm: entirely remove lru_add_drain in do_swap_page
  mm: try to free swapcache for non-LRU folios

 mm/memory.c | 30 +++++++++++++-----------------
 1 file changed, 13 insertions(+), 17 deletions(-)

-- 
2.39.3 (Apple Git-146)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-23 23:16 [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
@ 2026-06-23 23:16 ` Barry Song (Xiaomi)
  2026-06-23 23:16 ` [PATCH v2 2/4] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-23 23:16 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: baoquan.he, chrisl, david, jp.kobryn, kasong, liam, linux-kernel,
	ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
	usama.arif, vbabka, youngjun.park, Barry Song (Xiaomi)

We always unconditionally drain the LRU before retrying anon folio
reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
are in lru_cache, and use the refcount to avoid many unnecessary LRU
drains.

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index ff338c2abe92..f6848f4234a6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
 	 */
 	if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
 		return false;
-	if (!folio_test_lru(folio))
+	if (!folio_test_lru(folio)) {
+		/*
+		 * Assume folio is on lru_cache and holds a cache reference.
+		 */
+		if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
+			return false;
 		/*
 		 * We cannot easily detect+handle references from
 		 * remote LRU caches or references to LRU folios.
 		 */
 		lru_add_drain();
+	}
 	if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
 		return false;
 	if (!folio_trylock(folio))
-- 
2.39.3 (Apple Git-146)


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/4] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
  2026-06-23 23:16 [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
  2026-06-23 23:16 ` [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
@ 2026-06-23 23:16 ` Barry Song (Xiaomi)
  2026-06-23 23:16 ` [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
  2026-06-23 23:16 ` [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios Barry Song (Xiaomi)
  3 siblings, 0 replies; 5+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-23 23:16 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: baoquan.he, chrisl, david, jp.kobryn, kasong, liam, linux-kernel,
	ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
	usama.arif, vbabka, youngjun.park, Barry Song (Xiaomi)

The "we just allocated them without exposing them to the swapcache"
case no longer exists, as Kairui has routed synchronous I/O through
the swapcache as well in his series "unify swapin use swap cache and
cleanup flags"[1]. As a result, folio_ref_count() should never be 1
in this path, since at least two references are held (base ref plus
swapcache). Remove the folio_ref_count()==1 check and update the
comment accordingly.

[1] https://lore.kernel.org/all/20251220-swap-table-p2-v5-0-8862a265a033@tencent.com/

Acked-by: Usama Arif <usama.arif@linux.dev>
Reviewed-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index f6848f4234a6..abd0adcf65f0 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5049,12 +5049,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 
 	/*
 	 * Same logic as in do_wp_page(); however, optimize for pages that are
-	 * certainly not shared either because we just allocated them without
-	 * exposing them to the swapcache or because the swap entry indicates
-	 * exclusivity.
+	 * certainly not because the swap entry indicates exclusivity.
 	 */
-	if (!folio_test_ksm(folio) &&
-	    (exclusive || folio_ref_count(folio) == 1)) {
+	if (!folio_test_ksm(folio) && exclusive) {
 		if ((vma->vm_flags & VM_WRITE) && !userfaultfd_pte_wp(vma, pte) &&
 		    !pte_needs_soft_dirty_wp(vma, pte)) {
 			pte = pte_mkwrite(pte, vma);
-- 
2.39.3 (Apple Git-146)


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page
  2026-06-23 23:16 [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
  2026-06-23 23:16 ` [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
  2026-06-23 23:16 ` [PATCH v2 2/4] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
@ 2026-06-23 23:16 ` Barry Song (Xiaomi)
  2026-06-23 23:16 ` [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios Barry Song (Xiaomi)
  3 siblings, 0 replies; 5+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-23 23:16 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: baoquan.he, chrisl, david, jp.kobryn, kasong, liam, linux-kernel,
	ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
	usama.arif, vbabka, youngjun.park, Barry Song (Xiaomi)

We are doing a lot of redundant lru_add_drain() calls in
do_swap_page(), especially for synchronous I/O devices. For
example, the test program below currently ends up draining
lru_cache 100% of the time:

int main(int argc, char *argv[])
{
        int i;
 #define SIZE 100*1024*1024
	while(1) {
		volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
                        MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

		for (int i = 0; i < SIZE/sizeof(int); i++)
			p[i] =  i%64;
		madvise((void *)p, SIZE, MADV_PAGEOUT);
		for (int i = 0; i < SIZE/sizeof(int); i++)
			p[i] =  i%64;
		munmap(p, SIZE);
	}
	return 0;
}

Folio reuse now relies primarily on the exclusive hint, making
lru_cache draining to drop the refcount in lru_cache largely
irrelevant.

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index abd0adcf65f0..2983a6baf474 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4903,16 +4903,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 	} else if (folio != swapcache)
 		page = folio_page(folio, 0);
 
-	/*
-	 * If we want to map a page that's in the swapcache writable, we
-	 * have to detect via the refcount if we're really the exclusive
-	 * owner. Try removing the extra reference from the local LRU
-	 * caches if required.
-	 */
-	if ((vmf->flags & FAULT_FLAG_WRITE) &&
-	    !folio_test_ksm(folio) && !folio_test_lru(folio))
-		lru_add_drain();
-
 	folio_throttle_swaprate(folio, GFP_KERNEL);
 
 	/*
-- 
2.39.3 (Apple Git-146)


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios
  2026-06-23 23:16 [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
                   ` (2 preceding siblings ...)
  2026-06-23 23:16 ` [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
@ 2026-06-23 23:16 ` Barry Song (Xiaomi)
  3 siblings, 0 replies; 5+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-23 23:16 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: baoquan.he, chrisl, david, jp.kobryn, kasong, liam, linux-kernel,
	ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
	usama.arif, vbabka, youngjun.park, Barry Song (Xiaomi),
	Kairui Song

Originally, we unconditionally called lru_add_drain() for write
swap-in page faults. This might drop the reference held by the per-CPU
LRU cache if the folio happened to reside there. However, there was no
guarantee that the folio was actually cached on the current CPU.

Now that lru_add_drain() has been removed, we have lost one
opportunity to drop a reference held by the LRU cache. We could
instead incorporate that possibility into the condition evaluated by
should_try_to_free_swap().

Suggested-by: Kairui Song <ryncsn@gmail.com>
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 2983a6baf474..14577c67c61a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5087,8 +5087,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 	 * Remove the swap entry and conditionally try to free up the swapcache.
 	 * Do it after mapping, so raced page faults will likely see the folio
 	 * in swap cache and wait on the folio lock.
+	 * Assume non-LRU folios may be queued in the LRU cache, which contributes
+	 * an additional reference to the folio.
 	 */
-	if (should_try_to_free_swap(si, folio, vma, nr_pages, vmf->flags))
+	if (should_try_to_free_swap(si, folio, vma, nr_pages +
+			!folio_test_lru(folio), vmf->flags))
 		folio_free_swap(folio);
 
 	folio_unlock(folio);
-- 
2.39.3 (Apple Git-146)


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-23 23:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-23 23:16 [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
2026-06-23 23:16 ` [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
2026-06-23 23:16 ` [PATCH v2 2/4] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
2026-06-23 23:16 ` [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
2026-06-23 23:16 ` [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios Barry Song (Xiaomi)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.