Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths
@ 2026-06-11 10:51 Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-11 10:51 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, david, ljs, liam, vbabka, rppt, surenb, mhocko,
	chrisl, kasong, shikemeng, nphamcs, baoquan.he, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt, Barry Song (Xiaomi)

We are doing a large number of redundant lru_add_drain() calls in
both wp_can_reuse_anon_folio() and do_swap_page(), leading to LRU
lock contention and unnecessary overhead.

In wp_can_reuse_anon_folio(), we can check the refcount against the
lru_cache before deciding to drain. In do_swap_page(), the drain is
now entirely redundant after Kairui's work to route SYNC I/O through
the swapcache in the same way as ASYNC I/O.

Build the kernel within a 1GB memcg using 20 threads with zRAM swap.
The number of lru_add_drain() calls is reduced from 276,787 to
230,283, while sys time decreases slightly from 3m40.125s to
3m37.128s.

Build the kernel within an 800MB memcg using 20 threads with zRAM
swap. The number of lru_add_drain() calls is reduced from 796,661 to
537,262, while sys time decreases slightly from 6m25.981s to
6m22.678s.

Barry Song (Xiaomi) (3):
  mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
  mm: entirely remove lru_add_drain in do_swap_page

 mm/memory.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

-- 
2.39.3 (Apple Git-146)



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-11 10:51 [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
@ 2026-06-11 10:51 ` Barry Song (Xiaomi)
  2026-06-11 18:09   ` Shakeel Butt
  2026-06-11 10:51 ` [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
  2 siblings, 1 reply; 8+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-11 10:51 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, david, ljs, liam, vbabka, rppt, surenb, mhocko,
	chrisl, kasong, shikemeng, nphamcs, baoquan.he, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt, Barry Song (Xiaomi)

We always unconditionally drain the LRU before retrying anon folio
reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
are in lru_cache, and use the refcount to avoid many unnecessary LRU
drains.

Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 56be920c56d7..487a34377a7b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
 	 */
 	if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
 		return false;
-	if (!folio_test_lru(folio))
+	if (!folio_test_lru(folio)) {
+		/*
+		 * Assume folio is on lru_cache and holds a cache reference.
+		 */
+		if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
+			return false;
 		/*
 		 * We cannot easily detect+handle references from
 		 * remote LRU caches or references to LRU folios.
 		 */
 		lru_add_drain();
+	}
 	if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
 		return false;
 	if (!folio_trylock(folio))
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
  2026-06-11 10:51 [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
@ 2026-06-11 10:51 ` Barry Song (Xiaomi)
  2026-06-11 18:12   ` Shakeel Butt
  2026-06-11 10:51 ` [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
  2 siblings, 1 reply; 8+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-11 10:51 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, david, ljs, liam, vbabka, rppt, surenb, mhocko,
	chrisl, kasong, shikemeng, nphamcs, baoquan.he, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt, Barry Song (Xiaomi)

The "we just allocated them without exposing them to the swapcache"
case no longer exists, as Kairui has routed synchronous I/O through
the swapcache as well in his series "unify swapin use swap cache and
cleanup flags"[1]. As a result, folio_ref_count() should never be 1
in this path, since at least two references are held (base ref plus
swapcache). Remove the folio_ref_count()==1 check and update the
comment accordingly.

[1] https://lore.kernel.org/all/20251220-swap-table-p2-v5-0-8862a265a033@tencent.com/
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 487a34377a7b..ce8ef27e7a54 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5049,12 +5049,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 
 	/*
 	 * Same logic as in do_wp_page(); however, optimize for pages that are
-	 * certainly not shared either because we just allocated them without
-	 * exposing them to the swapcache or because the swap entry indicates
-	 * exclusivity.
+	 * certainly not because the swap entry indicates exclusivity.
 	 */
-	if (!folio_test_ksm(folio) &&
-	    (exclusive || folio_ref_count(folio) == 1)) {
+	if (!folio_test_ksm(folio) && exclusive) {
 		if ((vma->vm_flags & VM_WRITE) && !userfaultfd_pte_wp(vma, pte) &&
 		    !pte_needs_soft_dirty_wp(vma, pte)) {
 			pte = pte_mkwrite(pte, vma);
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page
  2026-06-11 10:51 [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
@ 2026-06-11 10:51 ` Barry Song (Xiaomi)
  2026-06-11 18:40   ` Shakeel Butt
  2 siblings, 1 reply; 8+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-11 10:51 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, david, ljs, liam, vbabka, rppt, surenb, mhocko,
	chrisl, kasong, shikemeng, nphamcs, baoquan.he, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt, Barry Song (Xiaomi)

We are doing a lot of redundant lru_add_drain() calls in
do_swap_page(), especially for synchronous I/O devices. For
example, the test program below currently ends up draining
lru_cache 100% of the time:

int main(int argc, char *argv[])
{
        int i;
 #define SIZE 100*1024*1024
	while(1) {
		volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
                        MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

		for (int i = 0; i < SIZE/sizeof(int); i++)
			p[i] =  i%64;
		madvise((void *)p, SIZE, MADV_PAGEOUT);
		for (int i = 0; i < SIZE/sizeof(int); i++)
			p[i] =  i%64;
		munmap(p, SIZE);
	}
	return 0;
}

Folio reuse now relies primarily on the exclusive hint, making
lru_cache draining to drop the refcount in lru_cache largely
irrelevant.

Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index ce8ef27e7a54..b5a78670bcc8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4903,16 +4903,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 	} else if (folio != swapcache)
 		page = folio_page(folio, 0);
 
-	/*
-	 * If we want to map a page that's in the swapcache writable, we
-	 * have to detect via the refcount if we're really the exclusive
-	 * owner. Try removing the extra reference from the local LRU
-	 * caches if required.
-	 */
-	if ((vmf->flags & FAULT_FLAG_WRITE) &&
-	    !folio_test_ksm(folio) && !folio_test_lru(folio))
-		lru_add_drain();
-
 	folio_throttle_swaprate(folio, GFP_KERNEL);
 
 	/*
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
@ 2026-06-11 18:09   ` Shakeel Butt
  2026-06-11 18:17     ` Shakeel Butt
  0 siblings, 1 reply; 8+ messages in thread
From: Shakeel Butt @ 2026-06-11 18:09 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, baoquan.he,
	youngjun.park, jp.kobryn, usama.arif

On Thu, Jun 11, 2026 at 06:51:22PM +0800, Barry Song (Xiaomi) wrote:
> We always unconditionally drain the LRU before retrying anon folio
> reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> are in lru_cache, and use the refcount to avoid many unnecessary LRU
> drains.
> 
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
>  mm/memory.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 56be920c56d7..487a34377a7b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
>  	 */
>  	if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
>  		return false;
> -	if (!folio_test_lru(folio))
> +	if (!folio_test_lru(folio)) {
> +		/*
> +		 * Assume folio is on lru_cache and holds a cache reference.
> +		 */
> +		if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> +			return false;

In your experiments, how much amount of drains were reduced due to this specific
check?

I wonder if that data can motivate to introduce lru_add_drain_folio(folio) which
only drains if the given folio is in the local lru_add cache.

>  		/*
>  		 * We cannot easily detect+handle references from
>  		 * remote LRU caches or references to LRU folios.
>  		 */
>  		lru_add_drain();
> +	}
>  	if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
>  		return false;
>  	if (!folio_trylock(folio))
> -- 
> 2.39.3 (Apple Git-146)
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
  2026-06-11 10:51 ` [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
@ 2026-06-11 18:12   ` Shakeel Butt
  0 siblings, 0 replies; 8+ messages in thread
From: Shakeel Butt @ 2026-06-11 18:12 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, baoquan.he,
	youngjun.park, jp.kobryn, usama.arif

On Thu, Jun 11, 2026 at 06:51:23PM +0800, Barry Song (Xiaomi) wrote:
> The "we just allocated them without exposing them to the swapcache"
> case no longer exists, as Kairui has routed synchronous I/O through
> the swapcache as well in his series "unify swapin use swap cache and
> cleanup flags"[1]. As a result, folio_ref_count() should never be 1
> in this path, since at least two references are held (base ref plus
> swapcache). Remove the folio_ref_count()==1 check and update the
> comment accordingly.
> 
> [1] https://lore.kernel.org/all/20251220-swap-table-p2-v5-0-8862a265a033@tencent.com/
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-11 18:09   ` Shakeel Butt
@ 2026-06-11 18:17     ` Shakeel Butt
  0 siblings, 0 replies; 8+ messages in thread
From: Shakeel Butt @ 2026-06-11 18:17 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, baoquan.he,
	youngjun.park, jp.kobryn, usama.arif

On Thu, Jun 11, 2026 at 11:09:43AM -0700, Shakeel Butt wrote:
> On Thu, Jun 11, 2026 at 06:51:22PM +0800, Barry Song (Xiaomi) wrote:
> > We always unconditionally drain the LRU before retrying anon folio
> > reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> > are in lru_cache, and use the refcount to avoid many unnecessary LRU
> > drains.
> > 
> > Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> > ---
> >  mm/memory.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 56be920c56d7..487a34377a7b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
> >  	 */
> >  	if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
> >  		return false;
> > -	if (!folio_test_lru(folio))
> > +	if (!folio_test_lru(folio)) {
> > +		/*
> > +		 * Assume folio is on lru_cache and holds a cache reference.
> > +		 */
> > +		if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> > +			return false;
> 
> In your experiments, how much amount of drains were reduced due to this specific
> check?
> 
> I wonder if that data can motivate to introduce lru_add_drain_folio(folio) which
> only drains if the given folio is in the local lru_add cache.

Actually if we can peek into lru_add cache and folio_ref_count(folio) is exactly
equal to (2 + folio_test_swapcache(folio)) and folio is not on LRU then we can
just reuse folio if it is in lru_add cache without draining, right?

> 
> >  		/*
> >  		 * We cannot easily detect+handle references from
> >  		 * remote LRU caches or references to LRU folios.
> >  		 */
> >  		lru_add_drain();
> > +	}
> >  	if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
> >  		return false;
> >  	if (!folio_trylock(folio))
> > -- 
> > 2.39.3 (Apple Git-146)
> > 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page
  2026-06-11 10:51 ` [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
@ 2026-06-11 18:40   ` Shakeel Butt
  0 siblings, 0 replies; 8+ messages in thread
From: Shakeel Butt @ 2026-06-11 18:40 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, baoquan.he,
	youngjun.park, jp.kobryn, usama.arif

On Thu, Jun 11, 2026 at 06:51:24PM +0800, Barry Song (Xiaomi) wrote:
> We are doing a lot of redundant lru_add_drain() calls in
> do_swap_page(), especially for synchronous I/O devices. For
> example, the test program below currently ends up draining
> lru_cache 100% of the time:
> 
> int main(int argc, char *argv[])
> {
>         int i;
>  #define SIZE 100*1024*1024
> 	while(1) {
> 		volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
>                         MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 
> 		for (int i = 0; i < SIZE/sizeof(int); i++)
> 			p[i] =  i%64;
> 		madvise((void *)p, SIZE, MADV_PAGEOUT);
> 		for (int i = 0; i < SIZE/sizeof(int); i++)
> 			p[i] =  i%64;
> 		munmap(p, SIZE);
> 	}
> 	return 0;
> }
> 
> Folio reuse now relies primarily on the exclusive hint, making
> lru_cache draining to drop the refcount in lru_cache largely
> irrelevant.
> 
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-06-11 18:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 10:51 [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
2026-06-11 18:09   ` Shakeel Butt
2026-06-11 18:17     ` Shakeel Butt
2026-06-11 10:51 ` [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
2026-06-11 18:12   ` Shakeel Butt
2026-06-11 10:51 ` [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
2026-06-11 18:40   ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox