Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths
@ 2026-06-11 10:51 Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-11 10:51 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, david, ljs, liam, vbabka, rppt, surenb, mhocko,
	chrisl, kasong, shikemeng, nphamcs, baoquan.he, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt, Barry Song (Xiaomi)

We are doing a large number of redundant lru_add_drain() calls in
both wp_can_reuse_anon_folio() and do_swap_page(), leading to LRU
lock contention and unnecessary overhead.

In wp_can_reuse_anon_folio(), we can check the refcount against the
lru_cache before deciding to drain. In do_swap_page(), the drain is
now entirely redundant after Kairui's work to route SYNC I/O through
the swapcache in the same way as ASYNC I/O.

Build the kernel within a 1GB memcg using 20 threads with zRAM swap.
The number of lru_add_drain() calls is reduced from 276,787 to
230,283, while sys time decreases slightly from 3m40.125s to
3m37.128s.

Build the kernel within an 800MB memcg using 20 threads with zRAM
swap. The number of lru_add_drain() calls is reduced from 796,661 to
537,262, while sys time decreases slightly from 6m25.981s to
6m22.678s.

Barry Song (Xiaomi) (3):
  mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
  mm: entirely remove lru_add_drain in do_swap_page

 mm/memory.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

-- 
2.39.3 (Apple Git-146)



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-11 10:51 [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
@ 2026-06-11 10:51 ` Barry Song (Xiaomi)
  2026-06-11 18:09   ` Shakeel Butt
  2026-06-12  3:41   ` Baoquan He
  2026-06-11 10:51 ` [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
  2 siblings, 2 replies; 15+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-11 10:51 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, david, ljs, liam, vbabka, rppt, surenb, mhocko,
	chrisl, kasong, shikemeng, nphamcs, baoquan.he, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt, Barry Song (Xiaomi)

We always unconditionally drain the LRU before retrying anon folio
reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
are in lru_cache, and use the refcount to avoid many unnecessary LRU
drains.

Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 56be920c56d7..487a34377a7b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
 	 */
 	if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
 		return false;
-	if (!folio_test_lru(folio))
+	if (!folio_test_lru(folio)) {
+		/*
+		 * Assume folio is on lru_cache and holds a cache reference.
+		 */
+		if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
+			return false;
 		/*
 		 * We cannot easily detect+handle references from
 		 * remote LRU caches or references to LRU folios.
 		 */
 		lru_add_drain();
+	}
 	if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
 		return false;
 	if (!folio_trylock(folio))
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
  2026-06-11 10:51 [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
@ 2026-06-11 10:51 ` Barry Song (Xiaomi)
  2026-06-11 18:12   ` Shakeel Butt
  2026-06-12  1:18   ` Baoquan He
  2026-06-11 10:51 ` [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
  2 siblings, 2 replies; 15+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-11 10:51 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, david, ljs, liam, vbabka, rppt, surenb, mhocko,
	chrisl, kasong, shikemeng, nphamcs, baoquan.he, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt, Barry Song (Xiaomi)

The "we just allocated them without exposing them to the swapcache"
case no longer exists, as Kairui has routed synchronous I/O through
the swapcache as well in his series "unify swapin use swap cache and
cleanup flags"[1]. As a result, folio_ref_count() should never be 1
in this path, since at least two references are held (base ref plus
swapcache). Remove the folio_ref_count()==1 check and update the
comment accordingly.

[1] https://lore.kernel.org/all/20251220-swap-table-p2-v5-0-8862a265a033@tencent.com/
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 487a34377a7b..ce8ef27e7a54 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5049,12 +5049,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 
 	/*
 	 * Same logic as in do_wp_page(); however, optimize for pages that are
-	 * certainly not shared either because we just allocated them without
-	 * exposing them to the swapcache or because the swap entry indicates
-	 * exclusivity.
+	 * certainly not because the swap entry indicates exclusivity.
 	 */
-	if (!folio_test_ksm(folio) &&
-	    (exclusive || folio_ref_count(folio) == 1)) {
+	if (!folio_test_ksm(folio) && exclusive) {
 		if ((vma->vm_flags & VM_WRITE) && !userfaultfd_pte_wp(vma, pte) &&
 		    !pte_needs_soft_dirty_wp(vma, pte)) {
 			pte = pte_mkwrite(pte, vma);
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page
  2026-06-11 10:51 [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
  2026-06-11 10:51 ` [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
@ 2026-06-11 10:51 ` Barry Song (Xiaomi)
  2026-06-11 18:40   ` Shakeel Butt
  2026-06-12  1:39   ` Baoquan He
  2 siblings, 2 replies; 15+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-11 10:51 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, david, ljs, liam, vbabka, rppt, surenb, mhocko,
	chrisl, kasong, shikemeng, nphamcs, baoquan.he, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt, Barry Song (Xiaomi)

We are doing a lot of redundant lru_add_drain() calls in
do_swap_page(), especially for synchronous I/O devices. For
example, the test program below currently ends up draining
lru_cache 100% of the time:

int main(int argc, char *argv[])
{
        int i;
 #define SIZE 100*1024*1024
	while(1) {
		volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
                        MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

		for (int i = 0; i < SIZE/sizeof(int); i++)
			p[i] =  i%64;
		madvise((void *)p, SIZE, MADV_PAGEOUT);
		for (int i = 0; i < SIZE/sizeof(int); i++)
			p[i] =  i%64;
		munmap(p, SIZE);
	}
	return 0;
}

Folio reuse now relies primarily on the exclusive hint, making
lru_cache draining to drop the refcount in lru_cache largely
irrelevant.

Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
 mm/memory.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index ce8ef27e7a54..b5a78670bcc8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4903,16 +4903,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 	} else if (folio != swapcache)
 		page = folio_page(folio, 0);
 
-	/*
-	 * If we want to map a page that's in the swapcache writable, we
-	 * have to detect via the refcount if we're really the exclusive
-	 * owner. Try removing the extra reference from the local LRU
-	 * caches if required.
-	 */
-	if ((vmf->flags & FAULT_FLAG_WRITE) &&
-	    !folio_test_ksm(folio) && !folio_test_lru(folio))
-		lru_add_drain();
-
 	folio_throttle_swaprate(folio, GFP_KERNEL);
 
 	/*
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
@ 2026-06-11 18:09   ` Shakeel Butt
  2026-06-11 18:17     ` Shakeel Butt
  2026-06-12  3:41   ` Baoquan He
  1 sibling, 1 reply; 15+ messages in thread
From: Shakeel Butt @ 2026-06-11 18:09 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, baoquan.he,
	youngjun.park, jp.kobryn, usama.arif

On Thu, Jun 11, 2026 at 06:51:22PM +0800, Barry Song (Xiaomi) wrote:
> We always unconditionally drain the LRU before retrying anon folio
> reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> are in lru_cache, and use the refcount to avoid many unnecessary LRU
> drains.
> 
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
>  mm/memory.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 56be920c56d7..487a34377a7b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
>  	 */
>  	if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
>  		return false;
> -	if (!folio_test_lru(folio))
> +	if (!folio_test_lru(folio)) {
> +		/*
> +		 * Assume folio is on lru_cache and holds a cache reference.
> +		 */
> +		if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> +			return false;

In your experiments, how much amount of drains were reduced due to this specific
check?

I wonder if that data can motivate to introduce lru_add_drain_folio(folio) which
only drains if the given folio is in the local lru_add cache.

>  		/*
>  		 * We cannot easily detect+handle references from
>  		 * remote LRU caches or references to LRU folios.
>  		 */
>  		lru_add_drain();
> +	}
>  	if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
>  		return false;
>  	if (!folio_trylock(folio))
> -- 
> 2.39.3 (Apple Git-146)
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
  2026-06-11 10:51 ` [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
@ 2026-06-11 18:12   ` Shakeel Butt
  2026-06-12  1:18   ` Baoquan He
  1 sibling, 0 replies; 15+ messages in thread
From: Shakeel Butt @ 2026-06-11 18:12 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, baoquan.he,
	youngjun.park, jp.kobryn, usama.arif

On Thu, Jun 11, 2026 at 06:51:23PM +0800, Barry Song (Xiaomi) wrote:
> The "we just allocated them without exposing them to the swapcache"
> case no longer exists, as Kairui has routed synchronous I/O through
> the swapcache as well in his series "unify swapin use swap cache and
> cleanup flags"[1]. As a result, folio_ref_count() should never be 1
> in this path, since at least two references are held (base ref plus
> swapcache). Remove the folio_ref_count()==1 check and update the
> comment accordingly.
> 
> [1] https://lore.kernel.org/all/20251220-swap-table-p2-v5-0-8862a265a033@tencent.com/
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-11 18:09   ` Shakeel Butt
@ 2026-06-11 18:17     ` Shakeel Butt
  2026-06-12  1:08       ` Baoquan He
  2026-06-12  1:35       ` Barry Song
  0 siblings, 2 replies; 15+ messages in thread
From: Shakeel Butt @ 2026-06-11 18:17 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, baoquan.he,
	youngjun.park, jp.kobryn, usama.arif

On Thu, Jun 11, 2026 at 11:09:43AM -0700, Shakeel Butt wrote:
> On Thu, Jun 11, 2026 at 06:51:22PM +0800, Barry Song (Xiaomi) wrote:
> > We always unconditionally drain the LRU before retrying anon folio
> > reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> > are in lru_cache, and use the refcount to avoid many unnecessary LRU
> > drains.
> > 
> > Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> > ---
> >  mm/memory.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 56be920c56d7..487a34377a7b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
> >  	 */
> >  	if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
> >  		return false;
> > -	if (!folio_test_lru(folio))
> > +	if (!folio_test_lru(folio)) {
> > +		/*
> > +		 * Assume folio is on lru_cache and holds a cache reference.
> > +		 */
> > +		if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> > +			return false;
> 
> In your experiments, how much amount of drains were reduced due to this specific
> check?
> 
> I wonder if that data can motivate to introduce lru_add_drain_folio(folio) which
> only drains if the given folio is in the local lru_add cache.

Actually if we can peek into lru_add cache and folio_ref_count(folio) is exactly
equal to (2 + folio_test_swapcache(folio)) and folio is not on LRU then we can
just reuse folio if it is in lru_add cache without draining, right?

> 
> >  		/*
> >  		 * We cannot easily detect+handle references from
> >  		 * remote LRU caches or references to LRU folios.
> >  		 */
> >  		lru_add_drain();
> > +	}
> >  	if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
> >  		return false;
> >  	if (!folio_trylock(folio))
> > -- 
> > 2.39.3 (Apple Git-146)
> > 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page
  2026-06-11 10:51 ` [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
@ 2026-06-11 18:40   ` Shakeel Butt
  2026-06-12  1:39   ` Baoquan He
  1 sibling, 0 replies; 15+ messages in thread
From: Shakeel Butt @ 2026-06-11 18:40 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, baoquan.he,
	youngjun.park, jp.kobryn, usama.arif

On Thu, Jun 11, 2026 at 06:51:24PM +0800, Barry Song (Xiaomi) wrote:
> We are doing a lot of redundant lru_add_drain() calls in
> do_swap_page(), especially for synchronous I/O devices. For
> example, the test program below currently ends up draining
> lru_cache 100% of the time:
> 
> int main(int argc, char *argv[])
> {
>         int i;
>  #define SIZE 100*1024*1024
> 	while(1) {
> 		volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
>                         MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 
> 		for (int i = 0; i < SIZE/sizeof(int); i++)
> 			p[i] =  i%64;
> 		madvise((void *)p, SIZE, MADV_PAGEOUT);
> 		for (int i = 0; i < SIZE/sizeof(int); i++)
> 			p[i] =  i%64;
> 		munmap(p, SIZE);
> 	}
> 	return 0;
> }
> 
> Folio reuse now relies primarily on the exclusive hint, making
> lru_cache draining to drop the refcount in lru_cache largely
> irrelevant.
> 
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-11 18:17     ` Shakeel Butt
@ 2026-06-12  1:08       ` Baoquan He
  2026-06-12  1:57         ` Barry Song
  2026-06-12  1:35       ` Barry Song
  1 sibling, 1 reply; 15+ messages in thread
From: Baoquan He @ 2026-06-12  1:08 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Barry Song (Xiaomi), akpm, linux-mm, linux-kernel, david, ljs,
	liam, vbabka, rppt, surenb, mhocko, chrisl, kasong, shikemeng,
	nphamcs, youngjun.park, jp.kobryn, usama.arif

On 06/11/26 at 11:17am, Shakeel Butt wrote:
> On Thu, Jun 11, 2026 at 11:09:43AM -0700, Shakeel Butt wrote:
> > On Thu, Jun 11, 2026 at 06:51:22PM +0800, Barry Song (Xiaomi) wrote:
> > > We always unconditionally drain the LRU before retrying anon folio
> > > reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> > > are in lru_cache, and use the refcount to avoid many unnecessary LRU
> > > drains.
> > > 
> > > Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> > > ---
> > >  mm/memory.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 56be920c56d7..487a34377a7b 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
> > >  	 */
> > >  	if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
> > >  		return false;
> > > -	if (!folio_test_lru(folio))
> > > +	if (!folio_test_lru(folio)) {
> > > +		/*
> > > +		 * Assume folio is on lru_cache and holds a cache reference.
> > > +		 */
> > > +		if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> > > +			return false;
> > 
> > In your experiments, how much amount of drains were reduced due to this specific
> > check?
> > 
> > I wonder if that data can motivate to introduce lru_add_drain_folio(folio) which
> > only drains if the given folio is in the local lru_add cache.
> 
> Actually if we can peek into lru_add cache and folio_ref_count(folio) is exactly
> equal to (2 + folio_test_swapcache(folio)) and folio is not on LRU then we can
> just reuse folio if it is in lru_add cache without draining, right?

Sounds more reasonable if we can only touch the wanted folio if it
exists in pvec. Believe it can improve efficiency more than the
current patch.

> 
> > 
> > >  		/*
> > >  		 * We cannot easily detect+handle references from
> > >  		 * remote LRU caches or references to LRU folios.
> > >  		 */
> > >  		lru_add_drain();
> > > +	}
> > >  	if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
> > >  		return false;
> > >  	if (!folio_trylock(folio))
> > > -- 
> > > 2.39.3 (Apple Git-146)
> > > 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
  2026-06-11 10:51 ` [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
  2026-06-11 18:12   ` Shakeel Butt
@ 2026-06-12  1:18   ` Baoquan He
  1 sibling, 0 replies; 15+ messages in thread
From: Baoquan He @ 2026-06-12  1:18 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt

On 06/11/26 at 06:51pm, Barry Song (Xiaomi) wrote:
> The "we just allocated them without exposing them to the swapcache"
> case no longer exists, as Kairui has routed synchronous I/O through
> the swapcache as well in his series "unify swapin use swap cache and
> cleanup flags"[1]. As a result, folio_ref_count() should never be 1
> in this path, since at least two references are held (base ref plus
> swapcache). Remove the folio_ref_count()==1 check and update the
> comment accordingly.
> 
> [1] https://lore.kernel.org/all/20251220-swap-table-p2-v5-0-8862a265a033@tencent.com/
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
>  mm/memory.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 487a34377a7b..ce8ef27e7a54 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5049,12 +5049,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>  
>  	/*
>  	 * Same logic as in do_wp_page(); however, optimize for pages that are
> -	 * certainly not shared either because we just allocated them without
> -	 * exposing them to the swapcache or because the swap entry indicates
> -	 * exclusivity.
> +	 * certainly not because the swap entry indicates exclusivity.
>  	 */
> -	if (!folio_test_ksm(folio) &&
> -	    (exclusive || folio_ref_count(folio) == 1)) {
> +	if (!folio_test_ksm(folio) && exclusive) {
>  		if ((vma->vm_flags & VM_WRITE) && !userfaultfd_pte_wp(vma, pte) &&
>  		    !pte_needs_soft_dirty_wp(vma, pte)) {
>  			pte = pte_mkwrite(pte, vma);

Reviewed-by: Baoquan He <baoquan.he@linux.dev>

> -- 
> 2.39.3 (Apple Git-146)
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-11 18:17     ` Shakeel Butt
  2026-06-12  1:08       ` Baoquan He
@ 2026-06-12  1:35       ` Barry Song
  1 sibling, 0 replies; 15+ messages in thread
From: Barry Song @ 2026-06-12  1:35 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, baoquan.he,
	youngjun.park, jp.kobryn, usama.arif

On Fri, Jun 12, 2026 at 2:18 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> On Thu, Jun 11, 2026 at 11:09:43AM -0700, Shakeel Butt wrote:
> > On Thu, Jun 11, 2026 at 06:51:22PM +0800, Barry Song (Xiaomi) wrote:
> > > We always unconditionally drain the LRU before retrying anon folio
> > > reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> > > are in lru_cache, and use the refcount to avoid many unnecessary LRU
> > > drains.
> > >
> > > Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> > > ---
> > >  mm/memory.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 56be920c56d7..487a34377a7b 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
> > >      */
> > >     if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
> > >             return false;
> > > -   if (!folio_test_lru(folio))
> > > +   if (!folio_test_lru(folio)) {
> > > +           /*
> > > +            * Assume folio is on lru_cache and holds a cache reference.
> > > +            */
> > > +           if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> > > +                   return false;
> >
> > In your experiments, how much amount of drains were reduced due to this specific
> > check?
> >
> > I wonder if that data can motivate to introduce lru_add_drain_folio(folio) which
> > only drains if the given folio is in the local lru_add cache.
>
> Actually if we can peek into lru_add cache and folio_ref_count(folio) is exactly
> equal to (2 + folio_test_swapcache(folio)) and folio is not on LRU then we can
> just reuse folio if it is in lru_add cache without draining, right?
>

The real problem is that !folio_test_lru(folio) does not tell us
whether the folio is still sitting in lru_cache, and determining that
is unlikely to be cheap. We could either scan the lru_cache or add a
separate flag to indicate lru_cache membership, but neither option
sounds particularly appealing.

Thanks
Barry


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page
  2026-06-11 10:51 ` [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
  2026-06-11 18:40   ` Shakeel Butt
@ 2026-06-12  1:39   ` Baoquan He
  1 sibling, 0 replies; 15+ messages in thread
From: Baoquan He @ 2026-06-12  1:39 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt

On 06/11/26 at 06:51pm, Barry Song (Xiaomi) wrote:
> We are doing a lot of redundant lru_add_drain() calls in
> do_swap_page(), especially for synchronous I/O devices. For
> example, the test program below currently ends up draining
> lru_cache 100% of the time:
> 
> int main(int argc, char *argv[])
> {
>         int i;
>  #define SIZE 100*1024*1024
> 	while(1) {
> 		volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
>                         MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 
> 		for (int i = 0; i < SIZE/sizeof(int); i++)
> 			p[i] =  i%64;
> 		madvise((void *)p, SIZE, MADV_PAGEOUT);
> 		for (int i = 0; i < SIZE/sizeof(int); i++)
> 			p[i] =  i%64;
> 		munmap(p, SIZE);
> 	}
> 	return 0;
> }
> 
> Folio reuse now relies primarily on the exclusive hint, making
> lru_cache draining to drop the refcount in lru_cache largely
> irrelevant.
> 
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
>  mm/memory.c | 10 ----------
>  1 file changed, 10 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index ce8ef27e7a54..b5a78670bcc8 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4903,16 +4903,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>  	} else if (folio != swapcache)
>  		page = folio_page(folio, 0);
>  
> -	/*
> -	 * If we want to map a page that's in the swapcache writable, we
> -	 * have to detect via the refcount if we're really the exclusive
> -	 * owner. Try removing the extra reference from the local LRU
> -	 * caches if required.
> -	 */
> -	if ((vmf->flags & FAULT_FLAG_WRITE) &&
> -	    !folio_test_ksm(folio) && !folio_test_lru(folio))
> -		lru_add_drain();
> -
>  	folio_throttle_swaprate(folio, GFP_KERNEL);

Reviewed-by: Baoquan He <baoquan.he@linux.dev>

>  
>  	/*
> -- 
> 2.39.3 (Apple Git-146)
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-12  1:08       ` Baoquan He
@ 2026-06-12  1:57         ` Barry Song
  2026-06-12  3:40           ` Baoquan He
  0 siblings, 1 reply; 15+ messages in thread
From: Barry Song @ 2026-06-12  1:57 UTC (permalink / raw)
  To: Baoquan He
  Cc: Shakeel Butt, akpm, linux-mm, linux-kernel, david, ljs, liam,
	vbabka, rppt, surenb, mhocko, chrisl, kasong, shikemeng, nphamcs,
	youngjun.park, jp.kobryn, usama.arif

On Fri, Jun 12, 2026 at 9:08 AM Baoquan He <baoquan.he@linux.dev> wrote:
>
> On 06/11/26 at 11:17am, Shakeel Butt wrote:
> > On Thu, Jun 11, 2026 at 11:09:43AM -0700, Shakeel Butt wrote:
> > > On Thu, Jun 11, 2026 at 06:51:22PM +0800, Barry Song (Xiaomi) wrote:
> > > > We always unconditionally drain the LRU before retrying anon folio
> > > > reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> > > > are in lru_cache, and use the refcount to avoid many unnecessary LRU
> > > > drains.
> > > >
> > > > Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> > > > ---
> > > >  mm/memory.c | 8 +++++++-
> > > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > index 56be920c56d7..487a34377a7b 100644
> > > > --- a/mm/memory.c
> > > > +++ b/mm/memory.c
> > > > @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
> > > >    */
> > > >   if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
> > > >           return false;
> > > > - if (!folio_test_lru(folio))
> > > > + if (!folio_test_lru(folio)) {
> > > > +         /*
> > > > +          * Assume folio is on lru_cache and holds a cache reference.
> > > > +          */
> > > > +         if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> > > > +                 return false;
> > >
> > > In your experiments, how much amount of drains were reduced due to this specific
> > > check?
> > >
> > > I wonder if that data can motivate to introduce lru_add_drain_folio(folio) which
> > > only drains if the given folio is in the local lru_add cache.
> >
> > Actually if we can peek into lru_add cache and folio_ref_count(folio) is exactly
> > equal to (2 + folio_test_swapcache(folio)) and folio is not on LRU then we can
> > just reuse folio if it is in lru_add cache without draining, right?
>
> Sounds more reasonable if we can only touch the wanted folio if it
> exists in pvec. Believe it can improve efficiency more than the
> current patch.
>

Technically yes, but in practice it seems quite hard. As I explained
to Shakeel, we don't have an easy way to determine whether a folio is
sitting in the lru_cache. !folio_test_lru(folio) only tells us that
the folio may either have been removed from the LRU or still be
sitting in the lru_cache; it does not distinguish between the two.
Unless we want to scan the lru_cache to look up the folio, we may need
a dedicated folio flag, similar to PG_lru, to indicate lru_cache
membership -
I guess people would be quite unhappy about adding a new flag?

On the other hand, the folio may be sitting in the lru_cache of
another CPU, which the current drain cannot flush. As things stand
today, the drain only succeeds if the folio happens to be queued on
the current CPU's lru_cache, giving it roughly a 1/nr_cpus chance of
working. drain_all would clearly be too expensive to avoid.

So another possibility is to drop this drain as well. The folio can
be released later, at the cost of missing some opportunities for
reuse.

Best Regards
Barry


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-12  1:57         ` Barry Song
@ 2026-06-12  3:40           ` Baoquan He
  0 siblings, 0 replies; 15+ messages in thread
From: Baoquan He @ 2026-06-12  3:40 UTC (permalink / raw)
  To: Barry Song
  Cc: Shakeel Butt, akpm, linux-mm, linux-kernel, david, ljs, liam,
	vbabka, rppt, surenb, mhocko, chrisl, kasong, shikemeng, nphamcs,
	youngjun.park, jp.kobryn, usama.arif

On 06/12/26 at 09:57am, Barry Song wrote:
> On Fri, Jun 12, 2026 at 9:08 AM Baoquan He <baoquan.he@linux.dev> wrote:
> >
> > On 06/11/26 at 11:17am, Shakeel Butt wrote:
> > > On Thu, Jun 11, 2026 at 11:09:43AM -0700, Shakeel Butt wrote:
> > > > On Thu, Jun 11, 2026 at 06:51:22PM +0800, Barry Song (Xiaomi) wrote:
> > > > > We always unconditionally drain the LRU before retrying anon folio
> > > > > reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> > > > > are in lru_cache, and use the refcount to avoid many unnecessary LRU
> > > > > drains.
> > > > >
> > > > > Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> > > > > ---
> > > > >  mm/memory.c | 8 +++++++-
> > > > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > > index 56be920c56d7..487a34377a7b 100644
> > > > > --- a/mm/memory.c
> > > > > +++ b/mm/memory.c
> > > > > @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
> > > > >    */
> > > > >   if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
> > > > >           return false;
> > > > > - if (!folio_test_lru(folio))
> > > > > + if (!folio_test_lru(folio)) {
> > > > > +         /*
> > > > > +          * Assume folio is on lru_cache and holds a cache reference.
> > > > > +          */
> > > > > +         if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> > > > > +                 return false;
> > > >
> > > > In your experiments, how much amount of drains were reduced due to this specific
> > > > check?
> > > >
> > > > I wonder if that data can motivate to introduce lru_add_drain_folio(folio) which
> > > > only drains if the given folio is in the local lru_add cache.
> > >
> > > Actually if we can peek into lru_add cache and folio_ref_count(folio) is exactly
> > > equal to (2 + folio_test_swapcache(folio)) and folio is not on LRU then we can
> > > just reuse folio if it is in lru_add cache without draining, right?
> >
> > Sounds more reasonable if we can only touch the wanted folio if it
> > exists in pvec. Believe it can improve efficiency more than the
> > current patch.
> >
> 
> Technically yes, but in practice it seems quite hard. As I explained
> to Shakeel, we don't have an easy way to determine whether a folio is
> sitting in the lru_cache. !folio_test_lru(folio) only tells us that
> the folio may either have been removed from the LRU or still be
> sitting in the lru_cache; it does not distinguish between the two.
> Unless we want to scan the lru_cache to look up the folio, we may need
> a dedicated folio flag, similar to PG_lru, to indicate lru_cache
> membership -
> I guess people would be quite unhappy about adding a new flag?
> 
> On the other hand, the folio may be sitting in the lru_cache of
> another CPU, which the current drain cannot flush. As things stand
> today, the drain only succeeds if the folio happens to be queued on
> the current CPU's lru_cache, giving it roughly a 1/nr_cpus chance of
> working. drain_all would clearly be too expensive to avoid.

BAM, sitting in other CPU's pvec is a real barrier. If it's in local
cpu, it definitely deserves a try as Shakeel said, when refcount = (2 +
folio_test_swapcache(folio)). 

Then the current solution is optimal at present.

> 
> So another possibility is to drop this drain as well. The folio can
> be released later, at the cost of missing some opportunities for
> reuse.
> 
> Best Regards
> Barry


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
  2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
  2026-06-11 18:09   ` Shakeel Butt
@ 2026-06-12  3:41   ` Baoquan He
  1 sibling, 0 replies; 15+ messages in thread
From: Baoquan He @ 2026-06-12  3:41 UTC (permalink / raw)
  To: Barry Song (Xiaomi)
  Cc: akpm, linux-mm, linux-kernel, david, ljs, liam, vbabka, rppt,
	surenb, mhocko, chrisl, kasong, shikemeng, nphamcs, youngjun.park,
	jp.kobryn, usama.arif, shakeel.butt

On 06/11/26 at 06:51pm, Barry Song (Xiaomi) wrote:
> We always unconditionally drain the LRU before retrying anon folio
> reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> are in lru_cache, and use the refcount to avoid many unnecessary LRU
> drains.
> 
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
>  mm/memory.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 56be920c56d7..487a34377a7b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
>  	 */
>  	if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
>  		return false;
> -	if (!folio_test_lru(folio))
> +	if (!folio_test_lru(folio)) {
> +		/*
> +		 * Assume folio is on lru_cache and holds a cache reference.
> +		 */
> +		if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> +			return false;
>  		/*
>  		 * We cannot easily detect+handle references from
>  		 * remote LRU caches or references to LRU folios.
>  		 */
>  		lru_add_drain();
> +	}
>  	if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
>  		return false;
>  	if (!folio_trylock(folio))

Reviewed-by: Baoquan He <baoquan.he@linux.dev>

> -- 
> 2.39.3 (Apple Git-146)
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-06-12  3:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 10:51 [RFC PATCH 0/3] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
2026-06-11 10:51 ` [RFC PATCH 1/3] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
2026-06-11 18:09   ` Shakeel Butt
2026-06-11 18:17     ` Shakeel Butt
2026-06-12  1:08       ` Baoquan He
2026-06-12  1:57         ` Barry Song
2026-06-12  3:40           ` Baoquan He
2026-06-12  1:35       ` Barry Song
2026-06-12  3:41   ` Baoquan He
2026-06-11 10:51 ` [RFC PATCH 2/3] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
2026-06-11 18:12   ` Shakeel Butt
2026-06-12  1:18   ` Baoquan He
2026-06-11 10:51 ` [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
2026-06-11 18:40   ` Shakeel Butt
2026-06-12  1:39   ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox