* [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
2026-06-23 23:16 [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
@ 2026-06-23 23:16 ` Barry Song (Xiaomi)
2026-06-24 10:14 ` Kairui Song
2026-06-24 15:02 ` David Hildenbrand (Arm)
2026-06-23 23:16 ` [PATCH v2 2/4] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
` (2 subsequent siblings)
3 siblings, 2 replies; 11+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-23 23:16 UTC (permalink / raw)
To: akpm, linux-mm
Cc: baoquan.he, chrisl, david, jp.kobryn, kasong, liam, linux-kernel,
ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
usama.arif, vbabka, youngjun.park, Barry Song (Xiaomi)
We always unconditionally drain the LRU before retrying anon folio
reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
are in lru_cache, and use the refcount to avoid many unnecessary LRU
drains.
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
mm/memory.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index ff338c2abe92..f6848f4234a6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
*/
if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
return false;
- if (!folio_test_lru(folio))
+ if (!folio_test_lru(folio)) {
+ /*
+ * Assume folio is on lru_cache and holds a cache reference.
+ */
+ if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
+ return false;
/*
* We cannot easily detect+handle references from
* remote LRU caches or references to LRU folios.
*/
lru_add_drain();
+ }
if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
return false;
if (!folio_trylock(folio))
--
2.39.3 (Apple Git-146)
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
2026-06-23 23:16 ` [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
@ 2026-06-24 10:14 ` Kairui Song
2026-06-24 15:02 ` David Hildenbrand (Arm)
1 sibling, 0 replies; 11+ messages in thread
From: Kairui Song @ 2026-06-24 10:14 UTC (permalink / raw)
To: Barry Song (Xiaomi)
Cc: akpm, linux-mm, baoquan.he, chrisl, david, jp.kobryn, liam,
linux-kernel, ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng,
surenb, usama.arif, vbabka, youngjun.park
On Wed, Jun 24, 2026 at 7:16 AM Barry Song (Xiaomi) <baohua@kernel.org> wrote:
>
> We always unconditionally drain the LRU before retrying anon folio
> reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> are in lru_cache, and use the refcount to avoid many unnecessary LRU
> drains.
>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Reviewed-by: Baoquan He <baoquan.he@linux.dev>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
> mm/memory.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index ff338c2abe92..f6848f4234a6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
> */
> if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
> return false;
> - if (!folio_test_lru(folio))
> + if (!folio_test_lru(folio)) {
> + /*
> + * Assume folio is on lru_cache and holds a cache reference.
> + */
> + if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> + return false;
> /*
> * We cannot easily detect+handle references from
> * remote LRU caches or references to LRU folios.
> */
> lru_add_drain();
> + }
> if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio))
> return false;
> if (!folio_trylock(folio))
> --
> 2.39.3 (Apple Git-146)
>
Reviewed-by: Kairui Song <kasong@tencent.com>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio()
2026-06-23 23:16 ` [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
2026-06-24 10:14 ` Kairui Song
@ 2026-06-24 15:02 ` David Hildenbrand (Arm)
1 sibling, 0 replies; 11+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-24 15:02 UTC (permalink / raw)
To: Barry Song (Xiaomi), akpm, linux-mm
Cc: baoquan.he, chrisl, jp.kobryn, kasong, liam, linux-kernel, ljs,
mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
usama.arif, vbabka, youngjun.park
On 6/24/26 01:16, Barry Song (Xiaomi) wrote:
> We always unconditionally drain the LRU before retrying anon folio
> reuse in wp_can_reuse_anon_folio(). Instead, assume !LRU anon folios
> are in lru_cache, and use the refcount to avoid many unnecessary LRU
> drains.
>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Reviewed-by: Baoquan He <baoquan.he@linux.dev>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
> mm/memory.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index ff338c2abe92..f6848f4234a6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4193,12 +4193,18 @@ static bool wp_can_reuse_anon_folio(struct folio *folio,
> */
> if (folio_test_ksm(folio) || folio_ref_count(folio) > 3)
> return false;
> - if (!folio_test_lru(folio))
> + if (!folio_test_lru(folio)) {
> + /*
> + * Assume folio is on lru_cache and holds a cache reference.
> + */
> + if (folio_ref_count(folio) > 2 + folio_test_swapcache(folio))
> + return false;
I'm not keen on making this function even uglier, so no, not like that.
We have the earlier "folio_ref_count(folio) > 3" check.
In which scenarios can you trigger this such that we would care?
If the answer is "I don't know" there is no reason for a change.
--
Cheers,
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 2/4] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
2026-06-23 23:16 [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
2026-06-23 23:16 ` [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
@ 2026-06-23 23:16 ` Barry Song (Xiaomi)
2026-06-24 15:07 ` David Hildenbrand (Arm)
2026-06-23 23:16 ` [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
2026-06-23 23:16 ` [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios Barry Song (Xiaomi)
3 siblings, 1 reply; 11+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-23 23:16 UTC (permalink / raw)
To: akpm, linux-mm
Cc: baoquan.he, chrisl, david, jp.kobryn, kasong, liam, linux-kernel,
ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
usama.arif, vbabka, youngjun.park, Barry Song (Xiaomi)
The "we just allocated them without exposing them to the swapcache"
case no longer exists, as Kairui has routed synchronous I/O through
the swapcache as well in his series "unify swapin use swap cache and
cleanup flags"[1]. As a result, folio_ref_count() should never be 1
in this path, since at least two references are held (base ref plus
swapcache). Remove the folio_ref_count()==1 check and update the
comment accordingly.
[1] https://lore.kernel.org/all/20251220-swap-table-p2-v5-0-8862a265a033@tencent.com/
Acked-by: Usama Arif <usama.arif@linux.dev>
Reviewed-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
mm/memory.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index f6848f4234a6..abd0adcf65f0 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5049,12 +5049,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
/*
* Same logic as in do_wp_page(); however, optimize for pages that are
- * certainly not shared either because we just allocated them without
- * exposing them to the swapcache or because the swap entry indicates
- * exclusivity.
+ * certainly not because the swap entry indicates exclusivity.
*/
- if (!folio_test_ksm(folio) &&
- (exclusive || folio_ref_count(folio) == 1)) {
+ if (!folio_test_ksm(folio) && exclusive) {
if ((vma->vm_flags & VM_WRITE) && !userfaultfd_pte_wp(vma, pte) &&
!pte_needs_soft_dirty_wp(vma, pte)) {
pte = pte_mkwrite(pte, vma);
--
2.39.3 (Apple Git-146)
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH v2 2/4] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic
2026-06-23 23:16 ` [PATCH v2 2/4] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
@ 2026-06-24 15:07 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 11+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-24 15:07 UTC (permalink / raw)
To: Barry Song (Xiaomi), akpm, linux-mm
Cc: baoquan.he, chrisl, jp.kobryn, kasong, liam, linux-kernel, ljs,
mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
usama.arif, vbabka, youngjun.park
On 6/24/26 01:16, Barry Song (Xiaomi) wrote:
> The "we just allocated them without exposing them to the swapcache"
> case no longer exists, as Kairui has routed synchronous I/O through
> the swapcache as well in his series "unify swapin use swap cache and
> cleanup flags"[1]. As a result, folio_ref_count() should never be 1
> in this path, since at least two references are held (base ref plus
> swapcache). Remove the folio_ref_count()==1 check and update the
> comment accordingly.
>
> [1] https://lore.kernel.org/all/20251220-swap-table-p2-v5-0-8862a265a033@tencent.com/
>
> Acked-by: Usama Arif <usama.arif@linux.dev>
> Reviewed-by: Kairui Song <kasong@tencent.com>
> Reviewed-by: Baoquan He <baoquan.he@linux.dev>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
> mm/memory.c | 7 ++-----
> 1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index f6848f4234a6..abd0adcf65f0 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5049,12 +5049,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>
> /*
> * Same logic as in do_wp_page(); however, optimize for pages that are
s/Same/Similar/ ?
> - * certainly not shared either because we just allocated them without
> - * exposing them to the swapcache or because the swap entry indicates
> - * exclusivity.
> + * certainly not because the swap entry indicates exclusivity.
> */
> - if (!folio_test_ksm(folio) &&
> - (exclusive || folio_ref_count(folio) == 1)) {
> + if (!folio_test_ksm(folio) && exclusive) {
Hmm, but KSM folios should never have "exclusive" set. So I think you can drop
that as well (was only relevant with folio_ref_count==1 check IIRC).
--
Cheers,
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page
2026-06-23 23:16 [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
2026-06-23 23:16 ` [PATCH v2 1/4] mm: avoid unnecessary lru drain for wp_can_reuse_anon_folio() Barry Song (Xiaomi)
2026-06-23 23:16 ` [PATCH v2 2/4] mm: drop stale folio_ref_count()==1 check in do_swap_page reuse logic Barry Song (Xiaomi)
@ 2026-06-23 23:16 ` Barry Song (Xiaomi)
2026-06-24 10:16 ` Kairui Song
2026-06-24 15:10 ` David Hildenbrand (Arm)
2026-06-23 23:16 ` [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios Barry Song (Xiaomi)
3 siblings, 2 replies; 11+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-23 23:16 UTC (permalink / raw)
To: akpm, linux-mm
Cc: baoquan.he, chrisl, david, jp.kobryn, kasong, liam, linux-kernel,
ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
usama.arif, vbabka, youngjun.park, Barry Song (Xiaomi)
We are doing a lot of redundant lru_add_drain() calls in
do_swap_page(), especially for synchronous I/O devices. For
example, the test program below currently ends up draining
lru_cache 100% of the time:
int main(int argc, char *argv[])
{
int i;
#define SIZE 100*1024*1024
while(1) {
volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
for (int i = 0; i < SIZE/sizeof(int); i++)
p[i] = i%64;
madvise((void *)p, SIZE, MADV_PAGEOUT);
for (int i = 0; i < SIZE/sizeof(int); i++)
p[i] = i%64;
munmap(p, SIZE);
}
return 0;
}
Folio reuse now relies primarily on the exclusive hint, making
lru_cache draining to drop the refcount in lru_cache largely
irrelevant.
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
mm/memory.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index abd0adcf65f0..2983a6baf474 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4903,16 +4903,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
} else if (folio != swapcache)
page = folio_page(folio, 0);
- /*
- * If we want to map a page that's in the swapcache writable, we
- * have to detect via the refcount if we're really the exclusive
- * owner. Try removing the extra reference from the local LRU
- * caches if required.
- */
- if ((vmf->flags & FAULT_FLAG_WRITE) &&
- !folio_test_ksm(folio) && !folio_test_lru(folio))
- lru_add_drain();
-
folio_throttle_swaprate(folio, GFP_KERNEL);
/*
--
2.39.3 (Apple Git-146)
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page
2026-06-23 23:16 ` [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
@ 2026-06-24 10:16 ` Kairui Song
2026-06-24 15:10 ` David Hildenbrand (Arm)
1 sibling, 0 replies; 11+ messages in thread
From: Kairui Song @ 2026-06-24 10:16 UTC (permalink / raw)
To: Barry Song (Xiaomi)
Cc: akpm, linux-mm, baoquan.he, chrisl, david, jp.kobryn, liam,
linux-kernel, ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng,
surenb, usama.arif, vbabka, youngjun.park
On Wed, Jun 24, 2026 at 7:18 AM Barry Song (Xiaomi) <baohua@kernel.org> wrote:
>
> We are doing a lot of redundant lru_add_drain() calls in
> do_swap_page(), especially for synchronous I/O devices. For
> example, the test program below currently ends up draining
> lru_cache 100% of the time:
>
> int main(int argc, char *argv[])
> {
> int i;
> #define SIZE 100*1024*1024
> while(1) {
> volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>
> for (int i = 0; i < SIZE/sizeof(int); i++)
> p[i] = i%64;
> madvise((void *)p, SIZE, MADV_PAGEOUT);
> for (int i = 0; i < SIZE/sizeof(int); i++)
> p[i] = i%64;
> munmap(p, SIZE);
> }
> return 0;
> }
>
> Folio reuse now relies primarily on the exclusive hint, making
> lru_cache draining to drop the refcount in lru_cache largely
> irrelevant.
>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Reviewed-by: Baoquan He <baoquan.he@linux.dev>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
> mm/memory.c | 10 ----------
> 1 file changed, 10 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index abd0adcf65f0..2983a6baf474 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4903,16 +4903,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> } else if (folio != swapcache)
> page = folio_page(folio, 0);
>
> - /*
> - * If we want to map a page that's in the swapcache writable, we
> - * have to detect via the refcount if we're really the exclusive
> - * owner. Try removing the extra reference from the local LRU
> - * caches if required.
> - */
> - if ((vmf->flags & FAULT_FLAG_WRITE) &&
> - !folio_test_ksm(folio) && !folio_test_lru(folio))
> - lru_add_drain();
> -
> folio_throttle_swaprate(folio, GFP_KERNEL);
>
> /*
> --
> 2.39.3 (Apple Git-146)
>
Thanks, I saw the previous discussed problem is address in the next
patch, it's totally fine so:
Reviewed-by: Kairui Song <kasong@tencent.com>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page
2026-06-23 23:16 ` [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
2026-06-24 10:16 ` Kairui Song
@ 2026-06-24 15:10 ` David Hildenbrand (Arm)
1 sibling, 0 replies; 11+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-24 15:10 UTC (permalink / raw)
To: Barry Song (Xiaomi), akpm, linux-mm
Cc: baoquan.he, chrisl, jp.kobryn, kasong, liam, linux-kernel, ljs,
mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
usama.arif, vbabka, youngjun.park
On 6/24/26 01:16, Barry Song (Xiaomi) wrote:
> We are doing a lot of redundant lru_add_drain() calls in
> do_swap_page(), especially for synchronous I/O devices. For
> example, the test program below currently ends up draining
> lru_cache 100% of the time:
>
> int main(int argc, char *argv[])
> {
> int i;
> #define SIZE 100*1024*1024
> while(1) {
> volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>
> for (int i = 0; i < SIZE/sizeof(int); i++)
> p[i] = i%64;
> madvise((void *)p, SIZE, MADV_PAGEOUT);
> for (int i = 0; i < SIZE/sizeof(int); i++)
> p[i] = i%64;
> munmap(p, SIZE);
> }
> return 0;
> }
>
> Folio reuse now relies primarily on the exclusive hint, making
> lru_cache draining to drop the refcount in lru_cache largely
> irrelevant.
Makes sense, we'll fallback to do_wp_page() where we handle the non-exclusive
either way.
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios
2026-06-23 23:16 [PATCH v2 0/4] mm: drop redundant lru_add_drain in anon folio reuse paths Barry Song (Xiaomi)
` (2 preceding siblings ...)
2026-06-23 23:16 ` [PATCH v2 3/4] mm: entirely remove lru_add_drain in do_swap_page Barry Song (Xiaomi)
@ 2026-06-23 23:16 ` Barry Song (Xiaomi)
2026-06-24 15:20 ` David Hildenbrand (Arm)
3 siblings, 1 reply; 11+ messages in thread
From: Barry Song (Xiaomi) @ 2026-06-23 23:16 UTC (permalink / raw)
To: akpm, linux-mm
Cc: baoquan.he, chrisl, david, jp.kobryn, kasong, liam, linux-kernel,
ljs, mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
usama.arif, vbabka, youngjun.park, Barry Song (Xiaomi),
Kairui Song
Originally, we unconditionally called lru_add_drain() for write
swap-in page faults. This might drop the reference held by the per-CPU
LRU cache if the folio happened to reside there. However, there was no
guarantee that the folio was actually cached on the current CPU.
Now that lru_add_drain() has been removed, we have lost one
opportunity to drop a reference held by the LRU cache. We could
instead incorporate that possibility into the condition evaluated by
should_try_to_free_swap().
Suggested-by: Kairui Song <ryncsn@gmail.com>
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
---
mm/memory.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index 2983a6baf474..14577c67c61a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5087,8 +5087,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
* Remove the swap entry and conditionally try to free up the swapcache.
* Do it after mapping, so raced page faults will likely see the folio
* in swap cache and wait on the folio lock.
+ * Assume non-LRU folios may be queued in the LRU cache, which contributes
+ * an additional reference to the folio.
*/
- if (should_try_to_free_swap(si, folio, vma, nr_pages, vmf->flags))
+ if (should_try_to_free_swap(si, folio, vma, nr_pages +
+ !folio_test_lru(folio), vmf->flags))
folio_free_swap(folio);
folio_unlock(folio);
--
2.39.3 (Apple Git-146)
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios
2026-06-23 23:16 ` [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios Barry Song (Xiaomi)
@ 2026-06-24 15:20 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 11+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-24 15:20 UTC (permalink / raw)
To: Barry Song (Xiaomi), akpm, linux-mm
Cc: baoquan.he, chrisl, jp.kobryn, kasong, liam, linux-kernel, ljs,
mhocko, nphamcs, rppt, shakeel.butt, shikemeng, surenb,
usama.arif, vbabka, youngjun.park, Kairui Song
On 6/24/26 01:16, Barry Song (Xiaomi) wrote:
> Originally, we unconditionally called lru_add_drain() for write
> swap-in page faults. This might drop the reference held by the per-CPU
> LRU cache if the folio happened to reside there. However, there was no
> guarantee that the folio was actually cached on the current CPU.
>
> Now that lru_add_drain() has been removed, we have lost one
> opportunity to drop a reference held by the LRU cache. We could
> instead incorporate that possibility into the condition evaluated by
> should_try_to_free_swap().
>
> Suggested-by: Kairui Song <ryncsn@gmail.com>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
> mm/memory.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 2983a6baf474..14577c67c61a 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5087,8 +5087,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> * Remove the swap entry and conditionally try to free up the swapcache.
> * Do it after mapping, so raced page faults will likely see the folio
> * in swap cache and wait on the folio lock.
> + * Assume non-LRU folios may be queued in the LRU cache, which contributes
> + * an additional reference to the folio.
> */
> - if (should_try_to_free_swap(si, folio, vma, nr_pages, vmf->flags))
> + if (should_try_to_free_swap(si, folio, vma, nr_pages +
> + !folio_test_lru(folio), vmf->flags))
> folio_free_swap(folio);
>
> folio_unlock(folio);
Hm, in wp_can_reuse_anon_folio() we'll try dropping the swapcache ourselves.
So I wonder if we still need that handling ("If we want to map a page that's in
the swapcache writable, we ...") at all?
Ahh, I see the problem now:
commit 4b34f1d82c6549837b2061096dea249e881a4495
Author: Kairui Song <kasong@tencent.com>
Date: Sat Dec 20 03:43:35 2025 +0800
mm, swap: free the swap cache after folio is mapped
Currently, we remove the folio from the swap cache and free the swap cache
before mapping the PTE. To reduce repeated faults due to parallel swapins
of the same PTE, change it to remove the folio from the swap cache after
it is mapped. So new faults from the swap PTE will be much more likely to
see the folio in the swap cache and wait on it.
This does not eliminate all swapin races: an ongoing swapin fault may
still see an empty swap cache. That's harmless, as the PTE is changed
before the swap cache is cleared, so it will just return and not trigger
any repeated faults. This does help to reduce the chance.
That changed that behavior such that we *must* now always fallback to do_wp_page().
What a mess (I didn't ack)
--
Cheers,
David
^ permalink raw reply [flat|nested] 11+ messages in thread