[PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

* [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped
@ 2026-02-25 21:26 Barry Song
  0 siblings, 0 replies; 8+ messages in thread
From: Barry Song @ 2026-02-25 21:26 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, Barry Song, wangzicheng, Suren Baghdasaryan,
	Lei Liu, Matthew Wilcox, Axel Rasmussen, Yuanchu Xie, Wei Xu,
	Kairui Song, Tangquan Zheng

From: Barry Song <baohua@kernel.org>

MGLRU activates folios when a new folio is added and
lru_gen_in_fault() returns true. The problem is that when a
page fault occurs at address N, readahead may bring in many
folios around N, and those folios are also activated even
though many of them may never be accessed.

A previous attempt by Lei Liu proposed introducing a separate
LRU for readahead[1], but that approach is likely over-designed.

This patch instead activates folios lazily, only when they are
actually mapped, so that unused folios do not occupy higher-
priority positions in the LRU and become harder to reclaim.

A similar optimization could also be applied to swapin readahead,
but this RFC limits the change to file-backed folios for now.

Based on Tangquan's observations, this can significantly reduce
file refaults on Android devices when using MGLRU.

BTW, it seems somewhat odd that all LRU APIs are defined in
swap.c and swap.h.

[1] https://lore.kernel.org/linux-mm/20250916072226.220426-1-liulei.rjpt@vivo.com/

Cc: wangzicheng <wangzicheng@honor.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Lei Liu <liulei.rjpt@vivo.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Tangquan Zheng <zhengtangquan@oppo.com>
Signed-off-by: Barry Song <baohua@kernel.org>
---
 include/linux/swap.h |  1 +
 mm/filemap.c         |  2 ++
 mm/swap.c            | 17 ++++++++++++++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 62fc7499b408..ce88ec560527 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -335,6 +335,7 @@ void folio_add_lru(struct folio *);
 void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
 void mark_page_accessed(struct page *);
 void folio_mark_accessed(struct folio *);
+void folio_activate_on_mapped(struct folio *folio);
 
 static inline bool folio_may_be_lru_cached(struct folio *folio)
 {
diff --git a/mm/filemap.c b/mm/filemap.c
index 6cd7974d4ada..0b8f383facdb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3567,6 +3567,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 		}
 	}
 
+	folio_activate_on_mapped(folio);
 	if (!lock_folio_maybe_drop_mmap(vmf, folio, &fpin))
 		goto out_retry;
 
@@ -3926,6 +3927,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 					nr_pages, &rss, &mmap_miss, file_end);
 
 		folio_unlock(folio);
+		folio_activate_on_mapped(folio);
 	} while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) != NULL);
 	add_mm_counter(vma->vm_mm, folio_type, rss);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
diff --git a/mm/swap.c b/mm/swap.c
index bb19ccbece46..1a991586c5af 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -488,6 +488,20 @@ void folio_mark_accessed(struct folio *folio)
 }
 EXPORT_SYMBOL(folio_mark_accessed);
 
+void folio_activate_on_mapped(struct folio *folio)
+{
+	if (lru_gen_enabled() && lru_gen_in_fault() &&
+			!(current->flags & PF_MEMALLOC) &&
+			!folio_test_active(folio) &&
+			!folio_test_unevictable(folio)) {
+		if (folio_test_lru(folio))
+			folio_activate(folio);
+		else /* still in lru cache */
+			__lru_cache_activate_folio(folio);
+	}
+}
+EXPORT_SYMBOL(folio_activate_on_mapped);
+
 /**
  * folio_add_lru - Add a folio to an LRU list.
  * @folio: The folio to be added to the LRU.
@@ -506,7 +520,8 @@ void folio_add_lru(struct folio *folio)
 	/* see the comment in lru_gen_folio_seq() */
 	if (lru_gen_enabled() && !folio_test_unevictable(folio) &&
 	    lru_gen_in_fault() && !(current->flags & PF_MEMALLOC))
-		folio_set_active(folio);
+		if (!folio_is_file_lru(folio))
+			folio_set_active(folio);
 
 	folio_batch_add_and_move(folio, lru_add);
 }
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped
@ 2026-02-25 22:37 Barry Song
  2026-02-26 12:57 ` wangzicheng
  0 siblings, 1 reply; 8+ messages in thread
From: Barry Song @ 2026-02-25 22:37 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, Barry Song, wangzicheng, Suren Baghdasaryan,
	Lei Liu, Matthew Wilcox (Oracle), Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Kairui Song, Tangquan Zheng

From: Barry Song <baohua@kernel.org>

MGLRU activates folios when a new folio is added and
lru_gen_in_fault() returns true. The problem is that when a
page fault occurs at address N, readahead may bring in many
folios around N, and those folios are also activated even
though many of them may never be accessed.

A previous attempt by Lei Liu proposed introducing a separate
LRU for readahead[1], but that approach is likely over-designed.

This patch instead activates folios lazily, only when they are
actually mapped, so that unused folios do not occupy higher-
priority positions in the LRU and become harder to reclaim.

A similar optimization could also be applied to swapin readahead,
but this RFC limits the change to file-backed folios for now.

Based on Tangquan's observations, this can significantly reduce
file refaults on Android devices when using MGLRU.

BTW, it seems somewhat odd that all LRU APIs are defined in
swap.c and swap.h.

[1] https://lore.kernel.org/linux-mm/20250916072226.220426-1-liulei.rjpt@vivo.com/

Cc: wangzicheng <wangzicheng@honor.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Lei Liu <liulei.rjpt@vivo.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Tangquan Zheng <zhengtangquan@oppo.com>
Signed-off-by: Barry Song <baohua@kernel.org>
---
 include/linux/swap.h |  1 +
 mm/filemap.c         |  2 ++
 mm/swap.c            | 16 +++++++++++++++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 62fc7499b408..ce88ec560527 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -335,6 +335,7 @@ void folio_add_lru(struct folio *);
 void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
 void mark_page_accessed(struct page *);
 void folio_mark_accessed(struct folio *);
+void folio_activate_on_mapped(struct folio *folio);
 
 static inline bool folio_may_be_lru_cached(struct folio *folio)
 {
diff --git a/mm/filemap.c b/mm/filemap.c
index 6cd7974d4ada..0b8f383facdb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3567,6 +3567,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 		}
 	}
 
+	folio_activate_on_mapped(folio);
 	if (!lock_folio_maybe_drop_mmap(vmf, folio, &fpin))
 		goto out_retry;
 
@@ -3926,6 +3927,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 					nr_pages, &rss, &mmap_miss, file_end);
 
 		folio_unlock(folio);
+		folio_activate_on_mapped(folio);
 	} while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) != NULL);
 	add_mm_counter(vma->vm_mm, folio_type, rss);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
diff --git a/mm/swap.c b/mm/swap.c
index bb19ccbece46..e50b1e794ef1 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -488,6 +488,19 @@ void folio_mark_accessed(struct folio *folio)
 }
 EXPORT_SYMBOL(folio_mark_accessed);
 
+void folio_activate_on_mapped(struct folio *folio)
+{
+	if (lru_gen_enabled() && lru_gen_in_fault() &&
+			!(current->flags & PF_MEMALLOC) &&
+			!folio_test_active(folio) &&
+			!folio_test_unevictable(folio)) {
+		if (folio_test_lru(folio))
+			folio_activate(folio);
+		else /* still in lru cache */
+			__lru_cache_activate_folio(folio);
+	}
+}
+
 /**
  * folio_add_lru - Add a folio to an LRU list.
  * @folio: The folio to be added to the LRU.
@@ -506,7 +519,8 @@ void folio_add_lru(struct folio *folio)
 	/* see the comment in lru_gen_folio_seq() */
 	if (lru_gen_enabled() && !folio_test_unevictable(folio) &&
 	    lru_gen_in_fault() && !(current->flags & PF_MEMALLOC))
-		folio_set_active(folio);
+		if (!folio_is_file_lru(folio))
+			folio_set_active(folio);
 
 	folio_batch_add_and_move(folio, lru_add);
 }
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped
  2026-02-25 22:37 [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped Barry Song
@ 2026-02-26 12:57 ` wangzicheng
  2026-02-27  0:15   ` Barry Song
  0 siblings, 1 reply; 8+ messages in thread
From: wangzicheng @ 2026-02-26 12:57 UTC (permalink / raw)
  To: Barry Song, akpm@linux-foundation.org, linux-mm@kvack.org
  Cc: linux-kernel@vger.kernel.org, Barry Song, Suren Baghdasaryan,
	Lei Liu, Matthew Wilcox (Oracle), Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Kairui Song, Tangquan Zheng, wangtao



> -----Original Message-----
> From: Barry Song <21cnbao@gmail.com>
> Sent: Thursday, February 26, 2026 6:37 AM
> To: akpm@linux-foundation.org; linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org; Barry Song <baohua@kernel.org>;
> wangzicheng <wangzicheng@honor.com>; Suren Baghdasaryan
> <surenb@google.com>; Lei Liu <liulei.rjpt@vivo.com>; Matthew Wilcox
> (Oracle) <willy@infradead.org>; Axel Rasmussen
> <axelrasmussen@google.com>; Yuanchu Xie <yuanchu@google.com>; Wei
> Xu <weixugc@google.com>; Kairui Song <kasong@tencent.com>; Tangquan
> Zheng <zhengtangquan@oppo.com>
> Subject: [PATCH RFC] mm/mglru: lazily activate folios while folios are really
> mapped
> 
> From: Barry Song <baohua@kernel.org>
> 
> MGLRU activates folios when a new folio is added and
> lru_gen_in_fault() returns true. The problem is that when a
> page fault occurs at address N, readahead may bring in many
> folios around N, and those folios are also activated even
> though many of them may never be accessed.
> 
> A previous attempt by Lei Liu proposed introducing a separate
> LRU for readahead[1], but that approach is likely over-designed.
> 
> This patch instead activates folios lazily, only when they are
> actually mapped, so that unused folios do not occupy higher-
> priority positions in the LRU and become harder to reclaim.
> 
> A similar optimization could also be applied to swapin readahead,
> but this RFC limits the change to file-backed folios for now.
> 
> Based on Tangquan's observations, this can significantly reduce
> file refaults on Android devices when using MGLRU.
> 
> BTW, it seems somewhat odd that all LRU APIs are defined in
> swap.c and swap.h.
> 
> [1] https://lore.kernel.org/linux-mm/20250916072226.220426-1-
> liulei.rjpt@vivo.com/
> 
> Cc: wangzicheng <wangzicheng@honor.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Lei Liu <liulei.rjpt@vivo.com>
> Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> Cc: Axel Rasmussen <axelrasmussen@google.com>
> Cc: Yuanchu Xie <yuanchu@google.com>
> Cc: Wei Xu <weixugc@google.com>
> Cc: Kairui Song <kasong@tencent.com>
> Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> Signed-off-by: Barry Song <baohua@kernel.org>
> ---
>  include/linux/swap.h |  1 +
>  mm/filemap.c         |  2 ++
>  mm/swap.c            | 16 +++++++++++++++-
>  3 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 62fc7499b408..ce88ec560527 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -335,6 +335,7 @@ void folio_add_lru(struct folio *);
>  void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
>  void mark_page_accessed(struct page *);
>  void folio_mark_accessed(struct folio *);
> +void folio_activate_on_mapped(struct folio *folio);
> 
>  static inline bool folio_may_be_lru_cached(struct folio *folio)
>  {
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 6cd7974d4ada..0b8f383facdb 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3567,6 +3567,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
>  		}
>  	}
> 
> +	folio_activate_on_mapped(folio);
>  	if (!lock_folio_maybe_drop_mmap(vmf, folio, &fpin))
>  		goto out_retry;
> 
> @@ -3926,6 +3927,7 @@ vm_fault_t filemap_map_pages(struct vm_fault
> *vmf,
>  					nr_pages, &rss, &mmap_miss,
> file_end);
> 
>  		folio_unlock(folio);
> +		folio_activate_on_mapped(folio);
>  	} while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) !=
> NULL);
>  	add_mm_counter(vma->vm_mm, folio_type, rss);
>  	pte_unmap_unlock(vmf->pte, vmf->ptl);
> diff --git a/mm/swap.c b/mm/swap.c
> index bb19ccbece46..e50b1e794ef1 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -488,6 +488,19 @@ void folio_mark_accessed(struct folio *folio)
>  }
>  EXPORT_SYMBOL(folio_mark_accessed);
> 
> +void folio_activate_on_mapped(struct folio *folio)
> +{
> +	if (lru_gen_enabled() && lru_gen_in_fault() &&
> +			!(current->flags & PF_MEMALLOC) &&
> +			!folio_test_active(folio) &&
> +			!folio_test_unevictable(folio)) {
> +		if (folio_test_lru(folio))
> +			folio_activate(folio);
> +		else /* still in lru cache */
> +			__lru_cache_activate_folio(folio);
> +	}
> +}
> +
>  /**
>   * folio_add_lru - Add a folio to an LRU list.
>   * @folio: The folio to be added to the LRU.
> @@ -506,7 +519,8 @@ void folio_add_lru(struct folio *folio)
>  	/* see the comment in lru_gen_folio_seq() */
>  	if (lru_gen_enabled() && !folio_test_unevictable(folio) &&
>  	    lru_gen_in_fault() && !(current->flags & PF_MEMALLOC))
> -		folio_set_active(folio);
> +		if (!folio_is_file_lru(folio))
> +			folio_set_active(folio);
> 
>  	folio_batch_add_and_move(folio, lru_add);
>  }
> --
> 2.39.3 (Apple Git-146)

Hi Barry,

Setting only non-filelru-folio in folio_add_lru looks reasonable and
should help with over-protecting readahead pages that are never
actually accessed.

For our workloads that already suffer from file under-protection, we see two
sides here: on the positive side, keeping only actually-used readahead pages
in memory could improve performance; on the other hand, since we already
see file under-protect issues, it's not clear whether this change might
exacerbate that or even hurt performance.

We'll test this when available and report back. We hope to have a
chance to discuss this topic at LSF/MM/BPF.

Thanks,
Zicheng


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped
  2026-02-26 12:57 ` wangzicheng
@ 2026-02-27  0:15   ` Barry Song
  2026-02-28 10:28     ` wangzicheng
  0 siblings, 1 reply; 8+ messages in thread
From: Barry Song @ 2026-02-27  0:15 UTC (permalink / raw)
  To: wangzicheng
  Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Suren Baghdasaryan, Lei Liu,
	Matthew Wilcox (Oracle), Axel Rasmussen, Yuanchu Xie, Wei Xu,
	Kairui Song, Tangquan Zheng, wangtao

Hi Zicheng,

On Thu, Feb 26, 2026 at 8:57 PM wangzicheng <wangzicheng@honor.com> wrote:
[...]
> >  /**
> >   * folio_add_lru - Add a folio to an LRU list.
> >   * @folio: The folio to be added to the LRU.
> > @@ -506,7 +519,8 @@ void folio_add_lru(struct folio *folio)
> >       /* see the comment in lru_gen_folio_seq() */
> >       if (lru_gen_enabled() && !folio_test_unevictable(folio) &&
> >           lru_gen_in_fault() && !(current->flags & PF_MEMALLOC))
> > -             folio_set_active(folio);
> > +             if (!folio_is_file_lru(folio))
> > +                     folio_set_active(folio);
> >
> >       folio_batch_add_and_move(folio, lru_add);
> >  }
> > --
> > 2.39.3 (Apple Git-146)
>
> Hi Barry,
>
> Setting only non-filelru-folio in folio_add_lru looks reasonable and
> should help with over-protecting readahead pages that are never
> actually accessed.
>
> For our workloads that already suffer from file under-protection, we see two
> sides here: on the positive side, keeping only actually-used readahead pages
> in memory could improve performance; on the other hand, since we already

Right, the fundamental principle of LRU is to place cold pages at
the tail, not at the head, making cold pages easier to reclaim and
hot pages harder to reclaim.

> see file under-protect issues, it's not clear whether this change might
> exacerbate that or even hurt performance.

I find your concern a bit surprising. If I understand correctly,
you’re observing that file folios are currently being over-reclaimed.
In that case, placing hot pages at the tail might make them harder
to reclaim after PTE scanning (since they may still be young), but
this seems to violate the fundamental principle of LRU. Moreover,
when scanning encounters young file folios, reclaim will simply
continue scanning more folios to find reclaimable ones, so scanning
hot folios only wastes CPU time.
Since read-ahead cold folios are placed at the head, relatively hotter
folios may be reclaimed instead, causing refaults and further triggering
reclaim, which can worsen the situation.

>
> We'll test this when available and report back. We hope to have a
> chance to discuss this topic at LSF/MM/BPF.
>

Sure, thanks!

Barry

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped
  2026-02-27  0:15   ` Barry Song
@ 2026-02-28 10:28     ` wangzicheng
  2026-03-01  4:16       ` Barry Song
  0 siblings, 1 reply; 8+ messages in thread
From: wangzicheng @ 2026-02-28 10:28 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Suren Baghdasaryan, Lei Liu,
	Matthew Wilcox (Oracle), Axel Rasmussen, Yuanchu Xie, Wei Xu,
	Kairui Song, Tangquan Zheng, wangtao

Hi Barry,
> 
> I find your concern a bit surprising. If I understand correctly,
> you’re observing that file folios are currently being over-reclaimed.
> In that case, placing hot pages at the tail might make them harder
> to reclaim after PTE scanning (since they may still be young), but
> this seems to violate the fundamental principle of LRU. Moreover,
> when scanning encounters young file folios, reclaim will simply
> continue scanning more folios to find reclaimable ones, so scanning
> hot folios only wastes CPU time.
> Since read-ahead cold folios are placed at the head, relatively hotter
> folios may be reclaimed instead, causing refaults and further triggering
> reclaim, which can worsen the situation.
> 
Thank you for the detailed explanation.
> >
> > We'll test this when available and report back. We hope to have a
> > chance to discuss this topic at LSF/MM/BPF.
> >
> 
> Sure, thanks!
> 
> Barry

For evaluation I’m using a workload that repeatedly cold-starts and
drives same user actions in 20+ apps on Android.
I’m comparing baseline(v6.6) vs. the patched kernel and watching
`/proc/vmstat -> workingset_refault_file`, expecting it to go down.

I ran 3 runs per kernel, but `workingset_refault_file` is quite noisy,
the Coefficient of Variation is around 40%, so the result doesn’t look
statistically solid.

Do you have any suggestions on how to measure the benefit more
robustly? For example:
- different or longer-running workloads,
- better normalization for refaults (per time, per faults, etc.),
- or other vmstat metrics that you found more stable in practice?

I’m also considering increasing the number of runs and using a t-test,
or comparing the CDF between baseline and patched kernels.
If you have a preferred methodology, I’d like to align with that.

Thanks,
Zicheng

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped
  2026-02-28 10:28     ` wangzicheng
@ 2026-03-01  4:16       ` Barry Song
  2026-03-19 10:12         ` wangzicheng
  0 siblings, 1 reply; 8+ messages in thread
From: Barry Song @ 2026-03-01  4:16 UTC (permalink / raw)
  To: wangzicheng
  Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Suren Baghdasaryan, Lei Liu,
	Matthew Wilcox (Oracle), Axel Rasmussen, Yuanchu Xie, Wei Xu,
	Kairui Song, Tangquan Zheng, wangtao

On Sat, Feb 28, 2026 at 6:28 PM wangzicheng <wangzicheng@honor.com> wrote:
>
> Hi Barry,
> >
> > I find your concern a bit surprising. If I understand correctly,
> > you’re observing that file folios are currently being over-reclaimed.
> > In that case, placing hot pages at the tail might make them harder
> > to reclaim after PTE scanning (since they may still be young), but
> > this seems to violate the fundamental principle of LRU. Moreover,
> > when scanning encounters young file folios, reclaim will simply
> > continue scanning more folios to find reclaimable ones, so scanning
> > hot folios only wastes CPU time.
> > Since read-ahead cold folios are placed at the head, relatively hotter
> > folios may be reclaimed instead, causing refaults and further triggering
> > reclaim, which can worsen the situation.
> >
> Thank you for the detailed explanation.
> > >
> > > We'll test this when available and report back. We hope to have a
> > > chance to discuss this topic at LSF/MM/BPF.
> > >
> >
> > Sure, thanks!
> >
> > Barry
>
> For evaluation I’m using a workload that repeatedly cold-starts and
> drives same user actions in 20+ apps on Android.
> I’m comparing baseline(v6.6) vs. the patched kernel and watching
> `/proc/vmstat -> workingset_refault_file`, expecting it to go down.
>
> I ran 3 runs per kernel, but `workingset_refault_file` is quite noisy,
> the Coefficient of Variation is around 40%, so the result doesn’t look
> statistically solid.
>
> Do you have any suggestions on how to measure the benefit more
> robustly? For example:
> - different or longer-running workloads,
> - better normalization for refaults (per time, per faults, etc.),
> - or other vmstat metrics that you found more stable in practice?

I've cc'ed Tangquan, and he may be able to share how he was testing.
Basically, you may want to disable Wi-Fi, as it can introduce a lot of
variability between runs. Aside from refault metrics, you should also
see reduced I/O load and fewer swap-out/in events if you run the same
sequence of apps consistently.

>
> I’m also considering increasing the number of runs and using a t-test,
> or comparing the CDF between baseline and patched kernels.
> If you have a preferred methodology, I’d like to align with that.
>

Thanks
Barry


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped
  2026-03-01  4:16       ` Barry Song
@ 2026-03-19 10:12         ` wangzicheng
  2026-03-20  9:59           ` 答复: " 郑堂权(Blues Zheng)
  0 siblings, 1 reply; 8+ messages in thread
From: wangzicheng @ 2026-03-19 10:12 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Suren Baghdasaryan, Lei Liu,
	Matthew Wilcox (Oracle), Axel Rasmussen, Yuanchu Xie, Wei Xu,
	Kairui Song, Tangquan Zheng, wangtao, liulu 00013167

Hi Barry,

Thank you for the suggestion.

I have re-designed the workload and get the relative promising results.
The workload repeatedly launches and switches between 30 apps
for 500 rounds. Since the test takes quite a long time, the final results
appear relatively stable across runs.

The testing was done on an Android 16 device with kernel 6.6.89,
8GB RAM, MGLRU enabled.

However, the results are not very easy to interpret.

Average number of kept-alive apps: ±0.08 apps
Average available memory (sampled after each app launch):
baseline vs patched: 2216MB vs 2218MB (~2MB difference)

Below is the vmstat comparison (patched vs baseline):

Metric                       Change
---------------------------  --------
pgpgin                       +2.06%
pgpgout                      +3.10%
pswpin                       +14.13%
pswpout                      +4.55%
pgfault                      -3.19%
pgmajfault                   +12.75%
workingset_refault_anon      +14.77%
workingset_refault_file      +3.48%
workingset_activate_anon     -3.45%
workingset_activate_file     -17.76%
workingset_restore_anon      -3.44%
workingset_restore_file      -19.13%

In v6.6, when PG_active is set, pages go to the youngest generation,
while pages without PG_active go to the second oldest generation.
```
static inline bool lru_gen_add_folio(
...
              if (folio_test_active(folio))
		seq = lrugen->max_seq;
	...
	else
		seq = lrugen->min_seq[type] + 1;
```

My rough expectation was that the patch should make file pages more
prone to reclaim and make file page hot/cold aging more accurate, so
both file refault and anon refault might decrease. But here anon refault
increases instead.

I’m not sure if this assumption is correct. Could you share your thoughts
on how to interpret these results?

Thanks,
Zicheng

> -----Original Message-----
> From: owner-linux-mm@kvack.org <owner-linux-mm@kvack.org> On Behalf
> Of Barry Song
> Sent: Sunday, March 1, 2026 12:16 PM
> To: wangzicheng <wangzicheng@honor.com>
> Cc: akpm@linux-foundation.org; linux-mm@kvack.org; linux-
> kernel@vger.kernel.org; Suren Baghdasaryan <surenb@google.com>; Lei Liu
> <liulei.rjpt@vivo.com>; Matthew Wilcox (Oracle) <willy@infradead.org>;
> Axel Rasmussen <axelrasmussen@google.com>; Yuanchu Xie
> <yuanchu@google.com>; Wei Xu <weixugc@google.com>; Kairui Song
> <kasong@tencent.com>; Tangquan Zheng <zhengtangquan@oppo.com>;
> wangtao <tao.wangtao@honor.com>
> Subject: Re: [PATCH RFC] mm/mglru: lazily activate folios while folios are
> really mapped
> 
> On Sat, Feb 28, 2026 at 6:28 PM wangzicheng <wangzicheng@honor.com>
> wrote:
> >
> > Hi Barry,
> > >
> > > I find your concern a bit surprising. If I understand correctly,
> > > you’re observing that file folios are currently being over-reclaimed.
> > > In that case, placing hot pages at the tail might make them harder
> > > to reclaim after PTE scanning (since they may still be young), but
> > > this seems to violate the fundamental principle of LRU. Moreover,
> > > when scanning encounters young file folios, reclaim will simply
> > > continue scanning more folios to find reclaimable ones, so scanning
> > > hot folios only wastes CPU time.
> > > Since read-ahead cold folios are placed at the head, relatively hotter
> > > folios may be reclaimed instead, causing refaults and further triggering
> > > reclaim, which can worsen the situation.
> > >
> > Thank you for the detailed explanation.
> > > >
> > > > We'll test this when available and report back. We hope to have a
> > > > chance to discuss this topic at LSF/MM/BPF.
> > > >
> > >
> > > Sure, thanks!
> > >
> > > Barry
> >
> > For evaluation I’m using a workload that repeatedly cold-starts and
> > drives same user actions in 20+ apps on Android.
> > I’m comparing baseline(v6.6) vs. the patched kernel and watching
> > `/proc/vmstat -> workingset_refault_file`, expecting it to go down.
> >
> > I ran 3 runs per kernel, but `workingset_refault_file` is quite noisy,
> > the Coefficient of Variation is around 40%, so the result doesn’t look
> > statistically solid.
> >
> > Do you have any suggestions on how to measure the benefit more
> > robustly? For example:
> > - different or longer-running workloads,
> > - better normalization for refaults (per time, per faults, etc.),
> > - or other vmstat metrics that you found more stable in practice?
> 
> I've cc'ed Tangquan, and he may be able to share how he was testing.
> Basically, you may want to disable Wi-Fi, as it can introduce a lot of
> variability between runs. Aside from refault metrics, you should also
> see reduced I/O load and fewer swap-out/in events if you run the same
> sequence of apps consistently.
> 
> >
> > I’m also considering increasing the number of runs and using a t-test,
> > or comparing the CDF between baseline and patched kernels.
> > If you have a preferred methodology, I’d like to align with that.
> >
> 
> Thanks
> Barry


^ permalink raw reply	[flat|nested] 8+ messages in thread

* 答复: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped
  2026-03-19 10:12         ` wangzicheng
@ 2026-03-20  9:59           ` 郑堂权(Blues Zheng)
  0 siblings, 0 replies; 8+ messages in thread
From: 郑堂权(Blues Zheng) @ 2026-03-20  9:59 UTC (permalink / raw)
  To: wangzicheng, Barry Song
  Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Suren Baghdasaryan, Lei Liu,
	Matthew Wilcox (Oracle), Axel Rasmussen, Yuanchu Xie, Wei Xu,
	Kairui Song, wangtao, liulu 00013167

Hi Zicheng,

We ran the same RFC on 6.6, 8 GB, with zstd in our internal whole-system perf model. /proc/vmstat (before → after; % = reduction):
pgpgin                                          57807848        55738480        −3.58%
pgpgout                                         31585160        26367420        −16.52%
pswpin                                          2305528         1534481         −33.44%
pswpout                                         6618935         5327316         −19.51%
workingset_refault_anon         2104047         1356316         −35.54%
workingset_refault_file         9020966         8407346         −6.80%
workingset_activate_anon        1196828         412937          −65.50%
workingset_activate_file                2941357         1468218         −50.08%
workingset_restore_anon         590337          412322          −30.15%
workingset_restore_file         1801398         1285060         −28.66%
workingset_nodereclaim          201014          152864          −23.95%

Here both file and anon refault drop—different from your Android run, likely workload/environment.

-----邮件原件-----
发件人: wangzicheng <wangzicheng@honor.com>
发送时间: 2026年3月19日 18:13
收件人: Barry Song <21cnbao@gmail.com>
抄送: akpm@linux-foundation.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org; Suren Baghdasaryan <surenb@google.com>; Lei Liu <liulei.rjpt@vivo.com>; Matthew Wilcox (Oracle) <willy@infradead.org>; Axel Rasmussen <axelrasmussen@google.com>; Yuanchu Xie <yuanchu@google.com>; Wei Xu <weixugc@google.com>; Kairui Song <kasong@tencent.com>; 郑堂权(Blues Zheng) <zhengtangquan@oppo.com>; wangtao <tao.wangtao@honor.com>; liulu 00013167 <liulu.liu@honor.com>
主题: RE: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped

[You don't often get email from wangzicheng@honor.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

外部邮件/External Mail

Hi Barry,

Thank you for the suggestion.

I have re-designed the workload and get the relative promising results.
The workload repeatedly launches and switches between 30 apps for 500 rounds. Since the test takes quite a long time, the final results appear relatively stable across runs.

The testing was done on an Android 16 device with kernel 6.6.89, 8GB RAM, MGLRU enabled.

However, the results are not very easy to interpret.

Average number of kept-alive apps: ±0.08 apps Average available memory (sampled after each app launch):
baseline vs patched: 2216MB vs 2218MB (~2MB difference)

Below is the vmstat comparison (patched vs baseline):

Metric                       Change
---------------------------  --------
pgpgin                       +2.06%
pgpgout                      +3.10%
pswpin                       +14.13%
pswpout                      +4.55%
pgfault                      -3.19%
pgmajfault                   +12.75%
workingset_refault_anon      +14.77%
workingset_refault_file      +3.48%
workingset_activate_anon     -3.45%
workingset_activate_file     -17.76%
workingset_restore_anon      -3.44%
workingset_restore_file      -19.13%

In v6.6, when PG_active is set, pages go to the youngest generation, while pages without PG_active go to the second oldest generation.
```
static inline bool lru_gen_add_folio(
...
              if (folio_test_active(folio))
                seq = lrugen->max_seq;
        ...
        else
                seq = lrugen->min_seq[type] + 1; ```

My rough expectation was that the patch should make file pages more prone to reclaim and make file page hot/cold aging more accurate, so both file refault and anon refault might decrease. But here anon refault increases instead.

I’m not sure if this assumption is correct. Could you share your thoughts on how to interpret these results?

Thanks,
Zicheng

> -----Original Message-----
> From: owner-linux-mm@kvack.org <owner-linux-mm@kvack.org> On Behalf Of
> Barry Song
> Sent: Sunday, March 1, 2026 12:16 PM
> To: wangzicheng <wangzicheng@honor.com>
> Cc: akpm@linux-foundation.org; linux-mm@kvack.org; linux-
> kernel@vger.kernel.org; Suren Baghdasaryan <surenb@google.com>; Lei
> Liu <liulei.rjpt@vivo.com>; Matthew Wilcox (Oracle)
> <willy@infradead.org>; Axel Rasmussen <axelrasmussen@google.com>;
> Yuanchu Xie <yuanchu@google.com>; Wei Xu <weixugc@google.com>; Kairui
> Song <kasong@tencent.com>; Tangquan Zheng <zhengtangquan@oppo.com>;
> wangtao <tao.wangtao@honor.com>
> Subject: Re: [PATCH RFC] mm/mglru: lazily activate folios while folios
> are really mapped
>
> On Sat, Feb 28, 2026 at 6:28 PM wangzicheng <wangzicheng@honor.com>
> wrote:
> >
> > Hi Barry,
> > >
> > > I find your concern a bit surprising. If I understand correctly,
> > > you’re observing that file folios are currently being over-reclaimed.
> > > In that case, placing hot pages at the tail might make them harder
> > > to reclaim after PTE scanning (since they may still be young), but
> > > this seems to violate the fundamental principle of LRU. Moreover,
> > > when scanning encounters young file folios, reclaim will simply
> > > continue scanning more folios to find reclaimable ones, so
> > > scanning hot folios only wastes CPU time.
> > > Since read-ahead cold folios are placed at the head, relatively
> > > hotter folios may be reclaimed instead, causing refaults and
> > > further triggering reclaim, which can worsen the situation.
> > >
> > Thank you for the detailed explanation.
> > > >
> > > > We'll test this when available and report back. We hope to have
> > > > a chance to discuss this topic at LSF/MM/BPF.
> > > >
> > >
> > > Sure, thanks!
> > >
> > > Barry
> >
> > For evaluation I’m using a workload that repeatedly cold-starts and
> > drives same user actions in 20+ apps on Android.
> > I’m comparing baseline(v6.6) vs. the patched kernel and watching
> > `/proc/vmstat -> workingset_refault_file`, expecting it to go down.
> >
> > I ran 3 runs per kernel, but `workingset_refault_file` is quite
> > noisy, the Coefficient of Variation is around 40%, so the result
> > doesn’t look statistically solid.
> >
> > Do you have any suggestions on how to measure the benefit more
> > robustly? For example:
> > - different or longer-running workloads,
> > - better normalization for refaults (per time, per faults, etc.),
> > - or other vmstat metrics that you found more stable in practice?
>
> I've cc'ed Tangquan, and he may be able to share how he was testing.
> Basically, you may want to disable Wi-Fi, as it can introduce a lot of
> variability between runs. Aside from refault metrics, you should also
> see reduced I/O load and fewer swap-out/in events if you run the same
> sequence of apps consistently.
>
> >
> > I’m also considering increasing the number of runs and using a
> > t-test, or comparing the CDF between baseline and patched kernels.
> > If you have a preferred methodology, I’d like to align with that.
> >
>
> Thanks
> Barry

________________________________
OPPO

本电子邮件及其附件含有OPPO公司的保密信息，仅限于邮件指明的收件人（包含个人及群组）使用。禁止任何人在未经授权的情况下以任何形式使用。如果您错收了本邮件，切勿传播、分发、复制、印刷或使用本邮件之任何部分或其所载之任何内容，并请立即以电子邮件通知发件人并删除本邮件及其附件。
网络通讯固有缺陷可能导致邮件被截留、修改、丢失、破坏或包含计算机病毒等不安全情况，OPPO对此类错误或遗漏而引致之任何损失概不承担责任并保留与本邮件相关之一切权利。
除非明确说明，本邮件及其附件无意作为在任何国家或地区之要约、招揽或承诺，亦无意作为任何交易或合同之正式确认。 发件人、其所属机构或所属机构之关联机构或任何上述机构之股东、董事、高级管理人员、员工或其他任何人（以下称“发件人”或“OPPO”）不因本邮件之误送而放弃其所享之任何权利，亦不对因故意或过失使用该等信息而引发或可能引发的损失承担任何责任。
文化差异披露：因全球文化差异影响，单纯以YES\OK或其他简单词汇的回复并不构成发件人对任何交易或合同之正式确认或接受，请与发件人再次确认以获得明确书面意见。发件人不对任何受文化差异影响而导致故意或错误使用该等信息所造成的任何直接或间接损害承担责任。
This e-mail and its attachments contain confidential information from OPPO, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you are not the intended recipient, please do not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
Electronic communications may contain computer viruses or other defects inherently, may not be accurately and/or timely transmitted to other systems, or may be intercepted, modified ,delayed, deleted or interfered. OPPO shall not be liable for any damages that arise or may arise from such matter and reserves all rights in connection with the email.
Unless expressly stated, this e-mail and its attachments are provided without any warranty, acceptance or promise of any kind in any country or region, nor constitute a formal confirmation or acceptance of any transaction or contract. The sender, together with its affiliates or any shareholder, director, officer, employee or any other person of any such institution (hereinafter referred to as "sender" or "OPPO") does not waive any rights and shall not be liable for any damages that arise or may arise from the intentional or negligent use of such information.
Cultural Differences Disclosure: Due to global cultural differences, any reply with only YES\OK or other simple words does not constitute any confirmation or acceptance of any transaction or contract, please confirm with the sender again to ensure clear opinion in written form. The sender shall not be responsible for any direct or indirect damages resulting from the intentional or misuse of such information.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-03-20  9:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 22:37 [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped Barry Song
2026-02-26 12:57 ` wangzicheng
2026-02-27  0:15   ` Barry Song
2026-02-28 10:28     ` wangzicheng
2026-03-01  4:16       ` Barry Song
2026-03-19 10:12         ` wangzicheng
2026-03-20  9:59           ` 答复: " 郑堂权(Blues Zheng)
  -- strict thread matches above, loose matches on Subject: below --
2026-02-25 21:26 Barry Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox