From: Yu Zhao <yuzhao@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Kairui Song <kasong@tencent.com>,
Kalesh Singh <kaleshsingh@google.com>
Subject: Re: [PATCH mm-unstable v2 6/6] mm/mglru: rework workingset protection
Date: Fri, 6 Dec 2024 21:44:15 -0700 [thread overview]
Message-ID: <Z1PSn79GPcCxeI_g@google.com> (raw)
In-Reply-To: <20241206003126.1338283-7-yuzhao@google.com>
On Thu, Dec 05, 2024 at 05:31:26PM -0700, Yu Zhao wrote:
> With the aging feedback no longer considering the distribution of
> folios in each generation, rework workingset protection to better
> distribute folios across MAX_NR_GENS. This is achieved by reusing
> PG_workingset and PG_referenced/LRU_REFS_FLAGS in a slightly different
> way.
>
> For folios accessed multiple times through file descriptors, make
> lru_gen_inc_refs() set additional bits of LRU_REFS_WIDTH in
> folio->flags after PG_referenced, then PG_workingset after
> LRU_REFS_WIDTH. After all its bits are set, i.e.,
> LRU_REFS_FLAGS|BIT(PG_workingset), a folio is lazily promoted into the
> second oldest generation in the eviction path. And when
> folio_inc_gen() does that, it clears LRU_REFS_FLAGS so that
> lru_gen_inc_refs() can start over. For this case, LRU_REFS_MASK is
> only valid when PG_referenced is set.
>
> For folios accessed multiple times through page tables,
> folio_update_gen() from a page table walk or lru_gen_set_refs() from a
> rmap walk sets PG_referenced after the accessed bit is cleared for the
> first time. Thereafter, those two paths set PG_workingset and promote
> folios to the youngest generation. Like folio_inc_gen(), when
> folio_update_gen() does that, it also clears PG_referenced. For this
> case, LRU_REFS_MASK is not used.
>
> For both of the cases, after PG_workingset is set on a folio, it
> remains until this folio is either reclaimed, or "deactivated" by
> lru_gen_clear_refs(). It can be set again if lru_gen_test_recent()
> returns true upon a refault.
>
> When adding folios to the LRU lists, lru_gen_distance() distributes
> them as follows:
> +---------------------------------+---------------------------------+
> | Accessed thru page tables | Accessed thru file descriptors |
> +---------------------------------+---------------------------------+
> | PG_active (set while isolated) | |
> +----------------+----------------+----------------+----------------+
> | PG_workingset | PG_referenced | PG_workingset | LRU_REFS_FLAGS |
> +---------------------------------+---------------------------------+
> |<--------- MIN_NR_GENS --------->| |
> |<-------------------------- MAX_NR_GENS -------------------------->|
>
> After this patch, some typical client and server workloads showed
> improvements under heavy memory pressure. For example, Python TPC-C,
> which was used to benchmark a different approach [1] to better detect
> refault distances, showed a significant decrease in total refaults:
> Before After Change
> Time (seconds) 10801 10801 0%
> Executed (transactions) 41472 43663 +5%
> workingset_nodes 109070 120244 +10%
> workingset_refault_anon 5019627 7281831 +45%
> workingset_refault_file 1294678786 554855564 -57%
> workingset_refault_total 1299698413 562137395 -57%
>
> [1] https://lore.kernel.org/20230920190244.16839-1-ryncsn@gmail.com/
>
> Reported-by: Kairui Song <kasong@tencent.com>
> Closes: https://lore.kernel.org/CAOUHufahuWcKf5f1Sg3emnqX+cODuR=2TQo7T4Gr-QYLujn4RA@mail.gmail.com/
> Signed-off-by: Yu Zhao <yuzhao@google.com>
> Tested-by: Kalesh Singh <kaleshsingh@google.com>
> ---
> include/linux/mm_inline.h | 94 +++++++++++++------------
> include/linux/mmzone.h | 82 +++++++++++++---------
> mm/swap.c | 23 +++---
> mm/vmscan.c | 142 +++++++++++++++++++++++---------------
> mm/workingset.c | 29 ++++----
> 5 files changed, 209 insertions(+), 161 deletions(-)
Some outlier results from LULESH (Livermore Unstructured Lagrangian
Explicit Shock Hydrodynamics) [1] caught my eye. The following fix
made the benchmark a lot happier (128GB DRAM + Optane swap):
Before After Change
Average (z/s) 6894 7574 +10%
Deviation (10 samples) 12.96% 1.76% -86%
[1] https://asc.llnl.gov/codes/proxy-apps/lulesh
Andrew, can you please fold it in? Thanks!
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 90bbc2b3be8b..5e03a61c894f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -916,8 +916,7 @@ static enum folio_references folio_check_references(struct folio *folio,
if (!referenced_ptes)
return FOLIOREF_RECLAIM;
- lru_gen_set_refs(folio);
- return FOLIOREF_ACTIVATE;
+ return lru_gen_set_refs(folio) ? FOLIOREF_ACTIVATE : FOLIOREF_KEEP;
}
referenced_folio = folio_test_clear_referenced(folio);
@@ -4173,11 +4172,7 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw)
old_gen = folio_update_gen(folio, new_gen);
if (old_gen >= 0 && old_gen != new_gen)
update_batch_size(walk, folio, old_gen, new_gen);
-
- continue;
- }
-
- if (lru_gen_set_refs(folio)) {
+ } else if (lru_gen_set_refs(folio)) {
old_gen = folio_lru_gen(folio);
if (old_gen >= 0 && old_gen != new_gen)
folio_activate(folio);
next prev parent reply other threads:[~2024-12-07 4:44 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-06 0:31 [PATCH mm-unstable v2 0/6] mm/mglru: performance optimizations Yu Zhao
2024-12-06 0:31 ` [PATCH mm-unstable v2 1/6] mm/mglru: clean up workingset Yu Zhao
2024-12-06 0:31 ` [PATCH mm-unstable v2 2/6] mm/mglru: optimize deactivation Yu Zhao
2024-12-06 0:31 ` [PATCH mm-unstable v2 3/6] mm/mglru: rework aging feedback Yu Zhao
2024-12-06 0:31 ` [PATCH mm-unstable v2 4/6] mm/mglru: rework type selection Yu Zhao
2024-12-06 0:31 ` [PATCH mm-unstable v2 5/6] mm/mglru: rework refault detection Yu Zhao
2024-12-06 0:31 ` [PATCH mm-unstable v2 6/6] mm/mglru: rework workingset protection Yu Zhao
2024-12-07 4:44 ` Yu Zhao [this message]
2024-12-07 19:09 ` Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1PSn79GPcCxeI_g@google.com \
--to=yuzhao@google.com \
--cc=akpm@linux-foundation.org \
--cc=kaleshsingh@google.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.