From: Nhat Pham <nphamcs@gmail.com>
To: akpm@linux-foundation.org
Cc: hannes@cmpxchg.org, yosryahmed@google.com,
shakeel.butt@linux.dev, linux-mm@kvack.org, kernel-team@meta.com,
linux-kernel@vger.kernel.org, flintglass@gmail.com,
chengming.zhou@linux.dev
Subject: [PATCH v3 0/2] improving dynamic zswap shrinker protection scheme
Date: Mon, 5 Aug 2024 16:22:41 -0700 [thread overview]
Message-ID: <20240805232243.2896283-1-nphamcs@gmail.com> (raw)
v3: No (intended) functional change
* Small cleanups, renamings, etc. (suggested by Yosry Ahmed)
v2:
* Add more details in comments, patch changelog, documentation, etc.
about the second chance scheme and its ability to modulate the
writeback rate (patch 1) (suggested by Yosry Ahmed).
* Move the referenced bit (patch 1) (suggested by Yosry Ahmed).
When experimenting with the memory-pressure based (i.e "dynamic") zswap
shrinker in production, we observed a sharp increase in the number of
swapins, which led to performance regression. We were able to trace this
regression to the following problems with the shrinker's warm pages
protection scheme:
1. The protection decays way too rapidly, and the decaying is coupled with
zswap stores, leading to anomalous patterns, in which a small batch of
zswap stores effectively erase all the protection in place for the
warmer pages in the zswap LRU.
This observation has also been corroborated upstream by Takero Funaki
(in [1]).
2. We inaccurately track the number of swapped in pages, missing the
non-pivot pages that are part of the readahead window, while counting
the pages that are found in the zswap pool.
To alleviate these two issues, this patch series improve the dynamic zswap
shrinker in the following manner:
1. Replace the protection size tracking scheme with a second chance
algorithm. This new scheme removes the need for haphazard stats
decaying, and automatically adjusts the pace of pages aging with memory
pressure, and writeback rate with pool activities: slowing down when
the pool is dominated with zswpouts, and speeding up when the pool is
dominated with stale entries.
2. Fix the tracking of the number of swapins to take into account
non-pivot pages in the readahead window.
With these two changes in place, in a kernel-building benchmark without
any cold data added, the number of swapins is reduced by 64.12%. This
translate to a 10.32% reduction in build time. We also observe a 3%
reduction in kernel CPU time.
In another benchmark, with cold data added (to gauge the new algorithm's
ability to offload cold data), the new second chance scheme outperforms
the old protection scheme by around 0.7%, and actually written back around
21% more pages to backing swap device. So the new scheme is just as good,
if not even better than the old scheme on this front as well.
[1]: https://lore.kernel.org/linux-mm/CAPpodddcGsK=0Xczfuk8usgZ47xeyf4ZjiofdT+ujiyz6V2pFQ@mail.gmail.com/
Nhat Pham (2):
zswap: implement a second chance algorithm for dynamic zswap shrinker
zswap: track swapins from disk more accurately
include/linux/zswap.h | 16 +++----
mm/page_io.c | 11 ++++-
mm/swap_state.c | 8 +---
mm/zswap.c | 108 ++++++++++++++++++++++++------------------
4 files changed, 82 insertions(+), 61 deletions(-)
base-commit: cca1345bd26a67fc61a92ff0c6d81766c259e522
--
2.43.0
next reply other threads:[~2024-08-05 23:22 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-05 23:22 Nhat Pham [this message]
2024-08-05 23:22 ` [PATCH v3 1/2] zswap: implement a second chance algorithm for dynamic zswap shrinker Nhat Pham
2024-08-06 0:13 ` Yosry Ahmed
2024-08-06 0:34 ` Nhat Pham
2024-08-06 0:34 ` [PATCH v3 1/2] zswap: implement a second chance algorithm for dynamic zswap shrinker (fix) Nhat Pham
2024-08-06 1:07 ` Yosry Ahmed
2024-08-05 23:22 ` [PATCH v3 2/2] zswap: track swapins from disk more accurately Nhat Pham
2024-08-06 0:14 ` Yosry Ahmed
2024-08-06 0:47 ` Nhat Pham
2024-08-06 0:45 ` [PATCH v3 2/2] zswap: track swapins from disk more accurately (fix) Nhat Pham
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240805232243.2896283-1-nphamcs@gmail.com \
--to=nphamcs@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=flintglass@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=shakeel.butt@linux.dev \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).