From: Hao Jia <jiahao.kernel@gmail.com>
To: akpm@linux-foundation.org, tj@kernel.org, hannes@cmpxchg.org,
shakeel.butt@linux.dev, mhocko@kernel.org, yosry@kernel.org,
mkoutny@suse.com, nphamcs@gmail.com, chengming.zhou@linux.dev,
muchun.song@linux.dev, roman.gushchin@linux.dev
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, Hao Jia <jiahao1@lixiang.com>
Subject: [PATCH v4 2/5] mm/zswap: Factor writeback loop out of shrink_worker()
Date: Thu, 18 Jun 2026 12:48:54 +0800 [thread overview]
Message-ID: <20260618044857.69439-3-jiahao.kernel@gmail.com> (raw)
In-Reply-To: <20260618044857.69439-1-jiahao.kernel@gmail.com>
From: Hao Jia <jiahao1@lixiang.com>
In preparation for sharing the writeback loop with proactive
writeback, move the memcg iteration into zswap_iter_global() and the
loop into zswap_try_to_writeback(lower, upper). shrink_worker() is
reduced to computing the accept threshold and invoking the helper.
Suggested-by: Yosry Ahmed <yosry@kernel.org>
Signed-off-by: Hao Jia <jiahao1@lixiang.com>
---
mm/zswap.c | 136 +++++++++++++++++++++++++++++++----------------------
1 file changed, 81 insertions(+), 55 deletions(-)
diff --git a/mm/zswap.c b/mm/zswap.c
index d7d031dee4cd..e29f8a61412d 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1380,61 +1380,75 @@ static long shrink_memcg(struct mem_cgroup *memcg,
return walk_arg.bytes_written;
}
-static void shrink_worker(struct work_struct *w)
+/*
+ * Global iteration uses a global cursor to select from all online
+ * memcgs in a round-robin fashion.
+ *
+ * We save iteration cursor memcg into zswap_next_shrink,
+ * which can be modified by the offline memcg cleaner
+ * zswap_memcg_offline_cleanup().
+ *
+ * Since the offline cleaner is called only once, we cannot leave an
+ * offline memcg reference in zswap_next_shrink.
+ * We can rely on the cleaner only if we get online memcg under lock.
+ *
+ * If we get an offline memcg, we cannot determine if the cleaner has
+ * already been called or will be called later. We must put back the
+ * reference before returning from this function. Otherwise, the
+ * offline memcg left in zswap_next_shrink will hold the reference
+ * until the next run of shrink_worker().
+ */
+static struct mem_cgroup *zswap_iter_global(void)
{
struct mem_cgroup *memcg;
- int failures = 0, attempts = 0;
- unsigned long thr;
- long ret;
-
- /* Reclaim down to the accept threshold */
- thr = zswap_accept_thr_pages();
/*
- * Global reclaim will select cgroup in a round-robin fashion from all
- * online memcgs, but memcgs that have no pages in zswap and
- * writeback-disabled memcgs (memory.zswap.writeback=0) are not
- * candidates for shrinking.
+ * Start from the next memcg after zswap_next_shrink.
+ * When the offline cleaner has already advanced the cursor,
+ * advancing the cursor here overlooks one memcg, but this
+ * should be negligibly rare.
*
- * Shrinking will be aborted if we encounter the following
- * MAX_RECLAIM_RETRIES times:
- * - No writeback-candidate memcgs found in a memcg tree walk.
- * - Shrinking a writeback-candidate memcg failed.
- *
- * We save iteration cursor memcg into zswap_next_shrink,
- * which can be modified by the offline memcg cleaner
- * zswap_memcg_offline_cleanup().
- *
- * Since the offline cleaner is called only once, we cannot leave an
- * offline memcg reference in zswap_next_shrink.
- * We can rely on the cleaner only if we get online memcg under lock.
- *
- * If we get an offline memcg, we cannot determine if the cleaner has
- * already been called or will be called later. We must put back the
- * reference before returning from this function. Otherwise, the
- * offline memcg left in zswap_next_shrink will hold the reference
- * until the next run of shrink_worker().
+ * If we get an online memcg, keep the extra reference in case
+ * the original one obtained by mem_cgroup_iter() is dropped by
+ * zswap_memcg_offline_cleanup() while we are shrinking the
+ * memcg.
*/
+ spin_lock(&zswap_shrink_lock);
do {
- /*
- * Start shrinking from the next memcg after zswap_next_shrink.
- * When the offline cleaner has already advanced the cursor,
- * advancing the cursor here overlooks one memcg, but this
- * should be negligibly rare.
- *
- * If we get an online memcg, keep the extra reference in case
- * the original one obtained by mem_cgroup_iter() is dropped by
- * zswap_memcg_offline_cleanup() while we are shrinking the
- * memcg.
- */
- spin_lock(&zswap_shrink_lock);
- do {
- memcg = mem_cgroup_iter(NULL, zswap_next_shrink, NULL);
- zswap_next_shrink = memcg;
- } while (memcg && !mem_cgroup_tryget_online(memcg));
- spin_unlock(&zswap_shrink_lock);
+ memcg = mem_cgroup_iter(NULL, zswap_next_shrink, NULL);
+ zswap_next_shrink = memcg;
+ } while (memcg && !mem_cgroup_tryget_online(memcg));
+ spin_unlock(&zswap_shrink_lock);
+
+ return memcg;
+}
+
+/*
+ * Walk the memcg tree and write back zswap pages until the
+ * (lower_pages, upper_pages) window closes, or abort encounter
+ * MAX_RECLAIM_RETRIES times of the following conditions:
+ * - No writeback-candidate memcgs found in a memcg tree walk.
+ * - Shrinking a writeback-candidate memcg failed.
+ *
+ * For shrink_worker(), it passes lower=thr and upper=zswap_total_pages().
+ * The @upper limit is refreshed in each iteration by re-evaluating
+ * zswap_total_pages(), and the window closes once the total falls
+ * below the threshold.
+ */
+static void zswap_try_to_writeback(unsigned long lower_pages,
+ unsigned long upper_pages)
+{
+ int failures = 0, attempts = 0;
+ struct mem_cgroup *iter_memcg;
+
+ while (lower_pages < upper_pages) {
+ unsigned long batch_size;
+ long shrunk;
- if (!memcg) {
+ cond_resched();
+
+ iter_memcg = zswap_iter_global();
+ if (!iter_memcg) {
/*
* Continue shrinking without incrementing failures if
* we found candidate memcgs in the last tree walk.
@@ -1443,12 +1457,16 @@ static void shrink_worker(struct work_struct *w)
break;
attempts = 0;
- goto resched;
+ continue;
}
- ret = shrink_memcg(memcg, NR_ZSWAP_WB_BATCH);
+ batch_size = min(upper_pages - lower_pages, NR_ZSWAP_WB_BATCH);
+ shrunk = shrink_memcg(iter_memcg, batch_size);
/* drop the extra reference */
- mem_cgroup_put(memcg);
+ mem_cgroup_put(iter_memcg);
+
+ /* zswap total pages might have changed, refresh it. */
+ upper_pages = zswap_total_pages();
/*
* There are no writeback-candidate pages in the memcg.
@@ -1456,15 +1474,23 @@ static void shrink_worker(struct work_struct *w)
* with pages in zswap. Skip this without incrementing attempts
* and failures.
*/
- if (ret == -ENOENT)
+ if (shrunk == -ENOENT)
continue;
++attempts;
- if (ret <= 0 && ++failures == MAX_RECLAIM_RETRIES)
+ if (shrunk <= 0 && ++failures == MAX_RECLAIM_RETRIES)
break;
-resched:
- cond_resched();
- } while (zswap_total_pages() > thr);
+ }
+}
+
+static void shrink_worker(struct work_struct *w)
+{
+ unsigned long thr;
+
+ /* Reclaim down to the accept threshold */
+ thr = zswap_accept_thr_pages();
+
+ zswap_try_to_writeback(thr, zswap_total_pages());
}
/*********************************
--
2.34.1
next prev parent reply other threads:[~2026-06-18 4:49 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-18 4:48 [PATCH v4 0/5] mm/zswap: Implement per-cgroup proactive writeback Hao Jia
2026-06-18 4:48 ` [PATCH v4 1/5] mm/zswap: Extend shrink_memcg() writeback capability Hao Jia
2026-06-18 4:48 ` Hao Jia [this message]
2026-06-18 4:48 ` [PATCH v4 3/5] mm/zswap: Implement proactive writeback Hao Jia
2026-06-18 4:48 ` [PATCH v4 4/5] mm/zswap: Add per-memcg stat for " Hao Jia
2026-06-18 4:48 ` [PATCH v4 5/5] selftests/cgroup: Add tests for zswap " Hao Jia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260618044857.69439-3-jiahao.kernel@gmail.com \
--to=jiahao.kernel@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=jiahao1@lixiang.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=tj@kernel.org \
--cc=yosry@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox