From: Hao Jia <jiahao.kernel@gmail.com>
To: akpm@linux-foundation.org, tj@kernel.org, hannes@cmpxchg.org,
shakeel.butt@linux.dev, mhocko@kernel.org, yosry@kernel.org,
mkoutny@suse.com, nphamcs@gmail.com, chengming.zhou@linux.dev,
muchun.song@linux.dev, roman.gushchin@linux.dev
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, Hao Jia <jiahao1@lixiang.com>
Subject: [PATCH v4 2/5] mm/zswap: Factor writeback loop out of shrink_worker()
Date: Thu, 18 Jun 2026 12:48:54 +0800 [thread overview]
Message-ID: <20260618044857.69439-3-jiahao.kernel@gmail.com> (raw)
In-Reply-To: <20260618044857.69439-1-jiahao.kernel@gmail.com>
From: Hao Jia <jiahao1@lixiang.com>
In preparation for sharing the writeback loop with proactive
writeback, move the memcg iteration into zswap_iter_global() and the
loop into zswap_try_to_writeback(lower, upper). shrink_worker() is
reduced to computing the accept threshold and invoking the helper.
Suggested-by: Yosry Ahmed <yosry@kernel.org>
Signed-off-by: Hao Jia <jiahao1@lixiang.com>
---
mm/zswap.c | 136 +++++++++++++++++++++++++++++++----------------------
1 file changed, 81 insertions(+), 55 deletions(-)
diff --git a/mm/zswap.c b/mm/zswap.c
index d7d031dee4cd..e29f8a61412d 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1380,61 +1380,75 @@ static long shrink_memcg(struct mem_cgroup *memcg,
return walk_arg.bytes_written;
}
-static void shrink_worker(struct work_struct *w)
+/*
+ * Global iteration uses a global cursor to select from all online
+ * memcgs in a round-robin fashion.
+ *
+ * We save iteration cursor memcg into zswap_next_shrink,
+ * which can be modified by the offline memcg cleaner
+ * zswap_memcg_offline_cleanup().
+ *
+ * Since the offline cleaner is called only once, we cannot leave an
+ * offline memcg reference in zswap_next_shrink.
+ * We can rely on the cleaner only if we get online memcg under lock.
+ *
+ * If we get an offline memcg, we cannot determine if the cleaner has
+ * already been called or will be called later. We must put back the
+ * reference before returning from this function. Otherwise, the
+ * offline memcg left in zswap_next_shrink will hold the reference
+ * until the next run of shrink_worker().
+ */
+static struct mem_cgroup *zswap_iter_global(void)
{
struct mem_cgroup *memcg;
- int failures = 0, attempts = 0;
- unsigned long thr;
- long ret;
-
- /* Reclaim down to the accept threshold */
- thr = zswap_accept_thr_pages();
/*
- * Global reclaim will select cgroup in a round-robin fashion from all
- * online memcgs, but memcgs that have no pages in zswap and
- * writeback-disabled memcgs (memory.zswap.writeback=0) are not
- * candidates for shrinking.
+ * Start from the next memcg after zswap_next_shrink.
+ * When the offline cleaner has already advanced the cursor,
+ * advancing the cursor here overlooks one memcg, but this
+ * should be negligibly rare.
*
- * Shrinking will be aborted if we encounter the following
- * MAX_RECLAIM_RETRIES times:
- * - No writeback-candidate memcgs found in a memcg tree walk.
- * - Shrinking a writeback-candidate memcg failed.
- *
- * We save iteration cursor memcg into zswap_next_shrink,
- * which can be modified by the offline memcg cleaner
- * zswap_memcg_offline_cleanup().
- *
- * Since the offline cleaner is called only once, we cannot leave an
- * offline memcg reference in zswap_next_shrink.
- * We can rely on the cleaner only if we get online memcg under lock.
- *
- * If we get an offline memcg, we cannot determine if the cleaner has
- * already been called or will be called later. We must put back the
- * reference before returning from this function. Otherwise, the
- * offline memcg left in zswap_next_shrink will hold the reference
- * until the next run of shrink_worker().
+ * If we get an online memcg, keep the extra reference in case
+ * the original one obtained by mem_cgroup_iter() is dropped by
+ * zswap_memcg_offline_cleanup() while we are shrinking the
+ * memcg.
*/
+ spin_lock(&zswap_shrink_lock);
do {
- /*
- * Start shrinking from the next memcg after zswap_next_shrink.
- * When the offline cleaner has already advanced the cursor,
- * advancing the cursor here overlooks one memcg, but this
- * should be negligibly rare.
- *
- * If we get an online memcg, keep the extra reference in case
- * the original one obtained by mem_cgroup_iter() is dropped by
- * zswap_memcg_offline_cleanup() while we are shrinking the
- * memcg.
- */
- spin_lock(&zswap_shrink_lock);
- do {
- memcg = mem_cgroup_iter(NULL, zswap_next_shrink, NULL);
- zswap_next_shrink = memcg;
- } while (memcg && !mem_cgroup_tryget_online(memcg));
- spin_unlock(&zswap_shrink_lock);
+ memcg = mem_cgroup_iter(NULL, zswap_next_shrink, NULL);
+ zswap_next_shrink = memcg;
+ } while (memcg && !mem_cgroup_tryget_online(memcg));
+ spin_unlock(&zswap_shrink_lock);
+
+ return memcg;
+}
+
+/*
+ * Walk the memcg tree and write back zswap pages until the
+ * (lower_pages, upper_pages) window closes, or abort encounter
+ * MAX_RECLAIM_RETRIES times of the following conditions:
+ * - No writeback-candidate memcgs found in a memcg tree walk.
+ * - Shrinking a writeback-candidate memcg failed.
+ *
+ * For shrink_worker(), it passes lower=thr and upper=zswap_total_pages().
+ * The @upper limit is refreshed in each iteration by re-evaluating
+ * zswap_total_pages(), and the window closes once the total falls
+ * below the threshold.
+ */
+static void zswap_try_to_writeback(unsigned long lower_pages,
+ unsigned long upper_pages)
+{
+ int failures = 0, attempts = 0;
+ struct mem_cgroup *iter_memcg;
+
+ while (lower_pages < upper_pages) {
+ unsigned long batch_size;
+ long shrunk;
- if (!memcg) {
+ cond_resched();
+
+ iter_memcg = zswap_iter_global();
+ if (!iter_memcg) {
/*
* Continue shrinking without incrementing failures if
* we found candidate memcgs in the last tree walk.
@@ -1443,12 +1457,16 @@ static void shrink_worker(struct work_struct *w)
break;
attempts = 0;
- goto resched;
+ continue;
}
- ret = shrink_memcg(memcg, NR_ZSWAP_WB_BATCH);
+ batch_size = min(upper_pages - lower_pages, NR_ZSWAP_WB_BATCH);
+ shrunk = shrink_memcg(iter_memcg, batch_size);
/* drop the extra reference */
- mem_cgroup_put(memcg);
+ mem_cgroup_put(iter_memcg);
+
+ /* zswap total pages might have changed, refresh it. */
+ upper_pages = zswap_total_pages();
/*
* There are no writeback-candidate pages in the memcg.
@@ -1456,15 +1474,23 @@ static void shrink_worker(struct work_struct *w)
* with pages in zswap. Skip this without incrementing attempts
* and failures.
*/
- if (ret == -ENOENT)
+ if (shrunk == -ENOENT)
continue;
++attempts;
- if (ret <= 0 && ++failures == MAX_RECLAIM_RETRIES)
+ if (shrunk <= 0 && ++failures == MAX_RECLAIM_RETRIES)
break;
-resched:
- cond_resched();
- } while (zswap_total_pages() > thr);
+ }
+}
+
+static void shrink_worker(struct work_struct *w)
+{
+ unsigned long thr;
+
+ /* Reclaim down to the accept threshold */
+ thr = zswap_accept_thr_pages();
+
+ zswap_try_to_writeback(thr, zswap_total_pages());
}
/*********************************
--
2.34.1
next prev parent reply other threads:[~2026-06-18 4:49 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-18 4:48 [PATCH v4 0/5] mm/zswap: Implement per-cgroup proactive writeback Hao Jia
2026-06-18 4:48 ` [PATCH v4 1/5] mm/zswap: Extend shrink_memcg() writeback capability Hao Jia
2026-06-18 4:48 ` Hao Jia [this message]
2026-06-18 4:48 ` [PATCH v4 3/5] mm/zswap: Implement proactive writeback Hao Jia
2026-06-18 4:48 ` [PATCH v4 4/5] mm/zswap: Add per-memcg stat for " Hao Jia
2026-06-18 4:48 ` [PATCH v4 5/5] selftests/cgroup: Add tests for zswap " Hao Jia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260618044857.69439-3-jiahao.kernel@gmail.com \
--to=jiahao.kernel@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=jiahao1@lixiang.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=tj@kernel.org \
--cc=yosry@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.