From: Takero Funaki <flintglass@gmail.com>
To: Johannes Weiner <hannes@cmpxchg.org>,
Yosry Ahmed <yosryahmed@google.com>,
Nhat Pham <nphamcs@gmail.com>,
Chengming Zhou <chengming.zhou@linux.dev>,
Jonathan Corbet <corbet@lwn.net>,
Andrew Morton <akpm@linux-foundation.org>,
Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Takero Funaki <flintglass@gmail.com>,
linux-mm@kvack.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: [PATCH v2 6/6] mm: zswap: interrupt shrinker writeback while pagein/out IO
Date: Sat, 6 Jul 2024 02:25:22 +0000 [thread overview]
Message-ID: <20240706022523.1104080-7-flintglass@gmail.com> (raw)
In-Reply-To: <20240706022523.1104080-1-flintglass@gmail.com>
To prevent the zswap global shrinker from writing back pages
simultaneously with IO performed for memory reclaim and faults, delay
the writeback when zswap_store() rejects pages or zswap_load() cannot
find entry in pool.
When the zswap shrinker is running and zswap rejects an incoming page,
simulatenous zswap writeback and the rejected page lead to IO contention
on swap device. In this case, the writeback of the rejected page must be
higher priority as it is necessary for actual memory reclaim progress.
The zswap global shrinker can run in the background and should not
interfere with memory reclaim.
The same logic applies to zswap_load(). When zswap cannot find requested
page from pool and read IO is performed, shrinker should be interrupted.
To avoid IO contention, save the timestamp jiffies when zswap cannot
buffer the pagein/out IO and interrupt the global shrinker. The shrinker
resumes the writeback in 500 msec since the saved timestamp.
Signed-off-by: Takero Funaki <flintglass@gmail.com>
---
mm/zswap.c | 47 +++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 45 insertions(+), 2 deletions(-)
diff --git a/mm/zswap.c b/mm/zswap.c
index def0f948a4ab..59ba4663c74f 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -35,6 +35,8 @@
#include <linux/pagemap.h>
#include <linux/workqueue.h>
#include <linux/list_lru.h>
+#include <linux/delay.h>
+#include <linux/jiffies.h>
#include "swap.h"
#include "internal.h"
@@ -176,6 +178,14 @@ static bool zswap_next_shrink_changed;
static struct work_struct zswap_shrink_work;
static struct shrinker *zswap_shrinker;
+/*
+ * To avoid IO contention between pagein/out and global shrinker writeback,
+ * track the last jiffies of pagein/out and delay the writeback.
+ * Default to 500msec in alignment with mq-deadline read timeout.
+ */
+#define ZSWAP_GLOBAL_SHRINKER_DELAY_MS 500
+static unsigned long zswap_shrinker_delay_start;
+
/*
* struct zswap_entry
*
@@ -244,6 +254,14 @@ static inline struct xarray *swap_zswap_tree(swp_entry_t swp)
pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \
zpool_get_type((p)->zpools[0]))
+static inline void zswap_shrinker_delay_update(void)
+{
+ unsigned long now = jiffies;
+
+ if (now != zswap_shrinker_delay_start)
+ zswap_shrinker_delay_start = now;
+}
+
/*********************************
* pool functions
**********************************/
@@ -1378,6 +1396,8 @@ static void shrink_worker(struct work_struct *w)
struct mem_cgroup *memcg;
int ret, failures = 0, progress;
unsigned long thr;
+ unsigned long now, sleepuntil;
+ const unsigned long delay = msecs_to_jiffies(ZSWAP_GLOBAL_SHRINKER_DELAY_MS);
/* Reclaim down to the accept threshold */
thr = zswap_accept_thr_pages();
@@ -1405,6 +1425,21 @@ static void shrink_worker(struct work_struct *w)
* until the next run of shrink_worker().
*/
do {
+ /*
+ * delay shrinking to allow the last rejected page completes
+ * its writeback
+ */
+ sleepuntil = delay + READ_ONCE(zswap_shrinker_delay_start);
+ now = jiffies;
+ /*
+ * If zswap did not reject pages for long, sleepuntil-now may
+ * underflow. We assume the timestamp is valid only if
+ * now < sleepuntil < now + delay + 1
+ */
+ if (time_before(now, sleepuntil) &&
+ time_before(sleepuntil, now + delay + 1))
+ fsleep(jiffies_to_usecs(sleepuntil - now));
+
spin_lock(&zswap_shrink_lock);
/*
@@ -1526,8 +1561,10 @@ bool zswap_store(struct folio *folio)
VM_WARN_ON_ONCE(!folio_test_swapcache(folio));
/* Large folios aren't supported */
- if (folio_test_large(folio))
+ if (folio_test_large(folio)) {
+ zswap_shrinker_delay_update();
return false;
+ }
if (!zswap_enabled)
goto check_old;
@@ -1648,6 +1685,8 @@ bool zswap_store(struct folio *folio)
zswap_entry_cache_free(entry);
reject:
obj_cgroup_put(objcg);
+ zswap_shrinker_delay_update();
+
if (need_global_shrink)
queue_work(shrink_wq, &zswap_shrink_work);
check_old:
@@ -1691,8 +1730,10 @@ bool zswap_load(struct folio *folio)
else
entry = xa_load(tree, offset);
- if (!entry)
+ if (!entry) {
+ zswap_shrinker_delay_update();
return false;
+ }
if (entry->length)
zswap_decompress(entry, page);
@@ -1835,6 +1876,8 @@ static int zswap_setup(void)
if (ret)
goto hp_fail;
+ zswap_shrinker_delay_update();
+
shrink_wq = alloc_workqueue("zswap-shrink",
WQ_UNBOUND, 1);
if (!shrink_wq)
--
2.43.0
next prev parent reply other threads:[~2024-07-06 2:25 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-06 2:25 [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink Takero Funaki
2024-07-06 2:25 ` [PATCH v2 1/6] mm: zswap: fix global shrinker memcg iteration Takero Funaki
2024-07-08 4:54 ` Chengming Zhou
2024-07-17 1:54 ` Yosry Ahmed
2024-07-06 2:25 ` [PATCH v2 2/6] mm: zswap: fix global shrinker error handling logic Takero Funaki
2024-07-17 2:39 ` Yosry Ahmed
2024-07-06 2:25 ` [PATCH v2 3/6] mm: zswap: proactive shrinking before pool size limit is hit Takero Funaki
2024-07-12 23:18 ` Nhat Pham
2024-07-06 2:25 ` [PATCH v2 4/6] mm: zswap: make writeback run in the background Takero Funaki
2024-07-06 2:25 ` [PATCH v2 5/6] mm: zswap: store incompressible page as-is Takero Funaki
2024-07-06 23:53 ` Nhat Pham
2024-07-07 9:38 ` Takero Funaki
2024-07-12 22:36 ` Nhat Pham
2024-07-08 3:56 ` Chengming Zhou
2024-07-08 13:44 ` Takero Funaki
2024-07-09 13:26 ` Chengming Zhou
2024-07-12 22:47 ` Nhat Pham
2024-07-16 2:30 ` Chengming Zhou
2024-07-06 2:25 ` Takero Funaki [this message]
2024-07-08 19:17 ` [PATCH v2 6/6] mm: zswap: interrupt shrinker writeback while pagein/out IO Nhat Pham
2024-07-09 0:57 ` Nhat Pham
2024-07-10 21:21 ` Takero Funaki
2024-07-10 22:10 ` Nhat Pham
2024-07-15 7:33 ` Takero Funaki
2024-07-06 17:32 ` [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink Andrew Morton
2024-07-07 10:54 ` Takero Funaki
2024-07-09 0:53 ` Nhat Pham
2024-07-10 22:26 ` Takero Funaki
2024-07-12 23:02 ` Nhat Pham
2024-07-15 8:20 ` Takero Funaki
2024-07-26 18:13 ` Nhat Pham
2024-07-26 18:25 ` Nhat Pham
2024-07-17 2:53 ` Yosry Ahmed
2024-07-17 17:49 ` Nhat Pham
2024-07-17 18:05 ` Yosry Ahmed
2024-07-17 19:01 ` Nhat Pham
2024-07-19 14:55 ` Takero Funaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240706022523.1104080-7-flintglass@gmail.com \
--to=flintglass@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cerasuolodomenico@gmail.com \
--cc=chengming.zhou@linux.dev \
--cc=corbet@lwn.net \
--cc=hannes@cmpxchg.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox