* [PATCH v7 0/3] [PATCH v7 0/3] mm/swap: use swap_ops to register swap device's methods
@ 2026-05-15 1:57 Baoquan He
2026-05-15 1:57 ` [PATCH v7 1/3] " Baoquan He
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Baoquan He @ 2026-05-15 1:57 UTC (permalink / raw)
To: linux-mm
Cc: akpm, chrisl, usama.arif, baohua, kasong, nphamcs, shikemeng,
youngjun.park, hch, linux-kernel, Baoquan He
This can simplify the code logic and benefit any new type of swap device
added later.
Patch 2 renames the page_io functions to a consistent
swap_<backend>_<op>_<folio> naming convention. Patch 3 fixes a
leaky abstraction where FS swap unplug bypassed swap_ops, adding
an .unplug callback and dispatcher.
Changlog:
===
v7:
- Drop the old patch 1 "[PATCH v6 1/3] mm/swap: rename mm/page_io.c
to mm/swap_io.c" as Christoph suggested.
- Add the .unplug callback to address Christoph's review about
the leaky abstraction.
- Minor cleanups and fixes per review feedback.
-v6:
- Fix a code bug Kairui found out when reviewing patch in mm/swapfile.c
in patch 2/3. Has fixed it by moving it to appropriate place and add
comment to explain.
-v5:
- Change the return value of init_swap_ops() as -EINVAL as per Chris's
suggestion and adjust its invocation in swapon() accordingly.
- Add Chris and Usama's Ack tags.
-v4:
- Fix a typo opeations -> operations
- Fix a code bug inside init_swap_ops(). I was taking a change at the
time, thought the change is trivial, so I only compiled but didn't
run kernel to test in v3. Now fix it and test passed.
Thanaks to Usama for catching the above two issues.
-v3:
- Rename setup_swap_ops() to init_swap_ops() which reflect the function
behaviour a little better
- Check if sis->ops, sis->ops->read_folio and sis->ops->write_folio is
NULL in init_swap_ops(), but not spread them where they are called.
And once the checking failed, fail swapon immediately. This is
suggested by Chris.
- Call init_swap_ops() before setup_swap_extents() invocation. This
doesn't harm anything and can benefit later adding sis->ops->swap_activate
method.
-v2:
- Lots of cleanup for patch 2/3: renaming, moving data
structures, and using const properly
- Collected tags from Kairui, Nhat and Barry
-v1:
- https://lore.kernel.org/linux-mm/20260302104016.163542-1-bhe@redhat.com/
Baoquan He (3):
mm/swap: use swap_ops to register swap device's methods
mm/page_io.c: rename page_io functions to consistent naming
mm/swap: add unplug callback to swap_ops to fix leaky abstraction
include/linux/swap.h | 3 ++
mm/page_io.c | 115 ++++++++++++++++++++++++++-----------------
mm/swap.h | 11 ++++-
mm/swapfile.c | 9 ++++
mm/zswap.c | 2 +-
5 files changed, 94 insertions(+), 46 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v7 1/3] mm/swap: use swap_ops to register swap device's methods
2026-05-15 1:57 [PATCH v7 0/3] [PATCH v7 0/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
@ 2026-05-15 1:57 ` Baoquan He
2026-05-15 1:57 ` [PATCH v7 2/3] mm/page_io.c: rename page_io functions to consistent naming Baoquan He
2026-05-15 1:57 ` [PATCH v7 3/3] mm/swap: add unplug callback to swap_ops to fix leaky abstraction Baoquan He
2 siblings, 0 replies; 6+ messages in thread
From: Baoquan He @ 2026-05-15 1:57 UTC (permalink / raw)
To: linux-mm
Cc: akpm, chrisl, usama.arif, baohua, kasong, nphamcs, shikemeng,
youngjun.park, hch, linux-kernel, Baoquan He
This simplifies codes and makes logic clearer. And also makes later any
new swap device type being added easier to handle.
Currently there are three types of swap devices: bdev_fs, bdev_sync
and bdev_async, and only operations read_folio and write_folio are
included. In the future, there could be more swap device types added
and more appropriate operations adapted into swap_ops.
Suggested-by: Chris Li <chrisl@kernel.org>
Signed-off-by: Baoquan He <baoquan.he@linux.dev>
---
include/linux/swap.h | 3 ++
mm/page_io.c | 104 +++++++++++++++++++++++++------------------
mm/swap.h | 10 ++++-
mm/swapfile.c | 9 ++++
mm/zswap.c | 2 +-
5 files changed, 83 insertions(+), 45 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 7a09df6977a5..0d045fc8ec35 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -20,6 +20,8 @@ struct notifier_block;
struct bio;
+struct swap_ops;
+
#define SWAP_FLAG_PREFER 0x8000 /* set if swap priority specified */
#define SWAP_FLAG_PRIO_MASK 0x7fff
#define SWAP_FLAG_DISCARD 0x10000 /* enable discard for swap */
@@ -282,6 +284,7 @@ struct swap_info_struct {
struct work_struct reclaim_work; /* reclaim worker */
struct list_head discard_clusters; /* discard clusters list */
struct plist_node avail_list; /* entry in swap_avail_head */
+ const struct swap_ops *ops;
};
static inline swp_entry_t page_swap_entry(struct page *page)
diff --git a/mm/page_io.c b/mm/page_io.c
index 70cea9e24d2f..2b520af88376 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -240,6 +240,9 @@ static void swap_zeromap_folio_clear(struct folio *folio)
int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug)
{
int ret = 0;
+ struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
+
+ VM_WARN_ON_FOLIO(!folio_test_swapcache(folio), folio);
if (folio_free_swap(folio))
goto out_unlock;
@@ -285,7 +288,7 @@ int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug)
}
rcu_read_unlock();
- __swap_writepage(folio, swap_plug);
+ sis->ops->write_folio(sis, folio, swap_plug);
return 0;
out_unlock:
folio_unlock(folio);
@@ -375,10 +378,11 @@ static void sio_write_complete(struct kiocb *iocb, long ret)
mempool_free(sio, sio_pool);
}
-static void swap_writepage_fs(struct folio *folio, struct swap_iocb **swap_plug)
+static void swap_writepage_fs(struct swap_info_struct *sis,
+ struct folio *folio,
+ struct swap_iocb **swap_plug)
{
struct swap_iocb *sio = swap_plug ? *swap_plug : NULL;
- struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
struct file *swap_file = sis->swap_file;
loff_t pos = swap_dev_pos(folio->swap);
@@ -411,8 +415,9 @@ static void swap_writepage_fs(struct folio *folio, struct swap_iocb **swap_plug)
*swap_plug = sio;
}
-static void swap_writepage_bdev_sync(struct folio *folio,
- struct swap_info_struct *sis)
+static void swap_writepage_bdev_sync(struct swap_info_struct *sis,
+ struct folio *folio,
+ struct swap_iocb **plug)
{
struct bio_vec bv;
struct bio bio;
@@ -431,8 +436,9 @@ static void swap_writepage_bdev_sync(struct folio *folio,
__end_swap_bio_write(&bio);
}
-static void swap_writepage_bdev_async(struct folio *folio,
- struct swap_info_struct *sis)
+static void swap_writepage_bdev_async(struct swap_info_struct *sis,
+ struct folio *folio,
+ struct swap_iocb **plug)
{
struct bio *bio;
@@ -448,29 +454,6 @@ static void swap_writepage_bdev_async(struct folio *folio,
submit_bio(bio);
}
-void __swap_writepage(struct folio *folio, struct swap_iocb **swap_plug)
-{
- struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
-
- VM_BUG_ON_FOLIO(!folio_test_swapcache(folio), folio);
- /*
- * ->flags can be updated non-atomically,
- * but that will never affect SWP_FS_OPS, so the data_race
- * is safe.
- */
- if (data_race(sis->flags & SWP_FS_OPS))
- swap_writepage_fs(folio, swap_plug);
- /*
- * ->flags can be updated non-atomically,
- * but that will never affect SWP_SYNCHRONOUS_IO, so the data_race
- * is safe.
- */
- else if (data_race(sis->flags & SWP_SYNCHRONOUS_IO))
- swap_writepage_bdev_sync(folio, sis);
- else
- swap_writepage_bdev_async(folio, sis);
-}
-
void swap_write_unplug(struct swap_iocb *sio)
{
struct iov_iter from;
@@ -539,9 +522,10 @@ static bool swap_read_folio_zeromap(struct folio *folio)
return true;
}
-static void swap_read_folio_fs(struct folio *folio, struct swap_iocb **plug)
+static void swap_read_folio_fs(struct swap_info_struct *sis,
+ struct folio *folio,
+ struct swap_iocb **plug)
{
- struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
struct swap_iocb *sio = NULL;
loff_t pos = swap_dev_pos(folio->swap);
@@ -573,8 +557,9 @@ static void swap_read_folio_fs(struct folio *folio, struct swap_iocb **plug)
*plug = sio;
}
-static void swap_read_folio_bdev_sync(struct folio *folio,
- struct swap_info_struct *sis)
+static void swap_read_folio_bdev_sync(struct swap_info_struct *sis,
+ struct folio *folio,
+ struct swap_iocb **plug)
{
struct bio_vec bv;
struct bio bio;
@@ -595,8 +580,9 @@ static void swap_read_folio_bdev_sync(struct folio *folio,
put_task_struct(current);
}
-static void swap_read_folio_bdev_async(struct folio *folio,
- struct swap_info_struct *sis)
+static void swap_read_folio_bdev_async(struct swap_info_struct *sis,
+ struct folio *folio,
+ struct swap_iocb **plug)
{
struct bio *bio;
@@ -610,6 +596,44 @@ static void swap_read_folio_bdev_async(struct folio *folio,
submit_bio(bio);
}
+static const struct swap_ops bdev_fs_swap_ops = {
+ .read_folio = swap_read_folio_fs,
+ .write_folio = swap_writepage_fs,
+};
+
+static const struct swap_ops bdev_sync_swap_ops = {
+ .read_folio = swap_read_folio_bdev_sync,
+ .write_folio = swap_writepage_bdev_sync,
+};
+
+static const struct swap_ops bdev_async_swap_ops = {
+ .read_folio = swap_read_folio_bdev_async,
+ .write_folio = swap_writepage_bdev_async,
+};
+
+int init_swap_ops(struct swap_info_struct *sis)
+{
+ /*
+ * ->flags can be updated non-atomically, but that will
+ * never affect SWP_FS_OPS, so the data_race is safe.
+ */
+ if (data_race(sis->flags & SWP_FS_OPS))
+ sis->ops = &bdev_fs_swap_ops;
+ /*
+ * ->flags can be updated non-atomically, but that will
+ * never affect SWP_SYNCHRONOUS_IO, so the data_race is safe.
+ */
+ else if (data_race(sis->flags & SWP_SYNCHRONOUS_IO))
+ sis->ops = &bdev_sync_swap_ops;
+ else
+ sis->ops = &bdev_async_swap_ops;
+
+ if (!sis->ops || !sis->ops->read_folio || !sis->ops->write_folio)
+ return -EINVAL;
+
+ return 0;
+}
+
void swap_read_folio(struct folio *folio, struct swap_iocb **plug)
{
struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
@@ -644,13 +668,7 @@ void swap_read_folio(struct folio *folio, struct swap_iocb **plug)
/* We have to read from slower devices. Increase zswap protection. */
zswap_folio_swapin(folio);
- if (data_race(sis->flags & SWP_FS_OPS)) {
- swap_read_folio_fs(folio, plug);
- } else if (synchronous) {
- swap_read_folio_bdev_sync(folio, sis);
- } else {
- swap_read_folio_bdev_async(folio, sis);
- }
+ sis->ops->read_folio(sis, folio, plug);
finish:
if (workingset) {
diff --git a/mm/swap.h b/mm/swap.h
index a77016f2423b..8d4375f91632 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -217,6 +217,15 @@ extern void __swap_cluster_free_entries(struct swap_info_struct *si,
/* linux/mm/page_io.c */
int sio_pool_init(void);
struct swap_iocb;
+struct swap_ops {
+ void (*read_folio)(struct swap_info_struct *sis,
+ struct folio *folio,
+ struct swap_iocb **plug);
+ void (*write_folio)(struct swap_info_struct *sis,
+ struct folio *folio,
+ struct swap_iocb **plug);
+};
+int init_swap_ops(struct swap_info_struct *sis);
void swap_read_folio(struct folio *folio, struct swap_iocb **plug);
void __swap_read_unplug(struct swap_iocb *plug);
static inline void swap_read_unplug(struct swap_iocb *plug)
@@ -226,7 +235,6 @@ static inline void swap_read_unplug(struct swap_iocb *plug)
}
void swap_write_unplug(struct swap_iocb *sio);
int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug);
-void __swap_writepage(struct folio *folio, struct swap_iocb **swap_plug);
/* linux/mm/swap_state.c */
extern struct address_space swap_space __read_mostly;
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 9174f1eeffb0..29ae79d0fa2e 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3597,6 +3597,15 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
}
}
+ /*
+ * init_swap_ops() sets si->ops based on flags. It does not need
+ * swapon_mutex, and must complete before enable_swap_info()
+ * exposes the device.
+ */
+ error = init_swap_ops(si);
+ if (error)
+ goto bad_swap_unlock_inode;
+
error = zswap_swapon(si->type, maxpages);
if (error)
goto bad_swap_unlock_inode;
diff --git a/mm/zswap.c b/mm/zswap.c
index 4b5149173b0e..192401f46de4 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1054,7 +1054,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
folio_set_reclaim(folio);
/* start writeback */
- __swap_writepage(folio, NULL);
+ si->ops->write_folio(si, folio, NULL);
out:
if (ret && ret != -EEXIST) {
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v7 2/3] mm/page_io.c: rename page_io functions to consistent naming
2026-05-15 1:57 [PATCH v7 0/3] [PATCH v7 0/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
2026-05-15 1:57 ` [PATCH v7 1/3] " Baoquan He
@ 2026-05-15 1:57 ` Baoquan He
2026-05-15 1:57 ` [PATCH v7 3/3] mm/swap: add unplug callback to swap_ops to fix leaky abstraction Baoquan He
2 siblings, 0 replies; 6+ messages in thread
From: Baoquan He @ 2026-05-15 1:57 UTC (permalink / raw)
To: linux-mm
Cc: akpm, chrisl, usama.arif, baohua, kasong, nphamcs, shikemeng,
youngjun.park, hch, linux-kernel, Baoquan He
Rename the swap I/O functions to use a consistent
swap_<backend>_<op>_<folio> naming scheme, so the backend type
immediately follows the swap_ prefix. The new names align with
swap_ops callbacks: .write_folio and .read_folio.
swap_writepage_fs -> swap_fs_write_folio
swap_writepage_bdev_sync -> swap_bdev_sync_write_folio
swap_writepage_bdev_async -> swap_bdev_async_write_folio
swap_read_folio_fs -> swap_fs_read_folio
swap_read_folio_bdev_sync -> swap_bdev_sync_read_folio
swap_read_folio_bdev_async -> swap_bdev_async_read_folio
Signed-off-by: Baoquan He <baoquan.he@linux.dev>
---
mm/page_io.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/mm/page_io.c b/mm/page_io.c
index 2b520af88376..38b94c560c37 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -378,7 +378,7 @@ static void sio_write_complete(struct kiocb *iocb, long ret)
mempool_free(sio, sio_pool);
}
-static void swap_writepage_fs(struct swap_info_struct *sis,
+static void swap_fs_write_folio(struct swap_info_struct *sis,
struct folio *folio,
struct swap_iocb **swap_plug)
{
@@ -415,7 +415,7 @@ static void swap_writepage_fs(struct swap_info_struct *sis,
*swap_plug = sio;
}
-static void swap_writepage_bdev_sync(struct swap_info_struct *sis,
+static void swap_bdev_sync_write_folio(struct swap_info_struct *sis,
struct folio *folio,
struct swap_iocb **plug)
{
@@ -436,7 +436,7 @@ static void swap_writepage_bdev_sync(struct swap_info_struct *sis,
__end_swap_bio_write(&bio);
}
-static void swap_writepage_bdev_async(struct swap_info_struct *sis,
+static void swap_bdev_async_write_folio(struct swap_info_struct *sis,
struct folio *folio,
struct swap_iocb **plug)
{
@@ -522,7 +522,7 @@ static bool swap_read_folio_zeromap(struct folio *folio)
return true;
}
-static void swap_read_folio_fs(struct swap_info_struct *sis,
+static void swap_fs_read_folio(struct swap_info_struct *sis,
struct folio *folio,
struct swap_iocb **plug)
{
@@ -557,7 +557,7 @@ static void swap_read_folio_fs(struct swap_info_struct *sis,
*plug = sio;
}
-static void swap_read_folio_bdev_sync(struct swap_info_struct *sis,
+static void swap_bdev_sync_read_folio(struct swap_info_struct *sis,
struct folio *folio,
struct swap_iocb **plug)
{
@@ -580,7 +580,7 @@ static void swap_read_folio_bdev_sync(struct swap_info_struct *sis,
put_task_struct(current);
}
-static void swap_read_folio_bdev_async(struct swap_info_struct *sis,
+static void swap_bdev_async_read_folio(struct swap_info_struct *sis,
struct folio *folio,
struct swap_iocb **plug)
{
@@ -597,18 +597,18 @@ static void swap_read_folio_bdev_async(struct swap_info_struct *sis,
}
static const struct swap_ops bdev_fs_swap_ops = {
- .read_folio = swap_read_folio_fs,
- .write_folio = swap_writepage_fs,
+ .read_folio = swap_fs_read_folio,
+ .write_folio = swap_fs_write_folio,
};
static const struct swap_ops bdev_sync_swap_ops = {
- .read_folio = swap_read_folio_bdev_sync,
- .write_folio = swap_writepage_bdev_sync,
+ .read_folio = swap_bdev_sync_read_folio,
+ .write_folio = swap_bdev_sync_write_folio,
};
static const struct swap_ops bdev_async_swap_ops = {
- .read_folio = swap_read_folio_bdev_async,
- .write_folio = swap_writepage_bdev_async,
+ .read_folio = swap_bdev_async_read_folio,
+ .write_folio = swap_bdev_async_write_folio,
};
int init_swap_ops(struct swap_info_struct *sis)
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v7 3/3] mm/swap: add unplug callback to swap_ops to fix leaky abstraction
2026-05-15 1:57 [PATCH v7 0/3] [PATCH v7 0/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
2026-05-15 1:57 ` [PATCH v7 1/3] " Baoquan He
2026-05-15 1:57 ` [PATCH v7 2/3] mm/page_io.c: rename page_io functions to consistent naming Baoquan He
@ 2026-05-15 1:57 ` Baoquan He
2026-05-15 4:39 ` Barry Song
2 siblings, 1 reply; 6+ messages in thread
From: Baoquan He @ 2026-05-15 1:57 UTC (permalink / raw)
To: linux-mm
Cc: akpm, chrisl, usama.arif, baohua, kasong, nphamcs, shikemeng,
youngjun.park, hch, linux-kernel, Baoquan He
When swap_ops was introduced, the FS-swap batch submission remained
as a standalone swap_write_unplug() that directly called
mapping->a_ops->swap_rw(). This meant callers still had implicit
knowledge of filesystem internals rather than going through the
swap_ops abstraction.
Fix this by adding an unplug callback to struct swap_ops.
Each ops table provides its own implementation:
- bdev_fs_swap_ops uses the existing FS batch-submission logic
- bdev_sync/bdev_async_swap_ops leave it NULL since block-layer
plugging handles their I/O
The swap_iocb now carries a pointer to its ops table so that
swap_write_unplug() can dispatch through the callback without
the caller needing to know the swap device type.
Signed-off-by: Baoquan He <baoquan.he@linux.dev>
---
mm/page_io.c | 11 ++++++++++-
mm/swap.h | 1 +
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/mm/page_io.c b/mm/page_io.c
index 38b94c560c37..2c36d261ad98 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -329,6 +329,7 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct folio *folio)
struct swap_iocb {
struct kiocb iocb;
+ const struct swap_ops *ops;
struct bio_vec bvec[SWAP_CLUSTER_MAX];
int pages;
int len;
@@ -401,6 +402,7 @@ static void swap_fs_write_folio(struct swap_info_struct *sis,
init_sync_kiocb(&sio->iocb, swap_file);
sio->iocb.ki_complete = sio_write_complete;
sio->iocb.ki_pos = pos;
+ sio->ops = sis->ops;
sio->pages = 0;
sio->len = 0;
}
@@ -454,7 +456,7 @@ static void swap_bdev_async_write_folio(struct swap_info_struct *sis,
submit_bio(bio);
}
-void swap_write_unplug(struct swap_iocb *sio)
+static void swap_fs_write_folio_unplug(struct swap_iocb *sio)
{
struct iov_iter from;
struct address_space *mapping = sio->iocb.ki_filp->f_mapping;
@@ -466,6 +468,12 @@ void swap_write_unplug(struct swap_iocb *sio)
sio_write_complete(&sio->iocb, ret);
}
+void swap_write_unplug(struct swap_iocb *sio)
+{
+ if (sio->ops && sio->ops->unplug)
+ sio->ops->unplug(sio);
+}
+
static void sio_read_complete(struct kiocb *iocb, long ret)
{
struct swap_iocb *sio = container_of(iocb, struct swap_iocb, iocb);
@@ -599,6 +607,7 @@ static void swap_bdev_async_read_folio(struct swap_info_struct *sis,
static const struct swap_ops bdev_fs_swap_ops = {
.read_folio = swap_fs_read_folio,
.write_folio = swap_fs_write_folio,
+ .unplug = swap_fs_write_folio_unplug,
};
static const struct swap_ops bdev_sync_swap_ops = {
diff --git a/mm/swap.h b/mm/swap.h
index 8d4375f91632..67e3b1617146 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -224,6 +224,7 @@ struct swap_ops {
void (*write_folio)(struct swap_info_struct *sis,
struct folio *folio,
struct swap_iocb **plug);
+ void (*unplug)(struct swap_iocb *sio);
};
int init_swap_ops(struct swap_info_struct *sis);
void swap_read_folio(struct folio *folio, struct swap_iocb **plug);
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v7 3/3] mm/swap: add unplug callback to swap_ops to fix leaky abstraction
2026-05-15 1:57 ` [PATCH v7 3/3] mm/swap: add unplug callback to swap_ops to fix leaky abstraction Baoquan He
@ 2026-05-15 4:39 ` Barry Song
2026-05-15 5:43 ` Baoquan He
0 siblings, 1 reply; 6+ messages in thread
From: Barry Song @ 2026-05-15 4:39 UTC (permalink / raw)
To: Baoquan He
Cc: linux-mm, akpm, chrisl, usama.arif, kasong, nphamcs, shikemeng,
youngjun.park, hch, linux-kernel
On Fri, May 15, 2026 at 9:58 AM Baoquan He <baoquan.he@linux.dev> wrote:
>
> When swap_ops was introduced, the FS-swap batch submission remained
> as a standalone swap_write_unplug() that directly called
> mapping->a_ops->swap_rw(). This meant callers still had implicit
> knowledge of filesystem internals rather than going through the
> swap_ops abstraction.
>
> Fix this by adding an unplug callback to struct swap_ops.
> Each ops table provides its own implementation:
> - bdev_fs_swap_ops uses the existing FS batch-submission logic
> - bdev_sync/bdev_async_swap_ops leave it NULL since block-layer
> plugging handles their I/O
>
> The swap_iocb now carries a pointer to its ops table so that
> swap_write_unplug() can dispatch through the callback without
> the caller needing to know the swap device type.
>
> Signed-off-by: Baoquan He <baoquan.he@linux.dev>
> ---
> mm/page_io.c | 11 ++++++++++-
> mm/swap.h | 1 +
> 2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_io.c b/mm/page_io.c
> index 38b94c560c37..2c36d261ad98 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -329,6 +329,7 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct folio *folio)
>
> struct swap_iocb {
> struct kiocb iocb;
> + const struct swap_ops *ops;
> struct bio_vec bvec[SWAP_CLUSTER_MAX];
> int pages;
> int len;
> @@ -401,6 +402,7 @@ static void swap_fs_write_folio(struct swap_info_struct *sis,
> init_sync_kiocb(&sio->iocb, swap_file);
> sio->iocb.ki_complete = sio_write_complete;
> sio->iocb.ki_pos = pos;
> + sio->ops = sis->ops;
> sio->pages = 0;
> sio->len = 0;
> }
> @@ -454,7 +456,7 @@ static void swap_bdev_async_write_folio(struct swap_info_struct *sis,
> submit_bio(bio);
> }
>
> -void swap_write_unplug(struct swap_iocb *sio)
> +static void swap_fs_write_folio_unplug(struct swap_iocb *sio)
> {
> struct iov_iter from;
> struct address_space *mapping = sio->iocb.ki_filp->f_mapping;
> @@ -466,6 +468,12 @@ void swap_write_unplug(struct swap_iocb *sio)
> sio_write_complete(&sio->iocb, ret);
> }
>
> +void swap_write_unplug(struct swap_iocb *sio)
> +{
> + if (sio->ops && sio->ops->unplug)
> + sio->ops->unplug(sio);
> +}
> +
Hi Baoquan,
we have already "sis->ops" check in init_swap_ops, do we need !sis->ops
again?
+int init_swap_ops(struct swap_info_struct *sis)
+{
+ ...
+
+ if (!sis->ops || !sis->ops->read_folio || !sis->ops->write_folio)
+ return -EINVAL;
+
+ return 0;
+}
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v7 3/3] mm/swap: add unplug callback to swap_ops to fix leaky abstraction
2026-05-15 4:39 ` Barry Song
@ 2026-05-15 5:43 ` Baoquan He
0 siblings, 0 replies; 6+ messages in thread
From: Baoquan He @ 2026-05-15 5:43 UTC (permalink / raw)
To: Barry Song
Cc: linux-mm, akpm, chrisl, usama.arif, kasong, nphamcs, shikemeng,
youngjun.park, hch, linux-kernel
On 05/15/26 at 12:39pm, Barry Song wrote:
> On Fri, May 15, 2026 at 9:58 AM Baoquan He <baoquan.he@linux.dev> wrote:
> >
> > When swap_ops was introduced, the FS-swap batch submission remained
> > as a standalone swap_write_unplug() that directly called
> > mapping->a_ops->swap_rw(). This meant callers still had implicit
> > knowledge of filesystem internals rather than going through the
> > swap_ops abstraction.
> >
> > Fix this by adding an unplug callback to struct swap_ops.
> > Each ops table provides its own implementation:
> > - bdev_fs_swap_ops uses the existing FS batch-submission logic
> > - bdev_sync/bdev_async_swap_ops leave it NULL since block-layer
> > plugging handles their I/O
> >
> > The swap_iocb now carries a pointer to its ops table so that
> > swap_write_unplug() can dispatch through the callback without
> > the caller needing to know the swap device type.
> >
> > Signed-off-by: Baoquan He <baoquan.he@linux.dev>
> > ---
> > mm/page_io.c | 11 ++++++++++-
> > mm/swap.h | 1 +
> > 2 files changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/page_io.c b/mm/page_io.c
> > index 38b94c560c37..2c36d261ad98 100644
> > --- a/mm/page_io.c
> > +++ b/mm/page_io.c
> > @@ -329,6 +329,7 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct folio *folio)
> >
> > struct swap_iocb {
> > struct kiocb iocb;
> > + const struct swap_ops *ops;
> > struct bio_vec bvec[SWAP_CLUSTER_MAX];
> > int pages;
> > int len;
> > @@ -401,6 +402,7 @@ static void swap_fs_write_folio(struct swap_info_struct *sis,
> > init_sync_kiocb(&sio->iocb, swap_file);
> > sio->iocb.ki_complete = sio_write_complete;
> > sio->iocb.ki_pos = pos;
> > + sio->ops = sis->ops;
> > sio->pages = 0;
> > sio->len = 0;
> > }
> > @@ -454,7 +456,7 @@ static void swap_bdev_async_write_folio(struct swap_info_struct *sis,
> > submit_bio(bio);
> > }
> >
> > -void swap_write_unplug(struct swap_iocb *sio)
> > +static void swap_fs_write_folio_unplug(struct swap_iocb *sio)
> > {
> > struct iov_iter from;
> > struct address_space *mapping = sio->iocb.ki_filp->f_mapping;
> > @@ -466,6 +468,12 @@ void swap_write_unplug(struct swap_iocb *sio)
> > sio_write_complete(&sio->iocb, ret);
> > }
> >
> > +void swap_write_unplug(struct swap_iocb *sio)
> > +{
> > + if (sio->ops && sio->ops->unplug)
> > + sio->ops->unplug(sio);
> > +}
> > +
>
> Hi Baoquan,
>
> we have already "sis->ops" check in init_swap_ops, do we need !sis->ops
> again?
That's a good point, and no, I dont' think we need it. I will fix it in
v8. Thanks.
>
> +int init_swap_ops(struct swap_info_struct *sis)
> +{
> + ...
> +
> + if (!sis->ops || !sis->ops->read_folio || !sis->ops->write_folio)
> + return -EINVAL;
> +
> + return 0;
> +}
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-05-15 5:43 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15 1:57 [PATCH v7 0/3] [PATCH v7 0/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
2026-05-15 1:57 ` [PATCH v7 1/3] " Baoquan He
2026-05-15 1:57 ` [PATCH v7 2/3] mm/page_io.c: rename page_io functions to consistent naming Baoquan He
2026-05-15 1:57 ` [PATCH v7 3/3] mm/swap: add unplug callback to swap_ops to fix leaky abstraction Baoquan He
2026-05-15 4:39 ` Barry Song
2026-05-15 5:43 ` Baoquan He
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.