The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH v5 0/3] mm/swap: use swap_ops to register swap device's methods
@ 2026-05-11  7:33 Baoquan He
  2026-05-11  7:33 ` [PATCH v5 1/3] mm/swap: rename mm/page_io.c to mm/swap_io.c Baoquan He
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Baoquan He @ 2026-05-11  7:33 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, usama.arif, chrisl, baohua, kasong, nphamcs, shikemeng,
	youngjun.park, linux-kernel, Baoquan He

This can simplify the code logic and benefit any new type of swap device
added later.

And also do renaming in this patchset:
-------
   file renaming:
   ---
   mm/page_io.c to mm/swap_io.c

   function renaming:
   ---
   swap_writepage_* to swap_write_folio_* in file mm/swap_io.c

Changlog:
===
-v5:
 * Change the return value of init_swap_ops() as -EINVAL as per Chris's
   suggestion and adjust its invocation in swapon() accordingly.
 * Add Chris and Usama's Ack tags.

-v4:
 * Fix a typo opeations -> operations
 * Fix a code bug inside init_swap_ops(). I was taking a change at the
   time, thought the change is trivial, so I only compiled but didn't
   run kernel to test in v3. Now fix it and test passed.
   Thanaks to Usama for catching the above two issues.

-v3:
 * Rename setup_swap_ops() to init_swap_ops() which reflect the function
   behaviour a little better
 * Check if sis->ops, sis->ops->read_folio and sis->ops->write_folio is
   NULL in init_swap_ops(), but not spread them where they are called.
   And once the checking failed, fail swapon immediately. This is
   suggested by Chris.
 * Call init_swap_ops() before setup_swap_extents() invocation. This
   doesn't harm anything and can benefit later adding sis->ops->swap_activate
   method.

-v2:
 * lots of cleanup for patch 2/3: renaming, moving data
   structures, and using const properly
 * collected tags from Kairui, Nhat and Barry

-v1:
 https://lore.kernel.org/linux-mm/20260302104016.163542-1-bhe@redhat.com/

Baoquan He (3):
  mm/swap: rename mm/page_io.c to mm/swap_io.c
  mm/swap: use swap_ops to register swap device's methods
  mm/swap_io.c: rename swap_writepage_* to swap_write_folio_*

 MAINTAINERS                 |   2 +-
 include/linux/swap.h        |   2 +
 mm/Makefile                 |   2 +-
 mm/swap.h                   |  12 ++++-
 mm/{page_io.c => swap_io.c} | 104 ++++++++++++++++++++----------------
 mm/swapfile.c               |   4 ++
 mm/zswap.c                  |   2 +-
 7 files changed, 78 insertions(+), 50 deletions(-)
 rename mm/{page_io.c => swap_io.c} (90%)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v5 1/3] mm/swap: rename mm/page_io.c to mm/swap_io.c
  2026-05-11  7:33 [PATCH v5 0/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
@ 2026-05-11  7:33 ` Baoquan He
  2026-05-11  7:33 ` [PATCH v5 2/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
  2026-05-11  7:33 ` [PATCH v5 3/3] mm/swap_io.c: rename swap_writepage_* to swap_write_folio_* Baoquan He
  2 siblings, 0 replies; 8+ messages in thread
From: Baoquan He @ 2026-05-11  7:33 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, usama.arif, chrisl, baohua, kasong, nphamcs, shikemeng,
	youngjun.park, linux-kernel, Baoquan He

Codes in mm/page_io.c are only related to swap io, it has
nothing to do with other page io.

Rename it to avoid confusion.

Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Acked-by: Kairui Song <kasong@tencent.com>
Signed-off-by: Baoquan He <baoquan.he@linux.dev>
---
 MAINTAINERS                 | 2 +-
 mm/Makefile                 | 2 +-
 mm/swap.h                   | 2 +-
 mm/{page_io.c => swap_io.c} | 2 --
 4 files changed, 3 insertions(+), 5 deletions(-)
 rename mm/{page_io.c => swap_io.c} (99%)

diff --git a/MAINTAINERS b/MAINTAINERS
index b2040011a386..4cc5fad4446f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17033,7 +17033,7 @@ F:	Documentation/mm/swap-table.rst
 F:	include/linux/swap.h
 F:	include/linux/swapfile.h
 F:	include/linux/swapops.h
-F:	mm/page_io.c
+F:	mm/swap_io.c
 F:	mm/swap.c
 F:	mm/swap.h
 F:	mm/swap_table.h
diff --git a/mm/Makefile b/mm/Makefile
index 8ad2ab08244e..a65ac900096a 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -75,7 +75,7 @@ ifdef CONFIG_MMU
 	obj-$(CONFIG_ADVISE_SYSCALLS)	+= madvise.o
 endif
 
-obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o
+obj-$(CONFIG_SWAP)	+= swap_io.o swap_state.o swapfile.o
 obj-$(CONFIG_ZSWAP)	+= zswap.o
 obj-$(CONFIG_HAS_DMA)	+= dmapool.o
 obj-$(CONFIG_HUGETLBFS)	+= hugetlb.o hugetlb_sysfs.o hugetlb_sysctl.o
diff --git a/mm/swap.h b/mm/swap.h
index a77016f2423b..161185057993 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -214,7 +214,7 @@ extern void __swap_cluster_free_entries(struct swap_info_struct *si,
 					struct swap_cluster_info *ci,
 					unsigned int ci_off, unsigned int nr_pages);
 
-/* linux/mm/page_io.c */
+/* linux/mm/swap_io.c */
 int sio_pool_init(void);
 struct swap_iocb;
 void swap_read_folio(struct folio *folio, struct swap_iocb **plug);
diff --git a/mm/page_io.c b/mm/swap_io.c
similarity index 99%
rename from mm/page_io.c
rename to mm/swap_io.c
index 70cea9e24d2f..91b33d955e63 100644
--- a/mm/page_io.c
+++ b/mm/swap_io.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- *  linux/mm/page_io.c
- *
  *  Copyright (C) 1991, 1992, 1993, 1994  Linus Torvalds
  *
  *  Swap reorganised 29.12.95, 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 2/3] mm/swap: use swap_ops to register swap device's methods
  2026-05-11  7:33 [PATCH v5 0/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
  2026-05-11  7:33 ` [PATCH v5 1/3] mm/swap: rename mm/page_io.c to mm/swap_io.c Baoquan He
@ 2026-05-11  7:33 ` Baoquan He
  2026-05-11  8:37   ` Kairui Song
  2026-05-11  7:33 ` [PATCH v5 3/3] mm/swap_io.c: rename swap_writepage_* to swap_write_folio_* Baoquan He
  2 siblings, 1 reply; 8+ messages in thread
From: Baoquan He @ 2026-05-11  7:33 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, usama.arif, chrisl, baohua, kasong, nphamcs, shikemeng,
	youngjun.park, linux-kernel, Baoquan He

This simplifies codes and makes logic clearer. And also makes later any
new swap device type being added easier to handle.

Currently there are three types of swap devices: bdev_fs, bdev_sync
and bdev_async, and only operations read_folio and write_folio are
included. In the future, there could be more swap device types added
and more appropriate opeations adapted into swap_ops.

Suggested-by: Chris Li <chrisl@kernel.org>
Acked-by: Chris Li <chrisl@kernel.org>
Acked-by: Usama Arif <usama.arif@linux.dev>
Co-developed-by: Barry Song <baohua@kernel.org>
Signed-off-by: Barry Song <baohua@kernel.org>
Signed-off-by: Baoquan He <baoquan.he@linux.dev>
---
 include/linux/swap.h |   2 +
 mm/swap.h            |  10 ++++-
 mm/swap_io.c         | 102 +++++++++++++++++++++++++------------------
 mm/swapfile.c        |   4 ++
 mm/zswap.c           |   2 +-
 5 files changed, 75 insertions(+), 45 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 7a09df6977a5..8edd629e30ba 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -242,6 +242,7 @@ struct swap_sequential_cluster {
 	unsigned int next[SWAP_NR_ORDERS]; /* Likely next allocation offset */
 };
 
+struct swap_ops;
 /*
  * The in-memory structure used to track swap areas.
  */
@@ -282,6 +283,7 @@ struct swap_info_struct {
 	struct work_struct reclaim_work; /* reclaim worker */
 	struct list_head discard_clusters; /* discard clusters list */
 	struct plist_node avail_list;   /* entry in swap_avail_head */
+	const struct swap_ops *ops;
 };
 
 static inline swp_entry_t page_swap_entry(struct page *page)
diff --git a/mm/swap.h b/mm/swap.h
index 161185057993..29bdc679fa98 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -217,6 +217,15 @@ extern void __swap_cluster_free_entries(struct swap_info_struct *si,
 /* linux/mm/swap_io.c */
 int sio_pool_init(void);
 struct swap_iocb;
+struct swap_ops {
+	void (*read_folio)(struct swap_info_struct *sis,
+			struct folio *folio,
+			struct swap_iocb **plug);
+	void (*write_folio)(struct swap_info_struct *sis,
+			struct folio *folio,
+			struct swap_iocb **plug);
+};
+int init_swap_ops(struct swap_info_struct *sis);
 void swap_read_folio(struct folio *folio, struct swap_iocb **plug);
 void __swap_read_unplug(struct swap_iocb *plug);
 static inline void swap_read_unplug(struct swap_iocb *plug)
@@ -226,7 +235,6 @@ static inline void swap_read_unplug(struct swap_iocb *plug)
 }
 void swap_write_unplug(struct swap_iocb *sio);
 int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug);
-void __swap_writepage(struct folio *folio, struct swap_iocb **swap_plug);
 
 /* linux/mm/swap_state.c */
 extern struct address_space swap_space __read_mostly;
diff --git a/mm/swap_io.c b/mm/swap_io.c
index 91b33d955e63..cef884339717 100644
--- a/mm/swap_io.c
+++ b/mm/swap_io.c
@@ -238,6 +238,7 @@ static void swap_zeromap_folio_clear(struct folio *folio)
 int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug)
 {
 	int ret = 0;
+	struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
 
 	if (folio_free_swap(folio))
 		goto out_unlock;
@@ -283,7 +284,7 @@ int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug)
 	}
 	rcu_read_unlock();
 
-	__swap_writepage(folio, swap_plug);
+	sis->ops->write_folio(sis, folio, swap_plug);
 	return 0;
 out_unlock:
 	folio_unlock(folio);
@@ -373,10 +374,11 @@ static void sio_write_complete(struct kiocb *iocb, long ret)
 	mempool_free(sio, sio_pool);
 }
 
-static void swap_writepage_fs(struct folio *folio, struct swap_iocb **swap_plug)
+static void swap_writepage_fs(struct swap_info_struct *sis,
+			      struct folio *folio,
+			      struct swap_iocb **swap_plug)
 {
 	struct swap_iocb *sio = swap_plug ? *swap_plug : NULL;
-	struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
 	struct file *swap_file = sis->swap_file;
 	loff_t pos = swap_dev_pos(folio->swap);
 
@@ -409,8 +411,9 @@ static void swap_writepage_fs(struct folio *folio, struct swap_iocb **swap_plug)
 		*swap_plug = sio;
 }
 
-static void swap_writepage_bdev_sync(struct folio *folio,
-		struct swap_info_struct *sis)
+static void swap_writepage_bdev_sync(struct swap_info_struct *sis,
+				     struct folio *folio,
+				     struct swap_iocb **plug)
 {
 	struct bio_vec bv;
 	struct bio bio;
@@ -429,8 +432,9 @@ static void swap_writepage_bdev_sync(struct folio *folio,
 	__end_swap_bio_write(&bio);
 }
 
-static void swap_writepage_bdev_async(struct folio *folio,
-		struct swap_info_struct *sis)
+static void swap_writepage_bdev_async(struct swap_info_struct *sis,
+				      struct folio *folio,
+				      struct swap_iocb **plug)
 {
 	struct bio *bio;
 
@@ -446,29 +450,6 @@ static void swap_writepage_bdev_async(struct folio *folio,
 	submit_bio(bio);
 }
 
-void __swap_writepage(struct folio *folio, struct swap_iocb **swap_plug)
-{
-	struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
-
-	VM_BUG_ON_FOLIO(!folio_test_swapcache(folio), folio);
-	/*
-	 * ->flags can be updated non-atomically,
-	 * but that will never affect SWP_FS_OPS, so the data_race
-	 * is safe.
-	 */
-	if (data_race(sis->flags & SWP_FS_OPS))
-		swap_writepage_fs(folio, swap_plug);
-	/*
-	 * ->flags can be updated non-atomically,
-	 * but that will never affect SWP_SYNCHRONOUS_IO, so the data_race
-	 * is safe.
-	 */
-	else if (data_race(sis->flags & SWP_SYNCHRONOUS_IO))
-		swap_writepage_bdev_sync(folio, sis);
-	else
-		swap_writepage_bdev_async(folio, sis);
-}
-
 void swap_write_unplug(struct swap_iocb *sio)
 {
 	struct iov_iter from;
@@ -537,9 +518,10 @@ static bool swap_read_folio_zeromap(struct folio *folio)
 	return true;
 }
 
-static void swap_read_folio_fs(struct folio *folio, struct swap_iocb **plug)
+static void swap_read_folio_fs(struct swap_info_struct *sis,
+			       struct folio *folio,
+			       struct swap_iocb **plug)
 {
-	struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
 	struct swap_iocb *sio = NULL;
 	loff_t pos = swap_dev_pos(folio->swap);
 
@@ -571,8 +553,9 @@ static void swap_read_folio_fs(struct folio *folio, struct swap_iocb **plug)
 		*plug = sio;
 }
 
-static void swap_read_folio_bdev_sync(struct folio *folio,
-		struct swap_info_struct *sis)
+static void swap_read_folio_bdev_sync(struct swap_info_struct *sis,
+				      struct folio *folio,
+				      struct swap_iocb **plug)
 {
 	struct bio_vec bv;
 	struct bio bio;
@@ -593,8 +576,9 @@ static void swap_read_folio_bdev_sync(struct folio *folio,
 	put_task_struct(current);
 }
 
-static void swap_read_folio_bdev_async(struct folio *folio,
-		struct swap_info_struct *sis)
+static void swap_read_folio_bdev_async(struct swap_info_struct *sis,
+				       struct folio *folio,
+				       struct swap_iocb **plug)
 {
 	struct bio *bio;
 
@@ -608,6 +592,44 @@ static void swap_read_folio_bdev_async(struct folio *folio,
 	submit_bio(bio);
 }
 
+static const struct swap_ops bdev_fs_swap_ops = {
+	.read_folio = swap_read_folio_fs,
+	.write_folio = swap_writepage_fs,
+};
+
+static const struct swap_ops bdev_sync_swap_ops = {
+	.read_folio = swap_read_folio_bdev_sync,
+	.write_folio = swap_writepage_bdev_sync,
+};
+
+static const struct swap_ops bdev_async_swap_ops = {
+	.read_folio = swap_read_folio_bdev_async,
+	.write_folio = swap_writepage_bdev_async,
+};
+
+int init_swap_ops(struct swap_info_struct *sis)
+{
+	/*
+	 * ->flags can be updated non-atomically, but that will
+	 * never affect SWP_FS_OPS, so the data_race is safe.
+	 */
+	if (data_race(sis->flags & SWP_FS_OPS))
+		sis->ops = &bdev_fs_swap_ops;
+	/*
+	 * ->flags can be updated non-atomically, but that will
+	 * never affect SWP_SYNCHRONOUS_IO, so the data_race is safe.
+	 */
+	else if (data_race(sis->flags & SWP_SYNCHRONOUS_IO))
+		sis->ops = &bdev_sync_swap_ops;
+	else
+		sis->ops = &bdev_async_swap_ops;
+
+	if (!sis->ops || !sis->ops->read_folio || !sis->ops->write_folio)
+		return -EINVAL;
+
+	return 0;
+}
+
 void swap_read_folio(struct folio *folio, struct swap_iocb **plug)
 {
 	struct swap_info_struct *sis = __swap_entry_to_info(folio->swap);
@@ -642,13 +664,7 @@ void swap_read_folio(struct folio *folio, struct swap_iocb **plug)
 	/* We have to read from slower devices. Increase zswap protection. */
 	zswap_folio_swapin(folio);
 
-	if (data_race(sis->flags & SWP_FS_OPS)) {
-		swap_read_folio_fs(folio, plug);
-	} else if (synchronous) {
-		swap_read_folio_bdev_sync(folio, sis);
-	} else {
-		swap_read_folio_bdev_async(folio, sis);
-	}
+	sis->ops->read_folio(sis, folio, plug);
 
 finish:
 	if (workingset) {
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 9174f1eeffb0..82d2c9b35b11 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3518,6 +3518,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
 		goto bad_swap_unlock_inode;
 	}
 
+	error = init_swap_ops(si);
+	if (error)
+		goto bad_swap_unlock_inode;
+
 	si->max = maxpages;
 	si->pages = maxpages - 1;
 	nr_extents = setup_swap_extents(si, swap_file, &span);
diff --git a/mm/zswap.c b/mm/zswap.c
index 4b5149173b0e..192401f46de4 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1054,7 +1054,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
 	folio_set_reclaim(folio);
 
 	/* start writeback */
-	__swap_writepage(folio, NULL);
+	si->ops->write_folio(si, folio, NULL);
 
 out:
 	if (ret && ret != -EEXIST) {
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 3/3] mm/swap_io.c: rename swap_writepage_* to swap_write_folio_*
  2026-05-11  7:33 [PATCH v5 0/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
  2026-05-11  7:33 ` [PATCH v5 1/3] mm/swap: rename mm/page_io.c to mm/swap_io.c Baoquan He
  2026-05-11  7:33 ` [PATCH v5 2/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
@ 2026-05-11  7:33 ` Baoquan He
  2 siblings, 0 replies; 8+ messages in thread
From: Baoquan He @ 2026-05-11  7:33 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, usama.arif, chrisl, baohua, kasong, nphamcs, shikemeng,
	youngjun.park, linux-kernel, Baoquan He

All these swap_writepage_* functions handle a passed-in folio,
not a page. This renaming makes them consistent with their
counterpart swap_read_folio_* functions.

Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Baoquan He <baoquan.he@linux.dev>
---
 mm/swap_io.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/swap_io.c b/mm/swap_io.c
index cef884339717..e2710d5fb44e 100644
--- a/mm/swap_io.c
+++ b/mm/swap_io.c
@@ -374,7 +374,7 @@ static void sio_write_complete(struct kiocb *iocb, long ret)
 	mempool_free(sio, sio_pool);
 }
 
-static void swap_writepage_fs(struct swap_info_struct *sis,
+static void swap_write_folio_fs(struct swap_info_struct *sis,
 			      struct folio *folio,
 			      struct swap_iocb **swap_plug)
 {
@@ -411,7 +411,7 @@ static void swap_writepage_fs(struct swap_info_struct *sis,
 		*swap_plug = sio;
 }
 
-static void swap_writepage_bdev_sync(struct swap_info_struct *sis,
+static void swap_write_folio_bdev_sync(struct swap_info_struct *sis,
 				     struct folio *folio,
 				     struct swap_iocb **plug)
 {
@@ -432,7 +432,7 @@ static void swap_writepage_bdev_sync(struct swap_info_struct *sis,
 	__end_swap_bio_write(&bio);
 }
 
-static void swap_writepage_bdev_async(struct swap_info_struct *sis,
+static void swap_write_folio_bdev_async(struct swap_info_struct *sis,
 				      struct folio *folio,
 				      struct swap_iocb **plug)
 {
@@ -594,17 +594,17 @@ static void swap_read_folio_bdev_async(struct swap_info_struct *sis,
 
 static const struct swap_ops bdev_fs_swap_ops = {
 	.read_folio = swap_read_folio_fs,
-	.write_folio = swap_writepage_fs,
+	.write_folio = swap_write_folio_fs,
 };
 
 static const struct swap_ops bdev_sync_swap_ops = {
 	.read_folio = swap_read_folio_bdev_sync,
-	.write_folio = swap_writepage_bdev_sync,
+	.write_folio = swap_write_folio_bdev_sync,
 };
 
 static const struct swap_ops bdev_async_swap_ops = {
 	.read_folio = swap_read_folio_bdev_async,
-	.write_folio = swap_writepage_bdev_async,
+	.write_folio = swap_write_folio_bdev_async,
 };
 
 int init_swap_ops(struct swap_info_struct *sis)
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 2/3] mm/swap: use swap_ops to register swap device's methods
  2026-05-11  7:33 ` [PATCH v5 2/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
@ 2026-05-11  8:37   ` Kairui Song
  2026-05-11  9:03     ` Baoquan He
  2026-05-11 12:17     ` Baoquan He
  0 siblings, 2 replies; 8+ messages in thread
From: Kairui Song @ 2026-05-11  8:37 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-mm, akpm, usama.arif, chrisl, baohua, nphamcs, shikemeng,
	youngjun.park, linux-kernel

On Mon, May 11, 2026 at 3:43 PM Baoquan He <baoquan.he@linux.dev> wrote:
>
> This simplifies codes and makes logic clearer. And also makes later any
> new swap device type being added easier to handle.
>
> Currently there are three types of swap devices: bdev_fs, bdev_sync
> and bdev_async, and only operations read_folio and write_folio are
> included. In the future, there could be more swap device types added
> and more appropriate opeations adapted into swap_ops.

Hi Baoquan,

Thanks for the patch. Sorry I was busy with travel and a few other
series and didn't got a chance to look at a few previous versions.

> +struct swap_ops {
> +       void (*read_folio)(struct swap_info_struct *sis,
> +                       struct folio *folio,
> +                       struct swap_iocb **plug);
> +       void (*write_folio)(struct swap_info_struct *sis,
> +                       struct folio *folio,
> +                       struct swap_iocb **plug);
> +};

Overall, I really like this idea, we can have a cleaner interface
starting there.

> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 9174f1eeffb0..82d2c9b35b11 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -3518,6 +3518,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
>                 goto bad_swap_unlock_inode;
>         }
>
> +       error = init_swap_ops(si);
> +       if (error)
> +               goto bad_swap_unlock_inode;
> +

But this part seems wrong? init_swap_ops is called before
setup_swap_extents -> swap_activate sets SWP_FS_OPS, or
SWP_SYNCHRONOUS_IO is set below, so the branches in init_swap_ops
never take effect, am I right? You need to move this someplace after
these flags are set. Maybe right before swapon_mutex so the mutex is a
clean barrier that the swap device will be used and all things before
that will be seen by users, with some comments.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 2/3] mm/swap: use swap_ops to register swap device's methods
  2026-05-11  8:37   ` Kairui Song
@ 2026-05-11  9:03     ` Baoquan He
  2026-05-11 12:17     ` Baoquan He
  1 sibling, 0 replies; 8+ messages in thread
From: Baoquan He @ 2026-05-11  9:03 UTC (permalink / raw)
  To: Kairui Song
  Cc: linux-mm, akpm, usama.arif, chrisl, baohua, nphamcs, shikemeng,
	youngjun.park, linux-kernel

On 05/11/26 at 04:37pm, Kairui Song wrote:
> On Mon, May 11, 2026 at 3:43 PM Baoquan He <baoquan.he@linux.dev> wrote:
> >
> > This simplifies codes and makes logic clearer. And also makes later any
> > new swap device type being added easier to handle.
> >
> > Currently there are three types of swap devices: bdev_fs, bdev_sync
> > and bdev_async, and only operations read_folio and write_folio are
> > included. In the future, there could be more swap device types added
> > and more appropriate opeations adapted into swap_ops.
> 
> Hi Baoquan,
> 
> Thanks for the patch. Sorry I was busy with travel and a few other
> series and didn't got a chance to look at a few previous versions.
> 
> > +struct swap_ops {
> > +       void (*read_folio)(struct swap_info_struct *sis,
> > +                       struct folio *folio,
> > +                       struct swap_iocb **plug);
> > +       void (*write_folio)(struct swap_info_struct *sis,
> > +                       struct folio *folio,
> > +                       struct swap_iocb **plug);
> > +};
> 
> Overall, I really like this idea, we can have a cleaner interface
> starting there.
> 
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 9174f1eeffb0..82d2c9b35b11 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -3518,6 +3518,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
> >                 goto bad_swap_unlock_inode;
> >         }
> >
> > +       error = init_swap_ops(si);
> > +       if (error)
> > +               goto bad_swap_unlock_inode;
> > +
> 
> But this part seems wrong? init_swap_ops is called before
> setup_swap_extents -> swap_activate sets SWP_FS_OPS, or
> SWP_SYNCHRONOUS_IO is set below, so the branches in init_swap_ops
> never take effect, am I right? You need to move this someplace after
> these flags are set. Maybe right before swapon_mutex so the mutex is a
> clean barrier that the swap device will be used and all things before
> that will be seen by users, with some comments.

You are right, this is wrong.

I checked my queued local patches, I added code relatd to
activation/deaction and register/unregister according to Youngjun's
comments. Later Barry helped post v2 when I took leave. I just picked
code following Barry's v2 to post v3. This wrong code defaults to take
bdev_async_swap_ops, so I didn't catch it when testing.

I will post v6 to fix it. Thanks a lot for careful reviewing and
catching it.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 2/3] mm/swap: use swap_ops to register swap device's methods
  2026-05-11  8:37   ` Kairui Song
  2026-05-11  9:03     ` Baoquan He
@ 2026-05-11 12:17     ` Baoquan He
  2026-05-11 14:38       ` Kairui Song
  1 sibling, 1 reply; 8+ messages in thread
From: Baoquan He @ 2026-05-11 12:17 UTC (permalink / raw)
  To: Kairui Song
  Cc: linux-mm, akpm, usama.arif, chrisl, baohua, nphamcs, shikemeng,
	youngjun.park, linux-kernel

On 05/11/26 at 04:37pm, Kairui Song wrote:
> On Mon, May 11, 2026 at 3:43 PM Baoquan He <baoquan.he@linux.dev> wrote:
> >
> > This simplifies codes and makes logic clearer. And also makes later any
> > new swap device type being added easier to handle.
> >
> > Currently there are three types of swap devices: bdev_fs, bdev_sync
> > and bdev_async, and only operations read_folio and write_folio are
> > included. In the future, there could be more swap device types added
> > and more appropriate opeations adapted into swap_ops.
> 
> Hi Baoquan,
> 
> Thanks for the patch. Sorry I was busy with travel and a few other
> series and didn't got a chance to look at a few previous versions.
> 
> > +struct swap_ops {
> > +       void (*read_folio)(struct swap_info_struct *sis,
> > +                       struct folio *folio,
> > +                       struct swap_iocb **plug);
> > +       void (*write_folio)(struct swap_info_struct *sis,
> > +                       struct folio *folio,
> > +                       struct swap_iocb **plug);
> > +};
> 
> Overall, I really like this idea, we can have a cleaner interface
> starting there.
> 
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 9174f1eeffb0..82d2c9b35b11 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -3518,6 +3518,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
> >                 goto bad_swap_unlock_inode;
> >         }
> >
> > +       error = init_swap_ops(si);
> > +       if (error)
> > +               goto bad_swap_unlock_inode;
> > +
> 
> But this part seems wrong? init_swap_ops is called before
> setup_swap_extents -> swap_activate sets SWP_FS_OPS, or
> SWP_SYNCHRONOUS_IO is set below, so the branches in init_swap_ops
> never take effect, am I right? You need to move this someplace after
> these flags are set. Maybe right before swapon_mutex so the mutex is a
> clean barrier that the swap device will be used and all things before
> that will be seen by users, with some comments.


Speaking of comments, do you think below sentences are OK to you?

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 82d2c9b35b11..8012e5e334f9 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3518,10 +3518,6 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
 		goto bad_swap_unlock_inode;
 	}
 
-	error = init_swap_ops(si);
-	if (error)
-		goto bad_swap_unlock_inode;
-
 	si->max = maxpages;
 	si->pages = maxpages - 1;
 	nr_extents = setup_swap_extents(si, swap_file, &span);
@@ -3616,6 +3612,15 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
 		goto free_swap_zswap;
 	}
 
+	/*
+	 * init_swap_ops() sets si->ops based on flags. It does not need
+	 * swapon_mutex, and must complete before enable_swap_info()
+	 * exposes the device.
+	 */
+	error = init_swap_ops(si);
+	if (error)
+		goto bad_swap_unlock_inode;
+
 	mutex_lock(&swapon_mutex);
 	prio = DEF_SWAP_PRIO;
 	if (swap_flags & SWAP_FLAG_PREFER)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 2/3] mm/swap: use swap_ops to register swap device's methods
  2026-05-11 12:17     ` Baoquan He
@ 2026-05-11 14:38       ` Kairui Song
  0 siblings, 0 replies; 8+ messages in thread
From: Kairui Song @ 2026-05-11 14:38 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-mm, akpm, usama.arif, chrisl, baohua, nphamcs, shikemeng,
	youngjun.park, linux-kernel

On Mon, May 11, 2026 at 8:17 PM Baoquan He <baoquan.he@linux.dev> wrote:
> Speaking of comments, do you think below sentences are OK to you?
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 82d2c9b35b11..8012e5e334f9 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -3518,10 +3518,6 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
>                 goto bad_swap_unlock_inode;
>         }
>
> -       error = init_swap_ops(si);
> -       if (error)
> -               goto bad_swap_unlock_inode;
> -
>         si->max = maxpages;
>         si->pages = maxpages - 1;
>         nr_extents = setup_swap_extents(si, swap_file, &span);
> @@ -3616,6 +3612,15 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
>                 goto free_swap_zswap;
>         }
>
> +       /*
> +        * init_swap_ops() sets si->ops based on flags. It does not need
> +        * swapon_mutex, and must complete before enable_swap_info()
> +        * exposes the device.
> +        */
> +       error = init_swap_ops(si);
> +       if (error)
> +               goto bad_swap_unlock_inode;
> +

Right, LGTM!
>         mutex_lock(&swapon_mutex);
>         prio = DEF_SWAP_PRIO;
>         if (swap_flags & SWAP_FLAG_PREFER)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-05-11 14:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11  7:33 [PATCH v5 0/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
2026-05-11  7:33 ` [PATCH v5 1/3] mm/swap: rename mm/page_io.c to mm/swap_io.c Baoquan He
2026-05-11  7:33 ` [PATCH v5 2/3] mm/swap: use swap_ops to register swap device's methods Baoquan He
2026-05-11  8:37   ` Kairui Song
2026-05-11  9:03     ` Baoquan He
2026-05-11 12:17     ` Baoquan He
2026-05-11 14:38       ` Kairui Song
2026-05-11  7:33 ` [PATCH v5 3/3] mm/swap_io.c: rename swap_writepage_* to swap_write_folio_* Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox