linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback
@ 2025-06-24  2:21 Joanne Koong
  2025-06-24  2:21 ` [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx Joanne Koong
                   ` (16 more replies)
  0 siblings, 17 replies; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

This series adds fuse iomap support for buffered writes and dirty folio
writeback. This is needed so that granular uptodate and dirty tracking can
be used in fuse when large folios are enabled. This has two big advantages.
For writes, instead of the entire folio needing to be read into the page
cache, only the relevant portions need to be. For writeback, only the
dirty portions need to be written back instead of the entire folio.

This patchset is composed of two parts:
a) Christoph's iomap changes (patches 1 to 11) which are taken verbatim
   from [1]
b) fuse changes to use iomap (patches 12 to 16)

Please note the following:
* this patchset version temporarily drops the CONFIG_BLOCK iomap refactoring
  patches that will be needed to merge in the series. As of now, this breaks
  compilation for environments where CONFIG_BLOCK is not set, but the
  CONFIG_BLOCK iomap changes will be re-added back in once the core changes
  in this patchset are ready to go.
* this patchset does not enable large folios yet. That will be sent out in a
  separate future patchset.

This series is on top of commit 86731a2a6 ("Linux 6.16-rc3") in the linux
tree.

Thanks,
Joanne

[1] https://lore.kernel.org/linux-fsdevel/20250617105514.3393938-1-hch@lst.de/


Changeset
-------
v2 -> v3:
* Fix up fuse patches to use iomap APIs from Christoph's patches
* Drop CONFIG_BLOCK patches
* Add patch to use iomap for invalidation and partial uptodateness check
* Add patch for refactoring fuse writeback to use iomap_writepage_ctx inode
v2: https://lore.kernel.org/linux-fsdevel/20250613214642.2903225-1-joannelkoong@gmail.com/

v1 -> v2:
* Drop IOMAP_IN_MEM type and just use IOMAP_MAPPED for fuse
* Separate out new helper functions added to iomap into separate commits
* Update iomap documentation
* Clean up iomap_writeback_dirty_folio() locking logic w/ christoph's
  recommendation 
* Refactor ->map_blocks() to generic ->writeback_folio()
* Refactor ->submit_ioend() to generic ->writeback_complete()
* Add patch for changing 'count' to 'async_writeback'
* Rebase commits onto linux branch instead of fuse branch
v1: https://lore.kernel.org/linux-fsdevel/20250606233803.1421259-1-joannelkoong@gmail.com/


Christoph Hellwig (7):
  iomap: pass more arguments using struct iomap_writepage_ctx
  iomap: refactor the writeback interface
  iomap: hide ioends from the generic writeback code
  iomap: move all ioend handling to ioend.c
  iomap: rename iomap_writepage_map to iomap_writeback_folio
  iomap: export iomap_writeback_folio
  iomap: replace iomap_folio_ops with iomap_write_ops

Joanne Koong (9):
  iomap: cleanup the pending writeback tracking in
    iomap_writepage_map_blocks
  iomap: add public helpers for uptodate state manipulation
  iomap: move folio_unlock out of iomap_writeback_folio
  iomap: add read_folio_range() handler for buffered writes
  fuse: use iomap for buffered writes
  fuse: use iomap for writeback
  fuse: use iomap for folio laundering
  fuse: hook into iomap for invalidating and checking partial
    uptodateness
  fuse: refactor writeback to use iomap_writepage_ctx inode

 Documentation/filesystems/iomap/design.rst    |   3 -
 .../filesystems/iomap/operations.rst          |  51 +-
 block/fops.c                                  |  37 +-
 fs/fuse/Kconfig                               |   1 +
 fs/fuse/file.c                                | 327 ++++++-------
 fs/gfs2/aops.c                                |   8 +-
 fs/gfs2/bmap.c                                |  48 +-
 fs/gfs2/bmap.h                                |   1 +
 fs/gfs2/file.c                                |   3 +-
 fs/iomap/buffered-io.c                        | 438 +++++-------------
 fs/iomap/internal.h                           |   1 -
 fs/iomap/ioend.c                              | 220 ++++++++-
 fs/iomap/trace.h                              |   2 +-
 fs/xfs/xfs_aops.c                             | 238 ++++++----
 fs/xfs/xfs_file.c                             |   6 +-
 fs/xfs/xfs_iomap.c                            |  12 +-
 fs/xfs/xfs_iomap.h                            |   1 +
 fs/xfs/xfs_reflink.c                          |   3 +-
 fs/zonefs/file.c                              |  40 +-
 include/linux/iomap.h                         |  81 ++--
 20 files changed, 775 insertions(+), 746 deletions(-)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-06-24  6:07   ` Johannes Thumshirn
  2025-07-02 15:57   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 02/16] iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks Joanne Koong
                   ` (15 subsequent siblings)
  16 siblings, 2 replies; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

From: Christoph Hellwig <hch@lst.de>

Add inode and wpc fields to pass the inode and writeback context that
are needed in the entire writeback call chain, and let the callers
initialize all fields in the writeback context before calling
iomap_writepages to simplify the argument passing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
---
 block/fops.c           |  8 +++++--
 fs/gfs2/aops.c         |  8 +++++--
 fs/iomap/buffered-io.c | 52 +++++++++++++++++++-----------------------
 fs/xfs/xfs_aops.c      | 24 +++++++++++++------
 fs/zonefs/file.c       |  8 +++++--
 include/linux/iomap.h  |  6 ++---
 6 files changed, 61 insertions(+), 45 deletions(-)

diff --git a/block/fops.c b/block/fops.c
index 1309861d4c2c..3394263d942b 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -558,9 +558,13 @@ static const struct iomap_writeback_ops blkdev_writeback_ops = {
 static int blkdev_writepages(struct address_space *mapping,
 		struct writeback_control *wbc)
 {
-	struct iomap_writepage_ctx wpc = { };
+	struct iomap_writepage_ctx wpc = {
+		.inode		= mapping->host,
+		.wbc		= wbc,
+		.ops		= &blkdev_writeback_ops
+	};
 
-	return iomap_writepages(mapping, wbc, &wpc, &blkdev_writeback_ops);
+	return iomap_writepages(&wpc);
 }
 
 const struct address_space_operations def_blk_aops = {
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 14f204cd5a82..47d74afd63ac 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -159,7 +159,11 @@ static int gfs2_writepages(struct address_space *mapping,
 			   struct writeback_control *wbc)
 {
 	struct gfs2_sbd *sdp = gfs2_mapping2sbd(mapping);
-	struct iomap_writepage_ctx wpc = { };
+	struct iomap_writepage_ctx wpc = {
+		.inode		= mapping->host,
+		.wbc		= wbc,
+		.ops		= &gfs2_writeback_ops,
+	};
 	int ret;
 
 	/*
@@ -168,7 +172,7 @@ static int gfs2_writepages(struct address_space *mapping,
 	 * want balance_dirty_pages() to loop indefinitely trying to write out
 	 * pages held in the ail that it can't find.
 	 */
-	ret = iomap_writepages(mapping, wbc, &wpc, &gfs2_writeback_ops);
+	ret = iomap_writepages(&wpc);
 	if (ret == 0 && wbc->nr_to_write > 0)
 		set_bit(SDF_FORCE_AIL_FLUSH, &sdp->sd_flags);
 	return ret;
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 3729391a18f3..71ad17bf827f 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1626,20 +1626,19 @@ static int iomap_submit_ioend(struct iomap_writepage_ctx *wpc, int error)
 }
 
 static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
-		struct writeback_control *wbc, struct inode *inode, loff_t pos,
-		u16 ioend_flags)
+		loff_t pos, u16 ioend_flags)
 {
 	struct bio *bio;
 
 	bio = bio_alloc_bioset(wpc->iomap.bdev, BIO_MAX_VECS,
-			       REQ_OP_WRITE | wbc_to_write_flags(wbc),
+			       REQ_OP_WRITE | wbc_to_write_flags(wpc->wbc),
 			       GFP_NOFS, &iomap_ioend_bioset);
 	bio->bi_iter.bi_sector = iomap_sector(&wpc->iomap, pos);
 	bio->bi_end_io = iomap_writepage_end_bio;
-	bio->bi_write_hint = inode->i_write_hint;
-	wbc_init_bio(wbc, bio);
+	bio->bi_write_hint = wpc->inode->i_write_hint;
+	wbc_init_bio(wpc->wbc, bio);
 	wpc->nr_folios = 0;
-	return iomap_init_ioend(inode, bio, pos, ioend_flags);
+	return iomap_init_ioend(wpc->inode, bio, pos, ioend_flags);
 }
 
 static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
@@ -1678,9 +1677,7 @@ static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
  * writepage context that the caller will need to submit.
  */
 static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
-		struct writeback_control *wbc, struct folio *folio,
-		struct inode *inode, loff_t pos, loff_t end_pos,
-		unsigned len)
+		struct folio *folio, loff_t pos, loff_t end_pos, unsigned len)
 {
 	struct iomap_folio_state *ifs = folio->private;
 	size_t poff = offset_in_folio(folio, pos);
@@ -1701,8 +1698,7 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
 		error = iomap_submit_ioend(wpc, 0);
 		if (error)
 			return error;
-		wpc->ioend = iomap_alloc_ioend(wpc, wbc, inode, pos,
-				ioend_flags);
+		wpc->ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
 	}
 
 	if (!bio_add_folio(&wpc->ioend->io_bio, folio, len, poff))
@@ -1756,24 +1752,24 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
 	if (wpc->ioend->io_offset + wpc->ioend->io_size > end_pos)
 		wpc->ioend->io_size = end_pos - wpc->ioend->io_offset;
 
-	wbc_account_cgroup_owner(wbc, folio, len);
+	wbc_account_cgroup_owner(wpc->wbc, folio, len);
 	return 0;
 }
 
 static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
-		struct writeback_control *wbc, struct folio *folio,
-		struct inode *inode, u64 pos, u64 end_pos,
-		unsigned dirty_len, unsigned *count)
+		struct folio *folio, u64 pos, u64 end_pos, unsigned dirty_len,
+		unsigned *count)
 {
 	int error;
 
 	do {
 		unsigned map_len;
 
-		error = wpc->ops->map_blocks(wpc, inode, pos, dirty_len);
+		error = wpc->ops->map_blocks(wpc, wpc->inode, pos, dirty_len);
 		if (error)
 			break;
-		trace_iomap_writepage_map(inode, pos, dirty_len, &wpc->iomap);
+		trace_iomap_writepage_map(wpc->inode, pos, dirty_len,
+				&wpc->iomap);
 
 		map_len = min_t(u64, dirty_len,
 			wpc->iomap.offset + wpc->iomap.length - pos);
@@ -1787,8 +1783,8 @@ static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
 		case IOMAP_HOLE:
 			break;
 		default:
-			error = iomap_add_to_ioend(wpc, wbc, folio, inode, pos,
-					end_pos, map_len);
+			error = iomap_add_to_ioend(wpc, folio, pos, end_pos,
+					map_len);
 			if (!error)
 				(*count)++;
 			break;
@@ -1870,10 +1866,10 @@ static bool iomap_writepage_handle_eof(struct folio *folio, struct inode *inode,
 }
 
 static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
-		struct writeback_control *wbc, struct folio *folio)
+		struct folio *folio)
 {
 	struct iomap_folio_state *ifs = folio->private;
-	struct inode *inode = folio->mapping->host;
+	struct inode *inode = wpc->inode;
 	u64 pos = folio_pos(folio);
 	u64 end_pos = pos + folio_size(folio);
 	u64 end_aligned = 0;
@@ -1920,8 +1916,8 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 	 */
 	end_aligned = round_up(end_pos, i_blocksize(inode));
 	while ((rlen = iomap_find_dirty_range(folio, &pos, end_aligned))) {
-		error = iomap_writepage_map_blocks(wpc, wbc, folio, inode,
-				pos, end_pos, rlen, &count);
+		error = iomap_writepage_map_blocks(wpc, folio, pos, end_pos,
+				rlen, &count);
 		if (error)
 			break;
 		pos += rlen;
@@ -1957,10 +1953,9 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 }
 
 int
-iomap_writepages(struct address_space *mapping, struct writeback_control *wbc,
-		struct iomap_writepage_ctx *wpc,
-		const struct iomap_writeback_ops *ops)
+iomap_writepages(struct iomap_writepage_ctx *wpc)
 {
+	struct address_space *mapping = wpc->inode->i_mapping;
 	struct folio *folio = NULL;
 	int error;
 
@@ -1972,9 +1967,8 @@ iomap_writepages(struct address_space *mapping, struct writeback_control *wbc,
 			PF_MEMALLOC))
 		return -EIO;
 
-	wpc->ops = ops;
-	while ((folio = writeback_iter(mapping, wbc, folio, &error)))
-		error = iomap_writepage_map(wpc, wbc, folio);
+	while ((folio = writeback_iter(mapping, wpc->wbc, folio, &error)))
+		error = iomap_writepage_map(wpc, folio);
 	return iomap_submit_ioend(wpc, error);
 }
 EXPORT_SYMBOL_GPL(iomap_writepages);
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 63151feb9c3f..65485a52df3b 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -636,19 +636,29 @@ xfs_vm_writepages(
 	xfs_iflags_clear(ip, XFS_ITRUNCATED);
 
 	if (xfs_is_zoned_inode(ip)) {
-		struct xfs_zoned_writepage_ctx	xc = { };
+		struct xfs_zoned_writepage_ctx	xc = {
+			.ctx = {
+				.inode	= mapping->host,
+				.wbc	= wbc,
+				.ops	= &xfs_zoned_writeback_ops
+			},
+		};
 		int				error;
 
-		error = iomap_writepages(mapping, wbc, &xc.ctx,
-					 &xfs_zoned_writeback_ops);
+		error = iomap_writepages(&xc.ctx);
 		if (xc.open_zone)
 			xfs_open_zone_put(xc.open_zone);
 		return error;
 	} else {
-		struct xfs_writepage_ctx	wpc = { };
-
-		return iomap_writepages(mapping, wbc, &wpc.ctx,
-				&xfs_writeback_ops);
+		struct xfs_writepage_ctx	wpc = {
+			.ctx = {
+				.inode	= mapping->host,
+				.wbc	= wbc,
+				.ops	= &xfs_writeback_ops
+			},
+		};
+
+		return iomap_writepages(&wpc.ctx);
 	}
 }
 
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 42e2c0065bb3..edca4bbe4b72 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -152,9 +152,13 @@ static const struct iomap_writeback_ops zonefs_writeback_ops = {
 static int zonefs_writepages(struct address_space *mapping,
 			     struct writeback_control *wbc)
 {
-	struct iomap_writepage_ctx wpc = { };
+	struct iomap_writepage_ctx wpc = {
+		.inode		= mapping->host,
+		.wbc		= wbc,
+		.ops		= &zonefs_writeback_ops,
+	};
 
-	return iomap_writepages(mapping, wbc, &wpc, &zonefs_writeback_ops);
+	return iomap_writepages(&wpc);
 }
 
 static int zonefs_swap_activate(struct swap_info_struct *sis,
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 522644d62f30..00179c9387c5 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -448,6 +448,8 @@ struct iomap_writeback_ops {
 
 struct iomap_writepage_ctx {
 	struct iomap		iomap;
+	struct inode		*inode;
+	struct writeback_control *wbc;
 	struct iomap_ioend	*ioend;
 	const struct iomap_writeback_ops *ops;
 	u32			nr_folios;	/* folios added to the ioend */
@@ -461,9 +463,7 @@ void iomap_finish_ioends(struct iomap_ioend *ioend, int error);
 void iomap_ioend_try_merge(struct iomap_ioend *ioend,
 		struct list_head *more_ioends);
 void iomap_sort_ioends(struct list_head *ioend_list);
-int iomap_writepages(struct address_space *mapping,
-		struct writeback_control *wbc, struct iomap_writepage_ctx *wpc,
-		const struct iomap_writeback_ops *ops);
+int iomap_writepages(struct iomap_writepage_ctx *wpc);
 
 /*
  * Flags for direct I/O ->end_io:
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 02/16] iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
  2025-06-24  2:21 ` [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-06-24  6:09   ` Johannes Thumshirn
  2025-07-02 15:59   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 03/16] iomap: refactor the writeback interface Joanne Koong
                   ` (14 subsequent siblings)
  16 siblings, 2 replies; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

We don't care about the count of outstanding ioends, just if there is one.
Replace the count variable passed to iomap_writepage_map_blocks with a
boolean to make that more clear.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
[hch: rename the variable, update the commit message]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 71ad17bf827f..11a55da26a6f 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1758,7 +1758,7 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
 
 static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
 		struct folio *folio, u64 pos, u64 end_pos, unsigned dirty_len,
-		unsigned *count)
+		bool *wb_pending)
 {
 	int error;
 
@@ -1786,7 +1786,7 @@ static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
 			error = iomap_add_to_ioend(wpc, folio, pos, end_pos,
 					map_len);
 			if (!error)
-				(*count)++;
+				*wb_pending = true;
 			break;
 		}
 		dirty_len -= map_len;
@@ -1873,7 +1873,7 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 	u64 pos = folio_pos(folio);
 	u64 end_pos = pos + folio_size(folio);
 	u64 end_aligned = 0;
-	unsigned count = 0;
+	bool wb_pending = false;
 	int error = 0;
 	u32 rlen;
 
@@ -1917,13 +1917,13 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 	end_aligned = round_up(end_pos, i_blocksize(inode));
 	while ((rlen = iomap_find_dirty_range(folio, &pos, end_aligned))) {
 		error = iomap_writepage_map_blocks(wpc, folio, pos, end_pos,
-				rlen, &count);
+				rlen, &wb_pending);
 		if (error)
 			break;
 		pos += rlen;
 	}
 
-	if (count)
+	if (wb_pending)
 		wpc->nr_folios++;
 
 	/*
@@ -1945,7 +1945,7 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		if (atomic_dec_and_test(&ifs->write_bytes_pending))
 			folio_end_writeback(folio);
 	} else {
-		if (!count)
+		if (!wb_pending)
 			folio_end_writeback(folio);
 	}
 	mapping_set_error(inode->i_mapping, error);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 03/16] iomap: refactor the writeback interface
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
  2025-06-24  2:21 ` [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx Joanne Koong
  2025-06-24  2:21 ` [PATCH v3 02/16] iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 17:13   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 04/16] iomap: hide ioends from the generic writeback code Joanne Koong
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

From: Christoph Hellwig <hch@lst.de>

Replace ->map_blocks with a new ->writeback_range, which differs in the
following ways:

 - it must also queue up the I/O for writeback, that is called into the
   slightly refactored and extended in scope iomap_add_to_ioend for
   each region
 - can handle only a part of the requested region, that is the retry
   loop for partial mappings moves to the caller
 - handles cleanup on failures as well, and thus also replaces the
   discard_folio method only implemented by XFS.

This will allow to use the iomap writeback code also for file systems
that are not block based like fuse.

Co-developed-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 .../filesystems/iomap/operations.rst          |  23 +--
 block/fops.c                                  |  25 ++-
 fs/gfs2/bmap.c                                |  26 +--
 fs/iomap/buffered-io.c                        |  93 +++++------
 fs/iomap/trace.h                              |   2 +-
 fs/xfs/xfs_aops.c                             | 154 ++++++++++--------
 fs/zonefs/file.c                              |  28 ++--
 include/linux/iomap.h                         |  20 +--
 8 files changed, 187 insertions(+), 184 deletions(-)

diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
index 3b628e370d88..b28f215db6e5 100644
--- a/Documentation/filesystems/iomap/operations.rst
+++ b/Documentation/filesystems/iomap/operations.rst
@@ -271,7 +271,7 @@ writeback.
 It does not lock ``i_rwsem`` or ``invalidate_lock``.
 
 The dirty bit will be cleared for all folios run through the
-``->map_blocks`` machinery described below even if the writeback fails.
+``->writeback_range`` machinery described below even if the writeback fails.
 This is to prevent dirty folio clots when storage devices fail; an
 ``-EIO`` is recorded for userspace to collect via ``fsync``.
 
@@ -283,15 +283,14 @@ The ``ops`` structure must be specified and is as follows:
 .. code-block:: c
 
  struct iomap_writeback_ops {
-     int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode,
-                       loff_t offset, unsigned len);
-     int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
-     void (*discard_folio)(struct folio *folio, loff_t pos);
+    int (*writeback_range)(struct iomap_writepage_ctx *wpc,
+    		struct folio *folio, u64 pos, unsigned int len, u64 end_pos);
+    int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
  };
 
 The fields are as follows:
 
-  - ``map_blocks``: Sets ``wpc->iomap`` to the space mapping of the file
+  - ``writeback_range``: Sets ``wpc->iomap`` to the space mapping of the file
     range (in bytes) given by ``offset`` and ``len``.
     iomap calls this function for each dirty fs block in each dirty folio,
     though it will `reuse mappings
@@ -316,18 +315,6 @@ The fields are as follows:
     transactions from process context before submitting the bio.
     This function is optional.
 
-  - ``discard_folio``: iomap calls this function after ``->map_blocks``
-    fails to schedule I/O for any part of a dirty folio.
-    The function should throw away any reservations that may have been
-    made for the write.
-    The folio will be marked clean and an ``-EIO`` recorded in the
-    pagecache.
-    Filesystems can use this callback to `remove
-    <https://lore.kernel.org/all/20201029163313.1766967-1-bfoster@redhat.com/>`_
-    delalloc reservations to avoid having delalloc reservations for
-    clean pagecache.
-    This function is optional.
-
 Pagecache Writeback Completion
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/block/fops.c b/block/fops.c
index 3394263d942b..b500ff8f55dd 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -537,22 +537,29 @@ static void blkdev_readahead(struct readahead_control *rac)
 	iomap_readahead(rac, &blkdev_iomap_ops);
 }
 
-static int blkdev_map_blocks(struct iomap_writepage_ctx *wpc,
-		struct inode *inode, loff_t offset, unsigned int len)
+static ssize_t blkdev_writeback_range(struct iomap_writepage_ctx *wpc,
+		struct folio *folio, u64 offset, unsigned int len, u64 end_pos)
 {
-	loff_t isize = i_size_read(inode);
+	loff_t isize = i_size_read(wpc->inode);
 
 	if (WARN_ON_ONCE(offset >= isize))
 		return -EIO;
-	if (offset >= wpc->iomap.offset &&
-	    offset < wpc->iomap.offset + wpc->iomap.length)
-		return 0;
-	return blkdev_iomap_begin(inode, offset, isize - offset,
-				  IOMAP_WRITE, &wpc->iomap, NULL);
+
+	if (offset < wpc->iomap.offset ||
+	    offset >= wpc->iomap.offset + wpc->iomap.length) {
+		int error;
+
+		error = blkdev_iomap_begin(wpc->inode, offset, isize - offset,
+				IOMAP_WRITE, &wpc->iomap, NULL);
+		if (error)
+			return error;
+	}
+
+	return iomap_add_to_ioend(wpc, folio, offset, end_pos, len);
 }
 
 static const struct iomap_writeback_ops blkdev_writeback_ops = {
-	.map_blocks		= blkdev_map_blocks,
+	.writeback_range	= blkdev_writeback_range,
 };
 
 static int blkdev_writepages(struct address_space *mapping,
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 7703d0471139..0cc41de54aba 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -2469,23 +2469,25 @@ int __gfs2_punch_hole(struct file *file, loff_t offset, loff_t length)
 	return error;
 }
 
-static int gfs2_map_blocks(struct iomap_writepage_ctx *wpc, struct inode *inode,
-		loff_t offset, unsigned int len)
+static ssize_t gfs2_writeback_range(struct iomap_writepage_ctx *wpc,
+		struct folio *folio, u64 offset, unsigned int len, u64 end_pos)
 {
-	int ret;
-
-	if (WARN_ON_ONCE(gfs2_is_stuffed(GFS2_I(inode))))
+	if (WARN_ON_ONCE(gfs2_is_stuffed(GFS2_I(wpc->inode))))
 		return -EIO;
 
-	if (offset >= wpc->iomap.offset &&
-	    offset < wpc->iomap.offset + wpc->iomap.length)
-		return 0;
+	if (offset < wpc->iomap.offset ||
+	    offset >= wpc->iomap.offset + wpc->iomap.length) {
+		int ret;
 
-	memset(&wpc->iomap, 0, sizeof(wpc->iomap));
-	ret = gfs2_iomap_get(inode, offset, INT_MAX, &wpc->iomap);
-	return ret;
+		memset(&wpc->iomap, 0, sizeof(wpc->iomap));
+		ret = gfs2_iomap_get(wpc->inode, offset, INT_MAX, &wpc->iomap);
+		if (ret)
+			return ret;
+	}
+
+	return iomap_add_to_ioend(wpc, folio, offset, end_pos, len);
 }
 
 const struct iomap_writeback_ops gfs2_writeback_ops = {
-	.map_blocks		= gfs2_map_blocks,
+	.writeback_range	= gfs2_writeback_range,
 };
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 11a55da26a6f..80d8acfaa068 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1676,14 +1676,30 @@ static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
  * At the end of a writeback pass, there will be a cached ioend remaining on the
  * writepage context that the caller will need to submit.
  */
-static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
-		struct folio *folio, loff_t pos, loff_t end_pos, unsigned len)
+ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
+		loff_t pos, loff_t end_pos, unsigned int dirty_len)
 {
 	struct iomap_folio_state *ifs = folio->private;
 	size_t poff = offset_in_folio(folio, pos);
 	unsigned int ioend_flags = 0;
+	unsigned int map_len = min_t(u64, dirty_len,
+		wpc->iomap.offset + wpc->iomap.length - pos);
 	int error;
 
+	trace_iomap_add_to_ioend(wpc->inode, pos, dirty_len, &wpc->iomap);
+
+	WARN_ON_ONCE(!folio->private && map_len < dirty_len);
+
+	switch (wpc->iomap.type) {
+	case IOMAP_INLINE:
+		WARN_ON_ONCE(1);
+		return -EIO;
+	case IOMAP_HOLE:
+		return map_len;
+	default:
+		break;
+	}
+
 	if (wpc->iomap.type == IOMAP_UNWRITTEN)
 		ioend_flags |= IOMAP_IOEND_UNWRITTEN;
 	if (wpc->iomap.flags & IOMAP_F_SHARED)
@@ -1701,11 +1717,11 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
 		wpc->ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
 	}
 
-	if (!bio_add_folio(&wpc->ioend->io_bio, folio, len, poff))
+	if (!bio_add_folio(&wpc->ioend->io_bio, folio, map_len, poff))
 		goto new_ioend;
 
 	if (ifs)
-		atomic_add(len, &ifs->write_bytes_pending);
+		atomic_add(map_len, &ifs->write_bytes_pending);
 
 	/*
 	 * Clamp io_offset and io_size to the incore EOF so that ondisk
@@ -1748,63 +1764,34 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
 	 * Note that this defeats the ability to chain the ioends of
 	 * appending writes.
 	 */
-	wpc->ioend->io_size += len;
+	wpc->ioend->io_size += map_len;
 	if (wpc->ioend->io_offset + wpc->ioend->io_size > end_pos)
 		wpc->ioend->io_size = end_pos - wpc->ioend->io_offset;
 
-	wbc_account_cgroup_owner(wpc->wbc, folio, len);
-	return 0;
+	wbc_account_cgroup_owner(wpc->wbc, folio, map_len);
+	return map_len;
 }
+EXPORT_SYMBOL_GPL(iomap_add_to_ioend);
 
-static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
-		struct folio *folio, u64 pos, u64 end_pos, unsigned dirty_len,
+static int iomap_writeback_range(struct iomap_writepage_ctx *wpc,
+		struct folio *folio, u64 pos, u32 rlen, u64 end_pos,
 		bool *wb_pending)
 {
-	int error;
-
 	do {
-		unsigned map_len;
-
-		error = wpc->ops->map_blocks(wpc, wpc->inode, pos, dirty_len);
-		if (error)
-			break;
-		trace_iomap_writepage_map(wpc->inode, pos, dirty_len,
-				&wpc->iomap);
-
-		map_len = min_t(u64, dirty_len,
-			wpc->iomap.offset + wpc->iomap.length - pos);
-		WARN_ON_ONCE(!folio->private && map_len < dirty_len);
+		ssize_t ret;
 
-		switch (wpc->iomap.type) {
-		case IOMAP_INLINE:
-			WARN_ON_ONCE(1);
-			error = -EIO;
-			break;
-		case IOMAP_HOLE:
-			break;
-		default:
-			error = iomap_add_to_ioend(wpc, folio, pos, end_pos,
-					map_len);
-			if (!error)
-				*wb_pending = true;
-			break;
-		}
-		dirty_len -= map_len;
-		pos += map_len;
-	} while (dirty_len && !error);
+		ret = wpc->ops->writeback_range(wpc, folio, pos, rlen, end_pos);
+		if (WARN_ON_ONCE(ret == 0 || ret > rlen))
+			return -EIO;
+		if (ret < 0)
+			return ret;
+		rlen -= ret;
+		pos += ret;
+		if (wpc->iomap.type != IOMAP_HOLE)
+			*wb_pending = true;
+	} while (rlen);
 
-	/*
-	 * We cannot cancel the ioend directly here on error.  We may have
-	 * already set other pages under writeback and hence we have to run I/O
-	 * completion to mark the error state of the pages under writeback
-	 * appropriately.
-	 *
-	 * Just let the file system know what portion of the folio failed to
-	 * map.
-	 */
-	if (error && wpc->ops->discard_folio)
-		wpc->ops->discard_folio(folio, pos);
-	return error;
+	return 0;
 }
 
 /*
@@ -1916,8 +1903,8 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 	 */
 	end_aligned = round_up(end_pos, i_blocksize(inode));
 	while ((rlen = iomap_find_dirty_range(folio, &pos, end_aligned))) {
-		error = iomap_writepage_map_blocks(wpc, folio, pos, end_pos,
-				rlen, &wb_pending);
+		error = iomap_writeback_range(wpc, folio, pos, rlen, end_pos,
+				&wb_pending);
 		if (error)
 			break;
 		pos += rlen;
diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h
index 455cc6f90be0..aaea02c9560a 100644
--- a/fs/iomap/trace.h
+++ b/fs/iomap/trace.h
@@ -169,7 +169,7 @@ DEFINE_EVENT(iomap_class, name,	\
 DEFINE_IOMAP_EVENT(iomap_iter_dstmap);
 DEFINE_IOMAP_EVENT(iomap_iter_srcmap);
 
-TRACE_EVENT(iomap_writepage_map,
+TRACE_EVENT(iomap_add_to_ioend,
 	TP_PROTO(struct inode *inode, u64 pos, unsigned int dirty_len,
 		 struct iomap *iomap),
 	TP_ARGS(inode, pos, dirty_len, iomap),
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 65485a52df3b..8157b6d92c8e 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -233,6 +233,47 @@ xfs_end_bio(
 	spin_unlock_irqrestore(&ip->i_ioend_lock, flags);
 }
 
+/*
+ * We cannot cancel the ioend directly on error.  We may have already set other
+ * pages under writeback and hence we have to run I/O completion to mark the
+ * error state of the pages under writeback appropriately.
+ *
+ * If the folio has delalloc blocks on it, the caller is asking us to punch them
+ * out. If we don't, we can leave a stale delalloc mapping covered by a clean
+ * page that needs to be dirtied again before the delalloc mapping can be
+ * converted. This stale delalloc mapping can trip up a later direct I/O read
+ * operation on the same region.
+ *
+ * We prevent this by truncating away the delalloc regions on the folio. Because
+ * they are delalloc, we can do this without needing a transaction. Indeed - if
+ * we get ENOSPC errors, we have to be able to do this truncation without a
+ * transaction as there is no space left for block reservation (typically why
+ * we see a ENOSPC in writeback).
+ */
+static void
+xfs_discard_folio(
+	struct folio		*folio,
+	loff_t			pos)
+{
+	struct xfs_inode	*ip = XFS_I(folio->mapping->host);
+	struct xfs_mount	*mp = ip->i_mount;
+
+	if (xfs_is_shutdown(mp))
+		return;
+
+	xfs_alert_ratelimited(mp,
+		"page discard on page "PTR_FMT", inode 0x%llx, pos %llu.",
+			folio, ip->i_ino, pos);
+
+	/*
+	 * The end of the punch range is always the offset of the first
+	 * byte of the next folio. Hence the end offset is only dependent on the
+	 * folio itself and not the start offset that is passed in.
+	 */
+	xfs_bmap_punch_delalloc_range(ip, XFS_DATA_FORK, pos,
+				folio_pos(folio) + folio_size(folio), NULL);
+}
+
 /*
  * Fast revalidation of the cached writeback mapping. Return true if the current
  * mapping is valid, false otherwise.
@@ -275,16 +316,17 @@ xfs_imap_valid(
 	return true;
 }
 
-static int
-xfs_map_blocks(
+static ssize_t
+xfs_writeback_range(
 	struct iomap_writepage_ctx *wpc,
-	struct inode		*inode,
-	loff_t			offset,
-	unsigned int		len)
+	struct folio		*folio,
+	u64			offset,
+	unsigned int		len,
+	u64			end_pos)
 {
-	struct xfs_inode	*ip = XFS_I(inode);
+	struct xfs_inode	*ip = XFS_I(wpc->inode);
 	struct xfs_mount	*mp = ip->i_mount;
-	ssize_t			count = i_blocksize(inode);
+	ssize_t			count = i_blocksize(wpc->inode);
 	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
 	xfs_fileoff_t		end_fsb = XFS_B_TO_FSB(mp, offset + count);
 	xfs_fileoff_t		cow_fsb;
@@ -292,7 +334,7 @@ xfs_map_blocks(
 	struct xfs_bmbt_irec	imap;
 	struct xfs_iext_cursor	icur;
 	int			retries = 0;
-	int			error = 0;
+	ssize_t			ret = 0;
 	unsigned int		*seq;
 
 	if (xfs_is_shutdown(mp))
@@ -316,7 +358,7 @@ xfs_map_blocks(
 	 * out that ensures that we always see the current value.
 	 */
 	if (xfs_imap_valid(wpc, ip, offset))
-		return 0;
+		goto map_blocks;
 
 	/*
 	 * If we don't have a valid map, now it's time to get a new one for this
@@ -351,7 +393,7 @@ xfs_map_blocks(
 	 */
 	if (xfs_imap_valid(wpc, ip, offset)) {
 		xfs_iunlock(ip, XFS_ILOCK_SHARED);
-		return 0;
+		goto map_blocks;
 	}
 
 	/*
@@ -389,7 +431,12 @@ xfs_map_blocks(
 
 	xfs_bmbt_to_iomap(ip, &wpc->iomap, &imap, 0, 0, XFS_WPC(wpc)->data_seq);
 	trace_xfs_map_blocks_found(ip, offset, count, whichfork, &imap);
-	return 0;
+map_blocks:
+	ret = iomap_add_to_ioend(wpc, folio, offset, end_pos, len);
+	if (ret < 0)
+		goto out_error;
+	return ret;
+
 allocate_blocks:
 	/*
 	 * Convert a dellalloc extent to a real one. The current page is held
@@ -402,9 +449,9 @@ xfs_map_blocks(
 	else
 		seq = &XFS_WPC(wpc)->data_seq;
 
-	error = xfs_bmapi_convert_delalloc(ip, whichfork, offset,
-				&wpc->iomap, seq);
-	if (error) {
+	ret = xfs_bmapi_convert_delalloc(ip, whichfork, offset, &wpc->iomap,
+			seq);
+	if (ret) {
 		/*
 		 * If we failed to find the extent in the COW fork we might have
 		 * raced with a COW to data fork conversion or truncate.
@@ -412,10 +459,10 @@ xfs_map_blocks(
 		 * the former case, but prevent additional retries to avoid
 		 * looping forever for the latter case.
 		 */
-		if (error == -EAGAIN && whichfork == XFS_COW_FORK && !retries++)
+		if (ret == -EAGAIN && whichfork == XFS_COW_FORK && !retries++)
 			goto retry;
-		ASSERT(error != -EAGAIN);
-		return error;
+		ASSERT(ret != -EAGAIN);
+		goto out_error;
 	}
 
 	/*
@@ -433,7 +480,11 @@ xfs_map_blocks(
 	ASSERT(wpc->iomap.offset <= offset);
 	ASSERT(wpc->iomap.offset + wpc->iomap.length > offset);
 	trace_xfs_map_blocks_alloc(ip, offset, count, whichfork, &imap);
-	return 0;
+	goto map_blocks;
+
+out_error:
+	xfs_discard_folio(folio, offset);
+	return ret;
 }
 
 static bool
@@ -488,47 +539,9 @@ xfs_submit_ioend(
 	return 0;
 }
 
-/*
- * If the folio has delalloc blocks on it, the caller is asking us to punch them
- * out. If we don't, we can leave a stale delalloc mapping covered by a clean
- * page that needs to be dirtied again before the delalloc mapping can be
- * converted. This stale delalloc mapping can trip up a later direct I/O read
- * operation on the same region.
- *
- * We prevent this by truncating away the delalloc regions on the folio. Because
- * they are delalloc, we can do this without needing a transaction. Indeed - if
- * we get ENOSPC errors, we have to be able to do this truncation without a
- * transaction as there is no space left for block reservation (typically why
- * we see a ENOSPC in writeback).
- */
-static void
-xfs_discard_folio(
-	struct folio		*folio,
-	loff_t			pos)
-{
-	struct xfs_inode	*ip = XFS_I(folio->mapping->host);
-	struct xfs_mount	*mp = ip->i_mount;
-
-	if (xfs_is_shutdown(mp))
-		return;
-
-	xfs_alert_ratelimited(mp,
-		"page discard on page "PTR_FMT", inode 0x%llx, pos %llu.",
-			folio, ip->i_ino, pos);
-
-	/*
-	 * The end of the punch range is always the offset of the first
-	 * byte of the next folio. Hence the end offset is only dependent on the
-	 * folio itself and not the start offset that is passed in.
-	 */
-	xfs_bmap_punch_delalloc_range(ip, XFS_DATA_FORK, pos,
-				folio_pos(folio) + folio_size(folio), NULL);
-}
-
 static const struct iomap_writeback_ops xfs_writeback_ops = {
-	.map_blocks		= xfs_map_blocks,
+	.writeback_range	= xfs_writeback_range,
 	.submit_ioend		= xfs_submit_ioend,
-	.discard_folio		= xfs_discard_folio,
 };
 
 struct xfs_zoned_writepage_ctx {
@@ -542,20 +555,22 @@ XFS_ZWPC(struct iomap_writepage_ctx *ctx)
 	return container_of(ctx, struct xfs_zoned_writepage_ctx, ctx);
 }
 
-static int
-xfs_zoned_map_blocks(
+static ssize_t
+xfs_zoned_writeback_range(
 	struct iomap_writepage_ctx *wpc,
-	struct inode		*inode,
-	loff_t			offset,
-	unsigned int		len)
+	struct folio		*folio,
+	u64			offset,
+	unsigned int		len,
+	u64			end_pos)
 {
-	struct xfs_inode	*ip = XFS_I(inode);
+	struct xfs_inode	*ip = XFS_I(wpc->inode);
 	struct xfs_mount	*mp = ip->i_mount;
 	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
 	xfs_fileoff_t		end_fsb = XFS_B_TO_FSB(mp, offset + len);
 	xfs_filblks_t		count_fsb;
 	struct xfs_bmbt_irec	imap, del;
 	struct xfs_iext_cursor	icur;
+	ssize_t			ret;
 
 	if (xfs_is_shutdown(mp))
 		return -EIO;
@@ -586,7 +601,7 @@ xfs_zoned_map_blocks(
 		imap.br_state = XFS_EXT_NORM;
 		xfs_iunlock(ip, XFS_ILOCK_EXCL);
 		xfs_bmbt_to_iomap(ip, &wpc->iomap, &imap, 0, 0, 0);
-		return 0;
+		goto map_blocks;
 	}
 	end_fsb = min(end_fsb, imap.br_startoff + imap.br_blockcount);
 	count_fsb = end_fsb - offset_fsb;
@@ -603,9 +618,13 @@ xfs_zoned_map_blocks(
 	wpc->iomap.offset = offset;
 	wpc->iomap.length = XFS_FSB_TO_B(mp, count_fsb);
 	wpc->iomap.flags = IOMAP_F_ANON_WRITE;
-
 	trace_xfs_zoned_map_blocks(ip, offset, wpc->iomap.length);
-	return 0;
+
+map_blocks:
+	ret = iomap_add_to_ioend(wpc, folio, offset, end_pos, len);
+	if (ret < 0)
+		xfs_discard_folio(folio, offset);
+	return ret;
 }
 
 static int
@@ -621,9 +640,8 @@ xfs_zoned_submit_ioend(
 }
 
 static const struct iomap_writeback_ops xfs_zoned_writeback_ops = {
-	.map_blocks		= xfs_zoned_map_blocks,
+	.writeback_range	= xfs_zoned_writeback_range,
 	.submit_ioend		= xfs_zoned_submit_ioend,
-	.discard_folio		= xfs_discard_folio,
 };
 
 STATIC int
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index edca4bbe4b72..c88e2c851753 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -124,29 +124,33 @@ static void zonefs_readahead(struct readahead_control *rac)
  * Map blocks for page writeback. This is used only on conventional zone files,
  * which implies that the page range can only be within the fixed inode size.
  */
-static int zonefs_write_map_blocks(struct iomap_writepage_ctx *wpc,
-				   struct inode *inode, loff_t offset,
-				   unsigned int len)
+static ssize_t zonefs_writeback_range(struct iomap_writepage_ctx *wpc,
+		struct folio *folio, u64 offset, unsigned len, u64 end_pos)
 {
-	struct zonefs_zone *z = zonefs_inode_zone(inode);
+	struct zonefs_zone *z = zonefs_inode_zone(wpc->inode);
 
 	if (WARN_ON_ONCE(zonefs_zone_is_seq(z)))
 		return -EIO;
-	if (WARN_ON_ONCE(offset >= i_size_read(inode)))
+	if (WARN_ON_ONCE(offset >= i_size_read(wpc->inode)))
 		return -EIO;
 
 	/* If the mapping is already OK, nothing needs to be done */
-	if (offset >= wpc->iomap.offset &&
-	    offset < wpc->iomap.offset + wpc->iomap.length)
-		return 0;
+	if (offset < wpc->iomap.offset ||
+	    offset >= wpc->iomap.offset + wpc->iomap.length) {
+		int error;
+
+		error = zonefs_write_iomap_begin(wpc->inode, offset,
+				z->z_capacity - offset, IOMAP_WRITE,
+				&wpc->iomap, NULL);
+		if (error)
+			return error;
+	}
 
-	return zonefs_write_iomap_begin(inode, offset,
-					z->z_capacity - offset,
-					IOMAP_WRITE, &wpc->iomap, NULL);
+	return iomap_add_to_ioend(wpc, folio, offset, end_pos, len);
 }
 
 static const struct iomap_writeback_ops zonefs_writeback_ops = {
-	.map_blocks		= zonefs_write_map_blocks,
+	.writeback_range	= zonefs_writeback_range,
 };
 
 static int zonefs_writepages(struct address_space *mapping,
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 00179c9387c5..063e18476286 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -416,18 +416,20 @@ static inline struct iomap_ioend *iomap_ioend_from_bio(struct bio *bio)
 
 struct iomap_writeback_ops {
 	/*
-	 * Required, maps the blocks so that writeback can be performed on
-	 * the range starting at offset.
+	 * Required, performs writeback on the passed in range
 	 *
-	 * Can return arbitrarily large regions, but we need to call into it at
+	 * Can map arbitrarily large regions, but we need to call into it at
 	 * least once per folio to allow the file systems to synchronize with
 	 * the write path that could be invalidating mappings.
 	 *
 	 * An existing mapping from a previous call to this method can be reused
 	 * by the file system if it is still valid.
+	 *
+	 * Returns the number of bytes processed or a negative errno.
 	 */
-	int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode,
-			  loff_t offset, unsigned len);
+	ssize_t (*writeback_range)(struct iomap_writepage_ctx *wpc,
+			struct folio *folio, u64 pos, unsigned int len,
+			u64 end_pos);
 
 	/*
 	 * Optional, allows the file systems to hook into bio submission,
@@ -438,12 +440,6 @@ struct iomap_writeback_ops {
 	 * the bio could not be submitted.
 	 */
 	int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
-
-	/*
-	 * Optional, allows the file system to discard state on a page where
-	 * we failed to submit any I/O.
-	 */
-	void (*discard_folio)(struct folio *folio, loff_t pos);
 };
 
 struct iomap_writepage_ctx {
@@ -463,6 +459,8 @@ void iomap_finish_ioends(struct iomap_ioend *ioend, int error);
 void iomap_ioend_try_merge(struct iomap_ioend *ioend,
 		struct list_head *more_ioends);
 void iomap_sort_ioends(struct list_head *ioend_list);
+ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
+		loff_t pos, loff_t end_pos, unsigned int dirty_len);
 int iomap_writepages(struct iomap_writepage_ctx *wpc);
 
 /*
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 04/16] iomap: hide ioends from the generic writeback code
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (2 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 03/16] iomap: refactor the writeback interface Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 17:38   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 05/16] iomap: add public helpers for uptodate state manipulation Joanne Koong
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

From: Christoph Hellwig <hch@lst.de>

Replace the ioend pointer in iomap_writeback_ctx with a void *wb_ctx
one to facilitate non-block, non-ioend writeback for use.  Rename
the submit_ioend method to writeback_submit and make it mandatory so
that the generic writeback code stops seeing ioends and bios.

Co-developed-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 .../filesystems/iomap/operations.rst          | 16 +---
 block/fops.c                                  |  1 +
 fs/gfs2/bmap.c                                |  1 +
 fs/iomap/buffered-io.c                        | 91 ++++++++++---------
 fs/xfs/xfs_aops.c                             | 60 ++++++------
 fs/zonefs/file.c                              |  1 +
 include/linux/iomap.h                         | 19 ++--
 7 files changed, 93 insertions(+), 96 deletions(-)

diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
index b28f215db6e5..ead56b27ec3f 100644
--- a/Documentation/filesystems/iomap/operations.rst
+++ b/Documentation/filesystems/iomap/operations.rst
@@ -285,7 +285,7 @@ The ``ops`` structure must be specified and is as follows:
  struct iomap_writeback_ops {
     int (*writeback_range)(struct iomap_writepage_ctx *wpc,
     		struct folio *folio, u64 pos, unsigned int len, u64 end_pos);
-    int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
+    int (*writeback_submit)(struct iomap_writepage_ctx *wpc, int error);
  };
 
 The fields are as follows:
@@ -307,13 +307,7 @@ The fields are as follows:
     purpose.
     This function must be supplied by the filesystem.
 
-  - ``submit_ioend``: Allows the file systems to hook into writeback bio
-    submission.
-    This might include pre-write space accounting updates, or installing
-    a custom ``->bi_end_io`` function for internal purposes, such as
-    deferring the ioend completion to a workqueue to run metadata update
-    transactions from process context before submitting the bio.
-    This function is optional.
+  - ``writeback_submit``: Submit the previous built writeback context.
 
 Pagecache Writeback Completion
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -328,12 +322,6 @@ the address space.
 This can happen in interrupt or process context, depending on the
 storage device.
 
-Filesystems that need to update internal bookkeeping (e.g. unwritten
-extent conversions) should provide a ``->submit_ioend`` function to
-set ``struct iomap_end::bio::bi_end_io`` to its own function.
-This function should call ``iomap_finish_ioends`` after finishing its
-own work (e.g. unwritten extent conversion).
-
 Some filesystems may wish to `amortize the cost of running metadata
 transactions
 <https://lore.kernel.org/all/20220120034733.221737-1-david@fromorbit.com/>`_
diff --git a/block/fops.c b/block/fops.c
index b500ff8f55dd..cf79cbcf80f0 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -560,6 +560,7 @@ static ssize_t blkdev_writeback_range(struct iomap_writepage_ctx *wpc,
 
 static const struct iomap_writeback_ops blkdev_writeback_ops = {
 	.writeback_range	= blkdev_writeback_range,
+	.writeback_submit	= ioend_writeback_submit,
 };
 
 static int blkdev_writepages(struct address_space *mapping,
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 0cc41de54aba..3cd46a09e820 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -2490,4 +2490,5 @@ static ssize_t gfs2_writeback_range(struct iomap_writepage_ctx *wpc,
 
 const struct iomap_writeback_ops gfs2_writeback_ops = {
 	.writeback_range	= gfs2_writeback_range,
+	.writeback_submit	= ioend_writeback_submit,
 };
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 80d8acfaa068..50cfddff1393 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1579,7 +1579,7 @@ u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend)
 	return folio_count;
 }
 
-static void iomap_writepage_end_bio(struct bio *bio)
+static void ioend_writeback_end_bio(struct bio *bio)
 {
 	struct iomap_ioend *ioend = iomap_ioend_from_bio(bio);
 
@@ -1588,42 +1588,30 @@ static void iomap_writepage_end_bio(struct bio *bio)
 }
 
 /*
- * Submit an ioend.
- *
- * If @error is non-zero, it means that we have a situation where some part of
- * the submission process has failed after we've marked pages for writeback.
- * We cannot cancel ioend directly in that case, so call the bio end I/O handler
- * with the error status here to run the normal I/O completion handler to clear
- * the writeback bit and let the file system proess the errors.
+ * We cannot cancel the ioend directly in case of an error, so call the bio end
+ * I/O handler with the error status here to run the normal I/O completion
+ * handler.
  */
-static int iomap_submit_ioend(struct iomap_writepage_ctx *wpc, int error)
+int ioend_writeback_submit(struct iomap_writepage_ctx *wpc, int error)
 {
-	if (!wpc->ioend)
-		return error;
+	struct iomap_ioend *ioend = wpc->wb_ctx;
 
-	/*
-	 * Let the file systems prepare the I/O submission and hook in an I/O
-	 * comletion handler.  This also needs to happen in case after a
-	 * failure happened so that the file system end I/O handler gets called
-	 * to clean up.
-	 */
-	if (wpc->ops->submit_ioend) {
-		error = wpc->ops->submit_ioend(wpc, error);
-	} else {
-		if (WARN_ON_ONCE(wpc->iomap.flags & IOMAP_F_ANON_WRITE))
-			error = -EIO;
-		if (!error)
-			submit_bio(&wpc->ioend->io_bio);
-	}
+	if (!ioend->io_bio.bi_end_io)
+		ioend->io_bio.bi_end_io = ioend_writeback_end_bio;
+
+	if (WARN_ON_ONCE(wpc->iomap.flags & IOMAP_F_ANON_WRITE))
+		error = -EIO;
 
 	if (error) {
-		wpc->ioend->io_bio.bi_status = errno_to_blk_status(error);
-		bio_endio(&wpc->ioend->io_bio);
+		ioend->io_bio.bi_status = errno_to_blk_status(error);
+		bio_endio(&ioend->io_bio);
+		return error;
 	}
 
-	wpc->ioend = NULL;
-	return error;
+	submit_bio(&ioend->io_bio);
+	return 0;
 }
+EXPORT_SYMBOL_GPL(ioend_writeback_submit);
 
 static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
 		loff_t pos, u16 ioend_flags)
@@ -1634,7 +1622,6 @@ static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
 			       REQ_OP_WRITE | wbc_to_write_flags(wpc->wbc),
 			       GFP_NOFS, &iomap_ioend_bioset);
 	bio->bi_iter.bi_sector = iomap_sector(&wpc->iomap, pos);
-	bio->bi_end_io = iomap_writepage_end_bio;
 	bio->bi_write_hint = wpc->inode->i_write_hint;
 	wbc_init_bio(wpc->wbc, bio);
 	wpc->nr_folios = 0;
@@ -1644,16 +1631,17 @@ static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
 static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
 		u16 ioend_flags)
 {
+	struct iomap_ioend *ioend = wpc->wb_ctx;
+
 	if (ioend_flags & IOMAP_IOEND_BOUNDARY)
 		return false;
 	if ((ioend_flags & IOMAP_IOEND_NOMERGE_FLAGS) !=
-	    (wpc->ioend->io_flags & IOMAP_IOEND_NOMERGE_FLAGS))
+	    (ioend->io_flags & IOMAP_IOEND_NOMERGE_FLAGS))
 		return false;
-	if (pos != wpc->ioend->io_offset + wpc->ioend->io_size)
+	if (pos != ioend->io_offset + ioend->io_size)
 		return false;
 	if (!(wpc->iomap.flags & IOMAP_F_ANON_WRITE) &&
-	    iomap_sector(&wpc->iomap, pos) !=
-	    bio_end_sector(&wpc->ioend->io_bio))
+	    iomap_sector(&wpc->iomap, pos) != bio_end_sector(&ioend->io_bio))
 		return false;
 	/*
 	 * Limit ioend bio chain lengths to minimise IO completion latency. This
@@ -1679,6 +1667,7 @@ static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
 ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
 		loff_t pos, loff_t end_pos, unsigned int dirty_len)
 {
+	struct iomap_ioend *ioend = wpc->wb_ctx;
 	struct iomap_folio_state *ifs = folio->private;
 	size_t poff = offset_in_folio(folio, pos);
 	unsigned int ioend_flags = 0;
@@ -1709,15 +1698,17 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
 	if (pos == wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY))
 		ioend_flags |= IOMAP_IOEND_BOUNDARY;
 
-	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
+	if (!ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
 new_ioend:
-		error = iomap_submit_ioend(wpc, 0);
-		if (error)
-			return error;
-		wpc->ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
+		if (ioend) {
+			error = wpc->ops->writeback_submit(wpc, 0);
+			if (error)
+				return error;
+		}
+		wpc->wb_ctx = ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
 	}
 
-	if (!bio_add_folio(&wpc->ioend->io_bio, folio, map_len, poff))
+	if (!bio_add_folio(&ioend->io_bio, folio, map_len, poff))
 		goto new_ioend;
 
 	if (ifs)
@@ -1764,9 +1755,9 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
 	 * Note that this defeats the ability to chain the ioends of
 	 * appending writes.
 	 */
-	wpc->ioend->io_size += map_len;
-	if (wpc->ioend->io_offset + wpc->ioend->io_size > end_pos)
-		wpc->ioend->io_size = end_pos - wpc->ioend->io_offset;
+	ioend->io_size += map_len;
+	if (ioend->io_offset + ioend->io_size > end_pos)
+		ioend->io_size = end_pos - ioend->io_offset;
 
 	wbc_account_cgroup_owner(wpc->wbc, folio, map_len);
 	return map_len;
@@ -1956,6 +1947,18 @@ iomap_writepages(struct iomap_writepage_ctx *wpc)
 
 	while ((folio = writeback_iter(mapping, wpc->wbc, folio, &error)))
 		error = iomap_writepage_map(wpc, folio);
-	return iomap_submit_ioend(wpc, error);
+
+	/*
+	 * If @error is non-zero, it means that we have a situation where some
+	 * part of the submission process has failed after we've marked pages
+	 * for writeback.
+	 *
+	 * We cannot cancel the writeback directly in that case, so always call
+	 * ->writeback_submit to run the I/O completion handler to clear the
+	 * writeback bit and let the file system proess the errors.
+	 */
+	if (wpc->wb_ctx)
+		return wpc->ops->writeback_submit(wpc, error);
+	return error;
 }
 EXPORT_SYMBOL_GPL(iomap_writepages);
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 8157b6d92c8e..35ff2cfcd7e7 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -507,41 +507,40 @@ xfs_ioend_needs_wq_completion(
 }
 
 static int
-xfs_submit_ioend(
-	struct iomap_writepage_ctx *wpc,
-	int			status)
+xfs_writeback_submit(
+	struct iomap_writepage_ctx	*wpc,
+	int				error)
 {
-	struct iomap_ioend	*ioend = wpc->ioend;
-	unsigned int		nofs_flag;
+	struct iomap_ioend		*ioend = wpc->wb_ctx;
 
 	/*
-	 * We can allocate memory here while doing writeback on behalf of
-	 * memory reclaim.  To avoid memory allocation deadlocks set the
-	 * task-wide nofs context for the following operations.
+	 * Convert CoW extents to regular.
+	 *
+	 * We can allocate memory here while doing writeback on behalf of memory
+	 * reclaim.  To avoid memory allocation deadlocks, set the task-wide
+	 * nofs context.
 	 */
-	nofs_flag = memalloc_nofs_save();
+	if (!error && (ioend->io_flags & IOMAP_IOEND_SHARED)) {
+		unsigned int		nofs_flag;
 
-	/* Convert CoW extents to regular */
-	if (!status && (ioend->io_flags & IOMAP_IOEND_SHARED)) {
-		status = xfs_reflink_convert_cow(XFS_I(ioend->io_inode),
+		nofs_flag = memalloc_nofs_save();
+		error = xfs_reflink_convert_cow(XFS_I(ioend->io_inode),
 				ioend->io_offset, ioend->io_size);
+		memalloc_nofs_restore(nofs_flag);
 	}
 
-	memalloc_nofs_restore(nofs_flag);
-
-	/* send ioends that might require a transaction to the completion wq */
+	/*
+	 * Send ioends that might require a transaction to the completion wq.
+	 */
 	if (xfs_ioend_needs_wq_completion(ioend))
 		ioend->io_bio.bi_end_io = xfs_end_bio;
 
-	if (status)
-		return status;
-	submit_bio(&ioend->io_bio);
-	return 0;
+	return ioend_writeback_submit(wpc, error);
 }
 
 static const struct iomap_writeback_ops xfs_writeback_ops = {
 	.writeback_range	= xfs_writeback_range,
-	.submit_ioend		= xfs_submit_ioend,
+	.writeback_submit	= xfs_writeback_submit,
 };
 
 struct xfs_zoned_writepage_ctx {
@@ -628,20 +627,25 @@ xfs_zoned_writeback_range(
 }
 
 static int
-xfs_zoned_submit_ioend(
-	struct iomap_writepage_ctx *wpc,
-	int			status)
+xfs_zoned_writeback_submit(
+	struct iomap_writepage_ctx	*wpc,
+	int				error)
 {
-	wpc->ioend->io_bio.bi_end_io = xfs_end_bio;
-	if (status)
-		return status;
-	xfs_zone_alloc_and_submit(wpc->ioend, &XFS_ZWPC(wpc)->open_zone);
+	struct iomap_ioend		*ioend = wpc->wb_ctx;
+
+	ioend->io_bio.bi_end_io = xfs_end_bio;
+	if (error) {
+		ioend->io_bio.bi_status = errno_to_blk_status(error);
+		bio_endio(&ioend->io_bio);
+		return error;
+	}
+	xfs_zone_alloc_and_submit(ioend, &XFS_ZWPC(wpc)->open_zone);
 	return 0;
 }
 
 static const struct iomap_writeback_ops xfs_zoned_writeback_ops = {
 	.writeback_range	= xfs_zoned_writeback_range,
-	.submit_ioend		= xfs_zoned_submit_ioend,
+	.writeback_submit	= xfs_zoned_writeback_submit,
 };
 
 STATIC int
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index c88e2c851753..0c64185325d3 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -151,6 +151,7 @@ static ssize_t zonefs_writeback_range(struct iomap_writepage_ctx *wpc,
 
 static const struct iomap_writeback_ops zonefs_writeback_ops = {
 	.writeback_range	= zonefs_writeback_range,
+	.writeback_submit	= ioend_writeback_submit,
 };
 
 static int zonefs_writepages(struct address_space *mapping,
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 063e18476286..047100f94092 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -391,8 +391,7 @@ sector_t iomap_bmap(struct address_space *mapping, sector_t bno,
 /*
  * Structure for writeback I/O completions.
  *
- * File systems implementing ->submit_ioend (for buffered I/O) or ->submit_io
- * for direct I/O) can split a bio generated by iomap.  In that case the parent
+ * File systems can split a bio generated by iomap.  In that case the parent
  * ioend it was split from is recorded in ioend->io_parent.
  */
 struct iomap_ioend {
@@ -416,7 +415,7 @@ static inline struct iomap_ioend *iomap_ioend_from_bio(struct bio *bio)
 
 struct iomap_writeback_ops {
 	/*
-	 * Required, performs writeback on the passed in range
+	 * Performs writeback on the passed in range
 	 *
 	 * Can map arbitrarily large regions, but we need to call into it at
 	 * least once per folio to allow the file systems to synchronize with
@@ -432,23 +431,22 @@ struct iomap_writeback_ops {
 			u64 end_pos);
 
 	/*
-	 * Optional, allows the file systems to hook into bio submission,
-	 * including overriding the bi_end_io handler.
+	 * Submit a writeback context previously build up by ->writeback_range.
 	 *
-	 * Returns 0 if the bio was successfully submitted, or a negative
-	 * error code if status was non-zero or another error happened and
-	 * the bio could not be submitted.
+	 * Returns 0 if the context was successfully submitted, or a negative
+	 * error code if not.  If @error is non-zero a failure occurred, and
+	 * the writeback context should be completed with an error.
 	 */
-	int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
+	int (*writeback_submit)(struct iomap_writepage_ctx *wpc, int error);
 };
 
 struct iomap_writepage_ctx {
 	struct iomap		iomap;
 	struct inode		*inode;
 	struct writeback_control *wbc;
-	struct iomap_ioend	*ioend;
 	const struct iomap_writeback_ops *ops;
 	u32			nr_folios;	/* folios added to the ioend */
+	void			*wb_ctx;	/* pending writeback context */
 };
 
 struct iomap_ioend *iomap_init_ioend(struct inode *inode, struct bio *bio,
@@ -461,6 +459,7 @@ void iomap_ioend_try_merge(struct iomap_ioend *ioend,
 void iomap_sort_ioends(struct list_head *ioend_list);
 ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
 		loff_t pos, loff_t end_pos, unsigned int dirty_len);
+int ioend_writeback_submit(struct iomap_writepage_ctx *wpc, int error);
 int iomap_writepages(struct iomap_writepage_ctx *wpc);
 
 /*
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 05/16] iomap: add public helpers for uptodate state manipulation
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (3 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 04/16] iomap: hide ioends from the generic writeback code Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 17:39   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 06/16] iomap: move all ioend handling to ioend.c Joanne Koong
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

Add a new iomap_start_folio_write helper to abstract away the
write_bytes_pending handling, and export it and the existing
iomap_finish_folio_write for non-iomap writeback in fuse.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 20 +++++++++++++++-----
 include/linux/iomap.h  |  5 +++++
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 50cfddff1393..b4a8d2241d70 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1535,7 +1535,18 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops,
 }
 EXPORT_SYMBOL_GPL(iomap_page_mkwrite);
 
-static void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
+void iomap_start_folio_write(struct inode *inode, struct folio *folio,
+		size_t len)
+{
+	struct iomap_folio_state *ifs = folio->private;
+
+	WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !ifs);
+	if (ifs)
+		atomic_add(len, &ifs->write_bytes_pending);
+}
+EXPORT_SYMBOL_GPL(iomap_start_folio_write);
+
+void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
 		size_t len)
 {
 	struct iomap_folio_state *ifs = folio->private;
@@ -1546,6 +1557,7 @@ static void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
 	if (!ifs || atomic_sub_and_test(len, &ifs->write_bytes_pending))
 		folio_end_writeback(folio);
 }
+EXPORT_SYMBOL_GPL(iomap_finish_folio_write);
 
 /*
  * We're now finished for good with this ioend structure.  Update the page
@@ -1668,7 +1680,6 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
 		loff_t pos, loff_t end_pos, unsigned int dirty_len)
 {
 	struct iomap_ioend *ioend = wpc->wb_ctx;
-	struct iomap_folio_state *ifs = folio->private;
 	size_t poff = offset_in_folio(folio, pos);
 	unsigned int ioend_flags = 0;
 	unsigned int map_len = min_t(u64, dirty_len,
@@ -1711,8 +1722,7 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
 	if (!bio_add_folio(&ioend->io_bio, folio, map_len, poff))
 		goto new_ioend;
 
-	if (ifs)
-		atomic_add(map_len, &ifs->write_bytes_pending);
+	iomap_start_folio_write(wpc->inode, folio, map_len);
 
 	/*
 	 * Clamp io_offset and io_size to the incore EOF so that ondisk
@@ -1880,7 +1890,7 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		 * all blocks.
 		 */
 		WARN_ON_ONCE(atomic_read(&ifs->write_bytes_pending) != 0);
-		atomic_inc(&ifs->write_bytes_pending);
+		iomap_start_folio_write(inode, folio, 1);
 	}
 
 	/*
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 047100f94092..bfd178fb7cfc 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -460,6 +460,11 @@ void iomap_sort_ioends(struct list_head *ioend_list);
 ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
 		loff_t pos, loff_t end_pos, unsigned int dirty_len);
 int ioend_writeback_submit(struct iomap_writepage_ctx *wpc, int error);
+
+void iomap_start_folio_write(struct inode *inode, struct folio *folio,
+		size_t len);
+void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
+		size_t len);
 int iomap_writepages(struct iomap_writepage_ctx *wpc);
 
 /*
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 06/16] iomap: move all ioend handling to ioend.c
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (4 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 05/16] iomap: add public helpers for uptodate state manipulation Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 17:40   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 07/16] iomap: rename iomap_writepage_map to iomap_writeback_folio Joanne Koong
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

From: Christoph Hellwig <hch@lst.de>

Now that the writeback code has the proper abstractions, all the ioend
code can be self-contained in ioend.c.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/iomap/buffered-io.c | 215 ----------------------------------------
 fs/iomap/internal.h    |   1 -
 fs/iomap/ioend.c       | 220 ++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 219 insertions(+), 217 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index b4a8d2241d70..c262f883f9f9 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1559,221 +1559,6 @@ void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
 }
 EXPORT_SYMBOL_GPL(iomap_finish_folio_write);
 
-/*
- * We're now finished for good with this ioend structure.  Update the page
- * state, release holds on bios, and finally free up memory.  Do not use the
- * ioend after this.
- */
-u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend)
-{
-	struct inode *inode = ioend->io_inode;
-	struct bio *bio = &ioend->io_bio;
-	struct folio_iter fi;
-	u32 folio_count = 0;
-
-	if (ioend->io_error) {
-		mapping_set_error(inode->i_mapping, ioend->io_error);
-		if (!bio_flagged(bio, BIO_QUIET)) {
-			pr_err_ratelimited(
-"%s: writeback error on inode %lu, offset %lld, sector %llu",
-				inode->i_sb->s_id, inode->i_ino,
-				ioend->io_offset, ioend->io_sector);
-		}
-	}
-
-	/* walk all folios in bio, ending page IO on them */
-	bio_for_each_folio_all(fi, bio) {
-		iomap_finish_folio_write(inode, fi.folio, fi.length);
-		folio_count++;
-	}
-
-	bio_put(bio);	/* frees the ioend */
-	return folio_count;
-}
-
-static void ioend_writeback_end_bio(struct bio *bio)
-{
-	struct iomap_ioend *ioend = iomap_ioend_from_bio(bio);
-
-	ioend->io_error = blk_status_to_errno(bio->bi_status);
-	iomap_finish_ioend_buffered(ioend);
-}
-
-/*
- * We cannot cancel the ioend directly in case of an error, so call the bio end
- * I/O handler with the error status here to run the normal I/O completion
- * handler.
- */
-int ioend_writeback_submit(struct iomap_writepage_ctx *wpc, int error)
-{
-	struct iomap_ioend *ioend = wpc->wb_ctx;
-
-	if (!ioend->io_bio.bi_end_io)
-		ioend->io_bio.bi_end_io = ioend_writeback_end_bio;
-
-	if (WARN_ON_ONCE(wpc->iomap.flags & IOMAP_F_ANON_WRITE))
-		error = -EIO;
-
-	if (error) {
-		ioend->io_bio.bi_status = errno_to_blk_status(error);
-		bio_endio(&ioend->io_bio);
-		return error;
-	}
-
-	submit_bio(&ioend->io_bio);
-	return 0;
-}
-EXPORT_SYMBOL_GPL(ioend_writeback_submit);
-
-static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
-		loff_t pos, u16 ioend_flags)
-{
-	struct bio *bio;
-
-	bio = bio_alloc_bioset(wpc->iomap.bdev, BIO_MAX_VECS,
-			       REQ_OP_WRITE | wbc_to_write_flags(wpc->wbc),
-			       GFP_NOFS, &iomap_ioend_bioset);
-	bio->bi_iter.bi_sector = iomap_sector(&wpc->iomap, pos);
-	bio->bi_write_hint = wpc->inode->i_write_hint;
-	wbc_init_bio(wpc->wbc, bio);
-	wpc->nr_folios = 0;
-	return iomap_init_ioend(wpc->inode, bio, pos, ioend_flags);
-}
-
-static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
-		u16 ioend_flags)
-{
-	struct iomap_ioend *ioend = wpc->wb_ctx;
-
-	if (ioend_flags & IOMAP_IOEND_BOUNDARY)
-		return false;
-	if ((ioend_flags & IOMAP_IOEND_NOMERGE_FLAGS) !=
-	    (ioend->io_flags & IOMAP_IOEND_NOMERGE_FLAGS))
-		return false;
-	if (pos != ioend->io_offset + ioend->io_size)
-		return false;
-	if (!(wpc->iomap.flags & IOMAP_F_ANON_WRITE) &&
-	    iomap_sector(&wpc->iomap, pos) != bio_end_sector(&ioend->io_bio))
-		return false;
-	/*
-	 * Limit ioend bio chain lengths to minimise IO completion latency. This
-	 * also prevents long tight loops ending page writeback on all the
-	 * folios in the ioend.
-	 */
-	if (wpc->nr_folios >= IOEND_BATCH_SIZE)
-		return false;
-	return true;
-}
-
-/*
- * Test to see if we have an existing ioend structure that we could append to
- * first; otherwise finish off the current ioend and start another.
- *
- * If a new ioend is created and cached, the old ioend is submitted to the block
- * layer instantly.  Batching optimisations are provided by higher level block
- * plugging.
- *
- * At the end of a writeback pass, there will be a cached ioend remaining on the
- * writepage context that the caller will need to submit.
- */
-ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
-		loff_t pos, loff_t end_pos, unsigned int dirty_len)
-{
-	struct iomap_ioend *ioend = wpc->wb_ctx;
-	size_t poff = offset_in_folio(folio, pos);
-	unsigned int ioend_flags = 0;
-	unsigned int map_len = min_t(u64, dirty_len,
-		wpc->iomap.offset + wpc->iomap.length - pos);
-	int error;
-
-	trace_iomap_add_to_ioend(wpc->inode, pos, dirty_len, &wpc->iomap);
-
-	WARN_ON_ONCE(!folio->private && map_len < dirty_len);
-
-	switch (wpc->iomap.type) {
-	case IOMAP_INLINE:
-		WARN_ON_ONCE(1);
-		return -EIO;
-	case IOMAP_HOLE:
-		return map_len;
-	default:
-		break;
-	}
-
-	if (wpc->iomap.type == IOMAP_UNWRITTEN)
-		ioend_flags |= IOMAP_IOEND_UNWRITTEN;
-	if (wpc->iomap.flags & IOMAP_F_SHARED)
-		ioend_flags |= IOMAP_IOEND_SHARED;
-	if (folio_test_dropbehind(folio))
-		ioend_flags |= IOMAP_IOEND_DONTCACHE;
-	if (pos == wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY))
-		ioend_flags |= IOMAP_IOEND_BOUNDARY;
-
-	if (!ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
-new_ioend:
-		if (ioend) {
-			error = wpc->ops->writeback_submit(wpc, 0);
-			if (error)
-				return error;
-		}
-		wpc->wb_ctx = ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
-	}
-
-	if (!bio_add_folio(&ioend->io_bio, folio, map_len, poff))
-		goto new_ioend;
-
-	iomap_start_folio_write(wpc->inode, folio, map_len);
-
-	/*
-	 * Clamp io_offset and io_size to the incore EOF so that ondisk
-	 * file size updates in the ioend completion are byte-accurate.
-	 * This avoids recovering files with zeroed tail regions when
-	 * writeback races with appending writes:
-	 *
-	 *    Thread 1:                  Thread 2:
-	 *    ------------               -----------
-	 *    write [A, A+B]
-	 *    update inode size to A+B
-	 *    submit I/O [A, A+BS]
-	 *                               write [A+B, A+B+C]
-	 *                               update inode size to A+B+C
-	 *    <I/O completes, updates disk size to min(A+B+C, A+BS)>
-	 *    <power failure>
-	 *
-	 *  After reboot:
-	 *    1) with A+B+C < A+BS, the file has zero padding in range
-	 *       [A+B, A+B+C]
-	 *
-	 *    |<     Block Size (BS)   >|
-	 *    |DDDDDDDDDDDD0000000000000|
-	 *    ^           ^        ^
-	 *    A          A+B     A+B+C
-	 *                       (EOF)
-	 *
-	 *    2) with A+B+C > A+BS, the file has zero padding in range
-	 *       [A+B, A+BS]
-	 *
-	 *    |<     Block Size (BS)   >|<     Block Size (BS)    >|
-	 *    |DDDDDDDDDDDD0000000000000|00000000000000000000000000|
-	 *    ^           ^             ^           ^
-	 *    A          A+B           A+BS       A+B+C
-	 *                             (EOF)
-	 *
-	 *    D = Valid Data
-	 *    0 = Zero Padding
-	 *
-	 * Note that this defeats the ability to chain the ioends of
-	 * appending writes.
-	 */
-	ioend->io_size += map_len;
-	if (ioend->io_offset + ioend->io_size > end_pos)
-		ioend->io_size = end_pos - ioend->io_offset;
-
-	wbc_account_cgroup_owner(wpc->wbc, folio, map_len);
-	return map_len;
-}
-EXPORT_SYMBOL_GPL(iomap_add_to_ioend);
-
 static int iomap_writeback_range(struct iomap_writepage_ctx *wpc,
 		struct folio *folio, u64 pos, u32 rlen, u64 end_pos,
 		bool *wb_pending)
diff --git a/fs/iomap/internal.h b/fs/iomap/internal.h
index f6992a3bf66a..d05cb3aed96e 100644
--- a/fs/iomap/internal.h
+++ b/fs/iomap/internal.h
@@ -4,7 +4,6 @@
 
 #define IOEND_BATCH_SIZE	4096
 
-u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend);
 u32 iomap_finish_ioend_direct(struct iomap_ioend *ioend);
 
 #endif /* _IOMAP_INTERNAL_H */
diff --git a/fs/iomap/ioend.c b/fs/iomap/ioend.c
index 18894ebba6db..81f4bac5a3a9 100644
--- a/fs/iomap/ioend.c
+++ b/fs/iomap/ioend.c
@@ -1,10 +1,13 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * Copyright (c) 2024-2025 Christoph Hellwig.
+ * Copyright (c) 2016-2025 Christoph Hellwig.
  */
 #include <linux/iomap.h>
 #include <linux/list_sort.h>
+#include <linux/pagemap.h>
+#include <linux/writeback.h>
 #include "internal.h"
+#include "trace.h"
 
 struct bio_set iomap_ioend_bioset;
 EXPORT_SYMBOL_GPL(iomap_ioend_bioset);
@@ -28,6 +31,221 @@ struct iomap_ioend *iomap_init_ioend(struct inode *inode,
 }
 EXPORT_SYMBOL_GPL(iomap_init_ioend);
 
+/*
+ * We're now finished for good with this ioend structure.  Update the folio
+ * state, release holds on bios, and finally free up memory.  Do not use the
+ * ioend after this.
+ */
+static u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend)
+{
+	struct inode *inode = ioend->io_inode;
+	struct bio *bio = &ioend->io_bio;
+	struct folio_iter fi;
+	u32 folio_count = 0;
+
+	if (ioend->io_error) {
+		mapping_set_error(inode->i_mapping, ioend->io_error);
+		if (!bio_flagged(bio, BIO_QUIET)) {
+			pr_err_ratelimited(
+"%s: writeback error on inode %lu, offset %lld, sector %llu",
+				inode->i_sb->s_id, inode->i_ino,
+				ioend->io_offset, ioend->io_sector);
+		}
+	}
+
+	/* walk all folios in bio, ending page IO on them */
+	bio_for_each_folio_all(fi, bio) {
+		iomap_finish_folio_write(inode, fi.folio, fi.length);
+		folio_count++;
+	}
+
+	bio_put(bio);	/* frees the ioend */
+	return folio_count;
+}
+
+static void ioend_writeback_end_bio(struct bio *bio)
+{
+	struct iomap_ioend *ioend = iomap_ioend_from_bio(bio);
+
+	ioend->io_error = blk_status_to_errno(bio->bi_status);
+	iomap_finish_ioend_buffered(ioend);
+}
+
+/*
+ * We cannot cancel the ioend directly in case of an error, so call the bio end
+ * I/O handler with the error status here to run the normal I/O completion
+ * handler.
+ */
+int ioend_writeback_submit(struct iomap_writepage_ctx *wpc, int error)
+{
+	struct iomap_ioend *ioend = wpc->wb_ctx;
+
+	if (!ioend->io_bio.bi_end_io)
+		ioend->io_bio.bi_end_io = ioend_writeback_end_bio;
+
+	if (WARN_ON_ONCE(wpc->iomap.flags & IOMAP_F_ANON_WRITE))
+		error = -EIO;
+
+	if (error) {
+		ioend->io_bio.bi_status = errno_to_blk_status(error);
+		bio_endio(&ioend->io_bio);
+		return error;
+	}
+
+	submit_bio(&ioend->io_bio);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(ioend_writeback_submit);
+
+static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
+		loff_t pos, u16 ioend_flags)
+{
+	struct bio *bio;
+
+	bio = bio_alloc_bioset(wpc->iomap.bdev, BIO_MAX_VECS,
+			       REQ_OP_WRITE | wbc_to_write_flags(wpc->wbc),
+			       GFP_NOFS, &iomap_ioend_bioset);
+	bio->bi_iter.bi_sector = iomap_sector(&wpc->iomap, pos);
+	bio->bi_write_hint = wpc->inode->i_write_hint;
+	wbc_init_bio(wpc->wbc, bio);
+	wpc->nr_folios = 0;
+	return iomap_init_ioend(wpc->inode, bio, pos, ioend_flags);
+}
+
+static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
+		u16 ioend_flags)
+{
+	struct iomap_ioend *ioend = wpc->wb_ctx;
+
+	if (ioend_flags & IOMAP_IOEND_BOUNDARY)
+		return false;
+	if ((ioend_flags & IOMAP_IOEND_NOMERGE_FLAGS) !=
+	    (ioend->io_flags & IOMAP_IOEND_NOMERGE_FLAGS))
+		return false;
+	if (pos != ioend->io_offset + ioend->io_size)
+		return false;
+	if (!(wpc->iomap.flags & IOMAP_F_ANON_WRITE) &&
+	    iomap_sector(&wpc->iomap, pos) != bio_end_sector(&ioend->io_bio))
+		return false;
+	/*
+	 * Limit ioend bio chain lengths to minimise IO completion latency. This
+	 * also prevents long tight loops ending page writeback on all the
+	 * folios in the ioend.
+	 */
+	if (wpc->nr_folios >= IOEND_BATCH_SIZE)
+		return false;
+	return true;
+}
+
+/*
+ * Test to see if we have an existing ioend structure that we could append to
+ * first; otherwise finish off the current ioend and start another.
+ *
+ * If a new ioend is created and cached, the old ioend is submitted to the block
+ * layer instantly.  Batching optimisations are provided by higher level block
+ * plugging.
+ *
+ * At the end of a writeback pass, there will be a cached ioend remaining on the
+ * writepage context that the caller will need to submit.
+ */
+ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
+		loff_t pos, loff_t end_pos, unsigned int dirty_len)
+{
+	struct iomap_ioend *ioend = wpc->wb_ctx;
+	size_t poff = offset_in_folio(folio, pos);
+	unsigned int ioend_flags = 0;
+	unsigned int map_len = min_t(u64, dirty_len,
+		wpc->iomap.offset + wpc->iomap.length - pos);
+	int error;
+
+	trace_iomap_add_to_ioend(wpc->inode, pos, dirty_len, &wpc->iomap);
+
+	WARN_ON_ONCE(!folio->private && map_len < dirty_len);
+
+	switch (wpc->iomap.type) {
+	case IOMAP_INLINE:
+		WARN_ON_ONCE(1);
+		return -EIO;
+	case IOMAP_HOLE:
+		return map_len;
+	default:
+		break;
+	}
+
+	if (wpc->iomap.type == IOMAP_UNWRITTEN)
+		ioend_flags |= IOMAP_IOEND_UNWRITTEN;
+	if (wpc->iomap.flags & IOMAP_F_SHARED)
+		ioend_flags |= IOMAP_IOEND_SHARED;
+	if (folio_test_dropbehind(folio))
+		ioend_flags |= IOMAP_IOEND_DONTCACHE;
+	if (pos == wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY))
+		ioend_flags |= IOMAP_IOEND_BOUNDARY;
+
+	if (!ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
+new_ioend:
+		if (ioend) {
+			error = wpc->ops->writeback_submit(wpc, 0);
+			if (error)
+				return error;
+		}
+		wpc->wb_ctx = ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
+	}
+
+	if (!bio_add_folio(&ioend->io_bio, folio, map_len, poff))
+		goto new_ioend;
+
+	iomap_start_folio_write(wpc->inode, folio, map_len);
+
+	/*
+	 * Clamp io_offset and io_size to the incore EOF so that ondisk
+	 * file size updates in the ioend completion are byte-accurate.
+	 * This avoids recovering files with zeroed tail regions when
+	 * writeback races with appending writes:
+	 *
+	 *    Thread 1:                  Thread 2:
+	 *    ------------               -----------
+	 *    write [A, A+B]
+	 *    update inode size to A+B
+	 *    submit I/O [A, A+BS]
+	 *                               write [A+B, A+B+C]
+	 *                               update inode size to A+B+C
+	 *    <I/O completes, updates disk size to min(A+B+C, A+BS)>
+	 *    <power failure>
+	 *
+	 *  After reboot:
+	 *    1) with A+B+C < A+BS, the file has zero padding in range
+	 *       [A+B, A+B+C]
+	 *
+	 *    |<     Block Size (BS)   >|
+	 *    |DDDDDDDDDDDD0000000000000|
+	 *    ^           ^        ^
+	 *    A          A+B     A+B+C
+	 *                       (EOF)
+	 *
+	 *    2) with A+B+C > A+BS, the file has zero padding in range
+	 *       [A+B, A+BS]
+	 *
+	 *    |<     Block Size (BS)   >|<     Block Size (BS)    >|
+	 *    |DDDDDDDDDDDD0000000000000|00000000000000000000000000|
+	 *    ^           ^             ^           ^
+	 *    A          A+B           A+BS       A+B+C
+	 *                             (EOF)
+	 *
+	 *    D = Valid Data
+	 *    0 = Zero Padding
+	 *
+	 * Note that this defeats the ability to chain the ioends of
+	 * appending writes.
+	 */
+	ioend->io_size += map_len;
+	if (ioend->io_offset + ioend->io_size > end_pos)
+		ioend->io_size = end_pos - ioend->io_offset;
+
+	wbc_account_cgroup_owner(wpc->wbc, folio, map_len);
+	return map_len;
+}
+EXPORT_SYMBOL_GPL(iomap_add_to_ioend);
+
 static u32 iomap_finish_ioend(struct iomap_ioend *ioend, int error)
 {
 	if (ioend->io_parent) {
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 07/16] iomap: rename iomap_writepage_map to iomap_writeback_folio
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (5 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 06/16] iomap: move all ioend handling to ioend.c Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 17:41   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 08/16] iomap: move folio_unlock out of iomap_writeback_folio Joanne Koong
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

From: Christoph Hellwig <hch@lst.de>

->writepage is gone, and our naming wasn't always that great to start
with.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index c262f883f9f9..c6bbee68812e 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1586,7 +1586,7 @@ static int iomap_writeback_range(struct iomap_writepage_ctx *wpc,
  * If the folio is entirely beyond i_size, return false.  If it straddles
  * i_size, adjust end_pos and zero all data beyond i_size.
  */
-static bool iomap_writepage_handle_eof(struct folio *folio, struct inode *inode,
+static bool iomap_writeback_handle_eof(struct folio *folio, struct inode *inode,
 		u64 *end_pos)
 {
 	u64 isize = i_size_read(inode);
@@ -1638,7 +1638,7 @@ static bool iomap_writepage_handle_eof(struct folio *folio, struct inode *inode,
 	return true;
 }
 
-static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
+static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
 		struct folio *folio)
 {
 	struct iomap_folio_state *ifs = folio->private;
@@ -1656,7 +1656,7 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 
 	trace_iomap_writepage(inode, pos, folio_size(folio));
 
-	if (!iomap_writepage_handle_eof(folio, inode, &end_pos)) {
+	if (!iomap_writeback_handle_eof(folio, inode, &end_pos)) {
 		folio_unlock(folio);
 		return 0;
 	}
@@ -1741,7 +1741,7 @@ iomap_writepages(struct iomap_writepage_ctx *wpc)
 		return -EIO;
 
 	while ((folio = writeback_iter(mapping, wpc->wbc, folio, &error)))
-		error = iomap_writepage_map(wpc, folio);
+		error = iomap_writeback_folio(wpc, folio);
 
 	/*
 	 * If @error is non-zero, it means that we have a situation where some
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 08/16] iomap: move folio_unlock out of iomap_writeback_folio
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (6 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 07/16] iomap: rename iomap_writepage_map to iomap_writeback_folio Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 17:43   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 09/16] iomap: export iomap_writeback_folio Joanne Koong
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

Move unlocking the folio out of iomap_writeback_folio into the caller.
This means the end writeback machinery is now run with the folio locked
when no writeback happend, or writeback completed extremely fast.

This prepares for exporting iomap_writeback_folio for use in folio
laundering.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index c6bbee68812e..2973fced2a52 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1656,10 +1656,8 @@ static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
 
 	trace_iomap_writepage(inode, pos, folio_size(folio));
 
-	if (!iomap_writeback_handle_eof(folio, inode, &end_pos)) {
-		folio_unlock(folio);
+	if (!iomap_writeback_handle_eof(folio, inode, &end_pos))
 		return 0;
-	}
 	WARN_ON_ONCE(end_pos <= pos);
 
 	if (i_blocks_per_folio(inode, folio) > 1) {
@@ -1713,7 +1711,6 @@ static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
 	 * already at this point.  In that case we need to clear the writeback
 	 * bit ourselves right after unlocking the page.
 	 */
-	folio_unlock(folio);
 	if (ifs) {
 		if (atomic_dec_and_test(&ifs->write_bytes_pending))
 			folio_end_writeback(folio);
@@ -1740,8 +1737,10 @@ iomap_writepages(struct iomap_writepage_ctx *wpc)
 			PF_MEMALLOC))
 		return -EIO;
 
-	while ((folio = writeback_iter(mapping, wpc->wbc, folio, &error)))
+	while ((folio = writeback_iter(mapping, wpc->wbc, folio, &error))) {
 		error = iomap_writeback_folio(wpc, folio);
+		folio_unlock(folio);
+	}
 
 	/*
 	 * If @error is non-zero, it means that we have a situation where some
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 09/16] iomap: export iomap_writeback_folio
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (7 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 08/16] iomap: move folio_unlock out of iomap_writeback_folio Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 17:44   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 10/16] iomap: replace iomap_folio_ops with iomap_write_ops Joanne Koong
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

From: Christoph Hellwig <hch@lst.de>

Allow fuse to use iomap_writeback_folio for folio laundering.  Note
that the caller needs to manually submit the pending writeback context.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/iomap/buffered-io.c | 4 ++--
 include/linux/iomap.h  | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 2973fced2a52..d7fa885b1a0c 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1638,8 +1638,7 @@ static bool iomap_writeback_handle_eof(struct folio *folio, struct inode *inode,
 	return true;
 }
 
-static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
-		struct folio *folio)
+int iomap_writeback_folio(struct iomap_writepage_ctx *wpc, struct folio *folio)
 {
 	struct iomap_folio_state *ifs = folio->private;
 	struct inode *inode = wpc->inode;
@@ -1721,6 +1720,7 @@ static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
 	mapping_set_error(inode->i_mapping, error);
 	return error;
 }
+EXPORT_SYMBOL_GPL(iomap_writeback_folio);
 
 int
 iomap_writepages(struct iomap_writepage_ctx *wpc)
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index bfd178fb7cfc..7e06b3a392f8 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -465,6 +465,7 @@ void iomap_start_folio_write(struct inode *inode, struct folio *folio,
 		size_t len);
 void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
 		size_t len);
+int iomap_writeback_folio(struct iomap_writepage_ctx *wpc, struct folio *folio);
 int iomap_writepages(struct iomap_writepage_ctx *wpc);
 
 /*
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 10/16] iomap: replace iomap_folio_ops with iomap_write_ops
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (8 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 09/16] iomap: export iomap_writeback_folio Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 17:48   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 11/16] iomap: add read_folio_range() handler for buffered writes Joanne Koong
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

From: Christoph Hellwig <hch@lst.de>

The iomap_folio_ops are only used for buffered writes, including
the zero and unshare variants.  Rename them to iomap_write_ops
to better describe the usage, and pass them through the callchain
like the other operation specific methods instead of through the
iomap.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/filesystems/iomap/design.rst    |  3 -
 .../filesystems/iomap/operations.rst          |  8 +-
 block/fops.c                                  |  3 +-
 fs/gfs2/bmap.c                                | 21 ++---
 fs/gfs2/bmap.h                                |  1 +
 fs/gfs2/file.c                                |  3 +-
 fs/iomap/buffered-io.c                        | 79 +++++++++++--------
 fs/xfs/xfs_file.c                             |  6 +-
 fs/xfs/xfs_iomap.c                            | 12 ++-
 fs/xfs/xfs_iomap.h                            |  1 +
 fs/xfs/xfs_reflink.c                          |  3 +-
 fs/zonefs/file.c                              |  3 +-
 include/linux/iomap.h                         | 22 +++---
 13 files changed, 89 insertions(+), 76 deletions(-)

diff --git a/Documentation/filesystems/iomap/design.rst b/Documentation/filesystems/iomap/design.rst
index f2df9b6df988..0f7672676c0b 100644
--- a/Documentation/filesystems/iomap/design.rst
+++ b/Documentation/filesystems/iomap/design.rst
@@ -167,7 +167,6 @@ structure below:
      struct dax_device   *dax_dev;
      void                *inline_data;
      void                *private;
-     const struct iomap_folio_ops *folio_ops;
      u64                 validity_cookie;
  };
 
@@ -292,8 +291,6 @@ The fields are as follows:
    <https://lore.kernel.org/all/20180619164137.13720-7-hch@lst.de/>`_.
    This value will be passed unchanged to ``->iomap_end``.
 
- * ``folio_ops`` will be covered in the section on pagecache operations.
-
  * ``validity_cookie`` is a magic freshness value set by the filesystem
    that should be used to detect stale mappings.
    For pagecache operations this is critical for correct operation
diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
index ead56b27ec3f..1f5732835567 100644
--- a/Documentation/filesystems/iomap/operations.rst
+++ b/Documentation/filesystems/iomap/operations.rst
@@ -57,16 +57,12 @@ The following address space operations can be wrapped easily:
  * ``bmap``
  * ``swap_activate``
 
-``struct iomap_folio_ops``
+``struct iomap_write_ops``
 --------------------------
 
-The ``->iomap_begin`` function for pagecache operations may set the
-``struct iomap::folio_ops`` field to an ops structure to override
-default behaviors of iomap:
-
 .. code-block:: c
 
- struct iomap_folio_ops {
+ struct iomap_write_ops {
      struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
                                 unsigned len);
      void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
diff --git a/block/fops.c b/block/fops.c
index cf79cbcf80f0..9e77be5ccb5f 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -723,7 +723,8 @@ blkdev_direct_write(struct kiocb *iocb, struct iov_iter *from)
 
 static ssize_t blkdev_buffered_write(struct kiocb *iocb, struct iov_iter *from)
 {
-	return iomap_file_buffered_write(iocb, from, &blkdev_iomap_ops, NULL);
+	return iomap_file_buffered_write(iocb, from, &blkdev_iomap_ops, NULL,
+			NULL);
 }
 
 /*
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 3cd46a09e820..a7832636da34 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -963,12 +963,16 @@ static struct folio *
 gfs2_iomap_get_folio(struct iomap_iter *iter, loff_t pos, unsigned len)
 {
 	struct inode *inode = iter->inode;
+	struct gfs2_inode *ip = GFS2_I(inode);
 	unsigned int blockmask = i_blocksize(inode) - 1;
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	unsigned int blocks;
 	struct folio *folio;
 	int status;
 
+	if (!gfs2_is_jdata(ip) && !gfs2_is_stuffed(ip))
+		return iomap_get_folio(iter, pos, len);
+
 	blocks = ((pos & blockmask) + len + blockmask) >> inode->i_blkbits;
 	status = gfs2_trans_begin(sdp, RES_DINODE + blocks, 0);
 	if (status)
@@ -987,7 +991,7 @@ static void gfs2_iomap_put_folio(struct inode *inode, loff_t pos,
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 
-	if (!gfs2_is_stuffed(ip))
+	if (gfs2_is_jdata(ip) && !gfs2_is_stuffed(ip))
 		gfs2_trans_add_databufs(ip->i_gl, folio,
 					offset_in_folio(folio, pos),
 					copied);
@@ -995,13 +999,14 @@ static void gfs2_iomap_put_folio(struct inode *inode, loff_t pos,
 	folio_unlock(folio);
 	folio_put(folio);
 
-	if (tr->tr_num_buf_new)
-		__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
-
-	gfs2_trans_end(sdp);
+	if (gfs2_is_jdata(ip) || gfs2_is_stuffed(ip)) {
+		if (tr->tr_num_buf_new)
+			__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
+		gfs2_trans_end(sdp);
+	}
 }
 
-static const struct iomap_folio_ops gfs2_iomap_folio_ops = {
+const struct iomap_write_ops gfs2_iomap_write_ops = {
 	.get_folio = gfs2_iomap_get_folio,
 	.put_folio = gfs2_iomap_put_folio,
 };
@@ -1078,8 +1083,6 @@ static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos,
 		gfs2_trans_end(sdp);
 	}
 
-	if (gfs2_is_stuffed(ip) || gfs2_is_jdata(ip))
-		iomap->folio_ops = &gfs2_iomap_folio_ops;
 	return 0;
 
 out_trans_end:
@@ -1304,7 +1307,7 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from, loff_t length
 		return 0;
 	length = min(length, inode->i_size - from);
 	return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
-			NULL);
+			&gfs2_iomap_write_ops, NULL);
 }
 
 #define GFS2_JTRUNC_REVOKES 8192
diff --git a/fs/gfs2/bmap.h b/fs/gfs2/bmap.h
index 4e8b1e8ebdf3..6cdc72dd55a3 100644
--- a/fs/gfs2/bmap.h
+++ b/fs/gfs2/bmap.h
@@ -44,6 +44,7 @@ static inline void gfs2_write_calc_reserv(const struct gfs2_inode *ip,
 }
 
 extern const struct iomap_ops gfs2_iomap_ops;
+extern const struct iomap_write_ops gfs2_iomap_write_ops;
 extern const struct iomap_writeback_ops gfs2_writeback_ops;
 
 int gfs2_unstuff_dinode(struct gfs2_inode *ip);
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index fd1147aa3891..2908f5bee21d 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -1058,7 +1058,8 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 	}
 
 	pagefault_disable();
-	ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops, NULL);
+	ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops,
+			&gfs2_iomap_write_ops, NULL);
 	pagefault_enable();
 	if (ret > 0)
 		written += ret;
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index d7fa885b1a0c..f6e410c9ea7b 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -742,28 +742,27 @@ static int __iomap_write_begin(const struct iomap_iter *iter, size_t len,
 	return 0;
 }
 
-static struct folio *__iomap_get_folio(struct iomap_iter *iter, size_t len)
+static struct folio *__iomap_get_folio(struct iomap_iter *iter,
+		const struct iomap_write_ops *write_ops, size_t len)
 {
-	const struct iomap_folio_ops *folio_ops = iter->iomap.folio_ops;
 	loff_t pos = iter->pos;
 
 	if (!mapping_large_folio_support(iter->inode->i_mapping))
 		len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos));
 
-	if (folio_ops && folio_ops->get_folio)
-		return folio_ops->get_folio(iter, pos, len);
-	else
-		return iomap_get_folio(iter, pos, len);
+	if (write_ops && write_ops->get_folio)
+		return write_ops->get_folio(iter, pos, len);
+	return iomap_get_folio(iter, pos, len);
 }
 
-static void __iomap_put_folio(struct iomap_iter *iter, size_t ret,
+static void __iomap_put_folio(struct iomap_iter *iter,
+		const struct iomap_write_ops *write_ops, size_t ret,
 		struct folio *folio)
 {
-	const struct iomap_folio_ops *folio_ops = iter->iomap.folio_ops;
 	loff_t pos = iter->pos;
 
-	if (folio_ops && folio_ops->put_folio) {
-		folio_ops->put_folio(iter->inode, pos, ret, folio);
+	if (write_ops && write_ops->put_folio) {
+		write_ops->put_folio(iter->inode, pos, ret, folio);
 	} else {
 		folio_unlock(folio);
 		folio_put(folio);
@@ -800,10 +799,10 @@ static int iomap_write_begin_inline(const struct iomap_iter *iter,
  * offset, and length. Callers can optionally pass a max length *plen,
  * otherwise init to zero.
  */
-static int iomap_write_begin(struct iomap_iter *iter, struct folio **foliop,
+static int iomap_write_begin(struct iomap_iter *iter,
+		const struct iomap_write_ops *write_ops, struct folio **foliop,
 		size_t *poffset, u64 *plen)
 {
-	const struct iomap_folio_ops *folio_ops = iter->iomap.folio_ops;
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
 	loff_t pos = iter->pos;
 	u64 len = min_t(u64, SIZE_MAX, iomap_length(iter));
@@ -818,7 +817,7 @@ static int iomap_write_begin(struct iomap_iter *iter, struct folio **foliop,
 	if (fatal_signal_pending(current))
 		return -EINTR;
 
-	folio = __iomap_get_folio(iter, len);
+	folio = __iomap_get_folio(iter, write_ops, len);
 	if (IS_ERR(folio))
 		return PTR_ERR(folio);
 
@@ -832,8 +831,8 @@ static int iomap_write_begin(struct iomap_iter *iter, struct folio **foliop,
 	 * could do the wrong thing here (zero a page range incorrectly or fail
 	 * to zero) and corrupt data.
 	 */
-	if (folio_ops && folio_ops->iomap_valid) {
-		bool iomap_valid = folio_ops->iomap_valid(iter->inode,
+	if (write_ops && write_ops->iomap_valid) {
+		bool iomap_valid = write_ops->iomap_valid(iter->inode,
 							 &iter->iomap);
 		if (!iomap_valid) {
 			iter->iomap.flags |= IOMAP_F_STALE;
@@ -859,8 +858,7 @@ static int iomap_write_begin(struct iomap_iter *iter, struct folio **foliop,
 	return 0;
 
 out_unlock:
-	__iomap_put_folio(iter, 0, folio);
-
+	__iomap_put_folio(iter, write_ops, 0, folio);
 	return status;
 }
 
@@ -932,7 +930,8 @@ static bool iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
 	return __iomap_write_end(iter->inode, pos, len, copied, folio);
 }
 
-static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
+static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i,
+		const struct iomap_write_ops *write_ops)
 {
 	ssize_t total_written = 0;
 	int status = 0;
@@ -976,7 +975,8 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 			break;
 		}
 
-		status = iomap_write_begin(iter, &folio, &offset, &bytes);
+		status = iomap_write_begin(iter, write_ops, &folio, &offset,
+				&bytes);
 		if (unlikely(status)) {
 			iomap_write_failed(iter->inode, iter->pos, bytes);
 			break;
@@ -1005,7 +1005,7 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 			i_size_write(iter->inode, pos + written);
 			iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
 		}
-		__iomap_put_folio(iter, written, folio);
+		__iomap_put_folio(iter, write_ops, written, folio);
 
 		if (old_size < pos)
 			pagecache_isize_extended(iter->inode, old_size, pos);
@@ -1038,7 +1038,8 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 
 ssize_t
 iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
-		const struct iomap_ops *ops, void *private)
+		const struct iomap_ops *ops,
+		const struct iomap_write_ops *write_ops, void *private)
 {
 	struct iomap_iter iter = {
 		.inode		= iocb->ki_filp->f_mapping->host,
@@ -1055,7 +1056,7 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
 		iter.flags |= IOMAP_DONTCACHE;
 
 	while ((ret = iomap_iter(&iter, ops)) > 0)
-		iter.status = iomap_write_iter(&iter, i);
+		iter.status = iomap_write_iter(&iter, i, write_ops);
 
 	if (unlikely(iter.pos == iocb->ki_pos))
 		return ret;
@@ -1289,7 +1290,8 @@ void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
 }
 EXPORT_SYMBOL_GPL(iomap_write_delalloc_release);
 
-static int iomap_unshare_iter(struct iomap_iter *iter)
+static int iomap_unshare_iter(struct iomap_iter *iter,
+		const struct iomap_write_ops *write_ops)
 {
 	struct iomap *iomap = &iter->iomap;
 	u64 bytes = iomap_length(iter);
@@ -1304,14 +1306,15 @@ static int iomap_unshare_iter(struct iomap_iter *iter)
 		bool ret;
 
 		bytes = min_t(u64, SIZE_MAX, bytes);
-		status = iomap_write_begin(iter, &folio, &offset, &bytes);
+		status = iomap_write_begin(iter, write_ops, &folio, &offset,
+				&bytes);
 		if (unlikely(status))
 			return status;
 		if (iomap->flags & IOMAP_F_STALE)
 			break;
 
 		ret = iomap_write_end(iter, bytes, bytes, folio);
-		__iomap_put_folio(iter, bytes, folio);
+		__iomap_put_folio(iter, write_ops, bytes, folio);
 		if (WARN_ON_ONCE(!ret))
 			return -EIO;
 
@@ -1329,7 +1332,8 @@ static int iomap_unshare_iter(struct iomap_iter *iter)
 
 int
 iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
-		const struct iomap_ops *ops)
+		const struct iomap_ops *ops,
+		const struct iomap_write_ops *write_ops)
 {
 	struct iomap_iter iter = {
 		.inode		= inode,
@@ -1344,7 +1348,7 @@ iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
 
 	iter.len = min(len, size - pos);
 	while ((ret = iomap_iter(&iter, ops)) > 0)
-		iter.status = iomap_unshare_iter(&iter);
+		iter.status = iomap_unshare_iter(&iter, write_ops);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(iomap_file_unshare);
@@ -1363,7 +1367,8 @@ static inline int iomap_zero_iter_flush_and_stale(struct iomap_iter *i)
 	return filemap_write_and_wait_range(mapping, i->pos, end);
 }
 
-static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
+static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero,
+		const struct iomap_write_ops *write_ops)
 {
 	u64 bytes = iomap_length(iter);
 	int status;
@@ -1374,7 +1379,8 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 		bool ret;
 
 		bytes = min_t(u64, SIZE_MAX, bytes);
-		status = iomap_write_begin(iter, &folio, &offset, &bytes);
+		status = iomap_write_begin(iter, write_ops, &folio, &offset,
+				&bytes);
 		if (status)
 			return status;
 		if (iter->iomap.flags & IOMAP_F_STALE)
@@ -1387,7 +1393,7 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 		folio_mark_accessed(folio);
 
 		ret = iomap_write_end(iter, bytes, bytes, folio);
-		__iomap_put_folio(iter, bytes, folio);
+		__iomap_put_folio(iter, write_ops, bytes, folio);
 		if (WARN_ON_ONCE(!ret))
 			return -EIO;
 
@@ -1403,7 +1409,8 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 
 int
 iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
-		const struct iomap_ops *ops, void *private)
+		const struct iomap_ops *ops,
+		const struct iomap_write_ops *write_ops, void *private)
 {
 	struct iomap_iter iter = {
 		.inode		= inode,
@@ -1433,7 +1440,8 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
 	    filemap_range_needs_writeback(mapping, pos, pos + plen - 1)) {
 		iter.len = plen;
 		while ((ret = iomap_iter(&iter, ops)) > 0)
-			iter.status = iomap_zero_iter(&iter, did_zero);
+			iter.status = iomap_zero_iter(&iter, did_zero,
+					write_ops);
 
 		iter.len = len - (iter.pos - pos);
 		if (ret || !iter.len)
@@ -1464,7 +1472,7 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
 			continue;
 		}
 
-		iter.status = iomap_zero_iter(&iter, did_zero);
+		iter.status = iomap_zero_iter(&iter, did_zero, write_ops);
 	}
 	return ret;
 }
@@ -1472,7 +1480,8 @@ EXPORT_SYMBOL_GPL(iomap_zero_range);
 
 int
 iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
-		const struct iomap_ops *ops, void *private)
+		const struct iomap_ops *ops,
+		const struct iomap_write_ops *write_ops, void *private)
 {
 	unsigned int blocksize = i_blocksize(inode);
 	unsigned int off = pos & (blocksize - 1);
@@ -1481,7 +1490,7 @@ iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
 	if (!off)
 		return 0;
 	return iomap_zero_range(inode, pos, blocksize - off, did_zero, ops,
-			private);
+			write_ops, private);
 }
 EXPORT_SYMBOL_GPL(iomap_truncate_page);
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 48254a72071b..6e0970f24df5 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -979,7 +979,8 @@ xfs_file_buffered_write(
 
 	trace_xfs_file_buffered_write(iocb, from);
 	ret = iomap_file_buffered_write(iocb, from,
-			&xfs_buffered_write_iomap_ops, NULL);
+			&xfs_buffered_write_iomap_ops, &xfs_iomap_write_ops,
+			NULL);
 
 	/*
 	 * If we hit a space limit, try to free up some lingering preallocated
@@ -1059,7 +1060,8 @@ xfs_file_buffered_write_zoned(
 retry:
 	trace_xfs_file_buffered_write(iocb, from);
 	ret = iomap_file_buffered_write(iocb, from,
-			&xfs_buffered_write_iomap_ops, &ac);
+			&xfs_buffered_write_iomap_ops, &xfs_iomap_write_ops,
+			&ac);
 	if (ret == -ENOSPC && !cleared_space) {
 		/*
 		 * Kick off writeback to convert delalloc space and release the
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index ff05e6b1b0bb..2e94a9435002 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -79,6 +79,9 @@ xfs_iomap_valid(
 {
 	struct xfs_inode	*ip = XFS_I(inode);
 
+	if (iomap->type == IOMAP_HOLE)
+		return true;
+
 	if (iomap->validity_cookie !=
 			xfs_iomap_inode_sequence(ip, iomap->flags)) {
 		trace_xfs_iomap_invalid(ip, iomap);
@@ -89,7 +92,7 @@ xfs_iomap_valid(
 	return true;
 }
 
-static const struct iomap_folio_ops xfs_iomap_folio_ops = {
+const struct iomap_write_ops xfs_iomap_write_ops = {
 	.iomap_valid		= xfs_iomap_valid,
 };
 
@@ -151,7 +154,6 @@ xfs_bmbt_to_iomap(
 		iomap->flags |= IOMAP_F_DIRTY;
 
 	iomap->validity_cookie = sequence_cookie;
-	iomap->folio_ops = &xfs_iomap_folio_ops;
 	return 0;
 }
 
@@ -2198,7 +2200,8 @@ xfs_zero_range(
 		return dax_zero_range(inode, pos, len, did_zero,
 				      &xfs_dax_write_iomap_ops);
 	return iomap_zero_range(inode, pos, len, did_zero,
-				&xfs_buffered_write_iomap_ops, ac);
+			&xfs_buffered_write_iomap_ops, &xfs_iomap_write_ops,
+			ac);
 }
 
 int
@@ -2214,5 +2217,6 @@ xfs_truncate_page(
 		return dax_truncate_page(inode, pos, did_zero,
 					&xfs_dax_write_iomap_ops);
 	return iomap_truncate_page(inode, pos, did_zero,
-				   &xfs_buffered_write_iomap_ops, ac);
+			&xfs_buffered_write_iomap_ops, &xfs_iomap_write_ops,
+			ac);
 }
diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h
index 674f8ac1b9bd..ebcce7d49446 100644
--- a/fs/xfs/xfs_iomap.h
+++ b/fs/xfs/xfs_iomap.h
@@ -57,5 +57,6 @@ extern const struct iomap_ops xfs_seek_iomap_ops;
 extern const struct iomap_ops xfs_xattr_iomap_ops;
 extern const struct iomap_ops xfs_dax_write_iomap_ops;
 extern const struct iomap_ops xfs_atomic_write_cow_iomap_ops;
+extern const struct iomap_write_ops xfs_iomap_write_ops;
 
 #endif /* __XFS_IOMAP_H__*/
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index ad3bcb76d805..3f177b4ec131 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1881,7 +1881,8 @@ xfs_reflink_unshare(
 				&xfs_dax_write_iomap_ops);
 	else
 		error = iomap_file_unshare(inode, offset, len,
-				&xfs_buffered_write_iomap_ops);
+				&xfs_buffered_write_iomap_ops,
+				&xfs_iomap_write_ops);
 	if (error)
 		goto out;
 
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 0c64185325d3..6fd3b23858db 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -572,7 +572,8 @@ static ssize_t zonefs_file_buffered_write(struct kiocb *iocb,
 	if (ret <= 0)
 		goto inode_unlock;
 
-	ret = iomap_file_buffered_write(iocb, from, &zonefs_write_iomap_ops, NULL);
+	ret = iomap_file_buffered_write(iocb, from, &zonefs_write_iomap_ops,
+			NULL, NULL);
 	if (ret == -EIO)
 		zonefs_io_error(inode, true);
 
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 7e06b3a392f8..8d20a926b645 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -101,8 +101,6 @@ struct vm_fault;
  */
 #define IOMAP_NULL_ADDR -1ULL	/* addr is not valid */
 
-struct iomap_folio_ops;
-
 struct iomap {
 	u64			addr; /* disk offset of mapping, bytes */
 	loff_t			offset;	/* file offset of mapping, bytes */
@@ -113,7 +111,6 @@ struct iomap {
 	struct dax_device	*dax_dev; /* dax_dev for dax operations */
 	void			*inline_data;
 	void			*private; /* filesystem private */
-	const struct iomap_folio_ops *folio_ops;
 	u64			validity_cookie; /* used with .iomap_valid() */
 };
 
@@ -143,16 +140,11 @@ static inline bool iomap_inline_data_valid(const struct iomap *iomap)
 }
 
 /*
- * When a filesystem sets folio_ops in an iomap mapping it returns, get_folio
- * and put_folio will be called for each folio written to.  This only applies
- * to buffered writes as unbuffered writes will not typically have folios
- * associated with them.
- *
  * When get_folio succeeds, put_folio will always be called to do any
  * cleanup work necessary.  put_folio is responsible for unlocking and putting
  * @folio.
  */
-struct iomap_folio_ops {
+struct iomap_write_ops {
 	struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
 			unsigned len);
 	void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
@@ -335,7 +327,8 @@ static inline bool iomap_want_unshare_iter(const struct iomap_iter *iter)
 }
 
 ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from,
-		const struct iomap_ops *ops, void *private);
+		const struct iomap_ops *ops,
+		const struct iomap_write_ops *write_ops, void *private);
 int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops);
 void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
 bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count);
@@ -344,11 +337,14 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags);
 void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len);
 bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio);
 int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
-		const struct iomap_ops *ops);
+		const struct iomap_ops *ops,
+		const struct iomap_write_ops *write_ops);
 int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,
-		bool *did_zero, const struct iomap_ops *ops, void *private);
+		bool *did_zero, const struct iomap_ops *ops,
+		const struct iomap_write_ops *write_ops, void *private);
 int iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
-		const struct iomap_ops *ops, void *private);
+		const struct iomap_ops *ops,
+		const struct iomap_write_ops *write_ops, void *private);
 vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops,
 		void *private);
 typedef void (*iomap_punch_t)(struct inode *inode, loff_t offset, loff_t length,
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 11/16] iomap: add read_folio_range() handler for buffered writes
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (9 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 10/16] iomap: replace iomap_folio_ops with iomap_write_ops Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 17:51   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 12/16] fuse: use iomap " Joanne Koong
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

Add a read_folio_range() handler for buffered writes that filesystems
may pass in if they wish to provide a custom handler for synchronously
reading in the contents of a folio.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
[hch: renamed to read_folio_range, pass less arguments]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 .../filesystems/iomap/operations.rst          |  6 +++++
 fs/iomap/buffered-io.c                        | 25 +++++++++++--------
 include/linux/iomap.h                         | 10 ++++++++
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
index 1f5732835567..813e26dbd21e 100644
--- a/Documentation/filesystems/iomap/operations.rst
+++ b/Documentation/filesystems/iomap/operations.rst
@@ -68,6 +68,8 @@ The following address space operations can be wrapped easily:
      void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
                        struct folio *folio);
      bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
+     int (*read_folio_range)(const struct iomap_iter *iter,
+     			struct folio *folio, loff_t pos, size_t len);
  };
 
 iomap calls these functions:
@@ -123,6 +125,10 @@ iomap calls these functions:
     ``->iomap_valid``, then the iomap should considered stale and the
     validation failed.
 
+  - ``read_folio_range``: Called to synchronously read in the range that will
+    be written to. If this function is not provided, iomap will default to
+    submitting a bio read request.
+
 These ``struct kiocb`` flags are significant for buffered I/O with iomap:
 
  * ``IOCB_NOWAIT``: Turns on ``IOMAP_NOWAIT``.
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index f6e410c9ea7b..897a3ccea2df 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -667,22 +667,23 @@ iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
 					 pos + len - 1);
 }
 
-static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
-		size_t poff, size_t plen, const struct iomap *iomap)
+static int iomap_read_folio_range(const struct iomap_iter *iter,
+		struct folio *folio, loff_t pos, size_t len)
 {
+	const struct iomap *srcmap = iomap_iter_srcmap(iter);
 	struct bio_vec bvec;
 	struct bio bio;
 
-	bio_init(&bio, iomap->bdev, &bvec, 1, REQ_OP_READ);
-	bio.bi_iter.bi_sector = iomap_sector(iomap, block_start);
-	bio_add_folio_nofail(&bio, folio, plen, poff);
+	bio_init(&bio, srcmap->bdev, &bvec, 1, REQ_OP_READ);
+	bio.bi_iter.bi_sector = iomap_sector(srcmap, pos);
+	bio_add_folio_nofail(&bio, folio, len, offset_in_folio(folio, pos));
 	return submit_bio_wait(&bio);
 }
 
-static int __iomap_write_begin(const struct iomap_iter *iter, size_t len,
+static int __iomap_write_begin(const struct iomap_iter *iter,
+		const struct iomap_write_ops *write_ops, size_t len,
 		struct folio *folio)
 {
-	const struct iomap *srcmap = iomap_iter_srcmap(iter);
 	struct iomap_folio_state *ifs;
 	loff_t pos = iter->pos;
 	loff_t block_size = i_blocksize(iter->inode);
@@ -731,8 +732,12 @@ static int __iomap_write_begin(const struct iomap_iter *iter, size_t len,
 			if (iter->flags & IOMAP_NOWAIT)
 				return -EAGAIN;
 
-			status = iomap_read_folio_sync(block_start, folio,
-					poff, plen, srcmap);
+			if (write_ops && write_ops->read_folio_range)
+				status = write_ops->read_folio_range(iter,
+						folio, block_start, plen);
+			else
+				status = iomap_read_folio_range(iter,
+						folio, block_start, plen);
 			if (status)
 				return status;
 		}
@@ -848,7 +853,7 @@ static int iomap_write_begin(struct iomap_iter *iter,
 	else if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
 		status = __block_write_begin_int(folio, pos, len, NULL, srcmap);
 	else
-		status = __iomap_write_begin(iter, len, folio);
+		status = __iomap_write_begin(iter, write_ops, len, folio);
 
 	if (unlikely(status))
 		goto out_unlock;
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 8d20a926b645..5ec651606c51 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -166,6 +166,16 @@ struct iomap_write_ops {
 	 * locked by the iomap code.
 	 */
 	bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
+
+	/*
+	 * Optional if the filesystem wishes to provide a custom handler for
+	 * reading in the contents of a folio, otherwise iomap will default to
+	 * submitting a bio read request.
+	 *
+	 * The read must be done synchronously.
+	 */
+	int (*read_folio_range)(const struct iomap_iter *iter,
+			struct folio *folio, loff_t pos, size_t len);
 };
 
 /*
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 12/16] fuse: use iomap for buffered writes
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (10 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 11/16] iomap: add read_folio_range() handler for buffered writes Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-06-24 10:07   ` Miklos Szeredi
                     ` (2 more replies)
  2025-06-24  2:21 ` [PATCH v3 13/16] fuse: use iomap for writeback Joanne Koong
                   ` (4 subsequent siblings)
  16 siblings, 3 replies; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

Have buffered writes go through iomap. This has two advantages:
* granular large folio synchronous reads
* granular large folio dirty tracking

If for example there is a 1 MB large folio and a write issued at pos 1
to pos 1 MB - 2, only the head and tail pages will need to be read in
and marked uptodate instead of the entire folio needing to be read in.
Non-relevant trailing pages are also skipped (eg if for a 1 MB large
folio a write is issued at pos 1 to 4099, only the first two pages are
read in and the ones after that are skipped).

iomap also has granular dirty tracking. This is useful in that when it
comes to writeback time, only the dirty portions of the large folio will
be written instead of having to write out the entire folio. For example
if there is a 1 MB large folio and only 2 bytes in it are dirty, only
the page for those dirty bytes get written out. Please note that
granular writeback is only done once fuse also uses iomap in writeback
(separate commit).

.release_folio needs to be set to iomap_release_folio so that any
allocated iomap ifs structs get freed.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/fuse/Kconfig |   1 +
 fs/fuse/file.c  | 140 ++++++++++++++++++------------------------------
 2 files changed, 53 insertions(+), 88 deletions(-)

diff --git a/fs/fuse/Kconfig b/fs/fuse/Kconfig
index ca215a3cba3e..a774166264de 100644
--- a/fs/fuse/Kconfig
+++ b/fs/fuse/Kconfig
@@ -2,6 +2,7 @@
 config FUSE_FS
 	tristate "FUSE (Filesystem in Userspace) support"
 	select FS_POSIX_ACL
+	select FS_IOMAP
 	help
 	  With FUSE it is possible to implement a fully functional filesystem
 	  in a userspace program.
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index f102afc03359..a7f11c1a4f89 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -21,6 +21,7 @@
 #include <linux/filelock.h>
 #include <linux/splice.h>
 #include <linux/task_io_accounting_ops.h>
+#include <linux/iomap.h>
 
 static int fuse_send_open(struct fuse_mount *fm, u64 nodeid,
 			  unsigned int open_flags, int opcode,
@@ -788,12 +789,16 @@ static void fuse_short_read(struct inode *inode, u64 attr_ver, size_t num_read,
 	}
 }
 
-static int fuse_do_readfolio(struct file *file, struct folio *folio)
+static int fuse_do_readfolio(struct file *file, struct folio *folio,
+			     size_t off, size_t len)
 {
 	struct inode *inode = folio->mapping->host;
 	struct fuse_mount *fm = get_fuse_mount(inode);
-	loff_t pos = folio_pos(folio);
-	struct fuse_folio_desc desc = { .length = folio_size(folio) };
+	loff_t pos = folio_pos(folio) + off;
+	struct fuse_folio_desc desc = {
+		.offset = off,
+		.length = len,
+	};
 	struct fuse_io_args ia = {
 		.ap.args.page_zeroing = true,
 		.ap.args.out_pages = true,
@@ -820,8 +825,6 @@ static int fuse_do_readfolio(struct file *file, struct folio *folio)
 	if (res < desc.length)
 		fuse_short_read(inode, attr_ver, res, &ia.ap);
 
-	folio_mark_uptodate(folio);
-
 	return 0;
 }
 
@@ -834,13 +837,26 @@ static int fuse_read_folio(struct file *file, struct folio *folio)
 	if (fuse_is_bad(inode))
 		goto out;
 
-	err = fuse_do_readfolio(file, folio);
+	err = fuse_do_readfolio(file, folio, 0, folio_size(folio));
+	if (!err)
+		folio_mark_uptodate(folio);
+
 	fuse_invalidate_atime(inode);
  out:
 	folio_unlock(folio);
 	return err;
 }
 
+static int fuse_iomap_read_folio_range(const struct iomap_iter *iter,
+				       struct folio *folio, loff_t pos,
+				       size_t len)
+{
+	struct file *file = iter->private;
+	size_t off = offset_in_folio(folio, pos);
+
+	return fuse_do_readfolio(file, folio, off, len);
+}
+
 static void fuse_readpages_end(struct fuse_mount *fm, struct fuse_args *args,
 			       int err)
 {
@@ -1375,6 +1391,24 @@ static void fuse_dio_unlock(struct kiocb *iocb, bool exclusive)
 	}
 }
 
+static const struct iomap_write_ops fuse_iomap_write_ops = {
+	.read_folio_range = fuse_iomap_read_folio_range,
+};
+
+static int fuse_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
+			    unsigned int flags, struct iomap *iomap,
+			    struct iomap *srcmap)
+{
+	iomap->type = IOMAP_MAPPED;
+	iomap->length = length;
+	iomap->offset = offset;
+	return 0;
+}
+
+static const struct iomap_ops fuse_iomap_ops = {
+	.iomap_begin	= fuse_iomap_begin,
+};
+
 static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct file *file = iocb->ki_filp;
@@ -1384,6 +1418,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	struct inode *inode = mapping->host;
 	ssize_t err, count;
 	struct fuse_conn *fc = get_fuse_conn(inode);
+	bool writeback = false;
 
 	if (fc->writeback_cache) {
 		/* Update size (EOF optimization) and mode (SUID clearing) */
@@ -1397,8 +1432,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
 						file_inode(file))) {
 			goto writethrough;
 		}
-
-		return generic_file_write_iter(iocb, from);
+		writeback = true;
 	}
 
 writethrough:
@@ -1420,6 +1454,15 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
 			goto out;
 		written = direct_write_fallback(iocb, from, written,
 				fuse_perform_write(iocb, from));
+	} else if (writeback) {
+		/*
+		 * Use iomap so that we can do granular uptodate reads
+		 * and granular dirty tracking for large folios.
+		 */
+		written = iomap_file_buffered_write(iocb, from,
+						    &fuse_iomap_ops,
+						    &fuse_iomap_write_ops,
+						    file);
 	} else {
 		written = fuse_perform_write(iocb, from);
 	}
@@ -2209,84 +2252,6 @@ static int fuse_writepages(struct address_space *mapping,
 	return err;
 }
 
-/*
- * It's worthy to make sure that space is reserved on disk for the write,
- * but how to implement it without killing performance need more thinking.
- */
-static int fuse_write_begin(struct file *file, struct address_space *mapping,
-		loff_t pos, unsigned len, struct folio **foliop, void **fsdata)
-{
-	pgoff_t index = pos >> PAGE_SHIFT;
-	struct fuse_conn *fc = get_fuse_conn(file_inode(file));
-	struct folio *folio;
-	loff_t fsize;
-	int err = -ENOMEM;
-
-	WARN_ON(!fc->writeback_cache);
-
-	folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
-			mapping_gfp_mask(mapping));
-	if (IS_ERR(folio))
-		goto error;
-
-	if (folio_test_uptodate(folio) || len >= folio_size(folio))
-		goto success;
-	/*
-	 * Check if the start of this folio comes after the end of file,
-	 * in which case the readpage can be optimized away.
-	 */
-	fsize = i_size_read(mapping->host);
-	if (fsize <= folio_pos(folio)) {
-		size_t off = offset_in_folio(folio, pos);
-		if (off)
-			folio_zero_segment(folio, 0, off);
-		goto success;
-	}
-	err = fuse_do_readfolio(file, folio);
-	if (err)
-		goto cleanup;
-success:
-	*foliop = folio;
-	return 0;
-
-cleanup:
-	folio_unlock(folio);
-	folio_put(folio);
-error:
-	return err;
-}
-
-static int fuse_write_end(struct file *file, struct address_space *mapping,
-		loff_t pos, unsigned len, unsigned copied,
-		struct folio *folio, void *fsdata)
-{
-	struct inode *inode = folio->mapping->host;
-
-	/* Haven't copied anything?  Skip zeroing, size extending, dirtying. */
-	if (!copied)
-		goto unlock;
-
-	pos += copied;
-	if (!folio_test_uptodate(folio)) {
-		/* Zero any unwritten bytes at the end of the page */
-		size_t endoff = pos & ~PAGE_MASK;
-		if (endoff)
-			folio_zero_segment(folio, endoff, PAGE_SIZE);
-		folio_mark_uptodate(folio);
-	}
-
-	if (pos > inode->i_size)
-		i_size_write(inode, pos);
-
-	folio_mark_dirty(folio);
-
-unlock:
-	folio_unlock(folio);
-	folio_put(folio);
-
-	return copied;
-}
-
 static int fuse_launder_folio(struct folio *folio)
 {
 	int err = 0;
@@ -3145,11 +3110,10 @@ static const struct address_space_operations fuse_file_aops  = {
 	.writepages	= fuse_writepages,
 	.launder_folio	= fuse_launder_folio,
 	.dirty_folio	= filemap_dirty_folio,
+	.release_folio	= iomap_release_folio,
 	.migrate_folio	= filemap_migrate_folio,
 	.bmap		= fuse_bmap,
 	.direct_IO	= fuse_direct_IO,
-	.write_begin	= fuse_write_begin,
-	.write_end	= fuse_write_end,
 };
 
 void fuse_init_file_inode(struct inode *inode, unsigned int flags)
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 13/16] fuse: use iomap for writeback
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (11 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 12/16] fuse: use iomap " Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-06-25 14:17   ` kernel test robot
  2025-07-02 18:13   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 14/16] fuse: use iomap for folio laundering Joanne Koong
                   ` (3 subsequent siblings)
  16 siblings, 2 replies; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

Use iomap for dirty folio writeback in ->writepages().
This allows for granular dirty writeback of large folios.

Only the dirty portions of the large folio will be written instead of
having to write out the entire folio. For example if there is a 1 MB
large folio and only 2 bytes in it are dirty, only the page for those
dirty bytes will be written out.

.dirty_folio needs to be set to iomap_dirty_folio so that the bitmap
iomap uses for dirty tracking correctly reflects dirty regions that need
to be written back.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/fuse/file.c | 124 ++++++++++++++++++++++++++++++-------------------
 1 file changed, 75 insertions(+), 49 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index a7f11c1a4f89..2b4b950eaeed 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1837,7 +1837,7 @@ static void fuse_writepage_finish(struct fuse_writepage_args *wpa)
 		 * scope of the fi->lock alleviates xarray lock
 		 * contention and noticeably improves performance.
 		 */
-		folio_end_writeback(ap->folios[i]);
+		iomap_finish_folio_write(inode, ap->folios[i], 1);
 		dec_wb_stat(&bdi->wb, WB_WRITEBACK);
 		wb_writeout_inc(&bdi->wb);
 	}
@@ -2024,19 +2024,20 @@ static void fuse_writepage_add_to_bucket(struct fuse_conn *fc,
 }
 
 static void fuse_writepage_args_page_fill(struct fuse_writepage_args *wpa, struct folio *folio,
-					  uint32_t folio_index)
+					  uint32_t folio_index, loff_t offset, unsigned len)
 {
 	struct inode *inode = folio->mapping->host;
 	struct fuse_args_pages *ap = &wpa->ia.ap;
 
 	ap->folios[folio_index] = folio;
-	ap->descs[folio_index].offset = 0;
-	ap->descs[folio_index].length = folio_size(folio);
+	ap->descs[folio_index].offset = offset;
+	ap->descs[folio_index].length = len;
 
 	inc_wb_stat(&inode_to_bdi(inode)->wb, WB_WRITEBACK);
 }
 
 static struct fuse_writepage_args *fuse_writepage_args_setup(struct folio *folio,
+							     size_t offset,
 							     struct fuse_file *ff)
 {
 	struct inode *inode = folio->mapping->host;
@@ -2049,7 +2050,7 @@ static struct fuse_writepage_args *fuse_writepage_args_setup(struct folio *folio
 		return NULL;
 
 	fuse_writepage_add_to_bucket(fc, wpa);
-	fuse_write_args_fill(&wpa->ia, ff, folio_pos(folio), 0);
+	fuse_write_args_fill(&wpa->ia, ff, folio_pos(folio) + offset, 0);
 	wpa->ia.write.in.write_flags |= FUSE_WRITE_CACHE;
 	wpa->inode = inode;
 	wpa->ia.ff = ff;
@@ -2075,7 +2076,7 @@ static int fuse_writepage_locked(struct folio *folio)
 	if (!ff)
 		goto err;
 
-	wpa = fuse_writepage_args_setup(folio, ff);
+	wpa = fuse_writepage_args_setup(folio, 0, ff);
 	error = -ENOMEM;
 	if (!wpa)
 		goto err_writepage_args;
@@ -2084,7 +2085,7 @@ static int fuse_writepage_locked(struct folio *folio)
 	ap->num_folios = 1;
 
 	folio_start_writeback(folio);
-	fuse_writepage_args_page_fill(wpa, folio, 0);
+	fuse_writepage_args_page_fill(wpa, folio, 0, 0, folio_size(folio));
 
 	spin_lock(&fi->lock);
 	list_add_tail(&wpa->queue_entry, &fi->queued_writes);
@@ -2105,7 +2106,7 @@ struct fuse_fill_wb_data {
 	struct fuse_file *ff;
 	struct inode *inode;
 	unsigned int max_folios;
-	unsigned int nr_pages;
+	unsigned int nr_bytes;
 };
 
 static bool fuse_pages_realloc(struct fuse_fill_wb_data *data)
@@ -2147,21 +2148,28 @@ static void fuse_writepages_send(struct fuse_fill_wb_data *data)
 }
 
 static bool fuse_writepage_need_send(struct fuse_conn *fc, struct folio *folio,
+				     loff_t offset, unsigned len,
 				     struct fuse_args_pages *ap,
 				     struct fuse_fill_wb_data *data)
 {
+	struct folio *prev_folio;
+	struct fuse_folio_desc prev_desc;
+
 	WARN_ON(!ap->num_folios);
 
 	/* Reached max pages */
-	if (data->nr_pages + folio_nr_pages(folio) > fc->max_pages)
+	if ((data->nr_bytes + len) / PAGE_SIZE > fc->max_pages)
 		return true;
 
 	/* Reached max write bytes */
-	if ((data->nr_pages * PAGE_SIZE) + folio_size(folio) > fc->max_write)
+	if (data->nr_bytes + len > fc->max_write)
 		return true;
 
 	/* Discontinuity */
-	if (folio_next_index(ap->folios[ap->num_folios - 1]) != folio->index)
+	prev_folio = ap->folios[ap->num_folios - 1];
+	prev_desc = ap->descs[ap->num_folios - 1];
+	if ((folio_pos(prev_folio) + prev_desc.offset + prev_desc.length) !=
+	    folio_pos(folio) + offset)
 		return true;
 
 	/* Need to grow the pages array?  If so, did the expansion fail? */
@@ -2171,85 +2179,103 @@ static bool fuse_writepage_need_send(struct fuse_conn *fc, struct folio *folio,
 	return false;
 }
 
-static int fuse_writepages_fill(struct folio *folio,
-		struct writeback_control *wbc, void *_data)
+static ssize_t fuse_iomap_writeback_range(struct iomap_writepage_ctx *wpc,
+					  struct folio *folio, u64 pos,
+					  unsigned len, u64 end_pos)
 {
-	struct fuse_fill_wb_data *data = _data;
+	struct fuse_fill_wb_data *data = wpc->wb_ctx;
 	struct fuse_writepage_args *wpa = data->wpa;
 	struct fuse_args_pages *ap = &wpa->ia.ap;
 	struct inode *inode = data->inode;
 	struct fuse_inode *fi = get_fuse_inode(inode);
 	struct fuse_conn *fc = get_fuse_conn(inode);
-	int err;
+	loff_t offset = offset_in_folio(folio, pos);
+
+	WARN_ON_ONCE(!data);
+	/* len will always be page aligned */
+	WARN_ON_ONCE(len & (PAGE_SIZE - 1));
 
 	if (!data->ff) {
-		err = -EIO;
 		data->ff = fuse_write_file_get(fi);
 		if (!data->ff)
-			goto out_unlock;
+			return -EIO;
 	}
 
-	if (wpa && fuse_writepage_need_send(fc, folio, ap, data)) {
+	iomap_start_folio_write(inode, folio, 1);
+
+	if (wpa && fuse_writepage_need_send(fc, folio, offset, len, ap, data)) {
 		fuse_writepages_send(data);
 		data->wpa = NULL;
-		data->nr_pages = 0;
+		data->nr_bytes = 0;
 	}
 
 	if (data->wpa == NULL) {
-		err = -ENOMEM;
-		wpa = fuse_writepage_args_setup(folio, data->ff);
+		wpa = fuse_writepage_args_setup(folio, offset, data->ff);
 		if (!wpa)
-			goto out_unlock;
+			return -ENOMEM;
 		fuse_file_get(wpa->ia.ff);
 		data->max_folios = 1;
 		ap = &wpa->ia.ap;
 	}
-	folio_start_writeback(folio);
 
-	fuse_writepage_args_page_fill(wpa, folio, ap->num_folios);
-	data->nr_pages += folio_nr_pages(folio);
+	fuse_writepage_args_page_fill(wpa, folio, ap->num_folios,
+				      offset, len);
+	data->nr_bytes += len;
 
-	err = 0;
 	ap->num_folios++;
 	if (!data->wpa)
 		data->wpa = wpa;
-out_unlock:
-	folio_unlock(folio);
 
-	return err;
+	return len;
 }
 
+static int fuse_iomap_writeback_submit(struct iomap_writepage_ctx *wpc,
+				       int error)
+{
+	struct fuse_fill_wb_data *data = wpc->wb_ctx;
+
+	WARN_ON_ONCE(!data);
+
+	if (data->wpa) {
+		WARN_ON(!data->wpa->ia.ap.num_folios);
+		fuse_writepages_send(data);
+	}
+
+	if (data->ff)
+		fuse_file_put(data->ff, false);
+
+	return error;
+}
+
+static const struct iomap_writeback_ops fuse_writeback_ops = {
+	.writeback_range	= fuse_iomap_writeback_range,
+	.writeback_submit	= fuse_iomap_writeback_submit,
+};
+
 static int fuse_writepages(struct address_space *mapping,
 			   struct writeback_control *wbc)
 {
 	struct inode *inode = mapping->host;
 	struct fuse_conn *fc = get_fuse_conn(inode);
-	struct fuse_fill_wb_data data;
-	int err;
+	struct fuse_fill_wb_data data = {
+		.inode = inode,
+	};
+	struct iomap_writepage_ctx wpc = {
+		.inode = inode,
+		.iomap.type = IOMAP_MAPPED,
+		.wbc = wbc,
+		.ops = &fuse_writeback_ops,
+		.wb_ctx	= &data,
+	};
 
-	err = -EIO;
 	if (fuse_is_bad(inode))
-		goto out;
+		return -EIO;
 
 	if (wbc->sync_mode == WB_SYNC_NONE &&
 	    fc->num_background >= fc->congestion_threshold)
 		return 0;
 
-	data.inode = inode;
-	data.wpa = NULL;
-	data.ff = NULL;
-	data.nr_pages = 0;
-
-	err = write_cache_pages(mapping, wbc, fuse_writepages_fill, &data);
-	if (data.wpa) {
-		WARN_ON(!data.wpa->ia.ap.num_folios);
-		fuse_writepages_send(&data);
-	}
-	if (data.ff)
-		fuse_file_put(data.ff, false);
-
-out:
-	return err;
+	return iomap_writepages(&wpc);
 }
 
 static int fuse_launder_folio(struct folio *folio)
@@ -3109,7 +3135,7 @@ static const struct address_space_operations fuse_file_aops  = {
 	.readahead	= fuse_readahead,
 	.writepages	= fuse_writepages,
 	.launder_folio	= fuse_launder_folio,
-	.dirty_folio	= filemap_dirty_folio,
+	.dirty_folio	= iomap_dirty_folio,
 	.release_folio	= iomap_release_folio,
 	.migrate_folio	= filemap_migrate_folio,
 	.bmap		= fuse_bmap,
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 14/16] fuse: use iomap for folio laundering
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (12 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 13/16] fuse: use iomap for writeback Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-06-25 17:02   ` kernel test robot
  2025-07-02 18:14   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 15/16] fuse: hook into iomap for invalidating and checking partial uptodateness Joanne Koong
                   ` (2 subsequent siblings)
  16 siblings, 2 replies; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

Use iomap for folio laundering, which will do granular dirty
writeback when laundering a large folio.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/fuse/file.c | 52 ++++++++++++--------------------------------------
 1 file changed, 12 insertions(+), 40 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 2b4b950eaeed..35ecc03c0c48 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2062,45 +2062,6 @@ static struct fuse_writepage_args *fuse_writepage_args_setup(struct folio *folio
 	return wpa;
 }
 
-static int fuse_writepage_locked(struct folio *folio)
-{
-	struct address_space *mapping = folio->mapping;
-	struct inode *inode = mapping->host;
-	struct fuse_inode *fi = get_fuse_inode(inode);
-	struct fuse_writepage_args *wpa;
-	struct fuse_args_pages *ap;
-	struct fuse_file *ff;
-	int error = -EIO;
-
-	ff = fuse_write_file_get(fi);
-	if (!ff)
-		goto err;
-
-	wpa = fuse_writepage_args_setup(folio, 0, ff);
-	error = -ENOMEM;
-	if (!wpa)
-		goto err_writepage_args;
-
-	ap = &wpa->ia.ap;
-	ap->num_folios = 1;
-
-	folio_start_writeback(folio);
-	fuse_writepage_args_page_fill(wpa, folio, 0, 0, folio_size(folio));
-
-	spin_lock(&fi->lock);
-	list_add_tail(&wpa->queue_entry, &fi->queued_writes);
-	fuse_flush_writepages(inode);
-	spin_unlock(&fi->lock);
-
-	return 0;
-
-err_writepage_args:
-	fuse_file_put(ff, false);
-err:
-	mapping_set_error(folio->mapping, error);
-	return error;
-}
-
 struct fuse_fill_wb_data {
 	struct fuse_writepage_args *wpa;
 	struct fuse_file *ff;
@@ -2281,8 +2242,19 @@ static int fuse_writepages(struct address_space *mapping,
 static int fuse_launder_folio(struct folio *folio)
 {
 	int err = 0;
+	struct fuse_fill_wb_data data = {
+		.inode = folio->mapping->host,
+	};
+	struct iomap_writepage_ctx wpc = {
+		.inode = folio->mapping->host,
+		.iomap.type = IOMAP_MAPPED,
+		.ops = &fuse_writeback_ops,
+		.wb_ctx	= &data,
+	};
+
 	if (folio_clear_dirty_for_io(folio)) {
-		err = fuse_writepage_locked(folio);
+		err = iomap_writeback_folio(&wpc, folio);
+		err = fuse_iomap_writeback_submit(&wpc, err);
 		if (!err)
 			folio_wait_writeback(folio);
 	}
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 15/16] fuse: hook into iomap for invalidating and checking partial uptodateness
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (13 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 14/16] fuse: use iomap for folio laundering Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 18:14   ` Darrick J. Wong
  2025-06-24  2:21 ` [PATCH v3 16/16] fuse: refactor writeback to use iomap_writepage_ctx inode Joanne Koong
  2025-06-24 13:31 ` [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Christoph Hellwig
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

Hook into iomap_invalidate_folio() so that if the entire folio is being
invalidated during truncation, the dirty state is cleared and the folio
doesn't get written back. As well the folio's corresponding ifs struct
will get freed.

Hook into iomap_is_partially_uptodate() since iomap tracks uptodateness
granularly when it does buffered writes.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/fuse/file.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 35ecc03c0c48..865d04b8ef31 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3109,6 +3109,8 @@ static const struct address_space_operations fuse_file_aops  = {
 	.launder_folio	= fuse_launder_folio,
 	.dirty_folio	= iomap_dirty_folio,
 	.release_folio	= iomap_release_folio,
+	.invalidate_folio = iomap_invalidate_folio,
+	.is_partially_uptodate = iomap_is_partially_uptodate,
 	.migrate_folio	= filemap_migrate_folio,
 	.bmap		= fuse_bmap,
 	.direct_IO	= fuse_direct_IO,
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 16/16] fuse: refactor writeback to use iomap_writepage_ctx inode
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (14 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 15/16] fuse: hook into iomap for invalidating and checking partial uptodateness Joanne Koong
@ 2025-06-24  2:21 ` Joanne Koong
  2025-07-02 18:15   ` Darrick J. Wong
  2025-06-24 13:31 ` [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Christoph Hellwig
  16 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-24  2:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: hch, miklos, brauner, djwong, anuj20.g, linux-xfs, linux-doc,
	linux-block, gfs2, kernel-team

struct iomap_writepage_ctx includes a pointer to the file inode. In
writeback, use that instead of also passing the inode into
fuse_fill_wb_data.

No functional changes.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/fuse/file.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 865d04b8ef31..4f17ba69ddfc 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2070,16 +2070,16 @@ struct fuse_fill_wb_data {
 	unsigned int nr_bytes;
 };
 
-static bool fuse_pages_realloc(struct fuse_fill_wb_data *data)
+static bool fuse_pages_realloc(struct fuse_fill_wb_data *data,
+			       unsigned int max_pages)
 {
 	struct fuse_args_pages *ap = &data->wpa->ia.ap;
-	struct fuse_conn *fc = get_fuse_conn(data->inode);
 	struct folio **folios;
 	struct fuse_folio_desc *descs;
 	unsigned int nfolios = min_t(unsigned int,
 				     max_t(unsigned int, data->max_folios * 2,
 					   FUSE_DEFAULT_MAX_PAGES_PER_REQ),
-				    fc->max_pages);
+				    max_pages);
 	WARN_ON(nfolios <= data->max_folios);
 
 	folios = fuse_folios_alloc(nfolios, GFP_NOFS, &descs);
@@ -2096,10 +2096,10 @@ static bool fuse_pages_realloc(struct fuse_fill_wb_data *data)
 	return true;
 }
 
-static void fuse_writepages_send(struct fuse_fill_wb_data *data)
+static void fuse_writepages_send(struct inode *inode,
+				 struct fuse_fill_wb_data *data)
 {
 	struct fuse_writepage_args *wpa = data->wpa;
-	struct inode *inode = data->inode;
 	struct fuse_inode *fi = get_fuse_inode(inode);
 
 	spin_lock(&fi->lock);
@@ -2134,7 +2134,8 @@ static bool fuse_writepage_need_send(struct fuse_conn *fc, struct folio *folio,
 		return true;
 
 	/* Need to grow the pages array?  If so, did the expansion fail? */
-	if (ap->num_folios == data->max_folios && !fuse_pages_realloc(data))
+	if (ap->num_folios == data->max_folios &&
+	    !fuse_pages_realloc(data, fc->max_pages))
 		return true;
 
 	return false;
@@ -2147,7 +2148,7 @@ static ssize_t fuse_iomap_writeback_range(struct iomap_writepage_ctx *wpc,
 	struct fuse_fill_wb_data *data = wpc->wb_ctx;
 	struct fuse_writepage_args *wpa = data->wpa;
 	struct fuse_args_pages *ap = &wpa->ia.ap;
-	struct inode *inode = data->inode;
+	struct inode *inode = wpc->inode;
 	struct fuse_inode *fi = get_fuse_inode(inode);
 	struct fuse_conn *fc = get_fuse_conn(inode);
 	loff_t offset = offset_in_folio(folio, pos);
@@ -2165,7 +2166,7 @@ static ssize_t fuse_iomap_writeback_range(struct iomap_writepage_ctx *wpc,
 	iomap_start_folio_write(inode, folio, 1);
 
 	if (wpa && fuse_writepage_need_send(fc, folio, offset, len, ap, data)) {
-		fuse_writepages_send(data);
+		fuse_writepages_send(inode, data);
 		data->wpa = NULL;
 		data->nr_bytes = 0;
 	}
@@ -2199,7 +2200,7 @@ static int fuse_iomap_writeback_submit(struct iomap_writepage_ctx *wpc,
 
 	if (data->wpa) {
 		WARN_ON(!data->wpa->ia.ap.num_folios);
-		fuse_writepages_send(data);
+		fuse_writepages_send(wpc->inode, data);
 	}
 
 	if (data->ff)
@@ -2218,9 +2219,7 @@ static int fuse_writepages(struct address_space *mapping,
 {
 	struct inode *inode = mapping->host;
 	struct fuse_conn *fc = get_fuse_conn(inode);
-	struct fuse_fill_wb_data data = {
-		.inode = inode,
-	};
+	struct fuse_fill_wb_data data = {};
 	struct iomap_writepage_ctx wpc = {
 		.inode = inode,
 		.iomap.type = IOMAP_MAPPED,
@@ -2242,9 +2241,7 @@ static int fuse_writepages(struct address_space *mapping,
 static int fuse_launder_folio(struct folio *folio)
 {
 	int err = 0;
-	struct fuse_fill_wb_data data = {
-		.inode = folio->mapping->host,
-	};
+	struct fuse_fill_wb_data data = {};
 	struct iomap_writepage_ctx wpc = {
 		.inode = folio->mapping->host,
 		.iomap.type = IOMAP_MAPPED,
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx
  2025-06-24  2:21 ` [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx Joanne Koong
@ 2025-06-24  6:07   ` Johannes Thumshirn
  2025-06-24 22:02     ` Joanne Koong
  2025-07-02 15:57   ` Darrick J. Wong
  1 sibling, 1 reply; 50+ messages in thread
From: Johannes Thumshirn @ 2025-06-24  6:07 UTC (permalink / raw)
  To: Joanne Koong, linux-fsdevel@vger.kernel.org
  Cc: hch, miklos@szeredi.hu, brauner@kernel.org, djwong@kernel.org,
	anuj20.g@samsung.com, linux-xfs@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-block@vger.kernel.org,
	gfs2@lists.linux.dev, kernel-team@meta.com

On 24.06.25 04:23, Joanne Koong wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> Add inode and wpc fields to pass the inode and writeback context that
> are needed in the entire writeback call chain, and let the callers
> initialize all fields in the writeback context before calling
> iomap_writepages to simplify the argument passing.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Joanne Koong <joannelkoong@gmail.com>

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

But IIRC it should also have Joanne's S-o-B as she's carrying it in her 
tree.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/16] iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks
  2025-06-24  2:21 ` [PATCH v3 02/16] iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks Joanne Koong
@ 2025-06-24  6:09   ` Johannes Thumshirn
  2025-07-02 15:59   ` Darrick J. Wong
  1 sibling, 0 replies; 50+ messages in thread
From: Johannes Thumshirn @ 2025-06-24  6:09 UTC (permalink / raw)
  To: Joanne Koong, linux-fsdevel@vger.kernel.org
  Cc: hch, miklos@szeredi.hu, brauner@kernel.org, djwong@kernel.org,
	anuj20.g@samsung.com, linux-xfs@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-block@vger.kernel.org,
	gfs2@lists.linux.dev, kernel-team@meta.com

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 12/16] fuse: use iomap for buffered writes
  2025-06-24  2:21 ` [PATCH v3 12/16] fuse: use iomap " Joanne Koong
@ 2025-06-24 10:07   ` Miklos Szeredi
  2025-06-24 21:52     ` Joanne Koong
  2025-06-25 11:10   ` kernel test robot
  2025-07-02 17:55   ` Darrick J. Wong
  2 siblings, 1 reply; 50+ messages in thread
From: Miklos Szeredi @ 2025-06-24 10:07 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, brauner, djwong, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Tue, 24 Jun 2025 at 04:23, Joanne Koong <joannelkoong@gmail.com> wrote:

>  static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  {
>         struct file *file = iocb->ki_filp;
> @@ -1384,6 +1418,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
>         struct inode *inode = mapping->host;
>         ssize_t err, count;
>         struct fuse_conn *fc = get_fuse_conn(inode);
> +       bool writeback = false;
>
>         if (fc->writeback_cache) {
>                 /* Update size (EOF optimization) and mode (SUID clearing) */
> @@ -1397,8 +1432,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
>                                                 file_inode(file))) {
>                         goto writethrough;
>                 }
> -
> -               return generic_file_write_iter(iocb, from);
> +               writeback = true;

Doing this in the else branch makes the writethrough label (which is
wrong now) unnecessary.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback
  2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
                   ` (15 preceding siblings ...)
  2025-06-24  2:21 ` [PATCH v3 16/16] fuse: refactor writeback to use iomap_writepage_ctx inode Joanne Koong
@ 2025-06-24 13:31 ` Christoph Hellwig
  16 siblings, 0 replies; 50+ messages in thread
From: Christoph Hellwig @ 2025-06-24 13:31 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, djwong, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:19PM -0700, Joanne Koong wrote:
> This patchset is composed of two parts:
> a) Christoph's iomap changes (patches 1 to 11) which are taken verbatim
>    from [1]
> b) fuse changes to use iomap (patches 12 to 16)
> 
> Please note the following:
> * this patchset version temporarily drops the CONFIG_BLOCK iomap refactoring
>   patches that will be needed to merge in the series. As of now, this breaks
>   compilation for environments where CONFIG_BLOCK is not set, but the
>   CONFIG_BLOCK iomap changes will be re-added back in once the core changes
>   in this patchset are ready to go.
> * this patchset does not enable large folios yet. That will be sent out in a
>   separate future patchset.

Cool, so the previous series works for fuse.  I'll try to find some
time this week to address the review comments and enable the
!CONFIG_BLOCK build.  It would be great if we could figure out a way to
kill of ->launder_folio for fuse in the meantime, but it looks like that
got stuck a bit.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 12/16] fuse: use iomap for buffered writes
  2025-06-24 10:07   ` Miklos Szeredi
@ 2025-06-24 21:52     ` Joanne Koong
  0 siblings, 0 replies; 50+ messages in thread
From: Joanne Koong @ 2025-06-24 21:52 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, hch, brauner, djwong, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Tue, Jun 24, 2025 at 3:07 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Tue, 24 Jun 2025 at 04:23, Joanne Koong <joannelkoong@gmail.com> wrote:
>
> >  static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >  {
> >         struct file *file = iocb->ki_filp;
> > @@ -1384,6 +1418,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >         struct inode *inode = mapping->host;
> >         ssize_t err, count;
> >         struct fuse_conn *fc = get_fuse_conn(inode);
> > +       bool writeback = false;
> >
> >         if (fc->writeback_cache) {
> >                 /* Update size (EOF optimization) and mode (SUID clearing) */
> > @@ -1397,8 +1432,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >                                                 file_inode(file))) {
> >                         goto writethrough;
> >                 }
> > -
> > -               return generic_file_write_iter(iocb, from);
> > +               writeback = true;
>
> Doing this in the else branch makes the writethrough label (which is
> wrong now) unnecessary.

Hi Miklos,

That's a great point, I'll get rid of the writethrough label in v4.


Thanks,
Joanne

>
> Thanks,
> Miklos

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx
  2025-06-24  6:07   ` Johannes Thumshirn
@ 2025-06-24 22:02     ` Joanne Koong
  0 siblings, 0 replies; 50+ messages in thread
From: Joanne Koong @ 2025-06-24 22:02 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: linux-fsdevel@vger.kernel.org, hch, miklos@szeredi.hu,
	brauner@kernel.org, djwong@kernel.org, anuj20.g@samsung.com,
	linux-xfs@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-block@vger.kernel.org, gfs2@lists.linux.dev,
	kernel-team@meta.com

On Mon, Jun 23, 2025 at 11:07 PM Johannes Thumshirn
<Johannes.Thumshirn@wdc.com> wrote:
>
> On 24.06.25 04:23, Joanne Koong wrote:
> > From: Christoph Hellwig <hch@lst.de>
> >
> > Add inode and wpc fields to pass the inode and writeback context that
> > are needed in the entire writeback call chain, and let the callers
> > initialize all fields in the writeback context before calling
> > iomap_writepages to simplify the argument passing.
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
>
> Looks good,
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>
> But IIRC it should also have Joanne's S-o-B as she's carrying it in her
> tree.

Hi Johannes,

I'll fix this up in v4 to include a signed-off-by line.

Thanks,
Joanne

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 12/16] fuse: use iomap for buffered writes
  2025-06-24  2:21 ` [PATCH v3 12/16] fuse: use iomap " Joanne Koong
  2025-06-24 10:07   ` Miklos Szeredi
@ 2025-06-25 11:10   ` kernel test robot
  2025-07-02 17:55   ` Darrick J. Wong
  2 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2025-06-25 11:10 UTC (permalink / raw)
  To: Joanne Koong, linux-fsdevel
  Cc: oe-kbuild-all, hch, miklos, brauner, djwong, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

Hi Joanne,

kernel test robot noticed the following build errors:

[auto build test ERROR on brauner-vfs/vfs.all]
[also build test ERROR on xfs-linux/for-next linus/master v6.16-rc3 next-20250625]
[cannot apply to gfs2/for-next mszeredi-fuse/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/iomap-pass-more-arguments-using-struct-iomap_writepage_ctx/20250624-102709
base:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
patch link:    https://lore.kernel.org/r/20250624022135.832899-13-joannelkoong%40gmail.com
patch subject: [PATCH v3 12/16] fuse: use iomap for buffered writes
config: arm64-randconfig-003-20250625 (https://download.01.org/0day-ci/archive/20250625/202506251857.gt3y42uZ-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 12.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250625/202506251857.gt3y42uZ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506251857.gt3y42uZ-lkp@intel.com/

All errors (new ones prefixed by >>):

   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_cache_write_iter':
>> file.c:(.text+0x880c): undefined reference to `iomap_file_buffered_write'
>> file.c:(.text+0x880c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_file_buffered_write'
>> aarch64-linux-ld: fs/fuse/file.o:(.rodata+0x360): undefined reference to `iomap_release_folio'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 13/16] fuse: use iomap for writeback
  2025-06-24  2:21 ` [PATCH v3 13/16] fuse: use iomap for writeback Joanne Koong
@ 2025-06-25 14:17   ` kernel test robot
  2025-06-25 16:48     ` Joanne Koong
  2025-07-02 18:13   ` Darrick J. Wong
  1 sibling, 1 reply; 50+ messages in thread
From: kernel test robot @ 2025-06-25 14:17 UTC (permalink / raw)
  To: Joanne Koong, linux-fsdevel
  Cc: oe-kbuild-all, hch, miklos, brauner, djwong, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

Hi Joanne,

kernel test robot noticed the following build errors:

[auto build test ERROR on brauner-vfs/vfs.all]
[also build test ERROR on xfs-linux/for-next linus/master v6.16-rc3 next-20250625]
[cannot apply to gfs2/for-next mszeredi-fuse/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/iomap-pass-more-arguments-using-struct-iomap_writepage_ctx/20250624-102709
base:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
patch link:    https://lore.kernel.org/r/20250624022135.832899-14-joannelkoong%40gmail.com
patch subject: [PATCH v3 13/16] fuse: use iomap for writeback
config: arm64-randconfig-003-20250625 (https://download.01.org/0day-ci/archive/20250625/202506252117.9V3HTO0i-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 12.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250625/202506252117.9V3HTO0i-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506252117.9V3HTO0i-lkp@intel.com/

All errors (new ones prefixed by >>):

   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_writepages':
>> file.c:(.text+0xa30): undefined reference to `iomap_writepages'
>> file.c:(.text+0xa30): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_writepages'
   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_writepage_finish':
>> file.c:(.text+0x1fd8): undefined reference to `iomap_finish_folio_write'
>> file.c:(.text+0x1fd8): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_finish_folio_write'
   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_cache_write_iter':
   file.c:(.text+0x888c): undefined reference to `iomap_file_buffered_write'
   file.c:(.text+0x888c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_file_buffered_write'
   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_iomap_writeback_range':
>> file.c:(.text+0x9258): undefined reference to `iomap_start_folio_write'
>> file.c:(.text+0x9258): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_start_folio_write'
>> aarch64-linux-ld: fs/fuse/file.o:(.rodata+0x370): undefined reference to `iomap_dirty_folio'
   aarch64-linux-ld: fs/fuse/file.o:(.rodata+0x3a0): undefined reference to `iomap_release_folio'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 13/16] fuse: use iomap for writeback
  2025-06-25 14:17   ` kernel test robot
@ 2025-06-25 16:48     ` Joanne Koong
  2025-07-01 11:34       ` Christian Brauner
  0 siblings, 1 reply; 50+ messages in thread
From: Joanne Koong @ 2025-06-25 16:48 UTC (permalink / raw)
  To: kernel test robot
  Cc: linux-fsdevel, oe-kbuild-all, hch, miklos, brauner, djwong,
	anuj20.g, linux-xfs, linux-doc, linux-block, gfs2, kernel-team

On Wed, Jun 25, 2025 at 7:18 AM kernel test robot <lkp@intel.com> wrote:
>
> Hi Joanne,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on brauner-vfs/vfs.all]
> [also build test ERROR on xfs-linux/for-next linus/master v6.16-rc3 next-20250625]
> [cannot apply to gfs2/for-next mszeredi-fuse/for-next]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/iomap-pass-more-arguments-using-struct-iomap_writepage_ctx/20250624-102709
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
> patch link:    https://lore.kernel.org/r/20250624022135.832899-14-joannelkoong%40gmail.com
> patch subject: [PATCH v3 13/16] fuse: use iomap for writeback
> config: arm64-randconfig-003-20250625 (https://download.01.org/0day-ci/archive/20250625/202506252117.9V3HTO0i-lkp@intel.com/config)
> compiler: aarch64-linux-gcc (GCC) 12.3.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250625/202506252117.9V3HTO0i-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202506252117.9V3HTO0i-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>    aarch64-linux-ld: fs/fuse/file.o: in function `fuse_writepages':
> >> file.c:(.text+0xa30): undefined reference to `iomap_writepages'
> >> file.c:(.text+0xa30): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_writepages'
>    aarch64-linux-ld: fs/fuse/file.o: in function `fuse_writepage_finish':
> >> file.c:(.text+0x1fd8): undefined reference to `iomap_finish_folio_write'
> >> file.c:(.text+0x1fd8): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_finish_folio_write'
>    aarch64-linux-ld: fs/fuse/file.o: in function `fuse_cache_write_iter':
>    file.c:(.text+0x888c): undefined reference to `iomap_file_buffered_write'
>    file.c:(.text+0x888c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_file_buffered_write'
>    aarch64-linux-ld: fs/fuse/file.o: in function `fuse_iomap_writeback_range':
> >> file.c:(.text+0x9258): undefined reference to `iomap_start_folio_write'
> >> file.c:(.text+0x9258): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_start_folio_write'
> >> aarch64-linux-ld: fs/fuse/file.o:(.rodata+0x370): undefined reference to `iomap_dirty_folio'
>    aarch64-linux-ld: fs/fuse/file.o:(.rodata+0x3a0): undefined reference to `iomap_release_folio'
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki

As noted in the cover letter [1]

   Please note the following:
   * this patchset version temporarily drops the CONFIG_BLOCK iomap refactoring
     patches that will be needed to merge in the series. As of now, this breaks
     compilation for environments where CONFIG_BLOCK is not set, but the
     CONFIG_BLOCK iomap changes will be re-added back in once the core changes
     in this patchset are ready to go.


These errors will be fixed in later versions when the CONFIG_BLOCK
patches get added in.


Thanks,
Joanne

[1] https://lore.kernel.org/linux-fsdevel/20250624022135.832899-1-joannelkoong@gmail.com/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 14/16] fuse: use iomap for folio laundering
  2025-06-24  2:21 ` [PATCH v3 14/16] fuse: use iomap for folio laundering Joanne Koong
@ 2025-06-25 17:02   ` kernel test robot
  2025-07-02 18:14   ` Darrick J. Wong
  1 sibling, 0 replies; 50+ messages in thread
From: kernel test robot @ 2025-06-25 17:02 UTC (permalink / raw)
  To: Joanne Koong, linux-fsdevel
  Cc: oe-kbuild-all, hch, miklos, brauner, djwong, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

Hi Joanne,

kernel test robot noticed the following build errors:

[auto build test ERROR on brauner-vfs/vfs.all]
[also build test ERROR on xfs-linux/for-next linus/master v6.16-rc3 next-20250625]
[cannot apply to gfs2/for-next mszeredi-fuse/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/iomap-pass-more-arguments-using-struct-iomap_writepage_ctx/20250624-102709
base:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
patch link:    https://lore.kernel.org/r/20250624022135.832899-15-joannelkoong%40gmail.com
patch subject: [PATCH v3 14/16] fuse: use iomap for folio laundering
config: arm64-randconfig-003-20250625 (https://download.01.org/0day-ci/archive/20250626/202506260003.qJL8KxcS-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 12.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250626/202506260003.qJL8KxcS-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506260003.qJL8KxcS-lkp@intel.com/

All errors (new ones prefixed by >>):

   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_writepages':
   file.c:(.text+0xbf0): undefined reference to `iomap_writepages'
   file.c:(.text+0xbf0): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_writepages'
   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_writepage_finish':
   file.c:(.text+0x1fd8): undefined reference to `iomap_finish_folio_write'
   file.c:(.text+0x1fd8): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_finish_folio_write'
   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_cache_write_iter':
   file.c:(.text+0x888c): undefined reference to `iomap_file_buffered_write'
   file.c:(.text+0x888c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_file_buffered_write'
   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_launder_folio':
>> file.c:(.text+0x8f74): undefined reference to `iomap_writeback_folio'
>> file.c:(.text+0x8f74): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_writeback_folio'
   aarch64-linux-ld: fs/fuse/file.o: in function `fuse_iomap_writeback_range':
   file.c:(.text+0x90d8): undefined reference to `iomap_start_folio_write'
   file.c:(.text+0x90d8): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_start_folio_write'
   aarch64-linux-ld: fs/fuse/file.o:(.rodata+0x370): undefined reference to `iomap_dirty_folio'
   aarch64-linux-ld: fs/fuse/file.o:(.rodata+0x3a0): undefined reference to `iomap_release_folio'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 13/16] fuse: use iomap for writeback
  2025-06-25 16:48     ` Joanne Koong
@ 2025-07-01 11:34       ` Christian Brauner
  0 siblings, 0 replies; 50+ messages in thread
From: Christian Brauner @ 2025-07-01 11:34 UTC (permalink / raw)
  To: Joanne Koong
  Cc: kernel test robot, linux-fsdevel, oe-kbuild-all, hch, miklos,
	djwong, anuj20.g, linux-xfs, linux-doc, linux-block, gfs2,
	kernel-team

On Wed, Jun 25, 2025 at 09:48:15AM -0700, Joanne Koong wrote:
> On Wed, Jun 25, 2025 at 7:18 AM kernel test robot <lkp@intel.com> wrote:
> >
> > Hi Joanne,
> >
> > kernel test robot noticed the following build errors:
> >
> > [auto build test ERROR on brauner-vfs/vfs.all]
> > [also build test ERROR on xfs-linux/for-next linus/master v6.16-rc3 next-20250625]
> > [cannot apply to gfs2/for-next mszeredi-fuse/for-next]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch#_base_tree_information]
> >
> > url:    https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/iomap-pass-more-arguments-using-struct-iomap_writepage_ctx/20250624-102709
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
> > patch link:    https://lore.kernel.org/r/20250624022135.832899-14-joannelkoong%40gmail.com
> > patch subject: [PATCH v3 13/16] fuse: use iomap for writeback
> > config: arm64-randconfig-003-20250625 (https://download.01.org/0day-ci/archive/20250625/202506252117.9V3HTO0i-lkp@intel.com/config)
> > compiler: aarch64-linux-gcc (GCC) 12.3.0
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250625/202506252117.9V3HTO0i-lkp@intel.com/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202506252117.9V3HTO0i-lkp@intel.com/
> >
> > All errors (new ones prefixed by >>):
> >
> >    aarch64-linux-ld: fs/fuse/file.o: in function `fuse_writepages':
> > >> file.c:(.text+0xa30): undefined reference to `iomap_writepages'
> > >> file.c:(.text+0xa30): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_writepages'
> >    aarch64-linux-ld: fs/fuse/file.o: in function `fuse_writepage_finish':
> > >> file.c:(.text+0x1fd8): undefined reference to `iomap_finish_folio_write'
> > >> file.c:(.text+0x1fd8): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_finish_folio_write'
> >    aarch64-linux-ld: fs/fuse/file.o: in function `fuse_cache_write_iter':
> >    file.c:(.text+0x888c): undefined reference to `iomap_file_buffered_write'
> >    file.c:(.text+0x888c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_file_buffered_write'
> >    aarch64-linux-ld: fs/fuse/file.o: in function `fuse_iomap_writeback_range':
> > >> file.c:(.text+0x9258): undefined reference to `iomap_start_folio_write'
> > >> file.c:(.text+0x9258): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `iomap_start_folio_write'
> > >> aarch64-linux-ld: fs/fuse/file.o:(.rodata+0x370): undefined reference to `iomap_dirty_folio'
> >    aarch64-linux-ld: fs/fuse/file.o:(.rodata+0x3a0): undefined reference to `iomap_release_folio'
> >
> > --
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests/wiki
> 
> As noted in the cover letter [1]
> 
>    Please note the following:
>    * this patchset version temporarily drops the CONFIG_BLOCK iomap refactoring
>      patches that will be needed to merge in the series. As of now, this breaks
>      compilation for environments where CONFIG_BLOCK is not set, but the
>      CONFIG_BLOCK iomap changes will be re-added back in once the core changes
>      in this patchset are ready to go.
> 
> 
> These errors will be fixed in later versions when the CONFIG_BLOCK
> patches get added in.

Ok, sounds good. Main thing is I can merge it all in one go.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx
  2025-06-24  2:21 ` [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx Joanne Koong
  2025-06-24  6:07   ` Johannes Thumshirn
@ 2025-07-02 15:57   ` Darrick J. Wong
  1 sibling, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 15:57 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:20PM -0700, Joanne Koong wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> Add inode and wpc fields to pass the inode and writeback context that
> are needed in the entire writeback call chain, and let the callers
> initialize all fields in the writeback context before calling
> iomap_writepages to simplify the argument passing.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Joanne Koong <joannelkoong@gmail.com>

Nice signature reduction :)
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  block/fops.c           |  8 +++++--
>  fs/gfs2/aops.c         |  8 +++++--
>  fs/iomap/buffered-io.c | 52 +++++++++++++++++++-----------------------
>  fs/xfs/xfs_aops.c      | 24 +++++++++++++------
>  fs/zonefs/file.c       |  8 +++++--
>  include/linux/iomap.h  |  6 ++---
>  6 files changed, 61 insertions(+), 45 deletions(-)
> 
> diff --git a/block/fops.c b/block/fops.c
> index 1309861d4c2c..3394263d942b 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -558,9 +558,13 @@ static const struct iomap_writeback_ops blkdev_writeback_ops = {
>  static int blkdev_writepages(struct address_space *mapping,
>  		struct writeback_control *wbc)
>  {
> -	struct iomap_writepage_ctx wpc = { };
> +	struct iomap_writepage_ctx wpc = {
> +		.inode		= mapping->host,
> +		.wbc		= wbc,
> +		.ops		= &blkdev_writeback_ops
> +	};
>  
> -	return iomap_writepages(mapping, wbc, &wpc, &blkdev_writeback_ops);
> +	return iomap_writepages(&wpc);
>  }
>  
>  const struct address_space_operations def_blk_aops = {
> diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
> index 14f204cd5a82..47d74afd63ac 100644
> --- a/fs/gfs2/aops.c
> +++ b/fs/gfs2/aops.c
> @@ -159,7 +159,11 @@ static int gfs2_writepages(struct address_space *mapping,
>  			   struct writeback_control *wbc)
>  {
>  	struct gfs2_sbd *sdp = gfs2_mapping2sbd(mapping);
> -	struct iomap_writepage_ctx wpc = { };
> +	struct iomap_writepage_ctx wpc = {
> +		.inode		= mapping->host,
> +		.wbc		= wbc,
> +		.ops		= &gfs2_writeback_ops,
> +	};
>  	int ret;
>  
>  	/*
> @@ -168,7 +172,7 @@ static int gfs2_writepages(struct address_space *mapping,
>  	 * want balance_dirty_pages() to loop indefinitely trying to write out
>  	 * pages held in the ail that it can't find.
>  	 */
> -	ret = iomap_writepages(mapping, wbc, &wpc, &gfs2_writeback_ops);
> +	ret = iomap_writepages(&wpc);
>  	if (ret == 0 && wbc->nr_to_write > 0)
>  		set_bit(SDF_FORCE_AIL_FLUSH, &sdp->sd_flags);
>  	return ret;
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 3729391a18f3..71ad17bf827f 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1626,20 +1626,19 @@ static int iomap_submit_ioend(struct iomap_writepage_ctx *wpc, int error)
>  }
>  
>  static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
> -		struct writeback_control *wbc, struct inode *inode, loff_t pos,
> -		u16 ioend_flags)
> +		loff_t pos, u16 ioend_flags)
>  {
>  	struct bio *bio;
>  
>  	bio = bio_alloc_bioset(wpc->iomap.bdev, BIO_MAX_VECS,
> -			       REQ_OP_WRITE | wbc_to_write_flags(wbc),
> +			       REQ_OP_WRITE | wbc_to_write_flags(wpc->wbc),
>  			       GFP_NOFS, &iomap_ioend_bioset);
>  	bio->bi_iter.bi_sector = iomap_sector(&wpc->iomap, pos);
>  	bio->bi_end_io = iomap_writepage_end_bio;
> -	bio->bi_write_hint = inode->i_write_hint;
> -	wbc_init_bio(wbc, bio);
> +	bio->bi_write_hint = wpc->inode->i_write_hint;
> +	wbc_init_bio(wpc->wbc, bio);
>  	wpc->nr_folios = 0;
> -	return iomap_init_ioend(inode, bio, pos, ioend_flags);
> +	return iomap_init_ioend(wpc->inode, bio, pos, ioend_flags);
>  }
>  
>  static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
> @@ -1678,9 +1677,7 @@ static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
>   * writepage context that the caller will need to submit.
>   */
>  static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
> -		struct writeback_control *wbc, struct folio *folio,
> -		struct inode *inode, loff_t pos, loff_t end_pos,
> -		unsigned len)
> +		struct folio *folio, loff_t pos, loff_t end_pos, unsigned len)
>  {
>  	struct iomap_folio_state *ifs = folio->private;
>  	size_t poff = offset_in_folio(folio, pos);
> @@ -1701,8 +1698,7 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
>  		error = iomap_submit_ioend(wpc, 0);
>  		if (error)
>  			return error;
> -		wpc->ioend = iomap_alloc_ioend(wpc, wbc, inode, pos,
> -				ioend_flags);
> +		wpc->ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
>  	}
>  
>  	if (!bio_add_folio(&wpc->ioend->io_bio, folio, len, poff))
> @@ -1756,24 +1752,24 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
>  	if (wpc->ioend->io_offset + wpc->ioend->io_size > end_pos)
>  		wpc->ioend->io_size = end_pos - wpc->ioend->io_offset;
>  
> -	wbc_account_cgroup_owner(wbc, folio, len);
> +	wbc_account_cgroup_owner(wpc->wbc, folio, len);
>  	return 0;
>  }
>  
>  static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
> -		struct writeback_control *wbc, struct folio *folio,
> -		struct inode *inode, u64 pos, u64 end_pos,
> -		unsigned dirty_len, unsigned *count)
> +		struct folio *folio, u64 pos, u64 end_pos, unsigned dirty_len,
> +		unsigned *count)
>  {
>  	int error;
>  
>  	do {
>  		unsigned map_len;
>  
> -		error = wpc->ops->map_blocks(wpc, inode, pos, dirty_len);
> +		error = wpc->ops->map_blocks(wpc, wpc->inode, pos, dirty_len);
>  		if (error)
>  			break;
> -		trace_iomap_writepage_map(inode, pos, dirty_len, &wpc->iomap);
> +		trace_iomap_writepage_map(wpc->inode, pos, dirty_len,
> +				&wpc->iomap);
>  
>  		map_len = min_t(u64, dirty_len,
>  			wpc->iomap.offset + wpc->iomap.length - pos);
> @@ -1787,8 +1783,8 @@ static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
>  		case IOMAP_HOLE:
>  			break;
>  		default:
> -			error = iomap_add_to_ioend(wpc, wbc, folio, inode, pos,
> -					end_pos, map_len);
> +			error = iomap_add_to_ioend(wpc, folio, pos, end_pos,
> +					map_len);
>  			if (!error)
>  				(*count)++;
>  			break;
> @@ -1870,10 +1866,10 @@ static bool iomap_writepage_handle_eof(struct folio *folio, struct inode *inode,
>  }
>  
>  static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> -		struct writeback_control *wbc, struct folio *folio)
> +		struct folio *folio)
>  {
>  	struct iomap_folio_state *ifs = folio->private;
> -	struct inode *inode = folio->mapping->host;
> +	struct inode *inode = wpc->inode;
>  	u64 pos = folio_pos(folio);
>  	u64 end_pos = pos + folio_size(folio);
>  	u64 end_aligned = 0;
> @@ -1920,8 +1916,8 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  	 */
>  	end_aligned = round_up(end_pos, i_blocksize(inode));
>  	while ((rlen = iomap_find_dirty_range(folio, &pos, end_aligned))) {
> -		error = iomap_writepage_map_blocks(wpc, wbc, folio, inode,
> -				pos, end_pos, rlen, &count);
> +		error = iomap_writepage_map_blocks(wpc, folio, pos, end_pos,
> +				rlen, &count);
>  		if (error)
>  			break;
>  		pos += rlen;
> @@ -1957,10 +1953,9 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  }
>  
>  int
> -iomap_writepages(struct address_space *mapping, struct writeback_control *wbc,
> -		struct iomap_writepage_ctx *wpc,
> -		const struct iomap_writeback_ops *ops)
> +iomap_writepages(struct iomap_writepage_ctx *wpc)
>  {
> +	struct address_space *mapping = wpc->inode->i_mapping;
>  	struct folio *folio = NULL;
>  	int error;
>  
> @@ -1972,9 +1967,8 @@ iomap_writepages(struct address_space *mapping, struct writeback_control *wbc,
>  			PF_MEMALLOC))
>  		return -EIO;
>  
> -	wpc->ops = ops;
> -	while ((folio = writeback_iter(mapping, wbc, folio, &error)))
> -		error = iomap_writepage_map(wpc, wbc, folio);
> +	while ((folio = writeback_iter(mapping, wpc->wbc, folio, &error)))
> +		error = iomap_writepage_map(wpc, folio);
>  	return iomap_submit_ioend(wpc, error);
>  }
>  EXPORT_SYMBOL_GPL(iomap_writepages);
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 63151feb9c3f..65485a52df3b 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -636,19 +636,29 @@ xfs_vm_writepages(
>  	xfs_iflags_clear(ip, XFS_ITRUNCATED);
>  
>  	if (xfs_is_zoned_inode(ip)) {
> -		struct xfs_zoned_writepage_ctx	xc = { };
> +		struct xfs_zoned_writepage_ctx	xc = {
> +			.ctx = {
> +				.inode	= mapping->host,
> +				.wbc	= wbc,
> +				.ops	= &xfs_zoned_writeback_ops
> +			},
> +		};
>  		int				error;
>  
> -		error = iomap_writepages(mapping, wbc, &xc.ctx,
> -					 &xfs_zoned_writeback_ops);
> +		error = iomap_writepages(&xc.ctx);
>  		if (xc.open_zone)
>  			xfs_open_zone_put(xc.open_zone);
>  		return error;
>  	} else {
> -		struct xfs_writepage_ctx	wpc = { };
> -
> -		return iomap_writepages(mapping, wbc, &wpc.ctx,
> -				&xfs_writeback_ops);
> +		struct xfs_writepage_ctx	wpc = {
> +			.ctx = {
> +				.inode	= mapping->host,
> +				.wbc	= wbc,
> +				.ops	= &xfs_writeback_ops
> +			},
> +		};
> +
> +		return iomap_writepages(&wpc.ctx);
>  	}
>  }
>  
> diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
> index 42e2c0065bb3..edca4bbe4b72 100644
> --- a/fs/zonefs/file.c
> +++ b/fs/zonefs/file.c
> @@ -152,9 +152,13 @@ static const struct iomap_writeback_ops zonefs_writeback_ops = {
>  static int zonefs_writepages(struct address_space *mapping,
>  			     struct writeback_control *wbc)
>  {
> -	struct iomap_writepage_ctx wpc = { };
> +	struct iomap_writepage_ctx wpc = {
> +		.inode		= mapping->host,
> +		.wbc		= wbc,
> +		.ops		= &zonefs_writeback_ops,
> +	};
>  
> -	return iomap_writepages(mapping, wbc, &wpc, &zonefs_writeback_ops);
> +	return iomap_writepages(&wpc);
>  }
>  
>  static int zonefs_swap_activate(struct swap_info_struct *sis,
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 522644d62f30..00179c9387c5 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -448,6 +448,8 @@ struct iomap_writeback_ops {
>  
>  struct iomap_writepage_ctx {
>  	struct iomap		iomap;
> +	struct inode		*inode;
> +	struct writeback_control *wbc;
>  	struct iomap_ioend	*ioend;
>  	const struct iomap_writeback_ops *ops;
>  	u32			nr_folios;	/* folios added to the ioend */
> @@ -461,9 +463,7 @@ void iomap_finish_ioends(struct iomap_ioend *ioend, int error);
>  void iomap_ioend_try_merge(struct iomap_ioend *ioend,
>  		struct list_head *more_ioends);
>  void iomap_sort_ioends(struct list_head *ioend_list);
> -int iomap_writepages(struct address_space *mapping,
> -		struct writeback_control *wbc, struct iomap_writepage_ctx *wpc,
> -		const struct iomap_writeback_ops *ops);
> +int iomap_writepages(struct iomap_writepage_ctx *wpc);
>  
>  /*
>   * Flags for direct I/O ->end_io:
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/16] iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks
  2025-06-24  2:21 ` [PATCH v3 02/16] iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks Joanne Koong
  2025-06-24  6:09   ` Johannes Thumshirn
@ 2025-07-02 15:59   ` Darrick J. Wong
  1 sibling, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 15:59 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:21PM -0700, Joanne Koong wrote:
> We don't care about the count of outstanding ioends, just if there is one.
> Replace the count variable passed to iomap_writepage_map_blocks with a
> boolean to make that more clear.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> [hch: rename the variable, update the commit message]
> Signed-off-by: Christoph Hellwig <hch@lst.de>

/methinks this also fixes a theoretical logic bug if *count should
ever overflow back to zero so

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/iomap/buffered-io.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 71ad17bf827f..11a55da26a6f 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1758,7 +1758,7 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
>  
>  static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
>  		struct folio *folio, u64 pos, u64 end_pos, unsigned dirty_len,
> -		unsigned *count)
> +		bool *wb_pending)
>  {
>  	int error;
>  
> @@ -1786,7 +1786,7 @@ static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
>  			error = iomap_add_to_ioend(wpc, folio, pos, end_pos,
>  					map_len);
>  			if (!error)
> -				(*count)++;
> +				*wb_pending = true;
>  			break;
>  		}
>  		dirty_len -= map_len;
> @@ -1873,7 +1873,7 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  	u64 pos = folio_pos(folio);
>  	u64 end_pos = pos + folio_size(folio);
>  	u64 end_aligned = 0;
> -	unsigned count = 0;
> +	bool wb_pending = false;
>  	int error = 0;
>  	u32 rlen;
>  
> @@ -1917,13 +1917,13 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  	end_aligned = round_up(end_pos, i_blocksize(inode));
>  	while ((rlen = iomap_find_dirty_range(folio, &pos, end_aligned))) {
>  		error = iomap_writepage_map_blocks(wpc, folio, pos, end_pos,
> -				rlen, &count);
> +				rlen, &wb_pending);
>  		if (error)
>  			break;
>  		pos += rlen;
>  	}
>  
> -	if (count)
> +	if (wb_pending)
>  		wpc->nr_folios++;
>  
>  	/*
> @@ -1945,7 +1945,7 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  		if (atomic_dec_and_test(&ifs->write_bytes_pending))
>  			folio_end_writeback(folio);
>  	} else {
> -		if (!count)
> +		if (!wb_pending)
>  			folio_end_writeback(folio);
>  	}
>  	mapping_set_error(inode->i_mapping, error);
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 03/16] iomap: refactor the writeback interface
  2025-06-24  2:21 ` [PATCH v3 03/16] iomap: refactor the writeback interface Joanne Koong
@ 2025-07-02 17:13   ` Darrick J. Wong
  2025-07-03 12:16     ` Christoph Hellwig
  0 siblings, 1 reply; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:13 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:22PM -0700, Joanne Koong wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> Replace ->map_blocks with a new ->writeback_range, which differs in the
> following ways:
> 
>  - it must also queue up the I/O for writeback, that is called into the
>    slightly refactored and extended in scope iomap_add_to_ioend for
>    each region
>  - can handle only a part of the requested region, that is the retry
>    loop for partial mappings moves to the caller
>  - handles cleanup on failures as well, and thus also replaces the
>    discard_folio method only implemented by XFS.
> 
> This will allow to use the iomap writeback code also for file systems
> that are not block based like fuse.

Sounds like a good goal. ;)

> Co-developed-by: Joanne Koong <joannelkoong@gmail.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  .../filesystems/iomap/operations.rst          |  23 +--
>  block/fops.c                                  |  25 ++-
>  fs/gfs2/bmap.c                                |  26 +--
>  fs/iomap/buffered-io.c                        |  93 +++++------
>  fs/iomap/trace.h                              |   2 +-
>  fs/xfs/xfs_aops.c                             | 154 ++++++++++--------
>  fs/zonefs/file.c                              |  28 ++--
>  include/linux/iomap.h                         |  20 +--
>  8 files changed, 187 insertions(+), 184 deletions(-)
> 
> diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
> index 3b628e370d88..b28f215db6e5 100644
> --- a/Documentation/filesystems/iomap/operations.rst
> +++ b/Documentation/filesystems/iomap/operations.rst

<snip>

> @@ -283,15 +283,14 @@ The ``ops`` structure must be specified and is as follows:
>  .. code-block:: c
>  
>   struct iomap_writeback_ops {
> -     int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode,
> -                       loff_t offset, unsigned len);
> -     int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
> -     void (*discard_folio)(struct folio *folio, loff_t pos);
> +    int (*writeback_range)(struct iomap_writepage_ctx *wpc,
> +    		struct folio *folio, u64 pos, unsigned int len, u64 end_pos);

Why does @pos change from loff_t to u64 here?  Are we expecting
filesystems that set FOP_UNSIGNED_OFFSET?

> +    int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);

Nit:   ^^ indenting change here.

>   };
>  
>  The fields are as follows:
>  
> -  - ``map_blocks``: Sets ``wpc->iomap`` to the space mapping of the file
> +  - ``writeback_range``: Sets ``wpc->iomap`` to the space mapping of the file
>      range (in bytes) given by ``offset`` and ``len``.
>      iomap calls this function for each dirty fs block in each dirty folio,
>      though it will `reuse mappings

<snip>

> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 11a55da26a6f..80d8acfaa068 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1748,63 +1764,34 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
>  	 * Note that this defeats the ability to chain the ioends of
>  	 * appending writes.
>  	 */
> -	wpc->ioend->io_size += len;
> +	wpc->ioend->io_size += map_len;
>  	if (wpc->ioend->io_offset + wpc->ioend->io_size > end_pos)
>  		wpc->ioend->io_size = end_pos - wpc->ioend->io_offset;
>  
> -	wbc_account_cgroup_owner(wpc->wbc, folio, len);
> -	return 0;
> +	wbc_account_cgroup_owner(wpc->wbc, folio, map_len);
> +	return map_len;
>  }
> +EXPORT_SYMBOL_GPL(iomap_add_to_ioend);
>  
> -static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc,
> -		struct folio *folio, u64 pos, u64 end_pos, unsigned dirty_len,
> +static int iomap_writeback_range(struct iomap_writepage_ctx *wpc,
> +		struct folio *folio, u64 pos, u32 rlen, u64 end_pos,
>  		bool *wb_pending)
>  {
> -	int error;
> -
>  	do {
> -		unsigned map_len;
> -
> -		error = wpc->ops->map_blocks(wpc, wpc->inode, pos, dirty_len);
> -		if (error)
> -			break;
> -		trace_iomap_writepage_map(wpc->inode, pos, dirty_len,
> -				&wpc->iomap);
> -
> -		map_len = min_t(u64, dirty_len,
> -			wpc->iomap.offset + wpc->iomap.length - pos);
> -		WARN_ON_ONCE(!folio->private && map_len < dirty_len);
> +		ssize_t ret;
>  
> -		switch (wpc->iomap.type) {
> -		case IOMAP_INLINE:
> -			WARN_ON_ONCE(1);
> -			error = -EIO;
> -			break;
> -		case IOMAP_HOLE:
> -			break;
> -		default:
> -			error = iomap_add_to_ioend(wpc, folio, pos, end_pos,
> -					map_len);
> -			if (!error)
> -				*wb_pending = true;
> -			break;
> -		}
> -		dirty_len -= map_len;
> -		pos += map_len;
> -	} while (dirty_len && !error);
> +		ret = wpc->ops->writeback_range(wpc, folio, pos, rlen, end_pos);
> +		if (WARN_ON_ONCE(ret == 0 || ret > rlen))
> +			return -EIO;
> +		if (ret < 0)
> +			return ret;
> +		rlen -= ret;
> +		pos += ret;
> +		if (wpc->iomap.type != IOMAP_HOLE)
> +			*wb_pending = true;

/me wonders if this should be an outparam of ->writeback_range to signal
that it actually added the folio to the writeback ioend chain?  Or maybe
just a boolean in iomap_writepage_ctx that we clear before calling
->writeback_range and iomap_add_to_ioend can set it as appropriate?

> +	} while (rlen);
>  
> -	/*
> -	 * We cannot cancel the ioend directly here on error.  We may have
> -	 * already set other pages under writeback and hence we have to run I/O
> -	 * completion to mark the error state of the pages under writeback
> -	 * appropriately.
> -	 *
> -	 * Just let the file system know what portion of the folio failed to
> -	 * map.
> -	 */
> -	if (error && wpc->ops->discard_folio)
> -		wpc->ops->discard_folio(folio, pos);
> -	return error;
> +	return 0;
>  }
>  
>  /*

<snip>

> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 65485a52df3b..8157b6d92c8e 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -233,6 +233,47 @@ xfs_end_bio(
>  	spin_unlock_irqrestore(&ip->i_ioend_lock, flags);
>  }
>  
> +/*
> + * We cannot cancel the ioend directly on error.  We may have already set other
> + * pages under writeback and hence we have to run I/O completion to mark the
> + * error state of the pages under writeback appropriately.
> + *
> + * If the folio has delalloc blocks on it, the caller is asking us to punch them
> + * out. If we don't, we can leave a stale delalloc mapping covered by a clean
> + * page that needs to be dirtied again before the delalloc mapping can be
> + * converted. This stale delalloc mapping can trip up a later direct I/O read
> + * operation on the same region.
> + *
> + * We prevent this by truncating away the delalloc regions on the folio. Because
> + * they are delalloc, we can do this without needing a transaction. Indeed - if
> + * we get ENOSPC errors, we have to be able to do this truncation without a
> + * transaction as there is no space left for block reservation (typically why
> + * we see a ENOSPC in writeback).
> + */
> +static void
> +xfs_discard_folio(

Nice, one less indirect call. :)

> +	struct folio		*folio,
> +	loff_t			pos)
> +{
> +	struct xfs_inode	*ip = XFS_I(folio->mapping->host);
> +	struct xfs_mount	*mp = ip->i_mount;
> +
> +	if (xfs_is_shutdown(mp))
> +		return;
> +
> +	xfs_alert_ratelimited(mp,
> +		"page discard on page "PTR_FMT", inode 0x%llx, pos %llu.",
> +			folio, ip->i_ino, pos);
> +
> +	/*
> +	 * The end of the punch range is always the offset of the first
> +	 * byte of the next folio. Hence the end offset is only dependent on the
> +	 * folio itself and not the start offset that is passed in.
> +	 */
> +	xfs_bmap_punch_delalloc_range(ip, XFS_DATA_FORK, pos,
> +				folio_pos(folio) + folio_size(folio), NULL);
> +}
> +
>  /*
>   * Fast revalidation of the cached writeback mapping. Return true if the current
>   * mapping is valid, false otherwise.

<snip>

> @@ -389,7 +431,12 @@ xfs_map_blocks(
>  
>  	xfs_bmbt_to_iomap(ip, &wpc->iomap, &imap, 0, 0, XFS_WPC(wpc)->data_seq);
>  	trace_xfs_map_blocks_found(ip, offset, count, whichfork, &imap);
> -	return 0;
> +map_blocks:

Should this jump label should be named add_to_ioend or something?  We
already mapped the blocks.  The same applies to the zoned version of
this function.

--D

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 04/16] iomap: hide ioends from the generic writeback code
  2025-06-24  2:21 ` [PATCH v3 04/16] iomap: hide ioends from the generic writeback code Joanne Koong
@ 2025-07-02 17:38   ` Darrick J. Wong
  2025-07-03 12:19     ` Christoph Hellwig
  0 siblings, 1 reply; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:38 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:23PM -0700, Joanne Koong wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> Replace the ioend pointer in iomap_writeback_ctx with a void *wb_ctx
> one to facilitate non-block, non-ioend writeback for use.  Rename
> the submit_ioend method to writeback_submit and make it mandatory so
> that the generic writeback code stops seeing ioends and bios.
> 
> Co-developed-by: Joanne Koong <joannelkoong@gmail.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  .../filesystems/iomap/operations.rst          | 16 +---
>  block/fops.c                                  |  1 +
>  fs/gfs2/bmap.c                                |  1 +
>  fs/iomap/buffered-io.c                        | 91 ++++++++++---------
>  fs/xfs/xfs_aops.c                             | 60 ++++++------
>  fs/zonefs/file.c                              |  1 +
>  include/linux/iomap.h                         | 19 ++--
>  7 files changed, 93 insertions(+), 96 deletions(-)
> 
> diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
> index b28f215db6e5..ead56b27ec3f 100644
> --- a/Documentation/filesystems/iomap/operations.rst
> +++ b/Documentation/filesystems/iomap/operations.rst
> @@ -285,7 +285,7 @@ The ``ops`` structure must be specified and is as follows:
>   struct iomap_writeback_ops {
>      int (*writeback_range)(struct iomap_writepage_ctx *wpc,
>      		struct folio *folio, u64 pos, unsigned int len, u64 end_pos);
> -    int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
> +    int (*writeback_submit)(struct iomap_writepage_ctx *wpc, int error);
>   };
>  
>  The fields are as follows:
> @@ -307,13 +307,7 @@ The fields are as follows:
>      purpose.
>      This function must be supplied by the filesystem.
>  
> -  - ``submit_ioend``: Allows the file systems to hook into writeback bio
> -    submission.
> -    This might include pre-write space accounting updates, or installing
> -    a custom ``->bi_end_io`` function for internal purposes, such as
> -    deferring the ioend completion to a workqueue to run metadata update
> -    transactions from process context before submitting the bio.
> -    This function is optional.
> +  - ``writeback_submit``: Submit the previous built writeback context.
>  
>  Pagecache Writeback Completion
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> @@ -328,12 +322,6 @@ the address space.
>  This can happen in interrupt or process context, depending on the
>  storage device.
>  
> -Filesystems that need to update internal bookkeeping (e.g. unwritten
> -extent conversions) should provide a ``->submit_ioend`` function to
> -set ``struct iomap_end::bio::bi_end_io`` to its own function.
> -This function should call ``iomap_finish_ioends`` after finishing its
> -own work (e.g. unwritten extent conversion).
> -

I really wish you wouldn't delete the documentation that talks about
what sort of things you might do in a ->writeback_submit function.
That might be obvious to us who've been around for a long time, but I
don't think that's so obvious to the junior programmers.

>  Some filesystems may wish to `amortize the cost of running metadata
>  transactions
>  <https://lore.kernel.org/all/20220120034733.221737-1-david@fromorbit.com/>`_

<snip>

> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 80d8acfaa068..50cfddff1393 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1579,7 +1579,7 @@ u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend)
>  	return folio_count;
>  }
>  
> -static void iomap_writepage_end_bio(struct bio *bio)
> +static void ioend_writeback_end_bio(struct bio *bio)
>  {
>  	struct iomap_ioend *ioend = iomap_ioend_from_bio(bio);
>  
> @@ -1588,42 +1588,30 @@ static void iomap_writepage_end_bio(struct bio *bio)
>  }
>  
>  /*
> - * Submit an ioend.

Please retain the summary.

> - *
> - * If @error is non-zero, it means that we have a situation where some part of
> - * the submission process has failed after we've marked pages for writeback.
> - * We cannot cancel ioend directly in that case, so call the bio end I/O handler
> - * with the error status here to run the normal I/O completion handler to clear
> - * the writeback bit and let the file system proess the errors.
> + * We cannot cancel the ioend directly in case of an error, so call the bio end
> + * I/O handler with the error status here to run the normal I/O completion
> + * handler.
>   */
> -static int iomap_submit_ioend(struct iomap_writepage_ctx *wpc, int error)
> +int ioend_writeback_submit(struct iomap_writepage_ctx *wpc, int error)
>  {
> -	if (!wpc->ioend)
> -		return error;
> +	struct iomap_ioend *ioend = wpc->wb_ctx;
>  
> -	/*
> -	 * Let the file systems prepare the I/O submission and hook in an I/O
> -	 * comletion handler.  This also needs to happen in case after a
> -	 * failure happened so that the file system end I/O handler gets called
> -	 * to clean up.
> -	 */
> -	if (wpc->ops->submit_ioend) {
> -		error = wpc->ops->submit_ioend(wpc, error);
> -	} else {
> -		if (WARN_ON_ONCE(wpc->iomap.flags & IOMAP_F_ANON_WRITE))
> -			error = -EIO;
> -		if (!error)
> -			submit_bio(&wpc->ioend->io_bio);
> -	}
> +	if (!ioend->io_bio.bi_end_io)
> +		ioend->io_bio.bi_end_io = ioend_writeback_end_bio;
> +
> +	if (WARN_ON_ONCE(wpc->iomap.flags & IOMAP_F_ANON_WRITE))
> +		error = -EIO;
>  
>  	if (error) {
> -		wpc->ioend->io_bio.bi_status = errno_to_blk_status(error);
> -		bio_endio(&wpc->ioend->io_bio);
> +		ioend->io_bio.bi_status = errno_to_blk_status(error);
> +		bio_endio(&ioend->io_bio);
> +		return error;
>  	}
>  
> -	wpc->ioend = NULL;
> -	return error;
> +	submit_bio(&ioend->io_bio);
> +	return 0;
>  }
> +EXPORT_SYMBOL_GPL(ioend_writeback_submit);
>  
>  static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
>  		loff_t pos, u16 ioend_flags)
> @@ -1634,7 +1622,6 @@ static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
>  			       REQ_OP_WRITE | wbc_to_write_flags(wpc->wbc),
>  			       GFP_NOFS, &iomap_ioend_bioset);
>  	bio->bi_iter.bi_sector = iomap_sector(&wpc->iomap, pos);
> -	bio->bi_end_io = iomap_writepage_end_bio;
>  	bio->bi_write_hint = wpc->inode->i_write_hint;
>  	wbc_init_bio(wpc->wbc, bio);
>  	wpc->nr_folios = 0;
> @@ -1644,16 +1631,17 @@ static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
>  static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
>  		u16 ioend_flags)
>  {
> +	struct iomap_ioend *ioend = wpc->wb_ctx;
> +
>  	if (ioend_flags & IOMAP_IOEND_BOUNDARY)
>  		return false;
>  	if ((ioend_flags & IOMAP_IOEND_NOMERGE_FLAGS) !=
> -	    (wpc->ioend->io_flags & IOMAP_IOEND_NOMERGE_FLAGS))
> +	    (ioend->io_flags & IOMAP_IOEND_NOMERGE_FLAGS))
>  		return false;
> -	if (pos != wpc->ioend->io_offset + wpc->ioend->io_size)
> +	if (pos != ioend->io_offset + ioend->io_size)
>  		return false;
>  	if (!(wpc->iomap.flags & IOMAP_F_ANON_WRITE) &&
> -	    iomap_sector(&wpc->iomap, pos) !=
> -	    bio_end_sector(&wpc->ioend->io_bio))
> +	    iomap_sector(&wpc->iomap, pos) != bio_end_sector(&ioend->io_bio))
>  		return false;
>  	/*
>  	 * Limit ioend bio chain lengths to minimise IO completion latency. This
> @@ -1679,6 +1667,7 @@ static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
>  ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
>  		loff_t pos, loff_t end_pos, unsigned int dirty_len)
>  {
> +	struct iomap_ioend *ioend = wpc->wb_ctx;
>  	struct iomap_folio_state *ifs = folio->private;
>  	size_t poff = offset_in_folio(folio, pos);
>  	unsigned int ioend_flags = 0;
> @@ -1709,15 +1698,17 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
>  	if (pos == wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY))
>  		ioend_flags |= IOMAP_IOEND_BOUNDARY;
>  
> -	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
> +	if (!ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
>  new_ioend:
> -		error = iomap_submit_ioend(wpc, 0);
> -		if (error)
> -			return error;
> -		wpc->ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
> +		if (ioend) {
> +			error = wpc->ops->writeback_submit(wpc, 0);

Should we call ioend_writeback_submit directly if
!wpc->ops->writeback_submit, to avoid the indirect call hit for simpler
filesystems?

--D

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/16] iomap: add public helpers for uptodate state manipulation
  2025-06-24  2:21 ` [PATCH v3 05/16] iomap: add public helpers for uptodate state manipulation Joanne Koong
@ 2025-07-02 17:39   ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:39 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:24PM -0700, Joanne Koong wrote:
> Add a new iomap_start_folio_write helper to abstract away the
> write_bytes_pending handling, and export it and the existing
> iomap_finish_folio_write for non-iomap writeback in fuse.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> [hch: split from a larger patch]
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks ok,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/iomap/buffered-io.c | 20 +++++++++++++++-----
>  include/linux/iomap.h  |  5 +++++
>  2 files changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 50cfddff1393..b4a8d2241d70 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1535,7 +1535,18 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops,
>  }
>  EXPORT_SYMBOL_GPL(iomap_page_mkwrite);
>  
> -static void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
> +void iomap_start_folio_write(struct inode *inode, struct folio *folio,
> +		size_t len)
> +{
> +	struct iomap_folio_state *ifs = folio->private;
> +
> +	WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !ifs);
> +	if (ifs)
> +		atomic_add(len, &ifs->write_bytes_pending);
> +}
> +EXPORT_SYMBOL_GPL(iomap_start_folio_write);
> +
> +void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
>  		size_t len)
>  {
>  	struct iomap_folio_state *ifs = folio->private;
> @@ -1546,6 +1557,7 @@ static void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
>  	if (!ifs || atomic_sub_and_test(len, &ifs->write_bytes_pending))
>  		folio_end_writeback(folio);
>  }
> +EXPORT_SYMBOL_GPL(iomap_finish_folio_write);
>  
>  /*
>   * We're now finished for good with this ioend structure.  Update the page
> @@ -1668,7 +1680,6 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
>  		loff_t pos, loff_t end_pos, unsigned int dirty_len)
>  {
>  	struct iomap_ioend *ioend = wpc->wb_ctx;
> -	struct iomap_folio_state *ifs = folio->private;
>  	size_t poff = offset_in_folio(folio, pos);
>  	unsigned int ioend_flags = 0;
>  	unsigned int map_len = min_t(u64, dirty_len,
> @@ -1711,8 +1722,7 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
>  	if (!bio_add_folio(&ioend->io_bio, folio, map_len, poff))
>  		goto new_ioend;
>  
> -	if (ifs)
> -		atomic_add(map_len, &ifs->write_bytes_pending);
> +	iomap_start_folio_write(wpc->inode, folio, map_len);
>  
>  	/*
>  	 * Clamp io_offset and io_size to the incore EOF so that ondisk
> @@ -1880,7 +1890,7 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  		 * all blocks.
>  		 */
>  		WARN_ON_ONCE(atomic_read(&ifs->write_bytes_pending) != 0);
> -		atomic_inc(&ifs->write_bytes_pending);
> +		iomap_start_folio_write(inode, folio, 1);
>  	}
>  
>  	/*
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 047100f94092..bfd178fb7cfc 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -460,6 +460,11 @@ void iomap_sort_ioends(struct list_head *ioend_list);
>  ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
>  		loff_t pos, loff_t end_pos, unsigned int dirty_len);
>  int ioend_writeback_submit(struct iomap_writepage_ctx *wpc, int error);
> +
> +void iomap_start_folio_write(struct inode *inode, struct folio *folio,
> +		size_t len);
> +void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
> +		size_t len);
>  int iomap_writepages(struct iomap_writepage_ctx *wpc);
>  
>  /*
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 06/16] iomap: move all ioend handling to ioend.c
  2025-06-24  2:21 ` [PATCH v3 06/16] iomap: move all ioend handling to ioend.c Joanne Koong
@ 2025-07-02 17:40   ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:40 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:25PM -0700, Joanne Koong wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> Now that the writeback code has the proper abstractions, all the ioend
> code can be self-contained in ioend.c.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Joanne Koong <joannelkoong@gmail.com>

Looks ok,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/iomap/buffered-io.c | 215 ----------------------------------------
>  fs/iomap/internal.h    |   1 -
>  fs/iomap/ioend.c       | 220 ++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 219 insertions(+), 217 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index b4a8d2241d70..c262f883f9f9 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1559,221 +1559,6 @@ void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
>  }
>  EXPORT_SYMBOL_GPL(iomap_finish_folio_write);
>  
> -/*
> - * We're now finished for good with this ioend structure.  Update the page
> - * state, release holds on bios, and finally free up memory.  Do not use the
> - * ioend after this.
> - */
> -u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend)
> -{
> -	struct inode *inode = ioend->io_inode;
> -	struct bio *bio = &ioend->io_bio;
> -	struct folio_iter fi;
> -	u32 folio_count = 0;
> -
> -	if (ioend->io_error) {
> -		mapping_set_error(inode->i_mapping, ioend->io_error);
> -		if (!bio_flagged(bio, BIO_QUIET)) {
> -			pr_err_ratelimited(
> -"%s: writeback error on inode %lu, offset %lld, sector %llu",
> -				inode->i_sb->s_id, inode->i_ino,
> -				ioend->io_offset, ioend->io_sector);
> -		}
> -	}
> -
> -	/* walk all folios in bio, ending page IO on them */
> -	bio_for_each_folio_all(fi, bio) {
> -		iomap_finish_folio_write(inode, fi.folio, fi.length);
> -		folio_count++;
> -	}
> -
> -	bio_put(bio);	/* frees the ioend */
> -	return folio_count;
> -}
> -
> -static void ioend_writeback_end_bio(struct bio *bio)
> -{
> -	struct iomap_ioend *ioend = iomap_ioend_from_bio(bio);
> -
> -	ioend->io_error = blk_status_to_errno(bio->bi_status);
> -	iomap_finish_ioend_buffered(ioend);
> -}
> -
> -/*
> - * We cannot cancel the ioend directly in case of an error, so call the bio end
> - * I/O handler with the error status here to run the normal I/O completion
> - * handler.
> - */
> -int ioend_writeback_submit(struct iomap_writepage_ctx *wpc, int error)
> -{
> -	struct iomap_ioend *ioend = wpc->wb_ctx;
> -
> -	if (!ioend->io_bio.bi_end_io)
> -		ioend->io_bio.bi_end_io = ioend_writeback_end_bio;
> -
> -	if (WARN_ON_ONCE(wpc->iomap.flags & IOMAP_F_ANON_WRITE))
> -		error = -EIO;
> -
> -	if (error) {
> -		ioend->io_bio.bi_status = errno_to_blk_status(error);
> -		bio_endio(&ioend->io_bio);
> -		return error;
> -	}
> -
> -	submit_bio(&ioend->io_bio);
> -	return 0;
> -}
> -EXPORT_SYMBOL_GPL(ioend_writeback_submit);
> -
> -static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
> -		loff_t pos, u16 ioend_flags)
> -{
> -	struct bio *bio;
> -
> -	bio = bio_alloc_bioset(wpc->iomap.bdev, BIO_MAX_VECS,
> -			       REQ_OP_WRITE | wbc_to_write_flags(wpc->wbc),
> -			       GFP_NOFS, &iomap_ioend_bioset);
> -	bio->bi_iter.bi_sector = iomap_sector(&wpc->iomap, pos);
> -	bio->bi_write_hint = wpc->inode->i_write_hint;
> -	wbc_init_bio(wpc->wbc, bio);
> -	wpc->nr_folios = 0;
> -	return iomap_init_ioend(wpc->inode, bio, pos, ioend_flags);
> -}
> -
> -static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
> -		u16 ioend_flags)
> -{
> -	struct iomap_ioend *ioend = wpc->wb_ctx;
> -
> -	if (ioend_flags & IOMAP_IOEND_BOUNDARY)
> -		return false;
> -	if ((ioend_flags & IOMAP_IOEND_NOMERGE_FLAGS) !=
> -	    (ioend->io_flags & IOMAP_IOEND_NOMERGE_FLAGS))
> -		return false;
> -	if (pos != ioend->io_offset + ioend->io_size)
> -		return false;
> -	if (!(wpc->iomap.flags & IOMAP_F_ANON_WRITE) &&
> -	    iomap_sector(&wpc->iomap, pos) != bio_end_sector(&ioend->io_bio))
> -		return false;
> -	/*
> -	 * Limit ioend bio chain lengths to minimise IO completion latency. This
> -	 * also prevents long tight loops ending page writeback on all the
> -	 * folios in the ioend.
> -	 */
> -	if (wpc->nr_folios >= IOEND_BATCH_SIZE)
> -		return false;
> -	return true;
> -}
> -
> -/*
> - * Test to see if we have an existing ioend structure that we could append to
> - * first; otherwise finish off the current ioend and start another.
> - *
> - * If a new ioend is created and cached, the old ioend is submitted to the block
> - * layer instantly.  Batching optimisations are provided by higher level block
> - * plugging.
> - *
> - * At the end of a writeback pass, there will be a cached ioend remaining on the
> - * writepage context that the caller will need to submit.
> - */
> -ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
> -		loff_t pos, loff_t end_pos, unsigned int dirty_len)
> -{
> -	struct iomap_ioend *ioend = wpc->wb_ctx;
> -	size_t poff = offset_in_folio(folio, pos);
> -	unsigned int ioend_flags = 0;
> -	unsigned int map_len = min_t(u64, dirty_len,
> -		wpc->iomap.offset + wpc->iomap.length - pos);
> -	int error;
> -
> -	trace_iomap_add_to_ioend(wpc->inode, pos, dirty_len, &wpc->iomap);
> -
> -	WARN_ON_ONCE(!folio->private && map_len < dirty_len);
> -
> -	switch (wpc->iomap.type) {
> -	case IOMAP_INLINE:
> -		WARN_ON_ONCE(1);
> -		return -EIO;
> -	case IOMAP_HOLE:
> -		return map_len;
> -	default:
> -		break;
> -	}
> -
> -	if (wpc->iomap.type == IOMAP_UNWRITTEN)
> -		ioend_flags |= IOMAP_IOEND_UNWRITTEN;
> -	if (wpc->iomap.flags & IOMAP_F_SHARED)
> -		ioend_flags |= IOMAP_IOEND_SHARED;
> -	if (folio_test_dropbehind(folio))
> -		ioend_flags |= IOMAP_IOEND_DONTCACHE;
> -	if (pos == wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY))
> -		ioend_flags |= IOMAP_IOEND_BOUNDARY;
> -
> -	if (!ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
> -new_ioend:
> -		if (ioend) {
> -			error = wpc->ops->writeback_submit(wpc, 0);
> -			if (error)
> -				return error;
> -		}
> -		wpc->wb_ctx = ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
> -	}
> -
> -	if (!bio_add_folio(&ioend->io_bio, folio, map_len, poff))
> -		goto new_ioend;
> -
> -	iomap_start_folio_write(wpc->inode, folio, map_len);
> -
> -	/*
> -	 * Clamp io_offset and io_size to the incore EOF so that ondisk
> -	 * file size updates in the ioend completion are byte-accurate.
> -	 * This avoids recovering files with zeroed tail regions when
> -	 * writeback races with appending writes:
> -	 *
> -	 *    Thread 1:                  Thread 2:
> -	 *    ------------               -----------
> -	 *    write [A, A+B]
> -	 *    update inode size to A+B
> -	 *    submit I/O [A, A+BS]
> -	 *                               write [A+B, A+B+C]
> -	 *                               update inode size to A+B+C
> -	 *    <I/O completes, updates disk size to min(A+B+C, A+BS)>
> -	 *    <power failure>
> -	 *
> -	 *  After reboot:
> -	 *    1) with A+B+C < A+BS, the file has zero padding in range
> -	 *       [A+B, A+B+C]
> -	 *
> -	 *    |<     Block Size (BS)   >|
> -	 *    |DDDDDDDDDDDD0000000000000|
> -	 *    ^           ^        ^
> -	 *    A          A+B     A+B+C
> -	 *                       (EOF)
> -	 *
> -	 *    2) with A+B+C > A+BS, the file has zero padding in range
> -	 *       [A+B, A+BS]
> -	 *
> -	 *    |<     Block Size (BS)   >|<     Block Size (BS)    >|
> -	 *    |DDDDDDDDDDDD0000000000000|00000000000000000000000000|
> -	 *    ^           ^             ^           ^
> -	 *    A          A+B           A+BS       A+B+C
> -	 *                             (EOF)
> -	 *
> -	 *    D = Valid Data
> -	 *    0 = Zero Padding
> -	 *
> -	 * Note that this defeats the ability to chain the ioends of
> -	 * appending writes.
> -	 */
> -	ioend->io_size += map_len;
> -	if (ioend->io_offset + ioend->io_size > end_pos)
> -		ioend->io_size = end_pos - ioend->io_offset;
> -
> -	wbc_account_cgroup_owner(wpc->wbc, folio, map_len);
> -	return map_len;
> -}
> -EXPORT_SYMBOL_GPL(iomap_add_to_ioend);
> -
>  static int iomap_writeback_range(struct iomap_writepage_ctx *wpc,
>  		struct folio *folio, u64 pos, u32 rlen, u64 end_pos,
>  		bool *wb_pending)
> diff --git a/fs/iomap/internal.h b/fs/iomap/internal.h
> index f6992a3bf66a..d05cb3aed96e 100644
> --- a/fs/iomap/internal.h
> +++ b/fs/iomap/internal.h
> @@ -4,7 +4,6 @@
>  
>  #define IOEND_BATCH_SIZE	4096
>  
> -u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend);
>  u32 iomap_finish_ioend_direct(struct iomap_ioend *ioend);
>  
>  #endif /* _IOMAP_INTERNAL_H */
> diff --git a/fs/iomap/ioend.c b/fs/iomap/ioend.c
> index 18894ebba6db..81f4bac5a3a9 100644
> --- a/fs/iomap/ioend.c
> +++ b/fs/iomap/ioend.c
> @@ -1,10 +1,13 @@
>  // SPDX-License-Identifier: GPL-2.0
>  /*
> - * Copyright (c) 2024-2025 Christoph Hellwig.
> + * Copyright (c) 2016-2025 Christoph Hellwig.
>   */
>  #include <linux/iomap.h>
>  #include <linux/list_sort.h>
> +#include <linux/pagemap.h>
> +#include <linux/writeback.h>
>  #include "internal.h"
> +#include "trace.h"
>  
>  struct bio_set iomap_ioend_bioset;
>  EXPORT_SYMBOL_GPL(iomap_ioend_bioset);
> @@ -28,6 +31,221 @@ struct iomap_ioend *iomap_init_ioend(struct inode *inode,
>  }
>  EXPORT_SYMBOL_GPL(iomap_init_ioend);
>  
> +/*
> + * We're now finished for good with this ioend structure.  Update the folio
> + * state, release holds on bios, and finally free up memory.  Do not use the
> + * ioend after this.
> + */
> +static u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend)
> +{
> +	struct inode *inode = ioend->io_inode;
> +	struct bio *bio = &ioend->io_bio;
> +	struct folio_iter fi;
> +	u32 folio_count = 0;
> +
> +	if (ioend->io_error) {
> +		mapping_set_error(inode->i_mapping, ioend->io_error);
> +		if (!bio_flagged(bio, BIO_QUIET)) {
> +			pr_err_ratelimited(
> +"%s: writeback error on inode %lu, offset %lld, sector %llu",
> +				inode->i_sb->s_id, inode->i_ino,
> +				ioend->io_offset, ioend->io_sector);
> +		}
> +	}
> +
> +	/* walk all folios in bio, ending page IO on them */
> +	bio_for_each_folio_all(fi, bio) {
> +		iomap_finish_folio_write(inode, fi.folio, fi.length);
> +		folio_count++;
> +	}
> +
> +	bio_put(bio);	/* frees the ioend */
> +	return folio_count;
> +}
> +
> +static void ioend_writeback_end_bio(struct bio *bio)
> +{
> +	struct iomap_ioend *ioend = iomap_ioend_from_bio(bio);
> +
> +	ioend->io_error = blk_status_to_errno(bio->bi_status);
> +	iomap_finish_ioend_buffered(ioend);
> +}
> +
> +/*
> + * We cannot cancel the ioend directly in case of an error, so call the bio end
> + * I/O handler with the error status here to run the normal I/O completion
> + * handler.
> + */
> +int ioend_writeback_submit(struct iomap_writepage_ctx *wpc, int error)
> +{
> +	struct iomap_ioend *ioend = wpc->wb_ctx;
> +
> +	if (!ioend->io_bio.bi_end_io)
> +		ioend->io_bio.bi_end_io = ioend_writeback_end_bio;
> +
> +	if (WARN_ON_ONCE(wpc->iomap.flags & IOMAP_F_ANON_WRITE))
> +		error = -EIO;
> +
> +	if (error) {
> +		ioend->io_bio.bi_status = errno_to_blk_status(error);
> +		bio_endio(&ioend->io_bio);
> +		return error;
> +	}
> +
> +	submit_bio(&ioend->io_bio);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(ioend_writeback_submit);
> +
> +static struct iomap_ioend *iomap_alloc_ioend(struct iomap_writepage_ctx *wpc,
> +		loff_t pos, u16 ioend_flags)
> +{
> +	struct bio *bio;
> +
> +	bio = bio_alloc_bioset(wpc->iomap.bdev, BIO_MAX_VECS,
> +			       REQ_OP_WRITE | wbc_to_write_flags(wpc->wbc),
> +			       GFP_NOFS, &iomap_ioend_bioset);
> +	bio->bi_iter.bi_sector = iomap_sector(&wpc->iomap, pos);
> +	bio->bi_write_hint = wpc->inode->i_write_hint;
> +	wbc_init_bio(wpc->wbc, bio);
> +	wpc->nr_folios = 0;
> +	return iomap_init_ioend(wpc->inode, bio, pos, ioend_flags);
> +}
> +
> +static bool iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t pos,
> +		u16 ioend_flags)
> +{
> +	struct iomap_ioend *ioend = wpc->wb_ctx;
> +
> +	if (ioend_flags & IOMAP_IOEND_BOUNDARY)
> +		return false;
> +	if ((ioend_flags & IOMAP_IOEND_NOMERGE_FLAGS) !=
> +	    (ioend->io_flags & IOMAP_IOEND_NOMERGE_FLAGS))
> +		return false;
> +	if (pos != ioend->io_offset + ioend->io_size)
> +		return false;
> +	if (!(wpc->iomap.flags & IOMAP_F_ANON_WRITE) &&
> +	    iomap_sector(&wpc->iomap, pos) != bio_end_sector(&ioend->io_bio))
> +		return false;
> +	/*
> +	 * Limit ioend bio chain lengths to minimise IO completion latency. This
> +	 * also prevents long tight loops ending page writeback on all the
> +	 * folios in the ioend.
> +	 */
> +	if (wpc->nr_folios >= IOEND_BATCH_SIZE)
> +		return false;
> +	return true;
> +}
> +
> +/*
> + * Test to see if we have an existing ioend structure that we could append to
> + * first; otherwise finish off the current ioend and start another.
> + *
> + * If a new ioend is created and cached, the old ioend is submitted to the block
> + * layer instantly.  Batching optimisations are provided by higher level block
> + * plugging.
> + *
> + * At the end of a writeback pass, there will be a cached ioend remaining on the
> + * writepage context that the caller will need to submit.
> + */
> +ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, struct folio *folio,
> +		loff_t pos, loff_t end_pos, unsigned int dirty_len)
> +{
> +	struct iomap_ioend *ioend = wpc->wb_ctx;
> +	size_t poff = offset_in_folio(folio, pos);
> +	unsigned int ioend_flags = 0;
> +	unsigned int map_len = min_t(u64, dirty_len,
> +		wpc->iomap.offset + wpc->iomap.length - pos);
> +	int error;
> +
> +	trace_iomap_add_to_ioend(wpc->inode, pos, dirty_len, &wpc->iomap);
> +
> +	WARN_ON_ONCE(!folio->private && map_len < dirty_len);
> +
> +	switch (wpc->iomap.type) {
> +	case IOMAP_INLINE:
> +		WARN_ON_ONCE(1);
> +		return -EIO;
> +	case IOMAP_HOLE:
> +		return map_len;
> +	default:
> +		break;
> +	}
> +
> +	if (wpc->iomap.type == IOMAP_UNWRITTEN)
> +		ioend_flags |= IOMAP_IOEND_UNWRITTEN;
> +	if (wpc->iomap.flags & IOMAP_F_SHARED)
> +		ioend_flags |= IOMAP_IOEND_SHARED;
> +	if (folio_test_dropbehind(folio))
> +		ioend_flags |= IOMAP_IOEND_DONTCACHE;
> +	if (pos == wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY))
> +		ioend_flags |= IOMAP_IOEND_BOUNDARY;
> +
> +	if (!ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
> +new_ioend:
> +		if (ioend) {
> +			error = wpc->ops->writeback_submit(wpc, 0);
> +			if (error)
> +				return error;
> +		}
> +		wpc->wb_ctx = ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
> +	}
> +
> +	if (!bio_add_folio(&ioend->io_bio, folio, map_len, poff))
> +		goto new_ioend;
> +
> +	iomap_start_folio_write(wpc->inode, folio, map_len);
> +
> +	/*
> +	 * Clamp io_offset and io_size to the incore EOF so that ondisk
> +	 * file size updates in the ioend completion are byte-accurate.
> +	 * This avoids recovering files with zeroed tail regions when
> +	 * writeback races with appending writes:
> +	 *
> +	 *    Thread 1:                  Thread 2:
> +	 *    ------------               -----------
> +	 *    write [A, A+B]
> +	 *    update inode size to A+B
> +	 *    submit I/O [A, A+BS]
> +	 *                               write [A+B, A+B+C]
> +	 *                               update inode size to A+B+C
> +	 *    <I/O completes, updates disk size to min(A+B+C, A+BS)>
> +	 *    <power failure>
> +	 *
> +	 *  After reboot:
> +	 *    1) with A+B+C < A+BS, the file has zero padding in range
> +	 *       [A+B, A+B+C]
> +	 *
> +	 *    |<     Block Size (BS)   >|
> +	 *    |DDDDDDDDDDDD0000000000000|
> +	 *    ^           ^        ^
> +	 *    A          A+B     A+B+C
> +	 *                       (EOF)
> +	 *
> +	 *    2) with A+B+C > A+BS, the file has zero padding in range
> +	 *       [A+B, A+BS]
> +	 *
> +	 *    |<     Block Size (BS)   >|<     Block Size (BS)    >|
> +	 *    |DDDDDDDDDDDD0000000000000|00000000000000000000000000|
> +	 *    ^           ^             ^           ^
> +	 *    A          A+B           A+BS       A+B+C
> +	 *                             (EOF)
> +	 *
> +	 *    D = Valid Data
> +	 *    0 = Zero Padding
> +	 *
> +	 * Note that this defeats the ability to chain the ioends of
> +	 * appending writes.
> +	 */
> +	ioend->io_size += map_len;
> +	if (ioend->io_offset + ioend->io_size > end_pos)
> +		ioend->io_size = end_pos - ioend->io_offset;
> +
> +	wbc_account_cgroup_owner(wpc->wbc, folio, map_len);
> +	return map_len;
> +}
> +EXPORT_SYMBOL_GPL(iomap_add_to_ioend);
> +
>  static u32 iomap_finish_ioend(struct iomap_ioend *ioend, int error)
>  {
>  	if (ioend->io_parent) {
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/16] iomap: rename iomap_writepage_map to iomap_writeback_folio
  2025-06-24  2:21 ` [PATCH v3 07/16] iomap: rename iomap_writepage_map to iomap_writeback_folio Joanne Koong
@ 2025-07-02 17:41   ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:41 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:26PM -0700, Joanne Koong wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> ->writepage is gone, and our naming wasn't always that great to start
> with.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Yesssssss
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/iomap/buffered-io.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index c262f883f9f9..c6bbee68812e 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1586,7 +1586,7 @@ static int iomap_writeback_range(struct iomap_writepage_ctx *wpc,
>   * If the folio is entirely beyond i_size, return false.  If it straddles
>   * i_size, adjust end_pos and zero all data beyond i_size.
>   */
> -static bool iomap_writepage_handle_eof(struct folio *folio, struct inode *inode,
> +static bool iomap_writeback_handle_eof(struct folio *folio, struct inode *inode,
>  		u64 *end_pos)
>  {
>  	u64 isize = i_size_read(inode);
> @@ -1638,7 +1638,7 @@ static bool iomap_writepage_handle_eof(struct folio *folio, struct inode *inode,
>  	return true;
>  }
>  
> -static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> +static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
>  		struct folio *folio)
>  {
>  	struct iomap_folio_state *ifs = folio->private;
> @@ -1656,7 +1656,7 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  
>  	trace_iomap_writepage(inode, pos, folio_size(folio));
>  
> -	if (!iomap_writepage_handle_eof(folio, inode, &end_pos)) {
> +	if (!iomap_writeback_handle_eof(folio, inode, &end_pos)) {
>  		folio_unlock(folio);
>  		return 0;
>  	}
> @@ -1741,7 +1741,7 @@ iomap_writepages(struct iomap_writepage_ctx *wpc)
>  		return -EIO;
>  
>  	while ((folio = writeback_iter(mapping, wpc->wbc, folio, &error)))
> -		error = iomap_writepage_map(wpc, folio);
> +		error = iomap_writeback_folio(wpc, folio);
>  
>  	/*
>  	 * If @error is non-zero, it means that we have a situation where some
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 08/16] iomap: move folio_unlock out of iomap_writeback_folio
  2025-06-24  2:21 ` [PATCH v3 08/16] iomap: move folio_unlock out of iomap_writeback_folio Joanne Koong
@ 2025-07-02 17:43   ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:43 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:27PM -0700, Joanne Koong wrote:
> Move unlocking the folio out of iomap_writeback_folio into the caller.
> This means the end writeback machinery is now run with the folio locked
> when no writeback happend, or writeback completed extremely fast.
> 
> This prepares for exporting iomap_writeback_folio for use in folio
> laundering.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> [hch: split from a larger patch]
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Seems fine to me, since the folio lock state is the same beofre and
after the call to iomap_writeback_folio.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/iomap/buffered-io.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index c6bbee68812e..2973fced2a52 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1656,10 +1656,8 @@ static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
>  
>  	trace_iomap_writepage(inode, pos, folio_size(folio));
>  
> -	if (!iomap_writeback_handle_eof(folio, inode, &end_pos)) {
> -		folio_unlock(folio);
> +	if (!iomap_writeback_handle_eof(folio, inode, &end_pos))
>  		return 0;
> -	}
>  	WARN_ON_ONCE(end_pos <= pos);
>  
>  	if (i_blocks_per_folio(inode, folio) > 1) {
> @@ -1713,7 +1711,6 @@ static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
>  	 * already at this point.  In that case we need to clear the writeback
>  	 * bit ourselves right after unlocking the page.
>  	 */
> -	folio_unlock(folio);
>  	if (ifs) {
>  		if (atomic_dec_and_test(&ifs->write_bytes_pending))
>  			folio_end_writeback(folio);
> @@ -1740,8 +1737,10 @@ iomap_writepages(struct iomap_writepage_ctx *wpc)
>  			PF_MEMALLOC))
>  		return -EIO;
>  
> -	while ((folio = writeback_iter(mapping, wpc->wbc, folio, &error)))
> +	while ((folio = writeback_iter(mapping, wpc->wbc, folio, &error))) {
>  		error = iomap_writeback_folio(wpc, folio);
> +		folio_unlock(folio);
> +	}
>  
>  	/*
>  	 * If @error is non-zero, it means that we have a situation where some
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 09/16] iomap: export iomap_writeback_folio
  2025-06-24  2:21 ` [PATCH v3 09/16] iomap: export iomap_writeback_folio Joanne Koong
@ 2025-07-02 17:44   ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:44 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:28PM -0700, Joanne Koong wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> Allow fuse to use iomap_writeback_folio for folio laundering.  Note
> that the caller needs to manually submit the pending writeback context.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Joanne Koong <joannelkoong@gmail.com>

Seems reasonable to me...
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/iomap/buffered-io.c | 4 ++--
>  include/linux/iomap.h  | 1 +
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 2973fced2a52..d7fa885b1a0c 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1638,8 +1638,7 @@ static bool iomap_writeback_handle_eof(struct folio *folio, struct inode *inode,
>  	return true;
>  }
>  
> -static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
> -		struct folio *folio)
> +int iomap_writeback_folio(struct iomap_writepage_ctx *wpc, struct folio *folio)
>  {
>  	struct iomap_folio_state *ifs = folio->private;
>  	struct inode *inode = wpc->inode;
> @@ -1721,6 +1720,7 @@ static int iomap_writeback_folio(struct iomap_writepage_ctx *wpc,
>  	mapping_set_error(inode->i_mapping, error);
>  	return error;
>  }
> +EXPORT_SYMBOL_GPL(iomap_writeback_folio);
>  
>  int
>  iomap_writepages(struct iomap_writepage_ctx *wpc)
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index bfd178fb7cfc..7e06b3a392f8 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -465,6 +465,7 @@ void iomap_start_folio_write(struct inode *inode, struct folio *folio,
>  		size_t len);
>  void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
>  		size_t len);
> +int iomap_writeback_folio(struct iomap_writepage_ctx *wpc, struct folio *folio);
>  int iomap_writepages(struct iomap_writepage_ctx *wpc);
>  
>  /*
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 10/16] iomap: replace iomap_folio_ops with iomap_write_ops
  2025-06-24  2:21 ` [PATCH v3 10/16] iomap: replace iomap_folio_ops with iomap_write_ops Joanne Koong
@ 2025-07-02 17:48   ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:48 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:29PM -0700, Joanne Koong wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> The iomap_folio_ops are only used for buffered writes, including
> the zero and unshare variants.  Rename them to iomap_write_ops
> to better describe the usage, and pass them through the callchain
> like the other operation specific methods instead of through the
> iomap.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  Documentation/filesystems/iomap/design.rst    |  3 -
>  .../filesystems/iomap/operations.rst          |  8 +-
>  block/fops.c                                  |  3 +-
>  fs/gfs2/bmap.c                                | 21 ++---
>  fs/gfs2/bmap.h                                |  1 +
>  fs/gfs2/file.c                                |  3 +-
>  fs/iomap/buffered-io.c                        | 79 +++++++++++--------
>  fs/xfs/xfs_file.c                             |  6 +-
>  fs/xfs/xfs_iomap.c                            | 12 ++-
>  fs/xfs/xfs_iomap.h                            |  1 +
>  fs/xfs/xfs_reflink.c                          |  3 +-
>  fs/zonefs/file.c                              |  3 +-
>  include/linux/iomap.h                         | 22 +++---
>  13 files changed, 89 insertions(+), 76 deletions(-)
> 
> diff --git a/Documentation/filesystems/iomap/design.rst b/Documentation/filesystems/iomap/design.rst
> index f2df9b6df988..0f7672676c0b 100644
> --- a/Documentation/filesystems/iomap/design.rst
> +++ b/Documentation/filesystems/iomap/design.rst
> @@ -167,7 +167,6 @@ structure below:
>       struct dax_device   *dax_dev;
>       void                *inline_data;
>       void                *private;
> -     const struct iomap_folio_ops *folio_ops;

I considered this a minor layering violation when it was added, and I'm
glad it's finally moving.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D


>       u64                 validity_cookie;
>   };
>  
> @@ -292,8 +291,6 @@ The fields are as follows:
>     <https://lore.kernel.org/all/20180619164137.13720-7-hch@lst.de/>`_.
>     This value will be passed unchanged to ``->iomap_end``.
>  
> - * ``folio_ops`` will be covered in the section on pagecache operations.
> -
>   * ``validity_cookie`` is a magic freshness value set by the filesystem
>     that should be used to detect stale mappings.
>     For pagecache operations this is critical for correct operation
> diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
> index ead56b27ec3f..1f5732835567 100644
> --- a/Documentation/filesystems/iomap/operations.rst
> +++ b/Documentation/filesystems/iomap/operations.rst
> @@ -57,16 +57,12 @@ The following address space operations can be wrapped easily:
>   * ``bmap``
>   * ``swap_activate``
>  
> -``struct iomap_folio_ops``
> +``struct iomap_write_ops``
>  --------------------------
>  
> -The ``->iomap_begin`` function for pagecache operations may set the
> -``struct iomap::folio_ops`` field to an ops structure to override
> -default behaviors of iomap:
> -
>  .. code-block:: c
>  
> - struct iomap_folio_ops {
> + struct iomap_write_ops {
>       struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
>                                  unsigned len);
>       void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
> diff --git a/block/fops.c b/block/fops.c
> index cf79cbcf80f0..9e77be5ccb5f 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -723,7 +723,8 @@ blkdev_direct_write(struct kiocb *iocb, struct iov_iter *from)
>  
>  static ssize_t blkdev_buffered_write(struct kiocb *iocb, struct iov_iter *from)
>  {
> -	return iomap_file_buffered_write(iocb, from, &blkdev_iomap_ops, NULL);
> +	return iomap_file_buffered_write(iocb, from, &blkdev_iomap_ops, NULL,
> +			NULL);
>  }
>  
>  /*
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index 3cd46a09e820..a7832636da34 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -963,12 +963,16 @@ static struct folio *
>  gfs2_iomap_get_folio(struct iomap_iter *iter, loff_t pos, unsigned len)
>  {
>  	struct inode *inode = iter->inode;
> +	struct gfs2_inode *ip = GFS2_I(inode);
>  	unsigned int blockmask = i_blocksize(inode) - 1;
>  	struct gfs2_sbd *sdp = GFS2_SB(inode);
>  	unsigned int blocks;
>  	struct folio *folio;
>  	int status;
>  
> +	if (!gfs2_is_jdata(ip) && !gfs2_is_stuffed(ip))
> +		return iomap_get_folio(iter, pos, len);
> +
>  	blocks = ((pos & blockmask) + len + blockmask) >> inode->i_blkbits;
>  	status = gfs2_trans_begin(sdp, RES_DINODE + blocks, 0);
>  	if (status)
> @@ -987,7 +991,7 @@ static void gfs2_iomap_put_folio(struct inode *inode, loff_t pos,
>  	struct gfs2_inode *ip = GFS2_I(inode);
>  	struct gfs2_sbd *sdp = GFS2_SB(inode);
>  
> -	if (!gfs2_is_stuffed(ip))
> +	if (gfs2_is_jdata(ip) && !gfs2_is_stuffed(ip))
>  		gfs2_trans_add_databufs(ip->i_gl, folio,
>  					offset_in_folio(folio, pos),
>  					copied);
> @@ -995,13 +999,14 @@ static void gfs2_iomap_put_folio(struct inode *inode, loff_t pos,
>  	folio_unlock(folio);
>  	folio_put(folio);
>  
> -	if (tr->tr_num_buf_new)
> -		__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
> -
> -	gfs2_trans_end(sdp);
> +	if (gfs2_is_jdata(ip) || gfs2_is_stuffed(ip)) {
> +		if (tr->tr_num_buf_new)
> +			__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
> +		gfs2_trans_end(sdp);
> +	}
>  }
>  
> -static const struct iomap_folio_ops gfs2_iomap_folio_ops = {
> +const struct iomap_write_ops gfs2_iomap_write_ops = {
>  	.get_folio = gfs2_iomap_get_folio,
>  	.put_folio = gfs2_iomap_put_folio,
>  };
> @@ -1078,8 +1083,6 @@ static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos,
>  		gfs2_trans_end(sdp);
>  	}
>  
> -	if (gfs2_is_stuffed(ip) || gfs2_is_jdata(ip))
> -		iomap->folio_ops = &gfs2_iomap_folio_ops;
>  	return 0;
>  
>  out_trans_end:
> @@ -1304,7 +1307,7 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from, loff_t length
>  		return 0;
>  	length = min(length, inode->i_size - from);
>  	return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
> -			NULL);
> +			&gfs2_iomap_write_ops, NULL);
>  }
>  
>  #define GFS2_JTRUNC_REVOKES 8192
> diff --git a/fs/gfs2/bmap.h b/fs/gfs2/bmap.h
> index 4e8b1e8ebdf3..6cdc72dd55a3 100644
> --- a/fs/gfs2/bmap.h
> +++ b/fs/gfs2/bmap.h
> @@ -44,6 +44,7 @@ static inline void gfs2_write_calc_reserv(const struct gfs2_inode *ip,
>  }
>  
>  extern const struct iomap_ops gfs2_iomap_ops;
> +extern const struct iomap_write_ops gfs2_iomap_write_ops;
>  extern const struct iomap_writeback_ops gfs2_writeback_ops;
>  
>  int gfs2_unstuff_dinode(struct gfs2_inode *ip);
> diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
> index fd1147aa3891..2908f5bee21d 100644
> --- a/fs/gfs2/file.c
> +++ b/fs/gfs2/file.c
> @@ -1058,7 +1058,8 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
>  	}
>  
>  	pagefault_disable();
> -	ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops, NULL);
> +	ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops,
> +			&gfs2_iomap_write_ops, NULL);
>  	pagefault_enable();
>  	if (ret > 0)
>  		written += ret;
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index d7fa885b1a0c..f6e410c9ea7b 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -742,28 +742,27 @@ static int __iomap_write_begin(const struct iomap_iter *iter, size_t len,
>  	return 0;
>  }
>  
> -static struct folio *__iomap_get_folio(struct iomap_iter *iter, size_t len)
> +static struct folio *__iomap_get_folio(struct iomap_iter *iter,
> +		const struct iomap_write_ops *write_ops, size_t len)
>  {
> -	const struct iomap_folio_ops *folio_ops = iter->iomap.folio_ops;
>  	loff_t pos = iter->pos;
>  
>  	if (!mapping_large_folio_support(iter->inode->i_mapping))
>  		len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos));
>  
> -	if (folio_ops && folio_ops->get_folio)
> -		return folio_ops->get_folio(iter, pos, len);
> -	else
> -		return iomap_get_folio(iter, pos, len);
> +	if (write_ops && write_ops->get_folio)
> +		return write_ops->get_folio(iter, pos, len);
> +	return iomap_get_folio(iter, pos, len);
>  }
>  
> -static void __iomap_put_folio(struct iomap_iter *iter, size_t ret,
> +static void __iomap_put_folio(struct iomap_iter *iter,
> +		const struct iomap_write_ops *write_ops, size_t ret,
>  		struct folio *folio)
>  {
> -	const struct iomap_folio_ops *folio_ops = iter->iomap.folio_ops;
>  	loff_t pos = iter->pos;
>  
> -	if (folio_ops && folio_ops->put_folio) {
> -		folio_ops->put_folio(iter->inode, pos, ret, folio);
> +	if (write_ops && write_ops->put_folio) {
> +		write_ops->put_folio(iter->inode, pos, ret, folio);
>  	} else {
>  		folio_unlock(folio);
>  		folio_put(folio);
> @@ -800,10 +799,10 @@ static int iomap_write_begin_inline(const struct iomap_iter *iter,
>   * offset, and length. Callers can optionally pass a max length *plen,
>   * otherwise init to zero.
>   */
> -static int iomap_write_begin(struct iomap_iter *iter, struct folio **foliop,
> +static int iomap_write_begin(struct iomap_iter *iter,
> +		const struct iomap_write_ops *write_ops, struct folio **foliop,
>  		size_t *poffset, u64 *plen)
>  {
> -	const struct iomap_folio_ops *folio_ops = iter->iomap.folio_ops;
>  	const struct iomap *srcmap = iomap_iter_srcmap(iter);
>  	loff_t pos = iter->pos;
>  	u64 len = min_t(u64, SIZE_MAX, iomap_length(iter));
> @@ -818,7 +817,7 @@ static int iomap_write_begin(struct iomap_iter *iter, struct folio **foliop,
>  	if (fatal_signal_pending(current))
>  		return -EINTR;
>  
> -	folio = __iomap_get_folio(iter, len);
> +	folio = __iomap_get_folio(iter, write_ops, len);
>  	if (IS_ERR(folio))
>  		return PTR_ERR(folio);
>  
> @@ -832,8 +831,8 @@ static int iomap_write_begin(struct iomap_iter *iter, struct folio **foliop,
>  	 * could do the wrong thing here (zero a page range incorrectly or fail
>  	 * to zero) and corrupt data.
>  	 */
> -	if (folio_ops && folio_ops->iomap_valid) {
> -		bool iomap_valid = folio_ops->iomap_valid(iter->inode,
> +	if (write_ops && write_ops->iomap_valid) {
> +		bool iomap_valid = write_ops->iomap_valid(iter->inode,
>  							 &iter->iomap);
>  		if (!iomap_valid) {
>  			iter->iomap.flags |= IOMAP_F_STALE;
> @@ -859,8 +858,7 @@ static int iomap_write_begin(struct iomap_iter *iter, struct folio **foliop,
>  	return 0;
>  
>  out_unlock:
> -	__iomap_put_folio(iter, 0, folio);
> -
> +	__iomap_put_folio(iter, write_ops, 0, folio);
>  	return status;
>  }
>  
> @@ -932,7 +930,8 @@ static bool iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
>  	return __iomap_write_end(iter->inode, pos, len, copied, folio);
>  }
>  
> -static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
> +static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i,
> +		const struct iomap_write_ops *write_ops)
>  {
>  	ssize_t total_written = 0;
>  	int status = 0;
> @@ -976,7 +975,8 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>  			break;
>  		}
>  
> -		status = iomap_write_begin(iter, &folio, &offset, &bytes);
> +		status = iomap_write_begin(iter, write_ops, &folio, &offset,
> +				&bytes);
>  		if (unlikely(status)) {
>  			iomap_write_failed(iter->inode, iter->pos, bytes);
>  			break;
> @@ -1005,7 +1005,7 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>  			i_size_write(iter->inode, pos + written);
>  			iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
>  		}
> -		__iomap_put_folio(iter, written, folio);
> +		__iomap_put_folio(iter, write_ops, written, folio);
>  
>  		if (old_size < pos)
>  			pagecache_isize_extended(iter->inode, old_size, pos);
> @@ -1038,7 +1038,8 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>  
>  ssize_t
>  iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
> -		const struct iomap_ops *ops, void *private)
> +		const struct iomap_ops *ops,
> +		const struct iomap_write_ops *write_ops, void *private)
>  {
>  	struct iomap_iter iter = {
>  		.inode		= iocb->ki_filp->f_mapping->host,
> @@ -1055,7 +1056,7 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
>  		iter.flags |= IOMAP_DONTCACHE;
>  
>  	while ((ret = iomap_iter(&iter, ops)) > 0)
> -		iter.status = iomap_write_iter(&iter, i);
> +		iter.status = iomap_write_iter(&iter, i, write_ops);
>  
>  	if (unlikely(iter.pos == iocb->ki_pos))
>  		return ret;
> @@ -1289,7 +1290,8 @@ void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
>  }
>  EXPORT_SYMBOL_GPL(iomap_write_delalloc_release);
>  
> -static int iomap_unshare_iter(struct iomap_iter *iter)
> +static int iomap_unshare_iter(struct iomap_iter *iter,
> +		const struct iomap_write_ops *write_ops)
>  {
>  	struct iomap *iomap = &iter->iomap;
>  	u64 bytes = iomap_length(iter);
> @@ -1304,14 +1306,15 @@ static int iomap_unshare_iter(struct iomap_iter *iter)
>  		bool ret;
>  
>  		bytes = min_t(u64, SIZE_MAX, bytes);
> -		status = iomap_write_begin(iter, &folio, &offset, &bytes);
> +		status = iomap_write_begin(iter, write_ops, &folio, &offset,
> +				&bytes);
>  		if (unlikely(status))
>  			return status;
>  		if (iomap->flags & IOMAP_F_STALE)
>  			break;
>  
>  		ret = iomap_write_end(iter, bytes, bytes, folio);
> -		__iomap_put_folio(iter, bytes, folio);
> +		__iomap_put_folio(iter, write_ops, bytes, folio);
>  		if (WARN_ON_ONCE(!ret))
>  			return -EIO;
>  
> @@ -1329,7 +1332,8 @@ static int iomap_unshare_iter(struct iomap_iter *iter)
>  
>  int
>  iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
> -		const struct iomap_ops *ops)
> +		const struct iomap_ops *ops,
> +		const struct iomap_write_ops *write_ops)
>  {
>  	struct iomap_iter iter = {
>  		.inode		= inode,
> @@ -1344,7 +1348,7 @@ iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
>  
>  	iter.len = min(len, size - pos);
>  	while ((ret = iomap_iter(&iter, ops)) > 0)
> -		iter.status = iomap_unshare_iter(&iter);
> +		iter.status = iomap_unshare_iter(&iter, write_ops);
>  	return ret;
>  }
>  EXPORT_SYMBOL_GPL(iomap_file_unshare);
> @@ -1363,7 +1367,8 @@ static inline int iomap_zero_iter_flush_and_stale(struct iomap_iter *i)
>  	return filemap_write_and_wait_range(mapping, i->pos, end);
>  }
>  
> -static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
> +static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero,
> +		const struct iomap_write_ops *write_ops)
>  {
>  	u64 bytes = iomap_length(iter);
>  	int status;
> @@ -1374,7 +1379,8 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
>  		bool ret;
>  
>  		bytes = min_t(u64, SIZE_MAX, bytes);
> -		status = iomap_write_begin(iter, &folio, &offset, &bytes);
> +		status = iomap_write_begin(iter, write_ops, &folio, &offset,
> +				&bytes);
>  		if (status)
>  			return status;
>  		if (iter->iomap.flags & IOMAP_F_STALE)
> @@ -1387,7 +1393,7 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
>  		folio_mark_accessed(folio);
>  
>  		ret = iomap_write_end(iter, bytes, bytes, folio);
> -		__iomap_put_folio(iter, bytes, folio);
> +		__iomap_put_folio(iter, write_ops, bytes, folio);
>  		if (WARN_ON_ONCE(!ret))
>  			return -EIO;
>  
> @@ -1403,7 +1409,8 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
>  
>  int
>  iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
> -		const struct iomap_ops *ops, void *private)
> +		const struct iomap_ops *ops,
> +		const struct iomap_write_ops *write_ops, void *private)
>  {
>  	struct iomap_iter iter = {
>  		.inode		= inode,
> @@ -1433,7 +1440,8 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
>  	    filemap_range_needs_writeback(mapping, pos, pos + plen - 1)) {
>  		iter.len = plen;
>  		while ((ret = iomap_iter(&iter, ops)) > 0)
> -			iter.status = iomap_zero_iter(&iter, did_zero);
> +			iter.status = iomap_zero_iter(&iter, did_zero,
> +					write_ops);
>  
>  		iter.len = len - (iter.pos - pos);
>  		if (ret || !iter.len)
> @@ -1464,7 +1472,7 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
>  			continue;
>  		}
>  
> -		iter.status = iomap_zero_iter(&iter, did_zero);
> +		iter.status = iomap_zero_iter(&iter, did_zero, write_ops);
>  	}
>  	return ret;
>  }
> @@ -1472,7 +1480,8 @@ EXPORT_SYMBOL_GPL(iomap_zero_range);
>  
>  int
>  iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
> -		const struct iomap_ops *ops, void *private)
> +		const struct iomap_ops *ops,
> +		const struct iomap_write_ops *write_ops, void *private)
>  {
>  	unsigned int blocksize = i_blocksize(inode);
>  	unsigned int off = pos & (blocksize - 1);
> @@ -1481,7 +1490,7 @@ iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
>  	if (!off)
>  		return 0;
>  	return iomap_zero_range(inode, pos, blocksize - off, did_zero, ops,
> -			private);
> +			write_ops, private);
>  }
>  EXPORT_SYMBOL_GPL(iomap_truncate_page);
>  
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 48254a72071b..6e0970f24df5 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -979,7 +979,8 @@ xfs_file_buffered_write(
>  
>  	trace_xfs_file_buffered_write(iocb, from);
>  	ret = iomap_file_buffered_write(iocb, from,
> -			&xfs_buffered_write_iomap_ops, NULL);
> +			&xfs_buffered_write_iomap_ops, &xfs_iomap_write_ops,
> +			NULL);
>  
>  	/*
>  	 * If we hit a space limit, try to free up some lingering preallocated
> @@ -1059,7 +1060,8 @@ xfs_file_buffered_write_zoned(
>  retry:
>  	trace_xfs_file_buffered_write(iocb, from);
>  	ret = iomap_file_buffered_write(iocb, from,
> -			&xfs_buffered_write_iomap_ops, &ac);
> +			&xfs_buffered_write_iomap_ops, &xfs_iomap_write_ops,
> +			&ac);
>  	if (ret == -ENOSPC && !cleared_space) {
>  		/*
>  		 * Kick off writeback to convert delalloc space and release the
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index ff05e6b1b0bb..2e94a9435002 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -79,6 +79,9 @@ xfs_iomap_valid(
>  {
>  	struct xfs_inode	*ip = XFS_I(inode);
>  
> +	if (iomap->type == IOMAP_HOLE)
> +		return true;
> +
>  	if (iomap->validity_cookie !=
>  			xfs_iomap_inode_sequence(ip, iomap->flags)) {
>  		trace_xfs_iomap_invalid(ip, iomap);
> @@ -89,7 +92,7 @@ xfs_iomap_valid(
>  	return true;
>  }
>  
> -static const struct iomap_folio_ops xfs_iomap_folio_ops = {
> +const struct iomap_write_ops xfs_iomap_write_ops = {
>  	.iomap_valid		= xfs_iomap_valid,
>  };
>  
> @@ -151,7 +154,6 @@ xfs_bmbt_to_iomap(
>  		iomap->flags |= IOMAP_F_DIRTY;
>  
>  	iomap->validity_cookie = sequence_cookie;
> -	iomap->folio_ops = &xfs_iomap_folio_ops;
>  	return 0;
>  }
>  
> @@ -2198,7 +2200,8 @@ xfs_zero_range(
>  		return dax_zero_range(inode, pos, len, did_zero,
>  				      &xfs_dax_write_iomap_ops);
>  	return iomap_zero_range(inode, pos, len, did_zero,
> -				&xfs_buffered_write_iomap_ops, ac);
> +			&xfs_buffered_write_iomap_ops, &xfs_iomap_write_ops,
> +			ac);
>  }
>  
>  int
> @@ -2214,5 +2217,6 @@ xfs_truncate_page(
>  		return dax_truncate_page(inode, pos, did_zero,
>  					&xfs_dax_write_iomap_ops);
>  	return iomap_truncate_page(inode, pos, did_zero,
> -				   &xfs_buffered_write_iomap_ops, ac);
> +			&xfs_buffered_write_iomap_ops, &xfs_iomap_write_ops,
> +			ac);
>  }
> diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h
> index 674f8ac1b9bd..ebcce7d49446 100644
> --- a/fs/xfs/xfs_iomap.h
> +++ b/fs/xfs/xfs_iomap.h
> @@ -57,5 +57,6 @@ extern const struct iomap_ops xfs_seek_iomap_ops;
>  extern const struct iomap_ops xfs_xattr_iomap_ops;
>  extern const struct iomap_ops xfs_dax_write_iomap_ops;
>  extern const struct iomap_ops xfs_atomic_write_cow_iomap_ops;
> +extern const struct iomap_write_ops xfs_iomap_write_ops;
>  
>  #endif /* __XFS_IOMAP_H__*/
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index ad3bcb76d805..3f177b4ec131 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -1881,7 +1881,8 @@ xfs_reflink_unshare(
>  				&xfs_dax_write_iomap_ops);
>  	else
>  		error = iomap_file_unshare(inode, offset, len,
> -				&xfs_buffered_write_iomap_ops);
> +				&xfs_buffered_write_iomap_ops,
> +				&xfs_iomap_write_ops);
>  	if (error)
>  		goto out;
>  
> diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
> index 0c64185325d3..6fd3b23858db 100644
> --- a/fs/zonefs/file.c
> +++ b/fs/zonefs/file.c
> @@ -572,7 +572,8 @@ static ssize_t zonefs_file_buffered_write(struct kiocb *iocb,
>  	if (ret <= 0)
>  		goto inode_unlock;
>  
> -	ret = iomap_file_buffered_write(iocb, from, &zonefs_write_iomap_ops, NULL);
> +	ret = iomap_file_buffered_write(iocb, from, &zonefs_write_iomap_ops,
> +			NULL, NULL);
>  	if (ret == -EIO)
>  		zonefs_io_error(inode, true);
>  
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 7e06b3a392f8..8d20a926b645 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -101,8 +101,6 @@ struct vm_fault;
>   */
>  #define IOMAP_NULL_ADDR -1ULL	/* addr is not valid */
>  
> -struct iomap_folio_ops;
> -
>  struct iomap {
>  	u64			addr; /* disk offset of mapping, bytes */
>  	loff_t			offset;	/* file offset of mapping, bytes */
> @@ -113,7 +111,6 @@ struct iomap {
>  	struct dax_device	*dax_dev; /* dax_dev for dax operations */
>  	void			*inline_data;
>  	void			*private; /* filesystem private */
> -	const struct iomap_folio_ops *folio_ops;
>  	u64			validity_cookie; /* used with .iomap_valid() */
>  };
>  
> @@ -143,16 +140,11 @@ static inline bool iomap_inline_data_valid(const struct iomap *iomap)
>  }
>  
>  /*
> - * When a filesystem sets folio_ops in an iomap mapping it returns, get_folio
> - * and put_folio will be called for each folio written to.  This only applies
> - * to buffered writes as unbuffered writes will not typically have folios
> - * associated with them.
> - *
>   * When get_folio succeeds, put_folio will always be called to do any
>   * cleanup work necessary.  put_folio is responsible for unlocking and putting
>   * @folio.
>   */
> -struct iomap_folio_ops {
> +struct iomap_write_ops {
>  	struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
>  			unsigned len);
>  	void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
> @@ -335,7 +327,8 @@ static inline bool iomap_want_unshare_iter(const struct iomap_iter *iter)
>  }
>  
>  ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from,
> -		const struct iomap_ops *ops, void *private);
> +		const struct iomap_ops *ops,
> +		const struct iomap_write_ops *write_ops, void *private);
>  int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops);
>  void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
>  bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count);
> @@ -344,11 +337,14 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags);
>  void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len);
>  bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio);
>  int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
> -		const struct iomap_ops *ops);
> +		const struct iomap_ops *ops,
> +		const struct iomap_write_ops *write_ops);
>  int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,
> -		bool *did_zero, const struct iomap_ops *ops, void *private);
> +		bool *did_zero, const struct iomap_ops *ops,
> +		const struct iomap_write_ops *write_ops, void *private);
>  int iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
> -		const struct iomap_ops *ops, void *private);
> +		const struct iomap_ops *ops,
> +		const struct iomap_write_ops *write_ops, void *private);
>  vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops,
>  		void *private);
>  typedef void (*iomap_punch_t)(struct inode *inode, loff_t offset, loff_t length,
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 11/16] iomap: add read_folio_range() handler for buffered writes
  2025-06-24  2:21 ` [PATCH v3 11/16] iomap: add read_folio_range() handler for buffered writes Joanne Koong
@ 2025-07-02 17:51   ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:51 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:30PM -0700, Joanne Koong wrote:
> Add a read_folio_range() handler for buffered writes that filesystems
> may pass in if they wish to provide a custom handler for synchronously
> reading in the contents of a folio.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> [hch: renamed to read_folio_range, pass less arguments]
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  .../filesystems/iomap/operations.rst          |  6 +++++
>  fs/iomap/buffered-io.c                        | 25 +++++++++++--------
>  include/linux/iomap.h                         | 10 ++++++++
>  3 files changed, 31 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
> index 1f5732835567..813e26dbd21e 100644
> --- a/Documentation/filesystems/iomap/operations.rst
> +++ b/Documentation/filesystems/iomap/operations.rst
> @@ -68,6 +68,8 @@ The following address space operations can be wrapped easily:
>       void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
>                         struct folio *folio);
>       bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
> +     int (*read_folio_range)(const struct iomap_iter *iter,
> +     			struct folio *folio, loff_t pos, size_t len);
>   };
>  
>  iomap calls these functions:
> @@ -123,6 +125,10 @@ iomap calls these functions:
>      ``->iomap_valid``, then the iomap should considered stale and the
>      validation failed.
>  
> +  - ``read_folio_range``: Called to synchronously read in the range that will
> +    be written to. If this function is not provided, iomap will default to
> +    submitting a bio read request.

Excellent!
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> +
>  These ``struct kiocb`` flags are significant for buffered I/O with iomap:
>  
>   * ``IOCB_NOWAIT``: Turns on ``IOMAP_NOWAIT``.
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index f6e410c9ea7b..897a3ccea2df 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -667,22 +667,23 @@ iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
>  					 pos + len - 1);
>  }
>  
> -static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
> -		size_t poff, size_t plen, const struct iomap *iomap)
> +static int iomap_read_folio_range(const struct iomap_iter *iter,
> +		struct folio *folio, loff_t pos, size_t len)
>  {
> +	const struct iomap *srcmap = iomap_iter_srcmap(iter);
>  	struct bio_vec bvec;
>  	struct bio bio;
>  
> -	bio_init(&bio, iomap->bdev, &bvec, 1, REQ_OP_READ);
> -	bio.bi_iter.bi_sector = iomap_sector(iomap, block_start);
> -	bio_add_folio_nofail(&bio, folio, plen, poff);
> +	bio_init(&bio, srcmap->bdev, &bvec, 1, REQ_OP_READ);
> +	bio.bi_iter.bi_sector = iomap_sector(srcmap, pos);
> +	bio_add_folio_nofail(&bio, folio, len, offset_in_folio(folio, pos));
>  	return submit_bio_wait(&bio);
>  }
>  
> -static int __iomap_write_begin(const struct iomap_iter *iter, size_t len,
> +static int __iomap_write_begin(const struct iomap_iter *iter,
> +		const struct iomap_write_ops *write_ops, size_t len,
>  		struct folio *folio)
>  {
> -	const struct iomap *srcmap = iomap_iter_srcmap(iter);
>  	struct iomap_folio_state *ifs;
>  	loff_t pos = iter->pos;
>  	loff_t block_size = i_blocksize(iter->inode);
> @@ -731,8 +732,12 @@ static int __iomap_write_begin(const struct iomap_iter *iter, size_t len,
>  			if (iter->flags & IOMAP_NOWAIT)
>  				return -EAGAIN;
>  
> -			status = iomap_read_folio_sync(block_start, folio,
> -					poff, plen, srcmap);
> +			if (write_ops && write_ops->read_folio_range)
> +				status = write_ops->read_folio_range(iter,
> +						folio, block_start, plen);
> +			else
> +				status = iomap_read_folio_range(iter,
> +						folio, block_start, plen);
>  			if (status)
>  				return status;
>  		}
> @@ -848,7 +853,7 @@ static int iomap_write_begin(struct iomap_iter *iter,
>  	else if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
>  		status = __block_write_begin_int(folio, pos, len, NULL, srcmap);
>  	else
> -		status = __iomap_write_begin(iter, len, folio);
> +		status = __iomap_write_begin(iter, write_ops, len, folio);
>  
>  	if (unlikely(status))
>  		goto out_unlock;
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 8d20a926b645..5ec651606c51 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -166,6 +166,16 @@ struct iomap_write_ops {
>  	 * locked by the iomap code.
>  	 */
>  	bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
> +
> +	/*
> +	 * Optional if the filesystem wishes to provide a custom handler for
> +	 * reading in the contents of a folio, otherwise iomap will default to
> +	 * submitting a bio read request.
> +	 *
> +	 * The read must be done synchronously.
> +	 */
> +	int (*read_folio_range)(const struct iomap_iter *iter,
> +			struct folio *folio, loff_t pos, size_t len);
>  };
>  
>  /*
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 12/16] fuse: use iomap for buffered writes
  2025-06-24  2:21 ` [PATCH v3 12/16] fuse: use iomap " Joanne Koong
  2025-06-24 10:07   ` Miklos Szeredi
  2025-06-25 11:10   ` kernel test robot
@ 2025-07-02 17:55   ` Darrick J. Wong
  2025-07-02 17:57     ` Darrick J. Wong
  2 siblings, 1 reply; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:55 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:31PM -0700, Joanne Koong wrote:
> Have buffered writes go through iomap. This has two advantages:
> * granular large folio synchronous reads
> * granular large folio dirty tracking
> 
> If for example there is a 1 MB large folio and a write issued at pos 1
> to pos 1 MB - 2, only the head and tail pages will need to be read in
> and marked uptodate instead of the entire folio needing to be read in.
> Non-relevant trailing pages are also skipped (eg if for a 1 MB large
> folio a write is issued at pos 1 to 4099, only the first two pages are
> read in and the ones after that are skipped).
> 
> iomap also has granular dirty tracking. This is useful in that when it
> comes to writeback time, only the dirty portions of the large folio will
> be written instead of having to write out the entire folio. For example
> if there is a 1 MB large folio and only 2 bytes in it are dirty, only
> the page for those dirty bytes get written out. Please note that
> granular writeback is only done once fuse also uses iomap in writeback
> (separate commit).
> 
> .release_folio needs to be set to iomap_release_folio so that any
> allocated iomap ifs structs get freed.

What happens in the !iomap case, which can still happen for
!writeback_cache filesystems?  I don't think you can call
iomap_release_folio, because iomap doesn't own folio->private in that
case.

--D

> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
>  fs/fuse/Kconfig |   1 +
>  fs/fuse/file.c  | 140 ++++++++++++++++++------------------------------
>  2 files changed, 53 insertions(+), 88 deletions(-)
> 
> diff --git a/fs/fuse/Kconfig b/fs/fuse/Kconfig
> index ca215a3cba3e..a774166264de 100644
> --- a/fs/fuse/Kconfig
> +++ b/fs/fuse/Kconfig
> @@ -2,6 +2,7 @@
>  config FUSE_FS
>  	tristate "FUSE (Filesystem in Userspace) support"
>  	select FS_POSIX_ACL
> +	select FS_IOMAP
>  	help
>  	  With FUSE it is possible to implement a fully functional filesystem
>  	  in a userspace program.
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index f102afc03359..a7f11c1a4f89 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -21,6 +21,7 @@
>  #include <linux/filelock.h>
>  #include <linux/splice.h>
>  #include <linux/task_io_accounting_ops.h>
> +#include <linux/iomap.h>
>  
>  static int fuse_send_open(struct fuse_mount *fm, u64 nodeid,
>  			  unsigned int open_flags, int opcode,
> @@ -788,12 +789,16 @@ static void fuse_short_read(struct inode *inode, u64 attr_ver, size_t num_read,
>  	}
>  }
>  
> -static int fuse_do_readfolio(struct file *file, struct folio *folio)
> +static int fuse_do_readfolio(struct file *file, struct folio *folio,
> +			     size_t off, size_t len)
>  {
>  	struct inode *inode = folio->mapping->host;
>  	struct fuse_mount *fm = get_fuse_mount(inode);
> -	loff_t pos = folio_pos(folio);
> -	struct fuse_folio_desc desc = { .length = folio_size(folio) };
> +	loff_t pos = folio_pos(folio) + off;
> +	struct fuse_folio_desc desc = {
> +		.offset = off,
> +		.length = len,
> +	};
>  	struct fuse_io_args ia = {
>  		.ap.args.page_zeroing = true,
>  		.ap.args.out_pages = true,
> @@ -820,8 +825,6 @@ static int fuse_do_readfolio(struct file *file, struct folio *folio)
>  	if (res < desc.length)
>  		fuse_short_read(inode, attr_ver, res, &ia.ap);
>  
> -	folio_mark_uptodate(folio);
> -
>  	return 0;
>  }
>  
> @@ -834,13 +837,26 @@ static int fuse_read_folio(struct file *file, struct folio *folio)
>  	if (fuse_is_bad(inode))
>  		goto out;
>  
> -	err = fuse_do_readfolio(file, folio);
> +	err = fuse_do_readfolio(file, folio, 0, folio_size(folio));
> +	if (!err)
> +		folio_mark_uptodate(folio);
> +
>  	fuse_invalidate_atime(inode);
>   out:
>  	folio_unlock(folio);
>  	return err;
>  }
>  
> +static int fuse_iomap_read_folio_range(const struct iomap_iter *iter,
> +				       struct folio *folio, loff_t pos,
> +				       size_t len)
> +{
> +	struct file *file = iter->private;
> +	size_t off = offset_in_folio(folio, pos);
> +
> +	return fuse_do_readfolio(file, folio, off, len);
> +}
> +
>  static void fuse_readpages_end(struct fuse_mount *fm, struct fuse_args *args,
>  			       int err)
>  {
> @@ -1375,6 +1391,24 @@ static void fuse_dio_unlock(struct kiocb *iocb, bool exclusive)
>  	}
>  }
>  
> +static const struct iomap_write_ops fuse_iomap_write_ops = {
> +	.read_folio_range = fuse_iomap_read_folio_range,
> +};
> +
> +static int fuse_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
> +			    unsigned int flags, struct iomap *iomap,
> +			    struct iomap *srcmap)
> +{
> +	iomap->type = IOMAP_MAPPED;
> +	iomap->length = length;
> +	iomap->offset = offset;
> +	return 0;
> +}
> +
> +static const struct iomap_ops fuse_iomap_ops = {
> +	.iomap_begin	= fuse_iomap_begin,
> +};
> +
>  static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  {
>  	struct file *file = iocb->ki_filp;
> @@ -1384,6 +1418,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  	struct inode *inode = mapping->host;
>  	ssize_t err, count;
>  	struct fuse_conn *fc = get_fuse_conn(inode);
> +	bool writeback = false;
>  
>  	if (fc->writeback_cache) {
>  		/* Update size (EOF optimization) and mode (SUID clearing) */
> @@ -1397,8 +1432,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  						file_inode(file))) {
>  			goto writethrough;
>  		}
> -
> -		return generic_file_write_iter(iocb, from);
> +		writeback = true;
>  	}
>  
>  writethrough:
> @@ -1420,6 +1454,15 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  			goto out;
>  		written = direct_write_fallback(iocb, from, written,
>  				fuse_perform_write(iocb, from));
> +	} else if (writeback) {
> +		/*
> +		 * Use iomap so that we can do granular uptodate reads
> +		 * and granular dirty tracking for large folios.
> +		 */
> +		written = iomap_file_buffered_write(iocb, from,
> +						    &fuse_iomap_ops,
> +						    &fuse_iomap_write_ops,
> +						    file);
>  	} else {
>  		written = fuse_perform_write(iocb, from);
>  	}
> @@ -2209,84 +2252,6 @@ static int fuse_writepages(struct address_space *mapping,
>  	return err;
>  }
>  
> -/*
> - * It's worthy to make sure that space is reserved on disk for the write,
> - * but how to implement it without killing performance need more thinking.
> - */
> -static int fuse_write_begin(struct file *file, struct address_space *mapping,
> -		loff_t pos, unsigned len, struct folio **foliop, void **fsdata)
> -{
> -	pgoff_t index = pos >> PAGE_SHIFT;
> -	struct fuse_conn *fc = get_fuse_conn(file_inode(file));
> -	struct folio *folio;
> -	loff_t fsize;
> -	int err = -ENOMEM;
> -
> -	WARN_ON(!fc->writeback_cache);
> -
> -	folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
> -			mapping_gfp_mask(mapping));
> -	if (IS_ERR(folio))
> -		goto error;
> -
> -	if (folio_test_uptodate(folio) || len >= folio_size(folio))
> -		goto success;
> -	/*
> -	 * Check if the start of this folio comes after the end of file,
> -	 * in which case the readpage can be optimized away.
> -	 */
> -	fsize = i_size_read(mapping->host);
> -	if (fsize <= folio_pos(folio)) {
> -		size_t off = offset_in_folio(folio, pos);
> -		if (off)
> -			folio_zero_segment(folio, 0, off);
> -		goto success;
> -	}
> -	err = fuse_do_readfolio(file, folio);
> -	if (err)
> -		goto cleanup;
> -success:
> -	*foliop = folio;
> -	return 0;
> -
> -cleanup:
> -	folio_unlock(folio);
> -	folio_put(folio);
> -error:
> -	return err;
> -}
> -
> -static int fuse_write_end(struct file *file, struct address_space *mapping,
> -		loff_t pos, unsigned len, unsigned copied,
> -		struct folio *folio, void *fsdata)
> -{
> -	struct inode *inode = folio->mapping->host;
> -
> -	/* Haven't copied anything?  Skip zeroing, size extending, dirtying. */
> -	if (!copied)
> -		goto unlock;
> -
> -	pos += copied;
> -	if (!folio_test_uptodate(folio)) {
> -		/* Zero any unwritten bytes at the end of the page */
> -		size_t endoff = pos & ~PAGE_MASK;
> -		if (endoff)
> -			folio_zero_segment(folio, endoff, PAGE_SIZE);
> -		folio_mark_uptodate(folio);
> -	}
> -
> -	if (pos > inode->i_size)
> -		i_size_write(inode, pos);
> -
> -	folio_mark_dirty(folio);
> -
> -unlock:
> -	folio_unlock(folio);
> -	folio_put(folio);
> -
> -	return copied;
> -}
> -
>  static int fuse_launder_folio(struct folio *folio)
>  {
>  	int err = 0;
> @@ -3145,11 +3110,10 @@ static const struct address_space_operations fuse_file_aops  = {
>  	.writepages	= fuse_writepages,
>  	.launder_folio	= fuse_launder_folio,
>  	.dirty_folio	= filemap_dirty_folio,
> +	.release_folio	= iomap_release_folio,
>  	.migrate_folio	= filemap_migrate_folio,
>  	.bmap		= fuse_bmap,
>  	.direct_IO	= fuse_direct_IO,
> -	.write_begin	= fuse_write_begin,
> -	.write_end	= fuse_write_end,
>  };
>  
>  void fuse_init_file_inode(struct inode *inode, unsigned int flags)
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 12/16] fuse: use iomap for buffered writes
  2025-07-02 17:55   ` Darrick J. Wong
@ 2025-07-02 17:57     ` Darrick J. Wong
  2025-07-02 22:57       ` Joanne Koong
  0 siblings, 1 reply; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 17:57 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Wed, Jul 02, 2025 at 10:55:09AM -0700, Darrick J. Wong wrote:
> On Mon, Jun 23, 2025 at 07:21:31PM -0700, Joanne Koong wrote:
> > Have buffered writes go through iomap. This has two advantages:
> > * granular large folio synchronous reads
> > * granular large folio dirty tracking
> > 
> > If for example there is a 1 MB large folio and a write issued at pos 1
> > to pos 1 MB - 2, only the head and tail pages will need to be read in
> > and marked uptodate instead of the entire folio needing to be read in.
> > Non-relevant trailing pages are also skipped (eg if for a 1 MB large
> > folio a write is issued at pos 1 to 4099, only the first two pages are
> > read in and the ones after that are skipped).
> > 
> > iomap also has granular dirty tracking. This is useful in that when it
> > comes to writeback time, only the dirty portions of the large folio will
> > be written instead of having to write out the entire folio. For example
> > if there is a 1 MB large folio and only 2 bytes in it are dirty, only
> > the page for those dirty bytes get written out. Please note that
> > granular writeback is only done once fuse also uses iomap in writeback
> > (separate commit).
> > 
> > .release_folio needs to be set to iomap_release_folio so that any
> > allocated iomap ifs structs get freed.
> 
> What happens in the !iomap case, which can still happen for
> !writeback_cache filesystems?  I don't think you can call
> iomap_release_folio, because iomap doesn't own folio->private in that
> case.

...and I think the answer to that is that the !writeback_cache case
passes all file IO directly to the fuse server and never touches the
page cache at all?

--D

> 
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> >  fs/fuse/Kconfig |   1 +
> >  fs/fuse/file.c  | 140 ++++++++++++++++++------------------------------
> >  2 files changed, 53 insertions(+), 88 deletions(-)
> > 
> > diff --git a/fs/fuse/Kconfig b/fs/fuse/Kconfig
> > index ca215a3cba3e..a774166264de 100644
> > --- a/fs/fuse/Kconfig
> > +++ b/fs/fuse/Kconfig
> > @@ -2,6 +2,7 @@
> >  config FUSE_FS
> >  	tristate "FUSE (Filesystem in Userspace) support"
> >  	select FS_POSIX_ACL
> > +	select FS_IOMAP
> >  	help
> >  	  With FUSE it is possible to implement a fully functional filesystem
> >  	  in a userspace program.
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index f102afc03359..a7f11c1a4f89 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/filelock.h>
> >  #include <linux/splice.h>
> >  #include <linux/task_io_accounting_ops.h>
> > +#include <linux/iomap.h>
> >  
> >  static int fuse_send_open(struct fuse_mount *fm, u64 nodeid,
> >  			  unsigned int open_flags, int opcode,
> > @@ -788,12 +789,16 @@ static void fuse_short_read(struct inode *inode, u64 attr_ver, size_t num_read,
> >  	}
> >  }
> >  
> > -static int fuse_do_readfolio(struct file *file, struct folio *folio)
> > +static int fuse_do_readfolio(struct file *file, struct folio *folio,
> > +			     size_t off, size_t len)
> >  {
> >  	struct inode *inode = folio->mapping->host;
> >  	struct fuse_mount *fm = get_fuse_mount(inode);
> > -	loff_t pos = folio_pos(folio);
> > -	struct fuse_folio_desc desc = { .length = folio_size(folio) };
> > +	loff_t pos = folio_pos(folio) + off;
> > +	struct fuse_folio_desc desc = {
> > +		.offset = off,
> > +		.length = len,
> > +	};
> >  	struct fuse_io_args ia = {
> >  		.ap.args.page_zeroing = true,
> >  		.ap.args.out_pages = true,
> > @@ -820,8 +825,6 @@ static int fuse_do_readfolio(struct file *file, struct folio *folio)
> >  	if (res < desc.length)
> >  		fuse_short_read(inode, attr_ver, res, &ia.ap);
> >  
> > -	folio_mark_uptodate(folio);
> > -
> >  	return 0;
> >  }
> >  
> > @@ -834,13 +837,26 @@ static int fuse_read_folio(struct file *file, struct folio *folio)
> >  	if (fuse_is_bad(inode))
> >  		goto out;
> >  
> > -	err = fuse_do_readfolio(file, folio);
> > +	err = fuse_do_readfolio(file, folio, 0, folio_size(folio));
> > +	if (!err)
> > +		folio_mark_uptodate(folio);
> > +
> >  	fuse_invalidate_atime(inode);
> >   out:
> >  	folio_unlock(folio);
> >  	return err;
> >  }
> >  
> > +static int fuse_iomap_read_folio_range(const struct iomap_iter *iter,
> > +				       struct folio *folio, loff_t pos,
> > +				       size_t len)
> > +{
> > +	struct file *file = iter->private;
> > +	size_t off = offset_in_folio(folio, pos);
> > +
> > +	return fuse_do_readfolio(file, folio, off, len);
> > +}
> > +
> >  static void fuse_readpages_end(struct fuse_mount *fm, struct fuse_args *args,
> >  			       int err)
> >  {
> > @@ -1375,6 +1391,24 @@ static void fuse_dio_unlock(struct kiocb *iocb, bool exclusive)
> >  	}
> >  }
> >  
> > +static const struct iomap_write_ops fuse_iomap_write_ops = {
> > +	.read_folio_range = fuse_iomap_read_folio_range,
> > +};
> > +
> > +static int fuse_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
> > +			    unsigned int flags, struct iomap *iomap,
> > +			    struct iomap *srcmap)
> > +{
> > +	iomap->type = IOMAP_MAPPED;
> > +	iomap->length = length;
> > +	iomap->offset = offset;
> > +	return 0;
> > +}
> > +
> > +static const struct iomap_ops fuse_iomap_ops = {
> > +	.iomap_begin	= fuse_iomap_begin,
> > +};
> > +
> >  static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >  {
> >  	struct file *file = iocb->ki_filp;
> > @@ -1384,6 +1418,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >  	struct inode *inode = mapping->host;
> >  	ssize_t err, count;
> >  	struct fuse_conn *fc = get_fuse_conn(inode);
> > +	bool writeback = false;
> >  
> >  	if (fc->writeback_cache) {
> >  		/* Update size (EOF optimization) and mode (SUID clearing) */
> > @@ -1397,8 +1432,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >  						file_inode(file))) {
> >  			goto writethrough;
> >  		}
> > -
> > -		return generic_file_write_iter(iocb, from);
> > +		writeback = true;
> >  	}
> >  
> >  writethrough:
> > @@ -1420,6 +1454,15 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >  			goto out;
> >  		written = direct_write_fallback(iocb, from, written,
> >  				fuse_perform_write(iocb, from));
> > +	} else if (writeback) {
> > +		/*
> > +		 * Use iomap so that we can do granular uptodate reads
> > +		 * and granular dirty tracking for large folios.
> > +		 */
> > +		written = iomap_file_buffered_write(iocb, from,
> > +						    &fuse_iomap_ops,
> > +						    &fuse_iomap_write_ops,
> > +						    file);
> >  	} else {
> >  		written = fuse_perform_write(iocb, from);
> >  	}
> > @@ -2209,84 +2252,6 @@ static int fuse_writepages(struct address_space *mapping,
> >  	return err;
> >  }
> >  
> > -/*
> > - * It's worthy to make sure that space is reserved on disk for the write,
> > - * but how to implement it without killing performance need more thinking.
> > - */
> > -static int fuse_write_begin(struct file *file, struct address_space *mapping,
> > -		loff_t pos, unsigned len, struct folio **foliop, void **fsdata)
> > -{
> > -	pgoff_t index = pos >> PAGE_SHIFT;
> > -	struct fuse_conn *fc = get_fuse_conn(file_inode(file));
> > -	struct folio *folio;
> > -	loff_t fsize;
> > -	int err = -ENOMEM;
> > -
> > -	WARN_ON(!fc->writeback_cache);
> > -
> > -	folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
> > -			mapping_gfp_mask(mapping));
> > -	if (IS_ERR(folio))
> > -		goto error;
> > -
> > -	if (folio_test_uptodate(folio) || len >= folio_size(folio))
> > -		goto success;
> > -	/*
> > -	 * Check if the start of this folio comes after the end of file,
> > -	 * in which case the readpage can be optimized away.
> > -	 */
> > -	fsize = i_size_read(mapping->host);
> > -	if (fsize <= folio_pos(folio)) {
> > -		size_t off = offset_in_folio(folio, pos);
> > -		if (off)
> > -			folio_zero_segment(folio, 0, off);
> > -		goto success;
> > -	}
> > -	err = fuse_do_readfolio(file, folio);
> > -	if (err)
> > -		goto cleanup;
> > -success:
> > -	*foliop = folio;
> > -	return 0;
> > -
> > -cleanup:
> > -	folio_unlock(folio);
> > -	folio_put(folio);
> > -error:
> > -	return err;
> > -}
> > -
> > -static int fuse_write_end(struct file *file, struct address_space *mapping,
> > -		loff_t pos, unsigned len, unsigned copied,
> > -		struct folio *folio, void *fsdata)
> > -{
> > -	struct inode *inode = folio->mapping->host;
> > -
> > -	/* Haven't copied anything?  Skip zeroing, size extending, dirtying. */
> > -	if (!copied)
> > -		goto unlock;
> > -
> > -	pos += copied;
> > -	if (!folio_test_uptodate(folio)) {
> > -		/* Zero any unwritten bytes at the end of the page */
> > -		size_t endoff = pos & ~PAGE_MASK;
> > -		if (endoff)
> > -			folio_zero_segment(folio, endoff, PAGE_SIZE);
> > -		folio_mark_uptodate(folio);
> > -	}
> > -
> > -	if (pos > inode->i_size)
> > -		i_size_write(inode, pos);
> > -
> > -	folio_mark_dirty(folio);
> > -
> > -unlock:
> > -	folio_unlock(folio);
> > -	folio_put(folio);
> > -
> > -	return copied;
> > -}
> > -
> >  static int fuse_launder_folio(struct folio *folio)
> >  {
> >  	int err = 0;
> > @@ -3145,11 +3110,10 @@ static const struct address_space_operations fuse_file_aops  = {
> >  	.writepages	= fuse_writepages,
> >  	.launder_folio	= fuse_launder_folio,
> >  	.dirty_folio	= filemap_dirty_folio,
> > +	.release_folio	= iomap_release_folio,
> >  	.migrate_folio	= filemap_migrate_folio,
> >  	.bmap		= fuse_bmap,
> >  	.direct_IO	= fuse_direct_IO,
> > -	.write_begin	= fuse_write_begin,
> > -	.write_end	= fuse_write_end,
> >  };
> >  
> >  void fuse_init_file_inode(struct inode *inode, unsigned int flags)
> > -- 
> > 2.47.1
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 13/16] fuse: use iomap for writeback
  2025-06-24  2:21 ` [PATCH v3 13/16] fuse: use iomap for writeback Joanne Koong
  2025-06-25 14:17   ` kernel test robot
@ 2025-07-02 18:13   ` Darrick J. Wong
  2025-07-02 23:20     ` Joanne Koong
  1 sibling, 1 reply; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 18:13 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:32PM -0700, Joanne Koong wrote:
> Use iomap for dirty folio writeback in ->writepages().
> This allows for granular dirty writeback of large folios.
> 
> Only the dirty portions of the large folio will be written instead of
> having to write out the entire folio. For example if there is a 1 MB
> large folio and only 2 bytes in it are dirty, only the page for those
> dirty bytes will be written out.
> 
> .dirty_folio needs to be set to iomap_dirty_folio so that the bitmap
> iomap uses for dirty tracking correctly reflects dirty regions that need
> to be written back.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
>  fs/fuse/file.c | 124 ++++++++++++++++++++++++++++++-------------------
>  1 file changed, 75 insertions(+), 49 deletions(-)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index a7f11c1a4f89..2b4b950eaeed 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1837,7 +1837,7 @@ static void fuse_writepage_finish(struct fuse_writepage_args *wpa)
>  		 * scope of the fi->lock alleviates xarray lock
>  		 * contention and noticeably improves performance.
>  		 */
> -		folio_end_writeback(ap->folios[i]);
> +		iomap_finish_folio_write(inode, ap->folios[i], 1);
>  		dec_wb_stat(&bdi->wb, WB_WRITEBACK);
>  		wb_writeout_inc(&bdi->wb);
>  	}
> @@ -2024,19 +2024,20 @@ static void fuse_writepage_add_to_bucket(struct fuse_conn *fc,
>  }
>  
>  static void fuse_writepage_args_page_fill(struct fuse_writepage_args *wpa, struct folio *folio,
> -					  uint32_t folio_index)
> +					  uint32_t folio_index, loff_t offset, unsigned len)
>  {
>  	struct inode *inode = folio->mapping->host;
>  	struct fuse_args_pages *ap = &wpa->ia.ap;
>  
>  	ap->folios[folio_index] = folio;
> -	ap->descs[folio_index].offset = 0;
> -	ap->descs[folio_index].length = folio_size(folio);
> +	ap->descs[folio_index].offset = offset;
> +	ap->descs[folio_index].length = len;
>  
>  	inc_wb_stat(&inode_to_bdi(inode)->wb, WB_WRITEBACK);
>  }
>  
>  static struct fuse_writepage_args *fuse_writepage_args_setup(struct folio *folio,
> +							     size_t offset,
>  							     struct fuse_file *ff)
>  {
>  	struct inode *inode = folio->mapping->host;
> @@ -2049,7 +2050,7 @@ static struct fuse_writepage_args *fuse_writepage_args_setup(struct folio *folio
>  		return NULL;
>  
>  	fuse_writepage_add_to_bucket(fc, wpa);
> -	fuse_write_args_fill(&wpa->ia, ff, folio_pos(folio), 0);
> +	fuse_write_args_fill(&wpa->ia, ff, folio_pos(folio) + offset, 0);
>  	wpa->ia.write.in.write_flags |= FUSE_WRITE_CACHE;
>  	wpa->inode = inode;
>  	wpa->ia.ff = ff;
> @@ -2075,7 +2076,7 @@ static int fuse_writepage_locked(struct folio *folio)
>  	if (!ff)
>  		goto err;
>  
> -	wpa = fuse_writepage_args_setup(folio, ff);
> +	wpa = fuse_writepage_args_setup(folio, 0, ff);
>  	error = -ENOMEM;
>  	if (!wpa)
>  		goto err_writepage_args;
> @@ -2084,7 +2085,7 @@ static int fuse_writepage_locked(struct folio *folio)
>  	ap->num_folios = 1;
>  
>  	folio_start_writeback(folio);
> -	fuse_writepage_args_page_fill(wpa, folio, 0);
> +	fuse_writepage_args_page_fill(wpa, folio, 0, 0, folio_size(folio));
>  
>  	spin_lock(&fi->lock);
>  	list_add_tail(&wpa->queue_entry, &fi->queued_writes);
> @@ -2105,7 +2106,7 @@ struct fuse_fill_wb_data {
>  	struct fuse_file *ff;
>  	struct inode *inode;
>  	unsigned int max_folios;
> -	unsigned int nr_pages;
> +	unsigned int nr_bytes;
>  };
>  
>  static bool fuse_pages_realloc(struct fuse_fill_wb_data *data)
> @@ -2147,21 +2148,28 @@ static void fuse_writepages_send(struct fuse_fill_wb_data *data)
>  }
>  
>  static bool fuse_writepage_need_send(struct fuse_conn *fc, struct folio *folio,
> +				     loff_t offset, unsigned len,
>  				     struct fuse_args_pages *ap,
>  				     struct fuse_fill_wb_data *data)
>  {
> +	struct folio *prev_folio;
> +	struct fuse_folio_desc prev_desc;
> +
>  	WARN_ON(!ap->num_folios);
>  
>  	/* Reached max pages */
> -	if (data->nr_pages + folio_nr_pages(folio) > fc->max_pages)
> +	if ((data->nr_bytes + len) / PAGE_SIZE > fc->max_pages)
>  		return true;
>  
>  	/* Reached max write bytes */
> -	if ((data->nr_pages * PAGE_SIZE) + folio_size(folio) > fc->max_write)
> +	if (data->nr_bytes + len > fc->max_write)
>  		return true;
>  
>  	/* Discontinuity */
> -	if (folio_next_index(ap->folios[ap->num_folios - 1]) != folio->index)
> +	prev_folio = ap->folios[ap->num_folios - 1];
> +	prev_desc = ap->descs[ap->num_folios - 1];
> +	if ((folio_pos(prev_folio) + prev_desc.offset + prev_desc.length) !=
> +	    folio_pos(folio) + offset)
>  		return true;
>  
>  	/* Need to grow the pages array?  If so, did the expansion fail? */
> @@ -2171,85 +2179,103 @@ static bool fuse_writepage_need_send(struct fuse_conn *fc, struct folio *folio,
>  	return false;
>  }
>  
> -static int fuse_writepages_fill(struct folio *folio,
> -		struct writeback_control *wbc, void *_data)
> +static ssize_t fuse_iomap_writeback_range(struct iomap_writepage_ctx *wpc,
> +					  struct folio *folio, u64 pos,
> +					  unsigned len, u64 end_pos)
>  {
> -	struct fuse_fill_wb_data *data = _data;
> +	struct fuse_fill_wb_data *data = wpc->wb_ctx;
>  	struct fuse_writepage_args *wpa = data->wpa;
>  	struct fuse_args_pages *ap = &wpa->ia.ap;
>  	struct inode *inode = data->inode;
>  	struct fuse_inode *fi = get_fuse_inode(inode);
>  	struct fuse_conn *fc = get_fuse_conn(inode);
> -	int err;
> +	loff_t offset = offset_in_folio(folio, pos);
> +
> +	WARN_ON_ONCE(!data);
> +	/* len will always be page aligned */
> +	WARN_ON_ONCE(len & (PAGE_SIZE - 1));
>  
>  	if (!data->ff) {
> -		err = -EIO;
>  		data->ff = fuse_write_file_get(fi);
>  		if (!data->ff)
> -			goto out_unlock;
> +			return -EIO;
>  	}
>  
> -	if (wpa && fuse_writepage_need_send(fc, folio, ap, data)) {
> +	iomap_start_folio_write(inode, folio, 1);
> +
> +	if (wpa && fuse_writepage_need_send(fc, folio, offset, len, ap, data)) {
>  		fuse_writepages_send(data);
>  		data->wpa = NULL;
> -		data->nr_pages = 0;
> +		data->nr_bytes = 0;
>  	}
>  
>  	if (data->wpa == NULL) {
> -		err = -ENOMEM;
> -		wpa = fuse_writepage_args_setup(folio, data->ff);
> +		wpa = fuse_writepage_args_setup(folio, offset, data->ff);
>  		if (!wpa)
> -			goto out_unlock;
> +			return -ENOMEM;

If we error out here, what subtracts from write_bytes_pending the
quantity that we just added in iomap_start_folio_write?

(It would have helped a lot if the cover letter had linked to a git
branch so I could go look at the final product for myself...)

--D

>  		fuse_file_get(wpa->ia.ff);
>  		data->max_folios = 1;
>  		ap = &wpa->ia.ap;
>  	}
> -	folio_start_writeback(folio);
>  
> -	fuse_writepage_args_page_fill(wpa, folio, ap->num_folios);
> -	data->nr_pages += folio_nr_pages(folio);
> +	fuse_writepage_args_page_fill(wpa, folio, ap->num_folios,
> +				      offset, len);
> +	data->nr_bytes += len;
>  
> -	err = 0;
>  	ap->num_folios++;
>  	if (!data->wpa)
>  		data->wpa = wpa;
> -out_unlock:
> -	folio_unlock(folio);
>  
> -	return err;
> +	return len;
>  }
>  
> +static int fuse_iomap_writeback_submit(struct iomap_writepage_ctx *wpc,
> +				       int error)
> +{
> +	struct fuse_fill_wb_data *data = wpc->wb_ctx;
> +
> +	WARN_ON_ONCE(!data);
> +
> +	if (data->wpa) {
> +		WARN_ON(!data->wpa->ia.ap.num_folios);
> +		fuse_writepages_send(data);
> +	}
> +
> +	if (data->ff)
> +		fuse_file_put(data->ff, false);
> +
> +	return error;
> +}
> +
> +static const struct iomap_writeback_ops fuse_writeback_ops = {
> +	.writeback_range	= fuse_iomap_writeback_range,
> +	.writeback_submit	= fuse_iomap_writeback_submit,
> +};
> +
>  static int fuse_writepages(struct address_space *mapping,
>  			   struct writeback_control *wbc)
>  {
>  	struct inode *inode = mapping->host;
>  	struct fuse_conn *fc = get_fuse_conn(inode);
> -	struct fuse_fill_wb_data data;
> -	int err;
> +	struct fuse_fill_wb_data data = {
> +		.inode = inode,
> +	};
> +	struct iomap_writepage_ctx wpc = {
> +		.inode = inode,
> +		.iomap.type = IOMAP_MAPPED,
> +		.wbc = wbc,
> +		.ops = &fuse_writeback_ops,
> +		.wb_ctx	= &data,
> +	};
>  
> -	err = -EIO;
>  	if (fuse_is_bad(inode))
> -		goto out;
> +		return -EIO;
>  
>  	if (wbc->sync_mode == WB_SYNC_NONE &&
>  	    fc->num_background >= fc->congestion_threshold)
>  		return 0;
>  
> -	data.inode = inode;
> -	data.wpa = NULL;
> -	data.ff = NULL;
> -	data.nr_pages = 0;
> -
> -	err = write_cache_pages(mapping, wbc, fuse_writepages_fill, &data);
> -	if (data.wpa) {
> -		WARN_ON(!data.wpa->ia.ap.num_folios);
> -		fuse_writepages_send(&data);
> -	}
> -	if (data.ff)
> -		fuse_file_put(data.ff, false);
> -
> -out:
> -	return err;
> +	return iomap_writepages(&wpc);
>  }
>  
>  static int fuse_launder_folio(struct folio *folio)
> @@ -3109,7 +3135,7 @@ static const struct address_space_operations fuse_file_aops  = {
>  	.readahead	= fuse_readahead,
>  	.writepages	= fuse_writepages,
>  	.launder_folio	= fuse_launder_folio,
> -	.dirty_folio	= filemap_dirty_folio,
> +	.dirty_folio	= iomap_dirty_folio,
>  	.release_folio	= iomap_release_folio,
>  	.migrate_folio	= filemap_migrate_folio,
>  	.bmap		= fuse_bmap,
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 14/16] fuse: use iomap for folio laundering
  2025-06-24  2:21 ` [PATCH v3 14/16] fuse: use iomap for folio laundering Joanne Koong
  2025-06-25 17:02   ` kernel test robot
@ 2025-07-02 18:14   ` Darrick J. Wong
  1 sibling, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 18:14 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:33PM -0700, Joanne Koong wrote:
> Use iomap for folio laundering, which will do granular dirty
> writeback when laundering a large folio.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
>  fs/fuse/file.c | 52 ++++++++++++--------------------------------------
>  1 file changed, 12 insertions(+), 40 deletions(-)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 2b4b950eaeed..35ecc03c0c48 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -2062,45 +2062,6 @@ static struct fuse_writepage_args *fuse_writepage_args_setup(struct folio *folio
>  	return wpa;
>  }
>  
> -static int fuse_writepage_locked(struct folio *folio)
> -{
> -	struct address_space *mapping = folio->mapping;
> -	struct inode *inode = mapping->host;
> -	struct fuse_inode *fi = get_fuse_inode(inode);
> -	struct fuse_writepage_args *wpa;
> -	struct fuse_args_pages *ap;
> -	struct fuse_file *ff;
> -	int error = -EIO;
> -
> -	ff = fuse_write_file_get(fi);
> -	if (!ff)
> -		goto err;
> -
> -	wpa = fuse_writepage_args_setup(folio, 0, ff);
> -	error = -ENOMEM;
> -	if (!wpa)
> -		goto err_writepage_args;
> -
> -	ap = &wpa->ia.ap;
> -	ap->num_folios = 1;
> -
> -	folio_start_writeback(folio);
> -	fuse_writepage_args_page_fill(wpa, folio, 0, 0, folio_size(folio));
> -
> -	spin_lock(&fi->lock);
> -	list_add_tail(&wpa->queue_entry, &fi->queued_writes);
> -	fuse_flush_writepages(inode);
> -	spin_unlock(&fi->lock);
> -
> -	return 0;
> -
> -err_writepage_args:
> -	fuse_file_put(ff, false);
> -err:
> -	mapping_set_error(folio->mapping, error);
> -	return error;
> -}
> -
>  struct fuse_fill_wb_data {
>  	struct fuse_writepage_args *wpa;
>  	struct fuse_file *ff;
> @@ -2281,8 +2242,19 @@ static int fuse_writepages(struct address_space *mapping,
>  static int fuse_launder_folio(struct folio *folio)
>  {
>  	int err = 0;
> +	struct fuse_fill_wb_data data = {
> +		.inode = folio->mapping->host,
> +	};
> +	struct iomap_writepage_ctx wpc = {
> +		.inode = folio->mapping->host,
> +		.iomap.type = IOMAP_MAPPED,
> +		.ops = &fuse_writeback_ops,
> +		.wb_ctx	= &data,
> +	};
> +
>  	if (folio_clear_dirty_for_io(folio)) {
> -		err = fuse_writepage_locked(folio);
> +		err = iomap_writeback_folio(&wpc, folio);
> +		err = fuse_iomap_writeback_submit(&wpc, err);

Weird krobot complaints aside, this looks like a reasonable way to write
a single folio similar to what writepage used to do.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

>  		if (!err)
>  			folio_wait_writeback(folio);
>  	}
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 15/16] fuse: hook into iomap for invalidating and checking partial uptodateness
  2025-06-24  2:21 ` [PATCH v3 15/16] fuse: hook into iomap for invalidating and checking partial uptodateness Joanne Koong
@ 2025-07-02 18:14   ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 18:14 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:34PM -0700, Joanne Koong wrote:
> Hook into iomap_invalidate_folio() so that if the entire folio is being
> invalidated during truncation, the dirty state is cleared and the folio
> doesn't get written back. As well the folio's corresponding ifs struct
> will get freed.
> 
> Hook into iomap_is_partially_uptodate() since iomap tracks uptodateness
> granularly when it does buffered writes.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>

Seems reasonable
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/fuse/file.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 35ecc03c0c48..865d04b8ef31 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -3109,6 +3109,8 @@ static const struct address_space_operations fuse_file_aops  = {
>  	.launder_folio	= fuse_launder_folio,
>  	.dirty_folio	= iomap_dirty_folio,
>  	.release_folio	= iomap_release_folio,
> +	.invalidate_folio = iomap_invalidate_folio,
> +	.is_partially_uptodate = iomap_is_partially_uptodate,
>  	.migrate_folio	= filemap_migrate_folio,
>  	.bmap		= fuse_bmap,
>  	.direct_IO	= fuse_direct_IO,
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 16/16] fuse: refactor writeback to use iomap_writepage_ctx inode
  2025-06-24  2:21 ` [PATCH v3 16/16] fuse: refactor writeback to use iomap_writepage_ctx inode Joanne Koong
@ 2025-07-02 18:15   ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-02 18:15 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Mon, Jun 23, 2025 at 07:21:35PM -0700, Joanne Koong wrote:
> struct iomap_writepage_ctx includes a pointer to the file inode. In
> writeback, use that instead of also passing the inode into
> fuse_fill_wb_data.
> 
> No functional changes.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>

Looks good,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/fuse/file.c | 27 ++++++++++++---------------
>  1 file changed, 12 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 865d04b8ef31..4f17ba69ddfc 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -2070,16 +2070,16 @@ struct fuse_fill_wb_data {
>  	unsigned int nr_bytes;
>  };
>  
> -static bool fuse_pages_realloc(struct fuse_fill_wb_data *data)
> +static bool fuse_pages_realloc(struct fuse_fill_wb_data *data,
> +			       unsigned int max_pages)
>  {
>  	struct fuse_args_pages *ap = &data->wpa->ia.ap;
> -	struct fuse_conn *fc = get_fuse_conn(data->inode);
>  	struct folio **folios;
>  	struct fuse_folio_desc *descs;
>  	unsigned int nfolios = min_t(unsigned int,
>  				     max_t(unsigned int, data->max_folios * 2,
>  					   FUSE_DEFAULT_MAX_PAGES_PER_REQ),
> -				    fc->max_pages);
> +				    max_pages);
>  	WARN_ON(nfolios <= data->max_folios);
>  
>  	folios = fuse_folios_alloc(nfolios, GFP_NOFS, &descs);
> @@ -2096,10 +2096,10 @@ static bool fuse_pages_realloc(struct fuse_fill_wb_data *data)
>  	return true;
>  }
>  
> -static void fuse_writepages_send(struct fuse_fill_wb_data *data)
> +static void fuse_writepages_send(struct inode *inode,
> +				 struct fuse_fill_wb_data *data)
>  {
>  	struct fuse_writepage_args *wpa = data->wpa;
> -	struct inode *inode = data->inode;
>  	struct fuse_inode *fi = get_fuse_inode(inode);
>  
>  	spin_lock(&fi->lock);
> @@ -2134,7 +2134,8 @@ static bool fuse_writepage_need_send(struct fuse_conn *fc, struct folio *folio,
>  		return true;
>  
>  	/* Need to grow the pages array?  If so, did the expansion fail? */
> -	if (ap->num_folios == data->max_folios && !fuse_pages_realloc(data))
> +	if (ap->num_folios == data->max_folios &&
> +	    !fuse_pages_realloc(data, fc->max_pages))
>  		return true;
>  
>  	return false;
> @@ -2147,7 +2148,7 @@ static ssize_t fuse_iomap_writeback_range(struct iomap_writepage_ctx *wpc,
>  	struct fuse_fill_wb_data *data = wpc->wb_ctx;
>  	struct fuse_writepage_args *wpa = data->wpa;
>  	struct fuse_args_pages *ap = &wpa->ia.ap;
> -	struct inode *inode = data->inode;
> +	struct inode *inode = wpc->inode;
>  	struct fuse_inode *fi = get_fuse_inode(inode);
>  	struct fuse_conn *fc = get_fuse_conn(inode);
>  	loff_t offset = offset_in_folio(folio, pos);
> @@ -2165,7 +2166,7 @@ static ssize_t fuse_iomap_writeback_range(struct iomap_writepage_ctx *wpc,
>  	iomap_start_folio_write(inode, folio, 1);
>  
>  	if (wpa && fuse_writepage_need_send(fc, folio, offset, len, ap, data)) {
> -		fuse_writepages_send(data);
> +		fuse_writepages_send(inode, data);
>  		data->wpa = NULL;
>  		data->nr_bytes = 0;
>  	}
> @@ -2199,7 +2200,7 @@ static int fuse_iomap_writeback_submit(struct iomap_writepage_ctx *wpc,
>  
>  	if (data->wpa) {
>  		WARN_ON(!data->wpa->ia.ap.num_folios);
> -		fuse_writepages_send(data);
> +		fuse_writepages_send(wpc->inode, data);
>  	}
>  
>  	if (data->ff)
> @@ -2218,9 +2219,7 @@ static int fuse_writepages(struct address_space *mapping,
>  {
>  	struct inode *inode = mapping->host;
>  	struct fuse_conn *fc = get_fuse_conn(inode);
> -	struct fuse_fill_wb_data data = {
> -		.inode = inode,
> -	};
> +	struct fuse_fill_wb_data data = {};
>  	struct iomap_writepage_ctx wpc = {
>  		.inode = inode,
>  		.iomap.type = IOMAP_MAPPED,
> @@ -2242,9 +2241,7 @@ static int fuse_writepages(struct address_space *mapping,
>  static int fuse_launder_folio(struct folio *folio)
>  {
>  	int err = 0;
> -	struct fuse_fill_wb_data data = {
> -		.inode = folio->mapping->host,
> -	};
> +	struct fuse_fill_wb_data data = {};
>  	struct iomap_writepage_ctx wpc = {
>  		.inode = folio->mapping->host,
>  		.iomap.type = IOMAP_MAPPED,
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 12/16] fuse: use iomap for buffered writes
  2025-07-02 17:57     ` Darrick J. Wong
@ 2025-07-02 22:57       ` Joanne Koong
  0 siblings, 0 replies; 50+ messages in thread
From: Joanne Koong @ 2025-07-02 22:57 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Wed, Jul 2, 2025 at 10:57 AM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Wed, Jul 02, 2025 at 10:55:09AM -0700, Darrick J. Wong wrote:
> > On Mon, Jun 23, 2025 at 07:21:31PM -0700, Joanne Koong wrote:
> > > Have buffered writes go through iomap. This has two advantages:
> > > * granular large folio synchronous reads
> > > * granular large folio dirty tracking
> > >
> > > If for example there is a 1 MB large folio and a write issued at pos 1
> > > to pos 1 MB - 2, only the head and tail pages will need to be read in
> > > and marked uptodate instead of the entire folio needing to be read in.
> > > Non-relevant trailing pages are also skipped (eg if for a 1 MB large
> > > folio a write is issued at pos 1 to 4099, only the first two pages are
> > > read in and the ones after that are skipped).
> > >
> > > iomap also has granular dirty tracking. This is useful in that when it
> > > comes to writeback time, only the dirty portions of the large folio will
> > > be written instead of having to write out the entire folio. For example
> > > if there is a 1 MB large folio and only 2 bytes in it are dirty, only
> > > the page for those dirty bytes get written out. Please note that
> > > granular writeback is only done once fuse also uses iomap in writeback
> > > (separate commit).
> > >
> > > .release_folio needs to be set to iomap_release_folio so that any
> > > allocated iomap ifs structs get freed.
> >
> > What happens in the !iomap case, which can still happen for
> > !writeback_cache filesystems?  I don't think you can call
> > iomap_release_folio, because iomap doesn't own folio->private in that
> > case.

AFAICS, there's otherwise no private data attached to the folio for
fuse for the non-writeback paths, so I don't think this is an issue.
ifs_free() would be a no-op.

>
> ...and I think the answer to that is that the !writeback_cache case
> passes all file IO directly to the fuse server and never touches the
> page cache at all?

There's two !writeback_cache cases, direct io and writethrough.
For writethrough, the file IO gets passed to the fuse server and it
also gets written to the page cache.

>
> --D
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 13/16] fuse: use iomap for writeback
  2025-07-02 18:13   ` Darrick J. Wong
@ 2025-07-02 23:20     ` Joanne Koong
  0 siblings, 0 replies; 50+ messages in thread
From: Joanne Koong @ 2025-07-02 23:20 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-fsdevel, hch, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Wed, Jul 2, 2025 at 11:13 AM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Mon, Jun 23, 2025 at 07:21:32PM -0700, Joanne Koong wrote:
> > Use iomap for dirty folio writeback in ->writepages().
> > This allows for granular dirty writeback of large folios.
> >
> > Only the dirty portions of the large folio will be written instead of
> > having to write out the entire folio. For example if there is a 1 MB
> > large folio and only 2 bytes in it are dirty, only the page for those
> > dirty bytes will be written out.
> >
> > .dirty_folio needs to be set to iomap_dirty_folio so that the bitmap
> > iomap uses for dirty tracking correctly reflects dirty regions that need
> > to be written back.
> >
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> >  fs/fuse/file.c | 124 ++++++++++++++++++++++++++++++-------------------
> >  1 file changed, 75 insertions(+), 49 deletions(-)
> >
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index a7f11c1a4f89..2b4b950eaeed 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -1837,7 +1837,7 @@ static void fuse_writepage_finish(struct fuse_writepage_args *wpa)
> >                * scope of the fi->lock alleviates xarray lock
> >                * contention and noticeably improves performance.
> >                */
> > -             folio_end_writeback(ap->folios[i]);
> > +             iomap_finish_folio_write(inode, ap->folios[i], 1);
> >               dec_wb_stat(&bdi->wb, WB_WRITEBACK);
> >               wb_writeout_inc(&bdi->wb);
> >       }
> > @@ -2024,19 +2024,20 @@ static void fuse_writepage_add_to_bucket(struct fuse_conn *fc,
> >  }
> >
> >  static void fuse_writepage_args_page_fill(struct fuse_writepage_args *wpa, struct folio *folio,
> > -                                       uint32_t folio_index)
> > +                                       uint32_t folio_index, loff_t offset, unsigned len)
> >  {
> >       struct inode *inode = folio->mapping->host;
> >       struct fuse_args_pages *ap = &wpa->ia.ap;
> >
> >       ap->folios[folio_index] = folio;
> > -     ap->descs[folio_index].offset = 0;
> > -     ap->descs[folio_index].length = folio_size(folio);
> > +     ap->descs[folio_index].offset = offset;
> > +     ap->descs[folio_index].length = len;
> >
> >       inc_wb_stat(&inode_to_bdi(inode)->wb, WB_WRITEBACK);
> >  }
> >
> >  static struct fuse_writepage_args *fuse_writepage_args_setup(struct folio *folio,
> > +                                                          size_t offset,
> >                                                            struct fuse_file *ff)
> >  {
> >       struct inode *inode = folio->mapping->host;
> > @@ -2049,7 +2050,7 @@ static struct fuse_writepage_args *fuse_writepage_args_setup(struct folio *folio
> >               return NULL;
> >
> >       fuse_writepage_add_to_bucket(fc, wpa);
> > -     fuse_write_args_fill(&wpa->ia, ff, folio_pos(folio), 0);
> > +     fuse_write_args_fill(&wpa->ia, ff, folio_pos(folio) + offset, 0);
> >       wpa->ia.write.in.write_flags |= FUSE_WRITE_CACHE;
> >       wpa->inode = inode;
> >       wpa->ia.ff = ff;
> > @@ -2075,7 +2076,7 @@ static int fuse_writepage_locked(struct folio *folio)
> >       if (!ff)
> >               goto err;
> >
> > -     wpa = fuse_writepage_args_setup(folio, ff);
> > +     wpa = fuse_writepage_args_setup(folio, 0, ff);
> >       error = -ENOMEM;
> >       if (!wpa)
> >               goto err_writepage_args;
> > @@ -2084,7 +2085,7 @@ static int fuse_writepage_locked(struct folio *folio)
> >       ap->num_folios = 1;
> >
> >       folio_start_writeback(folio);
> > -     fuse_writepage_args_page_fill(wpa, folio, 0);
> > +     fuse_writepage_args_page_fill(wpa, folio, 0, 0, folio_size(folio));
> >
> >       spin_lock(&fi->lock);
> >       list_add_tail(&wpa->queue_entry, &fi->queued_writes);
> > @@ -2105,7 +2106,7 @@ struct fuse_fill_wb_data {
> >       struct fuse_file *ff;
> >       struct inode *inode;
> >       unsigned int max_folios;
> > -     unsigned int nr_pages;
> > +     unsigned int nr_bytes;
> >  };
> >
> >  static bool fuse_pages_realloc(struct fuse_fill_wb_data *data)
> > @@ -2147,21 +2148,28 @@ static void fuse_writepages_send(struct fuse_fill_wb_data *data)
> >  }
> >
> >  static bool fuse_writepage_need_send(struct fuse_conn *fc, struct folio *folio,
> > +                                  loff_t offset, unsigned len,
> >                                    struct fuse_args_pages *ap,
> >                                    struct fuse_fill_wb_data *data)
> >  {
> > +     struct folio *prev_folio;
> > +     struct fuse_folio_desc prev_desc;
> > +
> >       WARN_ON(!ap->num_folios);
> >
> >       /* Reached max pages */
> > -     if (data->nr_pages + folio_nr_pages(folio) > fc->max_pages)
> > +     if ((data->nr_bytes + len) / PAGE_SIZE > fc->max_pages)
> >               return true;
> >
> >       /* Reached max write bytes */
> > -     if ((data->nr_pages * PAGE_SIZE) + folio_size(folio) > fc->max_write)
> > +     if (data->nr_bytes + len > fc->max_write)
> >               return true;
> >
> >       /* Discontinuity */
> > -     if (folio_next_index(ap->folios[ap->num_folios - 1]) != folio->index)
> > +     prev_folio = ap->folios[ap->num_folios - 1];
> > +     prev_desc = ap->descs[ap->num_folios - 1];
> > +     if ((folio_pos(prev_folio) + prev_desc.offset + prev_desc.length) !=
> > +         folio_pos(folio) + offset)
> >               return true;
> >
> >       /* Need to grow the pages array?  If so, did the expansion fail? */
> > @@ -2171,85 +2179,103 @@ static bool fuse_writepage_need_send(struct fuse_conn *fc, struct folio *folio,
> >       return false;
> >  }
> >
> > -static int fuse_writepages_fill(struct folio *folio,
> > -             struct writeback_control *wbc, void *_data)
> > +static ssize_t fuse_iomap_writeback_range(struct iomap_writepage_ctx *wpc,
> > +                                       struct folio *folio, u64 pos,
> > +                                       unsigned len, u64 end_pos)
> >  {
> > -     struct fuse_fill_wb_data *data = _data;
> > +     struct fuse_fill_wb_data *data = wpc->wb_ctx;
> >       struct fuse_writepage_args *wpa = data->wpa;
> >       struct fuse_args_pages *ap = &wpa->ia.ap;
> >       struct inode *inode = data->inode;
> >       struct fuse_inode *fi = get_fuse_inode(inode);
> >       struct fuse_conn *fc = get_fuse_conn(inode);
> > -     int err;
> > +     loff_t offset = offset_in_folio(folio, pos);
> > +
> > +     WARN_ON_ONCE(!data);
> > +     /* len will always be page aligned */
> > +     WARN_ON_ONCE(len & (PAGE_SIZE - 1));
> >
> >       if (!data->ff) {
> > -             err = -EIO;
> >               data->ff = fuse_write_file_get(fi);
> >               if (!data->ff)
> > -                     goto out_unlock;
> > +                     return -EIO;
> >       }
> >
> > -     if (wpa && fuse_writepage_need_send(fc, folio, ap, data)) {
> > +     iomap_start_folio_write(inode, folio, 1);
> > +
> > +     if (wpa && fuse_writepage_need_send(fc, folio, offset, len, ap, data)) {
> >               fuse_writepages_send(data);
> >               data->wpa = NULL;
> > -             data->nr_pages = 0;
> > +             data->nr_bytes = 0;
> >       }
> >
> >       if (data->wpa == NULL) {
> > -             err = -ENOMEM;
> > -             wpa = fuse_writepage_args_setup(folio, data->ff);
> > +             wpa = fuse_writepage_args_setup(folio, offset, data->ff);
> >               if (!wpa)
> > -                     goto out_unlock;
> > +                     return -ENOMEM;
>
> If we error out here, what subtracts from write_bytes_pending the
> quantity that we just added in iomap_start_folio_write?
>
> (It would have helped a lot if the cover letter had linked to a git
> branch so I could go look at the final product for myself...)
>

Ugh you're right, this needs to be accounted for in the error cases.
I'll fix this in v4. Thanks for spotting this. I'll make a git branch
and include a link to it in v4 as well.

> --D
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 03/16] iomap: refactor the writeback interface
  2025-07-02 17:13   ` Darrick J. Wong
@ 2025-07-03 12:16     ` Christoph Hellwig
  2025-07-03 16:51       ` Darrick J. Wong
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Hellwig @ 2025-07-03 12:16 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Joanne Koong, linux-fsdevel, hch, miklos, brauner, anuj20.g,
	linux-xfs, linux-doc, linux-block, gfs2, kernel-team

On Wed, Jul 02, 2025 at 10:13:53AM -0700, Darrick J. Wong wrote:
> > +    int (*writeback_range)(struct iomap_writepage_ctx *wpc,
> > +    		struct folio *folio, u64 pos, unsigned int len, u64 end_pos);
> 
> Why does @pos change from loff_t to u64 here?  Are we expecting
> filesystems that set FOP_UNSIGNED_OFFSET?

It doesn't really change, it matches what iomap_writepage_map_blocks
was doing.  I guess it simply doesn't fix the existing inconsistency.

> > +    int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
> 
> Nit:   ^^ indenting change here.

Yeah, RST formatting is a mess unfortunately.   I think the problem is
that the exiting code uses 4 space indents.  I wonder if that's required
by %##% RST?

> > +		if (wpc->iomap.type != IOMAP_HOLE)
> > +			*wb_pending = true;
> 
> /me wonders if this should be an outparam of ->writeback_range to signal
> that it actually added the folio to the writeback ioend chain?  Or maybe
> just a boolean in iomap_writepage_ctx that we clear before calling
> ->writeback_range and iomap_add_to_ioend can set it as appropriate?

What's the benefit of that?  A hole pretty clearly signal there is
no writeback here.

> Should this jump label should be named add_to_ioend or something?  We
> already mapped the blocks.  The same applies to the zoned version of
> this function.

The newer version already uses a map_blocks helper for both again.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 04/16] iomap: hide ioends from the generic writeback code
  2025-07-02 17:38   ` Darrick J. Wong
@ 2025-07-03 12:19     ` Christoph Hellwig
  0 siblings, 0 replies; 50+ messages in thread
From: Christoph Hellwig @ 2025-07-03 12:19 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Joanne Koong, linux-fsdevel, hch, miklos, brauner, anuj20.g,
	linux-xfs, linux-doc, linux-block, gfs2, kernel-team

On Wed, Jul 02, 2025 at 10:38:19AM -0700, Darrick J. Wong wrote:
> > -Filesystems that need to update internal bookkeeping (e.g. unwritten
> > -extent conversions) should provide a ``->submit_ioend`` function to
> > -set ``struct iomap_end::bio::bi_end_io`` to its own function.
> > -This function should call ``iomap_finish_ioends`` after finishing its
> > -own work (e.g. unwritten extent conversion).
> > -
> 
> I really wish you wouldn't delete the documentation that talks about
> what sort of things you might do in a ->writeback_submit function.
> That might be obvious to us who've been around for a long time, but I
> don't think that's so obvious to the junior programmers.

Because it's somewhere between wrong and totally arbitrary.  That whole
file is a real pain in the but because of that approach and I really
should have fought against adding it much harder.

> >  /*
> > - * Submit an ioend.
> 
> Please retain the summary.

The summary is split up into comments in the places where it makes
sense now.

> >  
> > -	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
> > +	if (!ioend || !iomap_can_add_to_ioend(wpc, pos, ioend_flags)) {
> >  new_ioend:
> > -		error = iomap_submit_ioend(wpc, 0);
> > -		if (error)
> > -			return error;
> > -		wpc->ioend = iomap_alloc_ioend(wpc, pos, ioend_flags);
> > +		if (ioend) {
> > +			error = wpc->ops->writeback_submit(wpc, 0);
> 
> Should we call ioend_writeback_submit directly if
> !wpc->ops->writeback_submit, to avoid the indirect call hit for simpler
> filesystems?

No.  Compared to all the other indirect calls here it doesn't matter,
and having arbitrary defaults tends to cause problems down the road.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 03/16] iomap: refactor the writeback interface
  2025-07-03 12:16     ` Christoph Hellwig
@ 2025-07-03 16:51       ` Darrick J. Wong
  0 siblings, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2025-07-03 16:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joanne Koong, linux-fsdevel, miklos, brauner, anuj20.g, linux-xfs,
	linux-doc, linux-block, gfs2, kernel-team

On Thu, Jul 03, 2025 at 02:16:54PM +0200, Christoph Hellwig wrote:
> On Wed, Jul 02, 2025 at 10:13:53AM -0700, Darrick J. Wong wrote:
> > > +    int (*writeback_range)(struct iomap_writepage_ctx *wpc,
> > > +    		struct folio *folio, u64 pos, unsigned int len, u64 end_pos);
> > 
> > Why does @pos change from loff_t to u64 here?  Are we expecting
> > filesystems that set FOP_UNSIGNED_OFFSET?
> 
> It doesn't really change, it matches what iomap_writepage_map_blocks
> was doing.  I guess it simply doesn't fix the existing inconsistency.
> 
> > > +    int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
> > 
> > Nit:   ^^ indenting change here.
> 
> Yeah, RST formatting is a mess unfortunately.   I think the problem is
> that the exiting code uses 4 space indents.  I wonder if that's required
> by %##% RST?

It's a code block, so it's not going to make the rst parser choke.
However it will result in an weirdly indented output:

 struct iomap_writeback_ops {
     int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode,
                       loff_t offset, unsigned len);
     int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
    void (*discard_folio)(struct folio *folio, loff_t pos);
 };

is what I got when I removed an indentation space from discard_folio.
Hilariously it actually makes the "(" line up which appeals to my column
aligning brain and actually looks better. :P

So having now seriously undercut my own point, I'll relax to "meh do
whatever".

> > > +		if (wpc->iomap.type != IOMAP_HOLE)
> > > +			*wb_pending = true;
> > 
> > /me wonders if this should be an outparam of ->writeback_range to signal
> > that it actually added the folio to the writeback ioend chain?  Or maybe
> > just a boolean in iomap_writepage_ctx that we clear before calling
> > ->writeback_range and iomap_add_to_ioend can set it as appropriate?
> 
> What's the benefit of that?  A hole pretty clearly signal there is
> no writeback here.

Fair enough.  In my head it was "the code that actually sets up the
ioend should set this flag" but I guess we can detect it from the
mapping after the fact instead of passing things around.

> > Should this jump label should be named add_to_ioend or something?  We
> > already mapped the blocks.  The same applies to the zoned version of
> > this function.
> 
> The newer version already uses a map_blocks helper for both again.

Ah, so it does.

--D

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2025-07-03 16:51 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-24  2:21 [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Joanne Koong
2025-06-24  2:21 ` [PATCH v3 01/16] iomap: pass more arguments using struct iomap_writepage_ctx Joanne Koong
2025-06-24  6:07   ` Johannes Thumshirn
2025-06-24 22:02     ` Joanne Koong
2025-07-02 15:57   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 02/16] iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks Joanne Koong
2025-06-24  6:09   ` Johannes Thumshirn
2025-07-02 15:59   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 03/16] iomap: refactor the writeback interface Joanne Koong
2025-07-02 17:13   ` Darrick J. Wong
2025-07-03 12:16     ` Christoph Hellwig
2025-07-03 16:51       ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 04/16] iomap: hide ioends from the generic writeback code Joanne Koong
2025-07-02 17:38   ` Darrick J. Wong
2025-07-03 12:19     ` Christoph Hellwig
2025-06-24  2:21 ` [PATCH v3 05/16] iomap: add public helpers for uptodate state manipulation Joanne Koong
2025-07-02 17:39   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 06/16] iomap: move all ioend handling to ioend.c Joanne Koong
2025-07-02 17:40   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 07/16] iomap: rename iomap_writepage_map to iomap_writeback_folio Joanne Koong
2025-07-02 17:41   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 08/16] iomap: move folio_unlock out of iomap_writeback_folio Joanne Koong
2025-07-02 17:43   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 09/16] iomap: export iomap_writeback_folio Joanne Koong
2025-07-02 17:44   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 10/16] iomap: replace iomap_folio_ops with iomap_write_ops Joanne Koong
2025-07-02 17:48   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 11/16] iomap: add read_folio_range() handler for buffered writes Joanne Koong
2025-07-02 17:51   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 12/16] fuse: use iomap " Joanne Koong
2025-06-24 10:07   ` Miklos Szeredi
2025-06-24 21:52     ` Joanne Koong
2025-06-25 11:10   ` kernel test robot
2025-07-02 17:55   ` Darrick J. Wong
2025-07-02 17:57     ` Darrick J. Wong
2025-07-02 22:57       ` Joanne Koong
2025-06-24  2:21 ` [PATCH v3 13/16] fuse: use iomap for writeback Joanne Koong
2025-06-25 14:17   ` kernel test robot
2025-06-25 16:48     ` Joanne Koong
2025-07-01 11:34       ` Christian Brauner
2025-07-02 18:13   ` Darrick J. Wong
2025-07-02 23:20     ` Joanne Koong
2025-06-24  2:21 ` [PATCH v3 14/16] fuse: use iomap for folio laundering Joanne Koong
2025-06-25 17:02   ` kernel test robot
2025-07-02 18:14   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 15/16] fuse: hook into iomap for invalidating and checking partial uptodateness Joanne Koong
2025-07-02 18:14   ` Darrick J. Wong
2025-06-24  2:21 ` [PATCH v3 16/16] fuse: refactor writeback to use iomap_writepage_ctx inode Joanne Koong
2025-07-02 18:15   ` Darrick J. Wong
2025-06-24 13:31 ` [PATCH v3 00/16] fuse: use iomap for buffered writes + writeback Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).