linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state
@ 2025-08-10 10:15 alexjlzheng
  2025-08-10 10:15 ` [PATCH v2 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: alexjlzheng @ 2025-08-10 10:15 UTC (permalink / raw)
  To: brauner, djwong; +Cc: linux-xfs, linux-fsdevel, linux-kernel, Jinliang Zheng

From: Jinliang Zheng <alexjlzheng@tencent.com>

With iomap_folio_state, we can identify uptodate states at the block
level, and a read_folio reading can correctly handle partially
uptodate folios.

Therefore, when a partial write occurs, accept the block-aligned
partial write instead of rejecting the entire write.

This patchset has been tested by xfstests' generic and xfs group, and
there's no new failed cases compared to the lastest upstream version kernel.

Changelog:

V2: use & instead of % for 64 bit variable on m68k/xtensa, try to make them happy:
       m68k-linux-ld: fs/iomap/buffered-io.o: in function `iomap_adjust_read_range':
    >> buffered-io.c:(.text+0xa8a): undefined reference to `__moddi3'
    >> m68k-linux-ld: buffered-io.c:(.text+0xaa8): undefined reference to `__moddi3'

V1: https://lore.kernel.org/linux-fsdevel/20250810044806.3433783-1-alexjlzheng@tencent.com/

Jinliang Zheng (4):
  iomap: make sure iomap_adjust_read_range() are aligned with block_size
  iomap: move iter revert case out of the unwritten branch
  iomap: make iomap_write_end() return the number of written length again
  iomap: don't abandon the whole thing with iomap_folio_state

 fs/iomap/buffered-io.c | 68 +++++++++++++++++++++++++++++-------------
 1 file changed, 47 insertions(+), 21 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size
  2025-08-10 10:15 [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state alexjlzheng
@ 2025-08-10 10:15 ` alexjlzheng
  2025-08-11 10:39   ` Christoph Hellwig
  2025-08-12  7:24   ` Geert Uytterhoeven
  2025-08-10 10:15 ` [PATCH v2 2/4] iomap: move iter revert case out of the unwritten branch alexjlzheng
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: alexjlzheng @ 2025-08-10 10:15 UTC (permalink / raw)
  To: brauner, djwong; +Cc: linux-xfs, linux-fsdevel, linux-kernel, Jinliang Zheng

From: Jinliang Zheng <alexjlzheng@tencent.com>

iomap_folio_state marks the uptodate state in units of block_size, so
it is better to check that pos and length are aligned with block_size.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 fs/iomap/buffered-io.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index fd827398afd2..934458850ddb 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -234,6 +234,9 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio,
 	unsigned first = poff >> block_bits;
 	unsigned last = (poff + plen - 1) >> block_bits;
 
+	BUG_ON(*pos & (block_size - 1));
+	BUG_ON(length & (block_size - 1));
+
 	/*
 	 * If the block size is smaller than the page size, we need to check the
 	 * per-block uptodate status and adjust the offset and length if needed
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/4] iomap: move iter revert case out of the unwritten branch
  2025-08-10 10:15 [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state alexjlzheng
  2025-08-10 10:15 ` [PATCH v2 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
@ 2025-08-10 10:15 ` alexjlzheng
  2025-08-11 10:40   ` Christoph Hellwig
  2025-08-10 10:15 ` [PATCH v2 3/4] iomap: make iomap_write_end() return the number of written length again alexjlzheng
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: alexjlzheng @ 2025-08-10 10:15 UTC (permalink / raw)
  To: brauner, djwong; +Cc: linux-xfs, linux-fsdevel, linux-kernel, Jinliang Zheng

From: Jinliang Zheng <alexjlzheng@tencent.com>

This reverts commit e1f453d4336d ("iomap: do some small logical
cleanup in buffered write"), for preparetion for the next patches
which allow iomap_write_end() return a partial write length.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 fs/iomap/buffered-io.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 934458850ddb..641034f621c1 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1019,6 +1019,11 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i,
 
 		if (old_size < pos)
 			pagecache_isize_extended(iter->inode, old_size, pos);
+		if (written < bytes)
+			iomap_write_failed(iter->inode, pos + written,
+					   bytes - written);
+		if (unlikely(copied != written))
+			iov_iter_revert(i, copied - written);
 
 		cond_resched();
 		if (unlikely(written == 0)) {
@@ -1028,9 +1033,6 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i,
 			 * halfway through, might be a race with munmap,
 			 * might be severe memory pressure.
 			 */
-			iomap_write_failed(iter->inode, pos, bytes);
-			iov_iter_revert(i, copied);
-
 			if (chunk > PAGE_SIZE)
 				chunk /= 2;
 			if (copied) {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/4] iomap: make iomap_write_end() return the number of written length again
  2025-08-10 10:15 [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state alexjlzheng
  2025-08-10 10:15 ` [PATCH v2 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
  2025-08-10 10:15 ` [PATCH v2 2/4] iomap: move iter revert case out of the unwritten branch alexjlzheng
@ 2025-08-10 10:15 ` alexjlzheng
  2025-08-10 10:15 ` [PATCH v2 4/4] iomap: don't abandon the whole thing with iomap_folio_state alexjlzheng
  2025-08-11 10:38 ` [PATCH v2 0/4] iomap: allow partial folio write " Christoph Hellwig
  4 siblings, 0 replies; 12+ messages in thread
From: alexjlzheng @ 2025-08-10 10:15 UTC (permalink / raw)
  To: brauner, djwong; +Cc: linux-xfs, linux-fsdevel, linux-kernel, Jinliang Zheng

From: Jinliang Zheng <alexjlzheng@tencent.com>

In the next patch, we allow iomap_write_end() to conditionally accept
partial writes, so this patch makes iomap_write_end() return the number
of accepted write bytes in preparation for the next patch.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 fs/iomap/buffered-io.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 641034f621c1..f80386a57d37 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -873,7 +873,7 @@ static int iomap_write_begin(struct iomap_iter *iter,
 	return status;
 }
 
-static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
+static int __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 		size_t copied, struct folio *folio)
 {
 	flush_dcache_folio(folio);
@@ -890,11 +890,11 @@ static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	 * redo the whole thing.
 	 */
 	if (unlikely(copied < len && !folio_test_uptodate(folio)))
-		return false;
+		return 0;
 	iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len);
 	iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied);
 	filemap_dirty_folio(inode->i_mapping, folio);
-	return true;
+	return copied;
 }
 
 static void iomap_write_end_inline(const struct iomap_iter *iter,
@@ -915,10 +915,10 @@ static void iomap_write_end_inline(const struct iomap_iter *iter,
 }
 
 /*
- * Returns true if all copied bytes have been written to the pagecache,
- * otherwise return false.
+ * Returns number of copied bytes have been written to the pagecache,
+ * zero if block is partial update.
  */
-static bool iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
+static int iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
 		struct folio *folio)
 {
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
@@ -926,7 +926,7 @@ static bool iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
 
 	if (srcmap->type == IOMAP_INLINE) {
 		iomap_write_end_inline(iter, folio, pos, copied);
-		return true;
+		return copied;
 	}
 
 	if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
@@ -934,7 +934,7 @@ static bool iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
 
 		bh_written = block_write_end(pos, len, copied, folio);
 		WARN_ON_ONCE(bh_written != copied && bh_written != 0);
-		return bh_written == copied;
+		return bh_written;
 	}
 
 	return __iomap_write_end(iter->inode, pos, len, copied, folio);
@@ -1000,8 +1000,7 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i,
 			flush_dcache_folio(folio);
 
 		copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
-		written = iomap_write_end(iter, bytes, copied, folio) ?
-			  copied : 0;
+		written = iomap_write_end(iter, bytes, copied, folio);
 
 		/*
 		 * Update the in-memory inode size after copying the data into
@@ -1315,7 +1314,7 @@ static int iomap_unshare_iter(struct iomap_iter *iter,
 	do {
 		struct folio *folio;
 		size_t offset;
-		bool ret;
+		int ret;
 
 		bytes = min_t(u64, SIZE_MAX, bytes);
 		status = iomap_write_begin(iter, write_ops, &folio, &offset,
@@ -1327,7 +1326,7 @@ static int iomap_unshare_iter(struct iomap_iter *iter,
 
 		ret = iomap_write_end(iter, bytes, bytes, folio);
 		__iomap_put_folio(iter, write_ops, bytes, folio);
-		if (WARN_ON_ONCE(!ret))
+		if (WARN_ON_ONCE(ret != bytes))
 			return -EIO;
 
 		cond_resched();
@@ -1388,7 +1387,7 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero,
 	do {
 		struct folio *folio;
 		size_t offset;
-		bool ret;
+		int ret;
 
 		bytes = min_t(u64, SIZE_MAX, bytes);
 		status = iomap_write_begin(iter, write_ops, &folio, &offset,
@@ -1406,7 +1405,7 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero,
 
 		ret = iomap_write_end(iter, bytes, bytes, folio);
 		__iomap_put_folio(iter, write_ops, bytes, folio);
-		if (WARN_ON_ONCE(!ret))
+		if (WARN_ON_ONCE(ret != bytes))
 			return -EIO;
 
 		status = iomap_iter_advance(iter, &bytes);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 4/4] iomap: don't abandon the whole thing with iomap_folio_state
  2025-08-10 10:15 [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state alexjlzheng
                   ` (2 preceding siblings ...)
  2025-08-10 10:15 ` [PATCH v2 3/4] iomap: make iomap_write_end() return the number of written length again alexjlzheng
@ 2025-08-10 10:15 ` alexjlzheng
  2025-08-11 10:41   ` Christoph Hellwig
  2025-08-11 10:38 ` [PATCH v2 0/4] iomap: allow partial folio write " Christoph Hellwig
  4 siblings, 1 reply; 12+ messages in thread
From: alexjlzheng @ 2025-08-10 10:15 UTC (permalink / raw)
  To: brauner, djwong; +Cc: linux-xfs, linux-fsdevel, linux-kernel, Jinliang Zheng

From: Jinliang Zheng <alexjlzheng@tencent.com>

With iomap_folio_state, we can identify uptodate states at the block
level, and a read_folio reading can correctly handle partially
uptodate folios.

Therefore, when a partial write occurs, accept the block-aligned
partial write instead of rejecting the entire write.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 fs/iomap/buffered-io.c | 32 +++++++++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index f80386a57d37..19bf879f3333 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -873,6 +873,25 @@ static int iomap_write_begin(struct iomap_iter *iter,
 	return status;
 }
 
+static int iomap_trim_tail_partial(struct inode *inode, loff_t pos,
+		size_t copied, struct folio *folio)
+{
+	struct iomap_folio_state *ifs = folio->private;
+	unsigned block_size, last_blk, last_blk_bytes;
+
+	if (!ifs || !copied)
+		return 0;
+
+	block_size = 1 << inode->i_blkbits;
+	last_blk = offset_in_folio(folio, pos + copied - 1) >> inode->i_blkbits;
+	last_blk_bytes = (pos + copied) & (block_size - 1);
+
+	if (!ifs_block_is_uptodate(ifs, last_blk))
+		copied -= min(copied, last_blk_bytes);
+
+	return copied;
+}
+
 static int __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 		size_t copied, struct folio *folio)
 {
@@ -886,12 +905,15 @@ static int __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	 * read_folio might come in and destroy our partial write.
 	 *
 	 * Do the simplest thing and just treat any short write to a
-	 * non-uptodate page as a zero-length write, and force the caller to
-	 * redo the whole thing.
+	 * non-uptodate block as a zero-length write, and force the caller to
+	 * redo the things begin from the block.
 	 */
-	if (unlikely(copied < len && !folio_test_uptodate(folio)))
-		return 0;
-	iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len);
+	if (unlikely(copied < len && !folio_test_uptodate(folio))) {
+		copied = iomap_trim_tail_partial(inode, pos, copied, folio);
+		if (!copied)
+			return 0;
+	}
+	iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), copied);
 	iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied);
 	filemap_dirty_folio(inode->i_mapping, folio);
 	return copied;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state
  2025-08-10 10:15 [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state alexjlzheng
                   ` (3 preceding siblings ...)
  2025-08-10 10:15 ` [PATCH v2 4/4] iomap: don't abandon the whole thing with iomap_folio_state alexjlzheng
@ 2025-08-11 10:38 ` Christoph Hellwig
  2025-08-11 12:08   ` Jinliang Zheng
  4 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2025-08-11 10:38 UTC (permalink / raw)
  To: alexjlzheng
  Cc: brauner, djwong, linux-xfs, linux-fsdevel, linux-kernel,
	Jinliang Zheng

On Sun, Aug 10, 2025 at 06:15:50PM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> With iomap_folio_state, we can identify uptodate states at the block
> level, and a read_folio reading can correctly handle partially
> uptodate folios.
> 
> Therefore, when a partial write occurs, accept the block-aligned
> partial write instead of rejecting the entire write.

We're not rejecting the entire write, but instead moving on to the
next loop iteration.

> This patchset has been tested by xfstests' generic and xfs group, and
> there's no new failed cases compared to the lastest upstream version kernel.

What is the motivation for this series?  Do you see performance
improvements in a workload you care about?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size
  2025-08-10 10:15 ` [PATCH v2 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
@ 2025-08-11 10:39   ` Christoph Hellwig
  2025-08-12  7:24   ` Geert Uytterhoeven
  1 sibling, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2025-08-11 10:39 UTC (permalink / raw)
  To: alexjlzheng
  Cc: brauner, djwong, linux-xfs, linux-fsdevel, linux-kernel,
	Jinliang Zheng

On Sun, Aug 10, 2025 at 06:15:51PM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> iomap_folio_state marks the uptodate state in units of block_size, so
> it is better to check that pos and length are aligned with block_size.
> 
> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> ---
>  fs/iomap/buffered-io.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index fd827398afd2..934458850ddb 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -234,6 +234,9 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio,
>  	unsigned first = poff >> block_bits;
>  	unsigned last = (poff + plen - 1) >> block_bits;
>  
> +	BUG_ON(*pos & (block_size - 1));
> +	BUG_ON(length & (block_size - 1));

We should not add BUG_ON calls like this, please turn these into
WARN_ON_ONCE instead.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/4] iomap: move iter revert case out of the unwritten branch
  2025-08-10 10:15 ` [PATCH v2 2/4] iomap: move iter revert case out of the unwritten branch alexjlzheng
@ 2025-08-11 10:40   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2025-08-11 10:40 UTC (permalink / raw)
  To: alexjlzheng
  Cc: brauner, djwong, linux-xfs, linux-fsdevel, linux-kernel,
	Jinliang Zheng

On Sun, Aug 10, 2025 at 06:15:52PM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> This reverts commit e1f453d4336d ("iomap: do some small logical
> cleanup in buffered write"), for preparetion for the next patches
> which allow iomap_write_end() return a partial write length.

Please provide a reason why you revert it.  As a courtesy it would also
be nice to Cc the author of the	commit.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] iomap: don't abandon the whole thing with iomap_folio_state
  2025-08-10 10:15 ` [PATCH v2 4/4] iomap: don't abandon the whole thing with iomap_folio_state alexjlzheng
@ 2025-08-11 10:41   ` Christoph Hellwig
  2025-08-11 12:18     ` Jinliang Zheng
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2025-08-11 10:41 UTC (permalink / raw)
  To: alexjlzheng
  Cc: brauner, djwong, linux-xfs, linux-fsdevel, linux-kernel,
	Jinliang Zheng

Where "the whole thing" is the current iteration in the write loop.
Can you spell this out a bit better?

Also please include the rationale why you are changing the logic
here in the commit log.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state
  2025-08-11 10:38 ` [PATCH v2 0/4] iomap: allow partial folio write " Christoph Hellwig
@ 2025-08-11 12:08   ` Jinliang Zheng
  0 siblings, 0 replies; 12+ messages in thread
From: Jinliang Zheng @ 2025-08-11 12:08 UTC (permalink / raw)
  To: hch
  Cc: alexjlzheng, alexjlzheng, brauner, djwong, linux-fsdevel,
	linux-kernel, linux-xfs

On Mon, 11 Aug 2025 03:38:17 -0700, Christoph Hellwig wrote:
> On Sun, Aug 10, 2025 at 06:15:50PM +0800, alexjlzheng@gmail.com wrote:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > 
> > With iomap_folio_state, we can identify uptodate states at the block
> > level, and a read_folio reading can correctly handle partially
> > uptodate folios.
> > 
> > Therefore, when a partial write occurs, accept the block-aligned
> > partial write instead of rejecting the entire write.
>

Thank you for your reply. :)
 
> We're not rejecting the entire write, but instead moving on to the
> next loop iteration.

Yes, but the next iteration will need to re-copy from the beginning,
which means that all copies in this iteration are useless. The purpose
of this patch set is to reduce the number of bytes that need to be
re-copied and reduce the number of discarded copies.

For example, suppose a folio is 2MB, blocksize is 4kB, and the copied
bytes are 2MB-3kB.

Without this patchset, we'd need to recopy 2MB-3kB of bytes in the next
iteration.

 |<-------------------- 2MB -------------------->|
 +-------+-------+-------+-------+-------+-------+
 | block |  ...  | block | block |  ...  | block | folio
 +-------+-------+-------+-------+-------+-------+
 |<-4kB->|

 |<--------------- copied 2MB-3kB --------->|       first time copied
 |<-------- 1MB -------->|                          next time we need copy (chunk /= 2)
                         |<-------- 1MB -------->|  next next time we need copy.

 |<------ 2MB-3kB bytes duplicate copy ---->|

With this patchset, we can accept 2MB-4kB of bytes, which is block-aligned.
This means we only need to process the remaining 4kB in the next iteration.

 |<-------------------- 2MB -------------------->|
 +-------+-------+-------+-------+-------+-------+
 | block |  ...  | block | block |  ...  | block | folio
 +-------+-------+-------+-------+-------+-------+
 |<-4kB->|

 |<--------------- copied 2MB-3kB --------->|       first time copied
                                         |<-4kB->|  next time we need copy

                                         |<>|
                              only 1kB bytes duplicate copy


> 
> > This patchset has been tested by xfstests' generic and xfs group, and
> > there's no new failed cases compared to the lastest upstream version kernel.
> 
> What is the motivation for this series?  Do you see performance
> improvements in a workload you care about?

Paritial writes are inherently a relatively unusual situation and don't account
for a significant portion of performance testing.

However, in scenarios with numerous memory errors, they can significantly reduce
the number of bytes copied.

thanks,
Jinliang Zheng :)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] iomap: don't abandon the whole thing with iomap_folio_state
  2025-08-11 10:41   ` Christoph Hellwig
@ 2025-08-11 12:18     ` Jinliang Zheng
  0 siblings, 0 replies; 12+ messages in thread
From: Jinliang Zheng @ 2025-08-11 12:18 UTC (permalink / raw)
  To: hch
  Cc: alexjlzheng, alexjlzheng, brauner, djwong, linux-fsdevel,
	linux-kernel, linux-xfs

On Mon, 11 Aug 2025 03:41:39 -0700, Christoph Hellwig wrote:
> Where "the whole thing" is the current iteration in the write loop.
> Can you spell this out a bit better?

Hahaha, I was also confused about "the whole thing". I guess it refers to a
partial write in a folio. It appears in the comments of __iomap_write_end().

static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
		size_t copied, struct folio *folio)
{
	flush_dcache_folio(folio);

	/*
	 * The blocks that were entirely written will now be uptodate, so we
	 * don't have to worry about a read_folio reading them and overwriting a
	 * partial write.  However, if we've encountered a short write and only
	 * partially written into a block, it will not be marked uptodate, so a
	 * read_folio might come in and destroy our partial write.
	 *
	 * Do the simplest thing and just treat any short write to a
	 * non-uptodate page as a zero-length write, and force the caller to
	 * redo the whole thing.
                ^^^^^^^^^^^^^^^ <------------------ look look look, it's here :)
	 */
	if (unlikely(copied < len && !folio_test_uptodate(folio)))
		return false;
	iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len);
	iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied);
	filemap_dirty_folio(inode->i_mapping, folio);
	return true;
}

> 
> Also please include the rationale why you are changing the logic
> here in the commit log.

Hahaha, what I want to express is that we no longer need to define partial write
based on folio granularity, it is more appropriate to use block granularity.

Please forgive my poor English. :-<

thanks,
Jinliang Zheng :)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size
  2025-08-10 10:15 ` [PATCH v2 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
  2025-08-11 10:39   ` Christoph Hellwig
@ 2025-08-12  7:24   ` Geert Uytterhoeven
  1 sibling, 0 replies; 12+ messages in thread
From: Geert Uytterhoeven @ 2025-08-12  7:24 UTC (permalink / raw)
  To: alexjlzheng
  Cc: brauner, djwong, linux-xfs, linux-fsdevel, linux-kernel,
	Jinliang Zheng

Hi Jinliang,

On Mon, 11 Aug 2025 at 18:49, <alexjlzheng@gmail.com> wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
>
> iomap_folio_state marks the uptodate state in units of block_size, so
> it is better to check that pos and length are aligned with block_size.
>
> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>

Thanks for your patch!

> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -234,6 +234,9 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio,
>         unsigned first = poff >> block_bits;
>         unsigned last = (poff + plen - 1) >> block_bits;
>
> +       BUG_ON(*pos & (block_size - 1));
> +       BUG_ON(length & (block_size - 1));

!IS_ALIGNED(...)

> +
>         /*
>          * If the block size is smaller than the page size, we need to check the
>          * per-block uptodate status and adjust the offset and length if needed

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-08-12  7:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-10 10:15 [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state alexjlzheng
2025-08-10 10:15 ` [PATCH v2 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
2025-08-11 10:39   ` Christoph Hellwig
2025-08-12  7:24   ` Geert Uytterhoeven
2025-08-10 10:15 ` [PATCH v2 2/4] iomap: move iter revert case out of the unwritten branch alexjlzheng
2025-08-11 10:40   ` Christoph Hellwig
2025-08-10 10:15 ` [PATCH v2 3/4] iomap: make iomap_write_end() return the number of written length again alexjlzheng
2025-08-10 10:15 ` [PATCH v2 4/4] iomap: don't abandon the whole thing with iomap_folio_state alexjlzheng
2025-08-11 10:41   ` Christoph Hellwig
2025-08-11 12:18     ` Jinliang Zheng
2025-08-11 10:38 ` [PATCH v2 0/4] iomap: allow partial folio write " Christoph Hellwig
2025-08-11 12:08   ` Jinliang Zheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).