* [PATCH v3 0/4] allow partial folio write with iomap_folio_state
@ 2025-08-12 9:15 alexjlzheng
2025-08-12 9:15 ` [PATCH v3 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
` (5 more replies)
0 siblings, 6 replies; 9+ messages in thread
From: alexjlzheng @ 2025-08-12 9:15 UTC (permalink / raw)
To: brauner, djwong
Cc: linux-xfs, linux-fsdevel, linux-kernel, yi.zhang, Jinliang Zheng
From: Jinliang Zheng <alexjlzheng@tencent.com>
With iomap_folio_state, we can identify uptodate states at the block
level, and a read_folio reading can correctly handle partially
uptodate folios.
Therefore, when a partial write occurs, accept the block-aligned
partial write instead of rejecting the entire write.
For example, suppose a folio is 2MB, blocksize is 4kB, and the copied
bytes are 2MB-3kB.
Without this patchset, we'd need to recopy from the beginning of the
folio in the next iteration, which means 2MB-3kB of bytes is copy
duplicately.
|<-------------------- 2MB -------------------->|
+-------+-------+-------+-------+-------+-------+
| block | ... | block | block | ... | block | folio
+-------+-------+-------+-------+-------+-------+
|<-4kB->|
|<--------------- copied 2MB-3kB --------->| first time copied
|<-------- 1MB -------->| next time we need copy (chunk /= 2)
|<-------- 1MB -------->| next next time we need copy.
|<------ 2MB-3kB bytes duplicate copy ---->|
With this patchset, we can accept 2MB-4kB of bytes, which is block-aligned.
This means we only need to process the remaining 4kB in the next iteration,
which means there's only 1kB we need to copy duplicately.
|<-------------------- 2MB -------------------->|
+-------+-------+-------+-------+-------+-------+
| block | ... | block | block | ... | block | folio
+-------+-------+-------+-------+-------+-------+
|<-4kB->|
|<--------------- copied 2MB-3kB --------->| first time copied
|<-4kB->| next time we need copy
|<>|
only 1kB bytes duplicate copy
Although partial writes are inherently a relatively unusual situation and do
not account for a large proportion of performance testing, the optimization
here still makes sense in large-scale data centers.
This patchset has been tested by xfstests' generic and xfs group, and
there's no new failed cases compared to the lastest upstream version kernel.
Changelog:
V3: patch[1]: use WARN_ON() instead of BUG_ON()
patch[2]: make commit message clear
patch[3]: -
patch[4]: make commit message clear
V2: https://lore.kernel.org/linux-fsdevel/20250810101554.257060-1-alexjlzheng@tencent.com/
use & instead of % for 64 bit variable on m68k/xtensa, try to make them happy:
m68k-linux-ld: fs/iomap/buffered-io.o: in function `iomap_adjust_read_range':
>> buffered-io.c:(.text+0xa8a): undefined reference to `__moddi3'
>> m68k-linux-ld: buffered-io.c:(.text+0xaa8): undefined reference to `__moddi3'
V1: https://lore.kernel.org/linux-fsdevel/20250810044806.3433783-1-alexjlzheng@tencent.com/
Jinliang Zheng (4):
iomap: make sure iomap_adjust_read_range() are aligned with block_size
iomap: move iter revert case out of the unwritten branch
iomap: make iomap_write_end() return the number of written length
again
iomap: don't abandon the whole copy when we have iomap_folio_state
fs/iomap/buffered-io.c | 68 +++++++++++++++++++++++++++++-------------
1 file changed, 47 insertions(+), 21 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size
2025-08-12 9:15 [PATCH v3 0/4] allow partial folio write with iomap_folio_state alexjlzheng
@ 2025-08-12 9:15 ` alexjlzheng
2025-08-12 9:15 ` [PATCH v3 2/4] iomap: move iter revert case out of the unwritten branch alexjlzheng
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: alexjlzheng @ 2025-08-12 9:15 UTC (permalink / raw)
To: brauner, djwong
Cc: linux-xfs, linux-fsdevel, linux-kernel, yi.zhang, Jinliang Zheng
From: Jinliang Zheng <alexjlzheng@tencent.com>
iomap_folio_state marks the uptodate state in units of block_size, so
it is better to check that pos and length are aligned with block_size.
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
fs/iomap/buffered-io.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index fd827398afd2..0c38333933c6 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -234,6 +234,9 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio,
unsigned first = poff >> block_bits;
unsigned last = (poff + plen - 1) >> block_bits;
+ WARN_ON(*pos & (block_size - 1));
+ WARN_ON(length & (block_size - 1));
+
/*
* If the block size is smaller than the page size, we need to check the
* per-block uptodate status and adjust the offset and length if needed
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 2/4] iomap: move iter revert case out of the unwritten branch
2025-08-12 9:15 [PATCH v3 0/4] allow partial folio write with iomap_folio_state alexjlzheng
2025-08-12 9:15 ` [PATCH v3 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
@ 2025-08-12 9:15 ` alexjlzheng
2025-08-12 9:15 ` [PATCH v3 3/4] iomap: make iomap_write_end() return the number of written length again alexjlzheng
` (3 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: alexjlzheng @ 2025-08-12 9:15 UTC (permalink / raw)
To: brauner, djwong
Cc: linux-xfs, linux-fsdevel, linux-kernel, yi.zhang, Jinliang Zheng
From: Jinliang Zheng <alexjlzheng@tencent.com>
The commit e1f453d4336d ("iomap: do some small logical cleanup in
buffered write") merged iomap_write_failed() and iov_iter_revert()
into the branch with written == 0. Because, at the time,
iomap_write_end() could never return a partial write length.
In the subsequent patch, iomap_write_end() will be modified to allow
to return block-aligned partial write length (partial write length
here is relative to the folio-sized write), which violated the above
patch's assumption.
This patch moves it back out to prepare for the subsequent patches.
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
fs/iomap/buffered-io.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 0c38333933c6..109c3bad6ccf 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1019,6 +1019,11 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i,
if (old_size < pos)
pagecache_isize_extended(iter->inode, old_size, pos);
+ if (written < bytes)
+ iomap_write_failed(iter->inode, pos + written,
+ bytes - written);
+ if (unlikely(copied != written))
+ iov_iter_revert(i, copied - written);
cond_resched();
if (unlikely(written == 0)) {
@@ -1028,9 +1033,6 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i,
* halfway through, might be a race with munmap,
* might be severe memory pressure.
*/
- iomap_write_failed(iter->inode, pos, bytes);
- iov_iter_revert(i, copied);
-
if (chunk > PAGE_SIZE)
chunk /= 2;
if (copied) {
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 3/4] iomap: make iomap_write_end() return the number of written length again
2025-08-12 9:15 [PATCH v3 0/4] allow partial folio write with iomap_folio_state alexjlzheng
2025-08-12 9:15 ` [PATCH v3 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
2025-08-12 9:15 ` [PATCH v3 2/4] iomap: move iter revert case out of the unwritten branch alexjlzheng
@ 2025-08-12 9:15 ` alexjlzheng
2025-08-12 9:15 ` [PATCH v3 4/4] iomap: don't abandon the whole copy when we have iomap_folio_state alexjlzheng
` (2 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: alexjlzheng @ 2025-08-12 9:15 UTC (permalink / raw)
To: brauner, djwong
Cc: linux-xfs, linux-fsdevel, linux-kernel, yi.zhang, Jinliang Zheng
From: Jinliang Zheng <alexjlzheng@tencent.com>
In the next patch, we allow iomap_write_end() to conditionally accept
partial writes, so this patch makes iomap_write_end() return the number
of accepted write bytes in preparation for the next patch.
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
fs/iomap/buffered-io.c | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 109c3bad6ccf..7b9193f8243a 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -873,7 +873,7 @@ static int iomap_write_begin(struct iomap_iter *iter,
return status;
}
-static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
+static int __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
size_t copied, struct folio *folio)
{
flush_dcache_folio(folio);
@@ -890,11 +890,11 @@ static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
* redo the whole thing.
*/
if (unlikely(copied < len && !folio_test_uptodate(folio)))
- return false;
+ return 0;
iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len);
iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied);
filemap_dirty_folio(inode->i_mapping, folio);
- return true;
+ return copied;
}
static void iomap_write_end_inline(const struct iomap_iter *iter,
@@ -915,10 +915,10 @@ static void iomap_write_end_inline(const struct iomap_iter *iter,
}
/*
- * Returns true if all copied bytes have been written to the pagecache,
- * otherwise return false.
+ * Returns number of copied bytes have been written to the pagecache,
+ * zero if block is partial update.
*/
-static bool iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
+static int iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
struct folio *folio)
{
const struct iomap *srcmap = iomap_iter_srcmap(iter);
@@ -926,7 +926,7 @@ static bool iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
if (srcmap->type == IOMAP_INLINE) {
iomap_write_end_inline(iter, folio, pos, copied);
- return true;
+ return copied;
}
if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
@@ -934,7 +934,7 @@ static bool iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
bh_written = block_write_end(pos, len, copied, folio);
WARN_ON_ONCE(bh_written != copied && bh_written != 0);
- return bh_written == copied;
+ return bh_written;
}
return __iomap_write_end(iter->inode, pos, len, copied, folio);
@@ -1000,8 +1000,7 @@ static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i,
flush_dcache_folio(folio);
copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
- written = iomap_write_end(iter, bytes, copied, folio) ?
- copied : 0;
+ written = iomap_write_end(iter, bytes, copied, folio);
/*
* Update the in-memory inode size after copying the data into
@@ -1315,7 +1314,7 @@ static int iomap_unshare_iter(struct iomap_iter *iter,
do {
struct folio *folio;
size_t offset;
- bool ret;
+ int ret;
bytes = min_t(u64, SIZE_MAX, bytes);
status = iomap_write_begin(iter, write_ops, &folio, &offset,
@@ -1327,7 +1326,7 @@ static int iomap_unshare_iter(struct iomap_iter *iter,
ret = iomap_write_end(iter, bytes, bytes, folio);
__iomap_put_folio(iter, write_ops, bytes, folio);
- if (WARN_ON_ONCE(!ret))
+ if (WARN_ON_ONCE(ret != bytes))
return -EIO;
cond_resched();
@@ -1388,7 +1387,7 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero,
do {
struct folio *folio;
size_t offset;
- bool ret;
+ int ret;
bytes = min_t(u64, SIZE_MAX, bytes);
status = iomap_write_begin(iter, write_ops, &folio, &offset,
@@ -1406,7 +1405,7 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero,
ret = iomap_write_end(iter, bytes, bytes, folio);
__iomap_put_folio(iter, write_ops, bytes, folio);
- if (WARN_ON_ONCE(!ret))
+ if (WARN_ON_ONCE(ret != bytes))
return -EIO;
status = iomap_iter_advance(iter, &bytes);
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 4/4] iomap: don't abandon the whole copy when we have iomap_folio_state
2025-08-12 9:15 [PATCH v3 0/4] allow partial folio write with iomap_folio_state alexjlzheng
` (2 preceding siblings ...)
2025-08-12 9:15 ` [PATCH v3 3/4] iomap: make iomap_write_end() return the number of written length again alexjlzheng
@ 2025-08-12 9:15 ` alexjlzheng
2025-08-25 6:41 ` [PATCH v3 0/4] allow partial folio write with iomap_folio_state Jinliang Zheng
2025-08-25 9:34 ` Christoph Hellwig
5 siblings, 0 replies; 9+ messages in thread
From: alexjlzheng @ 2025-08-12 9:15 UTC (permalink / raw)
To: brauner, djwong
Cc: linux-xfs, linux-fsdevel, linux-kernel, yi.zhang, Jinliang Zheng
From: Jinliang Zheng <alexjlzheng@tencent.com>
With iomap_folio_state, we can identify uptodate states at the block
level, and a read_folio reading can correctly handle partially
uptodate folios.
Therefore, when a partial write occurs, accept the block-aligned
partial write instead of rejecting the entire write.
For example, suppose a folio is 2MB, blocksize is 4kB, and the copied
bytes are 2MB-3kB.
Without this patchset, we'd need to recopy from the beginning of the
folio in the next iteration, which means 2MB-3kB of bytes is copy
duplicately.
|<-------------------- 2MB -------------------->|
+-------+-------+-------+-------+-------+-------+
| block | ... | block | block | ... | block | folio
+-------+-------+-------+-------+-------+-------+
|<-4kB->|
|<--------------- copied 2MB-3kB --------->| first time copied
|<-------- 1MB -------->| next time we need copy (chunk /= 2)
|<-------- 1MB -------->| next next time we need copy.
|<------ 2MB-3kB bytes duplicate copy ---->|
With this patchset, we can accept 2MB-4kB of bytes, which is block-aligned.
This means we only need to process the remaining 4kB in the next iteration,
which means there's only 1kB we need to copy duplicately.
|<-------------------- 2MB -------------------->|
+-------+-------+-------+-------+-------+-------+
| block | ... | block | block | ... | block | folio
+-------+-------+-------+-------+-------+-------+
|<-4kB->|
|<--------------- copied 2MB-3kB --------->| first time copied
|<-4kB->| next time we need copy
|<>|
only 1kB bytes duplicate copy
Although partial writes are inherently a relatively unusual situation and do
not account for a large proportion of performance testing, the optimization
here still makes sense in large-scale data centers.
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
fs/iomap/buffered-io.c | 32 +++++++++++++++++++++++++++-----
1 file changed, 27 insertions(+), 5 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 7b9193f8243a..743e369b64d4 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -873,6 +873,25 @@ static int iomap_write_begin(struct iomap_iter *iter,
return status;
}
+static int iomap_trim_tail_partial(struct inode *inode, loff_t pos,
+ size_t copied, struct folio *folio)
+{
+ struct iomap_folio_state *ifs = folio->private;
+ unsigned block_size, last_blk, last_blk_bytes;
+
+ if (!ifs || !copied)
+ return 0;
+
+ block_size = 1 << inode->i_blkbits;
+ last_blk = offset_in_folio(folio, pos + copied - 1) >> inode->i_blkbits;
+ last_blk_bytes = (pos + copied) & (block_size - 1);
+
+ if (!ifs_block_is_uptodate(ifs, last_blk))
+ copied -= min(copied, last_blk_bytes);
+
+ return copied;
+}
+
static int __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
size_t copied, struct folio *folio)
{
@@ -886,12 +905,15 @@ static int __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
* read_folio might come in and destroy our partial write.
*
* Do the simplest thing and just treat any short write to a
- * non-uptodate page as a zero-length write, and force the caller to
- * redo the whole thing.
+ * non-uptodate block as a zero-length write, and force the caller to
+ * redo the things begin from the block.
*/
- if (unlikely(copied < len && !folio_test_uptodate(folio)))
- return 0;
- iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len);
+ if (unlikely(copied < len && !folio_test_uptodate(folio))) {
+ copied = iomap_trim_tail_partial(inode, pos, copied, folio);
+ if (!copied)
+ return 0;
+ }
+ iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), copied);
iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied);
filemap_dirty_folio(inode->i_mapping, folio);
return copied;
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/4] allow partial folio write with iomap_folio_state
2025-08-12 9:15 [PATCH v3 0/4] allow partial folio write with iomap_folio_state alexjlzheng
` (3 preceding siblings ...)
2025-08-12 9:15 ` [PATCH v3 4/4] iomap: don't abandon the whole copy when we have iomap_folio_state alexjlzheng
@ 2025-08-25 6:41 ` Jinliang Zheng
2025-08-25 9:34 ` Christoph Hellwig
5 siblings, 0 replies; 9+ messages in thread
From: Jinliang Zheng @ 2025-08-25 6:41 UTC (permalink / raw)
To: alexjlzheng
Cc: alexjlzheng, brauner, djwong, linux-fsdevel, linux-kernel,
linux-xfs, yi.zhang, hch
On Tue, 12 Aug 2025 17:15:34 +0800, Jinliang Zheng wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
>
> With iomap_folio_state, we can identify uptodate states at the block
> level, and a read_folio reading can correctly handle partially
> uptodate folios.
>
> Therefore, when a partial write occurs, accept the block-aligned
> partial write instead of rejecting the entire write.
>
> For example, suppose a folio is 2MB, blocksize is 4kB, and the copied
> bytes are 2MB-3kB.
>
> Without this patchset, we'd need to recopy from the beginning of the
> folio in the next iteration, which means 2MB-3kB of bytes is copy
> duplicately.
>
> |<-------------------- 2MB -------------------->|
> +-------+-------+-------+-------+-------+-------+
> | block | ... | block | block | ... | block | folio
> +-------+-------+-------+-------+-------+-------+
> |<-4kB->|
>
> |<--------------- copied 2MB-3kB --------->| first time copied
> |<-------- 1MB -------->| next time we need copy (chunk /= 2)
> |<-------- 1MB -------->| next next time we need copy.
>
> |<------ 2MB-3kB bytes duplicate copy ---->|
>
> With this patchset, we can accept 2MB-4kB of bytes, which is block-aligned.
> This means we only need to process the remaining 4kB in the next iteration,
> which means there's only 1kB we need to copy duplicately.
>
> |<-------------------- 2MB -------------------->|
> +-------+-------+-------+-------+-------+-------+
> | block | ... | block | block | ... | block | folio
> +-------+-------+-------+-------+-------+-------+
> |<-4kB->|
>
> |<--------------- copied 2MB-3kB --------->| first time copied
> |<-4kB->| next time we need copy
>
> |<>|
> only 1kB bytes duplicate copy
>
> Although partial writes are inherently a relatively unusual situation and do
> not account for a large proportion of performance testing, the optimization
> here still makes sense in large-scale data centers.
>
> This patchset has been tested by xfstests' generic and xfs group, and
> there's no new failed cases compared to the lastest upstream version kernel.
Sorry forgot to cc Christoph Hellwig :)
thanks,
Jinliang Zheng
>
> Changelog:
>
> V3: patch[1]: use WARN_ON() instead of BUG_ON()
> patch[2]: make commit message clear
> patch[3]: -
> patch[4]: make commit message clear
>
> V2: https://lore.kernel.org/linux-fsdevel/20250810101554.257060-1-alexjlzheng@tencent.com/
> use & instead of % for 64 bit variable on m68k/xtensa, try to make them happy:
> m68k-linux-ld: fs/iomap/buffered-io.o: in function `iomap_adjust_read_range':
> >> buffered-io.c:(.text+0xa8a): undefined reference to `__moddi3'
> >> m68k-linux-ld: buffered-io.c:(.text+0xaa8): undefined reference to `__moddi3'
>
> V1: https://lore.kernel.org/linux-fsdevel/20250810044806.3433783-1-alexjlzheng@tencent.com/
>
> Jinliang Zheng (4):
> iomap: make sure iomap_adjust_read_range() are aligned with block_size
> iomap: move iter revert case out of the unwritten branch
> iomap: make iomap_write_end() return the number of written length
> again
> iomap: don't abandon the whole copy when we have iomap_folio_state
>
> fs/iomap/buffered-io.c | 68 +++++++++++++++++++++++++++++-------------
> 1 file changed, 47 insertions(+), 21 deletions(-)
>
> --
> 2.49.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/4] allow partial folio write with iomap_folio_state
2025-08-12 9:15 [PATCH v3 0/4] allow partial folio write with iomap_folio_state alexjlzheng
` (4 preceding siblings ...)
2025-08-25 6:41 ` [PATCH v3 0/4] allow partial folio write with iomap_folio_state Jinliang Zheng
@ 2025-08-25 9:34 ` Christoph Hellwig
2025-08-25 11:39 ` Jinliang Zheng
5 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2025-08-25 9:34 UTC (permalink / raw)
To: alexjlzheng
Cc: brauner, djwong, linux-xfs, linux-fsdevel, linux-kernel, yi.zhang,
Jinliang Zheng
On Tue, Aug 12, 2025 at 05:15:34PM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
>
> With iomap_folio_state, we can identify uptodate states at the block
> level, and a read_folio reading can correctly handle partially
> uptodate folios.
>
> Therefore, when a partial write occurs, accept the block-aligned
> partial write instead of rejecting the entire write.
>
> For example, suppose a folio is 2MB, blocksize is 4kB, and the copied
> bytes are 2MB-3kB.
I'd still love to see some explanation of why you are doing this.
Do you have a workload that actually hits this regularly, and where
it makes a difference. Can you provide numbers to quantify them?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/4] allow partial folio write with iomap_folio_state
2025-08-25 9:34 ` Christoph Hellwig
@ 2025-08-25 11:39 ` Jinliang Zheng
2025-08-26 13:20 ` Christoph Hellwig
0 siblings, 1 reply; 9+ messages in thread
From: Jinliang Zheng @ 2025-08-25 11:39 UTC (permalink / raw)
To: hch
Cc: alexjlzheng, alexjlzheng, brauner, djwong, linux-fsdevel,
linux-kernel, linux-xfs, yi.zhang
On Mon, 25 Aug 2025 02:34:30 -0700, Christoph Hellwig wrote:
> On Tue, Aug 13, 2025 at 05:15:34PM +0800, alexjlzheng@gmail.com wrote:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> >
> > With iomap_folio_state, we can identify uptodate states at the block
> > level, and a read_folio reading can correctly handle partially
> > uptodate folios.
> >
> > Therefore, when a partial write occurs, accept the block-aligned
> > partial write instead of rejecting the entire write.
> >
> > For example, suppose a folio is 2MB, blocksize is 4kB, and the copied
> > bytes are 2MB-3kB.
>
> I'd still love to see some explanation of why you are doing this.
> Do you have a workload that actually hits this regularly, and where
> it makes a difference. Can you provide numbers to quantify them?
Thank you for your reply. :)
Actually, I discovered this while reading (and studying) the code for large
folios.
Given that short-writes are inherently unusual, I don't think this patchset
will significantly improve performance in hot paths. It might help in scenarios
with frequent memory hardware errors, but unfortunately, I haven't built a
test scenario like that.
I'm posting this patchset just because I think we can do better in exception
handling: if we can reduce unnecessary copying, why not?
Hahaha, just my personal opinion. :)
thanks,
Jinliang Zheng
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/4] allow partial folio write with iomap_folio_state
2025-08-25 11:39 ` Jinliang Zheng
@ 2025-08-26 13:20 ` Christoph Hellwig
0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2025-08-26 13:20 UTC (permalink / raw)
To: Jinliang Zheng
Cc: hch, alexjlzheng, brauner, djwong, linux-fsdevel, linux-kernel,
linux-xfs, yi.zhang
On Mon, Aug 25, 2025 at 07:39:21PM +0800, Jinliang Zheng wrote:
> Actually, I discovered this while reading (and studying) the code for large
> folios.
>
> Given that short-writes are inherently unusual, I don't think this patchset
> will significantly improve performance in hot paths. It might help in scenarios
> with frequent memory hardware errors, but unfortunately, I haven't built a
> test scenario like that.
>
> I'm posting this patchset just because I think we can do better in exception
> handling: if we can reduce unnecessary copying, why not?
I'm always interested in the motivation, especially for something
adding more code or doing large changes. If it actually improves
performance it's much easier to argue for. If it doesn't that doesn't
mean the patch is bad, but it needs to have other upsides. I'll take
another close look, but please also add your motivation to the cover
letter and commit log for the next round.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-08-26 13:20 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-12 9:15 [PATCH v3 0/4] allow partial folio write with iomap_folio_state alexjlzheng
2025-08-12 9:15 ` [PATCH v3 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
2025-08-12 9:15 ` [PATCH v3 2/4] iomap: move iter revert case out of the unwritten branch alexjlzheng
2025-08-12 9:15 ` [PATCH v3 3/4] iomap: make iomap_write_end() return the number of written length again alexjlzheng
2025-08-12 9:15 ` [PATCH v3 4/4] iomap: don't abandon the whole copy when we have iomap_folio_state alexjlzheng
2025-08-25 6:41 ` [PATCH v3 0/4] allow partial folio write with iomap_folio_state Jinliang Zheng
2025-08-25 9:34 ` Christoph Hellwig
2025-08-25 11:39 ` Jinliang Zheng
2025-08-26 13:20 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).