From: alexjlzheng@gmail.com
To: brauner@kernel.org, djwong@kernel.org, hch@infradead.org,
kernel@pankajraghav.com
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, yi.zhang@huawei.com,
Jinliang Zheng <alexjlzheng@tencent.com>
Subject: [PATCH v5 0/4] allow partial folio write with iomap_folio_state
Date: Tue, 23 Sep 2025 12:21:54 +0800 [thread overview]
Message-ID: <20250923042158.1196568-1-alexjlzheng@tencent.com> (raw)
From: Jinliang Zheng <alexjlzheng@tencent.com>
Currently, if a partial write occurs in a buffer write, the entire write will
be discarded. While this is an uncommon case, it's still a bit wasteful and
we can do better.
With iomap_folio_state, we can identify uptodate states at the block
level, and a read_folio reading can correctly handle partially
uptodate folios.
Therefore, when a partial write occurs, accept the block-aligned
partial write instead of rejecting the entire write.
For example, suppose a folio is 2MB, blocksize is 4kB, and the copied
bytes are 2MB-3kB.
Without this patchset, we'd need to recopy from the beginning of the
folio in the next iteration, which means 2MB-3kB of bytes is copy
duplicately.
|<-------------------- 2MB -------------------->|
+-------+-------+-------+-------+-------+-------+
| block | ... | block | block | ... | block | folio
+-------+-------+-------+-------+-------+-------+
|<-4kB->|
|<--------------- copied 2MB-3kB --------->| first time copied
|<-------- 1MB -------->| next time we need copy (chunk /= 2)
|<-------- 1MB -------->| next next time we need copy.
|<------ 2MB-3kB bytes duplicate copy ---->|
With this patchset, we can accept 2MB-4kB of bytes, which is block-aligned.
This means we only need to process the remaining 4kB in the next iteration,
which means there's only 1kB we need to copy duplicately.
|<-------------------- 2MB -------------------->|
+-------+-------+-------+-------+-------+-------+
| block | ... | block | block | ... | block | folio
+-------+-------+-------+-------+-------+-------+
|<-4kB->|
|<--------------- copied 2MB-3kB --------->| first time copied
|<-4kB->| next time we need copy
|<>|
only 1kB bytes duplicate copy
Although partial writes are inherently a relatively unusual situation and do
not account for a large proportion of performance testing, the optimization
here still makes sense in large-scale data centers.
This patchset has been tested by xfstests' generic and xfs group, and
there's no new failed cases compared to the lastest upstream version kernel.
Changelog:
V5: patch[1]: use WARN_ON_ONCE() instead of WARN_ON(), suggested by Pankaj Raghav (Samsung)
V4: https://lore.kernel.org/linux-fsdevel/eyyshgzsxupyen6ms3izkh45ydh3ekxycpk5p4dbets6mpyhch@q4db2ayr4g3r/
patch[4]: better documentation in code, and add motivation to the cover letter
V3: https://lore.kernel.org/linux-xfs/aMPIDGq7pVuURg1t@infradead.org/
patch[1]: use WARN_ON() instead of BUG_ON()
patch[2]: make commit message clear
patch[3]: -
patch[4]: make commit message clear
V2: https://lore.kernel.org/linux-fsdevel/20250810101554.257060-1-alexjlzheng@tencent.com/
use & instead of % for 64 bit variable on m68k/xtensa, try to make them happy:
m68k-linux-ld: fs/iomap/buffered-io.o: in function `iomap_adjust_read_range':
>> buffered-io.c:(.text+0xa8a): undefined reference to `__moddi3'
>> m68k-linux-ld: buffered-io.c:(.text+0xaa8): undefined reference to `__moddi3'
V1: https://lore.kernel.org/linux-fsdevel/20250810044806.3433783-1-alexjlzheng@tencent.com/
Jinliang Zheng (4):
iomap: make sure iomap_adjust_read_range() are aligned with block_size
iomap: move iter revert case out of the unwritten branch
iomap: make iomap_write_end() return the number of written length again
iomap: don't abandon the whole copy when we have iomap_folio_state
fs/iomap/buffered-io.c | 80 +++++++++++++++++++++++++++++-------------
1 file changed, 55 insertions(+), 25 deletions(-)
--
2.49.0
next reply other threads:[~2025-09-23 4:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 4:21 alexjlzheng [this message]
2025-09-23 4:21 ` [PATCH v5 1/4] iomap: make sure iomap_adjust_read_range() are aligned with block_size alexjlzheng
2025-09-25 18:59 ` Brian Foster
2025-09-23 4:21 ` [PATCH v5 2/4] iomap: move iter revert case out of the unwritten branch alexjlzheng
2025-09-25 18:59 ` Brian Foster
2025-09-23 4:21 ` [PATCH v5 3/4] iomap: make iomap_write_end() return the number of written length again alexjlzheng
2025-09-25 19:00 ` Brian Foster
2025-09-23 4:21 ` [PATCH v5 4/4] iomap: don't abandon the whole copy when we have iomap_folio_state alexjlzheng
2025-09-25 19:01 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250923042158.1196568-1-alexjlzheng@tencent.com \
--to=alexjlzheng@gmail.com \
--cc=alexjlzheng@tencent.com \
--cc=brauner@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=kernel@pankajraghav.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).