From: Matthew Wilcox <willy@infradead.org>
To: Jia Zhu <zhujia.zj@bytedance.com>
Cc: Theodore Ts'o <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Baokun Li <libaokun@linux.alibaba.com>,
Ojaswin Mujoo <ojaswin@linux.ibm.com>,
Ritesh Harjani <ritesh.list@gmail.com>,
Zhang Yi <yi.zhang@huawei.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] ext4: avoid full buffer walks for large folio partial writes
Date: Wed, 3 Jun 2026 19:11:48 +0100 [thread overview]
Message-ID: <aiBuZE5NWMfOGAA6@casper.infradead.org> (raw)
In-Reply-To: <20260603134800.25155-1-zhujia.zj@bytedance.com>
On Wed, Jun 03, 2026 at 09:48:00PM +0800, Jia Zhu wrote:
> Ext4 buffered writes into large folios still walk every buffer_head in the
> folio in ext4_block_write_begin() and again in block_commit_write(). Before
> regular files used large folios this was cheap, but a large folio can
> contain hundreds of buffer_heads. Small overwrites of an existing large
> folio therefore pay work proportional to the folio size instead of the
> write size.
Is this a common case for you, or is this something you noticed by
inspection?
> Start the ext4 write_begin walk at the first buffer that overlaps the
> write. For already-uptodate large folio overwrites, add a partial commit
> path which marks only the written buffers uptodate and dirty. Leave
> non-uptodate folios on the old full-buffer commit path so BH_New cleanup
> and folio-uptodate discovery are preserved.
Wouldn't you get just as much benefit from this?
+++ b/fs/buffer.c
@@ -2096,6 +2096,7 @@ void block_commit_write(struct folio *folio, size_t from,
size_t to)
{
size_t block_start, block_end;
bool partial = false;
+ bool uptodate = folio_test_uptodate(folio);
unsigned blocksize;
struct buffer_head *bh, *head;
@@ -2118,6 +2119,8 @@ void block_commit_write(struct folio *folio, size_t from, size_t to)
clear_buffer_new(bh);
block_start = block_end;
+ if (uptodate && block_start >= to)
+ break;
bh = bh->b_this_page;
} while (bh != head);
> @@ -1191,17 +1191,18 @@ int ext4_block_write_begin(handle_t *handle, struct folio *folio,
> head = folio_buffers(folio);
> if (!head)
> head = create_empty_buffers(folio, blocksize, 0);
> - block = EXT4_PG_TO_LBLK(inode, folio->index);
> + if (from == to)
> + return 0;
> + block_start = round_down(from, blocksize);
> + block = EXT4_PG_TO_LBLK(inode, folio->index) +
> + (block_start >> inode->i_blkbits);
> + bh = head;
> + for (i = 0; i < block_start; i += blocksize)
> + bh = bh->b_this_page;
>
> - for (bh = head, block_start = 0; bh != head || !block_start;
> - block++, block_start = block_end, bh = bh->b_this_page) {
> + for (; block_start < to;
> + block++, block_start = block_end, bh = bh->b_this_page) {
> block_end = block_start + blocksize;
> - if (block_end <= from || block_start >= to) {
> - if (folio_test_uptodate(folio)) {
> - set_buffer_uptodate(bh);
> - }
> - continue;
> - }
> if (WARN_ON_ONCE(buffer_new(bh)))
> clear_buffer_new(bh);
> if (!buffer_mapped(bh)) {
>
I'm unconvinced that this is safe ... but all of this is a distraction
form what we should really be doing which is converting ext4 to use
iomap instead of buffer heads.
prev parent reply other threads:[~2026-06-03 18:12 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-03 13:48 [PATCH] ext4: avoid full buffer walks for large folio partial writes Jia Zhu
2026-06-03 18:11 ` Matthew Wilcox [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiBuZE5NWMfOGAA6@casper.infradead.org \
--to=willy@infradead.org \
--cc=adilger.kernel@dilger.ca \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=libaokun@linux.alibaba.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=yi.zhang@huawei.com \
--cc=zhujia.zj@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox