From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-mm@kvack.org, hch@infradead.org, willy@infradead.org
Subject: Re: [PATCH v3 6/7] iomap: remove old partial eof zeroing optimization
Date: Tue, 15 Jul 2025 08:36:54 -0400 [thread overview]
Message-ID: <aHZLZid5gggmDD09@bfoster> (raw)
In-Reply-To: <20250715053417.GR2672049@frogsfrogsfrogs>
On Mon, Jul 14, 2025 at 10:34:17PM -0700, Darrick J. Wong wrote:
> On Mon, Jul 14, 2025 at 04:41:21PM -0400, Brian Foster wrote:
> > iomap_zero_range() optimizes the partial eof block zeroing use case
> > by force zeroing if the mapping is dirty. This is to avoid frequent
> > flushing on file extending workloads, which hurts performance.
> >
> > Now that the folio batch mechanism provides a more generic solution
> > and is used by the only real zero range user (XFS), this isolated
> > optimization is no longer needed. Remove the unnecessary code and
> > let callers use the folio batch or fall back to flushing by default.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
>
> Heh, I was staring at this last Friday chasing fuse+iomap bugs in
> fallocate zerorange and straining to remember what this does.
> Is this chunk still needed if the ->iomap_begin implementation doesn't
> (or forgets to) grab the folio batch for iomap?
>
No, the hunk removed by this patch is just an optimization. The fallback
code here flushes the range if it's dirty and retries the lookup (i.e.
picking up unwritten conversions that were pending via dirty pagecache).
That flush logic caused a performance regression in a particular
workload, so this was introduced to mitigate that regression by just
doing the zeroing for the first block or so if the folio is dirty. [1]
The reason for removing it is more just for maintainability. XFS is
really the only user here and it is changing over to the more generic
batch mechanism, which effectively provides the same optimization, so
this basically becomes dead/duplicate code. If an fs doesn't use the
batch mechanism it will just fall back to the flush and retry approach,
which can be slower but is functionally correct.
> My bug turned out to be a bug in my fuse+iomap design -- with the way
> iomap_zero_range does things, you have to flush+unmap, punch the range
> and zero the range. If you punch and realloc the range and *then* try
> to zero the range, the new unwritten extents cause iomap to miss dirty
> pages that fuse should've unmapped. Ooops.
>
I don't quite follow. How do you mean it misses dirty pages?
Brian
[1] Details described in the commit log of fde4c4c3ec1c ("iomap: elide
flush from partial eof zero range").
> --D
>
> > ---
> > fs/iomap/buffered-io.c | 24 ------------------------
> > 1 file changed, 24 deletions(-)
> >
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index 194e3cc0857f..d2bbed692c06 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -1484,33 +1484,9 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
> > .private = private,
> > };
> > struct address_space *mapping = inode->i_mapping;
> > - unsigned int blocksize = i_blocksize(inode);
> > - unsigned int off = pos & (blocksize - 1);
> > - loff_t plen = min_t(loff_t, len, blocksize - off);
> > int ret;
> > bool range_dirty;
> >
> > - /*
> > - * Zero range can skip mappings that are zero on disk so long as
> > - * pagecache is clean. If pagecache was dirty prior to zero range, the
> > - * mapping converts on writeback completion and so must be zeroed.
> > - *
> > - * The simplest way to deal with this across a range is to flush
> > - * pagecache and process the updated mappings. To avoid excessive
> > - * flushing on partial eof zeroing, special case it to zero the
> > - * unaligned start portion if already dirty in pagecache.
> > - */
> > - if (!iter.fbatch && off &&
> > - filemap_range_needs_writeback(mapping, pos, pos + plen - 1)) {
> > - iter.len = plen;
> > - while ((ret = iomap_iter(&iter, ops)) > 0)
> > - iter.status = iomap_zero_iter(&iter, did_zero);
> > -
> > - iter.len = len - (iter.pos - pos);
> > - if (ret || !iter.len)
> > - return ret;
> > - }
> > -
> > /*
> > * To avoid an unconditional flush, check pagecache state and only flush
> > * if dirty and the fs returns a mapping that might convert on
> > --
> > 2.50.0
> >
> >
>
next prev parent reply other threads:[~2025-07-15 12:33 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-14 20:41 [PATCH v3 0/7] iomap: zero range folio batch support Brian Foster
2025-07-14 20:41 ` [PATCH v3 1/7] filemap: add helper to look up dirty folios in a range Brian Foster
2025-07-15 5:20 ` Darrick J. Wong
2025-07-14 20:41 ` [PATCH v3 2/7] iomap: remove pos+len BUG_ON() to after folio lookup Brian Foster
2025-07-15 5:14 ` Darrick J. Wong
2025-07-14 20:41 ` [PATCH v3 3/7] iomap: optional zero range dirty folio processing Brian Foster
2025-07-15 5:22 ` Darrick J. Wong
2025-07-15 12:35 ` Brian Foster
2025-07-18 11:30 ` Zhang Yi
2025-07-18 13:48 ` Brian Foster
2025-07-19 11:07 ` Zhang Yi
2025-07-21 8:47 ` Zhang Yi
2025-07-28 12:57 ` Zhang Yi
2025-07-30 13:19 ` Brian Foster
2025-08-02 7:26 ` Zhang Yi
2025-07-30 13:17 ` Brian Foster
2025-08-02 7:19 ` Zhang Yi
2025-08-05 13:08 ` Brian Foster
2025-08-06 3:10 ` Zhang Yi
2025-08-06 13:25 ` Brian Foster
2025-08-07 4:58 ` Zhang Yi
2025-07-14 20:41 ` [PATCH v3 4/7] xfs: always trim mapping to requested range for zero range Brian Foster
2025-07-14 20:41 ` [PATCH v3 5/7] xfs: fill dirty folios on zero range of unwritten mappings Brian Foster
2025-07-15 5:28 ` Darrick J. Wong
2025-07-15 12:35 ` Brian Foster
2025-07-15 14:19 ` Darrick J. Wong
2025-07-14 20:41 ` [PATCH v3 6/7] iomap: remove old partial eof zeroing optimization Brian Foster
2025-07-15 5:34 ` Darrick J. Wong
2025-07-15 12:36 ` Brian Foster [this message]
2025-07-15 14:37 ` Darrick J. Wong
2025-07-15 16:20 ` Brian Foster
2025-07-15 16:30 ` Darrick J. Wong
2025-07-14 20:41 ` [PATCH v3 7/7] xfs: error tag to force zeroing on debug kernels Brian Foster
2025-07-15 5:24 ` Darrick J. Wong
2025-07-15 12:39 ` Brian Foster
2025-07-15 14:30 ` Darrick J. Wong
2025-07-15 16:20 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aHZLZid5gggmDD09@bfoster \
--to=bfoster@redhat.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).