public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH v2 1/5] iomap, xfs: lift zero range hole mapping flush into xfs
Date: Wed, 4 Mar 2026 12:04:04 -0500	[thread overview]
Message-ID: <aahmBCz1xJBCPcZ-@bfoster> (raw)
In-Reply-To: <aahJcVkrkLRtsJO9@bfoster>

On Wed, Mar 04, 2026 at 10:02:09AM -0500, Brian Foster wrote:
> On Wed, Mar 04, 2026 at 06:41:23AM -0800, Christoph Hellwig wrote:
> > On Wed, Mar 04, 2026 at 09:17:33AM -0500, Brian Foster wrote:
> > > I tested the change below but it ended up failing xfs/131. Some fast and
> > > loose (i.e. LLM assisted) trace analysis suggests the issue is that this
> > > particular situation is racy. I.e., we write to a sparse file range and
> > > add COW fork dellaloc, writeback kicks in and drops the delalloc
> > > mapping, then zeroing occurs over said range and finds holes in both
> > > forks, then zone I/O completion occurs and maps blocks into the data
> > > fork.
> > 
> > Yes, that can happen.  But the folio will be locked or have the
> > writeback bit set over that whole period, so I can't see how writeback
> > can actually race with that?
> > 
> 
> I think that's why the flush (i.e. current behavior) works..
> 
> The change in my last mail attempts to replace the flush with improved
> reporting to only report a hole if one exists in both COW/data forks at
> the same time. So basically if the data fork is currently a hole and
> writeback kicks in, xfs_zoned_map_blocks() deletes the COW fork mapping
> and carries on with zoned writeback handling. At that point if zeroing
> occurs, it would see holes in both the COW (just removed) and data fork
> (not yet mapped in) and think there's nothing to do.
> 
> The idea is that if say we instead did something like transfer delalloc
> into the data fork at writeback time, if the data fork had a hole, then
> we could always tell from the iomap mapping whether we need to zero or
> not without flushing or consulting pagecache at all.
> 

This patch seems to work on a quick test. It's basically the two patches
squashed together (I'd post them as independent patches), so nothing too
different, but if we fix up the zero logic first that helps clean up the
indentation as a bonus.

Brian

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index be86d43044df..27470ec8372b 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1590,6 +1590,7 @@ xfs_zoned_buffered_write_iomap_begin(
 {
 	struct iomap_iter	*iter =
 		container_of(iomap, struct iomap_iter, iomap);
+	struct address_space	*mapping = inode->i_mapping;
 	struct xfs_zone_alloc_ctx *ac = iter->private;
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
@@ -1614,6 +1615,7 @@ xfs_zoned_buffered_write_iomap_begin(
 	if (error)
 		return error;
 
+restart:
 	error = xfs_ilock_for_iomap(ip, flags, &lockmode);
 	if (error)
 		return error;
@@ -1651,14 +1653,6 @@ xfs_zoned_buffered_write_iomap_begin(
 				&smap))
 			smap.br_startoff = end_fsb; /* fake hole until EOF */
 		if (smap.br_startoff > offset_fsb) {
-			/*
-			 * We never need to allocate blocks for zeroing a hole.
-			 */
-			if (flags & IOMAP_ZERO) {
-				xfs_hole_to_iomap(ip, iomap, offset_fsb,
-						smap.br_startoff);
-				goto out_unlock;
-			}
 			end_fsb = min(end_fsb, smap.br_startoff);
 		} else {
 			end_fsb = min(end_fsb,
@@ -1690,6 +1684,31 @@ xfs_zoned_buffered_write_iomap_begin(
 	count_fsb = min3(end_fsb - offset_fsb, XFS_MAX_BMBT_EXTLEN,
 			 XFS_B_TO_FSB(mp, 1024 * PAGE_SIZE));
 
+	/*
+	 * We don't allocate blocks for zeroing a hole, but we only report a
+	 * hole in zoned mode if one exists in both the COW and data forks.
+	 *
+	 * There is currently a corner case where writeback removes the COW fork
+	 * mapping and unlocks the inode, leaving a transient state where a hole
+	 * exists in both forks until write completion maps blocks into the data
+	 * fork. Until we can avoid this transient hole state, detect and avoid
+	 * this with a flush of any such range that appears dirty in pagecache.
+	 */
+	if ((flags & IOMAP_ZERO) && srcmap->type == IOMAP_HOLE) {
+		if (filemap_range_needs_writeback(mapping, offset,
+						  offset + count - 1)) {
+			xfs_iunlock(ip, lockmode);
+			error = filemap_write_and_wait_range(mapping, offset,
+							    offset + count - 1);
+			if (error)
+				return error;
+			goto restart;
+		}
+
+		xfs_hole_to_iomap(ip, iomap, offset_fsb, end_fsb);
+		goto out_unlock;
+	}
+
 	/*
 	 * The block reservation is supposed to cover all blocks that the
 	 * operation could possible write, but there is a nasty corner case


  reply	other threads:[~2026-03-04 17:04 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-29 15:50 [PATCH v2 0/5] iomap, xfs: improve zero range flushing and lookup Brian Foster
2026-01-29 15:50 ` [PATCH v2 1/5] iomap, xfs: lift zero range hole mapping flush into xfs Brian Foster
2026-02-10 16:15   ` Christoph Hellwig
2026-02-10 19:14     ` Brian Foster
2026-02-11 15:36       ` Christoph Hellwig
2026-02-13  6:06   ` Christoph Hellwig
2026-02-13 17:37     ` Brian Foster
2026-03-02 19:02       ` Brian Foster
2026-03-03 14:37         ` Christoph Hellwig
2026-03-03 19:00           ` Brian Foster
2026-03-04 13:13             ` Christoph Hellwig
2026-03-04 14:17               ` Brian Foster
2026-03-04 14:41                 ` Christoph Hellwig
2026-03-04 15:02                   ` Brian Foster
2026-03-04 17:04                     ` Brian Foster [this message]
2026-03-05 14:11                       ` Christoph Hellwig
2026-03-05 15:06                         ` Brian Foster
2026-03-05 16:10                           ` Christoph Hellwig
2026-02-13 10:20   ` Nirjhar Roy (IBM)
2026-02-13 16:24     ` Darrick J. Wong
2026-02-18 17:41       ` Nirjhar Roy (IBM)
2026-01-29 15:50 ` [PATCH v2 2/5] xfs: flush eof folio before insert range size update Brian Foster
2026-02-10 16:16   ` Christoph Hellwig
2026-02-10 19:14     ` Brian Foster
2026-01-29 15:50 ` [PATCH v2 3/5] xfs: look up cow fork extent earlier for buffered iomap_begin Brian Foster
2026-02-10 16:17   ` Christoph Hellwig
2026-01-29 15:50 ` [PATCH v2 4/5] xfs: only flush when COW fork blocks overlap data fork holes Brian Foster
2026-02-10 16:19   ` Christoph Hellwig
2026-02-10 19:18     ` Brian Foster
2026-02-17 15:06   ` Nirjhar Roy (IBM)
2026-02-18 15:37     ` Brian Foster
2026-02-18 17:40       ` Nirjhar Roy (IBM)
2026-01-29 15:50 ` [PATCH v2 5/5] xfs: replace zero range flush with folio batch Brian Foster
2026-02-10 16:21   ` Christoph Hellwig
2026-02-10 19:19     ` Brian Foster
2026-02-11 15:41       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aahmBCz1xJBCPcZ-@bfoster \
    --to=bfoster@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox