Linux XFS filesystem development
 help / color / mirror / Atom feed
From: Pankaj Raghav <pankaj.raghav@linux.dev>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com, lukas@herbolt.com,
	dgc@kernel.org, gost.dev@samsung.com,
	Zhang Yi <yi.zhang@huaweicloud.com>,
	andres@anarazel.de, kundan.kumar@samsung.com, hch@lst.de,
	cem@kernel.org, hch@infradead.org,
	Pankaj Raghav <p.raghav@samsung.com>
Subject: Re: [PATCH v8 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES
Date: Thu, 2 Jul 2026 12:37:31 +0200	[thread overview]
Message-ID: <75dd0085-e847-47de-aa67-a4f714d3d417@linux.dev> (raw)
In-Reply-To: <20260625172006.GC6078@frogsfrogsfrogs>


>> +	/*
>> +	 * Allocate written, zeroed extents across the range.  xfs_alloc_file_space()
>> +	 * rounds outward to block granularity:
>> +	 *  - holes (the punched interior and any unallocated edge block) are
>> +	 *    allocated and zeroed;
>> +	 *  - unwritten extents (including unwritten edge blocks) are converted to
>> +	 *    written and zeroed;
>> +	 *  - Already written edge blocks are skipped. The out-of-range bytes of
>> +	 *    a written edge block keep their data (offset_rd -> offset and
>> +	 *    end -> end_rd); their in-range bytes (offset -> offset_ru and
>> +	 *    end_ru -> end were already zeroed by xfs_free_file_space().
>> +	 */
>> +	return xfs_alloc_file_space(ip, offset, len,
>> +			XFS_ALLOC_FILE_SPACE_WRITE_ZEROES);
> 
> ...and now we can just do an accelerated "write zeroes to disk" which is
> conveniently always within EOF now.  I /think/ this looks ok to me now,
> though I'm curious how extensively the new fallocate mode has been
> tested with fsx and unaligned file ranges?  And rt volumes with rt
> extent size > 1 fsblock.
> 

I am running into an issue with rtvol.

Ran generic/363 (fsx) on rt, rextsize=2 (4k block / 8k rt extent). First I ran into this:

xfs_bmap_rtalloc: ASSERT(xfs_extlen_to_rtxmod == 0). WRITE_ZEROES uses
XFS_BMAPI_CONVERT, and xfs_bmap_extsize_align() bails early on convert, so the
rt-unaligned range becomes a sub-rt-extent allocation. I fixed it by rounding the alloc range
out to whole rt extents, like xfs_free_file_space() does for rt extents.

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 855602cb35e8..e52ad4c25b66 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -710,6 +710,13 @@ xfs_alloc_file_space(
        imapp = &imaps[0];
        startoffset_fsb = XFS_B_TO_FSBT(mp, offset);
        endoffset_fsb = XFS_B_TO_FSB(mp, offset + count);
+
+       if (mode == XFS_ALLOC_FILE_SPACE_WRITE_ZEROES &&
+           xfs_inode_has_bigrtalloc(ip)) {
+               startoffset_fsb = xfs_fileoff_rounddown_rtx(mp, startoffset_fsb);
+               endoffset_fsb = xfs_fileoff_roundup_rtx(mp, endoffset_fsb);
+       }
+
        allocatesize_fsb = endoffset_fsb - startoffset_fsb;

        /*

With that "fixed", I ran into WARN_ON_ONCE(folio_pos > i_size) in iomap_zero_iter,
via xfs_falloc_setsize -> xfs_setattr_size -> xfs_zero_range. The
rt-aligned alloc now leaves written zeroed blocks past a non-rt-aligned
EOF. So WRITE_ZEROES on rtvol breaks "no written blocks past EOF".

I am not sure what would be the most clean way of solving this issue for rtvol.

Let me know your thoughts!

@Zhang yi: did you test this on bigalloc configurations? Maybe you run into a similar problem
when cluster size (extsize for XFS) > block size?

--
Pankaj

  parent reply	other threads:[~2026-07-02 10:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-25 11:45 [PATCH v8 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav
2026-06-25 11:45 ` [PATCH v8 1/2] xfs: add an allocation mode to xfs_alloc_file_space() Pankaj Raghav
2026-06-25 17:01   ` Darrick J. Wong
2026-06-25 11:45 ` [PATCH v8 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES Pankaj Raghav
2026-06-25 17:20   ` Darrick J. Wong
2026-06-26 16:04     ` Pankaj Raghav
2026-07-02 10:37     ` Pankaj Raghav [this message]
2026-07-02 16:03       ` Darrick J. Wong
2026-07-02 19:44         ` Pankaj Raghav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=75dd0085-e847-47de-aa67-a4f714d3d417@linux.dev \
    --to=pankaj.raghav@linux.dev \
    --cc=andres@anarazel.de \
    --cc=bfoster@redhat.com \
    --cc=cem@kernel.org \
    --cc=dgc@kernel.org \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=kundan.kumar@samsung.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=lukas@herbolt.com \
    --cc=p.raghav@samsung.com \
    --cc=yi.zhang@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox