public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: "Pankaj Raghav (Samsung)" <pankaj.raghav@linux.dev>
Cc: Lukas Herbolt <lukas@herbolt.com>,
	linux-xfs@vger.kernel.org, djwong@kernel.org, cem@kernel.org,
	hch@infradead.org, p.raghav@samsung.com
Subject: Re: [PATCH v10] xfs: add FALLOC_FL_WRITE_ZEROES to XFS code base
Date: Thu, 26 Feb 2026 11:42:26 -0500	[thread overview]
Message-ID: <aaB38r55RPLj3ij-@bfoster> (raw)
In-Reply-To: <wmxdwtvahubdga73cgzprqtj7fxyjgx5kxvr4cobtl6ski2i6y@ic2g3bfymkwi>

On Thu, Feb 26, 2026 at 02:44:05PM +0000, Pankaj Raghav (Samsung) wrote:
> On Wed, Feb 25, 2026 at 09:39:33AM +0100, Lukas Herbolt wrote:
> > Add support for FALLOC_FL_WRITE_ZEROES if the underlying device enable
> > the unmap write zeroes operation.
> > 
> Hi Lukas,
> 
> I independently started implmenting this feature as well. I ran a test case
> on your patches and it resulted in a warning in iomap_zero_range.
> iomap_zero_range has a check for folios outside eof, and it is being
> called as a part of setsize, i.e, before we change the size of the file.
> 
> I think we need to do a PREALLOC and then do a XFS_BMAPI_ZERO with
> XFS_BMAPI_CONVERT. Or I don't know if we should change the warning in
> iomap_zero_range.
> 

The reason the warning is there is because iomap_zero_range() uses
buffered writes but doesn't actually bump i_size for writes beyond eof.
Therefore if it ends up zeroing folios that start beyond eof, writeback
would potentially toss those folios if i_size wasn't updated somehow or
another by the time it occurs..

I'd guess there are two likely scenarios that lead to this warning, but
you'd have to confirm. One is that we're unnecessarily zeroing an
unwritten range for some reason. That would probably be harmless, but
unexpected. The other would be zeroing written blocks beyond eof, which
is risky and probably something we want to avoid, but also suspicious in
that I don't think we should ever have written extents beyond eof in XFS
(but rather either delalloc or written).

Brian

> Doing unwritten extents first and then converting them to written with
> zeroes is what ext4 does as well. Maybe it is better this way because we
> can quickly allocate the blocks and return while holding the aglocks and
> then do the actually write. I guess someone more experienced with XFS
> can comment on that.
> 
> I can send what I have and I will CC you in the series.
> 
> This is the warning I get when I test your patch:
> 
> WARNING:
> 
> [  112.551102] WARNING: fs/iomap/buffered-io.c:1525 at iomap_zero_range+0x42d/0x7b0, CPU#2: write_zeroes/411
> [  112.560073] RIP: 0010:iomap_zero_range+0x42d/0x7b0
> [  112.593471]  xfs_zero_range+0x86/0xd0 [xfs]
> <snip>
> [  112.594100]  xfs_setattr_size+0x5c2/0xd90 [xfs]
> <snip>
> [  112.598895]  xfs_falloc_setsize+0x158/0x200 [xfs]
> 
> 
> This is the test case:
> #define _GNU_SOURCE
> #include <fcntl.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <string.h>
> #include <errno.h>
> 
> #ifndef FALLOC_FL_ZERO_RANGE
> #define FALLOC_FL_ZERO_RANGE 0x10
> #endif
> 
> #ifndef FALLOC_FL_WRITE_ZEROES
> #define FALLOC_FL_WRITE_ZEROES 0x80
> #endif
> 
> #define TEST_SIZE (10 * 1024 * 1024)
> 
> void test_fallocate(const char *filename, int mode, const char *mode_name) {
>     int fd;
> 
>     printf("Testing %s on %s...\n", mode_name, filename);
> 
>     unlink(filename);
> 
>     fd = open(filename, O_RDWR | O_CREAT, 0666);
>     if (fd < 0) {
>         perror("open failed");
>         return;
>     }
> 
>     if (fallocate(fd, mode, 0, TEST_SIZE) == 0) {
>         printf(" -> fallocate(%s) succeeded!\n", mode_name);
>     } else {
>         printf(" -> fallocate(%s) failed: %s\n", mode_name, strerror(errno));
>     }
> 
>     close(fd);
> 
>     /* Dump extent info using xfs_io */
>     char cmd[256];
>     snprintf(cmd, sizeof(cmd), "xfs_io -c 'bmap -vp' %s", filename);
>     printf("=== Extents for %s ===\n", filename);
>     system(cmd);
>     printf("\n");
> }
> 
> int main() {
>     printf("Starting fallocate tests...\n");
>     printf("------------------------------------------------\n\n");
> 
>     test_fallocate("test_zero_range.bin", FALLOC_FL_ZERO_RANGE, "FALLOC_FL_ZERO_RANGE");
>     test_fallocate("test_write_zeroes.bin", FALLOC_FL_WRITE_ZEROES, "FALLOC_FL_WRITE_ZEROES");
> 
>     printf("Test complete.\n");
>     return 0;
> }
> 
> --
> Pankaj
> 


  reply	other threads:[~2026-02-26 16:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-25  8:39 [PATCH v10] xfs: add FALLOC_FL_WRITE_ZEROES to XFS code base Lukas Herbolt
2026-02-25 17:52 ` Christoph Hellwig
2026-02-26 14:44 ` Pankaj Raghav (Samsung)
2026-02-26 16:42   ` Brian Foster [this message]
2026-02-27 12:05     ` Pankaj Raghav (Samsung)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaB38r55RPLj3ij-@bfoster \
    --to=bfoster@redhat.com \
    --cc=cem@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=lukas@herbolt.com \
    --cc=p.raghav@samsung.com \
    --cc=pankaj.raghav@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox