* [RFC 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs
@ 2026-02-27 14:08 Pankaj Raghav
2026-02-27 14:08 ` [RFC 1/2] xfs: add flags field to xfs_alloc_file_space Pankaj Raghav
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Pankaj Raghav @ 2026-02-27 14:08 UTC (permalink / raw)
To: linux-xfs
Cc: bfoster, dchinner, Darrick J . Wong, p.raghav, gost.dev,
pankaj.raghav, andres, cem, hch, lucas
The benefits of FALLOC_FL_WRITE_ZEROES was already discussed as a part
of Zhang Yi's initial patches[1]. Postgres developer Andres also
mentioned they would like to use this feature in Postgres [2].
Lukas Herbolt sent a patch recently that adds this support but I found
some issues with them[3]. I independtly started working on these patches
a while back as well, so I thought maybe I will send a RFC version of
this support.
I have implemented this support similar to what ext4 is doing: write unwritten
extents first, increase the size of the file, then zero out those extents
with XFS_BMAPI_CONVERT with XFS_BMAPI_ZERO. This seems to be working
correctly without changing any of the core infrastructure. But I am not
sure if this is the most efficient way of doing it in XFS or if there
are some corner cases I am missing, so any feedback is welcome.
[1] https://lore.kernel.org/linux-fsdevel/20250619111806.3546162-1-yi.zhang@huaweicloud.com/
[2] https://lore.kernel.org/linux-fsdevel/20260217055103.GA6174@lst.de/T/#m7935b9bab32bb5ff372507f84803b8753ad1c814
[3] https://lore.kernel.org/linux-xfs/wmxdwtvahubdga73cgzprqtj7fxyjgx5kxvr4cobtl6ski2i6y@ic2g3bfymkwi/
=== Testing ===:
void test_fallocate(const char *filename, int mode, const char *mode_name) {
int fd;
printf("Testing %s on %s...\n", mode_name, filename);
unlink(filename);
fd = open(filename, O_RDWR | O_CREAT, 0666);
if (fd < 0) {
perror("open failed");
return;
}
if (fallocate(fd, mode, 0, TEST_SIZE) == 0) {
printf(" -> fallocate(%s) succeeded!\n", mode_name);
} else {
printf(" -> fallocate(%s) failed: %s\n", mode_name, strerror(errno));
}
close(fd);
/* Dump extent info using xfs_io */
char cmd[256];
snprintf(cmd, sizeof(cmd), "xfs_io -c 'bmap -vvp' %s", filename);
printf("=== Extents for %s ===\n", filename);
system(cmd);
printf("\n");
}
int main() {
printf("Starting fallocate tests...\n");
printf("------------------------------------------------\n\n");
test_fallocate("test_zero_range.bin", FALLOC_FL_ZERO_RANGE, "FALLOC_FL_ZERO_RANGE");
test_fallocate("test_write_zeroes.bin", FALLOC_FL_WRITE_ZEROES, "FALLOC_FL_WRITE_ZEROES");
printf("Test complete.\n");
return 0;
}
This is the output:
root@debian:/mnt# ~/home/write_zeroes /mnt/hello
Starting fallocate tests...
------------------------------------------------
Testing FALLOC_FL_ZERO_RANGE on test_zero_range.bin...
-> fallocate(FALLOC_FL_ZERO_RANGE) succeeded!
=== Extents for test_zero_range.bin ===
test_zero_range.bin:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
0: [0..20479]: 20672..41151 0 (20672..41151) 20480 010000
FLAG Values:
0100000 Shared extent
0010000 Unwritten preallocated extent
0001000 Doesn't begin on stripe unit
0000100 Doesn't end on stripe unit
0000010 Doesn't begin on stripe width
0000001 Doesn't end on stripe width
Testing FALLOC_FL_WRITE_ZEROES on test_write_zeroes.bin...
-> fallocate(FALLOC_FL_WRITE_ZEROES) succeeded!
=== Extents for test_write_zeroes.bin ===
test_write_zeroes.bin:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
0: [0..20479]: 41152..61631 0 (41152..61631) 20480 000000
FLAG Values:
0100000 Shared extent
0010000 Unwritten preallocated extent
0001000 Doesn't begin on stripe unit
0000100 Doesn't end on stripe unit
0000010 Doesn't begin on stripe width
0000001 Doesn't end on stripe width
Pankaj Raghav (2):
xfs: add flags field to xfs_alloc_file_space
xfs: add support for FALLOC_FL_WRITE_ZEROES
fs/xfs/xfs_bmap_util.c | 5 ++--
fs/xfs/xfs_bmap_util.h | 2 +-
fs/xfs/xfs_file.c | 64 +++++++++++++++++++++++++++++++++++++++---
3 files changed, 64 insertions(+), 7 deletions(-)
base-commit: 4d750717498bbc1d8801281c32453a5f23d0bbe8
--
2.50.1
^ permalink raw reply [flat|nested] 11+ messages in thread* [RFC 1/2] xfs: add flags field to xfs_alloc_file_space 2026-02-27 14:08 [RFC 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav @ 2026-02-27 14:08 ` Pankaj Raghav 2026-03-03 15:24 ` Christoph Hellwig 2026-02-27 14:08 ` [RFC 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES Pankaj Raghav 2026-02-27 16:26 ` [RFC 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav (Samsung) 2 siblings, 1 reply; 11+ messages in thread From: Pankaj Raghav @ 2026-02-27 14:08 UTC (permalink / raw) To: linux-xfs Cc: bfoster, dchinner, Darrick J . Wong, p.raghav, gost.dev, pankaj.raghav, andres, cem, hch, lucas Currently, xfs_alloc_file_space() hardcodes the XFS_BMAPI_PREALLOC flag when calling xfs_bmapi_write(). This restricts its capability to only allocating unwritten extents. In preparation for adding FALLOC_FL_WRITE_ZEROES support, which needs to allocate space and simultaneously convert it to written and zeroed extents, introduce a 'flags' parameter to xfs_alloc_file_space(). This allows callers to explicitly pass the required XFS_BMAPI_* allocation flags. Update all existing callers to pass XFS_BMAPI_PREALLOC to maintain the current behavior. No functional changes intended. Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> --- fs/xfs/xfs_bmap_util.c | 5 +++-- fs/xfs/xfs_bmap_util.h | 2 +- fs/xfs/xfs_file.c | 6 +++--- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 0ab00615f1ad..532200959d8d 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -646,7 +646,8 @@ int xfs_alloc_file_space( struct xfs_inode *ip, xfs_off_t offset, - xfs_off_t len) + xfs_off_t len, + uint32_t flags) { xfs_mount_t *mp = ip->i_mount; xfs_off_t count; @@ -748,7 +749,7 @@ xfs_alloc_file_space( * will eventually reach the requested range. */ error = xfs_bmapi_write(tp, ip, startoffset_fsb, - allocatesize_fsb, XFS_BMAPI_PREALLOC, 0, imapp, + allocatesize_fsb, flags, 0, imapp, &nimaps); if (error) { if (error != -ENOSR) diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h index c477b3361630..1fd4844d4ec6 100644 --- a/fs/xfs/xfs_bmap_util.h +++ b/fs/xfs/xfs_bmap_util.h @@ -56,7 +56,7 @@ int xfs_bmap_last_extent(struct xfs_trans *tp, struct xfs_inode *ip, /* preallocation and hole punch interface */ int xfs_alloc_file_space(struct xfs_inode *ip, xfs_off_t offset, - xfs_off_t len); + xfs_off_t len, uint32_t flags); int xfs_free_file_space(struct xfs_inode *ip, xfs_off_t offset, xfs_off_t len, struct xfs_zone_alloc_ctx *ac); int xfs_collapse_file_space(struct xfs_inode *, xfs_off_t offset, diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 6246f34df9fd..3bd099534c68 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1346,7 +1346,7 @@ xfs_falloc_zero_range( len = round_up(offset + len, blksize) - round_down(offset, blksize); offset = round_down(offset, blksize); - error = xfs_alloc_file_space(ip, offset, len); + error = xfs_alloc_file_space(ip, offset, len, XFS_BMAPI_PREALLOC); } if (error) return error; @@ -1372,7 +1372,7 @@ xfs_falloc_unshare_range( if (error) return error; - error = xfs_alloc_file_space(XFS_I(inode), offset, len); + error = xfs_alloc_file_space(XFS_I(inode), offset, len, XFS_BMAPI_PREALLOC); if (error) return error; return xfs_falloc_setsize(file, new_size); @@ -1400,7 +1400,7 @@ xfs_falloc_allocate_range( if (error) return error; - error = xfs_alloc_file_space(XFS_I(inode), offset, len); + error = xfs_alloc_file_space(XFS_I(inode), offset, len, XFS_BMAPI_PREALLOC); if (error) return error; return xfs_falloc_setsize(file, new_size); -- 2.50.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC 1/2] xfs: add flags field to xfs_alloc_file_space 2026-02-27 14:08 ` [RFC 1/2] xfs: add flags field to xfs_alloc_file_space Pankaj Raghav @ 2026-03-03 15:24 ` Christoph Hellwig 2026-03-04 8:20 ` Pankaj Raghav (Samsung) 0 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2026-03-03 15:24 UTC (permalink / raw) To: Pankaj Raghav Cc: linux-xfs, bfoster, dchinner, Darrick J . Wong, gost.dev, pankaj.raghav, andres, cem, hch, lucas On Fri, Feb 27, 2026 at 03:08:41PM +0100, Pankaj Raghav wrote: > Currently, xfs_alloc_file_space() hardcodes the XFS_BMAPI_PREALLOC flag > when calling xfs_bmapi_write(). This restricts its capability to only > allocating unwritten extents. > > In preparation for adding FALLOC_FL_WRITE_ZEROES support, which needs to > allocate space and simultaneously convert it to written and zeroed > extents, introduce a 'flags' parameter to xfs_alloc_file_space(). This > allows callers to explicitly pass the required XFS_BMAPI_* allocation > flags. > > Update all existing callers to pass XFS_BMAPI_PREALLOC to maintain the > current behavior. No functional changes intended. > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > --- > fs/xfs/xfs_bmap_util.c | 5 +++-- > fs/xfs/xfs_bmap_util.h | 2 +- > fs/xfs/xfs_file.c | 6 +++--- > 3 files changed, 7 insertions(+), 6 deletions(-) > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > index 0ab00615f1ad..532200959d8d 100644 > --- a/fs/xfs/xfs_bmap_util.c > +++ b/fs/xfs/xfs_bmap_util.c > @@ -646,7 +646,8 @@ int > xfs_alloc_file_space( > struct xfs_inode *ip, > xfs_off_t offset, > - xfs_off_t len) > + xfs_off_t len, > + uint32_t flags) Messed up indentation. Given that we've been through this for a lot of iterations, what about you just take Lukas' existing patch and help improving it? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 1/2] xfs: add flags field to xfs_alloc_file_space 2026-03-03 15:24 ` Christoph Hellwig @ 2026-03-04 8:20 ` Pankaj Raghav (Samsung) 2026-03-04 8:31 ` Pankaj Raghav (Samsung) 2026-03-04 9:31 ` Carlos Maiolino 0 siblings, 2 replies; 11+ messages in thread From: Pankaj Raghav (Samsung) @ 2026-03-04 8:20 UTC (permalink / raw) To: Christoph Hellwig Cc: Pankaj Raghav, linux-xfs, bfoster, dchinner, Darrick J . Wong, gost.dev, andres, cem, lucas On Tue, Mar 03, 2026 at 07:24:30AM -0800, Christoph Hellwig wrote: > On Fri, Feb 27, 2026 at 03:08:41PM +0100, Pankaj Raghav wrote: > > Currently, xfs_alloc_file_space() hardcodes the XFS_BMAPI_PREALLOC flag > > when calling xfs_bmapi_write(). This restricts its capability to only > > allocating unwritten extents. > > > > In preparation for adding FALLOC_FL_WRITE_ZEROES support, which needs to > > allocate space and simultaneously convert it to written and zeroed > > extents, introduce a 'flags' parameter to xfs_alloc_file_space(). This > > allows callers to explicitly pass the required XFS_BMAPI_* allocation > > flags. > > > > Update all existing callers to pass XFS_BMAPI_PREALLOC to maintain the > > current behavior. No functional changes intended. > > > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > > --- > > fs/xfs/xfs_bmap_util.c | 5 +++-- > > fs/xfs/xfs_bmap_util.h | 2 +- > > fs/xfs/xfs_file.c | 6 +++--- > > 3 files changed, 7 insertions(+), 6 deletions(-) > > > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > > index 0ab00615f1ad..532200959d8d 100644 > > --- a/fs/xfs/xfs_bmap_util.c > > +++ b/fs/xfs/xfs_bmap_util.c > > @@ -646,7 +646,8 @@ int > > xfs_alloc_file_space( > > struct xfs_inode *ip, > > xfs_off_t offset, > > - xfs_off_t len) > > + xfs_off_t len, > > + uint32_t flags) > > Messed up indentation. > Oops. > Given that we've been through this for a lot of iterations, what > about you just take Lukas' existing patch and help improving it? I did review his patch[1]. The patches were broken when I tested it but I did not get a reply from him after I reported them. That is why I decided to send a new version. [1] https://lore.kernel.org/linux-xfs/wmxdwtvahubdga73cgzprqtj7fxyjgx5kxvr4cobtl6ski2i6y@ic2g3bfymkwi/ -- Pankaj ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 1/2] xfs: add flags field to xfs_alloc_file_space 2026-03-04 8:20 ` Pankaj Raghav (Samsung) @ 2026-03-04 8:31 ` Pankaj Raghav (Samsung) 2026-03-04 9:31 ` Carlos Maiolino 1 sibling, 0 replies; 11+ messages in thread From: Pankaj Raghav (Samsung) @ 2026-03-04 8:31 UTC (permalink / raw) To: Christoph Hellwig Cc: Pankaj Raghav, linux-xfs, bfoster, dchinner, Darrick J . Wong, gost.dev, andres, cem, lukas > > > struct xfs_inode *ip, > > > xfs_off_t offset, > > > - xfs_off_t len) > > > + xfs_off_t len, > > > + uint32_t flags) > > > > Messed up indentation. > > > Oops. > > > Given that we've been through this for a lot of iterations, what > > about you just take Lukas' existing patch and help improving it? > > I did review his patch[1]. The patches were broken when I tested it but I > did not get a reply from him after I reported them. That is why I decided > to send a new version. > > [1] https://lore.kernel.org/linux-xfs/wmxdwtvahubdga73cgzprqtj7fxyjgx5kxvr4cobtl6ski2i6y@ic2g3bfymkwi/ > @Lukas, how do you want to move forward? Can I merge the changes I did here into your patch? -- Pankaj ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 1/2] xfs: add flags field to xfs_alloc_file_space 2026-03-04 8:20 ` Pankaj Raghav (Samsung) 2026-03-04 8:31 ` Pankaj Raghav (Samsung) @ 2026-03-04 9:31 ` Carlos Maiolino 2026-03-04 9:46 ` Pankaj Raghav 1 sibling, 1 reply; 11+ messages in thread From: Carlos Maiolino @ 2026-03-04 9:31 UTC (permalink / raw) To: Pankaj Raghav (Samsung) Cc: Christoph Hellwig, Pankaj Raghav, linux-xfs, bfoster, dchinner, Darrick J . Wong, gost.dev, andres, lukas > > > @@ -646,7 +646,8 @@ int > > > xfs_alloc_file_space( > > > struct xfs_inode *ip, > > > xfs_off_t offset, > > > - xfs_off_t len) > > > + xfs_off_t len, > > > + uint32_t flags) > > > > Messed up indentation. > > > Oops. > > > Given that we've been through this for a lot of iterations, what > > about you just take Lukas' existing patch and help improving it? > > I did review his patch[1]. The patches were broken when I tested it but I > did not get a reply from him after I reported them. That is why I decided > to send a new version. > > [1] https://lore.kernel.org/linux-xfs/wmxdwtvahubdga73cgzprqtj7fxyjgx5kxvr4cobtl6ski2i6y@ic2g3bfymkwi/ If I properly got the timeline, you barely gave him time to reply: your reply to the original patch: Date: Thu, 26 Feb 2026 14:44:05 +0000 your RFC time: Date: Fri, 27 Feb 2026 15:08:40 +0100 You are around enough time to know that people usually requires more than 24 hours to reply. Please, work with him to get this done. It's not a nice thing to do IMHO to pass over somebody's else work if you are aware there is work being done. FWIW you also didn't Cc'ed him in your RFC as you used the wrong email address... I'm fixing the headers so he gets aware of it. If by any means I got the timeline wrong above, forget everything I said other than the "work with him to get this done". Carlos ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 1/2] xfs: add flags field to xfs_alloc_file_space 2026-03-04 9:31 ` Carlos Maiolino @ 2026-03-04 9:46 ` Pankaj Raghav 2026-03-04 14:39 ` Lukas Herbolt 0 siblings, 1 reply; 11+ messages in thread From: Pankaj Raghav @ 2026-03-04 9:46 UTC (permalink / raw) To: Carlos Maiolino Cc: Christoph Hellwig, Pankaj Raghav, linux-xfs, bfoster, dchinner, Darrick J . Wong, gost.dev, andres, lukas On 3/4/2026 10:31 AM, Carlos Maiolino wrote: >>>> @@ -646,7 +646,8 @@ int >>>> xfs_alloc_file_space( >>>> struct xfs_inode *ip, >>>> xfs_off_t offset, >>>> - xfs_off_t len) >>>> + xfs_off_t len, >>>> + uint32_t flags) >>> >>> Messed up indentation. >>> >> Oops. >> >>> Given that we've been through this for a lot of iterations, what >>> about you just take Lukas' existing patch and help improving it? >> >> I did review his patch[1]. The patches were broken when I tested it but I >> did not get a reply from him after I reported them. That is why I decided >> to send a new version. >> >> [1] https://lore.kernel.org/linux-xfs/wmxdwtvahubdga73cgzprqtj7fxyjgx5kxvr4cobtl6ski2i6y@ic2g3bfymkwi/ > > If I properly got the timeline, you barely gave him time to reply: > > your reply to the original patch: Date: Thu, 26 Feb 2026 14:44:05 +0000 > your RFC time: Date: Fri, 27 Feb 2026 15:08:40 +0100 > > You are around enough time to know that people usually requires more > than 24 hours to reply. > I started working on this in parallel before I realized it was already being worked on. I could have waited a bit longer after responding to his patches. :) . Sorry for that. > Please, work with him to get this done. It's not a nice thing to do IMHO > to pass over somebody's else work if you are aware there is work being > done. > > FWIW you also didn't Cc'ed him in your RFC as you used the wrong email > address... I'm fixing the headers so he gets aware of it. > Yes, I replied back to the thread with the correct ID [1]. > If by any means I got the timeline wrong above, forget everything I said > other than the "work with him to get this done". > His patches are not working properly. It is almost a week since I sent that message on his thread. I have messaged him again on how to proceed. I will wait and see what he replies :) [1] https://lore.kernel.org/linux-xfs/20260227140842.1437710-1-p.raghav@samsung.com/T/#mb773dea20a7fc37772f811b26a5c5dd8941c3d2d -- Pankaj ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 1/2] xfs: add flags field to xfs_alloc_file_space 2026-03-04 9:46 ` Pankaj Raghav @ 2026-03-04 14:39 ` Lukas Herbolt 2026-03-04 19:57 ` Pankaj Raghav (Samsung) 0 siblings, 1 reply; 11+ messages in thread From: Lukas Herbolt @ 2026-03-04 14:39 UTC (permalink / raw) To: Pankaj Raghav Cc: Carlos Maiolino, Christoph Hellwig, Pankaj Raghav, linux-xfs, bfoster, dchinner, Darrick J . Wong, gost.dev, andres On 2026-03-04 10:46, Pankaj Raghav wrote: > On 3/4/2026 10:31 AM, Carlos Maiolino wrote: >>>>> @@ -646,7 +646,8 @@ int >>>>> xfs_alloc_file_space( >>>>> struct xfs_inode *ip, >>>>> xfs_off_t offset, >>>>> - xfs_off_t len) >>>>> + xfs_off_t len, >>>>> + uint32_t flags) >>>> >>>> Messed up indentation. >>>> >>> Oops. >>> >>>> Given that we've been through this for a lot of iterations, what >>>> about you just take Lukas' existing patch and help improving it? >>> >>> I did review his patch[1]. The patches were broken when I tested it >>> but I >>> did not get a reply from him after I reported them. That is why I >>> decided >>> to send a new version. >>> >>> [1] >>> https://lore.kernel.org/linux-xfs/wmxdwtvahubdga73cgzprqtj7fxyjgx5kxvr4cobtl6ski2i6y@ic2g3bfymkwi/ >> >> If I properly got the timeline, you barely gave him time to reply: >> >> your reply to the original patch: Date: Thu, 26 Feb 2026 14:44:05 >> +0000 >> your RFC time: Date: Fri, 27 Feb 2026 15:08:40 +0100 >> >> You are around enough time to know that people usually requires more >> than 24 hours to reply. >> > > I started working on this in parallel before I realized it was already > being worked on. I could have waited a bit longer after responding to > his patches. :) . Sorry for that. > >> Please, work with him to get this done. It's not a nice thing to do >> IMHO >> to pass over somebody's else work if you are aware there is work being >> done. >> >> FWIW you also didn't Cc'ed him in your RFC as you used the wrong email >> address... I'm fixing the headers so he gets aware of it. >> > > Yes, I replied back to the thread with the correct ID [1]. > >> If by any means I got the timeline wrong above, forget everything I >> said >> other than the "work with him to get this done". >> > > His patches are not working properly. It is almost a week since I sent > that message on his thread. I have messaged him again on how to > proceed. I will wait and see what he replies :) Sorry, the original CC somehow messed the filtering and it fell trough the cracks of the email folders. If you agree I would add the `two stage Ext4 like` into the original patch still utilizing the xfs_falloc_zero_range. Doing the the default XFS_BMAPI_PREALLOC and sending the XFS_BMAPI_ZERO|XFS_BMAPI_CONVERT if the WR_ZERO is set and the device supports it. I think that would still be quite readable without the of duplicating the code. -- -lhe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 1/2] xfs: add flags field to xfs_alloc_file_space 2026-03-04 14:39 ` Lukas Herbolt @ 2026-03-04 19:57 ` Pankaj Raghav (Samsung) 0 siblings, 0 replies; 11+ messages in thread From: Pankaj Raghav (Samsung) @ 2026-03-04 19:57 UTC (permalink / raw) To: Lukas Herbolt Cc: Carlos Maiolino, Christoph Hellwig, Pankaj Raghav, linux-xfs, bfoster, dchinner, Darrick J . Wong, gost.dev, andres > > > If by any means I got the timeline wrong above, forget everything I > > > said > > > other than the "work with him to get this done". > > > > > > > His patches are not working properly. It is almost a week since I sent > > that message on his thread. I have messaged him again on how to proceed. > > I will wait and see what he replies :) > > Sorry, the original CC somehow messed the filtering and it fell trough the > cracks > of the email folders. If you agree I would add the `two stage Ext4 like` > into the > original patch still utilizing the xfs_falloc_zero_range. Doing the the > default > XFS_BMAPI_PREALLOC and sending the XFS_BMAPI_ZERO|XFS_BMAPI_CONVERT if the > WR_ZERO > is set and the device supports it. > Sounds good. I think this is the right way to go. > I think that would still be quite readable without the of duplicating the > code. > Yeah. Maybe also you want to split your code into two patches similar to what I done here? IMO, it makes it a bit more readable. -- Pankaj ^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES 2026-02-27 14:08 [RFC 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav 2026-02-27 14:08 ` [RFC 1/2] xfs: add flags field to xfs_alloc_file_space Pankaj Raghav @ 2026-02-27 14:08 ` Pankaj Raghav 2026-02-27 16:26 ` [RFC 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav (Samsung) 2 siblings, 0 replies; 11+ messages in thread From: Pankaj Raghav @ 2026-02-27 14:08 UTC (permalink / raw) To: linux-xfs Cc: bfoster, dchinner, Darrick J . Wong, p.raghav, gost.dev, pankaj.raghav, andres, cem, hch, lucas If the underlying block device supports the unmap write zeroes operation, this flag allows users to quickly preallocate a file with written extents that contain zeroes. This is beneficial for subsequent overwrites as it prevents the need for unwritten-to-written extent conversions, thereby significantly reducing metadata updates and journal I/O overhead, improving overwrite performance. When handling FALLOC_FL_WRITE_ZEROES, we first allocate unwritten extents. This ensures that xfs_falloc_setsize() does not trip over a written extent beyond i_size and trigger warnings in iomap_zero_range(). After the size is updated, we call xfs_alloc_file_space() again with the XFS_BMAPI_CONVERT and XFS_BMAPI_ZERO flags to convert the unwritten extents to written and offload the write zeroes operation to the device. Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> --- fs/xfs/xfs_file.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 57 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 3bd099534c68..38688cdf4cdc 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1308,6 +1308,59 @@ xfs_falloc_force_zero( return XFS_TEST_ERROR(ip->i_mount, XFS_ERRTAG_FORCE_ZERO_RANGE); } +static int +xfs_falloc_write_zeroes( + struct file *file, + int mode, + loff_t offset, + loff_t len, + struct xfs_zone_alloc_ctx *ac) +{ + struct inode *inode = file_inode(file); + struct xfs_inode *ip = XFS_I(inode); + unsigned int blksize = i_blocksize(inode); + loff_t new_size = 0; + int error; + + if (!bdev_write_zeroes_unmap_sectors( + xfs_inode_buftarg(XFS_I(inode))->bt_bdev)) + return -EOPNOTSUPP; + + error = xfs_falloc_newsize(file, mode, offset, len, &new_size); + if (error) + return error; + + error = xfs_free_file_space(ip, offset, len, ac); + if (error) + return error; + + len = round_up(offset + len, blksize) - round_down(offset, blksize); + offset = round_down(offset, blksize); + + /* + * Allocate unwritten extents first. This ensures that xfs_falloc_setsize + * does not trip over a written extent beyond i_size and trigger warnings + * in iomap_zero_range. + */ + error = xfs_alloc_file_space(ip, offset, len, XFS_BMAPI_PREALLOC); + if (error) + return error; + + error = xfs_falloc_setsize(file, new_size); + if (error) + return error; + + /* + * Now convert the unwritten extents to written and zero them out using + * unmap write zeroes. + */ + error = xfs_alloc_file_space(ip, offset, len, XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO); + if (error) + return error; + + return 0; +} + /* * Punch a hole and prealloc the range. We use a hole punch rather than * unwritten extent conversion for two reasons: @@ -1410,7 +1463,7 @@ xfs_falloc_allocate_range( (FALLOC_FL_ALLOCATE_RANGE | FALLOC_FL_KEEP_SIZE | \ FALLOC_FL_PUNCH_HOLE | FALLOC_FL_COLLAPSE_RANGE | \ FALLOC_FL_ZERO_RANGE | FALLOC_FL_INSERT_RANGE | \ - FALLOC_FL_UNSHARE_RANGE) + FALLOC_FL_UNSHARE_RANGE | FALLOC_FL_WRITE_ZEROES) STATIC long __xfs_file_fallocate( @@ -1462,6 +1515,9 @@ __xfs_file_fallocate( case FALLOC_FL_ALLOCATE_RANGE: error = xfs_falloc_allocate_range(file, mode, offset, len); break; + case FALLOC_FL_WRITE_ZEROES: + error = xfs_falloc_write_zeroes(file, mode, offset, len, ac); + break; default: error = -EOPNOTSUPP; break; -- 2.50.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs 2026-02-27 14:08 [RFC 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav 2026-02-27 14:08 ` [RFC 1/2] xfs: add flags field to xfs_alloc_file_space Pankaj Raghav 2026-02-27 14:08 ` [RFC 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES Pankaj Raghav @ 2026-02-27 16:26 ` Pankaj Raghav (Samsung) 2 siblings, 0 replies; 11+ messages in thread From: Pankaj Raghav (Samsung) @ 2026-02-27 16:26 UTC (permalink / raw) To: Pankaj Raghav Cc: linux-xfs, bfoster, dchinner, Darrick J . Wong, gost.dev, andres, cem, hch, lukas On Fri, Feb 27, 2026 at 03:08:40PM +0100, Pankaj Raghav wrote: > The benefits of FALLOC_FL_WRITE_ZEROES was already discussed as a part > of Zhang Yi's initial patches[1]. Postgres developer Andres also > mentioned they would like to use this feature in Postgres [2]. > > Lukas Herbolt sent a patch recently that adds this support but I found > some issues with them[3]. I independtly started working on these patches > a while back as well, so I thought maybe I will send a RFC version of > this support. > CCed the wrong Lukas by mistake. cc: s/lucas/lukas/@herbolt.com -- Pankaj ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-03-04 19:58 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-02-27 14:08 [RFC 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav 2026-02-27 14:08 ` [RFC 1/2] xfs: add flags field to xfs_alloc_file_space Pankaj Raghav 2026-03-03 15:24 ` Christoph Hellwig 2026-03-04 8:20 ` Pankaj Raghav (Samsung) 2026-03-04 8:31 ` Pankaj Raghav (Samsung) 2026-03-04 9:31 ` Carlos Maiolino 2026-03-04 9:46 ` Pankaj Raghav 2026-03-04 14:39 ` Lukas Herbolt 2026-03-04 19:57 ` Pankaj Raghav (Samsung) 2026-02-27 14:08 ` [RFC 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES Pankaj Raghav 2026-02-27 16:26 ` [RFC 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav (Samsung)
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.