* [PATCH v7 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs
@ 2026-06-22 8:31 Pankaj Raghav
2026-06-22 8:31 ` [PATCH v7 1/2] xfs: add an allocation mode to xfs_alloc_file_space() Pankaj Raghav
2026-06-22 8:31 ` [PATCH v7 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES Pankaj Raghav
0 siblings, 2 replies; 4+ messages in thread
From: Pankaj Raghav @ 2026-06-22 8:31 UTC (permalink / raw)
To: linux-xfs
Cc: bfoster, lukas, Darrick J . Wong, p.raghav, dgc, gost.dev,
pankaj.raghav, andres, kundan.kumar, hch, cem, hch
The benefits of FALLOC_FL_WRITE_ZEROES was already discussed as a part
of Zhang Yi's initial patches[1]. Postgres developer Andres also
mentioned they would like to use this feature in Postgres [2].
I tested the changes with fsstress and fsx based on the xfstests patch I
sent recently to test this flag[4]. generic/363 helped me debug the
crash I noticed when I did the initial implementation[3].
Dave initially suggested to create a common helper based on
xfs_iomap_convert_unwritten() but as it can be seen in the previous
version, a lot of the code had to be rewritten. The changes had more in
common with xfs_alloc_file_space(). This version reuses
xfs_alloc_file_space() for write zeroes.
Thanks to Christoph for all the review comments and design suggestions
that were made both offline and online for this series.
Stress test generic/363 generic/127 xfs/131 are passing. I have started
the full xfstest suite for this series.
Changes since v6:
- Pass only offset that needs to be zeroed to alloc_file_space (Christoph).
- Add RVB from Christoph.
- Change the call order. Call xfs_falloc_setsize() and then call
xfs_alloc_file_space().
- Remove the prep patch to allow xfs_set_filesize to take 64-bit len.
Changes since v5:
- Add a prep patch to allow xfs_set_filesize to take 64-bit len
(Sashiko)
Changes since v4:
- Introduce an enum for allocation mode in xfs_alloc_file_space (Christoph)
- Use xfs_set_filesize instead of updating the on-disk size in the
function.
Changes since v3:
- Introduce xfs_bmap_alloc_or_convert_range() in xfs_iomap.c for easy
review experience (christoph)
- Add extsz hint and rt support in xfs_bmap_alloc_or_convert_range()
Changes since v2:
- Add allow_write_zeroes to xfs_global so that we can enable this
feature independent of the HW underneath.
Changes since v1 [5.1 5.2]:
- Added a new function xfs_bmap_alloc_or_convert_range() based on Dave's
feedback.
- Changed the xfs_falloc_write_zeroes to use
xfs_bmap_alloc_or_convert_range() instead of doing prealloc and
convert approach.
[1] https://lore.kernel.org/linux-fsdevel/20250619111806.3546162-1-yi.zhang@huaweicloud.com/
[2] https://lore.kernel.org/linux-fsdevel/20260217055103.GA6174@lst.de/T/#m7935b9bab32bb5ff372507f84803b8753ad1c814
[3] https://lore.kernel.org/linux-xfs/6i2jvzn3lyugjlbgmjzpped3gogzyqv5mpe2uqaifz4vjpaega@pomzoq7ley77/
[4] https://lore.kernel.org/linux-xfs/20260312195308.738189-1-p.raghav@samsung.com/
[5.1] https://lore.kernel.org/linux-xfs/20260309180708.427553-2-lukas@herbolt.com/
[5.2] https://lore.kernel.org/linux-xfs/abC1LvRElctaHPe5@dread/
Pankaj Raghav (2):
xfs: add an allocation mode to xfs_alloc_file_space()
xfs: add support for FALLOC_FL_WRITE_ZEROES
fs/xfs/xfs_bmap_util.c | 42 +++++++++++++++++++----
fs/xfs/xfs_bmap_util.h | 7 +++-
fs/xfs/xfs_file.c | 76 +++++++++++++++++++++++++++++++++++++++---
3 files changed, 114 insertions(+), 11 deletions(-)
base-commit: 6e24acc45ab58d39a0162b4d5f3fd001d07d868e
--
2.51.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v7 1/2] xfs: add an allocation mode to xfs_alloc_file_space()
2026-06-22 8:31 [PATCH v7 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav
@ 2026-06-22 8:31 ` Pankaj Raghav
2026-06-22 8:31 ` [PATCH v7 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES Pankaj Raghav
1 sibling, 0 replies; 4+ messages in thread
From: Pankaj Raghav @ 2026-06-22 8:31 UTC (permalink / raw)
To: linux-xfs
Cc: bfoster, lukas, Darrick J . Wong, p.raghav, dgc, gost.dev,
pankaj.raghav, andres, kundan.kumar, hch, cem, hch
xfs_alloc_file_space() hardcodes XFS_BMAPI_PREALLOC to preallocate
unwritten extents across a range.
In preparation for FALLOC_FL_WRITE_ZEROES, add an explicit allocation
mode argument, enum xfs_alloc_file_space_mode, and derive the xfs_bmapi
flags from it. The only mode for now is XFS_ALLOC_FILE_SPACE_PREALLOC,
which preallocates unwritten extents and marks the inode as preallocated
exactly as before, so there is no functional change.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
fs/xfs/xfs_bmap_util.c | 25 +++++++++++++++++++++----
fs/xfs/xfs_bmap_util.h | 6 +++++-
fs/xfs/xfs_file.c | 9 ++++++---
3 files changed, 32 insertions(+), 8 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 3b9f262f8e91..8dfb3c1e3759 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -642,11 +642,19 @@ xfs_free_eofblocks(
return error;
}
+/*
+ * Allocate space for a file according to @mode:
+ *
+ * XFS_ALLOC_FILE_SPACE_PREALLOC:
+ * Preallocate unwritten extents across the range and mark the inode as
+ * preallocated.
+ */
int
xfs_alloc_file_space(
struct xfs_inode *ip,
xfs_off_t offset,
- xfs_off_t len)
+ xfs_off_t len,
+ enum xfs_alloc_file_space_mode mode)
{
xfs_mount_t *mp = ip->i_mount;
xfs_off_t count;
@@ -657,6 +665,7 @@ xfs_alloc_file_space(
int rt;
xfs_trans_t *tp;
xfs_bmbt_irec_t imaps[1], *imapp;
+ uint32_t bmapi_flags, nr_exts;
int error;
if (xfs_is_always_cow_inode(ip))
@@ -674,6 +683,15 @@ xfs_alloc_file_space(
if (len <= 0)
return -EINVAL;
+ switch (mode) {
+ case XFS_ALLOC_FILE_SPACE_PREALLOC:
+ bmapi_flags = XFS_BMAPI_PREALLOC;
+ nr_exts = XFS_IEXT_ADD_NOSPLIT_CNT;
+ break;
+ default:
+ return -EINVAL;
+ }
+
rt = XFS_IS_REALTIME_INODE(ip);
extsz = xfs_get_extsz_hint(ip);
@@ -733,8 +751,7 @@ xfs_alloc_file_space(
if (error)
break;
- error = xfs_iext_count_extend(tp, ip, XFS_DATA_FORK,
- XFS_IEXT_ADD_NOSPLIT_CNT);
+ error = xfs_iext_count_extend(tp, ip, XFS_DATA_FORK, nr_exts);
if (error)
goto error;
@@ -748,7 +765,7 @@ xfs_alloc_file_space(
* will eventually reach the requested range.
*/
error = xfs_bmapi_write(tp, ip, startoffset_fsb,
- allocatesize_fsb, XFS_BMAPI_PREALLOC, 0, imapp,
+ allocatesize_fsb, bmapi_flags, 0, imapp,
&nimaps);
if (error) {
if (error != -ENOSR)
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
index c477b3361630..232b4c48247e 100644
--- a/fs/xfs/xfs_bmap_util.h
+++ b/fs/xfs/xfs_bmap_util.h
@@ -55,8 +55,12 @@ int xfs_bmap_last_extent(struct xfs_trans *tp, struct xfs_inode *ip,
int *is_empty);
/* preallocation and hole punch interface */
+enum xfs_alloc_file_space_mode {
+ XFS_ALLOC_FILE_SPACE_PREALLOC,
+};
+
int xfs_alloc_file_space(struct xfs_inode *ip, xfs_off_t offset,
- xfs_off_t len);
+ xfs_off_t len, enum xfs_alloc_file_space_mode mode);
int xfs_free_file_space(struct xfs_inode *ip, xfs_off_t offset,
xfs_off_t len, struct xfs_zone_alloc_ctx *ac);
int xfs_collapse_file_space(struct xfs_inode *, xfs_off_t offset,
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 845a97c9b063..e90ea6ebdc8e 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1406,7 +1406,8 @@ xfs_falloc_zero_range(
len = round_up(offset + len, blksize) -
round_down(offset, blksize);
offset = round_down(offset, blksize);
- error = xfs_alloc_file_space(ip, offset, len);
+ error = xfs_alloc_file_space(ip, offset, len,
+ XFS_ALLOC_FILE_SPACE_PREALLOC);
}
if (error)
return error;
@@ -1432,7 +1433,8 @@ xfs_falloc_unshare_range(
if (error)
return error;
- error = xfs_alloc_file_space(XFS_I(inode), offset, len);
+ error = xfs_alloc_file_space(XFS_I(inode), offset, len,
+ XFS_ALLOC_FILE_SPACE_PREALLOC);
if (error)
return error;
return xfs_falloc_setsize(file, new_size);
@@ -1460,7 +1462,8 @@ xfs_falloc_allocate_range(
if (error)
return error;
- error = xfs_alloc_file_space(XFS_I(inode), offset, len);
+ error = xfs_alloc_file_space(XFS_I(inode), offset, len,
+ XFS_ALLOC_FILE_SPACE_PREALLOC);
if (error)
return error;
return xfs_falloc_setsize(file, new_size);
--
2.51.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v7 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES
2026-06-22 8:31 [PATCH v7 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav
2026-06-22 8:31 ` [PATCH v7 1/2] xfs: add an allocation mode to xfs_alloc_file_space() Pankaj Raghav
@ 2026-06-22 8:31 ` Pankaj Raghav
2026-06-23 20:21 ` Pankaj Raghav
1 sibling, 1 reply; 4+ messages in thread
From: Pankaj Raghav @ 2026-06-22 8:31 UTC (permalink / raw)
To: linux-xfs
Cc: bfoster, lukas, Darrick J . Wong, p.raghav, dgc, gost.dev,
pankaj.raghav, andres, kundan.kumar, hch, cem, hch
If the underlying block device supports the unmap write zeroes
operation, this flag allows users to quickly preallocate a file with
written extents that contain zeroes. This is beneficial for subsequent
overwrites as it prevents the need for unwritten-to-written extent
conversions, thereby significantly reducing metadata updates and journal
I/O overhead, improving overwrite performance.
Punch the range first so it becomes a hole, update the size via
xfs_falloc_setsize() while it is still a hole (so its xfs_zero_range()
skips it and avoids rezeroing), then convert it to written
zeroed extents. A crash between the size update and the conversion is
safe, as a hole within i_size reads back as zeroes.
Co-developed-by: Lukas Herbolt <lukas@herbolt.com>
Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
I went to back calling xfs_falloc_setsize as using xfs_setfilesize would
involve a lot of repetition in the function. By changing the call order
with xfs_falloc_setsize we reuse most of the code.
fs/xfs/xfs_bmap_util.c | 19 ++++++++++--
fs/xfs/xfs_bmap_util.h | 1 +
fs/xfs/xfs_file.c | 67 +++++++++++++++++++++++++++++++++++++++++-
3 files changed, 83 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 8dfb3c1e3759..55722b815117 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -643,11 +643,18 @@ xfs_free_eofblocks(
}
/*
- * Allocate space for a file according to @mode:
+ * Allocate space or convert extents for a file according to @mode:
*
* XFS_ALLOC_FILE_SPACE_PREALLOC:
* Preallocate unwritten extents across the range and mark the inode as
* preallocated.
+ *
+ * XFS_ALLOC_FILE_SPACE_WRITE_ZEROES:
+ * Allocate written extents over holes and convert unwritten extents in the
+ * range to written extents, initialising both to contain zeroes.
+ *
+ * This function does not update the file size; callers that extend the file
+ * are responsible for updating it once the extents are allocated.
*/
int
xfs_alloc_file_space(
@@ -688,6 +695,10 @@ xfs_alloc_file_space(
bmapi_flags = XFS_BMAPI_PREALLOC;
nr_exts = XFS_IEXT_ADD_NOSPLIT_CNT;
break;
+ case XFS_ALLOC_FILE_SPACE_WRITE_ZEROES:
+ bmapi_flags = XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO;
+ nr_exts = XFS_IEXT_WRITE_UNWRITTEN_CNT;
+ break;
default:
return -EINVAL;
}
@@ -776,8 +787,10 @@ xfs_alloc_file_space(
allocatesize_fsb -= imapp->br_blockcount;
}
- ip->i_diflags |= XFS_DIFLAG_PREALLOC;
- xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+ if (mode == XFS_ALLOC_FILE_SPACE_PREALLOC) {
+ ip->i_diflags |= XFS_DIFLAG_PREALLOC;
+ xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+ }
error = xfs_trans_commit(tp);
xfs_iunlock(ip, XFS_ILOCK_EXCL);
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
index 232b4c48247e..e3d506ca9610 100644
--- a/fs/xfs/xfs_bmap_util.h
+++ b/fs/xfs/xfs_bmap_util.h
@@ -57,6 +57,7 @@ int xfs_bmap_last_extent(struct xfs_trans *tp, struct xfs_inode *ip,
/* preallocation and hole punch interface */
enum xfs_alloc_file_space_mode {
XFS_ALLOC_FILE_SPACE_PREALLOC,
+ XFS_ALLOC_FILE_SPACE_WRITE_ZEROES,
};
int xfs_alloc_file_space(struct xfs_inode *ip, xfs_off_t offset,
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index e90ea6ebdc8e..0e1332ccdf79 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1368,6 +1368,68 @@ xfs_falloc_force_zero(
return XFS_TEST_ERROR(ip->i_mount, XFS_ERRTAG_FORCE_ZERO_RANGE);
}
+static int
+xfs_falloc_write_zeroes(
+ struct file *file,
+ int mode,
+ loff_t offset,
+ loff_t len,
+ struct xfs_zone_alloc_ctx *ac)
+{
+ struct inode *inode = file_inode(file);
+ struct xfs_inode *ip = XFS_I(inode);
+ loff_t new_size = 0;
+ unsigned int blksize = i_blocksize(inode);
+ xfs_off_t offset_aligned = round_up(offset, blksize);
+ xfs_off_t end_aligned = round_down(offset + len, blksize);
+ xfs_off_t len_aligned = end_aligned - offset_aligned;
+ int error;
+
+ if (xfs_is_always_cow_inode(ip) ||
+ !bdev_write_zeroes_unmap_sectors(xfs_inode_buftarg(ip)->bt_bdev))
+ return -EOPNOTSUPP;
+
+ error = xfs_falloc_newsize(file, mode, offset, len, &new_size);
+ if (error)
+ return error;
+
+ /*
+ *
+ * |----------|----------|----------|----------|----------|
+ * ^ ^ ^ ^ ^ ^
+ * | | | | | |
+ * | offset | | end |
+ * | | | |
+ * offset_rd offset_ru end_rd end_ru
+ *
+ * xfs_free_file_space() punches inside from offset_ru -> end_rd. It also
+ * zeroes offset -> offset_ru and end_rd -> end.
+ * Only pass offset_ru -> end_rd to be zeroed via xfs_alloc_file_space().
+ */
+ error = xfs_free_file_space(ip, offset, len, ac);
+ if (error)
+ return error;
+
+ /*
+ * Publish the new size while the punched range is still a hole, then
+ * fill it with written zeroes. Like the other fallocate modes we use
+ * xfs_falloc_setsize(), but it must run *before* we convert the range
+ * to written extents: xfs_setattr_size() zeroes [old EOF, new size) via
+ * xfs_zero_range(), which skips holes, so there is nothing to re-zero.
+ * It will also writeback partial EOF block before the on-disk size is
+ * logged.
+ */
+ error = xfs_falloc_setsize(file, new_size);
+ if (error)
+ return error;
+
+ if (len_aligned > 0)
+ error = xfs_alloc_file_space(ip, offset_aligned, len_aligned,
+ XFS_ALLOC_FILE_SPACE_WRITE_ZEROES);
+
+ return error;
+}
+
/*
* Punch a hole and prealloc the range. We use a hole punch rather than
* unwritten extent conversion for two reasons:
@@ -1473,7 +1535,7 @@ xfs_falloc_allocate_range(
(FALLOC_FL_ALLOCATE_RANGE | FALLOC_FL_KEEP_SIZE | \
FALLOC_FL_PUNCH_HOLE | FALLOC_FL_COLLAPSE_RANGE | \
FALLOC_FL_ZERO_RANGE | FALLOC_FL_INSERT_RANGE | \
- FALLOC_FL_UNSHARE_RANGE)
+ FALLOC_FL_UNSHARE_RANGE | FALLOC_FL_WRITE_ZEROES)
STATIC long
__xfs_file_fallocate(
@@ -1525,6 +1587,9 @@ __xfs_file_fallocate(
case FALLOC_FL_ALLOCATE_RANGE:
error = xfs_falloc_allocate_range(file, mode, offset, len);
break;
+ case FALLOC_FL_WRITE_ZEROES:
+ error = xfs_falloc_write_zeroes(file, mode, offset, len, ac);
+ break;
default:
error = -EOPNOTSUPP;
break;
--
2.51.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v7 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES
2026-06-22 8:31 ` [PATCH v7 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES Pankaj Raghav
@ 2026-06-23 20:21 ` Pankaj Raghav
0 siblings, 0 replies; 4+ messages in thread
From: Pankaj Raghav @ 2026-06-23 20:21 UTC (permalink / raw)
To: Pankaj Raghav, linux-xfs, hch
Cc: bfoster, lukas, Darrick J . Wong, dgc, gost.dev, andres,
kundan.kumar, hch, cem
> +static int
> +xfs_falloc_write_zeroes(
> + struct file *file,
> + int mode,
> + loff_t offset,
> + loff_t len,
> + struct xfs_zone_alloc_ctx *ac)
> +{
> + struct inode *inode = file_inode(file);
> + struct xfs_inode *ip = XFS_I(inode);
> + loff_t new_size = 0;
> + unsigned int blksize = i_blocksize(inode);
> + xfs_off_t offset_aligned = round_up(offset, blksize);
> + xfs_off_t end_aligned = round_down(offset + len, blksize);
> + xfs_off_t len_aligned = end_aligned - offset_aligned;
> + int error;
> +
> + if (xfs_is_always_cow_inode(ip) ||
> + !bdev_write_zeroes_unmap_sectors(xfs_inode_buftarg(ip)->bt_bdev))
> + return -EOPNOTSUPP;
> +
> + error = xfs_falloc_newsize(file, mode, offset, len, &new_size);
> + if (error)
> + return error;
> +
> + /*
> + *
> + * |----------|----------|----------|----------|----------|
> + * ^ ^ ^ ^ ^ ^
> + * | | | | | |
> + * | offset | | end |
> + * | | | |
> + * offset_rd offset_ru end_rd end_ru
> + *
> + * xfs_free_file_space() punches inside from offset_ru -> end_rd. It also
> + * zeroes offset -> offset_ru and end_rd -> end.
> + * Only pass offset_ru -> end_rd to be zeroed via xfs_alloc_file_space().
> + */
> + error = xfs_free_file_space(ip, offset, len, ac);
> + if (error)
> + return error;
> +
> + /*
> + * Publish the new size while the punched range is still a hole, then
> + * fill it with written zeroes. Like the other fallocate modes we use
> + * xfs_falloc_setsize(), but it must run *before* we convert the range
> + * to written extents: xfs_setattr_size() zeroes [old EOF, new size) via
> + * xfs_zero_range(), which skips holes, so there is nothing to re-zero.
> + * It will also writeback partial EOF block before the on-disk size is
> + * logged.
> + */
> + error = xfs_falloc_setsize(file, new_size);
> + if (error)
> + return error;
> +
> + if (len_aligned > 0)
> + error = xfs_alloc_file_space(ip, offset_aligned, len_aligned,
> + XFS_ALLOC_FILE_SPACE_WRITE_ZEROES);
> +
> + return error;
> +}
> +
Sashiko was not happy with this approach as there are cases where there will not be a data
corruption but we might end up not allocating an extent, therefore, getting an -ENOSPC at a later point.
I went back what Zhang yi pointed out in the previous version wrt semantics[1]. I think the correct
idea should be the following:
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 0e1332ccdf79..a27862037d22 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1379,10 +1379,6 @@ xfs_falloc_write_zeroes(
struct inode *inode = file_inode(file);
struct xfs_inode *ip = XFS_I(inode);
loff_t new_size = 0;
- unsigned int blksize = i_blocksize(inode);
- xfs_off_t offset_aligned = round_up(offset, blksize);
- xfs_off_t end_aligned = round_down(offset + len, blksize);
- xfs_off_t len_aligned = end_aligned - offset_aligned;
int error;
if (xfs_is_always_cow_inode(ip) ||
@@ -1402,9 +1398,11 @@ xfs_falloc_write_zeroes(
* | | | |
* offset_rd offset_ru end_rd end_ru
*
- * xfs_free_file_space() punches inside from offset_ru -> end_rd. It also
- * zeroes offset -> offset_ru and end_rd -> end.
- * Only pass offset_ru -> end_rd to be zeroed via xfs_alloc_file_space().
+ * xfs_free_file_space() punches the aligned interior offset_ru -> end_rd
+ * to holes and byte-zeroes the in-range parts of the partial edge blocks,
+ * offset -> offset_ru and end_rd -> end. xfs_zero_range() only touches
+ * already-written blocks here; it skips holes and unwritten extents, so
+ * unallocated/unwritten edge blocks are left for the allocation below.
*/
error = xfs_free_file_space(ip, offset, len, ac);
if (error)
@@ -1423,11 +1421,19 @@ xfs_falloc_write_zeroes(
if (error)
return error;
- if (len_aligned > 0)
- error = xfs_alloc_file_space(ip, offset_aligned, len_aligned,
- XFS_ALLOC_FILE_SPACE_WRITE_ZEROES);
-
- return error;
+ /*
+ * Allocate written, zeroed extents across the range. xfs_alloc_file_space()
+ * rounds outward to block granularity:
+ * - holes (the punched interior and any unallocated edge block) are
+ * allocated and zeroed;
+ * - unwritten extents (including unwritten edge blocks) are converted to
+ * written and zeroed;
+ * - already-written blocks are skipped, so the out-of-range bytes of a
+ * written edge block keep their data; their in-range bytes were already
+ * zeroed by xfs_free_file_space() above.
+ */
+ return xfs_alloc_file_space(ip, offset, len,
+ XFS_ALLOC_FILE_SPACE_WRITE_ZEROES);
}
/*
We pass offset and len without rounding to xfs_alloc_file_space, and the existing behaviour
correctly handles them. I could add test cases in xfstests to test out all these edge cases so that
we don't regress.
If I don't have anymore comments, I will send a v8 with this approach.
--
Pankaj
[1] https://lore.kernel.org/linux-xfs/557b2e5c-7c65-48de-87a9-6fba21eca99f@huaweicloud.com/
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-23 20:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-22 8:31 [PATCH v7 0/2] add FALLOC_FL_WRITE_ZEROES support to xfs Pankaj Raghav
2026-06-22 8:31 ` [PATCH v7 1/2] xfs: add an allocation mode to xfs_alloc_file_space() Pankaj Raghav
2026-06-22 8:31 ` [PATCH v7 2/2] xfs: add support for FALLOC_FL_WRITE_ZEROES Pankaj Raghav
2026-06-23 20:21 ` Pankaj Raghav
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox