From: "Darrick J. Wong" <djwong@kernel.org>
To: Lukas Herbolt <lukas@herbolt.com>
Cc: linux-xfs@vger.kernel.org, cem@kernel.org, hch@infradead.org,
p.raghav@samsung.com
Subject: Re: [PATCH v11 2/2] xfs: add FALLOC_FL_WRITE_ZEROES to XFS code base
Date: Tue, 10 Mar 2026 07:57:07 -0700 [thread overview]
Message-ID: <20260310145707.GO1105363@frogsfrogsfrogs> (raw)
In-Reply-To: <274965b98fee964be0cdcf4503d394b3@herbolt.com>
On Tue, Mar 10, 2026 at 11:20:54AM +0100, Lukas Herbolt wrote:
> On 2026-03-10 01:44, Darrick J. Wong wrote:
> > On Mon, Mar 09, 2026 at 07:12:36PM +0100, Lukas Herbolt wrote:
> > > Add support for FALLOC_FL_WRITE_ZEROES if the underlying device
> > > enable the unmap write zeroes operation.
> > >
> > > Co-developed-by: Pankaj Raghav <p.raghav@samsung.com>
> > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
> > > Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
> > >
> > > ---
> > > v11 changes:
> > > - split into 2 patches separating the bmapi_flags addition
> > > - 2 step allocation, to avoid zeroing beyond EOF
> > >
> > > fs/xfs/xfs_file.c | 41 +++++++++++++++++++++++++++++------------
> > > 1 file changed, 29 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > > index fd049a1fc9c6..f8c1611e3267 100644
> > > --- a/fs/xfs/xfs_file.c
> > > +++ b/fs/xfs/xfs_file.c
> > > @@ -1293,29 +1293,45 @@ xfs_falloc_zero_range(
> > > unsigned int blksize = i_blocksize(inode);
> > > loff_t new_size = 0;
> > > int error;
> > > + bool need_convert = false;
> > >
> > > trace_xfs_zero_file_space(ip);
> > >
> > > + if (mode & FALLOC_FL_WRITE_ZEROES) {
> > > + if (xfs_is_always_cow_inode(ip) ||
> > > + !bdev_write_zeroes_unmap_sectors(
> > > + xfs_inode_buftarg(ip)->bt_bdev))
> > > + return -EOPNOTSUPP;
> > > + need_convert = true;
> > > + }
> > > +
> > > error = xfs_falloc_newsize(file, mode, offset, len, &new_size);
> > > if (error)
> > > return error;
> > >
> > > if (xfs_falloc_force_zero(ip, ac)) {
> > > error = xfs_zero_range(ip, offset, len, ac, NULL);
> > > - } else {
> > > - error = xfs_free_file_space(ip, offset, len, ac);
> > > - if (error)
> > > - return error;
> > > -
> > > - len = round_up(offset + len, blksize) -
> > > - round_down(offset, blksize);
> > > - offset = round_down(offset, blksize);
> > > - error = xfs_alloc_file_space(ip, offset, len,
> > > - XFS_BMAPI_PREALLOC);
> > > + goto set_filesize;
> > > }
> > > + error = xfs_free_file_space(ip, offset, len, ac);
> > > if (error)
> > > return error;
> > > - return xfs_falloc_setsize(file, new_size);
> > > +
> > > + len = round_up(offset + len, blksize) - round_down(offset, blksize);
> > > + offset = round_down(offset, blksize);
> > > + error = xfs_alloc_file_space(ip, offset, len, XFS_BMAPI_PREALLOC);
> > > +
> > > +set_filesize:
> > > + if (error)
> > > + return error;
> > > +
> > > + error = xfs_falloc_setsize(file, new_size);
> > > + if (error)
> > > + return error;
> > > + if (need_convert)
> > > + error = xfs_alloc_file_space(ip, offset, len,
> > > + XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO);
> > > + return error;
> > > }
> >
> > I can't help but think this would be cleaner as:
> >
> > static int
> > xfs_falloc_write_zero_range(
> > struct file *file,
> > int mode,
> > loff_t offset,
> > loff_t len,
> > struct xfs_zone_alloc_ctx *ac)
> > {
> > struct inode *inode = file_inode(file);
> > struct xfs_inode *ip = XFS_I(inode);
> > unsigned int blksize = i_blocksize(inode);
> > loff_t new_size = 0;
> > int error;
> >
> > trace_xfs_zero_file_space(ip);
> >
> > if (xfs_is_always_cow_inode(ip) ||
> > !bdev_write_zeroes_unmap_sectors(
> > xfs_inode_buftarg(ip)->bt_bdev))
> > return -EOPNOTSUPP;
> >
> > error = xfs_falloc_newsize(file, mode, offset, len, &new_size);
> > if (error)
> > return error;
> >
> > if (xfs_falloc_force_zero(ip, ac)) {
> > error = xfs_zero_range(ip, offset, len, ac, NULL);
> > if (error)
> > return error;
> >
> > return xfs_falloc_setsize(file, new_size);
> > }
> >
> > error = xfs_free_file_space(ip, offset, len, ac);
> > if (error)
> > return error;
> >
> > len = round_up(offset + len, blksize) -
> > round_down(offset, blksize);
> > offset = round_down(offset, blksize);
> > error = xfs_alloc_file_space(ip, offset, len);
> > if (error)
> > return error;
> >
> > error = xfs_falloc_setsize(file, new_size);
> > if (error)
> > return error;
> >
> > return xfs_alloc_file_space(ip, offset, len,
> > XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO);
> > }
> I didn't want to duplicate most of the xfs_falloc_zero_range, but if that's
> fine we can go this way.
I /much/ prefer that each fallocate mode have its own small cohesive
function. We used to have one giant function with conditionals
everywhere and it was very hard to understand.
> > ...assuming there's even a point to WRITE_ZEROES on a zoned file?
> I think this is covered in:
>
> if (xfs_is_always_cow_inode(ip) ||
> !bdev_write_zeroes_unmap_sectors(
> xfs_inode_buftarg(ip)->bt_bdev))
> return -EOPNOTSUPP;
Yeah, it is. I missed that, so this becomes shorter:
static int
xfs_falloc_write_zero_range(
struct file *file,
int mode,
loff_t offset,
loff_t len,
struct xfs_zone_alloc_ctx *ac)
{
struct inode *inode = file_inode(file);
struct xfs_inode *ip = XFS_I(inode);
unsigned int blksize = i_blocksize(inode);
loff_t new_size = 0;
int error;
if (xfs_is_always_cow_inode(ip) ||
!bdev_write_zeroes_unmap_sectors(
xfs_inode_buftarg(ip)->bt_bdev))
return -EOPNOTSUPP;
trace_xfs_write_zero_range(ip);
error = xfs_falloc_newsize(file, mode, offset, len, &new_size);
if (error)
return error;
error = xfs_free_file_space(ip, offset, len, ac);
if (error)
return error;
len = round_up(offset + len, blksize) -
round_down(offset, blksize);
offset = round_down(offset, blksize);
error = xfs_alloc_file_space(ip, offset, len,
XFS_BMAPI_PREALLOC);
if (error)
return error;
error = xfs_falloc_setsize(file, new_size);
if (error)
return error;
return xfs_alloc_file_space(ip, offset, len,
XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO);
}
--D
next prev parent reply other threads:[~2026-03-10 14:57 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-09 18:12 [PATCH v11 2/2] xfs: add FALLOC_FL_WRITE_ZEROES to XFS code base Lukas Herbolt
2026-03-10 0:44 ` Darrick J. Wong
2026-03-10 10:10 ` Pankaj Raghav (Samsung)
2026-03-10 11:22 ` Lukas Herbolt
2026-03-10 15:02 ` Darrick J. Wong
2026-03-10 10:20 ` Lukas Herbolt
2026-03-10 14:57 ` Darrick J. Wong [this message]
2026-03-11 0:12 ` Dave Chinner
2026-03-12 21:36 ` Pankaj Raghav (Samsung)
2026-03-15 23:49 ` Dave Chinner
2026-03-16 7:23 ` Pankaj Raghav
2026-03-16 5:03 ` Lukas Herbolt
2026-03-17 12:20 ` Pankaj Raghav (Samsung)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260310145707.GO1105363@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=cem@kernel.org \
--cc=hch@infradead.org \
--cc=linux-xfs@vger.kernel.org \
--cc=lukas@herbolt.com \
--cc=p.raghav@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox