From: Dave Chinner <dgc@kernel.org>
To: "Pankaj Raghav (Samsung)" <pankaj.raghav@linux.dev>
Cc: Lukas Herbolt <lukas@herbolt.com>,
linux-xfs@vger.kernel.org, cem@kernel.org, hch@infradead.org,
djwong@kernel.org, p.raghav@samsung.com
Subject: Re: [PATCH v11 2/2] xfs: add FALLOC_FL_WRITE_ZEROES to XFS code base
Date: Mon, 16 Mar 2026 10:49:44 +1100 [thread overview]
Message-ID: <abdFmEn7lQKqqbv1@dread> (raw)
In-Reply-To: <my5vghqmg4npgqpunilj3btbgrvr2s4dbqkm2j6cpw2l3tccrk@p3h3scnzwz6o>
On Thu, Mar 12, 2026 at 09:36:50PM +0000, Pankaj Raghav (Samsung) wrote:
> > > +
> > > + len = round_up(offset + len, blksize) - round_down(offset, blksize);
> > > + offset = round_down(offset, blksize);
> > > + error = xfs_alloc_file_space(ip, offset, len, XFS_BMAPI_PREALLOC);
> >
> > xfs_alloc_file_space() already rounds the offset down and the range
> > end upwards to block size boundaries (XFS_B_TO_FSBT() rounds down,
> > XFS_B_TO_FSB rounds up). There is no need to do it here.
> >
> > > +
> > > +set_filesize:
> > > + if (error)
> > > + return error;
> > > +
> > > + error = xfs_falloc_setsize(file, new_size);
> > > + if (error)
> > > + return error;
> > > + if (need_convert)
> > > + error = xfs_alloc_file_space(ip, offset, len,
> > > + XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO);
> > > + return error;
> >
> > Honestly, this "prealloc, set file size, then convert and zero" thing
> > seems a bit ... clunky.
> >
> > I mean, this is trying to work around an issue with
> > xfs_falloc_setsize() operating on written extents beyond EOF to set
> > EOF. But why are we even using a generic truncate operation to set
> > the EOF when we have already done most of the complex truncate work
> > (exclusion, page cache invalidation, extent removal, etc) already?
>
> I agree.
>
> >
> > My logic:
> >
> > xfs_bmapi_write(off, len, XFS_BMAPI_ZERO) will allocate all the
> > holes and delalloc extents over the given range as written extents
> > that are initialised to contain zeroes.
> >
> > xfs_bmapi_write(off, len, XFS_BMAPI_CONVERT) will convert all the
> > unwritten extents over a given range to written extents, but will
> > not touch the data in those extents.
> >
> > xfs_bmapi_write(off, len, XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO) will
> > allocate holes and delalloc extents and convert unwritten extents
> > all to written extents, and it will write zeros to all of those
> > extents.
> >
> > This latter case is exactly what we want, and if gives us zeroes on
> > disk to the end of the extent beyond the new EOF. i.e. we've
> > completed all the data IO, and so now all we need
> > to do is update the file size transactionally, just like we do in
> > IO completion.
> >
> > Indeed, xfs_iomap_write_unwritten() integrates this size update
> > into post-IO unwritten extent conversion. i.e we already have code
> > that does exactly what we need, except for tweaking the
> > xfs_bmapi_write() flags for the zeroing behaviour we need.
>
> This makes sense.
>
> >
> > So, a helper function in fs/xfs/xfs_bmap_util.c such as this:
> >
> > /*
> > * This function is used to allocate written extents over holes
> > * and/or convert unwritten extents to written extents based on the
> > * @flags passed to it.
> > *
> > * If @flags is zero, if will allocate written extents for holes and
>
> s/zero/XFS_BMAPI_ZERO ?
No, I explicitly meant flags = 0.
> > * delalloc extents across the range.
> > *
> > * If XFS_BMAPI_CONVERT is specified in @flags, then it will also do
> > * conversion of unwritten extents in the range to written extents.
> > *
> > * If XFS_BMAPI_ZERO is specified in @flags, then both newly
>
> s/XFS_BMAPI_ZERO/(XFS_BMAPI_ZERO | XFS_BMAPI_CONVERT) ?
No I explictly describing what happens when you do "flags |=
XFS_BMAPI_ZERO;" to any other xfs_bmapi_write() operation
flag specification.
i.e. XFS_BMAPI_ZERO is an optional behavioural modifier for
allocation and/or unwritten extent conversion operations. i.e all it
does is add initialisation of the new/modified extents to contain
real zeroes.
e.g. flags = 0 means "allocate holes/delalloc as uninitialised
written extents". flags = 0 | XFS_BMAPI_ZERO
means "allocate holes/delalloc and initialise them to contain
zeros". Unwritten extents will not be modified by either operation.
Cheers,
Dave.
--
Dave Chinner
dgc@kernel.org
next prev parent reply other threads:[~2026-03-15 23:49 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-09 18:12 [PATCH v11 2/2] xfs: add FALLOC_FL_WRITE_ZEROES to XFS code base Lukas Herbolt
2026-03-10 0:44 ` Darrick J. Wong
2026-03-10 10:10 ` Pankaj Raghav (Samsung)
2026-03-10 11:22 ` Lukas Herbolt
2026-03-10 15:02 ` Darrick J. Wong
2026-03-10 10:20 ` Lukas Herbolt
2026-03-10 14:57 ` Darrick J. Wong
2026-03-11 0:12 ` Dave Chinner
2026-03-12 21:36 ` Pankaj Raghav (Samsung)
2026-03-15 23:49 ` Dave Chinner [this message]
2026-03-16 7:23 ` Pankaj Raghav
2026-03-16 5:03 ` Lukas Herbolt
2026-03-17 12:20 ` Pankaj Raghav (Samsung)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abdFmEn7lQKqqbv1@dread \
--to=dgc@kernel.org \
--cc=cem@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=linux-xfs@vger.kernel.org \
--cc=lukas@herbolt.com \
--cc=p.raghav@samsung.com \
--cc=pankaj.raghav@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.