From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>, Florian Weimer <fw@deneb.enyo.de>,
Florian Weimer <fweimer@redhat.com>,
Matthew Wilcox <willy@infradead.org>,
Hans Holmberg <hans.holmberg@wdc.com>,
linux-xfs@vger.kernel.org, Carlos Maiolino <cem@kernel.org>,
"Darrick J . Wong" <djwong@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
libc-alpha@sourceware.org
Subject: Re: [RFC] xfs: fake fallocate success for always CoW inodes
Date: Tue, 11 Nov 2025 10:04:57 +0100 [thread overview]
Message-ID: <20251111090457.GB11723@lst.de> (raw)
In-Reply-To: <aRJaLn72i4yh1mkp@dread.disaster.area>
On Tue, Nov 11, 2025 at 08:33:34AM +1100, Dave Chinner wrote:
> > Not really. FALLOC_FL_WRITE_ZEROS does hardware-offloaded zeroing.
>
> That is not required functionality - it is an implementation
> optimisation.
It's also the reason why it exists.
> WRITE_ZEROES requires that the subsequent write must not need to
> perform filesystem metadata updates to guarantee data integrity.
> How the filesystem implements that is up to the filesystem....
No, it can;t require that. But it is optimizing for that.
> > I think what Florian wants (although I might be misunderstanding him)
> > is an interface that will increase the file size up to the passed in
> > size, but never reduce it and lose data.
>
> Ah, that's not a "zeroing fallocate()" like was suggested. These are
> the existing FALLOC_FL_ALLOCATE_RANGE file extension semantics.
Yes, just without allocating.
> AFAICT, this is exactly what the proposed patch implements - it
> short circuits the bit we can't guarantee (ENOSPC prevention via
> preallocation) but retains all the other aspects (non-destructive
> truncate up) when it returns success.
Yes.
> I don't see how a glibc posix_fallocate() fallback that does a
> non-desctructive truncate up though some new interface is any better
> than just having the filesystem implement ALLOCATE_RANGE without the
> ENOSPC guarantees in the first place?
For one because applications specifically probing the low-level Linux
system call will find out what is supported or not. And Linux fallocate
has always failed when not supporting the exact semantics, while
posix_fallocate in glibc always had a (fairly broken) fallback and thus
applications can somewhat reasonable expect it to not fail.
> > They are both quite different as they both zero the entire passed in
> > range, even if it already contains data, which is completely different
> > from the posix_fallocate or fallocate FALLOC_FL_ALLOCATE_RANGE semantics
> > that leave any existing data intact.
>
> Yes. However:
>
> fallocate(fd, FALLOC_FL_WRITE_ZEROES, old_eof, new_eof - old_eof);
>
> is exactly the "zeroing truncate up" operation that was being
> suggested. It will not overwrite any existing data, except if the
> application is racing other file extension operations with this one.
FALLOC_FL_WRITE_ZEROES is defined to zero the entire range.
FALLOC_FL_ALLOCATE_RANGE or a truncate up do not zero existing data.
next prev parent reply other threads:[~2025-11-11 9:05 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-06 13:35 [RFC] xfs: fake fallocate success for always CoW inodes Hans Holmberg
2025-11-06 13:48 ` Florian Weimer
2025-11-06 13:52 ` Christoph Hellwig
2025-11-06 14:42 ` Matthew Wilcox
2025-11-06 14:46 ` Christoph Hellwig
2025-11-11 8:31 ` Hans Holmberg
2025-11-11 9:05 ` hch
2025-11-11 9:50 ` Florian Weimer
2025-11-11 13:40 ` hch
2025-11-06 16:31 ` Florian Weimer
2025-11-06 17:05 ` Christoph Hellwig
2025-11-08 12:30 ` Florian Weimer
2025-11-09 22:15 ` Dave Chinner
2025-11-10 5:27 ` Florian Weimer
2025-11-10 9:38 ` Christoph Hellwig
2025-11-10 10:03 ` Florian Weimer
2025-11-10 20:28 ` Dave Chinner
2025-11-11 8:56 ` Christoph Hellwig
2025-11-10 9:37 ` Christoph Hellwig
2025-11-10 9:44 ` Florian Weimer
2025-11-10 21:33 ` Dave Chinner
2025-11-11 9:04 ` Christoph Hellwig [this message]
2025-11-11 9:30 ` Florian Weimer
2025-11-10 9:31 ` Christoph Hellwig
2025-11-10 9:48 ` truncatat? was, " Christoph Hellwig
2025-11-10 10:00 ` Florian Weimer
2025-11-10 9:49 ` Florian Weimer
2025-11-10 9:52 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251111090457.GB11723@lst.de \
--to=hch@lst.de \
--cc=cem@kernel.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=fw@deneb.enyo.de \
--cc=fweimer@redhat.com \
--cc=hans.holmberg@wdc.com \
--cc=libc-alpha@sourceware.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).