linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>, Florian Weimer <fw@deneb.enyo.de>,
	Florian Weimer <fweimer@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Hans Holmberg <hans.holmberg@wdc.com>,
	linux-xfs@vger.kernel.org, Carlos Maiolino <cem@kernel.org>,
	"Darrick J . Wong" <djwong@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	libc-alpha@sourceware.org
Subject: Re: [RFC] xfs: fake fallocate success for always CoW inodes
Date: Tue, 11 Nov 2025 10:04:57 +0100	[thread overview]
Message-ID: <20251111090457.GB11723@lst.de> (raw)
In-Reply-To: <aRJaLn72i4yh1mkp@dread.disaster.area>

On Tue, Nov 11, 2025 at 08:33:34AM +1100, Dave Chinner wrote:
> > Not really.  FALLOC_FL_WRITE_ZEROS does hardware-offloaded zeroing.
> 
> That is not required functionality - it is an implementation
> optimisation.

It's also the reason why it exists.

> WRITE_ZEROES requires that the subsequent write must not need to
> perform filesystem metadata updates to guarantee data integrity.
> How the filesystem implements that is up to the filesystem....

No, it can;t require that.  But it is optimizing for that.

> > I think what Florian wants (although I might be misunderstanding him)
> > is an interface that will increase the file size up to the passed in
> > size, but never reduce it and lose data.
> 
> Ah, that's not a "zeroing fallocate()" like was suggested. These are
> the existing FALLOC_FL_ALLOCATE_RANGE file extension semantics.

Yes, just without allocating.

> AFAICT, this is exactly what the proposed patch implements - it
> short circuits the bit we can't guarantee (ENOSPC prevention via
> preallocation) but retains all the other aspects (non-destructive
> truncate up) when it returns success.

Yes.

> I don't see how a glibc posix_fallocate() fallback that does a
> non-desctructive truncate up though some new interface is any better
> than just having the filesystem implement ALLOCATE_RANGE without the
> ENOSPC guarantees in the first place?

For one because applications specifically probing the low-level Linux
system call will find out what is supported or not.  And Linux fallocate
has always failed when not supporting the exact semantics, while
posix_fallocate in glibc always had a (fairly broken) fallback and thus
applications can somewhat reasonable expect it to not fail.

> > They are both quite different as they both zero the entire passed in
> > range, even if it already contains data, which is completely different
> > from the posix_fallocate or fallocate FALLOC_FL_ALLOCATE_RANGE semantics
> > that leave any existing data intact.
> 
> Yes. However:
> 
> 	fallocate(fd, FALLOC_FL_WRITE_ZEROES, old_eof, new_eof - old_eof);
> 
> is exactly the "zeroing truncate up" operation that was being
> suggested. It will not overwrite any existing data, except if the
> application is racing other file extension operations with this one.

FALLOC_FL_WRITE_ZEROES is defined to zero the entire range.
FALLOC_FL_ALLOCATE_RANGE or a truncate up do not zero existing data.


  reply	other threads:[~2025-11-11  9:05 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-06 13:35 [RFC] xfs: fake fallocate success for always CoW inodes Hans Holmberg
2025-11-06 13:48 ` Florian Weimer
2025-11-06 13:52   ` Christoph Hellwig
2025-11-06 14:42     ` Matthew Wilcox
2025-11-06 14:46       ` Christoph Hellwig
2025-11-11  8:31         ` Hans Holmberg
2025-11-11  9:05           ` hch
2025-11-11  9:50             ` Florian Weimer
2025-11-11 13:40               ` hch
2025-11-06 16:31       ` Florian Weimer
2025-11-06 17:05         ` Christoph Hellwig
2025-11-08 12:30           ` Florian Weimer
2025-11-09 22:15             ` Dave Chinner
2025-11-10  5:27               ` Florian Weimer
2025-11-10  9:38                 ` Christoph Hellwig
2025-11-10 10:03                   ` Florian Weimer
2025-11-10 20:28                 ` Dave Chinner
2025-11-11  8:56                   ` Christoph Hellwig
2025-11-10  9:37               ` Christoph Hellwig
2025-11-10  9:44                 ` Florian Weimer
2025-11-10 21:33                 ` Dave Chinner
2025-11-11  9:04                   ` Christoph Hellwig [this message]
2025-11-11  9:30                   ` Florian Weimer
2025-11-10  9:31             ` Christoph Hellwig
2025-11-10  9:48               ` truncatat? was, " Christoph Hellwig
2025-11-10 10:00                 ` Florian Weimer
2025-11-10  9:49               ` Florian Weimer
2025-11-10  9:52                 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251111090457.GB11723@lst.de \
    --to=hch@lst.de \
    --cc=cem@kernel.org \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=fw@deneb.enyo.de \
    --cc=fweimer@redhat.com \
    --cc=hans.holmberg@wdc.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).