From: Christoph Hellwig <hch@lst.de>
To: libc-hacker@sourceware.org, linux-fsdevel@vger.kernel.org
Cc: Trond Myklebust <trondmy@hammerspace.com>
Subject: Re: posix_fallocate behavior in glibc
Date: Mon, 29 Jul 2024 17:09:52 +0200 [thread overview]
Message-ID: <20240729150952.GA29194@lst.de> (raw)
In-Reply-To: <20240626060134.GA22955@lst.de>
Hi dear glibc maintainer,
any comments and ideas how to get glibc out of the behavior of
making file systems non-conformant by adding a broken wrapper?
On Wed, Jun 26, 2024 at 08:01:34AM +0200, Christoph Hellwig wrote:
> Hi all,
>
> Trond brought the glibc posix_fallocate behavior to my attention.
>
> As a refresher, this is how Open Group defines posix_fallocate:
>
> The posix_fallocate() function shall ensure that any required storage
> for regular file data starting at offset and continuing for len bytes
> is allocated on the file system storage media. If posix_fallocate()
> returns successfully, subsequent writes to the specified file data
> shall not fail due to the lack of free space on the file system
> storage media.
>
> The glibc implementation in sysdeps/posix/posix_fallocate.c, which is
> also by sysdeps/unix/sysv/linux/posix_fallocate.c as a fallback if the
> fallocate syscall returns EOPNOTSUPP is implemented by doing single
> byte writes at intervals of min(f.f_bsize, 4096).
>
> This assumes the writes to a file guarantee allocating space for future
> writes. Such an assumption is false for write out place file systems
> which have been around since at least they early 1990s, but are becoming
> at lot more common in the last decode. Native Linux examples are
> all file systems sitting on zoned devices where this is required
> behavior, but also the nilfs2 file system or the LFS mode in f2fs.
> On top of that it is fairly common for storage systems exposing
> network file system access.
>
> How can we get rid of this glibc fallback that turns the implementations
> non-conformant and increases write amplication for no good reason?
---end quoted text---
next prev parent reply other threads:[~2024-07-29 15:09 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-26 6:01 posix_fallocate behavior in glibc Christoph Hellwig
2024-07-29 15:09 ` Christoph Hellwig [this message]
2024-07-29 15:11 ` Sam James
-- strict thread matches above, loose matches on Subject: below --
2024-07-29 16:09 Christoph Hellwig
2024-07-29 17:23 ` Paul Eggert
2024-07-29 17:43 ` Christoph Hellwig
2024-07-29 17:54 ` Adhemerval Zanella Netto
[not found] ` <CAPBLoAf11hM0PLhqPG5gUyivU9U1manpOOhDWCPugUmWc1VVUw@mail.gmail.com>
2024-07-29 18:45 ` Christoph Hellwig
2024-07-29 17:57 ` Florian Weimer
2024-07-29 18:44 ` Christoph Hellwig
2024-07-29 18:52 ` Florian Weimer
2024-07-29 19:01 ` Christoph Hellwig
2024-07-29 19:23 ` Florian Weimer
2024-07-30 15:47 ` Christoph Hellwig
2024-07-30 16:11 ` Paul Eggert
2024-07-30 16:20 ` Christoph Hellwig
2024-07-30 17:03 ` Florian Weimer
2024-07-30 17:08 ` Christoph Hellwig
2024-07-30 17:29 ` Florian Weimer
2024-07-30 17:52 ` Mark Wielaard
2024-07-31 2:32 ` Theodore Ts'o
2024-07-29 23:53 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240729150952.GA29194@lst.de \
--to=hch@lst.de \
--cc=libc-hacker@sourceware.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.