From: Jeff Layton <jlayton@kernel.org>
To: Trevor Gross <tmgross@umich.edu>, Jan Kara <jack@suse.cz>,
The 8472 <kernel@infinite-source.de>
Cc: Zack Weinberg <zack@owlfolio.org>, Rich Felker <dalias@libc.org>,
Alejandro Colomar <alx@kernel.org>,
Vincent Lefevre <vincent@vinc17.net>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
GNU libc development <libc-alpha@sourceware.org>
Subject: Re: [RFC v1] man/man2/close.2: CAVEATS: Document divergence from POSIX.1-2024
Date: Mon, 26 Jan 2026 19:49:28 -0500 [thread overview]
Message-ID: <2d6276fca349357f56733268681424b0de5179f7.camel@kernel.org> (raw)
In-Reply-To: <DFYW8O4499ZS.2L1ABA5T5XFF2@umich.edu>
On Mon, 2026-01-26 at 17:01 -0600, Trevor Gross wrote:
> On Mon Jan 26, 2026 at 10:43 AM CST, Jeff Layton wrote:
> > On Mon, 2026-01-26 at 16:56 +0100, Jan Kara wrote:
> > > On Mon 26-01-26 14:53:12, The 8472 wrote:
> > > > On 26/01/2026 13:15, Jan Kara wrote:
> > > > > On Sun 25-01-26 10:37:01, Zack Weinberg wrote:
> > > > > > On Sat, Jan 24, 2026, at 4:57 PM, The 8472 wrote:
> > > > > > > > [QUERY: Do delayed errors ever happen in any of these situations?
> > > > > > > >
> > > > > > > > - The fd is not the last reference to the open file description
> > > > > > > >
> > > > > > > > - The OFD was opened with O_RDONLY
> > > > > > > >
> > > > > > > > - The OFD was opened with O_RDWR but has never actually
> > > > > > > > been written to
> > > > > > > >
> > > > > > > > - No data has been written to the OFD since the last call to
> > > > > > > > fsync() for that OFD
> > > > > > > >
> > > > > > > > - No data has been written to the OFD since the last call to
> > > > > > > > fdatasync() for that OFD
> > > > > > > >
> > > > > > > > If we can give some guidance about when people don’t need to
> > > > > > > > worry about delayed errors, it would be helpful.]
> > > > > >
> > > > > > In particular, I really hope delayed errors *aren’t* ever reported
> > > > > > when you close a file descriptor that *isn’t* the last reference
> > > > > > to its open file description, because the thread-safe way to close
> > > > > > stdout without losing write errors[2] depends on that not happening.
> > > > >
> > > > > So I've checked and in Linux ->flush callback for the file is called
> > > > > whenever you close a file descriptor (regardless whether there are other
> > > > > file descriptors pointing to the same file description) so it's upto
> > > > > filesystem implementation what it decides to do and which error it will
> > > > > return... Checking the implementations e.g. FUSE and NFS *will* return
> > > > > delayed writeback errors on *first* descriptor close even if there are
> > > > > other still open descriptors for the description AFAICS.
> >
> > ...and I really wish they _didn't_.
> >
> > Reporting a writeback error on close is not particularly useful. Most
> > filesystems don't require you to write back all data on a close(). A
> > successful close() on those just means that no error has happened yet.
> >
> > Any application that cares about writeback errors needs to fsync(),
> > full stop.
>
> Is there a good middle ground solution here?
>
> It seems reasonable that an application may want to have different
> handling for errors expected during normal operation, such as temporary
> network failure with NFS, compared to more catastrophic things like
> failure to write to disk. The reason cited around [1] for avoiding fsync
> is that it comes with a cost that, for many applications, may not be
> worth it unless you are dealing with NFS.
>
> I was wondering if it could be worth a new fnctl that provides this kind
> of "best effort" error checking behavior without having the strict
> requirements of fsync. In effect, to report the errors that you might
> currently get at close() before actually calling close() and losing the
> fd.
>
For a long-held fd, I can see the appeal: spray writes at it and just
check occasionally (without blocking) that nothing has gone wrong.
Maybe when things are idle, you fsync().
A new fcntl(..., F_CHECKERR, ...) command that does a
file_check_and_advance_wb_err() on the fd and reports the result would
be pretty straightforward.
Would that be helpful for your use-case? This would be like a non-
blocking fsync that just reports whether an error has occurred since
the last F_CHECKERR or fsync().
> Alternatively, it would be interesting to have a deferred fsync() that
> schedules a nonblocking sync event that can be polled for completion/
> errors, with flags to indicate immediate sync or allow automatic syncing
> as needed. But there is probably a better alternative to this
> complexity.
>
> [1]: https://github.com/rust-lang/libs-team/issues/705
Aside from the polling, I suppose you could effectively do this with
io_uring. I'm pretty sure you can issue an fsync() or sync_file_range()
that way, but I think it just ends up blocking a kernel thread until
writeback is done.
We've had people ask for a non-blocking fsync before. Maybe it's time
to get serious about adding one. What would such a thing look like?
It would be pretty simple to add a new fcntl(..., F_DATAWRITE) command
that kicks off writeback a'la filemap_fdatawrite().
Then add fcntl(..., F_WB_CHECK):
That could do a non-blocking version of filemap_fdatawait(), and return
whether any folios are still under writeback. If there is a writeback
error, it can return that instead.
The catch of course is that a polling mechanism like this could easily
livelock. If there is a lot of memory pressure, it might always return
that something is still under writeback, no matter how often you hammer
F_CHECKERR.
Maybe that's ok? You can always issue a blocking fsync() if you really
need to know draw a line in the sand.
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2026-01-27 0:49 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-15 21:33 close(2) with EINTR has been changed by POSIX.1-2024 Alejandro Colomar
2025-05-16 10:48 ` Jan Kara
2025-05-16 12:11 ` Alejandro Colomar
2025-05-16 12:52 ` [RFC v1] man/man2/close.2: CAVEATS: Document divergence from POSIX.1-2024 Alejandro Colomar
2025-05-16 13:05 ` Rich Felker
2025-05-16 14:20 ` Theodore Ts'o
2025-05-17 5:46 ` Alejandro Colomar
2025-05-17 13:03 ` Alejandro Colomar
2025-05-17 13:43 ` Rich Felker
2025-05-16 14:39 ` Vincent Lefevre
2025-05-16 14:52 ` Florian Weimer
2025-05-16 15:28 ` Vincent Lefevre
2025-05-16 15:28 ` Rich Felker
2025-05-17 13:32 ` Rich Felker
2025-05-17 13:46 ` Alejandro Colomar
2025-05-23 18:10 ` Zack Weinberg
2025-05-24 2:24 ` Rich Felker
2026-01-20 17:05 ` Zack Weinberg
2026-01-20 17:46 ` Rich Felker
2026-01-20 18:39 ` Florian Weimer
2026-01-20 19:00 ` Rich Felker
2026-01-20 20:05 ` Florian Weimer
2026-01-20 20:11 ` Paul Eggert
2026-01-20 20:35 ` Alejandro Colomar
2026-01-20 20:42 ` Alejandro Colomar
2026-01-23 0:33 ` Zack Weinberg
2026-01-23 1:02 ` Alejandro Colomar
2026-01-23 1:38 ` Al Viro
2026-01-23 14:44 ` Alejandro Colomar
2026-01-23 14:05 ` Zack Weinberg
2026-01-24 19:34 ` The 8472
2026-01-24 21:39 ` Rich Felker
2026-01-24 21:57 ` The 8472
2026-01-25 15:37 ` Zack Weinberg
2026-01-26 8:51 ` Florian Weimer
2026-01-26 12:15 ` Jan Kara
2026-01-26 13:53 ` The 8472
2026-01-26 15:56 ` Jan Kara
2026-01-26 16:43 ` Jeff Layton
2026-01-26 23:01 ` Trevor Gross
2026-01-27 0:49 ` Jeff Layton [this message]
2026-01-28 16:58 ` Zack Weinberg
2026-02-05 9:34 ` Jan Kara
2025-05-24 19:25 ` Florian Weimer
2026-01-18 22:23 ` Alejandro Colomar
2026-01-20 16:15 ` Zack Weinberg
2026-01-20 16:36 ` Rich Felker
2026-01-20 19:17 ` Al Viro
2026-02-06 15:13 ` Vincent Lefevre
2025-05-16 12:41 ` close(2) with EINTR has been changed by POSIX.1-2024 Mateusz Guzik
2025-05-16 12:41 ` Theodore Ts'o
2025-05-19 23:19 ` Steffen Nurpmeso
2025-05-20 13:37 ` Theodore Ts'o
2025-05-20 23:16 ` Steffen Nurpmeso
2025-05-16 19:13 ` Al Viro
2025-05-19 9:48 ` Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2d6276fca349357f56733268681424b0de5179f7.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=alx@kernel.org \
--cc=brauner@kernel.org \
--cc=dalias@libc.org \
--cc=jack@suse.cz \
--cc=kernel@infinite-source.de \
--cc=libc-alpha@sourceware.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tmgross@umich.edu \
--cc=vincent@vinc17.net \
--cc=viro@zeniv.linux.org.uk \
--cc=zack@owlfolio.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox