From: "Zack Weinberg" <zack@owlfolio.org>
To: "Jeff Layton" <jlayton@kernel.org>,
"Trevor Gross" <tmgross@umich.edu>, "Jan Kara" <jack@suse.cz>,
"The 8472" <kernel@infinite-source.de>
Cc: "Rich Felker" <dalias@libc.org>,
"Alejandro Colomar" <alx@kernel.org>,
"Vincent Lefevre" <vincent@vinc17.net>,
"Alexander Viro" <viro@zeniv.linux.org.uk>,
"Christian Brauner" <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
"GNU libc development" <libc-alpha@sourceware.org>
Subject: Re: [RFC v1] man/man2/close.2: CAVEATS: Document divergence from POSIX.1-2024
Date: Wed, 28 Jan 2026 11:58:07 -0500 [thread overview]
Message-ID: <037a7546-cbbf-4c00-bebd-57cee38785e1@app.fastmail.com> (raw)
In-Reply-To: <2d6276fca349357f56733268681424b0de5179f7.camel@kernel.org>
On Mon, Jan 26, 2026, at 7:49 PM, Jeff Layton wrote:
> On Mon, 2026-01-26 at 17:01 -0600, Trevor Gross wrote:
>> On Mon Jan 26, 2026 at 10:43 AM CST, Jeff Layton wrote:
>> > On Mon, 2026-01-26 at 16:56 +0100, Jan Kara wrote:
>> > > On Mon 26-01-26 14:53:12, The 8472 wrote:
>> > > > On 26/01/2026 13:15, Jan Kara wrote:
>> > > > > On Sun 25-01-26 10:37:01, Zack Weinberg wrote:
>> > > > > > On Sat, Jan 24, 2026, at 4:57 PM, The 8472 wrote:
...
>> > > > > > In particular, I really hope delayed errors *aren’t* ever reported
>> > > > > > when you close a file descriptor that *isn’t* the last reference
>> > > > > > to its open file description, because the thread-safe way to close
>> > > > > > stdout without losing write errors[2] depends on that not happening.
>> > > > >
>> > > > > So I've checked and in Linux ->flush callback for the file is called
>> > > > > whenever you close a file descriptor (regardless whether there are other
>> > > > > file descriptors pointing to the same file description) so it's upto
>> > > > > filesystem implementation what it decides to do and which error it will
>> > > > > return... Checking the implementations e.g. FUSE and NFS *will* return
>> > > > > delayed writeback errors on *first* descriptor close even if there are
>> > > > > other still open descriptors for the description AFAICS.
>> >
>> > ...and I really wish they _didn't_.
>> >
>> > Reporting a writeback error on close is not particularly useful. Most
>> > filesystems don't require you to write back all data on a close(). A
>> > successful close() on those just means that no error has happened yet.
>> >
>> > Any application that cares about writeback errors needs to fsync(),
>> > full stop.
>>
>> Is there a good middle ground solution here?
...
>> I was wondering if it could be worth a new fnctl that provides this kind
>> of "best effort" error checking behavior without having the strict
>> requirements of fsync. In effect, to report the errors that you might
>> currently get at close() before actually calling close() and losing the
>> fd.
...
> A new fcntl(..., F_CHECKERR, ...) command that does a
> file_check_and_advance_wb_err() on the fd and reports the result would
> be pretty straightforward.
>
> Would that be helpful for your use-case? This would be like a non-
> blocking fsync that just reports whether an error has occurred since
> the last F_CHECKERR or fsync().
I feel I need to point out that “should the kernel report errors on
close()” and “should the kernel add a new API to make life better for
programs that currently expect close() to report [some] errors” and
“should the Rust standard library propagate errors produced by close()
back up to the application” and “what should the close(2) manpage say
about errors” are four different conversation topics.
I am all in favor of moving toward a world where close() never fails
and there’s _something_ that reports write errors like fsync() without
also kicking your application off a performance cliff. But that’s not
the world we live in today, and this thread started as a conversation
about revising the close(2) manpage, and I’d kinda like to *finish*
revising the manpage in, like, the next couple weeks, not several
years from now :-) So I’d like to refocus on that topic.
Given what Jan Kara said earlier...
> Checking the implementations e.g. FUSE and NFS *will* return delayed
> writeback errors on *first* descriptor close even if there are other
> still open descriptors for the description AFAICS.
...
> fsync(2) must make sure data is persistently stored and return error if
> it was not. Thus as a VFS person I'd consider it a filesystem bug if an
> error preveting reading data later was not returned from fsync(2). OTOH
> that doesn't necessarily mean that later close doesn't return an error -
> e.g. FUSE does communicate with the server on close that can fail and
> error can be returned.
>
> With this in mind let me now try to answer your remaining questions:
>
>> >> - The OFD was opened with O_RDONLY
>
> If the filesystem supports atime, close can in principle report that atime
> update failed.
>
>> >> - The OFD was opened with O_RDWR but has never actually
>> >> been written to
>
> The same as above but with inode mtime updates.
>
>> >> - No data has been written to the OFD since the last call to
>> >> fsync() for that OFD
>
> No writeback errors should happen in this case. As I wrote above I'd
> consider this a filesystem bug.
>
>> >>
>> >> - No data has been written to the OFD since the last call to
>> >> fdatasync() for that OFD
>
> Errors can happen because some inode metadata (in practice probably only
> inode time stamps) may still need to be written out.
>
> So in the cases described above (except for fsync()) you may get delayed
> errors on close. But since in all those cases no data is lost, I don't
> think 99.9% of applications care at all...
... regrettably I think this does mean the close(3) manpage still needs
to tell people to watch out for errors, and should probably say that
errors _can_ happen even if the file wasn’t written to, but are much
less likely to be important in that case.
And my “how to close stdout in a thread-safe manner” sample code is
wrong, because I was wrong to think that the error reporting only
happened on the _final_ close, when the OFD is destroyed.
... What happens if the close is implicit in a dup2() operation? Here’s
that erroneous “how to close stdout” fragment, with comments
indicating what I thought could and could not fail at the time I wrote
it:
// These allocate new fds, which can always fail, e.g. because
// the program already has too many files open.
int new_stdout = open("/dev/null", O_WRONLY);
if (new_stdout == -1) perror_exit("/dev/null");
int old_stdout = dup(1);
if (old_stdout == -1) perror_exit("dup(1)");
flockfile(stdout);
if (fflush(stdout)) perror_exit("stdout: write error");
dup2(new_stdout, 1); // cannot fail, atomically replaces fd 1
funlockfile(stdout);
// this close may receive delayed write errors from previous writes
// to stdout
if (close(old_stdout)) perror_exit("stdout: write error");
// this close cannot fail, because it only drops an alternative
// reference to the open file description now installed as fd 1
close(new_stdout);
Note in particular that the first close _operation_ on fd 1 is in
consequence of dup2(new_stdout, 1). The dup2() manpage specifically
says “the close is performed silently (i.e. any errors during the
close are not reported by dup()” but, if stdout points to a file on
an NFS mount, are those errors _lost_, or will they actually be
reported by the subsequent close(old_stdout)?
Incidentally, the dup2() manpage has a very similar example in its
NOTES section, also presuming that close only reports errors on the
_final_ close, not when it “merely” drops reference >=2 to an OFD.
(I’m starting to think we need dup3(old, new, O_SWAP_FDS). Or is that
already a thing somehow?)
zw
next prev parent reply other threads:[~2026-01-28 16:58 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-15 21:33 close(2) with EINTR has been changed by POSIX.1-2024 Alejandro Colomar
2025-05-16 10:48 ` Jan Kara
2025-05-16 12:11 ` Alejandro Colomar
2025-05-16 12:52 ` [RFC v1] man/man2/close.2: CAVEATS: Document divergence from POSIX.1-2024 Alejandro Colomar
2025-05-16 13:05 ` Rich Felker
2025-05-16 14:20 ` Theodore Ts'o
2025-05-17 5:46 ` Alejandro Colomar
2025-05-17 13:03 ` Alejandro Colomar
2025-05-17 13:43 ` Rich Felker
2025-05-16 14:39 ` Vincent Lefevre
2025-05-16 14:52 ` Florian Weimer
2025-05-16 15:28 ` Vincent Lefevre
2025-05-16 15:28 ` Rich Felker
2025-05-17 13:32 ` Rich Felker
2025-05-17 13:46 ` Alejandro Colomar
2025-05-23 18:10 ` Zack Weinberg
2025-05-24 2:24 ` Rich Felker
2026-01-20 17:05 ` Zack Weinberg
2026-01-20 17:46 ` Rich Felker
2026-01-20 18:39 ` Florian Weimer
2026-01-20 19:00 ` Rich Felker
2026-01-20 20:05 ` Florian Weimer
2026-01-20 20:11 ` Paul Eggert
2026-01-20 20:35 ` Alejandro Colomar
2026-01-20 20:42 ` Alejandro Colomar
2026-01-23 0:33 ` Zack Weinberg
2026-01-23 1:02 ` Alejandro Colomar
2026-01-23 1:38 ` Al Viro
2026-01-23 14:44 ` Alejandro Colomar
2026-01-23 14:05 ` Zack Weinberg
2026-01-24 19:34 ` The 8472
2026-01-24 21:39 ` Rich Felker
2026-01-24 21:57 ` The 8472
2026-01-25 15:37 ` Zack Weinberg
2026-01-26 8:51 ` Florian Weimer
2026-01-26 12:15 ` Jan Kara
2026-01-26 13:53 ` The 8472
2026-01-26 15:56 ` Jan Kara
2026-01-26 16:43 ` Jeff Layton
2026-01-26 23:01 ` Trevor Gross
2026-01-27 0:49 ` Jeff Layton
2026-01-28 16:58 ` Zack Weinberg [this message]
2026-02-05 9:34 ` Jan Kara
2025-05-24 19:25 ` Florian Weimer
2026-01-18 22:23 ` Alejandro Colomar
2026-01-20 16:15 ` Zack Weinberg
2026-01-20 16:36 ` Rich Felker
2026-01-20 19:17 ` Al Viro
2026-02-06 15:13 ` Vincent Lefevre
2025-05-16 12:41 ` close(2) with EINTR has been changed by POSIX.1-2024 Mateusz Guzik
2025-05-16 12:41 ` Theodore Ts'o
2025-05-19 23:19 ` Steffen Nurpmeso
2025-05-20 13:37 ` Theodore Ts'o
2025-05-20 23:16 ` Steffen Nurpmeso
2025-05-16 19:13 ` Al Viro
2025-05-19 9:48 ` Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=037a7546-cbbf-4c00-bebd-57cee38785e1@app.fastmail.com \
--to=zack@owlfolio.org \
--cc=alx@kernel.org \
--cc=brauner@kernel.org \
--cc=dalias@libc.org \
--cc=jack@suse.cz \
--cc=jlayton@kernel.org \
--cc=kernel@infinite-source.de \
--cc=libc-alpha@sourceware.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tmgross@umich.edu \
--cc=vincent@vinc17.net \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox