On Tue, Jan 20, 2026 at 09:35:43PM +0100, Alejandro Colomar wrote: > Hi Rich, Zack, > > On Tue, Jan 20, 2026 at 12:46:59PM -0500, Rich Felker wrote: > > On Tue, Jan 20, 2026 at 12:05:52PM -0500, Zack Weinberg wrote: > > > > On Fri, May 23, 2025 at 02:10:57PM -0400, Zack Weinberg wrote: > > [...] > > > > Now, the abstract correct behavior is secondary to the fact that we > > > know there are both systems where close should not be retried after > > > EINTR (Linux) and systems where the fd is still open after EINTR > > > (HP-UX). But it is my position that *portable code* should assume the > > > Linux behavior, because that is the safest option. If you assume the > > > HP-UX behavior on a machine that implements the Linux behavior, you > > > might close some unrelated file out from under yourself (probably but > > > not necessarily a different thread). If you assume the Linux behavior > > > on a machine that implements the HP-UX behavior, you have leaked a > > > file descriptor; the worst things that can do are much less severe. > > > > Unfortunately, regardless of what happens, code portable to old > > systems needs to avoid getting in the situation to begin with. By > > either not installing interrupting signal handlers or blocking EINTR > > around close. > > [...] > > > > > While I agree with all of this, I think the tone is way too > > > > proscriptive. The man pages are to document the behaviors, not tell > > > > people how to program. > > > > > > I could be persuaded to tone it down a little but in this case I think > > > the man page's job *is* to tell people how to program. We know lots of > > > existing code has gotten the fine details of close() wrong and we are > > > trying to document how to do it right. > > > > No, the job of the man pages absolutely is not "to tell people how to > > program". It's to document behaviors. They are not a programming > > tutorial. They are not polemic diatribes. They are unbiased statements > > of facts. Facts of what the standards say and what implementations do, > > that equip programmers with the knowledge they need to make their own > > informed decisions, rather than blindly following what someone who > > thinks they know better told them to do. > > This reminds me a little bit of the realloc(p,0) fiasco of C89 and > glibc. > > In most cases, I agree with you that manual pages are and should be > aseptic, there are cases where I think the manual page needs to be > tutorial. Especially when there's such a mess, we need to both explain > all the possible behaviors (or at least mention them to some degree). ... and guide programmers about how to best use the API. I forgot to finish the sentence. > > But for example, there's the case of realloc(p,0), where we have > a fiasco that was pushed by a compoundment of wrong decisions by the > C Committee, and prior to that from System V. We're a bit lucky that > C17 accidentally broke it so badly that we now have it as UB, and that > gives us the opportunity to fix it now (which BTW might also be the case > for close(2)). > > In the case of realloc(3), I went and documented in the manual page that > glibc is broken, and that ISO C is also broken. > > STANDARDS > malloc() > free() > calloc() > realloc() > C23, POSIX.1‐2024. > > reallocarray() > POSIX.1‐2024. > > realloc(p, 0) > The behavior of realloc(p, 0) in glibc doesn’t conform to > any of C99, C11, POSIX.1‐2001, POSIX.1‐2004, POSIX.1‐2008, > POSIX.1‐2013, POSIX.1‐2017, or POSIX.1‐2024. The C17 > specification was changed to make it conforming, but that > specification made it impossible to write code that reli‐ > ably determines if the input pointer is freed after real‐ > loc(p, 0), and C23 changed it again to make this undefined > behavior, acknowledging that the C17 specification was > broad enough, so that undefined behavior wasn’t worse than > that. > > reallocarray() suffers the same issues in glibc. > > musl libc and the BSDs conform to all versions of ISO C > and POSIX.1. > > gnulib provides the realloc‐posix module, which provides > wrappers realloc() and reallocarray() that conform to all > versions of ISO C and POSIX.1. > > There’s a proposal to standardize the BSD behavior: https: > //www.open-std.org/jtc1/sc22/wg14/www/docs/n3621.txt. > > HISTORY > malloc() > free() > calloc() > realloc() > POSIX.1‐2001, C89. > > reallocarray() > glibc 2.26. OpenBSD 5.6, FreeBSD 11.0. > > malloc() and related functions rejected sizes greater than > PTRDIFF_MAX starting in glibc 2.30. > > free() preserved errno starting in glibc 2.33. > > realloc(p, 0) > C89 was ambiguous in its specification of realloc(p, 0). > C99 partially fixed this. > > The original implementation in glibc would have been con‐ > forming to C99. However, and ironically, trying to comply > with C99 before the standard was released, glibc changed > its behavior in glibc 2.1.1 into something that ended up > not conforming to the final C99 specification (but this is > debated, as the wording of the standard seems self‐contra‐ > dicting). > > ... > > BUGS > Programmers would naturally expect by induction that > realloc(p, size) is consistent with free(p) and mal‐ > loc(size), as that is the behavior in the general case. > This is not explicitly required by POSIX.1‐2024 or C11, > but all conforming implementations are consistent with > that. > > The glibc implementation of realloc() is not consistent > with that, and as a consequence, it is dangerous to call > realloc(p, 0) in glibc. > > A trivial workaround for glibc is calling it as > realloc(p, size?size:1). > > The workaround for reallocarray() in glibc ——which shares > the same bug—— would be > reallocarray(p, n?n:1, size?size:1). > > > Apart from documenting that glibc and ISO C are broken, we document how > to best deal with it (see the last paragraph in BUGS). This is > necessary because I fear that just by documenting the different > behaviors, programmers would still not know what to do with that. > Just take into account that even several members of the committee don't > know how to deal with it. > > I'd be willing to have something similar for close(2). > > > Have a lovely night! > Alex > > P.S.: I have great news about realloc(p,0)! Microsoft is on-board with > the change. They told me they like the proposal, and are willing to > fix their realloc(3) implementation. They'll now conduct tests to make > sure it doesn't break anything too badly, and will come back to me with > any feedback they have from those tests. > > I'll put the standards proposal for realloc(3) on hold, waiting for > Microsoft's feedback. > > > > > Aside: the reason EINTR *has to* be specified this way is that pthread > > > > cancellation is aligned with EINTR. If EINTR were defined to have > > > > closed the fd, then acting on cancellation during close would also > > > > have closed the fd, but the cancellation handler would have no way to > > > > distinguish this, leading to a situation where you're forced to either > > > > leak fds or introduce a double-close vuln. > > > > > > The correct way to address this would be to make close() not be a > > > cancellation point. > > > > This would also be a desirable change, one I would support if other > > implementors are on-board with pushing for it. > > > > > > An outline of what I'd like to see instead: > > > > > > > > - Clear explanation of why double-close is a serious bug that must > > > > always be avoided. (I think we all agree on this.) > > > > > > > > - Statement that the historical Linux/glibc behavior and current POSIX > > > > requirement differ, without language that tries to paint the POSIX > > > > behavior as a HP-UX bug/quirk. Possibly citing real sources/history > > > > of the issue (Austin Group tracker items 529, 614; maybe others). > > > > > > > > - Consequence of just assuming the Linux behavior (fd leaks on > > > > conforming systems). > > > > > > > > - Consequences of assuming the POSIX behavior (double-close vulns on > > > > GNU/Linux, maybe others). > > > > > > > > - Survey of methods for avoiding the problem (ways to preclude EINTR, > > > > possibly ways to infer behavior, etc). > > > > > > This outline seems more or less reasonable to me but, if it's me > > > writing the text, I _will_ characterize what POSIX currently says > > > about EINTR returns from close() as a bug in POSIX. As far as I'm > > > concerned, that is a fact, not polemic. > > > > > > I have found that arguing with you in particular, Rich, is generally > > > not worth the effort. Therefore, unless you reply and _accept_ that > > > the final version of the close manpage will say that POSIX is buggy, > > > I am not going to write another version of this text, nor will I be > > > drawn into further debate. > > > > I will not accept that because it's a gross violation of the > > responsibility of document writing. > > > > Rich > > -- > --