On Tue, Jan 20, 2026 at 09:35:43PM +0100, Alejandro Colomar wrote:
> Hi Rich, Zack,
> 
> On Tue, Jan 20, 2026 at 12:46:59PM -0500, Rich Felker wrote:
> > On Tue, Jan 20, 2026 at 12:05:52PM -0500, Zack Weinberg wrote:
> > > > On Fri, May 23, 2025 at 02:10:57PM -0400, Zack Weinberg wrote:
> 
> [...]
> 
> > > Now, the abstract correct behavior is secondary to the fact that we
> > > know there are both systems where close should not be retried after
> > > EINTR (Linux) and systems where the fd is still open after EINTR
> > > (HP-UX).  But it is my position that *portable code* should assume the
> > > Linux behavior, because that is the safest option.  If you assume the
> > > HP-UX behavior on a machine that implements the Linux behavior, you
> > > might close some unrelated file out from under yourself (probably but
> > > not necessarily a different thread).  If you assume the Linux behavior
> > > on a machine that implements the HP-UX behavior, you have leaked a
> > > file descriptor; the worst things that can do are much less severe.
> > 
> > Unfortunately, regardless of what happens, code portable to old
> > systems needs to avoid getting in the situation to begin with. By
> > either not installing interrupting signal handlers or blocking EINTR
> > around close.
> 
> [...]
> 
> > > > While I agree with all of this, I think the tone is way too
> > > > proscriptive. The man pages are to document the behaviors, not tell
> > > > people how to program.
> > > 
> > > I could be persuaded to tone it down a little but in this case I think
> > > the man page's job *is* to tell people how to program.  We know lots of
> > > existing code has gotten the fine details of close() wrong and we are
> > > trying to document how to do it right.
> > 
> > No, the job of the man pages absolutely is not "to tell people how to
> > program". It's to document behaviors. They are not a programming
> > tutorial. They are not polemic diatribes. They are unbiased statements
> > of facts. Facts of what the standards say and what implementations do,
> > that equip programmers with the knowledge they need to make their own
> > informed decisions, rather than blindly following what someone who
> > thinks they know better told them to do.
> 
> This reminds me a little bit of the realloc(p,0) fiasco of C89 and
> glibc.
> 
> In most cases, I agree with you that manual pages are and should be
> aseptic, there are cases where I think the manual page needs to be
> tutorial.  Especially when there's such a mess, we need to both explain
> all the possible behaviors (or at least mention them to some degree).

... and guide programmers about how to best use the API.

I forgot to finish the sentence.

> 
> But for example, there's the case of realloc(p,0), where we have
> a fiasco that was pushed by a compoundment of wrong decisions by the
> C Committee, and prior to that from System V.  We're a bit lucky that
> C17 accidentally broke it so badly that we now have it as UB, and that
> gives us the opportunity to fix it now (which BTW might also be the case
> for close(2)).
> 
> In the case of realloc(3), I went and documented in the manual page that
> glibc is broken, and that ISO C is also broken.
> 
> 	STANDARDS
> 	     malloc()
> 	     free()
> 	     calloc()
> 	     realloc()
> 		    C23, POSIX.1‐2024.
> 
> 	     reallocarray()
> 		    POSIX.1‐2024.
> 
> 	   realloc(p, 0)
> 	     The  behavior of realloc(p, 0) in glibc doesn’t conform to
> 	     any of C99, C11, POSIX.1‐2001, POSIX.1‐2004, POSIX.1‐2008,
> 	     POSIX.1‐2013,  POSIX.1‐2017,  or  POSIX.1‐2024.   The  C17
> 	     specification  was changed to make it conforming, but that
> 	     specification made it impossible to write code that  reli‐
> 	     ably  determines if the input pointer is freed after real‐
> 	     loc(p, 0), and C23 changed it again to make this undefined
> 	     behavior, acknowledging that  the  C17  specification  was
> 	     broad enough, so that undefined behavior wasn’t worse than
> 	     that.
> 
> 	     reallocarray() suffers the same issues in glibc.
> 
> 	     musl  libc  and  the BSDs conform to all versions of ISO C
> 	     and POSIX.1.
> 
> 	     gnulib provides the realloc‐posix module,  which  provides
> 	     wrappers  realloc() and reallocarray() that conform to all
> 	     versions of ISO C and POSIX.1.
> 
> 	     There’s a proposal to standardize the BSD behavior: https:
> 	     //www.open-std.org/jtc1/sc22/wg14/www/docs/n3621.txt.
> 
> 	HISTORY
> 	     malloc()
> 	     free()
> 	     calloc()
> 	     realloc()
> 		    POSIX.1‐2001, C89.
> 
> 	     reallocarray()
> 		    glibc 2.26.  OpenBSD 5.6, FreeBSD 11.0.
> 
> 	     malloc() and related functions rejected sizes greater than
> 	     PTRDIFF_MAX starting in glibc 2.30.
> 
> 	     free() preserved errno starting in glibc 2.33.
> 
> 	   realloc(p, 0)
> 	     C89 was ambiguous in its specification of  realloc(p,  0).
> 	     C99 partially fixed this.
> 
> 	     The  original implementation in glibc would have been con‐
> 	     forming to C99.  However, and ironically, trying to comply
> 	     with C99 before the standard was released,  glibc  changed
> 	     its  behavior  in glibc 2.1.1 into something that ended up
> 	     not conforming to the final C99 specification (but this is
> 	     debated, as the wording of the standard seems self‐contra‐
> 	     dicting).
> 
> 	...
> 
> 	BUGS
> 	     Programmers  would  naturally  expect  by  induction  that
> 	     realloc(p, size)  is  consistent  with  free(p)  and  mal‐
> 	     loc(size),  as  that  is the behavior in the general case.
> 	     This is not explicitly required by  POSIX.1‐2024  or  C11,
> 	     but  all  conforming  implementations  are consistent with
> 	     that.
> 
> 	     The glibc implementation of realloc()  is  not  consistent
> 	     with  that,  and as a consequence, it is dangerous to call
> 	     realloc(p, 0) in glibc.
> 
> 	     A  trivial  workaround  for  glibc  is   calling   it   as
> 	     realloc(p, size?size:1).
> 
> 	     The  workaround for reallocarray() in glibc ——which shares
> 	     the         same          bug——          would          be
> 	     reallocarray(p, n?n:1, size?size:1).
> 
> 
> Apart from documenting that glibc and ISO C are broken, we document how
> to best deal with it (see the last paragraph in BUGS).  This is
> necessary because I fear that just by documenting the different
> behaviors, programmers would still not know what to do with that.
> Just take into account that even several members of the committee don't
> know how to deal with it.
> 
> I'd be willing to have something similar for close(2).
> 
> 
> Have a lovely night!
> Alex
> 
> P.S.:  I have great news about realloc(p,0)!  Microsoft is on-board with
> the change.  They told me they like the proposal, and are willing to
> fix their realloc(3) implementation.  They'll now conduct tests to make
> sure it doesn't break anything too badly, and will come back to me with
> any feedback they have from those tests.
> 
> I'll put the standards proposal for realloc(3) on hold, waiting for
> Microsoft's feedback.
> 
> > > > Aside: the reason EINTR *has to* be specified this way is that pthread
> > > > cancellation is aligned with EINTR. If EINTR were defined to have
> > > > closed the fd, then acting on cancellation during close would also
> > > > have closed the fd, but the cancellation handler would have no way to
> > > > distinguish this, leading to a situation where you're forced to either
> > > > leak fds or introduce a double-close vuln.
> > > 
> > > The correct way to address this would be to make close() not be a
> > > cancellation point.
> > 
> > This would also be a desirable change, one I would support if other
> > implementors are on-board with pushing for it.
> > 
> > > > An outline of what I'd like to see instead:
> > > >
> > > > - Clear explanation of why double-close is a serious bug that must
> > > >   always be avoided. (I think we all agree on this.)
> > > >
> > > > - Statement that the historical Linux/glibc behavior and current POSIX
> > > >   requirement differ, without language that tries to paint the POSIX
> > > >   behavior as a HP-UX bug/quirk. Possibly citing real sources/history
> > > >   of the issue (Austin Group tracker items 529, 614; maybe others).
> > > >
> > > > - Consequence of just assuming the Linux behavior (fd leaks on
> > > >   conforming systems).
> > > >
> > > > - Consequences of assuming the POSIX behavior (double-close vulns on
> > > >   GNU/Linux, maybe others).
> > > >
> > > > - Survey of methods for avoiding the problem (ways to preclude EINTR,
> > > >   possibly ways to infer behavior, etc).
> > > 
> > > This outline seems more or less reasonable to me but, if it's me
> > > writing the text, I _will_ characterize what POSIX currently says
> > > about EINTR returns from close() as a bug in POSIX.  As far as I'm
> > > concerned, that is a fact, not polemic.
> > > 
> > > I have found that arguing with you in particular, Rich, is generally
> > > not worth the effort.  Therefore, unless you reply and _accept_ that
> > > the final version of the close manpage will say that POSIX is buggy,
> > > I am not going to write another version of this text, nor will I be
> > > drawn into further debate.
> > 
> > I will not accept that because it's a gross violation of the
> > responsibility of document writing.
> > 
> > Rich
> 
> -- 
> <https://www.alejandro-colomar.es>



-- 
<https://www.alejandro-colomar.es>