From: "Pali Rohár" <pali@kernel.org>
To: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: linux-man@vger.kernel.org, Jan Kara <jack@suse.cz>,
Alejandro Colomar <alx@kernel.org>
Subject: Re: [PATCH v2] man/man3/readdir.3, man/man3type/stat.3type: Improve documentation about .d_ino and .st_ino
Date: Sat, 22 Nov 2025 01:53:29 +0100 [thread overview]
Message-ID: <20251122005329.4vcs2cgxx44slutl@pali> (raw)
In-Reply-To: <20251121233957.7ul4pq6tdqu7ihcg@illithid>
On Friday 21 November 2025 17:39:57 G. Branden Robinson wrote:
> Hi Pali,
>
> At 2025-11-21T22:10:28+0100, Pali Rohár wrote:
> > On Wednesday 29 October 2025 20:34:13 Pali Rohár wrote:
> > > On Wednesday 29 October 2025 02:00:39 G. Branden Robinson wrote:
> > > > At 2025-10-29T00:53:06+0100, Pali Rohár wrote:
> > > > > If you are referring to the "bug" then it is written in
> > > > > informative part in RATIONALE section of readdir / POSIX.1-2024.
> > > > > I wrote in my first email in that email thread which Alejandro
> > > > > linked above.
> > > > >
> > > > > Here is direct link to POSIX spec and below is quoted part:
> > > > > https://pubs.opengroup.org/onlinepubs/9799919799/functions/readdir.html
> > > > >
> > > > > "When returning a directory entry for the root of a mounted file
> > > > > system, some historical implementations of readdir() returned
> > > > > the file serial number of the underlying mount point, rather
> > > > > than of the root of the mounted file system. This behavior is
> > > > > considered to be a bug, since the underlying file serial number
> > > > > has no significance to applications."
> [...]
> > > > > That part is in the "informative" section. I have not found
> > > > > anything in normative sections which would disallow usage of
> > > > > that "historical" behavior, so my understanding was that
> > > > > "historical" behavior is conforming too.
> > > > >
> > > > > Please correct me if I'm wrong here, or if it should be
> > > > > understood in different way.
> > > >
> > > > I can't speak for the Austin Group, but I don't read the text
> > > > quite the same way. I interpret it as saying that some historical
> > > > implementations of readdir() would _not_ "return a pointer to a
> > > > structure representing the directory entry at the current position
> > > > in the directory stream specified by the argument dirp, and
> > > > position the directory stream at the next entry." But I suspect
> > > > that's not what it _intends_ to say.
> > > >
> > > > Instead, these implementations "returned [sic] the file serial
> > > > number of the underlying mount point", which I interpret to mean
> > > > that they would return a pointer to a _dirent_ struct whose
> > > > _d_ino_ member was not the file serial number of the file (of
> > > > directory type) named by the _d_name_ member but a pointer to a
> > > > _dirent_ struct whose _d_ino_ member was the file serial number of
> > > > the underlying mount point.
> > > >
> > > > I think there are two conclusions we can reach here.
> > > >
> > > > 1. POSIX.1-2024 might be a little sloppy in the wording of its
> > > > "RATIONALE" for this interface. Presumably no historical
> > > > implementation's readdir() returned a _d_ino_ number directly.
> > > > (Though with all the exuberant integer/pointer punning that
> > > > used to go in Unix, I'd wouldn't bet a lot of money that *no*
> > > > implementation ever did.) I'll wager a nickel that readdir()
> > > > has always, on every implementation, returned a pointer to a
> > > > _dirent_ struct, and it is only the value of the _d_ino_
> > > > member of the pointed-to struct that some implementations have
> > > > populated inconsistently when the entry is a directory that is
> > > > a mount point.
> > > >
> > > > If I'm right, this is an example of the common linguistic
> > > > error of synecdoche: confusing a container with (a subset of)
> > > > its contents.
> > > >
> > > > 2. The behavior POSIX describes as buggy is, in fact,
> > > > nonconforming.
> > >
> > > Only two? I can image that somebody come up with another conclusion.
> > > (just a joke)
>
> I wouldn't bet against your joke proving out in reality. ;-)
>
> > > Anyway, I think that it is important to document the existing Linux
> > > behavior and whether it is POSIX-conforming or not is then second
> > > step. We can drop the information about POSIX conformity from
> > > manpage until we figure out how it is.
> > >
> > > > > Also I have not read all those 4000 pages,
>
> Pity the person who has. :) And mastery of all 4000+ pages should not
> be necessary for an implementor to make sense of a reference entry for a
> single function, command, or data object.
>
> > > > > so maybe there is something hidden. It is quite hard to find
> > > > > information about this topic and that is why I think this should
> > > > > be documented in Linux manpages.
> > > >
> > > > I reckon someone should open a Mantis ticket with the Austin
> > > > Group's issue tracker to get clarity on what I characterized as
> > > > "sloppy" wording. Either it is and we can get the standard
> > > > clarified, or I'm wrong and an authority can point out how.
> > > > (Maybe both!)
> > > >
> > > > I'm subscribed to the austin-group-l reflector and will take an
> > > > action item to file this ticket. I'll try to do within a week.
> > > > (I have a lot of old Unix books and would like to rummage around
> > > > in them for any documented land mines in this area.)
> [...]
> > > Thanks for taking that part. It would be really nice if austin group
> > > can clarify how the whole situation is in a non-confusing way.
> > >
> > > Anyway, inode number is always connected to the specific mounted
> > > filesystem. So when the application is doing something with inodes,
> > > it always needs a pair (dev_t, ino_t) unless inodes belongs to same
> > > fs dev.
> > >
> > > readdir() and getdents() returns just ino_t, and without knowledge
> > > of dev_t, applications cannot use returned ino_t for anything
> > > useful. On "historical" implementations, the dev_t can be fetched
> > > for example by one fstat(dir_fd, &st) call as dev_t would be same
> > > for all readdir and getdents entries. But on non-"historical"
> > > implementation, it would be needed to call stat() on every one
> > > entry. For example /mnt/ directory which usually contains just
> > > mountpoints, will contain entries where each one has inode number 2
> > > (common inode number for root of fs).
> > >
> > > I looked into archives and I have found that this problem was
> > > already discussed in the past. Here are some email threads from
> > > coreutils:
> > > https://lore.kernel.org/lkml/87y6oyhkz8.fsf@meyering.net/t/#u
> > > https://public-inbox.org/bug-coreutils/8763c5wcgn.fsf@meyering.net/t/#u
> > > https://public-inbox.org/bug-coreutils/87iqvi2j0q.fsf@rho.meyering.net/t/#u
> > > https://public-inbox.org/bug-coreutils/87verkborm.fsf@rho.meyering.net/
> > > https://public-inbox.org/bug-coreutils/022320061637.4398.43FDE4D7000110830000112E22007507440A050E040D0C079D0A@comcast.net/
> > >
> > > Maybe they could be a good reference for future discussion by austin
> > > group.
> > >
> > > Just my personal idea: If there would be some xgetdents syscall
> > > (like there statx over stat), it could return both inode numbers
> > > with dev_t and application can take which it wants.
> > >
> > > For example, NFS4's readdir can return both inode numbers (depending
> > > what is client asking). NFSv4.1 spec has nicely documented this
> > > problem with UNIX background of mount point crossing:
> > > https://www.rfc-editor.org/rfc/rfc8881.html#section-5.8.2.23
> > >
> > > Pali
> >
> > Hello Branden, did you have a time fill a ticket to austin group?
>
> Not yet--I procrastinated and got preoccupied by exciting new
> undefined or ambiguously interpretable behavior of GNU troff.
>
> https://www.mail-archive.com/groff@gnu.org/msg20834.html
>
> > If the ticket system is public, could you send a link for reference?
>
> It is public...
>
> https://austingroupbugs.net/view_all_bug_page.php
>
> ...but to file a ticket or comment on one, I believe you need to create
> an account. If you file a ticket yourself because you tire of waiting
> on me (which I'll understand), please let me know when you do so I can
> take this item off my to do list.
>
> Regards,
> Branden
You are experienced with austin group, so I will let this to you.
I'm fine with waiting here.
next prev parent reply other threads:[~2025-11-22 0:53 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-25 10:33 Improving inode number documentation Pali Rohár
2025-05-25 23:30 ` Alejandro Colomar
2025-05-28 18:25 ` Pali Rohár
2025-05-28 19:03 ` Alejandro Colomar
2025-05-28 19:41 ` Pali Rohár
2025-05-28 19:59 ` Alejandro Colomar
2025-05-28 21:31 ` [RFC v1] man/man3/readdir.3, man/man3type/stat.3type: Improve documentation about .d_ino and .st_ino Alejandro Colomar
2025-05-28 22:54 ` G. Branden Robinson
2025-10-28 23:15 ` [PATCH v2] " Alejandro Colomar
2025-10-28 23:53 ` Pali Rohár
2025-10-29 7:00 ` G. Branden Robinson
2025-10-29 19:34 ` Pali Rohár
2025-11-21 21:10 ` Pali Rohár
2025-11-21 23:39 ` G. Branden Robinson
2025-11-22 0:53 ` Pali Rohár [this message]
2025-10-30 11:58 ` Jan Kara
2025-10-31 10:44 ` [PATCH v3] " Alejandro Colomar
2025-10-31 10:56 ` Jan Kara
2025-10-31 11:31 ` Alejandro Colomar
2025-10-31 17:10 ` Pali Rohár
2025-10-31 15:25 ` Darrick J. Wong
2025-11-02 21:17 ` Alejandro Colomar
2025-11-03 11:28 ` Jan Kara
2025-11-09 12:07 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251122005329.4vcs2cgxx44slutl@pali \
--to=pali@kernel.org \
--cc=alx@kernel.org \
--cc=g.branden.robinson@gmail.com \
--cc=jack@suse.cz \
--cc=linux-man@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox