From: Jim Meyering <jim@meyering.net>
To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: make getdents/readdir POSIX compliant wrt mount-point dirent.d_ino
Date: Tue, 01 Sep 2009 15:07:23 +0200 [thread overview]
Message-ID: <87y6oyhkz8.fsf@meyering.net> (raw)
Currently, on all unix and linux-based systems the dirent.d_ino of a mount
point (as read from its parent directory) fails to match the stat-returned
st_ino value for that same entry. That is contrary to POSIX 2008.
I'm bringing this up today because I've just had to disable an
optimization in coreutils ls -i:
http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/17887
Normally, work-arounds in coreutils penalize non-linux, or old-linux
kernels, but this is the first that has penalized *all* unix/linux-based
systems. Ironically, the sole system that can still take advanatage
of the optimization is Cygwin.
I'm hoping that Linux can catch up before too long.
------------------------
The POSIX readdir spec says this:
The structure dirent defined in the <dirent.h> header describes a
directory entry. The value of the structure's d_ino member shall be set
to the file serial number of the file named by the d_name member.
The description for sys/stat.h makes the connection between
"file serial number" and the stat.st_ino member:
The <sys/stat.h> header shall define the stat structure, which shall
include at least the following members:
...
ino_t st_ino File serial number.
------------------------
The current linux/unix readdir behavior makes it so ls -i cannot perform
the optimization of printing only readdir-returned inode numbers, and
instead must incur the cost of actually stat'ing each entry in order to
be assured that it prints valid inode numbers.
If you have gnu coreutils 6.0 or newer (but not built from today's
git repository) tools on your system, you can demonstrate the mismatch
with the following shell code: [if not, use the C program in
<http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/14020>]
#!/bin/sh
mount_points=$(df --local -P 2>&1 | sed -n 's,.*[0-9]% \(/.\),\1,p')
# Given e.g., /dev/shm, produce the list of GNU ls options that
# let us list just that entry using readdir data from its parent:
# ls -i -I '[^s]*' -I 's[^h]*' -I 'sh[^m]*' -I 'shm?*' -I '.?*' \
# -I '?' -I '??' /dev
ls_ignore_options()
{
name=$1
opts="-I '.?*' -I '$name?*'"
while :; do
glob=$(echo "$name"|sed 's/\(.*\)\(.\)$/\1[^\2]*/')
opts="$opts -I '$glob'"
name=$(echo "$name"|sed 's/.$//')
test -z "$name" && break
glob=$(echo "$name"|sed 's/./?/g')
opts="$opts -I '$glob'"
done
echo "$opts"
}
inode_via_readdir()
{
mount_point=$1
base=$(basename $mount_point)
case $base in
.*) skip_test_ 'mount point component starts with "."' ;;
*[*?]*) skip_test_ 'mount point component contains "?" or "*"' ;;
esac
opts=$(ls_ignore_options "$base")
parent_dir=$(dirname $mount_point)
eval "ls -i $opts $parent_dir" | sed 's/ .*//'
}
first_failure=1
for dir in $mount_points; do
readdir_inode=$(inode_via_readdir $dir)
stat_inode=$(env stat --format=%i $dir)
if test "$readdir_inode" != "$stat_inode"; then
test $first_failure = 1 \
&& printf '%8s %8s %-20s\n' st_ino d_ino mount-point
printf '%8d %8d %-20s\n' $stat_inode $readdir_inode $dir
first_failure=0
fi
done
#--------------------------------------------------------------
For example, here's the result of running it on one
of my systems:
st_ino d_ino mount-point
3508 36850 /lib/init/rw
824 376097 /dev
6237 3532 /dev/shm
2 8177 /boot
2 12265 /full
2 147197 /h
2 298428 /f
2 310689 /usr
2 73585 /var
6992 253457 /t
2 327041 /b
2 4113 /d
2 302521 /x
2 53378 /media/sdd1
The d_ino number is what ls -i $parent_dir would print,
before today's fix, while the st_ino value is the correct inode
number for that directory.
next reply other threads:[~2009-09-01 13:08 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-01 13:07 Jim Meyering [this message]
2009-09-01 20:19 ` make getdents/readdir POSIX compliant wrt mount-point dirent.d_ino Theodore Tso
2009-09-01 22:03 ` Ulrich Drepper
2009-09-03 14:50 ` Eric Blake
2009-11-04 20:22 ` Jeff Layton
2009-11-04 19:29 ` Jim Meyering
2009-11-05 19:48 ` Theodore Tso
2009-11-05 23:28 ` Jim Meyering
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y6oyhkz8.fsf@meyering.net \
--to=jim@meyering.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox