From: buhr@stat.wisc.edu (Kevin Buhr)
To: trond.myklebust@fys.uio.no
Cc: linux-kernel@vger.kernel.org
Subject: negative NFS cookies: bad C library or bad kernel?
Date: 02 Dec 2000 22:49:16 -0600 [thread overview]
Message-ID: <vbaaeae2joz.fsf@mozart.stat.wisc.edu> (raw)
Trond:
Fiddling with the Crytographic File System the other day, I managed to
tickle a mysterious bug. When some directories grew large enough,
suddenly a chunk of files would half "disappear". "find" would list
them fine, but "ls" and "echo *" wouldn't.
After a bit of troubleshooting, I discovered that the CFS daemon
(which presents itself to the system as an NFS daemon) was using
small, big-endian cookies in its directory entries. These became
large positive and negative little-endian "d_off" values in the dirent
structs.
The C library (in glibc-2.1.3/sysdeps/unix/sysv/linux/getdents.c) does
some fancy, double-buffering footwork in getdents(2) to try to guess
how many bytes of kernel_dirents it needs to read into a temporary
buffer to fill the user-supplied buffer with user dirents (which have
an extra "d_type" field). When its heuristic screws up, it does an
lseek on the directory so the next getdents(2) will start with the
right directory entry:
if ((char *) dp + new_reclen > buf + nbytes)
{
/* Our heuristic failed. We read too many entries. Reset
the stream. `last_offset' contains the last known
position. If it is zero this is the first record we are
reading. In this case do a relative search. */
if (last_offset == 0)
__lseek (fd, -retval, SEEK_CUR);
else
__lseek (fd, last_offset, SEEK_SET);
break;
}
In my case, for "ls" and "bash", the "last_offset" happened to be a
negative little-endian cookie. The kernel's "default_lseek" returned
EINVAL, the error was ignored, and "ls" and "bash" were blissfully
unaware that a bunch of directory entries had been read into the
temporary buffer and forever lost. Since "find" used a different
buffer size, it happened to have a positive little-endian cookie for
"last_offset" and didn't exhibit the problem.
A fix was easy---after modifying CFS to convert its cookies to small,
little-endian numbers, everything worked fine.
However, who's to blame here? It can't be CFS---any four-byte cookie
should be valid, right?
Is the kernel NFS client code to blame? If it's going to be using
cookies as offsets, shouldn't we have an nfs_lseek that special-cases
directory lseeks (at least those using SEEK_SET) to take negative
offsets, so utilities and libraries don't need to be bigfile-aware
just to read directories? And what in the world can we do about bogus
code like the:
__lseek (fd, -retval, SEEK_CUR);
that appears above? Shouldn't any non-SEEK_SET lseek on an NFS
directory fail with an error?
Any thoughts?
Thanks.
Kevin <buhr@stat.wisc.edu>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
next reply other threads:[~2000-12-03 5:20 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-12-03 4:49 Kevin Buhr [this message]
2000-12-03 13:30 ` negative NFS cookies: bad C library or bad kernel? Trond Myklebust
2000-12-04 5:28 ` Kevin Buhr
2000-12-04 23:40 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=vbaaeae2joz.fsf@mozart.stat.wisc.edu \
--to=buhr@stat.wisc.edu \
--cc=linux-kernel@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox