* ls hangs on NFS share from Apple Xserve @ 2004-09-08 8:42 bnies 2004-09-08 16:52 ` Trond Myklebust 0 siblings, 1 reply; 10+ messages in thread From: bnies @ 2004-09-08 8:42 UTC (permalink / raw) To: nfs Hi, A couple of months ago I reported a bug that occurs with Linux NFS client and Mac OS X NFS server. Here's the bug report: http://sourceforge.net/tracker/index.php?func=detail&aid=964204&group_id=14&atid=100014 And here are other docs related to this problem: http://discussions.info.apple.com/webx?14@5.gRhIaxY5v2O.0@.689495bc http://discussions.info.apple.com/webx?13@140.HfAHaE8Ew3J.148064@.6897b10d/2 http://groups.google.ch/groups?hl=de&lr=&ie=UTF-8&selm=9c87e8e6.0405280036.16f3c991%40posting.google.com http://algesten.blogspot.com/2004/07/mac-os-x-server-nfs-unique-cookie-bug.html http://sources.redhat.com/bugzilla/show_bug.cgi?id=353 Martin Algesten reported that this might be caused by non-unique NFS cookies in the same READDIRPLUS reply from the Apple NFS server but in our environment with MacOS X 10.3.5 and SuSE Linux 9.0 (Kernel 2.4.21-243-smp4G, glibc-2.3.2-88) I cannot confirm this. The NFS cookies are unique in the same READDIRPLUS reply but not unique during the same NFS session. We don't have problems accessing the MacOS X 10.3.5 NFS server from our Solaris clients and also don't have problems with Solaris, Linux and NetApp NFS servers from Linux clients. I assume this is a bug with the Linux Kernel NFS implementation or the glibc library which provides the getdents64 call that loops. Apple says it's a Linux bug. According to SuSE the patch linux-2.4.21-15-seekdir.dif from http://client.linux-nfs.org/Linux-2.4.x/2.4.21/ is already included in their kernel. Could someone who is familiar with NFS and the Linux NFS or glibc code analyze and solve this problem? I can provide network traffic and strace logs if you don't have the equipment to reproduce this problem. Thanks in advance. Regards, Bernd ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ls hangs on NFS share from Apple Xserve 2004-09-08 8:42 ls hangs on NFS share from Apple Xserve bnies @ 2004-09-08 16:52 ` Trond Myklebust 2004-09-08 18:36 ` Bernd Nies 0 siblings, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2004-09-08 16:52 UTC (permalink / raw) To: bnies; +Cc: nfs P=E5 on , 08/09/2004 klokka 04:42, skreiv bnies@bluewin.ch: > Martin Algesten reported that this might be caused by non-unique NFS cook= ies > in the same READDIRPLUS reply from the Apple NFS server but in our enviro= nment > with MacOS X 10.3.5 and SuSE Linux 9.0 (Kernel 2.4.21-243-smp4G, glibc-2.= 3.2-88) > I cannot confirm this. The NFS cookies are unique in the same READDIRPLUS > reply but not unique during the same NFS session. >=20 What does this mean? That 2 different READDIRPLUS entries for the same directory may return the same cookie? That would a server bug... Cheers, Trond ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ls hangs on NFS share from Apple Xserve 2004-09-08 16:52 ` Trond Myklebust @ 2004-09-08 18:36 ` Bernd Nies 2004-09-08 19:53 ` Trond Myklebust 0 siblings, 1 reply; 10+ messages in thread From: Bernd Nies @ 2004-09-08 18:36 UTC (permalink / raw) To: nfs Hi Trond, >>Martin Algesten reported that this might be caused by non-unique NFS cookies >>in the same READDIRPLUS reply from the Apple NFS server but in our environment >>with MacOS X 10.3.5 and SuSE Linux 9.0 (Kernel 2.4.21-243-smp4G, glibc-2.3.2-88) >>I cannot confirm this. The NFS cookies are unique in the same READDIRPLUS >>reply but not unique during the same NFS session. >> > > > What does this mean? That 2 different READDIRPLUS entries for the same > directory may return the same cookie? That would a server bug... The problem reported by Martin Algesten was: His MacOS X 10.3.4 NFS server sent a duplicate Cookie within the _same_ READDIRPLUS reply. See: http://algesten.blogspot.com/index.html#109103599929132337 I can't observe that on our MacOS X 10.3.5 server. I only see that while traversing a NFS share with ls -lR on a Linux client the same Cookie ID appears in different READDIRPLUS replies of different directories. Because our Solaris NFS servers do the same and Linux NFS client works fine with it, I guess this is allowed. So I think the endless looping ls -lR from a Linux NFS client on a Mac OS X 10.3.5 NFS share I reported is not related to that bug reported by Martin Algesten. Here is the ethereal snoop of such an incident: http://www.nies.ch/download/apple-linux-nfs-loop.tar.gz I did a ls -lR on a MacOS X NFS share and in a certain directory it hung. A strace on the ls process shows that it loops over these system calls: getdents64(5, /* 3 entries */, 4096) = 128 lstat64("/share/dir/file1.txt", {st_mode=S_IFREG|0444, st_size=8550, ...}) = 0 getxattr("/share/dir/file1.txt", "system.posix_acl_access", (nil), 0) = -1 EOPNOTSUPP (Operation not supported) lstat64("/share/dir/file2.txt", {st_mode=S_IFREG|0444, st_size=6570, ...}) = 0 getxattr("/share/dir/file2.txt", "system.posix_acl_access", (nil), 0) = -1 EOPNOTSUPP (Operation not supported) lstat64("/share/dir/file3.txt", {st_mode=S_IFREG|0444, st_size=23411, ...}) = 0 getxattr("/share/dir/file3.txt", "system.posix_acl_access", (nil), 0) = -1 EOPNOTSUPP (Operation not supported) This does not only happen with ls. It happens with all commands that want to read a certain directory (e.g. tar). A quick workaround is to create a new file in the directory where it loops. Regards, Bernd ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ls hangs on NFS share from Apple Xserve 2004-09-08 18:36 ` Bernd Nies @ 2004-09-08 19:53 ` Trond Myklebust 2004-09-08 20:07 ` Bernd Nies 0 siblings, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2004-09-08 19:53 UTC (permalink / raw) To: Bernd Nies; +Cc: nfs As far as I can see from your traces, 2 consecutive READDIRs on the same directory is giving totally different results... In your TCP file, the first time the Linux client reads through the directory which begins with "Autorun.inf", the entry "install.sh" (for instance) has cookie 292. The second time round, it has cookie 252. Our client will not work with such a server. Cheers, Trond ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ls hangs on NFS share from Apple Xserve 2004-09-08 19:53 ` Trond Myklebust @ 2004-09-08 20:07 ` Bernd Nies 2004-09-08 20:46 ` Trond Myklebust 0 siblings, 1 reply; 10+ messages in thread From: Bernd Nies @ 2004-09-08 20:07 UTC (permalink / raw) To: nfs Trond Myklebust wrote: > As far as I can see from your traces, 2 consecutive READDIRs on the same > directory is giving totally different results... > > In your TCP file, the first time the Linux client reads through the > directory which begins with "Autorun.inf", the entry "install.sh" (for > instance) has cookie 292. The second time round, it has cookie 252. > > Our client will not work with such a server. Are you sure it reads the same directory twice? A ls -lR shouldn't do this and I ran it only once per traffic capture. Between the UDP and NFS traffics I unmounted the directory. The share contains directories which contents are sometimes similar. The server is working fine. Only on about one of thousand directories the client loops and if Client A loops over one directory, client B does not. Regards, Bernd ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ls hangs on NFS share from Apple Xserve 2004-09-08 20:07 ` Bernd Nies @ 2004-09-08 20:46 ` Trond Myklebust 2004-09-08 21:51 ` Bernd Nies 0 siblings, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2004-09-08 20:46 UTC (permalink / raw) To: Bernd Nies; +Cc: nfs P=E5 on , 08/09/2004 klokka 16:07, skreiv Bernd Nies: > Are you sure it reads the same directory twice? A ls -lR shouldn't do=20 Yes. Check the tcpdumps. > this and I ran it only once per traffic capture. Between the UDP and NFS=20 > traffics I unmounted the directory. The share contains directories which=20 > contents are sometimes similar. Huh? Of course the client may end up reading in the directory multiple times. Unless you have allocated a large buffer for it, readdir() will end up calling multiple times down into the kernel, so the read of an entire directory is NOT going to be atomic. If the mtime on the directory has changed in the meantime, the cached version may be flushed out, and the directory read in again. If the server has screwed up the cookies, then that will confuse the client. Cheers, Trond ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ls hangs on NFS share from Apple Xserve 2004-09-08 20:46 ` Trond Myklebust @ 2004-09-08 21:51 ` Bernd Nies 2004-09-08 22:30 ` Trond Myklebust 0 siblings, 1 reply; 10+ messages in thread From: Bernd Nies @ 2004-09-08 21:51 UTC (permalink / raw) To: nfs > Unless you have allocated a large buffer for it, readdir() will end up > calling multiple times down into the kernel, so the read of an entire > directory is NOT going to be atomic. > If the mtime on the directory has changed in the meantime, the cached > version may be flushed out, and the directory read in again. If the > server has screwed up the cookies, then that will confuse the client. OK, right. I tracked down one incident captured here: http://www.nies.ch/download/apple-nfs-loop-1dir.dump.gz A READDIRPLUS reports a directory entry "." with cookie 276 (Frame 157) and the next READDIRPLUS reports a file "package-use.html" with same cookie 276 (Frame 161). This is WRONG by the server, right? But why is it working then without problems on a Solaris NFS client? Well, seems that I have to call Apple support and listen to one hour iTunes again ... Thanks a lot. Regards, Bernd ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ls hangs on NFS share from Apple Xserve 2004-09-08 21:51 ` Bernd Nies @ 2004-09-08 22:30 ` Trond Myklebust 2004-09-09 8:08 ` bnies 0 siblings, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2004-09-08 22:30 UTC (permalink / raw) To: Bernd Nies; +Cc: nfs P=E5 on , 08/09/2004 klokka 17:51, skreiv Bernd Nies: > OK, right. I tracked down one incident captured here: >=20 > http://www.nies.ch/download/apple-nfs-loop-1dir.dump.gz >=20 > A READDIRPLUS reports a directory entry "." with cookie 276 (Frame 157)=20 > and the next READDIRPLUS reports a file "package-use.html" with same=20 > cookie 276 (Frame 161). This is WRONG by the server, right? >=20 > But why is it working then without problems on a Solaris NFS client? They probably do not rely exclusively on the cookie to track where they are in the READDIR stream: for instance it is possible to cache the filename too, and to use that as a backup solution if the server is being cranky. We could possibly add this sort of information to Linux too. Note though, this scheme fails for the case where the user wants to use telldir()/seekdir(), so if you have an older glibc that uses the horrible heuristic algorithm, you will still see odd behaviour. More obviously: it also breaks down if the filename you cached was one of the entries that got deleted... Cheers, Trond ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ls hangs on NFS share from Apple Xserve 2004-09-08 22:30 ` Trond Myklebust @ 2004-09-09 8:08 ` bnies 2004-09-09 15:00 ` Trond Myklebust 0 siblings, 1 reply; 10+ messages in thread From: bnies @ 2004-09-09 8:08 UTC (permalink / raw) To: nfs Hi Trond, >They probably do not rely exclusively on the cookie to track where they >are in the READDIR stream: for instance it is possible to cache the >filename too, and to use that as a backup solution if the server is >being cranky. >We could possibly add this sort of information to Linux too. That's probably a good idea. At least the network file system connecting each flavours of Unix should be consistent and reliable. Maybe this bug also exists in other BSD derivates. Sun developed NFS. Is the Solaris client and server code available? I filed a bug in OpenDarwins Bugzilla: http://www.opendarwin.org/bugzilla/show_bug.cgi?id=2201 Seems that Mac OS X NFS implementation is a kind of buggy, because I found other problems with it: http://discussions.info.apple.com/webx?14@246.XlfoaVdSw0s.2@.6899e34d http://discussions.info.apple.com/webx?14@246.XlfoaVdSw0s.2@.6899e2eb http://discussions.info.apple.com/webx?14@246.XlfoaVdSw0s.0@.6899e444 Regards, Bernd ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ls hangs on NFS share from Apple Xserve 2004-09-09 8:08 ` bnies @ 2004-09-09 15:00 ` Trond Myklebust 0 siblings, 0 replies; 10+ messages in thread From: Trond Myklebust @ 2004-09-09 15:00 UTC (permalink / raw) To: bnies; +Cc: nfs P=E5 to , 09/09/2004 klokka 04:08, skreiv bnies@bluewin.ch: > Sun developed NFS. Is the Solaris client and server code available? You'll have to wait until they GPL Solaris 8-) Cheers, Trond ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 13. Go here: http://sf.net/ppc_contest.php _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2004-09-09 15:01 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-09-08 8:42 ls hangs on NFS share from Apple Xserve bnies 2004-09-08 16:52 ` Trond Myklebust 2004-09-08 18:36 ` Bernd Nies 2004-09-08 19:53 ` Trond Myklebust 2004-09-08 20:07 ` Bernd Nies 2004-09-08 20:46 ` Trond Myklebust 2004-09-08 21:51 ` Bernd Nies 2004-09-08 22:30 ` Trond Myklebust 2004-09-09 8:08 ` bnies 2004-09-09 15:00 ` Trond Myklebust
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.