* lockd / statd fun (sorry) @ 2004-04-23 10:37 Gavin Hamill 2004-04-23 11:32 ` Bernd Schubert 0 siblings, 1 reply; 10+ messages in thread From: Gavin Hamill @ 2004-04-23 10:37 UTC (permalink / raw) To: nfs Hullo, I'm afraid I have to dredge up old ground here... I'm running about 30 diskless workstations PXE-booting to an NFS-root and NFS-homedir with NIS logins, and the workstations are regularly getting the familiar Apr 23 11:17:18 10.0.0.13 kernel: nsm_mon_unmon: rpc failed, status=-13 Apr 23 11:17:18 10.0.0.13 kernel: lockd: cannot monitor 10.0.0.253 Apr 23 11:17:18 10.0.0.13 kernel: lockd: failed to monitor 10.0.0.253 Apr 23 11:17:18 10.0.0.13 kernel: nsm_mon_unmon: rpc failed, status=-13 Apr 23 11:17:18 10.0.0.13 kernel: lockd: cannot monitor 10.0.0.253 kernel messages. Now, I've read as much as I can on the topic, and I have made sure that both statd and lockd are running on both the client and the server. The machines have generally worked well, but OpenOffice.org's setup seems to require locking, and the messages have increasingly irritated me, so I need to turn to the oracles on such matters :) I'm using kernel-mode NFS server on both the physical server and the workstations... On the server, I see 10714 ? S 0:00 /sbin/rpc.statd 10733 ? SW 0:00 [lockd] 10734 ? SW 0:00 \_ [rpciod] and on the clients I see the same (from memory - I can't access them via SSH from here). Clients are using a 2.4.22 kernel and the server is on 2.4.24. I checked the manpage for statd, and was interested by the /var/lib/nfs/sm directory - on the server, this directory is completely empty, and /var/lib/nfs/state contains only 4 bytes: 001d 0000 - does this sound normal? /etc/hosts.allow was always blank , but today I tried the advice I found in another mailing list posting, to add statd: 10.0.0. I then restarted nfs-common (rpc.statd) and nfs-kernel-server (rpc.nfsd and rpc.mountd) with logs thusly: Apr 23 11:06:26 10.0.0.9 kernel: nsm_mon_unmon: rpc failed, status=-13 Apr 23 11:06:26 10.0.0.9 kernel: lockd: cannot monitor 10.0.0.253 Apr 23 11:06:26 10.0.0.9 kernel: lockd: failed to monitor 10.0.0.253 Apr 23 11:06:40 fon kernel: nfsd: last server has exited Apr 23 11:06:40 fon kernel: nfsd: unexporting all filesystems Apr 23 11:06:42 10.0.0.24 kernel: nfs: server 10.0.0.253 not responding, still trying Apr 23 11:06:43 10.0.0.19 kernel: nfs: server 10.0.0.253 not responding, still trying Apr 23 11:06:43 10.0.0.9 kernel: nfs: server 10.0.0.253 not responding, still Apr 23 11:06:45 10.0.0.19 kernel: nfs: server 10.0.0.253 OK Apr 23 11:06:56 10.0.0.19 kernel: nsm_mon_unmon: rpc failed, status=-13 Apr 23 11:06:56 10.0.0.19 kernel: lockd: cannot monitor 10.0.0.253 Apr 23 11:06:56 10.0.0.19 kernel: lockd: failed to monitor 10.0.0.253 i.e. nothing's changed :( Should the /var/lib/nfs be in use for the clients, too? My boot-sequence is based on KNOPPIX and uses an initrd to symlink much of the filesystem to a ramdisk, so I'm a bit concerned I might have messed up the permissions here. Any advice warmly welcomed. Cheers, Gavin. P.S. Olaf Kirch's "statd simplified" looks very interesting! :)) ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: lockd / statd fun (sorry) 2004-04-23 10:37 lockd / statd fun (sorry) Gavin Hamill @ 2004-04-23 11:32 ` Bernd Schubert 2004-04-23 11:48 ` Gavin Hamill 0 siblings, 1 reply; 10+ messages in thread From: Bernd Schubert @ 2004-04-23 11:32 UTC (permalink / raw) To: nfs; +Cc: Gavin Hamill, Chip Salzenberg =2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello Gavin, > I'm afraid I have to dredge up old ground here... > > I'm running about 30 diskless workstations PXE-booting to an NFS-root and > NFS-homedir with NIS logins, and the workstations are regularly getting t= he > familiar > > Apr 23 11:17:18 10.0.0.13 kernel: nsm_mon_unmon: rpc failed, status=3D-13 > Apr 23 11:17:18 10.0.0.13 kernel: lockd: cannot monitor 10.0.0.253 > Apr 23 11:17:18 10.0.0.13 kernel: lockd: failed to monitor 10.0.0.253 > Apr 23 11:17:18 10.0.0.13 kernel: nsm_mon_unmon: rpc failed, status=3D-13 > Apr 23 11:17:18 10.0.0.13 kernel: lockd: cannot monitor 10.0.0.253 > we also use a diskless environment and also see that problem. However, as I= =20 posted a long time ago to this list, it only happens if the nfs-utils are=20 compiled with the '--secure-statd' confiure option. So every time we perform a general debian (testing) -update and so when als= o=20 the nfs-utils become updated, we see that problem. Everytime this happens, = I=20 fetch the debian nfs-utils source and recompile them without the=20 '--secure-statd' option. When I posted that workaround, Trond told me, that its not good, since fake= d=20 packages can be send to the statd-daemon that way. However, for us its bett= er=20 to have an unsecured statd running than non at all. Also, the rpc.statd=20 manpage says, that the statd can be protected by the 'tcp_wrapper library'. Cheers, Bernd =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFAiP6zC8BUnAF+ydYRAv6qAJ9XovApFhzLv1E7/EBUxUbstj6UbQCeKW2a 0BIpyiHKUmqYIGnvxIO0JmQ=3D =3DDGzb =2D----END PGP SIGNATURE----- ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: lockd / statd fun (sorry) 2004-04-23 11:32 ` Bernd Schubert @ 2004-04-23 11:48 ` Gavin Hamill 2004-04-23 11:57 ` Olaf Kirch 0 siblings, 1 reply; 10+ messages in thread From: Gavin Hamill @ 2004-04-23 11:48 UTC (permalink / raw) To: nfs On Friday 23 April 2004 12:32, Bernd Schubert wrote: > Everytime this happens, I fetch the debian nfs-utils source and recompile > them without the '--secure-statd' option. OK, I did see your message in the archives, but wasn't sure that it was relevant here (I'm getting error -13 when you got -5) but I'm game for a laugh :) The workstations are running unstable, whilst the server is on woody. Should I only need to update the nfs-common package on the workstations? I don't want to touch the server too much if I can help it :) > Also, the rpc.statd manpage says, that the statd can be protected by > the 'tcp_wrapper library'. <nod> That's what I've done in /etc/hosts.allow - it's a moot point, really because it's a firewalled LAN :) Cheers, Gavin. ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: lockd / statd fun (sorry) 2004-04-23 11:48 ` Gavin Hamill @ 2004-04-23 11:57 ` Olaf Kirch 2004-04-27 13:43 ` Gavin Hamill 0 siblings, 1 reply; 10+ messages in thread From: Olaf Kirch @ 2004-04-23 11:57 UTC (permalink / raw) To: Gavin Hamill; +Cc: nfs On Fri, Apr 23, 2004 at 12:48:30PM +0100, Gavin Hamill wrote: > OK, I did see your message in the archives, but wasn't sure that it was > relevant here (I'm getting error -13 when you got -5) but I'm game for a > laugh :) 13 is EACCESS, and is probably returned by statd. So I would say you are able to talk to statd, only there's something wrong with the setup. statd wants the following files available and writable: /var/lib/nfs/state (at start-up only; to store seq# number) /var/lib/nfs/sm to store the NFS peer's address /var/lib/nfs/sm.bak (at start-up only; for lock recovery) Olaf -- Olaf Kirch | The Hardware Gods hate me. okir@suse.de | ---------------+ ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: lockd / statd fun (sorry) 2004-04-23 11:57 ` Olaf Kirch @ 2004-04-27 13:43 ` Gavin Hamill 2004-04-27 15:32 ` Olaf Kirch 0 siblings, 1 reply; 10+ messages in thread From: Gavin Hamill @ 2004-04-27 13:43 UTC (permalink / raw) To: nfs On Fri, Apr 23, 2004 at 01:57:04PM +0200, Olaf Kirch wrote: > On Fri, Apr 23, 2004 at 12:48:30PM +0100, Gavin Hamill wrote: > > OK, I did see your message in the archives, but wasn't sure that it was > > relevant here (I'm getting error -13 when you got -5) but I'm game for a > > laugh :) > > 13 is EACCESS, and is probably returned by statd. So I would say you > are able to talk to statd, only there's something wrong with the setup. > > statd wants the following files available and writable: > > /var/lib/nfs/state (at start-up only; to store seq# number) > /var/lib/nfs/sm to store the NFS peer's address > /var/lib/nfs/sm.bak (at start-up only; for lock recovery) OK, I'm assuming here that statd is only run by the machine running nfsd and exporting filesystems, and /usr/sbin/rpc.statd is definately in the process list Moving onto the files in /var/lib, everything seems to be in order, but I never see any files written to /var/lib/nfs/sm. statd is running as root, and here's the structure of /var/lib/nfs: drwxr-xr-x 4 root root 4096 Apr 27 11:05 nfs mop:/var/lib# ls -lR nfs/ nfs/: total 24 -rw-r--r-- 1 root root 614 Apr 27 11:05 etab -rw-r--r-- 1 root root 68 Apr 27 13:49 rmtab drwxr-xr-x 2 root root 4096 Jul 9 2003 sm drwxr-xr-x 2 root root 4096 Jul 9 2003 sm.bak -rw------- 1 root root 4 Apr 27 10:42 state -rw-r--r-- 1 root root 288 Apr 27 11:10 xtab nfs/sm: total 0 nfs/sm.bak: total 0 mop:/var/lib# There is nothing in the log for 'statd' other than Apr 27 10:42:53 mop rpc.statd[478]: Version 1.0 Starting This is not a life-or-death issue because the network is working, but this is functonality that shoudl work, and it's bugging me :) As always, any advice warmly received! Cheers, Gavin ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: lockd / statd fun (sorry) 2004-04-27 13:43 ` Gavin Hamill @ 2004-04-27 15:32 ` Olaf Kirch 2004-04-27 15:47 ` Gavin Hamill 0 siblings, 1 reply; 10+ messages in thread From: Olaf Kirch @ 2004-04-27 15:32 UTC (permalink / raw) To: Gavin Hamill; +Cc: nfs On Tue, Apr 27, 2004 at 02:43:43PM +0100, Gavin Hamill wrote: > OK, I'm assuming here that statd is only run by the machine running nfsd and > exporting filesystems, and /usr/sbin/rpc.statd is definately in the process list No, statd needs to run on the client as well. > Moving onto the files in /var/lib, everything seems to be in order, but I never see any > files written to /var/lib/nfs/sm. Files should appear in /var/lib/nfs/sm while a lock is held, and disappear shortly after the lock is released. Olaf -- Olaf Kirch | The Hardware Gods hate me. okir@suse.de | ---------------+ ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: lockd / statd fun (sorry) 2004-04-27 15:32 ` Olaf Kirch @ 2004-04-27 15:47 ` Gavin Hamill 2004-04-27 15:56 ` Olaf Kirch 2004-04-27 16:49 ` Trond Myklebust 0 siblings, 2 replies; 10+ messages in thread From: Gavin Hamill @ 2004-04-27 15:47 UTC (permalink / raw) To: nfs On Tuesday 27 April 2004 16:32, Olaf Kirch wrote: > On Tue, Apr 27, 2004 at 02:43:43PM +0100, Gavin Hamill wrote: > > OK, I'm assuming here that statd is only run by the machine running nfsd > > and exporting filesystems, and /usr/sbin/rpc.statd is definately in the > > process list > > No, statd needs to run on the client as well. Right, now we're moving in the right direction. when I run statd on the clients, I see no errors on the console or system logs, but the statd process does not appear in 'ps fawx' output (as it does on the server.) An strace from '/sbin/rpc.statd' one of the clients is: execve("/sbin/rpc.statd", ["/sbin/rpc.statd"], [/* 11 vars */]) = 0 uname({sys="Linux", node="cc", ...}) = 0 brk(0) = 0x80505c8 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=30964, ...}) = 0 old_mmap(NULL, 30964, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/libnsl.so.1", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000<\0\000"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=73528, ...}) = 0 old_mmap(NULL, 84864, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40020000 old_mmap(0x40032000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x11000) = 0x40032000 old_mmap(0x40033000, 7040, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED| MAP_ANONYMOUS, -1, 0) = 0x40033000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200^\1"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=1243792, ...}) = 0 old_mmap(NULL, 1253956, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40035000 old_mmap(0x4015d000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x127000) = 0x4015d000 old_mmap(0x40165000, 8772, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED| MAP_ANONYMOUS, -1, 0) = 0x40165000 close(3) = 0 munmap(0x40018000, 30964) = 0 getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0 pipe([3, 4]) = 0 fork() = 598 close(4) = 0 read(3, 0xbffffdc7, 1) = ? ERESTARTSYS (To be restarted) --- SIGCHLD (Child exited) @ 0 (0) --- read(3, "", 1) = 0 exit_group(1) = ? Not very revealing :/ The whole of /var on the clients is initially held on a ramdisk, so they have read/write access. Some of the larger 'less variable' parts of var are symlinked to the NFS-rootfs, but /var/lib/nfs isn't one of those parts :/ I'm just about out of ideas on this one, and am ready to just mount the NFS-root with the 'nolock' option - this works, and lets openoffice's setup run (it requires locking..) but again, it's just 'a fix' and not The Solution. Cheers, Gavin. ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: lockd / statd fun (sorry) 2004-04-27 15:47 ` Gavin Hamill @ 2004-04-27 15:56 ` Olaf Kirch 2004-04-27 16:49 ` Trond Myklebust 1 sibling, 0 replies; 10+ messages in thread From: Olaf Kirch @ 2004-04-27 15:56 UTC (permalink / raw) To: Gavin Hamill; +Cc: nfs On Tue, Apr 27, 2004 at 04:47:26PM +0100, Gavin Hamill wrote: > An strace from '/sbin/rpc.statd' one of the clients is: You should run "strace -f" to see what the child process does. Olaf -- Olaf Kirch | The Hardware Gods hate me. okir@suse.de | ---------------+ ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: lockd / statd fun (sorry) 2004-04-27 15:47 ` Gavin Hamill 2004-04-27 15:56 ` Olaf Kirch @ 2004-04-27 16:49 ` Trond Myklebust 2004-04-27 18:15 ` Gavin Hamill 1 sibling, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2004-04-27 16:49 UTC (permalink / raw) To: Gavin Hamill; +Cc: nfs On Tue, 2004-04-27 at 11:47, Gavin Hamill wrote: > I'm just about out of ideas on this one, and am ready to just mount the > NFS-root with the 'nolock' option - this works, and lets openoffice's setup > run (it requires locking..) but again, it's just 'a fix' and not The > Solution. Neither is keeping /var/lib/nfs on a ramdisk. The problem then is that if you crash and reboot, rpc.statd will restart, but it will not notify your server that you rebooted (because /var/lib/nfs will have been wiped by the reboot). The server may then end up hanging on to a bunch of locks that it thinks you still own. /var/lib/nfs should *always* be on permanent storage if you want to use locking. Cheers, Trond ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: lockd / statd fun (sorry) 2004-04-27 16:49 ` Trond Myklebust @ 2004-04-27 18:15 ` Gavin Hamill 0 siblings, 0 replies; 10+ messages in thread From: Gavin Hamill @ 2004-04-27 18:15 UTC (permalink / raw) To: nfs On Tue, Apr 27, 2004 at 12:49:17PM -0400, Trond Myklebust wrote: > On Tue, 2004-04-27 at 11:47, Gavin Hamill wrote: > > > but again, it's just 'a fix' and not The Solution. > > Neither is keeping /var/lib/nfs on a ramdisk. > > /var/lib/nfs should *always* be on permanent storage if you want to use > locking. Ah I'd certainly not taken that into consideration - I assumed (always bad, I know) it would reset any locks when the same client remounted the same export at reboot. OK, given that the machines have no local permanent storage, it seems that's solved the problem for me - I must disable locking completely. It's fortunate that the rootfs export is mounted read-only, and the home-directories export are never shared between multiple users. That, and the users are not running very demanding applications. Plus, if the worst happens, it would take only a few minutes to restore a corrupted file from backups, or re-create their userprofile entirely. Thanks for the advice, NFS has been around for a long time, and I've known nothing about it until the last few weeks :) Cheers, Gavin. ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2004-04-27 18:13 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-04-23 10:37 lockd / statd fun (sorry) Gavin Hamill 2004-04-23 11:32 ` Bernd Schubert 2004-04-23 11:48 ` Gavin Hamill 2004-04-23 11:57 ` Olaf Kirch 2004-04-27 13:43 ` Gavin Hamill 2004-04-27 15:32 ` Olaf Kirch 2004-04-27 15:47 ` Gavin Hamill 2004-04-27 15:56 ` Olaf Kirch 2004-04-27 16:49 ` Trond Myklebust 2004-04-27 18:15 ` Gavin Hamill
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox