* NFS problems across reboot
@ 2002-04-04 19:27 David Ford
2002-04-05 9:12 ` James Pearson
0 siblings, 1 reply; 16+ messages in thread
From: David Ford @ 2002-04-04 19:27 UTC (permalink / raw)
To: nfs
After rebooting my NFS server, the clients can no longer access the mounts.
A trimmed df shows:
james:/home/james 0 1 0 0% /home/james
james:/home/hnc 0 1 0 0% /home/hnc
# su - xyz
su: warning: cannot change directory to /home/james/x/xyz: Permission denied
bash: /home/james/x/xyz/.bash_profile: Permission denied
However:
# rpcinfo -p james
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100005 1 udp 10000 mountd
100005 1 tcp 10000 mountd
100005 2 udp 10000 mountd
100005 2 tcp 10000 mountd
100005 3 udp 10000 mountd
100005 3 tcp 10000 mountd
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100021 1 udp 10001 nlockmgr
100021 3 udp 10001 nlockmgr
100021 4 udp 10001 nlockmgr
100024 1 udp 10002 status
100024 1 tcp 10001 status
# umount /home/hnc
# mount /home/hnc
And the hnc mount is fine again. After repeating with /home/james, that
mount is also fine.
Why do I need to umount and remount? This is pretty brutal when I need
to shutdown services so the mounts aren't in use.
These are 2.4.18+ kernels with nfs utils from about a month ago.
David
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: NFS problems across reboot 2002-04-04 19:27 NFS problems across reboot David Ford @ 2002-04-05 9:12 ` James Pearson 2002-04-07 4:43 ` David Ford 0 siblings, 1 reply; 16+ messages in thread From: James Pearson @ 2002-04-05 9:12 UTC (permalink / raw) To: David Ford; +Cc: nfs Can't offer any solutions, but I've had the same problem - (see http://marc.theaimsgroup.com/?l=linux-nfs&m=101610950230938&w=2 ) In my case, one particular application (accessing files via NFS on the server at the time) was causing the server machine to crash, when the server rebooted, Linux clients got "Permission denied" problems on the mount points from the server (IRIX clients reported "I/O Error" on the same mount points). I thought I had fixed the problem by using a newer kernel on the server - but all this fixed was the application in question crashing the machine ... I still see the "Permission denied" problem from time to time (df reports a similar output to yours as well) - however I don't see it every time a 'server' reboots - most of our Linux workstations NFS export local disks - which can be automounted by other workstations - the workstations tend to get rebooted 'fairly' frequently. The workaround is to umount/mount ... I'm using kernels 2.4.7 and 2.4.14 (both with XFS). It doesn't seem to be a Linux NFS client problem - as I mentioned above, SGI IRIX clients have a similar problem if the Linux server crashes/reboots. James Pearson David Ford wrote: > > After rebooting my NFS server, the clients can no longer access the mounts. > > A trimmed df shows: > > james:/home/james 0 1 0 0% /home/james > james:/home/hnc 0 1 0 0% /home/hnc > > # su - xyz > su: warning: cannot change directory to /home/james/x/xyz: Permission denied > bash: /home/james/x/xyz/.bash_profile: Permission denied > > However: > > # rpcinfo -p james > program vers proto port > 100000 2 tcp 111 portmapper > 100000 2 udp 111 portmapper > 100005 1 udp 10000 mountd > 100005 1 tcp 10000 mountd > 100005 2 udp 10000 mountd > 100005 2 tcp 10000 mountd > 100005 3 udp 10000 mountd > 100005 3 tcp 10000 mountd > 100003 2 udp 2049 nfs > 100003 3 udp 2049 nfs > 100021 1 udp 10001 nlockmgr > 100021 3 udp 10001 nlockmgr > 100021 4 udp 10001 nlockmgr > 100024 1 udp 10002 status > 100024 1 tcp 10001 status > > # umount /home/hnc > # mount /home/hnc > > And the hnc mount is fine again. After repeating with /home/james, that > mount is also fine. > > Why do I need to umount and remount? This is pretty brutal when I need > to shutdown services so the mounts aren't in use. > > These are 2.4.18+ kernels with nfs utils from about a month ago. > > David > > _______________________________________________ > NFS maillist - NFS@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-05 9:12 ` James Pearson @ 2002-04-07 4:43 ` David Ford 2002-04-07 5:23 ` Neil Brown 0 siblings, 1 reply; 16+ messages in thread From: David Ford @ 2002-04-07 4:43 UTC (permalink / raw) To: nfs; +Cc: James Pearson Well, I'm an all Linux shop here, here are a few points to consider, Trond, please help us out. a) All machines are Linux b) this is 100% repeatable c) this is entirely unacceptable, every user (/home is mounted) has to log out of their machine in order to remount, daemons need shut down. Here's the scoop: a) Linux 2.4.18-pre6 on one nfs server, 2.4.19-pre6 on the other b) 2.4.18-pre7 through 2.4.19-pre6 on clients c) mount options are all like: defaults,rsize=8192,wsize=8192 d) programs run/ning, portmap, kmountd, knfsd 5, lockd, and statd. e) no firewall rules, /etc/exports is setup using individual IPs since the friggen system can't deal with ranges or hostnames properly, "Hostname is the same as "hostname" in DNS land, it should be in NFS land as well. exports is like: /path 1.2.3.4(rw,no_root_squash). f) hosts.allow is basically *:all When the server reboots (cleanly), the clients get the following, from rebooting to running: nfs: server james not responding, still trying nfs: server james not responding, still trying nfs: server james OK nfs: server james OK nfs_statfs: statfs error = 13 nfs_statfs: statfs error = 13 nfs_statfs: statfs error = 13 <repeats until client is able to remount> Basically, it's a totally simple NFS setup, but it's unusable in the event of a server restart because any daemon or user that has files open on that mount has to exit, sometimes not cleanly, and restart. It's really a killer. If this isn't fixable, can anyone recommend another network filesystem that has user ids, symlinks, and unix style permissions? smb looked all great and dandy until I realized that symlinks don't exist and file permissions are pretty wacky. Thanks, David James Pearson wrote: >Can't offer any solutions, but I've had the same problem - (see >http://marc.theaimsgroup.com/?l=linux-nfs&m=101610950230938&w=2 ) > >In my case, one particular application (accessing files via NFS on the >server at the time) was causing the server machine to crash, when the >server rebooted, Linux clients got "Permission denied" problems on the >mount points from the server (IRIX clients reported "I/O Error" on the >same mount points). > >I thought I had fixed the problem by using a newer kernel on the server >- but all this fixed was the application in question crashing the >machine ... I still see the "Permission denied" problem from time to >time (df reports a similar output to yours as well) - however I don't >see it every time a 'server' reboots - most of our Linux workstations >NFS export local disks - which can be automounted by other workstations >- the workstations tend to get rebooted 'fairly' frequently. > >The workaround is to umount/mount ... > >I'm using kernels 2.4.7 and 2.4.14 (both with XFS). > >It doesn't seem to be a Linux NFS client problem - as I mentioned above, >SGI IRIX clients have a similar problem if the Linux server >crashes/reboots. > >James Pearson > >David Ford wrote: > >>After rebooting my NFS server, the clients can no longer access the mounts. >> >>A trimmed df shows: >> >>james:/home/james 0 1 0 0% /home/james >>james:/home/hnc 0 1 0 0% /home/hnc >> >># su - xyz >>su: warning: cannot change directory to /home/james/x/xyz: Permission denied >>bash: /home/james/x/xyz/.bash_profile: Permission denied >> >>However: >> >># rpcinfo -p james >> program vers proto port >> 100000 2 tcp 111 portmapper >> 100000 2 udp 111 portmapper >> 100005 1 udp 10000 mountd >> 100005 1 tcp 10000 mountd >> 100005 2 udp 10000 mountd >> 100005 2 tcp 10000 mountd >> 100005 3 udp 10000 mountd >> 100005 3 tcp 10000 mountd >> 100003 2 udp 2049 nfs >> 100003 3 udp 2049 nfs >> 100021 1 udp 10001 nlockmgr >> 100021 3 udp 10001 nlockmgr >> 100021 4 udp 10001 nlockmgr >> 100024 1 udp 10002 status >> 100024 1 tcp 10001 status >> >># umount /home/hnc >># mount /home/hnc >> >>And the hnc mount is fine again. After repeating with /home/james, that >>mount is also fine. >> >>Why do I need to umount and remount? This is pretty brutal when I need >>to shutdown services so the mounts aren't in use. >> >>These are 2.4.18+ kernels with nfs utils from about a month ago. >> >>David >> >>_______________________________________________ >>NFS maillist - NFS@lists.sourceforge.net >>https://lists.sourceforge.net/lists/listinfo/nfs >> > >_______________________________________________ >NFS maillist - NFS@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/nfs > _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-07 4:43 ` David Ford @ 2002-04-07 5:23 ` Neil Brown 2002-04-07 5:44 ` Neil Brown ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Neil Brown @ 2002-04-07 5:23 UTC (permalink / raw) To: David Ford; +Cc: nfs, James Pearson On Saturday April 6, david+cert@blue-labs.org wrote: > Well, I'm an all Linux shop here, here are a few points to consider, > Trond, please help us out. > > a) All machines are Linux > b) this is 100% repeatable > c) this is entirely unacceptable, every user (/home is mounted) has to > log out of their machine in order to remount, daemons need shut down. > > Here's the scoop: > > a) Linux 2.4.18-pre6 on one nfs server, 2.4.19-pre6 on the other > b) 2.4.18-pre7 through 2.4.19-pre6 on clients > c) mount options are all like: defaults,rsize=8192,wsize=8192 > d) programs run/ning, portmap, kmountd, knfsd 5, lockd, and statd. > e) no firewall rules, /etc/exports is setup using individual IPs since > the friggen system can't deal with ranges or hostnames properly, > "Hostname is the same as "hostname" in DNS land, it should be in NFS > land as well. exports is like: /path 1.2.3.4(rw,no_root_squash). > f) hosts.allow is basically *:all > > When the server reboots (cleanly), the clients get the following, from > rebooting to running: > nfs: server james not responding, still trying > nfs: server james not responding, still trying > nfs: server james OK > nfs: server james OK > nfs_statfs: statfs error = 13 > nfs_statfs: statfs error = 13 > nfs_statfs: statfs error = 13 13 == EACCES. Looks like the filesystem isn't exported. Show us the relevent /etc/init.d file. My guess is that "exportfs -a" is being run *After* rpc.nfsd. It must must run *before* for correct operation. NeilBrown _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-07 5:23 ` Neil Brown @ 2002-04-07 5:44 ` Neil Brown 2002-04-07 6:48 ` David Ford 2002-04-08 13:42 ` James Pearson 2002-04-07 6:39 ` Daniel Freedman 2002-04-07 6:59 ` David Ford 2 siblings, 2 replies; 16+ messages in thread From: Neil Brown @ 2002-04-07 5:44 UTC (permalink / raw) To: Neil Brown; +Cc: David Ford, nfs, James Pearson On Sunday April 7, neilb@cse.unsw.edu.au wrote: > > 13 == EACCES. Looks like the filesystem isn't exported. > > Show us the relevent /etc/init.d file. > My guess is that "exportfs -a" is being run *After* rpc.nfsd. > It must must run *before* for correct operation. > Actually, that would cause ESTALE, not EACCES. EACCES seems to suggest an XFS problem. Can you duplicate this without using XFS???? NeilBrown > NeilBrown > > _______________________________________________ > NFS maillist - NFS@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-07 5:44 ` Neil Brown @ 2002-04-07 6:48 ` David Ford 2002-04-08 13:42 ` James Pearson 1 sibling, 0 replies; 16+ messages in thread From: David Ford @ 2002-04-07 6:48 UTC (permalink / raw) To: Neil Brown; +Cc: nfs, James Pearson I'm not using XFS, I'm using NFS. Originating filesystems are reiserfs and ext2, doesn't make a difference. -d Neil Brown wrote: >On Sunday April 7, neilb@cse.unsw.edu.au wrote: > >>13 == EACCES. Looks like the filesystem isn't exported. >> >>Show us the relevent /etc/init.d file. >>My guess is that "exportfs -a" is being run *After* rpc.nfsd. >>It must must run *before* for correct operation. >> > >Actually, that would cause ESTALE, not EACCES. > >EACCES seems to suggest an XFS problem. >Can you duplicate this without using XFS???? > >NeilBrown > >>NeilBrown >> >>_______________________________________________ >>NFS maillist - NFS@lists.sourceforge.net >>https://lists.sourceforge.net/lists/listinfo/nfs >> _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-07 5:44 ` Neil Brown 2002-04-07 6:48 ` David Ford @ 2002-04-08 13:42 ` James Pearson 2002-04-08 15:23 ` James Pearson 1 sibling, 1 reply; 16+ messages in thread From: James Pearson @ 2002-04-08 13:42 UTC (permalink / raw) To: Neil Brown; +Cc: David Ford, nfs I'm using XFS on my root file system - I'm using a "stock" RedHat 7.1 or 7.2 initscripts - so exportfs -r is run before rpc.mountd and rpc.nfsd I can reproduce my problem by resetting an NFS server just after automounting one of its exported disks. When the server comes up, the client reports 'Permission denied' on that mount point. If I repeat the test, but wait a short while before resetting the server, everything is OK on reboot - and rmtab contains an entry for the client). It looks like the update to /var/lib/nfs/rmtab of the clients mount hasn't been sync'd to disk before the server is reset. In fact I can simulate this on a running server and client by: On the client, mount one of the server's exported file systems and cd to it. Runs 'ls', everything fine. Then: On the server: cd /var/lib/nfs cp rmtab rmtab.save cat /dev/null > rmtab /etc/init.d/nfs stop /etc/init.d/nfs start Then on the client, attempt to 'ls' etc. gives 'Permission denied' If I then copy back the saved rmtab file and restart nfs, file access on the client is OK. I've also put /var/lib/nfs on an ext2 file system, but that didn't make any difference - it seems to be that fact that the 'crash' happens before the file data has been flushed to disk. The man page for rpc.mountd says the rmtab file is 'mostly ornamental', however in this case, it seems to be a bit more important. ... So, is it possible to have a client 'survive' a server crash even if its entry in rmtab hasn't been flushed? i.e. somehow allow the client access even if there is no entry for the client in rmtab? Thanks James Pearson Neil Brown wrote: > > On Sunday April 7, neilb@cse.unsw.edu.au wrote: > > > > 13 == EACCES. Looks like the filesystem isn't exported. > > > > Show us the relevent /etc/init.d file. > > My guess is that "exportfs -a" is being run *After* rpc.nfsd. > > It must must run *before* for correct operation. > > > > Actually, that would cause ESTALE, not EACCES. > > EACCES seems to suggest an XFS problem. > Can you duplicate this without using XFS???? > > NeilBrown > > > NeilBrown > > > > _______________________________________________ > > NFS maillist - NFS@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nfs _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-08 13:42 ` James Pearson @ 2002-04-08 15:23 ` James Pearson 2002-04-08 23:10 ` Neil Brown 0 siblings, 1 reply; 16+ messages in thread From: James Pearson @ 2002-04-08 15:23 UTC (permalink / raw) To: Neil Brown; +Cc: nfs Having a quick look through the nfs-utils source, as a test, I changed fendrmtabent() in nfs-utils-0.3.3/support/nfs/rmtab.c (used by mountd) to: void fendrmtabent(FILE *fp) { if (fp) { fflush(fp); fdatasync(fileno(fp)); fclose(fp); } } which appears to help - I can reset my test server immediately after a client mounts a disk - without getting the 'Permission denied' problems. However, I have no idea if this is a 'sensible' thing to do ... if it is OK to do this, would using fdatasync() elsewhere be helpful? Thanks James Pearson James Pearson wrote: > > I'm using XFS on my root file system - I'm using a "stock" RedHat 7.1 or > 7.2 initscripts - so exportfs -r is run before rpc.mountd and rpc.nfsd > > I can reproduce my problem by resetting an NFS server just after > automounting one of its exported disks. When the server comes up, the > client reports 'Permission denied' on that mount point. > > If I repeat the test, but wait a short while before resetting the > server, everything is OK on reboot - and rmtab contains an entry for the > client). > > It looks like the update to /var/lib/nfs/rmtab of the clients mount > hasn't been sync'd to disk before the server is reset. > > In fact I can simulate this on a running server and client by: > > On the client, mount one of the server's exported file systems and cd to > it. > Runs 'ls', everything fine. Then: > > On the server: > > cd /var/lib/nfs > cp rmtab rmtab.save > cat /dev/null > rmtab > /etc/init.d/nfs stop > /etc/init.d/nfs start > > Then on the client, attempt to 'ls' etc. gives 'Permission denied' > > If I then copy back the saved rmtab file and restart nfs, file access on > the client is OK. > > I've also put /var/lib/nfs on an ext2 file system, but that didn't make > any difference - it seems to be that fact that the 'crash' happens > before the file data has been flushed to disk. > > The man page for rpc.mountd says the rmtab file is 'mostly ornamental', > however in this case, it seems to be a bit more important. ... > > So, is it possible to have a client 'survive' a server crash even if its > entry in rmtab hasn't been flushed? i.e. somehow allow the client access > even if there is no entry for the client in rmtab? > > Thanks > > James Pearson > > Neil Brown wrote: > > > > On Sunday April 7, neilb@cse.unsw.edu.au wrote: > > > > > > 13 == EACCES. Looks like the filesystem isn't exported. > > > > > > Show us the relevent /etc/init.d file. > > > My guess is that "exportfs -a" is being run *After* rpc.nfsd. > > > It must must run *before* for correct operation. > > > > > > > Actually, that would cause ESTALE, not EACCES. > > > > EACCES seems to suggest an XFS problem. > > Can you duplicate this without using XFS???? > > > > NeilBrown > > > > > NeilBrown > > > > > > _______________________________________________ > > > NFS maillist - NFS@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/nfs > > _______________________________________________ > NFS maillist - NFS@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-08 15:23 ` James Pearson @ 2002-04-08 23:10 ` Neil Brown 0 siblings, 0 replies; 16+ messages in thread From: Neil Brown @ 2002-04-08 23:10 UTC (permalink / raw) To: James Pearson; +Cc: nfs On Monday April 8, james-p@moving-picture.com wrote: > Having a quick look through the nfs-utils source, as a test, I changed > fendrmtabent() in nfs-utils-0.3.3/support/nfs/rmtab.c (used by mountd) > to: > > void > fendrmtabent(FILE *fp) > { > if (fp) { > fflush(fp); > fdatasync(fileno(fp)); > fclose(fp); > } > } > > > which appears to help - I can reset my test server immediately after a > client mounts a disk - without getting the 'Permission denied' problems. > > However, I have no idea if this is a 'sensible' thing to do ... if it is > OK to do this, would using fdatasync() elsewhere be helpful? > Yes, this is a perfectly sensible thing to do. I have have just committed this change to the CVS, thanks. NeilBrown Index: ChangeLog =================================================================== RCS file: /cvsroot/nfs/nfs-utils/ChangeLog,v retrieving revision 1.167 diff -u -r1.167 ChangeLog --- ChangeLog 8 Apr 2002 21:42:17 -0000 1.167 +++ ChangeLog 8 Apr 2002 23:04:55 -0000 @@ -1,3 +1,9 @@ +2002-04-09 NeilBrown <neilb@cse.unsw.edu.au> + James Pearson <james-p@moving-picture.com> + + * support/nfs/rmtab.c(fendrmtabent): sync changes to + storage before returning, as this is critical state + 2002-04-08 H.J. Lu <hjl@lucon.org> * etc/redhat/nfs: New. Index: support/nfs/rmtab.c =================================================================== RCS file: /cvsroot/nfs/nfs-utils/support/nfs/rmtab.c,v retrieving revision 1.2 diff -u -r1.2 rmtab.c --- support/nfs/rmtab.c 1 Jun 2000 00:57:12 -0000 1.2 +++ support/nfs/rmtab.c 8 Apr 2002 23:04:56 -0000 @@ -114,8 +114,14 @@ void fendrmtabent(FILE *fp) { - if (fp) + if (fp) { + /* If it was written to, we really want + * to flush to disk before returning + */ + fflush(fp); + fdatasync(fileno(fp)); fclose(fp); + } } void _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs Sponsored by http://www.ThinkGeek.com/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-07 5:23 ` Neil Brown 2002-04-07 5:44 ` Neil Brown @ 2002-04-07 6:39 ` Daniel Freedman 2002-04-08 2:50 ` Neil Brown 2002-04-07 6:59 ` David Ford 2 siblings, 1 reply; 16+ messages in thread From: Daniel Freedman @ 2002-04-07 6:39 UTC (permalink / raw) To: nfs On Sun, Apr 07, 2002, Neil Brown wrote: > Show us the relevent /etc/init.d file. > My guess is that "exportfs -a" is being run *After* rpc.nfsd. > It must must run *before* for correct operation. > > NeilBrown Hi, I just checked the default NFS server init script from Debian stable (2.2) as found in '/etc/init.d/nfs-kernel-server'. It appears to violate the above order, by starting exportfs *after* rpc.nfsd. Am I missing something, please? **** Snippet from script listed below: case "$1" in start) if grep -q '^/' /etc/exports; then printf "Starting $DESC:" printf " nfsd" start-stop-daemon --start --quiet \ --exec $PREFIX/sbin/rpc.nfsd -- $RPCNFSDCOUNT printf " mountd" start-stop-daemon --start --quiet \ --exec $PREFIX/sbin/rpc.mountd -- $RPCMOUNTDOPTS echo "." printf "Exporting directories for $DESC..." $PREFIX/sbin/exportfs -r echo "done." else echo "Not starting $DESC: No exports." fi ;; Upon further inspection, I'm probably not missing something, since when I checked a Debian woody (3.0) NFS server I'm also maintaining, it has 'exportfs' before the 'rpc.nfsd' call, as you taught me above (I'm grateful to have learned more on this). I'm a little surprised this fix wasn't backported to Debian potato (even though they are understandably cautious to introduce non-security related changes)... I guess I'll have to change it myself in the initscript while still using the potato box. Could this explain why I've seen the same behaviour as described in an earlier post in this thread: in other words, when the NFS server cleanly reboots, all clients with open file handles develop difficulty with NFS access (sorry I can't remember exact error, it hasn't happened in months); come to think of it, though, I don't have this problem with my Debian woody NFS server, which has the correct 'exportfs'/'rpc.nfsd' order in the init-scripts. Thanks so much, and take care, Daniel -- Daniel A. Freedman Laboratory for Atomic and Solid State Physics Department of Physics Cornell University _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-07 6:39 ` Daniel Freedman @ 2002-04-08 2:50 ` Neil Brown 0 siblings, 0 replies; 16+ messages in thread From: Neil Brown @ 2002-04-08 2:50 UTC (permalink / raw) To: Daniel Freedman; +Cc: nfs On Sunday April 7, freedman@physics.cornell.edu wrote: > > Hi, > > I just checked the default NFS server init script from Debian stable (2.2) > as found in '/etc/init.d/nfs-kernel-server'. It appears to violate the > above order, by starting exportfs *after* rpc.nfsd. Am I missing something, > please? You are right. 2.2 has it wrong. 3.0 has it right. I don't know the Debian policy on how serious a bug has to be to get it fixed in "stable". This isn't a *very* serious bug, but is certainly can cause some problems. > > Could this explain why I've seen the same behaviour as described in an > earlier post in this thread: in other words, when the NFS server cleanly > reboots, all clients with open file handles develop difficulty with NFS > access (sorry I can't remember exact error, it hasn't happened in months); > come to think of it, though, I don't have this problem with my Debian woody > NFS server, which has the correct 'exportfs'/'rpc.nfsd' order in the > init-scripts. Yes. This would explain exactly that behaviour. NeilBrown _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-07 5:23 ` Neil Brown 2002-04-07 5:44 ` Neil Brown 2002-04-07 6:39 ` Daniel Freedman @ 2002-04-07 6:59 ` David Ford 2002-04-08 2:48 ` Neil Brown 2 siblings, 1 reply; 16+ messages in thread From: David Ford @ 2002-04-07 6:59 UTC (permalink / raw) To: Neil Brown; +Cc: nfs, James Pearson That was the problem. Everything mounts just fine normally. When the server reboots, the client loses access. All the client has to do is unmount and remount and everything is fine again. If it were a matter of what was exported, then the remount should fail, but it doesn't. Why does the client lose the access but with no action on the server part, the client regains access? (to whom know the answer, please don't let this thread just die away, ambiguity like this doesn't do anyone any good) The fix was to run exportfs. I wasn't running exportfs on boot. I didn't realize it was needed because mounting worked just fine without it. It is needed for already established connections. David Neil Brown wrote: >On Saturday April 6, david+cert@blue-labs.org wrote: > >>Well, I'm an all Linux shop here, here are a few points to consider, >>Trond, please help us out. >> >>a) All machines are Linux >>b) this is 100% repeatable >>c) this is entirely unacceptable, every user (/home is mounted) has to >>log out of their machine in order to remount, daemons need shut down. >> >>Here's the scoop: >> >>a) Linux 2.4.18-pre6 on one nfs server, 2.4.19-pre6 on the other >>b) 2.4.18-pre7 through 2.4.19-pre6 on clients >>c) mount options are all like: defaults,rsize=8192,wsize=8192 >>d) programs run/ning, portmap, kmountd, knfsd 5, lockd, and statd. >>e) no firewall rules, /etc/exports is setup using individual IPs since >>the friggen system can't deal with ranges or hostnames properly, >>"Hostname is the same as "hostname" in DNS land, it should be in NFS >>land as well. exports is like: /path 1.2.3.4(rw,no_root_squash). >>f) hosts.allow is basically *:all >> >>When the server reboots (cleanly), the clients get the following, from >>rebooting to running: >>nfs: server james not responding, still trying >>nfs: server james not responding, still trying >>nfs: server james OK >>nfs: server james OK >>nfs_statfs: statfs error = 13 >>nfs_statfs: statfs error = 13 >>nfs_statfs: statfs error = 13 >> > >13 == EACCES. Looks like the filesystem isn't exported. > >Show us the relevent /etc/init.d file. >My guess is that "exportfs -a" is being run *After* rpc.nfsd. >It must must run *before* for correct operation. > >NeilBrown > _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-07 6:59 ` David Ford @ 2002-04-08 2:48 ` Neil Brown 2002-04-08 12:15 ` Andreas Unterluggauer 0 siblings, 1 reply; 16+ messages in thread From: Neil Brown @ 2002-04-08 2:48 UTC (permalink / raw) To: David Ford; +Cc: nfs, James Pearson On Sunday April 7, david+cert@blue-labs.org wrote: > That was the problem. Everything mounts just fine normally. When the > server reboots, the client loses access. All the client has to do is > unmount and remount and everything is fine again. If it were a matter > of what was exported, then the remount should fail, but it doesn't. Why > does the client lose the access but with no action on the server part, > the client regains access? (to whom know the answer, please don't let > this thread just die away, ambiguity like this doesn't do anyone any good) "Exporting" a filesystem means doing two things: 1/ telling mountd that it is allow to respond positively to mount requests. 2/ telling the kernel nfsd that it is allow to respond to access requests. When mountd responds to a mount request, it also tells the kernel to respond to access requests. But after a reboot, you need to run exportfs. It reads /var/lib/nfs/rmtab to find out which clients have which filesystems mounted, and it tells the kernel to allow those clients to access those filesystem. The manpage should probably be made a bit more clear on this. NeilBrown _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-08 2:48 ` Neil Brown @ 2002-04-08 12:15 ` Andreas Unterluggauer 2002-04-08 22:58 ` Neil Brown 0 siblings, 1 reply; 16+ messages in thread From: Andreas Unterluggauer @ 2002-04-08 12:15 UTC (permalink / raw) To: Neil Brown; +Cc: nfs >=20 > "Exporting" a filesystem means doing two things: > 1/ telling mountd that it is allow to respond positively to mount > requests. > 2/ telling the kernel nfsd that it is allow to respond to access > requests. >=20 > When mountd responds to a mount request, it also tells the kernel to > respond to access requests. > But after a reboot, you need to run exportfs. It reads > /var/lib/nfs/rmtab to find out which clients have which filesystems > mounted, and it tells the kernel to allow those clients to access > those filesystem. Is it sufficent, if exportfs -r is run before rpc.nfsd are started instead of exportfs -a? The startscripts of redhat 6.2 uses exportfs -r (and we sometimes have troubles with stale mounts). andi --=20 Andreas Unterluggauer Dokumentationsstelle f=FCr neuere =F6sterreichische Literatur A-1070 Wien, Seidengasse 13 http://www.literaturhaus.at Tel. +43/1/526 20 44-11, Fax -30, e-mail au@literaturhaus.at _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: NFS problems across reboot 2002-04-08 12:15 ` Andreas Unterluggauer @ 2002-04-08 22:58 ` Neil Brown 0 siblings, 0 replies; 16+ messages in thread From: Neil Brown @ 2002-04-08 22:58 UTC (permalink / raw) To: Andreas Unterluggauer; +Cc: nfs On Monday April 8, au@literaturhaus.at wrote: > > > > "Exporting" a filesystem means doing two things: > > 1/ telling mountd that it is allow to respond positively to mount > > requests. > > 2/ telling the kernel nfsd that it is allow to respond to access > > requests. > > > > When mountd responds to a mount request, it also tells the kernel to > > respond to access requests. > > But after a reboot, you need to run exportfs. It reads > > /var/lib/nfs/rmtab to find out which clients have which filesystems > > mounted, and it tells the kernel to allow those clients to access > > those filesystem. > > Is it sufficent, if exportfs -r is run before rpc.nfsd are started > instead of exportfs -a? > The startscripts of redhat 6.2 uses exportfs -r (and we sometimes have > troubles with stale mounts). Yep, "exportfs -r" is fine. exportfs -r preserves any changes you may have made with exportfs -o what,ever host:/path/to/export whereas exportfs -a discards any such changes and just apply what /etc/exports says. Which one you want really depends on your sysadmin practices. I personally like "-a" because it means I know that after a reboot the machine will come back to a known-correct state. NeilBrown _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs Sponsored by http://www.ThinkGeek.com/ ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <E16uI1v-00067H-00@usw-sf-list1.sourceforge.net>]
* Re: NFS problems across reboot [not found] <E16uI1v-00067H-00@usw-sf-list1.sourceforge.net> @ 2002-04-08 0:43 ` Al Borchers 0 siblings, 0 replies; 16+ messages in thread From: Al Borchers @ 2002-04-08 0:43 UTC (permalink / raw) To: nfs Neil -- > From: Neil Brown <neilb@cse.unsw.edu.au> > My guess is that "exportfs -a" is being run *After* rpc.nfsd. > It must must run *before* for correct operation. I did not know this. Can you explain why? Our situation -- We run a proprietary file system that holds the exported directories, but it does not start up until _after_ nfs is started. On a reboot, we run "exportfs -a" again after our file system is up, so the clients that held mounts before the reboot don't get stale file handle errors. (The first run of exportfs, before rpc.nfsd, did not find the exported directories since the file system they are on was not up yet.) Will this be a problem? Should we delay starting nfs until _after_ our file system is up? We are still having some problems with stale file handles on reboot, even after the second exportfs, but it only happens occasionally. Sometimes it goes away after a minute or two, sometimes not. We are running 2.4.17 on both clients and servers. Thanks for any info on this, -- Al _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2002-04-08 23:07 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-04-04 19:27 NFS problems across reboot David Ford
2002-04-05 9:12 ` James Pearson
2002-04-07 4:43 ` David Ford
2002-04-07 5:23 ` Neil Brown
2002-04-07 5:44 ` Neil Brown
2002-04-07 6:48 ` David Ford
2002-04-08 13:42 ` James Pearson
2002-04-08 15:23 ` James Pearson
2002-04-08 23:10 ` Neil Brown
2002-04-07 6:39 ` Daniel Freedman
2002-04-08 2:50 ` Neil Brown
2002-04-07 6:59 ` David Ford
2002-04-08 2:48 ` Neil Brown
2002-04-08 12:15 ` Andreas Unterluggauer
2002-04-08 22:58 ` Neil Brown
[not found] <E16uI1v-00067H-00@usw-sf-list1.sourceforge.net>
2002-04-08 0:43 ` Al Borchers
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.