All of lore.kernel.org
 help / color / mirror / Atom feed
* nfsd stales when restarting too fast
@ 2004-08-10  9:06 Frank Steiner
  2004-08-18  3:24 ` Neil Brown
  0 siblings, 1 reply; 3+ messages in thread
From: Frank Steiner @ 2004-08-10  9:06 UTC (permalink / raw)
  To: nfs; +Cc: shylendra.bhat

Hi,

I posted this on the kernel list already, but now that I'm subscribed here
I guess this is the better place :-) Neil already reacted to my mail on
LKML but the first proposal didn't help (order of exportfs and killall).

System is: SuSE 9.0 with 2.6.7 (tested up to 2.6.8rc3) and util-linux-2.12

Also tested with SuSE 9.1/SLES9 and SuSEs kernel 2.6.5.

When running "/etc/init.d/nfsserver restart" on the server, the clients
will react with "stale nfs handle" for all mounted directories that were
in use during the restart (e.g. if /var is mounted and syslogd is running,
or if some "find" is running on a mounted directory). The stale directories
will never come back to sane state (except restarting with sleep, see below).

When using
/etc/init.d/nfsserver stop
sleep 2
/etc/init.d/nfsserver start

(or putting a "sleep 1" between the lines "$0 stop" and "$0 start" in the
init script), everything goes fine. Restarting with sleep 2 will also
bring back the client dirs that were staled from a former restart without
sleep.

Without the init script, it can be traced down to:

killall -9 nfsd
killall -9 /usr/sbin/rpc.mountd
/usr/sbin/exportfs -au
[sleep 2]
/usr/sbin/exportfs -r
/usr/sbin/rpc.nfsd
/usr/sbin/rpc.mountd

Stales without the sleep, does not with the sleep. That behaviour is
independent from options like v3/v4, tcp/udp, lock/nolock, and it did
not happen with 2.4.

Unless this is sth. easy to fix in the kernel nfsd or client, it might
be a good idea to insert such a sleep statement in the distributors
init scripts to avoid people running into this error. I assume the
problem in the mail "machine hangs - SLES9/NFS" was caused by the
same problem.


cu,
Frank

-- 
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik    Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17           Phone: +49 89 2180-4049
80333 Muenchen, Germany       Fax:   +49 89 2180-99-4049



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: nfsd stales when restarting too fast
  2004-08-10  9:06 nfsd stales when restarting too fast Frank Steiner
@ 2004-08-18  3:24 ` Neil Brown
  2004-08-18  7:16   ` Frank Steiner
  0 siblings, 1 reply; 3+ messages in thread
From: Neil Brown @ 2004-08-18  3:24 UTC (permalink / raw)
  To: Frank Steiner; +Cc: nfs, shylendra.bhat

On Tuesday August 10, fsteiner-mail@bio.ifi.lmu.de wrote:
> Hi,
> 
> I posted this on the kernel list already, but now that I'm subscribed here
> I guess this is the better place :-) Neil already reacted to my mail on
> LKML but the first proposal didn't help (order of exportfs and killall).
> 
> System is: SuSE 9.0 with 2.6.7 (tested up to 2.6.8rc3) and util-linux-2.12
> 
> Also tested with SuSE 9.1/SLES9 and SuSEs kernel 2.6.5.
> 
> When running "/etc/init.d/nfsserver restart" on the server, the clients
> will react with "stale nfs handle" for all mounted directories that were
> in use during the restart (e.g. if /var is mounted and syslogd is running,
> or if some "find" is running on a mounted directory). The stale directories
> will never come back to sane state (except restarting with sleep, see below).
> 
> When using
> /etc/init.d/nfsserver stop
> sleep 2
> /etc/init.d/nfsserver start
> 
> (or putting a "sleep 1" between the lines "$0 stop" and "$0 start" in the
> init script), everything goes fine. Restarting with sleep 2 will also
> bring back the client dirs that were staled from a former restart without
> sleep.
> 
> Without the init script, it can be traced down to:
> 
> killall -9 nfsd
> killall -9 /usr/sbin/rpc.mountd
> /usr/sbin/exportfs -au
> [sleep 2]
> /usr/sbin/exportfs -r
> /usr/sbin/rpc.nfsd
> /usr/sbin/rpc.mountd
> 
> Stales without the sleep, does not with the sleep. That behaviour is
> independent from options like v3/v4, tcp/udp, lock/nolock, and it did
> not happen with 2.4.

Probably the best solution is to "not do that" - why do you want to
stop and then restart the server anyway?  Why not just leave it
running.

However there is a race the, and "sleep 1" would fix it.
Another fix would be to use "-1" instead of "-9" to kill nfsd.  This
causes it to exit without clearing the export table.
Another fix would be to apply to following patch to your 2.6 kernel.

NeilBrown


diff ./net/sunrpc/cache.c~current~ ./net/sunrpc/cache.c
--- ./net/sunrpc/cache.c~current~	2004-08-18 13:07:44.000000000 +1000
+++ ./net/sunrpc/cache.c	2004-08-18 13:12:10.000000000 +1000
@@ -400,9 +400,10 @@ void cache_flush(void)
 
 void cache_purge(struct cache_detail *detail)
 {
-	detail->flush_time = get_seconds()+1;
+	detail->flush_time = LONG_MAX;
 	detail->nextcheck = get_seconds();
 	cache_flush();
+	detail->flush_time = 1;
 }
 
 


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: nfsd stales when restarting too fast
  2004-08-18  3:24 ` Neil Brown
@ 2004-08-18  7:16   ` Frank Steiner
  0 siblings, 0 replies; 3+ messages in thread
From: Frank Steiner @ 2004-08-18  7:16 UTC (permalink / raw)
  To: Neil Brown; +Cc: nfs

Neil Brown wrote:
> 
> Probably the best solution is to "not do that" - why do you want to
> stop and then restart the server anyway?  Why not just leave it
> running.

Aeh, well, yes. Of course I should not restart but just reload when
sth. has changed... That's because I'm not used to the reload
option since it didn't exist in the early SuSE versions I was using, so
I'm always using restart. With reload (which just issues exportfs -r
in the SuSE init script) no problem occurs. I definitely should have
thought of that *feeling stupid* :-((

> 
> However there is a race the, and "sleep 1" would fix it.
> Another fix would be to use "-1" instead of "-9" to kill nfsd.  This
> causes it to exit without clearing the export table.
> Another fix would be to apply to following patch to your 2.6 kernel.

Just in case I forget the reload again next time I will include your
patch in my kernel rpm, too. Just to be sure :-)

Thanks for your help!
cu,
Frank

-- 
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik    Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17           Phone: +49 89 2180-4049
80333 Muenchen, Germany       Fax:   +49 89 2180-99-4049



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-08-18  7:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-10  9:06 nfsd stales when restarting too fast Frank Steiner
2004-08-18  3:24 ` Neil Brown
2004-08-18  7:16   ` Frank Steiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.