* server locks up at boot
@ 2009-07-16 20:44 David Warren
[not found] ` <4A5F9110.4050208-qmPYOCrcNLLyFCzt5hm0YvZ8FUJU4vz8@public.gmane.org>
0 siblings, 1 reply; 2+ messages in thread
From: David Warren @ 2009-07-16 20:44 UTC (permalink / raw)
To: linux-nfs
[-- Attachment #1: Type: text/plain, Size: 877 bytes --]
We have been having a problem for a while now with kernels ranging from
2.6.26 through 30. When a boots and starts the nfs server, it reads the
/var/lib/nfs/v4recovery directory. If this directory is empty everything
is fine. If it is not, all nfs related processes go into D wait and
never return. Any other attempt to read the directory results in that
process getting stuck in D wait as well. This happens on a variety of
systems, all running Debian Lenny (happened with etch also) with kernels
ranging from 2.6.26 - 2.6.31-rc2 serving disks via nfs4. if you boot
single user and clear out the directory, everything is fine. Also, there
is never any problem logged anywhere.
So, my questions are:
What is this directory really used for? It doesn't seem to need it to
reestablish connections
Anyone have any idea what is going on or suggestions for finding out?
[-- Attachment #2: warren.vcf --]
[-- Type: text/x-vcard, Size: 342 bytes --]
begin:vcard
fn:David Warren
n:Warren;David
org:Atmospheric Sciences
adr:Dept of Atmospheric Sciences, Box 351640;;University of Washington;Seattle;WA;98195-1640;USA
email;internet:warren-qmPYOCrcNLLyFCzt5hm0YvZ8FUJU4vz8@public.gmane.org
title:Software Engineer
tel;work:206 543-0945
url:http://www.atmos.washington.edu
version:2.1
end:vcard
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: server locks up at boot
[not found] ` <4A5F9110.4050208-qmPYOCrcNLLyFCzt5hm0YvZ8FUJU4vz8@public.gmane.org>
@ 2009-07-22 16:32 ` J. Bruce Fields
0 siblings, 0 replies; 2+ messages in thread
From: J. Bruce Fields @ 2009-07-22 16:32 UTC (permalink / raw)
To: David Warren; +Cc: linux-nfs
On Thu, Jul 16, 2009 at 01:44:00PM -0700, David Warren wrote:
> We have been having a problem for a while now with kernels ranging from
> 2.6.26 through 30. When a boots and starts the nfs server, it reads the
> /var/lib/nfs/v4recovery directory. If this directory is empty everything
> is fine. If it is not, all nfs related processes go into D wait and
> never return. Any other attempt to read the directory results in that
> process getting stuck in D wait as well. This happens on a variety of
> systems, all running Debian Lenny (happened with etch also) with kernels
> ranging from 2.6.26 - 2.6.31-rc2 serving disks via nfs4. if you boot
> single user and clear out the directory, everything is fine. Also, there
> is never any problem logged anywhere.
What filesystem(s) is that directory on? When the processes get stuck
in D state, could you "echo w >/proc/sysrq-trigger" and see what's
dumped to the system logs? Also, when you boot with a nonempty
/var/lib/nfs/v4recovery directory, what do the contents of that
directory look like? Is SELinux (or something else that could make them
difficult to delete) enabled?
> So, my questions are:
>
> What is this directory really used for? It doesn't seem to need it to
> reestablish connections
> Anyone have any idea what is going on or suggestions for finding out?
It's essentially storing a list of clients that held some kind of state
on the previous boot. Clients on that list are permitted to reclaim
state (like locks) on the next boot.
So if you ever reboot while an application on a client holds a lock,
the client will not be allowed to reclaim that lock. In the case of
linux clients I think the application will be allowed to continue
without being told of the situation, with the risk that another client
may get a conflicting lock without the application knowing.
That v4recovery directory has been problematic, I've been promising for
a while to replace it with a different mechanism, and am pretty
embarrassed that I haven't yet....
--b.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-07-22 16:32 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-16 20:44 server locks up at boot David Warren
[not found] ` <4A5F9110.4050208-qmPYOCrcNLLyFCzt5hm0YvZ8FUJU4vz8@public.gmane.org>
2009-07-22 16:32 ` J. Bruce Fields
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox