NFS hang when writing to loopback file from VMWare ESX (kernel 2.6.30)

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* NFS hang when writing to loopback file from VMWare ESX (kernel 2.6.30)
@ 2010-05-10  9:20 Beast in Black
  2010-05-10 17:36 ` Chuck Lever
  0 siblings, 1 reply; 2+ messages in thread
From: Beast in Black @ 2010-05-10  9:20 UTC (permalink / raw)
  To: linux-nfs

Greetings.

Every so often, when i'm writing via NFS to a loopback-mounted file, i
find that about 10-15 nfsd threads (out of a total of 64) go into D
state, along with the loop file, and never recover from the D state.
My setup is as follows:

1. sparse file is created via dd and loopback-mounted onto a
/dev/loopX device (where 0<= X <= 100)
2. sparse file is mke2fs'd and mounted on mount point "/volumes/localvol"
3. "/volumes/localvol" is exported with options
*(rw,no_root_squash,no_subtree_check,async,insecure,nohide,no_wdelay).
4. /volumes/localvol is set as a network datastore (NFS mount) in ESX
5. Virtual machine files for an ESX VM are copied into the NFS mount on ESX
6. Virtual machine is powered on and I do some activity in it...write files etc.

At this point, the VM is running fine in ESX. After a while, however,
I notice that the VM freezes and that ESX reports the NFS mounted
datastore as unreachable. When I check the NFS server machine, I find
that 10-15 NFS threads are in D state, along with the associated
loopback-mounted file. The D states are never recovered from, and the
only way out is to reboot the NFS server machine.

I have also tried with specifying the export as "sync" instead of
"async" (and removing no_wdelay) but I still see the same behavior.

The NFS server is running the vanilla 2.6.30 kernel on Ubuntu 8.10.
The NFS exports are all NFSv3.

Does anyone have an idea of why this may be occurring? I would be glad
to provide any additional info required.

Regards.

--
Time flies like an arrow
Fruit flies like a banana

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: NFS hang when writing to loopback file from VMWare ESX (kernel 2.6.30)
  2010-05-10  9:20 NFS hang when writing to loopback file from VMWare ESX (kernel 2.6.30) Beast in Black
@ 2010-05-10 17:36 ` Chuck Lever
  0 siblings, 0 replies; 2+ messages in thread
From: Chuck Lever @ 2010-05-10 17:36 UTC (permalink / raw)
  To: Beast in Black; +Cc: linux-nfs

On 05/10/2010 05:20 AM, Beast in Black wrote:
> Greetings.
>
> Every so often, when i'm writing via NFS to a loopback-mounted file, i
> find that about 10-15 nfsd threads (out of a total of 64) go into D
> state, along with the loop file, and never recover from the D state.
> My setup is as follows:
>
> 1. sparse file is created via dd and loopback-mounted onto a
> /dev/loopX device (where 0<= X<= 100)
> 2. sparse file is mke2fs'd and mounted on mount point "/volumes/localvol"
> 3. "/volumes/localvol" is exported with options
> *(rw,no_root_squash,no_subtree_check,async,insecure,nohide,no_wdelay).
> 4. /volumes/localvol is set as a network datastore (NFS mount) in ESX
> 5. Virtual machine files for an ESX VM are copied into the NFS mount on ESX
> 6. Virtual machine is powered on and I do some activity in it...write files etc.
>
> At this point, the VM is running fine in ESX. After a while, however,
> I notice that the VM freezes and that ESX reports the NFS mounted
> datastore as unreachable. When I check the NFS server machine, I find
> that 10-15 NFS threads are in D state, along with the associated
> loopback-mounted file. The D states are never recovered from, and the
> only way out is to reboot the NFS server machine.
>
> I have also tried with specifying the export as "sync" instead of
> "async" (and removing no_wdelay) but I still see the same behavior.
>
> The NFS server is running the vanilla 2.6.30 kernel on Ubuntu 8.10.
> The NFS exports are all NFSv3.
>
> Does anyone have an idea of why this may be occurring? I would be glad
> to provide any additional info required.

There may be a deadlock due to memory pressure on the server.  You might 
get some information by doing a "sudo echo 't' > /proc/sysrq_trigger", 
then looking in your syslog, when the server gets into the hung state.

-- 
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-05-10 17:36 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-10  9:20 NFS hang when writing to loopback file from VMWare ESX (kernel 2.6.30) Beast in Black
2010-05-10 17:36 ` Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).