linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Florian Pritz <bluewind@xinu.at>
Cc: linux-nfs@vger.kernel.org
Subject: Re: NFS stalls when writing - linux 3.6.x
Date: Wed, 7 Nov 2012 14:07:24 -0500	[thread overview]
Message-ID: <20121107190724.GD7421@fieldses.org> (raw)
In-Reply-To: <50957087.6050008@xinu.at>

On Sat, Nov 03, 2012 at 08:29:11PM +0100, Florian Pritz wrote:
> Hi,
> 
> Long text ahead.
> 
> 
> Since I have no idea what to look at/for, I tried to summarise all more
> or less relevant information. If you need any more, please tell me.
> 
> I've been trying to debug this for days now and might have mixed
> something up although I double checked as much as possible while writing
> this mail.
> 
> 
> # Overview
> 
> I've been experiencing stalls when trying to write big-ish files on my
> nfs mount for some time (few months) now. Rsync is also somewhat slow,
> transferring only like 1 file per second even if the files are only a
> few kilobytes in size. Sometimes it also stalls for a few seconds
> between files. I hardly run rsync over nfs so can't tell if this might
> be normal.
> 
> Sadly I don't know when this started happening.

It would be helpful to know that--especially if you find an easy way to
reproduce this, it would be worth booting to older kernels and seeing if
you can figure when the problem started.

> Server and client are both running Arch Linux with linux 3.6.5 and
> nfs-utils 1.2.6.
> 
> The server is running on a striped raid10 array with 4 disks using the
> deadline scheduler and connected via Gbit ethernet. The CPU is an Intel
> i3-530 and it has 2GB RAM. The raid10 is part of an LVM which contains
> the actual XFS file system exported by nfsd.
> 
> At first I assumed a problem with file system, but I switched from ext3
> to XFS and still experience the issue. Transferring large amounts
> (>80GB) of data over samba + cifs didn't cause any problems so I'm
> ruling out network and disks.
> 
> # Description
> 
> dd if=/dev/zero of=test bs=1M count=8000 (writing a 1GB file is also
> enough, sometimes)
> 
> Watch the network traffic (with "vnstat -l" or conky) and wait until it
> drops from 110MB/s to 0-5MB/s (you might need to run dd multiple times,
> wait a few minutes/hours or reboot the server)
> 
> top on the server now shows lots of nfsd threads in D state.

Next time you find in that state, could you try

	echo t >/proc/sysrq-trigger

on the server?  That will dump a bunch of data to the logs which we
might be able to use.

--b.

> iostat only
> shows the 0-5MB/s of network traffic going to the disk.
> 
> A local dd job on the server manages to write 160MB/s while nfsd
> continues to hang. Reading from the nfs share while nfsd is hanging is
> possible, but has a delay of up to ~20-30 seconds.
> 
> After some time the client displays "nfs: server levant not responding,
> still trying" in dmesg followed by a "nfs: server levant OK" 0 or more
> seconds later (yes, zero). Both messages sometimes appear more than once
> at the same time.
> 
> Apart from those messages dmesg is clean on either system even after
> waiting for a few minutes.
> 
> # Environment
> 
> ## Mount options (from /proc/mounts)
> 
> rw,nosuid,nodev,noexec,relatime,vers=4.0,rsize=65536,wsize=65536,
> namlen=255,hard,proto=tcp,port=0,timeo=14,retrans=2,sec=sys,
> clientaddr=192.168.4.247,local_lock=none,addr=192.168.4.103,user
> 
> ## /etc/exportsfs -v
> 
> /mnt/data/nfs
> 192.168.4.1/24(rw,wdelay,crossmnt,root_squash,all_squash,no_subtree_check,anonuid=999,anongid=999)
> 
> ## Programm versions
> 
> Those are all the same on both client and server.
> 
> acl 2.2.51-2
> libgssglue 0.4-1
> libevent 2.0.20-1
> librpcsecgss 0.19-7
> nfs-utils 1.2.6-2
> util-linux 2.22.1-2
> 
> # Other notes
> 
> I tried reproducing the issue with a virtual machine and it somehow
> worked, but I'm not really sure if I actually hit the same issue because
> the vm sometimes locks up too.
> 
> The VM was set up in qemu with one virtio disk which was directly
> partioned without the use of mdadm or lvm.
> 
> 
> Thank you for reading.
> 
> -- 
> Florian Pritz
> 



  reply	other threads:[~2012-11-07 19:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-03 19:29 NFS stalls when writing - linux 3.6.x Florian Pritz
2012-11-07 19:07 ` J. Bruce Fields [this message]
     [not found]   ` <509D2993.4050604@xinu.at>
2012-11-09 16:36     ` J. Bruce Fields
2012-11-09 17:20       ` Ben Myers
2012-11-09 17:25         ` Florian Pritz
2012-11-09 17:53           ` Mark Tinguely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121107190724.GD7421@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=bluewind@xinu.at \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).