From: "J. Bruce Fields" <bfields@fieldses.org>
To: Florian Pritz <bluewind@xinu.at>
Cc: linux-nfs@vger.kernel.org
Subject: Re: NFS stalls when writing - linux 3.6.x
Date: Wed, 7 Nov 2012 14:07:24 -0500 [thread overview]
Message-ID: <20121107190724.GD7421@fieldses.org> (raw)
In-Reply-To: <50957087.6050008@xinu.at>
On Sat, Nov 03, 2012 at 08:29:11PM +0100, Florian Pritz wrote:
> Hi,
>
> Long text ahead.
>
>
> Since I have no idea what to look at/for, I tried to summarise all more
> or less relevant information. If you need any more, please tell me.
>
> I've been trying to debug this for days now and might have mixed
> something up although I double checked as much as possible while writing
> this mail.
>
>
> # Overview
>
> I've been experiencing stalls when trying to write big-ish files on my
> nfs mount for some time (few months) now. Rsync is also somewhat slow,
> transferring only like 1 file per second even if the files are only a
> few kilobytes in size. Sometimes it also stalls for a few seconds
> between files. I hardly run rsync over nfs so can't tell if this might
> be normal.
>
> Sadly I don't know when this started happening.
It would be helpful to know that--especially if you find an easy way to
reproduce this, it would be worth booting to older kernels and seeing if
you can figure when the problem started.
> Server and client are both running Arch Linux with linux 3.6.5 and
> nfs-utils 1.2.6.
>
> The server is running on a striped raid10 array with 4 disks using the
> deadline scheduler and connected via Gbit ethernet. The CPU is an Intel
> i3-530 and it has 2GB RAM. The raid10 is part of an LVM which contains
> the actual XFS file system exported by nfsd.
>
> At first I assumed a problem with file system, but I switched from ext3
> to XFS and still experience the issue. Transferring large amounts
> (>80GB) of data over samba + cifs didn't cause any problems so I'm
> ruling out network and disks.
>
> # Description
>
> dd if=/dev/zero of=test bs=1M count=8000 (writing a 1GB file is also
> enough, sometimes)
>
> Watch the network traffic (with "vnstat -l" or conky) and wait until it
> drops from 110MB/s to 0-5MB/s (you might need to run dd multiple times,
> wait a few minutes/hours or reboot the server)
>
> top on the server now shows lots of nfsd threads in D state.
Next time you find in that state, could you try
echo t >/proc/sysrq-trigger
on the server? That will dump a bunch of data to the logs which we
might be able to use.
--b.
> iostat only
> shows the 0-5MB/s of network traffic going to the disk.
>
> A local dd job on the server manages to write 160MB/s while nfsd
> continues to hang. Reading from the nfs share while nfsd is hanging is
> possible, but has a delay of up to ~20-30 seconds.
>
> After some time the client displays "nfs: server levant not responding,
> still trying" in dmesg followed by a "nfs: server levant OK" 0 or more
> seconds later (yes, zero). Both messages sometimes appear more than once
> at the same time.
>
> Apart from those messages dmesg is clean on either system even after
> waiting for a few minutes.
>
> # Environment
>
> ## Mount options (from /proc/mounts)
>
> rw,nosuid,nodev,noexec,relatime,vers=4.0,rsize=65536,wsize=65536,
> namlen=255,hard,proto=tcp,port=0,timeo=14,retrans=2,sec=sys,
> clientaddr=192.168.4.247,local_lock=none,addr=192.168.4.103,user
>
> ## /etc/exportsfs -v
>
> /mnt/data/nfs
> 192.168.4.1/24(rw,wdelay,crossmnt,root_squash,all_squash,no_subtree_check,anonuid=999,anongid=999)
>
> ## Programm versions
>
> Those are all the same on both client and server.
>
> acl 2.2.51-2
> libgssglue 0.4-1
> libevent 2.0.20-1
> librpcsecgss 0.19-7
> nfs-utils 1.2.6-2
> util-linux 2.22.1-2
>
> # Other notes
>
> I tried reproducing the issue with a virtual machine and it somehow
> worked, but I'm not really sure if I actually hit the same issue because
> the vm sometimes locks up too.
>
> The VM was set up in qemu with one virtio disk which was directly
> partioned without the use of mdadm or lvm.
>
>
> Thank you for reading.
>
> --
> Florian Pritz
>
next prev parent reply other threads:[~2012-11-07 19:07 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-03 19:29 NFS stalls when writing - linux 3.6.x Florian Pritz
2012-11-07 19:07 ` J. Bruce Fields [this message]
[not found] ` <509D2993.4050604@xinu.at>
2012-11-09 16:36 ` J. Bruce Fields
2012-11-09 17:20 ` Ben Myers
2012-11-09 17:25 ` Florian Pritz
2012-11-09 17:53 ` Mark Tinguely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121107190724.GD7421@fieldses.org \
--to=bfields@fieldses.org \
--cc=bluewind@xinu.at \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).