xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "G.R." <firemeteor@users.sourceforge.net>
Cc: xen-devel <xen-devel@lists.xen.org>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: Re: Unstable NFS mount at heavy load.
Date: Tue, 22 Jan 2013 15:29:29 -0500	[thread overview]
Message-ID: <20130122202929.GA12371@phenom.dumpdata.com> (raw)
In-Reply-To: <CAKhsbWZrjhMj5KPs7dXWkDB6HYuiOVDQP18_riXRX6VNGgf62w@mail.gmail.com>

On Mon, Jan 21, 2013 at 12:01:43AM +0800, G.R. wrote:
> On Sat, Jan 19, 2013 at 12:14 AM, Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com> wrote:
> > On Wed, Jan 16, 2013 at 12:50:08AM +0800, G.R. wrote:
> >> Hi Konrad, do you have any suggestion how to debug?
> >
> > Is your dom0 32-bit or 64-bit? And what kind of network card are you
> > using for the NFS traffic?
> >
> I have both 64-bit dom0 && domU.
> The physical card I have is RTL8111/8168B (rev06) (10ec: 8168).
> And the virtual card I used is e1000, but I guess this is not
> important since I've seen this in the log:
> Jan  6 01:31:03 debvm kernel: [    0.000000] Netfront and the Xen
> platform PCI driver have been compiled for this kernel: unplug
> emulated NICs.
> 
> I'm thinking of dumping the traffic to check when I got spare time.
> Do you think this is a good idea or do you have other suggestion?

Well, the thread on "Fatal crash on xen4.2 HVM + qemu-xen dm + NFS"
seems to imply that this a problem with NFS tcp-retransmit. 

And I've seen similar issues as well - but only on skge, tg3, and
r8169 - but only when using the 32-bit domain0.
I don't know if the issue I am hitting is the same thing.

> 
> Thanks,
> Timothy
> 
> PS: I'm on xen testing 4.2.1. The dom0 is a debian 3.6.6 kernel. The
> domU is a 3.6.9 kernel built from debian source package.
> >>
> >> Thanks,
> >> Timothy
> >>
> >> On Wed, Jan 9, 2013 at 4:47 PM, G.R. <firemeteor@users.sourceforge.net> wrote:
> >> > Hi Konrad,
> >> > Do you have any suggestion how to troubleshooting the NFS mount issue
> >> > as described below?
> >> > The broken connection is quite suspicious to me.
> >> >
> >> > Thanks,
> >> > Timothy
> >> >
> >> > On Wed, Jan 9, 2013 at 1:15 AM, Stefano Stabellini
> >> > <stefano.stabellini@eu.citrix.com> wrote:
> >> >> Do you mean the maintainer of the Linux PV network frontend and backend
> >> >> drivers (netfront and netback)?
> >> >> That would be Konrad.
> >> >>
> >> >> On Tue, 8 Jan 2013, G.R. wrote:
> >> >>> Nobody responses...
> >> >>>
> >> >>> Stefano, could you point me to the PVNET owner?
> >> >>> I suspect this has something to do with the net emulation.
> >> >>>
> >> >>> Thanks,
> >> >>> Timothy
> >> >>>
> >> >>> On Sat, Jan 5, 2013 at 1:12 PM, G.R. <firemeteor@users.sourceforge.net> wrote:
> >> >>> > Forward this to the devel list.
> >> >>> >
> >> >>> >
> >> >>> > ---------- Forwarded message ----------
> >> >>> > From: G.R. <firemeteor@users.sourceforge.net>
> >> >>> > Date: Sat, Jan 5, 2013 at 1:12 AM
> >> >>> > Subject: Unstable NFS mount at heavy load.
> >> >>> > To: xen-users@lists.xen.org
> >> >>> >
> >> >>> >
> >> >>> > I was running benchmark on IO performance using iozone3.
> >> >>> > In my build, the dom0 resides on a small usb stick and all the storage
> >> >>> > comes from a NFS mount.
> >> >>> > I test NFS performance on both dom0 && domU, mounting from the same server.
> >> >>> >
> >> >>> > The dom0 test works just well, but the domU run suffers from unstable NFS mount.
> >> >>> > Since this is a NFS root, the domU just appear to be freezed.
> >> >>> >
> >> >>> > The log from both end of the NFS mount shows that the connection is broken:
> >> >>> > Note that the client time stamp is about 20 seconds ahead of server.
> >> >>> >
> >> >>> > From the domU (client end):
> >> >>> > Jan  4 23:31:16 debvm kernel: [  371.008142] nfs: server 192.168.1.8
> >> >>> > not responding, still trying //(once)
> >> >>> > Jan  4 23:31:25 debvm kernel: [  379.928142] nfs: server 192.168.1.8
> >> >>> > not responding, still trying //(28 times within the same second)
> >> >>> > Jan  4 23:31:26 debvm kernel: [  381.396143] nfs: server 192.168.1.8
> >> >>> > not responding, still trying //(once)
> >> >>> > Jan  4 23:31:44 debvm kernel: [  399.452129] nfs: server 192.168.1.8
> >> >>> > not responding, still trying //(14 times within the same second)
> >> >>> > Jan  4 23:31:45 debvm kernel: [  399.524210] nfs: server 192.168.1.8
> >> >>> > not responding, still trying //(15 times within the same second)
> >> >>> > Jan  4 23:31:46 debvm kernel: [  400.964142] nfs: server 192.168.1.8
> >> >>> > not responding, still trying //(once)
> >> >>> > Jan  4 23:31:55 debvm kernel: [  410.468787] nfs: server 192.168.1.8
> >> >>> > OK                                     //(25 times within the same
> >> >>> > second)
> >> >>> > Jan  4 23:31:56 debvm kernel: [  410.520202] nfs: server 192.168.1.8
> >> >>> > OK                                     //(32 times within the same
> >> >>> > second)
> >> >>> > Jan  4 23:32:05 debvm kernel: [  420.208141] nfs: server 192.168.1.8
> >> >>> > not responding, still trying //(21 times within the same second)
> >> >>> > Jan  4 23:32:09 debvm kernel: [  424.367613] nfs: server 192.168.1.8
> >> >>> > OK                                     //(25 times within the same
> >> >>> > second)
> >> >>> > Jan  4 23:32:11 debvm kernel: [  425.764143] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:32:11 debvm kernel: [  425.772031] nfs: server 192.168.1.8 OK
> >> >>> > Jan  4 23:32:11 debvm kernel: [  426.466328] nfs: server 192.168.1.8 OK
> >> >>> > Jan  4 23:33:32 debvm kernel: [  507.136150] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:34:20 debvm kernel: [  555.170556] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:37:28 debvm kernel: [  742.616155] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:39:39 debvm kernel: [  873.880200] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:40:15 debvm kernel: [  909.987313] nfs: server 192.168.1.8
> >> >>> > OK                                    //(91 times within the same
> >> >>> > second)
> >> >>> > Jan  4 23:40:27 debvm kernel: [  921.776152] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:40:34 debvm kernel: [  929.314639] nfs: server 192.168.1.8 OK
> >> >>> > Jan  4 23:42:05 debvm kernel: [ 1019.584149] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:42:13 debvm kernel: [ 1028.504158] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:42:53 debvm kernel: [ 1067.565487] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:44:28 debvm kernel: [ 1163.368977] nfs: server 192.168.1.8 OK
> >> >>> > Jan  4 23:44:33 debvm kernel: [ 1168.337859] nfs: server 192.168.1.8 OK
> >> >>> > Jan  4 23:45:41 debvm kernel: [ 1236.448135] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:49:37 debvm kernel: [ 1471.960302] nfs: server 192.168.1.8
> >> >>> > not responding, still trying
> >> >>> > Jan  4 23:51:00 debvm kernel: [ 1554.982479] nfs: server 192.168.1.8 OK
> >> >>> >
> >> >>> > From the server side:
> >> >>> > Jan  4 23:31:33 Hasim kernel: rpc-srv/tcp: nfsd: got error -104 when
> >> >>> > sending 140 bytes - shutting down socket
> >> >>> > Jan  4 23:31:33 Hasim kernel: nfsd: peername failed (err 107)!
> >> >>> > Jan  4 23:39:50 Hasim kernel: rpc-srv/tcp: nfsd: got error -104 when
> >> >>> > sending 140 bytes - shutting down socket
> >> >>> > Jan  4 23:39:50 Hasim kernel: nfsd: peername failed (err 107)!
> >> >>> > Jan  4 23:39:50 Hasim kernel: nfsd: peername failed (err 107)!
> >> >>> > Jan  4 23:40:10 Hasim kernel: rpc-srv/tcp: nfsd: got error -104 when
> >> >>> > sending 140 bytes - shutting down socket
> >> >>> > Jan  4 23:44:01 Hasim kernel: rpc-srv/tcp: nfsd: got error -104 when
> >> >>> > sending 140 bytes - shutting down socket
> >> >>> > Jan  4 23:44:01 Hasim kernel: net_ratelimit: 11 callbacks suppressed
> >> >>> > Jan  4 23:44:01 Hasim kernel: nfsd: peername failed (err 107)!
> >> >>> > Jan  4 23:50:38 Hasim kernel: rpc-srv/tcp: nfsd: got error -104 when
> >> >>> > sending 140 bytes - shutting down socket
> >> >>> > Jan  4 23:50:38 Hasim kernel: nfsd: peername failed (err 107)!
> >> >>> >
> >> >>> >
> >> >>> > Any suggestion how to debug this issue?
> >> >>> > My xen version is 4.2.1, domU kernel is at 3.6.9, the domU is PVHVM.
> >> >>> >
> >> >>> > Thanks,
> >> >>> > Timothy
> >> >>>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xen.org
> >> http://lists.xen.org/xen-devel
> >>

  reply	other threads:[~2013-01-22 20:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAKhsbWbonNA-wdQYKZEAZ2pJOkxK5gtGXCY0YCP3hGF98_idBg@mail.gmail.com>
2013-01-05  5:12 ` Fwd: Unstable NFS mount at heavy load G.R.
2013-01-08 16:25   ` G.R.
2013-01-08 17:15     ` Stefano Stabellini
2013-01-09  8:47       ` G.R.
2013-01-15 16:50         ` G.R.
2013-01-18 16:14           ` Konrad Rzeszutek Wilk
2013-01-20 16:01             ` G.R.
2013-01-22 20:29               ` Konrad Rzeszutek Wilk [this message]
2013-01-26 12:18                 ` G.R.
2013-01-26 16:17                   ` G.R.

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130122202929.GA12371@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=firemeteor@users.sourceforge.net \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).