All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: unexpected NFS timeouts, related to sync/async soft mounts over TCP
Date: Wed, 16 Nov 2011 14:51:03 +0000	[thread overview]
Message-ID: <4EC3CDD7.801@citrix.com> (raw)
In-Reply-To: <4EC2790A.80706@citrix.com>

Further debugging shows that the FINs are being inserted because of a
call to xs_tcp_release_xprt(), where req->rq_bytes_sent !=
req->rq_snd_buf.len

Some of the time, the netapp server FIN+ACKs and the TCP connection goes
down and back up without adversely affecting the NFS session.  However,
some of the time, the server does not FIN+ACK the clients FIN, causing a
15 second timeout before the client RSTs the TCP connection, causing the
visible problems to the NFS session.

I would say that the netapp not FIN+ACKing is a bug in itself, but I
would also say that it is a bug for the client to not be able to send
all of its send buffer.

Are there cases where not sending its send buffer is expected, or is it
a state which should be avoided?

~Andrew

On 15/11/11 14:36, Andrew Cooper wrote:
> Sorry for a slow reply - this is unfortunately not the only bug I am
> working on.
>
> After further testing, this problem does actually reproduce with
> synchronous mounts as well as asynchronous mounts.  It just takes some
> extreme stress testing to reproduce with synchronous mounts.
>
> After some debugging in xs_tcp_shutdown()  (a cheeky dump_stack()), it
> appears that periodically xprt_autoclose() is closing the TCP connection.
>
> It appears that some of the time, the server correctly FIN+ACKs the
> first FIN, at which point the TCP connection is torn down and set back
> up, with no interruption to the NFS session.  However, some of the time,
> the server does not FIN+ACK the clients FIN, at which point the client
> waits 15 seconds and RST's the TCP connection, leading to the errors seen.
>
> What is the purpose of xprt_autoclose() ? I assume it is to
> automatically close idle connections.  Am I correct in assuming that it
> should not be attempting to close an active connection?
>
> Thanks,
>

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


      reply	other threads:[~2011-11-16 14:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-09 18:38 unexpected NFS timeouts, related to sync/async soft mounts over TCP Andrew Cooper
2011-11-09 22:36 ` Chuck Lever
2011-11-10 11:15   ` Andrew Cooper
2011-11-10 15:29     ` Chuck Lever
2011-11-10 15:52       ` Andrew Cooper
2011-11-10 20:43         ` Trond Myklebust
2011-11-11 10:31           ` Andrew Cooper
2011-11-11 12:52             ` Jim Rees
2011-11-11 22:38             ` Trond Myklebust
2011-11-14 13:16               ` Andrew Cooper
2011-11-15 14:36                 ` Andrew Cooper
2011-11-16 14:51                   ` Andrew Cooper [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EC3CDD7.801@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.