public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: "Batsakis, Alexandros" <Alexandros.Batsakis@netapp.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH 6/6] RPC: adjust timeout for connect, bind, restablish so that they sensitive to the major time out value
Date: Mon, 08 Feb 2010 13:43:47 -0500	[thread overview]
Message-ID: <4B705B63.8060604@oracle.com> (raw)
In-Reply-To: <2CDC4373-10AD-4F84-BA44-3C2106D590BE@netapp.com>

On 02/06/2010 11:11 PM, Batsakis, Alexandros wrote:
>
>
> On Feb 6, 2010, at 16:53, "Chuck Lever" <chuck.lever@oracle.com> wrote:
>
>> On 02/05/2010 11:05 PM, Batsakis, Alexandros wrote:
>>>
>>> My replies marked with the "AB" prefix
>>>
>>> -----Original Message-----
>>> From: Chuck Lever [mailto:chuck.lever@oracle.com]
>>> Sent: Fri 2/5/2010 4:11 PM
>>> To: Batsakis, Alexandros
>>> Cc: Batsakis, Alexandros; linux-nfs@vger.kernel.org; Myklebust, Trond
>>> Subject: Re: [PATCH 6/6] RPC: adjust timeout for connect, bind,
>>> restablish so that they sensitive to the major time out value
>>>
>>> On 02/05/2010 06:04 PM, Batsakis, Alexandros wrote:
>>> >
>>> >
>>> > On Feb 5, 2010, at 14:47, "Chuck Lever" <chuck.lever@oracle.com>
>>> wrote:
>>> >
>>> >> On 02/05/2010 05:14 PM, Batsakis, Alexandros wrote:
>>> >>> Yeah sure,
>>> >>>
>>> >>> So imagine that for a specific connection the remaining major timeo
>>> >>> value is 30secs. Xs_connect has a default timeout before
>>> attempting to
>>> >>> reconnect of 60secs. The user (NFS) expects to "hear back" from
>>> the rpc
>>> >>> layer within the timeout as in often cases e.g. lease renewal,
>>> it's of
>>> >>> no benefit for an operation to reach the server at a later time and
>>> miss
>>> >>> the critical time because it was sleeping for an arbitrary amount of
>>> >>> time.
>>>
>>> Maybe you want RPC_TASK_SOFTCONN for NFSv4 renewals instead of
>>> RPC_TASK_SOFT. This would cause the RENEW request to fail immediately
>>> if the transport can't connect.
>>>
>>> AB: is this a new flag ? I am not familiar with it. Or are you proposing
>>> to add such a flag?
>>> It's not an unreasonable thing to do
>>
>> The flag was added recently (maybe in 2.6.33-rc?). It causes an
>> individual RPC request to fail immediately if the underlying transport
>> cannot be connected. It bypasses the reconnect timeout if the
>> transport is not already connected.
>>
>
> Oh OK. Maybe then it's a reasonable workaround to the reconnection
> policy changes. I think though that the rest of the changes wrt the
> major timeout are still valid. Also IMHO the max of 5min seems a lot,
> especially for operations that are state-oriented like in v4.0 and v4.1.

I'm not averse to reducing the maximum reconnect delay to something like 
60 seconds.  This might even be an acceptable work around for some of 
the issues you've raised.  Additional illumination of current reconnect 
behavior may find that it no longer behaves as expected in the quick 
server reboot cases, and that also should be addressed.

However, the fact that _all_ NFSv4 state-changing operations now have 
additional delivery constraints makes this an issue larger than RENEWD 
(which is the subject line of your original postings).  IMO sunrpc.ko is 
not currently prepared to handle that kind of timing constraint 
adequately.  Adjusting the retransmit behavior is simply not sufficient 
to address these problems (and perhaps it is even orthogonal to them).

-- 
chuck[dot]lever[at]oracle[dot]com

  parent reply	other threads:[~2010-02-08 18:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-03  0:06 [PATCH 0/6] nfs: renewd fixes Alexandros Batsakis
2010-02-03  0:06 ` [PATCH 1/6] nfs: kill renewd before clearing client minor version Alexandros Batsakis
2010-02-03  0:06   ` [PATCH 2/6] nfs: prevent backlogging of renewd requests Alexandros Batsakis
2010-02-03  0:06     ` [PATCH 3/6] nfs41: fix race between umount and renewd sequence operations Alexandros Batsakis
2010-02-03  0:06       ` [PATCH 4/6] nfs4: fix race between umount and renewd renew operations Alexandros Batsakis
2010-02-03  0:06         ` [PATCH 5/6] nfs4: adjust rpc timeout for nfs_client rpc client based on the lease_time Alexandros Batsakis
2010-02-03  0:06           ` [PATCH 6/6] RPC: adjust timeout for connect, bind, restablish so that they sensitive to the major time out value Alexandros Batsakis
2010-02-05 20:12             ` Chuck Lever
2010-02-05 22:14               ` Batsakis, Alexandros
2010-02-05 22:45                 ` Chuck Lever
2010-02-05 23:04                   ` Batsakis, Alexandros
2010-02-06  0:11                     ` Chuck Lever
     [not found]                       ` <B9364369CA66BF45806C2CD86EAB8BA60259D23D@SACMVEXC3-PRD.hq.netapp.com>
2010-02-07  0:53                         ` Chuck Lever
     [not found]                           ` <2CDC4373-10AD-4F84-BA44-3C2106D590BE@netapp.com>
2010-02-08 18:43                             ` Chuck Lever [this message]
2010-02-08 23:13                               ` Batsakis, Alexandros

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B705B63.8060604@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Alexandros.Batsakis@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox