From: Chuck Lever <chuck.lever@oracle.com>
To: "Batsakis, Alexandros" <Alexandros.Batsakis@netapp.com>
Cc: Alexandros Batsakis <batsakis@netapp.com>,
linux-nfs@vger.kernel.org, trond@netapp.com
Subject: Re: [PATCH 6/6] RPC: adjust timeout for connect, bind, restablish so that they sensitive to the major time out value
Date: Fri, 05 Feb 2010 17:45:59 -0500 [thread overview]
Message-ID: <4B6C9FA7.2010702@oracle.com> (raw)
In-Reply-To: <383F4881-BD88-4155-B605-4D24F5B05BDD@netapp.com>
On 02/05/2010 05:14 PM, Batsakis, Alexandros wrote:
> Yeah sure,
>
> So imagine that for a specific connection the remaining major timeo
> value is 30secs. Xs_connect has a default timeout before attempting to
> reconnect of 60secs. The user (NFS) expects to "hear back" from the rpc
> layer within the timeout as in often cases e.g. lease renewal, it's of
> no benefit for an operation to reach the server at a later time and miss
> the critical time because it was sleeping for an arbitrary amount of time.
>
> My patch ensures that the reconnection attempts happen within the
> user-specified limits by replacing the 60secs default timeout with a
> value that is less than the time to expire. If this is not ideal the
> user should readjust the timeout value.
>
> Does it make sense?
That's better, but this explanation needs to accompany the patch, as the
long description. It's hard for reviewers who are not familiar with the
constraints on NFSv4 RENEW to understand what you're trying to fix.
Same comment applies to the other patches in your series, I think. (And
see further comments in the code below).
You are changing a generic part of the RPC client to fix an issue with
one specific NFSv4 procedure, it seems to me. Did you actually observe
a situation where the reconnect delay caused the server to miss a RENEW
request?
There are good reasons why there is a timeout before reestablishing a
connection. Have you tested this patch with NFSv3 servers that are
going up and down repeatedly, for example? I think skipping the
reconnect delay could have consequences for the cases which the original
xs_connect logic is supposed to address, and it's not something we want
in many other cases.
Perhaps a better idea would be to mark these particular RPCs with some
kind of indication that the RPC client has to connect immediately for
this request, if possible. Similar to RPC_TASK_SOFTCONN.
In general, sunrpc.ko has a problem with this kind of "urgent" RPC
request. If the RPC backlog queue is large for a particular rpc_clnt,
it can often take many seconds (like longer than the major timeout) for
a request to actually get down to the transport and get sent. I don't
see that these timeout changes necessarily address that at all.
I would suggest that the best solution might be a separate transport for
such requests, but that has other consequences. Maybe our RPC client
needs a separate "urgent" request queue that bypasses the normal backlog
queue for such time-dependent requests?
> -alexandros
>
> On Feb 5, 2010, at 12:14, "Chuck Lever" <chuck.lever@oracle.com> wrote:
>
>> Hi Alexandros-
>>
>> I think we need a little more than the short description for these
>> patches. Can you explain why the extra logic in xs_connect is
>> necessary? What exactly happens if the RENEW doesn't reach the server
>> because the transport can't be reconnected?
>>
>> In other words, what problem are you trying to address here?
>>
>> On 02/02/2010 07:06 PM, Alexandros Batsakis wrote:
>>> Signed-off-by: Alexandros Batsakis<batsakis@netapp.com>
>>> ---
>>> net/sunrpc/clnt.c | 2 +-
>>> net/sunrpc/sunrpc.h | 2 ++
>>> net/sunrpc/xprt.c | 18 +++++++++++++++---
>>> net/sunrpc/xprtsock.c | 12 ++++++++++--
>>> 4 files changed, 28 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>> index 154034b..8e552cd 100644
>>> --- a/net/sunrpc/clnt.c
>>> +++ b/net/sunrpc/clnt.c
>>> @@ -1019,7 +1019,7 @@ call_bind(struct rpc_task *task)
>>> task->tk_action = call_connect;
>>> if (!xprt_bound(xprt)) {
>>> task->tk_action = call_bind_status;
>>> - task->tk_timeout = xprt->bind_timeout;
>>> + xprt_set_timeout_sane(task, xprt->bind_timeout);
>>> xprt->ops->rpcbind(task);
>>> }
>>> }
>>> diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h
>>> index 90c292e..87c7aaa 100644
>>> --- a/net/sunrpc/sunrpc.h
>>> +++ b/net/sunrpc/sunrpc.h
>>> @@ -37,6 +37,8 @@ struct rpc_buffer {
>>> char data[];
>>> };
>>>
>>> +void xprt_set_timeout_sane(struct rpc_task *task, unsigned long
>>> timeout);
>>> +
>>> static inline int rpc_reply_expected(struct rpc_task *task)
>>> {
>>> return (task->tk_msg.rpc_proc != NULL)&&
>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>> index 469de29..f6f3988 100644
>>> --- a/net/sunrpc/xprt.c
>>> +++ b/net/sunrpc/xprt.c
>>> @@ -482,7 +482,7 @@ void xprt_wait_for_buffer_space(struct rpc_task
>>> *task, rpc_action action)
>>> struct rpc_rqst *req = task->tk_rqstp;
>>> struct rpc_xprt *xprt = req->rq_xprt;
>>>
>>> - task->tk_timeout = req->rq_timeout;
>>> + xprt_set_timeout_sane(task, req->rq_timeout);
>>> rpc_sleep_on(&xprt->pending, task, action);
>>> }
>>> EXPORT_SYMBOL_GPL(xprt_wait_for_buffer_space);
>>> @@ -518,7 +518,7 @@ EXPORT_SYMBOL_GPL(xprt_write_space);
>>> */
>>> void xprt_set_retrans_timeout_def(struct rpc_task *task)
>>> {
>>> - task->tk_timeout = task->tk_rqstp->rq_timeout;
>>> + xprt_set_timeout_sane(task, task->tk_rqstp->rq_timeout);
>>> }
>>> EXPORT_SYMBOL_GPL(xprt_set_retrans_timeout_def);
>>>
>>> @@ -682,6 +682,18 @@ out_abort:
>>> spin_unlock(&xprt->transport_lock);
>>> }
>>>
>>> +void xprt_set_timeout_sane(struct rpc_task *task, unsigned long
>>> timeout)
>>
>> xprt_set_timeout_sane? That's amusing, but it isn't very descriptive
>> to someone who doesn't know why this extra function is needed.
>>
>>> +{
>>> + unsigned long majorto = task->tk_rqstp->rq_majortimeo;
>>> +
>>> + if (majorto<= jiffies)
>>> + task->tk_timeout = 5 * HZ;
>>> + else if (timeout> majorto - jiffies)
>>> + task->tk_timeout = 2 * (majorto - jiffies) / 3;
>>> + else
>>> + task->tk_timeout = timeout;
>>> +}
>>> +
>>> /**
>>> * xprt_connect - schedule a transport connect operation
>>> * @task: RPC task that is requesting the connect
>>> @@ -710,7 +722,7 @@ void xprt_connect(struct rpc_task *task)
>>> if (task->tk_rqstp)
>>> task->tk_rqstp->rq_bytes_sent = 0;
>>>
>>> - task->tk_timeout = xprt->connect_timeout;
>>> + xprt_set_timeout_sane(task, xprt->connect_timeout);
>>> rpc_sleep_on(&xprt->pending, task, xprt_connect_status);
>>> xprt->stat.connect_start = jiffies;
>>> xprt->ops->connect(task);
>>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>>> index 721bafd..7fcc97f 100644
>>> --- a/net/sunrpc/xprtsock.c
>>> +++ b/net/sunrpc/xprtsock.c
>>> @@ -2020,9 +2020,17 @@ static void xs_connect(struct rpc_task *task)
>>> return;
>>>
>>> if (transport->sock != NULL&& !RPC_IS_SOFTCONN(task)) {
>>> + unsigned long majorto = task->tk_rqstp->rq_majortimeo;
>>> +
>>> + if (majorto<= jiffies)
>>> + xprt->reestablish_timeout = 5 * HZ;
>>> + else if (xprt->reestablish_timeout> majorto - jiffies)
>>> + xprt->reestablish_timeout = 2 * (majorto - jiffies)
>>> + / 3;
>>
>> This looks like the code you added above in _sane. Can you reuse that
>> code here?
>>
>>> +
>>> dprintk("RPC: xs_connect delayed xprt %p for %lu "
>>> - "seconds\n",
>>> - xprt, xprt->reestablish_timeout / HZ);
>>> + "seconds\n",
>>> + xprt, xprt->reestablish_timeout + (HZ - 1) / HZ);
>>> queue_delayed_work(rpciod_workqueue,
>>> &transport->connect_worker,
>>> xprt->reestablish_timeout);
>>
>>
>> --
>> chuck[dot]lever[at]oracle[dot]com
--
chuck[dot]lever[at]oracle[dot]com
next prev parent reply other threads:[~2010-02-05 22:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-03 0:06 [PATCH 0/6] nfs: renewd fixes Alexandros Batsakis
2010-02-03 0:06 ` [PATCH 1/6] nfs: kill renewd before clearing client minor version Alexandros Batsakis
2010-02-03 0:06 ` [PATCH 2/6] nfs: prevent backlogging of renewd requests Alexandros Batsakis
2010-02-03 0:06 ` [PATCH 3/6] nfs41: fix race between umount and renewd sequence operations Alexandros Batsakis
2010-02-03 0:06 ` [PATCH 4/6] nfs4: fix race between umount and renewd renew operations Alexandros Batsakis
2010-02-03 0:06 ` [PATCH 5/6] nfs4: adjust rpc timeout for nfs_client rpc client based on the lease_time Alexandros Batsakis
2010-02-03 0:06 ` [PATCH 6/6] RPC: adjust timeout for connect, bind, restablish so that they sensitive to the major time out value Alexandros Batsakis
2010-02-05 20:12 ` Chuck Lever
2010-02-05 22:14 ` Batsakis, Alexandros
2010-02-05 22:45 ` Chuck Lever [this message]
2010-02-05 23:04 ` Batsakis, Alexandros
2010-02-06 0:11 ` Chuck Lever
[not found] ` <B9364369CA66BF45806C2CD86EAB8BA60259D23D@SACMVEXC3-PRD.hq.netapp.com>
2010-02-07 0:53 ` Chuck Lever
[not found] ` <2CDC4373-10AD-4F84-BA44-3C2106D590BE@netapp.com>
2010-02-08 18:43 ` Chuck Lever
2010-02-08 23:13 ` Batsakis, Alexandros
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B6C9FA7.2010702@oracle.com \
--to=chuck.lever@oracle.com \
--cc=Alexandros.Batsakis@netapp.com \
--cc=batsakis@netapp.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond@netapp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox