From: Boaz Harrosh <bharrosh@panasas.com>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: NFS list <linux-nfs@vger.kernel.org>,
Stable Tree <stable@vger.kernel.org>
Subject: Re: [PATCH v2] pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
Date: Wed, 15 Jan 2014 01:41:56 +0200 [thread overview]
Message-ID: <52D5CB44.6080605@panasas.com> (raw)
In-Reply-To: <153DAED6-461B-43A4-A5B2-A79C8E893285@primarydata.com>
On 01/15/2014 12:47 AM, Trond Myklebust wrote:
>
> On Jan 14, 2014, at 17:43, Trond Myklebust <trond.myklebust@primarydata.com> wrote:
>
>>
>> On Jan 14, 2014, at 17:21, Boaz Harrosh <bharrosh@panasas.com> wrote:
>>
>>> On 01/14/2014 09:05 PM, Trond Myklebust wrote:
>>>> On Tue, 2014-01-14 at 17:32 +0200, Boaz Harrosh wrote:
>>>>>
>>>>
>>>> For the default mount option of 'timeo=600', and the default #define
>>>> NFS4_POLL_RETRY_MIN==HZ/10, this means we can end up pounding the server
>>>> with 600 LAYOUTGET requests within the space of 1 minute, before giving
>>>> up. Is that reasonable?
>>>>
>>>
>>> It will never get there it will always be 1 or two sends. Usually it is
>>> just so the sequence of layout_get_done is out of the way and the
>>> LAYOUT_RECALL sequence+1 can get through and the layout released. Then
>>> the next time it will all be good and the LAYOUT_GET will succeed.
>>>
>>> Worst case is when the client is very busy with queue full of IO
>>> on the same busy layout that needs to be released by the recall. Personally
>>> I found that this never exceeds 40 IOPs in flight. Note that this is not
>>> the amount of total dirty memory but only the amount of already submitted
>>> IO. I guess that on a very slow connection these can take time but in
>>> regular line speeds I never observed more the 2 retries with this patch.
>>>
>>> It is all up to the client. NFS4ERR_RECALLCONFLICT means "the layouts you
>>> have need to be released" (I say released because the forgetful model does
>>> not actually returns them). Can you see a critical time when layouts are
>>> held for longer than a second ?
>>
>> That will probably depend on the workload and possibly on the layout type.
>>
>> My point was, however, about the potential for mischief due to the mismatch between the number of retries that the resulting code allows, and the fixed period between those retries of 1/10 seconds. Why not rather use something along the lines of "rpc_delay(rpc_task, min(giveup -jiffies , max(jiffies - lgp->args.timestamp, NFS4_POLL_RETRY_MIN)));”? That gives you an initially exponential back off with a minimum period of NFS4_POLL_RETRY_MIN, and with an expiry date of ‘timeo’ jiffies after the first attempt.
>
> Whoops. That should probably be
>
> max(NFS4_POLL_RETRY_MIN, min(giveup - jiffies , jiffies - lgp->args.timestamp))
>
> so that the time interval is not < NFS4_POLL_RETRY_MIN.
OK I'll try that.
Thanks
Boaz
prev parent reply other threads:[~2014-01-14 23:42 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-14 15:32 [PATCH v2] pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done Boaz Harrosh
2014-01-14 19:05 ` Trond Myklebust
2014-01-14 22:21 ` Boaz Harrosh
2014-01-14 22:43 ` Trond Myklebust
2014-01-14 22:47 ` Trond Myklebust
2014-01-14 23:41 ` Boaz Harrosh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52D5CB44.6080605@panasas.com \
--to=bharrosh@panasas.com \
--cc=linux-nfs@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.