From: Boaz Harrosh <bharrosh@panasas.com>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: NFS list <linux-nfs@vger.kernel.org>,
Stable Tree <stable@vger.kernel.org>
Subject: Re: [PATCH v2] pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
Date: Wed, 15 Jan 2014 01:41:56 +0200 [thread overview]
Message-ID: <52D5CB44.6080605@panasas.com> (raw)
In-Reply-To: <153DAED6-461B-43A4-A5B2-A79C8E893285@primarydata.com>
On 01/15/2014 12:47 AM, Trond Myklebust wrote:
>
> On Jan 14, 2014, at 17:43, Trond Myklebust <trond.myklebust@primarydata.com> wrote:
>
>>
>> On Jan 14, 2014, at 17:21, Boaz Harrosh <bharrosh@panasas.com> wrote:
>>
>>> On 01/14/2014 09:05 PM, Trond Myklebust wrote:
>>>> On Tue, 2014-01-14 at 17:32 +0200, Boaz Harrosh wrote:
>>>>>
>>>>
>>>> For the default mount option of 'timeo=600', and the default #define
>>>> NFS4_POLL_RETRY_MIN==HZ/10, this means we can end up pounding the server
>>>> with 600 LAYOUTGET requests within the space of 1 minute, before giving
>>>> up. Is that reasonable?
>>>>
>>>
>>> It will never get there it will always be 1 or two sends. Usually it is
>>> just so the sequence of layout_get_done is out of the way and the
>>> LAYOUT_RECALL sequence+1 can get through and the layout released. Then
>>> the next time it will all be good and the LAYOUT_GET will succeed.
>>>
>>> Worst case is when the client is very busy with queue full of IO
>>> on the same busy layout that needs to be released by the recall. Personally
>>> I found that this never exceeds 40 IOPs in flight. Note that this is not
>>> the amount of total dirty memory but only the amount of already submitted
>>> IO. I guess that on a very slow connection these can take time but in
>>> regular line speeds I never observed more the 2 retries with this patch.
>>>
>>> It is all up to the client. NFS4ERR_RECALLCONFLICT means "the layouts you
>>> have need to be released" (I say released because the forgetful model does
>>> not actually returns them). Can you see a critical time when layouts are
>>> held for longer than a second ?
>>
>> That will probably depend on the workload and possibly on the layout type.
>>
>> My point was, however, about the potential for mischief due to the mismatch between the number of retries that the resulting code allows, and the fixed period between those retries of 1/10 seconds. Why not rather use something along the lines of "rpc_delay(rpc_task, min(giveup -jiffies , max(jiffies - lgp->args.timestamp, NFS4_POLL_RETRY_MIN)));”? That gives you an initially exponential back off with a minimum period of NFS4_POLL_RETRY_MIN, and with an expiry date of ‘timeo’ jiffies after the first attempt.
>
> Whoops. That should probably be
>
> max(NFS4_POLL_RETRY_MIN, min(giveup - jiffies , jiffies - lgp->args.timestamp))
>
> so that the time interval is not < NFS4_POLL_RETRY_MIN.
OK I'll try that.
Thanks
Boaz
prev parent reply other threads:[~2014-01-14 23:42 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-14 15:32 [PATCH v2] pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done Boaz Harrosh
2014-01-14 19:05 ` Trond Myklebust
2014-01-14 22:21 ` Boaz Harrosh
2014-01-14 22:43 ` Trond Myklebust
2014-01-14 22:47 ` Trond Myklebust
2014-01-14 23:41 ` Boaz Harrosh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52D5CB44.6080605@panasas.com \
--to=bharrosh@panasas.com \
--cc=linux-nfs@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox