linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mikael Davranche <mikael.davranche@free.fr>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks
Date: Wed, 12 Mar 2008 12:26:21 +0100	[thread overview]
Message-ID: <1205321181.47d7bddd9c869@imp.free.fr> (raw)
In-Reply-To: <1205277943.16687.14.camel@heimdal.trondhjem.org>

> You want to reduce the retransmission timeout on NLM because you receive
> more than 1 email per retransmission timeout? I can't see how the two
> are related.


To explain it, let's see two examples. In those following examples, I used the
following syntax:

lro: is a Lock Reclaim OK (the server didn't sent NLM_BLOCKED)
lre: is a Lock RElease (the client don't want the lock anymore)
lrb: is a Lock Reclaim not OK (the server sent NLM_BLOCKED and the client will
retry x seconds later)


Example 1

2 e-mail servers, 1 NAS, 1 e-mail every 10 seconds on each servers, 2 seconds to
store an e-mail
retransmission timeout: 30 seconds (default)

time    000 001 002 010 012 020 022 030 031 032
server1 lro     lre lro lre lro lre lro     lre
server2     lrn                         lrn

Report:

Between t=0 and t=32,
e-mails processed by server1: 4
e-mails processed by server1: 0

t=32,
e-mails in server1's local queue: 0
e-mails in server2's local queue: 4


Example 2

2 e-mail servers, 1 NAS, 1 e-mail every 10 seconds on each servers, 2 seconds to
store an e-mail
retransmission timeout: 3 seconds

time    000 001 002 005 007 010 011 012 013 015...
server1 lro     lre         lro     lre        ...
server2     lrn     lro lre     lrn     lro lre...

Report:

Between t=0 and t=15,
e-mails processed by server1: 2
e-mails processed by server1: 2

t=15,
e-mails in server1's local queue: 0
e-mails in server2's local queue: 0


Of course, a server never receives EXACTLY 1 e-mail every 10 seconds, but what
we can see in a production environment could be summarized with those two
examples.


> Normally, the server should call your client back using an NLM_GRANTED
> call as soon as the lock is available. If that isn't happening, then you
> need to look at why not. The retransmission+timeout is supposed to be a
> failsafe for when the NLM_GRANTED mechanism fails, not the main method
> for grabbing a lock.
>
> For instance, it may be that the server is unable to call the client
> back because you've hidden it behind a firewall or NAT, or perhaps your
> netfilter settings on either the client or the server are blocking the
> callback.


The only reason why I propose this short serie of patchs is that there is one
case in which we can not use the NLM_GRANTED mechanism and in which we must
always use the retransmission+timeout failsafe: the NFS server is under HPUX.

Let's have a look at the comment of the nlmclnt_lock function:

473 /*
474  * LOCK: Try to create a lock
475  *
476  *                      Programmer Harassment Alert
477  *
478  * When given a blocking lock request in a sync RPC call, the HPUX lockd
479  * will faithfully return LCK_BLOCKED but never cares to notify us when
480  * the lock could be granted. This way, our local process could hang
481  * around forever waiting for the callback.
482  *
483  *  Solution A: Implement busy-waiting
484  *  Solution B: Use the async version of the call (NLM_LOCK_{MSG,RES})
485  *
486  * For now I am implementing solution A, because I hate the idea of
487  * re-implementing lockd for a third time in two months. The async
488  * calls shouldn't be too hard to do, however.
489  *
490  * This is one of the lovely things about standards in the NFS area:
491  * they're so soft and squishy you can't really blame HP for doing this.
492  */

Note that I made my tests using a NetApp NAS ;) Indeed, Data ONTAP only sends
NLM_GRANTED over UDP (not TCP). So we can reproduce the HPUX behaviour with
"nlm_udpport = 0" on the client.


Cheers, Mikael

-- 
Mikael Davranche
System Engineer
Atos Worldline, France

  reply	other threads:[~2008-03-12 11:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-11 12:14 [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks Mikael Davranche
2008-03-11 12:15 ` [PATCH 1/3] " Mikael Davranche
2008-03-11 12:16 ` [PATCH 2/3] " Mikael Davranche
2008-03-11 12:18 ` [PATCH 3/3] " Mikael Davranche
2008-03-11 23:25 ` [PATCH 0/3] " Trond Myklebust
2008-03-12 11:26   ` Mikael Davranche [this message]
2008-03-12 13:33     ` Trond Myklebust
2008-03-12 15:51       ` Mikael Davranche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1205321181.47d7bddd9c869@imp.free.fr \
    --to=mikael.davranche@free.fr \
    --cc=Trond.Myklebust@netapp.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).