* [PATCH 1/3] NLM: Proposal for a timeout setting on blocking locks
2008-03-11 12:14 [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks Mikael Davranche
@ 2008-03-11 12:15 ` Mikael Davranche
2008-03-11 12:16 ` [PATCH 2/3] " Mikael Davranche
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Mikael Davranche @ 2008-03-11 12:15 UTC (permalink / raw)
To: Trond Myklebust, Trond Myklebust, linux-fsdevel
# diff -u a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
--- a/include/linux/lockd/lockd.h Tue Feb 26 01:20:20 2008
+++ b/include/linux/lockd/lockd.h Fri Mar 7 10:33:39 2008
@@ -34,6 +34,11 @@
#define LOCKD_DFLT_TIMEO 10
/*
+ * Default timeout for waiting on an NLM blocking lock (seconds)
+ */
+#define NLMCLNT_POLL_TIMEOUT 30
+
+/*
* Lockd host handle (used both by the client and server personality).
*/
struct nlm_host {
@@ -154,6 +159,7 @@
extern int nlmsvc_grace_period;
extern unsigned long nlmsvc_timeout;
extern int nsm_use_hostnames;
+extern unsigned long nlm_clnt_poll_timeout;
/*
* Lockd client functions
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 2/3] NLM: Proposal for a timeout setting on blocking locks
2008-03-11 12:14 [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks Mikael Davranche
2008-03-11 12:15 ` [PATCH 1/3] " Mikael Davranche
@ 2008-03-11 12:16 ` Mikael Davranche
2008-03-11 12:18 ` [PATCH 3/3] " Mikael Davranche
2008-03-11 23:25 ` [PATCH 0/3] " Trond Myklebust
3 siblings, 0 replies; 8+ messages in thread
From: Mikael Davranche @ 2008-03-11 12:16 UTC (permalink / raw)
To: Trond Myklebust, Trond Myklebust, linux-fsdevel
# diff -u a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
--- a/fs/lockd/clntproc.c Tue Feb 26 01:20:20 2008
+++ b/fs/lockd/clntproc.c Fri Mar 7 18:08:58 2008
@@ -20,7 +20,6 @@
#define NLMDBG_FACILITY NLMDBG_CLIENT
#define NLMCLNT_GRACE_WAIT (5*HZ)
-#define NLMCLNT_POLL_TIMEOUT (30*HZ)
#define NLMCLNT_MAX_RETRIES 3
static int nlmclnt_test(struct nlm_rqst *, struct file_lock *);
@@ -525,7 +524,9 @@
if (resp->status != nlm_lck_blocked)
break;
/* Wait on an NLM blocking lock */
- status = nlmclnt_block(block, req, NLMCLNT_POLL_TIMEOUT);
+ if (!nlm_clnt_poll_timeout)
+ nlm_clnt_poll_timeout = NLMCLNT_POLL_TIMEOUT;
+ status = nlmclnt_block(block, req, nlm_clnt_poll_timeout * HZ);
/* if we were interrupted. Send a CANCEL request to the server
* and exit
*/
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 3/3] NLM: Proposal for a timeout setting on blocking locks
2008-03-11 12:14 [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks Mikael Davranche
2008-03-11 12:15 ` [PATCH 1/3] " Mikael Davranche
2008-03-11 12:16 ` [PATCH 2/3] " Mikael Davranche
@ 2008-03-11 12:18 ` Mikael Davranche
2008-03-11 23:25 ` [PATCH 0/3] " Trond Myklebust
3 siblings, 0 replies; 8+ messages in thread
From: Mikael Davranche @ 2008-03-11 12:18 UTC (permalink / raw)
To: Trond Myklebust, Trond Myklebust, linux-fsdevel
# diff -u a/fs/lockd/svc.c b/fs/lockd/svc.c
--- a/fs/lockd/svc.c Tue Feb 26 01:20:20 2008
+++ b/fs/lockd/svc.c Mon Mar 10 16:21:58 2008
@@ -62,6 +62,7 @@
*/
static unsigned long nlm_grace_period;
static unsigned long nlm_timeout = LOCKD_DFLT_TIMEO;
+unsigned long nlm_clnt_poll_timeout = NLMCLNT_POLL_TIMEOUT;
static int nlm_udpport, nlm_tcpport;
int nsm_use_hostnames = 0;
@@ -72,6 +73,8 @@
static const unsigned long nlm_grace_period_max = 240;
static const unsigned long nlm_timeout_min = 3;
static const unsigned long nlm_timeout_max = 20;
+static const unsigned long nlm_clnt_poll_timeout_min = 1;
+static const unsigned long nlm_clnt_poll_timeout_max = 30;
static const int nlm_port_min = 0, nlm_port_max = 65535;
static struct ctl_table_header * nlm_sysctl_table;
@@ -391,6 +394,16 @@
},
{
.ctl_name = CTL_UNNUMBERED,
+ .procname = "nlm_clnt_poll_timeout",
+ .data = &nlm_clnt_poll_timeout,
+ .maxlen = sizeof(unsigned long),
+ .mode = 0644,
+ .proc_handler = &proc_doulongvec_minmax,
+ .extra1 = (unsigned long *) &nlm_clnt_poll_timeout_min,
+ .extra2 = (unsigned long *) &nlm_clnt_poll_timeout_max,
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
.procname = "nlm_udpport",
.data = &nlm_udpport,
.maxlen = sizeof(int),
@@ -514,6 +527,8 @@
module_param_call(nlm_tcpport, param_set_port, param_get_int,
&nlm_tcpport, 0644);
module_param(nsm_use_hostnames, bool, 0644);
+module_param_call(nlm_clnt_poll_timeout, param_set_timeout, param_get_ulong,
+ &nlm_clnt_poll_timeout, 0644);
/*
* Initialising and terminating the module.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks
2008-03-11 12:14 [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks Mikael Davranche
` (2 preceding siblings ...)
2008-03-11 12:18 ` [PATCH 3/3] " Mikael Davranche
@ 2008-03-11 23:25 ` Trond Myklebust
2008-03-12 11:26 ` Mikael Davranche
3 siblings, 1 reply; 8+ messages in thread
From: Trond Myklebust @ 2008-03-11 23:25 UTC (permalink / raw)
To: Mikael Davranche; +Cc: linux-fsdevel
On Tue, 2008-03-11 at 13:14 +0100, Mikael Davranche wrote:
> Hi,
>
> When a lock blocks, the server sends us a BLOCKED message. When it releases, it
> may sends us an NLM callback. When it does not (it depends on NLM
> implementations), the client waits for 30 seconds before attempting to reclaim
> the lock again.
>
> The 30 seconds time is hard-coded on fs/lockd/clntproc.c:
> #define NLMCLNT_POLL_TIMEOUT (30*HZ)
>
> 30 seconds is generally suitable, but in some cases, it is too much and it may
> be set to less than that. I have this problem in my production env
> ironment when an e-mail box receives more than 1 e-mail every 30 seconds. In
> that particular case, the nlm_blocked list grows up and never reduce
> s. Setting this time to less than 30 seconds resolves the problem.
>
> This short series of patches enables the set of this timeout, setting a new
> /proc entry named nlm_clnt_poll_timeout (this name is based on the NL
> MCLNT_POLL_TIMEOUT define). Patches are based on the 2.6.24.3 version (is that a
> problem? may I base them on the 2.6.25-rc5 one?).
You want to reduce the retransmission timeout on NLM because you receive
more than 1 email per retransmission timeout? I can't see how the two
are related.
Normally, the server should call your client back using an NLM_GRANTED
call as soon as the lock is available. If that isn't happening, then you
need to look at why not. The retransmission+timeout is supposed to be a
failsafe for when the NLM_GRANTED mechanism fails, not the main method
for grabbing a lock.
For instance, it may be that the server is unable to call the client
back because you've hidden it behind a firewall or NAT, or perhaps your
netfilter settings on either the client or the server are blocking the
callback.
Cheers
Trond
--
Trond Myklebust
NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks
2008-03-11 23:25 ` [PATCH 0/3] " Trond Myklebust
@ 2008-03-12 11:26 ` Mikael Davranche
2008-03-12 13:33 ` Trond Myklebust
0 siblings, 1 reply; 8+ messages in thread
From: Mikael Davranche @ 2008-03-12 11:26 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-fsdevel
> You want to reduce the retransmission timeout on NLM because you receive
> more than 1 email per retransmission timeout? I can't see how the two
> are related.
To explain it, let's see two examples. In those following examples, I used the
following syntax:
lro: is a Lock Reclaim OK (the server didn't sent NLM_BLOCKED)
lre: is a Lock RElease (the client don't want the lock anymore)
lrb: is a Lock Reclaim not OK (the server sent NLM_BLOCKED and the client will
retry x seconds later)
Example 1
2 e-mail servers, 1 NAS, 1 e-mail every 10 seconds on each servers, 2 seconds to
store an e-mail
retransmission timeout: 30 seconds (default)
time 000 001 002 010 012 020 022 030 031 032
server1 lro lre lro lre lro lre lro lre
server2 lrn lrn
Report:
Between t=0 and t=32,
e-mails processed by server1: 4
e-mails processed by server1: 0
t=32,
e-mails in server1's local queue: 0
e-mails in server2's local queue: 4
Example 2
2 e-mail servers, 1 NAS, 1 e-mail every 10 seconds on each servers, 2 seconds to
store an e-mail
retransmission timeout: 3 seconds
time 000 001 002 005 007 010 011 012 013 015...
server1 lro lre lro lre ...
server2 lrn lro lre lrn lro lre...
Report:
Between t=0 and t=15,
e-mails processed by server1: 2
e-mails processed by server1: 2
t=15,
e-mails in server1's local queue: 0
e-mails in server2's local queue: 0
Of course, a server never receives EXACTLY 1 e-mail every 10 seconds, but what
we can see in a production environment could be summarized with those two
examples.
> Normally, the server should call your client back using an NLM_GRANTED
> call as soon as the lock is available. If that isn't happening, then you
> need to look at why not. The retransmission+timeout is supposed to be a
> failsafe for when the NLM_GRANTED mechanism fails, not the main method
> for grabbing a lock.
>
> For instance, it may be that the server is unable to call the client
> back because you've hidden it behind a firewall or NAT, or perhaps your
> netfilter settings on either the client or the server are blocking the
> callback.
The only reason why I propose this short serie of patchs is that there is one
case in which we can not use the NLM_GRANTED mechanism and in which we must
always use the retransmission+timeout failsafe: the NFS server is under HPUX.
Let's have a look at the comment of the nlmclnt_lock function:
473 /*
474 * LOCK: Try to create a lock
475 *
476 * Programmer Harassment Alert
477 *
478 * When given a blocking lock request in a sync RPC call, the HPUX lockd
479 * will faithfully return LCK_BLOCKED but never cares to notify us when
480 * the lock could be granted. This way, our local process could hang
481 * around forever waiting for the callback.
482 *
483 * Solution A: Implement busy-waiting
484 * Solution B: Use the async version of the call (NLM_LOCK_{MSG,RES})
485 *
486 * For now I am implementing solution A, because I hate the idea of
487 * re-implementing lockd for a third time in two months. The async
488 * calls shouldn't be too hard to do, however.
489 *
490 * This is one of the lovely things about standards in the NFS area:
491 * they're so soft and squishy you can't really blame HP for doing this.
492 */
Note that I made my tests using a NetApp NAS ;) Indeed, Data ONTAP only sends
NLM_GRANTED over UDP (not TCP). So we can reproduce the HPUX behaviour with
"nlm_udpport = 0" on the client.
Cheers, Mikael
--
Mikael Davranche
System Engineer
Atos Worldline, France
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks
2008-03-12 11:26 ` Mikael Davranche
@ 2008-03-12 13:33 ` Trond Myklebust
2008-03-12 15:51 ` Mikael Davranche
0 siblings, 1 reply; 8+ messages in thread
From: Trond Myklebust @ 2008-03-12 13:33 UTC (permalink / raw)
To: Mikael Davranche; +Cc: linux-fsdevel
On Wed, 2008-03-12 at 12:26 +0100, Mikael Davranche wrote:
> The only reason why I propose this short serie of patchs is that there is one
> case in which we can not use the NLM_GRANTED mechanism and in which we must
> always use the retransmission+timeout failsafe: the NFS server is under HPUX.
>
> Let's have a look at the comment of the nlmclnt_lock function:
>
> 473 /*
> 474 * LOCK: Try to create a lock
> 475 *
> 476 * Programmer Harassment Alert
> 477 *
> 478 * When given a blocking lock request in a sync RPC call, the HPUX lockd
> 479 * will faithfully return LCK_BLOCKED but never cares to notify us when
> 480 * the lock could be granted. This way, our local process could hang
> 481 * around forever waiting for the callback.
> 482 *
> 483 * Solution A: Implement busy-waiting
> 484 * Solution B: Use the async version of the call (NLM_LOCK_{MSG,RES})
> 485 *
> 486 * For now I am implementing solution A, because I hate the idea of
> 487 * re-implementing lockd for a third time in two months. The async
> 488 * calls shouldn't be too hard to do, however.
> 489 *
> 490 * This is one of the lovely things about standards in the NFS area:
> 491 * they're so soft and squishy you can't really blame HP for doing this.
> 492 */
>
> Note that I made my tests using a NetApp NAS ;) Indeed, Data ONTAP only sends
> NLM_GRANTED over UDP (not TCP). So we can reproduce the HPUX behaviour with
> "nlm_udpport = 0" on the client.
Yes, but that HPUX case is a very old server bug (at least 10 years
old), and I'd assume it has been fixed by now. Even if it hasn't, I'm
not going to bend over backwards in the client code to fix a server that
is in obvious violation of the NLM protocol.
We fix server bugs on the server and client bugs on the client.
Cheers
Trond
--
Trond Myklebust
NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks
2008-03-12 13:33 ` Trond Myklebust
@ 2008-03-12 15:51 ` Mikael Davranche
0 siblings, 0 replies; 8+ messages in thread
From: Mikael Davranche @ 2008-03-12 15:51 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-fsdevel
Quoting Trond Myklebust <Trond.Myklebust@netapp.com>:
> Yes, but that HPUX case is a very old server bug (at least 10 years
> old), and I'd assume it has been fixed by now. Even if it hasn't, I'm
> not going to bend over backwards in the client code to fix a server that
> is in obvious violation of the NLM protocol.
>
> We fix server bugs on the server and client bugs on the client.
>
> Cheers
> Trond
Right Trond, thanks for your answers! =)
See you,
--
Mikael Davranche
System Engineer
Atos Worldline, France
^ permalink raw reply [flat|nested] 8+ messages in thread