From mboxrd@z Thu Jan 1 00:00:00 1970 From: Trond Myklebust Subject: Re: [PATCH 0/3] NLM: Proposal for a timeout setting on blocking locks Date: Tue, 11 Mar 2008 19:25:43 -0400 Message-ID: <1205277943.16687.14.camel@heimdal.trondhjem.org> References: <1205237665.47d677a143941@imp.free.fr> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: linux-fsdevel To: Mikael Davranche Return-path: Received: from mx2.netapp.com ([216.240.18.37]:15952 "EHLO mx2.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750870AbYCKXZp (ORCPT ); Tue, 11 Mar 2008 19:25:45 -0400 In-Reply-To: <1205237665.47d677a143941@imp.free.fr> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, 2008-03-11 at 13:14 +0100, Mikael Davranche wrote: > Hi, > > When a lock blocks, the server sends us a BLOCKED message. When it releases, it > may sends us an NLM callback. When it does not (it depends on NLM > implementations), the client waits for 30 seconds before attempting to reclaim > the lock again. > > The 30 seconds time is hard-coded on fs/lockd/clntproc.c: > #define NLMCLNT_POLL_TIMEOUT (30*HZ) > > 30 seconds is generally suitable, but in some cases, it is too much and it may > be set to less than that. I have this problem in my production env > ironment when an e-mail box receives more than 1 e-mail every 30 seconds. In > that particular case, the nlm_blocked list grows up and never reduce > s. Setting this time to less than 30 seconds resolves the problem. > > This short series of patches enables the set of this timeout, setting a new > /proc entry named nlm_clnt_poll_timeout (this name is based on the NL > MCLNT_POLL_TIMEOUT define). Patches are based on the 2.6.24.3 version (is that a > problem? may I base them on the 2.6.25-rc5 one?). You want to reduce the retransmission timeout on NLM because you receive more than 1 email per retransmission timeout? I can't see how the two are related. Normally, the server should call your client back using an NLM_GRANTED call as soon as the lock is available. If that isn't happening, then you need to look at why not. The retransmission+timeout is supposed to be a failsafe for when the NLM_GRANTED mechanism fails, not the main method for grabbing a lock. For instance, it may be that the server is unable to call the client back because you've hidden it behind a firewall or NAT, or perhaps your netfilter settings on either the client or the server are blocking the callback. Cheers Trond -- Trond Myklebust NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com