From mboxrd@z Thu Jan  1 00:00:00 1970
From: Wendy Cheng <wcheng@redhat.com>
Date: Thu, 03 Aug 2006 17:34:55 -0400
Subject: [Cluster-devel] Re: [NFS] [PATCH 0/3] NLM lock failover
In-Reply-To: <17617.30732.643539.353696@cse.unsw.edu.au>
References: <44A41246.2070106@redhat.com>	<1154397341.3378.10.camel@localhost.localdomain>
	<17617.30732.643539.353696@cse.unsw.edu.au>
Message-ID: <44D26BFF.9090506@redhat.com>
List-Id: <cluster-devel.redhat.com>
To: cluster-devel.redhat.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

Neil Brown wrote:

>First note:  it helps a lot if the Subject line for each patch
>contains a distinctive short description of what the patch does.
>  
>
This is due to inexperience with open source patch submission plus 
end-of-day fatigue :) .. It will be improved.

>>PATCH 1/3
>>---------
>>This patch makes an assumption that any given filehandle will only arrive at
>>one particular interface - never more.  This is implicit in the fact
>>that f_iaddr is stored in 'struct nlm_file' which is indexed by
>>filehandle.
>>
>>.....
>>
>>A consequence of this is that you cannot have a virtual server with
>>two (or more interfaces).  Is this likely to be a problem?
>>e.g. if you have 4 physical interfaces on your server, might you want
>>to bind a different IP to each for each virtual server?
>>If you did, then my change above would mean that you couldn't do
>>failover, and we might need to look at other options...
>>
>>Possibly (and maybe this is more work than is justified), lockd can
>>monitor interface usage and deduce interface pools based on seeing the
>>same filehandle on multiple interfaces.  Then when an unlock request
>>arrives on nlm_unlock, lockd would require all interfaces that touched
>>a file to be 'unlocked' before actually dropping the locks on the
>>file.
>>
>>As you can probably tell I was "thinking out loud" there and it may
>>not be particularly coherent or cohesive.   
>>
>>Do you have any thoughts on this issues?
>>    
>>
Another option is dropping the (NLM) locks based on "fsid" (that can be 
retrieved from filehandle), instead of virtual ip address. Note that 
"fsid" has a good use in a cluster environment (compared to device 
major/minor since different nodes may have different device numbers). 
See any bad thing about fsid approach ?

One catch (about fsid) I can think of is that it must be passed from 
lockd to statd (then to ha-callout program). Current SM_MON and SM_UNMON 
protocol doesn't have any extra field for us to do that. Will add one 
more field causing any issue ? e.g.

current SM_MON argument

string<1024> mon_name;
string<1024> my_name;
unit32 my_prog;
unit32 my_vers;
unit32 my_proc;
opaque[16] priv;

Will add "opaque[16] fsid" after "priv" be ok ?  Ditto for SM_UNMON. On 
the other hand, the fsid can be the 4th parameter to pass to ha-callout 
program (then, that we can avoid breaking any existing ha-callup 
application).

Lets give it few more days to think these issues over.

All others (comments for PATCH 2/3 and 3/3) are helpful coding advices - 
they are appreciated and changes will be made accordingly.

-- Wendy