From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wendy Cheng Date: Thu, 03 Aug 2006 17:34:55 -0400 Subject: [Cluster-devel] Re: [NFS] [PATCH 0/3] NLM lock failover In-Reply-To: <17617.30732.643539.353696@cse.unsw.edu.au> References: <44A41246.2070106@redhat.com> <1154397341.3378.10.camel@localhost.localdomain> <17617.30732.643539.353696@cse.unsw.edu.au> Message-ID: <44D26BFF.9090506@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Neil Brown wrote: >First note: it helps a lot if the Subject line for each patch >contains a distinctive short description of what the patch does. > > This is due to inexperience with open source patch submission plus end-of-day fatigue :) .. It will be improved. >>PATCH 1/3 >>--------- >>This patch makes an assumption that any given filehandle will only arrive at >>one particular interface - never more. This is implicit in the fact >>that f_iaddr is stored in 'struct nlm_file' which is indexed by >>filehandle. >> >>..... >> >>A consequence of this is that you cannot have a virtual server with >>two (or more interfaces). Is this likely to be a problem? >>e.g. if you have 4 physical interfaces on your server, might you want >>to bind a different IP to each for each virtual server? >>If you did, then my change above would mean that you couldn't do >>failover, and we might need to look at other options... >> >>Possibly (and maybe this is more work than is justified), lockd can >>monitor interface usage and deduce interface pools based on seeing the >>same filehandle on multiple interfaces. Then when an unlock request >>arrives on nlm_unlock, lockd would require all interfaces that touched >>a file to be 'unlocked' before actually dropping the locks on the >>file. >> >>As you can probably tell I was "thinking out loud" there and it may >>not be particularly coherent or cohesive. >> >>Do you have any thoughts on this issues? >> >> Another option is dropping the (NLM) locks based on "fsid" (that can be retrieved from filehandle), instead of virtual ip address. Note that "fsid" has a good use in a cluster environment (compared to device major/minor since different nodes may have different device numbers). See any bad thing about fsid approach ? One catch (about fsid) I can think of is that it must be passed from lockd to statd (then to ha-callout program). Current SM_MON and SM_UNMON protocol doesn't have any extra field for us to do that. Will add one more field causing any issue ? e.g. current SM_MON argument string<1024> mon_name; string<1024> my_name; unit32 my_prog; unit32 my_vers; unit32 my_proc; opaque[16] priv; Will add "opaque[16] fsid" after "priv" be ok ? Ditto for SM_UNMON. On the other hand, the fsid can be the 4th parameter to pass to ha-callout program (then, that we can avoid breaking any existing ha-callup application). Lets give it few more days to think these issues over. All others (comments for PATCH 2/3 and 3/3) are helpful coding advices - they are appreciated and changes will be made accordingly. -- Wendy