* [Cluster-devel] [RFC PATCH 1/3] NLM lock failover - lock release
@ 2006-06-29 18:11 Wendy Cheng
2006-06-29 19:11 ` [Cluster-devel] Re: [NFS] " Chuck Lever
2006-06-29 23:06 ` Trond Myklebust
0 siblings, 2 replies; 9+ messages in thread
From: Wendy Cheng @ 2006-06-29 18:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
The patch piggy-backs the logic into "rq_daddr" field of struct svc_rqst
where NFS server ip address is stored. Upon writing IPv4 address in
standard dot notation into /proc/fs/nfsd/nlm_unlock, the logic will
examine NLM's global nlm_files list and subsequently unlock the
associated file if server ip address matches.
Due to the size of rq_daddr (u32), we would not be able to support IPV6
for this round of changes. Another to-do item is to enable client:server
ip pairs to allow NFS V4 failover.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: gfs_nlm_unlock.patch
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20060629/0e2f4e2b/attachment.ksh>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Cluster-devel] Re: [NFS] [RFC PATCH 1/3] NLM lock failover - lock release
2006-06-29 18:11 [Cluster-devel] [RFC PATCH 1/3] NLM lock failover - lock release Wendy Cheng
@ 2006-06-29 19:11 ` Chuck Lever
2006-06-29 21:49 ` Wendy Cheng
2006-06-29 23:06 ` Trond Myklebust
1 sibling, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2006-06-29 19:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
On 6/29/06, Wendy Cheng <wcheng@redhat.com> wrote:
> The patch piggy-backs the logic into "rq_daddr" field of struct svc_rqst
> where NFS server ip address is stored. Upon writing IPv4 address in
> standard dot notation into /proc/fs/nfsd/nlm_unlock, the logic will
> examine NLM's global nlm_files list and subsequently unlock the
> associated file if server ip address matches.
>
> Due to the size of rq_daddr (u32), we would not be able to support IPV6
> for this round of changes. Another to-do item is to enable client:server
> ip pairs to allow NFS V4 failover.
FYI: I have a patch in my IPv6 patchset that increases the size of
this field specifically in order to hold an IPv6 address. See:
http://oss.oracle.com/~cel/linux-2.6/2.6.17/patches/52-svc-rq_daddr.diff
for the individual patch, and surrounding patches for more context.
--
"We who cut mere stones must always be envisioning cathedrals"
-- Quarry worker's creed
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Cluster-devel] Re: [NFS] [RFC PATCH 1/3] NLM lock failover - lock release
2006-06-29 19:11 ` [Cluster-devel] Re: [NFS] " Chuck Lever
@ 2006-06-29 21:49 ` Wendy Cheng
0 siblings, 0 replies; 9+ messages in thread
From: Wendy Cheng @ 2006-06-29 21:49 UTC (permalink / raw)
To: cluster-devel.redhat.com
Chuck Lever wrote:
>On 6/29/06, Wendy Cheng <wcheng@redhat.com> wrote:
>
>
>>The patch piggy-backs the logic into "rq_daddr" field of struct svc_rqst
>>where NFS server ip address is stored. Upon writing IPv4 address in
>>standard dot notation into /proc/fs/nfsd/nlm_unlock, the logic will
>>examine NLM's global nlm_files list and subsequently unlock the
>>associated file if server ip address matches.
>>
>>Due to the size of rq_daddr (u32), we would not be able to support IPV6
>>for this round of changes. Another to-do item is to enable client:server
>>ip pairs to allow NFS V4 failover.
>>
>>
>
>FYI: I have a patch in my IPv6 patchset that increases the size of
>this field specifically in order to hold an IPv6 address. See:
>
> http://oss.oracle.com/~cel/linux-2.6/2.6.17/patches/52-svc-rq_daddr.diff
>
>for the individual patch, and surrounding patches for more context.
>
>
>
ok, thanks ...
-- Wendy
--
S. Wendy Cheng
wcheng at redhat.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Cluster-devel] Re: [NFS] [RFC PATCH 1/3] NLM lock failover - lock release
2006-06-29 18:11 [Cluster-devel] [RFC PATCH 1/3] NLM lock failover - lock release Wendy Cheng
2006-06-29 19:11 ` [Cluster-devel] Re: [NFS] " Chuck Lever
@ 2006-06-29 23:06 ` Trond Myklebust
2006-06-30 3:57 ` Wendy Cheng
1 sibling, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2006-06-29 23:06 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Thu, 2006-06-29 at 14:11 -0400, Wendy Cheng wrote:
> The patch piggy-backs the logic into "rq_daddr" field of struct svc_rqst
> where NFS server ip address is stored. Upon writing IPv4 address in
> standard dot notation into /proc/fs/nfsd/nlm_unlock, the logic will
> examine NLM's global nlm_files list and subsequently unlock the
> associated file if server ip address matches.
>
> Due to the size of rq_daddr (u32), we would not be able to support IPV6
> for this round of changes. Another to-do item is to enable client:server
> ip pairs to allow NFS V4 failover.
>
>
>
>
> plain text document attachment (gfs_nlm_unlock.patch)
> fs/lockd/svcsubs.c | 58 ++++++++++++++++++++++++++++++++++++++++----
> fs/nfsd/nfsctl.c | 54 ++++++++++++++++++++++++++++++++++++++++
> include/linux/lockd/bind.h | 5 +++
> include/linux/lockd/lockd.h | 3 ++
> net/sunrpc/svcsock.c | 5 +++
> 5 files changed, 120 insertions(+), 5 deletions(-)
>
> --- linux-2.6.17/include/linux/lockd/lockd.h 2006-06-17 21:49:35.000000000 -0400
> +++ linux-2.6.17-1/include/linux/lockd/lockd.h 2006-06-27 10:58:47.000000000 -0400
> @@ -105,6 +105,7 @@ struct nlm_file {
> unsigned int f_count; /* reference count */
> struct semaphore f_sema; /* avoid concurrent access */
> int f_hash; /* hash of f_handle */
> + __u32 f_iaddr; /* server ip for failover */
> };
>
> /*
> @@ -133,6 +134,7 @@ struct nlm_block {
> #define NLM_ACT_CHECK 0 /* check for locks */
> #define NLM_ACT_MARK 1 /* mark & sweep */
> #define NLM_ACT_UNLOCK 2 /* release all locks */
> +#define NLM_ACT_FO_UNLOCK 3 /* failover release locks */
>
> /*
> * Global variables
> @@ -196,6 +198,7 @@ void nlm_release_file(struct nlm_file
> void nlmsvc_mark_resources(void);
> void nlmsvc_free_host_resources(struct nlm_host *);
> void nlmsvc_invalidate_all(void);
> +int nlmsvc_fo_unlock(struct in_addr *);
>
> static __inline__ struct inode *
> nlmsvc_file_inode(struct nlm_file *file)
> --- linux-2.6.17/net/sunrpc/svcsock.c 2006-06-17 21:49:35.000000000 -0400
> +++ linux-2.6.17-1/net/sunrpc/svcsock.c 2006-06-27 10:58:47.000000000 -0400
> @@ -454,6 +454,7 @@ svc_recvfrom(struct svc_rqst *rqstp, str
> struct msghdr msg;
> struct socket *sock;
> int len, alen;
> + struct sockaddr_in daddr;
>
> rqstp->rq_addrlen = sizeof(rqstp->rq_addr);
> sock = rqstp->rq_sock->sk_sock;
> @@ -474,6 +475,10 @@ svc_recvfrom(struct svc_rqst *rqstp, str
> alen = sizeof(rqstp->rq_addr);
> sock->ops->getname(sock, (struct sockaddr *)&rqstp->rq_addr, &alen, 1);
>
> + /* add server ip for nlm lock failover */
> + sock->ops->getname(sock, (struct sockaddr *)&daddr, &alen, 0);
> + rqstp->rq_daddr = daddr.sin_addr.s_addr;
> +
Hmm.... Why would you want to do this on every receive when you could
just store the ip address in the struct svc_sock once and for all?
That said, how do you envisage this working in the cases where the
socket is bound to INADDR_ANY?
Cheers,
Trond
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Cluster-devel] Re: [NFS] [RFC PATCH 1/3] NLM lock failover - lock release
2006-06-29 23:06 ` Trond Myklebust
@ 2006-06-30 3:57 ` Wendy Cheng
[not found] ` <message from Wendy Cheng on Thursday June 29>
2006-06-30 4:39 ` Trond Myklebust
0 siblings, 2 replies; 9+ messages in thread
From: Wendy Cheng @ 2006-06-30 3:57 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Thu, 2006-06-29 at 19:06 -0400, Trond Myklebust wrote:
> >
> > + /* add server ip for nlm lock failover */
> > + sock->ops->getname(sock, (struct sockaddr *)&daddr, &alen, 0);
> > + rqstp->rq_daddr = daddr.sin_addr.s_addr;
> > +
>
> Hmm.... Why would you want to do this on every receive when you could
> just store the ip address in the struct svc_sock once and for all?
ok, will do that - save latency. Thanks.
>
> That said, how do you envisage this working in the cases where the
> socket is bound to INADDR_ANY?
This is "our" (server's) address, not peer address - for this request to
arrive "here", it can't be INADDR_ANY. Can it ? Remember "rq_daddr" will
only be used during failover in a clustered NFS servers environment.
-- Wendy
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Cluster-devel] Re: [NFS] [RFC PATCH 1/3] NLM lock failover - lock release
[not found] ` <message from Wendy Cheng on Thursday June 29>
@ 2006-06-30 4:15 ` Neil Brown
2006-06-30 4:38 ` Trond Myklebust
0 siblings, 1 reply; 9+ messages in thread
From: Neil Brown @ 2006-06-30 4:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Thursday June 29, wcheng at redhat.com wrote:
> On Thu, 2006-06-29 at 19:06 -0400, Trond Myklebust wrote:
>
> > >
> > > + /* add server ip for nlm lock failover */
> > > + sock->ops->getname(sock, (struct sockaddr *)&daddr, &alen, 0);
> > > + rqstp->rq_daddr = daddr.sin_addr.s_addr;
> > > +
> >
> > Hmm.... Why would you want to do this on every receive when you could
> > just store the ip address in the struct svc_sock once and for all?
>
> ok, will do that - save latency. Thanks.
>
> >
> > That said, how do you envisage this working in the cases where the
> > socket is bound to INADDR_ANY?
>
> This is "our" (server's) address, not peer address - for this request to
> arrive "here", it can't be INADDR_ANY. Can it ? Remember "rq_daddr" will
> only be used during failover in a clustered NFS servers environment.
The socket can only be bound to INADDR_ANY for UDP, and in that case
we already set rq_daddr correctly.
For a TCP socket, it will be connected, so the local an remote
endpoints will be well defined.
So the code as it stands should work fine.
But yes, it would be best to record the local and remote addresses in
svc_tcp_accept rather than called ->getname twice in svc_recvfrom.
NeilBrown
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Cluster-devel] Re: [NFS] [RFC PATCH 1/3] NLM lock failover - lock release
2006-06-30 4:15 ` Neil Brown
@ 2006-06-30 4:38 ` Trond Myklebust
[not found] ` <message from Trond Myklebust on Friday June 30>
0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2006-06-30 4:38 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Fri, 2006-06-30 at 14:15 +1000, Neil Brown wrote:
> The socket can only be bound to INADDR_ANY for UDP, and in that case
> we already set rq_daddr correctly.
As I understand it, Wendy is considering the case of a multi-homed
server. She wants to record the IP address on which we received the
datagram so that she knows which locks to invalidate in the case of a
migration of that particular IP address onto another server.
My point is that she won't get that information if the socket is bound
to INADDR_ANY, as would be the case for a UDP socket.
Cheers,
Trond
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Cluster-devel] Re: [NFS] [RFC PATCH 1/3] NLM lock failover - lock release
2006-06-30 3:57 ` Wendy Cheng
[not found] ` <message from Wendy Cheng on Thursday June 29>
@ 2006-06-30 4:39 ` Trond Myklebust
1 sibling, 0 replies; 9+ messages in thread
From: Trond Myklebust @ 2006-06-30 4:39 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Thu, 2006-06-29 at 23:57 -0400, Wendy Cheng wrote:
> This is "our" (server's) address, not peer address - for this request to
> arrive "here", it can't be INADDR_ANY. Can it ?
Consider the case of a UDP socket.
Cheers,
Trond
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Cluster-devel] Re: [NFS] [RFC PATCH 1/3] NLM lock failover - lock release
[not found] ` <message from Trond Myklebust on Friday June 30>
@ 2006-06-30 5:00 ` Neil Brown
0 siblings, 0 replies; 9+ messages in thread
From: Neil Brown @ 2006-06-30 5:00 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Friday June 30, trond.myklebust at fys.uio.no wrote:
> On Fri, 2006-06-30 at 14:15 +1000, Neil Brown wrote:
>
> > The socket can only be bound to INADDR_ANY for UDP, and in that case
> > we already set rq_daddr correctly.
>
> As I understand it, Wendy is considering the case of a multi-homed
> server. She wants to record the IP address on which we received the
> datagram so that she knows which locks to invalidate in the case of a
> migration of that particular IP address onto another server.
>
> My point is that she won't get that information if the socket is bound
> to INADDR_ANY, as would be the case for a UDP socket.
Yes. But that code fragment covered the TCP case only.
The extra called to ->getname was placed in svc_recvfrom which is only
called from svc_tcp_recvfrom, not from svc_udp_recvfrom (Yes, I agree
there is room for confusion there).
svc_udp_recvfrom already has
rqstp->rq_addr.sin_addr.s_addr = skb->nh.iph->saddr;
rqstp->rq_daddr = skb->nh.iph->daddr;
so it sets rq_daddr correctly as each packet is received.
rq_daddr was never set for requests received via TCP because it was
never used for tcp (it was used only to set the source address for UDP
replies).
NeilBrown
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-06-30 5:00 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-29 18:11 [Cluster-devel] [RFC PATCH 1/3] NLM lock failover - lock release Wendy Cheng
2006-06-29 19:11 ` [Cluster-devel] Re: [NFS] " Chuck Lever
2006-06-29 21:49 ` Wendy Cheng
2006-06-29 23:06 ` Trond Myklebust
2006-06-30 3:57 ` Wendy Cheng
[not found] ` <message from Wendy Cheng on Thursday June 29>
2006-06-30 4:15 ` Neil Brown
2006-06-30 4:38 ` Trond Myklebust
[not found] ` <message from Trond Myklebust on Friday June 30>
2006-06-30 5:00 ` Neil Brown
2006-06-30 4:39 ` Trond Myklebust
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).