All of lore.kernel.org
 help / color / mirror / Atom feed
* Use of delayed request information in nlmsvc_lookup_host
@ 2007-11-14 18:23 Chuck Lever
  2007-11-14 18:43 ` Trond Myklebust
  0 siblings, 1 reply; 16+ messages in thread
From: Chuck Lever @ 2007-11-14 18:23 UTC (permalink / raw)
  To: Neil Brown, Trond Myklebust; +Cc: nfs

[-- Attachment #1: Type: text/plain, Size: 890 bytes --]

Hi-

Historical question here.

nlmsvc_lookup_host() has this:

struct nlm_host *
nlmsvc_lookup_host(struct svc_rqst *rqstp,
                         const char *hostname, unsigned int hostname_len)
{
         struct sockaddr_in ssin = {0};

 >>>     ssin.sin_addr = rqstp->rq_daddr.addr;
         return nlm_lookup_host(1, svc_addr_in(rqstp),
                                rqstp->rq_prot, rqstp->rq_vers,
                                hostname, hostname_len, &ssin);
}

Why is it using rq_daddr to construct the lookup target?

The problem here is that rq_daddr isn't a full address.  It doesn't have 
address family information.  So nlmsvc_lookup_host() just assumes that 
what's stored in rq_daddr is always AF_INET.

As I started adding support for AF_INET6 to lockd, I noticed that 
there's no way for nlmsvc_lookup_host() to know that it should construct 
a sockaddr_in6 instead.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-14 18:23 Use of delayed request information in nlmsvc_lookup_host Chuck Lever
@ 2007-11-14 18:43 ` Trond Myklebust
  2007-11-14 19:00   ` Chuck Lever
  0 siblings, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2007-11-14 18:43 UTC (permalink / raw)
  To: chuck.lever; +Cc: Neil Brown, nfs


On Wed, 2007-11-14 at 13:23 -0500, Chuck Lever wrote:
> Hi-
> 
> Historical question here.
> 
> nlmsvc_lookup_host() has this:
> 
> struct nlm_host *
> nlmsvc_lookup_host(struct svc_rqst *rqstp,
>                          const char *hostname, unsigned int hostname_len)
> {
>          struct sockaddr_in ssin = {0};
> 
>  >>>     ssin.sin_addr = rqstp->rq_daddr.addr;
>          return nlm_lookup_host(1, svc_addr_in(rqstp),
>                                 rqstp->rq_prot, rqstp->rq_vers,
>                                 hostname, hostname_len, &ssin);
> }
> 
> Why is it using rq_daddr to construct the lookup target?
> 
> The problem here is that rq_daddr isn't a full address.  It doesn't have 
> address family information.  So nlmsvc_lookup_host() just assumes that 
> what's stored in rq_daddr is always AF_INET.

Why can't rq_daddr hold address family information? AFAICS there is
nothing stopping you from setting that in svc_recvfrom() and in
svc_udp_get_dest_address().

In fact, why isn't rq_daddr defined to be a sockaddr_storage so that it
can be safely cast into either a sockaddr_in or a sockaddr_in6 after
checking the ss_family field?

Cheers
  Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-14 18:43 ` Trond Myklebust
@ 2007-11-14 19:00   ` Chuck Lever
  2007-11-14 19:32     ` J. Bruce Fields
  2007-11-14 19:41     ` Trond Myklebust
  0 siblings, 2 replies; 16+ messages in thread
From: Chuck Lever @ 2007-11-14 19:00 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Neil Brown, nfs

[-- Attachment #1: Type: text/plain, Size: 1816 bytes --]

Trond Myklebust wrote:
> On Wed, 2007-11-14 at 13:23 -0500, Chuck Lever wrote:
>> Hi-
>>
>> Historical question here.
>>
>> nlmsvc_lookup_host() has this:
>>
>> struct nlm_host *
>> nlmsvc_lookup_host(struct svc_rqst *rqstp,
>>                          const char *hostname, unsigned int hostname_len)
>> {
>>          struct sockaddr_in ssin = {0};
>>
>>  >>>     ssin.sin_addr = rqstp->rq_daddr.addr;
>>          return nlm_lookup_host(1, svc_addr_in(rqstp),
>>                                 rqstp->rq_prot, rqstp->rq_vers,
>>                                 hostname, hostname_len, &ssin);
>> }
>>
>> Why is it using rq_daddr to construct the lookup target?
>>
>> The problem here is that rq_daddr isn't a full address.  It doesn't have 
>> address family information.  So nlmsvc_lookup_host() just assumes that 
>> what's stored in rq_daddr is always AF_INET.
> 
> Why can't rq_daddr hold address family information? AFAICS there is
> nothing stopping you from setting that in svc_recvfrom() and in
> svc_udp_get_dest_address().

That's correct.  I'm just trying to understand why, historically, 
rq_daddr was just the 32-bit address and not a full sockaddr to begin 
with.  There may be something we're missing, like "we didn't want to add 
another large field to this structure due to memory alignment or 
allocation efficiency concerns".  :-)

> In fact, why isn't rq_daddr defined to be a sockaddr_storage so that it
> can be safely cast into either a sockaddr_in or a sockaddr_in6 after
> checking the ss_family field?

That isn't unreasonable, but see above.

Note also we have the same problem with NLM_REBOOT (see 
nlmsvc_decode_reboot() and nlmsvc_proc_sm_notify()).  These pass and 
store a 32-bit network-endian address rather than constructing a full 
sockaddr.  A similar solution invites itself.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-14 19:00   ` Chuck Lever
@ 2007-11-14 19:32     ` J. Bruce Fields
  2007-11-14 19:41     ` Trond Myklebust
  1 sibling, 0 replies; 16+ messages in thread
From: J. Bruce Fields @ 2007-11-14 19:32 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Neil Brown, nfs, Trond Myklebust

On Wed, Nov 14, 2007 at 02:00:25PM -0500, Chuck Lever wrote:
> Trond Myklebust wrote:
>> On Wed, 2007-11-14 at 13:23 -0500, Chuck Lever wrote:
>>> Hi-
>>>
>>> Historical question here.
>>>
>>> nlmsvc_lookup_host() has this:
>>>
>>> struct nlm_host *
>>> nlmsvc_lookup_host(struct svc_rqst *rqstp,
>>>                          const char *hostname, unsigned int hostname_len)
>>> {
>>>          struct sockaddr_in ssin = {0};
>>>
>>>  >>>     ssin.sin_addr = rqstp->rq_daddr.addr;
>>>          return nlm_lookup_host(1, svc_addr_in(rqstp),
>>>                                 rqstp->rq_prot, rqstp->rq_vers,
>>>                                 hostname, hostname_len, &ssin);
>>> }
>>>
>>> Why is it using rq_daddr to construct the lookup target?
>>>
>>> The problem here is that rq_daddr isn't a full address.  It doesn't have 
>>> address family information.  So nlmsvc_lookup_host() just assumes that 
>>> what's stored in rq_daddr is always AF_INET.
>>
>> Why can't rq_daddr hold address family information? AFAICS there is
>> nothing stopping you from setting that in svc_recvfrom() and in
>> svc_udp_get_dest_address().
>
> That's correct.  I'm just trying to understand why, historically, rq_daddr 
> was just the 32-bit address and not a full sockaddr to begin with.  There 
> may be something we're missing, like "we didn't want to add another large 
> field to this structure due to memory alignment or allocation efficiency 
> concerns".  :-)

Maybe somebody should be thinking about it, but I don't think anyone
has.  And if they did, this would seem the wrong place to skimp.

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-14 19:00   ` Chuck Lever
  2007-11-14 19:32     ` J. Bruce Fields
@ 2007-11-14 19:41     ` Trond Myklebust
  2007-11-14 19:50       ` Trond Myklebust
  2007-11-14 19:54       ` Chuck Lever
  1 sibling, 2 replies; 16+ messages in thread
From: Trond Myklebust @ 2007-11-14 19:41 UTC (permalink / raw)
  To: chuck.lever; +Cc: Neil Brown, nfs


On Wed, 2007-11-14 at 14:00 -0500, Chuck Lever wrote:
> That's correct.  I'm just trying to understand why, historically, 
> rq_daddr was just the 32-bit address and not a full sockaddr to begin 
> with.  There may be something we're missing, like "we didn't want to add 
> another large field to this structure due to memory alignment or 
> allocation efficiency concerns".  :-)

Actually, rq_daddr by definition pretty much has to be of the same
address family as rq_addr, since they are the two endpoints for the same
socket. 

However I can't see where rq_addr is being initialised for UDP sockets.
That is sort of worrying given that it is used among other things by the
nfsd duplicate reply cache...

Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-14 19:41     ` Trond Myklebust
@ 2007-11-14 19:50       ` Trond Myklebust
  2007-11-14 19:54       ` Chuck Lever
  1 sibling, 0 replies; 16+ messages in thread
From: Trond Myklebust @ 2007-11-14 19:50 UTC (permalink / raw)
  To: chuck.lever; +Cc: Neil Brown, nfs


On Wed, 2007-11-14 at 14:41 -0500, Trond Myklebust wrote:
> On Wed, 2007-11-14 at 14:00 -0500, Chuck Lever wrote:
> > That's correct.  I'm just trying to understand why, historically, 
> > rq_daddr was just the 32-bit address and not a full sockaddr to begin 
> > with.  There may be something we're missing, like "we didn't want to add 
> > another large field to this structure due to memory alignment or 
> > allocation efficiency concerns".  :-)
> 
> Actually, rq_daddr by definition pretty much has to be of the same
> address family as rq_addr, since they are the two endpoints for the same
> socket. 
> 
> However I can't see where rq_addr is being initialised for UDP sockets.
> That is sort of worrying given that it is used among other things by the
> nfsd duplicate reply cache...

Gah. Never mind... The wretched thing is wrapped by a useful svc_addr()
'helper' that returns its address.

Problem solved, then. Use the sa_family from rq_addr in order to figure
out how to interpret rq_daddr.

Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-14 19:41     ` Trond Myklebust
  2007-11-14 19:50       ` Trond Myklebust
@ 2007-11-14 19:54       ` Chuck Lever
  2007-11-15  1:41         ` Neil Brown
  1 sibling, 1 reply; 16+ messages in thread
From: Chuck Lever @ 2007-11-14 19:54 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Neil Brown, nfs

[-- Attachment #1: Type: text/plain, Size: 853 bytes --]

Trond Myklebust wrote:
> On Wed, 2007-11-14 at 14:00 -0500, Chuck Lever wrote:
>> That's correct.  I'm just trying to understand why, historically, 
>> rq_daddr was just the 32-bit address and not a full sockaddr to begin 
>> with.  There may be something we're missing, like "we didn't want to add 
>> another large field to this structure due to memory alignment or 
>> allocation efficiency concerns".  :-)
> 
> Actually, rq_daddr by definition pretty much has to be of the same
> address family as rq_addr, since they are the two endpoints for the same
> socket. 
> 
> However I can't see where rq_addr is being initialised for UDP sockets.
> That is sort of worrying given that it is used among other things by the
> nfsd duplicate reply cache...

That's the other half of my question.  Why isn't nlmsvc_lookup_host 
using rq_addr (without the d)?

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-14 19:54       ` Chuck Lever
@ 2007-11-15  1:41         ` Neil Brown
  2007-11-15 13:55           ` Chuck Lever
  0 siblings, 1 reply; 16+ messages in thread
From: Neil Brown @ 2007-11-15  1:41 UTC (permalink / raw)
  To: chuck.lever; +Cc: nfs, Trond Myklebust

On Wednesday November 14, chuck.lever@oracle.com wrote:
> Trond Myklebust wrote:
> > On Wed, 2007-11-14 at 14:00 -0500, Chuck Lever wrote:
> >> That's correct.  I'm just trying to understand why, historically, 
> >> rq_daddr was just the 32-bit address and not a full sockaddr to begin 
> >> with.  There may be something we're missing, like "we didn't want to add 
> >> another large field to this structure due to memory alignment or 
> >> allocation efficiency concerns".  :-)
> > 
> > Actually, rq_daddr by definition pretty much has to be of the same
> > address family as rq_addr, since they are the two endpoints for the same
> > socket. 
> > 
> > However I can't see where rq_addr is being initialised for UDP sockets.
> > That is sort of worrying given that it is used among other things by the
> > nfsd duplicate reply cache...
> 
> That's the other half of my question.  Why isn't nlmsvc_lookup_host 
> using rq_addr (without the d)?

Git is your friend.

commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
Author: Frank van Maarseveen <frankvm@frankvm.com>
Date:   Mon Jul 9 22:25:29 2007 +0200

    NLM: fix source address of callback to client
    
    Use the destination address of the original NLM request as the
    source address in callbacks to the client.
    
    Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


Also, other places that need to interpret rq_daddr use:

	struct svc_sock *svsk =
		container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
	switch (svsk->sk_sk->sk_family) {
	case AF_INET:...
        case AF_INET6: ....

(see svcsock.c).

Why wouldn't that work here?


NeilBrown


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-15  1:41         ` Neil Brown
@ 2007-11-15 13:55           ` Chuck Lever
  2007-11-15 15:54             ` Trond Myklebust
  2007-11-15 16:00             ` Frank van Maarseveen
  0 siblings, 2 replies; 16+ messages in thread
From: Chuck Lever @ 2007-11-15 13:55 UTC (permalink / raw)
  To: Neil Brown; +Cc: nfs, Trond Myklebust

[-- Attachment #1: Type: text/plain, Size: 2299 bytes --]

Neil Brown wrote:
> On Wednesday November 14, chuck.lever@oracle.com wrote:
>> Trond Myklebust wrote:
>>> On Wed, 2007-11-14 at 14:00 -0500, Chuck Lever wrote:
>>>> That's correct.  I'm just trying to understand why, historically, 
>>>> rq_daddr was just the 32-bit address and not a full sockaddr to begin 
>>>> with.  There may be something we're missing, like "we didn't want to add 
>>>> another large field to this structure due to memory alignment or 
>>>> allocation efficiency concerns".  :-)
>>> Actually, rq_daddr by definition pretty much has to be of the same
>>> address family as rq_addr, since they are the two endpoints for the same
>>> socket. 
>>>
>>> However I can't see where rq_addr is being initialised for UDP sockets.
>>> That is sort of worrying given that it is used among other things by the
>>> nfsd duplicate reply cache...
>> That's the other half of my question.  Why isn't nlmsvc_lookup_host 
>> using rq_addr (without the d)?
> 
> Git is your friend.
> 
> commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
> Author: Frank van Maarseveen <frankvm@frankvm.com>
> Date:   Mon Jul 9 22:25:29 2007 +0200
> 
>     NLM: fix source address of callback to client
>     
>     Use the destination address of the original NLM request as the
>     source address in callbacks to the client.
>     
>     Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
>     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

Silly me.  I had assumed that it had always been that way, and thus git 
would not be helpful (git's history truncates at 2.6.12).

Unfortunately Frank's patch description doesn't explain *why* this 
change was made.  I assume this fixes a bug with multi-homed servers?

> Also, other places that need to interpret rq_daddr use:
> 
> 	struct svc_sock *svsk =
> 		container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
> 	switch (svsk->sk_sk->sk_family) {
> 	case AF_INET:...
>         case AF_INET6: ....
> 
> (see svcsock.c).
> 
> Why wouldn't that work here?

Should lockd be poking around in network layer data structures?  I would 
argue "no" especially because that would make lockd transport-dependent. 
  This approach wouldn't work at all for, say, RDMA transports.

What do you think of changing the rq_daddr field to be a sockaddr_storage?

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-15 13:55           ` Chuck Lever
@ 2007-11-15 15:54             ` Trond Myklebust
  2007-11-15 16:26               ` Chuck Lever
  2007-11-15 17:23               ` Chuck Lever
  2007-11-15 16:00             ` Frank van Maarseveen
  1 sibling, 2 replies; 16+ messages in thread
From: Trond Myklebust @ 2007-11-15 15:54 UTC (permalink / raw)
  To: chuck.lever; +Cc: Neil Brown, nfs


On Thu, 2007-11-15 at 08:55 -0500, Chuck Lever wrote:

> What do you think of changing the rq_daddr field to be a sockaddr_storage?

Why? You've already got rq_addr.

Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-15 13:55           ` Chuck Lever
  2007-11-15 15:54             ` Trond Myklebust
@ 2007-11-15 16:00             ` Frank van Maarseveen
  1 sibling, 0 replies; 16+ messages in thread
From: Frank van Maarseveen @ 2007-11-15 16:00 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Neil Brown, nfs, Trond Myklebust

On Thu, Nov 15, 2007 at 08:55:46AM -0500, Chuck Lever wrote:
> Neil Brown wrote:
> >On Wednesday November 14, chuck.lever@oracle.com wrote:
> >>Trond Myklebust wrote:
> >>>On Wed, 2007-11-14 at 14:00 -0500, Chuck Lever wrote:
> >>>>That's correct.  I'm just trying to understand why, historically, 
> >>>>rq_daddr was just the 32-bit address and not a full sockaddr to begin 
> >>>>with.  There may be something we're missing, like "we didn't want to 
> >>>>add another large field to this structure due to memory alignment or 
> >>>>allocation efficiency concerns".  :-)
> >>>Actually, rq_daddr by definition pretty much has to be of the same
> >>>address family as rq_addr, since they are the two endpoints for the same
> >>>socket. 
> >>>
> >>>However I can't see where rq_addr is being initialised for UDP sockets.
> >>>That is sort of worrying given that it is used among other things by the
> >>>nfsd duplicate reply cache...
> >>That's the other half of my question.  Why isn't nlmsvc_lookup_host 
> >>using rq_addr (without the d)?
> >
> >Git is your friend.
> >
> >commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
> >Author: Frank van Maarseveen <frankvm@frankvm.com>
> >Date:   Mon Jul 9 22:25:29 2007 +0200
> >
> >    NLM: fix source address of callback to client
> >    
> >    Use the destination address of the original NLM request as the
> >    source address in callbacks to the client.
> >    
> >    Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
> >    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> Silly me.  I had assumed that it had always been that way, and thus git 
> would not be helpful (git's history truncates at 2.6.12).
> 
> Unfortunately Frank's patch description doesn't explain *why* this 
> change was made.  I assume this fixes a bug with multi-homed servers?

Almost: Multiple NICs are probably not a problem (kernel should fill in
correct src address depending on outgoing interface) but for multiple
IP addresses on the same interface it definately is a problem.

When NFS clients A and B are contending for a lock the server needs
to do a callback to client B when the lock is released by client
A. Without above fix the kernel would pick an arbitrary source address
for callbacks. And (at least) the linux NFS client verifies the source
address of an NLM callback, see nlmclnt_grant().

IP aliases are very useful for virtualizing NFS servers (we run about
70 of them on 4 physical machines).

> What do you think of changing the rq_daddr field to be a sockaddr_storage?

One thing which still worries me is the sockaddr_storage size: this thing
is a bit large due to AF_UNIX support. AF_UNIX isn't used for NFS AFAIK.
Anyway, changing that is a separate issue.

-- 
Frank

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-15 15:54             ` Trond Myklebust
@ 2007-11-15 16:26               ` Chuck Lever
  2007-11-15 17:00                 ` Chuck Lever
  2007-11-15 17:23               ` Chuck Lever
  1 sibling, 1 reply; 16+ messages in thread
From: Chuck Lever @ 2007-11-15 16:26 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Neil Brown, nfs

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

Trond Myklebust wrote:
> On Thu, 2007-11-15 at 08:55 -0500, Chuck Lever wrote:
> 
>> What do you think of changing the rq_daddr field to be a sockaddr_storage?
> 
> Why? You've already got rq_addr.

rq_daddr stores raw address bits, but nlm_host stores a whole sockaddr 
for the source address.  Should we change nlm_host to store just the raw 
address bits as well (svc_addr_u)?  AFAICS the only place that needs to 
construct a whole sockaddr out of the source address is nlm_bind_host.

In addition, NSM_NOTIFY gets a 4-byte address off the wire, and assumes 
it's AF_INET.  Should we store this in an in_addr/in_addr6 union as well?

I haven't looked closely at NSM_NOTIFY... is it even possible to send a 
non-AF_INET address with this procedure?

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-15 16:26               ` Chuck Lever
@ 2007-11-15 17:00                 ` Chuck Lever
  0 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever @ 2007-11-15 17:00 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Neil Brown, nfs

[-- Attachment #1: Type: text/plain, Size: 736 bytes --]



Chuck Lever wrote:
> Trond Myklebust wrote:
>> On Thu, 2007-11-15 at 08:55 -0500, Chuck Lever wrote:
>>
>>> What do you think of changing the rq_daddr field to be a 
>>> sockaddr_storage?
>>
>> Why? You've already got rq_addr.
> 
> rq_daddr stores raw address bits, but nlm_host stores a whole sockaddr 
> for the source address.  Should we change nlm_host to store just the raw 
> address bits as well (svc_addr_u)?  AFAICS the only place that needs to 
> construct a whole sockaddr out of the source address is nlm_bind_host.

I'll answer my own e-mail question.

nlm_host{} has to store a full address because nlm_lookup_host() uses 
nlm_cmp_addr() to compare the source addresses.  nlm_cmp_addr() has to 
be address-family-aware.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-15 15:54             ` Trond Myklebust
  2007-11-15 16:26               ` Chuck Lever
@ 2007-11-15 17:23               ` Chuck Lever
  2007-11-15 17:37                 ` Trond Myklebust
  1 sibling, 1 reply; 16+ messages in thread
From: Chuck Lever @ 2007-11-15 17:23 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Neil Brown, nfs

[-- Attachment #1: Type: text/plain, Size: 1271 bytes --]

Trond Myklebust wrote:
> On Thu, 2007-11-15 at 08:55 -0500, Chuck Lever wrote:
> 
>> What do you think of changing the rq_daddr field to be a sockaddr_storage?
> 
> Why? You've already got rq_addr.

OK, here's what I've ended up with.  I think this is what you and Neil 
have been suggesting.  Comments?

/*
  * Find an NLM client handle in the cache. If there is none, create it.
  *
  * Manufacture a specific source address in case we're using multiple
  * IP addresses on a single NIC.  NB: the family of the address in
  * rq_daddr is guaranteed to be the same as the family of rq_addr.
  */
struct nlm_host *
nlmsvc_lookup_host(struct svc_rqst *rqstp,
			const char *hostname, unsigned int hostname_len)
{
	struct sockaddr_in6 source = {
		.sin6_family	= AF_INET6,
	};

	switch(svc_addr(rqstp)->sa_family) {
	case AF_INET: {
		struct sockaddr_in *sin = (struct sockaddr_in *)&source;
		sin->sin_family = AF_INET;
		sin->sin_addr.s_addr = rqstp->rq_daddr.addr.s_addr;
		break;
	}
	case AF_INET6:
		ipv6_addr_copy(&source.sin6_addr,
                                &rqstp->rq_daddr.addr6);
		break;
	}

	return nlm_lookup_host(1, svc_addr_in(rqstp),
			       rqstp->rq_prot, rqstp->rq_vers,
			       hostname, hostname_len,
			       (struct sockaddr *)&source);
}

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-15 17:23               ` Chuck Lever
@ 2007-11-15 17:37                 ` Trond Myklebust
  2007-11-15 18:02                   ` Chuck Lever
  0 siblings, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2007-11-15 17:37 UTC (permalink / raw)
  To: chuck.lever; +Cc: Neil Brown, nfs


On Thu, 2007-11-15 at 12:23 -0500, Chuck Lever wrote:
> Trond Myklebust wrote:
> > On Thu, 2007-11-15 at 08:55 -0500, Chuck Lever wrote:
> > 
> >> What do you think of changing the rq_daddr field to be a sockaddr_storage?
> > 
> > Why? You've already got rq_addr.
> 
> OK, here's what I've ended up with.  I think this is what you and Neil 
> have been suggesting.  Comments?
> 
> /*
>   * Find an NLM client handle in the cache. If there is none, create it.
>   *
>   * Manufacture a specific source address in case we're using multiple
>   * IP addresses on a single NIC.  NB: the family of the address in
>   * rq_daddr is guaranteed to be the same as the family of rq_addr.
>   */
> struct nlm_host *
> nlmsvc_lookup_host(struct svc_rqst *rqstp,
> 			const char *hostname, unsigned int hostname_len)
> {
> 	struct sockaddr_in6 source = {
> 		.sin6_family	= AF_INET6,
> 	};
> 
> 	switch(svc_addr(rqstp)->sa_family) {
> 	case AF_INET: {
> 		struct sockaddr_in *sin = (struct sockaddr_in *)&source;
> 		sin->sin_family = AF_INET;
> 		sin->sin_addr.s_addr = rqstp->rq_daddr.addr.s_addr;
> 		break;
> 	}
> 	case AF_INET6:
> 		ipv6_addr_copy(&source.sin6_addr,
>                                 &rqstp->rq_daddr.addr6);
> 		break;
> 	}
> 
> 	return nlm_lookup_host(1, svc_addr_in(rqstp),
> 			       rqstp->rq_prot, rqstp->rq_vers,
> 			       hostname, hostname_len,
> 			       (struct sockaddr *)&source);
> }

That works for me. Technically, I suppose we should really pass a
source_len to nlm_lookup_host() if only to ensure that the comparisons
don't overrun memory, however in practice I doubt this is a problem. We
always know that the kernel allocates enough space.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Use of delayed request information in nlmsvc_lookup_host
  2007-11-15 17:37                 ` Trond Myklebust
@ 2007-11-15 18:02                   ` Chuck Lever
  0 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever @ 2007-11-15 18:02 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Neil Brown, nfs

[-- Attachment #1: Type: text/plain, Size: 2013 bytes --]

Trond Myklebust wrote:
> On Thu, 2007-11-15 at 12:23 -0500, Chuck Lever wrote:
>> Trond Myklebust wrote:
>>> On Thu, 2007-11-15 at 08:55 -0500, Chuck Lever wrote:
>>>
>>>> What do you think of changing the rq_daddr field to be a sockaddr_storage?
>>> Why? You've already got rq_addr.
>> OK, here's what I've ended up with.  I think this is what you and Neil 
>> have been suggesting.  Comments?
>>
>> /*
>>   * Find an NLM client handle in the cache. If there is none, create it.
>>   *
>>   * Manufacture a specific source address in case we're using multiple
>>   * IP addresses on a single NIC.  NB: the family of the address in
>>   * rq_daddr is guaranteed to be the same as the family of rq_addr.
>>   */
>> struct nlm_host *
>> nlmsvc_lookup_host(struct svc_rqst *rqstp,
>> 			const char *hostname, unsigned int hostname_len)
>> {
>> 	struct sockaddr_in6 source = {
>> 		.sin6_family	= AF_INET6,
>> 	};
>>
>> 	switch(svc_addr(rqstp)->sa_family) {
>> 	case AF_INET: {
>> 		struct sockaddr_in *sin = (struct sockaddr_in *)&source;
>> 		sin->sin_family = AF_INET;
>> 		sin->sin_addr.s_addr = rqstp->rq_daddr.addr.s_addr;
>> 		break;
>> 	}
>> 	case AF_INET6:
>> 		ipv6_addr_copy(&source.sin6_addr,
>>                                 &rqstp->rq_daddr.addr6);
>> 		break;
>> 	}
>>
>> 	return nlm_lookup_host(1, svc_addr_in(rqstp),
>> 			       rqstp->rq_prot, rqstp->rq_vers,
>> 			       hostname, hostname_len,
>> 			       (struct sockaddr *)&source);
>> }
> 
> That works for me. Technically, I suppose we should really pass a
> source_len to nlm_lookup_host() if only to ensure that the comparisons
> don't overrun memory, however in practice I doubt this is a problem. We
> always know that the kernel allocates enough space.

My IPv6-aware version of nlm_cmp_addr takes a pair of sockaddr *'s; no 
lengths are needed for the comparison of either AF_INET or AF_INET6 
addresses.

In fact, in most cases, it appears that the address length is unneeded 
for us.  Both INET families use fixed-size addresses.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-11-15 18:03 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-14 18:23 Use of delayed request information in nlmsvc_lookup_host Chuck Lever
2007-11-14 18:43 ` Trond Myklebust
2007-11-14 19:00   ` Chuck Lever
2007-11-14 19:32     ` J. Bruce Fields
2007-11-14 19:41     ` Trond Myklebust
2007-11-14 19:50       ` Trond Myklebust
2007-11-14 19:54       ` Chuck Lever
2007-11-15  1:41         ` Neil Brown
2007-11-15 13:55           ` Chuck Lever
2007-11-15 15:54             ` Trond Myklebust
2007-11-15 16:26               ` Chuck Lever
2007-11-15 17:00                 ` Chuck Lever
2007-11-15 17:23               ` Chuck Lever
2007-11-15 17:37                 ` Trond Myklebust
2007-11-15 18:02                   ` Chuck Lever
2007-11-15 16:00             ` Frank van Maarseveen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.