* [RFC,PATCH] svc: Fix the UDP address logic
@ 2007-10-22 21:01 Tom Tucker
2007-10-23 1:53 ` Greg Banks
2007-10-23 15:13 ` Chuck Lever
0 siblings, 2 replies; 9+ messages in thread
From: Tom Tucker @ 2007-10-22 21:01 UTC (permalink / raw)
To: nfs; +Cc: bfields, neilb, gnb
When the address information was moved to the svc_xprt structure, a bug
was introduced to UDP that caused an incorrect address to be used
when responding to RPC on the UDP transport. This was the result of
failing to completely implement the generic address logic for the UDP
transport.
Thanks to Greg for pointing this out...
Since I confused myself, it's probably a good idea to describe how
this is supposed to work. The transport is responsible for setting
the xpt_local and xpt_remote addresses in the svc_xprt structure as
part of xpo_recvfrom processing. This cannot be done in a generic way
and in fact varies between TCP, UDP and RDMA. A set of xpo_ functions
(e.g. getlocalname, getremotename) could have been added but this would
have resulted in additional caching and copying of the addresses around.
The generic svc_recv code copies the addresses from the svc_xprt
structure into the rqstp structure as part of svc_recv processing.
The xpt_local address should also be set on listening endpoints, for
tcp/rdma this is done as part of endpoint creation and for new
connections in xpo_accept processing.
This patch was tested with Connectathon over V3 on UDP.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
---
net/sunrpc/svcsock.c | 16 ++++++++++------
1 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index e1a27ee..0f8a3d5 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -473,17 +473,21 @@ svc_write_space(struct sock *sk)
static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
struct cmsghdr *cmh)
{
- struct svc_sock *svsk =
- container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
+ struct svc_xprt *xprt = rqstp->rq_xprt;
+ struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
switch (svsk->sk_sk->sk_family) {
case AF_INET: {
struct in_pktinfo *pki = CMSG_DATA(cmh);
- rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr;
+ struct sockaddr_in *sin =
+ (struct sockaddr_in *)&xprt->xpt_local;
+ sin->sin_addr.s_addr = pki->ipi_spec_dst.s_addr;
break;
}
case AF_INET6: {
struct in6_pktinfo *pki = CMSG_DATA(cmh);
- ipv6_addr_copy(&rqstp->rq_daddr.addr6, &pki->ipi6_addr);
+ struct sockaddr_in6 *sin6 =
+ (struct sockaddr_in6 *)&xprt->xpt_local;
+ ipv6_addr_copy(&sin6->sin6_addr, &pki->ipi6_addr);
break;
}
}
@@ -506,7 +510,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
struct cmsghdr *cmh = &buffer.hdr;
int err, len;
struct msghdr msg = {
- .msg_name = svc_addr(rqstp),
+ .msg_name = &svsk->sk_xprt.xpt_remote,
.msg_control = cmh,
.msg_controllen = sizeof(buffer),
.msg_flags = MSG_DONTWAIT,
@@ -541,7 +545,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
svc_xprt_received(&svsk->sk_xprt);
return -EAGAIN;
}
- rqstp->rq_addrlen = sizeof(rqstp->rq_addr);
+ svsk->sk_xprt.xpt_remotelen = sizeof(svsk->sk_xprt.xpt_remote);
if (skb->tstamp.tv64 == 0) {
skb->tstamp = ktime_get_real();
/* Don't enable netstamp, sunrpc doesn't
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC,PATCH] svc: Fix the UDP address logic
2007-10-22 21:01 [RFC,PATCH] svc: Fix the UDP address logic Tom Tucker
@ 2007-10-23 1:53 ` Greg Banks
2007-10-23 2:13 ` Tom Tucker
2007-10-23 15:13 ` Chuck Lever
1 sibling, 1 reply; 9+ messages in thread
From: Greg Banks @ 2007-10-23 1:53 UTC (permalink / raw)
To: Tom Tucker; +Cc: bfields, neilb, nfs
On Mon, Oct 22, 2007 at 04:01:48PM -0500, Tom Tucker wrote:
>
> When the address information was moved to the svc_xprt structure, a bug
> was introduced to UDP that caused an incorrect address to be used
> when responding to RPC on the UDP transport. This was the result of
> failing to completely implement the generic address logic for the UDP
> transport.
ok.
> Since I confused myself, it's probably a good idea to describe how
> this is supposed to work. [...]
This is useful text, any chance you can put it into a comment
or a text file describing the xprt switch interface?
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC,PATCH] svc: Fix the UDP address logic
2007-10-23 1:53 ` Greg Banks
@ 2007-10-23 2:13 ` Tom Tucker
2007-10-23 2:46 ` Greg Banks
0 siblings, 1 reply; 9+ messages in thread
From: Tom Tucker @ 2007-10-23 2:13 UTC (permalink / raw)
To: Greg Banks; +Cc: bfields, neilb, nfs
Greg Banks wrote:
> On Mon, Oct 22, 2007 at 04:01:48PM -0500, Tom Tucker wrote:
>
>> When the address information was moved to the svc_xprt structure, a bug
>> was introduced to UDP that caused an incorrect address to be used
>> when responding to RPC on the UDP transport. This was the result of
>> failing to completely implement the generic address logic for the UDP
>> transport.
>>
>
> ok.
>
>
>> Since I confused myself, it's probably a good idea to describe how
>> this is supposed to work. [...]
>>
>
> This is useful text, any chance you can put it into a comment
> or a text file describing the xprt switch interface?
>
>
I think it would be most useful if I wrote a document describing the
transport switch and the requirements of a transport driver. The text
file would be for the Documentation directory. I'll try to get to it in
the next week.
> Greg.
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC,PATCH] svc: Fix the UDP address logic
2007-10-23 2:13 ` Tom Tucker
@ 2007-10-23 2:46 ` Greg Banks
0 siblings, 0 replies; 9+ messages in thread
From: Greg Banks @ 2007-10-23 2:46 UTC (permalink / raw)
To: Tom Tucker; +Cc: bfields, neilb, nfs
On Mon, Oct 22, 2007 at 09:13:20PM -0500, Tom Tucker wrote:
> Greg Banks wrote:
> >On Mon, Oct 22, 2007 at 04:01:48PM -0500, Tom Tucker wrote:
> >
> >This is useful text, any chance you can put it into a comment
> >or a text file describing the xprt switch interface?
> >
> >
> I think it would be most useful if I wrote a document describing the
> transport switch and the requirements of a transport driver. The text
> file would be for the Documentation directory. I'll try to get to it in
> the next week.
Great!
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC,PATCH] svc: Fix the UDP address logic
2007-10-22 21:01 [RFC,PATCH] svc: Fix the UDP address logic Tom Tucker
2007-10-23 1:53 ` Greg Banks
@ 2007-10-23 15:13 ` Chuck Lever
2007-10-23 15:38 ` Tom Tucker
1 sibling, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2007-10-23 15:13 UTC (permalink / raw)
To: Tom Tucker; +Cc: bfields, neilb, nfs, gnb
[-- Attachment #1: Type: text/plain, Size: 3457 bytes --]
Tom Tucker wrote:
> When the address information was moved to the svc_xprt structure, a bug
> was introduced to UDP that caused an incorrect address to be used
> when responding to RPC on the UDP transport. This was the result of
> failing to completely implement the generic address logic for the UDP
> transport.
>
> Thanks to Greg for pointing this out...
>
> Since I confused myself, it's probably a good idea to describe how
> this is supposed to work. The transport is responsible for setting
> the xpt_local and xpt_remote addresses in the svc_xprt structure as
> part of xpo_recvfrom processing. This cannot be done in a generic way
> and in fact varies between TCP, UDP and RDMA. A set of xpo_ functions
> (e.g. getlocalname, getremotename) could have been added but this would
> have resulted in additional caching and copying of the addresses around.
>
> The generic svc_recv code copies the addresses from the svc_xprt
> structure into the rqstp structure as part of svc_recv processing.
>
> The xpt_local address should also be set on listening endpoints, for
> tcp/rdma this is done as part of endpoint creation and for new
> connections in xpo_accept processing.
>
> This patch was tested with Connectathon over V3 on UDP.
>
> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
> ---
>
> net/sunrpc/svcsock.c | 16 ++++++++++------
> 1 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index e1a27ee..0f8a3d5 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -473,17 +473,21 @@ svc_write_space(struct sock *sk)
> static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
> struct cmsghdr *cmh)
> {
> - struct svc_sock *svsk =
> - container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
> + struct svc_xprt *xprt = rqstp->rq_xprt;
> + struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
> switch (svsk->sk_sk->sk_family) {
> case AF_INET: {
> struct in_pktinfo *pki = CMSG_DATA(cmh);
> - rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr;
> + struct sockaddr_in *sin =
> + (struct sockaddr_in *)&xprt->xpt_local;
> + sin->sin_addr.s_addr = pki->ipi_spec_dst.s_addr;
> break;
> }
> case AF_INET6: {
> struct in6_pktinfo *pki = CMSG_DATA(cmh);
> - ipv6_addr_copy(&rqstp->rq_daddr.addr6, &pki->ipi6_addr);
> + struct sockaddr_in6 *sin6 =
> + (struct sockaddr_in6 *)&xprt->xpt_local;
> + ipv6_addr_copy(&sin6->sin6_addr, &pki->ipi6_addr);
> break;
> }
> }
> @@ -506,7 +510,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
> struct cmsghdr *cmh = &buffer.hdr;
> int err, len;
> struct msghdr msg = {
> - .msg_name = svc_addr(rqstp),
> + .msg_name = &svsk->sk_xprt.xpt_remote,
> .msg_control = cmh,
> .msg_controllen = sizeof(buffer),
> .msg_flags = MSG_DONTWAIT,
> @@ -541,7 +545,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
> svc_xprt_received(&svsk->sk_xprt);
> return -EAGAIN;
> }
> - rqstp->rq_addrlen = sizeof(rqstp->rq_addr);
> + svsk->sk_xprt.xpt_remotelen = sizeof(svsk->sk_xprt.xpt_remote);
This worries me a bit. Why isn't rq_addrlen involved here? the
xpt_remote field should be a sockaddr_storage, and the remote_len field
should be the actual length of the address, not the size of the storage
space.
> if (skb->tstamp.tv64 == 0) {
> skb->tstamp = ktime_get_real();
> /* Don't enable netstamp, sunrpc doesn't
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 327 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
email;internet:chuck dot lever at nospam oracle dot com
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 314 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC,PATCH] svc: Fix the UDP address logic
2007-10-23 15:13 ` Chuck Lever
@ 2007-10-23 15:38 ` Tom Tucker
2007-10-23 15:49 ` Chuck Lever
0 siblings, 1 reply; 9+ messages in thread
From: Tom Tucker @ 2007-10-23 15:38 UTC (permalink / raw)
To: chuck.lever; +Cc: bfields, neilb, nfs, gnb
Chuck Lever wrote:
> Tom Tucker wrote:
>> When the address information was moved to the svc_xprt structure, a bug
>> was introduced to UDP that caused an incorrect address to be used
>> when responding to RPC on the UDP transport. This was the result of
>> failing to completely implement the generic address logic for the UDP
>> transport.
>>
>> Thanks to Greg for pointing this out...
>>
>> Since I confused myself, it's probably a good idea to describe how
>> this is supposed to work. The transport is responsible for setting
>> the xpt_local and xpt_remote addresses in the svc_xprt structure as
>> part of xpo_recvfrom processing. This cannot be done in a generic way
>> and in fact varies between TCP, UDP and RDMA. A set of xpo_ functions
>> (e.g. getlocalname, getremotename) could have been added but this would
>> have resulted in additional caching and copying of the addresses around.
>> The generic svc_recv code copies the addresses from the svc_xprt
>> structure into the rqstp structure as part of svc_recv processing.
>> The xpt_local address should also be set on listening endpoints, for
>> tcp/rdma this is done as part of endpoint creation and for new
>> connections in xpo_accept processing.
>> This patch was tested with Connectathon over V3 on UDP.
>>
>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>> ---
>>
>> net/sunrpc/svcsock.c | 16 ++++++++++------
>> 1 files changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
>> index e1a27ee..0f8a3d5 100644
>> --- a/net/sunrpc/svcsock.c
>> +++ b/net/sunrpc/svcsock.c
>> @@ -473,17 +473,21 @@ svc_write_space(struct sock *sk)
>> static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
>> struct cmsghdr *cmh)
>> {
>> - struct svc_sock *svsk =
>> - container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
>> + struct svc_xprt *xprt = rqstp->rq_xprt;
>> + struct svc_sock *svsk = container_of(xprt, struct svc_sock,
>> sk_xprt);
>> switch (svsk->sk_sk->sk_family) {
>> case AF_INET: {
>> struct in_pktinfo *pki = CMSG_DATA(cmh);
>> - rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr;
>> + struct sockaddr_in *sin =
>> + (struct sockaddr_in *)&xprt->xpt_local;
>> + sin->sin_addr.s_addr = pki->ipi_spec_dst.s_addr;
>> break;
>> }
>> case AF_INET6: {
>> struct in6_pktinfo *pki = CMSG_DATA(cmh);
>> - ipv6_addr_copy(&rqstp->rq_daddr.addr6, &pki->ipi6_addr);
>> + struct sockaddr_in6 *sin6 =
>> + (struct sockaddr_in6 *)&xprt->xpt_local;
>> + ipv6_addr_copy(&sin6->sin6_addr, &pki->ipi6_addr);
>> break;
>> }
>> }
>> @@ -506,7 +510,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
>> struct cmsghdr *cmh = &buffer.hdr;
>> int err, len;
>> struct msghdr msg = {
>> - .msg_name = svc_addr(rqstp),
>> + .msg_name = &svsk->sk_xprt.xpt_remote,
>> .msg_control = cmh,
>> .msg_controllen = sizeof(buffer),
>> .msg_flags = MSG_DONTWAIT,
>> @@ -541,7 +545,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
>> svc_xprt_received(&svsk->sk_xprt);
>> return -EAGAIN;
>> }
>> - rqstp->rq_addrlen = sizeof(rqstp->rq_addr);
>> + svsk->sk_xprt.xpt_remotelen = sizeof(svsk->sk_xprt.xpt_remote);
>
> This worries me a bit. Why isn't rq_addrlen involved here?
rq_addrlen gets set in the generic code (see svc_recv). The source of
the address is the transport, not the rqstp structure. The rqstp
structure caches it for use when responding to the request in
svc_process. Ironically, I think the HA case that Greg referred to is
the only reason we can't get rid of rq_addr/rq_daddr altogether and
access the address data directly from svc_xprt. Since this address may
change for UDP, we have to store it in the rqstp structure and use it to
reply because another request could arrive in the interim and change the
address data in svc_xprt.
> the xpt_remote field should be a sockaddr_storage,
...it is.
> and the remote_len field should be the actual length of the address,
> not the size of the storage space.
Perhaps it should be, but a) kernel_recvmsg doesn't give us a length, b)
to determine it would require a switch, the degenerate case for which is
sizeof(storage), and c) in it's current usage, big-enough is
good-enough. A weaker point is "that's the way it's always been" ;-)
Tom
>
>> if (skb->tstamp.tv64 == 0) {
>> skb->tstamp = ktime_get_real();
>> /* Don't enable netstamp, sunrpc doesn't
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC,PATCH] svc: Fix the UDP address logic
2007-10-23 15:38 ` Tom Tucker
@ 2007-10-23 15:49 ` Chuck Lever
2007-10-23 16:13 ` Tom Tucker
0 siblings, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2007-10-23 15:49 UTC (permalink / raw)
To: Tom Tucker; +Cc: bfields, neilb, nfs, gnb
[-- Attachment #1: Type: text/plain, Size: 4988 bytes --]
Tom Tucker wrote:
> Chuck Lever wrote:
>> Tom Tucker wrote:
>>> When the address information was moved to the svc_xprt structure, a bug
>>> was introduced to UDP that caused an incorrect address to be used
>>> when responding to RPC on the UDP transport. This was the result of
>>> failing to completely implement the generic address logic for the UDP
>>> transport.
>>>
>>> Thanks to Greg for pointing this out...
>>>
>>> Since I confused myself, it's probably a good idea to describe how
>>> this is supposed to work. The transport is responsible for setting
>>> the xpt_local and xpt_remote addresses in the svc_xprt structure as
>>> part of xpo_recvfrom processing. This cannot be done in a generic way
>>> and in fact varies between TCP, UDP and RDMA. A set of xpo_ functions
>>> (e.g. getlocalname, getremotename) could have been added but this would
>>> have resulted in additional caching and copying of the addresses around.
>>> The generic svc_recv code copies the addresses from the svc_xprt
>>> structure into the rqstp structure as part of svc_recv processing.
>>> The xpt_local address should also be set on listening endpoints, for
>>> tcp/rdma this is done as part of endpoint creation and for new
>>> connections in xpo_accept processing.
>>> This patch was tested with Connectathon over V3 on UDP.
>>>
>>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>>> ---
>>>
>>> net/sunrpc/svcsock.c | 16 ++++++++++------
>>> 1 files changed, 10 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
>>> index e1a27ee..0f8a3d5 100644
>>> --- a/net/sunrpc/svcsock.c
>>> +++ b/net/sunrpc/svcsock.c
>>> @@ -473,17 +473,21 @@ svc_write_space(struct sock *sk)
>>> static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
>>> struct cmsghdr *cmh)
>>> {
>>> - struct svc_sock *svsk =
>>> - container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
>>> + struct svc_xprt *xprt = rqstp->rq_xprt;
>>> + struct svc_sock *svsk = container_of(xprt, struct svc_sock,
>>> sk_xprt);
>>> switch (svsk->sk_sk->sk_family) {
>>> case AF_INET: {
>>> struct in_pktinfo *pki = CMSG_DATA(cmh);
>>> - rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr;
>>> + struct sockaddr_in *sin =
>>> + (struct sockaddr_in *)&xprt->xpt_local;
>>> + sin->sin_addr.s_addr = pki->ipi_spec_dst.s_addr;
>>> break;
>>> }
>>> case AF_INET6: {
>>> struct in6_pktinfo *pki = CMSG_DATA(cmh);
>>> - ipv6_addr_copy(&rqstp->rq_daddr.addr6, &pki->ipi6_addr);
>>> + struct sockaddr_in6 *sin6 =
>>> + (struct sockaddr_in6 *)&xprt->xpt_local;
>>> + ipv6_addr_copy(&sin6->sin6_addr, &pki->ipi6_addr);
>>> break;
>>> }
>>> }
>>> @@ -506,7 +510,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
>>> struct cmsghdr *cmh = &buffer.hdr;
>>> int err, len;
>>> struct msghdr msg = {
>>> - .msg_name = svc_addr(rqstp),
>>> + .msg_name = &svsk->sk_xprt.xpt_remote,
>>> .msg_control = cmh,
>>> .msg_controllen = sizeof(buffer),
>>> .msg_flags = MSG_DONTWAIT,
>>> @@ -541,7 +545,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
>>> svc_xprt_received(&svsk->sk_xprt);
>>> return -EAGAIN;
>>> }
>>> - rqstp->rq_addrlen = sizeof(rqstp->rq_addr);
>>> + svsk->sk_xprt.xpt_remotelen = sizeof(svsk->sk_xprt.xpt_remote);
>>
>> This worries me a bit. Why isn't rq_addrlen involved here?
> rq_addrlen gets set in the generic code (see svc_recv). The source of
> the address is the transport, not the rqstp structure. The rqstp
> structure caches it for use when responding to the request in
> svc_process. Ironically, I think the HA case that Greg referred to is
> the only reason we can't get rid of rq_addr/rq_daddr altogether and
> access the address data directly from svc_xprt. Since this address may
> change for UDP, we have to store it in the rqstp structure and use it to
> reply because another request could arrive in the interim and change the
> address data in svc_xprt.
>> the xpt_remote field should be a sockaddr_storage,
> ...it is.
>> and the remote_len field should be the actual length of the address,
>> not the size of the storage space.
> Perhaps it should be, but a) kernel_recvmsg doesn't give us a length, b)
> to determine it would require a switch, the degenerate case for which is
> sizeof(storage),
I don't agree: there is no degenerate case, and sizeof(storage) should
never ever be used.
> and c) in it's current usage, big-enough is
> good-enough. A weaker point is "that's the way it's always been" ;-)
Please change it to use an appropriate address length. Using
sizeof(struct sockaddr_storage) is broken, or at the very least
inefficient (this struct is huge). The use of sizeof() here is
inconsistent with the design of the IPv6 changes I made last year.
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 327 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
email;internet:chuck dot lever at nospam oracle dot com
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 314 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC,PATCH] svc: Fix the UDP address logic
2007-10-23 15:49 ` Chuck Lever
@ 2007-10-23 16:13 ` Tom Tucker
2007-10-23 16:24 ` Chuck Lever
0 siblings, 1 reply; 9+ messages in thread
From: Tom Tucker @ 2007-10-23 16:13 UTC (permalink / raw)
To: chuck.lever; +Cc: bfields, neilb, nfs, gnb
Chuck Lever wrote:
> Tom Tucker wrote:
>> Chuck Lever wrote:
>>> Tom Tucker wrote:
>>>> When the address information was moved to the svc_xprt structure, a
>>>> bug
>>>> was introduced to UDP that caused an incorrect address to be used
>>>> when responding to RPC on the UDP transport. This was the result of
>>>> failing to completely implement the generic address logic for the
>>>> UDP transport.
>>>>
>>>> Thanks to Greg for pointing this out...
>>>>
>>>> Since I confused myself, it's probably a good idea to describe how
>>>> this is supposed to work. The transport is responsible for setting
>>>> the xpt_local and xpt_remote addresses in the svc_xprt structure as
>>>> part of xpo_recvfrom processing. This cannot be done in a generic
>>>> way and in fact varies between TCP, UDP and RDMA. A set of xpo_
>>>> functions
>>>> (e.g. getlocalname, getremotename) could have been added but this
>>>> would
>>>> have resulted in additional caching and copying of the addresses
>>>> around.
>>>> The generic svc_recv code copies the addresses from the svc_xprt
>>>> structure into the rqstp structure as part of svc_recv processing.
>>>> The xpt_local address should also be set on listening endpoints, for
>>>> tcp/rdma this is done as part of endpoint creation and for new
>>>> connections in xpo_accept processing.
>>>> This patch was tested with Connectathon over V3 on UDP.
>>>>
>>>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>>>> ---
>>>>
>>>> net/sunrpc/svcsock.c | 16 ++++++++++------
>>>> 1 files changed, 10 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
>>>> index e1a27ee..0f8a3d5 100644
>>>> --- a/net/sunrpc/svcsock.c
>>>> +++ b/net/sunrpc/svcsock.c
>>>> @@ -473,17 +473,21 @@ svc_write_space(struct sock *sk)
>>>> static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
>>>> struct cmsghdr *cmh)
>>>> {
>>>> - struct svc_sock *svsk =
>>>> - container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
>>>> + struct svc_xprt *xprt = rqstp->rq_xprt;
>>>> + struct svc_sock *svsk = container_of(xprt, struct svc_sock,
>>>> sk_xprt);
>>>> switch (svsk->sk_sk->sk_family) {
>>>> case AF_INET: {
>>>> struct in_pktinfo *pki = CMSG_DATA(cmh);
>>>> - rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr;
>>>> + struct sockaddr_in *sin =
>>>> + (struct sockaddr_in *)&xprt->xpt_local;
>>>> + sin->sin_addr.s_addr = pki->ipi_spec_dst.s_addr;
>>>> break;
>>>> }
>>>> case AF_INET6: {
>>>> struct in6_pktinfo *pki = CMSG_DATA(cmh);
>>>> - ipv6_addr_copy(&rqstp->rq_daddr.addr6, &pki->ipi6_addr);
>>>> + struct sockaddr_in6 *sin6 =
>>>> + (struct sockaddr_in6 *)&xprt->xpt_local;
>>>> + ipv6_addr_copy(&sin6->sin6_addr, &pki->ipi6_addr);
>>>> break;
>>>> }
>>>> }
>>>> @@ -506,7 +510,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
>>>> struct cmsghdr *cmh = &buffer.hdr;
>>>> int err, len;
>>>> struct msghdr msg = {
>>>> - .msg_name = svc_addr(rqstp),
>>>> + .msg_name = &svsk->sk_xprt.xpt_remote,
>>>> .msg_control = cmh,
>>>> .msg_controllen = sizeof(buffer),
>>>> .msg_flags = MSG_DONTWAIT,
>>>> @@ -541,7 +545,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
>>>> svc_xprt_received(&svsk->sk_xprt);
>>>> return -EAGAIN;
>>>> }
>>>> - rqstp->rq_addrlen = sizeof(rqstp->rq_addr);
>>>> + svsk->sk_xprt.xpt_remotelen = sizeof(svsk->sk_xprt.xpt_remote);
>>>
>>> This worries me a bit. Why isn't rq_addrlen involved here?
>> rq_addrlen gets set in the generic code (see svc_recv). The source of
>> the address is the transport, not the rqstp structure. The rqstp
>> structure caches it for use when responding to the request in
>> svc_process. Ironically, I think the HA case that Greg referred to is
>> the only reason we can't get rid of rq_addr/rq_daddr altogether and
>> access the address data directly from svc_xprt. Since this address
>> may change for UDP, we have to store it in the rqstp structure and
>> use it to reply because another request could arrive in the interim
>> and change the address data in svc_xprt.
>>> the xpt_remote field should be a sockaddr_storage,
>> ...it is.
>>> and the remote_len field should be the actual length of the address,
>>> not the size of the storage space.
>> Perhaps it should be, but a) kernel_recvmsg doesn't give us a length,
>> b) to determine it would require a switch, the degenerate case for
>> which is sizeof(storage),
>
> I don't agree: there is no degenerate case, and sizeof(storage) should
> never ever be used.
>
>> and c) in it's current usage, big-enough is good-enough. A weaker
>> point is "that's the way it's always been" ;-)
>
> Please change it to use an appropriate address length. Using
> sizeof(struct sockaddr_storage) is broken, or at the very least
> inefficient (this struct is huge).
Yes, for TCP/UDP 128B is getting copied when it really only needs to be
16 or 28.
> The use of sizeof() here is inconsistent with the design of the IPv6
> changes I made last year.
Fair enough, but in practical terms that means if ss_family is not
AF_INET or AF_INET6, we'll return an error or is there something I'm
missing?
Tom
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC,PATCH] svc: Fix the UDP address logic
2007-10-23 16:13 ` Tom Tucker
@ 2007-10-23 16:24 ` Chuck Lever
0 siblings, 0 replies; 9+ messages in thread
From: Chuck Lever @ 2007-10-23 16:24 UTC (permalink / raw)
To: Tom Tucker; +Cc: bfields, neilb, nfs, gnb
[-- Attachment #1: Type: text/plain, Size: 6055 bytes --]
Tom Tucker wrote:
> Chuck Lever wrote:
>> Tom Tucker wrote:
>>> Chuck Lever wrote:
>>>> Tom Tucker wrote:
>>>>> When the address information was moved to the svc_xprt structure, a
>>>>> bug
>>>>> was introduced to UDP that caused an incorrect address to be used
>>>>> when responding to RPC on the UDP transport. This was the result of
>>>>> failing to completely implement the generic address logic for the
>>>>> UDP transport.
>>>>>
>>>>> Thanks to Greg for pointing this out...
>>>>>
>>>>> Since I confused myself, it's probably a good idea to describe how
>>>>> this is supposed to work. The transport is responsible for setting
>>>>> the xpt_local and xpt_remote addresses in the svc_xprt structure as
>>>>> part of xpo_recvfrom processing. This cannot be done in a generic
>>>>> way and in fact varies between TCP, UDP and RDMA. A set of xpo_
>>>>> functions
>>>>> (e.g. getlocalname, getremotename) could have been added but this
>>>>> would
>>>>> have resulted in additional caching and copying of the addresses
>>>>> around.
>>>>> The generic svc_recv code copies the addresses from the svc_xprt
>>>>> structure into the rqstp structure as part of svc_recv processing.
>>>>> The xpt_local address should also be set on listening endpoints, for
>>>>> tcp/rdma this is done as part of endpoint creation and for new
>>>>> connections in xpo_accept processing.
>>>>> This patch was tested with Connectathon over V3 on UDP.
>>>>>
>>>>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>>>>> ---
>>>>>
>>>>> net/sunrpc/svcsock.c | 16 ++++++++++------
>>>>> 1 files changed, 10 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
>>>>> index e1a27ee..0f8a3d5 100644
>>>>> --- a/net/sunrpc/svcsock.c
>>>>> +++ b/net/sunrpc/svcsock.c
>>>>> @@ -473,17 +473,21 @@ svc_write_space(struct sock *sk)
>>>>> static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
>>>>> struct cmsghdr *cmh)
>>>>> {
>>>>> - struct svc_sock *svsk =
>>>>> - container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
>>>>> + struct svc_xprt *xprt = rqstp->rq_xprt;
>>>>> + struct svc_sock *svsk = container_of(xprt, struct svc_sock,
>>>>> sk_xprt);
>>>>> switch (svsk->sk_sk->sk_family) {
>>>>> case AF_INET: {
>>>>> struct in_pktinfo *pki = CMSG_DATA(cmh);
>>>>> - rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr;
>>>>> + struct sockaddr_in *sin =
>>>>> + (struct sockaddr_in *)&xprt->xpt_local;
>>>>> + sin->sin_addr.s_addr = pki->ipi_spec_dst.s_addr;
>>>>> break;
>>>>> }
>>>>> case AF_INET6: {
>>>>> struct in6_pktinfo *pki = CMSG_DATA(cmh);
>>>>> - ipv6_addr_copy(&rqstp->rq_daddr.addr6, &pki->ipi6_addr);
>>>>> + struct sockaddr_in6 *sin6 =
>>>>> + (struct sockaddr_in6 *)&xprt->xpt_local;
>>>>> + ipv6_addr_copy(&sin6->sin6_addr, &pki->ipi6_addr);
>>>>> break;
>>>>> }
>>>>> }
>>>>> @@ -506,7 +510,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
>>>>> struct cmsghdr *cmh = &buffer.hdr;
>>>>> int err, len;
>>>>> struct msghdr msg = {
>>>>> - .msg_name = svc_addr(rqstp),
>>>>> + .msg_name = &svsk->sk_xprt.xpt_remote,
>>>>> .msg_control = cmh,
>>>>> .msg_controllen = sizeof(buffer),
>>>>> .msg_flags = MSG_DONTWAIT,
>>>>> @@ -541,7 +545,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
>>>>> svc_xprt_received(&svsk->sk_xprt);
>>>>> return -EAGAIN;
>>>>> }
>>>>> - rqstp->rq_addrlen = sizeof(rqstp->rq_addr);
>>>>> + svsk->sk_xprt.xpt_remotelen = sizeof(svsk->sk_xprt.xpt_remote);
>>>>
>>>> This worries me a bit. Why isn't rq_addrlen involved here?
>>> rq_addrlen gets set in the generic code (see svc_recv). The source of
>>> the address is the transport, not the rqstp structure. The rqstp
>>> structure caches it for use when responding to the request in
>>> svc_process. Ironically, I think the HA case that Greg referred to is
>>> the only reason we can't get rid of rq_addr/rq_daddr altogether and
>>> access the address data directly from svc_xprt. Since this address
>>> may change for UDP, we have to store it in the rqstp structure and
>>> use it to reply because another request could arrive in the interim
>>> and change the address data in svc_xprt.
>>>> the xpt_remote field should be a sockaddr_storage,
>>> ...it is.
>>>> and the remote_len field should be the actual length of the address,
>>>> not the size of the storage space.
>>> Perhaps it should be, but a) kernel_recvmsg doesn't give us a length,
>>> b) to determine it would require a switch, the degenerate case for
>>> which is sizeof(storage),
>>
>> I don't agree: there is no degenerate case, and sizeof(storage) should
>> never ever be used.
>>
>>> and c) in it's current usage, big-enough is good-enough. A weaker
>>> point is "that's the way it's always been" ;-)
>>
>> Please change it to use an appropriate address length. Using
>> sizeof(struct sockaddr_storage) is broken, or at the very least
>> inefficient (this struct is huge).
> Yes, for TCP/UDP 128B is getting copied when it really only needs to be
> 16 or 28.
I didn't think this was the case before the transport switch was
introduced, but perhaps there have been some patch mix-ups, or
oversights on my part.
The purpose of the address length field is to limit the number of bytes
that you need to copy. It's used when the address is treated as a blob
of bytes rather than when it is accessed as a union of structures
discriminated by ss_family.
>> The use of sizeof() here is inconsistent with the design of the IPv6
>> changes I made last year.
> Fair enough, but in practical terms that means if ss_family is not
> AF_INET or AF_INET6, we'll return an error or is there something I'm
> missing?
Yes, I think an error should be thrown in that case. I notice that in
some places, we have "default: BUG();" which I think is excessive.
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 327 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
email;internet:chuck dot lever at nospam oracle dot com
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 314 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-10-23 16:25 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-22 21:01 [RFC,PATCH] svc: Fix the UDP address logic Tom Tucker
2007-10-23 1:53 ` Greg Banks
2007-10-23 2:13 ` Tom Tucker
2007-10-23 2:46 ` Greg Banks
2007-10-23 15:13 ` Chuck Lever
2007-10-23 15:38 ` Tom Tucker
2007-10-23 15:49 ` Chuck Lever
2007-10-23 16:13 ` Tom Tucker
2007-10-23 16:24 ` Chuck Lever
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox