All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Pavel Emelyanov <xemul@parallels.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v2 9/13] sunrpc: Merge the xs_bind code
Date: Mon, 18 Oct 2010 19:55:09 -0400	[thread overview]
Message-ID: <20101018235509.GI4740@fieldses.org> (raw)
In-Reply-To: <C254B855-0C70-4989-8559-143EB388D2EA@oracle.com>

On Mon, Oct 18, 2010 at 06:37:01PM -0400, Chuck Lever wrote:
> 
> On Oct 18, 2010, at 5:57 PM, J. Bruce Fields wrote:
> 
> > On Mon, Oct 18, 2010 at 05:53:45PM -0400, Chuck Lever wrote:
> >> Can you post the full revised patch?  I'm wondering if it's OK to init the sockaddr's family that way, but I can't tell from just the snippet.
> > 
> > I end up modifying these two patches (then the change propagates through
> > the others without conflicts).
> > 
> > --b.
> > 
> > commit c923f8c579bd65e4d4096cdcc1ca2b17780143ce
> > Author: Pavel Emelyanov <xemul@parallels.com>
> > Date:   Tue Oct 5 15:53:08 2010 +0400
> > 
> >    sunrpc: Merge the xs_bind code
> > 
> >    There's the only difference betseen the xs_bind4 and the
> >    xs_bind6 - the size of sockaddr structure they use.
> > 
> >    Fortunatelly its size can be indirectly get from the transport.
> > 
> >    Change since v1:
> >    * use sockaddr_storage instead of sockaddr
> >    * use rpc_set_port instead of manual port assigning
> > 
> >    Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
> >    [bfields@redhat.com: fix address family initialization]
> >    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > 
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index 7fdf2bb..fc1e767 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -1534,23 +1534,18 @@ static unsigned short xs_next_srcport(struct sock_xprt *transport, unsigned shor
> > 		return xprt_max_resvport;
> > 	return --port;
> > }
> > -
> > -static int xs_bind4(struct sock_xprt *transport, struct socket *sock)
> > +static int xs_bind(struct sock_xprt *transport, struct socket *sock)
> > {
> > -	struct sockaddr_in myaddr = {
> > -		.sin_family = AF_INET,
> > -	};
> > -	struct sockaddr_in *sa;
> > +	struct sockaddr_storage myaddr;
> > 	int err, nloop = 0;
> > 	unsigned short port = xs_get_srcport(transport);
> > 	unsigned short last;
> > 
> > -	sa = (struct sockaddr_in *)&transport->srcaddr;
> > -	myaddr.sin_addr = sa->sin_addr;
> > +	memcpy(&myaddr, &transport->srcaddr, transport->xprt.addrlen);
> 
> This routine used to construct my_addr by hand in automatic storage by copying just the address part from srcaddr.  Since there was a separate xs_bind for IPv4 and one for IPv6, we already had the source address family planted in my_addr.
> 
> Now we're copying the full source address, including sa_family, from the passed in srcaddr.  But most ULPs don't actually set up srcaddr.  In fact, I think only lockd does.
> 
> The calling convention for rpc_create() allows this.  Notice in xs_setup_xprt() the source address is copied only if the args->srcaddr pointer is not NULL.  If it _is_ NULL, then the sock_xprt's srcaddr field should be filled with all zeros (which is the ANYADDR in both IPv4 and IPv6) because we allocate the new sock_xprt structure via kzalloc().  But the sa_family field is left zero as well in this case.
> 
> This is where the actual bug is, I think.  When srcaddr is NULL, instead of leaving that srcaddr field uninitialized in xs_setup_xprt(), it should plant an appropriate ANYADDR address there.  The family of that ANYADDR address is determined by the family of dstaddr, which is always initialized before this function is called.  I can send you a simple example patch that might apply before Pavel's work, if you like.
> 
> In the current code, the fact that ANYADDR just happens to be all zeroes saved our ass on this one.

It sounds like you're saying that before we ensured that fields other
tha .sin_family were zero in one way (with the automatic initialiation
of myaddr above), and now ensure it in another (kzalloc of the thing we
copy from)--I'm not sure the difference is as important as all that?

But, honestly, I'm not following all this--I'll happily take whatever
revision you and Pavel want to give me, either incremental or as a
replacement for the 17 patches.

--b.

> 
> > 	do {
> > -		myaddr.sin_port = htons(port);
> > -		err = kernel_bind(sock, (struct sockaddr *) &myaddr,
> > -						sizeof(myaddr));
> > +		rpc_set_port((struct sockaddr *)&myaddr, port);
> > +		err = kernel_bind(sock, (struct sockaddr *)&myaddr,
> > +				transport->xprt.addrlen);
> > 		if (port == 0)
> > 			break;
> > 		if (err == 0) {
> > @@ -1562,44 +1557,19 @@ static int xs_bind4(struct sock_xprt *transport, struct socket *sock)
> > 		if (port > last)
> > 			nloop++;
> > 	} while (err == -EADDRINUSE && nloop != 2);
> > -	dprintk("RPC:       %s %pI4:%u: %s (%d)\n",
> > -			__func__, &myaddr.sin_addr,
> > -			port, err ? "failed" : "ok", err);
> > -	return err;
> > -}
> > -
> > -static int xs_bind6(struct sock_xprt *transport, struct socket *sock)
> > -{
> > -	struct sockaddr_in6 myaddr = {
> > -		.sin6_family = AF_INET6,
> > -	};
> > -	struct sockaddr_in6 *sa;
> > -	int err, nloop = 0;
> > -	unsigned short port = xs_get_srcport(transport);
> > -	unsigned short last;
> > 
> > -	sa = (struct sockaddr_in6 *)&transport->srcaddr;
> > -	myaddr.sin6_addr = sa->sin6_addr;
> > -	do {
> > -		myaddr.sin6_port = htons(port);
> > -		err = kernel_bind(sock, (struct sockaddr *) &myaddr,
> > -						sizeof(myaddr));
> > -		if (port == 0)
> > -			break;
> > -		if (err == 0) {
> > -			transport->srcport = port;
> > -			break;
> > -		}
> > -		last = port;
> > -		port = xs_next_srcport(transport, port);
> > -		if (port > last)
> > -			nloop++;
> > -	} while (err == -EADDRINUSE && nloop != 2);
> > -	dprintk("RPC:       xs_bind6 %pI6:%u: %s (%d)\n",
> > -		&myaddr.sin6_addr, port, err ? "failed" : "ok", err);
> > +	if (myaddr.ss_family == PF_INET)
> > +		dprintk("RPC:       %s %pI4:%u: %s (%d)\n", __func__,
> > +				&((struct sockaddr_in *)&myaddr)->sin_addr,
> > +				port, err ? "failed" : "ok", err);
> > +	else
> > +		dprintk("RPC:       %s %pI6:%u: %s (%d)\n", __func__,
> > +				&((struct sockaddr_in6 *)&myaddr)->sin6_addr,
> > +				port, err ? "failed" : "ok", err);
> > 	return err;
> > }
> > 
> > +
> > #ifdef CONFIG_DEBUG_LOCK_ALLOC
> > static struct lock_class_key xs_key[2];
> > static struct lock_class_key xs_slock_key[2];
> > @@ -1643,9 +1613,10 @@ static struct socket *xs_create_sock4(struct rpc_xprt *xprt,
> > 				protocol, -err);
> > 		goto out;
> > 	}
> > +	transport->srcaddr.ss_family = AF_INET;
> > 	xs_reclassify_socket4(sock);
> > 
> > -	if (xs_bind4(transport, sock)) {
> > +	if (xs_bind(transport, sock)) {
> > 		sock_release(sock);
> > 		goto out;
> > 	}
> > @@ -1667,9 +1638,10 @@ static struct socket *xs_create_sock6(struct rpc_xprt *xprt,
> > 				protocol, -err);
> > 		goto out;
> > 	}
> > +	transport->srcaddr.ss_family = AF_INET6;
> > 	xs_reclassify_socket6(sock);
> > 
> > -	if (xs_bind6(transport, sock)) {
> > +	if (xs_bind(transport, sock)) {
> > 		sock_release(sock);
> > 		goto out;
> > 	}
> > commit fa8045a09755f64f8a74c7a8f5046da9d632d4c5
> > Author: Pavel Emelyanov <xemul@parallels.com>
> > Date:   Mon Oct 4 16:56:38 2010 +0400
> > 
> >    sunrpc: Merge xs_create_sock code
> > 
> >    After xs_bind is merged it's easy to merge its callers.
> > 
> >    Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
> >    [bfields@redhat.com: fix address family initialization]
> >    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > 
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index fc1e767..324d97a 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -1591,6 +1591,14 @@ static inline void xs_reclassify_socket6(struct socket *sock)
> > 	sock_lock_init_class_and_name(sk, "slock-AF_INET6-RPC",
> > 		&xs_slock_key[1], "sk_lock-AF_INET6-RPC", &xs_key[1]);
> > }
> > +
> > +static inline void xs_reclassify_socket(int family, struct socket *sock)
> > +{
> > +	if (family == PF_INET)
> > +		xs_reclassify_socket4(sock);
> > +	else
> > +		xs_reclassify_socket6(sock);
> 
> 
> Just noticed: This should be a switch, just like this type of logic is everywhere else in this code.
> 
> 
> 
> > +}
> > #else
> > static inline void xs_reclassify_socket4(struct socket *sock)
> > {
> > @@ -1599,22 +1607,26 @@ static inline void xs_reclassify_socket4(struct socket *sock)
> > static inline void xs_reclassify_socket6(struct socket *sock)
> > {
> > }
> > +
> > +static inline void xs_reclassify_socket(int family, struct socket *sock)
> > +{
> > +}
> > #endif
> > 
> > -static struct socket *xs_create_sock4(struct rpc_xprt *xprt,
> > -		struct sock_xprt *transport, int type, int protocol)
> > +static struct socket *xs_create_sock(struct rpc_xprt *xprt,
> > +		struct sock_xprt *transport, int family, int type, int protocol)
> > {
> > 	struct socket *sock;
> > 	int err;
> > 
> > -	err = __sock_create(xprt->xprt_net, PF_INET, type, protocol, &sock, 1);
> > +	err = __sock_create(xprt->xprt_net, family, type, protocol, &sock, 1);
> > 	if (err < 0) {
> > 		dprintk("RPC:       can't create %d transport socket (%d).\n",
> > 				protocol, -err);
> > 		goto out;
> > 	}
> > -	transport->srcaddr.ss_family = AF_INET;
> > -	xs_reclassify_socket4(sock);
> > +	transport->srcaddr.ss_family = family;
> > +	xs_reclassify_socket(family, sock);
> 
> 
> See above.  I think this just covers up the problem, and the real fix is in xs_setup_xprt().
> 
> 
> 
> > 	if (xs_bind(transport, sock)) {
> > 		sock_release(sock);
> > @@ -1626,29 +1638,16 @@ out:
> > 	return ERR_PTR(err);
> > }
> > 
> > -static struct socket *xs_create_sock6(struct rpc_xprt *xprt,
> > +static struct socket *xs_create_sock4(struct rpc_xprt *xprt,
> > 		struct sock_xprt *transport, int type, int protocol)
> > {
> > -	struct socket *sock;
> > -	int err;
> > -
> > -	err = __sock_create(xprt->xprt_net, PF_INET6, type, protocol, &sock, 1);
> > -	if (err < 0) {
> > -		dprintk("RPC:       can't create %d transport socket (%d).\n",
> > -				protocol, -err);
> > -		goto out;
> > -	}
> > -	transport->srcaddr.ss_family = AF_INET6;
> > -	xs_reclassify_socket6(sock);
> > -
> > -	if (xs_bind(transport, sock)) {
> > -		sock_release(sock);
> > -		goto out;
> > -	}
> > +	return xs_create_sock(xprt, transport, PF_INET, type, protocol);
> > +}
> > 
> > -	return sock;
> > -out:
> > -	return ERR_PTR(err);
> > +static struct socket *xs_create_sock6(struct rpc_xprt *xprt,
> > +		struct sock_xprt *transport, int type, int protocol)
> > +{
> > +	return xs_create_sock(xprt, transport, PF_INET6, type, protocol);
> > }
> > 
> > static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
> 
> -- 
> chuck[dot]lever[at]oracle[dot]com
> 
> 
> 
> 

  reply	other threads:[~2010-10-18 23:55 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-04 12:50 [PATCH 0/13] sunrpc: Compact the xprtsock code a bit Pavel Emelyanov
2010-10-04 12:51 ` [PATCH 1/13] sunrpc: Remove unused sock arg from xs_get_srcport Pavel Emelyanov
2010-10-04 12:51 ` [PATCH 2/13] sunrpc: Remove unused sock arg from xs_next_srcport Pavel Emelyanov
2010-10-04 12:52 ` [PATCH 3/13] sunrpc: Get xprt pointer once in xs_tcp_setup_socket Pavel Emelyanov
2010-10-04 12:52 ` [PATCH 4/13] sunrpc: Remove duplicate xprt/transport arguments from calls Pavel Emelyanov
2010-10-04 12:53 ` [PATCH 5/13] sunrpc: Factor out udp sockets creation Pavel Emelyanov
2010-10-04 12:54 ` [PATCH 6/13] sunrpc: Factor out v4 " Pavel Emelyanov
2010-10-04 12:54 ` [PATCH 7/13] sunrpc: Factor out v6 " Pavel Emelyanov
2010-10-04 12:55 ` [PATCH 8/13] sunrpc: Call xs_create_sockX directly from setup_socket Pavel Emelyanov
2010-10-04 12:56 ` [PATCH 9/13] sunrpc: Merge the xs_bind code Pavel Emelyanov
2010-10-05  3:20   ` Chuck Lever
2010-10-05  5:42     ` Pavel Emelyanov
2010-10-05 11:53       ` [PATCH v2 " Pavel Emelyanov
2010-10-06  3:04         ` Chuck Lever
2010-10-15 16:05         ` J. Bruce Fields
2010-10-15 16:24           ` Pavel Emelyanov
2010-10-15 16:39           ` Chuck Lever
2010-10-15 18:08             ` J. Bruce Fields
2010-10-15 18:09               ` J. Bruce Fields
2010-10-15 18:11               ` Chuck Lever
2010-10-18 21:15                 ` J. Bruce Fields
2010-10-18 21:43                   ` J. Bruce Fields
2010-10-18 21:53                     ` Chuck Lever
2010-10-18 21:57                       ` J. Bruce Fields
2010-10-18 22:37                         ` Chuck Lever
2010-10-18 23:55                           ` J. Bruce Fields [this message]
2010-10-19 14:35                             ` Chuck Lever
2010-10-19 15:20                               ` J. Bruce Fields
2010-10-20  9:19                                 ` Pavel Emelyanov
2010-10-04 12:56 ` [PATCH 10/13] sunrpc: Merge xs_create_sock code Pavel Emelyanov
2010-10-04 12:57 ` [PATCH 11/13] sunrpc: Pass family to setup_socket calls Pavel Emelyanov
2010-10-04 12:57 ` [PATCH 12/13] sunrpc: Remove TCP worker wrappers Pavel Emelyanov
2010-10-04 12:58 ` [PATCH 13/13] sunrpc: Remove UDP " Pavel Emelyanov
2010-10-11 23:47 ` [PATCH 0/13] sunrpc: Compact the xprtsock code a bit J. Bruce Fields
2010-10-12  7:51   ` Pavel Emelyanov
2010-10-12 13:50     ` J. Bruce Fields
2010-10-12 15:21       ` Pavel Emelyanov
2010-10-15 15:03         ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101018235509.GI4740@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.