From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: chucklever@gmail.com, linux-nfs@vger.kernel.org
Subject: Re: [PATCH 0/7] Remaining rpcbind patches for 2.6.27
Date: Mon, 07 Jul 2008 18:13:42 -0400 [thread overview]
Message-ID: <1215468822.19512.55.camel@localhost> (raw)
In-Reply-To: <20080707211925.GB19523@fieldses.org>
On Mon, 2008-07-07 at 17:19 -0400, J. Bruce Fields wrote:
> On Mon, Jul 07, 2008 at 04:51:17PM -0400, Trond Myklebust wrote:
> > On Mon, 2008-07-07 at 15:44 -0400, Chuck Lever wrote:
> >
> > > If you would like connected UDP, I won't object to you implementing
> > > it. However, I never tested whether a connected UDP socket will give
> > > the desired semantics without extra code in the UDP transport (for
> > > example, an ->sk_error callback). I don't think it's worth the hassle
> > > if we have to add code to UDP that only this tiny use case would need.
> > >
> >
> > OK. I'll set these patches aside until I have time to look into adding
> > connected UDP support.
>
> I'm curious--why weren't you convinced by this argument?:
>
> "You're talking about the difference between supporting say 1358
> mounts at boot time versus 1357 mounts at boot time.
Where did you get those figures from? Firstly, the total number of
privileged ports is much smaller. Secondly, the number of _free_
privileged ports can vary wildly depending on the user's setup.
> "In most cases, a client with hundreds of mounts will use up
> exactly one extra privileged TCP port to register NLM during the
> first lockd_up() call. If these are all NFSv4 mounts, it will
> use exactly zero extra ports, since the NFSv4 callback service
> is not even registered.
>
> "Considering that _each_ mount operation can take between 2 and
> 5 privileged ports, while registering NFSD and NLM both would
> take exactly two ports at boot time, I think that registration
> is wrong place to optimize."
>
> I'll admit to not following this carefully, but that seemed reasonable
> to me.
Like it or not, this _is_ a user interface change: if someone has set up
their iptables firewall or is using the tcp_wrapper library to limit
access to the portmapper (a common practice), then this change is
forcing them to change that.
It is not as if UDP connections are prohibitively difficult to implement
either. The entire framework is already there for the TCP case, and so
the following patch (as yet untested) should be close...
--------------------------------------------------------------------------
commit 161c60bc13899b0def4251cffa492ae6faa00b93
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Mon Jul 7 17:43:12 2008 -0400
SUNRPC: Add connected sockets for UDP
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 4486c59..2e49f5a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -580,8 +580,8 @@ static int xs_udp_send_request(struct rpc_task *task)
req->rq_svec->iov_len);
status = xs_sendpages(transport->sock,
- xs_addr(xprt),
- xprt->addrlen, xdr,
+ NULL,
+ 0, xdr,
req->rq_bytes_sent);
dprintk("RPC: xs_udp_send_request(%u) = %d\n",
@@ -1445,13 +1445,13 @@ static inline void xs_reclassify_socket6(struct socket *sock)
}
#endif
-static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
+static int xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
{
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
+ struct sock *sk = sock->sk;
+ int ret;
if (!transport->inet) {
- struct sock *sk = sock->sk;
-
write_lock_bh(&sk->sk_callback_lock);
sk->sk_user_data = xprt;
@@ -1463,8 +1463,6 @@ static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
sk->sk_no_check = UDP_CSUM_NORCV;
sk->sk_allocation = GFP_ATOMIC;
- xprt_set_connected(xprt);
-
/* Reset to new socket */
transport->sock = sock;
transport->inet = sk;
@@ -1472,6 +1470,39 @@ static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
write_unlock_bh(&sk->sk_callback_lock);
}
xs_udp_do_set_buffer_size(xprt);
+ ret = kernel_connect(sock, xs_addr(xprt), xprt->addrlen, 0);
+
+ if (ret == 0) {
+ spin_lock_bh(&xprt->transport_lock);
+ if (sk->sk_state == TCP_ESTABLISHED)
+ xprt_set_connected(xprt);
+ spin_unlock_bh(&xprt->transport_lock);
+ }
+ return ret;
+}
+
+/*
+ * We need to preserve the port number so the reply cache on the server can
+ * find our cached RPC replies when we get around to reconnecting.
+ */
+static void xs_sock_reuse_connection(struct rpc_xprt *xprt)
+{
+ int result;
+ struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
+ struct sockaddr any;
+
+ dprintk("RPC: disconnecting xprt %p to reuse port\n", xprt);
+
+ /*
+ * Disconnect the transport socket by doing a connect operation
+ * with AF_UNSPEC. This should return immediately...
+ */
+ memset(&any, 0, sizeof(any));
+ any.sa_family = AF_UNSPEC;
+ result = kernel_connect(transport->sock, &any, sizeof(any), 0);
+ if (result)
+ dprintk("RPC: AF_UNSPEC connect return code %d\n",
+ result);
}
/**
@@ -1491,25 +1522,35 @@ static void xs_udp_connect_worker4(struct work_struct *work)
if (xprt->shutdown || !xprt_bound(xprt))
goto out;
- /* Start by resetting any existing state */
- xs_close(xprt);
-
- if ((err = sock_create_kern(PF_INET, SOCK_DGRAM, IPPROTO_UDP, &sock)) < 0) {
- dprintk("RPC: can't create UDP transport socket (%d).\n", -err);
- goto out;
- }
- xs_reclassify_socket4(sock);
+ if (!sock) {
+ if ((err = sock_create_kern(PF_INET, SOCK_DGRAM, IPPROTO_UDP, &sock)) < 0) {
+ dprintk("RPC: can't create UDP transport socket (%d).\n", -err);
+ goto out;
+ }
+ xs_reclassify_socket4(sock);
- if (xs_bind4(transport, sock)) {
- sock_release(sock);
- goto out;
- }
+ if (xs_bind4(transport, sock)) {
+ sock_release(sock);
+ goto out;
+ }
+ } else
+ xs_sock_reuse_connection(xprt);
dprintk("RPC: worker connecting xprt %p to address: %s\n",
xprt, xprt->address_strings[RPC_DISPLAY_ALL]);
- xs_udp_finish_connecting(xprt, sock);
- status = 0;
+ status = xs_udp_finish_connecting(xprt, sock);
+ if (status < 0) {
+ switch (status) {
+ case -ECONNREFUSED:
+ case -ECONNRESET:
+ /* retry with existing socket, after a delay */
+ break;
+ default:
+ /* get rid of existing socket, and retry */
+ xs_close(xprt);
+ }
+ }
out:
xprt_wake_pending_tasks(xprt, status);
xprt_clear_connecting(xprt);
@@ -1532,54 +1573,40 @@ static void xs_udp_connect_worker6(struct work_struct *work)
if (xprt->shutdown || !xprt_bound(xprt))
goto out;
- /* Start by resetting any existing state */
- xs_close(xprt);
-
- if ((err = sock_create_kern(PF_INET6, SOCK_DGRAM, IPPROTO_UDP, &sock)) < 0) {
- dprintk("RPC: can't create UDP transport socket (%d).\n", -err);
- goto out;
- }
- xs_reclassify_socket6(sock);
+ if (!sock) {
+ if ((err = sock_create_kern(PF_INET6, SOCK_DGRAM, IPPROTO_UDP, &sock)) < 0) {
+ dprintk("RPC: can't create UDP transport socket (%d).\n", -err);
+ goto out;
+ }
+ xs_reclassify_socket6(sock);
- if (xs_bind6(transport, sock) < 0) {
- sock_release(sock);
- goto out;
- }
+ if (xs_bind6(transport, sock) < 0) {
+ sock_release(sock);
+ goto out;
+ }
+ } else
+ xs_sock_reuse_connection(xprt);
dprintk("RPC: worker connecting xprt %p to address: %s\n",
xprt, xprt->address_strings[RPC_DISPLAY_ALL]);
- xs_udp_finish_connecting(xprt, sock);
- status = 0;
+ status = xs_udp_finish_connecting(xprt, sock);
+ if (status < 0) {
+ switch (status) {
+ case -ECONNREFUSED:
+ case -ECONNRESET:
+ /* retry with existing socket, after a delay */
+ break;
+ default:
+ /* get rid of existing socket, and retry */
+ xs_close(xprt);
+ }
+ }
out:
xprt_wake_pending_tasks(xprt, status);
xprt_clear_connecting(xprt);
}
-/*
- * We need to preserve the port number so the reply cache on the server can
- * find our cached RPC replies when we get around to reconnecting.
- */
-static void xs_tcp_reuse_connection(struct rpc_xprt *xprt)
-{
- int result;
- struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
- struct sockaddr any;
-
- dprintk("RPC: disconnecting xprt %p to reuse port\n", xprt);
-
- /*
- * Disconnect the transport socket by doing a connect operation
- * with AF_UNSPEC. This should return immediately...
- */
- memset(&any, 0, sizeof(any));
- any.sa_family = AF_UNSPEC;
- result = kernel_connect(transport->sock, &any, sizeof(any), 0);
- if (result)
- dprintk("RPC: AF_UNSPEC connect return code %d\n",
- result);
-}
-
static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
{
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
@@ -1650,7 +1677,7 @@ static void xs_tcp_connect_worker4(struct work_struct *work)
}
} else
/* "close" the socket, preserving the local port */
- xs_tcp_reuse_connection(xprt);
+ xs_sock_reuse_connection(xprt);
dprintk("RPC: worker connecting xprt %p to address: %s\n",
xprt, xprt->address_strings[RPC_DISPLAY_ALL]);
@@ -1710,7 +1737,7 @@ static void xs_tcp_connect_worker6(struct work_struct *work)
}
} else
/* "close" the socket, preserving the local port */
- xs_tcp_reuse_connection(xprt);
+ xs_sock_reuse_connection(xprt);
dprintk("RPC: worker connecting xprt %p to address: %s\n",
xprt, xprt->address_strings[RPC_DISPLAY_ALL]);
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
next prev parent reply other threads:[~2008-07-07 22:13 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-30 22:38 [PATCH 0/7] Remaining rpcbind patches for 2.6.27 Chuck Lever
[not found] ` <20080630223646.24534.74654.stgit-ewv44WTpT0t9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
2008-06-30 22:38 ` [PATCH 1/7] SUNRPC: Use correct XDR encoding procedure for rpcbind SET/UNSET Chuck Lever
2008-06-30 22:38 ` [PATCH 2/7] SUNRPC: Introduce a specific rpcb_create for contacting localhost Chuck Lever
2008-06-30 22:38 ` [PATCH 3/7] SUNRPC: None of rpcb_create's callers wants a privileged source port Chuck Lever
2008-06-30 22:39 ` [PATCH 4/7] SUNRPC: Refactor rpcb_register to make rpcbindv4 support easier Chuck Lever
2008-06-30 22:39 ` [PATCH 5/7] SUNRPC: introduce new rpc_task flag that fails requests on xprt disconnect Chuck Lever
2008-06-30 22:39 ` [PATCH 6/7] SUNRPC: Quickly detect missing portmapper during RPC service registration Chuck Lever
2008-06-30 22:39 ` [PATCH 7/7] SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon Chuck Lever
2008-07-03 20:45 ` [PATCH 0/7] Remaining rpcbind patches for 2.6.27 J. Bruce Fields
2008-07-07 18:20 ` Trond Myklebust
2008-07-07 18:43 ` Chuck Lever
2008-07-07 18:51 ` Trond Myklebust
2008-07-07 19:44 ` Chuck Lever
[not found] ` <76bd70e30807071244v4db1c366uc7599d2dd806bf1b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-07 20:51 ` Trond Myklebust
2008-07-07 21:19 ` J. Bruce Fields
2008-07-07 22:13 ` Trond Myklebust [this message]
2008-07-07 22:56 ` J. Bruce Fields
2008-07-08 1:56 ` Chuck Lever
2008-07-10 17:27 ` Chuck Lever
2008-07-11 18:40 ` J. Bruce Fields
2008-07-11 19:11 ` Chuck Lever
[not found] ` <76bd70e30807111211m567e9f8cv38a975bbc9df5758-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-14 19:56 ` J. Bruce Fields
[not found] ` <76bd70e30807141430o783ef431pb61eae97b42e00b4@mail.gmail.com>
[not found] ` <76bd70e30807141430o783ef431pb61eae97b42e00b4-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-14 21:31 ` Fwd: " Chuck Lever
2008-07-14 20:03 ` [PATCH 1/5] SUNRPC: Use correct XDR encoding procedure for rpcbind SET/UNSET J. Bruce Fields
2008-07-14 20:03 ` [PATCH 2/5] SUNRPC: Introduce a specific rpcb_create for contacting localhost J. Bruce Fields
2008-07-14 20:03 ` [PATCH 3/5] SUNRPC: None of rpcb_create's callers wants a privileged source port J. Bruce Fields
2008-07-14 20:03 ` [PATCH 4/5] SUNRPC: Refactor rpcb_register to make rpcbindv4 support easier J. Bruce Fields
2008-07-14 20:03 ` [PATCH 5/5] SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1215468822.19512.55.camel@localhost \
--to=trond.myklebust@netapp.com \
--cc=bfields@fieldses.org \
--cc=chucklever@gmail.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox