netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v2 0/2] RDS: TCP: sychronization during connection startup
@ 2016-05-02 18:24 Sowmini Varadhan
  2016-05-02 18:24 ` [PATCH net v2 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Sowmini Varadhan @ 2016-05-02 18:24 UTC (permalink / raw)
  To: sowmini.varadhan, netdev, rds-devel; +Cc: santosh.shilimkar, davem

This patch series ensures that the passive (accept) side of the
TCP connection used for RDS-TCP is correctly synchronized with
any concurrent active (connect) attempts for a given pair of peers.

Patch 1 in the series makes sure that the t_sock in struct 
rds_tcp_connection is only reset after any threads in rds_tcp_xmit
have completed (otherwise a null-ptr deref may be encountered). 
Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect()
path.

v2: review comments from Santosh Shilimkar, other spelling corrections

Sowmini Varadhan (2):
  RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when
    resetting t_sock
  RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.

 net/rds/tcp.c         |    3 +-
 net/rds/tcp.h         |    4 +++
 net/rds/tcp_connect.c |    8 +++++++
 net/rds/tcp_listen.c  |   54 ++++++++++++++++++++++++++++++++----------------
 4 files changed, 50 insertions(+), 19 deletions(-)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net v2 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
  2016-05-02 18:24 [PATCH net v2 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan
@ 2016-05-02 18:24 ` Sowmini Varadhan
  2016-05-02 18:24 ` [PATCH net v2 2/2] RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock Sowmini Varadhan
  2016-05-03 20:05 ` [PATCH net v2 0/2] RDS: TCP: sychronization during connection startup David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Sowmini Varadhan @ 2016-05-02 18:24 UTC (permalink / raw)
  To: sowmini.varadhan, netdev, rds-devel; +Cc: santosh.shilimkar, davem

There is a race condition between rds_send_xmit -> rds_tcp_xmit
and the code that deals with resolution of duelling syns added
by commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()").

Specifically, we may end up derefencing a null pointer in rds_send_xmit
if we have the interleaving sequence:
           rds_tcp_accept_one                  rds_send_xmit

                                             conn is RDS_CONN_UP, so
    					 invoke rds_tcp_xmit

                                             tc = conn->c_transport_data
        rds_tcp_restore_callbacks
            /* reset t_sock */
    					 null ptr deref from tc->t_sock

The race condition can be avoided without adding the overhead of
additional locking in the xmit path: have rds_tcp_accept_one wait
for rds_tcp_xmit threads to complete before resetting callbacks.
The synchronization can be done in the same manner as rds_conn_shutdown().
First set the rds_conn_state to something other than RDS_CONN_UP
(so that new threads cannot get into rds_tcp_xmit()), then wait for
RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any
threads in rds_tcp_xmit are done.

Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
---
v2: spelling edit in commit: s/sequencee/sequence
 net/rds/tcp.c        |    2 +-
 net/rds/tcp_listen.c |   40 ++++++++++++++++++++++++----------------
 2 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index 61ed2a8..9134544 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -127,7 +127,7 @@ void rds_tcp_restore_callbacks(struct socket *sock,
 
 /*
  * This is the only path that sets tc->t_sock.  Send and receive trust that
- * it is set.  The RDS_CONN_CONNECTED bit protects those paths from being
+ * it is set.  The RDS_CONN_UP bit protects those paths from being
  * called while it isn't set.
  */
 void rds_tcp_set_callbacks(struct socket *sock, struct rds_connection *conn)
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 0936a4a..0896187 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -115,24 +115,32 @@ int rds_tcp_accept_one(struct socket *sock)
 	 * rds_tcp_state_change() will do that cleanup
 	 */
 	rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data;
-	if (rs_tcp->t_sock &&
-	    ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) {
-		struct sock *nsk = new_sock->sk;
-
-		nsk->sk_user_data = NULL;
-		nsk->sk_prot->disconnect(nsk, 0);
-		tcp_done(nsk);
-		new_sock = NULL;
-		ret = 0;
-		goto out;
-	} else if (rs_tcp->t_sock) {
-		rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp);
-		conn->c_outgoing = 0;
-	}
-
 	rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
+	if (rs_tcp->t_sock) {
+		/* Need to resolve a duelling SYN between peers.
+		 * We have an outstanding SYN to this peer, which may
+		 * potentially have transitioned to the RDS_CONN_UP state,
+		 * so we must quiesce any send threads before resetting
+		 * c_transport_data.
+		 */
+		wait_event(conn->c_waitq,
+			   !test_bit(RDS_IN_XMIT, &conn->c_flags));
+		if (ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) {
+			struct sock *nsk = new_sock->sk;
+
+			nsk->sk_user_data = NULL;
+			nsk->sk_prot->disconnect(nsk, 0);
+			tcp_done(nsk);
+			new_sock = NULL;
+			ret = 0;
+			goto out;
+		} else if (rs_tcp->t_sock) {
+			rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp);
+			conn->c_outgoing = 0;
+		}
+	}
 	rds_tcp_set_callbacks(new_sock, conn);
-	rds_connect_complete(conn);
+	rds_connect_complete(conn); /* marks RDS_CONN_UP */
 	new_sock = NULL;
 	ret = 0;
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net v2 2/2] RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.
  2016-05-02 18:24 [PATCH net v2 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan
  2016-05-02 18:24 ` [PATCH net v2 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan
@ 2016-05-02 18:24 ` Sowmini Varadhan
  2016-05-03 20:05 ` [PATCH net v2 0/2] RDS: TCP: sychronization during connection startup David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Sowmini Varadhan @ 2016-05-02 18:24 UTC (permalink / raw)
  To: sowmini.varadhan, netdev, rds-devel; +Cc: santosh.shilimkar, davem

An arbitration scheme for duelling SYNs is implemented as part of
commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()") which ensures that both nodes
involved will arrive at the same arbitration decision. However, this
needs to be synchronized with an outgoing SYN to be generated by
rds_tcp_conn_connect(). This commit achieves the synchronization
through the t_conn_lock mutex in struct rds_tcp_connection.

The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring
the t_conn_lock mutex.  A SYN is sent out only if the RDS connection is
not already UP (an UP would indicate that rds_tcp_accept_one() has
completed 3WH, so no SYN needs to be generated).

Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after
acquiring the t_conn_lock mutex. The only acceptable states (to
allow continuation of the arbitration logic) are UP (i.e., outgoing SYN
was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent
outgoing SYN before we saw incoming SYN).

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
---
v2: review comments from Santosh Shilimkar

 net/rds/tcp.c         |    1 +
 net/rds/tcp.h         |    4 ++++
 net/rds/tcp_connect.c |    8 ++++++++
 net/rds/tcp_listen.c  |   30 ++++++++++++++++++++----------
 4 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index 9134544..86187da 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -216,6 +216,7 @@ static int rds_tcp_conn_alloc(struct rds_connection *conn, gfp_t gfp)
 	if (!tc)
 		return -ENOMEM;
 
+	mutex_init(&tc->t_conn_lock);
 	tc->t_sock = NULL;
 	tc->t_tinc = NULL;
 	tc->t_tinc_hdr_rem = sizeof(struct rds_header);
diff --git a/net/rds/tcp.h b/net/rds/tcp.h
index 64f873c..41c2283 100644
--- a/net/rds/tcp.h
+++ b/net/rds/tcp.h
@@ -12,6 +12,10 @@ struct rds_tcp_connection {
 
 	struct list_head	t_tcp_node;
 	struct rds_connection   *conn;
+	/* t_conn_lock synchronizes the connection establishment between
+	 * rds_tcp_accept_one and rds_tcp_conn_connect
+	 */
+	struct mutex		t_conn_lock;
 	struct socket		*t_sock;
 	void			*t_orig_write_space;
 	void			*t_orig_data_ready;
diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c
index 5cb1687..49a3fcf 100644
--- a/net/rds/tcp_connect.c
+++ b/net/rds/tcp_connect.c
@@ -78,7 +78,14 @@ int rds_tcp_conn_connect(struct rds_connection *conn)
 	struct socket *sock = NULL;
 	struct sockaddr_in src, dest;
 	int ret;
+	struct rds_tcp_connection *tc = conn->c_transport_data;
+
+	mutex_lock(&tc->t_conn_lock);
 
+	if (rds_conn_up(conn)) {
+		mutex_unlock(&tc->t_conn_lock);
+		return 0;
+	}
 	ret = sock_create_kern(rds_conn_net(conn), PF_INET,
 			       SOCK_STREAM, IPPROTO_TCP, &sock);
 	if (ret < 0)
@@ -120,6 +127,7 @@ int rds_tcp_conn_connect(struct rds_connection *conn)
 	}
 
 out:
+	mutex_unlock(&tc->t_conn_lock);
 	if (sock)
 		sock_release(sock);
 	return ret;
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 0896187..be263cd 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -76,7 +76,9 @@ int rds_tcp_accept_one(struct socket *sock)
 	struct rds_connection *conn;
 	int ret;
 	struct inet_sock *inet;
-	struct rds_tcp_connection *rs_tcp;
+	struct rds_tcp_connection *rs_tcp = NULL;
+	int conn_state;
+	struct sock *nsk;
 
 	ret = sock_create_kern(sock_net(sock->sk), sock->sk->sk_family,
 			       sock->sk->sk_type, sock->sk->sk_protocol,
@@ -116,6 +118,10 @@ int rds_tcp_accept_one(struct socket *sock)
 	 */
 	rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data;
 	rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
+	mutex_lock(&rs_tcp->t_conn_lock);
+	conn_state = rds_conn_state(conn);
+	if (conn_state != RDS_CONN_CONNECTING && conn_state != RDS_CONN_UP)
+		goto rst_nsk;
 	if (rs_tcp->t_sock) {
 		/* Need to resolve a duelling SYN between peers.
 		 * We have an outstanding SYN to this peer, which may
@@ -126,14 +132,7 @@ int rds_tcp_accept_one(struct socket *sock)
 		wait_event(conn->c_waitq,
 			   !test_bit(RDS_IN_XMIT, &conn->c_flags));
 		if (ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) {
-			struct sock *nsk = new_sock->sk;
-
-			nsk->sk_user_data = NULL;
-			nsk->sk_prot->disconnect(nsk, 0);
-			tcp_done(nsk);
-			new_sock = NULL;
-			ret = 0;
-			goto out;
+			goto rst_nsk;
 		} else if (rs_tcp->t_sock) {
 			rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp);
 			conn->c_outgoing = 0;
@@ -143,8 +142,19 @@ int rds_tcp_accept_one(struct socket *sock)
 	rds_connect_complete(conn); /* marks RDS_CONN_UP */
 	new_sock = NULL;
 	ret = 0;
-
+	goto out;
+rst_nsk:
+	/* reset the newly returned accept sock and bail */
+	nsk = new_sock->sk;
+	rds_tcp_stats_inc(s_tcp_listen_closed_stale);
+	nsk->sk_user_data = NULL;
+	nsk->sk_prot->disconnect(nsk, 0);
+	tcp_done(nsk);
+	new_sock = NULL;
+	ret = 0;
 out:
+	if (rs_tcp)
+		mutex_unlock(&rs_tcp->t_conn_lock);
 	if (new_sock)
 		sock_release(new_sock);
 	return ret;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net v2 0/2] RDS: TCP: sychronization during connection startup
  2016-05-02 18:24 [PATCH net v2 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan
  2016-05-02 18:24 ` [PATCH net v2 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan
  2016-05-02 18:24 ` [PATCH net v2 2/2] RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock Sowmini Varadhan
@ 2016-05-03 20:05 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2016-05-03 20:05 UTC (permalink / raw)
  To: sowmini.varadhan; +Cc: netdev, rds-devel, santosh.shilimkar

From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Mon,  2 May 2016 11:24:50 -0700

> This patch series ensures that the passive (accept) side of the
> TCP connection used for RDS-TCP is correctly synchronized with
> any concurrent active (connect) attempts for a given pair of peers.
> 
> Patch 1 in the series makes sure that the t_sock in struct 
> rds_tcp_connection is only reset after any threads in rds_tcp_xmit
> have completed (otherwise a null-ptr deref may be encountered). 
> Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect()
> path.
> 
> v2: review comments from Santosh Shilimkar, other spelling corrections

Series applied, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-05-03 20:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-02 18:24 [PATCH net v2 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan
2016-05-02 18:24 ` [PATCH net v2 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan
2016-05-02 18:24 ` [PATCH net v2 2/2] RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock Sowmini Varadhan
2016-05-03 20:05 ` [PATCH net v2 0/2] RDS: TCP: sychronization during connection startup David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).