* [PATCH net 0/2] RDS: TCP: sychronization during connection startup
@ 2016-05-01 23:10 Sowmini Varadhan
2016-05-01 23:10 ` [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan
2016-05-01 23:10 ` [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock Sowmini Varadhan
0 siblings, 2 replies; 9+ messages in thread
From: Sowmini Varadhan @ 2016-05-01 23:10 UTC (permalink / raw)
To: sowmini.varadhan, netdev, rds-devel; +Cc: santosh.shilimkar, davem
This patch series ensures that the passive (accept) side of the
TCP connection used for RDS-TCP is correctly synchronized with
any concurrent active (connect) attempts for a given pair of peers.
Patch 1 in the series makes sure that the t_sock in struct
rds_tcp_connection is only reset after any threads in rds_tcp_xmit
have completed (otherwise a null-ptr deref may be encountered).
Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect()
path.
Sowmini Varadhan (2):
RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when
resetting t_sock
RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock.
net/rds/tcp.c | 3 +-
net/rds/tcp.h | 4 +++
net/rds/tcp_connect.c | 8 +++++++
net/rds/tcp_listen.c | 54 ++++++++++++++++++++++++++++++++----------------
4 files changed, 50 insertions(+), 19 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock 2016-05-01 23:10 [PATCH net 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan @ 2016-05-01 23:10 ` Sowmini Varadhan 2016-05-02 16:20 ` Santosh Shilimkar 2016-05-01 23:10 ` [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock Sowmini Varadhan 1 sibling, 1 reply; 9+ messages in thread From: Sowmini Varadhan @ 2016-05-01 23:10 UTC (permalink / raw) To: sowmini.varadhan, netdev, rds-devel; +Cc: santosh.shilimkar, davem There is a race condition between rds_send_xmit -> rds_tcp_xmit and the code that deals with resolution of duelling syns added by commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()"). Specifically, we may end up derefencing a null pointer in rds_send_xmit if we have the interleaving sequencee: rds_tcp_accept_one rds_send_xmit conn is RDS_CONN_UP, so invoke rds_tcp_xmit tc = conn->c_transport_data rds_tcp_restore_callbacks /* reset t_sock */ null ptr deref from tc->t_sock The race condition can be avoided without adding the overhead of additional locking in the xmit path: have rds_tcp_accept_one wait for rds_tcp_xmit threads to complete before resetting callbacks. The synchronization can be done in the same manner as rds_conn_shutdown(). First set the rds_conn_state to something other than RDS_CONN_UP (so that new threads cannot get into rds_tcp_xmit()), then wait for RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any threads in rds_tcp_xmit are done. Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> --- net/rds/tcp.c | 2 +- net/rds/tcp_listen.c | 40 ++++++++++++++++++++++++---------------- 2 files changed, 25 insertions(+), 17 deletions(-) diff --git a/net/rds/tcp.c b/net/rds/tcp.c index 61ed2a8..9134544 100644 --- a/net/rds/tcp.c +++ b/net/rds/tcp.c @@ -127,7 +127,7 @@ void rds_tcp_restore_callbacks(struct socket *sock, /* * This is the only path that sets tc->t_sock. Send and receive trust that - * it is set. The RDS_CONN_CONNECTED bit protects those paths from being + * it is set. The RDS_CONN_UP bit protects those paths from being * called while it isn't set. */ void rds_tcp_set_callbacks(struct socket *sock, struct rds_connection *conn) diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c index 0936a4a..0896187 100644 --- a/net/rds/tcp_listen.c +++ b/net/rds/tcp_listen.c @@ -115,24 +115,32 @@ int rds_tcp_accept_one(struct socket *sock) * rds_tcp_state_change() will do that cleanup */ rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data; - if (rs_tcp->t_sock && - ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) { - struct sock *nsk = new_sock->sk; - - nsk->sk_user_data = NULL; - nsk->sk_prot->disconnect(nsk, 0); - tcp_done(nsk); - new_sock = NULL; - ret = 0; - goto out; - } else if (rs_tcp->t_sock) { - rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp); - conn->c_outgoing = 0; - } - rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING); + if (rs_tcp->t_sock) { + /* Need to resolve a duelling SYN between peers. + * We have an outstanding SYN to this peer, which may + * potentially have transitioned to the RDS_CONN_UP state, + * so we must quiesce any send threads before resetting + * c_transport_data. + */ + wait_event(conn->c_waitq, + !test_bit(RDS_IN_XMIT, &conn->c_flags)); + if (ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) { + struct sock *nsk = new_sock->sk; + + nsk->sk_user_data = NULL; + nsk->sk_prot->disconnect(nsk, 0); + tcp_done(nsk); + new_sock = NULL; + ret = 0; + goto out; + } else if (rs_tcp->t_sock) { + rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp); + conn->c_outgoing = 0; + } + } rds_tcp_set_callbacks(new_sock, conn); - rds_connect_complete(conn); + rds_connect_complete(conn); /* marks RDS_CONN_UP */ new_sock = NULL; ret = 0; -- 1.7.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock 2016-05-01 23:10 ` [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan @ 2016-05-02 16:20 ` Santosh Shilimkar 2016-05-02 16:37 ` Sowmini Varadhan 0 siblings, 1 reply; 9+ messages in thread From: Santosh Shilimkar @ 2016-05-02 16:20 UTC (permalink / raw) To: Sowmini Varadhan, netdev, rds-devel; +Cc: davem On 5/1/2016 4:10 PM, Sowmini Varadhan wrote: > There is a race condition between rds_send_xmit -> rds_tcp_xmit > and the code that deals with resolution of duelling syns added > by commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an > outgoing socket in rds_tcp_accept_one()"). > > Specifically, we may end up derefencing a null pointer in rds_send_xmit > if we have the interleaving sequencee: > rds_tcp_accept_one rds_send_xmit > > conn is RDS_CONN_UP, so > invoke rds_tcp_xmit > > tc = conn->c_transport_data > rds_tcp_restore_callbacks > /* reset t_sock */ > null ptr deref from tc->t_sock > > The race condition can be avoided without adding the overhead of > additional locking in the xmit path: have rds_tcp_accept_one wait > for rds_tcp_xmit threads to complete before resetting callbacks. > The synchronization can be done in the same manner as rds_conn_shutdown(). > First set the rds_conn_state to something other than RDS_CONN_UP > (so that new threads cannot get into rds_tcp_xmit()), then wait for > RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any > threads in rds_tcp_xmit are done. > > Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an > outgoing socket in rds_tcp_accept_one()") > > Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> > --- Mostly looks correct. A question below. > net/rds/tcp.c | 2 +- > net/rds/tcp_listen.c | 40 ++++++++++++++++++++++++---------------- > 2 files changed, 25 insertions(+), 17 deletions(-) > > diff --git a/net/rds/tcp.c b/net/rds/tcp.c > index 61ed2a8..9134544 100644 > --- a/net/rds/tcp.c > +++ b/net/rds/tcp.c > @@ -127,7 +127,7 @@ void rds_tcp_restore_callbacks(struct socket *sock, > > /* > * This is the only path that sets tc->t_sock. Send and receive trust that > - * it is set. The RDS_CONN_CONNECTED bit protects those paths from being > + * it is set. The RDS_CONN_UP bit protects those paths from being > * called while it isn't set. > */ > void rds_tcp_set_callbacks(struct socket *sock, struct rds_connection *conn) > diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c > index 0936a4a..0896187 100644 > --- a/net/rds/tcp_listen.c > +++ b/net/rds/tcp_listen.c > @@ -115,24 +115,32 @@ int rds_tcp_accept_one(struct socket *sock) > * rds_tcp_state_change() will do that cleanup > */ > rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data; > - if (rs_tcp->t_sock && > - ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) { > - struct sock *nsk = new_sock->sk; > - > - nsk->sk_user_data = NULL; > - nsk->sk_prot->disconnect(nsk, 0); > - tcp_done(nsk); > - new_sock = NULL; > - ret = 0; > - goto out; > - } else if (rs_tcp->t_sock) { > - rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp); > - conn->c_outgoing = 0; > - } > - > rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING); > + if (rs_tcp->t_sock) { > + /* Need to resolve a duelling SYN between peers. > + * We have an outstanding SYN to this peer, which may > + * potentially have transitioned to the RDS_CONN_UP state, > + * so we must quiesce any send threads before resetting > + * c_transport_data. > + */ > + wait_event(conn->c_waitq, > + !test_bit(RDS_IN_XMIT, &conn->c_flags)); Would it be good to check the return value of rds_conn_transition() since if CONN is already UP above will fail and then send message might again race and we will let message through even though passive hasn't finished its connection. > + if (ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) { > + struct sock *nsk = new_sock->sk; > + > + nsk->sk_user_data = NULL; > + nsk->sk_prot->disconnect(nsk, 0); > + tcp_done(nsk); > + new_sock = NULL; > + ret = 0; > + goto out; > + } else if (rs_tcp->t_sock) { > + rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp); > + conn->c_outgoing = 0; > + } > + } > rds_tcp_set_callbacks(new_sock, conn); > - rds_connect_complete(conn); > + rds_connect_complete(conn); /* marks RDS_CONN_UP */ > new_sock = NULL; > ret = 0; > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock 2016-05-02 16:20 ` Santosh Shilimkar @ 2016-05-02 16:37 ` Sowmini Varadhan 2016-05-02 18:05 ` Santosh Shilimkar 0 siblings, 1 reply; 9+ messages in thread From: Sowmini Varadhan @ 2016-05-02 16:37 UTC (permalink / raw) To: Santosh Shilimkar; +Cc: netdev, rds-devel, davem On (05/02/16 09:20), Santosh Shilimkar wrote: > > rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING); > >+ if (rs_tcp->t_sock) { > >+ /* Need to resolve a duelling SYN between peers. > >+ * We have an outstanding SYN to this peer, which may > >+ * potentially have transitioned to the RDS_CONN_UP state, > >+ * so we must quiesce any send threads before resetting > >+ * c_transport_data. > >+ */ > >+ wait_event(conn->c_waitq, > >+ !test_bit(RDS_IN_XMIT, &conn->c_flags)); > Would it be good to check the return value of rds_conn_transition() > since if CONN is already UP above will fail and then send message > might again race and we will let message through even though passive > hasn't finished its connection. no, that was the original issue that I was running into, which needed commit 241b2719 - prior to that commit, if the conn was already UP, we'd end up doing a rds_conn_drop on a good connection, and both sides would end up in a pair of infinite 3WH loops. Even if we dont do a rds_conn_drop on the UP connection, we've just (before rds_tcp_accept_one) sent out a syn-ack on the incoming syn, and now need to RST that syn-ac. The other side is going to receive the rst, and get confused about what to clean up (since there's already an UP connection going on). In short, when there is a duel, it's cleanest to have a deterministic arbitration- both sides use the numeric value of saddr and faddr to figure out which side is active, which side is passive. (Thus the basis on the BGP router-id based model for 241b2719) FWIW, much of this is actually a corner case- in practice, its not frequent to have syns crossing each other at "almost the same time". --Sowmini ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock 2016-05-02 16:37 ` Sowmini Varadhan @ 2016-05-02 18:05 ` Santosh Shilimkar 0 siblings, 0 replies; 9+ messages in thread From: Santosh Shilimkar @ 2016-05-02 18:05 UTC (permalink / raw) To: Sowmini Varadhan; +Cc: netdev, rds-devel, davem On 5/2/2016 9:37 AM, Sowmini Varadhan wrote: > On (05/02/16 09:20), Santosh Shilimkar wrote: >>> rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING); >>> + if (rs_tcp->t_sock) { >>> + /* Need to resolve a duelling SYN between peers. >>> + * We have an outstanding SYN to this peer, which may >>> + * potentially have transitioned to the RDS_CONN_UP state, >>> + * so we must quiesce any send threads before resetting >>> + * c_transport_data. >>> + */ >>> + wait_event(conn->c_waitq, >>> + !test_bit(RDS_IN_XMIT, &conn->c_flags)); >> Would it be good to check the return value of rds_conn_transition() >> since if CONN is already UP above will fail and then send message >> might again race and we will let message through even though passive >> hasn't finished its connection. > > no, that was the original issue that I was running into, which needed > commit 241b2719 - prior to that commit, if the conn was already UP, > we'd end up doing a rds_conn_drop on a good connection, and both sides > would end up in a pair of infinite 3WH loops. Even if we dont do > a rds_conn_drop on the UP connection, we've just (before > rds_tcp_accept_one) sent out a syn-ack on the incoming syn, and now > need to RST that syn-ac. The other side is going to receive the rst, > and get confused about what to clean up (since there's already an UP > connection going on). > > In short, when there is a duel, it's cleanest to have a deterministic > arbitration- both sides use the numeric value of saddr and faddr to > figure out which side is active, which side is passive. (Thus the > basis on the BGP router-id based model for 241b2719) > > FWIW, much of this is actually a corner case- in practice, its not > frequent to have syns crossing each other at "almost the same time". > Sounds good. Thanks for expanding it. Patch looks good to me. Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock. 2016-05-01 23:10 [PATCH net 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan 2016-05-01 23:10 ` [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan @ 2016-05-01 23:10 ` Sowmini Varadhan 2016-05-02 16:33 ` Santosh Shilimkar 1 sibling, 1 reply; 9+ messages in thread From: Sowmini Varadhan @ 2016-05-01 23:10 UTC (permalink / raw) To: sowmini.varadhan, netdev, rds-devel; +Cc: santosh.shilimkar, davem An arbitration scheme for duelling SYNs is implemented as part of commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()") which ensures that both nodes involved will arrive at the same arbitration decision. However, this needs to be synchronized with an outgoing SYN to be generated by rds_tcp_conn_connect(). This commit achieves the synchronization through the t_conn_lock mutex in struct rds_tcp_connection. The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring the t_conn_lock mutex. A SYN is sent out only if the RDS connection is not already UP (an UP would indicate that rds_tcp_accept_one() has completed 3WH, so no SYN needs to be generated). Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after acquiring the t_conn_lock mutex. The only acceptable states (to allow continuation of the arbitration logic) are UP (i.e., outgoing SYN was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent outgoing SYN before we saw incoming SYN). Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> --- net/rds/tcp.c | 1 + net/rds/tcp.h | 4 ++++ net/rds/tcp_connect.c | 8 ++++++++ net/rds/tcp_listen.c | 30 ++++++++++++++++++++---------- 4 files changed, 33 insertions(+), 10 deletions(-) diff --git a/net/rds/tcp.c b/net/rds/tcp.c index 9134544..86187da 100644 --- a/net/rds/tcp.c +++ b/net/rds/tcp.c @@ -216,6 +216,7 @@ static int rds_tcp_conn_alloc(struct rds_connection *conn, gfp_t gfp) if (!tc) return -ENOMEM; + mutex_init(&tc->t_conn_lock); tc->t_sock = NULL; tc->t_tinc = NULL; tc->t_tinc_hdr_rem = sizeof(struct rds_header); diff --git a/net/rds/tcp.h b/net/rds/tcp.h index 64f873c..41c2283 100644 --- a/net/rds/tcp.h +++ b/net/rds/tcp.h @@ -12,6 +12,10 @@ struct rds_tcp_connection { struct list_head t_tcp_node; struct rds_connection *conn; + /* t_conn_lock synchronizes the connection establishment between + * rds_tcp_accept_one and rds_tcp_conn_connect + */ + struct mutex t_conn_lock; struct socket *t_sock; void *t_orig_write_space; void *t_orig_data_ready; diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c index 5cb1687..49a3fcf 100644 --- a/net/rds/tcp_connect.c +++ b/net/rds/tcp_connect.c @@ -78,7 +78,14 @@ int rds_tcp_conn_connect(struct rds_connection *conn) struct socket *sock = NULL; struct sockaddr_in src, dest; int ret; + struct rds_tcp_connection *tc = conn->c_transport_data; + + mutex_lock(&tc->t_conn_lock); + if (rds_conn_up(conn)) { + mutex_unlock(&tc->t_conn_lock); + return 0; + } ret = sock_create_kern(rds_conn_net(conn), PF_INET, SOCK_STREAM, IPPROTO_TCP, &sock); if (ret < 0) @@ -120,6 +127,7 @@ int rds_tcp_conn_connect(struct rds_connection *conn) } out: + mutex_unlock(&tc->t_conn_lock); if (sock) sock_release(sock); return ret; diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c index 0896187..cc8496f 100644 --- a/net/rds/tcp_listen.c +++ b/net/rds/tcp_listen.c @@ -76,7 +76,9 @@ int rds_tcp_accept_one(struct socket *sock) struct rds_connection *conn; int ret; struct inet_sock *inet; - struct rds_tcp_connection *rs_tcp; + struct rds_tcp_connection *rs_tcp = NULL; + int conn_state; + struct sock *nsk; ret = sock_create_kern(sock_net(sock->sk), sock->sk->sk_family, sock->sk->sk_type, sock->sk->sk_protocol, @@ -116,6 +118,10 @@ int rds_tcp_accept_one(struct socket *sock) */ rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data; rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING); + mutex_lock(&rs_tcp->t_conn_lock); + conn_state = rds_conn_state(conn); + if (conn_state != RDS_CONN_CONNECTING && conn_state != RDS_CONN_UP) + goto rst_nsk; if (rs_tcp->t_sock) { /* Need to resolve a duelling SYN between peers. * We have an outstanding SYN to this peer, which may @@ -126,14 +132,7 @@ int rds_tcp_accept_one(struct socket *sock) wait_event(conn->c_waitq, !test_bit(RDS_IN_XMIT, &conn->c_flags)); if (ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) { - struct sock *nsk = new_sock->sk; - - nsk->sk_user_data = NULL; - nsk->sk_prot->disconnect(nsk, 0); - tcp_done(nsk); - new_sock = NULL; - ret = 0; - goto out; + goto rst_nsk; } else if (rs_tcp->t_sock) { rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp); conn->c_outgoing = 0; @@ -143,8 +142,19 @@ int rds_tcp_accept_one(struct socket *sock) rds_connect_complete(conn); /* marks RDS_CONN_UP */ new_sock = NULL; ret = 0; - + goto out; +rst_nsk: + /* rest the newly returned accept sock and bail */ + nsk = new_sock->sk; + rds_tcp_stats_inc(s_tcp_listen_closed_stale); + nsk->sk_user_data = NULL; + nsk->sk_prot->disconnect(nsk, 0); + tcp_done(nsk); + new_sock = NULL; + ret = 0; out: + if (rs_tcp) + mutex_unlock(&rs_tcp->t_conn_lock); if (new_sock) sock_release(new_sock); return ret; -- 1.7.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock. 2016-05-01 23:10 ` [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock Sowmini Varadhan @ 2016-05-02 16:33 ` Santosh Shilimkar 2016-05-02 16:43 ` Sowmini Varadhan 0 siblings, 1 reply; 9+ messages in thread From: Santosh Shilimkar @ 2016-05-02 16:33 UTC (permalink / raw) To: Sowmini Varadhan, netdev, rds-devel; +Cc: davem On 5/1/2016 4:10 PM, Sowmini Varadhan wrote: > An arbitration scheme for duelling SYNs is implemented as part of > commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an > outgoing socket in rds_tcp_accept_one()") which ensures that both nodes > involved will arrive at the same arbitration decision. However, this > needs to be synchronized with an outgoing SYN to be generated by > rds_tcp_conn_connect(). This commit achieves the synchronization > through the t_conn_lock mutex in struct rds_tcp_connection. > > The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring > the t_conn_lock mutex. A SYN is sent out only if the RDS connection is > not already UP (an UP would indicate that rds_tcp_accept_one() has > completed 3WH, so no SYN needs to be generated). > > Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after > acquiring the t_conn_lock mutex. The only acceptable states (to > allow continuation of the arbitration logic) are UP (i.e., outgoing SYN > was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent > outgoing SYN before we saw incoming SYN). > > Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> > --- > net/rds/tcp.c | 1 + > net/rds/tcp.h | 4 ++++ > net/rds/tcp_connect.c | 8 ++++++++ > net/rds/tcp_listen.c | 30 ++++++++++++++++++++---------- > 4 files changed, 33 insertions(+), 10 deletions(-) > [...] > diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c > index 5cb1687..49a3fcf 100644 > --- a/net/rds/tcp_connect.c > +++ b/net/rds/tcp_connect.c > @@ -78,7 +78,14 @@ int rds_tcp_conn_connect(struct rds_connection *conn) > struct socket *sock = NULL; > struct sockaddr_in src, dest; > int ret; > + struct rds_tcp_connection *tc = conn->c_transport_data; > + > + mutex_lock(&tc->t_conn_lock); > > + if (rds_conn_up(conn)) { > + mutex_unlock(&tc->t_conn_lock); > + return 0; > + } > ret = sock_create_kern(rds_conn_net(conn), PF_INET, > SOCK_STREAM, IPPROTO_TCP, &sock); > if (ret < 0) > @@ -120,6 +127,7 @@ int rds_tcp_conn_connect(struct rds_connection *conn) > } > > out: > + mutex_unlock(&tc->t_conn_lock); Just wondering whether the spin_lock() would better here considering entry into rds_tcp_conn_connect() & rds_tcp_accept_one() might be from softirq context. Ignore it if its not applicable. > if (sock) > sock_release(sock); > return ret; > diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c > index 0896187..cc8496f 100644 > --- a/net/rds/tcp_listen.c > +++ b/net/rds/tcp_listen.c > @@ -76,7 +76,9 @@ int rds_tcp_accept_one(struct socket *sock) > struct rds_connection *conn; > int ret; > struct inet_sock *inet; > - struct rds_tcp_connection *rs_tcp; > + struct rds_tcp_connection *rs_tcp = NULL; > + int conn_state; > + struct sock *nsk; > > ret = sock_create_kern(sock_net(sock->sk), sock->sk->sk_family, > sock->sk->sk_type, sock->sk->sk_protocol, > @@ -116,6 +118,10 @@ int rds_tcp_accept_one(struct socket *sock) > */ > rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data; > rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING); Like patch 1/2, probably we can leverage return value of above. > + conn_state = rds_conn_state(conn); > + if (conn_state != RDS_CONN_CONNECTING && conn_state != RDS_CONN_UP) You probably don't need the local 'conn_state' and below should work. if (!rds_conn_connecting(conn) && !rds_conn_up(conn)) Regards, Santosh ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock. 2016-05-02 16:33 ` Santosh Shilimkar @ 2016-05-02 16:43 ` Sowmini Varadhan 2016-05-02 18:08 ` Santosh Shilimkar 0 siblings, 1 reply; 9+ messages in thread From: Sowmini Varadhan @ 2016-05-02 16:43 UTC (permalink / raw) To: Santosh Shilimkar; +Cc: netdev, rds-devel, davem On (05/02/16 09:33), Santosh Shilimkar wrote: > >+ mutex_unlock(&tc->t_conn_lock); > Just wondering whether the spin_lock() would better here considering > entry into rds_tcp_conn_connect() & rds_tcp_accept_one() might be > from softirq context. Ignore it if its not applicable. It's not from softirq context (both are workqs), but I used a mutex to follow c_cm_lock (which I considered reusing, given that it is only IB specific?) But spin_lock vs mutex may not be a big differentiator here- this is really a one-time start up (corner-case) issue in the control path. > > rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING); > Like patch 1/2, probably we can leverage return value of above. : > You probably don't need the local 'conn_state' and below should work. > if (!rds_conn_connecting(conn) && !rds_conn_up(conn)) see explanation for comment to 1/2. --Sowmini ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock. 2016-05-02 16:43 ` Sowmini Varadhan @ 2016-05-02 18:08 ` Santosh Shilimkar 0 siblings, 0 replies; 9+ messages in thread From: Santosh Shilimkar @ 2016-05-02 18:08 UTC (permalink / raw) To: Sowmini Varadhan; +Cc: netdev, rds-devel, davem On 5/2/2016 9:43 AM, Sowmini Varadhan wrote: > On (05/02/16 09:33), Santosh Shilimkar wrote: >>> + mutex_unlock(&tc->t_conn_lock); >> Just wondering whether the spin_lock() would better here considering >> entry into rds_tcp_conn_connect() & rds_tcp_accept_one() might be >> from softirq context. Ignore it if its not applicable. > > It's not from softirq context (both are workqs), but I used a mutex > to follow c_cm_lock (which I considered reusing, given that it > is only IB specific?) But spin_lock vs mutex may not be a big > differentiator here- this is really a one-time start up (corner-case) > issue in the control path. > That should be fine then. >>> rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING); >> Like patch 1/2, probably we can leverage return value of above. > : >> You probably don't need the local 'conn_state' and below should work. >> if (!rds_conn_connecting(conn) && !rds_conn_up(conn)) > > see explanation for comment to 1/2. > Yep. > +rst_nsk: > + /* rest the newly returned accept sock and bail */ s/rest/reset With typo fixed, Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-05-02 18:08 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-05-01 23:10 [PATCH net 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan 2016-05-01 23:10 ` [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan 2016-05-02 16:20 ` Santosh Shilimkar 2016-05-02 16:37 ` Sowmini Varadhan 2016-05-02 18:05 ` Santosh Shilimkar 2016-05-01 23:10 ` [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock Sowmini Varadhan 2016-05-02 16:33 ` Santosh Shilimkar 2016-05-02 16:43 ` Sowmini Varadhan 2016-05-02 18:08 ` Santosh Shilimkar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).