* [PATCH net 1/3] rds: tcp: Take explicit refcounts on struct net
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
@ 2017-03-04 16:57 ` Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 2/3] rds: tcp: Reorder initialization sequence in rds_tcp_init to avoid races Sowmini Varadhan
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Sowmini Varadhan @ 2017-03-04 16:57 UTC (permalink / raw)
To: sowmini.varadhan, dvyukov, santosh.shilimkar, davem, netdev
Cc: syzkaller, rds-devel
It is incorrect for the rds_connection to piggyback on the
sock_net() refcount for the netns because this gives rise to
a chicken-and-egg problem during rds_conn_destroy. Instead explicitly
take a ref on the net, and hold the netns down till the connection
tear-down is complete.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
net/rds/connection.c | 1 +
net/rds/rds.h | 6 +++---
net/rds/tcp.c | 4 ++--
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/rds/connection.c b/net/rds/connection.c
index 0e04dcc..1fa75ab 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -429,6 +429,7 @@ void rds_conn_destroy(struct rds_connection *conn)
*/
rds_cong_remove_conn(conn);
+ put_net(conn->c_net);
kmem_cache_free(rds_conn_slab, conn);
spin_lock_irqsave(&rds_conn_lock, flags);
diff --git a/net/rds/rds.h b/net/rds/rds.h
index 6f523dd..5dd2565 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -147,7 +147,7 @@ struct rds_connection {
/* Protocol version */
unsigned int c_version;
- possible_net_t c_net;
+ struct net *c_net;
struct list_head c_map_item;
unsigned long c_map_queued;
@@ -162,13 +162,13 @@ struct rds_connection {
static inline
struct net *rds_conn_net(struct rds_connection *conn)
{
- return read_pnet(&conn->c_net);
+ return conn->c_net;
}
static inline
void rds_conn_net_set(struct rds_connection *conn, struct net *net)
{
- write_pnet(&conn->c_net, net);
+ conn->c_net = get_net(net);
}
#define RDS_FLAG_CONG_BITMAP 0x01
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index a973d3b..65c8e3b 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -529,7 +529,7 @@ static void rds_tcp_kill_sock(struct net *net)
flush_work(&rtn->rds_tcp_accept_w);
spin_lock_irq(&rds_tcp_conn_lock);
list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
- struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
+ struct net *c_net = tc->t_cpath->cp_conn->c_net;
if (net != c_net || !tc->t_sock)
continue;
@@ -584,7 +584,7 @@ static void rds_tcp_sysctl_reset(struct net *net)
spin_lock_irq(&rds_tcp_conn_lock);
list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
- struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
+ struct net *c_net = tc->t_cpath->cp_conn->c_net;
if (net != c_net || !tc->t_sock)
continue;
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net 2/3] rds: tcp: Reorder initialization sequence in rds_tcp_init to avoid races
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 1/3] rds: tcp: Take explicit refcounts on struct net Sowmini Varadhan
@ 2017-03-04 16:57 ` Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 3/3] rds: tcp: Sequence teardown of listen and acceptor sockets " Sowmini Varadhan
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Sowmini Varadhan @ 2017-03-04 16:57 UTC (permalink / raw)
To: sowmini.varadhan, dvyukov, santosh.shilimkar, davem, netdev
Cc: syzkaller, rds-devel
Order of initialization in rds_tcp_init needs to be done so
that resources are set up and destroyed in the correct synchronization
sequence with both the data path, as well as netns create/destroy
path. Specifically,
- we must call register_pernet_subsys and get the rds_tcp_netid
before calling register_netdevice_notifier, otherwise we risk
the sequence
1. register_netdevice_notifier sets up netdev notifier callback
2. rds_tcp_dev_event -> rds_tcp_kill_sock uses netid 0, and finds
the wrong rtn, resulting in a panic with string that is of the form:
BUG: unable to handle kernel NULL pointer dereference at 000000000000000d
IP: rds_tcp_kill_sock+0x3a/0x1d0 [rds_tcp]
:
- the rds_tcp_incoming_slab kmem_cache must be initialized before the
datapath starts up. The latter can happen any time after the
pernet_subsys registration of rds_tcp_net_ops, whose -> init
function sets up the listen socket. If the rds_tcp_incoming_slab has
not been set up at that time, a panic of the form below may be
encountered
BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
IP: kmem_cache_alloc+0x90/0x1c0
:
rds_tcp_data_recv+0x1e7/0x370 [rds_tcp]
tcp_read_sock+0x96/0x1c0
rds_tcp_recv_path+0x65/0x80 [rds_tcp]
:
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
net/rds/tcp.c | 19 +++++++++----------
1 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index 65c8e3b..fbf807a 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -638,19 +638,19 @@ static int rds_tcp_init(void)
goto out;
}
- ret = register_netdevice_notifier(&rds_tcp_dev_notifier);
- if (ret) {
- pr_warn("could not register rds_tcp_dev_notifier\n");
+ ret = rds_tcp_recv_init();
+ if (ret)
goto out_slab;
- }
ret = register_pernet_subsys(&rds_tcp_net_ops);
if (ret)
- goto out_notifier;
+ goto out_recv;
- ret = rds_tcp_recv_init();
- if (ret)
+ ret = register_netdevice_notifier(&rds_tcp_dev_notifier);
+ if (ret) {
+ pr_warn("could not register rds_tcp_dev_notifier\n");
goto out_pernet;
+ }
rds_trans_register(&rds_tcp_transport);
@@ -660,9 +660,8 @@ static int rds_tcp_init(void)
out_pernet:
unregister_pernet_subsys(&rds_tcp_net_ops);
-out_notifier:
- if (unregister_netdevice_notifier(&rds_tcp_dev_notifier))
- pr_warn("could not unregister rds_tcp_dev_notifier\n");
+out_recv:
+ rds_tcp_recv_exit();
out_slab:
kmem_cache_destroy(rds_tcp_conn_slab);
out:
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net 3/3] rds: tcp: Sequence teardown of listen and acceptor sockets to avoid races
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 1/3] rds: tcp: Take explicit refcounts on struct net Sowmini Varadhan
2017-03-04 16:57 ` [PATCH net 2/3] rds: tcp: Reorder initialization sequence in rds_tcp_init to avoid races Sowmini Varadhan
@ 2017-03-04 16:57 ` Sowmini Varadhan
2017-03-07 1:04 ` [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences santosh.shilimkar
2017-03-07 22:10 ` David Miller
4 siblings, 0 replies; 8+ messages in thread
From: Sowmini Varadhan @ 2017-03-04 16:57 UTC (permalink / raw)
To: sowmini.varadhan, dvyukov, santosh.shilimkar, davem, netdev
Cc: syzkaller, rds-devel
Commit a93d01f5777e ("RDS: TCP: avoid bad page reference in
rds_tcp_listen_data_ready") added the function
rds_tcp_listen_sock_def_readable() to handle the case when a
partially set-up acceptor socket drops into rds_tcp_listen_data_ready().
However, if the listen socket (rtn->rds_tcp_listen_sock) is itself going
through a tear-down via rds_tcp_listen_stop(), the (*ready)() will be
null and we would hit a panic of the form
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: (null)
:
? rds_tcp_listen_data_ready+0x59/0xb0 [rds_tcp]
tcp_data_queue+0x39d/0x5b0
tcp_rcv_established+0x2e5/0x660
tcp_v4_do_rcv+0x122/0x220
tcp_v4_rcv+0x8b7/0x980
:
In the above case, it is not fatal to encounter a NULL value for
ready- we should just drop the packet and let the flush of the
acceptor thread finish gracefully.
In general, the tear-down sequence for listen() and accept() socket
that is ensured by this commit is:
rtn->rds_tcp_listen_sock = NULL; /* prevent any new accepts */
In rds_tcp_listen_stop():
serialize with, and prevent, further callbacks using lock_sock()
flush rds_wq
flush acceptor workq
sock_release(listen socket)
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
net/rds/tcp.c | 15 ++++++++++-----
net/rds/tcp.h | 2 +-
net/rds/tcp_listen.c | 9 +++++++--
3 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index fbf807a..2256900 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -484,9 +484,10 @@ static void __net_exit rds_tcp_exit_net(struct net *net)
* we do need to clean up the listen socket here.
*/
if (rtn->rds_tcp_listen_sock) {
- rds_tcp_listen_stop(rtn->rds_tcp_listen_sock);
+ struct socket *lsock = rtn->rds_tcp_listen_sock;
+
rtn->rds_tcp_listen_sock = NULL;
- flush_work(&rtn->rds_tcp_accept_w);
+ rds_tcp_listen_stop(lsock, &rtn->rds_tcp_accept_w);
}
}
@@ -523,10 +524,10 @@ static void rds_tcp_kill_sock(struct net *net)
struct rds_tcp_connection *tc, *_tc;
LIST_HEAD(tmp_list);
struct rds_tcp_net *rtn = net_generic(net, rds_tcp_netid);
+ struct socket *lsock = rtn->rds_tcp_listen_sock;
- rds_tcp_listen_stop(rtn->rds_tcp_listen_sock);
rtn->rds_tcp_listen_sock = NULL;
- flush_work(&rtn->rds_tcp_accept_w);
+ rds_tcp_listen_stop(lsock, &rtn->rds_tcp_accept_w);
spin_lock_irq(&rds_tcp_conn_lock);
list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
struct net *c_net = tc->t_cpath->cp_conn->c_net;
@@ -546,8 +547,12 @@ static void rds_tcp_kill_sock(struct net *net)
void *rds_tcp_listen_sock_def_readable(struct net *net)
{
struct rds_tcp_net *rtn = net_generic(net, rds_tcp_netid);
+ struct socket *lsock = rtn->rds_tcp_listen_sock;
+
+ if (!lsock)
+ return NULL;
- return rtn->rds_tcp_listen_sock->sk->sk_user_data;
+ return lsock->sk->sk_user_data;
}
static int rds_tcp_dev_event(struct notifier_block *this,
diff --git a/net/rds/tcp.h b/net/rds/tcp.h
index 9a1cc89..56ea662 100644
--- a/net/rds/tcp.h
+++ b/net/rds/tcp.h
@@ -66,7 +66,7 @@ void rds_tcp_restore_callbacks(struct socket *sock,
/* tcp_listen.c */
struct socket *rds_tcp_listen_init(struct net *);
-void rds_tcp_listen_stop(struct socket *);
+void rds_tcp_listen_stop(struct socket *sock, struct work_struct *acceptor);
void rds_tcp_listen_data_ready(struct sock *sk);
int rds_tcp_accept_one(struct socket *sock);
int rds_tcp_keepalive(struct socket *sock);
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 67d0929..2c69a50 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -223,6 +223,9 @@ void rds_tcp_listen_data_ready(struct sock *sk)
* before it has been accepted and the accepter has set up their
* data_ready.. we only want to queue listen work for our listening
* socket
+ *
+ * (*ready)() may be null if we are racing with netns delete, and
+ * the listen socket is being torn down.
*/
if (sk->sk_state == TCP_LISTEN)
rds_tcp_accept_work(sk);
@@ -231,7 +234,8 @@ void rds_tcp_listen_data_ready(struct sock *sk)
out:
read_unlock_bh(&sk->sk_callback_lock);
- ready(sk);
+ if (ready)
+ ready(sk);
}
struct socket *rds_tcp_listen_init(struct net *net)
@@ -271,7 +275,7 @@ struct socket *rds_tcp_listen_init(struct net *net)
return NULL;
}
-void rds_tcp_listen_stop(struct socket *sock)
+void rds_tcp_listen_stop(struct socket *sock, struct work_struct *acceptor)
{
struct sock *sk;
@@ -292,5 +296,6 @@ void rds_tcp_listen_stop(struct socket *sock)
/* wait for accepts to stop and close the socket */
flush_workqueue(rds_wq);
+ flush_work(acceptor);
sock_release(sock);
}
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
` (2 preceding siblings ...)
2017-03-04 16:57 ` [PATCH net 3/3] rds: tcp: Sequence teardown of listen and acceptor sockets " Sowmini Varadhan
@ 2017-03-07 1:04 ` santosh.shilimkar
2017-03-07 8:28 ` Dmitry Vyukov
2017-03-07 22:10 ` David Miller
4 siblings, 1 reply; 8+ messages in thread
From: santosh.shilimkar @ 2017-03-07 1:04 UTC (permalink / raw)
To: Sowmini Varadhan, dvyukov, davem, netdev; +Cc: syzkaller, rds-devel
On 3/4/17 8:57 AM, Sowmini Varadhan wrote:
> Dmitry Vyukov reported some syszkaller panics during netns deletion.
>
> While I have not been able to reproduce those exact panics, my attempts
> to do so uncovered a few other problems, which are fixed patch 2 and
> patch 3 of this patch series. In addition, as mentioned in,
> https://www.spinics.net/lists/netdev/msg422997.html
> code-inspection points that the rds_connection needs to take an explicit
> refcnt on the struct net so that it is held down until all cleanup is
> completed for netns removal, and this is fixed by patch1.
>
Hopefully Dmitry can try the series and see if it fixes the issue(s).
The fixes looks good to me.
FWIW, Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences
2017-03-07 1:04 ` [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences santosh.shilimkar
@ 2017-03-07 8:28 ` Dmitry Vyukov
2017-03-07 16:29 ` Santosh Shilimkar
0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Vyukov @ 2017-03-07 8:28 UTC (permalink / raw)
To: santosh.shilimkar@oracle.com
Cc: Sowmini Varadhan, David Miller, netdev, syzkaller, rds-devel
On Tue, Mar 7, 2017 at 2:04 AM, santosh.shilimkar@oracle.com
<santosh.shilimkar@oracle.com> wrote:
> On 3/4/17 8:57 AM, Sowmini Varadhan wrote:
>>
>> Dmitry Vyukov reported some syszkaller panics during netns deletion.
>>
>> While I have not been able to reproduce those exact panics, my attempts
>> to do so uncovered a few other problems, which are fixed patch 2 and
>> patch 3 of this patch series. In addition, as mentioned in,
>> https://www.spinics.net/lists/netdev/msg422997.html
>> code-inspection points that the rds_connection needs to take an explicit
>> refcnt on the struct net so that it is held down until all cleanup is
>> completed for netns removal, and this is fixed by patch1.
>>
> Hopefully Dmitry can try the series and see if it fixes the issue(s).
> The fixes looks good to me.
>
> FWIW, Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
I've applied the patches for testing. I've seen the reported crashes
only few times, so it won't provide a good testing. But at least it
can detect any regressions.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences
2017-03-07 8:28 ` Dmitry Vyukov
@ 2017-03-07 16:29 ` Santosh Shilimkar
0 siblings, 0 replies; 8+ messages in thread
From: Santosh Shilimkar @ 2017-03-07 16:29 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: Sowmini Varadhan, David Miller, netdev, syzkaller, rds-devel
On 3/7/2017 12:28 AM, Dmitry Vyukov wrote:
> On Tue, Mar 7, 2017 at 2:04 AM, santosh.shilimkar@oracle.com
> <santosh.shilimkar@oracle.com> wrote:
>> On 3/4/17 8:57 AM, Sowmini Varadhan wrote:
>>>
>>> Dmitry Vyukov reported some syszkaller panics during netns deletion.
>>>
>>> While I have not been able to reproduce those exact panics, my attempts
>>> to do so uncovered a few other problems, which are fixed patch 2 and
>>> patch 3 of this patch series. In addition, as mentioned in,
>>> https://www.spinics.net/lists/netdev/msg422997.html
>>> code-inspection points that the rds_connection needs to take an explicit
>>> refcnt on the struct net so that it is held down until all cleanup is
>>> completed for netns removal, and this is fixed by patch1.
>>>
>> Hopefully Dmitry can try the series and see if it fixes the issue(s).
>> The fixes looks good to me.
>>
>> FWIW, Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
>
>
> I've applied the patches for testing. I've seen the reported crashes
> only few times, so it won't provide a good testing. But at least it
> can detect any regressions.
>
Thanks Dmitry !!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences
2017-03-04 16:57 [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences Sowmini Varadhan
` (3 preceding siblings ...)
2017-03-07 1:04 ` [PATCH net 0/3] rds: tcp: fix various rds-tcp issues during netns create/delete sequences santosh.shilimkar
@ 2017-03-07 22:10 ` David Miller
4 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2017-03-07 22:10 UTC (permalink / raw)
To: sowmini.varadhan; +Cc: dvyukov, santosh.shilimkar, netdev, syzkaller, rds-devel
From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Sat, 4 Mar 2017 08:57:32 -0800
> Dmitry Vyukov reported some syszkaller panics during netns deletion.
...
Series applied, thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread