All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Sowmini Varadhan <sowmini.varadhan@oracle.com>,
	Santosh Shilimkar <santosh.shilimkar@oracle.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.15 19/45] rds: tcp: correctly sequence cleanup on netns deletion.
Date: Fri, 23 Feb 2018 19:28:58 +0100	[thread overview]
Message-ID: <20180223170718.380458441@linuxfoundation.org> (raw)
In-Reply-To: <20180223170715.197760019@linuxfoundation.org>

4.15-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sowmini Varadhan <sowmini.varadhan@oracle.com>

commit 681648e67d43cf269c5590ecf021ed481f4551fc upstream.

Commit 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net")
introduces a regression in rds-tcp netns cleanup. The cleanup_net(),
(and thus rds_tcp_dev_event notification) is only called from put_net()
when all netns refcounts go to 0, but this cannot happen if the
rds_connection itself is holding a c_net ref that it expects to
release in rds_tcp_kill_sock.

Instead, the rds_tcp_kill_sock callback should make sure to
tear down state carefully, ensuring that the socket teardown
is only done after all data-structures and workqs that depend
on it are quiesced.

The original motivation for commit 8edc3affc077 ("rds: tcp: Take explicit
refcounts on struct net") was to resolve a race condition reported by
syzkaller where workqs for tx/rx/connect were triggered after the
namespace was deleted. Those worker threads should have been
cancelled/flushed before socket tear-down and indeed,
rds_conn_path_destroy() does try to sequence this by doing
     /* cancel cp_send_w */
     /* cancel cp_recv_w */
     /* flush cp_down_w */
     /* free data structures */
Here the "flush cp_down_w" will trigger rds_conn_shutdown and thus
invoke rds_tcp_conn_path_shutdown() to close the tcp socket, so that
we ought to have satisfied the requirement that "socket-close is
done after all other dependent state is quiesced". However,
rds_conn_shutdown has a bug in that it *always* triggers the reconnect
workq (and if connection is successful, we always restart tx/rx
workqs so with the right timing, we risk the race conditions reported
by syzkaller).

Netns deletion is like module teardown- no need to restart a
reconnect in this case. We can use the c_destroy_in_prog bit
to avoid restarting the reconnect.

Fixes: 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/rds/connection.c |    3 ++-
 net/rds/rds.h        |    6 +++---
 net/rds/tcp.c        |    4 ++--
 3 files changed, 7 insertions(+), 6 deletions(-)

--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -366,6 +366,8 @@ void rds_conn_shutdown(struct rds_conn_p
 	 * to the conn hash, so we never trigger a reconnect on this
 	 * conn - the reconnect is always triggered by the active peer. */
 	cancel_delayed_work_sync(&cp->cp_conn_w);
+	if (conn->c_destroy_in_prog)
+		return;
 	rcu_read_lock();
 	if (!hlist_unhashed(&conn->c_hash_node)) {
 		rcu_read_unlock();
@@ -445,7 +447,6 @@ void rds_conn_destroy(struct rds_connect
 	 */
 	rds_cong_remove_conn(conn);
 
-	put_net(conn->c_net);
 	kfree(conn->c_path);
 	kmem_cache_free(rds_conn_slab, conn);
 
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -150,7 +150,7 @@ struct rds_connection {
 
 	/* Protocol version */
 	unsigned int		c_version;
-	struct net		*c_net;
+	possible_net_t		c_net;
 
 	struct list_head	c_map_item;
 	unsigned long		c_map_queued;
@@ -165,13 +165,13 @@ struct rds_connection {
 static inline
 struct net *rds_conn_net(struct rds_connection *conn)
 {
-	return conn->c_net;
+	return read_pnet(&conn->c_net);
 }
 
 static inline
 void rds_conn_net_set(struct rds_connection *conn, struct net *net)
 {
-	conn->c_net = get_net(net);
+	write_pnet(&conn->c_net, net);
 }
 
 #define RDS_FLAG_CONG_BITMAP	0x01
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -528,7 +528,7 @@ static void rds_tcp_kill_sock(struct net
 	rds_tcp_listen_stop(lsock, &rtn->rds_tcp_accept_w);
 	spin_lock_irq(&rds_tcp_conn_lock);
 	list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
-		struct net *c_net = tc->t_cpath->cp_conn->c_net;
+		struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
 
 		if (net != c_net || !tc->t_sock)
 			continue;
@@ -587,7 +587,7 @@ static void rds_tcp_sysctl_reset(struct
 
 	spin_lock_irq(&rds_tcp_conn_lock);
 	list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
-		struct net *c_net = tc->t_cpath->cp_conn->c_net;
+		struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
 
 		if (net != c_net || !tc->t_sock)
 			continue;

  parent reply	other threads:[~2018-02-23 18:28 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-23 18:28 [PATCH 4.15 00/45] 4.15.6-stable review Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 01/45] tun: fix tun_napi_alloc_frags() frag allocator Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 02/45] ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 03/45] ptr_ring: try vmalloc() when kmalloc() fails Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 04/45] selinux: ensure the context is NUL terminated in security_context_to_sid_core() Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 05/45] selinux: skip bounded transition processing if the policy isnt loaded Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 06/45] media: pvrusb2: properly check endpoint types Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 07/45] crypto: x86/twofish-3way - Fix %rbp usage Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 08/45] staging: android: ion: Add __GFP_NOWARN for system contig heap Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 09/45] staging: android: ion: Switch from WARN to pr_warn Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 10/45] blk_rq_map_user_iov: fix error override Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 11/45] KVM: x86: fix escape of guest dr6 to the host Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 12/45] kcov: detect double association with a single task Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 13/45] netfilter: x_tables: fix int overflow in xt_alloc_table_info() Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 14/45] netfilter: x_tables: avoid out-of-bounds reads in xt_request_find_{match|target} Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 15/45] netfilter: ipt_CLUSTERIP: fix out-of-bounds accesses in clusterip_tg_check() Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 16/45] netfilter: on sockopt() acquire sock lock only in the required scope Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 17/45] netfilter: xt_cgroup: initialize info->priv in cgroup_mt_check_v1() Greg Kroah-Hartman
2018-02-23 18:28 ` [PATCH 4.15 18/45] netfilter: xt_RATEEST: acquire xt_rateest_mutex for hash insert Greg Kroah-Hartman
2018-02-23 18:28 ` Greg Kroah-Hartman [this message]
2018-02-23 18:28 ` [PATCH 4.15 20/45] rds: tcp: atomically purge entries from rds_tcp_conn_list during netns delete Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 21/45] net: avoid skb_warn_bad_offload on IS_ERR Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 22/45] net_sched: gen_estimator: fix lockdep splat Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 23/45] soc: qcom: rmtfs_mem: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 24/45] ASoC: ux500: add MODULE_LICENSE tag Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 25/45] video: fbdev/mmp: add MODULE_LICENSE Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 26/45] ARM: 8743/1: bL_switcher: add MODULE_LICENSE tag Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 27/45] arm64: dts: add #cooling-cells to CPU nodes Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 28/45] dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 29/45] ANDROID: binder: remove WARN() for redundant txn error Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 30/45] ANDROID: binder: synchronize_rcu() when using POLLFREE Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 31/45] staging: android: ashmem: Fix a race condition in pin ioctls Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 32/45] binder: check for binder_thread allocation failure in binder_poll() Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 33/45] binder: replace "%p" with "%pK" Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 34/45] staging: fsl-mc: fix build testing on x86 Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 35/45] staging: iio: adc: ad7192: fix external frequency setting Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 36/45] staging: iio: ad5933: switch buffer mode to software Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 37/45] xhci: Fix NULL pointer in xhci debugfs Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 38/45] xhci: Fix xhci debugfs devices node disappearance after hibernation Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 39/45] xhci: xhci debugfs device nodes werent removed after device plugged out Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 40/45] xhci: fix xhci debugfs errors in xhci_stop Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 41/45] usbip: keep usbip_device sockfd state in sync with tcp_socket Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 42/45] crypto: s5p-sss - Fix kernel Oops in AES-ECB mode Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 43/45] mei: me: add cannon point device ids Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 44/45] mei: me: add cannon point device ids for 4th device Greg Kroah-Hartman
2018-02-23 18:29 ` [PATCH 4.15 45/45] vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems Greg Kroah-Hartman
2018-02-23 23:57 ` [PATCH 4.15 00/45] 4.15.6-stable review kernelci.org bot
2018-02-24  0:38 ` Shuah Khan
2018-02-24  8:26   ` Greg Kroah-Hartman
2018-02-24 17:58 ` Guenter Roeck
2018-02-25  9:59   ` Greg Kroah-Hartman
2018-02-25  3:37 ` Dan Rue
2018-02-25  9:58   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180223170718.380458441@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=santosh.shilimkar@oracle.com \
    --cc=sowmini.varadhan@oracle.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.