public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Lars Persson <lars.persson@axis.com>,
	Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 3.13 23/65] tcp: tcp_release_cb() should release socket ownership
Date: Fri, 11 Apr 2014 09:10:55 -0700	[thread overview]
Message-ID: <20140411161000.897953190@linuxfoundation.org> (raw)
In-Reply-To: <20140411160957.714773410@linuxfoundation.org>

3.13-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <eric.dumazet@gmail.com>

[ Upstream commit c3f9b01849ef3bc69024990092b9f42e20df7797 ]

Lars Persson reported following deadlock :

-000 |M:0x0:0x802B6AF8(asm) <-- arch_spin_lock
-001 |tcp_v4_rcv(skb = 0x8BD527A0) <-- sk = 0x8BE6B2A0
-002 |ip_local_deliver_finish(skb = 0x8BD527A0)
-003 |__netif_receive_skb_core(skb = 0x8BD527A0, ?)
-004 |netif_receive_skb(skb = 0x8BD527A0)
-005 |elk_poll(napi = 0x8C770500, budget = 64)
-006 |net_rx_action(?)
-007 |__do_softirq()
-008 |do_softirq()
-009 |local_bh_enable()
-010 |tcp_rcv_established(sk = 0x8BE6B2A0, skb = 0x87D3A9E0, th = 0x814EBE14, ?)
-011 |tcp_v4_do_rcv(sk = 0x8BE6B2A0, skb = 0x87D3A9E0)
-012 |tcp_delack_timer_handler(sk = 0x8BE6B2A0)
-013 |tcp_release_cb(sk = 0x8BE6B2A0)
-014 |release_sock(sk = 0x8BE6B2A0)
-015 |tcp_sendmsg(?, sk = 0x8BE6B2A0, ?, ?)
-016 |sock_sendmsg(sock = 0x8518C4C0, msg = 0x87D8DAA8, size = 4096)
-017 |kernel_sendmsg(?, ?, ?, ?, size = 4096)
-018 |smb_send_kvec()
-019 |smb_send_rqst(server = 0x87C4D400, rqst = 0x87D8DBA0)
-020 |cifs_call_async()
-021 |cifs_async_writev(wdata = 0x87FD6580)
-022 |cifs_writepages(mapping = 0x852096E4, wbc = 0x87D8DC88)
-023 |__writeback_single_inode(inode = 0x852095D0, wbc = 0x87D8DC88)
-024 |writeback_sb_inodes(sb = 0x87D6D800, wb = 0x87E4A9C0, work = 0x87D8DD88)
-025 |__writeback_inodes_wb(wb = 0x87E4A9C0, work = 0x87D8DD88)
-026 |wb_writeback(wb = 0x87E4A9C0, work = 0x87D8DD88)
-027 |wb_do_writeback(wb = 0x87E4A9C0, force_wait = 0)
-028 |bdi_writeback_workfn(work = 0x87E4A9CC)
-029 |process_one_work(worker = 0x8B045880, work = 0x87E4A9CC)
-030 |worker_thread(__worker = 0x8B045880)
-031 |kthread(_create = 0x87CADD90)
-032 |ret_from_kernel_thread(asm)

Bug occurs because __tcp_checksum_complete_user() enables BH, assuming
it is running from softirq context.

Lars trace involved a NIC without RX checksum support but other points
are problematic as well, like the prequeue stuff.

Problem is triggered by a timer, that found socket being owned by user.

tcp_release_cb() should call tcp_write_timer_handler() or
tcp_delack_timer_handler() in the appropriate context :

BH disabled and socket lock held, but 'owned' field cleared,
as if they were running from timer handlers.

Fixes: 6f458dfb4092 ("tcp: improve latencies of timer triggered events")
Reported-by: Lars Persson <lars.persson@axis.com>
Tested-by: Lars Persson <lars.persson@axis.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/sock.h    |    5 +++++
 net/core/sock.c       |    5 ++++-
 net/ipv4/tcp_output.c |   11 +++++++++++
 3 files changed, 20 insertions(+), 1 deletion(-)

--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1474,6 +1474,11 @@ static inline void sk_wmem_free_skb(stru
  */
 #define sock_owned_by_user(sk)	((sk)->sk_lock.owned)
 
+static inline void sock_release_ownership(struct sock *sk)
+{
+	sk->sk_lock.owned = 0;
+}
+
 /*
  * Macro so as to not evaluate some arguments when
  * lockdep is not enabled.
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2384,10 +2384,13 @@ void release_sock(struct sock *sk)
 	if (sk->sk_backlog.tail)
 		__release_sock(sk);
 
+	/* Warning : release_cb() might need to release sk ownership,
+	 * ie call sock_release_ownership(sk) before us.
+	 */
 	if (sk->sk_prot->release_cb)
 		sk->sk_prot->release_cb(sk);
 
-	sk->sk_lock.owned = 0;
+	sock_release_ownership(sk);
 	if (waitqueue_active(&sk->sk_lock.wq))
 		wake_up(&sk->sk_lock.wq);
 	spin_unlock_bh(&sk->sk_lock.slock);
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -765,6 +765,17 @@ void tcp_release_cb(struct sock *sk)
 	if (flags & (1UL << TCP_TSQ_DEFERRED))
 		tcp_tsq_handler(sk);
 
+	/* Here begins the tricky part :
+	 * We are called from release_sock() with :
+	 * 1) BH disabled
+	 * 2) sk_lock.slock spinlock held
+	 * 3) socket owned by us (sk->sk_lock.owned == 1)
+	 *
+	 * But following code is meant to be called from BH handlers,
+	 * so we should keep BH disabled, but early release socket ownership
+	 */
+	sock_release_ownership(sk);
+
 	if (flags & (1UL << TCP_WRITE_TIMER_DEFERRED)) {
 		tcp_write_timer_handler(sk);
 		__sock_put(sk);



  parent reply	other threads:[~2014-04-11 16:49 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-11 16:10 [PATCH 3.13 00/65] 3.13.10-stable review Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 01/65] selinux: correctly label /proc inodes in use before the policy is loaded Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 02/65] net: fix for a race condition in the inet frag code Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 03/65] net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 04/65] bridge: multicast: add sanity check for query source addresses Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 05/65] tipc: allow connection shutdown callback to be invoked in advance Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 06/65] tipc: fix connection refcount leak Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 07/65] tipc: drop subscriber connection id invalidation Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 08/65] tipc: fix memory leak during module removal Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 09/65] tipc: dont log disabled tasklet handler errors Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 10/65] inet: frag: make sure forced eviction removes all frags Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 11/65] net: unix: non blocking recvmsg() should not return -EINTR Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 12/65] ipv6: Fix exthdrs offload registration Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 13/65] ipv6: dont set DST_NOCOUNT for remotely added routes Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 14/65] bnx2: Fix shutdown sequence Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 15/65] pkt_sched: fq: do not hold qdisc lock while allocating memory Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 16/65] Xen-netback: Fix issue caused by using gso_type wrongly Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 17/65] vlan: Set correct source MAC address with TX VLAN offload enabled Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 18/65] skbuff: skb_segment: s/frag/nskb_frag/ Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 19/65] skbuff: skb_segment: s/skb_frag/frag/ Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 20/65] skbuff: skb_segment: s/skb/head_skb/ Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 21/65] skbuff: skb_segment: s/fskb/list_skb/ Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 22/65] skbuff: skb_segment: orphan frags before copying Greg Kroah-Hartman
2014-04-11 16:10 ` Greg Kroah-Hartman [this message]
2014-04-11 16:10 ` [PATCH 3.13 24/65] bridge: multicast: add sanity check for general query destination Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 25/65] bridge: multicast: enable snooping on general queries only Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 26/65] net: socket: error on a negative msg_namelen Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 27/65] bonding: set correct vlan id for alb xmit path Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 28/65] eth: fec: Fix lost promiscuous mode after reconnecting cable Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 29/65] ipv6: Avoid unnecessary temporary addresses being generated Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 30/65] ipv6: ip6_append_data_mtu do not handle the mtu of the second fragment properly Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 31/65] net: cdc_ncm: fix control message ordering Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 32/65] vxlan: fix potential NULL dereference in arp_reduce() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 33/65] vxlan: fix nonfunctional neigh_reduce() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 34/65] tcp: syncookies: do not use getnstimeofday() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 35/65] rtnetlink: fix fdb notification flags Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 36/65] ipmr: fix mfc " Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 37/65] ip6mr: " Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 38/65] net: micrel : ks8851-ml: add vdd-supply support Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 39/65] netpoll: fix the skb check in pkt_is_ns Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 40/65] tipc: fix spinlock recursion bug for failed subscriptions Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 41/65] ip_tunnel: Fix dst ref-count Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 42/65] tg3: Do not include vlan acceleration features in vlan_features Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 43/65] virtio-net: correct error handling of virtqueue_kick() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 44/65] usbnet: include wait queue head in device structure Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 45/65] vlan: Set hard_header_len according to available acceleration Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 46/65] vhost: fix total length when packets are too short Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 47/65] vhost: validate vhost_get_vq_desc return value Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 48/65] tcp: fix get_timewait4_sock() delay computation on 64bit Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 49/65] xen-netback: remove pointless clause from if statement Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 50/65] ipv6: some ipv6 statistic counters failed to disable bh Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 51/65] netlink: dont compare the nul-termination in nla_strcmp Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 52/65] xen-netback: disable rogue vif in kthread context Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 53/65] Call efx_set_channels() before efx->type->dimension_resources() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 54/65] net: vxlan: fix crash when interface is created with no group Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 55/65] isdnloop: Validate NUL-terminated strings from user Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 56/65] isdnloop: several buffer overflows Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 57/65] rds: prevent dereference of a NULL device in rds_iw_laddr_check Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 58/65] powernow-k6: disable cache when changing frequency Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 59/65] powernow-k6: correctly initialize default parameters Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 60/65] powernow-k6: reorder frequencies Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 61/65] ARC: [nsimosci] Change .dts to use generic 8250 UART Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 62/65] ARC: [nsimosci] Unbork console Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 63/65] futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 64/65] m68k: Skip " Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 65/65] crypto: ghash-clmulni-intel - use C implementation for setkey() Greg Kroah-Hartman
2014-04-11 21:45 ` [PATCH 3.13 00/65] 3.13.10-stable review Guenter Roeck
2014-04-11 23:46 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140411161000.897953190@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=lars.persson@axis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox