All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Lars Persson <lars.persson@axis.com>,
	Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 3.10 15/41] tcp: tcp_release_cb() should release socket ownership
Date: Fri, 11 Apr 2014 09:09:46 -0700	[thread overview]
Message-ID: <20140411160933.912236670@linuxfoundation.org> (raw)
In-Reply-To: <20140411160932.865173041@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <eric.dumazet@gmail.com>

[ Upstream commit c3f9b01849ef3bc69024990092b9f42e20df7797 ]

Lars Persson reported following deadlock :

-000 |M:0x0:0x802B6AF8(asm) <-- arch_spin_lock
-001 |tcp_v4_rcv(skb = 0x8BD527A0) <-- sk = 0x8BE6B2A0
-002 |ip_local_deliver_finish(skb = 0x8BD527A0)
-003 |__netif_receive_skb_core(skb = 0x8BD527A0, ?)
-004 |netif_receive_skb(skb = 0x8BD527A0)
-005 |elk_poll(napi = 0x8C770500, budget = 64)
-006 |net_rx_action(?)
-007 |__do_softirq()
-008 |do_softirq()
-009 |local_bh_enable()
-010 |tcp_rcv_established(sk = 0x8BE6B2A0, skb = 0x87D3A9E0, th = 0x814EBE14, ?)
-011 |tcp_v4_do_rcv(sk = 0x8BE6B2A0, skb = 0x87D3A9E0)
-012 |tcp_delack_timer_handler(sk = 0x8BE6B2A0)
-013 |tcp_release_cb(sk = 0x8BE6B2A0)
-014 |release_sock(sk = 0x8BE6B2A0)
-015 |tcp_sendmsg(?, sk = 0x8BE6B2A0, ?, ?)
-016 |sock_sendmsg(sock = 0x8518C4C0, msg = 0x87D8DAA8, size = 4096)
-017 |kernel_sendmsg(?, ?, ?, ?, size = 4096)
-018 |smb_send_kvec()
-019 |smb_send_rqst(server = 0x87C4D400, rqst = 0x87D8DBA0)
-020 |cifs_call_async()
-021 |cifs_async_writev(wdata = 0x87FD6580)
-022 |cifs_writepages(mapping = 0x852096E4, wbc = 0x87D8DC88)
-023 |__writeback_single_inode(inode = 0x852095D0, wbc = 0x87D8DC88)
-024 |writeback_sb_inodes(sb = 0x87D6D800, wb = 0x87E4A9C0, work = 0x87D8DD88)
-025 |__writeback_inodes_wb(wb = 0x87E4A9C0, work = 0x87D8DD88)
-026 |wb_writeback(wb = 0x87E4A9C0, work = 0x87D8DD88)
-027 |wb_do_writeback(wb = 0x87E4A9C0, force_wait = 0)
-028 |bdi_writeback_workfn(work = 0x87E4A9CC)
-029 |process_one_work(worker = 0x8B045880, work = 0x87E4A9CC)
-030 |worker_thread(__worker = 0x8B045880)
-031 |kthread(_create = 0x87CADD90)
-032 |ret_from_kernel_thread(asm)

Bug occurs because __tcp_checksum_complete_user() enables BH, assuming
it is running from softirq context.

Lars trace involved a NIC without RX checksum support but other points
are problematic as well, like the prequeue stuff.

Problem is triggered by a timer, that found socket being owned by user.

tcp_release_cb() should call tcp_write_timer_handler() or
tcp_delack_timer_handler() in the appropriate context :

BH disabled and socket lock held, but 'owned' field cleared,
as if they were running from timer handlers.

Fixes: 6f458dfb4092 ("tcp: improve latencies of timer triggered events")
Reported-by: Lars Persson <lars.persson@axis.com>
Tested-by: Lars Persson <lars.persson@axis.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/sock.h    |    5 +++++
 net/core/sock.c       |    5 ++++-
 net/ipv4/tcp_output.c |   11 +++++++++++
 3 files changed, 20 insertions(+), 1 deletion(-)

--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1437,6 +1437,11 @@ static inline void sk_wmem_free_skb(stru
  */
 #define sock_owned_by_user(sk)	((sk)->sk_lock.owned)
 
+static inline void sock_release_ownership(struct sock *sk)
+{
+	sk->sk_lock.owned = 0;
+}
+
 /*
  * Macro so as to not evaluate some arguments when
  * lockdep is not enabled.
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2309,10 +2309,13 @@ void release_sock(struct sock *sk)
 	if (sk->sk_backlog.tail)
 		__release_sock(sk);
 
+	/* Warning : release_cb() might need to release sk ownership,
+	 * ie call sock_release_ownership(sk) before us.
+	 */
 	if (sk->sk_prot->release_cb)
 		sk->sk_prot->release_cb(sk);
 
-	sk->sk_lock.owned = 0;
+	sock_release_ownership(sk);
 	if (waitqueue_active(&sk->sk_lock.wq))
 		wake_up(&sk->sk_lock.wq);
 	spin_unlock_bh(&sk->sk_lock.slock);
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -755,6 +755,17 @@ void tcp_release_cb(struct sock *sk)
 	if (flags & (1UL << TCP_TSQ_DEFERRED))
 		tcp_tsq_handler(sk);
 
+	/* Here begins the tricky part :
+	 * We are called from release_sock() with :
+	 * 1) BH disabled
+	 * 2) sk_lock.slock spinlock held
+	 * 3) socket owned by us (sk->sk_lock.owned == 1)
+	 *
+	 * But following code is meant to be called from BH handlers,
+	 * so we should keep BH disabled, but early release socket ownership
+	 */
+	sock_release_ownership(sk);
+
 	if (flags & (1UL << TCP_WRITE_TIMER_DEFERRED)) {
 		tcp_write_timer_handler(sk);
 		__sock_put(sk);



  parent reply	other threads:[~2014-04-11 16:09 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-11 16:09 [PATCH 3.10 00/41] 3.10.37-stable review Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 01/41] selinux: correctly label /proc inodes in use before the policy is loaded Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 02/41] powernow-k6: disable cache when changing frequency Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 03/41] powernow-k6: correctly initialize default parameters Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 04/41] powernow-k6: reorder frequencies Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 05/41] kbuild: fix make headers_install when path is too long Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 06/41] cpuidle: Check the result of cpuidle_get_driver() against NULL Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 07/41] net: fix for a race condition in the inet frag code Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 08/41] net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 09/41] bridge: multicast: add sanity check for query source addresses Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 10/41] inet: frag: make sure forced eviction removes all frags Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 11/41] net: unix: non blocking recvmsg() should not return -EINTR Greg Kroah-Hartman
2014-04-11 16:21   ` Rainer Weikusat
2014-04-11 16:09 ` [PATCH 3.10 12/41] ipv6: Fix exthdrs offload registration Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 13/41] ipv6: dont set DST_NOCOUNT for remotely added routes Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 14/41] vlan: Set correct source MAC address with TX VLAN offload enabled Greg Kroah-Hartman
2014-04-11 16:09 ` Greg Kroah-Hartman [this message]
2014-04-11 16:09 ` [PATCH 3.10 16/41] net: socket: error on a negative msg_namelen Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 17/41] ipv6: Avoid unnecessary temporary addresses being generated Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 18/41] ipv6: ip6_append_data_mtu do not handle the mtu of the second fragment properly Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 19/41] vxlan: fix potential NULL dereference in arp_reduce() Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 20/41] rtnetlink: fix fdb notification flags Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 21/41] ipmr: fix mfc " Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 22/41] ip6mr: " Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 23/41] netpoll: fix the skb check in pkt_is_ns Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 24/41] tg3: Do not include vlan acceleration features in vlan_features Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 25/41] usbnet: include wait queue head in device structure Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 26/41] vlan: Set hard_header_len according to available acceleration Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 27/41] vhost: fix total length when packets are too short Greg Kroah-Hartman
2014-04-11 16:09 ` [PATCH 3.10 28/41] vhost: validate vhost_get_vq_desc return value Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 29/41] xen-netback: remove pointless clause from if statement Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 30/41] ipv6: some ipv6 statistic counters failed to disable bh Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 31/41] netlink: dont compare the nul-termination in nla_strcmp Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 32/41] isdnloop: Validate NUL-terminated strings from user Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 33/41] isdnloop: several buffer overflows Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 34/41] rds: prevent dereference of a NULL device in rds_iw_laddr_check Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 35/41] ARC: [nsimosci] Change .dts to use generic 8250 UART Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 36/41] ARC: [nsimosci] Unbork console Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 37/41] futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 38/41] m68k: Skip " Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 39/41] crypto: ghash-clmulni-intel - use C implementation for setkey() Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 40/41] cpufreq: Fix governor start/stop race condition Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.10 41/41] cpufreq: Fix timer/workqueue corruption due to double queueing Greg Kroah-Hartman
2014-04-11 21:44 ` [PATCH 3.10 00/41] 3.10.37-stable review Guenter Roeck
2014-04-11 23:45 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140411160933.912236670@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=lars.persson@axis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.