From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Lars Persson <lars.persson@axis.com>,
Eric Dumazet <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 3.13 23/65] tcp: tcp_release_cb() should release socket ownership
Date: Fri, 11 Apr 2014 09:10:55 -0700 [thread overview]
Message-ID: <20140411161000.897953190@linuxfoundation.org> (raw)
In-Reply-To: <20140411160957.714773410@linuxfoundation.org>
3.13-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <eric.dumazet@gmail.com>
[ Upstream commit c3f9b01849ef3bc69024990092b9f42e20df7797 ]
Lars Persson reported following deadlock :
-000 |M:0x0:0x802B6AF8(asm) <-- arch_spin_lock
-001 |tcp_v4_rcv(skb = 0x8BD527A0) <-- sk = 0x8BE6B2A0
-002 |ip_local_deliver_finish(skb = 0x8BD527A0)
-003 |__netif_receive_skb_core(skb = 0x8BD527A0, ?)
-004 |netif_receive_skb(skb = 0x8BD527A0)
-005 |elk_poll(napi = 0x8C770500, budget = 64)
-006 |net_rx_action(?)
-007 |__do_softirq()
-008 |do_softirq()
-009 |local_bh_enable()
-010 |tcp_rcv_established(sk = 0x8BE6B2A0, skb = 0x87D3A9E0, th = 0x814EBE14, ?)
-011 |tcp_v4_do_rcv(sk = 0x8BE6B2A0, skb = 0x87D3A9E0)
-012 |tcp_delack_timer_handler(sk = 0x8BE6B2A0)
-013 |tcp_release_cb(sk = 0x8BE6B2A0)
-014 |release_sock(sk = 0x8BE6B2A0)
-015 |tcp_sendmsg(?, sk = 0x8BE6B2A0, ?, ?)
-016 |sock_sendmsg(sock = 0x8518C4C0, msg = 0x87D8DAA8, size = 4096)
-017 |kernel_sendmsg(?, ?, ?, ?, size = 4096)
-018 |smb_send_kvec()
-019 |smb_send_rqst(server = 0x87C4D400, rqst = 0x87D8DBA0)
-020 |cifs_call_async()
-021 |cifs_async_writev(wdata = 0x87FD6580)
-022 |cifs_writepages(mapping = 0x852096E4, wbc = 0x87D8DC88)
-023 |__writeback_single_inode(inode = 0x852095D0, wbc = 0x87D8DC88)
-024 |writeback_sb_inodes(sb = 0x87D6D800, wb = 0x87E4A9C0, work = 0x87D8DD88)
-025 |__writeback_inodes_wb(wb = 0x87E4A9C0, work = 0x87D8DD88)
-026 |wb_writeback(wb = 0x87E4A9C0, work = 0x87D8DD88)
-027 |wb_do_writeback(wb = 0x87E4A9C0, force_wait = 0)
-028 |bdi_writeback_workfn(work = 0x87E4A9CC)
-029 |process_one_work(worker = 0x8B045880, work = 0x87E4A9CC)
-030 |worker_thread(__worker = 0x8B045880)
-031 |kthread(_create = 0x87CADD90)
-032 |ret_from_kernel_thread(asm)
Bug occurs because __tcp_checksum_complete_user() enables BH, assuming
it is running from softirq context.
Lars trace involved a NIC without RX checksum support but other points
are problematic as well, like the prequeue stuff.
Problem is triggered by a timer, that found socket being owned by user.
tcp_release_cb() should call tcp_write_timer_handler() or
tcp_delack_timer_handler() in the appropriate context :
BH disabled and socket lock held, but 'owned' field cleared,
as if they were running from timer handlers.
Fixes: 6f458dfb4092 ("tcp: improve latencies of timer triggered events")
Reported-by: Lars Persson <lars.persson@axis.com>
Tested-by: Lars Persson <lars.persson@axis.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/net/sock.h | 5 +++++
net/core/sock.c | 5 ++++-
net/ipv4/tcp_output.c | 11 +++++++++++
3 files changed, 20 insertions(+), 1 deletion(-)
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1474,6 +1474,11 @@ static inline void sk_wmem_free_skb(stru
*/
#define sock_owned_by_user(sk) ((sk)->sk_lock.owned)
+static inline void sock_release_ownership(struct sock *sk)
+{
+ sk->sk_lock.owned = 0;
+}
+
/*
* Macro so as to not evaluate some arguments when
* lockdep is not enabled.
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2384,10 +2384,13 @@ void release_sock(struct sock *sk)
if (sk->sk_backlog.tail)
__release_sock(sk);
+ /* Warning : release_cb() might need to release sk ownership,
+ * ie call sock_release_ownership(sk) before us.
+ */
if (sk->sk_prot->release_cb)
sk->sk_prot->release_cb(sk);
- sk->sk_lock.owned = 0;
+ sock_release_ownership(sk);
if (waitqueue_active(&sk->sk_lock.wq))
wake_up(&sk->sk_lock.wq);
spin_unlock_bh(&sk->sk_lock.slock);
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -765,6 +765,17 @@ void tcp_release_cb(struct sock *sk)
if (flags & (1UL << TCP_TSQ_DEFERRED))
tcp_tsq_handler(sk);
+ /* Here begins the tricky part :
+ * We are called from release_sock() with :
+ * 1) BH disabled
+ * 2) sk_lock.slock spinlock held
+ * 3) socket owned by us (sk->sk_lock.owned == 1)
+ *
+ * But following code is meant to be called from BH handlers,
+ * so we should keep BH disabled, but early release socket ownership
+ */
+ sock_release_ownership(sk);
+
if (flags & (1UL << TCP_WRITE_TIMER_DEFERRED)) {
tcp_write_timer_handler(sk);
__sock_put(sk);
next prev parent reply other threads:[~2014-04-11 16:49 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-11 16:10 [PATCH 3.13 00/65] 3.13.10-stable review Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 01/65] selinux: correctly label /proc inodes in use before the policy is loaded Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 02/65] net: fix for a race condition in the inet frag code Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 03/65] net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 04/65] bridge: multicast: add sanity check for query source addresses Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 05/65] tipc: allow connection shutdown callback to be invoked in advance Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 06/65] tipc: fix connection refcount leak Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 07/65] tipc: drop subscriber connection id invalidation Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 08/65] tipc: fix memory leak during module removal Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 09/65] tipc: dont log disabled tasklet handler errors Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 10/65] inet: frag: make sure forced eviction removes all frags Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 11/65] net: unix: non blocking recvmsg() should not return -EINTR Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 12/65] ipv6: Fix exthdrs offload registration Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 13/65] ipv6: dont set DST_NOCOUNT for remotely added routes Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 14/65] bnx2: Fix shutdown sequence Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 15/65] pkt_sched: fq: do not hold qdisc lock while allocating memory Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 16/65] Xen-netback: Fix issue caused by using gso_type wrongly Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 17/65] vlan: Set correct source MAC address with TX VLAN offload enabled Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 18/65] skbuff: skb_segment: s/frag/nskb_frag/ Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 19/65] skbuff: skb_segment: s/skb_frag/frag/ Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 20/65] skbuff: skb_segment: s/skb/head_skb/ Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 21/65] skbuff: skb_segment: s/fskb/list_skb/ Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 22/65] skbuff: skb_segment: orphan frags before copying Greg Kroah-Hartman
2014-04-11 16:10 ` Greg Kroah-Hartman [this message]
2014-04-11 16:10 ` [PATCH 3.13 24/65] bridge: multicast: add sanity check for general query destination Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 25/65] bridge: multicast: enable snooping on general queries only Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 26/65] net: socket: error on a negative msg_namelen Greg Kroah-Hartman
2014-04-11 16:10 ` [PATCH 3.13 27/65] bonding: set correct vlan id for alb xmit path Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 28/65] eth: fec: Fix lost promiscuous mode after reconnecting cable Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 29/65] ipv6: Avoid unnecessary temporary addresses being generated Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 30/65] ipv6: ip6_append_data_mtu do not handle the mtu of the second fragment properly Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 31/65] net: cdc_ncm: fix control message ordering Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 32/65] vxlan: fix potential NULL dereference in arp_reduce() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 33/65] vxlan: fix nonfunctional neigh_reduce() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 34/65] tcp: syncookies: do not use getnstimeofday() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 35/65] rtnetlink: fix fdb notification flags Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 36/65] ipmr: fix mfc " Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 37/65] ip6mr: " Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 38/65] net: micrel : ks8851-ml: add vdd-supply support Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 39/65] netpoll: fix the skb check in pkt_is_ns Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 40/65] tipc: fix spinlock recursion bug for failed subscriptions Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 41/65] ip_tunnel: Fix dst ref-count Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 42/65] tg3: Do not include vlan acceleration features in vlan_features Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 43/65] virtio-net: correct error handling of virtqueue_kick() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 44/65] usbnet: include wait queue head in device structure Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 45/65] vlan: Set hard_header_len according to available acceleration Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 46/65] vhost: fix total length when packets are too short Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 47/65] vhost: validate vhost_get_vq_desc return value Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 48/65] tcp: fix get_timewait4_sock() delay computation on 64bit Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 49/65] xen-netback: remove pointless clause from if statement Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 50/65] ipv6: some ipv6 statistic counters failed to disable bh Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 51/65] netlink: dont compare the nul-termination in nla_strcmp Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 52/65] xen-netback: disable rogue vif in kthread context Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 53/65] Call efx_set_channels() before efx->type->dimension_resources() Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 54/65] net: vxlan: fix crash when interface is created with no group Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 55/65] isdnloop: Validate NUL-terminated strings from user Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 56/65] isdnloop: several buffer overflows Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 57/65] rds: prevent dereference of a NULL device in rds_iw_laddr_check Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 58/65] powernow-k6: disable cache when changing frequency Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 59/65] powernow-k6: correctly initialize default parameters Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 60/65] powernow-k6: reorder frequencies Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 61/65] ARC: [nsimosci] Change .dts to use generic 8250 UART Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 62/65] ARC: [nsimosci] Unbork console Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 63/65] futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 64/65] m68k: Skip " Greg Kroah-Hartman
2014-04-11 16:11 ` [PATCH 3.13 65/65] crypto: ghash-clmulni-intel - use C implementation for setkey() Greg Kroah-Hartman
2014-04-11 21:45 ` [PATCH 3.13 00/65] 3.13.10-stable review Guenter Roeck
2014-04-11 23:46 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140411161000.897953190@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=lars.persson@axis.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.