From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, "David J. Wilder" <dwilder@us.ibm.com>,
Eric Dumazet <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.1 26/43] net: fix IP early demux races
Date: Wed, 20 Jan 2016 15:10:33 -0800 [thread overview]
Message-ID: <20160120215930.722423228@linuxfoundation.org> (raw)
In-Reply-To: <20160120215926.787430744@linuxfoundation.org>
4.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 5037e9ef9454917b047f9f3a19b4dd179fbf7cd4 ]
David Wilder reported crashes caused by dst reuse.
<quote David>
I am seeing a crash on a distro V4.2.3 kernel caused by a double
release of a dst_entry. In ipv4_dst_destroy() the call to
list_empty() finds a poisoned next pointer, indicating the dst_entry
has already been removed from the list and freed. The crash occurs
18 to 24 hours into a run of a network stress exerciser.
</quote>
Thanks to his detailed report and analysis, we were able to understand
the core issue.
IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.
When socket cache is not properly set, we want to store into
sk->sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.
Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.
We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.
This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.
It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb->dst
can suddenly be cleared.
Can probably be backported back to linux-3.6 kernels
Reported-by: David J. Wilder <dwilder@us.ibm.com>
Tested-by: David J. Wilder <dwilder@us.ibm.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/net/dst.h | 33 +++++++++++++++++++++++++++++++++
include/net/sock.h | 2 +-
net/ipv4/tcp_ipv4.c | 5 ++---
net/ipv6/tcp_ipv6.c | 3 +--
4 files changed, 37 insertions(+), 6 deletions(-)
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -312,6 +312,39 @@ static inline void skb_dst_force(struct
}
}
+/**
+ * dst_hold_safe - Take a reference on a dst if possible
+ * @dst: pointer to dst entry
+ *
+ * This helper returns false if it could not safely
+ * take a reference on a dst.
+ */
+static inline bool dst_hold_safe(struct dst_entry *dst)
+{
+ if (dst->flags & DST_NOCACHE)
+ return atomic_inc_not_zero(&dst->__refcnt);
+ dst_hold(dst);
+ return true;
+}
+
+/**
+ * skb_dst_force_safe - makes sure skb dst is refcounted
+ * @skb: buffer
+ *
+ * If dst is not yet refcounted and not destroyed, grab a ref on it.
+ */
+static inline void skb_dst_force_safe(struct sk_buff *skb)
+{
+ if (skb_dst_is_noref(skb)) {
+ struct dst_entry *dst = skb_dst(skb);
+
+ if (!dst_hold_safe(dst))
+ dst = NULL;
+
+ skb->_skb_refdst = (unsigned long)dst;
+ }
+}
+
/**
* __skb_tunnel_rx - prepare skb for rx reinsert
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -799,7 +799,7 @@ void sk_stream_write_space(struct sock *
static inline void __sk_add_backlog(struct sock *sk, struct sk_buff *skb)
{
/* dont let skb dst not refcounted, we are going to leave rcu lock */
- skb_dst_force(skb);
+ skb_dst_force_safe(skb);
if (!sk->sk_backlog.tail)
sk->sk_backlog.head = skb;
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1509,7 +1509,7 @@ bool tcp_prequeue(struct sock *sk, struc
if (likely(sk->sk_rx_dst))
skb_dst_drop(skb);
else
- skb_dst_force(skb);
+ skb_dst_force_safe(skb);
__skb_queue_tail(&tp->ucopy.prequeue, skb);
tp->ucopy.memory += skb->truesize;
@@ -1714,8 +1714,7 @@ void inet_sk_rx_dst_set(struct sock *sk,
{
struct dst_entry *dst = skb_dst(skb);
- if (dst) {
- dst_hold(dst);
+ if (dst && dst_hold_safe(dst)) {
sk->sk_rx_dst = dst;
inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
}
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -93,10 +93,9 @@ static void inet6_sk_rx_dst_set(struct s
{
struct dst_entry *dst = skb_dst(skb);
- if (dst) {
+ if (dst && dst_hold_safe(dst)) {
const struct rt6_info *rt = (const struct rt6_info *)dst;
- dst_hold(dst);
sk->sk_rx_dst = dst;
inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
if (rt->rt6i_node)
next prev parent reply other threads:[~2016-01-20 23:11 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-20 23:10 [PATCH 4.1 00/43] 4.1.16-stable review Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 01/43] tpm, tpm_tis: fix tpm_tis ACPI detection issue with TPM 2.0 Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 02/43] toshiba_acpi: Initialize hotkey_event_type variable Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 03/43] USB: cdc_acm: Ignore Infineon Flash Loader utility Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 04/43] USB: serial: Another Infineon flash loader USB ID Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 05/43] usb-storage: Fix scsi-sd failure "Invalid field in cdb" for USB adapter JMicron Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 06/43] USB: cp210x: Remove CP2110 ID from compatibility list Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 07/43] USB: add quirk for devices with broken LPM Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 08/43] USB: whci-hcd: add check for dma mapping error Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 09/43] usb: gadget: pxa27x: fix suspend callback Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 10/43] usb: musb: USB_TI_CPPI41_DMA requires dmaengine support Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 11/43] usb: core : hub: Fix BOS NULL pointer kernel panic Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 12/43] usb: Use the USB_SS_MULT() macro to decode burst multiplier for log message Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 13/43] pppoe: fix memory corruption in padt work structure Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 14/43] gre6: allow to update all parameters via rtnl Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 15/43] atl1c: Improve driver not to do order 4 GFP_ATOMIC allocation Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 17/43] vxlan: fix incorrect RCO bit in VXLAN header Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 18/43] sctp: use the same clock as if sock source timestamps were on Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 19/43] sctp: update the netstamp_needed counter when copying sockets Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 20/43] sctp: also copy sk_tsflags when copying the socket Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 21/43] net: qca_spi: fix transmit queue timeout handling Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 22/43] r8152: fix lockup when runtime PM is enabled Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 23/43] ipv6: sctp: clone options to avoid use after free Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 25/43] sh_eth: fix kernel oops in skb_put() Greg Kroah-Hartman
2016-01-20 23:10 ` Greg Kroah-Hartman [this message]
2016-01-20 23:10 ` [PATCH 4.1 27/43] pptp: verify sockaddr_len in pptp_bind() and pptp_connect() Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 28/43] vlan: Fix untag operations of stacked vlans with REORDER_HEADER off Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 29/43] skbuff: Fix offset error in skb_reorder_vlan_header Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 30/43] net: check both type and procotol for tcp sockets Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 31/43] net_sched: make qdisc_tree_decrease_qlen() work for non mq Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 32/43] bluetooth: Validate socket address length in sco_sock_bind() Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 33/43] net: fix uninitialized variable issue Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 35/43] rhashtable: Enforce minimum size on initial hash table Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 36/43] fou: clean up socket with kfree_rcu Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 37/43] af_unix: Revert lock_interruptible in stream receive code Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 38/43] tcp: restore fastopen with no data in SYN packet Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 39/43] rhashtable: Fix walker list corruption Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 40/43] KEYS: Fix race between key destruction and finding a keyring by name Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 41/43] KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 42/43] KEYS: Fix race between read and revoke Greg Kroah-Hartman
2016-01-20 23:10 ` [PATCH 4.1 43/43] KEYS: Fix keyring ref leak in join_session_keyring() Greg Kroah-Hartman
2016-01-20 23:51 ` [PATCH 4.1 00/43] 4.1.16-stable review Shuah Khan
2016-01-21 12:23 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160120215930.722423228@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=dwilder@us.ibm.com \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).