From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, "David J. Wilder" <dwilder@us.ibm.com>,
Eric Dumazet <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.3 38/55] net: fix IP early demux races
Date: Wed, 20 Jan 2016 16:44:13 -0800 [thread overview]
Message-ID: <20160120232229.371044441@linuxfoundation.org> (raw)
In-Reply-To: <20160120232227.417513468@linuxfoundation.org>
4.3-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 5037e9ef9454917b047f9f3a19b4dd179fbf7cd4 ]
David Wilder reported crashes caused by dst reuse.
<quote David>
I am seeing a crash on a distro V4.2.3 kernel caused by a double
release of a dst_entry. In ipv4_dst_destroy() the call to
list_empty() finds a poisoned next pointer, indicating the dst_entry
has already been removed from the list and freed. The crash occurs
18 to 24 hours into a run of a network stress exerciser.
</quote>
Thanks to his detailed report and analysis, we were able to understand
the core issue.
IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.
When socket cache is not properly set, we want to store into
sk->sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.
Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.
We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.
This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.
It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb->dst
can suddenly be cleared.
Can probably be backported back to linux-3.6 kernels
Reported-by: David J. Wilder <dwilder@us.ibm.com>
Tested-by: David J. Wilder <dwilder@us.ibm.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/net/dst.h | 33 +++++++++++++++++++++++++++++++++
include/net/sock.h | 2 +-
net/ipv4/tcp_ipv4.c | 5 ++---
net/ipv6/tcp_ipv6.c | 3 +--
4 files changed, 37 insertions(+), 6 deletions(-)
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -322,6 +322,39 @@ static inline void skb_dst_force(struct
}
}
+/**
+ * dst_hold_safe - Take a reference on a dst if possible
+ * @dst: pointer to dst entry
+ *
+ * This helper returns false if it could not safely
+ * take a reference on a dst.
+ */
+static inline bool dst_hold_safe(struct dst_entry *dst)
+{
+ if (dst->flags & DST_NOCACHE)
+ return atomic_inc_not_zero(&dst->__refcnt);
+ dst_hold(dst);
+ return true;
+}
+
+/**
+ * skb_dst_force_safe - makes sure skb dst is refcounted
+ * @skb: buffer
+ *
+ * If dst is not yet refcounted and not destroyed, grab a ref on it.
+ */
+static inline void skb_dst_force_safe(struct sk_buff *skb)
+{
+ if (skb_dst_is_noref(skb)) {
+ struct dst_entry *dst = skb_dst(skb);
+
+ if (!dst_hold_safe(dst))
+ dst = NULL;
+
+ skb->_skb_refdst = (unsigned long)dst;
+ }
+}
+
/**
* __skb_tunnel_rx - prepare skb for rx reinsert
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -801,7 +801,7 @@ void sk_stream_write_space(struct sock *
static inline void __sk_add_backlog(struct sock *sk, struct sk_buff *skb)
{
/* dont let skb dst not refcounted, we are going to leave rcu lock */
- skb_dst_force(skb);
+ skb_dst_force_safe(skb);
if (!sk->sk_backlog.tail)
sk->sk_backlog.head = skb;
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1509,7 +1509,7 @@ bool tcp_prequeue(struct sock *sk, struc
if (likely(sk->sk_rx_dst))
skb_dst_drop(skb);
else
- skb_dst_force(skb);
+ skb_dst_force_safe(skb);
__skb_queue_tail(&tp->ucopy.prequeue, skb);
tp->ucopy.memory += skb->truesize;
@@ -1710,8 +1710,7 @@ void inet_sk_rx_dst_set(struct sock *sk,
{
struct dst_entry *dst = skb_dst(skb);
- if (dst) {
- dst_hold(dst);
+ if (dst && dst_hold_safe(dst)) {
sk->sk_rx_dst = dst;
inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
}
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -93,10 +93,9 @@ static void inet6_sk_rx_dst_set(struct s
{
struct dst_entry *dst = skb_dst(skb);
- if (dst) {
+ if (dst && dst_hold_safe(dst)) {
const struct rt6_info *rt = (const struct rt6_info *)dst;
- dst_hold(dst);
sk->sk_rx_dst = dst;
inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
inet6_sk(sk)->rx_dst_cookie = rt6_get_cookie(rt);
next prev parent reply other threads:[~2016-01-21 0:47 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-21 0:43 [PATCH 4.3 00/55] 4.3.4-stable review Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 01/55] Revert "vrf: fix double free and memory corruption on register_netdevice failure" Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 02/55] vrf: fix double free and memory corruption on register_netdevice failure Greg Kroah-Hartman
2016-01-21 1:37 ` Ben Hutchings
2016-01-22 7:53 ` Greg Kroah-Hartman
2016-01-22 7:53 ` Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 03/55] tipc: Fix kfree_skb() of uninitialised pointer Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 04/55] ACPI: Use correct IRQ when uninstalling ACPI interrupt handler Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 05/55] ACPI: Using correct irq when waiting for events Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 06/55] ACPI / PM: Fix incorrect wakeup IRQ setting during suspend-to-idle Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 07/55] tpm, tpm_tis: fix tpm_tis ACPI detection issue with TPM 2.0 Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 08/55] toshiba_acpi: Initialize hotkey_event_type variable Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 09/55] USB: cdc_acm: Ignore Infineon Flash Loader utility Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 10/55] USB: serial: Another Infineon flash loader USB ID Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 11/55] usb-storage: Fix scsi-sd failure "Invalid field in cdb" for USB adapter JMicron Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 12/55] USB: cp210x: Remove CP2110 ID from compatibility list Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 13/55] USB: add quirk for devices with broken LPM Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 14/55] USB: whci-hcd: add check for dma mapping error Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 15/55] usb: gadget: pxa27x: fix suspend callback Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 16/55] USB: host: ohci-at91: fix a crash in ohci_hcd_at91_overcurrent_irq Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 17/55] usb: musb: USB_TI_CPPI41_DMA requires dmaengine support Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 18/55] usb: core : hub: Fix BOS NULL pointer kernel panic Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 19/55] usb: Use the USB_SS_MULT() macro to decode burst multiplier for log message Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 20/55] pppoe: fix memory corruption in padt work structure Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 21/55] gre6: allow to update all parameters via rtnl Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 22/55] atl1c: Improve driver not to do order 4 GFP_ATOMIC allocation Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 24/55] vxlan: fix incorrect RCO bit in VXLAN header Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 25/55] sctp: use the same clock as if sock source timestamps were on Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 26/55] sctp: update the netstamp_needed counter when copying sockets Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 27/55] sctp: also copy sk_tsflags when copying the socket Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 29/55] net: qca_spi: fix transmit queue timeout handling Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 30/55] r8152: fix lockup when runtime PM is enabled Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 31/55] ipv6: sctp: clone options to avoid use after free Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 32/55] phy: micrel: Fix finding PHY properties in MAC node Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 33/55] openvswitch: Fix helper reference leak Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 34/55] openvswitch: Respect conntrack zone even if invalid Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 35/55] uapi: export ila.h Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 37/55] sh_eth: fix kernel oops in skb_put() Greg Kroah-Hartman
2016-01-21 0:44 ` Greg Kroah-Hartman [this message]
2016-01-21 0:44 ` [PATCH 4.3 39/55] pptp: verify sockaddr_len in pptp_bind() and pptp_connect() Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 40/55] vlan: Fix untag operations of stacked vlans with REORDER_HEADER off Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 41/55] skbuff: Fix offset error in skb_reorder_vlan_header Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 42/55] net: check both type and procotol for tcp sockets Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 43/55] net_sched: make qdisc_tree_decrease_qlen() work for non mq Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 44/55] bluetooth: Validate socket address length in sco_sock_bind() Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 45/55] net: fix uninitialized variable issue Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 47/55] inet: tcp: fix inetpeer_set_addr_v4() Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 48/55] rhashtable: Enforce minimum size on initial hash table Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 49/55] gianfar: Dont enable RX Filer if not supported Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 50/55] fou: clean up socket with kfree_rcu Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 51/55] af_unix: Revert lock_interruptible in stream receive code Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 52/55] tcp: restore fastopen with no data in SYN packet Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 53/55] rhashtable: Fix walker list corruption Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 54/55] KEYS: Fix race between read and revoke Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 55/55] KEYS: Fix keyring ref leak in join_session_keyring() Greg Kroah-Hartman
2016-01-21 1:39 ` [PATCH 4.3 00/55] 4.3.4-stable review Shuah Khan
2016-01-22 7:51 ` Greg Kroah-Hartman
2016-01-21 9:42 ` Mel Gorman
2016-01-22 7:54 ` Greg Kroah-Hartman
2016-01-22 8:12 ` Mel Gorman
2016-01-21 12:24 ` Guenter Roeck
2016-01-22 7:51 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160120232229.371044441@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=dwilder@us.ibm.com \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.