From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, "David J. Wilder" <dwilder@us.ibm.com>,
Eric Dumazet <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.3 38/55] net: fix IP early demux races
Date: Wed, 20 Jan 2016 16:44:13 -0800 [thread overview]
Message-ID: <20160120232229.371044441@linuxfoundation.org> (raw)
In-Reply-To: <20160120232227.417513468@linuxfoundation.org>
4.3-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 5037e9ef9454917b047f9f3a19b4dd179fbf7cd4 ]
David Wilder reported crashes caused by dst reuse.
<quote David>
I am seeing a crash on a distro V4.2.3 kernel caused by a double
release of a dst_entry. In ipv4_dst_destroy() the call to
list_empty() finds a poisoned next pointer, indicating the dst_entry
has already been removed from the list and freed. The crash occurs
18 to 24 hours into a run of a network stress exerciser.
</quote>
Thanks to his detailed report and analysis, we were able to understand
the core issue.
IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.
When socket cache is not properly set, we want to store into
sk->sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.
Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.
We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.
This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.
It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb->dst
can suddenly be cleared.
Can probably be backported back to linux-3.6 kernels
Reported-by: David J. Wilder <dwilder@us.ibm.com>
Tested-by: David J. Wilder <dwilder@us.ibm.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/net/dst.h | 33 +++++++++++++++++++++++++++++++++
include/net/sock.h | 2 +-
net/ipv4/tcp_ipv4.c | 5 ++---
net/ipv6/tcp_ipv6.c | 3 +--
4 files changed, 37 insertions(+), 6 deletions(-)
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -322,6 +322,39 @@ static inline void skb_dst_force(struct
}
}
+/**
+ * dst_hold_safe - Take a reference on a dst if possible
+ * @dst: pointer to dst entry
+ *
+ * This helper returns false if it could not safely
+ * take a reference on a dst.
+ */
+static inline bool dst_hold_safe(struct dst_entry *dst)
+{
+ if (dst->flags & DST_NOCACHE)
+ return atomic_inc_not_zero(&dst->__refcnt);
+ dst_hold(dst);
+ return true;
+}
+
+/**
+ * skb_dst_force_safe - makes sure skb dst is refcounted
+ * @skb: buffer
+ *
+ * If dst is not yet refcounted and not destroyed, grab a ref on it.
+ */
+static inline void skb_dst_force_safe(struct sk_buff *skb)
+{
+ if (skb_dst_is_noref(skb)) {
+ struct dst_entry *dst = skb_dst(skb);
+
+ if (!dst_hold_safe(dst))
+ dst = NULL;
+
+ skb->_skb_refdst = (unsigned long)dst;
+ }
+}
+
/**
* __skb_tunnel_rx - prepare skb for rx reinsert
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -801,7 +801,7 @@ void sk_stream_write_space(struct sock *
static inline void __sk_add_backlog(struct sock *sk, struct sk_buff *skb)
{
/* dont let skb dst not refcounted, we are going to leave rcu lock */
- skb_dst_force(skb);
+ skb_dst_force_safe(skb);
if (!sk->sk_backlog.tail)
sk->sk_backlog.head = skb;
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1509,7 +1509,7 @@ bool tcp_prequeue(struct sock *sk, struc
if (likely(sk->sk_rx_dst))
skb_dst_drop(skb);
else
- skb_dst_force(skb);
+ skb_dst_force_safe(skb);
__skb_queue_tail(&tp->ucopy.prequeue, skb);
tp->ucopy.memory += skb->truesize;
@@ -1710,8 +1710,7 @@ void inet_sk_rx_dst_set(struct sock *sk,
{
struct dst_entry *dst = skb_dst(skb);
- if (dst) {
- dst_hold(dst);
+ if (dst && dst_hold_safe(dst)) {
sk->sk_rx_dst = dst;
inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
}
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -93,10 +93,9 @@ static void inet6_sk_rx_dst_set(struct s
{
struct dst_entry *dst = skb_dst(skb);
- if (dst) {
+ if (dst && dst_hold_safe(dst)) {
const struct rt6_info *rt = (const struct rt6_info *)dst;
- dst_hold(dst);
sk->sk_rx_dst = dst;
inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
inet6_sk(sk)->rx_dst_cookie = rt6_get_cookie(rt);
next prev parent reply other threads:[~2016-01-21 0:44 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-21 0:43 [PATCH 4.3 00/55] 4.3.4-stable review Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 01/55] Revert "vrf: fix double free and memory corruption on register_netdevice failure" Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 02/55] vrf: fix double free and memory corruption on register_netdevice failure Greg Kroah-Hartman
2016-01-21 1:37 ` Ben Hutchings
2016-01-22 7:53 ` Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 03/55] tipc: Fix kfree_skb() of uninitialised pointer Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 04/55] ACPI: Use correct IRQ when uninstalling ACPI interrupt handler Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 05/55] ACPI: Using correct irq when waiting for events Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 06/55] ACPI / PM: Fix incorrect wakeup IRQ setting during suspend-to-idle Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 07/55] tpm, tpm_tis: fix tpm_tis ACPI detection issue with TPM 2.0 Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 08/55] toshiba_acpi: Initialize hotkey_event_type variable Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 09/55] USB: cdc_acm: Ignore Infineon Flash Loader utility Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 10/55] USB: serial: Another Infineon flash loader USB ID Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 11/55] usb-storage: Fix scsi-sd failure "Invalid field in cdb" for USB adapter JMicron Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 12/55] USB: cp210x: Remove CP2110 ID from compatibility list Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 13/55] USB: add quirk for devices with broken LPM Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 14/55] USB: whci-hcd: add check for dma mapping error Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 15/55] usb: gadget: pxa27x: fix suspend callback Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 16/55] USB: host: ohci-at91: fix a crash in ohci_hcd_at91_overcurrent_irq Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 17/55] usb: musb: USB_TI_CPPI41_DMA requires dmaengine support Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 18/55] usb: core : hub: Fix BOS NULL pointer kernel panic Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 19/55] usb: Use the USB_SS_MULT() macro to decode burst multiplier for log message Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 20/55] pppoe: fix memory corruption in padt work structure Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 21/55] gre6: allow to update all parameters via rtnl Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 22/55] atl1c: Improve driver not to do order 4 GFP_ATOMIC allocation Greg Kroah-Hartman
2016-01-21 0:43 ` [PATCH 4.3 24/55] vxlan: fix incorrect RCO bit in VXLAN header Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 25/55] sctp: use the same clock as if sock source timestamps were on Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 26/55] sctp: update the netstamp_needed counter when copying sockets Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 27/55] sctp: also copy sk_tsflags when copying the socket Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 29/55] net: qca_spi: fix transmit queue timeout handling Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 30/55] r8152: fix lockup when runtime PM is enabled Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 31/55] ipv6: sctp: clone options to avoid use after free Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 32/55] phy: micrel: Fix finding PHY properties in MAC node Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 33/55] openvswitch: Fix helper reference leak Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 34/55] openvswitch: Respect conntrack zone even if invalid Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 35/55] uapi: export ila.h Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 37/55] sh_eth: fix kernel oops in skb_put() Greg Kroah-Hartman
2016-01-21 0:44 ` Greg Kroah-Hartman [this message]
2016-01-21 0:44 ` [PATCH 4.3 39/55] pptp: verify sockaddr_len in pptp_bind() and pptp_connect() Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 40/55] vlan: Fix untag operations of stacked vlans with REORDER_HEADER off Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 41/55] skbuff: Fix offset error in skb_reorder_vlan_header Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 42/55] net: check both type and procotol for tcp sockets Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 43/55] net_sched: make qdisc_tree_decrease_qlen() work for non mq Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 44/55] bluetooth: Validate socket address length in sco_sock_bind() Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 45/55] net: fix uninitialized variable issue Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 47/55] inet: tcp: fix inetpeer_set_addr_v4() Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 48/55] rhashtable: Enforce minimum size on initial hash table Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 49/55] gianfar: Dont enable RX Filer if not supported Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 50/55] fou: clean up socket with kfree_rcu Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 51/55] af_unix: Revert lock_interruptible in stream receive code Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 52/55] tcp: restore fastopen with no data in SYN packet Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 53/55] rhashtable: Fix walker list corruption Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 54/55] KEYS: Fix race between read and revoke Greg Kroah-Hartman
2016-01-21 0:44 ` [PATCH 4.3 55/55] KEYS: Fix keyring ref leak in join_session_keyring() Greg Kroah-Hartman
2016-01-21 1:39 ` [PATCH 4.3 00/55] 4.3.4-stable review Shuah Khan
2016-01-22 7:51 ` Greg Kroah-Hartman
2016-01-21 9:42 ` Mel Gorman
2016-01-22 7:54 ` Greg Kroah-Hartman
2016-01-22 8:12 ` Mel Gorman
2016-01-21 12:24 ` Guenter Roeck
2016-01-22 7:51 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160120232229.371044441@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=dwilder@us.ibm.com \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).