From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
syzbot <syzkaller@googlegroups.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.13 12/35] tcp: gso: avoid refcount_t warning from tcp_gso_segment()
Date: Wed, 22 Nov 2017 11:12:06 +0100 [thread overview]
Message-ID: <20171122101138.787191328@linuxfoundation.org> (raw)
In-Reply-To: <20171122101137.661212603@linuxfoundation.org>
4.13-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 7ec318feeed10a64c0359ec4d10889cb4defa39a ]
When a GSO skb of truesize O is segmented into 2 new skbs of truesize N1
and N2, we want to transfer socket ownership to the new fresh skbs.
In order to avoid expensive atomic operations on a cache line subject to
cache bouncing, we replace the sequence :
refcount_add(N1, &sk->sk_wmem_alloc);
refcount_add(N2, &sk->sk_wmem_alloc); // repeated by number of segments
refcount_sub(O, &sk->sk_wmem_alloc);
by a single
refcount_add(sum_of(N) - O, &sk->sk_wmem_alloc);
Problem is :
In some pathological cases, sum(N) - O might be a negative number, and
syzkaller bot was apparently able to trigger this trace [1]
atomic_t was ok with this construct, but we need to take care of the
negative delta with refcount_t
[1]
refcount_t: saturated; leaking memory.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 8404 at lib/refcount.c:77 refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 8404 Comm: syz-executor2 Not tainted 4.14.0-rc5-mm1+ #20
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1c4/0x1e0 kernel/panic.c:546
report_bug+0x211/0x2d0 lib/bug.c:183
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:177
do_trap_no_signal arch/x86/kernel/traps.c:211 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:260
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:297
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
RSP: 0018:ffff8801c606e3a0 EFLAGS: 00010282
RAX: 0000000000000026 RBX: 0000000000001401 RCX: 0000000000000000
RDX: 0000000000000026 RSI: ffffc900036fc000 RDI: ffffed0038c0dc68
RBP: ffff8801c606e430 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8801d97f5eba R11: 0000000000000000 R12: ffff8801d5acf73c
R13: 1ffff10038c0dc75 R14: 00000000ffffffff R15: 00000000fffff72f
refcount_add+0x1b/0x60 lib/refcount.c:101
tcp_gso_segment+0x10d0/0x16b0 net/ipv4/tcp_offload.c:155
tcp4_gso_segment+0xd4/0x310 net/ipv4/tcp_offload.c:51
inet_gso_segment+0x60c/0x11c0 net/ipv4/af_inet.c:1271
skb_mac_gso_segment+0x33f/0x660 net/core/dev.c:2749
__skb_gso_segment+0x35f/0x7f0 net/core/dev.c:2821
skb_gso_segment include/linux/netdevice.h:3971 [inline]
validate_xmit_skb+0x4ba/0xb20 net/core/dev.c:3074
__dev_queue_xmit+0xe49/0x2070 net/core/dev.c:3497
dev_queue_xmit+0x17/0x20 net/core/dev.c:3538
neigh_hh_output include/net/neighbour.h:471 [inline]
neigh_output include/net/neighbour.h:479 [inline]
ip_finish_output2+0xece/0x1460 net/ipv4/ip_output.c:229
ip_finish_output+0x85e/0xd10 net/ipv4/ip_output.c:317
NF_HOOK_COND include/linux/netfilter.h:238 [inline]
ip_output+0x1cc/0x860 net/ipv4/ip_output.c:405
dst_output include/net/dst.h:459 [inline]
ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
ip_queue_xmit+0x8c6/0x18e0 net/ipv4/ip_output.c:504
tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1137
tcp_write_xmit+0x663/0x4de0 net/ipv4/tcp_output.c:2341
__tcp_push_pending_frames+0xa0/0x250 net/ipv4/tcp_output.c:2513
tcp_push_pending_frames include/net/tcp.h:1722 [inline]
tcp_data_snd_check net/ipv4/tcp_input.c:5050 [inline]
tcp_rcv_established+0x8c7/0x18a0 net/ipv4/tcp_input.c:5497
tcp_v4_do_rcv+0x2ab/0x7d0 net/ipv4/tcp_ipv4.c:1460
sk_backlog_rcv include/net/sock.h:909 [inline]
__release_sock+0x124/0x360 net/core/sock.c:2264
release_sock+0xa4/0x2a0 net/core/sock.c:2776
tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1462
inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
sock_sendmsg_nosec net/socket.c:632 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:642
___sys_sendmsg+0x31c/0x890 net/socket.c:2048
__sys_sendmmsg+0x1e6/0x5f0 net/socket.c:2138
Fixes: 14afee4b6092 ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/ipv4/tcp_offload.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -149,11 +149,19 @@ struct sk_buff *tcp_gso_segment(struct s
* is freed by GSO engine
*/
if (copy_destructor) {
+ int delta;
+
swap(gso_skb->sk, skb->sk);
swap(gso_skb->destructor, skb->destructor);
sum_truesize += skb->truesize;
- refcount_add(sum_truesize - gso_skb->truesize,
- &skb->sk->sk_wmem_alloc);
+ delta = sum_truesize - gso_skb->truesize;
+ /* In some pathological cases, delta can be negative.
+ * We need to either use refcount_add() or refcount_sub_and_test()
+ */
+ if (likely(delta >= 0))
+ refcount_add(delta, &skb->sk->sk_wmem_alloc);
+ else
+ WARN_ON_ONCE(refcount_sub_and_test(-delta, &skb->sk->sk_wmem_alloc));
}
delta = htonl(oldlen + (skb_tail_pointer(skb) -
next prev parent reply other threads:[~2017-11-22 10:15 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-22 10:11 [PATCH 4.13 00/35] 4.13.16-stable review Greg Kroah-Hartman
2017-11-22 10:11 ` [PATCH 4.13 01/35] tcp_nv: fix division by zero in tcpnv_acked() Greg Kroah-Hartman
2017-11-22 10:11 ` [PATCH 4.13 02/35] net: vrf: correct FRA_L3MDEV encode type Greg Kroah-Hartman
2017-11-22 10:11 ` [PATCH 4.13 03/35] tcp: do not mangle skb->cb[] in tcp_make_synack() Greg Kroah-Hartman
2017-11-22 10:11 ` [PATCH 4.13 04/35] net: systemport: Correct IPG length settings Greg Kroah-Hartman
2017-11-22 10:11 ` [PATCH 4.13 05/35] netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 06/35] l2tp: dont use l2tp_tunnel_find() in l2tp_ip and l2tp_ip6 Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 07/35] bonding: discard lowest hash bit for 802.3ad layer3+4 Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 11/35] net: usb: asix: fill null-ptr-deref in asix_suspend Greg Kroah-Hartman
2017-11-22 10:12 ` Greg Kroah-Hartman [this message]
2017-11-22 10:12 ` [PATCH 4.13 13/35] tcp: fix tcp_fastretrans_alert warning Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 14/35] vlan: fix a use-after-free in vlan_device_event() Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 15/35] net/mlx5: Cancel health poll before sending panic teardown command Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 16/35] net/mlx5e: Set page to null in case dma mapping fails Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 17/35] af_netlink: ensure that NLMSG_DONE never fails in dumps Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 18/35] vxlan: fix the issue that neigh proxy blocks all icmpv6 packets Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 20/35] sctp: do not peel off an assoc from one netns to another one Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 21/35] fealnx: Fix building error on MIPS Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 22/35] net/sctp: Always set scope_id in sctp_inet6_skb_msgname Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 23/35] ima: do not update security.ima if appraisal status is not INTEGRITY_PASS Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 24/35] serial: omap: Fix EFR write on RTS deassertion Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 25/35] serial: 8250_fintek: Fix finding base_port with activated SuperIO Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 26/35] tpm-dev-common: Reject too short writes Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 27/35] rcu: Fix up pending cbs check in rcu_prepare_for_idle Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 28/35] mm/pagewalk.c: report holes in hugetlb ranges Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 29/35] ocfs2: fix cluster hang after a node dies Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 30/35] ocfs2: should wait dio before inode lock in ocfs2_setattr() Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 31/35] ipmi: fix unsigned long underflow Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 32/35] mm/page_alloc.c: broken deferred calculation Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 33/35] mm/page_ext.c: check if page_ext is not prepared Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 34/35] x86/cpu/amd: Derive L3 shared_cpu_map from cpu_llc_shared_mask Greg Kroah-Hartman
2017-11-22 10:12 ` [PATCH 4.13 35/35] coda: fix kernel memory exposure attempt in fsync Greg Kroah-Hartman
2017-11-22 16:49 ` [PATCH 4.13 00/35] 4.13.16-stable review Greg Kroah-Hartman
2017-11-22 21:33 ` Guenter Roeck
2017-11-23 14:48 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171122101138.787191328@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=syzkaller@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.