From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
Juha-Matti Tilli <juha-matti.tilli@iki.fi>,
Yuchung Cheng <ycheng@google.com>,
Soheil Hassas Yeganeh <soheil@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.9 17/33] tcp: free batches of packets in tcp_prune_ofo_queue()
Date: Fri, 27 Jul 2018 12:08:58 +0200 [thread overview]
Message-ID: <20180727100828.323222556@linuxfoundation.org> (raw)
In-Reply-To: <20180727100827.665729981@linuxfoundation.org>
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 72cd43ba64fc172a443410ce01645895850844c8 ]
Juha-Matti Tilli reported that malicious peers could inject tiny
packets in out_of_order_queue, forcing very expensive calls
to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for
every incoming packet. out_of_order_queue rb-tree can contain
thousands of nodes, iterating over all of them is not nice.
Before linux-4.9, we would have pruned all packets in ofo_queue
in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs
truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB.
Since we plan to increase tcp_rmem[2] in the future to cope with
modern BDP, can not revert to the old behavior, without great pain.
Strategy taken in this patch is to purge ~12.5 % of the queue capacity.
Fixes: 36a6503fedda ("tcp: refine tcp_prune_ofo_queue() to not drop all packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/skbuff.h | 2 ++
net/ipv4/tcp_input.c | 15 +++++++++++----
2 files changed, 13 insertions(+), 4 deletions(-)
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2982,6 +2982,8 @@ static inline int __skb_grow_rcsum(struc
return __skb_grow(skb, len);
}
+#define rb_to_skb(rb) rb_entry_safe(rb, struct sk_buff, rbnode)
+
#define skb_queue_walk(queue, skb) \
for (skb = (queue)->next; \
skb != (struct sk_buff *)(queue); \
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4965,6 +4965,7 @@ new_range:
* 2) not add too big latencies if thousands of packets sit there.
* (But if application shrinks SO_RCVBUF, we could still end up
* freeing whole queue here)
+ * 3) Drop at least 12.5 % of sk_rcvbuf to avoid malicious attacks.
*
* Return true if queue has shrunk.
*/
@@ -4972,20 +4973,26 @@ static bool tcp_prune_ofo_queue(struct s
{
struct tcp_sock *tp = tcp_sk(sk);
struct rb_node *node, *prev;
+ int goal;
if (RB_EMPTY_ROOT(&tp->out_of_order_queue))
return false;
NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED);
+ goal = sk->sk_rcvbuf >> 3;
node = &tp->ooo_last_skb->rbnode;
do {
prev = rb_prev(node);
rb_erase(node, &tp->out_of_order_queue);
+ goal -= rb_to_skb(node)->truesize;
tcp_drop(sk, rb_entry(node, struct sk_buff, rbnode));
- sk_mem_reclaim(sk);
- if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf &&
- !tcp_under_memory_pressure(sk))
- break;
+ if (!prev || goal <= 0) {
+ sk_mem_reclaim(sk);
+ if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf &&
+ !tcp_under_memory_pressure(sk))
+ break;
+ goal = sk->sk_rcvbuf >> 3;
+ }
node = prev;
} while (node);
tp->ooo_last_skb = rb_entry(prev, struct sk_buff, rbnode);
next prev parent reply other threads:[~2018-07-27 11:31 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-27 10:08 [PATCH 4.9 00/33] 4.9.116-stable review Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 01/33] MIPS: ath79: fix register address in ath79_ddr_wb_flush() Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 02/33] MIPS: Fix off-by-one in pci_resource_to_user() Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 03/33] ip: hash fragments consistently Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 04/33] ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 05/33] net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 06/33] net: skb_segment() should not return NULL Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 07/33] net/mlx5: Adjust clock overflow work period Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 08/33] net/mlx5e: Dont allow aRFS for encapsulated packets Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 09/33] net/mlx5e: Fix quota counting in aRFS expire flow Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 10/33] multicast: do not restore deleted record source filter mode to new one Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 11/33] net: phy: consider PHY_IGNORE_INTERRUPT in phy_start_aneg_priv Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 12/33] rtnetlink: add rtnl_link_state check in rtnl_configure_link Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 13/33] tcp: fix dctcp delayed ACK schedule Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 14/33] tcp: helpers to send special DCTCP ack Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 15/33] tcp: do not cancel delay-AcK on DCTCP special ACK Greg Kroah-Hartman
2018-07-27 10:08 ` [PATCH 4.9 16/33] tcp: do not delay ACK in DCTCP upon CE status change Greg Kroah-Hartman
2018-07-27 10:08 ` Greg Kroah-Hartman [this message]
2018-07-27 10:08 ` [PATCH 4.9 18/33] tcp: avoid collapses in tcp_prune_queue() if possible Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 19/33] tcp: detect malicious patterns in tcp_collapse_ofo_queue() Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 20/33] tcp: call tcp_drop() from tcp_data_queue_ofo() Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 21/33] usb: cdc_acm: Add quirk for Castles VEGA3000 Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 22/33] usb: core: handle hub C_PORT_OVER_CURRENT condition Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 23/33] usb: gadget: f_fs: Only return delayed status when len is 0 Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 24/33] driver core: Partially revert "driver core: correct devices shutdown order" Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 25/33] can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 26/33] can: xilinx_can: fix power management handling Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 27/33] can: xilinx_can: fix recovery from error states not being propagated Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 28/33] can: xilinx_can: fix device dropping off bus on RX overrun Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 29/33] can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 30/33] can: xilinx_can: fix incorrect clear of non-processed interrupts Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 31/33] can: xilinx_can: fix RX overflow interrupt not being enabled Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 32/33] turn off -Wattribute-alias Greg Kroah-Hartman
2018-07-27 10:09 ` [PATCH 4.9 33/33] exec: avoid gcc-8 warning for get_task_comm Greg Kroah-Hartman
2018-07-27 12:21 ` [PATCH 4.9 00/33] 4.9.116-stable review Nathan Chancellor
2018-07-27 17:29 ` Guenter Roeck
2018-07-27 20:01 ` Shuah Khan
2018-07-28 6:55 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180727100828.323222556@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=juha-matti.tilli@iki.fi \
--cc=linux-kernel@vger.kernel.org \
--cc=soheil@google.com \
--cc=stable@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).