From: Ben Hutchings <bhutchings@solarflare.com>
To: David Miller <davem@davemloft.net>
Cc: <netdev@vger.kernel.org>, <linux-net-drivers@solarflare.com>
Subject: [PATCH net 1/2] tcp: Limit number of segments generated by GSO per skb
Date: Mon, 30 Jul 2012 18:16:42 +0100 [thread overview]
Message-ID: <1343668602.2667.6.camel@bwh-desktop.uk.solarflarecom.com> (raw)
In-Reply-To: <1343668498.2667.5.camel@bwh-desktop.uk.solarflarecom.com>
A peer (or local user) may cause TCP to use a nominal MSS of as little
as 88 (actual MSS of 76 with timestamps). Given that we have a
sufficiently prodigious local sender and the peer ACKs quickly enough,
it is nevertheless possible to grow the window for such a connection
to the point that we will try to send just under 64K at once. This
results in a single skb that expands to 861 segments.
In some drivers with TSO support, such an skb will require hundreds of
DMA descriptors; a substantial fraction of a TX ring or even more than
a full ring. The TX queue selected for the skb may stall and trigger
the TX watchdog repeatedly (since the problem skb will be retried
after the TX reset). This particularly affects sfc, for which the
issue is designated as CVE-2012-3412. However it may be that some
hardware or firmware also fails to handle such an extreme TSO request
correctly.
Therefore, limit the number of segments per skb to 100. This should
make no difference to behaviour unless the actual MSS is less than
about 700.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
include/net/tcp.h | 3 +++
net/ipv4/tcp.c | 4 +++-
net/ipv4/tcp_output.c | 17 +++++++++--------
3 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index e19124b..098a2d0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -70,6 +70,9 @@ extern void tcp_time_wait(struct sock *sk, int state, int timeo);
/* The least MTU to use for probing */
#define TCP_BASE_MSS 512
+/* Maximum number of segments we may require GSO to generate from an skb. */
+#define TCP_MAX_GSO_SEGS 100
+
/* After receiving this amount of duplicate ACKs fast retransmit starts. */
#define TCP_FASTRETRANS_THRESH 3
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e7e6eea..51d8daf 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -811,7 +811,9 @@ static unsigned int tcp_xmit_size_goal(struct sock *sk, u32 mss_now,
old_size_goal + mss_now > xmit_size_goal)) {
xmit_size_goal = old_size_goal;
} else {
- tp->xmit_size_goal_segs = xmit_size_goal / mss_now;
+ tp->xmit_size_goal_segs =
+ min_t(u32, xmit_size_goal / mss_now,
+ TCP_MAX_GSO_SEGS);
xmit_size_goal = tp->xmit_size_goal_segs * mss_now;
}
}
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 33cd065..c86c288 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1522,21 +1522,21 @@ static void tcp_cwnd_validate(struct sock *sk)
* when we would be allowed to send the split-due-to-Nagle skb fully.
*/
static unsigned int tcp_mss_split_point(const struct sock *sk, const struct sk_buff *skb,
- unsigned int mss_now, unsigned int cwnd)
+ unsigned int mss_now, unsigned int max_segs)
{
const struct tcp_sock *tp = tcp_sk(sk);
- u32 needed, window, cwnd_len;
+ u32 needed, window, max_len;
window = tcp_wnd_end(tp) - TCP_SKB_CB(skb)->seq;
- cwnd_len = mss_now * cwnd;
+ max_len = mss_now * max_segs;
- if (likely(cwnd_len <= window && skb != tcp_write_queue_tail(sk)))
- return cwnd_len;
+ if (likely(max_len <= window && skb != tcp_write_queue_tail(sk)))
+ return max_len;
needed = min(skb->len, window);
- if (cwnd_len <= needed)
- return cwnd_len;
+ if (max_len <= needed)
+ return max_len;
return needed - needed % mss_now;
}
@@ -1999,7 +1999,8 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
limit = mss_now;
if (tso_segs > 1 && !tcp_urg_mode(tp))
limit = tcp_mss_split_point(sk, skb, mss_now,
- cwnd_quota);
+ min(cwnd_quota,
+ TCP_MAX_GSO_SEGS));
if (skb->len > limit &&
unlikely(tso_fragment(sk, skb, limit, mss_now, gfp)))
--
1.7.7.6
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
next prev parent reply other threads:[~2012-07-30 17:16 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-30 17:14 [PATCH net 0/2] Prevent extreme TSO parameters from stalling TX queues Ben Hutchings
2012-07-30 17:16 ` Ben Hutchings [this message]
2012-07-30 17:23 ` [PATCH net 1/2] tcp: Limit number of segments generated by GSO per skb Ben Greear
2012-07-30 19:41 ` Ben Hutchings
2012-07-30 21:00 ` Ben Greear
2012-07-30 17:31 ` Eric Dumazet
2012-07-30 19:35 ` Ben Hutchings
2012-07-30 19:56 ` Ben Hutchings
2012-07-30 21:46 ` David Miller
2012-07-30 22:20 ` Ben Hutchings
2012-07-30 22:50 ` Stephen Hemminger
2012-07-30 23:07 ` Ben Hutchings
2012-07-30 17:17 ` [PATCH net 2/2] sfc: Correct the minimum TX queue size Ben Hutchings
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1343668602.2667.6.camel@bwh-desktop.uk.solarflarecom.com \
--to=bhutchings@solarflare.com \
--cc=davem@davemloft.net \
--cc=linux-net-drivers@solarflare.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox