From: Eric Dumazet <eric.dumazet@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: netdev <netdev@vger.kernel.org>,
"Perry Lorier" <perryl@google.com>,
"Matt Mathis" <mattmathis@google.com>,
"Yuchung Cheng" <ycheng@google.com>,
"Neal Cardwell" <ncardwell@google.com>,
"Tom Herbert" <therbert@google.com>,
"Wilmer van der Gaast" <wilmer@google.com>,
"Dave Täht" <dave.taht@bufferbloat.net>,
"Ankur Jain" <jankur@google.com>
Subject: [PATCH net-next] tcp: be more strict before accepting ECN negociation
Date: Fri, 04 May 2012 17:14:02 +0200 [thread overview]
Message-ID: <1336144442.3752.348.camel@edumazet-glaptop> (raw)
From: Eric Dumazet <edumazet@google.com>
It appears some networks play bad games with the two bits reserved for
ECN. This can trigger false congestion notifications and very slow
transferts.
Since RFC 3168 (6.1.1) forbids SYN packets to carry CT bits, we can
disable TCP ECN negociation if it happens we receive mangled CT bits in
the SYN packet.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Perry Lorier <perryl@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Wilmer van der Gaast <wilmer@google.com>
Cc: Ankur Jain <jankur@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Dave Täht <dave.taht@bufferbloat.net>
---
include/net/tcp.h | 23 ++++++++++++++++-------
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv6/tcp_ipv6.c | 2 +-
3 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index c826ed7..92faa6a 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -367,13 +367,6 @@ static inline void tcp_dec_quickack_mode(struct sock *sk,
#define TCP_ECN_DEMAND_CWR 4
#define TCP_ECN_SEEN 8
-static __inline__ void
-TCP_ECN_create_request(struct request_sock *req, struct tcphdr *th)
-{
- if (sysctl_tcp_ecn && th->ece && th->cwr)
- inet_rsk(req)->ecn_ok = 1;
-}
-
enum tcp_tw_status {
TCP_TW_SUCCESS = 0,
TCP_TW_RST = 1,
@@ -671,6 +664,22 @@ struct tcp_skb_cb {
#define TCP_SKB_CB(__skb) ((struct tcp_skb_cb *)&((__skb)->cb[0]))
+/* RFC3168 : 6.1.1 SYN packets must not have ECT/ECN bits set
+ *
+ * If we receive a SYN packet with these bits set, it means a network is
+ * playing bad games with TOS bits. In order to avoid possible false congestion
+ * notifications, we disable TCP ECN negociation.
+ */
+static inline void
+TCP_ECN_create_request(struct request_sock *req, const struct sk_buff *skb)
+{
+ const struct tcphdr *th = tcp_hdr(skb);
+
+ if (sysctl_tcp_ecn && th->ece && th->cwr &&
+ INET_ECN_is_not_ect(TCP_SKB_CB(skb)->ip_dsfield))
+ inet_rsk(req)->ecn_ok = 1;
+}
+
/* Due to TSO, an SKB can be composed of multiple actual
* packets. To keep these tracked properly, we use this.
*/
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index cf97e98..4ff5e1f 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1368,7 +1368,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
goto drop_and_free;
if (!want_cookie || tmp_opt.tstamp_ok)
- TCP_ECN_create_request(req, tcp_hdr(skb));
+ TCP_ECN_create_request(req, skb);
if (want_cookie) {
isn = cookie_v4_init_sequence(sk, skb, &req->mss);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 57b2109..078d039 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1140,7 +1140,7 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
treq->rmt_addr = ipv6_hdr(skb)->saddr;
treq->loc_addr = ipv6_hdr(skb)->daddr;
if (!want_cookie || tmp_opt.tstamp_ok)
- TCP_ECN_create_request(req, tcp_hdr(skb));
+ TCP_ECN_create_request(req, skb);
treq->iif = sk->sk_bound_dev_if;
next reply other threads:[~2012-05-04 15:14 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-04 15:14 Eric Dumazet [this message]
2012-05-04 15:54 ` [PATCH net-next] tcp: be more strict before accepting ECN negociation Neal Cardwell
2012-05-04 16:06 ` David Miller
2012-05-04 18:09 ` Rick Jones
2012-05-04 18:23 ` Eric Dumazet
2012-05-04 18:48 ` Rick Jones
2012-05-04 19:05 ` Eric Dumazet
2012-05-04 20:20 ` Rick Jones
2012-05-04 20:36 ` Rick Jones
2012-05-04 20:49 ` Eric Dumazet
2012-05-04 21:01 ` Rick Jones
2012-05-04 21:14 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1336144442.3752.348.camel@edumazet-glaptop \
--to=eric.dumazet@gmail.com \
--cc=dave.taht@bufferbloat.net \
--cc=davem@davemloft.net \
--cc=jankur@google.com \
--cc=mattmathis@google.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=perryl@google.com \
--cc=therbert@google.com \
--cc=wilmer@google.com \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox