* [PATCH -next v2 0/2] net: allow setting ecn via routing table
@ 2014-10-31 12:13 Florian Westphal
2014-10-31 12:13 ` [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp Florian Westphal
2014-10-31 12:13 ` [PATCH -next v2 2/2] net: allow setting ecn via routing table Florian Westphal
0 siblings, 2 replies; 9+ messages in thread
From: Florian Westphal @ 2014-10-31 12:13 UTC (permalink / raw)
To: netdev
Here is the resend of the v1 patchset, as requested.
Thanks again for all the valuable comments and the feedback received for
the original submission.
I did a quick re-test, with a 'feature ecn' route ecn set and tcp_ecn != 1
and the various combinations look ok:
host with ecnroute is initiator
- connect to host with tcp_ecn != 0: connection uses ecn
- connect to host with tcp_ecn == 0: ecn is announced but not used
host with ecnroute is responder:
- connect from host with tcp_ecn == 1: connection uses ecn
- connect from host with tcp_ecn != 1: connection does not uses ecn
original cover letter below:
These two patches allow turning on explicit congestion notification
based on the destination network.
For example, assuming the default tcp_ecn sysctl '2', the following will
enable ecn (tcp_ecn=1 behaviour, i.e. request ecn to be enabled for a
tcp connection) for all connections to hosts inside the 192.168.2/24 network:
ip route change 192.168.2.0/24 dev eth0 features ecn
Having a more fine-grained per-route setting can be beneficial for
various reasons, for example 1) within data centers, or 2) local ISPs
may deploy ECN support for their own video/streaming services [1], etc.
Joint work with Daniel Borkmann, feature suggested by Hannes Frederic Sowa.
The patch to enable this in iproute2 will be posted shortly, it is currently
also available here:
http://git.breakpoint.cc/cgit/fw/iproute2.git/commit/?h=iproute_features&id=8843d2d8973fb81c78a7efe6d42e3a17d739003e
[1] http://www.ietf.org/proceedings/89/slides/slides-89-tsvarea-1.pdf, p.15
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp
2014-10-31 12:13 [PATCH -next v2 0/2] net: allow setting ecn via routing table Florian Westphal
@ 2014-10-31 12:13 ` Florian Westphal
2014-10-31 13:32 ` Eric Dumazet
2014-10-31 12:13 ` [PATCH -next v2 2/2] net: allow setting ecn via routing table Florian Westphal
1 sibling, 1 reply; 9+ messages in thread
From: Florian Westphal @ 2014-10-31 12:13 UTC (permalink / raw)
To: netdev; +Cc: Florian Westphal
Won't work anymore when tcp_ecn=0 and RTAX_FEATURES route attribute did
allow ecn.
Just turn on ecn if the client ts says so.
This means that while syn cookies are in use clients can turn on ecn
even if it is off on the server.
However, there seems to be no harm in permitting this.
Alternatively one can extend the test to also perform route lookup and
check RTAX_FEATURES, but it simply doesn't appear to be worth the effort.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
Changes since v1:
- reword commit message
include/net/tcp.h | 3 +--
net/ipv4/syncookies.c | 6 ++----
net/ipv6/syncookies.c | 2 +-
3 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 3a35b15..57521de 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -493,8 +493,7 @@ __u32 cookie_v4_init_sequence(struct sock *sk, const struct sk_buff *skb,
#endif
__u32 cookie_init_timestamp(struct request_sock *req);
-bool cookie_check_timestamp(struct tcp_options_received *opt, struct net *net,
- bool *ecn_ok);
+bool cookie_check_timestamp(struct tcp_options_received *opt, bool *ecn_ok);
/* From net/ipv6/syncookies.c */
int __cookie_v6_check(const struct ipv6hdr *iph, const struct tcphdr *th,
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 4ac7bca..c4e5e2d 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -225,7 +225,7 @@ static inline struct sock *get_cookie_sock(struct sock *sk, struct sk_buff *skb,
* return false if we decode an option that should not be.
*/
bool cookie_check_timestamp(struct tcp_options_received *tcp_opt,
- struct net *net, bool *ecn_ok)
+ bool *ecn_ok)
{
/* echoed timestamp, lowest bits contain options */
u32 options = tcp_opt->rcv_tsecr & TSMASK;
@@ -240,8 +240,6 @@ bool cookie_check_timestamp(struct tcp_options_received *tcp_opt,
tcp_opt->sack_ok = (options & (1 << 4)) ? TCP_SACK_SEEN : 0;
*ecn_ok = (options >> 5) & 1;
- if (*ecn_ok && !net->ipv4.sysctl_tcp_ecn)
- return false;
if (tcp_opt->sack_ok && !sysctl_tcp_sack)
return false;
@@ -290,7 +288,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb)
memset(&tcp_opt, 0, sizeof(tcp_opt));
tcp_parse_options(skb, &tcp_opt, 0, NULL);
- if (!cookie_check_timestamp(&tcp_opt, sock_net(sk), &ecn_ok))
+ if (!cookie_check_timestamp(&tcp_opt, &ecn_ok))
goto out;
ret = NULL;
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index be291ba..a08062c 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -186,7 +186,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
memset(&tcp_opt, 0, sizeof(tcp_opt));
tcp_parse_options(skb, &tcp_opt, 0, NULL);
- if (!cookie_check_timestamp(&tcp_opt, sock_net(sk), &ecn_ok))
+ if (!cookie_check_timestamp(&tcp_opt, &ecn_ok))
goto out;
ret = NULL;
--
2.0.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH -next v2 2/2] net: allow setting ecn via routing table
2014-10-31 12:13 [PATCH -next v2 0/2] net: allow setting ecn via routing table Florian Westphal
2014-10-31 12:13 ` [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp Florian Westphal
@ 2014-10-31 12:13 ` Florian Westphal
1 sibling, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2014-10-31 12:13 UTC (permalink / raw)
To: netdev; +Cc: Florian Westphal, Daniel Borkmann
Allows to set ECN on a per-route basis in case the sysctl tcp_ecn is not set to 1.
IOW, when ECN is set for specific routes, it provides a tcp_ecn=1 behaviour for that
route while the rest of the stack acts according to the global settings.
Having a more fine-grained per-route setting can be beneficial for various reasons,
for example a) within data centers, or b) local ISPs may deploy ECN support for
their own video/streaming services [1], etc.
One can use 'ip route change dev $dev $net features ecn' to toggle this.
[1] http://www.ietf.org/proceedings/89/slides/slides-89-tsvarea-1.pdf, p.15
Joint work with Daniel Borkmann.
Reference: http://thread.gmane.org/gmane.linux.network/335797
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
Changes since v1:
reword commit message and add reference to the discussion of the v1 patchset
net/ipv4/tcp_input.c | 25 +++++++++++++++----------
net/ipv4/tcp_output.c | 12 ++++++++++--
2 files changed, 25 insertions(+), 12 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4e4617e..9db942a 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5876,20 +5876,22 @@ static inline void pr_drop_req(struct request_sock *req, __u16 port, int family)
*/
static void tcp_ecn_create_request(struct request_sock *req,
const struct sk_buff *skb,
- const struct sock *listen_sk)
+ const struct sock *listen_sk,
+ struct dst_entry *dst)
{
const struct tcphdr *th = tcp_hdr(skb);
const struct net *net = sock_net(listen_sk);
bool th_ecn = th->ece && th->cwr;
- bool ect, need_ecn;
+ bool ect, need_ecn, ecn_ok;
if (!th_ecn)
return;
ect = !INET_ECN_is_not_ect(TCP_SKB_CB(skb)->ip_dsfield);
need_ecn = tcp_ca_needs_ecn(listen_sk);
+ ecn_ok = net->ipv4.sysctl_tcp_ecn || dst_feature(dst, RTAX_FEATURE_ECN);
- if (!ect && !need_ecn && net->ipv4.sysctl_tcp_ecn)
+ if (!ect && !need_ecn && ecn_ok)
inet_rsk(req)->ecn_ok = 1;
else if (ect && need_ecn)
inet_rsk(req)->ecn_ok = 1;
@@ -5954,13 +5956,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
if (security_inet_conn_request(sk, skb, req))
goto drop_and_free;
- if (!want_cookie || tmp_opt.tstamp_ok)
- tcp_ecn_create_request(req, skb, sk);
-
- if (want_cookie) {
- isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
- req->cookie_ts = tmp_opt.tstamp_ok;
- } else if (!isn) {
+ if (!want_cookie && !isn) {
/* VJ's idea. We save last timestamp seen
* from the destination in peer table, when entering
* state TIME-WAIT, and check against it before
@@ -6008,6 +6004,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
goto drop_and_free;
}
+ tcp_ecn_create_request(req, skb, sk, dst);
+
+ if (want_cookie) {
+ isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
+ req->cookie_ts = tmp_opt.tstamp_ok;
+ if (!tmp_opt.tstamp_ok)
+ inet_rsk(req)->ecn_ok = 0;
+ }
+
tcp_rsk(req)->snt_isn = isn;
tcp_openreq_init_rwin(req, sk, dst);
fastopen = !want_cookie &&
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 3af2129..b1c6296 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -333,10 +333,18 @@ static void tcp_ecn_send_synack(struct sock *sk, struct sk_buff *skb)
static void tcp_ecn_send_syn(struct sock *sk, struct sk_buff *skb)
{
struct tcp_sock *tp = tcp_sk(sk);
+ bool use_ecn = sock_net(sk)->ipv4.sysctl_tcp_ecn == 1 ||
+ tcp_ca_needs_ecn(sk);
+
+ if (!use_ecn) {
+ const struct dst_entry *dst = __sk_dst_get(sk);
+ if (dst && dst_feature(dst, RTAX_FEATURE_ECN))
+ use_ecn = true;
+ }
tp->ecn_flags = 0;
- if (sock_net(sk)->ipv4.sysctl_tcp_ecn == 1 ||
- tcp_ca_needs_ecn(sk)) {
+
+ if (use_ecn) {
TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_ECE | TCPHDR_CWR;
tp->ecn_flags = TCP_ECN_OK;
if (tcp_ca_needs_ecn(sk))
--
2.0.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp
2014-10-31 12:13 ` [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp Florian Westphal
@ 2014-10-31 13:32 ` Eric Dumazet
2014-10-31 13:39 ` Florian Westphal
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2014-10-31 13:32 UTC (permalink / raw)
To: Florian Westphal; +Cc: netdev
On Fri, 2014-10-31 at 13:13 +0100, Florian Westphal wrote:
> Won't work anymore when tcp_ecn=0 and RTAX_FEATURES route attribute did
> allow ecn.
>
> Just turn on ecn if the client ts says so.
>
> This means that while syn cookies are in use clients can turn on ecn
> even if it is off on the server.
>
> However, there seems to be no harm in permitting this.
>
> Alternatively one can extend the test to also perform route lookup and
> check RTAX_FEATURES, but it simply doesn't appear to be worth the effort.
>
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
> Changes since v1:
> - reword commit message
Sorry.
Google chose to disable ecn on their GFE, so we set sysctl_tcp_ecn to 0
If I understand your patch, if a synflood is going on, some innocent
connections could get ECN enabled, while we do not want this to ever
happen. ECN really hurts our customers, this is a known fact.
You cannot change this like that, it would force us (and maybe others)
to either revert this patch, or add a knob.
If sysctl_tcp_ecn = 0, there is no way a connection should have ECN
enabled. This was documented years ago.
For the record :
vi +247 Documentation/networking/ip-sysctl.txt
tcp_ecn - INTEGER
Control use of Explicit Congestion Notification (ECN) by TCP.
ECN is used only when both ends of the TCP connection indicate
support for it. This feature is useful in avoiding losses due
to congestion by allowing supporting routers to signal
congestion before having to drop packets.
Possible values are:
0 Disable ECN. Neither initiate nor accept ECN.
1 Enable ECN when requested by incoming connections and
also request ECN on outgoing connection attempts.
2 Enable ECN when requested by incoming connections
but do not request ECN on outgoing connections.
Default: 2
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp
2014-10-31 13:32 ` Eric Dumazet
@ 2014-10-31 13:39 ` Florian Westphal
2014-10-31 14:04 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Florian Westphal @ 2014-10-31 13:39 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Florian Westphal, netdev
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2014-10-31 at 13:13 +0100, Florian Westphal wrote:
> > Won't work anymore when tcp_ecn=0 and RTAX_FEATURES route attribute did
> > allow ecn.
> >
> > Just turn on ecn if the client ts says so.
> >
> > This means that while syn cookies are in use clients can turn on ecn
> > even if it is off on the server.
> >
> > However, there seems to be no harm in permitting this.
> >
> > Alternatively one can extend the test to also perform route lookup and
> > check RTAX_FEATURES, but it simply doesn't appear to be worth the effort.
> >
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> > ---
> > Changes since v1:
> > - reword commit message
>
> Sorry.
>
> Google chose to disable ecn on their GFE, so we set sysctl_tcp_ecn to 0
>
> If I understand your patch, if a synflood is going on, some innocent
> connections could get ECN enabled, while we do not want this to ever
> happen. ECN really hurts our customers, this is a known fact.
>
> You cannot change this like that, it would force us (and maybe others)
> to either revert this patch, or add a knob.
Mot needed, if you think its wrong to remove the check, I will respin
with a proper validation.
> If sysctl_tcp_ecn = 0, there is no way a connection should have ECN
> enabled. This was documented years ago.
It would only get enabled if the echoed timestamp (ie the timestamp we
sent in the synack) indicates that ecn was enabled, i.e. the client or
a middlebox would have to munge/modify it to set the 'ecn on' bit in the
timestamp.
If that is too fragile in your opinion I will respin the patch to include
the additional validation via dst. We already need to fetch the dst
object anyway to fetch certain route attributes not in the timestamp or
cookie, so its only a matter of reorganizing code first to avoid two lookups.
Let me know what you prefer.
Thanks,
Florian
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp
2014-10-31 13:39 ` Florian Westphal
@ 2014-10-31 14:04 ` Eric Dumazet
2014-10-31 14:15 ` Florian Westphal
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2014-10-31 14:04 UTC (permalink / raw)
To: Florian Westphal; +Cc: netdev
On Fri, 2014-10-31 at 14:39 +0100, Florian Westphal wrote:
> It would only get enabled if the echoed timestamp (ie the timestamp we
> sent in the synack) indicates that ecn was enabled, i.e. the client or
> a middlebox would have to munge/modify it to set the 'ecn on' bit in the
> timestamp.
>
> If that is too fragile in your opinion I will respin the patch to include
> the additional validation via dst. We already need to fetch the dst
> object anyway to fetch certain route attributes not in the timestamp or
> cookie, so its only a matter of reorganizing code first to avoid two lookups.
Well, your changelog is so confusing, I have no idea what is your
intent.
I do not really understand why you need to change something.
Maybe this is because I have not yet took my coffee ;)
Thanks
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp
2014-10-31 14:04 ` Eric Dumazet
@ 2014-10-31 14:15 ` Florian Westphal
2014-10-31 15:47 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Florian Westphal @ 2014-10-31 14:15 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Florian Westphal, netdev
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2014-10-31 at 14:39 +0100, Florian Westphal wrote:
>
> > It would only get enabled if the echoed timestamp (ie the timestamp we
> > sent in the synack) indicates that ecn was enabled, i.e. the client or
> > a middlebox would have to munge/modify it to set the 'ecn on' bit in the
> > timestamp.
> >
> > If that is too fragile in your opinion I will respin the patch to include
> > the additional validation via dst. We already need to fetch the dst
> > object anyway to fetch certain route attributes not in the timestamp or
> > cookie, so its only a matter of reorganizing code first to avoid two lookups.
>
> Well, your changelog is so confusing, I have no idea what is your
> intent.
Sorry :-/
So if you have a per route ecn setting, and syncookies are used,
and tcp_ecn sysctl is 0:
1. we receive syn with ecn on and timestamps
2. we send cookie synack, with timestamp and ecn (route allowed it),
the lower bits of the timestamp have a "magic" bit set that allows
us to infer that ecn was negotiated successfully.
3. we drop the ack from the client, since timestamp decoding sees
"ecn is on according to timestamp, but the tcp_ecn sysctl is off".
So to fix this, step 3 either has to check the dst setting
in addition to the global sysctl, or to rely on the timestamp alone
that ecn was requested by the original client and allowed by our host
at the time synack timestamp was generated/sent.
I hope that explains the reason behind patch #1 up.
> I do not really understand why you need to change something.
Yes, unfortunately you're not the first person saying that my
changelogs are not precise enough sometimes, I hope to do
a better job next time around.
> Maybe this is because I have not yet took my coffee ;)
Oh, well, that could also explain it 8-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp
2014-10-31 14:15 ` Florian Westphal
@ 2014-10-31 15:47 ` Eric Dumazet
2014-10-31 16:00 ` Florian Westphal
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2014-10-31 15:47 UTC (permalink / raw)
To: Florian Westphal; +Cc: netdev
On Fri, 2014-10-31 at 15:15 +0100, Florian Westphal wrote:
> So if you have a per route ecn setting, and syncookies are used,
> and tcp_ecn sysctl is 0:
This part I do not understand.
Why should tcp_ecn be 0 here, and not 2 (default value) ?
>
> 1. we receive syn with ecn on and timestamps
> 2. we send cookie synack, with timestamp and ecn (route allowed it),
> the lower bits of the timestamp have a "magic" bit set that allows
> us to infer that ecn was negotiated successfully.
> 3. we drop the ack from the client, since timestamp decoding sees
> "ecn is on according to timestamp, but the tcp_ecn sysctl is off".
>
> So to fix this, step 3 either has to check the dst setting
> in addition to the global sysctl, or to rely on the timestamp alone
> that ecn was requested by the original client and allowed by our host
> at the time synack timestamp was generated/sent.
>
> I hope that explains the reason behind patch #1 up.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp
2014-10-31 15:47 ` Eric Dumazet
@ 2014-10-31 16:00 ` Florian Westphal
0 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2014-10-31 16:00 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Florian Westphal, netdev
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2014-10-31 at 15:15 +0100, Florian Westphal wrote:
>
> > So if you have a per route ecn setting, and syncookies are used,
> > and tcp_ecn sysctl is 0:
>
> This part I do not understand.
>
> Why should tcp_ecn be 0 here, and not 2 (default value) ?
Because admin might have changed it.
There is no problem if tcp_ecn sysctl is nonzero (1 or 2).
This problem will only manifest itself iff tcp_ecn sysctl was set to 0,
and the remote peer requests ecn and a route specific setting enabled
ecn for the source network and syncookies are used.
Current timestamp cookie validation will think "client is lying about
ecn in the timestamp as sysctl is off", since it does not consider a
per-route ecn knob.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-10-31 16:00 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-31 12:13 [PATCH -next v2 0/2] net: allow setting ecn via routing table Florian Westphal
2014-10-31 12:13 ` [PATCH -next v2 1/2] syncookies: remove ecn_ok validation when decoding option timestamp Florian Westphal
2014-10-31 13:32 ` Eric Dumazet
2014-10-31 13:39 ` Florian Westphal
2014-10-31 14:04 ` Eric Dumazet
2014-10-31 14:15 ` Florian Westphal
2014-10-31 15:47 ` Eric Dumazet
2014-10-31 16:00 ` Florian Westphal
2014-10-31 12:13 ` [PATCH -next v2 2/2] net: allow setting ecn via routing table Florian Westphal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).