[PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
@ 2013-12-20 13:08 Hannes Frederic Sowa
  2013-12-31  3:20 ` David Miller
  2013-12-31  4:28 ` Hannes Frederic Sowa
  0 siblings, 2 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2013-12-20 13:08 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, davem

Provide a mode where the forwarding path does not use the protocol path
MTU to calculate the maximum size for a forwarded packet but instead
uses the interface or the per-route locked MTU.

It is easy to inject bogus or malicious path mtu information which
will cause either unneeded fragmentation-needed icmp errors (in case
of DF-bit set) or unnecessary fragmentation of packets (by default down
to min_pmtu). This could be used to either create blackholes on routers
(if the generated DF-bit gets dropped later on) or to leverage attacks
on fragmentation.

Forwarded skbs are marked with IPSKB_FORWARDED in ip_forward. This flag
was introduced for multicast forwarding, but as it does not conflict with
our usage in the unicast code path it is perfect for reuse.

I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
along with the new ip_dst_mtu_secure to net/ip.h to fix circular
dependencies because of IPSKB_FORWARDED.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
v2:
* forwarding_uses_pmtu default to on
* reworded documentation and changelog

 Documentation/networking/ip-sysctl.txt | 13 +++++++++++++
 include/net/ip.h                       | 32 ++++++++++++++++++++++++++++++++
 include/net/netns/ipv4.h               |  1 +
 include/net/route.h                    | 19 +++----------------
 net/ipv4/ip_forward.c                  |  7 +++++--
 net/ipv4/ip_output.c                   |  9 ++++++---
 net/ipv4/route.c                       |  3 ---
 net/ipv4/sysctl_net_ipv4.c             |  7 +++++++
 8 files changed, 67 insertions(+), 24 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index d71afa8..1077c92 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -32,6 +32,19 @@ ip_no_pmtu_disc - INTEGER
 min_pmtu - INTEGER
 	default 552 - minimum discovered Path MTU
 
+forwarding_uses_pmtu - BOOLEAN
+	Don't trust path MTUs while forwarding. Path MTU notifications
+	can easily be forged and lead to unwanted and unnecessary
+	fragmentation or, in case DF-bit being set, dropping of a
+	packet and generation of fragmentation-needed ICMPs . If this
+	mode is enabled, only interface and per-route locked MTUs are
+	considered for fragmentation and fragmentation-needed
+	generation.
+	Default: 1 (enabled)
+	Possible values:
+	0 - disabled
+	1 - enabled
+
 route/max_size - INTEGER
 	Maximum number of routes allowed in the kernel.  Increase
 	this when using large numbers of interfaces and/or routes.
diff --git a/include/net/ip.h b/include/net/ip.h
index 5356644..58fbedc 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -263,6 +263,38 @@ int ip_dont_fragment(struct sock *sk, struct dst_entry *dst)
 		 !(dst_metric_locked(dst, RTAX_MTU)));
 }
 
+static inline bool ip_sk_accept_pmtu(const struct sock *sk)
+{
+	return inet_sk(sk)->pmtudisc != IP_PMTUDISC_INTERFACE;
+}
+
+static inline bool ip_sk_use_pmtu(const struct sock *sk)
+{
+	return inet_sk(sk)->pmtudisc < IP_PMTUDISC_PROBE;
+}
+static inline unsigned int ip_dst_mtu_secure(const struct dst_entry *dst,
+					     bool forwarding)
+{
+	struct net *net = dev_net(dst->dev);
+
+	if (net->ipv4.sysctl_ip_fwd_use_pmtu ||
+	    dst_metric_locked(dst, RTAX_MTU) ||
+	    !forwarding)
+		return dst_mtu(dst);
+
+	return min(dst->dev->mtu, IP_MAX_MTU);
+}
+
+static inline unsigned int ip_skb_dst_mtu(const struct sk_buff *skb)
+{
+	if (!skb->sk || ip_sk_use_pmtu(skb->sk)) {
+		bool forwarding = !!(IPCB(skb)->flags & IPSKB_FORWARDED);
+		return ip_dst_mtu_secure(skb_dst(skb), forwarding);
+	} else {
+		return min(skb_dst(skb)->dev->mtu, IP_MAX_MTU);
+	}
+}
+
 void __ip_select_ident(struct iphdr *iph, struct dst_entry *dst, int more);
 
 static inline void ip_select_ident(struct sk_buff *skb, struct dst_entry *dst, struct sock *sk)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 929a668..80f500a 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -70,6 +70,7 @@ struct netns_ipv4 {
 
 	int sysctl_tcp_ecn;
 	int sysctl_ip_no_pmtu_disc;
+	int sysctl_ip_fwd_use_pmtu;
 
 	kgid_t sysctl_ping_group_range[2];
 
diff --git a/include/net/route.h b/include/net/route.h
index 638e3eb..9d1f423 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -36,6 +36,9 @@
 #include <linux/cache.h>
 #include <linux/security.h>
 
+/* IPv4 datagram length is stored into 16bit field (tot_len) */
+#define IP_MAX_MTU	0xFFFFU
+
 #define RTO_ONLINK	0x01
 
 #define RT_CONN_FLAGS(sk)   (RT_TOS(inet_sk(sk)->tos) | sock_flag(sk, SOCK_LOCALROUTE))
@@ -311,20 +314,4 @@ static inline int ip4_dst_hoplimit(const struct dst_entry *dst)
 	return hoplimit;
 }
 
-static inline bool ip_sk_accept_pmtu(const struct sock *sk)
-{
-	return inet_sk(sk)->pmtudisc != IP_PMTUDISC_INTERFACE;
-}
-
-static inline bool ip_sk_use_pmtu(const struct sock *sk)
-{
-	return inet_sk(sk)->pmtudisc < IP_PMTUDISC_PROBE;
-}
-
-static inline int ip_skb_dst_mtu(const struct sk_buff *skb)
-{
-	return (!skb->sk || ip_sk_use_pmtu(skb->sk)) ?
-	       dst_mtu(skb_dst(skb)) : skb_dst(skb)->dev->mtu;
-}
-
 #endif	/* _ROUTE_H */
diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c
index 694de3b..66d027f 100644
--- a/net/ipv4/ip_forward.c
+++ b/net/ipv4/ip_forward.c
@@ -54,6 +54,7 @@ static int ip_forward_finish(struct sk_buff *skb)
 
 int ip_forward(struct sk_buff *skb)
 {
+	u32 mtu;
 	struct iphdr *iph;	/* Our header */
 	struct rtable *rt;	/* Route we use */
 	struct ip_options *opt	= &(IPCB(skb)->opt);
@@ -88,11 +89,13 @@ int ip_forward(struct sk_buff *skb)
 	if (opt->is_strictroute && rt->rt_uses_gateway)
 		goto sr_failed;
 
-	if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
+	IPCB(skb)->flags |= IPSKB_FORWARDED;
+	mtu = ip_dst_mtu_secure(&rt->dst, true);
+	if (unlikely(skb->len > mtu && !skb_is_gso(skb) &&
 		     (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
 		IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
 		icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
-			  htonl(dst_mtu(&rt->dst)));
+			  htonl(mtu));
 		goto drop;
 	}
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 9124027..f2574d4 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -449,6 +449,7 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 	__be16 not_last_frag;
 	struct rtable *rt = skb_rtable(skb);
 	int err = 0;
+	bool forwarding = !!(IPCB(skb)->flags & IPSKB_FORWARDED);
 
 	dev = rt->dst.dev;
 
@@ -458,12 +459,13 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 
 	iph = ip_hdr(skb);
 
+	mtu = ip_dst_mtu_secure(&rt->dst, forwarding);
 	if (unlikely(((iph->frag_off & htons(IP_DF)) && !skb->local_df) ||
 		     (IPCB(skb)->frag_max_size &&
-		      IPCB(skb)->frag_max_size > dst_mtu(&rt->dst)))) {
+		      IPCB(skb)->frag_max_size > mtu))) {
 		IP_INC_STATS(dev_net(dev), IPSTATS_MIB_FRAGFAILS);
 		icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
-			  htonl(ip_skb_dst_mtu(skb)));
+			  htonl(mtu));
 		kfree_skb(skb);
 		return -EMSGSIZE;
 	}
@@ -473,7 +475,7 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
 	 */
 
 	hlen = iph->ihl * 4;
-	mtu = dst_mtu(&rt->dst) - hlen;	/* Size of data space */
+	mtu = mtu - hlen;	/* Size of data space */
 #ifdef CONFIG_BRIDGE_NETFILTER
 	if (skb->nf_bridge)
 		mtu -= nf_bridge_mtu_reduction(skb);
@@ -1547,6 +1549,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
 
 void __init ip_init(void)
 {
+	init_net.ipv4.sysctl_ip_fwd_use_pmtu = 1;
 	ip_rt_init();
 	inet_initpeers();
 
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f8da282..25071b4 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -112,9 +112,6 @@
 #define RT_FL_TOS(oldflp4) \
 	((oldflp4)->flowi4_tos & (IPTOS_RT_MASK | RTO_ONLINK))
 
-/* IPv4 datagram length is stored into 16bit field (tot_len) */
-#define IP_MAX_MTU	0xFFFF
-
 #define RT_GC_TIMEOUT (300*HZ)
 
 static int ip_rt_max_size;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index d7b63a6..56cc374 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -831,6 +831,13 @@ static struct ctl_table ipv4_net_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec
 	},
+	{
+		.procname	= "forwarding_uses_pmtu",
+		.data		= &init_net.ipv4.sysctl_ip_fwd_use_pmtu,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
 	{ }
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2013-12-20 13:08 [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info Hannes Frederic Sowa
@ 2013-12-31  3:20 ` David Miller
  2013-12-31  3:52   ` Hannes Frederic Sowa
  2014-01-05 10:41   ` Hannes Frederic Sowa
  2013-12-31  4:28 ` Hannes Frederic Sowa
  1 sibling, 2 replies; 14+ messages in thread
From: David Miller @ 2013-12-31  3:20 UTC (permalink / raw)
  To: hannes; +Cc: netdev, eric.dumazet

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Fri, 20 Dec 2013 14:08:22 +0100

> Provide a mode where the forwarding path does not use the protocol path
> MTU to calculate the maximum size for a forwarded packet but instead
> uses the interface or the per-route locked MTU.
> 
> It is easy to inject bogus or malicious path mtu information which
> will cause either unneeded fragmentation-needed icmp errors (in case
> of DF-bit set) or unnecessary fragmentation of packets (by default down
> to min_pmtu). This could be used to either create blackholes on routers
> (if the generated DF-bit gets dropped later on) or to leverage attacks
> on fragmentation.
> 
> Forwarded skbs are marked with IPSKB_FORWARDED in ip_forward. This flag
> was introduced for multicast forwarding, but as it does not conflict with
> our usage in the unicast code path it is perfect for reuse.
> 
> I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
> along with the new ip_dst_mtu_secure to net/ip.h to fix circular
> dependencies because of IPSKB_FORWARDED.
> 
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: David Miller <davem@davemloft.net>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
> v2:
> * forwarding_uses_pmtu default to on
> * reworded documentation and changelog

Sorry Hannes, I'm still not happy with this change at all.

First of all, you misunderstood what I said last time around.
I basically was trying to say that if you make this facility
available at all, distributions are going to turn it on by
default even if you make the kernel default to off.

That drives me nuts, and I don't want to give people this
opportunity to screw things up again just like they did
for rp_filter, which also defaults to off.

Secondly, I don't see how this is as big of an issue as you
present it to be.  At best, the language is way too strong.

How can PMTU information be injected into the kernel?

Local connections, well we want that and it's good.  And for TCP you
have to be able to inject a valid TCP packet that can be validated by
the stack for the PMTU information to stick.

Tunnels, well you said that tunnels don't apply to this change.

The only thing left are things like AH and ESP, which also perform
some level of validation making sure that some state exists.  And
frankly for non-tunneled IPSEC this PMTU information is absolutely
essential.

I really don't want to apply these changes, sorry.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2013-12-31  3:20 ` David Miller
@ 2013-12-31  3:52   ` Hannes Frederic Sowa
  2013-12-31  6:04     ` David Miller
  2014-01-05 10:41   ` Hannes Frederic Sowa
  1 sibling, 1 reply; 14+ messages in thread
From: Hannes Frederic Sowa @ 2013-12-31  3:52 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, eric.dumazet

On Mon, Dec 30, 2013 at 10:20:44PM -0500, David Miller wrote:
> Sorry Hannes, I'm still not happy with this change at all.
> 
> First of all, you misunderstood what I said last time around.
> I basically was trying to say that if you make this facility
> available at all, distributions are going to turn it on by
> default even if you make the kernel default to off.

Yes, I am sorry for that.

> That drives me nuts, and I don't want to give people this
> opportunity to screw things up again just like they did
> for rp_filter, which also defaults to off.
> 
> Secondly, I don't see how this is as big of an issue as you
> present it to be.  At best, the language is way too strong.
> 
> How can PMTU information be injected into the kernel?

I used icmp_err handler with type == ICMP_ECHOREPLY. It is pretty open
as it does not validate (and cannot) validate local socket state.

This also applies to ipv6 and raw handler could also be a problem.

> Local connections, well we want that and it's good.  And for TCP you
> have to be able to inject a valid TCP packet that can be validated by
> the stack for the PMTU information to stick.
> 
> Tunnels, well you said that tunnels don't apply to this change.
> 
> The only thing left are things like AH and ESP, which also perform
> some level of validation making sure that some state exists.  And
> frankly for non-tunneled IPSEC this PMTU information is absolutely
> essential.
>
> I really don't want to apply these changes, sorry.

I can understand.

I'll do more research on this and try to present more convincing reasons
why this could be a problem and explain and check more carefully how
this interacts with the xfrm subsystem. I am still convinced we should
suppress path mtu information in forwarding path.

I'll review the language of this change, too. ;)

Thank you for the review and a happy new year!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2013-12-20 13:08 [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info Hannes Frederic Sowa
  2013-12-31  3:20 ` David Miller
@ 2013-12-31  4:28 ` Hannes Frederic Sowa
  2014-01-06  9:05   ` Steffen Klassert
  1 sibling, 1 reply; 14+ messages in thread
From: Hannes Frederic Sowa @ 2013-12-31  4:28 UTC (permalink / raw)
  To: steffen.klassert; +Cc: netdev, eric.dumazet, davem

Hi Steffen!

On Fri, Dec 20, 2013 at 02:08:22PM +0100, Hannes Frederic Sowa wrote:
> Provide a mode where the forwarding path does not use the protocol path
> MTU to calculate the maximum size for a forwarded packet but instead
> uses the interface or the per-route locked MTU.
> 
> It is easy to inject bogus or malicious path mtu information which
> will cause either unneeded fragmentation-needed icmp errors (in case
> of DF-bit set) or unnecessary fragmentation of packets (by default down
> to min_pmtu). This could be used to either create blackholes on routers
> (if the generated DF-bit gets dropped later on) or to leverage attacks
> on fragmentation.
> 
> Forwarded skbs are marked with IPSKB_FORWARDED in ip_forward. This flag
> was introduced for multicast forwarding, but as it does not conflict with
> our usage in the unicast code path it is perfect for reuse.
> 
> I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
> along with the new ip_dst_mtu_secure to net/ip.h to fix circular
> dependencies because of IPSKB_FORWARDED.

IIRC you have a (semi-)automatic test suite to test for (p)mtu problems? Would
these checks cover such a change?

Maybe your test suite is publicly available so I could test these changes
myself or otherwise could you give them a testdrive? That would be very
helpful.

Thank you,

  Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2013-12-31  3:52   ` Hannes Frederic Sowa
@ 2013-12-31  6:04     ` David Miller
  2013-12-31  7:02       ` Hannes Frederic Sowa
  0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2013-12-31  6:04 UTC (permalink / raw)
  To: hannes; +Cc: netdev, eric.dumazet

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Tue, 31 Dec 2013 04:52:47 +0100

> I am still convinced we should suppress path mtu information in
> forwarding path.

This is why it disappoints me that the person who felt differently,
you left out of the CC: when you reposted your patches.  He's
the person who made the code the way it is right now.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2013-12-31  6:04     ` David Miller
@ 2013-12-31  7:02       ` Hannes Frederic Sowa
  2013-12-31 17:59         ` John Heffner
  0 siblings, 1 reply; 14+ messages in thread
From: Hannes Frederic Sowa @ 2013-12-31  7:02 UTC (permalink / raw)
  To: johnwheffner; +Cc: netdev, eric.dumazet, davem

[added John Heffner to the discussion]

* Context
> Provide a mode where the forwarding path does not use the protocol path
> MTU to calculate the maximum size for a forwarded packet but instead
> uses the interface or the per-route locked MTU.
> 
> It is easy to inject bogus or malicious path mtu information which
> will cause either unneeded fragmentation-needed icmp errors (in case
> of DF-bit set) or unnecessary fragmentation of packets (by default down
> to min_pmtu). This could be used to either create blackholes on routers
> (if the generated DF-bit gets dropped later on) or to leverage attacks
> on fragmentation.
> 
> Forwarded skbs are marked with IPSKB_FORWARDED in ip_forward. This flag
> was introduced for multicast forwarding, but as it does not conflict with
> our usage in the unicast code path it is perfect for reuse.
> 
> I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
> along with the new ip_dst_mtu_secure to net/ip.h to fix circular
> dependencies because of IPSKB_FORWARDED.
> 
> Cc: Eric Dumazet <eric.dumazet@gmail.com>  
> Cc: David Miller <davem@davemloft.net>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
> v2:
> * forwarding_uses_pmtu default to on
> * reworded documentation and changelog

On Tue, Dec 31, 2013 at 01:04:03AM -0500, David Miller wrote:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Tue, 31 Dec 2013 04:52:47 +0100
> 
> > I am still convinced we should suppress path mtu information in
> > forwarding path.
> 
> This is why it disappoints me that the person who felt differently,
> you left out of the CC: when you reposted your patches.  He's
> the person who made the code the way it is right now.

That was not on purpose. I am including him in the discussion right
now. I thought that after your last mail in the old thread you already
made a decision. I also failed to update the patch's cc list, I am sorry
for that.

So, I try again:

Jeff, I am sorry to not have included you in this resent. This resend patch
has as the only change that it enables forwarding_uses_pmtu by default.

I think we should find a consensus on this if it is reasonable to provide such
a knob or find a default mode the forwarding path always operates on.

In my opinion we must never use the path mtu to determine the size of packets
while forwarding as this would make it easier to try attacks on fragments. I
know we cannot fully prevent fragmentation but we should try as hard as
possible to prevent it.

Also, I don't see that anything should break because of this (with the
exception of user space daemons doing manually probing, that was the
only reason I included the knob). I also have an alternative to this. Like
multicast routing such a daemon could install a socket and set a specific
setsockopt. As long as this daemon is running and the socket is not closed we
use the path mtu. If the program exits and closes the socket we fall back to
just using the interface mtu.

Given the recent attention to a succesful implementation of an attack
on DNS using fragmentation as the root cause [1] I think this change is
justified. I also described another problem I thought about regarding
IPv6.  As we can shrink the path mtu the router would generate PtB errors
and would lead to fragmentation on the end system. Recent research [2]
showed that the failure rate of IPv6 fragments on the Alexa Top 1M sites
is about 40% or 47% depending on the test. If this is also true and one
can bring a host to fragment along such a path this could result in a
blackhole. It may cause struggle for some people. I don't think this
is as serious as the IPv4 problem. The problem could happen the other
way around, if the generated PtB errors get dropped back to the host,
which thus will never get informed about a mtu bottleneck on the path.

Given the problems with correctly filtering those icmp errors I propose
to always use the interface MTU for forwarding. In my understanding this
is also in consent with RFC 1191 "4. Router specification".

What do you think? I am looking forward to hear your opinion on that.

[1] <https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf>
[2] <http://www.iepg.org/2013-11-ietf88/fgont-iepg-ietf88-ipv6-frag-and-eh.pdf>

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2013-12-31  7:02       ` Hannes Frederic Sowa
@ 2013-12-31 17:59         ` John Heffner
  2013-12-31 18:41           ` Hannes Frederic Sowa
  0 siblings, 1 reply; 14+ messages in thread
From: John Heffner @ 2013-12-31 17:59 UTC (permalink / raw)
  To: Netdev, Eric Dumazet, David Miller

On Tue, Dec 31, 2013 at 2:02 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> [added John Heffner to the discussion]
>
> * Context
>> Provide a mode where the forwarding path does not use the protocol path
>> MTU to calculate the maximum size for a forwarded packet but instead
>> uses the interface or the per-route locked MTU.
>>
>> It is easy to inject bogus or malicious path mtu information which
>> will cause either unneeded fragmentation-needed icmp errors (in case
>> of DF-bit set) or unnecessary fragmentation of packets (by default down
>> to min_pmtu). This could be used to either create blackholes on routers
>> (if the generated DF-bit gets dropped later on) or to leverage attacks
>> on fragmentation.
>>
>> Forwarded skbs are marked with IPSKB_FORWARDED in ip_forward. This flag
>> was introduced for multicast forwarding, but as it does not conflict with
>> our usage in the unicast code path it is perfect for reuse.
>>
>> I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
>> along with the new ip_dst_mtu_secure to net/ip.h to fix circular
>> dependencies because of IPSKB_FORWARDED.
>>
>> Cc: Eric Dumazet <eric.dumazet@gmail.com>
>> Cc: David Miller <davem@davemloft.net>
>> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> ---
>> v2:
>> * forwarding_uses_pmtu default to on
>> * reworded documentation and changelog
>
> On Tue, Dec 31, 2013 at 01:04:03AM -0500, David Miller wrote:
>> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Date: Tue, 31 Dec 2013 04:52:47 +0100
>>
>> > I am still convinced we should suppress path mtu information in
>> > forwarding path.
>>
>> This is why it disappoints me that the person who felt differently,
>> you left out of the CC: when you reposted your patches.  He's
>> the person who made the code the way it is right now.
>
> That was not on purpose. I am including him in the discussion right
> now. I thought that after your last mail in the old thread you already
> made a decision. I also failed to update the patch's cc list, I am sorry
> for that.
>
> So, I try again:
>
> Jeff, I am sorry to not have included you in this resent. This resend patch
> has as the only change that it enables forwarding_uses_pmtu by default.
>
> I think we should find a consensus on this if it is reasonable to provide such
> a knob or find a default mode the forwarding path always operates on.
>
> In my opinion we must never use the path mtu to determine the size of packets
> while forwarding as this would make it easier to try attacks on fragments. I
> know we cannot fully prevent fragmentation but we should try as hard as
> possible to prevent it.
>
> Also, I don't see that anything should break because of this (with the
> exception of user space daemons doing manually probing, that was the
> only reason I included the knob). I also have an alternative to this. Like
> multicast routing such a daemon could install a socket and set a specific
> setsockopt. As long as this daemon is running and the socket is not closed we
> use the path mtu. If the program exits and closes the socket we fall back to
> just using the interface mtu.
>
> Given the recent attention to a succesful implementation of an attack
> on DNS using fragmentation as the root cause [1] I think this change is
> justified. I also described another problem I thought about regarding
> IPv6.  As we can shrink the path mtu the router would generate PtB errors
> and would lead to fragmentation on the end system. Recent research [2]
> showed that the failure rate of IPv6 fragments on the Alexa Top 1M sites
> is about 40% or 47% depending on the test. If this is also true and one
> can bring a host to fragment along such a path this could result in a
> blackhole. It may cause struggle for some people. I don't think this
> is as serious as the IPv4 problem. The problem could happen the other
> way around, if the generated PtB errors get dropped back to the host,
> which thus will never get informed about a mtu bottleneck on the path.
>
> Given the problems with correctly filtering those icmp errors I propose
> to always use the interface MTU for forwarding. In my understanding this
> is also in consent with RFC 1191 "4. Router specification".
>
> What do you think? I am looking forward to hear your opinion on that.
>
> [1] <https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf>
> [2] <http://www.iepg.org/2013-11-ietf88/fgont-iepg-ietf88-ipv6-frag-and-eh.pdf>


I'm not completely sure I understand the attack.  Perhaps a more
thorough explanation or concrete example is in order.  I understand
that it might be possible to spoof ICMP packet-too-big messages to a
router, but that router would need a socket listening that would
handle these messages.  Is this a problem in practice?  Routers in
critical forwarding paths are usually pretty well hardened, and don't
have services running on the forwarding path interfaces (or use
firewall rules to control access to those services).  Are such routers
vulnerable?  Is there a reason we need to compromise and increase
complexity of the routing code to handle poorly-configured routers?

Thanks,
  -John

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2013-12-31 17:59         ` John Heffner
@ 2013-12-31 18:41           ` Hannes Frederic Sowa
  0 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2013-12-31 18:41 UTC (permalink / raw)
  To: John Heffner; +Cc: Netdev, Eric Dumazet, David Miller

Hi!

On Tue, Dec 31, 2013 at 12:59:06PM -0500, John Heffner wrote:
> I'm not completely sure I understand the attack.  Perhaps a more
> thorough explanation or concrete example is in order.  I understand
> that it might be possible to spoof ICMP packet-too-big messages to a
> router, but that router would need a socket listening that would
> handle these messages.

No, an open socket is not needed in case of icmp echo reply payloads
on frag_needed notifications. And you really have to be careful about
bound raw sockets to any addresses on such an interface (e.g. some dhcp
clients, or maybe not bound at all to an interface).

> Is this a problem in practice?  Routers in
> critical forwarding paths are usually pretty well hardened, and don't
> have services running on the forwarding path interfaces (or use
> firewall rules to control access to those services).  Are such routers
> vulnerable?  Is there a reason we need to compromise and increase
> complexity of the routing code to handle poorly-configured routers?

Given the whole complexity in forwarding path I don't think that this
change does add a lot more.

The essential part of the attack is, that there is a way to spray the
fragmentation queue with already made-up answers and correctly guessed
fragmentation ids and wait for a user to issue a DNS request whose answer
is gonna to be fragmented on the path (or on the DNS name system, that
is the reason for IP_PMTUDISC_INTERFACE). In the paper they also found
a way to precompute the checksums so that after reassembling the valid
packet with the spoofed one, the whole packet is still valid. Sorry, I
don't have access to working code, as this does not yet seemed published.

The trick to prevent this attack is to configure the DNS server to
not emit packets above a specified size and make sure they don't get
fragmented locally (the mtu value needs to be somewhat lower than the
average MTU size of the internet, some people even say default min_pmtu
is the right choice). Of course, packets on paths which have a mtu below the
max-udp-size mtu will still get (rightfully) fragmented. But there seem to be
not that many paths out there (I hope, I don't have numbers for that).

Bascially all forwarding hops could be used for that, not only the
routers in critical forwarding paths.

Also IMHO I think using the mtu of the next-hop forwarding network is
just the right thing to do besides the presented attack on DNS.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2013-12-31  3:20 ` David Miller
  2013-12-31  3:52   ` Hannes Frederic Sowa
@ 2014-01-05 10:41   ` Hannes Frederic Sowa
  2014-01-05 19:45     ` David Miller
  1 sibling, 1 reply; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-05 10:41 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, eric.dumazet, steffen.klassert

Hi David!

On Mon, Dec 30, 2013 at 10:20:44PM -0500, David Miller wrote:
> The only thing left are things like AH and ESP, which also perform
> some level of validation making sure that some state exists.  And
> frankly for non-tunneled IPSEC this PMTU information is absolutely
> essential.

I am trying to finish the testing of these patches with ipsec.

What do you mean here by non-tunneled IPSEC? Transport mode is not an issue
with this patch as we don't push packets for transport mode through
forwarding. Tunneled mode would invoke fragmentation again. It is always
ensured that mtu in ip_forward is higher than when xfrm layer checks with
dst_mtu, so nothing should break here.

Thank you,

  Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2014-01-05 10:41   ` Hannes Frederic Sowa
@ 2014-01-05 19:45     ` David Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2014-01-05 19:45 UTC (permalink / raw)
  To: hannes; +Cc: netdev, eric.dumazet, steffen.klassert

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Sun, 5 Jan 2014 11:41:09 +0100

> What do you mean here by non-tunneled IPSEC? Transport mode is not an issue
> with this patch as we don't push packets for transport mode through
> forwarding.

Right you are.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2013-12-31  4:28 ` Hannes Frederic Sowa
@ 2014-01-06  9:05   ` Steffen Klassert
  2014-01-06  9:14     ` Hannes Frederic Sowa
  2014-01-06 13:18     ` Steffen Klassert
  0 siblings, 2 replies; 14+ messages in thread
From: Steffen Klassert @ 2014-01-06  9:05 UTC (permalink / raw)
  To: Hannes Frederic Sowa, netdev, eric.dumazet, davem

On Tue, Dec 31, 2013 at 05:28:40AM +0100, Hannes Frederic Sowa wrote:
> Hi Steffen!
> 
> On Fri, Dec 20, 2013 at 02:08:22PM +0100, Hannes Frederic Sowa wrote:
> > Provide a mode where the forwarding path does not use the protocol path
> > MTU to calculate the maximum size for a forwarded packet but instead
> > uses the interface or the per-route locked MTU.
> > 
> > It is easy to inject bogus or malicious path mtu information which
> > will cause either unneeded fragmentation-needed icmp errors (in case
> > of DF-bit set) or unnecessary fragmentation of packets (by default down
> > to min_pmtu). This could be used to either create blackholes on routers
> > (if the generated DF-bit gets dropped later on) or to leverage attacks
> > on fragmentation.
> > 
> > Forwarded skbs are marked with IPSKB_FORWARDED in ip_forward. This flag
> > was introduced for multicast forwarding, but as it does not conflict with
> > our usage in the unicast code path it is perfect for reuse.
> > 
> > I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
> > along with the new ip_dst_mtu_secure to net/ip.h to fix circular
> > dependencies because of IPSKB_FORWARDED.
> 
> IIRC you have a (semi-)automatic test suite to test for (p)mtu problems? Would
> these checks cover such a change?
> 

I'm currenlty testing these patches. ipv4 looks good but on
ipv6 with 'ping6' the packet size is not reduced according
to the pmtu when forward_use_pmtu is set to 0.

I'll run the tests again with your updated v3 patches.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2014-01-06  9:05   ` Steffen Klassert
@ 2014-01-06  9:14     ` Hannes Frederic Sowa
  2014-01-06 13:18     ` Steffen Klassert
  1 sibling, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-06  9:14 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, eric.dumazet, davem

On Mon, Jan 06, 2014 at 10:05:21AM +0100, Steffen Klassert wrote:
> I'm currenlty testing these patches. ipv4 looks good but on
> ipv6 with 'ping6' the packet size is not reduced according
> to the pmtu when forward_use_pmtu is set to 0.
> 
> I'll run the tests again with your updated v3 patches.

Thank you very much!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2014-01-06  9:05   ` Steffen Klassert
  2014-01-06  9:14     ` Hannes Frederic Sowa
@ 2014-01-06 13:18     ` Steffen Klassert
  2014-01-06 13:29       ` Hannes Frederic Sowa
  1 sibling, 1 reply; 14+ messages in thread
From: Steffen Klassert @ 2014-01-06 13:18 UTC (permalink / raw)
  To: Hannes Frederic Sowa, netdev, eric.dumazet, davem

On Mon, Jan 06, 2014 at 10:05:21AM +0100, Steffen Klassert wrote:
> > 
> > IIRC you have a (semi-)automatic test suite to test for (p)mtu problems? Would
> > these checks cover such a change?
> > 
> 
> I'm currenlty testing these patches. ipv4 looks good but on
> ipv6 with 'ping6' the packet size is not reduced according
> to the pmtu when forward_use_pmtu is set to 0.
> 
> I'll run the tests again with your updated v3 patches.
> 

The v3 version of your patches don't show any obvious regressions
on our pmtu test suite.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info
  2014-01-06 13:18     ` Steffen Klassert
@ 2014-01-06 13:29       ` Hannes Frederic Sowa
  0 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-06 13:29 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, eric.dumazet, davem

On Mon, Jan 06, 2014 at 02:18:22PM +0100, Steffen Klassert wrote:
> On Mon, Jan 06, 2014 at 10:05:21AM +0100, Steffen Klassert wrote:
> > > 
> > > IIRC you have a (semi-)automatic test suite to test for (p)mtu problems? Would
> > > these checks cover such a change?
> > > 
> > 
> > I'm currenlty testing these patches. ipv4 looks good but on
> > ipv6 with 'ping6' the packet size is not reduced according
> > to the pmtu when forward_use_pmtu is set to 0.
> > 
> > I'll run the tests again with your updated v3 patches.
> > 
> 
> The v3 version of your patches don't show any obvious regressions
> on our pmtu test suite.

I am happy to hear that. Thanks again for testing!

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-01-06 13:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-20 13:08 [PATCH net-next 1/2] ipv4: add forwarding_uses_pmtu knob to protect forward path to use pmtu info Hannes Frederic Sowa
2013-12-31  3:20 ` David Miller
2013-12-31  3:52   ` Hannes Frederic Sowa
2013-12-31  6:04     ` David Miller
2013-12-31  7:02       ` Hannes Frederic Sowa
2013-12-31 17:59         ` John Heffner
2013-12-31 18:41           ` Hannes Frederic Sowa
2014-01-05 10:41   ` Hannes Frederic Sowa
2014-01-05 19:45     ` David Miller
2013-12-31  4:28 ` Hannes Frederic Sowa
2014-01-06  9:05   ` Steffen Klassert
2014-01-06  9:14     ` Hannes Frederic Sowa
2014-01-06 13:18     ` Steffen Klassert
2014-01-06 13:29       ` Hannes Frederic Sowa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).