netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Problematic commits in the ipsec tree
@ 2013-08-22 10:47 Steffen Klassert
  2013-08-22 11:21 ` Hannes Frederic Sowa
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Steffen Klassert @ 2013-08-22 10:47 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: David Miller, netdev

Hannes,

I have two problematic commits from you in the ipsec tree. The first one is:

commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)

This breakes pmtu discovery for IPv4 because now we use the device mtu
instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
socket is at the skb.

The second is:

commit 844d48746 (xfrm: choose protocol family by skb protocol)

This breaks pmtu discovery for IPv6 too because skb->protocol can be null
in __xfrm6_output().

We need a solution soon, or I have to revert or remove these commits from
the ipsec tree.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Problematic commits in the ipsec tree
  2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
@ 2013-08-22 11:21 ` Hannes Frederic Sowa
  2013-08-22 13:53 ` Hannes Frederic Sowa
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-22 11:21 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: David Miller, netdev

On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> I have two problematic commits from you in the ipsec tree. The first one is:
> 
> commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> 
> This breakes pmtu discovery for IPv4 because now we use the device mtu
> instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> socket is at the skb.
> 
> The second is:
> 
> commit 844d48746 (xfrm: choose protocol family by skb protocol)
> 
> This breaks pmtu discovery for IPv6 too because skb->protocol can be null
> in __xfrm6_output().
> 
> We need a solution soon, or I have to revert or remove these commits from
> the ipsec tree.

Thanks for letting me know. I'll work on it right away.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Problematic commits in the ipsec tree
  2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
  2013-08-22 11:21 ` Hannes Frederic Sowa
@ 2013-08-22 13:53 ` Hannes Frederic Sowa
  2013-08-23  8:58   ` Steffen Klassert
  2013-08-22 19:53 ` [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu Hannes Frederic Sowa
  2013-08-22 19:54 ` [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs Hannes Frederic Sowa
  3 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-22 13:53 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: David Miller, netdev

On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> Hannes,
> 
> I have two problematic commits from you in the ipsec tree. The first one is:
> 
> commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> 
> This breakes pmtu discovery for IPv4 because now we use the device mtu
> instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> socket is at the skb.

I am currently testing this following patch. It should restore old behavior
for ipv4 sockets.

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ac5b025..65d3529 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
 
 	if (sk && skb->protocol == htons(ETH_P_IPV6))
 		return ip6_skb_dst_mtu(skb);
-	else if (sk && skb->protocol == htons(ETH_P_IP))
-		return ip_skb_dst_mtu(skb);
 	return dst_mtu(skb_dst(skb));
 }
 
Actually I would like to extend this check so that we only take the
dst_mtu(dst->path) in case of IP_PMTUDISC_PROBE. But that would be a
patch for ipsec-next. I was not sure if we always dispatch to xfrm_mtu
in this code-path.

> 
> The second is:
> 
> commit 844d48746 (xfrm: choose protocol family by skb protocol)
> 
> This breaks pmtu discovery for IPv6 too because skb->protocol can be null
> in __xfrm6_output().

I am currently checking if there are side-effects if we set skb->protocol in
raw sockets. This should solve this problem. Btw. this is also the case for
IPv4.

Hope to have tested patches later today.

Thanks,

  Hannes

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu
  2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
  2013-08-22 11:21 ` Hannes Frederic Sowa
  2013-08-22 13:53 ` Hannes Frederic Sowa
@ 2013-08-22 19:53 ` Hannes Frederic Sowa
  2013-08-22 19:54 ` [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs Hannes Frederic Sowa
  3 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-22 19:53 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: David Miller, netdev

In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
helper for safe determination of mtu") I switched the determination of
ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
never correct for ipv4 ipsec.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/xfrm.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ac5b025..65d3529 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
 
 	if (sk && skb->protocol == htons(ETH_P_IPV6))
 		return ip6_skb_dst_mtu(skb);
-	else if (sk && skb->protocol == htons(ETH_P_IP))
-		return ip_skb_dst_mtu(skb);
 	return dst_mtu(skb_dst(skb));
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs
  2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
                   ` (2 preceding siblings ...)
  2013-08-22 19:53 ` [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu Hannes Frederic Sowa
@ 2013-08-22 19:54 ` Hannes Frederic Sowa
  2013-08-22 23:59   ` Eric Dumazet
  3 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-22 19:54 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: David Miller, netdev, eric.dumazet

Currently we don't initialize skb->protocol when transmitting data via
tcp, raw(with and without inclhdr) or udp+ufo or appending data directly
to the socket transmit queue (via ip6_append_data). This needs to be
done so that we can get the correct mtu in the xfrm layer.

Setting of skb->protocol happens only in functions where we also have
a transmitting socket and a new skb, so we don't overwrite old values.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---

I would love to have more reviews for this. I hope to have checked all
the code-paths down from these routines to not cause any side-effects.

Testing showed no problems here. Sorry for only testing with udp only
before.

 net/ipv6/ip6_output.c | 3 +++
 net/ipv6/raw.c        | 1 +
 2 files changed, 4 insertions(+)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6e3ddf8..e7ceb6c 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -238,6 +238,7 @@ int ip6_xmit(struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
 	hdr->saddr = fl6->saddr;
 	hdr->daddr = *first_hop;
 
+	skb->protocol = htons(ETH_P_IPV6);
 	skb->priority = sk->sk_priority;
 	skb->mark = sk->sk_mark;
 
@@ -1057,6 +1058,7 @@ static inline int ip6_ufo_append_data(struct sock *sk,
 		/* initialize protocol header pointer */
 		skb->transport_header = skb->network_header + fragheaderlen;
 
+		skb->protocol = htons(ETH_P_IPV6);
 		skb->ip_summed = CHECKSUM_PARTIAL;
 		skb->csum = 0;
 	}
@@ -1359,6 +1361,7 @@ alloc_new_skb:
 			/*
 			 *	Fill in the control structures
 			 */
+			skb->protocol = htons(ETH_P_IPV6);
 			skb->ip_summed = CHECKSUM_NONE;
 			skb->csum = 0;
 			/* reserve for fragmentation and ipsec header */
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index c45f7a5..cdaed47 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -628,6 +628,7 @@ static int rawv6_send_hdrinc(struct sock *sk, void *from, int length,
 		goto error;
 	skb_reserve(skb, hlen);
 
+	skb->protocol = htons(ETH_P_IPV6);
 	skb->priority = sk->sk_priority;
 	skb->mark = sk->sk_mark;
 	skb_dst_set(skb, &rt->dst);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs
  2013-08-22 19:54 ` [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs Hannes Frederic Sowa
@ 2013-08-22 23:59   ` Eric Dumazet
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2013-08-22 23:59 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Steffen Klassert, David Miller, netdev

On Thu, 2013-08-22 at 21:54 +0200, Hannes Frederic Sowa wrote:
> Currently we don't initialize skb->protocol when transmitting data via
> tcp, raw(with and without inclhdr) or udp+ufo or appending data directly
> to the socket transmit queue (via ip6_append_data). This needs to be
> done so that we can get the correct mtu in the xfrm layer.
> 
> Setting of skb->protocol happens only in functions where we also have
> a transmitting socket and a new skb, so we don't overwrite old values.
> 
> Cc: Steffen Klassert <steffen.klassert@secunet.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---

Seems good to me.

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Problematic commits in the ipsec tree
  2013-08-22 13:53 ` Hannes Frederic Sowa
@ 2013-08-23  8:58   ` Steffen Klassert
  2013-08-23 11:03     ` Hannes Frederic Sowa
  0 siblings, 1 reply; 12+ messages in thread
From: Steffen Klassert @ 2013-08-23  8:58 UTC (permalink / raw)
  To: Hannes Frederic Sowa, David Miller, netdev

On Thu, Aug 22, 2013 at 03:53:42PM +0200, Hannes Frederic Sowa wrote:
> On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> > Hannes,
> > 
> > I have two problematic commits from you in the ipsec tree. The first one is:
> > 
> > commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> > 
> > This breakes pmtu discovery for IPv4 because now we use the device mtu
> > instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> > socket is at the skb.
> 
> I am currently testing this following patch. It should restore old behavior
> for ipv4 sockets.
> 
> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index ac5b025..65d3529 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
> @@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
>  
>  	if (sk && skb->protocol == htons(ETH_P_IPV6))
>  		return ip6_skb_dst_mtu(skb);
> -	else if (sk && skb->protocol == htons(ETH_P_IP))
> -		return ip_skb_dst_mtu(skb);
>  	return dst_mtu(skb_dst(skb));
>  }

This looks still fragile. xfrm_skb_dst_mtu() is called from
__xfrm6_output() and from xfrm4_tunnel_check_size().
We will have the same bug again as soon as somebody thinks that
it is save to call it from xfrm6_tunnel_check_size() too. So I
think it is better not to call it from xfrm4_tunnel_check_size().

Also, why do we need xfrm_skb_dst_mtu() at all? When it is called
from __xfrm6_output(), we know that this is IPv6. So we can call
ip6_skb_dst_mtu() directly as it was before your change.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Problematic commits in the ipsec tree
  2013-08-23  8:58   ` Steffen Klassert
@ 2013-08-23 11:03     ` Hannes Frederic Sowa
  2013-08-23 11:34       ` Hannes Frederic Sowa
  0 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-23 11:03 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: David Miller, netdev

Hello!

On Fri, Aug 23, 2013 at 10:58:07AM +0200, Steffen Klassert wrote:
> On Thu, Aug 22, 2013 at 03:53:42PM +0200, Hannes Frederic Sowa wrote:
> > On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> > > Hannes,
> > > 
> > > I have two problematic commits from you in the ipsec tree. The first one is:
> > > 
> > > commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> > > 
> > > This breakes pmtu discovery for IPv4 because now we use the device mtu
> > > instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> > > socket is at the skb.
> > 
> > I am currently testing this following patch. It should restore old behavior
> > for ipv4 sockets.
> > 
> > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > index ac5b025..65d3529 100644
> > --- a/include/net/xfrm.h
> > +++ b/include/net/xfrm.h
> > @@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
> >  
> >  	if (sk && skb->protocol == htons(ETH_P_IPV6))
> >  		return ip6_skb_dst_mtu(skb);
> > -	else if (sk && skb->protocol == htons(ETH_P_IP))
> > -		return ip_skb_dst_mtu(skb);
> >  	return dst_mtu(skb_dst(skb));
> >  }
> 
> This looks still fragile. xfrm_skb_dst_mtu() is called from
> __xfrm6_output() and from xfrm4_tunnel_check_size().
> We will have the same bug again as soon as somebody thinks that
> it is save to call it from xfrm6_tunnel_check_size() too. So I
> think it is better not to call it from xfrm4_tunnel_check_size().

Hm, I don't think I can follow you completly here. I searched for allocations
of ipv6 skbs (where they originated from a socket) and checked these
allocations also initialize the skb->protocol field (the second patch).

I wonder if ip6_skb_dst_mtu was correct all along and if we should just
switch to dst_mtu(skb_dst(skb)) in all cases?

> Also, why do we need xfrm_skb_dst_mtu() at all? When it is called
> from __xfrm6_output(), we know that this is IPv6. So we can call
> ip6_skb_dst_mtu() directly as it was before your change.

The problem was that ip6_skb_dst_mtu dereferences the ipv6_pinfo socket to
check for IPV6_PMTUDISC_PROBE. If IPv4 is encapsulated in IPv6 and the packet
originated from the same host (sk != NULL) we call down to ip6_skb_dst_mtu.
This function checks that ipv6_pinfo is not NULL, so it worked by accident
currently.

My plan would be to check that all IPv4 skb allocations use the correct
skb->protocol, too, or if we can use the skb->inner_protocol field to mark
packets in the tunnel drivers from which protocol they originated.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Problematic commits in the ipsec tree
  2013-08-23 11:03     ` Hannes Frederic Sowa
@ 2013-08-23 11:34       ` Hannes Frederic Sowa
  2013-08-23 12:49         ` Hannes Frederic Sowa
  0 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-23 11:34 UTC (permalink / raw)
  To: Steffen Klassert, David Miller, netdev

On Fri, Aug 23, 2013 at 01:03:23PM +0200, Hannes Frederic Sowa wrote:
> Hello!
> 
> On Fri, Aug 23, 2013 at 10:58:07AM +0200, Steffen Klassert wrote:
> > On Thu, Aug 22, 2013 at 03:53:42PM +0200, Hannes Frederic Sowa wrote:
> > > On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> > > > Hannes,
> > > > 
> > > > I have two problematic commits from you in the ipsec tree. The first one is:
> > > > 
> > > > commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> > > > 
> > > > This breakes pmtu discovery for IPv4 because now we use the device mtu
> > > > instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> > > > socket is at the skb.
> > > 
> > > I am currently testing this following patch. It should restore old behavior
> > > for ipv4 sockets.
> > > 
> > > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > > index ac5b025..65d3529 100644
> > > --- a/include/net/xfrm.h
> > > +++ b/include/net/xfrm.h
> > > @@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
> > >  
> > >  	if (sk && skb->protocol == htons(ETH_P_IPV6))
> > >  		return ip6_skb_dst_mtu(skb);
> > > -	else if (sk && skb->protocol == htons(ETH_P_IP))
> > > -		return ip_skb_dst_mtu(skb);
> > >  	return dst_mtu(skb_dst(skb));
> > >  }
> > 
> > This looks still fragile. xfrm_skb_dst_mtu() is called from
> > __xfrm6_output() and from xfrm4_tunnel_check_size().
> > We will have the same bug again as soon as somebody thinks that
> > it is save to call it from xfrm6_tunnel_check_size() too. So I
> > think it is better not to call it from xfrm4_tunnel_check_size().
> 
> Hm, I don't think I can follow you completly here. I searched for allocations
> of ipv6 skbs (where they originated from a socket) and checked these
> allocations also initialize the skb->protocol field (the second patch).
> 
> I wonder if ip6_skb_dst_mtu was correct all along and if we should just
> switch to dst_mtu(skb_dst(skb)) in all cases?

Ok, I got it.

How about just checking in __xfrm6_output if we actually have a packet
originated from an IPv6 socket so that we only replace the original call to
ip6_skb_dst_mtu(skb)?

> 
> > Also, why do we need xfrm_skb_dst_mtu() at all? When it is called
> > from __xfrm6_output(), we know that this is IPv6. So we can call
> > ip6_skb_dst_mtu() directly as it was before your change.
> 
> The problem was that ip6_skb_dst_mtu dereferences the ipv6_pinfo socket to
> check for IPV6_PMTUDISC_PROBE. If IPv4 is encapsulated in IPv6 and the packet
> originated from the same host (sk != NULL) we call down to ip6_skb_dst_mtu.
> This function checks that ipv6_pinfo is not NULL, so it worked by accident
> currently.
> 
> My plan would be to check that all IPv4 skb allocations use the correct
> skb->protocol, too, or if we can use the skb->inner_protocol field to mark
> packets in the tunnel drivers from which protocol they originated.

Btw. this also let's us refer to the wrong mtu in case of an v4-mapped v6
socket transmitting over a address family changing tunnel over ipsec.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Problematic commits in the ipsec tree
  2013-08-23 11:34       ` Hannes Frederic Sowa
@ 2013-08-23 12:49         ` Hannes Frederic Sowa
  2013-08-26  9:41           ` Steffen Klassert
  0 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-23 12:49 UTC (permalink / raw)
  To: Steffen Klassert, David Miller, netdev

On Fri, Aug 23, 2013 at 01:34:35PM +0200, Hannes Frederic Sowa wrote:
> On Fri, Aug 23, 2013 at 01:03:23PM +0200, Hannes Frederic Sowa wrote:
> > Hello!
> > 
> > On Fri, Aug 23, 2013 at 10:58:07AM +0200, Steffen Klassert wrote:
> > > On Thu, Aug 22, 2013 at 03:53:42PM +0200, Hannes Frederic Sowa wrote:
> > > > On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> > > > > Hannes,
> > > > > 
> > > > > I have two problematic commits from you in the ipsec tree. The first one is:
> > > > > 
> > > > > commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> > > > > 
> > > > > This breakes pmtu discovery for IPv4 because now we use the device mtu
> > > > > instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> > > > > socket is at the skb.
> > > > 
> > > > I am currently testing this following patch. It should restore old behavior
> > > > for ipv4 sockets.
> > > > 
> > > > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > > > index ac5b025..65d3529 100644
> > > > --- a/include/net/xfrm.h
> > > > +++ b/include/net/xfrm.h
> > > > @@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
> > > >  
> > > >  	if (sk && skb->protocol == htons(ETH_P_IPV6))
> > > >  		return ip6_skb_dst_mtu(skb);
> > > > -	else if (sk && skb->protocol == htons(ETH_P_IP))
> > > > -		return ip_skb_dst_mtu(skb);
> > > >  	return dst_mtu(skb_dst(skb));
> > > >  }
> > > 
> > > This looks still fragile. xfrm_skb_dst_mtu() is called from
> > > __xfrm6_output() and from xfrm4_tunnel_check_size().
> > > We will have the same bug again as soon as somebody thinks that
> > > it is save to call it from xfrm6_tunnel_check_size() too. So I
> > > think it is better not to call it from xfrm4_tunnel_check_size().
> > 
> > Hm, I don't think I can follow you completly here. I searched for allocations
> > of ipv6 skbs (where they originated from a socket) and checked these
> > allocations also initialize the skb->protocol field (the second patch).
> > 
> > I wonder if ip6_skb_dst_mtu was correct all along and if we should just
> > switch to dst_mtu(skb_dst(skb)) in all cases?
> 
> Ok, I got it.
> 
> How about just checking in __xfrm6_output if we actually have a packet
> originated from an IPv6 socket so that we only replace the original call to
> ip6_skb_dst_mtu(skb)?

This could be the replacement for patch 1/2 to reassemble old behaviour
without touching ip6_skb_dst_mtu if the socket type is not an IPv6 one.

I would still like to look if we could correctly handle *_PMTUDISC_PROBE one
day and fallback to dst_mtu(dst->path) if possible. So I don't know if
removing xfrm_skb_dst_mtu is good style and would just make churn in the git
history. What do you think?

[PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu

In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
helper for safe determination of mtu") I switched the determination of
ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
never correct for ipv4 ipsec.

This patch partly reverts 0ea9d5e3e0e03a63b11392f5613378977dae7eca
("xfrm: introduce helper for safe determination of mtu").

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/xfrm.h      | 12 ------------
 net/ipv4/xfrm4_output.c |  2 +-
 net/ipv6/xfrm6_output.c |  8 +++++---
 3 files changed, 6 insertions(+), 16 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ac5b025..e823786 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -20,7 +20,6 @@
 #include <net/route.h>
 #include <net/ipv6.h>
 #include <net/ip6_fib.h>
-#include <net/ip6_route.h>
 #include <net/flow.h>
 
 #include <linux/interrupt.h>
@@ -1724,15 +1723,4 @@ static inline int xfrm_mark_put(struct sk_buff *skb, const struct xfrm_mark *m)
 	return ret;
 }
 
-static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
-{
-	struct sock *sk = skb->sk;
-
-	if (sk && skb->protocol == htons(ETH_P_IPV6))
-		return ip6_skb_dst_mtu(skb);
-	else if (sk && skb->protocol == htons(ETH_P_IP))
-		return ip_skb_dst_mtu(skb);
-	return dst_mtu(skb_dst(skb));
-}
-
 #endif	/* _NET_XFRM_H */
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index 80baf4a..baa0f63 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -28,7 +28,7 @@ static int xfrm4_tunnel_check_size(struct sk_buff *skb)
 	if (!(ip_hdr(skb)->frag_off & htons(IP_DF)) || skb->local_df)
 		goto out;
 
-	mtu = xfrm_skb_dst_mtu(skb);
+	mtu = dst_mtu(skb_dst(skb));
 	if (skb->len > mtu) {
 		if (skb->sk)
 			xfrm_local_error(skb, mtu);
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index e092e30..6cd625e 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -140,10 +140,12 @@ static int __xfrm6_output(struct sk_buff *skb)
 {
 	struct dst_entry *dst = skb_dst(skb);
 	struct xfrm_state *x = dst->xfrm;
-	int mtu = xfrm_skb_dst_mtu(skb);
+	int mtu;
 
-	if (mtu < IPV6_MIN_MTU)
-		mtu = IPV6_MIN_MTU;
+	if (skb->protocol == htons(ETH_P_IPV6))
+		mtu = ip6_skb_dst_mtu(skb);
+	else
+		mtu = dst_mtu(skb_dst(skb));
 
 	if (skb->len > mtu && xfrm6_local_dontfrag(skb)) {
 		xfrm6_local_rxpmtu(skb, mtu);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Problematic commits in the ipsec tree
  2013-08-23 12:49         ` Hannes Frederic Sowa
@ 2013-08-26  9:41           ` Steffen Klassert
  2013-08-26 10:46             ` Hannes Frederic Sowa
  0 siblings, 1 reply; 12+ messages in thread
From: Steffen Klassert @ 2013-08-26  9:41 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: David Miller, netdev

On Fri, Aug 23, 2013 at 02:49:11PM +0200, Hannes Frederic Sowa wrote:
> 
> This could be the replacement for patch 1/2 to reassemble old behaviour
> without touching ip6_skb_dst_mtu if the socket type is not an IPv6 one.
> 
> I would still like to look if we could correctly handle *_PMTUDISC_PROBE one
> day and fallback to dst_mtu(dst->path) if possible. So I don't know if
> removing xfrm_skb_dst_mtu is good style and would just make churn in the git
> history. What do you think?

Currently I think we can call dst_mtu() unconditionally from
__xfrm6_output(), then we would not need xfrm_skb_dst_mtu().
But this needs further investigation, IPsec pmtu discovery
was frequently broken in the past and I don't want to break
it again.

> 
> [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu
> 
> In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
> helper for safe determination of mtu") I switched the determination of
> ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
> case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
> never correct for ipv4 ipsec.
> 
> This patch partly reverts 0ea9d5e3e0e03a63b11392f5613378977dae7eca
> ("xfrm: introduce helper for safe determination of mtu").
> 

I think with this and you other patch, we get the all the
interfamily tunnel problems fixed for now. Everything else
should be done in ipsec-next.

Please resend the whole patchset, so we can get it fixed soon.

Tanks a lot!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Problematic commits in the ipsec tree
  2013-08-26  9:41           ` Steffen Klassert
@ 2013-08-26 10:46             ` Hannes Frederic Sowa
  0 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-26 10:46 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: David Miller, netdev

On Mon, Aug 26, 2013 at 11:41:45AM +0200, Steffen Klassert wrote:
> On Fri, Aug 23, 2013 at 02:49:11PM +0200, Hannes Frederic Sowa wrote:
> > 
> > This could be the replacement for patch 1/2 to reassemble old behaviour
> > without touching ip6_skb_dst_mtu if the socket type is not an IPv6 one.
> > 
> > I would still like to look if we could correctly handle *_PMTUDISC_PROBE one
> > day and fallback to dst_mtu(dst->path) if possible. So I don't know if
> > removing xfrm_skb_dst_mtu is good style and would just make churn in the git
> > history. What do you think?
> 
> Currently I think we can call dst_mtu() unconditionally from
> __xfrm6_output(), then we would not need xfrm_skb_dst_mtu().
> But this needs further investigation, IPsec pmtu discovery
> was frequently broken in the past and I don't want to break
> it again.

My idea was something like

|	  struct ipv6_pinfo *np = ...;
|         int mtu = (np && np->pmtudisc == IPV6_PMTUDISC_PROBE) ?
|                   dst_mtu(skb_dst(skb)->path) : dst_mtu(skb_dst(skb));

But I don't know if this does actually anything good and where the dispatch of
dst_mtu goes to. My idea was to avoid the dst_metric_raw(dst, RTAX_MTU)
call in xfrm_mtu in case of IPV6_PMTUDISC_PROBE.

> > [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu
> > 
> > In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
> > helper for safe determination of mtu") I switched the determination of
> > ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
> > case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
> > never correct for ipv4 ipsec.
> > 
> > This patch partly reverts 0ea9d5e3e0e03a63b11392f5613378977dae7eca
> > ("xfrm: introduce helper for safe determination of mtu").
> > 
> 
> I think with this and you other patch, we get the all the
> interfamily tunnel problems fixed for now. Everything else
> should be done in ipsec-next.

Fully ACK.

> Please resend the whole patchset, so we can get it fixed soon.
> 
> Tanks a lot!

Sorry for holding back your tree for so long to get merged.

Thanks,

  Hannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-08-26 10:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
2013-08-22 11:21 ` Hannes Frederic Sowa
2013-08-22 13:53 ` Hannes Frederic Sowa
2013-08-23  8:58   ` Steffen Klassert
2013-08-23 11:03     ` Hannes Frederic Sowa
2013-08-23 11:34       ` Hannes Frederic Sowa
2013-08-23 12:49         ` Hannes Frederic Sowa
2013-08-26  9:41           ` Steffen Klassert
2013-08-26 10:46             ` Hannes Frederic Sowa
2013-08-22 19:53 ` [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu Hannes Frederic Sowa
2013-08-22 19:54 ` [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs Hannes Frederic Sowa
2013-08-22 23:59   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).