* Problematic commits in the ipsec tree
@ 2013-08-22 10:47 Steffen Klassert
2013-08-22 11:21 ` Hannes Frederic Sowa
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Steffen Klassert @ 2013-08-22 10:47 UTC (permalink / raw)
To: Hannes Frederic Sowa; +Cc: David Miller, netdev
Hannes,
I have two problematic commits from you in the ipsec tree. The first one is:
commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
This breakes pmtu discovery for IPv4 because now we use the device mtu
instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
socket is at the skb.
The second is:
commit 844d48746 (xfrm: choose protocol family by skb protocol)
This breaks pmtu discovery for IPv6 too because skb->protocol can be null
in __xfrm6_output().
We need a solution soon, or I have to revert or remove these commits from
the ipsec tree.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Problematic commits in the ipsec tree
2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
@ 2013-08-22 11:21 ` Hannes Frederic Sowa
2013-08-22 13:53 ` Hannes Frederic Sowa
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-22 11:21 UTC (permalink / raw)
To: Steffen Klassert; +Cc: David Miller, netdev
On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> I have two problematic commits from you in the ipsec tree. The first one is:
>
> commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
>
> This breakes pmtu discovery for IPv4 because now we use the device mtu
> instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> socket is at the skb.
>
> The second is:
>
> commit 844d48746 (xfrm: choose protocol family by skb protocol)
>
> This breaks pmtu discovery for IPv6 too because skb->protocol can be null
> in __xfrm6_output().
>
> We need a solution soon, or I have to revert or remove these commits from
> the ipsec tree.
Thanks for letting me know. I'll work on it right away.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Problematic commits in the ipsec tree
2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
2013-08-22 11:21 ` Hannes Frederic Sowa
@ 2013-08-22 13:53 ` Hannes Frederic Sowa
2013-08-23 8:58 ` Steffen Klassert
2013-08-22 19:53 ` [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu Hannes Frederic Sowa
2013-08-22 19:54 ` [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs Hannes Frederic Sowa
3 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-22 13:53 UTC (permalink / raw)
To: Steffen Klassert; +Cc: David Miller, netdev
On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> Hannes,
>
> I have two problematic commits from you in the ipsec tree. The first one is:
>
> commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
>
> This breakes pmtu discovery for IPv4 because now we use the device mtu
> instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> socket is at the skb.
I am currently testing this following patch. It should restore old behavior
for ipv4 sockets.
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ac5b025..65d3529 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
if (sk && skb->protocol == htons(ETH_P_IPV6))
return ip6_skb_dst_mtu(skb);
- else if (sk && skb->protocol == htons(ETH_P_IP))
- return ip_skb_dst_mtu(skb);
return dst_mtu(skb_dst(skb));
}
Actually I would like to extend this check so that we only take the
dst_mtu(dst->path) in case of IP_PMTUDISC_PROBE. But that would be a
patch for ipsec-next. I was not sure if we always dispatch to xfrm_mtu
in this code-path.
>
> The second is:
>
> commit 844d48746 (xfrm: choose protocol family by skb protocol)
>
> This breaks pmtu discovery for IPv6 too because skb->protocol can be null
> in __xfrm6_output().
I am currently checking if there are side-effects if we set skb->protocol in
raw sockets. This should solve this problem. Btw. this is also the case for
IPv4.
Hope to have tested patches later today.
Thanks,
Hannes
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu
2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
2013-08-22 11:21 ` Hannes Frederic Sowa
2013-08-22 13:53 ` Hannes Frederic Sowa
@ 2013-08-22 19:53 ` Hannes Frederic Sowa
2013-08-22 19:54 ` [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs Hannes Frederic Sowa
3 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-22 19:53 UTC (permalink / raw)
To: Steffen Klassert; +Cc: David Miller, netdev
In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
helper for safe determination of mtu") I switched the determination of
ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
never correct for ipv4 ipsec.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
include/net/xfrm.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ac5b025..65d3529 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
if (sk && skb->protocol == htons(ETH_P_IPV6))
return ip6_skb_dst_mtu(skb);
- else if (sk && skb->protocol == htons(ETH_P_IP))
- return ip_skb_dst_mtu(skb);
return dst_mtu(skb_dst(skb));
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs
2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
` (2 preceding siblings ...)
2013-08-22 19:53 ` [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu Hannes Frederic Sowa
@ 2013-08-22 19:54 ` Hannes Frederic Sowa
2013-08-22 23:59 ` Eric Dumazet
3 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-22 19:54 UTC (permalink / raw)
To: Steffen Klassert; +Cc: David Miller, netdev, eric.dumazet
Currently we don't initialize skb->protocol when transmitting data via
tcp, raw(with and without inclhdr) or udp+ufo or appending data directly
to the socket transmit queue (via ip6_append_data). This needs to be
done so that we can get the correct mtu in the xfrm layer.
Setting of skb->protocol happens only in functions where we also have
a transmitting socket and a new skb, so we don't overwrite old values.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
I would love to have more reviews for this. I hope to have checked all
the code-paths down from these routines to not cause any side-effects.
Testing showed no problems here. Sorry for only testing with udp only
before.
net/ipv6/ip6_output.c | 3 +++
net/ipv6/raw.c | 1 +
2 files changed, 4 insertions(+)
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6e3ddf8..e7ceb6c 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -238,6 +238,7 @@ int ip6_xmit(struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
hdr->saddr = fl6->saddr;
hdr->daddr = *first_hop;
+ skb->protocol = htons(ETH_P_IPV6);
skb->priority = sk->sk_priority;
skb->mark = sk->sk_mark;
@@ -1057,6 +1058,7 @@ static inline int ip6_ufo_append_data(struct sock *sk,
/* initialize protocol header pointer */
skb->transport_header = skb->network_header + fragheaderlen;
+ skb->protocol = htons(ETH_P_IPV6);
skb->ip_summed = CHECKSUM_PARTIAL;
skb->csum = 0;
}
@@ -1359,6 +1361,7 @@ alloc_new_skb:
/*
* Fill in the control structures
*/
+ skb->protocol = htons(ETH_P_IPV6);
skb->ip_summed = CHECKSUM_NONE;
skb->csum = 0;
/* reserve for fragmentation and ipsec header */
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index c45f7a5..cdaed47 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -628,6 +628,7 @@ static int rawv6_send_hdrinc(struct sock *sk, void *from, int length,
goto error;
skb_reserve(skb, hlen);
+ skb->protocol = htons(ETH_P_IPV6);
skb->priority = sk->sk_priority;
skb->mark = sk->sk_mark;
skb_dst_set(skb, &rt->dst);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs
2013-08-22 19:54 ` [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs Hannes Frederic Sowa
@ 2013-08-22 23:59 ` Eric Dumazet
0 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2013-08-22 23:59 UTC (permalink / raw)
To: Hannes Frederic Sowa; +Cc: Steffen Klassert, David Miller, netdev
On Thu, 2013-08-22 at 21:54 +0200, Hannes Frederic Sowa wrote:
> Currently we don't initialize skb->protocol when transmitting data via
> tcp, raw(with and without inclhdr) or udp+ufo or appending data directly
> to the socket transmit queue (via ip6_append_data). This needs to be
> done so that we can get the correct mtu in the xfrm layer.
>
> Setting of skb->protocol happens only in functions where we also have
> a transmitting socket and a new skb, so we don't overwrite old values.
>
> Cc: Steffen Klassert <steffen.klassert@secunet.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
Seems good to me.
Acked-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Problematic commits in the ipsec tree
2013-08-22 13:53 ` Hannes Frederic Sowa
@ 2013-08-23 8:58 ` Steffen Klassert
2013-08-23 11:03 ` Hannes Frederic Sowa
0 siblings, 1 reply; 12+ messages in thread
From: Steffen Klassert @ 2013-08-23 8:58 UTC (permalink / raw)
To: Hannes Frederic Sowa, David Miller, netdev
On Thu, Aug 22, 2013 at 03:53:42PM +0200, Hannes Frederic Sowa wrote:
> On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> > Hannes,
> >
> > I have two problematic commits from you in the ipsec tree. The first one is:
> >
> > commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> >
> > This breakes pmtu discovery for IPv4 because now we use the device mtu
> > instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> > socket is at the skb.
>
> I am currently testing this following patch. It should restore old behavior
> for ipv4 sockets.
>
> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index ac5b025..65d3529 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
> @@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
>
> if (sk && skb->protocol == htons(ETH_P_IPV6))
> return ip6_skb_dst_mtu(skb);
> - else if (sk && skb->protocol == htons(ETH_P_IP))
> - return ip_skb_dst_mtu(skb);
> return dst_mtu(skb_dst(skb));
> }
This looks still fragile. xfrm_skb_dst_mtu() is called from
__xfrm6_output() and from xfrm4_tunnel_check_size().
We will have the same bug again as soon as somebody thinks that
it is save to call it from xfrm6_tunnel_check_size() too. So I
think it is better not to call it from xfrm4_tunnel_check_size().
Also, why do we need xfrm_skb_dst_mtu() at all? When it is called
from __xfrm6_output(), we know that this is IPv6. So we can call
ip6_skb_dst_mtu() directly as it was before your change.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Problematic commits in the ipsec tree
2013-08-23 8:58 ` Steffen Klassert
@ 2013-08-23 11:03 ` Hannes Frederic Sowa
2013-08-23 11:34 ` Hannes Frederic Sowa
0 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-23 11:03 UTC (permalink / raw)
To: Steffen Klassert; +Cc: David Miller, netdev
Hello!
On Fri, Aug 23, 2013 at 10:58:07AM +0200, Steffen Klassert wrote:
> On Thu, Aug 22, 2013 at 03:53:42PM +0200, Hannes Frederic Sowa wrote:
> > On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> > > Hannes,
> > >
> > > I have two problematic commits from you in the ipsec tree. The first one is:
> > >
> > > commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> > >
> > > This breakes pmtu discovery for IPv4 because now we use the device mtu
> > > instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> > > socket is at the skb.
> >
> > I am currently testing this following patch. It should restore old behavior
> > for ipv4 sockets.
> >
> > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > index ac5b025..65d3529 100644
> > --- a/include/net/xfrm.h
> > +++ b/include/net/xfrm.h
> > @@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
> >
> > if (sk && skb->protocol == htons(ETH_P_IPV6))
> > return ip6_skb_dst_mtu(skb);
> > - else if (sk && skb->protocol == htons(ETH_P_IP))
> > - return ip_skb_dst_mtu(skb);
> > return dst_mtu(skb_dst(skb));
> > }
>
> This looks still fragile. xfrm_skb_dst_mtu() is called from
> __xfrm6_output() and from xfrm4_tunnel_check_size().
> We will have the same bug again as soon as somebody thinks that
> it is save to call it from xfrm6_tunnel_check_size() too. So I
> think it is better not to call it from xfrm4_tunnel_check_size().
Hm, I don't think I can follow you completly here. I searched for allocations
of ipv6 skbs (where they originated from a socket) and checked these
allocations also initialize the skb->protocol field (the second patch).
I wonder if ip6_skb_dst_mtu was correct all along and if we should just
switch to dst_mtu(skb_dst(skb)) in all cases?
> Also, why do we need xfrm_skb_dst_mtu() at all? When it is called
> from __xfrm6_output(), we know that this is IPv6. So we can call
> ip6_skb_dst_mtu() directly as it was before your change.
The problem was that ip6_skb_dst_mtu dereferences the ipv6_pinfo socket to
check for IPV6_PMTUDISC_PROBE. If IPv4 is encapsulated in IPv6 and the packet
originated from the same host (sk != NULL) we call down to ip6_skb_dst_mtu.
This function checks that ipv6_pinfo is not NULL, so it worked by accident
currently.
My plan would be to check that all IPv4 skb allocations use the correct
skb->protocol, too, or if we can use the skb->inner_protocol field to mark
packets in the tunnel drivers from which protocol they originated.
Greetings,
Hannes
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Problematic commits in the ipsec tree
2013-08-23 11:03 ` Hannes Frederic Sowa
@ 2013-08-23 11:34 ` Hannes Frederic Sowa
2013-08-23 12:49 ` Hannes Frederic Sowa
0 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-23 11:34 UTC (permalink / raw)
To: Steffen Klassert, David Miller, netdev
On Fri, Aug 23, 2013 at 01:03:23PM +0200, Hannes Frederic Sowa wrote:
> Hello!
>
> On Fri, Aug 23, 2013 at 10:58:07AM +0200, Steffen Klassert wrote:
> > On Thu, Aug 22, 2013 at 03:53:42PM +0200, Hannes Frederic Sowa wrote:
> > > On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> > > > Hannes,
> > > >
> > > > I have two problematic commits from you in the ipsec tree. The first one is:
> > > >
> > > > commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> > > >
> > > > This breakes pmtu discovery for IPv4 because now we use the device mtu
> > > > instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> > > > socket is at the skb.
> > >
> > > I am currently testing this following patch. It should restore old behavior
> > > for ipv4 sockets.
> > >
> > > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > > index ac5b025..65d3529 100644
> > > --- a/include/net/xfrm.h
> > > +++ b/include/net/xfrm.h
> > > @@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
> > >
> > > if (sk && skb->protocol == htons(ETH_P_IPV6))
> > > return ip6_skb_dst_mtu(skb);
> > > - else if (sk && skb->protocol == htons(ETH_P_IP))
> > > - return ip_skb_dst_mtu(skb);
> > > return dst_mtu(skb_dst(skb));
> > > }
> >
> > This looks still fragile. xfrm_skb_dst_mtu() is called from
> > __xfrm6_output() and from xfrm4_tunnel_check_size().
> > We will have the same bug again as soon as somebody thinks that
> > it is save to call it from xfrm6_tunnel_check_size() too. So I
> > think it is better not to call it from xfrm4_tunnel_check_size().
>
> Hm, I don't think I can follow you completly here. I searched for allocations
> of ipv6 skbs (where they originated from a socket) and checked these
> allocations also initialize the skb->protocol field (the second patch).
>
> I wonder if ip6_skb_dst_mtu was correct all along and if we should just
> switch to dst_mtu(skb_dst(skb)) in all cases?
Ok, I got it.
How about just checking in __xfrm6_output if we actually have a packet
originated from an IPv6 socket so that we only replace the original call to
ip6_skb_dst_mtu(skb)?
>
> > Also, why do we need xfrm_skb_dst_mtu() at all? When it is called
> > from __xfrm6_output(), we know that this is IPv6. So we can call
> > ip6_skb_dst_mtu() directly as it was before your change.
>
> The problem was that ip6_skb_dst_mtu dereferences the ipv6_pinfo socket to
> check for IPV6_PMTUDISC_PROBE. If IPv4 is encapsulated in IPv6 and the packet
> originated from the same host (sk != NULL) we call down to ip6_skb_dst_mtu.
> This function checks that ipv6_pinfo is not NULL, so it worked by accident
> currently.
>
> My plan would be to check that all IPv4 skb allocations use the correct
> skb->protocol, too, or if we can use the skb->inner_protocol field to mark
> packets in the tunnel drivers from which protocol they originated.
Btw. this also let's us refer to the wrong mtu in case of an v4-mapped v6
socket transmitting over a address family changing tunnel over ipsec.
Greetings,
Hannes
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Problematic commits in the ipsec tree
2013-08-23 11:34 ` Hannes Frederic Sowa
@ 2013-08-23 12:49 ` Hannes Frederic Sowa
2013-08-26 9:41 ` Steffen Klassert
0 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-23 12:49 UTC (permalink / raw)
To: Steffen Klassert, David Miller, netdev
On Fri, Aug 23, 2013 at 01:34:35PM +0200, Hannes Frederic Sowa wrote:
> On Fri, Aug 23, 2013 at 01:03:23PM +0200, Hannes Frederic Sowa wrote:
> > Hello!
> >
> > On Fri, Aug 23, 2013 at 10:58:07AM +0200, Steffen Klassert wrote:
> > > On Thu, Aug 22, 2013 at 03:53:42PM +0200, Hannes Frederic Sowa wrote:
> > > > On Thu, Aug 22, 2013 at 12:47:24PM +0200, Steffen Klassert wrote:
> > > > > Hannes,
> > > > >
> > > > > I have two problematic commits from you in the ipsec tree. The first one is:
> > > > >
> > > > > commit 0ea9d5e3e (xfrm: introduce helper for safe determination of mtu)
> > > > >
> > > > > This breakes pmtu discovery for IPv4 because now we use the device mtu
> > > > > instead of the reduced IPsec mtu in xfrm4_tunnel_check_size() if a IPv4
> > > > > socket is at the skb.
> > > >
> > > > I am currently testing this following patch. It should restore old behavior
> > > > for ipv4 sockets.
> > > >
> > > > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > > > index ac5b025..65d3529 100644
> > > > --- a/include/net/xfrm.h
> > > > +++ b/include/net/xfrm.h
> > > > @@ -1730,8 +1730,6 @@ static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
> > > >
> > > > if (sk && skb->protocol == htons(ETH_P_IPV6))
> > > > return ip6_skb_dst_mtu(skb);
> > > > - else if (sk && skb->protocol == htons(ETH_P_IP))
> > > > - return ip_skb_dst_mtu(skb);
> > > > return dst_mtu(skb_dst(skb));
> > > > }
> > >
> > > This looks still fragile. xfrm_skb_dst_mtu() is called from
> > > __xfrm6_output() and from xfrm4_tunnel_check_size().
> > > We will have the same bug again as soon as somebody thinks that
> > > it is save to call it from xfrm6_tunnel_check_size() too. So I
> > > think it is better not to call it from xfrm4_tunnel_check_size().
> >
> > Hm, I don't think I can follow you completly here. I searched for allocations
> > of ipv6 skbs (where they originated from a socket) and checked these
> > allocations also initialize the skb->protocol field (the second patch).
> >
> > I wonder if ip6_skb_dst_mtu was correct all along and if we should just
> > switch to dst_mtu(skb_dst(skb)) in all cases?
>
> Ok, I got it.
>
> How about just checking in __xfrm6_output if we actually have a packet
> originated from an IPv6 socket so that we only replace the original call to
> ip6_skb_dst_mtu(skb)?
This could be the replacement for patch 1/2 to reassemble old behaviour
without touching ip6_skb_dst_mtu if the socket type is not an IPv6 one.
I would still like to look if we could correctly handle *_PMTUDISC_PROBE one
day and fallback to dst_mtu(dst->path) if possible. So I don't know if
removing xfrm_skb_dst_mtu is good style and would just make churn in the git
history. What do you think?
[PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu
In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
helper for safe determination of mtu") I switched the determination of
ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
never correct for ipv4 ipsec.
This patch partly reverts 0ea9d5e3e0e03a63b11392f5613378977dae7eca
("xfrm: introduce helper for safe determination of mtu").
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
include/net/xfrm.h | 12 ------------
net/ipv4/xfrm4_output.c | 2 +-
net/ipv6/xfrm6_output.c | 8 +++++---
3 files changed, 6 insertions(+), 16 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ac5b025..e823786 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -20,7 +20,6 @@
#include <net/route.h>
#include <net/ipv6.h>
#include <net/ip6_fib.h>
-#include <net/ip6_route.h>
#include <net/flow.h>
#include <linux/interrupt.h>
@@ -1724,15 +1723,4 @@ static inline int xfrm_mark_put(struct sk_buff *skb, const struct xfrm_mark *m)
return ret;
}
-static inline int xfrm_skb_dst_mtu(struct sk_buff *skb)
-{
- struct sock *sk = skb->sk;
-
- if (sk && skb->protocol == htons(ETH_P_IPV6))
- return ip6_skb_dst_mtu(skb);
- else if (sk && skb->protocol == htons(ETH_P_IP))
- return ip_skb_dst_mtu(skb);
- return dst_mtu(skb_dst(skb));
-}
-
#endif /* _NET_XFRM_H */
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index 80baf4a..baa0f63 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -28,7 +28,7 @@ static int xfrm4_tunnel_check_size(struct sk_buff *skb)
if (!(ip_hdr(skb)->frag_off & htons(IP_DF)) || skb->local_df)
goto out;
- mtu = xfrm_skb_dst_mtu(skb);
+ mtu = dst_mtu(skb_dst(skb));
if (skb->len > mtu) {
if (skb->sk)
xfrm_local_error(skb, mtu);
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index e092e30..6cd625e 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -140,10 +140,12 @@ static int __xfrm6_output(struct sk_buff *skb)
{
struct dst_entry *dst = skb_dst(skb);
struct xfrm_state *x = dst->xfrm;
- int mtu = xfrm_skb_dst_mtu(skb);
+ int mtu;
- if (mtu < IPV6_MIN_MTU)
- mtu = IPV6_MIN_MTU;
+ if (skb->protocol == htons(ETH_P_IPV6))
+ mtu = ip6_skb_dst_mtu(skb);
+ else
+ mtu = dst_mtu(skb_dst(skb));
if (skb->len > mtu && xfrm6_local_dontfrag(skb)) {
xfrm6_local_rxpmtu(skb, mtu);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Problematic commits in the ipsec tree
2013-08-23 12:49 ` Hannes Frederic Sowa
@ 2013-08-26 9:41 ` Steffen Klassert
2013-08-26 10:46 ` Hannes Frederic Sowa
0 siblings, 1 reply; 12+ messages in thread
From: Steffen Klassert @ 2013-08-26 9:41 UTC (permalink / raw)
To: Hannes Frederic Sowa; +Cc: David Miller, netdev
On Fri, Aug 23, 2013 at 02:49:11PM +0200, Hannes Frederic Sowa wrote:
>
> This could be the replacement for patch 1/2 to reassemble old behaviour
> without touching ip6_skb_dst_mtu if the socket type is not an IPv6 one.
>
> I would still like to look if we could correctly handle *_PMTUDISC_PROBE one
> day and fallback to dst_mtu(dst->path) if possible. So I don't know if
> removing xfrm_skb_dst_mtu is good style and would just make churn in the git
> history. What do you think?
Currently I think we can call dst_mtu() unconditionally from
__xfrm6_output(), then we would not need xfrm_skb_dst_mtu().
But this needs further investigation, IPsec pmtu discovery
was frequently broken in the past and I don't want to break
it again.
>
> [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu
>
> In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
> helper for safe determination of mtu") I switched the determination of
> ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
> case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
> never correct for ipv4 ipsec.
>
> This patch partly reverts 0ea9d5e3e0e03a63b11392f5613378977dae7eca
> ("xfrm: introduce helper for safe determination of mtu").
>
I think with this and you other patch, we get the all the
interfamily tunnel problems fixed for now. Everything else
should be done in ipsec-next.
Please resend the whole patchset, so we can get it fixed soon.
Tanks a lot!
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Problematic commits in the ipsec tree
2013-08-26 9:41 ` Steffen Klassert
@ 2013-08-26 10:46 ` Hannes Frederic Sowa
0 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-26 10:46 UTC (permalink / raw)
To: Steffen Klassert; +Cc: David Miller, netdev
On Mon, Aug 26, 2013 at 11:41:45AM +0200, Steffen Klassert wrote:
> On Fri, Aug 23, 2013 at 02:49:11PM +0200, Hannes Frederic Sowa wrote:
> >
> > This could be the replacement for patch 1/2 to reassemble old behaviour
> > without touching ip6_skb_dst_mtu if the socket type is not an IPv6 one.
> >
> > I would still like to look if we could correctly handle *_PMTUDISC_PROBE one
> > day and fallback to dst_mtu(dst->path) if possible. So I don't know if
> > removing xfrm_skb_dst_mtu is good style and would just make churn in the git
> > history. What do you think?
>
> Currently I think we can call dst_mtu() unconditionally from
> __xfrm6_output(), then we would not need xfrm_skb_dst_mtu().
> But this needs further investigation, IPsec pmtu discovery
> was frequently broken in the past and I don't want to break
> it again.
My idea was something like
| struct ipv6_pinfo *np = ...;
| int mtu = (np && np->pmtudisc == IPV6_PMTUDISC_PROBE) ?
| dst_mtu(skb_dst(skb)->path) : dst_mtu(skb_dst(skb));
But I don't know if this does actually anything good and where the dispatch of
dst_mtu goes to. My idea was to avoid the dst_metric_raw(dst, RTAX_MTU)
call in xfrm_mtu in case of IPV6_PMTUDISC_PROBE.
> > [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu
> >
> > In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
> > helper for safe determination of mtu") I switched the determination of
> > ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
> > case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
> > never correct for ipv4 ipsec.
> >
> > This patch partly reverts 0ea9d5e3e0e03a63b11392f5613378977dae7eca
> > ("xfrm: introduce helper for safe determination of mtu").
> >
>
> I think with this and you other patch, we get the all the
> interfamily tunnel problems fixed for now. Everything else
> should be done in ipsec-next.
Fully ACK.
> Please resend the whole patchset, so we can get it fixed soon.
>
> Tanks a lot!
Sorry for holding back your tree for so long to get merged.
Thanks,
Hannes
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-08-26 10:46 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-22 10:47 Problematic commits in the ipsec tree Steffen Klassert
2013-08-22 11:21 ` Hannes Frederic Sowa
2013-08-22 13:53 ` Hannes Frederic Sowa
2013-08-23 8:58 ` Steffen Klassert
2013-08-23 11:03 ` Hannes Frederic Sowa
2013-08-23 11:34 ` Hannes Frederic Sowa
2013-08-23 12:49 ` Hannes Frederic Sowa
2013-08-26 9:41 ` Steffen Klassert
2013-08-26 10:46 ` Hannes Frederic Sowa
2013-08-22 19:53 ` [PATCH ipsec 1/2] xfrm: revert ipv4 mtu determination to dst_mtu Hannes Frederic Sowa
2013-08-22 19:54 ` [PATCH ipsec 2/2] ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs Hannes Frederic Sowa
2013-08-22 23:59 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).