netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question with commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
@ 2012-02-13  8:10 Li Wei
  2012-02-14 11:04 ` Steffen Klassert
  0 siblings, 1 reply; 16+ messages in thread
From: Li Wei @ 2012-02-13  8:10 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev

Hi, Steffen!

I found some problem while doing networking tests with IPSec that
the first fragment doesn't use the max MTU to fill payload, but with
20 bytes smaller. When I reverted your commit 299b0767(ipv6: Fix 
IPsec slowpath fragmentation problem), things goes well.

Would you so kindly to point me out what the commit did, because I
think the original implementation had taken IPSec header and tailer
into account.

Thanks!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question with commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
  2012-02-13  8:10 Question with commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem) Li Wei
@ 2012-02-14 11:04 ` Steffen Klassert
  2012-02-15  7:00   ` Li Wei
  0 siblings, 1 reply; 16+ messages in thread
From: Steffen Klassert @ 2012-02-14 11:04 UTC (permalink / raw)
  To: Li Wei; +Cc: netdev

On Mon, Feb 13, 2012 at 04:10:39PM +0800, Li Wei wrote:
> Hi, Steffen!
> 
> I found some problem while doing networking tests with IPSec that
> the first fragment doesn't use the max MTU to fill payload, but with
> 20 bytes smaller. When I reverted your commit 299b0767(ipv6: Fix 
> IPsec slowpath fragmentation problem), things goes well.
> 
> Would you so kindly to point me out what the commit did, because I
> think the original implementation had taken IPSec header and tailer
> into account.
> 

Without this patch we used always the slow path in ip6_fragment()
due to a miscalculation of the packet lenght in ip6_append_data().

This patch just makes use of the reduced IPsec mtu, and adapts
the IPsec header handling to have enought headroom on the skb.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question with commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
  2012-02-14 11:04 ` Steffen Klassert
@ 2012-02-15  7:00   ` Li Wei
  2012-02-15 10:40     ` Steffen Klassert
  0 siblings, 1 reply; 16+ messages in thread
From: Li Wei @ 2012-02-15  7:00 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev

Steffen Klassert wrote:
> On Mon, Feb 13, 2012 at 04:10:39PM +0800, Li Wei wrote:
>> Hi, Steffen!
>>
>> I found some problem while doing networking tests with IPSec that
>> the first fragment doesn't use the max MTU to fill payload, but with
>> 20 bytes smaller. When I reverted your commit 299b0767(ipv6: Fix 
>> IPsec slowpath fragmentation problem), things goes well.
>>
>> Would you so kindly to point me out what the commit did, because I
>> think the original implementation had taken IPSec header and tailer
>> into account.
>>
> 
> Without this patch we used always the slow path in ip6_fragment()
> due to a miscalculation of the packet lenght in ip6_append_data().
> 
> This patch just makes use of the reduced IPsec mtu, and adapts
> the IPsec header handling to have enought headroom on the skb.
> 
> 

Hi Steffen,

Thank you for your reply!

I see in your patch that you use the "mtu" of &rt->dst (which taken IPSec
into account) instead of rt->dst.path, but the "exthdrlen" and "dst_exthdrlen"
things process IPSec again. Does some duplication there?

After reverted the patch and put some "printk" things in the slow_path of
ip6_fragment(), setup IPSec transport mode between two hosts, when sending
some echo request which exceeds the MTU, I don't see any "printk" in slow_path
outputed. Could you tell me how to reproduce the slow_path things?

Thanks,
Wei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question with commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
  2012-02-15  7:00   ` Li Wei
@ 2012-02-15 10:40     ` Steffen Klassert
  2012-05-14  3:21       ` [PATCH] ipv6: fix incorrect ipsec transport mode fragment Gao feng
  0 siblings, 1 reply; 16+ messages in thread
From: Steffen Klassert @ 2012-02-15 10:40 UTC (permalink / raw)
  To: Li Wei; +Cc: netdev

On Wed, Feb 15, 2012 at 03:00:04PM +0800, Li Wei wrote:
> 
> Hi Steffen,
> 
> Thank you for your reply!
> 
> I see in your patch that you use the "mtu" of &rt->dst (which taken IPSec
> into account) instead of rt->dst.path, but the "exthdrlen" and "dst_exthdrlen"
> things process IPSec again. Does some duplication there?
> 
> After reverted the patch and put some "printk" things in the slow_path of
> ip6_fragment(), setup IPSec transport mode between two hosts, when sending
> some echo request which exceeds the MTU, I don't see any "printk" in slow_path
> outputed. Could you tell me how to reproduce the slow_path things?
> 

Ok, I see what's going on. The slowpath fragmentation problem appeared
in tunnel mode. This is because commit ad0081e43a 
"ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
changed tunnel mode to do fragmentation before the transformation
while transport mode still does fragmentation after transformation.
Now, tunnel mode needs IPsec headers and trailer for all fragments,
while on transport mode it is sufficient to add the headers to the
first fragment and the trailer to the last.

Not quite sure what to do here. We would have the information
to check for tunnel/transport mode when we calculate the packet
lenght in ip6_append_data(), but it would look quite ugly to
search through the xfrm state bundle to figure out which mode
this is using.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-02-15 10:40     ` Steffen Klassert
@ 2012-05-14  3:21       ` Gao feng
  2012-05-14 13:05         ` Steffen Klassert
  2012-05-26 11:30         ` [PATCH v2] ipv6: fix incorrect ipsec fragment Gao feng
  0 siblings, 2 replies; 16+ messages in thread
From: Gao feng @ 2012-05-14  3:21 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, davem, lw, Gao feng

Since commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
the fragment of ipsec transport mode packets is incorrect.
because tunnel mode needs IPsec headers and trailer for all fragments,
while on transport mode it is sufficient to add the headers to the
first fragment and the trailer to the last.

so modify mtu and maxfraglen base on ipsec mode and if fragment is first
or last.

with my test,it work well and does not trigger slow fragment path.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/ipv6/ip6_output.c |   80 +++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index b7ca461..9416887 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1191,19 +1191,23 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct inet_cork *cork;
 	struct sk_buff *skb;
-	unsigned int maxfraglen, fragheaderlen;
+	unsigned int maxfraglen, maxfraglen_prev, fragheaderlen;
 	int exthdrlen;
 	int dst_exthdrlen;
 	int hh_len;
-	int mtu;
+	int mtu, mtu_prev;
 	int copy;
 	int err;
 	int offset = 0;
 	int csummode = CHECKSUM_NONE;
 	__u8 tx_flags = 0;
-
+	bool transport_mode = false;
+	struct xfrm_state *x = rt->dst.xfrm;
 	if (flags&MSG_PROBE)
 		return 0;
+	if (x && x->props.mode == XFRM_MODE_TRANSPORT)
+		transport_mode = true;
+
 	cork = &inet->cork.base;
 	if (skb_queue_empty(&sk->sk_write_queue)) {
 		/*
@@ -1248,13 +1252,17 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 		inet->cork.fl.u.ip6 = *fl6;
 		np->cork.hop_limit = hlimit;
 		np->cork.tclass = tclass;
-		mtu = np->pmtudisc == IPV6_PMTUDISC_PROBE ?
-		      rt->dst.dev->mtu : dst_mtu(&rt->dst);
+		if (transport_mode)
+			mtu = np->pmtudisc == IPV6_PMTUDISC_PROBE ?
+			      rt->dst.dev->mtu : dst_mtu(rt->dst.path);
+		else
+			mtu = np->pmtudisc == IPV6_PMTUDISC_PROBE ?
+			      rt->dst.dev->mtu : dst_mtu(&rt->dst);
 		if (np->frag_size < mtu) {
 			if (np->frag_size)
 				mtu = np->frag_size;
 		}
-		cork->fragsize = mtu;
+		mtu_prev = cork->fragsize = mtu;
 		if (dst_allfrag(rt->dst.path))
 			cork->flags |= IPCORK_ALLFRAG;
 		cork->length = 0;
@@ -1271,14 +1279,15 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 		transhdrlen = 0;
 		exthdrlen = 0;
 		dst_exthdrlen = 0;
-		mtu = cork->fragsize;
+		mtu_prev = mtu = cork->fragsize;
 	}
 
 	hh_len = LL_RESERVED_SPACE(rt->dst.dev);
 
 	fragheaderlen = sizeof(struct ipv6hdr) + rt->rt6i_nfheader_len +
 			(opt ? opt->opt_nflen : 0);
-	maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen - sizeof(struct frag_hdr);
+	maxfraglen_prev = maxfraglen = ((mtu - fragheaderlen) & ~7)
+				       + fragheaderlen - sizeof(struct frag_hdr);
 
 	if (mtu <= sizeof(struct ipv6hdr) + IPV6_MAXPLEN) {
 		if (cork->length + length > sizeof(struct ipv6hdr) + IPV6_MAXPLEN - fragheaderlen) {
@@ -1329,15 +1338,27 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 			return 0;
 		}
 	}
-
-	if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL)
+	skb = skb_peek_tail(&sk->sk_write_queue);
+	if (skb == NULL) {
+		if (transport_mode) {
+			/*
+			 * transport mode the first ipsec fragment should contain
+			 * ipsec header, so decrease dst_exthdrlen from mtu.
+			 */
+			mtu -= dst_exthdrlen;
+			mtu_prev = mtu;
+			maxfraglen = ((mtu - fragheaderlen) & ~7)
+				     + fragheaderlen - sizeof(struct frag_hdr);
+			maxfraglen_prev = maxfraglen;
+		}
 		goto alloc_new_skb;
+	}
 
 	while (length > 0) {
 		/* Check if the remaining data fits into current packet. */
-		copy = (cork->length <= mtu && !(cork->flags & IPCORK_ALLFRAG) ? mtu : maxfraglen) - skb->len;
+		copy = (cork->length <= mtu_prev && !(cork->flags & IPCORK_ALLFRAG) ? mtu_prev : maxfraglen_prev) - skb->len;
 		if (copy < length)
-			copy = maxfraglen - skb->len;
+			copy = maxfraglen_prev - skb->len;
 
 		if (copy <= 0) {
 			char *data;
@@ -1351,7 +1372,7 @@ alloc_new_skb:
 
 			/* There's no room in the current skb */
 			if (skb_prev)
-				fraggap = skb_prev->len - maxfraglen;
+				fraggap = skb_prev->len - maxfraglen_prev;
 			else
 				fraggap = 0;
 
@@ -1360,10 +1381,14 @@ alloc_new_skb:
 			 * we know we need more fragment(s).
 			 */
 			datalen = length + fraggap;
-			if (datalen > (cork->length <= mtu && !(cork->flags & IPCORK_ALLFRAG) ? mtu : maxfraglen) - fragheaderlen)
-				datalen = maxfraglen - fragheaderlen;
+			if (datalen > (cork->length <= mtu && !(cork->flags & IPCORK_ALLFRAG) ? mtu : maxfraglen) - fragheaderlen) {
+				/*
+				 * decrease ipsec trailer here,
+				 * if it's not the last fragment, add trailer latter
+				 */
+				datalen = maxfraglen - fragheaderlen - rt->dst.trailer_len;
+			}
 
-			fraglen = datalen + fragheaderlen;
 			if ((flags & MSG_MORE) &&
 			    !(rt->dst.dev->features&NETIF_F_SG))
 				alloclen = mtu;
@@ -1377,9 +1402,15 @@ alloc_new_skb:
 			 * Note: we overallocate on fragments with MSG_MODE
 			 * because we have no idea if we're the last one.
 			 */
-			if (datalen == length + fraggap)
-				alloclen += rt->dst.trailer_len;
-
+			alloclen += rt->dst.trailer_len;
+			if (datalen != length + fraggap) {
+				/*
+				 * this fragment is not the last fragment,
+				 * add trailer to datalen
+				 */
+				datalen += rt->dst.trailer_len;
+			}
+			fraglen = datalen + fragheaderlen;
 			/*
 			 * We just reserve space for fragment header.
 			 * Note: this may be overallocation if the message
@@ -1452,6 +1483,17 @@ alloc_new_skb:
 
 			offset += copy;
 			length -= datalen - fraggap;
+
+			mtu_prev = mtu;
+			maxfraglen_prev = maxfraglen;
+			if (skb_prev == NULL && transport_mode) {
+				/*
+				 * transport mode the middle skb should not have
+				 * ipsec header and trailer, so update the mtu.
+				 */
+				mtu = dst_mtu(rt->dst.path);
+				maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen - sizeof(struct frag_hdr);
+			}
 			transhdrlen = 0;
 			exthdrlen = 0;
 			dst_exthdrlen = 0;
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-05-14  3:21       ` [PATCH] ipv6: fix incorrect ipsec transport mode fragment Gao feng
@ 2012-05-14 13:05         ` Steffen Klassert
  2012-05-14 22:41           ` David Miller
                             ` (2 more replies)
  2012-05-26 11:30         ` [PATCH v2] ipv6: fix incorrect ipsec fragment Gao feng
  1 sibling, 3 replies; 16+ messages in thread
From: Steffen Klassert @ 2012-05-14 13:05 UTC (permalink / raw)
  To: Gao feng; +Cc: netdev, davem, lw

On Mon, May 14, 2012 at 11:21:00AM +0800, Gao feng wrote:
> Since commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
> the fragment of ipsec transport mode packets is incorrect.
> because tunnel mode needs IPsec headers and trailer for all fragments,
> while on transport mode it is sufficient to add the headers to the
> first fragment and the trailer to the last.

I mentioned this in an other thread some time ago,
this is due to commit ad0081e43a
"ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
changed tunnel mode to do fragmentation before the transformation
while transport mode still does fragmentation after transformation.
Now, tunnel mode needs IPsec headers and trailer for all fragments,
while on transport mode it is sufficient to add the headers to the
first fragment and the trailer to the last.

> 
> so modify mtu and maxfraglen base on ipsec mode and if fragment is first
> or last.

There might be other opinions, but I don't like to see this IPsec mode
dependent stuff hacked into the generic ipv6 output path.

Basically we have two cases. One where we have to add rt->dst.header_len
to the first fragment and rt->dst.trailer_len to the last fragment,
and the other where we have to add both to all fragments. So perhaps we
could isolate this code and create two functions, one for each case.


> 
> with my test,it work well and does not trigger slow fragment path.
> 
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> ---
>  net/ipv6/ip6_output.c |   80 +++++++++++++++++++++++++++++++++++++-----------
>  1 files changed, 61 insertions(+), 19 deletions(-)
> 
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index b7ca461..9416887 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1191,19 +1191,23 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
>  	struct ipv6_pinfo *np = inet6_sk(sk);
>  	struct inet_cork *cork;
>  	struct sk_buff *skb;
> -	unsigned int maxfraglen, fragheaderlen;
> +	unsigned int maxfraglen, maxfraglen_prev, fragheaderlen;
>  	int exthdrlen;
>  	int dst_exthdrlen;
>  	int hh_len;
> -	int mtu;
> +	int mtu, mtu_prev;
>  	int copy;
>  	int err;
>  	int offset = 0;
>  	int csummode = CHECKSUM_NONE;
>  	__u8 tx_flags = 0;
> -
> +	bool transport_mode = false;
> +	struct xfrm_state *x = rt->dst.xfrm;
>  	if (flags&MSG_PROBE)
>  		return 0;
> +	if (x && x->props.mode == XFRM_MODE_TRANSPORT)
> +		transport_mode = true;
> +

Btw. beet mode should behave like transport mode here, just tunnel
mode was changed to do fragmentation before the transformation.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-05-14 13:05         ` Steffen Klassert
@ 2012-05-14 22:41           ` David Miller
  2012-05-17  7:42             ` Gao feng
  2012-05-15  3:44           ` Gao feng
  2012-05-26  9:00           ` Gao feng
  2 siblings, 1 reply; 16+ messages in thread
From: David Miller @ 2012-05-14 22:41 UTC (permalink / raw)
  To: steffen.klassert; +Cc: gaofeng, netdev, lw

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Mon, 14 May 2012 15:05:28 +0200

> There might be other opinions, but I don't like to see this IPsec mode
> dependent stuff hacked into the generic ipv6 output path.

Completely agreed.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-05-14 13:05         ` Steffen Klassert
  2012-05-14 22:41           ` David Miller
@ 2012-05-15  3:44           ` Gao feng
  2012-05-15 11:48             ` Steffen Klassert
  2012-05-26  9:00           ` Gao feng
  2 siblings, 1 reply; 16+ messages in thread
From: Gao feng @ 2012-05-15  3:44 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, davem, lw

Hi steffen:

于 2012年05月14日 21:05, Steffen Klassert 写道:
> On Mon, May 14, 2012 at 11:21:00AM +0800, Gao feng wrote:
>> Since commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
>> the fragment of ipsec transport mode packets is incorrect.
>> because tunnel mode needs IPsec headers and trailer for all fragments,
>> while on transport mode it is sufficient to add the headers to the
>> first fragment and the trailer to the last.
> 
> I mentioned this in an other thread some time ago,
> this is due to commit ad0081e43a
> "ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
> changed tunnel mode to do fragmentation before the transformation
> while transport mode still does fragmentation after transformation.
> Now, tunnel mode needs IPsec headers and trailer for all fragments,
> while on transport mode it is sufficient to add the headers to the
> first fragment and the trailer to the last.
> 
>>
>> so modify mtu and maxfraglen base on ipsec mode and if fragment is first
>> or last.
> 
> There might be other opinions, but I don't like to see this IPsec mode
> dependent stuff hacked into the generic ipv6 output path.
> 
> Basically we have two cases. One where we have to add rt->dst.header_len
> to the first fragment and rt->dst.trailer_len to the last fragment,
> and the other where we have to add both to all fragments. So perhaps we
> could isolate this code and create two functions, one for each case.
> 

how about add a function pointer append_data to the struct rt6_info?
so we can just call rt->append_data in ip6_append_data without conside
witch mode it is.

of course, we will set rt->append_data appropriatly in xfrm_lookup.

But the only problem is this will bloats up rt6_info,I don't konw if
it's worth doing it in this way.

> 
>>
>> with my test,it work well and does not trigger slow fragment path.
>>
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>>  net/ipv6/ip6_output.c |   80 +++++++++++++++++++++++++++++++++++++-----------
>>  1 files changed, 61 insertions(+), 19 deletions(-)
>>
>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>> index b7ca461..9416887 100644
>> --- a/net/ipv6/ip6_output.c
>> +++ b/net/ipv6/ip6_output.c
>> @@ -1191,19 +1191,23 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
>>  	struct ipv6_pinfo *np = inet6_sk(sk);
>>  	struct inet_cork *cork;
>>  	struct sk_buff *skb;
>> -	unsigned int maxfraglen, fragheaderlen;
>> +	unsigned int maxfraglen, maxfraglen_prev, fragheaderlen;
>>  	int exthdrlen;
>>  	int dst_exthdrlen;
>>  	int hh_len;
>> -	int mtu;
>> +	int mtu, mtu_prev;
>>  	int copy;
>>  	int err;
>>  	int offset = 0;
>>  	int csummode = CHECKSUM_NONE;
>>  	__u8 tx_flags = 0;
>> -
>> +	bool transport_mode = false;
>> +	struct xfrm_state *x = rt->dst.xfrm;
>>  	if (flags&MSG_PROBE)
>>  		return 0;
>> +	if (x && x->props.mode == XFRM_MODE_TRANSPORT)
>> +		transport_mode = true;
>> +
> 
> Btw. beet mode should behave like transport mode here, just tunnel
> mode was changed to do fragmentation before the transformation.
> 

thanks steffen,I miss it and CONFIG_XFRM...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-05-15  3:44           ` Gao feng
@ 2012-05-15 11:48             ` Steffen Klassert
  2012-05-16  2:59               ` Gao feng
  0 siblings, 1 reply; 16+ messages in thread
From: Steffen Klassert @ 2012-05-15 11:48 UTC (permalink / raw)
  To: Gao feng; +Cc: netdev, davem, lw

On Tue, May 15, 2012 at 11:44:26AM +0800, Gao feng wrote:
> 
> how about add a function pointer append_data to the struct rt6_info?
> so we can just call rt->append_data in ip6_append_data without conside
> witch mode it is.
> 

If you want to use a function pointer, it should go to stuct xfrm_mode.
That's where the IPsec mode dependent functions reside.

A side note, I'll be off for three weeks starting from tomorrow.
I'll have no E-mail access most of the time, so I'll probaply not
respond for the next three weeks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-05-15 11:48             ` Steffen Klassert
@ 2012-05-16  2:59               ` Gao feng
  0 siblings, 0 replies; 16+ messages in thread
From: Gao feng @ 2012-05-16  2:59 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, davem, lw

于 2012年05月15日 19:48, Steffen Klassert 写道:
> On Tue, May 15, 2012 at 11:44:26AM +0800, Gao feng wrote:
>>
>> how about add a function pointer append_data to the struct rt6_info?
>> so we can just call rt->append_data in ip6_append_data without conside
>> witch mode it is.
>>
> 
> If you want to use a function pointer, it should go to stuct xfrm_mode.
> That's where the IPsec mode dependent functions reside.
> 

Yes,I will do it.

> A side note, I'll be off for three weeks starting from tomorrow.
> I'll have no E-mail access most of the time, so I'll probaply not
> respond for the next three weeks.

thanks for your response.

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-05-14 22:41           ` David Miller
@ 2012-05-17  7:42             ` Gao feng
  2012-05-17  8:48               ` David Miller
  0 siblings, 1 reply; 16+ messages in thread
From: Gao feng @ 2012-05-17  7:42 UTC (permalink / raw)
  To: David Miller; +Cc: steffen.klassert, netdev, lw

于 2012年05月15日 06:41, David Miller 写道:
> From: Steffen Klassert <steffen.klassert@secunet.com>
> Date: Mon, 14 May 2012 15:05:28 +0200
> 
>> There might be other opinions, but I don't like to see this IPsec mode
>> dependent stuff hacked into the generic ipv6 output path.
> 
> Completely agreed.

Hi David

how do you think about adding function pointer to struct xfrm_mode?
when prefering xfrm_mode,there must be some ipsec codes in the generic ipv6 output
path,just like below.it looks ugly.

int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
        int offset, int len, int odd, struct sk_buff *skb),
        void *from, int length, int transhdrlen,
        int hlimit, int tclass, struct ipv6_txoptions *opt, struct flowi6 *fl6,
        struct rt6_info *rt, unsigned int flags, int dontfrag)
{
#ifdef CONFIG_XFRM
        struct xfrm_state *x = rt->dst.xfrm;
        if (x && x->outer_mode && x->outer_mode->append_data) {
                x->outer_mode->append_data(...);
        } else
#endif
                __ip6_append_data(...);
}

I want to use one bit of rt6_info->rt6i_flags to identify the actions we should
do in ip6_append_data. BUT it seems not what the rt6i_flags should do.this may
make rt6i_flags in chaos.

What's your comment?

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-05-17  7:42             ` Gao feng
@ 2012-05-17  8:48               ` David Miller
  0 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2012-05-17  8:48 UTC (permalink / raw)
  To: gaofeng; +Cc: steffen.klassert, netdev, lw

From: Gao feng <gaofeng@cn.fujitsu.com>
Date: Thu, 17 May 2012 15:42:00 +0800

> What's your comment?

I don't have any, you will need to work this out yourself.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-05-14 13:05         ` Steffen Klassert
  2012-05-14 22:41           ` David Miller
  2012-05-15  3:44           ` Gao feng
@ 2012-05-26  9:00           ` Gao feng
  2012-05-26 11:29             ` Gao feng
  2 siblings, 1 reply; 16+ messages in thread
From: Gao feng @ 2012-05-26  9:00 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, davem, lw

Hi Steffen

于 2012年05月14日 21:05, Steffen Klassert 写道:
> On Mon, May 14, 2012 at 11:21:00AM +0800, Gao feng wrote:
>> Since commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
>> the fragment of ipsec transport mode packets is incorrect.
>> because tunnel mode needs IPsec headers and trailer for all fragments,
>> while on transport mode it is sufficient to add the headers to the
>> first fragment and the trailer to the last.
> 
> I mentioned this in an other thread some time ago,
> this is due to commit ad0081e43a
> "ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
> changed tunnel mode to do fragmentation before the transformation
> while transport mode still does fragmentation after transformation.
> Now, tunnel mode needs IPsec headers and trailer for all fragments,
> while on transport mode it is sufficient to add the headers to the
> first fragment and the trailer to the last.
> 
>>
>> so modify mtu and maxfraglen base on ipsec mode and if fragment is first
>> or last.
> 
> There might be other opinions, but I don't like to see this IPsec mode
> dependent stuff hacked into the generic ipv6 output path.
> 
> Basically we have two cases. One where we have to add rt->dst.header_len
> to the first fragment and rt->dst.trailer_len to the last fragment,
> and the other where we have to add both to all fragments. So perhaps we
> could isolate this code and create two functions, one for each case.
> 
> 

I thought this problem carefully,I think the important and troubled thing is
how to deal with transport mode.

we have to use different mtu and maxfraglen for checking if the prev_skb has
extra data or has free room. so mtu_prev and maxfraglen_prev have to be used.

And I also think it's not very well to create two function for the two cases,
it will create a lot of redundant codes.

I will add a dst_entry flag DST_XFRM_TUNNEL to avoid ipsec mode dependent stuff
hacked into the generic ipv6_output path, and send patch v2.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipv6: fix incorrect ipsec transport mode fragment
  2012-05-26  9:00           ` Gao feng
@ 2012-05-26 11:29             ` Gao feng
  0 siblings, 0 replies; 16+ messages in thread
From: Gao feng @ 2012-05-26 11:29 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, davem, lw

于 2012年05月26日 17:00, Gao feng 写道:
> Hi Steffen
> 
> 于 2012年05月14日 21:05, Steffen Klassert 写道:
>> On Mon, May 14, 2012 at 11:21:00AM +0800, Gao feng wrote:
>>> Since commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
>>> the fragment of ipsec transport mode packets is incorrect.
>>> because tunnel mode needs IPsec headers and trailer for all fragments,
>>> while on transport mode it is sufficient to add the headers to the
>>> first fragment and the trailer to the last.
>>
>> I mentioned this in an other thread some time ago,
>> this is due to commit ad0081e43a
>> "ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
>> changed tunnel mode to do fragmentation before the transformation
>> while transport mode still does fragmentation after transformation.
>> Now, tunnel mode needs IPsec headers and trailer for all fragments,
>> while on transport mode it is sufficient to add the headers to the
>> first fragment and the trailer to the last.
>>
>>>
>>> so modify mtu and maxfraglen base on ipsec mode and if fragment is first
>>> or last.
>>
>> There might be other opinions, but I don't like to see this IPsec mode
>> dependent stuff hacked into the generic ipv6 output path.
>>
>> Basically we have two cases. One where we have to add rt->dst.header_len
>> to the first fragment and rt->dst.trailer_len to the last fragment,
>> and the other where we have to add both to all fragments. So perhaps we
>> could isolate this code and create two functions, one for each case.
>>
>>
> 
> I thought this problem carefully,I think the important and troubled thing is
> how to deal with transport mode.
> 
> we have to use different mtu and maxfraglen for checking if the prev_skb has
> extra data or has free room. so mtu_prev and maxfraglen_prev have to be used.
> 

I found we can delete the mtu_prev and maxfraglen prev ;)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] ipv6: fix incorrect ipsec fragment
  2012-05-14  3:21       ` [PATCH] ipv6: fix incorrect ipsec transport mode fragment Gao feng
  2012-05-14 13:05         ` Steffen Klassert
@ 2012-05-26 11:30         ` Gao feng
  2012-05-27  5:12           ` David Miller
  1 sibling, 1 reply; 16+ messages in thread
From: Gao feng @ 2012-05-26 11:30 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, lw, Gao feng

Since commit ad0081e43a
"ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
the fragment of packets is incorrect.
because tunnel mode needs IPsec headers and trailer for all fragments,
while on transport mode it is sufficient to add the headers to the
first fragment and the trailer to the last.

so modify mtu and maxfraglen base on ipsec mode and if fragment is first
or last.

with my test,it work well(every fragment's size is the mtu)
and does not trigger slow fragment path.

Changes from v1:
	though optimization, mtu_prev and maxfraglen_prev can be delete.
	replace xfrm mode codes with dst_entry's new frag DST_XFRM_TUNNEL.
	add fuction ip6_append_data_mtu to make codes clearer.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/net/dst.h      |    1 +
 net/ipv6/ip6_output.c  |   68 +++++++++++++++++++++++++++++++++++------------
 net/xfrm/xfrm_policy.c |    3 ++
 3 files changed, 54 insertions(+), 18 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index bed833d..8197ead 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -60,6 +60,7 @@ struct dst_entry {
 #define DST_NOCOUNT		0x0020
 #define DST_NOPEER		0x0040
 #define DST_FAKE_RTABLE		0x0080
+#define DST_XFRM_TUNNEL		0x0100
 
 	short			error;
 	short			obsolete;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index d99fdc6..1d34f86 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1187,6 +1187,29 @@ static inline struct ipv6_rt_hdr *ip6_rthdr_dup(struct ipv6_rt_hdr *src,
 	return src ? kmemdup(src, (src->hdrlen + 1) * 8, gfp) : NULL;
 }
 
+static void ip6_append_data_mtu(int *mtu,
+				int *maxfraglen,
+				unsigned int fragheaderlen,
+				struct sk_buff *skb,
+				struct rt6_info *rt)
+{
+	if (!(rt->dst.flags & DST_XFRM_TUNNEL)) {
+		if (skb == NULL) {
+			/* first fragment, reserve header_len */
+			*mtu = *mtu - rt->dst.header_len;
+
+		} else {
+			/*
+			 * this fragment is not first, the headers
+			 * space is regarded as data space.
+			 */
+			*mtu = dst_mtu(rt->dst.path);
+		}
+		*maxfraglen = ((*mtu - fragheaderlen) & ~7)
+			      + fragheaderlen - sizeof(struct frag_hdr);
+	}
+}
+
 int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 	int offset, int len, int odd, struct sk_buff *skb),
 	void *from, int length, int transhdrlen,
@@ -1196,7 +1219,7 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 	struct inet_sock *inet = inet_sk(sk);
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct inet_cork *cork;
-	struct sk_buff *skb;
+	struct sk_buff *skb, *skb_prev = NULL;
 	unsigned int maxfraglen, fragheaderlen;
 	int exthdrlen;
 	int dst_exthdrlen;
@@ -1253,8 +1276,12 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 		inet->cork.fl.u.ip6 = *fl6;
 		np->cork.hop_limit = hlimit;
 		np->cork.tclass = tclass;
-		mtu = np->pmtudisc == IPV6_PMTUDISC_PROBE ?
-		      rt->dst.dev->mtu : dst_mtu(&rt->dst);
+		if (rt->dst.flags & DST_XFRM_TUNNEL)
+			mtu = np->pmtudisc == IPV6_PMTUDISC_PROBE ?
+			      rt->dst.dev->mtu : dst_mtu(&rt->dst);
+		else
+			mtu = np->pmtudisc == IPV6_PMTUDISC_PROBE ?
+			      rt->dst.dev->mtu : dst_mtu(rt->dst.path);
 		if (np->frag_size < mtu) {
 			if (np->frag_size)
 				mtu = np->frag_size;
@@ -1350,25 +1377,27 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 			unsigned int fraglen;
 			unsigned int fraggap;
 			unsigned int alloclen;
-			struct sk_buff *skb_prev;
 alloc_new_skb:
-			skb_prev = skb;
-
 			/* There's no room in the current skb */
-			if (skb_prev)
-				fraggap = skb_prev->len - maxfraglen;
+			if (skb)
+				fraggap = skb->len - maxfraglen;
 			else
 				fraggap = 0;
+			/* update mtu and maxfraglen if necessary */
+			if (skb == NULL || skb_prev == NULL)
+				ip6_append_data_mtu(&mtu, &maxfraglen,
+						    fragheaderlen, skb, rt);
+
+			skb_prev = skb;
 
 			/*
 			 * If remaining data exceeds the mtu,
 			 * we know we need more fragment(s).
 			 */
 			datalen = length + fraggap;
-			if (datalen > (cork->length <= mtu && !(cork->flags & IPCORK_ALLFRAG) ? mtu : maxfraglen) - fragheaderlen)
-				datalen = maxfraglen - fragheaderlen;
 
-			fraglen = datalen + fragheaderlen;
+			if (datalen > (cork->length <= mtu && !(cork->flags & IPCORK_ALLFRAG) ? mtu : maxfraglen) - fragheaderlen)
+				datalen = maxfraglen - fragheaderlen - rt->dst.trailer_len;
 			if ((flags & MSG_MORE) &&
 			    !(rt->dst.dev->features&NETIF_F_SG))
 				alloclen = mtu;
@@ -1377,13 +1406,16 @@ alloc_new_skb:
 
 			alloclen += dst_exthdrlen;
 
-			/*
-			 * The last fragment gets additional space at tail.
-			 * Note: we overallocate on fragments with MSG_MODE
-			 * because we have no idea if we're the last one.
-			 */
-			if (datalen == length + fraggap)
-				alloclen += rt->dst.trailer_len;
+			if (datalen != length + fraggap) {
+				/*
+				 * this is not the last fragment, the trailer
+				 * space is regarded as data space.
+				 */
+				datalen += rt->dst.trailer_len;
+			}
+
+			alloclen += rt->dst.trailer_len;
+			fraglen = datalen + fragheaderlen;
 
 			/*
 			 * We just reserve space for fragment header.
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index c53e8f4..7379eef 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1921,6 +1921,9 @@ no_transform:
 	}
 ok:
 	xfrm_pols_put(pols, drop_pols);
+	if (dst && dst->xfrm &&
+	    dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
+		dst->flags |= DST_XFRM_TUNNEL;
 	return dst;
 
 nopol:
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ipv6: fix incorrect ipsec fragment
  2012-05-26 11:30         ` [PATCH v2] ipv6: fix incorrect ipsec fragment Gao feng
@ 2012-05-27  5:12           ` David Miller
  0 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2012-05-27  5:12 UTC (permalink / raw)
  To: gaofeng; +Cc: netdev, steffen.klassert, lw

From: Gao feng <gaofeng@cn.fujitsu.com>
Date: Sat, 26 May 2012 19:30:53 +0800

> Since commit ad0081e43a
> "ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
> the fragment of packets is incorrect.
> because tunnel mode needs IPsec headers and trailer for all fragments,
> while on transport mode it is sufficient to add the headers to the
> first fragment and the trailer to the last.
> 
> so modify mtu and maxfraglen base on ipsec mode and if fragment is first
> or last.
> 
> with my test,it work well(every fragment's size is the mtu)
> and does not trigger slow fragment path.
> 
> Changes from v1:
> 	though optimization, mtu_prev and maxfraglen_prev can be delete.
> 	replace xfrm mode codes with dst_entry's new frag DST_XFRM_TUNNEL.
> 	add fuction ip6_append_data_mtu to make codes clearer.
> 
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>

Applied, and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-05-27  5:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-13  8:10 Question with commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem) Li Wei
2012-02-14 11:04 ` Steffen Klassert
2012-02-15  7:00   ` Li Wei
2012-02-15 10:40     ` Steffen Klassert
2012-05-14  3:21       ` [PATCH] ipv6: fix incorrect ipsec transport mode fragment Gao feng
2012-05-14 13:05         ` Steffen Klassert
2012-05-14 22:41           ` David Miller
2012-05-17  7:42             ` Gao feng
2012-05-17  8:48               ` David Miller
2012-05-15  3:44           ` Gao feng
2012-05-15 11:48             ` Steffen Klassert
2012-05-16  2:59               ` Gao feng
2012-05-26  9:00           ` Gao feng
2012-05-26 11:29             ` Gao feng
2012-05-26 11:30         ` [PATCH v2] ipv6: fix incorrect ipsec fragment Gao feng
2012-05-27  5:12           ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).