netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [NET]: Update frag_list in pskb_trim
@ 2006-07-13  9:03 Herbert Xu
  2006-07-14  2:22 ` David Miller
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Herbert Xu @ 2006-07-13  9:03 UTC (permalink / raw)
  To: David S. Miller, netdev

Hi Dave:

This needs to go into stable as well.  In fact, there is another unrelated
bug with exactly the same symptoms which was inadvertently fixed by the
GSO patches.  So I'll send a simpler fix for that to stable.

[NET]: Update frag_list in pskb_trim

When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of
the packet, the frag_list is not updated to reflect the trimming.  This
will usually work fine until you hit something that uses the packet length
or tail from the frag_list.

Examples include esp_output and ip_fragment.

Another problem caused by this is that you can end up with a linear packet
with a frag_list attached.

It is possible to get away with this if we audit everything to make sure
that they always consult skb->len before going down onto frag_list.  In
fact we can do the samething for the paged part as well to avoid copying
the data area of the skb.  For now though, let's do the conservative fix
and update frag_list.

Many thanks to Marco Berizzi for helping me to track down this bug.

This 4-year old bug took 3 months to track down.  Marco was very patient
indeed :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 44f6a18..476aa39 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -257,11 +257,11 @@ nodata:
 }
 
 
-static void skb_drop_fraglist(struct sk_buff *skb)
+static void skb_drop_list(struct sk_buff **listp)
 {
-	struct sk_buff *list = skb_shinfo(skb)->frag_list;
+	struct sk_buff *list = *listp;
 
-	skb_shinfo(skb)->frag_list = NULL;
+	*listp = NULL;
 
 	do {
 		struct sk_buff *this = list;
@@ -270,6 +270,11 @@ static void skb_drop_fraglist(struct sk_
 	} while (list);
 }
 
+static inline void skb_drop_fraglist(struct sk_buff *skb)
+{
+	skb_drop_list(&skb_shinfo(skb)->frag_list);
+}
+
 static void skb_clone_fraglist(struct sk_buff *skb)
 {
 	struct sk_buff *list;
@@ -830,41 +835,75 @@ free_skb:
 
 int ___pskb_trim(struct sk_buff *skb, unsigned int len)
 {
+	struct sk_buff **fragp;
+	struct sk_buff *frag;
 	int offset = skb_headlen(skb);
 	int nfrags = skb_shinfo(skb)->nr_frags;
 	int i;
+	int err;
+
+	if (skb_cloned(skb) &&
+	    unlikely((err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))))
+		return err;
 
 	for (i = 0; i < nfrags; i++) {
 		int end = offset + skb_shinfo(skb)->frags[i].size;
-		if (end > len) {
-			if (skb_cloned(skb)) {
-				if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
-					return -ENOMEM;
-			}
-			if (len <= offset) {
-				put_page(skb_shinfo(skb)->frags[i].page);
-				skb_shinfo(skb)->nr_frags--;
-			} else {
-				skb_shinfo(skb)->frags[i].size = len - offset;
-			}
+
+		if (end < len) {
+			offset = end;
+			continue;
 		}
-		offset = end;
+
+		if (len > offset)
+			skb_shinfo(skb)->frags[i++].size = len - offset;
+
+		skb_shinfo(skb)->nr_frags = i;
+
+		for (; i < nfrags; i++)
+			put_page(skb_shinfo(skb)->frags[i].page);
+
+		if (skb_shinfo(skb)->frag_list)
+			skb_drop_fraglist(skb);
+		break;
 	}
 
-	if (offset < len) {
+	for (fragp = &skb_shinfo(skb)->frag_list; (frag = *fragp);
+	     fragp = &frag->next) {
+		int end = offset + frag->len;
+
+		if (skb_shared(frag)) {
+			struct sk_buff *nfrag;
+
+			nfrag = skb_clone(frag, GFP_ATOMIC);
+			if (unlikely(!nfrag))
+				return -ENOMEM;
+
+			nfrag->next = frag->next;
+			frag = nfrag;
+			*fragp = frag;
+		}
+
+		if (end < len) {
+			offset = end;
+			continue;
+		}
+
+		if (end > len &&
+		    unlikely((err = pskb_trim(frag, len - offset))))
+			return err;
+
+		if (frag->next)
+			skb_drop_list(&frag->next);
+		break;
+	}
+
+	if (len > skb_headlen(skb)) {
 		skb->data_len -= skb->len - len;
 		skb->len       = len;
 	} else {
-		if (len <= skb_headlen(skb)) {
-			skb->len      = len;
-			skb->data_len = 0;
-			skb->tail     = skb->data + len;
-			if (skb_shinfo(skb)->frag_list && !skb_cloned(skb))
-				skb_drop_fraglist(skb);
-		} else {
-			skb->data_len -= skb->len - len;
-			skb->len       = len;
-		}
+		skb->len       = len;
+		skb->data_len  = 0;
+		skb->tail      = skb->data + len;
 	}
 
 	return 0;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [NET]: Update frag_list in pskb_trim
  2006-07-13  9:03 [NET]: Update frag_list in pskb_trim Herbert Xu
@ 2006-07-14  2:22 ` David Miller
  2006-07-14  5:05 ` David Miller
  2006-07-29 11:45 ` [NET]: Fix ___pskb_trim when entire frag_list needs dropping Herbert Xu
  2 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2006-07-14  2:22 UTC (permalink / raw)
  To: herbert; +Cc: netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 13 Jul 2006 19:03:41 +1000

> [NET]: Update frag_list in pskb_trim
> 
> When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of
> the packet, the frag_list is not updated to reflect the trimming.  This
> will usually work fine until you hit something that uses the packet length
> or tail from the frag_list.
> 
> Examples include esp_output and ip_fragment.
> 
> Another problem caused by this is that you can end up with a linear packet
> with a frag_list attached.
> 
> It is possible to get away with this if we audit everything to make sure
> that they always consult skb->len before going down onto frag_list.  In
> fact we can do the samething for the paged part as well to avoid copying
> the data area of the skb.  For now though, let's do the conservative fix
> and update frag_list.
> 
> Many thanks to Marco Berizzi for helping me to track down this bug.
> 
> This 4-year old bug took 3 months to track down.  Marco was very patient
> indeed :)
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

I happened to be looking at this function just the other day
and said to myself: "It only drops the fraglist if the whole
thing is trimmed, lazy and inefficient but seems to work"

Even the existing check for when to drop the whole fraglist
seems to be buggy, it only drops if the length drops below
headlen.  Paged and fraglist skbs are legal, so the proper
check would be to drop the entire fraglist when len drops
to headlen + the length of the paged portion (if any).

In any event leaving stale chunks there on the fraglist is
certainly a major bug, thanks for fixing this Herbert.

I bet a suitably placed assertion or two would have caught this
sooner.  One good spot would have been skb_cow_data(), where it has to
mince the fragments due to the presence of a frag_list.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [NET]: Update frag_list in pskb_trim
  2006-07-13  9:03 [NET]: Update frag_list in pskb_trim Herbert Xu
  2006-07-14  2:22 ` David Miller
@ 2006-07-14  5:05 ` David Miller
  2006-07-15  0:10   ` Herbert Xu
  2006-07-29 11:45 ` [NET]: Fix ___pskb_trim when entire frag_list needs dropping Herbert Xu
  2 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2006-07-14  5:05 UTC (permalink / raw)
  To: herbert; +Cc: netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 13 Jul 2006 19:03:41 +1000

> This needs to go into stable as well.  In fact, there is another unrelated
> bug with exactly the same symptoms which was inadvertently fixed by the
> GSO patches.  So I'll send a simpler fix for that to stable.

As I noted already, this is in my tree and will go off to
Linus soon.

Please toss this over to -stable under seperate cover, if
you haven't done so already.  Please add my signoff:

Signed-off-by: David S. Miller <davem@davemloft.net>

if you like.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [NET]: Update frag_list in pskb_trim
  2006-07-14  5:05 ` David Miller
@ 2006-07-15  0:10   ` Herbert Xu
  2006-07-17 15:22     ` [stable] " Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2006-07-15  0:10 UTC (permalink / raw)
  To: stable, David Miller; +Cc: netdev

On Thu, Jul 13, 2006 at 10:05:26PM -0700, David Miller wrote:
> 
> As I noted already, this is in my tree and will go off to
> Linus soon.
> 
> Please toss this over to -stable under seperate cover, if
> you haven't done so already.  Please add my signoff:
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>

Great.  Here we go:

[NET]: Update frag_list in pskb_trim

When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of
the packet, the frag_list is not updated to reflect the trimming.  This
will usually work fine until you hit something that uses the packet length
or tail from the frag_list.

Examples include esp_output and ip_fragment.

Another problem caused by this is that you can end up with a linear packet
with a frag_list attached.

It is possible to get away with this if we audit everything to make sure
that they always consult skb->len before going down onto frag_list.  In
fact we can do the samething for the paged part as well to avoid copying
the data area of the skb.  For now though, let's do the conservative fix
and update frag_list.

Many thanks to Marco Berizzi for helping me to track down this bug.

This 4-year old bug took 3 months to track down.  Marco was very patient
indeed :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 44f6a18..476aa39 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -257,11 +257,11 @@ nodata:
 }
 
 
-static void skb_drop_fraglist(struct sk_buff *skb)
+static void skb_drop_list(struct sk_buff **listp)
 {
-	struct sk_buff *list = skb_shinfo(skb)->frag_list;
+	struct sk_buff *list = *listp;
 
-	skb_shinfo(skb)->frag_list = NULL;
+	*listp = NULL;
 
 	do {
 		struct sk_buff *this = list;
@@ -270,6 +270,11 @@ static void skb_drop_fraglist(struct sk_
 	} while (list);
 }
 
+static inline void skb_drop_fraglist(struct sk_buff *skb)
+{
+	skb_drop_list(&skb_shinfo(skb)->frag_list);
+}
+
 static void skb_clone_fraglist(struct sk_buff *skb)
 {
 	struct sk_buff *list;
@@ -830,41 +835,75 @@ free_skb:
 
 int ___pskb_trim(struct sk_buff *skb, unsigned int len)
 {
+	struct sk_buff **fragp;
+	struct sk_buff *frag;
 	int offset = skb_headlen(skb);
 	int nfrags = skb_shinfo(skb)->nr_frags;
 	int i;
+	int err;
+
+	if (skb_cloned(skb) &&
+	    unlikely((err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))))
+		return err;
 
 	for (i = 0; i < nfrags; i++) {
 		int end = offset + skb_shinfo(skb)->frags[i].size;
-		if (end > len) {
-			if (skb_cloned(skb)) {
-				if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
-					return -ENOMEM;
-			}
-			if (len <= offset) {
-				put_page(skb_shinfo(skb)->frags[i].page);
-				skb_shinfo(skb)->nr_frags--;
-			} else {
-				skb_shinfo(skb)->frags[i].size = len - offset;
-			}
+
+		if (end < len) {
+			offset = end;
+			continue;
 		}
-		offset = end;
+
+		if (len > offset)
+			skb_shinfo(skb)->frags[i++].size = len - offset;
+
+		skb_shinfo(skb)->nr_frags = i;
+
+		for (; i < nfrags; i++)
+			put_page(skb_shinfo(skb)->frags[i].page);
+
+		if (skb_shinfo(skb)->frag_list)
+			skb_drop_fraglist(skb);
+		break;
 	}
 
-	if (offset < len) {
+	for (fragp = &skb_shinfo(skb)->frag_list; (frag = *fragp);
+	     fragp = &frag->next) {
+		int end = offset + frag->len;
+
+		if (skb_shared(frag)) {
+			struct sk_buff *nfrag;
+
+			nfrag = skb_clone(frag, GFP_ATOMIC);
+			if (unlikely(!nfrag))
+				return -ENOMEM;
+
+			nfrag->next = frag->next;
+			frag = nfrag;
+			*fragp = frag;
+		}
+
+		if (end < len) {
+			offset = end;
+			continue;
+		}
+
+		if (end > len &&
+		    unlikely((err = pskb_trim(frag, len - offset))))
+			return err;
+
+		if (frag->next)
+			skb_drop_list(&frag->next);
+		break;
+	}
+
+	if (len > skb_headlen(skb)) {
 		skb->data_len -= skb->len - len;
 		skb->len       = len;
 	} else {
-		if (len <= skb_headlen(skb)) {
-			skb->len      = len;
-			skb->data_len = 0;
-			skb->tail     = skb->data + len;
-			if (skb_shinfo(skb)->frag_list && !skb_cloned(skb))
-				skb_drop_fraglist(skb);
-		} else {
-			skb->data_len -= skb->len - len;
-			skb->len       = len;
-		}
+		skb->len       = len;
+		skb->data_len  = 0;
+		skb->tail      = skb->data + len;
 	}
 
 	return 0;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [stable] [NET]: Update frag_list in pskb_trim
  2006-07-15  0:10   ` Herbert Xu
@ 2006-07-17 15:22     ` Greg KH
  2006-07-17 18:13       ` Herbert Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2006-07-17 15:22 UTC (permalink / raw)
  To: Herbert Xu; +Cc: stable, David Miller, netdev

On Sat, Jul 15, 2006 at 10:10:03AM +1000, Herbert Xu wrote:
> On Thu, Jul 13, 2006 at 10:05:26PM -0700, David Miller wrote:
> > 
> > As I noted already, this is in my tree and will go off to
> > Linus soon.
> > 
> > Please toss this over to -stable under seperate cover, if
> > you haven't done so already.  Please add my signoff:
> > 
> > Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> Great.  Here we go:
> 
> [NET]: Update frag_list in pskb_trim
> 
> When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of
> the packet, the frag_list is not updated to reflect the trimming.  This
> will usually work fine until you hit something that uses the packet length
> or tail from the frag_list.
> 
> Examples include esp_output and ip_fragment.

Ick, this doesn't apply to 2.6.17, care to rediff it?  I don't trust
myself to get it correct :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [stable] [NET]: Update frag_list in pskb_trim
  2006-07-17 15:22     ` [stable] " Greg KH
@ 2006-07-17 18:13       ` Herbert Xu
  2006-07-30 22:50         ` Herbert Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2006-07-17 18:13 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, David Miller, netdev

On Mon, Jul 17, 2006 at 08:22:44AM -0700, Greg KH wrote:
>
> Ick, this doesn't apply to 2.6.17, care to rediff it?  I don't trust
> myself to get it correct :)

Oops, I thought I rediffed against 2.6.17, but it must've been
something else.

Here is a second attempt:

[NET]: Update frag_list in pskb_trim

When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of
the packet, the frag_list is not updated to reflect the trimming.  This
will usually work fine until you hit something that uses the packet length
or tail from the frag_list.

Examples include esp_output and ip_fragment.

Another problem caused by this is that you can end up with a linear packet
with a frag_list attached.

It is possible to get away with this if we audit everything to make sure
that they always consult skb->len before going down onto frag_list.  In
fact we can do the samething for the paged part as well to avoid copying
the data area of the skb.  For now though, let's do the conservative fix
and update frag_list.

Many thanks to Marco Berizzi for helping me to track down this bug.

This 4-year old bug took 3 months to track down.  Marco was very patient
indeed :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f8f2347..2c31bb0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -967,15 +967,16 @@ #ifndef NET_SKB_PAD
 #define NET_SKB_PAD	16
 #endif
 
-extern int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc);
+extern int ___pskb_trim(struct sk_buff *skb, unsigned int len);
 
 static inline void __skb_trim(struct sk_buff *skb, unsigned int len)
 {
-	if (!skb->data_len) {
-		skb->len  = len;
-		skb->tail = skb->data + len;
-	} else
-		___pskb_trim(skb, len, 0);
+	if (unlikely(skb->data_len)) {
+		WARN_ON(1);
+		return;
+	}
+	skb->len  = len;
+	skb->tail = skb->data + len;
 }
 
 /**
@@ -985,6 +986,7 @@ static inline void __skb_trim(struct sk_
  *
  *	Cut the length of a buffer down by removing data from the tail. If
  *	the buffer is already under the length specified it is not modified.
+ *	The skb must be linear.
  */
 static inline void skb_trim(struct sk_buff *skb, unsigned int len)
 {
@@ -995,12 +997,10 @@ static inline void skb_trim(struct sk_bu
 
 static inline int __pskb_trim(struct sk_buff *skb, unsigned int len)
 {
-	if (!skb->data_len) {
-		skb->len  = len;
-		skb->tail = skb->data+len;
-		return 0;
-	}
-	return ___pskb_trim(skb, len, 1);
+	if (skb->data_len)
+		return ___pskb_trim(skb, len);
+	__skb_trim(skb, len);
+	return 0;
 }
 
 static inline int pskb_trim(struct sk_buff *skb, unsigned int len)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index fb3770f..40f108e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -250,11 +250,11 @@ nodata:
 }
 
 
-static void skb_drop_fraglist(struct sk_buff *skb)
+static void skb_drop_list(struct sk_buff **listp)
 {
-	struct sk_buff *list = skb_shinfo(skb)->frag_list;
+	struct sk_buff *list = *listp;
 
-	skb_shinfo(skb)->frag_list = NULL;
+	*listp = NULL;
 
 	do {
 		struct sk_buff *this = list;
@@ -263,6 +263,11 @@ static void skb_drop_fraglist(struct sk_
 	} while (list);
 }
 
+static inline void skb_drop_fraglist(struct sk_buff *skb)
+{
+	skb_drop_list(&skb_shinfo(skb)->frag_list);
+}
+
 static void skb_clone_fraglist(struct sk_buff *skb)
 {
 	struct sk_buff *list;
@@ -800,49 +805,80 @@ struct sk_buff *skb_pad(struct sk_buff *
 	return nskb;
 }	
  
-/* Trims skb to length len. It can change skb pointers, if "realloc" is 1.
- * If realloc==0 and trimming is impossible without change of data,
- * it is BUG().
+/* Trims skb to length len. It can change skb pointers.
  */
 
-int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc)
+int ___pskb_trim(struct sk_buff *skb, unsigned int len)
 {
+	struct sk_buff **fragp;
+	struct sk_buff *frag;
 	int offset = skb_headlen(skb);
 	int nfrags = skb_shinfo(skb)->nr_frags;
 	int i;
+	int err;
+
+	if (skb_cloned(skb) &&
+	    unlikely((err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))))
+		return err;
 
 	for (i = 0; i < nfrags; i++) {
 		int end = offset + skb_shinfo(skb)->frags[i].size;
-		if (end > len) {
-			if (skb_cloned(skb)) {
-				BUG_ON(!realloc);
-				if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
-					return -ENOMEM;
-			}
-			if (len <= offset) {
-				put_page(skb_shinfo(skb)->frags[i].page);
-				skb_shinfo(skb)->nr_frags--;
-			} else {
-				skb_shinfo(skb)->frags[i].size = len - offset;
-			}
+
+		if (end < len) {
+			offset = end;
+			continue;
+		}
+
+		if (len > offset)
+			skb_shinfo(skb)->frags[i++].size = len - offset;
+
+		skb_shinfo(skb)->nr_frags = i;
+
+		for (; i < nfrags; i++)
+			put_page(skb_shinfo(skb)->frags[i].page);
+
+		if (skb_shinfo(skb)->frag_list)
+			skb_drop_fraglist(skb);
+		break;
+	}
+
+	for (fragp = &skb_shinfo(skb)->frag_list; (frag = *fragp);
+	     fragp = &frag->next) {
+		int end = offset + frag->len;
+
+		if (skb_shared(frag)) {
+			struct sk_buff *nfrag;
+
+			nfrag = skb_clone(frag, GFP_ATOMIC);
+			if (unlikely(!nfrag))
+				return -ENOMEM;
+
+			nfrag->next = frag->next;
+			frag = nfrag;
+			*fragp = frag;
 		}
-		offset = end;
+
+		if (end < len) {
+			offset = end;
+			continue;
+		}
+
+		if (end > len &&
+		    unlikely((err = pskb_trim(frag, len - offset))))
+			return err;
+
+		if (frag->next)
+			skb_drop_list(&frag->next);
+		break;
 	}
 
-	if (offset < len) {
+	if (len > skb_headlen(skb)) {
 		skb->data_len -= skb->len - len;
 		skb->len       = len;
 	} else {
-		if (len <= skb_headlen(skb)) {
-			skb->len      = len;
-			skb->data_len = 0;
-			skb->tail     = skb->data + len;
-			if (skb_shinfo(skb)->frag_list && !skb_cloned(skb))
-				skb_drop_fraglist(skb);
-		} else {
-			skb->data_len -= skb->len - len;
-			skb->len       = len;
-		}
+		skb->len       = len;
+		skb->data_len  = 0;
+		skb->tail      = skb->data + len;
 	}
 
 	return 0;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [NET]: Fix ___pskb_trim when entire frag_list needs dropping
  2006-07-13  9:03 [NET]: Update frag_list in pskb_trim Herbert Xu
  2006-07-14  2:22 ` David Miller
  2006-07-14  5:05 ` David Miller
@ 2006-07-29 11:45 ` Herbert Xu
  2006-07-30 22:46   ` David Miller
  2006-08-02 15:01   ` Marco Berizzi
  2 siblings, 2 replies; 12+ messages in thread
From: Herbert Xu @ 2006-07-29 11:45 UTC (permalink / raw)
  To: David S. Miller, netdev; +Cc: Marco Berizzi

On Thu, Jul 13, 2006 at 07:03:41PM +1000, herbert wrote:
> 
> This needs to go into stable as well.  In fact, there is another unrelated
> bug with exactly the same symptoms which was inadvertently fixed by the
> GSO patches.  So I'll send a simpler fix for that to stable.
> 
> [NET]: Update frag_list in pskb_trim

Marco told me that he was still seeing the same problem.  Turns out that
my patch missed one important case.  I hope this is really the last time
I have to look at this bug :)

[NET]: Fix ___pskb_trim when entire frag_list needs dropping

When the trim point is within the head and there is no paged data,
___pskb_trim fails to drop the first element in the frag_list.
This patch fixes this by moving the len <= offset case out of the
page data loop.

This patch also adds a missing kfree_skb on the frag that we just
cloned.
 
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 476aa39..d236f02 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -846,7 +846,11 @@ int ___pskb_trim(struct sk_buff *skb, un
 	    unlikely((err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))))
 		return err;
 
-	for (i = 0; i < nfrags; i++) {
+	i = 0;
+	if (offset >= len)
+		goto drop_pages;
+
+	for (; i < nfrags; i++) {
 		int end = offset + skb_shinfo(skb)->frags[i].size;
 
 		if (end < len) {
@@ -854,9 +858,9 @@ int ___pskb_trim(struct sk_buff *skb, un
 			continue;
 		}
 
-		if (len > offset)
-			skb_shinfo(skb)->frags[i++].size = len - offset;
+		skb_shinfo(skb)->frags[i++].size = len - offset;
 
+drop_pages:
 		skb_shinfo(skb)->nr_frags = i;
 
 		for (; i < nfrags; i++)
@@ -864,7 +868,7 @@ int ___pskb_trim(struct sk_buff *skb, un
 
 		if (skb_shinfo(skb)->frag_list)
 			skb_drop_fraglist(skb);
-		break;
+		goto done;
 	}
 
 	for (fragp = &skb_shinfo(skb)->frag_list; (frag = *fragp);
@@ -879,6 +883,7 @@ int ___pskb_trim(struct sk_buff *skb, un
 				return -ENOMEM;
 
 			nfrag->next = frag->next;
+			kfree_skb(frag);
 			frag = nfrag;
 			*fragp = frag;
 		}
@@ -897,6 +902,7 @@ int ___pskb_trim(struct sk_buff *skb, un
 		break;
 	}
 
+done:
 	if (len > skb_headlen(skb)) {
 		skb->data_len -= skb->len - len;
 		skb->len       = len;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [NET]: Fix ___pskb_trim when entire frag_list needs dropping
  2006-07-29 11:45 ` [NET]: Fix ___pskb_trim when entire frag_list needs dropping Herbert Xu
@ 2006-07-30 22:46   ` David Miller
  2006-08-02 15:01   ` Marco Berizzi
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2006-07-30 22:46 UTC (permalink / raw)
  To: herbert; +Cc: netdev, pupilla

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sat, 29 Jul 2006 21:45:25 +1000

> [NET]: Fix ___pskb_trim when entire frag_list needs dropping
> 
> When the trim point is within the head and there is no paged data,
> ___pskb_trim fails to drop the first element in the frag_list.
> This patch fixes this by moving the len <= offset case out of the
> page data loop.
> 
> This patch also adds a missing kfree_skb on the frag that we just
> cloned.
>  
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks Herbert.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [stable] [NET]: Update frag_list in pskb_trim
  2006-07-17 18:13       ` Herbert Xu
@ 2006-07-30 22:50         ` Herbert Xu
  2006-08-03  7:19           ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2006-07-30 22:50 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, David Miller, netdev

For stable only:

On Tue, Jul 18, 2006 at 04:13:29AM +1000, herbert wrote:
> 
> Here is a second attempt:
> 
> [NET]: Update frag_list in pskb_trim

Unfortunately there was a bug in the fix which is now fixed.  I presume
this hasn't been included yet so I'm simply regenerating the whole patch.
Let me know if you want an incremental version instead.

[NET]: Update frag_list in pskb_trim

When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of
the packet, the frag_list is not updated to reflect the trimming.  This
will usually work fine until you hit something that uses the packet length
or tail from the frag_list.

Examples include esp_output and ip_fragment.

Another problem caused by this is that you can end up with a linear packet
with a frag_list attached.

It is possible to get away with this if we audit everything to make sure
that they always consult skb->len before going down onto frag_list.  In
fact we can do the samething for the paged part as well to avoid copying
the data area of the skb.  For now though, let's do the conservative fix
and update frag_list.

Many thanks to Marco Berizzi for helping me to track down this bug.

This 4-year old bug took 3 months to track down.  Marco was very patient
indeed :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f8f2347..2c31bb0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -967,15 +967,16 @@ #ifndef NET_SKB_PAD
 #define NET_SKB_PAD	16
 #endif
 
-extern int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc);
+extern int ___pskb_trim(struct sk_buff *skb, unsigned int len);
 
 static inline void __skb_trim(struct sk_buff *skb, unsigned int len)
 {
-	if (!skb->data_len) {
-		skb->len  = len;
-		skb->tail = skb->data + len;
-	} else
-		___pskb_trim(skb, len, 0);
+	if (unlikely(skb->data_len)) {
+		WARN_ON(1);
+		return;
+	}
+	skb->len  = len;
+	skb->tail = skb->data + len;
 }
 
 /**
@@ -985,6 +986,7 @@ static inline void __skb_trim(struct sk_
  *
  *	Cut the length of a buffer down by removing data from the tail. If
  *	the buffer is already under the length specified it is not modified.
+ *	The skb must be linear.
  */
 static inline void skb_trim(struct sk_buff *skb, unsigned int len)
 {
@@ -995,12 +997,10 @@ static inline void skb_trim(struct sk_bu
 
 static inline int __pskb_trim(struct sk_buff *skb, unsigned int len)
 {
-	if (!skb->data_len) {
-		skb->len  = len;
-		skb->tail = skb->data+len;
-		return 0;
-	}
-	return ___pskb_trim(skb, len, 1);
+	if (skb->data_len)
+		return ___pskb_trim(skb, len);
+	__skb_trim(skb, len);
+	return 0;
 }
 
 static inline int pskb_trim(struct sk_buff *skb, unsigned int len)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index fb3770f..f3cfb36 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -250,11 +250,11 @@ nodata:
 }
 
 
-static void skb_drop_fraglist(struct sk_buff *skb)
+static void skb_drop_list(struct sk_buff **listp)
 {
-	struct sk_buff *list = skb_shinfo(skb)->frag_list;
+	struct sk_buff *list = *listp;
 
-	skb_shinfo(skb)->frag_list = NULL;
+	*listp = NULL;
 
 	do {
 		struct sk_buff *this = list;
@@ -263,6 +263,11 @@ static void skb_drop_fraglist(struct sk_
 	} while (list);
 }
 
+static inline void skb_drop_fraglist(struct sk_buff *skb)
+{
+	skb_drop_list(&skb_shinfo(skb)->frag_list);
+}
+
 static void skb_clone_fraglist(struct sk_buff *skb)
 {
 	struct sk_buff *list;
@@ -800,49 +805,86 @@ struct sk_buff *skb_pad(struct sk_buff *
 	return nskb;
 }	
  
-/* Trims skb to length len. It can change skb pointers, if "realloc" is 1.
- * If realloc==0 and trimming is impossible without change of data,
- * it is BUG().
+/* Trims skb to length len. It can change skb pointers.
  */
 
-int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc)
+int ___pskb_trim(struct sk_buff *skb, unsigned int len)
 {
+	struct sk_buff **fragp;
+	struct sk_buff *frag;
 	int offset = skb_headlen(skb);
 	int nfrags = skb_shinfo(skb)->nr_frags;
 	int i;
+	int err;
 
-	for (i = 0; i < nfrags; i++) {
+	if (skb_cloned(skb) &&
+	    unlikely((err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))))
+		return err;
+
+	i = 0;
+	if (offset >= len)
+		goto drop_pages;
+
+	for (; i < nfrags; i++) {
 		int end = offset + skb_shinfo(skb)->frags[i].size;
-		if (end > len) {
-			if (skb_cloned(skb)) {
-				BUG_ON(!realloc);
-				if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
-					return -ENOMEM;
-			}
-			if (len <= offset) {
-				put_page(skb_shinfo(skb)->frags[i].page);
-				skb_shinfo(skb)->nr_frags--;
-			} else {
-				skb_shinfo(skb)->frags[i].size = len - offset;
-			}
+
+		if (end < len) {
+			offset = end;
+			continue;
 		}
-		offset = end;
+
+		skb_shinfo(skb)->frags[i++].size = len - offset;
+
+drop_pages:
+		skb_shinfo(skb)->nr_frags = i;
+
+		for (; i < nfrags; i++)
+			put_page(skb_shinfo(skb)->frags[i].page);
+
+		if (skb_shinfo(skb)->frag_list)
+			skb_drop_fraglist(skb);
+		goto done;
+	}
+
+	for (fragp = &skb_shinfo(skb)->frag_list; (frag = *fragp);
+	     fragp = &frag->next) {
+		int end = offset + frag->len;
+
+		if (skb_shared(frag)) {
+			struct sk_buff *nfrag;
+
+			nfrag = skb_clone(frag, GFP_ATOMIC);
+			if (unlikely(!nfrag))
+				return -ENOMEM;
+
+			nfrag->next = frag->next;
+			kfree_skb(frag);
+			frag = nfrag;
+			*fragp = frag;
+		}
+
+		if (end < len) {
+			offset = end;
+			continue;
+		}
+
+		if (end > len &&
+		    unlikely((err = pskb_trim(frag, len - offset))))
+			return err;
+
+		if (frag->next)
+			skb_drop_list(&frag->next);
+		break;
 	}
 
-	if (offset < len) {
+done:
+	if (len > skb_headlen(skb)) {
 		skb->data_len -= skb->len - len;
 		skb->len       = len;
 	} else {
-		if (len <= skb_headlen(skb)) {
-			skb->len      = len;
-			skb->data_len = 0;
-			skb->tail     = skb->data + len;
-			if (skb_shinfo(skb)->frag_list && !skb_cloned(skb))
-				skb_drop_fraglist(skb);
-		} else {
-			skb->data_len -= skb->len - len;
-			skb->len       = len;
-		}
+		skb->len       = len;
+		skb->data_len  = 0;
+		skb->tail      = skb->data + len;
 	}
 
 	return 0;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: [NET]: Fix ___pskb_trim when entire frag_list needs dropping
  2006-07-29 11:45 ` [NET]: Fix ___pskb_trim when entire frag_list needs dropping Herbert Xu
  2006-07-30 22:46   ` David Miller
@ 2006-08-02 15:01   ` Marco Berizzi
  2006-08-02 21:15     ` David Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Marco Berizzi @ 2006-08-02 15:01 UTC (permalink / raw)
  To: herbert; +Cc: davem, netdev

Herbert Xu wrote:

>On Thu, Jul 13, 2006 at 07:03:41PM +1000, herbert wrote:
> >
> > This needs to go into stable as well.  In fact, there is another 
>unrelated
> > bug with exactly the same symptoms which was inadvertently fixed by the
> > GSO patches.  So I'll send a simpler fix for that to stable.
> >
> > [NET]: Update frag_list in pskb_trim
>
>Marco told me that he was still seeing the same problem.  Turns out that
>my patch missed one important case.  I hope this is really the last time
>I have to look at this bug :)
>
>[NET]: Fix ___pskb_trim when entire frag_list needs dropping
>
>When the trim point is within the head and there is no paged data,
>___pskb_trim fails to drop the first element in the frag_list.
>This patch fixes this by moving the len <= offset case out of the
>page data loop.
>
>This patch also adds a missing kfree_skb on the frag that we just
>cloned.

The problem is fixed now. I have applied this
patch to 2.6.18-rc2
Many thanks Herbert for all the time spent
to debug this problem.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [NET]: Fix ___pskb_trim when entire frag_list needs dropping
  2006-08-02 15:01   ` Marco Berizzi
@ 2006-08-02 21:15     ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2006-08-02 21:15 UTC (permalink / raw)
  To: pupilla; +Cc: herbert, netdev

From: "Marco Berizzi" <pupilla@hotmail.com>
Date: Wed, 02 Aug 2006 17:01:17 +0200

> The problem is fixed now. I have applied this
> patch to 2.6.18-rc2
> Many thanks Herbert for all the time spent
> to debug this problem.

Thank you for testing.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [stable] [NET]: Update frag_list in pskb_trim
  2006-07-30 22:50         ` Herbert Xu
@ 2006-08-03  7:19           ` Greg KH
  0 siblings, 0 replies; 12+ messages in thread
From: Greg KH @ 2006-08-03  7:19 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, stable, David Miller

On Mon, Jul 31, 2006 at 08:50:37AM +1000, Herbert Xu wrote:
> For stable only:
> 
> On Tue, Jul 18, 2006 at 04:13:29AM +1000, herbert wrote:
> > 
> > Here is a second attempt:
> > 
> > [NET]: Update frag_list in pskb_trim
> 
> Unfortunately there was a bug in the fix which is now fixed.  I presume
> this hasn't been included yet so I'm simply regenerating the whole patch.
> Let me know if you want an incremental version instead.
> 
> [NET]: Update frag_list in pskb_trim


Queued to -stable.

thanks,

greg k-h


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-08-03  7:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-13  9:03 [NET]: Update frag_list in pskb_trim Herbert Xu
2006-07-14  2:22 ` David Miller
2006-07-14  5:05 ` David Miller
2006-07-15  0:10   ` Herbert Xu
2006-07-17 15:22     ` [stable] " Greg KH
2006-07-17 18:13       ` Herbert Xu
2006-07-30 22:50         ` Herbert Xu
2006-08-03  7:19           ` Greg KH
2006-07-29 11:45 ` [NET]: Fix ___pskb_trim when entire frag_list needs dropping Herbert Xu
2006-07-30 22:46   ` David Miller
2006-08-02 15:01   ` Marco Berizzi
2006-08-02 21:15     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).