netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* skb_segment() questions
@ 2009-03-05  0:08 James Huang
  0 siblings, 0 replies; 12+ messages in thread
From: James Huang @ 2009-03-05  0:08 UTC (permalink / raw)
  To: netdev

Hi all,

    After spending hours trying to understand how GSO and GRO works in the 
latest Linux kernel (net-next-2.6.git), I am still quite confused about 
the implementation of skb_segment():

(1) Comments about the roles of some critical variables in the routine will 
help. Among them, len, hsize, and offset are not as confusing and I figured 
they have the following meanings:

len: amount of payload to "copy" into nskb
hsize: amount of payload to copy into nskb's head buffer 
offset: offset (from L2 header) of skb's payload to start "copy" into nskb

However, the variable "pos" is quite ambiguous. The value of "pos" at the 
beginning of each iteration of the do loop seems to depend on the current 
fragment being processed.  If current fragment is the head buffer of skb, pos 
is set to offset(end of the head buffer).  But if the current fragment is a 
page entry in skb or a skb in the frag_list, then pos is set to offset
(beginning of the current fragment).

(2) What is the purpose of the following check?

`    if (pos >= offset + len)
        continue;

     If the payload in the head buffer of skb has at least mss bytes, this 
check will succeed and no payload in skb’s head buffer will be copy into nskb 
through a call to skb_copy_from_linear_data_offset(). Something seems to be 
wrong here.

(3) Variable "hsize" seems to have a new meaning within the following if 
statement:

       if (!hsize && i >= nfrags) {
                      :
                      :
	   hsize = skb_end_pointer(nskb) - nskb->head;
	   if (skb_cow_head(nskb, doffset + headroom)) {
	 	kfree_skb(nskb);
		goto err;
	   }

	   nskb->truesize += skb_end_pointer(nskb) - nskb->head - 
hsize;                :
                      :
       }

    If so, it will be better to use a different variable here.

(4) When will the if condition (if (pos < offset + len)) just before 
skip_faglist become true?  When the if condition is true, nskb will have a non-
null frag_list. How do we know that the output interface's driver will support 
such a skb?

(5) There are some assumptions about the input skb. These assumptions are 
asserted by BUG_ON() statements throughout the routine.  It will help to list 
those assumptions at the very beginning of skb_segment().


Thanks,
James Huang




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: skb_segment() questions
       [not found] <f0ed9b110903041912v6fba381fm4da792d49b1cbb91@mail.gmail.com>
@ 2009-03-29  2:07 ` Herbert Xu
  2009-03-29  6:39   ` David Miller
  2009-04-01  9:18   ` [PATCH] " Jarek Poplawski
  0 siblings, 2 replies; 12+ messages in thread
From: Herbert Xu @ 2009-03-29  2:07 UTC (permalink / raw)
  To: James Huang, David S. Miller; +Cc: netdev

On Wed, Mar 04, 2009 at 07:12:40PM -0800, James Huang wrote:
>
> (2) What is the purpose of the this check?
> 
> `    if (pos >= offset + len)
>         continue;
> 
>      If the payload in the head buffer of skb has at least mss bytes, this
> check will succeed and no payload in skb’s head buffer will be copy into
> nskb
> through a call to skb_copy_from_linear_data_offset(). Something seems to be
> wrong here.

Indeed.  This breaks linear packets, which unfortunately older
versions of tun likes to construct.

gso: Fix support for linear packets

When GRO/frag_list support was added to GSO, I made an error
which broke the support for segmenting linear GSO packets (GSO
packets are normally non-linear in the payload).

These days most of these packets are constructed by the tun
driver, which prefers to allocate linear memory if possible.
This is fixed in the latest kernel, but for 2.6.29 and earlier
it is still the norm.

Therefore this bug causes failures with GSO when used with tun
in 2.6.29.

Reported-by: James Huang <jamesclhuang@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6acbf9e..ce6356c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2579,7 +2579,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
 					  skb_network_header_len(skb));
 		skb_copy_from_linear_data(skb, nskb->data, doffset);
 
-		if (pos >= offset + len)
+		if (fskb != skb_shinfo(skb)->frag_list)
 			continue;
 
 		if (!sg) {

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: skb_segment() questions
  2009-03-29  2:07 ` skb_segment() questions Herbert Xu
@ 2009-03-29  6:39   ` David Miller
  2009-03-30  8:50     ` Mark McLoughlin
  2009-04-01  9:18   ` [PATCH] " Jarek Poplawski
  1 sibling, 1 reply; 12+ messages in thread
From: David Miller @ 2009-03-29  6:39 UTC (permalink / raw)
  To: herbert; +Cc: jamesclhuang, netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sun, 29 Mar 2009 10:07:01 +0800

> gso: Fix support for linear packets
> 
> When GRO/frag_list support was added to GSO, I made an error
> which broke the support for segmenting linear GSO packets (GSO
> packets are normally non-linear in the payload).
> 
> These days most of these packets are constructed by the tun
> driver, which prefers to allocate linear memory if possible.
> This is fixed in the latest kernel, but for 2.6.29 and earlier
> it is still the norm.
> 
> Therefore this bug causes failures with GSO when used with tun
> in 2.6.29.
> 
> Reported-by: James Huang <jamesclhuang@gmail.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: skb_segment() questions
  2009-03-29  6:39   ` David Miller
@ 2009-03-30  8:50     ` Mark McLoughlin
  2009-03-30 20:57       ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Mark McLoughlin @ 2009-03-30  8:50 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, jamesclhuang, netdev

On Sat, 2009-03-28 at 23:39 -0700, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Sun, 29 Mar 2009 10:07:01 +0800
> 
> > gso: Fix support for linear packets
> > 
> > When GRO/frag_list support was added to GSO, I made an error
> > which broke the support for segmenting linear GSO packets (GSO
> > packets are normally non-linear in the payload).
> > 
> > These days most of these packets are constructed by the tun
> > driver, which prefers to allocate linear memory if possible.
> > This is fixed in the latest kernel, but for 2.6.29 and earlier
> > it is still the norm.
> > 
> > Therefore this bug causes failures with GSO when used with tun
> > in 2.6.29.
> > 
> > Reported-by: James Huang <jamesclhuang@gmail.com>
> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> Applied, thanks.

This is needed in -stable too, fwiw. Fixes e.g. virtio guest->remote
with 2.6.29 host:

  https://bugzilla.redhat.com/490266

Cheers,
Mark.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: skb_segment() questions
  2009-03-30  8:50     ` Mark McLoughlin
@ 2009-03-30 20:57       ` David Miller
  2009-04-20 11:12         ` Mark McLoughlin
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2009-03-30 20:57 UTC (permalink / raw)
  To: markmc; +Cc: herbert, jamesclhuang, netdev

From: Mark McLoughlin <markmc@redhat.com>
Date: Mon, 30 Mar 2009 09:50:45 +0100

> On Sat, 2009-03-28 at 23:39 -0700, David Miller wrote:
> > From: Herbert Xu <herbert@gondor.apana.org.au>
> > Date: Sun, 29 Mar 2009 10:07:01 +0800
> > 
> > > gso: Fix support for linear packets
> > > 
> > > When GRO/frag_list support was added to GSO, I made an error
> > > which broke the support for segmenting linear GSO packets (GSO
> > > packets are normally non-linear in the payload).
> > > 
> > > These days most of these packets are constructed by the tun
> > > driver, which prefers to allocate linear memory if possible.
> > > This is fixed in the latest kernel, but for 2.6.29 and earlier
> > > it is still the norm.
> > > 
> > > Therefore this bug causes failures with GSO when used with tun
> > > in 2.6.29.
> > > 
> > > Reported-by: James Huang <jamesclhuang@gmail.com>
> > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> > 
> > Applied, thanks.
> 
> This is needed in -stable too, fwiw. Fixes e.g. virtio guest->remote
> with 2.6.29 host:
> 
>   https://bugzilla.redhat.com/490266

I know, I'll queue it up.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] Re: skb_segment() questions
  2009-03-29  2:07 ` skb_segment() questions Herbert Xu
  2009-03-29  6:39   ` David Miller
@ 2009-04-01  9:18   ` Jarek Poplawski
  2009-04-01  9:24     ` Herbert Xu
  1 sibling, 1 reply; 12+ messages in thread
From: Jarek Poplawski @ 2009-04-01  9:18 UTC (permalink / raw)
  To: Herbert Xu; +Cc: James Huang, David S. Miller, netdev

On 29-03-2009 03:07, Herbert Xu wrote:
> On Wed, Mar 04, 2009 at 07:12:40PM -0800, James Huang wrote:
>> (2) What is the purpose of the this check?
>>
>> `    if (pos >= offset + len)
>>         continue;
>>
>>      If the payload in the head buffer of skb has at least mss bytes, this
>> check will succeed and no payload in skbâ??s head buffer will be copy into
>> nskb
>> through a call to skb_copy_from_linear_data_offset(). Something seems to be
>> wrong here.
> 
> Indeed.  This breaks linear packets, which unfortunately older
> versions of tun likes to construct.
> 
> gso: Fix support for linear packets
> 
> When GRO/frag_list support was added to GSO, I made an error
> which broke the support for segmenting linear GSO packets (GSO
> packets are normally non-linear in the payload).
> 
> These days most of these packets are constructed by the tun
> driver, which prefers to allocate linear memory if possible.
> This is fixed in the latest kernel, but for 2.6.29 and earlier
> it is still the norm.
> 
> Therefore this bug causes failures with GSO when used with tun
> in 2.6.29.
> 
> Reported-by: James Huang <jamesclhuang@gmail.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6acbf9e..ce6356c 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2579,7 +2579,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
>  					  skb_network_header_len(skb));
>  		skb_copy_from_linear_data(skb, nskb->data, doffset);
>  
> -		if (pos >= offset + len)
> +		if (fskb != skb_shinfo(skb)->frag_list)
>  			continue;
>  
>  		if (!sg) {
> 

----------------------->
gso: Fix support for linear packets 2

The previous fix removed a check, which should be useful, only a bit
later, by skipping at least two similar checks and three useless
assignments in case a header is (still) copied.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

 net/core/skbuff.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index ce6356c..2123a92 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2594,6 +2594,8 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
 
 		skb_copy_from_linear_data_offset(skb, offset,
 						 skb_put(nskb, hsize), hsize);
+		if (pos >= offset + len)
+			continue;
 
 		while (pos < offset + len && i < nfrags) {
 			*frag = skb_shinfo(skb)->frags[i];

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Re: skb_segment() questions
  2009-04-01  9:18   ` [PATCH] " Jarek Poplawski
@ 2009-04-01  9:24     ` Herbert Xu
  2009-04-01  9:50       ` Jarek Poplawski
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2009-04-01  9:24 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: James Huang, David S. Miller, netdev

On Wed, Apr 01, 2009 at 09:18:01AM +0000, Jarek Poplawski wrote:
>
> gso: Fix support for linear packets 2
> 
> The previous fix removed a check, which should be useful, only a bit
> later, by skipping at least two similar checks and three useless
> assignments in case a header is (still) copied.
> 
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> ---
> 
>  net/core/skbuff.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index ce6356c..2123a92 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2594,6 +2594,8 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
>  
>  		skb_copy_from_linear_data_offset(skb, offset,
>  						 skb_put(nskb, hsize), hsize);
> +		if (pos >= offset + len)
> +			continue;
>  
>  		while (pos < offset + len && i < nfrags) {
>  			*frag = skb_shinfo(skb)->frags[i];

The common case (non-linear skb) is going to fail that check so 
I'm no sure if it's warranted.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Re: skb_segment() questions
  2009-04-01  9:24     ` Herbert Xu
@ 2009-04-01  9:50       ` Jarek Poplawski
  2009-04-01  9:53         ` Herbert Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Jarek Poplawski @ 2009-04-01  9:50 UTC (permalink / raw)
  To: Herbert Xu; +Cc: James Huang, David S. Miller, netdev

On Wed, Apr 01, 2009 at 05:24:57PM +0800, Herbert Xu wrote:
> On Wed, Apr 01, 2009 at 09:18:01AM +0000, Jarek Poplawski wrote:
> >
> > gso: Fix support for linear packets 2
> > 
> > The previous fix removed a check, which should be useful, only a bit
> > later, by skipping at least two similar checks and three useless
> > assignments in case a header is (still) copied.
> > 
> > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> > ---
> > 
> >  net/core/skbuff.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> > 
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index ce6356c..2123a92 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -2594,6 +2594,8 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
> >  
> >  		skb_copy_from_linear_data_offset(skb, offset,
> >  						 skb_put(nskb, hsize), hsize);
> > +		if (pos >= offset + len)
> > +			continue;
> >  
> >  		while (pos < offset + len && i < nfrags) {
> >  			*frag = skb_shinfo(skb)->frags[i];
> 
> The common case (non-linear skb) is going to fail that check so 
> I'm no sure if it's warranted.

I guess you mean non-linear skb with a header smaller than mtu? Well,
if it's the most common case now, I agree.

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Re: skb_segment() questions
  2009-04-01  9:50       ` Jarek Poplawski
@ 2009-04-01  9:53         ` Herbert Xu
  2009-04-01 10:02           ` Jarek Poplawski
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2009-04-01  9:53 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: James Huang, David S. Miller, netdev

On Wed, Apr 01, 2009 at 09:50:50AM +0000, Jarek Poplawski wrote:
>
> I guess you mean non-linear skb with a header smaller than mtu? Well,
> if it's the most common case now, I agree.

The common case is the path stemming from the TCP stack, where
we always construct skb's with the entire payload in page frags,
and only the header is placed in skb->data.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Re: skb_segment() questions
  2009-04-01  9:53         ` Herbert Xu
@ 2009-04-01 10:02           ` Jarek Poplawski
  0 siblings, 0 replies; 12+ messages in thread
From: Jarek Poplawski @ 2009-04-01 10:02 UTC (permalink / raw)
  To: Herbert Xu; +Cc: James Huang, David S. Miller, netdev

On Wed, Apr 01, 2009 at 05:53:52PM +0800, Herbert Xu wrote:
> On Wed, Apr 01, 2009 at 09:50:50AM +0000, Jarek Poplawski wrote:
> >
> > I guess you mean non-linear skb with a header smaller than mtu? Well,
> > if it's the most common case now, I agree.
> 
> The common case is the path stemming from the TCP stack, where
> we always construct skb's with the entire payload in page frags,
> and only the header is placed in skb->data.

OK, then let's forget about this patch.

Thanks for the explanation,
Jarek P.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: skb_segment() questions
  2009-03-30 20:57       ` David Miller
@ 2009-04-20 11:12         ` Mark McLoughlin
  2009-04-20 11:57           ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Mark McLoughlin @ 2009-04-20 11:12 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, jamesclhuang, netdev

On Mon, 2009-03-30 at 13:57 -0700, David Miller wrote:
> From: Mark McLoughlin <markmc@redhat.com>
> Date: Mon, 30 Mar 2009 09:50:45 +0100
> 
> > On Sat, 2009-03-28 at 23:39 -0700, David Miller wrote:
> > > From: Herbert Xu <herbert@gondor.apana.org.au>
> > > Date: Sun, 29 Mar 2009 10:07:01 +0800
> > > 
> > > > gso: Fix support for linear packets
> > > > 
> > > > When GRO/frag_list support was added to GSO, I made an error
> > > > which broke the support for segmenting linear GSO packets (GSO
> > > > packets are normally non-linear in the payload).
> > > > 
> > > > These days most of these packets are constructed by the tun
> > > > driver, which prefers to allocate linear memory if possible.
> > > > This is fixed in the latest kernel, but for 2.6.29 and earlier
> > > > it is still the norm.
> > > > 
> > > > Therefore this bug causes failures with GSO when used with tun
> > > > in 2.6.29.
> > > > 
> > > > Reported-by: James Huang <jamesclhuang@gmail.com>
> > > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> > > 
> > > Applied, thanks.
> > 
> > This is needed in -stable too, fwiw. Fixes e.g. virtio guest->remote
> > with 2.6.29 host:
> > 
> >   https://bugzilla.redhat.com/490266
> 
> I know, I'll queue it up.

This hasn't made it yet? Just had another report on kvm list about it.

(Can't seem to find anywhere to check whether it's queued)

Cheers,
Mark.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: skb_segment() questions
  2009-04-20 11:12         ` Mark McLoughlin
@ 2009-04-20 11:57           ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2009-04-20 11:57 UTC (permalink / raw)
  To: markmc; +Cc: herbert, jamesclhuang, netdev

From: Mark McLoughlin <markmc@redhat.com>
Date: Mon, 20 Apr 2009 12:12:10 +0100

> On Mon, 2009-03-30 at 13:57 -0700, David Miller wrote:
>> I know, I'll queue it up.
> 
> This hasn't made it yet? Just had another report on kvm list about it.
> 
> (Can't seem to find anywhere to check whether it's queued)

Sorry, it's in my backlog, I'll take care of it tomorr.w

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-04-20 11:57 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <f0ed9b110903041912v6fba381fm4da792d49b1cbb91@mail.gmail.com>
2009-03-29  2:07 ` skb_segment() questions Herbert Xu
2009-03-29  6:39   ` David Miller
2009-03-30  8:50     ` Mark McLoughlin
2009-03-30 20:57       ` David Miller
2009-04-20 11:12         ` Mark McLoughlin
2009-04-20 11:57           ` David Miller
2009-04-01  9:18   ` [PATCH] " Jarek Poplawski
2009-04-01  9:24     ` Herbert Xu
2009-04-01  9:50       ` Jarek Poplawski
2009-04-01  9:53         ` Herbert Xu
2009-04-01 10:02           ` Jarek Poplawski
2009-03-05  0:08 James Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).