netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.36-rc3: Reported regressions from 2.6.35
@ 2010-08-29 22:24 Rafael J. Wysocki
       [not found] ` <4AUWBNzTkbD.A.ey.cGueMB@chimera>
  0 siblings, 1 reply; 21+ messages in thread
From: Rafael J. Wysocki @ 2010-08-29 22:24 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Maciej Rutecki, Andrew Morton, Linus Torvalds,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, Linux Wireless List, DRI

This message contains a list of some regressions from 2.6.35,
for which there are no fixes in the mainline known to the tracking team.
If any of them have been fixed already, please let us know.

If you know of any other unresolved regressions from 2.6.35, please let us
know either and we'll add them to the list.  Also, please let us know
if any of the entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply
to this message with CCs to the people involved in reporting and handling
the issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2010-08-30       21       16          15


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17341
Subject		: kdump regression compared to v2.6.35
Submitter	: CAI Qian <caiqian@redhat.com>
Date		: 2010-08-27 12:35 (3 days old)
Message-ID	: <2136707099.1405541282912500148.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
References	: http://marc.info/?l=linux-kernel&m=128291252612135&w=2
Handled-By	: Tejun Heo <tj@kernel.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17331
Subject		: BUG: scheduling while atomic
Submitter	: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Date		: 2010-08-27 7:59 (3 days old)
Message-ID	: <20100827075911.GA5966@swordfish.minsk.epam.com>
References	: http://marc.info/?l=linux-kernel&m=128289602925505&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17321
Subject		: i386 WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x2f/0x7e
Submitter	: Arno Schuring <aelschuring@hotmail.com>
Date		: 2010-08-27 20:04 (3 days old)
Message-ID	: <4C781A3A.4010707@hotmail.com>
References	: http://marc.info/?l=linux-kernel&m=128294076822387&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17311
Subject		: 2.6.36-rc2-git4 - INFO: possible circular locking dependency detected
Submitter	: Miles Lane <miles.lane@gmail.com>
Date		: 2010-08-27 1:56 (3 days old)
First-Bad-Commit: http://git.kernel.org/linus/5a652052fedbd7869572c757dd2ffc2ed420c69d
Message-ID	: <AANLkTinHbEW36D5R9NSrGgfbOC0Hri3Tg-fA0iR92Udi@mail.gmail.com>
References	: http://marc.info/?l=linux-kernel&m=128287422106267&w=2
Handled-By	: Johannes Berg <johannes@sipsolutions.net>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17301
Subject		: i915: 2.6.36-rc2 wrong resolution on gdm start
Submitter	: Ivan Bulatovic <combuster@gmx.com>
Date		: 2010-08-24 1:00 (6 days old)
First-Bad-Commit: http://git.kernel.org/linus/9d0498a2bf7455159b317f19531a3e5db2ecc9c4
Message-ID	: <1282611655.2177.19.camel@localhost.localdomain>
References	: http://marc.info/?l=linux-kernel&m=128261168202306&w=2
Handled-By	: Chris Wilson <chris@chris-wilson.co.uk>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17151
Subject		: i915: 2.6.36-rc2 hoses my Intel display
Submitter	: Jonathan Corbet <corbet@lwn.net>
Date		: 2010-08-23 17:01 (7 days old)
First-Bad-Commit: http://git.kernel.org/linus/32aad86fe88e7323d4fc5e9e423abcee0d55a03d
Message-ID	: <20100823110145.08eb72fd@bike.lwn.net>
References	: http://marc.info/?l=linux-kernel&m=128258293327326&w=2
Handled-By	: Chris Wilson <chris@chris-wilson.co.uk>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17131
Subject		: WARN with 3c905 boomerang NIC
Submitter	: Doug Nazar <nazard.lkml@gmail.com>
Date		: 2010-08-22 6:35 (8 days old)
Message-ID	: <4C70C516.5020404@gmail.com>
References	: http://marc.info/?l=linux-kernel&m=128245894300623&w=2
Handled-By	: David Miller <davem@davemloft.net>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17061
Subject		: 2.6.36-rc1 on zaurus: bluetooth regression
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2010-08-21 15:24 (9 days old)
Message-ID	: <20100821152445.GA1536@ucw.cz>
References	: http://marc.info/?l=linux-kernel&m=128240433828087&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17041
Subject		: 2.6.36-rc1 hangs during XFS barrier test for /
Submitter	: Torsten Kaiser <just.for.lkml@googlemail.com>
Date		: 2010-08-20 (10 days old)
Message-ID	: <AANLkTim1PXibiY98GUdMj-UZLTav+n7GAnJ0Mjn6_5a3@mail.gmail.com>
References	: http://marc.info/?l=linux-kernel&m=128231691708710&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=17021
Subject		: [REGRESSION] [2.6.36-rc1] [DRM INTEL] [drm:intel_calculate_wm] *ERROR* Insufficient FIFO for plane, expect flickering: entries required = 36, available = 28.
Submitter	: Maciej Rutecki <maciej.rutecki@gmail.com>
Date		: 2010-08-18 18:46 (12 days old)
Message-ID	: <201008182046.37732.maciej.rutecki@gmail.com>
References	: http://marc.info/?l=linux-kernel&m=128215721507666&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=16971
Subject		: qla4xxx compile failure on 32-bit PowerPC: missing readq and writeq
Submitter	: Meelis Roos <mroos@linux.ee>
Date		: 2010-08-19 21:03 (11 days old)
Message-ID	: <alpine.SOC.1.00.1008192359310.19654@math.ut.ee>
References	: http://marc.info/?l=linux-kernel&m=128225184900892&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=16961
Subject		: kernel BUG at arch/x86/kvm/../../../virt/kvm/kvm_main.c:1978
Submitter	: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Date		: 2010-08-19 9:54 (11 days old)
Message-ID	: <20100819095429.GA5201@swordfish.minsk.epam.com>
References	: http://marc.info/?l=linux-kernel&m=128221169606214&w=2
Handled-By	: Avi Kivity <avi@redhat.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=16951
Subject		: hackbench regression with 2.6.36-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Date		: 2010-08-18 6:18 (12 days old)
Message-ID	: <1282112318.21202.8.camel@ymzhang.sh.intel.com>
References	: http://marc.info/?l=linux-kernel&m=128211235904910&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=16881
Subject		: [REGRESSION, Radeon-KMS] 2.6.36-rc1,2 - missing textures in 0 A.D.
Submitter	:  <trapdoor6@gmail.com>
Date		: 2010-08-24 12:20 (6 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=16626
Subject		: Machine hangs with EIP at skb_copy_and_csum_dev
Submitter	: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Date		: 2010-08-19 09:57 (11 days old)
Handled-By	:  Eric Dumazet <eric.dumazet@gmail.com>


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=16629
Subject		: fix BUG: using smp_processor_id() in preemptible code (resend)
Submitter	: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Date		: 2010-08-18 9:11 (12 days old)
Message-ID	: <20100818091157.GA5238@swordfish.minsk.epam.com>
References	: http://marc.info/?l=linux-kernel&m=128212276618793&w=2
Handled-By	: H. Peter Anvin <hpa@zytor.com>
Patch		: http://marc.info/?l=linux-kernel&m=128212276618793&w=2


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.35,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=16444

Please let the tracking team know if there are any Bugzilla entries that
should be added to the list in there.

Thanks!


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
       [not found]     ` <courier.4C7C99F2.00001F74-Xdw7EbNJKi3354cJYj5R/Q@public.gmane.org>
@ 2010-08-31 19:26       ` Jarek Poplawski
  2010-09-01 10:50         ` Eric Dumazet
  2018-08-29 21:39         ` Mitchell Erblich
  0 siblings, 2 replies; 21+ messages in thread
From: Jarek Poplawski @ 2010-08-31 19:26 UTC (permalink / raw)
  To: Plamen Petrov
  Cc: Rafael J. Wysocki, Kernel Testers List, Maciej Rutecki,
	David S. Miller, Eric Dumazet, Linux Kernel Mailing List,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Tue, Aug 31, 2010 at 08:58:10AM +0300, Plamen Petrov wrote:
> Rafael J. Wysocki ????????????:
> 
> >This message has been generated automatically as a part of a summary report
> >of recent regressions.
> >
> >The following bug entry is on the current list of known regressions
> >from 2.6.35.  Please verify if it still should be listed and let the tracking team
> >know (either way).
> >
> >
> >Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=16626
> >Subject		: Machine hangs with EIP at skb_copy_and_csum_dev
> >Submitter	: Plamen Petrov <pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> >Date		: 2010-08-19 09:57 (11 days old)
> >Handled-By	:  Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >
> >
> 
> Should "generic receive offload" work on a forwarding setup?
> If yes - then the bug should remain open.
> If not - then it's my mistake.

If/since it's on by default it should work and it definitely can't be
your mistake. (Unless we can't find the real reason... ;-)

Jarek P.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-08-31 19:26       ` [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev Jarek Poplawski
@ 2010-09-01 10:50         ` Eric Dumazet
  2010-09-01 11:20           ` Jarek Poplawski
                             ` (4 more replies)
  2018-08-29 21:39         ` Mitchell Erblich
  1 sibling, 5 replies; 21+ messages in thread
From: Eric Dumazet @ 2010-09-01 10:50 UTC (permalink / raw)
  To: Jarek Poplawski, Plamen Petrov, Herbert Xu
  Cc: Rafael J. Wysocki, Kernel Testers List, Maciej Rutecki,
	David S. Miller, Linux Kernel Mailing List, netdev

Plamen, could you test following patch ?

I reproduced problem on a dev machine and following patch cured it.

Thanks

[PATCH] gro: fix different skb headrooms

packets entering GRO might have different headrooms, even for a given
flow (because of implementation details in drivers, like copybreak).
We cant force drivers to deliver packets with a fixed headroom.

1) fix skb_segment()

skb_segment() makes the false assumption headrooms of fragments are same
than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
errors, and crash later in skb_copy_and_csum_dev()

2) allocate a minimal skb for head of frag_list

skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
allocate a fresh skb. This adds NET_SKB_PAD to a padding already
provided by netdevice, depending on various things, like copybreak.

Use alloc_skb() to allocate an exact padding, to reduce cache line
needs:
NET_SKB_PAD + NET_IP_ALIGN

bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626

Many thanks to Plamen Petrov, testing many debugging patches !
With help of Jarek Poplawski.

Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
---
patch against linux-2.6 current tree

 net/core/skbuff.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3a2513f..26396ff 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2573,6 +2573,10 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
 		__copy_skb_header(nskb, skb);
 		nskb->mac_len = skb->mac_len;
 
+		/* nskb and skb might have different headroom */
+		if (nskb->ip_summed == CHECKSUM_PARTIAL)
+			nskb->csum_start += skb_headroom(nskb) - headroom;
+
 		skb_reset_mac_header(nskb);
 		skb_set_network_header(nskb, skb->mac_len);
 		nskb->transport_header = (nskb->network_header +
@@ -2702,8 +2706,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 	} else if (skb_gro_len(p) != pinfo->gso_size)
 		return -E2BIG;
 
-	headroom = skb_headroom(p);
-	nskb = netdev_alloc_skb(p->dev, headroom + skb_gro_offset(p));
+	headroom = NET_SKB_PAD + NET_IP_ALIGN;
+	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
 	if (unlikely(!nskb))
 		return -ENOMEM;
 



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-01 10:50         ` Eric Dumazet
@ 2010-09-01 11:20           ` Jarek Poplawski
       [not found]             ` <20100901112026.GA9468-8HppEYmqbBCE+EvaaNYduQ@public.gmane.org>
  2010-09-02  1:23           ` David Miller
                             ` (3 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Jarek Poplawski @ 2010-09-01 11:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Plamen Petrov, Herbert Xu, Rafael J. Wysocki, Kernel Testers List,
	Maciej Rutecki, David S. Miller, Linux Kernel Mailing List,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Wed, Sep 01, 2010 at 12:50:51PM +0200, Eric Dumazet wrote:
> Plamen, could you test following patch ?
> 
> I reproduced problem on a dev machine and following patch cured it.
> 
> Thanks
> 
> [PATCH] gro: fix different skb headrooms
> 
> packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
> 
> 1) fix skb_segment()
> 
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()

Eric, probably I missed something, but since the same test as in
skb_copy_and_csum_dev() gave different result a bit earlier on exactly
the same skb, I've suspected some sharing (or use after free)
problems, so I'm not sure your current diagnose can explain this.
(Unless this old test was dismissed later.)

Thanks,
Jarek P.

> 
> 2) allocate a minimal skb for head of frag_list
> 
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
> 
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
> 
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
> 
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
> 
> Reported-by: Plamen Petrov <pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> CC: Jarek Poplawski <jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> patch against linux-2.6 current tree
> 
>  net/core/skbuff.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 3a2513f..26396ff 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2573,6 +2573,10 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
>  		__copy_skb_header(nskb, skb);
>  		nskb->mac_len = skb->mac_len;
>  
> +		/* nskb and skb might have different headroom */
> +		if (nskb->ip_summed == CHECKSUM_PARTIAL)
> +			nskb->csum_start += skb_headroom(nskb) - headroom;
> +
>  		skb_reset_mac_header(nskb);
>  		skb_set_network_header(nskb, skb->mac_len);
>  		nskb->transport_header = (nskb->network_header +
> @@ -2702,8 +2706,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
>  	} else if (skb_gro_len(p) != pinfo->gso_size)
>  		return -E2BIG;
>  
> -	headroom = skb_headroom(p);
> -	nskb = netdev_alloc_skb(p->dev, headroom + skb_gro_offset(p));
> +	headroom = NET_SKB_PAD + NET_IP_ALIGN;
> +	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
>  	if (unlikely(!nskb))
>  		return -ENOMEM;
>  
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
       [not found]             ` <20100901112026.GA9468-8HppEYmqbBCE+EvaaNYduQ@public.gmane.org>
@ 2010-09-01 13:57               ` Eric Dumazet
  2010-09-01 15:05                 ` Jarek Poplawski
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Dumazet @ 2010-09-01 13:57 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Plamen Petrov, Herbert Xu, Rafael J. Wysocki, Kernel Testers List,
	Maciej Rutecki, David S. Miller, Linux Kernel Mailing List,
	netdev-u79uwXL29TY76Z2rM5mHXA

Le mercredi 01 septembre 2010 à 11:20 +0000, Jarek Poplawski a écrit :
> On Wed, Sep 01, 2010 at 12:50:51PM +0200, Eric Dumazet wrote:
> > Plamen, could you test following patch ?
> > 
> > I reproduced problem on a dev machine and following patch cured it.
> > 
> > Thanks
> > 
> > [PATCH] gro: fix different skb headrooms
> > 
> > packets entering GRO might have different headrooms, even for a given
> > flow (because of implementation details in drivers, like copybreak).
> > We cant force drivers to deliver packets with a fixed headroom.
> > 
> > 1) fix skb_segment()
> > 
> > skb_segment() makes the false assumption headrooms of fragments are same
> > than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> > errors, and crash later in skb_copy_and_csum_dev()
> 
> Eric, probably I missed something, but since the same test as in
> skb_copy_and_csum_dev() gave different result a bit earlier on exactly
> the same skb, I've suspected some sharing (or use after free)
> problems, so I'm not sure your current diagnose can explain this.
> (Unless this old test was dismissed later.)

Oh, this is because your patch had an error for the gso part that read :

-               rc = ops->ndo_start_xmit(nskb, dev);
+               if (skb_csum_start_bug(skb, 50)) {
+                       kfree_skb(skb);
+                       rc = NETDEV_TX_OK;
+               } else
+                       rc = ops->ndo_start_xmit(nskb, dev);
+
                if (unlikely(rc != NETDEV_TX_OK)) {
                        if (rc & ~NETDEV_TX_MASK)
                                goto out_kfree_gso_skb;

You called skb_csum_start_bug(skb, 50) instead of
skb_csum_start_bug(nskb, 50)

Hope this clarify a bit ;)

Thanks

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-01 13:57               ` Eric Dumazet
@ 2010-09-01 15:05                 ` Jarek Poplawski
  0 siblings, 0 replies; 21+ messages in thread
From: Jarek Poplawski @ 2010-09-01 15:05 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Plamen Petrov, Herbert Xu, Rafael J. Wysocki, Kernel Testers List,
	Maciej Rutecki, David S. Miller, Linux Kernel Mailing List,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Wed, Sep 01, 2010 at 03:57:41PM +0200, Eric Dumazet wrote:
> Le mercredi 01 septembre 2010 ?? 11:20 +0000, Jarek Poplawski a écrit :
> > On Wed, Sep 01, 2010 at 12:50:51PM +0200, Eric Dumazet wrote:
> > > Plamen, could you test following patch ?
> > > 
> > > I reproduced problem on a dev machine and following patch cured it.
> > > 
> > > Thanks
> > > 
> > > [PATCH] gro: fix different skb headrooms
> > > 
> > > packets entering GRO might have different headrooms, even for a given
> > > flow (because of implementation details in drivers, like copybreak).
> > > We cant force drivers to deliver packets with a fixed headroom.
> > > 
> > > 1) fix skb_segment()
> > > 
> > > skb_segment() makes the false assumption headrooms of fragments are same
> > > than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> > > errors, and crash later in skb_copy_and_csum_dev()
> > 
> > Eric, probably I missed something, but since the same test as in
> > skb_copy_and_csum_dev() gave different result a bit earlier on exactly
> > the same skb, I've suspected some sharing (or use after free)
> > problems, so I'm not sure your current diagnose can explain this.
> > (Unless this old test was dismissed later.)
> 
> Oh, this is because your patch had an error for the gso part that read :
> 
> -               rc = ops->ndo_start_xmit(nskb, dev);
> +               if (skb_csum_start_bug(skb, 50)) {
> +                       kfree_skb(skb);
> +                       rc = NETDEV_TX_OK;
> +               } else
> +                       rc = ops->ndo_start_xmit(nskb, dev);
> +
>                 if (unlikely(rc != NETDEV_TX_OK)) {
>                         if (rc & ~NETDEV_TX_MASK)
>                                 goto out_kfree_gso_skb;
> 
> You called skb_csum_start_bug(skb, 50) instead of
> skb_csum_start_bug(nskb, 50)
> 
> Hope this clarify a bit ;)

All clear! Sorry for the false alarm!

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-01 10:50         ` Eric Dumazet
  2010-09-01 11:20           ` Jarek Poplawski
@ 2010-09-02  1:23           ` David Miller
  2010-09-03  8:00           ` Plamen Petrov
                             ` (2 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: David Miller @ 2010-09-02  1:23 UTC (permalink / raw)
  To: eric.dumazet
  Cc: jarkao2, pvp-lsts, herbert, rjw, kernel-testers, maciej.rutecki,
	linux-kernel, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 01 Sep 2010 12:50:51 +0200

> [PATCH] gro: fix different skb headrooms
> 
> packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
> 
> 1) fix skb_segment()
> 
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()
> 
> 2) allocate a minimal skb for head of frag_list
> 
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
> 
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
> 
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
> 
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
> 
> Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Jarek Poplawski <jarkao2@gmail.com>

Good spotting, applied, thanks!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-01 10:50         ` Eric Dumazet
  2010-09-01 11:20           ` Jarek Poplawski
  2010-09-02  1:23           ` David Miller
@ 2010-09-03  8:00           ` Plamen Petrov
  2010-09-03  9:06             ` Jarek Poplawski
  2010-09-03  8:30           ` Herbert Xu
  2010-09-04 20:34           ` Jarek Poplawski
  4 siblings, 1 reply; 21+ messages in thread
From: Plamen Petrov @ 2010-09-03  8:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jarek Poplawski, Herbert Xu, Rafael J. Wysocki,
	Kernel Testers List, Maciej Rutecki, David S. Miller,
	Linux Kernel Mailing List, netdev

На 01.9.2010 г. 13:50, Eric Dumazet написа:
> Plamen, could you test following patch ?
>
> I reproduced problem on a dev machine and following patch cured it.
>
> Thanks
>
> [PATCH] gro: fix different skb headrooms
>
> packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
>
> 1) fix skb_segment()
>
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()
>
> 2) allocate a minimal skb for head of frag_list
>
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
>
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
>
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
>
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
>
> Reported-by: Plamen Petrov<pvp-lsts@fs.uni-ruse.bg>
> Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com>
> CC: Jarek Poplawski<jarkao2@gmail.com>
> ---
> patch against linux-2.6 current tree
>
>   net/core/skbuff.c |    8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 3a2513f..26396ff 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2573,6 +2573,10 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
>   		__copy_skb_header(nskb, skb);
>   		nskb->mac_len = skb->mac_len;
>
> +		/* nskb and skb might have different headroom */
> +		if (nskb->ip_summed == CHECKSUM_PARTIAL)
> +			nskb->csum_start += skb_headroom(nskb) - headroom;
> +
>   		skb_reset_mac_header(nskb);
>   		skb_set_network_header(nskb, skb->mac_len);
>   		nskb->transport_header = (nskb->network_header +
> @@ -2702,8 +2706,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
>   	} else if (skb_gro_len(p) != pinfo->gso_size)
>   		return -E2BIG;
>
> -	headroom = skb_headroom(p);
> -	nskb = netdev_alloc_skb(p->dev, headroom + skb_gro_offset(p));
> +	headroom = NET_SKB_PAD + NET_IP_ALIGN;
> +	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
>   	if (unlikely(!nskb))
>   		return -ENOMEM;
>
>
>

I confirm that the above patch applied on top of v2.6.36-rc3 does not
show the problems that all the kernels since v2.6.35 (both stable
and Linus' tree) had.

My problematic machine has been running the patched 36-rc3 for 36 hours, 
and couning, with "generic receive offload" enabled only my tg3 nic.

Thank you very much for the wonderful job, Eric!
Thanks to you, too, Jarek!

Plamen Petrov

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-01 10:50         ` Eric Dumazet
                             ` (2 preceding siblings ...)
  2010-09-03  8:00           ` Plamen Petrov
@ 2010-09-03  8:30           ` Herbert Xu
  2010-09-04 20:34           ` Jarek Poplawski
  4 siblings, 0 replies; 21+ messages in thread
From: Herbert Xu @ 2010-09-03  8:30 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jarek Poplawski, Plamen Petrov, Rafael J. Wysocki,
	Kernel Testers List, Maciej Rutecki, David S. Miller,
	Linux Kernel Mailing List, netdev-u79uwXL29TY76Z2rM5mHXA

On Wed, Sep 01, 2010 at 12:50:51PM +0200, Eric Dumazet wrote:
> Plamen, could you test following patch ?
> 
> I reproduced problem on a dev machine and following patch cured it.
> 
> Thanks
> 
> [PATCH] gro: fix different skb headrooms
> 
> packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
> 
> 1) fix skb_segment()
> 
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()
> 
> 2) allocate a minimal skb for head of frag_list
> 
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
> 
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
> 
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
> 
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
> 
> Reported-by: Plamen Petrov <pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> CC: Jarek Poplawski <jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Thanks for diagnosing and fixing this!

> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 3a2513f..26396ff 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2573,6 +2573,10 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
>  		__copy_skb_header(nskb, skb);
>  		nskb->mac_len = skb->mac_len;
>  
> +		/* nskb and skb might have different headroom */
> +		if (nskb->ip_summed == CHECKSUM_PARTIAL)
> +			nskb->csum_start += skb_headroom(nskb) - headroom;

This test is redundant since we require CHECKSUM_PARTIAL for
GSO packets.

Cheers,
-- 
Email: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-03  8:00           ` Plamen Petrov
@ 2010-09-03  9:06             ` Jarek Poplawski
  0 siblings, 0 replies; 21+ messages in thread
From: Jarek Poplawski @ 2010-09-03  9:06 UTC (permalink / raw)
  To: Plamen Petrov
  Cc: Eric Dumazet, Herbert Xu, Rafael J. Wysocki, Kernel Testers List,
	Maciej Rutecki, David S. Miller, Linux Kernel Mailing List,
	netdev

On Fri, Sep 03, 2010 at 11:00:52AM +0300, Plamen Petrov wrote:
> ???? 01.9.2010 ??. 13:50, Eric Dumazet ????????????:
>
> I confirm that the above patch applied on top of v2.6.36-rc3 does not
> show the problems that all the kernels since v2.6.35 (both stable
> and Linus' tree) had.
>
> My problematic machine has been running the patched 36-rc3 for 36 hours,  
> and couning, with "generic receive offload" enabled only my tg3 nic.
>
> Thank you very much for the wonderful job, Eric!
> Thanks to you, too, Jarek!

Not at all! I only messed up a bit :-(

All credits to Eric and Plamen!

Thanks again,
Jarek P.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-01 10:50         ` Eric Dumazet
                             ` (3 preceding siblings ...)
  2010-09-03  8:30           ` Herbert Xu
@ 2010-09-04 20:34           ` Jarek Poplawski
       [not found]             ` <20100904203429.GA4891-qLhEM4ewh5CNj9Bq2fkWzw@public.gmane.org>
  4 siblings, 1 reply; 21+ messages in thread
From: Jarek Poplawski @ 2010-09-04 20:34 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Plamen Petrov, Herbert Xu, Rafael J. Wysocki, Kernel Testers List,
	Maciej Rutecki, David S. Miller, Linux Kernel Mailing List,
	netdev

Eric Dumazet wrote, On 09/01/2010 12:50 PM:

> [PATCH] gro: fix different skb headrooms
> 
> packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
> 
> 1) fix skb_segment()
> 
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()
> 
> 2) allocate a minimal skb for head of frag_list
> 
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
> 
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
> 
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
> 
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
> 
> Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Jarek Poplawski <jarkao2@gmail.com>
> ---
> patch against linux-2.6 current tree
> 
>  net/core/skbuff.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
...
> @@ -2702,8 +2706,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
>  	} else if (skb_gro_len(p) != pinfo->gso_size)
>  		return -E2BIG;
>  
> -	headroom = skb_headroom(p);
> -	nskb = netdev_alloc_skb(p->dev, headroom + skb_gro_offset(p));
> +	headroom = NET_SKB_PAD + NET_IP_ALIGN;
> +	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
>  	if (unlikely(!nskb))
>  		return -ENOMEM;

Hi again,

Just had a second look, and unless I miss something...

Plamen, could you test this patch, too? (Without removing the previous
one.)

Thanks,
Jarek P.

------------------->

[PATCH] gro: Re-fix different skb headrooms

The patch: "gro: fix different skb headrooms" in its part:
"2) allocate a minimal skb for head of frag_list" is buggy. The copied
skb has p->data set at the ip header at the moment, and skb_gro_offset
is the length of ip + tcp headers. So, after the change the length of
mac header is skipped. Later skb_set_mac_header() sets it into the
NET_SKB_PAD area (if it's long enough) and ip header is misaligned at
NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the
original skb was wrongly allocated, so let's copy it as it was.

bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57

Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
---

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 26396ff..c83b421 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2706,7 +2706,7 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 	} else if (skb_gro_len(p) != pinfo->gso_size)
 		return -E2BIG;
 
-	headroom = NET_SKB_PAD + NET_IP_ALIGN;
+	headroom = skb_headroom(p);
 	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
 	if (unlikely(!nskb))
 		return -ENOMEM;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
       [not found]             ` <20100904203429.GA4891-qLhEM4ewh5CNj9Bq2fkWzw@public.gmane.org>
@ 2010-09-05  7:49               ` Eric Dumazet
  2010-09-08  4:57                 ` Plamen Petrov
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Dumazet @ 2010-09-05  7:49 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Plamen Petrov, Herbert Xu, Rafael J. Wysocki, Kernel Testers List,
	Maciej Rutecki, David S. Miller, Linux Kernel Mailing List,
	netdev-u79uwXL29TY76Z2rM5mHXA

Le samedi 04 septembre 2010 à 22:34 +0200, Jarek Poplawski a écrit :

> Hi again,
> 
> Just had a second look, and unless I miss something...
> 
> Plamen, could you test this patch, too? (Without removing the previous
> one.)
> 
> Thanks,
> Jarek P.
> 
> ------------------->
> 
> [PATCH] gro: Re-fix different skb headrooms
> 
> The patch: "gro: fix different skb headrooms" in its part:
> "2) allocate a minimal skb for head of frag_list" is buggy. The copied
> skb has p->data set at the ip header at the moment, and skb_gro_offset
> is the length of ip + tcp headers. So, after the change the length of
> mac header is skipped. Later skb_set_mac_header() sets it into the
> NET_SKB_PAD area (if it's long enough) and ip header is misaligned at
> NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the
> original skb was wrongly allocated, so let's copy it as it was.
> 
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
> fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57
> 
> Reported-by: Plamen Petrov <pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> Signed-off-by: Jarek Poplawski <jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> CC: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 26396ff..c83b421 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2706,7 +2706,7 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
>  	} else if (skb_gro_len(p) != pinfo->gso_size)
>  		return -E2BIG;
>  
> -	headroom = NET_SKB_PAD + NET_IP_ALIGN;
> +	headroom = skb_headroom(p);
>  	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
>  	if (unlikely(!nskb))
>  		return -ENOMEM;

You are right, thanks for reviewing this patch again :)

By the way, NET_IP_ALIGN is now 0 on x86, so technically speaking, your
patch un-aligns IP header on x86, but thats OK, since other arches might
want it being aligned, while x86 doesnt care.

Acked-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-05  7:49               ` Eric Dumazet
@ 2010-09-08  4:57                 ` Plamen Petrov
  2010-09-08  6:20                   ` Jarek Poplawski
  0 siblings, 1 reply; 21+ messages in thread
From: Plamen Petrov @ 2010-09-08  4:57 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jarek Poplawski, Herbert Xu, Rafael J. Wysocki,
	Kernel Testers List, Maciej Rutecki, David S. Miller,
	Linux Kernel Mailing List, netdev-u79uwXL29TY76Z2rM5mHXA

На 05.9.2010 г. 10:49, Eric Dumazet написа:
> Le samedi 04 septembre 2010 à 22:34 +0200, Jarek Poplawski a écrit :
>
>> Hi again,
>>
>> Just had a second look, and unless I miss something...
>>
>> Plamen, could you test this patch, too? (Without removing the previous
>> one.)
>>
>> Thanks,
>> Jarek P.
>>
>> ------------------->
>>
>> [PATCH] gro: Re-fix different skb headrooms
>>
>> The patch: "gro: fix different skb headrooms" in its part:
>> "2) allocate a minimal skb for head of frag_list" is buggy. The copied
>> skb has p->data set at the ip header at the moment, and skb_gro_offset
>> is the length of ip + tcp headers. So, after the change the length of
>> mac header is skipped. Later skb_set_mac_header() sets it into the
>> NET_SKB_PAD area (if it's long enough) and ip header is misaligned at
>> NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the
>> original skb was wrongly allocated, so let's copy it as it was.
>>
>> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
>> fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57
>>
>> Reported-by: Plamen Petrov<pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
>> Signed-off-by: Jarek Poplawski<jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> CC: Eric Dumazet<eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> ---
>>
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index 26396ff..c83b421 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -2706,7 +2706,7 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
>>   	} else if (skb_gro_len(p) != pinfo->gso_size)
>>   		return -E2BIG;
>>
>> -	headroom = NET_SKB_PAD + NET_IP_ALIGN;
>> +	headroom = skb_headroom(p);
>>   	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
>>   	if (unlikely(!nskb))
>>   		return -ENOMEM;
>
> You are right, thanks for reviewing this patch again :)
>
> By the way, NET_IP_ALIGN is now 0 on x86, so technically speaking, your
> patch un-aligns IP header on x86, but thats OK, since other arches might
> want it being aligned, while x86 doesnt care.
>
> Acked-by: Eric Dumazet<eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>
>
>

Well, now that I'm back at work, I'm glad to report
that I tested both variants of the patch, and as Eric
points out - it works both ways.

So, which ever fits you guys better.

Thanks a lot!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-08  4:57                 ` Plamen Petrov
@ 2010-09-08  6:20                   ` Jarek Poplawski
  2010-09-08 17:32                     ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Jarek Poplawski @ 2010-09-08  6:20 UTC (permalink / raw)
  To: Plamen Petrov
  Cc: Eric Dumazet, Herbert Xu, Rafael J. Wysocki, Kernel Testers List,
	Maciej Rutecki, David S. Miller, Linux Kernel Mailing List,
	netdev

On Wed, Sep 08, 2010 at 07:57:31AM +0300, Plamen Petrov wrote:
> ???? 05.9.2010 ??. 10:49, Eric Dumazet ????????????:
>> Le samedi 04 septembre 2010 ?? 22:34 +0200, Jarek Poplawski a écrit :
>>
>>> Hi again,
>>>
>>> Just had a second look, and unless I miss something...
>>>
>>> Plamen, could you test this patch, too? (Without removing the previous
>>> one.)
>>>
>>> Thanks,
>>> Jarek P.
>>>
>>> ------------------->
>>>
>>> [PATCH] gro: Re-fix different skb headrooms
>>>
>>> The patch: "gro: fix different skb headrooms" in its part:
>>> "2) allocate a minimal skb for head of frag_list" is buggy. The copied
>>> skb has p->data set at the ip header at the moment, and skb_gro_offset
>>> is the length of ip + tcp headers. So, after the change the length of
>>> mac header is skipped. Later skb_set_mac_header() sets it into the
>>> NET_SKB_PAD area (if it's long enough) and ip header is misaligned at
>>> NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the
>>> original skb was wrongly allocated, so let's copy it as it was.
>>>
>>> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
>>> fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57
>>>
>>> Reported-by: Plamen Petrov<pvp-lsts@fs.uni-ruse.bg>
>>> Signed-off-by: Jarek Poplawski<jarkao2@gmail.com>
>>> CC: Eric Dumazet<eric.dumazet@gmail.com>
>>> ---
>>>
>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>>> index 26396ff..c83b421 100644
>>> --- a/net/core/skbuff.c
>>> +++ b/net/core/skbuff.c
>>> @@ -2706,7 +2706,7 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
>>>   	} else if (skb_gro_len(p) != pinfo->gso_size)
>>>   		return -E2BIG;
>>>
>>> -	headroom = NET_SKB_PAD + NET_IP_ALIGN;
>>> +	headroom = skb_headroom(p);
>>>   	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
>>>   	if (unlikely(!nskb))
>>>   		return -ENOMEM;
>>
>> You are right, thanks for reviewing this patch again :)
>>
>> By the way, NET_IP_ALIGN is now 0 on x86, so technically speaking, your
>> patch un-aligns IP header on x86, but thats OK, since other arches might
>> want it being aligned, while x86 doesnt care.
>>
>> Acked-by: Eric Dumazet<eric.dumazet@gmail.com>
>>
>>
>>
>
> Well, now that I'm back at work, I'm glad to report
> that I tested both variants of the patch, and as Eric
> points out - it works both ways.
>
> So, which ever fits you guys better.

We need both of them. I hope David could add this too:

Tested-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>

Thanks a lot everybody!
Jarek P.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-09-08  6:20                   ` Jarek Poplawski
@ 2010-09-08 17:32                     ` David Miller
       [not found]                       ` <20100908.103256.112592448.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2010-09-08 17:32 UTC (permalink / raw)
  To: jarkao2
  Cc: pvp-lsts, eric.dumazet, herbert, rjw, kernel-testers,
	maciej.rutecki, linux-kernel, netdev

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Wed, 8 Sep 2010 06:20:04 +0000

> We need both of them. I hope David could add this too:
> 
> Tested-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>

Done, and applied, thanks :-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
       [not found]                       ` <20100908.103256.112592448.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2010-09-08 20:21                         ` Rafael J. Wysocki
       [not found]                           ` <201009082221.16356.rjw-KKrjLPT3xs0@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Rafael J. Wysocki @ 2010-09-08 20:21 UTC (permalink / raw)
  To: David Miller
  Cc: jarkao2-Re5JQEeQqe8AvxtiuMwx3w, pvp-lsts-s6OjJRe3oxUfI6EYonfoRA,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Wednesday, September 08, 2010, David Miller wrote:
> From: Jarek Poplawski <jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date: Wed, 8 Sep 2010 06:20:04 +0000
> 
> > We need both of them. I hope David could add this too:
> > 
> > Tested-by: Plamen Petrov <pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> 
> Done, and applied, thanks :-)

Please kindly let me know when Linus gets them, so that I can close the bug.

Rafael

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
       [not found]                           ` <201009082221.16356.rjw-KKrjLPT3xs0@public.gmane.org>
@ 2010-09-12  6:55                             ` Plamen Petrov
       [not found]                               ` <4C8C7957.7080207-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Plamen Petrov @ 2010-09-12  6:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: David Miller, jarkao2-Re5JQEeQqe8AvxtiuMwx3w,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable-DgEjT+Ai2ygdnm+yROfE0A

На 08.9.2010 г. 23:21, Rafael J. Wysocki написа:
> On Wednesday, September 08, 2010, David Miller wrote:
>> From: Jarek Poplawski<jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Date: Wed, 8 Sep 2010 06:20:04 +0000
>>
>>> We need both of them. I hope David could add this too:
>>>
>>> Tested-by: Plamen Petrov<pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
>>
>> Done, and applied, thanks :-)
>
> Please kindly let me know when Linus gets them, so that I can close the bug.
>
> Rafael

Now that both commits that fix my problems are in Linus' tree, the
bug can be closed, but these fixes should go in 2.6.35.y, too.
So, CCing -stable.

Fix 1:
> commit	3d3be4333fdf6faa080947b331a6a19bce1a4f57
>
> gro: fix different skb headrooms
>
> Packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
>
> 1) fix skb_segment()
>
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()
>
> 2) allocate a minimal skb for head of frag_list
>
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
>
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
>
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
>
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
>
> Reported-by: Plamen Petrov <pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> CC: Jarek Poplawski <jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Signed-off-by: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

Fix 2:
> commit	64289c8e6851bca0e589e064c9a5c9fbd6ae5dd4
>
> gro: Re-fix different skb headrooms
>
> The patch: "gro: fix different skb headrooms" in its part:
> "2) allocate a minimal skb for head of frag_list" is buggy. The copied
> skb has p->data set at the ip header at the moment, and skb_gro_offset
> is the length of ip + tcp headers. So, after the change the length of
> mac header is skipped. Later skb_set_mac_header() sets it into the
> NET_SKB_PAD area (if it's long enough) and ip header is misaligned at
> NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the
> original skb was wrongly allocated, so let's copy it as it was.
>
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
> fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57
>
> Reported-by: Plamen Petrov <pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> Signed-off-by: Jarek Poplawski <jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> CC: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Acked-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Tested-by: Plamen Petrov <pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> Signed-off-by: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

Thanks,
Plamen

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
       [not found]                               ` <4C8C7957.7080207-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
@ 2010-09-12 15:55                                 ` David Miller
       [not found]                                   ` <20100912.085509.193705327.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  2010-09-12 17:28                                 ` Rafael J. Wysocki
  1 sibling, 1 reply; 21+ messages in thread
From: David Miller @ 2010-09-12 15:55 UTC (permalink / raw)
  To: pvp-lsts-s6OjJRe3oxUfI6EYonfoRA
  Cc: rjw-KKrjLPT3xs0, jarkao2-Re5JQEeQqe8AvxtiuMwx3w,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable-DgEjT+Ai2ygdnm+yROfE0A

From: Plamen Petrov <pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
Date: Sun, 12 Sep 2010 09:55:19 +0300

> Now that both commits that fix my problems are in Linus' tree, the
> bug can be closed, but these fixes should go in 2.6.35.y, too.
> So, CCing -stable.

You don't need to do this, I will submit whatever is appropriate for
networking to -stable as needed.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
       [not found]                               ` <4C8C7957.7080207-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
  2010-09-12 15:55                                 ` David Miller
@ 2010-09-12 17:28                                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 21+ messages in thread
From: Rafael J. Wysocki @ 2010-09-12 17:28 UTC (permalink / raw)
  To: Plamen Petrov
  Cc: David Miller, jarkao2-Re5JQEeQqe8AvxtiuMwx3w,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable-DgEjT+Ai2ygdnm+yROfE0A

On Sunday, September 12, 2010, Plamen Petrov wrote:
> На 08.9.2010 г. 23:21, Rafael J. Wysocki написа:
> > On Wednesday, September 08, 2010, David Miller wrote:
> >> From: Jarek Poplawski<jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >> Date: Wed, 8 Sep 2010 06:20:04 +0000
> >>
> >>> We need both of them. I hope David could add this too:
> >>>
> >>> Tested-by: Plamen Petrov<pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> >>
> >> Done, and applied, thanks :-)
> >
> > Please kindly let me know when Linus gets them, so that I can close the bug.
> >
> > Rafael
> 
> Now that both commits that fix my problems are in Linus' tree, the
> bug can be closed, but these fixes should go in 2.6.35.y, too.
> So, CCing -stable.

Thanks, bug closed.

Rafael

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
       [not found]                                   ` <20100912.085509.193705327.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2010-09-13  6:49                                     ` Plamen Petrov
  0 siblings, 0 replies; 21+ messages in thread
From: Plamen Petrov @ 2010-09-13  6:49 UTC (permalink / raw)
  To: David Miller
  Cc: rjw-KKrjLPT3xs0, jarkao2-Re5JQEeQqe8AvxtiuMwx3w,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable-DgEjT+Ai2ygdnm+yROfE0A

На 12.9.2010 г. 18:55, David Miller написа:
> From: Plamen Petrov<pvp-lsts-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
> Date: Sun, 12 Sep 2010 09:55:19 +0300
>
>> Now that both commits that fix my problems are in Linus' tree, the
>> bug can be closed, but these fixes should go in 2.6.35.y, too.
>> So, CCing -stable.
>
> You don't need to do this, I will submit whatever is appropriate for
> networking to -stable as needed.

Sorry! I didn't know it will be taken care of.

Thanks!
Plamen

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev
  2010-08-31 19:26       ` [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev Jarek Poplawski
  2010-09-01 10:50         ` Eric Dumazet
@ 2018-08-29 21:39         ` Mitchell Erblich
  1 sibling, 0 replies; 21+ messages in thread
From: Mitchell Erblich @ 2018-08-29 21:39 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Plamen Petrov, Rafael J. Wysocki, Kernel Testers List,
	Maciej Rutecki, David S. Miller, Eric Dumazet,
	Linux Kernel Mailing List, netdev

						OLD REPLY … and Correcting . Suggesting of Corrections of LONG TERM embedded problems 

Summary:

		Due to watermark awareness differences that is problematic in embedded systems, the GFP_ATOMIC which is not memory watermark aware is used in interrupt / atomic context.

		To properly monitor WATERMARK levels at suggested kernel locations, the “PROPER” GFP_FLAG SHOULD be GFP_NOWAIT. This is atomic/interupt friendly and is aware of memory watermarks, thus if memory is below the specified watermark level, it will then return a ENOMEM from that location.

					GFP_ATOMIC will not return ENOMEM and by the time the watermarks drop, ALL GFP_KERNEL callers are now SLEEPING.

					Please be embedded friendly with code patches…

		FYI: Embedded system due to no 2ndary drive can not clean PTEs (page frames), thus callers need to be aware of low memory issues and be able to reduce their memory consumption based on receiving ENOMEMs. GFP_KERNEL callers will just sleep until memory is back above the watermarks.

Mitchell Erblich
erblichs@earthlink.net



> On Aug 31, 2010, at 12:26 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> 
> On Tue, Aug 31, 2010 at 08:58:10AM +0300, Plamen Petrov wrote:
>> Rafael J. Wysocki ????????????:
>> 
>>> This message has been generated automatically as a part of a summary report
>>> of recent regressions.
>>> 
>>> The following bug entry is on the current list of known regressions
>>> from 2.6.35.  Please verify if it still should be listed and let the tracking team
>>> know (either way).
>>> 
>>> 
>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=16626
>>> Subject		: Machine hangs with EIP at skb_copy_and_csum_dev
>>> Submitter	: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
>>> Date		: 2010-08-19 09:57 (11 days old)
>>> Handled-By	:  Eric Dumazet <eric.dumazet@gmail.com>
>>> 
>>> 
>> 
>> Should "generic receive offload" work on a forwarding setup?
>> If yes - then the bug should remain open.
>> If not - then it's my mistake.
> 
> If/since it's on by default it should work and it definitely can't be
> your mistake. (Unless we can't find the real reason... ;-)
> 
> Jarek P.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-08-29 21:39 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-29 22:24 2.6.36-rc3: Reported regressions from 2.6.35 Rafael J. Wysocki
     [not found] ` <4AUWBNzTkbD.A.ey.cGueMB@chimera>
     [not found]   ` <courier.4C7C99F2.00001F74@fs.ru.acad.bg>
     [not found]     ` <courier.4C7C99F2.00001F74-Xdw7EbNJKi3354cJYj5R/Q@public.gmane.org>
2010-08-31 19:26       ` [Bug #16626] Machine hangs with EIP at skb_copy_and_csum_dev Jarek Poplawski
2010-09-01 10:50         ` Eric Dumazet
2010-09-01 11:20           ` Jarek Poplawski
     [not found]             ` <20100901112026.GA9468-8HppEYmqbBCE+EvaaNYduQ@public.gmane.org>
2010-09-01 13:57               ` Eric Dumazet
2010-09-01 15:05                 ` Jarek Poplawski
2010-09-02  1:23           ` David Miller
2010-09-03  8:00           ` Plamen Petrov
2010-09-03  9:06             ` Jarek Poplawski
2010-09-03  8:30           ` Herbert Xu
2010-09-04 20:34           ` Jarek Poplawski
     [not found]             ` <20100904203429.GA4891-qLhEM4ewh5CNj9Bq2fkWzw@public.gmane.org>
2010-09-05  7:49               ` Eric Dumazet
2010-09-08  4:57                 ` Plamen Petrov
2010-09-08  6:20                   ` Jarek Poplawski
2010-09-08 17:32                     ` David Miller
     [not found]                       ` <20100908.103256.112592448.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2010-09-08 20:21                         ` Rafael J. Wysocki
     [not found]                           ` <201009082221.16356.rjw-KKrjLPT3xs0@public.gmane.org>
2010-09-12  6:55                             ` Plamen Petrov
     [not found]                               ` <4C8C7957.7080207-s6OjJRe3oxUfI6EYonfoRA@public.gmane.org>
2010-09-12 15:55                                 ` David Miller
     [not found]                                   ` <20100912.085509.193705327.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2010-09-13  6:49                                     ` Plamen Petrov
2010-09-12 17:28                                 ` Rafael J. Wysocki
2018-08-29 21:39         ` Mitchell Erblich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).