From: Jesse Brandeburg <jesse.brandeburg@intel.com>
To: "Boris B. Zhmurov" <bb@kernelpanic.ru>,
Herbert Xu <herbert@gondor.apana.org.au>
Cc: Phil Oester <kernel@linuxace.com>,
Mark Nipper <nipsy@bitgnome.net>,
"David S. Miller" <davem@davemloft.net>,
jrlundgren@gmail.com, cat@zip.com.au, djani22@dynamicweb.hu,
yoseph.basri@gmail.com, mykleb@no.ibm.com, olel@ans.pl,
michal@feix.cz, chris@scorpion.nl, netdev@vger.kernel.org,
jesse.brandeburg@gmail.com, E1000-devel@lists.sourceforge.net,
Andi Kleen <ak@suse.de>, Jeff Garzik <jgarzik@pobox.com>
Subject: Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
Date: Fri, 14 Apr 2006 13:28:10 -0700 [thread overview]
Message-ID: <444005DA.4090606@intel.com> (raw)
In-Reply-To: <44350056.80501@kernelpanic.ru>
Boris B. Zhmurov wrote:
> Hello, Jesse Brandeburg.
>
> On 06.04.2006 04:42 you said the following:
>
>> I built and tested the driver with patches on 2.6.16, with pci-x
>> adapters. I removed some workarounds for PCIe adapters, but I dont
>> think anyone having this problem has a PCIe adapter anyway. I saw no
>> TX hangs and ran some bi-directional tests, so i think the driver
>> should work okay. Just warning you I did minimal testing.
>>
>> *********************
>> e1000: transmit the old fashioned way
>>
>> It seems back in the day of 2.6.11, there were no sk_forward_alloc
>> asserions. Forward port that transmit code to see if it fixes the
>> issues
>> in today's kernel. Unfortunately it doesn't have all the bug fixes that
>> the current code has, but if we get transmit timeouts we can add in
>> workarounds appropriately.
>>
>> this changes only the e1000_tso function
>
> With this one still having:
>
> TCP: Treason uncloaked! Peer 80.72.16.78:11460/80 shrinks window
> 2223569515:2223569516. Repaired.
> KERNEL: assertion (!sk->sk_forward_alloc) failed at net/core/stream.c
> (283)
> KERNEL: assertion (!sk->sk_forward_alloc) failed at net/ipv4/af_inet.c
> (150)
This is a very important result. It shows that the changes to the
driver to call pskb_expand_head for TSO operations are not the cause of
this problem.
We also have some new data from the last couple of days. First, I think
that this problem is likely not just E1000's fault. We have multiple
reports both in bugzilla.kernel.org and from a distro that show this
problem has occurred on (at least) tg3 driven adapters as well as e1000.
I've been able to reliably reproduce this issue in house (finally)
thanks to one of our testers. The test is using the tbench application
from the dbench package at samba.org.
on the server, start tbench_srv
on the machine you're trying to repro the issue on, start tbench 500
<server ip>, on another client start tbench 50 <server ip>
I've seen sk_forward_alloc assertions on both server and client both
running 2.6.16. We're trying to figure out where there might be a stale
pointer to an sk that accesses the data after free. something seems to
write ff ff ff ff 00 00 00 00 to memory after it is freed maybe?
It does seem that the load (the 500 threads) is important to this
failure. I've also seen a report that a memory poisoning kernel caught
the failure.
Any investigation hints for me?
>> e1000: implement old xmit_frame
>>
>> It seems back in the day of 2.6.11, there were no sk_forward_alloc
>> asserions. Forward port that transmit code to see if it fixes the
>> issues
>> in today's kernel. Unfortunately it doesn't have all the bug fixes that
>> the current code has, but if we get transmit timeouts we can add in
>> workarounds appropriately.
>>
>> this changes the e1000_xmit_frame function, and some ancilliaries
>>
>> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
>
>
>
> Can't apply this one:
>
> [zhmurov@builds linux-2.6.16]$ patch -p1 <
> ../../../SOURCES/linux-2.6.16-e1000-implement_old_xmit_frame.patch
> patching file drivers/net/e1000/e1000_main.c
> Hunk #1 succeeded at 2620 (offset -105 lines).
> Hunk #2 FAILED at 2695.
> Hunk #4 FAILED at 2837.
> Hunk #5 FAILED at 2868.
> Hunk #6 FAILED at 2899.
> 4 out of 6 hunks FAILED -- saving rejects to file
> drivers/net/e1000/e1000_main.c.rej
well that seems kind of lame, but I think we got the data that we needed
from the first patch.
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
next prev parent reply other threads:[~2006-04-14 20:28 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-30 2:53 [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed Brandeburg, Jesse
2006-03-30 4:02 ` Yoseph Basri
2006-03-30 4:25 ` Phil Oester
2006-03-30 4:44 ` David S. Miller
2006-03-30 9:52 ` Herbert Xu
2006-03-30 10:02 ` Boris B. Zhmurov
2006-03-30 10:12 ` Herbert Xu
2006-03-30 12:53 ` JaniD++
2006-03-30 13:29 ` Boris B. Zhmurov
2006-03-31 9:12 ` David S. Miller
2006-03-31 10:16 ` Boris B. Zhmurov
2006-03-31 10:39 ` Herbert Xu
2006-03-31 10:45 ` David S. Miller
2006-03-31 10:51 ` Boris B. Zhmurov
2006-03-31 10:52 ` Herbert Xu
2006-03-31 11:02 ` Boris B. Zhmurov
2006-03-31 12:07 ` Boris B. Zhmurov
2006-03-31 11:15 ` Andi Kleen
2006-03-31 12:10 ` Mark Nipper
2006-03-31 12:23 ` Boris B. Zhmurov
2006-03-31 12:35 ` Herbert Xu
2006-03-31 12:36 ` Boris B. Zhmurov
2006-04-03 21:01 ` Mark Nipper
2006-04-03 21:39 ` Phil Oester
2006-04-03 22:00 ` Boris B. Zhmurov
2006-04-05 22:05 ` Jesse Brandeburg
2006-04-06 0:42 ` Jesse Brandeburg
2006-04-06 11:49 ` Boris B. Zhmurov
2006-04-14 20:28 ` Jesse Brandeburg [this message]
2006-04-14 21:02 ` David S. Miller
2006-04-14 22:32 ` Jesse Brandeburg
2006-04-14 22:42 ` David S. Miller
2006-04-14 22:46 ` Jesse Brandeburg
2006-04-14 22:52 ` David S. Miller
2006-04-14 22:55 ` Jesse Brandeburg
2006-04-14 23:53 ` David S. Miller
2006-03-31 12:46 ` Boris B. Zhmurov
2006-03-31 13:12 ` Christiaan den Besten
2006-03-31 13:30 ` Boris B. Zhmurov
2006-03-31 15:08 ` Boris B. Zhmurov
2006-03-31 15:19 ` Boris B. Zhmurov
2006-03-31 16:01 ` Mark Nipper
2006-03-31 17:19 ` Boris B. Zhmurov
2006-03-31 12:45 ` JaniD++
2006-03-31 9:13 ` David S. Miller
2006-03-30 8:08 ` Christiaan den Besten
2006-03-30 8:24 ` Mark Nipper
2006-03-30 10:29 ` Krzysztof Oledzki
2006-03-30 16:22 ` Phil Oester
2006-03-30 17:21 ` Krzysztof Oledzki
2006-03-30 8:39 ` Boris B. Zhmurov
2006-03-30 9:49 ` Johan Lundgren
2006-03-30 10:27 ` Krzysztof Oledzki
2006-03-31 8:57 ` Ingo Oeser
2006-03-31 9:12 ` David S. Miller
2006-03-31 9:16 ` Herbert Xu
2006-03-31 9:35 ` David S. Miller
2006-03-31 9:42 ` Herbert Xu
2006-03-31 12:02 ` JaniD++
2006-03-31 12:18 ` Ingo Oeser
2006-03-31 17:22 ` Jesse Brandeburg
2006-03-31 10:51 ` Mark Nipper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=444005DA.4090606@intel.com \
--to=jesse.brandeburg@intel.com \
--cc=E1000-devel@lists.sourceforge.net \
--cc=ak@suse.de \
--cc=bb@kernelpanic.ru \
--cc=cat@zip.com.au \
--cc=chris@scorpion.nl \
--cc=davem@davemloft.net \
--cc=djani22@dynamicweb.hu \
--cc=herbert@gondor.apana.org.au \
--cc=jesse.brandeburg@gmail.com \
--cc=jgarzik@pobox.com \
--cc=jrlundgren@gmail.com \
--cc=kernel@linuxace.com \
--cc=michal@feix.cz \
--cc=mykleb@no.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=nipsy@bitgnome.net \
--cc=olel@ans.pl \
--cc=yoseph.basri@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).