* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Jesse Brandeburg @ 2006-03-31 17:22 UTC (permalink / raw)
To: Ingo Oeser
Cc: Herbert Xu, David S. Miller, jesse.brandeburg, nipsy, jrlundgren,
cat, djani22, yoseph.basri, bb, mykleb, olel, michal, chris,
netdev, jesse.brandeburg, E1000-devel
In-Reply-To: <200603311418.47626.netdev@axxeo.de>
On Fri, 31 Mar 2006, Ingo Oeser wrote:
> Hi,
>
> Herbert Xu wrote:
>> On Fri, Mar 31, 2006 at 01:35:40AM -0800, David S. Miller wrote:
>>> He does not have TSO enabled, e1000 disables TSO when on a link speed
>>> slower than gigabit.
>
> dmesg|grep eth0
> [4294671.426000] e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> [4294679.125000] e1000: eth0: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex
>
>
> # ethtool -k eth0
> Offload parameters for eth0:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp segmentation offload: on
>
> So this theory doesn't seem to hold :-(
>
>> Indeed. But I think that only happens on PCI Express and I don't think
>> Ingo is using PCI Express.
>
> Right. PCI-Express is not available in this machine.
>
First, thanks for all the responses.
6.3.9-k4 in 2.6.16 doesn't turn off TSO for 10/100, 7.0.33 in 2.6.17-pre
does, I think that will help alleviate some of the confusion.
I've been working hard to try to reproduce here, no luck so far.
Herbert's fixes are interesting and appreciated. I'm going to try to
generate tests today that will show that the bugs he's mentioned could
occur.
Jesse
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 17:19 UTC (permalink / raw)
To: Mark Nipper
Cc: Christiaan den Besten, Herbert Xu, David S. Miller,
jesse.brandeburg, jrlundgren, cat, djani22, yoseph.basri, mykleb,
olel, michal, netdev, jesse.brandeburg, E1000-devel, Andi Kleen,
Jeff Garzik
In-Reply-To: <20060331160121.GA19110@king.bitgnome.net>
Hello, Mark Nipper.
On 31.03.2006 20:01 you said the following:
> On 31 Mar 2006, Boris B. Zhmurov wrote:
>
>>stream.c (279) -> stream.c (283)
>>af_inet.c (148) -> af_inet.c (150)
>
>
> That will be because the patches changed the line numbers
> in the source I believe. Nothing helpful unfortunately.
>
Ok. Anyway, as assertion is 100% repeatable on my server, I'm ready to
try any patches to get rid of this.
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [PATCH]: e1000: prevent statistics from getting garbled during reset.
From: Linas Vepstas @ 2006-03-31 17:06 UTC (permalink / raw)
To: Greg KH
Cc: john.ronciak, jesse.brandeburg, jeffrey.t.kirsher, Jeff Garzik,
linux-kernel, netdev, linux-pci, linuxppc-dev
In-Reply-To: <20060331054654.GA6632@kroah.com>
On Thu, Mar 30, 2006 at 09:46:54PM -0800, Greg KH wrote:
>
> (hint, use a tab...)
glurg.
[PATCH]: e1000: prevent statistics from getting garbled during reset.
If a PCI bus error/fault triggers a PCI bus reset, attempts to get the
ethernet packet count statistics from the hardware will fail, returning
garbage data upstream. This patch skips statistics data collection
if the PCI device is not on the bus.
This patch presumes that an earlier patch,
[PATCH] PCI Error Recovery: e1000 network device driver
has already been applied.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
----
drivers/net/e1000/e1000_main.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletion(-)
Index: linux-2.6.16-git6/drivers/net/e1000/e1000_main.c
===================================================================
--- linux-2.6.16-git6.orig/drivers/net/e1000/e1000_main.c 2006-03-30 17:51:37.924162779 -0600
+++ linux-2.6.16-git6/drivers/net/e1000/e1000_main.c 2006-03-30 17:54:07.659188391 -0600
@@ -3069,14 +3069,18 @@ void
e1000_update_stats(struct e1000_adapter *adapter)
{
struct e1000_hw *hw = &adapter->hw;
+ struct pci_dev *pdev = adapter->pdev;
unsigned long flags;
uint16_t phy_tmp;
#define PHY_IDLE_ERROR_COUNT_MASK 0x00FF
- /* Prevent stats update while adapter is being reset */
+ /* Prevent stats update while adapter is being reset,
+ * or if the pci connection is down. */
if (adapter->link_speed == 0)
return;
+ if (pdev->error_state && pdev->error_state != pci_channel_io_normal)
+ return;
spin_lock_irqsave(&adapter->stats_lock, flags);
^ permalink raw reply
* Re: [PATCH]: e1000: prevent statistics from getting garbled during reset.
From: Linas Vepstas @ 2006-03-31 17:03 UTC (permalink / raw)
To: Jeffrey V. Merkey
Cc: john.ronciak, jesse.brandeburg, jeffrey.t.kirsher, Jeff Garzik,
linux-kernel, netdev, linux-pci, linuxppc-dev
In-Reply-To: <442CACC0.1060308@wolfmountaingroup.com>
On Thu, Mar 30, 2006 at 09:14:56PM -0700, Jeffrey V. Merkey wrote:
> Yes, we need one. The adapter needs to maintain these stats from the
> registers in the kernel structure and not
> its own local variables.
Did you read the code to see what the adapter does with these stats?
Among other things, it uses them to adaptively modulate transmit rates
to avoid collisions. Just clearing the hardware-private stats will mess
up that function.
> That way, when someone calls to clear the stats
> for testing and analysis purposes,
> they zero out and are reset.
1) ifdown/ifup is guarenteed to to clear things. Try that.
2) What's wrong with taking deltas? Typical through-put performance
measurement is done by pre-loading the pipes (i.e. running for
a few minutes wihtout measuring, then starting the measurement).
I'd think that snapshotting the numbers would be easier, and is
trivially doable in user-space. I guess I don't understand why
you need a new kernel featre to imlement this.
--linas
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Mark Nipper @ 2006-03-31 16:01 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: Christiaan den Besten, Mark Nipper, Herbert Xu, David S. Miller,
jesse.brandeburg, jrlundgren, cat, djani22, yoseph.basri, mykleb,
olel, michal, netdev, jesse.brandeburg, E1000-devel, Andi Kleen,
Jeff Garzik
In-Reply-To: <442D486D.909@kernelpanic.ru>
On 31 Mar 2006, Boris B. Zhmurov wrote:
> stream.c (279) -> stream.c (283)
> af_inet.c (148) -> af_inet.c (150)
That will be because the patches changed the line numbers
in the source I believe. Nothing helpful unfortunately.
--
Mark Nipper e-contacts:
832 Tanglewood Drive nipsy@bitgnome.net
Bryan, Texas 77802-4013 http://nipsy.bitgnome.net/
(979)575-3193 AIM/Yahoo: texasnipsy ICQ: 66971617
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GG/IT d- s++:+ a- C++$ UBL++++$ P--->+++ L+++$ !E---
W++(--) N+ o K++ w(---) O++ M V(--) PS+++(+) PE(--)
Y+ PGP t+ 5 X R tv b+++@ DI+(++) D+ G e h r++ y+(**)
------END GEEK CODE BLOCK------
---begin random quote of the moment---
"Whiskey-Tango-Foxtrot, over."
-- anonymous
----end random quote of the moment----
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 15:19 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: Christiaan den Besten, Mark Nipper, Herbert Xu, David S. Miller,
jesse.brandeburg, jrlundgren, cat, djani22, yoseph.basri, mykleb,
olel, michal, netdev, jesse.brandeburg, E1000-devel, Andi Kleen,
Jeff Garzik
In-Reply-To: <442D45EA.9010309@kernelpanic.ru>
Hello, Boris B. Zhmurov.
On 31.03.2006 19:08 you said the following:
> Hmm... with lastest debug patch I can't see any of debug info:
But wait a minute. Two days ago, without Herbert's patches, assertion's
errors was like this:
Mar 29 20:03:23 msk4 kernel: KERNEL: assertion (!sk->sk_forward_alloc)
failed at net/core/stream.c (279)
Mar 29 20:03:23 msk4 kernel: KERNEL: assertion (!sk->sk_forward_alloc)
failed at net/ipv4/af_inet.c (148)
and after appling patches, errors looks like this:
Mar 31 18:21:06 msk4 kernel: KERNEL: assertion (!sk->sk_forward_alloc)
failed at net/core/stream.c (283)
Mar 31 18:21:06 msk4 kernel: KERNEL: assertion (!sk->sk_forward_alloc)
failed at net/ipv4/af_inet.c (150)
stream.c (279) -> stream.c (283)
af_inet.c (148) -> af_inet.c (150)
Does it really matters?
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 15:08 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: Christiaan den Besten, Mark Nipper, Herbert Xu, David S. Miller,
jesse.brandeburg, jrlundgren, cat, djani22, yoseph.basri, mykleb,
olel, michal, netdev, jesse.brandeburg, E1000-devel, Andi Kleen,
Jeff Garzik
In-Reply-To: <442D2EF6.1040703@kernelpanic.ru>
Hello, Boris B. Zhmurov.
On 31.03.2006 17:30 you said the following:
> Herbert, with your second patch still no luck. After an hour of uptime I
> have assertion (!sk->sk_forward_alloc) failed at net/core/stream.c (283)
> again...
>
> Trying your debug patch.
Hmm... with lastest debug patch I can't see any of debug info:
e1000: eth0: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex
e1000: eth1: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex
e1000: eth1: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex
KERNEL: assertion (!sk->sk_forward_alloc) failed at net/core/stream.c (283)
KERNEL: assertion (!sk->sk_forward_alloc) failed at net/ipv4/af_inet.c (150)
Is it normal?
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 13:30 UTC (permalink / raw)
To: Christiaan den Besten
Cc: Mark Nipper, Herbert Xu, David S. Miller, jesse.brandeburg,
jrlundgren, cat, djani22, yoseph.basri, mykleb, olel, michal,
netdev, jesse.brandeburg, E1000-devel, Andi Kleen, Jeff Garzik
In-Reply-To: <045601c654c4$c3dece80$3d64880a@speedy>
Hello, Christiaan den Besten.
On 31.03.2006 17:12 you said the following:
> Hi !
>
>> P.S. I have another high-load server as gateway. Same distro, same
>> kernels, but less memory (512Mb lowmem). eth0 up - e100, eth1 up -
>> e1000. No errors at all! It kinda looks like assertions happens on
>> systems, where the _only_ interface _eth1_ e1000 is up.
>
>
> No, we have a couple gateway's asserting.
Yes, my mistake :( My server asserting with eth0 and eth1 is up both...
Herbert, with your second patch still no luck. After an hour of uptime I
have assertion (!sk->sk_forward_alloc) failed at net/core/stream.c (283)
again...
Trying your debug patch.
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Christiaan den Besten @ 2006-03-31 13:12 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: Mark Nipper, Herbert Xu, David S. Miller, jesse.brandeburg,
jrlundgren, cat, djani22, yoseph.basri, mykleb, olel, michal,
netdev, jesse.brandeburg, E1000-devel, Andi Kleen, Jeff Garzik
In-Reply-To: <442D24AA.8080609@kernelpanic.ru>
Hi !
> P.S. I have another high-load server as gateway. Same distro, same kernels, but less memory (512Mb lowmem). eth0 up - e100, eth1
> up - e1000. No errors at all! It kinda looks like assertions happens on systems, where the _only_ interface _eth1_ e1000 is up.
No, we have a couple gateway's asserting.
2x : Usenet feeder : Onboard eth0 and eth1 "Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03)" ->
asserts (lot's of disk activity (writes) as well by the way ... ). SMP, 4Gb RAM. (2.6.14-mm2)
4x : Usenet cache : PCI-X eth0 "Ethernet controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev 04)" -> no asserts
(no disk activity). Has 2 extra onboard e1000's, but are not used (Ethernet controller: Intel Corporation 82541GI/PI Gigabit
Ethernet Controller). SMP, 2Gb RAM (2.6.15.1)
bye,
Chris
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 12:46 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: Mark Nipper, Herbert Xu, David S. Miller, jesse.brandeburg,
jrlundgren, cat, djani22, yoseph.basri, mykleb, olel, michal,
chris, netdev, jesse.brandeburg, E1000-devel, Andi Kleen,
Jeff Garzik
In-Reply-To: <442D1F26.8050601@kernelpanic.ru>
Hello, Boris B. Zhmurov.
On 31.03.2006 16:23 you said the following:
> Hello, Mark Nipper.
>
> On 31.03.2006 16:10 you said the following:
>
>> This unfortunately is not the case. I have two e1000
>> interfaces but only eth1 is up and in use. And I still had
>> assertions.
>
>
>
> Can you switch to eth0? There is no problem with _eth0_, my friend says.
P.S. I have another high-load server as gateway. Same distro, same
kernels, but less memory (512Mb lowmem). eth0 up - e100, eth1 up -
e1000. No errors at all! It kinda looks like assertions happens on
systems, where the _only_ interface _eth1_ e1000 is up.
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: JaniD++ @ 2006-03-31 12:45 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: davem, jesse.brandeburg, nipsy, jrlundgren, cat, djani22,
yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, "Andi Kleen",
"Jeff Garzik"
In-Reply-To: <442D1B67.8000804@kernelpanic.ru>
----- Original Message -----
From: "Boris B. Zhmurov" <bb@kernelpanic.ru>
To: "Herbert Xu" <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>; <jesse.brandeburg@intel.com>;
<nipsy@bitgnome.net>; <jrlundgren@gmail.com>; <cat@zip.com.au>;
<djani22@dynamicweb.hu>; <yoseph.basri@gmail.com>; <mykleb@no.ibm.com>;
<olel@ans.pl>; <michal@feix.cz>; <chris@scorpion.nl>;
<netdev@vger.kernel.org>; <jesse.brandeburg@gmail.com>;
<E1000-devel@lists.sourceforge.net>; "Andi Kleen" <ak@suse.de>; "Jeff
Garzik" <jgarzik@pobox.com>
Sent: Friday, March 31, 2006 2:07 PM
Subject: Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
> Hello, Herbert Xu.
>
> On 31.03.2006 14:39 you said the following:
>
> > On Fri, Mar 31, 2006 at 02:16:38PM +0400, Boris B. Zhmurov wrote:
> >
> >>And xdelta tells, that e1000.ko was modified :)
> >
> >
> > Thanks for checking again.
> >
> > Anyway, it didn't take long to find another bug in the same area.
> > I'm afraid this driver does seem to be full of them :)
> >
> > It sets last_tx_tso in between computing the number of descriptors and
> > calling e1000_tx_map. This is bad because e1000_tx_map gets the wrong
> > value for last_tx_tso and therefore may corrupt memory for every TSO
> > packet when the ring is almost full.
> >
> > This bug exists on UP as well as SMP.
> >
> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> >
> > Please try this in conjunction with the previous patch.
> >
> > Cheers,
>
>
> David, Herbert - FYI. One of my colleague confirmed, that idea "bug
> reproducible only if there is more then one e1000 adapter onboard" is
> true. He has a 3 servers with double intel pro 1000 adapters, and that
> bug occurs. Also, he has 4 servers with double intel pro 1000 adapters
> onboard, but _only one_ of them is up. And there is no such messages in
> dmesg at all! Inetresting...
This is not an unique thing!
Only _one_ of my 2 equal NIC get this message
NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex
with the old 2.6.15.* e1000 driver!
Not the all e1000 chips ar really equal with the same P/N Number!
This can be hardware based problem, and needs workaround?
Cheers,
>
> --
> Boris B. Zhmurov
> mailto: bb@kernelpanic.ru
> "wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
>
> _____________ NOD32 1.584 (20031220) Információ _____________
>
> Az üzenetet a NOD32 Antivirus System megvizsgálta.
> http://www.nod32.hu
>
>
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 12:36 UTC (permalink / raw)
To: Herbert Xu
Cc: Mark Nipper, David S. Miller, jesse.brandeburg, jrlundgren, cat,
djani22, yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, Andi Kleen, Jeff Garzik
In-Reply-To: <20060331123514.GA13500@gondor.apana.org.au>
Hello, Herbert Xu.
On 31.03.2006 16:35 you said the following:
> On Fri, Mar 31, 2006 at 04:23:02PM +0400, Boris B. Zhmurov wrote:
>
>>I'm already using kernel with second Herbert's patch. We'll see...
>
>
> If it still fails
Not yet. But give it a time :)
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Herbert Xu @ 2006-03-31 12:35 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: Mark Nipper, David S. Miller, jesse.brandeburg, jrlundgren, cat,
djani22, yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, Andi Kleen, Jeff Garzik
In-Reply-To: <442D1F26.8050601@kernelpanic.ru>
[-- Attachment #1: Type: text/plain, Size: 459 bytes --]
On Fri, Mar 31, 2006 at 04:23:02PM +0400, Boris B. Zhmurov wrote:
>
> I'm already using kernel with second Herbert's patch. We'll see...
If it still fails, here is a debugging patch which should tell us
whether we need to look elsewhere.
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[-- Attachment #2: e1000-debug.patch --]
[-- Type: text/plain, Size: 662 bytes --]
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 49cd096..64ac6f4 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2906,6 +2906,13 @@
e1000_tx_map(adapter, tx_ring, skb, first,
max_per_txd, nr_frags, mss));
+ tso = tx_ring->next_to_use - first;
+ if (tso < 0)
+ tso += tx_ring->count;
+ if (unlikely(tso > count))
+ printk(KERN_ERR "e1000 bug: mss=%d, len=%d, frags=%d, est=%d, actual=%d\n",
+ mss, skb->len, nr_frags, count, tso);
+
netdev->trans_start = jiffies;
/* Make sure there is space in the ring for the next send. */
^ permalink raw reply related
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 12:23 UTC (permalink / raw)
To: Mark Nipper
Cc: Herbert Xu, David S. Miller, jesse.brandeburg, jrlundgren, cat,
djani22, yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, Andi Kleen, Jeff Garzik
In-Reply-To: <20060331121007.GA2146@king.bitgnome.net>
Hello, Mark Nipper.
On 31.03.2006 16:10 you said the following:
> This unfortunately is not the case. I have two e1000
> interfaces but only eth1 is up and in use. And I still had
> assertions.
Can you switch to eth0? There is no problem with _eth0_, my friend says.
> And I still had
> assertions.
I'm already using kernel with second Herbert's patch. We'll see...
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Ingo Oeser @ 2006-03-31 12:18 UTC (permalink / raw)
To: Herbert Xu
Cc: David S. Miller, jesse.brandeburg, nipsy, jrlundgren, cat,
djani22, yoseph.basri, bb, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel
In-Reply-To: <20060331094240.GA11040@gondor.apana.org.au>
Hi,
Herbert Xu wrote:
> On Fri, Mar 31, 2006 at 01:35:40AM -0800, David S. Miller wrote:
> > He does not have TSO enabled, e1000 disables TSO when on a link speed
> > slower than gigabit.
dmesg|grep eth0
[4294671.426000] e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
[4294679.125000] e1000: eth0: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex
# ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
So this theory doesn't seem to hold :-(
> Indeed. But I think that only happens on PCI Express and I don't think
> Ingo is using PCI Express.
Right. PCI-Express is not available in this machine.
Maybe the traffic is not enough to trigger it. External connect is just a 6MBit DSL.
Regards
Ingo Oeser
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Mark Nipper @ 2006-03-31 12:10 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: Herbert Xu, David S. Miller, jesse.brandeburg, nipsy, jrlundgren,
cat, djani22, yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, Andi Kleen, Jeff Garzik
In-Reply-To: <442D1B67.8000804@kernelpanic.ru>
On 31 Mar 2006, Boris B. Zhmurov wrote:
> David, Herbert - FYI. One of my colleague confirmed, that idea "bug
> reproducible only if there is more then one e1000 adapter onboard" is
> true. He has a 3 servers with double intel pro 1000 adapters, and that
> bug occurs. Also, he has 4 servers with double intel pro 1000 adapters
> onboard, but _only one_ of them is up. And there is no such messages in
> dmesg at all! Inetresting...
This unfortunately is not the case. I have two e1000
interfaces but only eth1 is up and in use. And I still had
assertions. Hopefully the two already discovered problems will
fix things up for everyone though.
--
Mark Nipper e-contacts:
832 Tanglewood Drive nipsy@bitgnome.net
Bryan, Texas 77802-4013 http://nipsy.bitgnome.net/
(979)575-3193 AIM/Yahoo: texasnipsy ICQ: 66971617
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GG/IT d- s++:+ a- C++$ UBL++++$ P--->+++ L+++$ !E---
W++(--) N+ o K++ w(---) O++ M V(--) PS+++(+) PE(--)
Y+ PGP t+ 5 X R tv b+++@ DI+(++) D+ G e h r++ y+(**)
------END GEEK CODE BLOCK------
---begin random quote of the moment---
Generalizations are usually flawed by exceptions.
-- seen at http://wunderland.com/
----end random quote of the moment----
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 12:07 UTC (permalink / raw)
To: Herbert Xu
Cc: David S. Miller, jesse.brandeburg, nipsy, jrlundgren, cat,
djani22, yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, Andi Kleen, Jeff Garzik
In-Reply-To: <20060331103956.GA12181@gondor.apana.org.au>
Hello, Herbert Xu.
On 31.03.2006 14:39 you said the following:
> On Fri, Mar 31, 2006 at 02:16:38PM +0400, Boris B. Zhmurov wrote:
>
>>And xdelta tells, that e1000.ko was modified :)
>
>
> Thanks for checking again.
>
> Anyway, it didn't take long to find another bug in the same area.
> I'm afraid this driver does seem to be full of them :)
>
> It sets last_tx_tso in between computing the number of descriptors and
> calling e1000_tx_map. This is bad because e1000_tx_map gets the wrong
> value for last_tx_tso and therefore may corrupt memory for every TSO
> packet when the ring is almost full.
>
> This bug exists on UP as well as SMP.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Please try this in conjunction with the previous patch.
>
> Cheers,
David, Herbert - FYI. One of my colleague confirmed, that idea "bug
reproducible only if there is more then one e1000 adapter onboard" is
true. He has a 3 servers with double intel pro 1000 adapters, and that
bug occurs. Also, he has 4 servers with double intel pro 1000 adapters
onboard, but _only one_ of them is up. And there is no such messages in
dmesg at all! Inetresting...
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: JaniD++ @ 2006-03-31 12:02 UTC (permalink / raw)
To: Herbert Xu
Cc: netdev, jesse.brandeburg, nipsy, jrlundgren, cat, djani22,
yoseph.basri, bb, mykleb, olel, michal, chris, netdev,
Jesse Brandeburg, E1000-devel
In-Reply-To: <20060331094240.GA11040@gondor.apana.org.au>
----- Original Message -----
From: "Herbert Xu" <herbert@gondor.apana.org.au>
To: "David S. Miller" <davem@davemloft.net>
Cc: <netdev@axxeo.de>; <jesse.brandeburg@intel.com>; <nipsy@bitgnome.net>;
<jrlundgren@gmail.com>; <cat@zip.com.au>; <djani22@dynamicweb.hu>;
<yoseph.basri@gmail.com>; <bb@kernelpanic.ru>; <mykleb@no.ibm.com>;
<olel@ans.pl>; <michal@feix.cz>; <chris@scorpion.nl>;
<netdev@vger.kernel.org>; <jesse.brandeburg@gmail.com>;
<E1000-devel@lists.sourceforge.net>
Sent: Friday, March 31, 2006 11:42 AM
Subject: Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
> On Fri, Mar 31, 2006 at 01:35:40AM -0800, David S. Miller wrote:
> >
> > He does not have TSO enabled, e1000 disables TSO when on a link speed
> > slower than gigabit.
>
> Indeed. But I think that only happens on PCI Express and I don't think
> Ingo is using PCI Express.
No, my card is "64-bit PCI-X Rev. 1.0 master interface". - from the
datasheet
Number : "82546GB"
This is not PCI Express issue!
Cheers,
>
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> _____________ NOD32 1.584 (20031220) Információ _____________
>
> Az üzenetet a NOD32 Antivirus System megvizsgálta.
> http://www.nod32.hu
>
>
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Andi Kleen @ 2006-03-31 11:15 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: Herbert Xu, David S. Miller, jesse.brandeburg, nipsy, jrlundgren,
cat, djani22, yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, Jeff Garzik
In-Reply-To: <442D1B67.8000804@kernelpanic.ru>
On Friday 31 March 2006 14:07, Boris B. Zhmurov wrote:
> David, Herbert - FYI. One of my colleague confirmed, that idea "bug
> reproducible only if there is more then one e1000 adapter onboard" is
> true. He has a 3 servers with double intel pro 1000 adapters, and that
> bug occurs. Also, he has 4 servers with double intel pro 1000 adapters
> onboard, but _only one_ of them is up. And there is no such messages in
> dmesg at all! Inetresting...
At least all our systems with troubles seem to have more than one e1000
though. Usually only one is active though.
We're still not 100% it is actually the E1000, it is a bit hard to reproduce
the memory corruption :/
-Andi
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 11:02 UTC (permalink / raw)
To: Herbert Xu
Cc: davem, jesse.brandeburg, nipsy, jrlundgren, cat, djani22,
yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, ak, jgarzik
In-Reply-To: <E1FPHFJ-0003Fq-00@gondolin.me.apana.org.au>
Hello, Herbert Xu.
On 31.03.2006 14:52 you said the following:
> BTW, if you kept the built tree it is possible to apply the patch and
> then do a make which should compile just the e1000 driver.
>
> Cheers,
Thank's for the tip, actually I knew that :) First of, I've already
applied some other new patches from bk-commits-head. Not for the e1000
driver. And second - I didn't keep the tree, rpmbuild cleaned it up :)
That's why I'm recompiling entire kernel.
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Herbert Xu @ 2006-03-31 10:52 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: davem, herbert, jesse.brandeburg, nipsy, jrlundgren, cat, djani22,
yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, ak, jgarzik
In-Reply-To: <442D09A3.3020700@kernelpanic.ru>
Boris B. Zhmurov <bb@kernelpanic.ru> wrote:
>
> Recompiling the kernel. I need about 2 hours to get the answer...
BTW, if you kept the built tree it is possible to apply the patch and
then do a make which should compile just the e1000 driver.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Mark Nipper @ 2006-03-31 10:51 UTC (permalink / raw)
To: David S. Miller
Cc: herbert, netdev, jesse.brandeburg, nipsy, jrlundgren, cat,
djani22, yoseph.basri, bb, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel
In-Reply-To: <20060331.013540.95485284.davem@davemloft.net>
On 31 Mar 2006, David S. Miller wrote:
> He does not have TSO enabled, e1000 disables TSO when on a link speed
> slower than gigabit.
>
> You'll see something like the following in your logs:
>
> e1000: eth0: e1000_watchdog_task: 10/100 speed: disabling TSO
Um...
---
$ uname -a
Linux king 2.6.16.1 #1 SMP Thu Mar 30 06:11:33 CST 2006 i686 GNU/Linux
$ dmesg | grep -i task
e1000: eth1: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex
$ ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
---
I know for a fact the link is 100Mbps (other than the
output from the driver itself) and I have been bitten by the
assertion.
I've been running the first patch for about the last 24
hours and have not seen any assertions yet (although they don't
occur that frequently on this server). I'll be adding the
second, most recent patch in a bit and rebooting again.
Hopefully between the two of them, that will have fixed the
problem.
--
Mark Nipper e-contacts:
832 Tanglewood Drive nipsy@bitgnome.net
Bryan, Texas 77802-4013 http://nipsy.bitgnome.net/
(979)575-3193 AIM/Yahoo: texasnipsy ICQ: 66971617
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GG/IT d- s++:+ a- C++$ UBL++++$ P--->+++ L+++$ !E---
W++(--) N+ o K++ w(---) O++ M V(--) PS+++(+) PE(--)
Y+ PGP t+ 5 X R tv b+++@ DI+(++) D+ G e h r++ y+(**)
------END GEEK CODE BLOCK------
---begin random quote of the moment---
And if I close my mind in fear, please pry it open.
----end random quote of the moment----
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Boris B. Zhmurov @ 2006-03-31 10:51 UTC (permalink / raw)
To: David S. Miller
Cc: herbert, jesse.brandeburg, nipsy, jrlundgren, cat, djani22,
yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, ak, jgarzik
In-Reply-To: <20060331.024544.96296223.davem@davemloft.net>
Hello, David S. Miller.
On 31.03.2006 14:45 you said the following:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Fri, 31 Mar 2006 21:39:56 +1100
>
>
>>Anyway, it didn't take long to find another bug in the same area.
>>I'm afraid this driver does seem to be full of them :)
>
>
> Indeed.
>
> Thanks for picking through this some more Herbert. I hope we got it
> this time.
Recompiling the kernel. I need about 2 hours to get the answer...
--
Boris B. Zhmurov
mailto: bb@kernelpanic.ru
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: David S. Miller @ 2006-03-31 10:45 UTC (permalink / raw)
To: herbert
Cc: bb, jesse.brandeburg, nipsy, jrlundgren, cat, djani22,
yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, ak, jgarzik
In-Reply-To: <20060331103956.GA12181@gondor.apana.org.au>
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 31 Mar 2006 21:39:56 +1100
> Anyway, it didn't take long to find another bug in the same area.
> I'm afraid this driver does seem to be full of them :)
Indeed.
Thanks for picking through this some more Herbert. I hope we got it
this time.
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
^ permalink raw reply
* Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...
From: Herbert Xu @ 2006-03-31 10:39 UTC (permalink / raw)
To: Boris B. Zhmurov
Cc: David S. Miller, jesse.brandeburg, nipsy, jrlundgren, cat,
djani22, yoseph.basri, mykleb, olel, michal, chris, netdev,
jesse.brandeburg, E1000-devel, Andi Kleen, Jeff Garzik
In-Reply-To: <442D0186.8090705@kernelpanic.ru>
[-- Attachment #1: Type: text/plain, Size: 890 bytes --]
On Fri, Mar 31, 2006 at 02:16:38PM +0400, Boris B. Zhmurov wrote:
>
> And xdelta tells, that e1000.ko was modified :)
Thanks for checking again.
Anyway, it didn't take long to find another bug in the same area.
I'm afraid this driver does seem to be full of them :)
It sets last_tx_tso in between computing the number of descriptors and
calling e1000_tx_map. This is bad because e1000_tx_map gets the wrong
value for last_tx_tso and therefore may corrupt memory for every TSO
packet when the ring is almost full.
This bug exists on UP as well as SMP.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Please try this in conjunction with the previous patch.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[-- Attachment #2: e1000-tso.patch --]
[-- Type: text/plain, Size: 645 bytes --]
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 49cd096..38aeff9 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2891,7 +2891,6 @@
}
if (likely(tso)) {
- tx_ring->last_tx_tso = 1;
tx_flags |= E1000_TX_FLAGS_TSO;
} else if (likely(e1000_tx_csum(adapter, tx_ring, skb)))
tx_flags |= E1000_TX_FLAGS_CSUM;
@@ -2905,6 +2904,8 @@
e1000_tx_queue(adapter, tx_ring, tx_flags,
e1000_tx_map(adapter, tx_ring, skb, first,
max_per_txd, nr_frags, mss));
+
+ tx_ring->last_tx_tso = tso;
netdev->trans_start = jiffies;
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox