From: Bernhard Schmidt <berni@birkenwald.de>
To: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>,
Andrew Morton <akpm@linux-foundation.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"bugme-daemon@bugzilla.kernel.org"
<bugme-daemon@bugzilla.kernel.org>
Subject: Re: [Bugme-new] [Bug 12877] New: tg3: eth0 transit timed out, resetting -> dead NIC
Date: Tue, 24 Mar 2009 01:35:46 +0100 [thread overview]
Message-ID: <49C82AE2.3080206@birkenwald.de> (raw)
In-Reply-To: <20090323181859.GA5473@xw6200.broadcom.net>
On 23.03.2009 19:18, Matt Carlson wrote:
Hello Matt,
>> Mar 22 04:06:46 svr02 kernel: [1392136.468921] PCI Memory Mapped IO Disabled!!!!
[...]
>> Mar 22 04:07:14 svr02 kernel: [1392164.768266] PCI Memory Mapped IO Disabled!!!!
>> at this point the "watchdog" kicked in and did rmmod/modprobe, so I
>> think the only thing you can read out of this debugging log is that
>> there was no kernel message right before MMIO got disabled and it takes
>> quite a while to fire the Tx timeout.
> So traffic on this box must be pretty light for the watchdog to fire off
> 30 seconds after the MMIO problem was detected, right? Interesting.
Just to make sure I didn't confuse you, the "watchdog" I was talking
about here is a shellscript like this, executed every minute
---
/bin/ping -q -c 5 <defaultgw> > /dev/null
RC=$?
if [ ${RC} -ne 0 ]; then
rmmod tg3; sleep 5; modprobe tg3; sleep 5; ifup --force eth0
fi
---
at :46 MMIO was disabled, at :00 the cronjob started which took until
:15 before detecting the network was dead and reloaded the modules
>> Mar 22 04:07:15 svr02 kernel: [1392165.540078] tg3 0000:03:04.1: PCI INT B disabled
>> Mar 22 04:07:16 svr02 kernel: [1392166.817125] tg3: tg3_abort_hw timed out for eth0, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
>> Mar 22 04:07:18 svr02 kernel: [1392168.398844] tg3: eth0: No firmware running.
>> Mar 22 04:07:29 svr02 kernel: [1392179.793309] tg3: eth0: Link is down.
>> Mar 22 04:07:31 svr02 kernel: [1392181.896030] tg3 0000:03:04.0: PCI INT A disabled
>> Mar 22 04:07:33 svr02 kernel: [1392183.957132] tg3.c:v3.94 (August 14, 2008)
>> Mar 22 04:07:33 svr02 kernel: [1392184.020034] tg3 0000:03:04.0: enabling device (0000 -> 0002)
>> Mar 22 04:07:33 svr02 kernel: [1392184.086083] tg3 0000:03:04.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
The tg3 watchdog (tg3: eth0: transmit timed out, resetting) did not
appear at all in this circle, so I guess the checkscript killed the
module before.
Yes, the NIC is very lightly loaded, around 100kbps / 70pps in each
direction with a few occasional spikes.
>> I'm now switching to eth1.
> O.K. I eagerly await your results.
So far so good, but it has only been running ~36 hours, that's not
really a stability spree yet :-)
I'll keep you updated.
Bernhard
next prev parent reply other threads:[~2009-03-24 0:35 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-12877-10286@http.bugzilla.kernel.org/>
2009-03-15 21:32 ` [Bugme-new] [Bug 12877] New: tg3: eth0 transit timed out, resetting -> dead NIC Andrew Morton
2009-03-16 21:23 ` Michael Chan
2009-03-16 22:46 ` Bernhard Schmidt
2009-03-17 22:09 ` Bernhard Schmidt
2009-03-17 23:30 ` Michael Chan
2009-03-19 16:58 ` Matt Carlson
2009-03-19 18:06 ` Bernhard Schmidt
2009-03-19 18:15 ` Matt Carlson
2009-03-19 18:19 ` Bernhard Schmidt
2009-03-22 13:21 ` Bernhard Schmidt
2009-03-23 18:18 ` Matt Carlson
2009-03-24 0:35 ` Bernhard Schmidt [this message]
2009-03-31 16:26 ` Matt Carlson
2009-03-31 22:16 ` Bernhard Schmidt
2009-04-13 21:54 ` Bernhard Schmidt
2009-04-14 18:29 ` Matt Carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49C82AE2.3080206@birkenwald.de \
--to=berni@birkenwald.de \
--cc=akpm@linux-foundation.org \
--cc=bugme-daemon@bugzilla.kernel.org \
--cc=mcarlson@broadcom.com \
--cc=mchan@broadcom.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).