From: Bernhard Schmidt <berni@birkenwald.de>
To: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>,
Andrew Morton <akpm@linux-foundation.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"bugme-daemon@bugzilla.kernel.org"
<bugme-daemon@bugzilla.kernel.org>
Subject: Re: [Bugme-new] [Bug 12877] New: tg3: eth0 transit timed out, resetting -> dead NIC
Date: Tue, 24 Mar 2009 01:35:46 +0100 [thread overview]
Message-ID: <49C82AE2.3080206@birkenwald.de> (raw)
In-Reply-To: <20090323181859.GA5473@xw6200.broadcom.net>
On 23.03.2009 19:18, Matt Carlson wrote:
Hello Matt,
>> Mar 22 04:06:46 svr02 kernel: [1392136.468921] PCI Memory Mapped IO Disabled!!!!
[...]
>> Mar 22 04:07:14 svr02 kernel: [1392164.768266] PCI Memory Mapped IO Disabled!!!!
>> at this point the "watchdog" kicked in and did rmmod/modprobe, so I
>> think the only thing you can read out of this debugging log is that
>> there was no kernel message right before MMIO got disabled and it takes
>> quite a while to fire the Tx timeout.
> So traffic on this box must be pretty light for the watchdog to fire off
> 30 seconds after the MMIO problem was detected, right? Interesting.
Just to make sure I didn't confuse you, the "watchdog" I was talking
about here is a shellscript like this, executed every minute
---
/bin/ping -q -c 5 <defaultgw> > /dev/null
RC=$?
if [ ${RC} -ne 0 ]; then
rmmod tg3; sleep 5; modprobe tg3; sleep 5; ifup --force eth0
fi
---
at :46 MMIO was disabled, at :00 the cronjob started which took until
:15 before detecting the network was dead and reloaded the modules
>> Mar 22 04:07:15 svr02 kernel: [1392165.540078] tg3 0000:03:04.1: PCI INT B disabled
>> Mar 22 04:07:16 svr02 kernel: [1392166.817125] tg3: tg3_abort_hw timed out for eth0, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
>> Mar 22 04:07:18 svr02 kernel: [1392168.398844] tg3: eth0: No firmware running.
>> Mar 22 04:07:29 svr02 kernel: [1392179.793309] tg3: eth0: Link is down.
>> Mar 22 04:07:31 svr02 kernel: [1392181.896030] tg3 0000:03:04.0: PCI INT A disabled
>> Mar 22 04:07:33 svr02 kernel: [1392183.957132] tg3.c:v3.94 (August 14, 2008)
>> Mar 22 04:07:33 svr02 kernel: [1392184.020034] tg3 0000:03:04.0: enabling device (0000 -> 0002)
>> Mar 22 04:07:33 svr02 kernel: [1392184.086083] tg3 0000:03:04.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
The tg3 watchdog (tg3: eth0: transmit timed out, resetting) did not
appear at all in this circle, so I guess the checkscript killed the
module before.
Yes, the NIC is very lightly loaded, around 100kbps / 70pps in each
direction with a few occasional spikes.
>> I'm now switching to eth1.
> O.K. I eagerly await your results.
So far so good, but it has only been running ~36 hours, that's not
really a stability spree yet :-)
I'll keep you updated.
Bernhard
next prev parent reply other threads:[~2009-03-24 0:35 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-12877-10286@http.bugzilla.kernel.org/>
2009-03-15 21:32 ` [Bugme-new] [Bug 12877] New: tg3: eth0 transit timed out, resetting -> dead NIC Andrew Morton
2009-03-16 21:23 ` Michael Chan
2009-03-16 22:46 ` Bernhard Schmidt
2009-03-17 22:09 ` Bernhard Schmidt
2009-03-17 23:30 ` Michael Chan
2009-03-19 16:58 ` Matt Carlson
2009-03-19 18:06 ` Bernhard Schmidt
2009-03-19 18:15 ` Matt Carlson
2009-03-19 18:19 ` Bernhard Schmidt
2009-03-22 13:21 ` Bernhard Schmidt
2009-03-23 18:18 ` Matt Carlson
2009-03-24 0:35 ` Bernhard Schmidt [this message]
2009-03-31 16:26 ` Matt Carlson
2009-03-31 22:16 ` Bernhard Schmidt
2009-04-13 21:54 ` Bernhard Schmidt
2009-04-14 18:29 ` Matt Carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49C82AE2.3080206@birkenwald.de \
--to=berni@birkenwald.de \
--cc=akpm@linux-foundation.org \
--cc=bugme-daemon@bugzilla.kernel.org \
--cc=mcarlson@broadcom.com \
--cc=mchan@broadcom.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.