netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Philip Molter <philip@datafoundry.com>
To: Philip Molter <philip@datafoundry.com>
Cc: Michael Chan <mchan@broadcom.com>,
	Bernd Schubert <bernd-schubert@gmx.de>,
	netdev@vger.kernel.org
Subject: Re: tg3: tg3_stop_block timed out
Date: Sun, 03 Sep 2006 17:35:47 -0500	[thread overview]
Message-ID: <44FB58C3.2060209@datafoundry.com> (raw)
In-Reply-To: <44D9F4EB.8050809@datafoundry.com>

Philip Molter wrote:
> Michael Chan wrote:
>> On Tue, 2006-08-08 at 01:24 +0200, Bernd Schubert wrote:
>>
>>> tg3.c:v3.49 (Feb 2, 2006)
>>> acpi_bus-0201 [01] bus_set_power         : Device is not power 
>>> manageable
>>> eth1: Tigon3 [partno(BCM95704A6) rev 2003 PHY(5704)] 
>>> (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2b:aa:28
>>> eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] 
>>> TSOcap[0]
>>> eth1: dma_rwctrl[769f4000] dma_mask[64-bit]
>>> eth2: Tigon3 [partno(BCM95704A6) rev 2003 PHY(5704)] 
>>> (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2b:aa:29
>>> eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] 
>>> TSOcap[1]
>>> eth2: dma_rwctrl[769f4000] dma_mask[64-bit]
>>>
>>
>> You have ASF enabled on eth1 but not on eth2 so I wonder if ASF is
>> causing the problem.  Can you run the same traffic on eth2 and see if
>> you get the same timeout problem?  Thanks.
> 
> I'm also having this same problem:

Is there any additional information that I can give to help get some 
more work targeted at this bug?  I've been getting this lockup three or 
four times a week per server (I have four of them exhibiting this behavior).

The network setup is fairly complicated, but unfortunately, these are 
production machines pushing multi-gigabit traffic loads.  We're using 
vlans on top of bonding on top of anywhere from 2-to-6 broadcomm NICs, 
but it appears that the problem is unrelated to the bonding and vlans, 
as others are reporting similar problems without those enabled.

Any assistance would be appreciated.  I've left the original information 
below for reference.

If anyone could even explain what this error means, that would be 
helpful.  Maybe we can change something to work around it.

Philip

> divert: allocating divert_blk for bond0
> tg3.c:v3.14 (November 15, 2004)
> divert: allocating divert_blk for eth0
> eth0: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] 
> (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:82:1a
> eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] 
> TSOcap[1]
> divert: allocating divert_blk for eth1
> eth1: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] 
> (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:82:1b
> eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] 
> TSOcap[1]
> divert: freeing divert_blk for bond0
> divert: freeing divert_blk for eth0
> divert: freeing divert_blk for eth1
> 
> 02:09.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 
> Gigabit Ethernet (rev 03)
>         Subsystem: Broadcom Corporation: Unknown device 1644
>         Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 161
>         Memory at fc8c0000 (64-bit, non-prefetchable) [size=fc8a0000]
>         Memory at fc8b0000 (64-bit, non-prefetchable) [size=64K]
>         Expansion ROM at 00010000 [disabled]
>         Capabilities: [40] PCI-X non-bridge device.
>         Capabilities: [48] Power Management version 2
>         Capabilities: [50] Vital Product Data
>         Capabilities: [58] Message Signalled Interrupts: 64bit+ 
> Queue=0/3 Enable-
> 
> 02:09.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 
> Gigabit Ethernet (rev 03)
>         Subsystem: Broadcom Corporation: Unknown device 1644
>         Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 169
>         Memory at fc8f0000 (64-bit, non-prefetchable) [size=fc8d0000]
>         Memory at fc8e0000 (64-bit, non-prefetchable) [size=64K]
>         Expansion ROM at 00010000 [disabled]
>         Capabilities: [40] PCI-X non-bridge device.
>         Capabilities: [48] Power Management version 2
>         Capabilities: [50] Vital Product Data
>         Capabilities: [58] Message Signalled Interrupts: 64bit+ 
> Queue=0/3 Enable-
> 
> I run these things with jumbo frames and bonding.  In the case last 
> night, our machine completely locked up because both interfaces stopped 
> working and the channel bond between them went down.  These guys are 
> pushing a little over 1Gb/s total traffic between them (500Mb/s each) 
> and one of them will take in about 300Mb/s.  Outgoing packets average 
> 20kpkts/s and incoming packets on the one interface average about 
> 45kpkts/s (most incoming traffic is not jumbo).
> 
> This was on console:
> 
> tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2
> tg3: eth1: transmit timed out, resetting
> tg3: tg3_stop_block timed out, ofs=3400 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=2400 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=1800 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2
> tg3: eth0: transmit timed out, resetting
> tg3: tg3_stop_block timed out, ofs=3400 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=2400 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=1800 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2
> tg3: eth1: transmit timed out, resetting
> tg3: tg3_stop_block timed out, ofs=3400 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=2400 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=1800 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2
> tg3: eth0: transmit timed out, resetting
> tg3: tg3_stop_block timed out, ofs=3400 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=2400 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=1800 enable_bit=2
> tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2
> 
> We tried restarting networking.  We tried unloading all network-related 
> modules and reloading them.  We eventually had to reboot the box to get 
> networking started again.  The kernel is 2.6.10, via FC2 
> (2.6.10-2.3.legacy).  We've also had the problem with the latest FC4 
> kernel.
> 
> Any information would be greatly appreciated.
> 
> Philip
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
VGER BF report: U 0.965869

  reply	other threads:[~2006-09-03 22:35 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-07 22:43 tg3: tg3_stop_block timed out Bernd Schubert
2006-08-07 23:07 ` Michael Chan
2006-08-07 23:24   ` Bernd Schubert
2006-08-07 23:46     ` Michael Chan
2006-08-09 14:44       ` Philip Molter
2006-09-03 22:35         ` Philip Molter [this message]
2006-09-04 18:25           ` Michael Chan
2006-09-04 21:27             ` Philip Molter
2006-09-12 17:22             ` Philip Molter
2006-08-09 15:20       ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44FB58C3.2060209@datafoundry.com \
    --to=philip@datafoundry.com \
    --cc=bernd-schubert@gmx.de \
    --cc=mchan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).