netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: panic in tg3 driver
       [not found] <4D2334B5.1060408@earthlink.net>
@ 2011-01-09 22:30 ` Stephen Clark
  2011-01-10 19:22   ` Matt Carlson
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen Clark @ 2011-01-09 22:30 UTC (permalink / raw)
  To: Linux Kernel Network Developers; +Cc: Michael Chan, Matt Carlson

On 01/04/2011 09:54 AM, Stephen Clark wrote:
> Hello,
>
>
> The hardware is an Acrosser AR-M0898B micro box.
>  lspci
> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> Host Bridge
> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> Host Bridge
> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> Host Bridge
> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge
> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> Host Bridge
> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> Host Bridge
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge
> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA
> Controller (rev
> 20)
> 00:0f.1 IDE interface: VIA Technologies, Inc.
> VT82C586A/B/VT82C686/A/B/VT823x/A/
> C PIPC Bus Master IDE (rev 07)
> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> Controller
>  (rev 91)
> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> Controller
>  (rev 91)
> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> Controller
>  (rev 91)
> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> Controller
>  (rev 91)
> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90)
> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge
> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller
> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge
> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge
> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
> (rev 02)
> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
> (rev 02)
> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
> Fast Ethernet
>  PCI Express (rev 02)
> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
> Fast Ethernet
>  PCI Express (rev 02)
>
> Kernel 2.6.36-2.el5.elrepo on an i686
>
> When I try to ifconfig either of the BCM5906M ports the system panics.
>
> Ideas, fixes ?
>
> [root@Z1010 ~]# modprobe tg3
> [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24
> ------------[ cut here ]------------
> kernel BUG at drivers/net/tg3.c:4365!
> invalid opcode: 0000 [#1] PREEMPT SMP
> last sysfs file: /sys/class/net/eth3/address
> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state
> iptable_mangle af_ke]
>
> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1
> CN700-8251/
> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0
> EIP is at tg3_tx_recover+0x1e/0x53 [tg3]
> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff
> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30
>  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0
> task.ti=dee62000)
> Stack:
>  dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000
> <0> df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040
> c1801f94
> <0> e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500
> dfab4000
> Call Trace:
>  [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3]
>  [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3]
>  [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44]
>  [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3]
>  [<c0674058>] ? net_rx_action+0x7e/0x11c
>  [<c04409c9>] ? __do_softirq+0x85/0x10c
>  [<c0440944>] ? __do_softirq+0x0/0x10c
> <IRQ>
>  [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87
>  [<c044051b>] ? local_bh_enable_ip+0xd/0xf
>  [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e
>  [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf
>  [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3]
>  [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3]
>  [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3]
>  [<c044ec37>] ? process_one_work+0x10b/0x1bc
>  [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3]
>  [<c044fd41>] ? worker_thread+0x77/0xf9
>  [<c0453048>] ? kthread+0x60/0x65
>  [<c044fcca>] ? worker_thread+0x0/0xf9
>  [<c0452fe8>] ? kthread+0x0/0x65
>  [<c040337e>] ? kernel_thread_helper+0x6/0x10
> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44
> 00 00 f6 8
> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30
> ---[ end trace 82381e9b93e397ad ]---
> Kernel panic - not syncing: Fatal exception in interrupt
> Pid: 20303, comm: kworker/0:2 Tainted: G      D
> 2.6.36-2.el5.elrepo #1
> Call Trace:
>  [<c043b3cd>] panic+0x62/0x15d
>  [<c06fb7d1>] oops_end+0x99/0xa8
>  [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3]
>  [<c0405a62>] die+0x58/0x5e
>
> Thanks,
> Steve
>
Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem.
Also boot with noapic I see this in the dmesg log and interrupts are increasing 
like crazy:
tg3.c:v3.115 (October 14, 2010)
tg3 0000:81:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10
tg3 0000:81:00.0: setting latency timer to 64
tg3 0000:81:00.0: PCI: Disallowing DAC for device
tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
ress 00:02:b6:36:d1:39
tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
[0])
tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit]
tg3 0000:82:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10
tg3 0000:82:00.0: setting latency timer to 64
tg3 0000:82:00.0: PCI: Disallowing DAC for device
tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
ress 00:02:b6:36:d1:3a
tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
[0])
tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit]
tg3 0000:81:00.0: irq 40 for MSI/MSI-X
tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx
mode. Please report this failure to the PCI maintainer and include system chipse
t information
ADDRCONF(NETDEV_UP): eth2: link is not ready
[root@Z1010 ~]# cat /proc/interrupts
            CPU0
   0:        162    XT-PIC-XT-PIC    timer
   1:          2    XT-PIC-XT-PIC    i8042
   2:          0    XT-PIC-XT-PIC    cascade
   3:          1    XT-PIC-XT-PIC
   4:       4863    XT-PIC-XT-PIC    serial
   6:          2    XT-PIC-XT-PIC    floppy
   7:          5    XT-PIC-XT-PIC    ehci_hcd:usb1, uhci_hcd:usb3
   8:          0    XT-PIC-XT-PIC    rtc0
   9:          0    XT-PIC-XT-PIC    acpi
  10:    2334234    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2

[root@Z1010 ~]# cat /proc/interrupts |grep eth2
  10:   18388914    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
[root@Z1010 ~]# cat /proc/interrupts |grep eth2
  10:   18901627    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2

-- 

"They that give up essential liberty to obtain temporary safety,
deserve neither liberty nor safety."  (Ben Franklin)

"The course of history shows that as a government grows, liberty
decreases."  (Thomas Jefferson)




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-09 22:30 ` panic in tg3 driver Stephen Clark
@ 2011-01-10 19:22   ` Matt Carlson
  2011-01-10 20:04     ` Stephen Clark
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Carlson @ 2011-01-10 19:22 UTC (permalink / raw)
  To: Stephen Clark
  Cc: Linux Kernel Network Developers, Michael Chan, Matthew Carlson

On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote:
> On 01/04/2011 09:54 AM, Stephen Clark wrote:
> > Hello,
> >
> >
> > The hardware is an Acrosser AR-M0898B micro box.
> >  lspci
> > 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> > Host Bridge
> > 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> > Host Bridge
> > 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> > Host Bridge
> > 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge
> > 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> > Host Bridge
> > 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> > Host Bridge
> > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge
> > 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA
> > Controller (rev
> > 20)
> > 00:0f.1 IDE interface: VIA Technologies, Inc.
> > VT82C586A/B/VT82C686/A/B/VT823x/A/
> > C PIPC Bus Master IDE (rev 07)
> > 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> > Controller
> >  (rev 91)
> > 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> > Controller
> >  (rev 91)
> > 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> > Controller
> >  (rev 91)
> > 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> > Controller
> >  (rev 91)
> > 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90)
> > 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge
> > 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller
> > 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge
> > 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge
> > 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
> > (rev 02)
> > 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
> > (rev 02)
> > 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
> > 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
> > 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
> > Fast Ethernet
> >  PCI Express (rev 02)
> > 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
> > Fast Ethernet
> >  PCI Express (rev 02)
> >
> > Kernel 2.6.36-2.el5.elrepo on an i686
> >
> > When I try to ifconfig either of the BCM5906M ports the system panics.
> >
> > Ideas, fixes ?
> >
> > [root@Z1010 ~]# modprobe tg3
> > [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24
> > ------------[ cut here ]------------
> > kernel BUG at drivers/net/tg3.c:4365!
> > invalid opcode: 0000 [#1] PREEMPT SMP
> > last sysfs file: /sys/class/net/eth3/address
> > Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state
> > iptable_mangle af_ke]
> >
> > Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1
> > CN700-8251/
> > EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0
> > EIP is at tg3_tx_recover+0x1e/0x53 [tg3]
> > EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff
> > ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30
> >  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> > Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0
> > task.ti=dee62000)
> > Stack:
> >  dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000
> > <0> df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040
> > c1801f94
> > <0> e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500
> > dfab4000
> > Call Trace:
> >  [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3]
> >  [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3]
> >  [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44]
> >  [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3]
> >  [<c0674058>] ? net_rx_action+0x7e/0x11c
> >  [<c04409c9>] ? __do_softirq+0x85/0x10c
> >  [<c0440944>] ? __do_softirq+0x0/0x10c
> > <IRQ>
> >  [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87
> >  [<c044051b>] ? local_bh_enable_ip+0xd/0xf
> >  [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e
> >  [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf
> >  [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3]
> >  [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3]
> >  [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3]
> >  [<c044ec37>] ? process_one_work+0x10b/0x1bc
> >  [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3]
> >  [<c044fd41>] ? worker_thread+0x77/0xf9
> >  [<c0453048>] ? kthread+0x60/0x65
> >  [<c044fcca>] ? worker_thread+0x0/0xf9
> >  [<c0452fe8>] ? kthread+0x0/0x65
> >  [<c040337e>] ? kernel_thread_helper+0x6/0x10
> > Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44
> > 00 00 f6 8
> > EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30
> > ---[ end trace 82381e9b93e397ad ]---
> > Kernel panic - not syncing: Fatal exception in interrupt
> > Pid: 20303, comm: kworker/0:2 Tainted: G      D
> > 2.6.36-2.el5.elrepo #1
> > Call Trace:
> >  [<c043b3cd>] panic+0x62/0x15d
> >  [<c06fb7d1>] oops_end+0x99/0xa8
> >  [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3]
> >  [<c0405a62>] die+0x58/0x5e
> >
> > Thanks,
> > Steve
> >
> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem.
> Also boot with noapic I see this in the dmesg log and interrupts are increasing 
> like crazy:
> tg3.c:v3.115 (October 14, 2010)
> tg3 0000:81:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10
> tg3 0000:81:00.0: setting latency timer to 64
> tg3 0000:81:00.0: PCI: Disallowing DAC for device
> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
> ress 00:02:b6:36:d1:39
> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
> [0])
> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit]
> tg3 0000:82:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10
> tg3 0000:82:00.0: setting latency timer to 64
> tg3 0000:82:00.0: PCI: Disallowing DAC for device
> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
> ress 00:02:b6:36:d1:3a
> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
> [0])
> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit]
> tg3 0000:81:00.0: irq 40 for MSI/MSI-X
> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx
> mode. Please report this failure to the PCI maintainer and include system chipse
> t information
> ADDRCONF(NETDEV_UP): eth2: link is not ready
> [root@Z1010 ~]# cat /proc/interrupts
>             CPU0
>    0:        162    XT-PIC-XT-PIC    timer
>    1:          2    XT-PIC-XT-PIC    i8042
>    2:          0    XT-PIC-XT-PIC    cascade
>    3:          1    XT-PIC-XT-PIC
>    4:       4863    XT-PIC-XT-PIC    serial
>    6:          2    XT-PIC-XT-PIC    floppy
>    7:          5    XT-PIC-XT-PIC    ehci_hcd:usb1, uhci_hcd:usb3
>    8:          0    XT-PIC-XT-PIC    rtc0
>    9:          0    XT-PIC-XT-PIC    acpi
>   10:    2334234    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
> 
> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
>   10:   18388914    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
>   10:   18901627    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
> 
> -- 
> 
> "They that give up essential liberty to obtain temporary safety,
> deserve neither liberty nor safety."  (Ben Franklin)
> 
> "The course of history shows that as a government grows, liberty
> decreases."  (Thomas Jefferson)

I think drivers/net/tg3.c:4365 is at the line that reads
"spin_lock(&tp->lock);" in tg3_tx_recover.  Can you verify?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-10 19:22   ` Matt Carlson
@ 2011-01-10 20:04     ` Stephen Clark
  2011-01-11  2:00       ` Matt Carlson
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen Clark @ 2011-01-10 20:04 UTC (permalink / raw)
  To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan

On 01/10/2011 02:22 PM, Matt Carlson wrote:
> On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote:
>    
>> On 01/04/2011 09:54 AM, Stephen Clark wrote:
>>      
>>> Hello,
>>>
>>>
>>> The hardware is an Acrosser AR-M0898B micro box.
>>>   lspci
>>> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>> Host Bridge
>>> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>> Host Bridge
>>> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>> Host Bridge
>>> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge
>>> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>> Host Bridge
>>> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>> Host Bridge
>>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge
>>> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA
>>> Controller (rev
>>> 20)
>>> 00:0f.1 IDE interface: VIA Technologies, Inc.
>>> VT82C586A/B/VT82C686/A/B/VT823x/A/
>>> C PIPC Bus Master IDE (rev 07)
>>> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>> Controller
>>>   (rev 91)
>>> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>> Controller
>>>   (rev 91)
>>> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>> Controller
>>>   (rev 91)
>>> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>> Controller
>>>   (rev 91)
>>> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90)
>>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge
>>> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller
>>> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge
>>> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge
>>> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
>>> (rev 02)
>>> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
>>> (rev 02)
>>> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
>>> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
>>> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
>>> Fast Ethernet
>>>   PCI Express (rev 02)
>>> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
>>> Fast Ethernet
>>>   PCI Express (rev 02)
>>>
>>> Kernel 2.6.36-2.el5.elrepo on an i686
>>>
>>> When I try to ifconfig either of the BCM5906M ports the system panics.
>>>
>>> Ideas, fixes ?
>>>
>>> [root@Z1010 ~]# modprobe tg3
>>> [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24
>>> ------------[ cut here ]------------
>>> kernel BUG at drivers/net/tg3.c:4365!
>>> invalid opcode: 0000 [#1] PREEMPT SMP
>>> last sysfs file: /sys/class/net/eth3/address
>>> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state
>>> iptable_mangle af_ke]
>>>
>>> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1
>>> CN700-8251/
>>> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0
>>> EIP is at tg3_tx_recover+0x1e/0x53 [tg3]
>>> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff
>>> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30
>>>   DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
>>> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0
>>> task.ti=dee62000)
>>> Stack:
>>>   dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000
>>> <0>  df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040
>>> c1801f94
>>> <0>  e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500
>>> dfab4000
>>> Call Trace:
>>>   [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3]
>>>   [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3]
>>>   [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44]
>>>   [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3]
>>>   [<c0674058>] ? net_rx_action+0x7e/0x11c
>>>   [<c04409c9>] ? __do_softirq+0x85/0x10c
>>>   [<c0440944>] ? __do_softirq+0x0/0x10c
>>> <IRQ>
>>>   [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87
>>>   [<c044051b>] ? local_bh_enable_ip+0xd/0xf
>>>   [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e
>>>   [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf
>>>   [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3]
>>>   [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3]
>>>   [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3]
>>>   [<c044ec37>] ? process_one_work+0x10b/0x1bc
>>>   [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3]
>>>   [<c044fd41>] ? worker_thread+0x77/0xf9
>>>   [<c0453048>] ? kthread+0x60/0x65
>>>   [<c044fcca>] ? worker_thread+0x0/0xf9
>>>   [<c0452fe8>] ? kthread+0x0/0x65
>>>   [<c040337e>] ? kernel_thread_helper+0x6/0x10
>>> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44
>>> 00 00 f6 8
>>> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30
>>> ---[ end trace 82381e9b93e397ad ]---
>>> Kernel panic - not syncing: Fatal exception in interrupt
>>> Pid: 20303, comm: kworker/0:2 Tainted: G      D
>>> 2.6.36-2.el5.elrepo #1
>>> Call Trace:
>>>   [<c043b3cd>] panic+0x62/0x15d
>>>   [<c06fb7d1>] oops_end+0x99/0xa8
>>>   [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3]
>>>   [<c0405a62>] die+0x58/0x5e
>>>
>>> Thanks,
>>> Steve
>>>
>>>        
>> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem.
>> Also boot with noapic I see this in the dmesg log and interrupts are increasing
>> like crazy:
>> tg3.c:v3.115 (October 14, 2010)
>> tg3 0000:81:00.0: PCI INT A ->  Link[LNKA] ->  GSI 10 (level, low) ->  IRQ 10
>> tg3 0000:81:00.0: setting latency timer to 64
>> tg3 0000:81:00.0: PCI: Disallowing DAC for device
>> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
>> ress 00:02:b6:36:d1:39
>> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
>> [0])
>> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
>> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit]
>> tg3 0000:82:00.0: PCI INT A ->  Link[LNKA] ->  GSI 10 (level, low) ->  IRQ 10
>> tg3 0000:82:00.0: setting latency timer to 64
>> tg3 0000:82:00.0: PCI: Disallowing DAC for device
>> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
>> ress 00:02:b6:36:d1:3a
>> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
>> [0])
>> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
>> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit]
>> tg3 0000:81:00.0: irq 40 for MSI/MSI-X
>> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx
>> mode. Please report this failure to the PCI maintainer and include system chipse
>> t information
>> ADDRCONF(NETDEV_UP): eth2: link is not ready
>> [root@Z1010 ~]# cat /proc/interrupts
>>              CPU0
>>     0:        162    XT-PIC-XT-PIC    timer
>>     1:          2    XT-PIC-XT-PIC    i8042
>>     2:          0    XT-PIC-XT-PIC    cascade
>>     3:          1    XT-PIC-XT-PIC
>>     4:       4863    XT-PIC-XT-PIC    serial
>>     6:          2    XT-PIC-XT-PIC    floppy
>>     7:          5    XT-PIC-XT-PIC    ehci_hcd:usb1, uhci_hcd:usb3
>>     8:          0    XT-PIC-XT-PIC    rtc0
>>     9:          0    XT-PIC-XT-PIC    acpi
>>    10:    2334234    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
>>
>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
>>    10:   18388914    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
>>    10:   18901627    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
>>
>> -- 
>>
>> "They that give up essential liberty to obtain temporary safety,
>> deserve neither liberty nor safety."  (Ben Franklin)
>>
>> "The course of history shows that as a government grows, liberty
>> decreases."  (Thomas Jefferson)
>>      
> I think drivers/net/tg3.c:4365 is at the line that reads
> "spin_lock(&tp->lock);" in tg3_tx_recover.  Can you verify?
>
>    


         tg3_readphy(tp, MII_TG3_DSP_RW_PORT, &phy2);

in static void tg3_serdes_parallel_detect(struct tg3 *tp)

The driver version is:
#define DRV_MODULE_NAME        "tg3"
#define TG3_MAJ_NUM            3
#define TG3_MIN_NUM            115

-- 

"They that give up essential liberty to obtain temporary safety,
deserve neither liberty nor safety."  (Ben Franklin)

"The course of history shows that as a government grows, liberty
decreases."  (Thomas Jefferson)




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-10 20:04     ` Stephen Clark
@ 2011-01-11  2:00       ` Matt Carlson
  2011-01-11 14:10         ` Stephen Clark
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Carlson @ 2011-01-11  2:00 UTC (permalink / raw)
  To: Stephen Clark
  Cc: Matthew Carlson, Linux Kernel Network Developers, Michael Chan

On Mon, Jan 10, 2011 at 12:04:34PM -0800, Stephen Clark wrote:
> On 01/10/2011 02:22 PM, Matt Carlson wrote:
> > On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote:
> >    
> >> On 01/04/2011 09:54 AM, Stephen Clark wrote:
> >>      
> >>> Hello,
> >>>
> >>>
> >>> The hardware is an Acrosser AR-M0898B micro box.
> >>>   lspci
> >>> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>> Host Bridge
> >>> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>> Host Bridge
> >>> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>> Host Bridge
> >>> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge
> >>> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>> Host Bridge
> >>> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>> Host Bridge
> >>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge
> >>> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA
> >>> Controller (rev
> >>> 20)
> >>> 00:0f.1 IDE interface: VIA Technologies, Inc.
> >>> VT82C586A/B/VT82C686/A/B/VT823x/A/
> >>> C PIPC Bus Master IDE (rev 07)
> >>> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> >>> Controller
> >>>   (rev 91)
> >>> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> >>> Controller
> >>>   (rev 91)
> >>> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> >>> Controller
> >>>   (rev 91)
> >>> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> >>> Controller
> >>>   (rev 91)
> >>> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90)
> >>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge
> >>> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller
> >>> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge
> >>> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge
> >>> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
> >>> (rev 02)
> >>> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
> >>> (rev 02)
> >>> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
> >>> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
> >>> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
> >>> Fast Ethernet
> >>>   PCI Express (rev 02)
> >>> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
> >>> Fast Ethernet
> >>>   PCI Express (rev 02)
> >>>
> >>> Kernel 2.6.36-2.el5.elrepo on an i686
> >>>
> >>> When I try to ifconfig either of the BCM5906M ports the system panics.
> >>>
> >>> Ideas, fixes ?
> >>>
> >>> [root@Z1010 ~]# modprobe tg3
> >>> [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24
> >>> ------------[ cut here ]------------
> >>> kernel BUG at drivers/net/tg3.c:4365!
> >>> invalid opcode: 0000 [#1] PREEMPT SMP
> >>> last sysfs file: /sys/class/net/eth3/address
> >>> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state
> >>> iptable_mangle af_ke]
> >>>
> >>> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1
> >>> CN700-8251/
> >>> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0
> >>> EIP is at tg3_tx_recover+0x1e/0x53 [tg3]
> >>> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff
> >>> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30
> >>>   DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> >>> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0
> >>> task.ti=dee62000)
> >>> Stack:
> >>>   dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000
> >>> <0>  df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040
> >>> c1801f94
> >>> <0>  e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500
> >>> dfab4000
> >>> Call Trace:
> >>>   [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3]
> >>>   [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3]
> >>>   [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44]
> >>>   [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3]
> >>>   [<c0674058>] ? net_rx_action+0x7e/0x11c
> >>>   [<c04409c9>] ? __do_softirq+0x85/0x10c
> >>>   [<c0440944>] ? __do_softirq+0x0/0x10c
> >>> <IRQ>
> >>>   [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87
> >>>   [<c044051b>] ? local_bh_enable_ip+0xd/0xf
> >>>   [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e
> >>>   [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf
> >>>   [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3]
> >>>   [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3]
> >>>   [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3]
> >>>   [<c044ec37>] ? process_one_work+0x10b/0x1bc
> >>>   [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3]
> >>>   [<c044fd41>] ? worker_thread+0x77/0xf9
> >>>   [<c0453048>] ? kthread+0x60/0x65
> >>>   [<c044fcca>] ? worker_thread+0x0/0xf9
> >>>   [<c0452fe8>] ? kthread+0x0/0x65
> >>>   [<c040337e>] ? kernel_thread_helper+0x6/0x10
> >>> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44
> >>> 00 00 f6 8
> >>> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30
> >>> ---[ end trace 82381e9b93e397ad ]---
> >>> Kernel panic - not syncing: Fatal exception in interrupt
> >>> Pid: 20303, comm: kworker/0:2 Tainted: G      D
> >>> 2.6.36-2.el5.elrepo #1
> >>> Call Trace:
> >>>   [<c043b3cd>] panic+0x62/0x15d
> >>>   [<c06fb7d1>] oops_end+0x99/0xa8
> >>>   [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3]
> >>>   [<c0405a62>] die+0x58/0x5e
> >>>
> >>> Thanks,
> >>> Steve
> >>>
> >>>        
> >> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem.
> >> Also boot with noapic I see this in the dmesg log and interrupts are increasing
> >> like crazy:
> >> tg3.c:v3.115 (October 14, 2010)
> >> tg3 0000:81:00.0: PCI INT A ->  Link[LNKA] ->  GSI 10 (level, low) ->  IRQ 10
> >> tg3 0000:81:00.0: setting latency timer to 64
> >> tg3 0000:81:00.0: PCI: Disallowing DAC for device
> >> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
> >> ress 00:02:b6:36:d1:39
> >> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
> >> [0])
> >> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> >> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit]
> >> tg3 0000:82:00.0: PCI INT A ->  Link[LNKA] ->  GSI 10 (level, low) ->  IRQ 10
> >> tg3 0000:82:00.0: setting latency timer to 64
> >> tg3 0000:82:00.0: PCI: Disallowing DAC for device
> >> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
> >> ress 00:02:b6:36:d1:3a
> >> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
> >> [0])
> >> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> >> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit]
> >> tg3 0000:81:00.0: irq 40 for MSI/MSI-X
> >> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx
> >> mode. Please report this failure to the PCI maintainer and include system chipse
> >> t information
> >> ADDRCONF(NETDEV_UP): eth2: link is not ready
> >> [root@Z1010 ~]# cat /proc/interrupts
> >>              CPU0
> >>     0:        162    XT-PIC-XT-PIC    timer
> >>     1:          2    XT-PIC-XT-PIC    i8042
> >>     2:          0    XT-PIC-XT-PIC    cascade
> >>     3:          1    XT-PIC-XT-PIC
> >>     4:       4863    XT-PIC-XT-PIC    serial
> >>     6:          2    XT-PIC-XT-PIC    floppy
> >>     7:          5    XT-PIC-XT-PIC    ehci_hcd:usb1, uhci_hcd:usb3
> >>     8:          0    XT-PIC-XT-PIC    rtc0
> >>     9:          0    XT-PIC-XT-PIC    acpi
> >>    10:    2334234    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
> >>
> >> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
> >>    10:   18388914    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
> >> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
> >>    10:   18901627    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
> >>
> >> -- 
> >>
> >> "They that give up essential liberty to obtain temporary safety,
> >> deserve neither liberty nor safety."  (Ben Franklin)
> >>
> >> "The course of history shows that as a government grows, liberty
> >> decreases."  (Thomas Jefferson)
> >>      
> > I think drivers/net/tg3.c:4365 is at the line that reads
> > "spin_lock(&tp->lock);" in tg3_tx_recover.  Can you verify?
> >
> >    
> 
> 
>          tg3_readphy(tp, MII_TG3_DSP_RW_PORT, &phy2);
> 
> in static void tg3_serdes_parallel_detect(struct tg3 *tp)
> 
> The driver version is:
> #define DRV_MODULE_NAME        "tg3"
> #define TG3_MAJ_NUM            3
> #define TG3_MIN_NUM            115


That doesn't look right.  The line number I quoted came from the kernel
panic output from 2.6.36-2.el5.elrepo.  I'm guessing you quoted me the
sources from the tg3.c file in 2.6.37-rc8+.  If you don't have the
2.6.36-2.el5.elrepo sources readily available, can you give me the line
the kernel panic specifies from the tg3.c file from your 2.6.37-rc8+
sources?

It looks like there are a lot of devices on IRQ 10.  Does the interrupt
count drop if you bring down eth0 (which I'm guessing is the b44 device)?

Can you tell me if you saw the following message in the syslogs?

"The system may be re-ordering memory-mapped I/O cycles to the network
 device, attempting to recover.  Please report the problem to the driver
 maintainer and include system chipset information."


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-11  2:00       ` Matt Carlson
@ 2011-01-11 14:10         ` Stephen Clark
  2011-01-12  3:06           ` Matt Carlson
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen Clark @ 2011-01-11 14:10 UTC (permalink / raw)
  To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan

On 01/10/2011 09:00 PM, Matt Carlson wrote:
> On Mon, Jan 10, 2011 at 12:04:34PM -0800, Stephen Clark wrote:
>    
>> On 01/10/2011 02:22 PM, Matt Carlson wrote:
>>      
>>> On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote:
>>>
>>>        
>>>> On 01/04/2011 09:54 AM, Stephen Clark wrote:
>>>>
>>>>          
>>>>> Hello,
>>>>>
>>>>>
>>>>> The hardware is an Acrosser AR-M0898B micro box.
>>>>>    lspci
>>>>> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge
>>>>> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge
>>>>> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA
>>>>> Controller (rev
>>>>> 20)
>>>>> 00:0f.1 IDE interface: VIA Technologies, Inc.
>>>>> VT82C586A/B/VT82C686/A/B/VT823x/A/
>>>>> C PIPC Bus Master IDE (rev 07)
>>>>> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>>>> Controller
>>>>>    (rev 91)
>>>>> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>>>> Controller
>>>>>    (rev 91)
>>>>> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>>>> Controller
>>>>>    (rev 91)
>>>>> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>>>> Controller
>>>>>    (rev 91)
>>>>> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90)
>>>>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge
>>>>> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller
>>>>> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge
>>>>> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge
>>>>> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
>>>>> (rev 02)
>>>>> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
>>>>> (rev 02)
>>>>> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
>>>>> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
>>>>> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
>>>>> Fast Ethernet
>>>>>    PCI Express (rev 02)
>>>>> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
>>>>> Fast Ethernet
>>>>>    PCI Express (rev 02)
>>>>>
>>>>> Kernel 2.6.36-2.el5.elrepo on an i686
>>>>>
>>>>> When I try to ifconfig either of the BCM5906M ports the system panics.
>>>>>
>>>>> Ideas, fixes ?
>>>>>
>>>>> [root@Z1010 ~]# modprobe tg3
>>>>> [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24
>>>>> ------------[ cut here ]------------
>>>>> kernel BUG at drivers/net/tg3.c:4365!
>>>>> invalid opcode: 0000 [#1] PREEMPT SMP
>>>>> last sysfs file: /sys/class/net/eth3/address
>>>>> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state
>>>>> iptable_mangle af_ke]
>>>>>
>>>>> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1
>>>>> CN700-8251/
>>>>> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0
>>>>> EIP is at tg3_tx_recover+0x1e/0x53 [tg3]
>>>>> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff
>>>>> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30
>>>>>    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
>>>>> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0
>>>>> task.ti=dee62000)
>>>>> Stack:
>>>>>    dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000
>>>>> <0>   df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040
>>>>> c1801f94
>>>>> <0>   e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500
>>>>> dfab4000
>>>>> Call Trace:
>>>>>    [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3]
>>>>>    [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3]
>>>>>    [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44]
>>>>>    [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3]
>>>>>    [<c0674058>] ? net_rx_action+0x7e/0x11c
>>>>>    [<c04409c9>] ? __do_softirq+0x85/0x10c
>>>>>    [<c0440944>] ? __do_softirq+0x0/0x10c
>>>>> <IRQ>
>>>>>    [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87
>>>>>    [<c044051b>] ? local_bh_enable_ip+0xd/0xf
>>>>>    [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e
>>>>>    [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf
>>>>>    [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3]
>>>>>    [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3]
>>>>>    [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3]
>>>>>    [<c044ec37>] ? process_one_work+0x10b/0x1bc
>>>>>    [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3]
>>>>>    [<c044fd41>] ? worker_thread+0x77/0xf9
>>>>>    [<c0453048>] ? kthread+0x60/0x65
>>>>>    [<c044fcca>] ? worker_thread+0x0/0xf9
>>>>>    [<c0452fe8>] ? kthread+0x0/0x65
>>>>>    [<c040337e>] ? kernel_thread_helper+0x6/0x10
>>>>> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44
>>>>> 00 00 f6 8
>>>>> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30
>>>>> ---[ end trace 82381e9b93e397ad ]---
>>>>> Kernel panic - not syncing: Fatal exception in interrupt
>>>>> Pid: 20303, comm: kworker/0:2 Tainted: G      D
>>>>> 2.6.36-2.el5.elrepo #1
>>>>> Call Trace:
>>>>>    [<c043b3cd>] panic+0x62/0x15d
>>>>>    [<c06fb7d1>] oops_end+0x99/0xa8
>>>>>    [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3]
>>>>>    [<c0405a62>] die+0x58/0x5e
>>>>>
>>>>> Thanks,
>>>>> Steve
>>>>>
>>>>>
>>>>>            
>>>> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem.
>>>> Also boot with noapic I see this in the dmesg log and interrupts are increasing
>>>> like crazy:
>>>> tg3.c:v3.115 (October 14, 2010)
>>>> tg3 0000:81:00.0: PCI INT A ->   Link[LNKA] ->   GSI 10 (level, low) ->   IRQ 10
>>>> tg3 0000:81:00.0: setting latency timer to 64
>>>> tg3 0000:81:00.0: PCI: Disallowing DAC for device
>>>> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
>>>> ress 00:02:b6:36:d1:39
>>>> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
>>>> [0])
>>>> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
>>>> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit]
>>>> tg3 0000:82:00.0: PCI INT A ->   Link[LNKA] ->   GSI 10 (level, low) ->   IRQ 10
>>>> tg3 0000:82:00.0: setting latency timer to 64
>>>> tg3 0000:82:00.0: PCI: Disallowing DAC for device
>>>> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
>>>> ress 00:02:b6:36:d1:3a
>>>> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
>>>> [0])
>>>> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
>>>> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit]
>>>> tg3 0000:81:00.0: irq 40 for MSI/MSI-X
>>>> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx
>>>> mode. Please report this failure to the PCI maintainer and include system chipse
>>>> t information
>>>> ADDRCONF(NETDEV_UP): eth2: link is not ready
>>>> [root@Z1010 ~]# cat /proc/interrupts
>>>>               CPU0
>>>>      0:        162    XT-PIC-XT-PIC    timer
>>>>      1:          2    XT-PIC-XT-PIC    i8042
>>>>      2:          0    XT-PIC-XT-PIC    cascade
>>>>      3:          1    XT-PIC-XT-PIC
>>>>      4:       4863    XT-PIC-XT-PIC    serial
>>>>      6:          2    XT-PIC-XT-PIC    floppy
>>>>      7:          5    XT-PIC-XT-PIC    ehci_hcd:usb1, uhci_hcd:usb3
>>>>      8:          0    XT-PIC-XT-PIC    rtc0
>>>>      9:          0    XT-PIC-XT-PIC    acpi
>>>>     10:    2334234    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
>>>>
>>>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
>>>>     10:   18388914    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
>>>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
>>>>     10:   18901627    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
>>>>
>>>> -- 
>>>>
>>>> "They that give up essential liberty to obtain temporary safety,
>>>> deserve neither liberty nor safety."  (Ben Franklin)
>>>>
>>>> "The course of history shows that as a government grows, liberty
>>>> decreases."  (Thomas Jefferson)
>>>>
>>>>          
>>> I think drivers/net/tg3.c:4365 is at the line that reads
>>> "spin_lock(&tp->lock);" in tg3_tx_recover.  Can you verify?
>>>
>>>
>>>        
>>
>>           tg3_readphy(tp, MII_TG3_DSP_RW_PORT,&phy2);
>>
>> in static void tg3_serdes_parallel_detect(struct tg3 *tp)
>>
>> The driver version is:
>> #define DRV_MODULE_NAME        "tg3"
>> #define TG3_MAJ_NUM            3
>> #define TG3_MIN_NUM            115
>>      
>
> That doesn't look right.  The line number I quoted came from the kernel
> panic output from 2.6.36-2.el5.elrepo.  I'm guessing you quoted me the
> sources from the tg3.c file in 2.6.37-rc8+.  If you don't have the
> 2.6.36-2.el5.elrepo sources readily available, can you give me the line
> the kernel panic specifies from the tg3.c file from your 2.6.37-rc8+
> sources?
>
>    
Oops - You are correct. The problem is most of the time I don't get a 
panic on the
screen the box simply reboots.

I'll see if I can get the 2.6.36-2 sources - though they are suppose to 
be the virgin
kernel.org sources simply recompiled for Centos.

static void tg3_tx_recover(struct tg3 *tp)
{
     BUG_ON((tp->tg3_flags & TG3_FLAG_MBOX_WRITE_REORDER) ||
4365:           tp->write32_tx_mbox == tg3_write_indirect_mbox);


> It looks like there are a lot of devices on IRQ 10.  Does the interrupt
> count drop if you bring down eth0 (which I'm guessing is the b44 device)?
>    
This happens when I boot with noapic. Which I only did as a test. With 
the noapic option
the system doesn't panic - but gets all these extra interrupts as soon 
as I ifconfig one of
the 5906 ports.


> Can you tell me if you saw the following message in the syslogs?
>
> "The system may be re-ordering memory-mapped I/O cycles to the network
>   device, attempting to recover.  Please report the problem to the driver
>   maintainer and include system chipset information."
>
>    
Couldn't find this in the messages file.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-11 14:10         ` Stephen Clark
@ 2011-01-12  3:06           ` Matt Carlson
  2011-01-12 13:53             ` Stephen Clark
  2011-01-13 13:12             ` Stephen Clark
  0 siblings, 2 replies; 12+ messages in thread
From: Matt Carlson @ 2011-01-12  3:06 UTC (permalink / raw)
  To: Stephen Clark
  Cc: Matthew Carlson, Linux Kernel Network Developers, Michael Chan

On Tue, Jan 11, 2011 at 06:10:55AM -0800, Stephen Clark wrote:
> On 01/10/2011 09:00 PM, Matt Carlson wrote:
> > On Mon, Jan 10, 2011 at 12:04:34PM -0800, Stephen Clark wrote:
> >    
> >> On 01/10/2011 02:22 PM, Matt Carlson wrote:
> >>      
> >>> On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote:
> >>>
> >>>        
> >>>> On 01/04/2011 09:54 AM, Stephen Clark wrote:
> >>>>
> >>>>          
> >>>>> Hello,
> >>>>>
> >>>>>
> >>>>> The hardware is an Acrosser AR-M0898B micro box.
> >>>>>    lspci
> >>>>> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>>>> Host Bridge
> >>>>> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>>>> Host Bridge
> >>>>> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>>>> Host Bridge
> >>>>> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge
> >>>>> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>>>> Host Bridge
> >>>>> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
> >>>>> Host Bridge
> >>>>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge
> >>>>> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA
> >>>>> Controller (rev
> >>>>> 20)
> >>>>> 00:0f.1 IDE interface: VIA Technologies, Inc.
> >>>>> VT82C586A/B/VT82C686/A/B/VT823x/A/
> >>>>> C PIPC Bus Master IDE (rev 07)
> >>>>> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> >>>>> Controller
> >>>>>    (rev 91)
> >>>>> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> >>>>> Controller
> >>>>>    (rev 91)
> >>>>> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> >>>>> Controller
> >>>>>    (rev 91)
> >>>>> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> >>>>> Controller
> >>>>>    (rev 91)
> >>>>> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90)
> >>>>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge
> >>>>> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller
> >>>>> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge
> >>>>> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge
> >>>>> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
> >>>>> (rev 02)
> >>>>> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
> >>>>> (rev 02)
> >>>>> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
> >>>>> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
> >>>>> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
> >>>>> Fast Ethernet
> >>>>>    PCI Express (rev 02)
> >>>>> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
> >>>>> Fast Ethernet
> >>>>>    PCI Express (rev 02)
> >>>>>
> >>>>> Kernel 2.6.36-2.el5.elrepo on an i686
> >>>>>
> >>>>> When I try to ifconfig either of the BCM5906M ports the system panics.
> >>>>>
> >>>>> Ideas, fixes ?
> >>>>>
> >>>>> [root@Z1010 ~]# modprobe tg3
> >>>>> [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24
> >>>>> ------------[ cut here ]------------
> >>>>> kernel BUG at drivers/net/tg3.c:4365!
> >>>>> invalid opcode: 0000 [#1] PREEMPT SMP
> >>>>> last sysfs file: /sys/class/net/eth3/address
> >>>>> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state
> >>>>> iptable_mangle af_ke]
> >>>>>
> >>>>> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1
> >>>>> CN700-8251/
> >>>>> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0
> >>>>> EIP is at tg3_tx_recover+0x1e/0x53 [tg3]
> >>>>> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff
> >>>>> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30
> >>>>>    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> >>>>> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0
> >>>>> task.ti=dee62000)
> >>>>> Stack:
> >>>>>    dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000
> >>>>> <0>   df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040
> >>>>> c1801f94
> >>>>> <0>   e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500
> >>>>> dfab4000
> >>>>> Call Trace:
> >>>>>    [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3]
> >>>>>    [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3]
> >>>>>    [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44]
> >>>>>    [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3]
> >>>>>    [<c0674058>] ? net_rx_action+0x7e/0x11c
> >>>>>    [<c04409c9>] ? __do_softirq+0x85/0x10c
> >>>>>    [<c0440944>] ? __do_softirq+0x0/0x10c
> >>>>> <IRQ>
> >>>>>    [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87
> >>>>>    [<c044051b>] ? local_bh_enable_ip+0xd/0xf
> >>>>>    [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e
> >>>>>    [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf
> >>>>>    [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3]
> >>>>>    [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3]
> >>>>>    [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3]
> >>>>>    [<c044ec37>] ? process_one_work+0x10b/0x1bc
> >>>>>    [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3]
> >>>>>    [<c044fd41>] ? worker_thread+0x77/0xf9
> >>>>>    [<c0453048>] ? kthread+0x60/0x65
> >>>>>    [<c044fcca>] ? worker_thread+0x0/0xf9
> >>>>>    [<c0452fe8>] ? kthread+0x0/0x65
> >>>>>    [<c040337e>] ? kernel_thread_helper+0x6/0x10
> >>>>> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44
> >>>>> 00 00 f6 8
> >>>>> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30
> >>>>> ---[ end trace 82381e9b93e397ad ]---
> >>>>> Kernel panic - not syncing: Fatal exception in interrupt
> >>>>> Pid: 20303, comm: kworker/0:2 Tainted: G      D
> >>>>> 2.6.36-2.el5.elrepo #1
> >>>>> Call Trace:
> >>>>>    [<c043b3cd>] panic+0x62/0x15d
> >>>>>    [<c06fb7d1>] oops_end+0x99/0xa8
> >>>>>    [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3]
> >>>>>    [<c0405a62>] die+0x58/0x5e
> >>>>>
> >>>>> Thanks,
> >>>>> Steve
> >>>>>
> >>>>>
> >>>>>            
> >>>> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem.
> >>>> Also boot with noapic I see this in the dmesg log and interrupts are increasing
> >>>> like crazy:
> >>>> tg3.c:v3.115 (October 14, 2010)
> >>>> tg3 0000:81:00.0: PCI INT A ->   Link[LNKA] ->   GSI 10 (level, low) ->   IRQ 10
> >>>> tg3 0000:81:00.0: setting latency timer to 64
> >>>> tg3 0000:81:00.0: PCI: Disallowing DAC for device
> >>>> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
> >>>> ress 00:02:b6:36:d1:39
> >>>> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
> >>>> [0])
> >>>> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> >>>> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit]
> >>>> tg3 0000:82:00.0: PCI INT A ->   Link[LNKA] ->   GSI 10 (level, low) ->   IRQ 10
> >>>> tg3 0000:82:00.0: setting latency timer to 64
> >>>> tg3 0000:82:00.0: PCI: Disallowing DAC for device
> >>>> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
> >>>> ress 00:02:b6:36:d1:3a
> >>>> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
> >>>> [0])
> >>>> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> >>>> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit]
> >>>> tg3 0000:81:00.0: irq 40 for MSI/MSI-X
> >>>> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx
> >>>> mode. Please report this failure to the PCI maintainer and include system chipse
> >>>> t information
> >>>> ADDRCONF(NETDEV_UP): eth2: link is not ready
> >>>> [root@Z1010 ~]# cat /proc/interrupts
> >>>>               CPU0
> >>>>      0:        162    XT-PIC-XT-PIC    timer
> >>>>      1:          2    XT-PIC-XT-PIC    i8042
> >>>>      2:          0    XT-PIC-XT-PIC    cascade
> >>>>      3:          1    XT-PIC-XT-PIC
> >>>>      4:       4863    XT-PIC-XT-PIC    serial
> >>>>      6:          2    XT-PIC-XT-PIC    floppy
> >>>>      7:          5    XT-PIC-XT-PIC    ehci_hcd:usb1, uhci_hcd:usb3
> >>>>      8:          0    XT-PIC-XT-PIC    rtc0
> >>>>      9:          0    XT-PIC-XT-PIC    acpi
> >>>>     10:    2334234    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
> >>>>
> >>>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
> >>>>     10:   18388914    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
> >>>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2
> >>>>     10:   18901627    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
> >>>>
> >>>> -- 
> >>>>
> >>>> "They that give up essential liberty to obtain temporary safety,
> >>>> deserve neither liberty nor safety."  (Ben Franklin)
> >>>>
> >>>> "The course of history shows that as a government grows, liberty
> >>>> decreases."  (Thomas Jefferson)
> >>>>
> >>>>          
> >>> I think drivers/net/tg3.c:4365 is at the line that reads
> >>> "spin_lock(&tp->lock);" in tg3_tx_recover.  Can you verify?
> >>>
> >>>
> >>>        
> >>
> >>           tg3_readphy(tp, MII_TG3_DSP_RW_PORT,&phy2);
> >>
> >> in static void tg3_serdes_parallel_detect(struct tg3 *tp)
> >>
> >> The driver version is:
> >> #define DRV_MODULE_NAME        "tg3"
> >> #define TG3_MAJ_NUM            3
> >> #define TG3_MIN_NUM            115
> >>      
> >
> > That doesn't look right.  The line number I quoted came from the kernel
> > panic output from 2.6.36-2.el5.elrepo.  I'm guessing you quoted me the
> > sources from the tg3.c file in 2.6.37-rc8+.  If you don't have the
> > 2.6.36-2.el5.elrepo sources readily available, can you give me the line
> > the kernel panic specifies from the tg3.c file from your 2.6.37-rc8+
> > sources?
> >
> >    
> Oops - You are correct. The problem is most of the time I don't get a 
> panic on the
> screen the box simply reboots.
> 
> I'll see if I can get the 2.6.36-2 sources - though they are suppose to 
> be the virgin
> kernel.org sources simply recompiled for Centos.
> 
> static void tg3_tx_recover(struct tg3 *tp)
> {
>      BUG_ON((tp->tg3_flags & TG3_FLAG_MBOX_WRITE_REORDER) ||
> 4365:           tp->write32_tx_mbox == tg3_write_indirect_mbox);
> 
> 
> > It looks like there are a lot of devices on IRQ 10.  Does the interrupt
> > count drop if you bring down eth0 (which I'm guessing is the b44 device)?
> >    
> This happens when I boot with noapic. Which I only did as a test. With 
> the noapic option
> the system doesn't panic - but gets all these extra interrupts as soon 
> as I ifconfig one of
> the 5906 ports.

I was wondering if the b44 device is having a problem with shared
interrupts.

> > Can you tell me if you saw the following message in the syslogs?
> >
> > "The system may be re-ordering memory-mapped I/O cycles to the network
> >   device, attempting to recover.  Please report the problem to the driver
> >   maintainer and include system chipset information."
> >
> >    
> Couldn't find this in the messages file.

Can you give me the output of 'lspci -vvv -xxx -s 81:00.0' and
'ethtool -i eth2'?

I'm wondering if this BUG_ON is a symptom of a different problem
masquerading as a write-reordering bug.  Do you have IPv6 configured?
If not, what happens if you just run 'ifconfig eth2 up', without
assigning an IP address?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-12  3:06           ` Matt Carlson
@ 2011-01-12 13:53             ` Stephen Clark
  2011-01-13 13:12             ` Stephen Clark
  1 sibling, 0 replies; 12+ messages in thread
From: Stephen Clark @ 2011-01-12 13:53 UTC (permalink / raw)
  To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan

On 01/11/2011 10:06 PM, Matt Carlson wrote:
> lspci -vvv -xxx -s 81:00.0

Linux Z1010.netwolves.com 2.6.37 #9 SMP PREEMPT Wed Jan 5 11:14:46 EST 
2011 i686
i686 i386 GNU/Linux

[root@Z1010 ~]# lspci -vvv -xxx -s 81:00.0
81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M Fast 
Ethernet
PCI Express (rev 02)
         Subsystem: Broadcom Corporation Unknown device 9713
         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepp
ing- SERR- FastB2B-
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <
MAbort- >SERR- <PERR-
         Latency: 0, Cache Line Size: 64 bytes
         Interrupt: pin A routed to IRQ 10
         Region 0: Memory at cfbf0000 (64-bit, non-prefetchable) [size=64K]
         Capabilities: [48] Power Management version 3
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot+
,D3cold+)
                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-
         Capabilities: [50] Vital Product Data
         Capabilities: [58] Vendor Specific Information
         Capabilities: [e8] Message Signalled Interrupts: 64bit+ 
Queue=0/0 Enable-
                 Address: 00000000fee0100c  Data: 4169
         Capabilities: [d0] Express Endpoint IRQ 0
                 Device: Supported: MaxPayload 128 bytes, PhantFunc 0, 
ExtTag+
                 Device: Latency L0s <4us, L1 unlimited
                 Device: AtnBtn- AtnInd- PwrInd-
                 Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
                 Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                 Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
                 Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, 
Port 0
                 Link: Latency L0s <4us, L1 <64us
                 Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
                 Link: Speed 2.5Gb/s, Width x1
         Capabilities: [100] Advanced Error Reporting
         Capabilities: [13c] Virtual Channel
         Capabilities: [160] Device Serial Number 39-d1-36-fe-ff-b6-02-00
00: e4 14 13 17 06 00 10 00 02 00 00 02 10 00 00 00
10: 04 00 bf cf 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 e4 14 13 97
30: 00 00 00 00 48 00 00 00 00 00 00 00 0a 01 00 00
40: 00 00 00 00 00 00 00 00 01 50 03 c0 08 00 00 00
50: 03 58 fc 00 78 00 00 00 09 e8 78 00 64 2f 72 64
60: 00 00 00 00 00 00 00 00 00 00 02 c0 00 00 00 00
70: 12 12 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 fe 50 08 24
90: 01 92 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 10 00 01 00 a0 8f 64 00 00 20 10 00 11 6c 03 00
e0: 00 00 11 10 00 00 00 00 05 d0 80 00 0c 10 e0 fe
f0: 00 00 00 00 69 41 00 00 00 00 00 00 00 00 00 00

modprobe tg3

dmesg output:
tg3.c:v3.115 (October 14, 2010)
tg3 0000:81:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
tg3 0000:81:00.0: setting latency timer to 64
tg3 0000:81:00.0: PCI: Disallowing DAC for device
tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) 
MAC addr
ess 00:02:b6:36:d1:39
tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) 
(WireSpeed[
0])
tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit]
tg3 0000:82:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
tg3 0000:82:00.0: setting latency timer to 64
tg3 0000:82:00.0: PCI: Disallowing DAC for device
tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) 
MAC addr
ess 00:02:b6:36:d1:3a
tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) 
(WireSpeed[
0])
tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit]

[root@Z1010 ~]# ethtool -i eth2
driver: tg3
version: 3.115
firmware-version: sb v3.03
bus-info: 0000:81:00.0

[root@Z1010 ~]# cat /proc/interrupts
            CPU0
   0:        173   IO-APIC-edge      timer
   1:          2   IO-APIC-edge      i8042
   4:       2864   IO-APIC-edge      serial
   6:          2   IO-APIC-edge      floppy
   8:          0   IO-APIC-edge      rtc0
   9:          0   IO-APIC-fasteoi   acpi
  14:          0   IO-APIC-edge      pata_via
  15:       8100   IO-APIC-edge      pata_via
  16:        984   IO-APIC-fasteoi   eth0
  17:        104   IO-APIC-fasteoi   eth1
  20:          0   IO-APIC-fasteoi   uhci_hcd:usb2
  21:          0   IO-APIC-fasteoi   uhci_hcd:usb4, sata_via
  22:          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb3
  23:          0   IO-APIC-fasteoi   uhci_hcd:usb5
NMI:          0   Non-maskable interrupts
LOC:     101963   Local timer interrupts
SPU:          0   Spurious interrupts
PMI:          0   Performance monitoring interrupts
IWI:          0   IRQ work interrupts
RES:          0   Rescheduling interrupts
CAL:          0   Function call interrupts
TLB:          0   TLB shootdowns
TRM:          0   Thermal event interrupts
THR:          0   Threshold APIC interrupts
MCE:          0   Machine check exceptions
MCP:          0   Machine check polls
ERR:          0
MIS:          0

The b44 interfaces are working great.

[root@Z1010 ~]# ifconfig eth2 up
do_IRQ: 0.64 No irq handler for vector (irq -1)

system becomes unresponsive then ususally
reboots.

but it didn't this last time just has become really doggy in
responding
[root@Z1010 ~]#
[root@Z1010 ~]#
[root@Z1010 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:02:B6:36:D1:37
           inet addr:10.0.129.4  Bcast:10.0.255.255  Mask:255.255.128.0
           inet6 addr: fe80::202:b6ff:fe36:d137/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:1025 errors:0 dropped:12 overruns:0 frame:0
           TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:185675 (181.3 KiB)  TX bytes:492 (492.0 b)
           Interrupt:16

eth1      Link encap:Ethernet  HWaddr 00:02:B6:36:D1:38
           inet6 addr: fe80::202:b6ff:fe36:d138/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:35 errors:0 dropped:0 overruns:0 frame:0
           TX packets:41 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:2612 (2.5 KiB)  TX bytes:4014 (3.9 KiB)
           Interrupt:17

eth2      Link encap:Ethernet  HWaddr 00:02:B6:36:D1:39
           UP BROADCAST MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
           Interrupt:16

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:5298 errors:0 dropped:0 overruns:0 frame:0
           TX packets:5298 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:475525 (464.3 KiB)  TX bytes:475525 (464.3 KiB)


Message from syslogd@ at Wed Jan 12 08:44:17 2011 ...
localhost kernel: do_IRQ: 0.192 No irq handler for vector (irq -1)

Message from syslogd@ at Wed Jan 12 08:44:17 2011 ...
localhost kernel: do_IRQ: 0.64 No irq handler for vector (irq -1)
[root@Z1010 ~]# cat /proc/interrupts
            CPU0
   0:        173   IO-APIC-edge      timer
   1:          2   IO-APIC-edge      i8042
   4:        821   IO-APIC-edge      serial
   6:          2   IO-APIC-edge      floppy
   8:          0   IO-APIC-edge      rtc0
   9:          2   IO-APIC-fasteoi   acpi
  14:          0   IO-APIC-edge      pata_via
  15:      19522   IO-APIC-edge      pata_via
  16:        256   IO-APIC-fasteoi   eth0, eth2
  17:         54   IO-APIC-fasteoi   eth1
  20:          0   IO-APIC-fasteoi   uhci_hcd:usb2
  21:          0   IO-APIC-fasteoi   uhci_hcd:usb4, sata_via
  22:          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb3
  23:          0   IO-APIC-fasteoi   uhci_hcd:usb5
NMI:          0   Non-maskable interrupts
LOC:     116090   Local timer interrupts
SPU:          0   Spurious interrupts
PMI:          0   Performance monitoring interrupts
IWI:          0   IRQ work interrupts
RES:          0   Rescheduling interrupts
CAL:          0   Function call interrupts
TLB:          0   TLB shootdowns
TRM:          0   Thermal event interrupts
THR:          0   Threshold APIC interrupts
MCE:          0   Machine check exceptions
MCP:          0   Machine check polls
ERR:         38
MIS:          2
[root@Z1010 ~]# arp -an

the system has now lost ethernet connectivity via the b44 ports


This is a test system and I can recompile the kernel if there are
any patches you would like me to try out.








^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-12  3:06           ` Matt Carlson
  2011-01-12 13:53             ` Stephen Clark
@ 2011-01-13 13:12             ` Stephen Clark
  2011-01-16 18:11               ` Stephen Clark
  1 sibling, 1 reply; 12+ messages in thread
From: Stephen Clark @ 2011-01-13 13:12 UTC (permalink / raw)
  To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan

On 01/11/2011 10:06 PM, Matt Carlson wrote:
> lspci -vvv -xxx -s 81:00.0



Further information - I found these messages in /var/log/messages. It looks
like after it switched to INTx mode interrupts for other devices were hosed.

Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt was gener
ated using MSI. Switching to INTx mode. Please report this failure to the PCI ma
intainer and include system chipset information
Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is not ready
Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50)
Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0
  action 0x6 frozen
Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA
Jan 12 08:38:50 localhost kernel: ata2.01: cmd 
ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out
Jan 12 08:38:50 localhost kernel:          res 
40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout)
Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY }
Jan 12 08:38:50 localhost kernel: ata2: soft resetting link
Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for vector (irq -1)
Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33
Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests
Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something wicked 
happened on session 3363




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-13 13:12             ` Stephen Clark
@ 2011-01-16 18:11               ` Stephen Clark
  2011-01-25  0:59                 ` Matt Carlson
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen Clark @ 2011-01-16 18:11 UTC (permalink / raw)
  To: sclark46; +Cc: Matt Carlson, Linux Kernel Network Developers, Michael Chan

On 01/13/2011 08:12 AM, Stephen Clark wrote:
> On 01/11/2011 10:06 PM, Matt Carlson wrote:
>> lspci -vvv -xxx -s 81:00.0
>
>
>
> Further information - I found these messages in /var/log/messages. It 
> looks
> like after it switched to INTx mode interrupts for other devices were 
> hosed.
>
> Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt 
> was gener
> ated using MSI. Switching to INTx mode. Please report this failure to 
> the PCI ma
> intainer and include system chipset information
> Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is 
> not ready
> Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50)
> Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct 
> 0x0 SErr 0x0
>  action 0x6 frozen
> Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA
> Jan 12 08:38:50 localhost kernel: ata2.01: cmd 
> ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out
> Jan 12 08:38:50 localhost kernel:          res 
> 40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout)
> Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY }
> Jan 12 08:38:50 localhost kernel: ata2: soft resetting link
> Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for 
> vector (irq -1)
> Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33
> Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests
> Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something 
> wicked happened on session 3363 
Just checking to make sure you have everything you need?

Thanks,
Steve

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-16 18:11               ` Stephen Clark
@ 2011-01-25  0:59                 ` Matt Carlson
  2011-01-25  2:25                   ` Matt Carlson
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Carlson @ 2011-01-25  0:59 UTC (permalink / raw)
  To: Stephen Clark
  Cc: Matthew Carlson, Linux Kernel Network Developers, Michael Chan

On Sun, Jan 16, 2011 at 10:11:50AM -0800, Stephen Clark wrote:
> On 01/13/2011 08:12 AM, Stephen Clark wrote:
> > On 01/11/2011 10:06 PM, Matt Carlson wrote:
> >> lspci -vvv -xxx -s 81:00.0
> >
> >
> >
> > Further information - I found these messages in /var/log/messages. It 
> > looks
> > like after it switched to INTx mode interrupts for other devices were 
> > hosed.
> >
> > Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt 
> > was gener
> > ated using MSI. Switching to INTx mode. Please report this failure to 
> > the PCI ma
> > intainer and include system chipset information
> > Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is 
> > not ready
> > Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50)
> > Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct 
> > 0x0 SErr 0x0
> >  action 0x6 frozen
> > Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA
> > Jan 12 08:38:50 localhost kernel: ata2.01: cmd 
> > ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out
> > Jan 12 08:38:50 localhost kernel:          res 
> > 40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout)
> > Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY }
> > Jan 12 08:38:50 localhost kernel: ata2: soft resetting link
> > Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for 
> > vector (irq -1)
> > Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33
> > Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests
> > Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something 
> > wicked happened on session 3363 
> Just checking to make sure you have everything you need?

Sorry for the delay Stephen.

It looks to me like interrupts aren't being setup correctly on this
system.  I tested MSI and INTx interrupt modes locally and they both
work.  I'm guessing one of two things could be happening:

1) The 2nd parameter of the low-level ISR (tg3_interrupt_tagged()) is
   not correct.  The ISR tries to tell the hardware the interrupt is
   acknowledged, but the message goes unheard.  (This might also explain
   why other devices are also afflicted.)

2) Something is blocking the delivery of the interrupt to the tg3 driver
   altogether.

In both cases, the hardware persistently nags the host to ack the
interrupt, hence the interrupt storm.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-25  0:59                 ` Matt Carlson
@ 2011-01-25  2:25                   ` Matt Carlson
  2011-04-13 18:23                     ` Stephen Clark
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Carlson @ 2011-01-25  2:25 UTC (permalink / raw)
  To: Matt Carlson; +Cc: Stephen Clark, Linux Kernel Network Developers, Michael Chan

On Mon, Jan 24, 2011 at 04:59:22PM -0800, Matt Carlson wrote:
> On Sun, Jan 16, 2011 at 10:11:50AM -0800, Stephen Clark wrote:
> > On 01/13/2011 08:12 AM, Stephen Clark wrote:
> > > On 01/11/2011 10:06 PM, Matt Carlson wrote:
> > >> lspci -vvv -xxx -s 81:00.0
> > >
> > >
> > >
> > > Further information - I found these messages in /var/log/messages. It 
> > > looks
> > > like after it switched to INTx mode interrupts for other devices were 
> > > hosed.
> > >
> > > Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt 
> > > was gener
> > > ated using MSI. Switching to INTx mode. Please report this failure to 
> > > the PCI ma
> > > intainer and include system chipset information
> > > Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is 
> > > not ready
> > > Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50)
> > > Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct 
> > > 0x0 SErr 0x0
> > >  action 0x6 frozen
> > > Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA
> > > Jan 12 08:38:50 localhost kernel: ata2.01: cmd 
> > > ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out
> > > Jan 12 08:38:50 localhost kernel:          res 
> > > 40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout)
> > > Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY }
> > > Jan 12 08:38:50 localhost kernel: ata2: soft resetting link
> > > Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for 
> > > vector (irq -1)
> > > Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33
> > > Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests
> > > Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something 
> > > wicked happened on session 3363 
> > Just checking to make sure you have everything you need?
> 
> Sorry for the delay Stephen.
> 
> It looks to me like interrupts aren't being setup correctly on this
> system.  I tested MSI and INTx interrupt modes locally and they both
> work.  I'm guessing one of two things could be happening:
> 
> 1) The 2nd parameter of the low-level ISR (tg3_interrupt_tagged()) is
>    not correct.  The ISR tries to tell the hardware the interrupt is
>    acknowledged, but the message goes unheard.  (This might also explain
>    why other devices are also afflicted.)
> 
> 2) Something is blocking the delivery of the interrupt to the tg3 driver
>    altogether.
> 
> In both cases, the hardware persistently nags the host to ack the
> interrupt, hence the interrupt storm.

Just curious, is the problem still there if you add pci=nomsi to the
kernel command line?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: panic in tg3 driver
  2011-01-25  2:25                   ` Matt Carlson
@ 2011-04-13 18:23                     ` Stephen Clark
  0 siblings, 0 replies; 12+ messages in thread
From: Stephen Clark @ 2011-04-13 18:23 UTC (permalink / raw)
  To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan

On 01/24/2011 09:25 PM, Matt Carlson wrote:
> On Mon, Jan 24, 2011 at 04:59:22PM -0800, Matt Carlson wrote:
>    
>> On Sun, Jan 16, 2011 at 10:11:50AM -0800, Stephen Clark wrote:
>>      
>>> On 01/13/2011 08:12 AM, Stephen Clark wrote:
>>>        
>>>> On 01/11/2011 10:06 PM, Matt Carlson wrote:
>>>>          
>>>>> lspci -vvv -xxx -s 81:00.0
>>>>>            
>>>>
>>>>
>>>> Further information - I found these messages in /var/log/messages. It
>>>> looks
>>>> like after it switched to INTx mode interrupts for other devices were
>>>> hosed.
>>>>
>>>> Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt
>>>> was gener
>>>> ated using MSI. Switching to INTx mode. Please report this failure to
>>>> the PCI ma
>>>> intainer and include system chipset information
>>>> Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is
>>>> not ready
>>>> Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50)
>>>> Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct
>>>> 0x0 SErr 0x0
>>>>   action 0x6 frozen
>>>> Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA
>>>> Jan 12 08:38:50 localhost kernel: ata2.01: cmd
>>>> ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out
>>>> Jan 12 08:38:50 localhost kernel:          res
>>>> 40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout)
>>>> Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY }
>>>> Jan 12 08:38:50 localhost kernel: ata2: soft resetting link
>>>> Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for
>>>> vector (irq -1)
>>>> Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33
>>>> Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests
>>>> Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something
>>>> wicked happened on session 3363
>>>>          
>>> Just checking to make sure you have everything you need?
>>>        
>> Sorry for the delay Stephen.
>>
>> It looks to me like interrupts aren't being setup correctly on this
>> system.  I tested MSI and INTx interrupt modes locally and they both
>> work.  I'm guessing one of two things could be happening:
>>
>> 1) The 2nd parameter of the low-level ISR (tg3_interrupt_tagged()) is
>>     not correct.  The ISR tries to tell the hardware the interrupt is
>>     acknowledged, but the message goes unheard.  (This might also explain
>>     why other devices are also afflicted.)
>>
>> 2) Something is blocking the delivery of the interrupt to the tg3 driver
>>     altogether.
>>
>> In both cases, the hardware persistently nags the host to ack the
>> interrupt, hence the interrupt storm.
>>      
> Just curious, is the problem still there if you add pci=nomsi to the
> kernel command line?
>
>    
Sorry I have been tied up.

With kernel 2.6.32-44.1.el6.i686 and pci=nomsi on the kernel command 
line it seems to work great.

[root@Z1010 ~]# ping -f 3.3.3.2
PING 3.3.3.2 (3.3.3.2) 56(84) bytes of data.
.^
--- 3.3.3.2 ping statistics ---
20562 packets transmitted, 20562 received, 0% packet loss, time 4408ms
rtt min/avg/max/mdev = 0.141/0.163/1.021/0.034 ms, ipg/ewma 0.214/0.161 ms



> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>    


-- 

"They that give up essential liberty to obtain temporary safety,
deserve neither liberty nor safety."  (Ben Franklin)

"The course of history shows that as a government grows, liberty
decreases."  (Thomas Jefferson)




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-04-13 18:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <4D2334B5.1060408@earthlink.net>
2011-01-09 22:30 ` panic in tg3 driver Stephen Clark
2011-01-10 19:22   ` Matt Carlson
2011-01-10 20:04     ` Stephen Clark
2011-01-11  2:00       ` Matt Carlson
2011-01-11 14:10         ` Stephen Clark
2011-01-12  3:06           ` Matt Carlson
2011-01-12 13:53             ` Stephen Clark
2011-01-13 13:12             ` Stephen Clark
2011-01-16 18:11               ` Stephen Clark
2011-01-25  0:59                 ` Matt Carlson
2011-01-25  2:25                   ` Matt Carlson
2011-04-13 18:23                     ` Stephen Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).