* Re: panic in tg3 driver [not found] <4D2334B5.1060408@earthlink.net> @ 2011-01-09 22:30 ` Stephen Clark 2011-01-10 19:22 ` Matt Carlson 0 siblings, 1 reply; 12+ messages in thread From: Stephen Clark @ 2011-01-09 22:30 UTC (permalink / raw) To: Linux Kernel Network Developers; +Cc: Michael Chan, Matt Carlson On 01/04/2011 09:54 AM, Stephen Clark wrote: > Hello, > > > The hardware is an Acrosser AR-M0898B micro box. > lspci > 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > Host Bridge > 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > Host Bridge > 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > Host Bridge > 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge > 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > Host Bridge > 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > Host Bridge > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge > 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA > Controller (rev > 20) > 00:0f.1 IDE interface: VIA Technologies, Inc. > VT82C586A/B/VT82C686/A/B/VT823x/A/ > C PIPC Bus Master IDE (rev 07) > 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > Controller > (rev 91) > 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > Controller > (rev 91) > 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > Controller > (rev 91) > 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > Controller > (rev 91) > 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90) > 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge > 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller > 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge > 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge > 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T > (rev 02) > 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T > (rev 02) > 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port > 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port > 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M > Fast Ethernet > PCI Express (rev 02) > 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M > Fast Ethernet > PCI Express (rev 02) > > Kernel 2.6.36-2.el5.elrepo on an i686 > > When I try to ifconfig either of the BCM5906M ports the system panics. > > Ideas, fixes ? > > [root@Z1010 ~]# modprobe tg3 > [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24 > ------------[ cut here ]------------ > kernel BUG at drivers/net/tg3.c:4365! > invalid opcode: 0000 [#1] PREEMPT SMP > last sysfs file: /sys/class/net/eth3/address > Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state > iptable_mangle af_ke] > > Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1 > CN700-8251/ > EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0 > EIP is at tg3_tx_recover+0x1e/0x53 [tg3] > EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff > ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30 > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0 > task.ti=dee62000) > Stack: > dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000 > <0> df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040 > c1801f94 > <0> e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500 > dfab4000 > Call Trace: > [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3] > [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3] > [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44] > [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3] > [<c0674058>] ? net_rx_action+0x7e/0x11c > [<c04409c9>] ? __do_softirq+0x85/0x10c > [<c0440944>] ? __do_softirq+0x0/0x10c > <IRQ> > [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87 > [<c044051b>] ? local_bh_enable_ip+0xd/0xf > [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e > [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf > [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3] > [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3] > [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3] > [<c044ec37>] ? process_one_work+0x10b/0x1bc > [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3] > [<c044fd41>] ? worker_thread+0x77/0xf9 > [<c0453048>] ? kthread+0x60/0x65 > [<c044fcca>] ? worker_thread+0x0/0xf9 > [<c0452fe8>] ? kthread+0x0/0x65 > [<c040337e>] ? kernel_thread_helper+0x6/0x10 > Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44 > 00 00 f6 8 > EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30 > ---[ end trace 82381e9b93e397ad ]--- > Kernel panic - not syncing: Fatal exception in interrupt > Pid: 20303, comm: kworker/0:2 Tainted: G D > 2.6.36-2.el5.elrepo #1 > Call Trace: > [<c043b3cd>] panic+0x62/0x15d > [<c06fb7d1>] oops_end+0x99/0xa8 > [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3] > [<c0405a62>] die+0x58/0x5e > > Thanks, > Steve > Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem. Also boot with noapic I see this in the dmesg log and interrupts are increasing like crazy: tg3.c:v3.115 (October 14, 2010) tg3 0000:81:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 tg3 0000:81:00.0: setting latency timer to 64 tg3 0000:81:00.0: PCI: Disallowing DAC for device tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add ress 00:02:b6:36:d1:39 tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed [0]) tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit] tg3 0000:82:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 tg3 0000:82:00.0: setting latency timer to 64 tg3 0000:82:00.0: PCI: Disallowing DAC for device tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add ress 00:02:b6:36:d1:3a tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed [0]) tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit] tg3 0000:81:00.0: irq 40 for MSI/MSI-X tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx mode. Please report this failure to the PCI maintainer and include system chipse t information ADDRCONF(NETDEV_UP): eth2: link is not ready [root@Z1010 ~]# cat /proc/interrupts CPU0 0: 162 XT-PIC-XT-PIC timer 1: 2 XT-PIC-XT-PIC i8042 2: 0 XT-PIC-XT-PIC cascade 3: 1 XT-PIC-XT-PIC 4: 4863 XT-PIC-XT-PIC serial 6: 2 XT-PIC-XT-PIC floppy 7: 5 XT-PIC-XT-PIC ehci_hcd:usb1, uhci_hcd:usb3 8: 0 XT-PIC-XT-PIC rtc0 9: 0 XT-PIC-XT-PIC acpi 10: 2334234 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 [root@Z1010 ~]# cat /proc/interrupts |grep eth2 10: 18388914 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 [root@Z1010 ~]# cat /proc/interrupts |grep eth2 10: 18901627 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-09 22:30 ` panic in tg3 driver Stephen Clark @ 2011-01-10 19:22 ` Matt Carlson 2011-01-10 20:04 ` Stephen Clark 0 siblings, 1 reply; 12+ messages in thread From: Matt Carlson @ 2011-01-10 19:22 UTC (permalink / raw) To: Stephen Clark Cc: Linux Kernel Network Developers, Michael Chan, Matthew Carlson On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote: > On 01/04/2011 09:54 AM, Stephen Clark wrote: > > Hello, > > > > > > The hardware is an Acrosser AR-M0898B micro box. > > lspci > > 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > > Host Bridge > > 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > > Host Bridge > > 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > > Host Bridge > > 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge > > 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > > Host Bridge > > 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > > Host Bridge > > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge > > 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA > > Controller (rev > > 20) > > 00:0f.1 IDE interface: VIA Technologies, Inc. > > VT82C586A/B/VT82C686/A/B/VT823x/A/ > > C PIPC Bus Master IDE (rev 07) > > 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > > Controller > > (rev 91) > > 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > > Controller > > (rev 91) > > 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > > Controller > > (rev 91) > > 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > > Controller > > (rev 91) > > 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90) > > 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge > > 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller > > 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge > > 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge > > 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T > > (rev 02) > > 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T > > (rev 02) > > 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port > > 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port > > 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M > > Fast Ethernet > > PCI Express (rev 02) > > 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M > > Fast Ethernet > > PCI Express (rev 02) > > > > Kernel 2.6.36-2.el5.elrepo on an i686 > > > > When I try to ifconfig either of the BCM5906M ports the system panics. > > > > Ideas, fixes ? > > > > [root@Z1010 ~]# modprobe tg3 > > [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24 > > ------------[ cut here ]------------ > > kernel BUG at drivers/net/tg3.c:4365! > > invalid opcode: 0000 [#1] PREEMPT SMP > > last sysfs file: /sys/class/net/eth3/address > > Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state > > iptable_mangle af_ke] > > > > Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1 > > CN700-8251/ > > EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0 > > EIP is at tg3_tx_recover+0x1e/0x53 [tg3] > > EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff > > ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30 > > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > > Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0 > > task.ti=dee62000) > > Stack: > > dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000 > > <0> df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040 > > c1801f94 > > <0> e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500 > > dfab4000 > > Call Trace: > > [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3] > > [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3] > > [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44] > > [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3] > > [<c0674058>] ? net_rx_action+0x7e/0x11c > > [<c04409c9>] ? __do_softirq+0x85/0x10c > > [<c0440944>] ? __do_softirq+0x0/0x10c > > <IRQ> > > [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87 > > [<c044051b>] ? local_bh_enable_ip+0xd/0xf > > [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e > > [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf > > [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3] > > [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3] > > [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3] > > [<c044ec37>] ? process_one_work+0x10b/0x1bc > > [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3] > > [<c044fd41>] ? worker_thread+0x77/0xf9 > > [<c0453048>] ? kthread+0x60/0x65 > > [<c044fcca>] ? worker_thread+0x0/0xf9 > > [<c0452fe8>] ? kthread+0x0/0x65 > > [<c040337e>] ? kernel_thread_helper+0x6/0x10 > > Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44 > > 00 00 f6 8 > > EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30 > > ---[ end trace 82381e9b93e397ad ]--- > > Kernel panic - not syncing: Fatal exception in interrupt > > Pid: 20303, comm: kworker/0:2 Tainted: G D > > 2.6.36-2.el5.elrepo #1 > > Call Trace: > > [<c043b3cd>] panic+0x62/0x15d > > [<c06fb7d1>] oops_end+0x99/0xa8 > > [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3] > > [<c0405a62>] die+0x58/0x5e > > > > Thanks, > > Steve > > > Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem. > Also boot with noapic I see this in the dmesg log and interrupts are increasing > like crazy: > tg3.c:v3.115 (October 14, 2010) > tg3 0000:81:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 > tg3 0000:81:00.0: setting latency timer to 64 > tg3 0000:81:00.0: PCI: Disallowing DAC for device > tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add > ress 00:02:b6:36:d1:39 > tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed > [0]) > tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] > tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit] > tg3 0000:82:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 > tg3 0000:82:00.0: setting latency timer to 64 > tg3 0000:82:00.0: PCI: Disallowing DAC for device > tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add > ress 00:02:b6:36:d1:3a > tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed > [0]) > tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] > tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit] > tg3 0000:81:00.0: irq 40 for MSI/MSI-X > tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx > mode. Please report this failure to the PCI maintainer and include system chipse > t information > ADDRCONF(NETDEV_UP): eth2: link is not ready > [root@Z1010 ~]# cat /proc/interrupts > CPU0 > 0: 162 XT-PIC-XT-PIC timer > 1: 2 XT-PIC-XT-PIC i8042 > 2: 0 XT-PIC-XT-PIC cascade > 3: 1 XT-PIC-XT-PIC > 4: 4863 XT-PIC-XT-PIC serial > 6: 2 XT-PIC-XT-PIC floppy > 7: 5 XT-PIC-XT-PIC ehci_hcd:usb1, uhci_hcd:usb3 > 8: 0 XT-PIC-XT-PIC rtc0 > 9: 0 XT-PIC-XT-PIC acpi > 10: 2334234 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 > > [root@Z1010 ~]# cat /proc/interrupts |grep eth2 > 10: 18388914 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 > [root@Z1010 ~]# cat /proc/interrupts |grep eth2 > 10: 18901627 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 > > -- > > "They that give up essential liberty to obtain temporary safety, > deserve neither liberty nor safety." (Ben Franklin) > > "The course of history shows that as a government grows, liberty > decreases." (Thomas Jefferson) I think drivers/net/tg3.c:4365 is at the line that reads "spin_lock(&tp->lock);" in tg3_tx_recover. Can you verify? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-10 19:22 ` Matt Carlson @ 2011-01-10 20:04 ` Stephen Clark 2011-01-11 2:00 ` Matt Carlson 0 siblings, 1 reply; 12+ messages in thread From: Stephen Clark @ 2011-01-10 20:04 UTC (permalink / raw) To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan On 01/10/2011 02:22 PM, Matt Carlson wrote: > On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote: > >> On 01/04/2011 09:54 AM, Stephen Clark wrote: >> >>> Hello, >>> >>> >>> The hardware is an Acrosser AR-M0898B micro box. >>> lspci >>> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>> Host Bridge >>> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>> Host Bridge >>> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>> Host Bridge >>> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge >>> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>> Host Bridge >>> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>> Host Bridge >>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge >>> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA >>> Controller (rev >>> 20) >>> 00:0f.1 IDE interface: VIA Technologies, Inc. >>> VT82C586A/B/VT82C686/A/B/VT823x/A/ >>> C PIPC Bus Master IDE (rev 07) >>> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 >>> Controller >>> (rev 91) >>> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 >>> Controller >>> (rev 91) >>> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 >>> Controller >>> (rev 91) >>> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 >>> Controller >>> (rev 91) >>> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90) >>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge >>> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller >>> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge >>> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge >>> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T >>> (rev 02) >>> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T >>> (rev 02) >>> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port >>> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port >>> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M >>> Fast Ethernet >>> PCI Express (rev 02) >>> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M >>> Fast Ethernet >>> PCI Express (rev 02) >>> >>> Kernel 2.6.36-2.el5.elrepo on an i686 >>> >>> When I try to ifconfig either of the BCM5906M ports the system panics. >>> >>> Ideas, fixes ? >>> >>> [root@Z1010 ~]# modprobe tg3 >>> [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24 >>> ------------[ cut here ]------------ >>> kernel BUG at drivers/net/tg3.c:4365! >>> invalid opcode: 0000 [#1] PREEMPT SMP >>> last sysfs file: /sys/class/net/eth3/address >>> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state >>> iptable_mangle af_ke] >>> >>> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1 >>> CN700-8251/ >>> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0 >>> EIP is at tg3_tx_recover+0x1e/0x53 [tg3] >>> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff >>> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30 >>> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 >>> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0 >>> task.ti=dee62000) >>> Stack: >>> dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000 >>> <0> df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040 >>> c1801f94 >>> <0> e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500 >>> dfab4000 >>> Call Trace: >>> [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3] >>> [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3] >>> [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44] >>> [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3] >>> [<c0674058>] ? net_rx_action+0x7e/0x11c >>> [<c04409c9>] ? __do_softirq+0x85/0x10c >>> [<c0440944>] ? __do_softirq+0x0/0x10c >>> <IRQ> >>> [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87 >>> [<c044051b>] ? local_bh_enable_ip+0xd/0xf >>> [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e >>> [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf >>> [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3] >>> [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3] >>> [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3] >>> [<c044ec37>] ? process_one_work+0x10b/0x1bc >>> [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3] >>> [<c044fd41>] ? worker_thread+0x77/0xf9 >>> [<c0453048>] ? kthread+0x60/0x65 >>> [<c044fcca>] ? worker_thread+0x0/0xf9 >>> [<c0452fe8>] ? kthread+0x0/0x65 >>> [<c040337e>] ? kernel_thread_helper+0x6/0x10 >>> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44 >>> 00 00 f6 8 >>> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30 >>> ---[ end trace 82381e9b93e397ad ]--- >>> Kernel panic - not syncing: Fatal exception in interrupt >>> Pid: 20303, comm: kworker/0:2 Tainted: G D >>> 2.6.36-2.el5.elrepo #1 >>> Call Trace: >>> [<c043b3cd>] panic+0x62/0x15d >>> [<c06fb7d1>] oops_end+0x99/0xa8 >>> [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3] >>> [<c0405a62>] die+0x58/0x5e >>> >>> Thanks, >>> Steve >>> >>> >> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem. >> Also boot with noapic I see this in the dmesg log and interrupts are increasing >> like crazy: >> tg3.c:v3.115 (October 14, 2010) >> tg3 0000:81:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 >> tg3 0000:81:00.0: setting latency timer to 64 >> tg3 0000:81:00.0: PCI: Disallowing DAC for device >> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add >> ress 00:02:b6:36:d1:39 >> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed >> [0]) >> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] >> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit] >> tg3 0000:82:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 >> tg3 0000:82:00.0: setting latency timer to 64 >> tg3 0000:82:00.0: PCI: Disallowing DAC for device >> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add >> ress 00:02:b6:36:d1:3a >> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed >> [0]) >> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] >> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit] >> tg3 0000:81:00.0: irq 40 for MSI/MSI-X >> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx >> mode. Please report this failure to the PCI maintainer and include system chipse >> t information >> ADDRCONF(NETDEV_UP): eth2: link is not ready >> [root@Z1010 ~]# cat /proc/interrupts >> CPU0 >> 0: 162 XT-PIC-XT-PIC timer >> 1: 2 XT-PIC-XT-PIC i8042 >> 2: 0 XT-PIC-XT-PIC cascade >> 3: 1 XT-PIC-XT-PIC >> 4: 4863 XT-PIC-XT-PIC serial >> 6: 2 XT-PIC-XT-PIC floppy >> 7: 5 XT-PIC-XT-PIC ehci_hcd:usb1, uhci_hcd:usb3 >> 8: 0 XT-PIC-XT-PIC rtc0 >> 9: 0 XT-PIC-XT-PIC acpi >> 10: 2334234 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 >> >> [root@Z1010 ~]# cat /proc/interrupts |grep eth2 >> 10: 18388914 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 >> [root@Z1010 ~]# cat /proc/interrupts |grep eth2 >> 10: 18901627 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 >> >> -- >> >> "They that give up essential liberty to obtain temporary safety, >> deserve neither liberty nor safety." (Ben Franklin) >> >> "The course of history shows that as a government grows, liberty >> decreases." (Thomas Jefferson) >> > I think drivers/net/tg3.c:4365 is at the line that reads > "spin_lock(&tp->lock);" in tg3_tx_recover. Can you verify? > > tg3_readphy(tp, MII_TG3_DSP_RW_PORT, &phy2); in static void tg3_serdes_parallel_detect(struct tg3 *tp) The driver version is: #define DRV_MODULE_NAME "tg3" #define TG3_MAJ_NUM 3 #define TG3_MIN_NUM 115 -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-10 20:04 ` Stephen Clark @ 2011-01-11 2:00 ` Matt Carlson 2011-01-11 14:10 ` Stephen Clark 0 siblings, 1 reply; 12+ messages in thread From: Matt Carlson @ 2011-01-11 2:00 UTC (permalink / raw) To: Stephen Clark Cc: Matthew Carlson, Linux Kernel Network Developers, Michael Chan On Mon, Jan 10, 2011 at 12:04:34PM -0800, Stephen Clark wrote: > On 01/10/2011 02:22 PM, Matt Carlson wrote: > > On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote: > > > >> On 01/04/2011 09:54 AM, Stephen Clark wrote: > >> > >>> Hello, > >>> > >>> > >>> The hardware is an Acrosser AR-M0898B micro box. > >>> lspci > >>> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>> Host Bridge > >>> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>> Host Bridge > >>> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>> Host Bridge > >>> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge > >>> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>> Host Bridge > >>> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>> Host Bridge > >>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge > >>> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA > >>> Controller (rev > >>> 20) > >>> 00:0f.1 IDE interface: VIA Technologies, Inc. > >>> VT82C586A/B/VT82C686/A/B/VT823x/A/ > >>> C PIPC Bus Master IDE (rev 07) > >>> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > >>> Controller > >>> (rev 91) > >>> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > >>> Controller > >>> (rev 91) > >>> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > >>> Controller > >>> (rev 91) > >>> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > >>> Controller > >>> (rev 91) > >>> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90) > >>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge > >>> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller > >>> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge > >>> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge > >>> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T > >>> (rev 02) > >>> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T > >>> (rev 02) > >>> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port > >>> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port > >>> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M > >>> Fast Ethernet > >>> PCI Express (rev 02) > >>> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M > >>> Fast Ethernet > >>> PCI Express (rev 02) > >>> > >>> Kernel 2.6.36-2.el5.elrepo on an i686 > >>> > >>> When I try to ifconfig either of the BCM5906M ports the system panics. > >>> > >>> Ideas, fixes ? > >>> > >>> [root@Z1010 ~]# modprobe tg3 > >>> [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24 > >>> ------------[ cut here ]------------ > >>> kernel BUG at drivers/net/tg3.c:4365! > >>> invalid opcode: 0000 [#1] PREEMPT SMP > >>> last sysfs file: /sys/class/net/eth3/address > >>> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state > >>> iptable_mangle af_ke] > >>> > >>> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1 > >>> CN700-8251/ > >>> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0 > >>> EIP is at tg3_tx_recover+0x1e/0x53 [tg3] > >>> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff > >>> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30 > >>> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > >>> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0 > >>> task.ti=dee62000) > >>> Stack: > >>> dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000 > >>> <0> df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040 > >>> c1801f94 > >>> <0> e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500 > >>> dfab4000 > >>> Call Trace: > >>> [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3] > >>> [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3] > >>> [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44] > >>> [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3] > >>> [<c0674058>] ? net_rx_action+0x7e/0x11c > >>> [<c04409c9>] ? __do_softirq+0x85/0x10c > >>> [<c0440944>] ? __do_softirq+0x0/0x10c > >>> <IRQ> > >>> [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87 > >>> [<c044051b>] ? local_bh_enable_ip+0xd/0xf > >>> [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e > >>> [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf > >>> [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3] > >>> [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3] > >>> [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3] > >>> [<c044ec37>] ? process_one_work+0x10b/0x1bc > >>> [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3] > >>> [<c044fd41>] ? worker_thread+0x77/0xf9 > >>> [<c0453048>] ? kthread+0x60/0x65 > >>> [<c044fcca>] ? worker_thread+0x0/0xf9 > >>> [<c0452fe8>] ? kthread+0x0/0x65 > >>> [<c040337e>] ? kernel_thread_helper+0x6/0x10 > >>> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44 > >>> 00 00 f6 8 > >>> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30 > >>> ---[ end trace 82381e9b93e397ad ]--- > >>> Kernel panic - not syncing: Fatal exception in interrupt > >>> Pid: 20303, comm: kworker/0:2 Tainted: G D > >>> 2.6.36-2.el5.elrepo #1 > >>> Call Trace: > >>> [<c043b3cd>] panic+0x62/0x15d > >>> [<c06fb7d1>] oops_end+0x99/0xa8 > >>> [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3] > >>> [<c0405a62>] die+0x58/0x5e > >>> > >>> Thanks, > >>> Steve > >>> > >>> > >> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem. > >> Also boot with noapic I see this in the dmesg log and interrupts are increasing > >> like crazy: > >> tg3.c:v3.115 (October 14, 2010) > >> tg3 0000:81:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 > >> tg3 0000:81:00.0: setting latency timer to 64 > >> tg3 0000:81:00.0: PCI: Disallowing DAC for device > >> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add > >> ress 00:02:b6:36:d1:39 > >> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed > >> [0]) > >> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] > >> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit] > >> tg3 0000:82:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 > >> tg3 0000:82:00.0: setting latency timer to 64 > >> tg3 0000:82:00.0: PCI: Disallowing DAC for device > >> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add > >> ress 00:02:b6:36:d1:3a > >> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed > >> [0]) > >> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] > >> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit] > >> tg3 0000:81:00.0: irq 40 for MSI/MSI-X > >> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx > >> mode. Please report this failure to the PCI maintainer and include system chipse > >> t information > >> ADDRCONF(NETDEV_UP): eth2: link is not ready > >> [root@Z1010 ~]# cat /proc/interrupts > >> CPU0 > >> 0: 162 XT-PIC-XT-PIC timer > >> 1: 2 XT-PIC-XT-PIC i8042 > >> 2: 0 XT-PIC-XT-PIC cascade > >> 3: 1 XT-PIC-XT-PIC > >> 4: 4863 XT-PIC-XT-PIC serial > >> 6: 2 XT-PIC-XT-PIC floppy > >> 7: 5 XT-PIC-XT-PIC ehci_hcd:usb1, uhci_hcd:usb3 > >> 8: 0 XT-PIC-XT-PIC rtc0 > >> 9: 0 XT-PIC-XT-PIC acpi > >> 10: 2334234 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 > >> > >> [root@Z1010 ~]# cat /proc/interrupts |grep eth2 > >> 10: 18388914 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 > >> [root@Z1010 ~]# cat /proc/interrupts |grep eth2 > >> 10: 18901627 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 > >> > >> -- > >> > >> "They that give up essential liberty to obtain temporary safety, > >> deserve neither liberty nor safety." (Ben Franklin) > >> > >> "The course of history shows that as a government grows, liberty > >> decreases." (Thomas Jefferson) > >> > > I think drivers/net/tg3.c:4365 is at the line that reads > > "spin_lock(&tp->lock);" in tg3_tx_recover. Can you verify? > > > > > > > tg3_readphy(tp, MII_TG3_DSP_RW_PORT, &phy2); > > in static void tg3_serdes_parallel_detect(struct tg3 *tp) > > The driver version is: > #define DRV_MODULE_NAME "tg3" > #define TG3_MAJ_NUM 3 > #define TG3_MIN_NUM 115 That doesn't look right. The line number I quoted came from the kernel panic output from 2.6.36-2.el5.elrepo. I'm guessing you quoted me the sources from the tg3.c file in 2.6.37-rc8+. If you don't have the 2.6.36-2.el5.elrepo sources readily available, can you give me the line the kernel panic specifies from the tg3.c file from your 2.6.37-rc8+ sources? It looks like there are a lot of devices on IRQ 10. Does the interrupt count drop if you bring down eth0 (which I'm guessing is the b44 device)? Can you tell me if you saw the following message in the syslogs? "The system may be re-ordering memory-mapped I/O cycles to the network device, attempting to recover. Please report the problem to the driver maintainer and include system chipset information." ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-11 2:00 ` Matt Carlson @ 2011-01-11 14:10 ` Stephen Clark 2011-01-12 3:06 ` Matt Carlson 0 siblings, 1 reply; 12+ messages in thread From: Stephen Clark @ 2011-01-11 14:10 UTC (permalink / raw) To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan On 01/10/2011 09:00 PM, Matt Carlson wrote: > On Mon, Jan 10, 2011 at 12:04:34PM -0800, Stephen Clark wrote: > >> On 01/10/2011 02:22 PM, Matt Carlson wrote: >> >>> On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote: >>> >>> >>>> On 01/04/2011 09:54 AM, Stephen Clark wrote: >>>> >>>> >>>>> Hello, >>>>> >>>>> >>>>> The hardware is an Acrosser AR-M0898B micro box. >>>>> lspci >>>>> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>>>> Host Bridge >>>>> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>>>> Host Bridge >>>>> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>>>> Host Bridge >>>>> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge >>>>> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>>>> Host Bridge >>>>> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro >>>>> Host Bridge >>>>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge >>>>> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA >>>>> Controller (rev >>>>> 20) >>>>> 00:0f.1 IDE interface: VIA Technologies, Inc. >>>>> VT82C586A/B/VT82C686/A/B/VT823x/A/ >>>>> C PIPC Bus Master IDE (rev 07) >>>>> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 >>>>> Controller >>>>> (rev 91) >>>>> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 >>>>> Controller >>>>> (rev 91) >>>>> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 >>>>> Controller >>>>> (rev 91) >>>>> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 >>>>> Controller >>>>> (rev 91) >>>>> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90) >>>>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge >>>>> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller >>>>> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge >>>>> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge >>>>> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T >>>>> (rev 02) >>>>> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T >>>>> (rev 02) >>>>> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port >>>>> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port >>>>> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M >>>>> Fast Ethernet >>>>> PCI Express (rev 02) >>>>> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M >>>>> Fast Ethernet >>>>> PCI Express (rev 02) >>>>> >>>>> Kernel 2.6.36-2.el5.elrepo on an i686 >>>>> >>>>> When I try to ifconfig either of the BCM5906M ports the system panics. >>>>> >>>>> Ideas, fixes ? >>>>> >>>>> [root@Z1010 ~]# modprobe tg3 >>>>> [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24 >>>>> ------------[ cut here ]------------ >>>>> kernel BUG at drivers/net/tg3.c:4365! >>>>> invalid opcode: 0000 [#1] PREEMPT SMP >>>>> last sysfs file: /sys/class/net/eth3/address >>>>> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state >>>>> iptable_mangle af_ke] >>>>> >>>>> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1 >>>>> CN700-8251/ >>>>> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0 >>>>> EIP is at tg3_tx_recover+0x1e/0x53 [tg3] >>>>> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff >>>>> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30 >>>>> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 >>>>> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0 >>>>> task.ti=dee62000) >>>>> Stack: >>>>> dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000 >>>>> <0> df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040 >>>>> c1801f94 >>>>> <0> e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500 >>>>> dfab4000 >>>>> Call Trace: >>>>> [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3] >>>>> [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3] >>>>> [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44] >>>>> [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3] >>>>> [<c0674058>] ? net_rx_action+0x7e/0x11c >>>>> [<c04409c9>] ? __do_softirq+0x85/0x10c >>>>> [<c0440944>] ? __do_softirq+0x0/0x10c >>>>> <IRQ> >>>>> [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87 >>>>> [<c044051b>] ? local_bh_enable_ip+0xd/0xf >>>>> [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e >>>>> [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf >>>>> [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3] >>>>> [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3] >>>>> [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3] >>>>> [<c044ec37>] ? process_one_work+0x10b/0x1bc >>>>> [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3] >>>>> [<c044fd41>] ? worker_thread+0x77/0xf9 >>>>> [<c0453048>] ? kthread+0x60/0x65 >>>>> [<c044fcca>] ? worker_thread+0x0/0xf9 >>>>> [<c0452fe8>] ? kthread+0x0/0x65 >>>>> [<c040337e>] ? kernel_thread_helper+0x6/0x10 >>>>> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44 >>>>> 00 00 f6 8 >>>>> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30 >>>>> ---[ end trace 82381e9b93e397ad ]--- >>>>> Kernel panic - not syncing: Fatal exception in interrupt >>>>> Pid: 20303, comm: kworker/0:2 Tainted: G D >>>>> 2.6.36-2.el5.elrepo #1 >>>>> Call Trace: >>>>> [<c043b3cd>] panic+0x62/0x15d >>>>> [<c06fb7d1>] oops_end+0x99/0xa8 >>>>> [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3] >>>>> [<c0405a62>] die+0x58/0x5e >>>>> >>>>> Thanks, >>>>> Steve >>>>> >>>>> >>>>> >>>> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem. >>>> Also boot with noapic I see this in the dmesg log and interrupts are increasing >>>> like crazy: >>>> tg3.c:v3.115 (October 14, 2010) >>>> tg3 0000:81:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 >>>> tg3 0000:81:00.0: setting latency timer to 64 >>>> tg3 0000:81:00.0: PCI: Disallowing DAC for device >>>> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add >>>> ress 00:02:b6:36:d1:39 >>>> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed >>>> [0]) >>>> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] >>>> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit] >>>> tg3 0000:82:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 >>>> tg3 0000:82:00.0: setting latency timer to 64 >>>> tg3 0000:82:00.0: PCI: Disallowing DAC for device >>>> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add >>>> ress 00:02:b6:36:d1:3a >>>> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed >>>> [0]) >>>> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] >>>> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit] >>>> tg3 0000:81:00.0: irq 40 for MSI/MSI-X >>>> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx >>>> mode. Please report this failure to the PCI maintainer and include system chipse >>>> t information >>>> ADDRCONF(NETDEV_UP): eth2: link is not ready >>>> [root@Z1010 ~]# cat /proc/interrupts >>>> CPU0 >>>> 0: 162 XT-PIC-XT-PIC timer >>>> 1: 2 XT-PIC-XT-PIC i8042 >>>> 2: 0 XT-PIC-XT-PIC cascade >>>> 3: 1 XT-PIC-XT-PIC >>>> 4: 4863 XT-PIC-XT-PIC serial >>>> 6: 2 XT-PIC-XT-PIC floppy >>>> 7: 5 XT-PIC-XT-PIC ehci_hcd:usb1, uhci_hcd:usb3 >>>> 8: 0 XT-PIC-XT-PIC rtc0 >>>> 9: 0 XT-PIC-XT-PIC acpi >>>> 10: 2334234 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 >>>> >>>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2 >>>> 10: 18388914 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 >>>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2 >>>> 10: 18901627 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 >>>> >>>> -- >>>> >>>> "They that give up essential liberty to obtain temporary safety, >>>> deserve neither liberty nor safety." (Ben Franklin) >>>> >>>> "The course of history shows that as a government grows, liberty >>>> decreases." (Thomas Jefferson) >>>> >>>> >>> I think drivers/net/tg3.c:4365 is at the line that reads >>> "spin_lock(&tp->lock);" in tg3_tx_recover. Can you verify? >>> >>> >>> >> >> tg3_readphy(tp, MII_TG3_DSP_RW_PORT,&phy2); >> >> in static void tg3_serdes_parallel_detect(struct tg3 *tp) >> >> The driver version is: >> #define DRV_MODULE_NAME "tg3" >> #define TG3_MAJ_NUM 3 >> #define TG3_MIN_NUM 115 >> > > That doesn't look right. The line number I quoted came from the kernel > panic output from 2.6.36-2.el5.elrepo. I'm guessing you quoted me the > sources from the tg3.c file in 2.6.37-rc8+. If you don't have the > 2.6.36-2.el5.elrepo sources readily available, can you give me the line > the kernel panic specifies from the tg3.c file from your 2.6.37-rc8+ > sources? > > Oops - You are correct. The problem is most of the time I don't get a panic on the screen the box simply reboots. I'll see if I can get the 2.6.36-2 sources - though they are suppose to be the virgin kernel.org sources simply recompiled for Centos. static void tg3_tx_recover(struct tg3 *tp) { BUG_ON((tp->tg3_flags & TG3_FLAG_MBOX_WRITE_REORDER) || 4365: tp->write32_tx_mbox == tg3_write_indirect_mbox); > It looks like there are a lot of devices on IRQ 10. Does the interrupt > count drop if you bring down eth0 (which I'm guessing is the b44 device)? > This happens when I boot with noapic. Which I only did as a test. With the noapic option the system doesn't panic - but gets all these extra interrupts as soon as I ifconfig one of the 5906 ports. > Can you tell me if you saw the following message in the syslogs? > > "The system may be re-ordering memory-mapped I/O cycles to the network > device, attempting to recover. Please report the problem to the driver > maintainer and include system chipset information." > > Couldn't find this in the messages file. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-11 14:10 ` Stephen Clark @ 2011-01-12 3:06 ` Matt Carlson 2011-01-12 13:53 ` Stephen Clark 2011-01-13 13:12 ` Stephen Clark 0 siblings, 2 replies; 12+ messages in thread From: Matt Carlson @ 2011-01-12 3:06 UTC (permalink / raw) To: Stephen Clark Cc: Matthew Carlson, Linux Kernel Network Developers, Michael Chan On Tue, Jan 11, 2011 at 06:10:55AM -0800, Stephen Clark wrote: > On 01/10/2011 09:00 PM, Matt Carlson wrote: > > On Mon, Jan 10, 2011 at 12:04:34PM -0800, Stephen Clark wrote: > > > >> On 01/10/2011 02:22 PM, Matt Carlson wrote: > >> > >>> On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote: > >>> > >>> > >>>> On 01/04/2011 09:54 AM, Stephen Clark wrote: > >>>> > >>>> > >>>>> Hello, > >>>>> > >>>>> > >>>>> The hardware is an Acrosser AR-M0898B micro box. > >>>>> lspci > >>>>> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>>>> Host Bridge > >>>>> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>>>> Host Bridge > >>>>> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>>>> Host Bridge > >>>>> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge > >>>>> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>>>> Host Bridge > >>>>> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro > >>>>> Host Bridge > >>>>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge > >>>>> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA > >>>>> Controller (rev > >>>>> 20) > >>>>> 00:0f.1 IDE interface: VIA Technologies, Inc. > >>>>> VT82C586A/B/VT82C686/A/B/VT823x/A/ > >>>>> C PIPC Bus Master IDE (rev 07) > >>>>> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > >>>>> Controller > >>>>> (rev 91) > >>>>> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > >>>>> Controller > >>>>> (rev 91) > >>>>> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > >>>>> Controller > >>>>> (rev 91) > >>>>> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > >>>>> Controller > >>>>> (rev 91) > >>>>> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90) > >>>>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge > >>>>> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller > >>>>> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge > >>>>> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge > >>>>> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T > >>>>> (rev 02) > >>>>> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T > >>>>> (rev 02) > >>>>> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port > >>>>> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port > >>>>> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M > >>>>> Fast Ethernet > >>>>> PCI Express (rev 02) > >>>>> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M > >>>>> Fast Ethernet > >>>>> PCI Express (rev 02) > >>>>> > >>>>> Kernel 2.6.36-2.el5.elrepo on an i686 > >>>>> > >>>>> When I try to ifconfig either of the BCM5906M ports the system panics. > >>>>> > >>>>> Ideas, fixes ? > >>>>> > >>>>> [root@Z1010 ~]# modprobe tg3 > >>>>> [root@Z1010 ~]# ifconfig eth2 2.2.2.2/24 > >>>>> ------------[ cut here ]------------ > >>>>> kernel BUG at drivers/net/tg3.c:4365! > >>>>> invalid opcode: 0000 [#1] PREEMPT SMP > >>>>> last sysfs file: /sys/class/net/eth3/address > >>>>> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state > >>>>> iptable_mangle af_ke] > >>>>> > >>>>> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1 > >>>>> CN700-8251/ > >>>>> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0 > >>>>> EIP is at tg3_tx_recover+0x1e/0x53 [tg3] > >>>>> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff > >>>>> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30 > >>>>> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > >>>>> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0 > >>>>> task.ti=dee62000) > >>>>> Stack: > >>>>> dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000 > >>>>> <0> df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040 > >>>>> c1801f94 > >>>>> <0> e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500 > >>>>> dfab4000 > >>>>> Call Trace: > >>>>> [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3] > >>>>> [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3] > >>>>> [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44] > >>>>> [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3] > >>>>> [<c0674058>] ? net_rx_action+0x7e/0x11c > >>>>> [<c04409c9>] ? __do_softirq+0x85/0x10c > >>>>> [<c0440944>] ? __do_softirq+0x0/0x10c > >>>>> <IRQ> > >>>>> [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87 > >>>>> [<c044051b>] ? local_bh_enable_ip+0xd/0xf > >>>>> [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e > >>>>> [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf > >>>>> [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3] > >>>>> [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3] > >>>>> [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3] > >>>>> [<c044ec37>] ? process_one_work+0x10b/0x1bc > >>>>> [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3] > >>>>> [<c044fd41>] ? worker_thread+0x77/0xf9 > >>>>> [<c0453048>] ? kthread+0x60/0x65 > >>>>> [<c044fcca>] ? worker_thread+0x0/0xf9 > >>>>> [<c0452fe8>] ? kthread+0x0/0x65 > >>>>> [<c040337e>] ? kernel_thread_helper+0x6/0x10 > >>>>> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44 > >>>>> 00 00 f6 8 > >>>>> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30 > >>>>> ---[ end trace 82381e9b93e397ad ]--- > >>>>> Kernel panic - not syncing: Fatal exception in interrupt > >>>>> Pid: 20303, comm: kworker/0:2 Tainted: G D > >>>>> 2.6.36-2.el5.elrepo #1 > >>>>> Call Trace: > >>>>> [<c043b3cd>] panic+0x62/0x15d > >>>>> [<c06fb7d1>] oops_end+0x99/0xa8 > >>>>> [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3] > >>>>> [<c0405a62>] die+0x58/0x5e > >>>>> > >>>>> Thanks, > >>>>> Steve > >>>>> > >>>>> > >>>>> > >>>> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem. > >>>> Also boot with noapic I see this in the dmesg log and interrupts are increasing > >>>> like crazy: > >>>> tg3.c:v3.115 (October 14, 2010) > >>>> tg3 0000:81:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 > >>>> tg3 0000:81:00.0: setting latency timer to 64 > >>>> tg3 0000:81:00.0: PCI: Disallowing DAC for device > >>>> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add > >>>> ress 00:02:b6:36:d1:39 > >>>> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed > >>>> [0]) > >>>> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] > >>>> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit] > >>>> tg3 0000:82:00.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 > >>>> tg3 0000:82:00.0: setting latency timer to 64 > >>>> tg3 0000:82:00.0: PCI: Disallowing DAC for device > >>>> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add > >>>> ress 00:02:b6:36:d1:3a > >>>> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed > >>>> [0]) > >>>> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] > >>>> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit] > >>>> tg3 0000:81:00.0: irq 40 for MSI/MSI-X > >>>> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx > >>>> mode. Please report this failure to the PCI maintainer and include system chipse > >>>> t information > >>>> ADDRCONF(NETDEV_UP): eth2: link is not ready > >>>> [root@Z1010 ~]# cat /proc/interrupts > >>>> CPU0 > >>>> 0: 162 XT-PIC-XT-PIC timer > >>>> 1: 2 XT-PIC-XT-PIC i8042 > >>>> 2: 0 XT-PIC-XT-PIC cascade > >>>> 3: 1 XT-PIC-XT-PIC > >>>> 4: 4863 XT-PIC-XT-PIC serial > >>>> 6: 2 XT-PIC-XT-PIC floppy > >>>> 7: 5 XT-PIC-XT-PIC ehci_hcd:usb1, uhci_hcd:usb3 > >>>> 8: 0 XT-PIC-XT-PIC rtc0 > >>>> 9: 0 XT-PIC-XT-PIC acpi > >>>> 10: 2334234 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 > >>>> > >>>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2 > >>>> 10: 18388914 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 > >>>> [root@Z1010 ~]# cat /proc/interrupts |grep eth2 > >>>> 10: 18901627 XT-PIC-XT-PIC uhci_hcd:usb2, eth0, eth2 > >>>> > >>>> -- > >>>> > >>>> "They that give up essential liberty to obtain temporary safety, > >>>> deserve neither liberty nor safety." (Ben Franklin) > >>>> > >>>> "The course of history shows that as a government grows, liberty > >>>> decreases." (Thomas Jefferson) > >>>> > >>>> > >>> I think drivers/net/tg3.c:4365 is at the line that reads > >>> "spin_lock(&tp->lock);" in tg3_tx_recover. Can you verify? > >>> > >>> > >>> > >> > >> tg3_readphy(tp, MII_TG3_DSP_RW_PORT,&phy2); > >> > >> in static void tg3_serdes_parallel_detect(struct tg3 *tp) > >> > >> The driver version is: > >> #define DRV_MODULE_NAME "tg3" > >> #define TG3_MAJ_NUM 3 > >> #define TG3_MIN_NUM 115 > >> > > > > That doesn't look right. The line number I quoted came from the kernel > > panic output from 2.6.36-2.el5.elrepo. I'm guessing you quoted me the > > sources from the tg3.c file in 2.6.37-rc8+. If you don't have the > > 2.6.36-2.el5.elrepo sources readily available, can you give me the line > > the kernel panic specifies from the tg3.c file from your 2.6.37-rc8+ > > sources? > > > > > Oops - You are correct. The problem is most of the time I don't get a > panic on the > screen the box simply reboots. > > I'll see if I can get the 2.6.36-2 sources - though they are suppose to > be the virgin > kernel.org sources simply recompiled for Centos. > > static void tg3_tx_recover(struct tg3 *tp) > { > BUG_ON((tp->tg3_flags & TG3_FLAG_MBOX_WRITE_REORDER) || > 4365: tp->write32_tx_mbox == tg3_write_indirect_mbox); > > > > It looks like there are a lot of devices on IRQ 10. Does the interrupt > > count drop if you bring down eth0 (which I'm guessing is the b44 device)? > > > This happens when I boot with noapic. Which I only did as a test. With > the noapic option > the system doesn't panic - but gets all these extra interrupts as soon > as I ifconfig one of > the 5906 ports. I was wondering if the b44 device is having a problem with shared interrupts. > > Can you tell me if you saw the following message in the syslogs? > > > > "The system may be re-ordering memory-mapped I/O cycles to the network > > device, attempting to recover. Please report the problem to the driver > > maintainer and include system chipset information." > > > > > Couldn't find this in the messages file. Can you give me the output of 'lspci -vvv -xxx -s 81:00.0' and 'ethtool -i eth2'? I'm wondering if this BUG_ON is a symptom of a different problem masquerading as a write-reordering bug. Do you have IPv6 configured? If not, what happens if you just run 'ifconfig eth2 up', without assigning an IP address? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-12 3:06 ` Matt Carlson @ 2011-01-12 13:53 ` Stephen Clark 2011-01-13 13:12 ` Stephen Clark 1 sibling, 0 replies; 12+ messages in thread From: Stephen Clark @ 2011-01-12 13:53 UTC (permalink / raw) To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan On 01/11/2011 10:06 PM, Matt Carlson wrote: > lspci -vvv -xxx -s 81:00.0 Linux Z1010.netwolves.com 2.6.37 #9 SMP PREEMPT Wed Jan 5 11:14:46 EST 2011 i686 i686 i386 GNU/Linux [root@Z1010 ~]# lspci -vvv -xxx -s 81:00.0 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M Fast Ethernet PCI Express (rev 02) Subsystem: Broadcom Corporation Unknown device 9713 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepp ing- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- < MAbort- >SERR- <PERR- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 10 Region 0: Memory at cfbf0000 (64-bit, non-prefetchable) [size=64K] Capabilities: [48] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+ ,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] Vital Product Data Capabilities: [58] Vendor Specific Information Capabilities: [e8] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Address: 00000000fee0100c Data: 4169 Capabilities: [d0] Express Endpoint IRQ 0 Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag+ Device: Latency L0s <4us, L1 unlimited Device: AtnBtn- AtnInd- PwrInd- Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported- Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- Device: MaxPayload 128 bytes, MaxReadReq 512 bytes Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 0 Link: Latency L0s <4us, L1 <64us Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch- Link: Speed 2.5Gb/s, Width x1 Capabilities: [100] Advanced Error Reporting Capabilities: [13c] Virtual Channel Capabilities: [160] Device Serial Number 39-d1-36-fe-ff-b6-02-00 00: e4 14 13 17 06 00 10 00 02 00 00 02 10 00 00 00 10: 04 00 bf cf 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 e4 14 13 97 30: 00 00 00 00 48 00 00 00 00 00 00 00 0a 01 00 00 40: 00 00 00 00 00 00 00 00 01 50 03 c0 08 00 00 00 50: 03 58 fc 00 78 00 00 00 09 e8 78 00 64 2f 72 64 60: 00 00 00 00 00 00 00 00 00 00 02 c0 00 00 00 00 70: 12 12 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 fe 50 08 24 90: 01 92 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 10 00 01 00 a0 8f 64 00 00 20 10 00 11 6c 03 00 e0: 00 00 11 10 00 00 00 00 05 d0 80 00 0c 10 e0 fe f0: 00 00 00 00 69 41 00 00 00 00 00 00 00 00 00 00 modprobe tg3 dmesg output: tg3.c:v3.115 (October 14, 2010) tg3 0000:81:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 tg3 0000:81:00.0: setting latency timer to 64 tg3 0000:81:00.0: PCI: Disallowing DAC for device tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC addr ess 00:02:b6:36:d1:39 tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed[ 0]) tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit] tg3 0000:82:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 tg3 0000:82:00.0: setting latency timer to 64 tg3 0000:82:00.0: PCI: Disallowing DAC for device tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC addr ess 00:02:b6:36:d1:3a tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed[ 0]) tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit] [root@Z1010 ~]# ethtool -i eth2 driver: tg3 version: 3.115 firmware-version: sb v3.03 bus-info: 0000:81:00.0 [root@Z1010 ~]# cat /proc/interrupts CPU0 0: 173 IO-APIC-edge timer 1: 2 IO-APIC-edge i8042 4: 2864 IO-APIC-edge serial 6: 2 IO-APIC-edge floppy 8: 0 IO-APIC-edge rtc0 9: 0 IO-APIC-fasteoi acpi 14: 0 IO-APIC-edge pata_via 15: 8100 IO-APIC-edge pata_via 16: 984 IO-APIC-fasteoi eth0 17: 104 IO-APIC-fasteoi eth1 20: 0 IO-APIC-fasteoi uhci_hcd:usb2 21: 0 IO-APIC-fasteoi uhci_hcd:usb4, sata_via 22: 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb3 23: 0 IO-APIC-fasteoi uhci_hcd:usb5 NMI: 0 Non-maskable interrupts LOC: 101963 Local timer interrupts SPU: 0 Spurious interrupts PMI: 0 Performance monitoring interrupts IWI: 0 IRQ work interrupts RES: 0 Rescheduling interrupts CAL: 0 Function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts THR: 0 Threshold APIC interrupts MCE: 0 Machine check exceptions MCP: 0 Machine check polls ERR: 0 MIS: 0 The b44 interfaces are working great. [root@Z1010 ~]# ifconfig eth2 up do_IRQ: 0.64 No irq handler for vector (irq -1) system becomes unresponsive then ususally reboots. but it didn't this last time just has become really doggy in responding [root@Z1010 ~]# [root@Z1010 ~]# [root@Z1010 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:02:B6:36:D1:37 inet addr:10.0.129.4 Bcast:10.0.255.255 Mask:255.255.128.0 inet6 addr: fe80::202:b6ff:fe36:d137/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1025 errors:0 dropped:12 overruns:0 frame:0 TX packets:6 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:185675 (181.3 KiB) TX bytes:492 (492.0 b) Interrupt:16 eth1 Link encap:Ethernet HWaddr 00:02:B6:36:D1:38 inet6 addr: fe80::202:b6ff:fe36:d138/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:35 errors:0 dropped:0 overruns:0 frame:0 TX packets:41 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2612 (2.5 KiB) TX bytes:4014 (3.9 KiB) Interrupt:17 eth2 Link encap:Ethernet HWaddr 00:02:B6:36:D1:39 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:16 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:5298 errors:0 dropped:0 overruns:0 frame:0 TX packets:5298 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:475525 (464.3 KiB) TX bytes:475525 (464.3 KiB) Message from syslogd@ at Wed Jan 12 08:44:17 2011 ... localhost kernel: do_IRQ: 0.192 No irq handler for vector (irq -1) Message from syslogd@ at Wed Jan 12 08:44:17 2011 ... localhost kernel: do_IRQ: 0.64 No irq handler for vector (irq -1) [root@Z1010 ~]# cat /proc/interrupts CPU0 0: 173 IO-APIC-edge timer 1: 2 IO-APIC-edge i8042 4: 821 IO-APIC-edge serial 6: 2 IO-APIC-edge floppy 8: 0 IO-APIC-edge rtc0 9: 2 IO-APIC-fasteoi acpi 14: 0 IO-APIC-edge pata_via 15: 19522 IO-APIC-edge pata_via 16: 256 IO-APIC-fasteoi eth0, eth2 17: 54 IO-APIC-fasteoi eth1 20: 0 IO-APIC-fasteoi uhci_hcd:usb2 21: 0 IO-APIC-fasteoi uhci_hcd:usb4, sata_via 22: 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb3 23: 0 IO-APIC-fasteoi uhci_hcd:usb5 NMI: 0 Non-maskable interrupts LOC: 116090 Local timer interrupts SPU: 0 Spurious interrupts PMI: 0 Performance monitoring interrupts IWI: 0 IRQ work interrupts RES: 0 Rescheduling interrupts CAL: 0 Function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts THR: 0 Threshold APIC interrupts MCE: 0 Machine check exceptions MCP: 0 Machine check polls ERR: 38 MIS: 2 [root@Z1010 ~]# arp -an the system has now lost ethernet connectivity via the b44 ports This is a test system and I can recompile the kernel if there are any patches you would like me to try out. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-12 3:06 ` Matt Carlson 2011-01-12 13:53 ` Stephen Clark @ 2011-01-13 13:12 ` Stephen Clark 2011-01-16 18:11 ` Stephen Clark 1 sibling, 1 reply; 12+ messages in thread From: Stephen Clark @ 2011-01-13 13:12 UTC (permalink / raw) To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan On 01/11/2011 10:06 PM, Matt Carlson wrote: > lspci -vvv -xxx -s 81:00.0 Further information - I found these messages in /var/log/messages. It looks like after it switched to INTx mode interrupts for other devices were hosed. Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt was gener ated using MSI. Switching to INTx mode. Please report this failure to the PCI ma intainer and include system chipset information Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is not ready Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50) Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA Jan 12 08:38:50 localhost kernel: ata2.01: cmd ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out Jan 12 08:38:50 localhost kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout) Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY } Jan 12 08:38:50 localhost kernel: ata2: soft resetting link Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for vector (irq -1) Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33 Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something wicked happened on session 3363 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-13 13:12 ` Stephen Clark @ 2011-01-16 18:11 ` Stephen Clark 2011-01-25 0:59 ` Matt Carlson 0 siblings, 1 reply; 12+ messages in thread From: Stephen Clark @ 2011-01-16 18:11 UTC (permalink / raw) To: sclark46; +Cc: Matt Carlson, Linux Kernel Network Developers, Michael Chan On 01/13/2011 08:12 AM, Stephen Clark wrote: > On 01/11/2011 10:06 PM, Matt Carlson wrote: >> lspci -vvv -xxx -s 81:00.0 > > > > Further information - I found these messages in /var/log/messages. It > looks > like after it switched to INTx mode interrupts for other devices were > hosed. > > Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt > was gener > ated using MSI. Switching to INTx mode. Please report this failure to > the PCI ma > intainer and include system chipset information > Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is > not ready > Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50) > Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct > 0x0 SErr 0x0 > action 0x6 frozen > Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA > Jan 12 08:38:50 localhost kernel: ata2.01: cmd > ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out > Jan 12 08:38:50 localhost kernel: res > 40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout) > Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY } > Jan 12 08:38:50 localhost kernel: ata2: soft resetting link > Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for > vector (irq -1) > Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33 > Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests > Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something > wicked happened on session 3363 Just checking to make sure you have everything you need? Thanks, Steve ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-16 18:11 ` Stephen Clark @ 2011-01-25 0:59 ` Matt Carlson 2011-01-25 2:25 ` Matt Carlson 0 siblings, 1 reply; 12+ messages in thread From: Matt Carlson @ 2011-01-25 0:59 UTC (permalink / raw) To: Stephen Clark Cc: Matthew Carlson, Linux Kernel Network Developers, Michael Chan On Sun, Jan 16, 2011 at 10:11:50AM -0800, Stephen Clark wrote: > On 01/13/2011 08:12 AM, Stephen Clark wrote: > > On 01/11/2011 10:06 PM, Matt Carlson wrote: > >> lspci -vvv -xxx -s 81:00.0 > > > > > > > > Further information - I found these messages in /var/log/messages. It > > looks > > like after it switched to INTx mode interrupts for other devices were > > hosed. > > > > Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt > > was gener > > ated using MSI. Switching to INTx mode. Please report this failure to > > the PCI ma > > intainer and include system chipset information > > Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is > > not ready > > Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50) > > Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct > > 0x0 SErr 0x0 > > action 0x6 frozen > > Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA > > Jan 12 08:38:50 localhost kernel: ata2.01: cmd > > ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out > > Jan 12 08:38:50 localhost kernel: res > > 40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout) > > Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY } > > Jan 12 08:38:50 localhost kernel: ata2: soft resetting link > > Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for > > vector (irq -1) > > Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33 > > Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests > > Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something > > wicked happened on session 3363 > Just checking to make sure you have everything you need? Sorry for the delay Stephen. It looks to me like interrupts aren't being setup correctly on this system. I tested MSI and INTx interrupt modes locally and they both work. I'm guessing one of two things could be happening: 1) The 2nd parameter of the low-level ISR (tg3_interrupt_tagged()) is not correct. The ISR tries to tell the hardware the interrupt is acknowledged, but the message goes unheard. (This might also explain why other devices are also afflicted.) 2) Something is blocking the delivery of the interrupt to the tg3 driver altogether. In both cases, the hardware persistently nags the host to ack the interrupt, hence the interrupt storm. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-25 0:59 ` Matt Carlson @ 2011-01-25 2:25 ` Matt Carlson 2011-04-13 18:23 ` Stephen Clark 0 siblings, 1 reply; 12+ messages in thread From: Matt Carlson @ 2011-01-25 2:25 UTC (permalink / raw) To: Matt Carlson; +Cc: Stephen Clark, Linux Kernel Network Developers, Michael Chan On Mon, Jan 24, 2011 at 04:59:22PM -0800, Matt Carlson wrote: > On Sun, Jan 16, 2011 at 10:11:50AM -0800, Stephen Clark wrote: > > On 01/13/2011 08:12 AM, Stephen Clark wrote: > > > On 01/11/2011 10:06 PM, Matt Carlson wrote: > > >> lspci -vvv -xxx -s 81:00.0 > > > > > > > > > > > > Further information - I found these messages in /var/log/messages. It > > > looks > > > like after it switched to INTx mode interrupts for other devices were > > > hosed. > > > > > > Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt > > > was gener > > > ated using MSI. Switching to INTx mode. Please report this failure to > > > the PCI ma > > > intainer and include system chipset information > > > Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is > > > not ready > > > Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50) > > > Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct > > > 0x0 SErr 0x0 > > > action 0x6 frozen > > > Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA > > > Jan 12 08:38:50 localhost kernel: ata2.01: cmd > > > ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out > > > Jan 12 08:38:50 localhost kernel: res > > > 40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout) > > > Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY } > > > Jan 12 08:38:50 localhost kernel: ata2: soft resetting link > > > Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for > > > vector (irq -1) > > > Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33 > > > Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests > > > Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something > > > wicked happened on session 3363 > > Just checking to make sure you have everything you need? > > Sorry for the delay Stephen. > > It looks to me like interrupts aren't being setup correctly on this > system. I tested MSI and INTx interrupt modes locally and they both > work. I'm guessing one of two things could be happening: > > 1) The 2nd parameter of the low-level ISR (tg3_interrupt_tagged()) is > not correct. The ISR tries to tell the hardware the interrupt is > acknowledged, but the message goes unheard. (This might also explain > why other devices are also afflicted.) > > 2) Something is blocking the delivery of the interrupt to the tg3 driver > altogether. > > In both cases, the hardware persistently nags the host to ack the > interrupt, hence the interrupt storm. Just curious, is the problem still there if you add pci=nomsi to the kernel command line? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: panic in tg3 driver 2011-01-25 2:25 ` Matt Carlson @ 2011-04-13 18:23 ` Stephen Clark 0 siblings, 0 replies; 12+ messages in thread From: Stephen Clark @ 2011-04-13 18:23 UTC (permalink / raw) To: Matt Carlson; +Cc: Linux Kernel Network Developers, Michael Chan On 01/24/2011 09:25 PM, Matt Carlson wrote: > On Mon, Jan 24, 2011 at 04:59:22PM -0800, Matt Carlson wrote: > >> On Sun, Jan 16, 2011 at 10:11:50AM -0800, Stephen Clark wrote: >> >>> On 01/13/2011 08:12 AM, Stephen Clark wrote: >>> >>>> On 01/11/2011 10:06 PM, Matt Carlson wrote: >>>> >>>>> lspci -vvv -xxx -s 81:00.0 >>>>> >>>> >>>> >>>> Further information - I found these messages in /var/log/messages. It >>>> looks >>>> like after it switched to INTx mode interrupts for other devices were >>>> hosed. >>>> >>>> Jan 12 08:37:49 localhost kernel: tg3 0000:81:00.0: eth2: No interrupt >>>> was gener >>>> ated using MSI. Switching to INTx mode. Please report this failure to >>>> the PCI ma >>>> intainer and include system chipset information >>>> Jan 12 08:37:49 localhost kernel: ADDRCONF(NETDEV_UP): eth2: link is >>>> not ready >>>> Jan 12 08:38:50 localhost kernel: ata2: lost interrupt (Status 0x50) >>>> Jan 12 08:38:50 localhost kernel: ata2.01: exception Emask 0x0 SAct >>>> 0x0 SErr 0x0 >>>> action 0x6 frozen >>>> Jan 12 08:38:50 localhost kernel: ata2.01: failed command: WRITE DMA >>>> Jan 12 08:38:50 localhost kernel: ata2.01: cmd >>>> ca/00:08:e0:bc:51/00:00:00:00:00/f0 tag 0 dma 4096 out >>>> Jan 12 08:38:50 localhost kernel: res >>>> 40/00:01:00:4f:c2/00:00:00:00:00/b0 Emask 0x4 (timeout) >>>> Jan 12 08:38:50 localhost kernel: ata2.01: status: { DRDY } >>>> Jan 12 08:38:50 localhost kernel: ata2: soft resetting link >>>> Jan 12 08:38:50 localhost kernel: do_IRQ: 0.64 No irq handler for >>>> vector (irq -1) >>>> Jan 12 08:38:50 localhost kernel: ata2.01: configured for UDMA/33 >>>> Jan 12 08:38:54 localhost pppd[1983]: No response to 3 echo-requests >>>> Jan 12 08:39:55 localhost pppoe[1988]: Inactivity timeout... something >>>> wicked happened on session 3363 >>>> >>> Just checking to make sure you have everything you need? >>> >> Sorry for the delay Stephen. >> >> It looks to me like interrupts aren't being setup correctly on this >> system. I tested MSI and INTx interrupt modes locally and they both >> work. I'm guessing one of two things could be happening: >> >> 1) The 2nd parameter of the low-level ISR (tg3_interrupt_tagged()) is >> not correct. The ISR tries to tell the hardware the interrupt is >> acknowledged, but the message goes unheard. (This might also explain >> why other devices are also afflicted.) >> >> 2) Something is blocking the delivery of the interrupt to the tg3 driver >> altogether. >> >> In both cases, the hardware persistently nags the host to ack the >> interrupt, hence the interrupt storm. >> > Just curious, is the problem still there if you add pci=nomsi to the > kernel command line? > > Sorry I have been tied up. With kernel 2.6.32-44.1.el6.i686 and pci=nomsi on the kernel command line it seems to work great. [root@Z1010 ~]# ping -f 3.3.3.2 PING 3.3.3.2 (3.3.3.2) 56(84) bytes of data. .^ --- 3.3.3.2 ping statistics --- 20562 packets transmitted, 20562 received, 0% packet loss, time 4408ms rtt min/avg/max/mdev = 0.141/0.163/1.021/0.034 ms, ipg/ewma 0.214/0.161 ms > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-04-13 18:23 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <4D2334B5.1060408@earthlink.net> 2011-01-09 22:30 ` panic in tg3 driver Stephen Clark 2011-01-10 19:22 ` Matt Carlson 2011-01-10 20:04 ` Stephen Clark 2011-01-11 2:00 ` Matt Carlson 2011-01-11 14:10 ` Stephen Clark 2011-01-12 3:06 ` Matt Carlson 2011-01-12 13:53 ` Stephen Clark 2011-01-13 13:12 ` Stephen Clark 2011-01-16 18:11 ` Stephen Clark 2011-01-25 0:59 ` Matt Carlson 2011-01-25 2:25 ` Matt Carlson 2011-04-13 18:23 ` Stephen Clark
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).