* Strange errors from e1000 driver (2.6.18) @ 2006-10-22 18:46 Martin J. Bligh 2006-10-22 19:05 ` Martin J. Bligh 0 siblings, 1 reply; 8+ messages in thread From: Martin J. Bligh @ 2006-10-22 18:46 UTC (permalink / raw) To: Linux Kernel Mailing List, netdev I'm getting a lot of these type of errors if I run 2.6.18. If I run the standard Ubuntu Dapper kernel, I don't get them. What do they indicate? Oct 21 18:48:28 localhost kernel: buffer_info[next_to_clean] Oct 21 18:48:28 localhost kernel: time_stamp <7b79d33> Oct 21 18:48:28 localhost kernel: next_to_watch <3d> Oct 21 18:48:28 localhost kernel: jiffies <7b7a0c1> Oct 21 18:48:28 localhost kernel: next_to_watch.status <0> Oct 21 18:48:30 localhost kernel: Tx Queue <0> Oct 21 18:48:30 localhost kernel: TDH <3d> Oct 21 18:48:30 localhost kernel: TDT <44> Oct 21 18:48:30 localhost kernel: next_to_use <44> Oct 21 18:48:30 localhost kernel: next_to_clean <39> Oct 21 18:48:30 localhost kernel: buffer_info[next_to_clean] Oct 21 18:48:30 localhost kernel: time_stamp <7b79d33> Oct 21 18:48:30 localhost kernel: next_to_watch <3d> Oct 21 18:48:30 localhost kernel: jiffies <7b7a2b5> Oct 21 18:48:30 localhost kernel: next_to_watch.status <0> Oct 21 18:48:32 localhost kernel: Tx Queue <0> Oct 21 18:48:32 localhost kernel: TDH <3d> Oct 21 18:48:32 localhost kernel: TDT <44> Oct 21 18:48:32 localhost kernel: next_to_use <44> Oct 21 18:48:32 localhost kernel: next_to_clean <39> Oct 21 18:48:32 localhost kernel: buffer_info[next_to_clean] Oct 21 18:48:32 localhost kernel: time_stamp <7b79d33> Oct 21 18:48:32 localhost kernel: next_to_watch <3d> Oct 21 18:48:32 localhost kernel: jiffies <7b7a4a9> Oct 21 18:48:32 localhost kernel: next_to_watch.status <0> Oct 21 18:48:34 localhost kernel: Tx Queue <0> Oct 21 18:48:34 localhost kernel: TDH <3d> Oct 21 18:48:34 localhost kernel: TDT <44> Oct 21 18:48:34 localhost kernel: next_to_use <44> Oct 21 18:48:34 localhost kernel: next_to_clean <39> Oct 21 18:48:34 localhost kernel: buffer_info[next_to_clean] Oct 21 18:48:34 localhost kernel: time_stamp <7b79d33> Oct 21 18:48:34 localhost kernel: next_to_watch <3d> Oct 21 18:48:34 localhost kernel: jiffies <7b7a69d> Oct 21 18:48:34 localhost kernel: next_to_watch.status <0> Oct 21 18:48:35 localhost kernel: NETDEV WATCHDOG: eth0: transmit timed out Oct 21 18:48:36 localhost kernel: e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Strange errors from e1000 driver (2.6.18) 2006-10-22 18:46 Strange errors from e1000 driver (2.6.18) Martin J. Bligh @ 2006-10-22 19:05 ` Martin J. Bligh 2006-10-22 20:21 ` Jesse Brandeburg 0 siblings, 1 reply; 8+ messages in thread From: Martin J. Bligh @ 2006-10-22 19:05 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Linux Kernel Mailing List, netdev Martin J. Bligh wrote: > I'm getting a lot of these type of errors if I run 2.6.18. If > I run the standard Ubuntu Dapper kernel, I don't get them. > What do they indicate? > > Oct 21 18:48:28 localhost kernel: buffer_info[next_to_clean] > Oct 21 18:48:28 localhost kernel: time_stamp <7b79d33> > Oct 21 18:48:28 localhost kernel: next_to_watch <3d> > Oct 21 18:48:28 localhost kernel: jiffies <7b7a0c1> > Oct 21 18:48:28 localhost kernel: next_to_watch.status <0> > Oct 21 18:48:30 localhost kernel: Tx Queue <0> > Oct 21 18:48:30 localhost kernel: TDH <3d> > Oct 21 18:48:30 localhost kernel: TDT <44> > Oct 21 18:48:30 localhost kernel: next_to_use <44> > Oct 21 18:48:30 localhost kernel: next_to_clean <39> > Oct 21 18:48:30 localhost kernel: buffer_info[next_to_clean] > Oct 21 18:48:30 localhost kernel: time_stamp <7b79d33> > Oct 21 18:48:30 localhost kernel: next_to_watch <3d> > Oct 21 18:48:30 localhost kernel: jiffies <7b7a2b5> > Oct 21 18:48:30 localhost kernel: next_to_watch.status <0> > Oct 21 18:48:32 localhost kernel: Tx Queue <0> > Oct 21 18:48:32 localhost kernel: TDH <3d> > Oct 21 18:48:32 localhost kernel: TDT <44> > Oct 21 18:48:32 localhost kernel: next_to_use <44> > Oct 21 18:48:32 localhost kernel: next_to_clean <39> > Oct 21 18:48:32 localhost kernel: buffer_info[next_to_clean] > Oct 21 18:48:32 localhost kernel: time_stamp <7b79d33> > Oct 21 18:48:32 localhost kernel: next_to_watch <3d> > Oct 21 18:48:32 localhost kernel: jiffies <7b7a4a9> > Oct 21 18:48:32 localhost kernel: next_to_watch.status <0> > Oct 21 18:48:34 localhost kernel: Tx Queue <0> > Oct 21 18:48:34 localhost kernel: TDH <3d> > Oct 21 18:48:34 localhost kernel: TDT <44> > Oct 21 18:48:34 localhost kernel: next_to_use <44> > Oct 21 18:48:34 localhost kernel: next_to_clean <39> > Oct 21 18:48:34 localhost kernel: buffer_info[next_to_clean] > Oct 21 18:48:34 localhost kernel: time_stamp <7b79d33> > Oct 21 18:48:34 localhost kernel: next_to_watch <3d> > Oct 21 18:48:34 localhost kernel: jiffies <7b7a69d> > Oct 21 18:48:34 localhost kernel: next_to_watch.status <0> > Oct 21 18:48:35 localhost kernel: NETDEV WATCHDOG: eth0: transmit timed out > Oct 21 18:48:36 localhost kernel: e1000: eth0: e1000_watchdog: NIC Link > is Up 100 Mbps Full Duplex Actually, maybe this set is more helpful: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <6> TDT <1f> next_to_use <1f> next_to_clean <2> buffer_info[next_to_clean] time_stamp <2de8b54> next_to_watch <6> jiffies <2de8db7> next_to_watch.status <0> e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <6> TDT <1f> next_to_use <1f> next_to_clean <2> buffer_info[next_to_clean] time_stamp <2de8b54> next_to_watch <6> jiffies <2de8fab> next_to_watch.status <0> e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <6> TDT <1f> next_to_use <1f> next_to_clean <2> buffer_info[next_to_clean] time_stamp <2de8b54> next_to_watch <6> jiffies <2de919f> next_to_watch.status <0> NETDEV WATCHDOG: eth0: transmit timed out e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Strange errors from e1000 driver (2.6.18) 2006-10-22 19:05 ` Martin J. Bligh @ 2006-10-22 20:21 ` Jesse Brandeburg 2006-10-22 20:27 ` Martin J. Bligh 0 siblings, 1 reply; 8+ messages in thread From: Jesse Brandeburg @ 2006-10-22 20:21 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Martin J. Bligh, Linux Kernel Mailing List, netdev On 10/22/06, Martin J. Bligh <mbligh@mbligh.org> wrote: > Martin J. Bligh wrote: > > I'm getting a lot of these type of errors if I run 2.6.18. If > > I run the standard Ubuntu Dapper kernel, I don't get them. > > What do they indicate? Hi Martin, they indicate that you're getting transmit hangs. Means your hardware is having issues with some of the buffers it is being handed. Because the TDH and TDT noted below are not equal, it means the hardware is hung processing buffers that the driver gave to it. We need the standard bug report particulars, lspci -vv, cat /proc/interrupts, dmesg, ethtool -e eth0, and maybe output of dmidecode, etc. I'm pretty sure you know the drill. > > Oct 21 18:48:28 localhost kernel: buffer_info[next_to_clean] > > Oct 21 18:48:28 localhost kernel: time_stamp <7b79d33> > > Oct 21 18:48:28 localhost kernel: next_to_watch <3d> > > Oct 21 18:48:28 localhost kernel: jiffies <7b7a0c1> > > Oct 21 18:48:28 localhost kernel: next_to_watch.status <0> > > Oct 21 18:48:30 localhost kernel: Tx Queue <0> > > Oct 21 18:48:30 localhost kernel: TDH <3d> > > Oct 21 18:48:30 localhost kernel: TDT <44> > > Oct 21 18:48:30 localhost kernel: next_to_use <44> > > Oct 21 18:48:30 localhost kernel: next_to_clean <39> <snip> > Actually, maybe this set is more helpful: > > e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > Tx Queue <0> > TDH <6> > TDT <1f> > next_to_use <1f> > next_to_clean <2> > buffer_info[next_to_clean] > time_stamp <2de8b54> > next_to_watch <6> > jiffies <2de8db7> > next_to_watch.status <0> <snip> > NETDEV WATCHDOG: eth0: transmit timed out > e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex only a little. There are so many different pieces of e1000 hardware and so few specifics in this report that I'll be able to tell you lots more when you get us the info requested. Jesse ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Strange errors from e1000 driver (2.6.18) 2006-10-22 20:21 ` Jesse Brandeburg @ 2006-10-22 20:27 ` Martin J. Bligh 2006-10-22 22:15 ` Jesse Brandeburg 2006-10-22 23:02 ` Dumitru Ciobarcianu 0 siblings, 2 replies; 8+ messages in thread From: Martin J. Bligh @ 2006-10-22 20:27 UTC (permalink / raw) To: Jesse Brandeburg; +Cc: Martin J. Bligh, Linux Kernel Mailing List, netdev [-- Attachment #1: Type: text/plain, Size: 5795 bytes --] Jesse Brandeburg wrote: > On 10/22/06, Martin J. Bligh <mbligh@mbligh.org> wrote: >> Martin J. Bligh wrote: >> > I'm getting a lot of these type of errors if I run 2.6.18. If >> > I run the standard Ubuntu Dapper kernel, I don't get them. >> > What do they indicate? > > Hi Martin, they indicate that you're getting transmit hangs. Means > your hardware is having issues with some of the buffers it is being > handed. Because the TDH and TDT noted below are not equal, it means > the hardware is hung processing buffers that the driver gave to it. > > We need the standard bug report particulars, Sure. > lspci -vv, 0000:00:0a.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Con troller (Copper) (rev 01) Subsystem: Intel Corporation PRO/1000 MT Dual Port Server Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Step ping- SERR- FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort - <MAbort- >SERR- <PERR- Latency: 32 (63750ns min), Cache Line Size: 0x08 (32 bytes) Interrupt: pin A routed to IRQ 5 Region 0: Memory at ef020000 (64-bit, non-prefetchable) [size=128K] Region 4: I/O ports at a000 [size=64] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot +,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [e4] Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 0000:00:0a.1 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Con troller (Copper) (rev 01) Subsystem: Intel Corporation PRO/1000 MT Dual Port Server Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Step ping- SERR- FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort - <MAbort- >SERR- <PERR- Latency: 32 (63750ns min), Cache Line Size: 0x08 (32 bytes) Interrupt: pin B routed to IRQ 11 Region 0: Memory at ef000000 (64-bit, non-prefetchable) [size=128K] Region 4: I/O ports at a400 [size=64] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot +,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [e4] Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 > cat /proc/interrupts, CPU0 0: 146271373 XT-PIC timer 1: 179459 XT-PIC i8042 2: 0 XT-PIC cascade 5: 1975991 XT-PIC ehci_hcd:usb2, VIA8237, eth0 6: 2 XT-PIC floppy 10: 0 XT-PIC uhci_hcd:usb4, uhci_hcd:usb5, uhci_hcd:usb6 11: 0 XT-PIC ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb7, uhci_hcd:usb8 12: 2758142 XT-PIC i8042 14: 6344745 XT-PIC ide0 15: 20014468 XT-PIC ide1 NMI: 0 LOC: 146264664 ERR: 52805 > dmesg Did that bit already. > ethtool -e eth0, root@titus:/usr/local/autotest/bin # ethtool -e eth0 Offset Values ------ ------ 0x0000 00 07 e9 09 0b 08 30 05 ff ff ff ff ff ff ff ff 0x0010 44 a9 03 98 0b 46 11 10 86 80 10 10 86 80 68 34 0x0020 0c 00 10 10 00 00 02 21 c8 18 ff ff ff ff ff ff 0x0030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0040 0c c3 61 78 04 50 02 21 c8 08 ff ff ff ff ff ff 0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 02 06 0x0060 2c 00 00 40 07 11 00 00 2c 00 00 40 ff ff ff ff 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 4f 29 0x0080 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0090 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00a0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00b0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00c0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00d0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00e0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00f0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0100 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0110 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0120 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0130 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0140 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0150 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0160 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0170 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0180 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0190 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01a0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01b0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01c0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01d0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01e0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01f0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > and maybe output of > dmidecode, etc. Attached. > only a little. There are so many different pieces of e1000 hardware > and so few specifics in this report that I'll be able to tell you lots > more when you get us the info requested. Thanks. Not sure if the bug wasn't there in earlier kernels, or if we just weren't printing anything. M. [-- Attachment #2: dmidecode --] [-- Type: text/plain, Size: 10736 bytes --] # dmidecode 2.7 SMBIOS 2.2 present. 37 structures occupying 959 bytes. Table at 0x000F0800. Handle 0x0000, DMI type 0, 19 bytes. BIOS Information Vendor: Phoenix Technologies, LTD Version: 6.00 PG Release Date: 09/13/2004 Address: 0xE0000 Runtime Size: 128 kB ROM Size: 256 kB Characteristics: ISA is supported PCI is supported PNP is supported APM is supported BIOS is upgradeable BIOS shadowing is allowed ESCD support is available Boot from CD is supported Selectable boot is supported BIOS ROM is socketed EDD is supported 5.25"/360 KB floppy services are supported (int 13h) 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 KB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) CGA/mono video services are supported (int 10h) ACPI is supported USB legacy is supported AGP is supported LS-120 boot is supported ATAPI Zip drive boot is supported Handle 0x0001, DMI type 1, 25 bytes. System Information Manufacturer: VIA Technologies, Inc. Product Name: KT600-8237 Version: Serial Number: UUID: Not Present Wake-up Type: Power Switch Handle 0x0002, DMI type 2, 8 bytes. Base Board Information Manufacturer: Product Name: KT600-8237 Version: Serial Number: Handle 0x0003, DMI type 3, 13 bytes. Chassis Information Manufacturer: Type: Desktop Lock: Not Present Version: Serial Number: Asset Tag: Boot-up State: Unknown Power Supply State: Unknown Thermal State: Unknown Security Status: Unknown Handle 0x0004, DMI type 4, 32 bytes. Processor Information Socket Designation: Socket A Type: Central Processor Family: Athlon XP Manufacturer: AMD ID: A0 06 00 00 FF FB 83 03 Signature: Family 6, Model A, Stepping 0 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) MMX (MMX technology supported) FXSR (Fast floating-point save and restore) SSE (Streaming SIMD extensions) Version: AMD Athlon(tm) XP Voltage: 1.5 V External Clock: 100 MHz Max Speed: 3000 MHz Current Speed: 1100 MHz Status: Populated, Enabled Upgrade: ZIF Socket L1 Cache Handle: 0x0009 L2 Cache Handle: 0x000A L3 Cache Handle: No L3 Cache Handle 0x0005, DMI type 5, 22 bytes. Memory Controller Information Error Detecting Method: None Error Correcting Capabilities: None Supported Interleave: One-way Interleave Current Interleave: Four-way Interleave Maximum Memory Module Size: 32 MB Maximum Total Memory Size: 96 MB Supported Speeds: 70 ns 60 ns Supported Memory Types: Standard EDO Memory Module Voltage: 5.0 V Associated Memory Slots: 3 0x0006 0x0007 0x0008 Enabled Error Correcting Capabilities: None Handle 0x0006, DMI type 6, 12 bytes. Memory Module Information Socket Designation: A0 Bank Connections: 0 1 Current Speed: 60 ns Type: Other SDRAM Installed Size: 512 MB (Double-bank Connection) Enabled Size: 512 MB (Double-bank Connection) Error Status: OK Handle 0x0007, DMI type 6, 12 bytes. Memory Module Information Socket Designation: A1 Bank Connections: 2 3 Current Speed: 60 ns Type: Other SDRAM Installed Size: 512 MB (Double-bank Connection) Enabled Size: 512 MB (Double-bank Connection) Error Status: OK Handle 0x0008, DMI type 6, 12 bytes. Memory Module Information Socket Designation: A2 Bank Connections: None Current Speed: 60 ns Type: Unknown Installed Size: Not Installed Enabled Size: Not Installed Error Status: OK Handle 0x0009, DMI type 7, 19 bytes. Cache Information Socket Designation: Internal Cache Configuration: Enabled, Not Socketed, Level 1 Operational Mode: Write Back Location: Internal Installed Size: 128 KB Maximum Size: 128 KB Supported SRAM Types: Synchronous Installed SRAM Type: Synchronous Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: Unknown Handle 0x000A, DMI type 7, 19 bytes. Cache Information Socket Designation: External Cache Configuration: Enabled, Not Socketed, Level 2 Operational Mode: Write Back Location: External Installed Size: 512 KB Maximum Size: 512 KB Supported SRAM Types: Synchronous Installed SRAM Type: Synchronous Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: Unknown Handle 0x000B, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: PRIMARY IDE Internal Connector Type: On Board IDE External Reference Designator: Not Specified External Connector Type: None Port Type: Other Handle 0x000C, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: SECONDARY IDE Internal Connector Type: On Board IDE External Reference Designator: Not Specified External Connector Type: None Port Type: Other Handle 0x000D, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: FDD Internal Connector Type: On Board Floppy External Reference Designator: Not Specified External Connector Type: None Port Type: 8251 FIFO Compatible Handle 0x000E, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: COM1 Internal Connector Type: 9 Pin Dual Inline (pin 10 cut) External Reference Designator: External Connector Type: DB-9 male Port Type: Serial Port 16450 Compatible Handle 0x000F, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: COM2 Internal Connector Type: 9 Pin Dual Inline (pin 10 cut) External Reference Designator: External Connector Type: DB-9 male Port Type: Serial Port 16450 Compatible Handle 0x0010, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: LPT1 Internal Connector Type: DB-25 female External Reference Designator: External Connector Type: DB-25 female Port Type: Parallel Port ECP/EPP Handle 0x0011, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Keyboard Internal Connector Type: PS/2 External Reference Designator: External Connector Type: PS/2 Port Type: Keyboard Port Handle 0x0012, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: PS/2 Mouse Internal Connector Type: PS/2 External Reference Designator: External Connector Type: PS/2 Port Type: Mouse Port Handle 0x0013, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: USB0 External Connector Type: Other Port Type: USB Handle 0x0014, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: AUDIO External Connector Type: None Port Type: Audio Port Handle 0x0015, DMI type 9, 13 bytes. System Slot Information Designation: PCI0 Type: 32-bit PCI Current Usage: In Use Length: Long ID: 1 Characteristics: 5.0 V is provided PME signal is supported Handle 0x0016, DMI type 9, 13 bytes. System Slot Information Designation: PCI1 Type: 32-bit PCI Current Usage: In Use Length: Long ID: 2 Characteristics: 5.0 V is provided PME signal is supported Handle 0x0017, DMI type 9, 13 bytes. System Slot Information Designation: PCI2 Type: 32-bit PCI Current Usage: In Use Length: Long ID: 3 Characteristics: 5.0 V is provided PME signal is supported Handle 0x0018, DMI type 9, 13 bytes. System Slot Information Designation: PCI3 Type: 32-bit PCI Current Usage: In Use Length: Long ID: 4 Characteristics: 5.0 V is provided PME signal is supported Handle 0x0019, DMI type 9, 13 bytes. System Slot Information Designation: AGP Type: 32-bit AGP Current Usage: Available Length: Long ID: 8 Characteristics: 5.0 V is provided Handle 0x001A, DMI type 13, 22 bytes. BIOS Language Information Installable Languages: 3 n|US|iso8859-1 n|US|iso8859-1 r|CA|iso8859-1 Currently Installed Language: n|US|iso8859-1 Handle 0x001B, DMI type 16, 15 bytes. Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: None Maximum Capacity: 3 GB Error Information Handle: Not Provided Number Of Devices: 3 Handle 0x001C, DMI type 17, 21 bytes. Memory Device Array Handle: 0x001B Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: 512 MB Form Factor: DIMM Set: None Locator: A0 Bank Locator: Bank0/1 Type: Unknown Type Detail: None Handle 0x001D, DMI type 17, 21 bytes. Memory Device Array Handle: 0x001B Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: 512 MB Form Factor: DIMM Set: None Locator: A1 Bank Locator: Bank2/3 Type: Unknown Type Detail: None Handle 0x001E, DMI type 17, 21 bytes. Memory Device Array Handle: 0x001B Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: DIMM Set: None Locator: A2 Bank Locator: Bank4/5 Type: Unknown Type Detail: None Handle 0x001F, DMI type 19, 15 bytes. Memory Array Mapped Address Starting Address: 0x00000000000 Ending Address: 0x0003FFFFFFF Range Size: 1 GB Physical Array Handle: 0x001B Partition Width: 0 Handle 0x0020, DMI type 20, 19 bytes. Memory Device Mapped Address Starting Address: 0x00000000000 Ending Address: 0x0001FFFFFFF Range Size: 512 MB Physical Device Handle: 0x001C Memory Array Mapped Address Handle: 0x001F Partition Row Position: 1 Handle 0x0021, DMI type 20, 19 bytes. Memory Device Mapped Address Starting Address: 0x00020000000 Ending Address: 0x0003FFFFFFF Range Size: 512 MB Physical Device Handle: 0x001D Memory Array Mapped Address Handle: 0x001F Partition Row Position: 1 Handle 0x0022, DMI type 20, 19 bytes. Memory Device Mapped Address Starting Address: 0x00000000000 Ending Address: 0x000000003FF Range Size: 1 kB Physical Device Handle: 0x001E Memory Array Mapped Address Handle: 0x001F Partition Row Position: 1 Handle 0x0023, DMI type 32, 11 bytes. System Boot Information Status: No errors detected Handle 0x0024, DMI type 127, 4 bytes. End Of Table ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Strange errors from e1000 driver (2.6.18) 2006-10-22 20:27 ` Martin J. Bligh @ 2006-10-22 22:15 ` Jesse Brandeburg 2006-10-22 22:32 ` Martin J. Bligh 2006-10-22 23:02 ` Dumitru Ciobarcianu 1 sibling, 1 reply; 8+ messages in thread From: Jesse Brandeburg @ 2006-10-22 22:15 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Martin J. Bligh, Linux Kernel Mailing List, netdev Analysis follows, but I wanted to ask you to bisect back if you can to find the apparent patch to make the difference. Basically at this point I'd say its not likely to be an e1000 issue, but I'd like to follow up and make sure. On 10/22/06, Martin J. Bligh <mbligh@google.com> wrote: > 0000:00:0a.0 Ethernet controller: Intel Corporation 82546EB Gigabit > Ethernet Con > troller (Copper) (rev 01) > Latency: 32 (63750ns min), Cache Line Size: 0x08 (32 bytes) > Interrupt: pin A routed to IRQ 5 > Region 0: Memory at ef020000 (64-bit, non-prefetchable) [size=128K] > Region 4: I/O ports at a000 [size=64] > Capabilities: [dc] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > PME(D0+,D1-,D2-,D3hot > +,D3cold+) > Status: D0 PME-Enable- DSel=0 DScale=1 PME- > Capabilities: [e4] Capabilities: [f0] Message Signalled > Interrupts: > 64bit+ Queue=0/0 Enable- > Address: 0000000000000000 Data: 0000 > > 0000:00:0a.1 Ethernet controller: Intel Corporation 82546EB Gigabit > Ethernet Con > troller (Copper) (rev 01) > Subsystem: Intel Corporation PRO/1000 MT Dual Port Server Adapter > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- > ParErr- Step > ping- SERR- FastB2B- > Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium > >TAbort- <TAbort > - <MAbort- >SERR- <PERR- > Latency: 32 (63750ns min), Cache Line Size: 0x08 (32 bytes) > Interrupt: pin B routed to IRQ 11 > Region 0: Memory at ef000000 (64-bit, non-prefetchable) [size=128K] > Region 4: I/O ports at a400 [size=64] > Capabilities: [dc] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > PME(D0+,D1-,D2-,D3hot > +,D3cold+) > Status: D0 PME-Enable- DSel=0 DScale=1 PME- > Capabilities: [e4] Capabilities: [f0] Message Signalled > Interrupts: > 64bit+ Queue=0/0 Enable- > Address: 0000000000000000 Data: 0000 Nothing seems out of order, but the latency may be low, I'd be curious what these looked like before with the old kernel. Some of the other things to compare would have been the lspci -vv output from your chipset with old/new kernel, in case the bridge/system configuration changed. There are no known problems right now with this chipset 82546EB > > cat /proc/interrupts, > > CPU0 > 5: 1975991 XT-PIC ehci_hcd:usb2, VIA8237, eth0 > NMI: 0 > LOC: 146264664 > ERR: 52805 shared int, fine, but whats with the ERR: ? > > dmesg > > Did that bit already. except you didn't include any of the e1000 load information nor the system's boot information as it came up. > > ethtool -e eth0, > > root@titus:/usr/local/autotest/bin # ethtool -e eth0 > Offset Values > ------ ------ > 0x0000 00 07 e9 09 0b 08 30 05 ff ff ff ff ff ff ff ff > 0x0010 44 a9 03 98 0b 46 11 10 86 80 10 10 86 80 68 34 > 0x0020 0c 00 10 10 00 00 02 21 c8 18 ff ff ff ff ff ff > 0x0030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > 0x0040 0c c3 61 78 04 50 02 21 c8 08 ff ff ff ff ff ff > 0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 02 06 > 0x0060 2c 00 00 40 07 11 00 00 2c 00 00 40 ff ff ff ff nothing out of order here that I can see immediately. > > and maybe output of > > dmidecode, etc. > > Attached. > > > only a little. There are so many different pieces of e1000 hardware > > and so few specifics in this report that I'll be able to tell you lots > > more when you get us the info requested. > > Thanks. Not sure if the bug wasn't there in earlier kernels, or if we > just weren't printing anything. I think it may not have been in earlier kernels, but I also don't think this is an e1000 problem, at least initially. <snip> > BIOS Information > Vendor: Phoenix Technologies, LTD > Version: 6.00 PG > Release Date: 09/13/2004 > Address: 0xE0000 > Runtime Size: 128 kB > ROM Size: 256 kB <snip> > Handle 0x0001, DMI type 1, 25 bytes. > System Information > Manufacturer: VIA Technologies, Inc. > Product Name: KT600-8237 > > Handle 0x0002, DMI type 2, 8 bytes. > Base Board Information > Manufacturer: > Product Name: KT600-8237 > Version: > Serial Number: This chipset is one of the most frequent common elements in problem reports of TX hangs for e1000. My current theory (we've bought a bunch of these systems and never reproduced the issue) is that there is something either design specific or BIOS specific that causes this chipset to interact very badly with e1000 hardware. Some systems have the issue and some don't. If you could bisect back to a working point it would be interesting to see where that pointed. > Handle 0x0004, DMI type 4, 32 bytes. > Processor Information > Socket Designation: Socket A > Type: Central Processor > Family: Athlon XP > Manufacturer: AMD > ID: A0 06 00 00 FF FB 83 03 > Signature: Family 6, Model A, Stepping 0 > Version: AMD Athlon(tm) XP > Voltage: 1.5 V > External Clock: 100 MHz > Max Speed: 3000 MHz > Current Speed: 1100 MHz doesn't seem you're overclocked. Good. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Strange errors from e1000 driver (2.6.18) 2006-10-22 22:15 ` Jesse Brandeburg @ 2006-10-22 22:32 ` Martin J. Bligh 2006-10-22 23:29 ` Jesse Brandeburg 0 siblings, 1 reply; 8+ messages in thread From: Martin J. Bligh @ 2006-10-22 22:32 UTC (permalink / raw) To: Jesse Brandeburg; +Cc: Martin J. Bligh, Linux Kernel Mailing List, netdev [-- Attachment #1: Type: text/plain, Size: 2532 bytes --] Jesse Brandeburg wrote: > Analysis follows, but I wanted to ask you to bisect back if you can to > find the apparent patch to make the difference. Basically at this > point I'd say its not likely to be an e1000 issue, but I'd like to > follow up and make sure. That's going to be ugly, since I can't reproduce it at will. Maybe if I netperf it to the other box I can push it over. > Nothing seems out of order, but the latency may be low, I'd be curious > what these looked like before with the old kernel. Some of the other > things to compare would have been the lspci -vv output from your > chipset with old/new kernel, in case the bridge/system configuration > changed. There are no known problems right now with this chipset > 82546EB OK. will try later when I have more time. For now I switched to the onboard via rhine controller. > shared int, fine, but whats with the ERR: ? Hmm. Having rebooted they look rather lower. but might be a time thing. CPU0 0: 1405995 XT-PIC timer 1: 5910 XT-PIC i8042 2: 0 XT-PIC cascade 5: 0 XT-PIC uhci_hcd:usb3 7: 27135 XT-PIC ehci_hcd:usb2, VIA8237, eth0 10: 0 XT-PIC uhci_hcd:usb4, uhci_hcd:usb5, uhci_hcd:usb6 11: 0 XT-PIC ehci_hcd:usb1, uhci_hcd:usb7, uhci_hcd:usb8 12: 157547 XT-PIC i8042 14: 36296 XT-PIC ide0 15: 196690 XT-PIC ide1 NMI: 0 LOC: 1406006 ERR: 26 > except you didn't include any of the e1000 load information nor the > system's boot information as it came up. OK, it had gone since reboot, but I rebooted just now .... new info attached. > This chipset is one of the most frequent common elements in problem > reports of TX hangs for e1000. My current theory (we've bought a > bunch of these systems and never reproduced the issue) is that there > is something either design specific or BIOS specific that causes this > chipset to interact very badly with e1000 hardware. Some systems have > the issue and some don't. If you could bisect back to a working point > it would be interesting to see where that pointed. OK, is going to be hard to bisect, since the other one was an Ubuntu kernel, but I guess I can give 2.6.15 virgin a shot, at least. > doesn't seem you're overclocked. Good. Nah, I'm pretty conservative with hardware, get enough problems when it's all running within specs ;-) Thanks for looking at all this. M. [-- Attachment #2: dmesg --] [-- Type: text/plain, Size: 14155 bytes --] Linux version 2.6.18 (mbligh@titus) (gcc version 3.4.6 (Ubuntu 3.4.6-1ubuntu2)) #2 Sun Oct 22 13:45:39 PDT 2006 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS) BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved) Warning only 896MB will be used. Use a HIGHMEM enabled kernel. 896MB LOWMEM available. found SMP MP-table at 000f52b0 On node 0 totalpages: 229376 DMA zone: 4096 pages, LIFO batch:0 Normal zone: 225280 pages, LIFO batch:31 DMI 2.2 present. Intel MultiProcessor Specification v1.4 Virtual Wire compatibility mode. OEM ID: OEM00000 Product ID: PROD00000000 APIC at: 0xFEE00000 Processor #0 6:10 APIC version 17 I/O APIC #2 Version 17 at 0xFEC00000. Enabling APIC mode: Flat. Using 1 I/O APICs Processors: 1 Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000) Detected 1098.980 MHz processor. Built 1 zonelists. Total pages: 229376 Kernel command line: root=/dev/hda1 ro lapic profile=2 kernel profiling enabled (shift: 2) mapped APIC to ffffd000 (fee00000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 CPU 0 irqstacks, hard=c04e4000 soft=c04e3000 PID hash table entries: 4096 (order: 12, 16384 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 902328k/917504k available (2647k kernel code, 14784k reserved, 1144k data, 160k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 2200.00 BogoMIPS (lpj=4400011) Mount-cache hash table entries: 512 CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000 00000000 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000420 00000000 00000000 00000000 Compat vDSO mapped to ffffe000. CPU: AMD Athlon(tm) stepping 00 Checking 'hlt' instruction... OK. NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfb360, last bus=1 PCI: Using configuration type 1 Setting up standard PCI resources SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) Boot video device is 0000:00:09.0 PCI: Using IRQ router VIA [1106/3227] at 0000:00:11.0 spurious 8259A interrupt: IRQ7. PCI: Bridge: 0000:00:01.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Setting latency timer of device 0000:00:01.0 to 64 NET: Registered protocol family 2 IP route cache hash table entries: 32768 (order: 5, 131072 bytes) TCP established hash table entries: 131072 (order: 7, 524288 bytes) TCP bind hash table entries: 65536 (order: 6, 262144 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac) Installing knfsd (copyright (C) 1996 okir@monad.swb.de). io scheduler noop registered io scheduler cfq registered (default) lp: driver loaded but no devices found Linux agpgart interface v0.101 (c) Dave Jones Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A floppy0: no floppy controllers found loop: loaded (max 8 devices) Intel(R) PRO/1000 Network Driver - version 7.1.9-k4 Copyright (c) 1999-2006 Intel Corporation. PCI: setting IRQ 7 as level-triggered PCI: Found IRQ 7 for device 0000:00:0a.0 PCI: Sharing IRQ 7 with 0000:00:10.4 PCI: Sharing IRQ 7 with 0000:00:11.5 e1000: 0000:00:0a.0: e1000_probe: (PCI:33MHz:32-bit) 00:07:e9:09:0b:08 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection PCI: setting IRQ 5 as level-triggered PCI: Found IRQ 5 for device 0000:00:0a.1 PCI: Sharing IRQ 5 with 0000:00:0b.0 e1000: 0000:00:0a.1: e1000_probe: (PCI:33MHz:32-bit) 00:07:e9:09:0b:09 e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection via-rhine.c:v1.10-LK1.4.1 July-24-2006 Written by Donald Becker PCI: setting IRQ 10 as level-triggered PCI: Found IRQ 10 for device 0000:00:12.0 PCI: Sharing IRQ 10 with 0000:00:0b.1 PCI: Sharing IRQ 10 with 0000:00:10.0 PCI: Sharing IRQ 10 with 0000:00:10.1 PCI: Sharing IRQ 10 with 0000:00:0c.0 eth2: VIA Rhine II at 0x1e000, 00:11:5b:a4:70:4d, IRQ 10. eth2: MII PHY found at address 1, status 0x7849 advertising 05e1 Link 0000. Linux video capture interface: v2.00 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller at PCI slot 0000:00:0f.1 PCI: VIA IRQ fixup for 0000:00:0f.1, from 255 to 0 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1 ide0: BM-DMA at 0xc800-0xc807, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xc808-0xc80f, BIOS settings: hdc:DMA, hdd:DMA Probing IDE interface ide0... hda: ST3200822A, ATA DISK drive hdb: ST3400832A, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: _NEC DVD_RW ND-3550A, ATAPI CD/DVD-ROM drive hdd: DVD-ROM BDV316B, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 512KiB hda: 390721968 sectors (200049 MB) w/8192KiB Cache, CHS=24321/255/63, UDMA(100) hda: cache flushes supported hda: hda1 hda2 hda3 hdb: max request size: 512KiB hdb: 781422768 sectors (400088 MB) w/8192KiB Cache, CHS=48641/255/63, UDMA(100) hdb: cache flushes supported hdb: hdb1 hdc: ATAPI 48X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 hdd: ATAPI 48X DVD-ROM drive, 512kB Cache, UDMA(33) ide-floppy driver 0.99.newide usbmon: debugfs is not available PCI: setting IRQ 11 as level-triggered PCI: Found IRQ 11 for device 0000:00:0b.2 PCI: Sharing IRQ 11 with 0000:00:09.0 PCI: Sharing IRQ 11 with 0000:00:10.2 PCI: Sharing IRQ 11 with 0000:00:10.3 ehci_hcd 0000:00:0b.2: EHCI Host Controller ehci_hcd 0000:00:0b.2: new USB bus registered, assigned bus number 1 ehci_hcd 0000:00:0b.2: irq 11, io mem 0xef040000 ehci_hcd 0000:00:0b.2: USB 2.0 started, EHCI 0.95, driver 10 Dec 2004 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 4 ports detected PCI: Found IRQ 7 for device 0000:00:10.4 PCI: Sharing IRQ 7 with 0000:00:0a.0 PCI: Sharing IRQ 7 with 0000:00:11.5 ehci_hcd 0000:00:10.4: EHCI Host Controller ehci_hcd 0000:00:10.4: new USB bus registered, assigned bus number 2 ehci_hcd 0000:00:10.4: irq 7, io mem 0xef041000 ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 8 ports detected ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI) USB Universal Host Controller Interface driver v3.0 PCI: Found IRQ 5 for device 0000:00:0b.0 PCI: Sharing IRQ 5 with 0000:00:0a.1 uhci_hcd 0000:00:0b.0: UHCI Host Controller uhci_hcd 0000:00:0b.0: new USB bus registered, assigned bus number 3 uhci_hcd 0000:00:0b.0: irq 5, io base 0x0000a800 usb usb3: configuration #1 chosen from 1 choice hub 3-0:1.0: USB hub found hub 3-0:1.0: 2 ports detected PCI: Found IRQ 10 for device 0000:00:0b.1 PCI: Sharing IRQ 10 with 0000:00:10.0 PCI: Sharing IRQ 10 with 0000:00:10.1 PCI: Sharing IRQ 10 with 0000:00:0c.0 PCI: Sharing IRQ 10 with 0000:00:12.0 uhci_hcd 0000:00:0b.1: UHCI Host Controller uhci_hcd 0000:00:0b.1: new USB bus registered, assigned bus number 4 uhci_hcd 0000:00:0b.1: irq 10, io base 0x0000ac00 usb usb4: configuration #1 chosen from 1 choice hub 4-0:1.0: USB hub found hub 4-0:1.0: 2 ports detected PCI: Found IRQ 10 for device 0000:00:10.0 PCI: Sharing IRQ 10 with 0000:00:0b.1 PCI: Sharing IRQ 10 with 0000:00:10.1 PCI: Sharing IRQ 10 with 0000:00:0c.0 PCI: Sharing IRQ 10 with 0000:00:12.0 uhci_hcd 0000:00:10.0: UHCI Host Controller uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 5 uhci_hcd 0000:00:10.0: irq 10, io base 0x0000cc00 usb usb5: configuration #1 chosen from 1 choice hub 5-0:1.0: USB hub found hub 5-0:1.0: 2 ports detected PCI: Found IRQ 10 for device 0000:00:10.1 PCI: Sharing IRQ 10 with 0000:00:0b.1 PCI: Sharing IRQ 10 with 0000:00:10.0 PCI: Sharing IRQ 10 with 0000:00:0c.0 PCI: Sharing IRQ 10 with 0000:00:12.0 uhci_hcd 0000:00:10.1: UHCI Host Controller uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 6 uhci_hcd 0000:00:10.1: irq 10, io base 0x0000d000 usb usb6: configuration #1 chosen from 1 choice hub 6-0:1.0: USB hub found hub 6-0:1.0: 2 ports detected PCI: Found IRQ 11 for device 0000:00:10.2 PCI: Sharing IRQ 11 with 0000:00:09.0 PCI: Sharing IRQ 11 with 0000:00:10.3 PCI: Sharing IRQ 11 with 0000:00:0b.2 uhci_hcd 0000:00:10.2: UHCI Host Controller uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 7 uhci_hcd 0000:00:10.2: irq 11, io base 0x0000d400 usb usb7: configuration #1 chosen from 1 choice hub 7-0:1.0: USB hub found hub 7-0:1.0: 2 ports detected PCI: Found IRQ 11 for device 0000:00:10.3 PCI: Sharing IRQ 11 with 0000:00:09.0 PCI: Sharing IRQ 11 with 0000:00:10.2 PCI: Sharing IRQ 11 with 0000:00:0b.2 uhci_hcd 0000:00:10.3: UHCI Host Controller uhci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 8 uhci_hcd 0000:00:10.3: irq 11, io base 0x0000d800 usb usb8: configuration #1 chosen from 1 choice hub 8-0:1.0: USB hub found hub 8-0:1.0: 2 ports detected Initializing USB Mass Storage driver... usbcore: registered new driver usb-storage USB Mass Storage support registered. usbcore: registered new driver usbserial drivers/usb/serial/usb-serial.c: USB Serial support registered for generic usbcore: registered new driver usbserial_generic drivers/usb/serial/usb-serial.c: USB Serial Driver core drivers/usb/serial/usb-serial.c: USB Serial support registered for Handspring Visor / Palm OS drivers/usb/serial/usb-serial.c: USB Serial support registered for Sony Clie 3.5 drivers/usb/serial/usb-serial.c: USB Serial support registered for Sony Clie 5.0 usbcore: registered new driver visor drivers/usb/serial/visor.c: USB HandSpring Visor / Palm OS driver serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 mice: PS/2 mouse device common for all mice Advanced Linux Sound Architecture Driver Version 1.0.12rc1 (Thu Jun 22 13:55:50 2006 UTC). PCI: Found IRQ 7 for device 0000:00:11.5 PCI: Sharing IRQ 7 with 0000:00:0a.0 PCI: Sharing IRQ 7 with 0000:00:10.4 PCI: Setting latency timer of device 0000:00:11.5 to 64 input: AT Translated Set 2 keyboard as /class/input/input0 logips2pp: Detected unknown logitech mouse model 11 ALSA device list: #0: VIA 8237 with ALC655 at 0xdc00, irq 7 oprofile: using NMI interrupt. ip_conntrack version 2.4 (7168 buckets, 57344 max) - 172 bytes per conntrack TCP bic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Using IPI Shortcut mode Time: tsc clocksource has been installed. kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Freeing unused kernel memory: 160k freed input: PS/2 Logitech Mouse as /class/input/input1 e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex Unable to find swap-space signature EXT3 FS on hda1, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on hda3, internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3 FS on hdb1, internal journal EXT3-fs: mounted filesystem with ordered data mode. Unable to find swap-space signature hdc: drive_cmd: status=0x51 { DriveReady SeekComplete Error } hdc: drive_cmd: error=0x04 { AbortedCommand } ide: failed opcode was: 0xec hdd: drive_cmd: status=0x51 { DriveReady SeekComplete Error } hdd: drive_cmd: error=0x04 { AbortedCommand } ide: failed opcode was: 0xec device eth0 entered promiscuous mode device eth0 left promiscuous mode e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <26> TDT <26> next_to_use <26> next_to_clean <39> buffer_info[next_to_clean] time_stamp <77145> next_to_watch <3b> jiffies <7734f> next_to_watch.status <0> e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <26> TDT <26> next_to_use <26> next_to_clean <39> buffer_info[next_to_clean] time_stamp <77145> next_to_watch <3b> jiffies <77543> next_to_watch.status <0> e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <26> TDT <26> next_to_use <26> next_to_clean <39> buffer_info[next_to_clean] time_stamp <77145> next_to_watch <3b> jiffies <77737> next_to_watch.status <0> e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <26> TDT <26> next_to_use <26> next_to_clean <39> buffer_info[next_to_clean] time_stamp <77145> next_to_watch <3b> jiffies <7792b> next_to_watch.status <0> NETDEV WATCHDOG: eth0: transmit timed out e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Strange errors from e1000 driver (2.6.18) 2006-10-22 22:32 ` Martin J. Bligh @ 2006-10-22 23:29 ` Jesse Brandeburg 0 siblings, 0 replies; 8+ messages in thread From: Jesse Brandeburg @ 2006-10-22 23:29 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Martin J. Bligh, Linux Kernel Mailing List, netdev On 10/22/06, Martin J. Bligh <mbligh@google.com> wrote: > Jesse Brandeburg wrote: > > Analysis follows, but I wanted to ask you to bisect back if you can to > > find the apparent patch to make the difference. Basically at this > > point I'd say its not likely to be an e1000 issue, but I'd like to > > follow up and make sure. > > That's going to be ugly, since I can't reproduce it at will. Maybe if > I netperf it to the other box I can push it over. try tbench with 100 sessions (from dbench package) and see if that hurts. > > Nothing seems out of order, but the latency may be low, I'd be curious > > what these looked like before with the old kernel. Some of the other > > things to compare would have been the lspci -vv output from your > > chipset with old/new kernel, in case the bridge/system configuration > > changed. There are no known problems right now with this chipset > > 82546EB > > OK. will try later when I have more time. For now I switched to the > onboard via rhine controller. ouch. > > shared int, fine, but whats with the ERR: ? > > Hmm. Having rebooted they look rather lower. but might be a time thing. > > CPU0 > 0: 1405995 XT-PIC timer > 1: 5910 XT-PIC i8042 > 2: 0 XT-PIC cascade > 5: 0 XT-PIC uhci_hcd:usb3 > 7: 27135 XT-PIC ehci_hcd:usb2, VIA8237, eth0 > 10: 0 XT-PIC uhci_hcd:usb4, uhci_hcd:usb5, > uhci_hcd:usb6 > 11: 0 XT-PIC ehci_hcd:usb1, uhci_hcd:usb7, > uhci_hcd:usb8 > 12: 157547 XT-PIC i8042 > 14: 36296 XT-PIC ide0 > 15: 196690 XT-PIC ide1 > NMI: 0 > LOC: 1406006 > ERR: 26 > > > except you didn't include any of the e1000 load information nor the > > system's boot information as it came up. > > OK, it had gone since reboot, but I rebooted just now .... new info > attached. > > > This chipset is one of the most frequent common elements in problem > > reports of TX hangs for e1000. My current theory (we've bought a > > bunch of these systems and never reproduced the issue) is that there > > is something either design specific or BIOS specific that causes this > > chipset to interact very badly with e1000 hardware. Some systems have > > the issue and some don't. If you could bisect back to a working point > > it would be interesting to see where that pointed. > > OK, is going to be hard to bisect, since the other one was an Ubuntu > kernel, but I guess I can give 2.6.15 virgin a shot, at least. thanks, I know how difficult and time consuming bisecting is. > > doesn't seem you're overclocked. Good. > > Nah, I'm pretty conservative with hardware, get enough problems when > it's all running within specs ;-) > > Thanks for looking at all this. welcome, like to help when I can. > Linux version 2.6.18 (mbligh@titus) (gcc version 3.4.6 (Ubuntu 3.4.6-1ubuntu2)) #2 Sun > e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > Tx Queue <0> > TDH <26> > TDT <26> > next_to_use <26> > next_to_clean <39> > buffer_info[next_to_clean] > time_stamp <77145> > next_to_watch <3b> > jiffies <7734f> > next_to_watch.status <0> > NETDEV WATCHDOG: eth0: transmit timed out > e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex hey, this one is different. It is actually the common tx hang signature (TDH == TDT) for these kinds of systems. I've come up with a workaround driver, code is still in development. you can try it if you would like. http://sourceforge.net/tracker/download.php?group_id=42302&atid=447449&file_id=198849&aid=1463045 Thanks, Jesse ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Strange errors from e1000 driver (2.6.18) 2006-10-22 20:27 ` Martin J. Bligh 2006-10-22 22:15 ` Jesse Brandeburg @ 2006-10-22 23:02 ` Dumitru Ciobarcianu 1 sibling, 0 replies; 8+ messages in thread From: Dumitru Ciobarcianu @ 2006-10-22 23:02 UTC (permalink / raw) To: Martin J. Bligh Cc: Jesse Brandeburg, Martin J. Bligh, Linux Kernel Mailing List, netdev On Sun, 2006-10-22 at 13:27 -0700, Martin J. Bligh wrote: > Jesse Brandeburg wrote: > > On 10/22/06, Martin J. Bligh <mbligh@mbligh.org> wrote: > >> Martin J. Bligh wrote: > >> > I'm getting a lot of these type of errors if I run 2.6.18. If > >> > I run the standard Ubuntu Dapper kernel, I don't get them. > >> > What do they indicate? > > > > Hi Martin, they indicate that you're getting transmit hangs. Means > > your hardware is having issues with some of the buffers it is being > > handed. Because the TDH and TDT noted below are not equal, it means > > the hardware is hung processing buffers that the driver gave to it. > > > > We need the standard bug report particulars, > > Sure. > > Handle 0x0001, DMI type 1, 25 bytes. > System Information > Manufacturer: VIA Technologies, Inc. > Product Name: KT600-8237 > Version: > Serial Number: > UUID: Not Present > Wake-up Type: Power Switch If this matters: I've got the same errors with the fc5 kernel sometime around january, also on an VIA-based motherboard. I only got around to fix it by changing the motherboard... (worked fine with an intel-based mb). -- Cioby ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-10-22 23:29 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-10-22 18:46 Strange errors from e1000 driver (2.6.18) Martin J. Bligh 2006-10-22 19:05 ` Martin J. Bligh 2006-10-22 20:21 ` Jesse Brandeburg 2006-10-22 20:27 ` Martin J. Bligh 2006-10-22 22:15 ` Jesse Brandeburg 2006-10-22 22:32 ` Martin J. Bligh 2006-10-22 23:29 ` Jesse Brandeburg 2006-10-22 23:02 ` Dumitru Ciobarcianu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).