* rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
@ 2006-06-30 9:23 Marcus Better
2006-06-30 21:16 ` Francois Romieu
0 siblings, 1 reply; 11+ messages in thread
From: Marcus Better @ 2006-06-30 9:23 UTC (permalink / raw)
To: netdev
I'm seeing this problem on my Acer Travelmate 223X laptop with built-in
Realtek 8139: The ethernet stops working, usually after at most a few
minutes operation. The problem appears in kernel 2.6.16 and 2.6.17, but not
in 2.6.15.
It prints the following in the syslog:
Jun 28 07:50:36 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jun 28 07:50:39 kelev kernel: eth0: link up, 100Mbps, half-duplex, lpa
0x40A1
Jun 28 07:50:51 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jun 28 07:50:54 kelev kernel: eth0: link up, 100Mbps, half-duplex, lpa
0x40A1
Jun 28 07:51:06 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jun 28 07:51:09 kelev kernel: eth0: link up, 100Mbps, half-duplex, lpa
0x40A1
Jun 28 07:51:21 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jun 28 07:51:24 kelev kernel: eth0: link up, 100Mbps, half-duplex, lpa
0x40A1
I've exchanged the hub and network cables, to no avail.
Results of lspci -vvv:
01:05.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
Subsystem: Realtek Semiconductor Co., Ltd. RT8139
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (8000ns min, 16000ns max)
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at 9000 [size=256]
Region 1: Memory at a0100000 (32-bit, non-prefetchable) [size=512]
Capabilities: [60] Vital Product Data
Marcus
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
2006-06-30 9:23 rtl8139: NETDEV WATCHDOG: eth0: transmit timed out Marcus Better
@ 2006-06-30 21:16 ` Francois Romieu
2006-07-04 19:38 ` Marcus Better
2006-07-07 16:32 ` Marcus Better
0 siblings, 2 replies; 11+ messages in thread
From: Francois Romieu @ 2006-06-30 21:16 UTC (permalink / raw)
To: Marcus Better; +Cc: netdev
Marcus Better <marcus@better.se> :
> I'm seeing this problem on my Acer Travelmate 223X laptop with built-in
> Realtek 8139: The ethernet stops working, usually after at most a few
> minutes operation. The problem appears in kernel 2.6.16 and 2.6.17, but not
> in 2.6.15.
Broken again :o(
In a better world, you would narrow the suspect with a git bissect [1]
between v2.6.15 and v2.6.16.
As an alternate solution, you may try the patches available at
http://www.fr.zoreil.com/people/francois/misc/8139 against v2.6.15,
starting from 0001 and going to 0006 (of course you can revert the
patches against v2.6.16 from 0006 to 0001 too). I can't claim that
each step will compile though.
[1] http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html
--
Ueimor
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
2006-06-30 21:16 ` Francois Romieu
@ 2006-07-04 19:38 ` Marcus Better
2006-07-07 16:32 ` Marcus Better
1 sibling, 0 replies; 11+ messages in thread
From: Marcus Better @ 2006-07-04 19:38 UTC (permalink / raw)
To: netdev
Francois Romieu wrote:
> In a better world, you would narrow the suspect with a git bissect [1]
> between v2.6.15 and v2.6.16.
Will try. It may take some time...
Marcus
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
2006-06-30 21:16 ` Francois Romieu
2006-07-04 19:38 ` Marcus Better
@ 2006-07-07 16:32 ` Marcus Better
2006-07-08 16:15 ` Thomas Hellström
1 sibling, 1 reply; 11+ messages in thread
From: Marcus Better @ 2006-07-07 16:32 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev, Alan Hourihane, dri-devel, Dave Airlie
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
(For those haven't followed, this is about
http://permalink.gmane.org/gmane.linux.network/38493
)
Francois Romieu wrote:
> Marcus Better <marcus@better.se> :
>> I'm seeing this problem on my Acer Travelmate 223X laptop with built-in
>> Realtek 8139: The ethernet stops working, usually after at most a few
>> minutes operation.
> In a better world, you would narrow the suspect with a git bissect [1]
> between v2.6.15 and v2.6.16.
I did, and the winner after 13 reboots is...
commit de227f5f32775d86e5c780a7cffdd2e08574f7fb
Author: Dave Airlie <airlied@starflyer.(none)>
Date: Wed Jan 25 15:31:43 2006 +1100
drm: i915 patches from Tungsten Graphics
Fix CMDBUFFER path, add heap destroy and flesh out sarea for rotation
(Tungsten Graphics)
From: Alan Hourihane <alanh@tungstengraphics.com>
Signed-off-by: Dave Airlie <airlied@linux.ie>
I didn't believe it at first either, but blacklisting the i915 module
actually fixes the problem. Now that I know what to look for, I notice
that the network errors always started cropping up after X11 started.
Wonder what's going on here. Why is the graphics driver killing my network?
Marcus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
iD8DBQFEroy7XjXn6TzcAQkRAp7bAJ9F7HgWg+VsvQ0fwkK3+b4Ne+tASwCg8+m3
8i5BoW+ujUjoX3DLW0QKAPQ=
=MDAc
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
2006-07-07 16:32 ` Marcus Better
@ 2006-07-08 16:15 ` Thomas Hellström
2006-07-09 7:23 ` Marcus Better
0 siblings, 1 reply; 11+ messages in thread
From: Thomas Hellström @ 2006-07-08 16:15 UTC (permalink / raw)
To: Marcus Better
Cc: Francois Romieu, Dave Airlie, netdev, dri-devel, Alan Hourihane
Hi!
Marcus Better wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>(For those haven't followed, this is about
> http://permalink.gmane.org/gmane.linux.network/38493
>)
>
>Francois Romieu wrote:
>
>
>>Marcus Better <marcus@better.se> :
>>
>>
>>>I'm seeing this problem on my Acer Travelmate 223X laptop with built-in
>>>Realtek 8139: The ethernet stops working, usually after at most a few
>>>minutes operation.
>>>
>>>
>
>
>
>>In a better world, you would narrow the suspect with a git bissect [1]
>>between v2.6.15 and v2.6.16.
>>
>>
>
>I did, and the winner after 13 reboots is...
>
>commit de227f5f32775d86e5c780a7cffdd2e08574f7fb
>Author: Dave Airlie <airlied@starflyer.(none)>
>Date: Wed Jan 25 15:31:43 2006 +1100
>
> drm: i915 patches from Tungsten Graphics
>
> Fix CMDBUFFER path, add heap destroy and flesh out sarea for rotation
> (Tungsten Graphics)
>
> From: Alan Hourihane <alanh@tungstengraphics.com>
> Signed-off-by: Dave Airlie <airlied@linux.ie>
>
>
>I didn't believe it at first either, but blacklisting the i915 module
>actually fixes the problem. Now that I know what to look for, I notice
>that the network errors always started cropping up after X11 started.
>
>Wonder what's going on here. Why is the graphics driver killing my network?
>
>
I guess you got the wrong commit, and the correct one should be the one
where Dave adds vblank interrupts. It should be close to the one you listed.
Some i915 chips have buggy interrupts. What happens is that the display
interrupts are duplicated on the sound interrupt channel. Usually
there's no interrupt handler there that recognizes them, and after a
while, the kernel detects too many spurious interrupts and disables the
sound IRQ line. If the network sits on the same IRQ line, it will be
disabled as well. If you check your kernel logs, you will probably have
messages about disabled IRQs.
A workaround is to add the "noirqdebug" (I hope I remember the spelling
correctly) kernel option at boot time.
/Thomas
>Marcus
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.4.3 (GNU/Linux)
>
>iD8DBQFEroy7XjXn6TzcAQkRAp7bAJ9F7HgWg+VsvQ0fwkK3+b4Ne+tASwCg8+m3
>8i5BoW+ujUjoX3DLW0QKAPQ=
>=MDAc
>-----END PGP SIGNATURE-----
>
>Using Tomcat but need to do more? Need to support web services, security?
>Get stuff done quickly with pre-integrated technology to make your job easier
>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
>--
>_______________________________________________
>Dri-devel mailing list
>Dri-devel@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/dri-devel
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
2006-07-08 16:15 ` Thomas Hellström
@ 2006-07-09 7:23 ` Marcus Better
2006-07-09 7:34 ` Thomas Hellström
0 siblings, 1 reply; 11+ messages in thread
From: Marcus Better @ 2006-07-09 7:23 UTC (permalink / raw)
To: Thomas Hellström
Cc: Francois Romieu, Dave Airlie, netdev, dri-devel, Alan Hourihane
Thomas Hellström wrote:
> I guess you got the wrong commit, and the correct one should be the one
> where Dave adds vblank interrupts. It should be close to the one you
> listed.
I thought I double-checked that it was the right commit, but will check
again.
> If the network sits on the same IRQ line, it will be disabled as well.
It appears to be on a different IRQ:
~$ cat /proc/interrupts
CPU0
0: 202574 XT-PIC timer
1: 2649 XT-PIC i8042
2: 0 XT-PIC cascade
8: 0 XT-PIC rtc
9: 0 XT-PIC acpi
10: 1059 XT-PIC yenta, Intel 82801CA-ICH3 Modem, Intel
82801CA-ICH3
11: 40776 XT-PIC uhci_hcd:usb1, uhci_hcd:usb2, eth0
12: 100433 XT-PIC i8042
14: 63563 XT-PIC ide0
15: 6773 XT-PIC ide1
NMI: 0
ERR: 0
(This is without i915 loaded though.)
> If you check your kernel logs, you will probably have
> messages about disabled IRQs.
Correct. Here's an example:
Jul 7 16:43:39 kelev kernel: irq 11: nobody cared (try booting with the
"irqpoll" option)
Jul 7 16:43:39 kelev kernel: [<c0138b74>] __report_bad_irq+0x24/0x90
Jul 7 16:43:39 kelev kernel: [<c0138c82>] note_interrupt+0x72/0xc0
Jul 7 16:43:39 kelev kernel: [<c013862e>] __do_IRQ+0xae/0xc0
Jul 7 16:43:39 kelev kernel: [<c0104a69>] do_IRQ+0x19/0x30
Jul 7 16:43:39 kelev kernel: [<c01030c2>] common_interrupt+0x1a/0x20
Jul 7 16:43:39 kelev kernel: [<c011c4fe>] __do_softirq+0x2e/0xa0
Jul 7 16:43:39 kelev kernel: [<c011c597>] do_softirq+0x27/0x30
Jul 7 16:43:39 kelev kernel: [<c0104a6e>] do_IRQ+0x1e/0x30
Jul 7 16:43:39 kelev kernel: [<c01030c2>] common_interrupt+0x1a/0x20
Jul 7 16:43:39 kelev kernel: [<c0102ef5>] syscall_call+0x7/0xb
Jul 7 16:43:39 kelev kernel: handlers:
Jul 7 16:43:39 kelev kernel: [<e01270c0>] (usb_hcd_irq+0x0/0x60 [usbcore])
Jul 7 16:43:39 kelev kernel: [<e01270c0>] (usb_hcd_irq+0x0/0x60 [usbcore])
Jul 7 16:43:39 kelev kernel: [<e0044160>] (rtl8139_interrupt+0x0/0x1d0
[8139too]
)
Jul 7 16:43:39 kelev kernel: Disabling IRQ #11
Jul 7 16:44:03 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jul 7 16:44:03 kelev kernel: eth0: Transmit timeout, status 0c 0005
c07f media 9
0.
Jul 7 16:44:03 kelev kernel: eth0: Tx queue start entry 48 dirty entry 44.
Jul 7 16:44:03 kelev kernel: eth0: Tx descriptor 0 is 0008a062. (queue
head)
Jul 7 16:44:03 kelev kernel: eth0: Tx descriptor 1 is 0008a062.
Jul 7 16:44:03 kelev kernel: eth0: Tx descriptor 2 is 0008a062.
Jul 7 16:44:03 kelev kernel: eth0: Tx descriptor 3 is 0008a062.
Jul 7 16:44:03 kelev kernel: eth0: link up, 100Mbps, full-duplex, lpa
0x45E1
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
2006-07-09 7:23 ` Marcus Better
@ 2006-07-09 7:34 ` Thomas Hellström
2006-07-09 16:34 ` Marcus Better
2006-07-10 13:02 ` Marcus Better
0 siblings, 2 replies; 11+ messages in thread
From: Thomas Hellström @ 2006-07-09 7:34 UTC (permalink / raw)
To: Marcus Better
Cc: Dave Airlie, netdev, dri-devel, Alan Hourihane, Francois Romieu
Marcus Better wrote:
>It appears to be on a different IRQ:
>~$ cat /proc/interrupts
> CPU0
> 0: 202574 XT-PIC timer
> 1: 2649 XT-PIC i8042
> 2: 0 XT-PIC cascade
> 8: 0 XT-PIC rtc
> 9: 0 XT-PIC acpi
> 10: 1059 XT-PIC yenta, Intel 82801CA-ICH3 Modem, Intel
>82801CA-ICH3
> 11: 40776 XT-PIC uhci_hcd:usb1, uhci_hcd:usb2, eth0
> 12: 100433 XT-PIC i8042
> 14: 63563 XT-PIC ide0
> 15: 6773 XT-PIC ide1
>NMI: 0
>ERR: 0
>
>(This is without i915 loaded though.)
>
>
>
Strange. I've also seen the i915 sending false interrupts on its own
line, though.
Does the "noirqdebug" option fix the problem?
/Thomas
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
--
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
2006-07-09 7:34 ` Thomas Hellström
@ 2006-07-09 16:34 ` Marcus Better
2006-07-10 13:02 ` Marcus Better
1 sibling, 0 replies; 11+ messages in thread
From: Marcus Better @ 2006-07-09 16:34 UTC (permalink / raw)
To: Thomas Hellström
Cc: Francois Romieu, Dave Airlie, netdev, dri-devel, Alan Hourihane
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Thomas Hellström wrote:
> Strange. I've also seen the i915 sending false interrupts on its own
> line, though.
Here's the interrupt table with i915 loaded:
~$ cat /proc/interrupts
CPU0
0: 401031 XT-PIC timer
1: 3681 XT-PIC i8042
2: 0 XT-PIC cascade
8: 0 XT-PIC rtc
9: 0 XT-PIC acpi
10: 997 XT-PIC yenta, Intel 82801CA-ICH3, Intel
82801CA-ICH3 Modem
11: 93823 XT-PIC uhci_hcd:usb1, uhci_hcd:usb2, eth0,
i915@pci:0000:00:02.0
12: 75631 XT-PIC i8042
14: 18284 XT-PIC ide0
15: 13901 XT-PIC ide1
NMI: 0
ERR: 0
> Does the "noirqdebug" option fix the problem?
Yes, it appears to fix it.
Marcus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
iD8DBQFEsTArXjXn6TzcAQkRAn/vAKCZUAVd45xQae4FthvNr68x/jTS4QCgyE7N
CzPv0R9okmIjrsGykMXrfPk=
=gU6D
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
2006-07-09 7:34 ` Thomas Hellström
2006-07-09 16:34 ` Marcus Better
@ 2006-07-10 13:02 ` Marcus Better
2006-07-10 13:10 ` Thomas Hellström
1 sibling, 1 reply; 11+ messages in thread
From: Marcus Better @ 2006-07-10 13:02 UTC (permalink / raw)
To: Thomas Hellström
Cc: Francois Romieu, Dave Airlie, netdev, dri-devel, Alan Hourihane
[-- Attachment #1: Type: text/plain, Size: 247 bytes --]
Thomas Hellström wrote:
> Does the "noirqdebug" option fix the problem?
Yes... but it breaks switching to a text console. I get an interesting
"fluid" effect on the screen (a bright static pattern), and the keyboard
locks up.
Marcus
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
2006-07-10 13:02 ` Marcus Better
@ 2006-07-10 13:10 ` Thomas Hellström
2006-07-10 13:17 ` Marcus Better
0 siblings, 1 reply; 11+ messages in thread
From: Thomas Hellström @ 2006-07-10 13:10 UTC (permalink / raw)
To: Marcus Better; +Cc: Francois Romieu, netdev, dri-devel
Marcus Better wrote:
>Thomas Hellström wrote:
>
>
>>Does the "noirqdebug" option fix the problem?
>>
>>
>
>Yes... but it breaks switching to a text console. I get an interesting
>"fluid" effect on the screen (a bright static pattern), and the keyboard
>locks up.
>
>Marcus
>
>
>
Hi!
Are you _sure_ these are related?
If you don't use the noirqdebug option, do you have the same problem?
/Thomas
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-07-10 13:17 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-30 9:23 rtl8139: NETDEV WATCHDOG: eth0: transmit timed out Marcus Better
2006-06-30 21:16 ` Francois Romieu
2006-07-04 19:38 ` Marcus Better
2006-07-07 16:32 ` Marcus Better
2006-07-08 16:15 ` Thomas Hellström
2006-07-09 7:23 ` Marcus Better
2006-07-09 7:34 ` Thomas Hellström
2006-07-09 16:34 ` Marcus Better
2006-07-10 13:02 ` Marcus Better
2006-07-10 13:10 ` Thomas Hellström
2006-07-10 13:17 ` Marcus Better
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).