netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
@ 2006-06-30  9:23 Marcus Better
  2006-06-30 21:16 ` Francois Romieu
  0 siblings, 1 reply; 11+ messages in thread
From: Marcus Better @ 2006-06-30  9:23 UTC (permalink / raw)
  To: netdev

I'm seeing this problem on my Acer Travelmate 223X laptop with built-in
Realtek 8139: The ethernet stops working, usually after at most a few
minutes operation. The problem appears in kernel 2.6.16 and 2.6.17, but not
in 2.6.15.

It prints the following in the syslog:

Jun 28 07:50:36 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jun 28 07:50:39 kelev kernel: eth0: link up, 100Mbps, half-duplex, lpa
0x40A1
Jun 28 07:50:51 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jun 28 07:50:54 kelev kernel: eth0: link up, 100Mbps, half-duplex, lpa
0x40A1
Jun 28 07:51:06 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jun 28 07:51:09 kelev kernel: eth0: link up, 100Mbps, half-duplex, lpa
0x40A1
Jun 28 07:51:21 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jun 28 07:51:24 kelev kernel: eth0: link up, 100Mbps, half-duplex, lpa
0x40A1

I've exchanged the hub and network cables, to no avail.

Results of lspci -vvv:

01:05.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RT8139
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (8000ns min, 16000ns max)
        Interrupt: pin A routed to IRQ 11
        Region 0: I/O ports at 9000 [size=256]
        Region 1: Memory at a0100000 (32-bit, non-prefetchable) [size=512]
        Capabilities: [60] Vital Product Data


Marcus



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-06-30  9:23 rtl8139: NETDEV WATCHDOG: eth0: transmit timed out Marcus Better
@ 2006-06-30 21:16 ` Francois Romieu
  2006-07-04 19:38   ` Marcus Better
  2006-07-07 16:32   ` Marcus Better
  0 siblings, 2 replies; 11+ messages in thread
From: Francois Romieu @ 2006-06-30 21:16 UTC (permalink / raw)
  To: Marcus Better; +Cc: netdev

Marcus Better <marcus@better.se> :
> I'm seeing this problem on my Acer Travelmate 223X laptop with built-in
> Realtek 8139: The ethernet stops working, usually after at most a few
> minutes operation. The problem appears in kernel 2.6.16 and 2.6.17, but not
> in 2.6.15.

Broken again :o(

In a better world, you would narrow the suspect with a git bissect [1]
between v2.6.15 and v2.6.16.

As an alternate solution, you may try the patches available at
http://www.fr.zoreil.com/people/francois/misc/8139 against v2.6.15,
starting from 0001 and going to 0006 (of course you can revert the
patches against v2.6.16 from 0006 to 0001 too). I can't claim that
each step will compile though.

[1] http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html

-- 
Ueimor

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-06-30 21:16 ` Francois Romieu
@ 2006-07-04 19:38   ` Marcus Better
  2006-07-07 16:32   ` Marcus Better
  1 sibling, 0 replies; 11+ messages in thread
From: Marcus Better @ 2006-07-04 19:38 UTC (permalink / raw)
  To: netdev

Francois Romieu wrote:
> In a better world, you would narrow the suspect with a git bissect [1]
> between v2.6.15 and v2.6.16.

Will try. It may take some time...

Marcus



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-06-30 21:16 ` Francois Romieu
  2006-07-04 19:38   ` Marcus Better
@ 2006-07-07 16:32   ` Marcus Better
  2006-07-08 16:15     ` Thomas Hellström
  1 sibling, 1 reply; 11+ messages in thread
From: Marcus Better @ 2006-07-07 16:32 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev, Alan Hourihane, dri-devel, Dave Airlie

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

(For those haven't followed, this is about
  http://permalink.gmane.org/gmane.linux.network/38493
)

Francois Romieu wrote:
> Marcus Better <marcus@better.se> :
>> I'm seeing this problem on my Acer Travelmate 223X laptop with built-in
>> Realtek 8139: The ethernet stops working, usually after at most a few
>> minutes operation.

> In a better world, you would narrow the suspect with a git bissect [1]
> between v2.6.15 and v2.6.16.

I did, and the winner after 13 reboots is...

commit de227f5f32775d86e5c780a7cffdd2e08574f7fb
Author: Dave Airlie <airlied@starflyer.(none)>
Date:   Wed Jan 25 15:31:43 2006 +1100

    drm: i915 patches from Tungsten Graphics

    Fix CMDBUFFER path, add heap destroy and flesh out sarea for rotation
    (Tungsten Graphics)

    From: Alan Hourihane <alanh@tungstengraphics.com>
    Signed-off-by: Dave Airlie <airlied@linux.ie>


I didn't believe it at first either, but blacklisting the i915 module
actually fixes the problem. Now that I know what to look for, I notice
that the network errors always started cropping up after X11 started.

Wonder what's going on here. Why is the graphics driver killing my network?

Marcus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)

iD8DBQFEroy7XjXn6TzcAQkRAp7bAJ9F7HgWg+VsvQ0fwkK3+b4Ne+tASwCg8+m3
8i5BoW+ujUjoX3DLW0QKAPQ=
=MDAc
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-07-07 16:32   ` Marcus Better
@ 2006-07-08 16:15     ` Thomas Hellström
  2006-07-09  7:23       ` Marcus Better
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Hellström @ 2006-07-08 16:15 UTC (permalink / raw)
  To: Marcus Better
  Cc: Francois Romieu, Dave Airlie, netdev, dri-devel, Alan Hourihane


Hi!

Marcus Better wrote:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>(For those haven't followed, this is about
>  http://permalink.gmane.org/gmane.linux.network/38493
>)
>
>Francois Romieu wrote:
>  
>
>>Marcus Better <marcus@better.se> :
>>    
>>
>>>I'm seeing this problem on my Acer Travelmate 223X laptop with built-in
>>>Realtek 8139: The ethernet stops working, usually after at most a few
>>>minutes operation.
>>>      
>>>
>
>  
>
>>In a better world, you would narrow the suspect with a git bissect [1]
>>between v2.6.15 and v2.6.16.
>>    
>>
>
>I did, and the winner after 13 reboots is...
>
>commit de227f5f32775d86e5c780a7cffdd2e08574f7fb
>Author: Dave Airlie <airlied@starflyer.(none)>
>Date:   Wed Jan 25 15:31:43 2006 +1100
>
>    drm: i915 patches from Tungsten Graphics
>
>    Fix CMDBUFFER path, add heap destroy and flesh out sarea for rotation
>    (Tungsten Graphics)
>
>    From: Alan Hourihane <alanh@tungstengraphics.com>
>    Signed-off-by: Dave Airlie <airlied@linux.ie>
>
>
>I didn't believe it at first either, but blacklisting the i915 module
>actually fixes the problem. Now that I know what to look for, I notice
>that the network errors always started cropping up after X11 started.
>
>Wonder what's going on here. Why is the graphics driver killing my network?
>  
>
I guess you got the wrong commit, and the correct one should be the one 
where Dave adds vblank interrupts. It should be close to the one you listed.

Some i915 chips have buggy interrupts. What happens is that the display 
interrupts are duplicated on the sound interrupt channel. Usually 
there's no interrupt handler there that recognizes them, and after a 
while, the kernel detects too many spurious interrupts and disables the 
sound IRQ line. If the network sits on the same IRQ line, it will be 
disabled as well. If you check your kernel logs, you will probably have 
messages about disabled IRQs.

A workaround is to add the "noirqdebug" (I hope I remember the spelling 
correctly) kernel option at boot time.

/Thomas




>Marcus
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.4.3 (GNU/Linux)
>
>iD8DBQFEroy7XjXn6TzcAQkRAp7bAJ9F7HgWg+VsvQ0fwkK3+b4Ne+tASwCg8+m3
>8i5BoW+ujUjoX3DLW0QKAPQ=
>=MDAc
>-----END PGP SIGNATURE-----
>
>Using Tomcat but need to do more? Need to support web services, security?
>Get stuff done quickly with pre-integrated technology to make your job easier
>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
>--
>_______________________________________________
>Dri-devel mailing list
>Dri-devel@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/dri-devel
>  
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-07-08 16:15     ` Thomas Hellström
@ 2006-07-09  7:23       ` Marcus Better
  2006-07-09  7:34         ` Thomas Hellström
  0 siblings, 1 reply; 11+ messages in thread
From: Marcus Better @ 2006-07-09  7:23 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Francois Romieu, Dave Airlie, netdev, dri-devel, Alan Hourihane

Thomas Hellström wrote:
> I guess you got the wrong commit, and the correct one should be the one
> where Dave adds vblank interrupts. It should be close to the one you
> listed.

I thought I double-checked that it was the right commit, but will check
again.

> If the network sits on the same IRQ line, it will be disabled as well.

It appears to be on a different IRQ:
~$ cat /proc/interrupts
           CPU0
  0:     202574          XT-PIC  timer
  1:       2649          XT-PIC  i8042
  2:          0          XT-PIC  cascade
  8:          0          XT-PIC  rtc
  9:          0          XT-PIC  acpi
 10:       1059          XT-PIC  yenta, Intel 82801CA-ICH3 Modem, Intel
82801CA-ICH3
 11:      40776          XT-PIC  uhci_hcd:usb1, uhci_hcd:usb2, eth0
 12:     100433          XT-PIC  i8042
 14:      63563          XT-PIC  ide0
 15:       6773          XT-PIC  ide1
NMI:          0
ERR:          0

(This is without i915 loaded though.)

> If you check your kernel logs, you will probably have
> messages about disabled IRQs.

Correct. Here's an example:

Jul  7 16:43:39 kelev kernel: irq 11: nobody cared (try booting with the
"irqpoll" option)
Jul  7 16:43:39 kelev kernel:  [<c0138b74>] __report_bad_irq+0x24/0x90
Jul  7 16:43:39 kelev kernel:  [<c0138c82>] note_interrupt+0x72/0xc0
Jul  7 16:43:39 kelev kernel:  [<c013862e>] __do_IRQ+0xae/0xc0
Jul  7 16:43:39 kelev kernel:  [<c0104a69>] do_IRQ+0x19/0x30
Jul  7 16:43:39 kelev kernel:  [<c01030c2>] common_interrupt+0x1a/0x20
Jul  7 16:43:39 kelev kernel:  [<c011c4fe>] __do_softirq+0x2e/0xa0
Jul  7 16:43:39 kelev kernel:  [<c011c597>] do_softirq+0x27/0x30
Jul  7 16:43:39 kelev kernel:  [<c0104a6e>] do_IRQ+0x1e/0x30
Jul  7 16:43:39 kelev kernel:  [<c01030c2>] common_interrupt+0x1a/0x20
Jul  7 16:43:39 kelev kernel:  [<c0102ef5>] syscall_call+0x7/0xb
Jul  7 16:43:39 kelev kernel: handlers:
Jul  7 16:43:39 kelev kernel: [<e01270c0>] (usb_hcd_irq+0x0/0x60 [usbcore])
Jul  7 16:43:39 kelev kernel: [<e01270c0>] (usb_hcd_irq+0x0/0x60 [usbcore])
Jul  7 16:43:39 kelev kernel: [<e0044160>] (rtl8139_interrupt+0x0/0x1d0
[8139too]
)
Jul  7 16:43:39 kelev kernel: Disabling IRQ #11

Jul  7 16:44:03 kelev kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jul  7 16:44:03 kelev kernel: eth0: Transmit timeout, status 0c 0005
c07f media 9
0.
Jul  7 16:44:03 kelev kernel: eth0: Tx queue start entry 48  dirty entry 44.
Jul  7 16:44:03 kelev kernel: eth0:  Tx descriptor 0 is 0008a062. (queue
head)
Jul  7 16:44:03 kelev kernel: eth0:  Tx descriptor 1 is 0008a062.
Jul  7 16:44:03 kelev kernel: eth0:  Tx descriptor 2 is 0008a062.
Jul  7 16:44:03 kelev kernel: eth0:  Tx descriptor 3 is 0008a062.
Jul  7 16:44:03 kelev kernel: eth0: link up, 100Mbps, full-duplex, lpa
0x45E1



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-07-09  7:23       ` Marcus Better
@ 2006-07-09  7:34         ` Thomas Hellström
  2006-07-09 16:34           ` Marcus Better
  2006-07-10 13:02           ` Marcus Better
  0 siblings, 2 replies; 11+ messages in thread
From: Thomas Hellström @ 2006-07-09  7:34 UTC (permalink / raw)
  To: Marcus Better
  Cc: Dave Airlie, netdev, dri-devel, Alan Hourihane, Francois Romieu

Marcus Better wrote:

>It appears to be on a different IRQ:
>~$ cat /proc/interrupts
>           CPU0
>  0:     202574          XT-PIC  timer
>  1:       2649          XT-PIC  i8042
>  2:          0          XT-PIC  cascade
>  8:          0          XT-PIC  rtc
>  9:          0          XT-PIC  acpi
> 10:       1059          XT-PIC  yenta, Intel 82801CA-ICH3 Modem, Intel
>82801CA-ICH3
> 11:      40776          XT-PIC  uhci_hcd:usb1, uhci_hcd:usb2, eth0
> 12:     100433          XT-PIC  i8042
> 14:      63563          XT-PIC  ide0
> 15:       6773          XT-PIC  ide1
>NMI:          0
>ERR:          0
>
>(This is without i915 loaded though.)
>
>  
>
Strange. I've also seen the i915 sending false interrupts on its own 
line, though.

Does the "noirqdebug" option fix the problem?

/Thomas



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-07-09  7:34         ` Thomas Hellström
@ 2006-07-09 16:34           ` Marcus Better
  2006-07-10 13:02           ` Marcus Better
  1 sibling, 0 replies; 11+ messages in thread
From: Marcus Better @ 2006-07-09 16:34 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Francois Romieu, Dave Airlie, netdev, dri-devel, Alan Hourihane

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thomas Hellström wrote:
> Strange. I've also seen the i915 sending false interrupts on its own
> line, though.

Here's the interrupt table with i915 loaded:

~$ cat /proc/interrupts
           CPU0
  0:     401031          XT-PIC  timer
  1:       3681          XT-PIC  i8042
  2:          0          XT-PIC  cascade
  8:          0          XT-PIC  rtc
  9:          0          XT-PIC  acpi
 10:        997          XT-PIC  yenta, Intel 82801CA-ICH3, Intel
82801CA-ICH3 Modem
 11:      93823          XT-PIC  uhci_hcd:usb1, uhci_hcd:usb2, eth0,
i915@pci:0000:00:02.0
 12:      75631          XT-PIC  i8042
 14:      18284          XT-PIC  ide0
 15:      13901          XT-PIC  ide1
NMI:          0
ERR:          0

> Does the "noirqdebug" option fix the problem?

Yes, it appears to fix it.

Marcus

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)

iD8DBQFEsTArXjXn6TzcAQkRAn/vAKCZUAVd45xQae4FthvNr68x/jTS4QCgyE7N
CzPv0R9okmIjrsGykMXrfPk=
=gU6D
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-07-09  7:34         ` Thomas Hellström
  2006-07-09 16:34           ` Marcus Better
@ 2006-07-10 13:02           ` Marcus Better
  2006-07-10 13:10             ` Thomas Hellström
  1 sibling, 1 reply; 11+ messages in thread
From: Marcus Better @ 2006-07-10 13:02 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Francois Romieu, Dave Airlie, netdev, dri-devel, Alan Hourihane

[-- Attachment #1: Type: text/plain, Size: 247 bytes --]

Thomas Hellström wrote:
> Does the "noirqdebug" option fix the problem?

Yes... but it breaks switching to a text console. I get an interesting
"fluid" effect on the screen (a bright static pattern), and the keyboard
locks up.

Marcus



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-07-10 13:02           ` Marcus Better
@ 2006-07-10 13:10             ` Thomas Hellström
  2006-07-10 13:17               ` Marcus Better
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Hellström @ 2006-07-10 13:10 UTC (permalink / raw)
  To: Marcus Better; +Cc: Francois Romieu, netdev, dri-devel

Marcus Better wrote:

>Thomas Hellström wrote:
>  
>
>>Does the "noirqdebug" option fix the problem?
>>    
>>
>
>Yes... but it breaks switching to a text console. I get an interesting
>"fluid" effect on the screen (a bright static pattern), and the keyboard
>locks up.
>
>Marcus
>
>  
>
Hi!
Are you _sure_ these are related?
If you don't use the noirqdebug option, do you have the same problem?

/Thomas


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rtl8139: NETDEV WATCHDOG: eth0: transmit timed out
  2006-07-10 13:10             ` Thomas Hellström
@ 2006-07-10 13:17               ` Marcus Better
  0 siblings, 0 replies; 11+ messages in thread
From: Marcus Better @ 2006-07-10 13:17 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: Francois Romieu, netdev, dri-devel

[-- Attachment #1: Type: text/plain, Size: 385 bytes --]

Thomas Hellström wrote:
>>> Does the "noirqdebug" option fix the problem?

>> Yes... but it breaks switching to a text console.

> Are you _sure_ these are related?

Yes. (I tried a few times and it always crashed, whereas without
noirqdebug I've switched mode successfully hundreds of times.)

Without the i915 module both network and console switching work.

Marcus



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-07-10 13:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-30  9:23 rtl8139: NETDEV WATCHDOG: eth0: transmit timed out Marcus Better
2006-06-30 21:16 ` Francois Romieu
2006-07-04 19:38   ` Marcus Better
2006-07-07 16:32   ` Marcus Better
2006-07-08 16:15     ` Thomas Hellström
2006-07-09  7:23       ` Marcus Better
2006-07-09  7:34         ` Thomas Hellström
2006-07-09 16:34           ` Marcus Better
2006-07-10 13:02           ` Marcus Better
2006-07-10 13:10             ` Thomas Hellström
2006-07-10 13:17               ` Marcus Better

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).