netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* r8169 transmit queue time outs
@ 2010-05-06 14:17 Kyle McMartin
  2010-05-06 20:10 ` Francois Romieu
  0 siblings, 1 reply; 7+ messages in thread
From: Kyle McMartin @ 2010-05-06 14:17 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

Hi Francois,

Some of our users have been seeing their r8169 cards just up and stop
transmitting packets pretty quickly after boot with recent kernels.

After trying a few things:
 1- Updating the driver to the latest upstream (and -next).
 2- Downgrading the driver to the last known working release version.

Nothing seems to have helped.

I take it from the fact that the last known working driver no longer
works on a newer kernel to mean that something has changed in the core
net code that makes the r8169 driver particularly unhappy?

The WARN_ON getting triggered is:
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:246 dev_watchdog+0xc6/0x12d() (Not
tainted)
Hardware name: VX800
NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
Modules linked in: sunrpc ip6_tables cpufreq_ondemand acpi_cpufreq
dm_multipath
uinput snd_hda_codec_via snd_hda_intel snd_hda_codec snd_hwdep snd_seq
snd_seq_device snd_pcm snd_timer i2c_viapro snd soundcore snd_page_alloc
r8169
serio_raw i2c_core mii pata_acpi ata_generic pata_via [last unloaded:
scsi_wait_scan]
Pid: 0, comm: swapper Not tainted 2.6.31.5-127.fc12.i686.PAE #1
Call Trace:
 [<c043db03>] warn_slowpath_common+0x70/0x87
 [<c06ff9d0>] ? dev_watchdog+0xc6/0x12d
 [<c043db58>] warn_slowpath_fmt+0x29/0x2c
 [<c06ff9d0>] dev_watchdog+0xc6/0x12d
 [<c04471bd>] ? mod_timer+0x20/0x27
 [<c0680021>] ? usb_hcd_poll_rh_status+0x126/0x12e
 [<c06ff90a>] ? dev_watchdog+0x0/0x12d
 [<c0446e91>] run_timer_softirq+0x14e/0x1af
 [<c0442daa>] __do_softirq+0xb1/0x157
 [<c0442e86>] do_softirq+0x36/0x41
 [<c0442f78>] irq_exit+0x2e/0x61
 [<c041cf17>] smp_apic_timer_interrupt+0x6d/0x7b
 [<c04099b5>] apic_timer_interrupt+0x31/0x38
 [<c040f34b>] ? mwait_idle+0x67/0x85
 [<c040811f>] cpu_idle+0x96/0xaf
 [<c0765784>] rest_init+0x58/0x5a
 [<c09a78c3>] start_kernel+0x32b/0x330
 [<c09a7081>] i386_start_kernel+0x70/0x77
---[ end trace 8eb83276b856939b ]---  

And the RH bugzilla ref is:
https://bugzilla.redhat.com/show_bug.cgi?id=538920

I'd be happy to do whatever is necessary to help figure this out.

regards, Kyle

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: r8169 transmit queue time outs
  2010-05-06 14:17 r8169 transmit queue time outs Kyle McMartin
@ 2010-05-06 20:10 ` Francois Romieu
  2010-05-07  4:51   ` Kyle McMartin
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Francois Romieu @ 2010-05-06 20:10 UTC (permalink / raw)
  To: Kyle McMartin; +Cc: netdev

Kyle McMartin <kmcmartin@redhat.com> :
[...]
> Some of our users have been seeing their r8169 cards just up and stop
> transmitting packets pretty quickly after boot with recent kernels.
[...]
> Pid: 0, comm: swapper Not tainted 2.6.31.5-127.fc12.i686.PAE #1

Can they upgrade to 2.6.32.11-99.fc12.i686 and try an out-of-tree build
of the driver at http://userweb.kernel.org/~romieu/r8169/2.6.32.11-99.fc12/ ?

It should be quite close to the current git kernel.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: r8169 transmit queue time outs
  2010-05-06 20:10 ` Francois Romieu
@ 2010-05-07  4:51   ` Kyle McMartin
  2010-05-19 13:43   ` Kyle McMartin
  2010-09-09 17:32   ` Kyle McMartin
  2 siblings, 0 replies; 7+ messages in thread
From: Kyle McMartin @ 2010-05-07  4:51 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Kyle McMartin, netdev

On Thu, May 06, 2010 at 10:10:24PM +0200, Francois Romieu wrote:
> Kyle McMartin <kmcmartin@redhat.com> :
> [...]
> > Some of our users have been seeing their r8169 cards just up and stop
> > transmitting packets pretty quickly after boot with recent kernels.
> [...]
> > Pid: 0, comm: swapper Not tainted 2.6.31.5-127.fc12.i686.PAE #1
> 
> Can they upgrade to 2.6.32.11-99.fc12.i686 and try an out-of-tree build
> of the driver at http://userweb.kernel.org/~romieu/r8169/2.6.32.11-99.fc12/ ?
> 
> It should be quite close to the current git kernel.
> 

Thanks Francois, I've done a build for F-12 and F-13 with that driver
for the users, and updated the bugs with links to the builds.

I'll let you know if it helps things.

Thanks again, Kyle.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: r8169 transmit queue time outs
  2010-05-06 20:10 ` Francois Romieu
  2010-05-07  4:51   ` Kyle McMartin
@ 2010-05-19 13:43   ` Kyle McMartin
  2010-05-19 13:48     ` Kyle McMartin
  2010-05-19 14:18     ` Eric Dumazet
  2010-09-09 17:32   ` Kyle McMartin
  2 siblings, 2 replies; 7+ messages in thread
From: Kyle McMartin @ 2010-05-19 13:43 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Kyle McMartin, netdev, dgilmore

On Thu, May 06, 2010 at 10:10:24PM +0200, Francois Romieu wrote:
> Kyle McMartin <kmcmartin@redhat.com> :
> [...]
> > Some of our users have been seeing their r8169 cards just up and stop
> > transmitting packets pretty quickly after boot with recent kernels.
> [...]
> > Pid: 0, comm: swapper Not tainted 2.6.31.5-127.fc12.i686.PAE #1
> 
> Can they upgrade to 2.6.32.11-99.fc12.i686 and try an out-of-tree build
> of the driver at http://userweb.kernel.org/~romieu/r8169/2.6.32.11-99.fc12/ ?
> 
> It should be quite close to the current git kernel.
> 

I provided a bunch of testers with a backport of the current git head
r8169 driver, and sadly, they report the TX timeout issues still occur.
:/

Any other ideas?

regards, Kyle

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: r8169 transmit queue time outs
  2010-05-19 13:43   ` Kyle McMartin
@ 2010-05-19 13:48     ` Kyle McMartin
  2010-05-19 14:18     ` Eric Dumazet
  1 sibling, 0 replies; 7+ messages in thread
From: Kyle McMartin @ 2010-05-19 13:48 UTC (permalink / raw)
  To: Kyle McMartin; +Cc: Francois Romieu, netdev, dgilmore

On Wed, May 19, 2010 at 09:43:44AM -0400, Kyle McMartin wrote:
> I provided a bunch of testers with a backport of the current git head
> r8169 driver, and sadly, they report the TX timeout issues still occur.
> :/
> 
> Any other ideas?
> 

They note the vendor driver seems to work without these transmit
timeouts... Would it be worthwhile for me to take a look and compare the
tx setup between these git head and the vendor latest driver?

regards, Kyle

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: r8169 transmit queue time outs
  2010-05-19 13:43   ` Kyle McMartin
  2010-05-19 13:48     ` Kyle McMartin
@ 2010-05-19 14:18     ` Eric Dumazet
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2010-05-19 14:18 UTC (permalink / raw)
  To: Kyle McMartin; +Cc: Francois Romieu, netdev, dgilmore

Le mercredi 19 mai 2010 à 09:43 -0400, Kyle McMartin a écrit :
> On Thu, May 06, 2010 at 10:10:24PM +0200, Francois Romieu wrote:
> > Kyle McMartin <kmcmartin@redhat.com> :
> > [...]
> > > Some of our users have been seeing their r8169 cards just up and stop
> > > transmitting packets pretty quickly after boot with recent kernels.
> > [...]
> > > Pid: 0, comm: swapper Not tainted 2.6.31.5-127.fc12.i686.PAE #1
> > 
> > Can they upgrade to 2.6.32.11-99.fc12.i686 and try an out-of-tree build
> > of the driver at http://userweb.kernel.org/~romieu/r8169/2.6.32.11-99.fc12/ ?
> > 
> > It should be quite close to the current git kernel.
> > 
> 
> I provided a bunch of testers with a backport of the current git head
> r8169 driver, and sadly, they report the TX timeout issues still occur.
> :/
> 
> Any other ideas?

Scratch the NIC ?

Normally not related, but I mentioned once following patch that could be
tried. (Not reset the NIC if we receive too many frames in a row)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 217e709..c4dbb15 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4520,10 +4520,8 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
 				dev->stats.rx_length_errors++;
 			if (status & RxCRC)
 				dev->stats.rx_crc_errors++;
-			if (status & RxFOVF) {
-				rtl8169_schedule_work(dev, rtl8169_reset_task);
+			if (status & RxFOVF)
 				dev->stats.rx_fifo_errors++;
-			}
 			rtl8169_mark_to_asic(desc, tp->rx_buf_sz);
 		} else {
 			struct sk_buff *skb = tp->Rx_skbuff[entry];



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: r8169 transmit queue time outs
  2010-05-06 20:10 ` Francois Romieu
  2010-05-07  4:51   ` Kyle McMartin
  2010-05-19 13:43   ` Kyle McMartin
@ 2010-09-09 17:32   ` Kyle McMartin
  2 siblings, 0 replies; 7+ messages in thread
From: Kyle McMartin @ 2010-09-09 17:32 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Kyle McMartin, netdev

On Thu, May 06, 2010 at 10:10:24PM +0200, Francois Romieu wrote:
> Kyle McMartin <kmcmartin@redhat.com> :
> [...]
> > Some of our users have been seeing their r8169 cards just up and stop
> > transmitting packets pretty quickly after boot with recent kernels.
> [...]
> > Pid: 0, comm: swapper Not tainted 2.6.31.5-127.fc12.i686.PAE #1
> 
> Can they upgrade to 2.6.32.11-99.fc12.i686 and try an out-of-tree build
> of the driver at http://userweb.kernel.org/~romieu/r8169/2.6.32.11-99.fc12/ ?
> 
> It should be quite close to the current git kernel.

Just fyi, we finally figured out what this problem was, or at least, a
workaround for it. Turning off PCI Express ASPM seemed to work around
the problem.

regards, Kyle

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-09-09 17:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-06 14:17 r8169 transmit queue time outs Kyle McMartin
2010-05-06 20:10 ` Francois Romieu
2010-05-07  4:51   ` Kyle McMartin
2010-05-19 13:43   ` Kyle McMartin
2010-05-19 13:48     ` Kyle McMartin
2010-05-19 14:18     ` Eric Dumazet
2010-09-09 17:32   ` Kyle McMartin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).