* atl1: WARNING at net/sched/sch_generic.c:221
@ 2008-08-21 11:58 adobriyan
2008-08-21 11:59 ` David Miller
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: adobriyan @ 2008-08-21 11:58 UTC (permalink / raw)
To: jcliburn, csnook; +Cc: netdev
This message happens more or less every reboot, sometimes cable unplug/plug is needed
to restore connectivity, otherwise card is working fine.
[ 22.570010] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
[ 26.570011] NET: Registered protocol family 10
[ 37.551934] eth0: no IPv6 routers present
[rebooted box which is directly connected to a box with atl1]
[ 2078.740004] atl1 0000:03:00.0: eth0 link is down
[ 2080.790004] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full duplex
[ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit timed out
[ 2086.050004] ------------[ cut here ]------------
[ 2086.050004] WARNING: at net/sched/sch_generic.c:221 dev_watchdog+0x205/0x220()
[ 2086.050004] Modules linked in: ipv6 ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables
[ 2086.052503] Pid: 0, comm: swapper Not tainted 2.6.27-rc4-netns-nf #4
[ 2086.052503]
[ 2086.052503] Call Trace:
[ 2086.052503] <IRQ> [<ffffffff8023455f>] warn_on_slowpath+0x5f/0x80
[ 2086.052503] [<ffffffff80259c00>] ? trace_hardirqs_on_caller+0x130/0x160
[ 2086.052503] [<ffffffff80259b61>] ? trace_hardirqs_on_caller+0x91/0x160
[ 2086.052503] [<ffffffff80259c3d>] ? trace_hardirqs_on+0xd/0x10
[ 2086.052503] [<ffffffff80431ab5>] ? _spin_unlock_irqrestore+0x75/0x80
[ 2086.052503] [<ffffffff8024676c>] ? __queue_work+0x3c/0x50
[ 2086.052503] [<ffffffff80246814>] ? queue_work_on+0x44/0x60
[ 2086.052503] [<ffffffff80246993>] ? queue_work+0x53/0x60
[ 2086.052503] [<ffffffff803d4c35>] dev_watchdog+0x205/0x220
[ 2086.052503] [<ffffffff80259c3d>] ? trace_hardirqs_on+0xd/0x10
[ 2086.052503] [<ffffffff80259b61>] ? trace_hardirqs_on_caller+0x91/0x160
[ 2086.052503] [<ffffffff803d4a30>] ? dev_watchdog+0x0/0x220
[ 2086.052503] [<ffffffff8023eacc>] run_timer_softirq+0x18c/0x200
[ 2086.052503] [<ffffffff80239e67>] __do_softirq+0x67/0xe0
[ 2086.052503] [<ffffffff8020cc4c>] call_softirq+0x1c/0x30
[ 2086.052503] [<ffffffff8020f2b5>] do_softirq+0x65/0xa0
[ 2086.052503] [<ffffffff80239cf9>] irq_exit+0x99/0xb0
[ 2086.052503] [<ffffffff8021d187>] smp_apic_timer_interrupt+0x97/0xf0
[ 2086.052503] [<ffffffff8020c69b>] apic_timer_interrupt+0x6b/0x70
[ 2086.052503] <EOI> [<ffffffff802136ec>] ? mwait_idle+0x4c/0x60
[ 2086.052503] [<ffffffff802136e3>] ? mwait_idle+0x43/0x60
[ 2086.052503] [<ffffffff8020a276>] ? cpu_idle+0x46/0x90
[ 2086.052503] [<ffffffff804266d0>] ? rest_init+0x70/0x80
[ 2086.052503]
[ 2086.052503] ---[ end trace 4ccf372e8f6b84c3 ]---
[ 2086.070003] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full duplex
[ 2090.390003] atl1 0000:03:00.0: eth0 link is down
[ 2092.550003] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full duplex
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-08-21 11:58 atl1: WARNING at net/sched/sch_generic.c:221 adobriyan
@ 2008-08-21 11:59 ` David Miller
2008-08-21 12:04 ` adobriyan
2008-08-22 2:00 ` Jay Cliburn
2008-09-14 23:17 ` Jay Cliburn
2 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2008-08-21 11:59 UTC (permalink / raw)
To: adobriyan; +Cc: jcliburn, csnook, netdev
From: adobriyan@gmail.com
Date: Thu, 21 Aug 2008 15:58:49 +0400
> This message happens more or less every reboot, sometimes cable unplug/plug is needed
> to restore connectivity, otherwise card is working fine.
What kernel version.... no, I can figure it out via osmosis never mind!
:-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-08-21 11:59 ` David Miller
@ 2008-08-21 12:04 ` adobriyan
2008-08-21 12:08 ` David Miller
0 siblings, 1 reply; 15+ messages in thread
From: adobriyan @ 2008-08-21 12:04 UTC (permalink / raw)
To: David Miller; +Cc: jcliburn, csnook, netdev
On Thu, Aug 21, 2008 at 04:59:10AM -0700, David Miller wrote:
> From: adobriyan@gmail.com
> Date: Thu, 21 Aug 2008 15:58:49 +0400
>
> > This message happens more or less every reboot, sometimes cable unplug/plug is needed
> > to restore connectivity, otherwise card is working fine.
>
> What kernel version.... no, I can figure it out via osmosis never mind!
> :-)
WARN already prints kernel version. ;-)
[ 2086.052503] Pid: 0, comm: swapper Not tainted 2.6.27-rc4-netns-nf #4
^^^^^^^^^^^^^^^^^^^
And before you ask, "-netns-nf" part doesn't matter, it happens with
strictly mainline kernels too and started around multiqueue TX changes,
IIRC.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-08-21 12:04 ` adobriyan
@ 2008-08-21 12:08 ` David Miller
2008-09-14 19:26 ` Jay Cliburn
0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2008-08-21 12:08 UTC (permalink / raw)
To: adobriyan; +Cc: jcliburn, csnook, netdev
From: adobriyan@gmail.com
Date: Thu, 21 Aug 2008 16:04:09 +0400
> On Thu, Aug 21, 2008 at 04:59:10AM -0700, David Miller wrote:
> > From: adobriyan@gmail.com
> > Date: Thu, 21 Aug 2008 15:58:49 +0400
> >
> > > This message happens more or less every reboot, sometimes cable unplug/plug is needed
> > > to restore connectivity, otherwise card is working fine.
> >
> > What kernel version.... no, I can figure it out via osmosis never mind!
> > :-)
>
> WARN already prints kernel version. ;-)
>
> [ 2086.052503] Pid: 0, comm: swapper Not tainted 2.6.27-rc4-netns-nf #4
> ^^^^^^^^^^^^^^^^^^^
>
> And before you ask, "-netns-nf" part doesn't matter, it happens with
> strictly mainline kernels too and started around multiqueue TX changes,
> IIRC.
It's a simple transmit timeout error.
Perhaps atl1 doesn't call netif_carrier_off() in all the places that it
should.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-08-21 11:58 atl1: WARNING at net/sched/sch_generic.c:221 adobriyan
2008-08-21 11:59 ` David Miller
@ 2008-08-22 2:00 ` Jay Cliburn
2008-08-22 21:50 ` Jay Cliburn
2008-09-14 23:17 ` Jay Cliburn
2 siblings, 1 reply; 15+ messages in thread
From: Jay Cliburn @ 2008-08-22 2:00 UTC (permalink / raw)
To: adobriyan; +Cc: csnook, netdev
Hi Alexey,
On Thu, 21 Aug 2008 15:58:49 +0400
adobriyan@gmail.com wrote:
> This message happens more or less every reboot, sometimes cable
> unplug/plug is needed to restore connectivity, otherwise card is
> working fine.
>
>
> [ 22.570010] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
> [ 26.570011] NET: Registered protocol family 10
> [ 37.551934] eth0: no IPv6 routers present
>
> [rebooted box which is directly connected to a box with atl1]
>
> [ 2078.740004] atl1 0000:03:00.0: eth0 link is down
> [ 2080.790004] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full
> duplex [ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit timed out
> [ 2086.050004] ------------[ cut here ]------------
> [ 2086.050004] WARNING: at net/sched/sch_generic.c:221
> dev_watchdog+0x205/0x220() [ 2086.050004] Modules linked in: ipv6
> ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 xt_state
> nf_conntrack iptable_filter ip_tables x_tables [ 2086.052503] Pid: 0,
> comm: swapper Not tainted 2.6.27-rc4-netns-nf #4 [ 2086.052503]
> [ 2086.052503] Call Trace: [ 2086.052503] <IRQ>
> [<ffffffff8023455f>] warn_on_slowpath+0x5f/0x80 [ 2086.052503]
> [<ffffffff80259c00>] ? trace_hardirqs_on_caller+0x130/0x160
> [ 2086.052503] [<ffffffff80259b61>] ?
> trace_hardirqs_on_caller+0x91/0x160 [ 2086.052503]
> [<ffffffff80259c3d>] ? trace_hardirqs_on+0xd/0x10 [ 2086.052503]
> [<ffffffff80431ab5>] ? _spin_unlock_irqrestore+0x75/0x80
> [ 2086.052503] [<ffffffff8024676c>] ? __queue_work+0x3c/0x50
> [ 2086.052503] [<ffffffff80246814>] ? queue_work_on+0x44/0x60
> [ 2086.052503] [<ffffffff80246993>] ? queue_work+0x53/0x60
> [ 2086.052503] [<ffffffff803d4c35>] dev_watchdog+0x205/0x220
> [ 2086.052503] [<ffffffff80259c3d>] ? trace_hardirqs_on+0xd/0x10
> [ 2086.052503] [<ffffffff80259b61>] ?
> trace_hardirqs_on_caller+0x91/0x160 [ 2086.052503]
> [<ffffffff803d4a30>] ? dev_watchdog+0x0/0x220 [ 2086.052503]
> [<ffffffff8023eacc>] run_timer_softirq+0x18c/0x200 [ 2086.052503]
> [<ffffffff80239e67>] __do_softirq+0x67/0xe0 [ 2086.052503]
> [<ffffffff8020cc4c>] call_softirq+0x1c/0x30 [ 2086.052503]
> [<ffffffff8020f2b5>] do_softirq+0x65/0xa0 [ 2086.052503]
> [<ffffffff80239cf9>] irq_exit+0x99/0xb0 [ 2086.052503]
> [<ffffffff8021d187>] smp_apic_timer_interrupt+0x97/0xf0
> [ 2086.052503] [<ffffffff8020c69b>] apic_timer_interrupt+0x6b/0x70
> [ 2086.052503] <EOI> [<ffffffff802136ec>] ? mwait_idle+0x4c/0x60
> [ 2086.052503] [<ffffffff802136e3>] ? mwait_idle+0x43/0x60
> [ 2086.052503] [<ffffffff8020a276>] ? cpu_idle+0x46/0x90
> [ 2086.052503] [<ffffffff804266d0>] ? rest_init+0x70/0x80
> [ 2086.052503] [ 2086.052503] ---[ end trace 4ccf372e8f6b84c3 ]---
> [ 2086.070003] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full
> duplex [ 2090.390003] atl1 0000:03:00.0: eth0 link is down
> [ 2092.550003] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full
> duplex
Does this patch fix it?
diff --git a/drivers/net/atlx/atl1.c b/drivers/net/atlx/atl1.c
index e23ce77..4816c6d 100644
--- a/drivers/net/atlx/atl1.c
+++ b/drivers/net/atlx/atl1.c
@@ -1307,7 +1307,6 @@ static u32 atl1_check_link(struct atl1_adapter *adapter)
if (netif_msg_link(adapter))
dev_info(&adapter->pdev->dev, "link is down\n");
adapter->link_speed = SPEED_0;
- netif_carrier_off(netdev);
}
return 0;
}
@@ -1364,8 +1363,6 @@ static u32 atl1_check_link(struct atl1_adapter *adapter)
/* change original link status */
if (netif_carrier_ok(netdev)) {
adapter->link_speed = SPEED_0;
- netif_carrier_off(netdev);
- netif_stop_queue(netdev);
}
if (hw->media_type != MEDIA_TYPE_AUTO_SENSOR &&
@@ -2654,8 +2651,6 @@ static void atl1_down(struct atl1_adapter *adapter)
adapter->link_speed = SPEED_0;
adapter->link_duplex = -1;
- netif_carrier_off(netdev);
- netif_stop_queue(netdev);
atl1_clean_tx_ring(adapter);
atl1_clean_rx_ring(adapter);
@@ -3063,8 +3058,6 @@ static int __devinit atl1_probe(struct pci_dev *pdev,
atl1_pcie_patch(adapter);
/* assume we have no link for now */
- netif_carrier_off(netdev);
- netif_stop_queue(netdev);
init_timer(&adapter->watchdog_timer);
adapter->watchdog_timer.function = &atl1_watchdog;
--
1.5.5.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-08-22 2:00 ` Jay Cliburn
@ 2008-08-22 21:50 ` Jay Cliburn
0 siblings, 0 replies; 15+ messages in thread
From: Jay Cliburn @ 2008-08-22 21:50 UTC (permalink / raw)
To: adobriyan; +Cc: csnook, netdev
On Thu, 21 Aug 2008 21:00:07 -0500
Jay Cliburn <jcliburn@gmail.com> wrote:
>
> Does this patch fix it?
>
> diff --git a/drivers/net/atlx/atl1.c b/drivers/net/atlx/atl1.c
> index e23ce77..4816c6d 100644
> --- a/drivers/net/atlx/atl1.c
> +++ b/drivers/net/atlx/atl1.c
> @@ -1307,7 +1307,6 @@ static u32 atl1_check_link(struct atl1_adapter
> *adapter) if (netif_msg_link(adapter))
> dev_info(&adapter->pdev->dev, "link
> is down\n"); adapter->link_speed = SPEED_0;
> - netif_carrier_off(netdev);
> }
> return 0;
> }
> @@ -1364,8 +1363,6 @@ static u32 atl1_check_link(struct atl1_adapter
> *adapter) /* change original link status */
> if (netif_carrier_ok(netdev)) {
> adapter->link_speed = SPEED_0;
> - netif_carrier_off(netdev);
> - netif_stop_queue(netdev);
> }
>
> if (hw->media_type != MEDIA_TYPE_AUTO_SENSOR &&
> @@ -2654,8 +2651,6 @@ static void atl1_down(struct atl1_adapter
> *adapter)
> adapter->link_speed = SPEED_0;
> adapter->link_duplex = -1;
> - netif_carrier_off(netdev);
> - netif_stop_queue(netdev);
>
> atl1_clean_tx_ring(adapter);
> atl1_clean_rx_ring(adapter);
> @@ -3063,8 +3058,6 @@ static int __devinit atl1_probe(struct pci_dev
> *pdev,
> atl1_pcie_patch(adapter);
> /* assume we have no link for now */
> - netif_carrier_off(netdev);
> - netif_stop_queue(netdev);
>
> init_timer(&adapter->watchdog_timer);
> adapter->watchdog_timer.function = &atl1_watchdog;
Alexey,
Please ignore this patch. It is unsavory fruit borne of abject
ignorance.
Jay
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-08-21 12:08 ` David Miller
@ 2008-09-14 19:26 ` Jay Cliburn
2008-09-14 19:58 ` Stephen Hemminger
2008-09-14 23:56 ` David Miller
0 siblings, 2 replies; 15+ messages in thread
From: Jay Cliburn @ 2008-09-14 19:26 UTC (permalink / raw)
To: David Miller; +Cc: adobriyan, csnook, netdev
On Thu, 21 Aug 2008 05:08:57 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
> From: adobriyan@gmail.com
> Date: Thu, 21 Aug 2008 16:04:09 +0400
>
> > On Thu, Aug 21, 2008 at 04:59:10AM -0700, David Miller wrote:
> > > From: adobriyan@gmail.com
> > > Date: Thu, 21 Aug 2008 15:58:49 +0400
> > >
> > > > This message happens more or less every reboot, sometimes cable
> > > > unplug/plug is needed to restore connectivity, otherwise card
> > > > is working fine.
> > >
> > > What kernel version.... no, I can figure it out via osmosis never
> > > mind! :-)
> >
> > WARN already prints kernel version. ;-)
> >
> > [ 2086.052503] Pid: 0, comm: swapper Not tainted
> > 2.6.27-rc4-netns-nf #4 ^^^^^^^^^^^^^^^^^^^
> >
> > And before you ask, "-netns-nf" part doesn't matter, it happens with
> > strictly mainline kernels too and started around multiqueue TX
> > changes, IIRC.
>
> It's a simple transmit timeout error.
>
> Perhaps atl1 doesn't call netif_carrier_off() in all the places that
> it should.
For reference, the original report from Alexey on this matter is here:
http://marc.info/?l=linux-netdev&m=121931988219314&w=2
To which Dave responded above, "It's a simple transmit timeout error."
Should a netdev driver be coded such that a watchdog transmit timeout
never occurs?
[ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit timed out
Or is a watchdog timeout an expected occurrence if a cable is
unplugged/plugged?
Thanks,
Jay
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-09-14 19:26 ` Jay Cliburn
@ 2008-09-14 19:58 ` Stephen Hemminger
2008-09-14 23:56 ` David Miller
1 sibling, 0 replies; 15+ messages in thread
From: Stephen Hemminger @ 2008-09-14 19:58 UTC (permalink / raw)
To: Jay Cliburn; +Cc: David Miller, adobriyan, csnook, netdev
On Sun, 14 Sep 2008 14:26:54 -0500
Jay Cliburn <jacliburn@bellsouth.net> wrote:
> On Thu, 21 Aug 2008 05:08:57 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
>
> > From: adobriyan@gmail.com
> > Date: Thu, 21 Aug 2008 16:04:09 +0400
> >
> > > On Thu, Aug 21, 2008 at 04:59:10AM -0700, David Miller wrote:
> > > > From: adobriyan@gmail.com
> > > > Date: Thu, 21 Aug 2008 15:58:49 +0400
> > > >
> > > > > This message happens more or less every reboot, sometimes cable
> > > > > unplug/plug is needed to restore connectivity, otherwise card
> > > > > is working fine.
> > > >
> > > > What kernel version.... no, I can figure it out via osmosis never
> > > > mind! :-)
> > >
> > > WARN already prints kernel version. ;-)
> > >
> > > [ 2086.052503] Pid: 0, comm: swapper Not tainted
> > > 2.6.27-rc4-netns-nf #4 ^^^^^^^^^^^^^^^^^^^
> > >
> > > And before you ask, "-netns-nf" part doesn't matter, it happens with
> > > strictly mainline kernels too and started around multiqueue TX
> > > changes, IIRC.
> >
> > It's a simple transmit timeout error.
> >
> > Perhaps atl1 doesn't call netif_carrier_off() in all the places that
> > it should.
>
> For reference, the original report from Alexey on this matter is here:
>
> http://marc.info/?l=linux-netdev&m=121931988219314&w=2
>
> To which Dave responded above, "It's a simple transmit timeout error."
>
> Should a netdev driver be coded such that a watchdog transmit timeout
> never occurs?
>
> [ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit timed out
>
> Or is a watchdog timeout an expected occurrence if a cable is
> unplugged/plugged?
Any transmit timeout is driver or hardware bug. The driver should be
shutting down transmit correctly on cable pull.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-08-21 11:58 atl1: WARNING at net/sched/sch_generic.c:221 adobriyan
2008-08-21 11:59 ` David Miller
2008-08-22 2:00 ` Jay Cliburn
@ 2008-09-14 23:17 ` Jay Cliburn
2008-09-15 22:45 ` Alexey Dobriyan
2 siblings, 1 reply; 15+ messages in thread
From: Jay Cliburn @ 2008-09-14 23:17 UTC (permalink / raw)
To: adobriyan; +Cc: csnook, netdev
On Thu, 21 Aug 2008 15:58:49 +0400
adobriyan@gmail.com wrote:
> This message happens more or less every reboot, sometimes cable
> unplug/plug is needed to restore connectivity, otherwise card is
> working fine.
>
>
> [ 22.570010] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
> [ 26.570011] NET: Registered protocol family 10
> [ 37.551934] eth0: no IPv6 routers present
>
> [rebooted box which is directly connected to a box with atl1]
>
> [ 2078.740004] atl1 0000:03:00.0: eth0 link is down
> [ 2080.790004] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full
> duplex [ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit timed out
> [ 2086.050004] ------------[ cut here ]------------
> [ 2086.050004] WARNING: at net/sched/sch_generic.c:221
[...]
Alexey,
Can you please try this patch?
diff --git a/drivers/net/atlx/atl1.c b/drivers/net/atlx/atl1.c
index e23ce77..e00a986 100644
--- a/drivers/net/atlx/atl1.c
+++ b/drivers/net/atlx/atl1.c
@@ -2642,6 +2642,7 @@ static void atl1_down(struct atl1_adapter *adapter)
{
struct net_device *netdev = adapter->netdev;
+ netif_stop_queue(netdev);
del_timer_sync(&adapter->watchdog_timer);
del_timer_sync(&adapter->phy_config_timer);
adapter->phy_timer_pending = false;
@@ -2655,7 +2656,6 @@ static void atl1_down(struct atl1_adapter *adapter)
adapter->link_speed = SPEED_0;
adapter->link_duplex = -1;
netif_carrier_off(netdev);
- netif_stop_queue(netdev);
atl1_clean_tx_ring(adapter);
atl1_clean_rx_ring(adapter);
@@ -2724,6 +2724,8 @@ static int atl1_open(struct net_device *netdev)
struct atl1_adapter *adapter = netdev_priv(netdev);
int err;
+ netif_carrier_off(netdev);
+
/* allocate transmit descriptors */
err = atl1_setup_ring_resources(adapter);
if (err)
diff --git a/drivers/net/atlx/atlx.c b/drivers/net/atlx/atlx.c
index b3e7fcf..3cc9d10 100644
--- a/drivers/net/atlx/atlx.c
+++ b/drivers/net/atlx/atlx.c
@@ -105,7 +105,6 @@ static void atlx_check_for_link(struct atlx_adapter *adapter)
netdev->name);
adapter->link_speed = SPEED_0;
netif_carrier_off(netdev);
- netif_stop_queue(netdev);
}
}
schedule_work(&adapter->link_chg_task);
--
1.5.5.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-09-14 19:26 ` Jay Cliburn
2008-09-14 19:58 ` Stephen Hemminger
@ 2008-09-14 23:56 ` David Miller
2008-09-15 0:11 ` Jay Cliburn
2008-09-15 3:14 ` Jeff Garzik
1 sibling, 2 replies; 15+ messages in thread
From: David Miller @ 2008-09-14 23:56 UTC (permalink / raw)
To: jacliburn; +Cc: adobriyan, csnook, netdev
From: Jay Cliburn <jacliburn@bellsouth.net>
Date: Sun, 14 Sep 2008 14:26:54 -0500
> Should a netdev driver be coded such that a watchdog transmit timeout
> never occurs?
>
> [ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit timed out
>
> Or is a watchdog timeout an expected occurrence if a cable is
> unplugged/plugged?
If the cable is unplugged, netif_carrier_off() will be (or should
be) invoked by the driver, and that cancels the watchdog timer.
This is all from memory since I'm travelling and don't have the
time to check this directly. You should investigate this yourself
instead of asking me if you want a truly definitive answer :)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-09-14 23:56 ` David Miller
@ 2008-09-15 0:11 ` Jay Cliburn
2008-09-15 3:14 ` Jeff Garzik
1 sibling, 0 replies; 15+ messages in thread
From: Jay Cliburn @ 2008-09-15 0:11 UTC (permalink / raw)
To: David Miller; +Cc: adobriyan, csnook, netdev
On Sun, 14 Sep 2008 16:56:55 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
> From: Jay Cliburn <jacliburn@bellsouth.net>
> Date: Sun, 14 Sep 2008 14:26:54 -0500
>
> > Should a netdev driver be coded such that a watchdog transmit
> > timeout never occurs?
> >
> > [ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit timed out
> >
> > Or is a watchdog timeout an expected occurrence if a cable is
> > unplugged/plugged?
>
> If the cable is unplugged, netif_carrier_off() will be (or should
> be) invoked by the driver, and that cancels the watchdog timer.
Thanks, I think I found the problem. It seems to fix it for me, but
I'd like Alexey to test the patch before I submit it to mainline. He
seems to be able to hit the issue more frequently than I can.
>
> This is all from memory since I'm travelling and don't have the
> time to check this directly. You should investigate this yourself
> instead of asking me if you want a truly definitive answer :)
I've been investigating it myself in my spare time since Alexey pointed
out the problem August 21.
Happy travels.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-09-14 23:56 ` David Miller
2008-09-15 0:11 ` Jay Cliburn
@ 2008-09-15 3:14 ` Jeff Garzik
1 sibling, 0 replies; 15+ messages in thread
From: Jeff Garzik @ 2008-09-15 3:14 UTC (permalink / raw)
To: David Miller; +Cc: jacliburn, adobriyan, csnook, netdev
On Sun, Sep 14, 2008 at 04:56:55PM -0700, David Miller wrote:
> From: Jay Cliburn <jacliburn@bellsouth.net>
> Date: Sun, 14 Sep 2008 14:26:54 -0500
>
> > Should a netdev driver be coded such that a watchdog transmit timeout
> > never occurs?
> >
> > [ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit timed out
> >
> > Or is a watchdog timeout an expected occurrence if a cable is
> > unplugged/plugged?
>
> If the cable is unplugged, netif_carrier_off() will be (or should
> be) invoked by the driver, and that cancels the watchdog timer.
100% correct :)
As Stephen noted, a transmit timeout is an unhandled condition plain and
simple -- either unhandled by the driver or unhandled by the hardware.
Jeff
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-09-14 23:17 ` Jay Cliburn
@ 2008-09-15 22:45 ` Alexey Dobriyan
2008-09-16 1:44 ` Jay Cliburn
0 siblings, 1 reply; 15+ messages in thread
From: Alexey Dobriyan @ 2008-09-15 22:45 UTC (permalink / raw)
To: Jay Cliburn; +Cc: csnook, netdev
On Sun, Sep 14, 2008 at 06:17:14PM -0500, Jay Cliburn wrote:
> On Thu, 21 Aug 2008 15:58:49 +0400
> adobriyan@gmail.com wrote:
>
> > This message happens more or less every reboot, sometimes cable
> > unplug/plug is needed to restore connectivity, otherwise card is
> > working fine.
> >
> >
> > [ 22.570010] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
> > [ 26.570011] NET: Registered protocol family 10
> > [ 37.551934] eth0: no IPv6 routers present
> >
> > [rebooted box which is directly connected to a box with atl1]
> >
> > [ 2078.740004] atl1 0000:03:00.0: eth0 link is down
> > [ 2080.790004] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full
> > duplex [ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit timed out
> > [ 2086.050004] ------------[ cut here ]------------
> > [ 2086.050004] WARNING: at net/sched/sch_generic.c:221
> [...]
>
> Alexey,
>
> Can you please try this patch?
Seems to help (no more messages), still sometimes no networking until after
cable replug. Stay tuned.
> --- a/drivers/net/atlx/atl1.c
> +++ b/drivers/net/atlx/atl1.c
> @@ -2642,6 +2642,7 @@ static void atl1_down(struct atl1_adapter *adapter)
> {
> struct net_device *netdev = adapter->netdev;
>
> + netif_stop_queue(netdev);
> del_timer_sync(&adapter->watchdog_timer);
> del_timer_sync(&adapter->phy_config_timer);
> adapter->phy_timer_pending = false;
> @@ -2655,7 +2656,6 @@ static void atl1_down(struct atl1_adapter *adapter)
> adapter->link_speed = SPEED_0;
> adapter->link_duplex = -1;
> netif_carrier_off(netdev);
> - netif_stop_queue(netdev);
>
> atl1_clean_tx_ring(adapter);
> atl1_clean_rx_ring(adapter);
> @@ -2724,6 +2724,8 @@ static int atl1_open(struct net_device *netdev)
> struct atl1_adapter *adapter = netdev_priv(netdev);
> int err;
>
> + netif_carrier_off(netdev);
> +
> /* allocate transmit descriptors */
> err = atl1_setup_ring_resources(adapter);
> if (err)
> diff --git a/drivers/net/atlx/atlx.c b/drivers/net/atlx/atlx.c
> index b3e7fcf..3cc9d10 100644
> --- a/drivers/net/atlx/atlx.c
> +++ b/drivers/net/atlx/atlx.c
> @@ -105,7 +105,6 @@ static void atlx_check_for_link(struct atlx_adapter *adapter)
> netdev->name);
> adapter->link_speed = SPEED_0;
> netif_carrier_off(netdev);
> - netif_stop_queue(netdev);
> }
> }
> schedule_work(&adapter->link_chg_task);
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-09-15 22:45 ` Alexey Dobriyan
@ 2008-09-16 1:44 ` Jay Cliburn
2008-09-17 7:44 ` Alexey Dobriyan
0 siblings, 1 reply; 15+ messages in thread
From: Jay Cliburn @ 2008-09-16 1:44 UTC (permalink / raw)
To: Alexey Dobriyan; +Cc: csnook, netdev
On Tue, 16 Sep 2008 02:45:22 +0400
Alexey Dobriyan <adobriyan@gmail.com> wrote:
> On Sun, Sep 14, 2008 at 06:17:14PM -0500, Jay Cliburn wrote:
> > On Thu, 21 Aug 2008 15:58:49 +0400
> > adobriyan@gmail.com wrote:
> >
> > > This message happens more or less every reboot, sometimes cable
> > > unplug/plug is needed to restore connectivity, otherwise card is
> > > working fine.
> > >
> > >
> > > [ 22.570010] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
> > > [ 26.570011] NET: Registered protocol family 10
> > > [ 37.551934] eth0: no IPv6 routers present
> > >
> > > [rebooted box which is directly connected to a box with
> > > atl1]
> > >
> > > [ 2078.740004] atl1 0000:03:00.0: eth0 link is down
> > > [ 2080.790004] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full
> > > duplex [ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit
> > > timed out [ 2086.050004] ------------[ cut here ]------------
> > > [ 2086.050004] WARNING: at net/sched/sch_generic.c:221
> > [...]
> >
> > Alexey,
> >
> > Can you please try this patch?
>
> Seems to help (no more messages), still sometimes no networking until
> after cable replug. Stay tuned.
Does this stay tuned mean, "Jay, hold off while I (Alexey) troubleshoot
this further." Or does it mean, "Jay, you've not completely solved the
problem yet and you need to continue working on it." If it's the
latter, can you please describe your setup and how you reproduce the
problem? Is the box you have connected directly to the atl1 box used
as a netconsole?
Thanks for your help.
Jay
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: atl1: WARNING at net/sched/sch_generic.c:221
2008-09-16 1:44 ` Jay Cliburn
@ 2008-09-17 7:44 ` Alexey Dobriyan
0 siblings, 0 replies; 15+ messages in thread
From: Alexey Dobriyan @ 2008-09-17 7:44 UTC (permalink / raw)
To: Jay Cliburn; +Cc: csnook, netdev
On Mon, Sep 15, 2008 at 08:44:31PM -0500, Jay Cliburn wrote:
> On Tue, 16 Sep 2008 02:45:22 +0400
> Alexey Dobriyan <adobriyan@gmail.com> wrote:
>
> > On Sun, Sep 14, 2008 at 06:17:14PM -0500, Jay Cliburn wrote:
> > > On Thu, 21 Aug 2008 15:58:49 +0400
> > > adobriyan@gmail.com wrote:
> > >
> > > > This message happens more or less every reboot, sometimes cable
> > > > unplug/plug is needed to restore connectivity, otherwise card is
> > > > working fine.
> > > >
> > > >
> > > > [ 22.570010] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
> > > > [ 26.570011] NET: Registered protocol family 10
> > > > [ 37.551934] eth0: no IPv6 routers present
> > > >
> > > > [rebooted box which is directly connected to a box with
> > > > atl1]
> > > >
> > > > [ 2078.740004] atl1 0000:03:00.0: eth0 link is down
> > > > [ 2080.790004] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full
> > > > duplex [ 2086.049998] NETDEV WATCHDOG: eth0 (atl1): transmit
> > > > timed out [ 2086.050004] ------------[ cut here ]------------
> > > > [ 2086.050004] WARNING: at net/sched/sch_generic.c:221
> > > [...]
> > >
> > > Alexey,
> > >
> > > Can you please try this patch?
> >
> > Seems to help (no more messages), still sometimes no networking until
> > after cable replug. Stay tuned.
>
> Does this stay tuned mean, "Jay, hold off while I (Alexey) troubleshoot
> this further." Or does it mean, "Jay, you've not completely solved the
> problem yet and you need to continue working on it."
I don't know.
Transmit timeout messages definitely dissapeared, no network until cable
replug a) happened earlier, b) much more rare than transmit timeouts.
> If it's the latter, can you please describe your setup and how you reproduce
> the problem? Is the box you have connected directly to the atl1 box used
> as a netconsole?
atl1 netconsoles to r8169, connected directly without any switches.
It usually happens after one of the boxes reboots (well, obviously)
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2008-09-17 7:42 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-21 11:58 atl1: WARNING at net/sched/sch_generic.c:221 adobriyan
2008-08-21 11:59 ` David Miller
2008-08-21 12:04 ` adobriyan
2008-08-21 12:08 ` David Miller
2008-09-14 19:26 ` Jay Cliburn
2008-09-14 19:58 ` Stephen Hemminger
2008-09-14 23:56 ` David Miller
2008-09-15 0:11 ` Jay Cliburn
2008-09-15 3:14 ` Jeff Garzik
2008-08-22 2:00 ` Jay Cliburn
2008-08-22 21:50 ` Jay Cliburn
2008-09-14 23:17 ` Jay Cliburn
2008-09-15 22:45 ` Alexey Dobriyan
2008-09-16 1:44 ` Jay Cliburn
2008-09-17 7:44 ` Alexey Dobriyan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).