* Fwd: [PATCH] bcm43xx: (hopefully) fix watchdog timeouts.
@ 2006-10-24 14:31 Michael Buesch
2006-10-24 14:32 ` Michael Buesch
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Michael Buesch @ 2006-10-24 14:31 UTC (permalink / raw)
To: Greg KH, John Linville
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, bcm43xx-dev-0fE9KPoRgkgATYTw5x5z8w,
stable-u79uwXL29TY76Z2rM5mHXA, Larry Finger
This fixes a netdev watchdog timeout problem.
The problem is caused by a needed netif_tx_disable
in the hardware calibration code and can be shown by the
following timegraph.
|---5secs - ~10 jiffies time---|---|OOPS
^ ^
last real TX periodic work stops netif
At OOPS, the following happens:
The watchdog timer triggers, because the timeout of 5secs
is over. The watchdog first checks for stopped TX.
_Usually_ TX is only stopped from the TX handler to indicate
a full TX queue. But this is different. We need to stop TX here,
regardless of the TX queue state. So the watchdog recognizes
the stopped device and assumes it is stopped due to full
TX queues (Which is a _wrong_ assumption in this case). It then
tests how far the last TX has been in the past. If it's more than
5secs (which is the case for low or no traffic), it will fire
a TX timeout.
Signed-off-by: Michael Buesch <mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>
--
John, please apply this bugfix to wireless-2.6.
Greg, as the -stable maintainer, please consider putting this
into 2.6.18.2
Index: linux-2.6.18/drivers/net/wireless/bcm43xx/bcm43xx_main.c
===================================================================
--- linux-2.6.18.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c 2006-10-19 21:30:42.000000000 +0200
+++ linux-2.6.18/drivers/net/wireless/bcm43xx/bcm43xx_main.c 2006-10-19 21:33:28.000000000 +0200
@@ -3165,7 +3165,15 @@ static void bcm43xx_periodic_work_handle
badness = estimate_periodic_work_badness(bcm->periodic_state);
mutex_lock(&bcm->mutex);
+
+ /* We must fake a started transmission here, as we are going to
+ * disable TX. If we wouldn't fake a TX, it would be possible to
+ * trigger the netdev watchdog, if the last real TX is already
+ * some time on the past (slightly less than 5secs)
+ */
+ bcm->net_dev->trans_start = jiffies;
netif_tx_disable(bcm->net_dev);
+
spin_lock_irqsave(&bcm->irq_lock, flags);
if (badness > BADNESS_LIMIT) {
/* Periodic work will take a long time, so we want it to
--
Greetings Michael.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Fwd: [PATCH] bcm43xx: (hopefully) fix watchdog timeouts.
2006-10-24 14:31 Fwd: [PATCH] bcm43xx: (hopefully) fix watchdog timeouts Michael Buesch
@ 2006-10-24 14:32 ` Michael Buesch
[not found] ` <200610241631.18911.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>
2006-10-25 0:37 ` John W. Linville
2 siblings, 0 replies; 6+ messages in thread
From: Michael Buesch @ 2006-10-24 14:32 UTC (permalink / raw)
To: Greg KH, John Linville; +Cc: stable, Larry Finger, bcm43xx-dev, netdev
Oh, damn crap. Please remove the words "fwd" and "hopefully"
from the subject.
Sorry for the inconvenience.
--
Greetings Michael.
^ permalink raw reply [flat|nested] 6+ messages in thread[parent not found: <200610241631.18911.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>]
* Re: Fwd: [PATCH] bcm43xx: (hopefully) fix watchdog timeouts.
[not found] ` <200610241631.18911.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>
@ 2006-10-25 0:28 ` John W. Linville
0 siblings, 0 replies; 6+ messages in thread
From: John W. Linville @ 2006-10-25 0:28 UTC (permalink / raw)
To: Michael Buesch
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Greg KH,
bcm43xx-dev-0fE9KPoRgkgATYTw5x5z8w, stable-u79uwXL29TY76Z2rM5mHXA,
Larry Finger
Michael,
It looks like you have a patch that I don't have, one that moves the
netif_tx_disable and spin_lock_irqsave outside of the "if (badness >
BADNESS_LIMIT)" conditional.
Could you pass that one along as well, or correct this patch to match
what is in Linus' tree?
Thanks,
John
On Tue, Oct 24, 2006 at 04:31:18PM +0200, Michael Buesch wrote:
> This fixes a netdev watchdog timeout problem.
> The problem is caused by a needed netif_tx_disable
> in the hardware calibration code and can be shown by the
> following timegraph.
>
> |---5secs - ~10 jiffies time---|---|OOPS
> ^ ^
> last real TX periodic work stops netif
>
> At OOPS, the following happens:
> The watchdog timer triggers, because the timeout of 5secs
> is over. The watchdog first checks for stopped TX.
> _Usually_ TX is only stopped from the TX handler to indicate
> a full TX queue. But this is different. We need to stop TX here,
> regardless of the TX queue state. So the watchdog recognizes
> the stopped device and assumes it is stopped due to full
> TX queues (Which is a _wrong_ assumption in this case). It then
> tests how far the last TX has been in the past. If it's more than
> 5secs (which is the case for low or no traffic), it will fire
> a TX timeout.
>
> Signed-off-by: Michael Buesch <mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>
>
> --
>
> John, please apply this bugfix to wireless-2.6.
> Greg, as the -stable maintainer, please consider putting this
> into 2.6.18.2
>
> Index: linux-2.6.18/drivers/net/wireless/bcm43xx/bcm43xx_main.c
> ===================================================================
> --- linux-2.6.18.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c 2006-10-19 21:30:42.000000000 +0200
> +++ linux-2.6.18/drivers/net/wireless/bcm43xx/bcm43xx_main.c 2006-10-19 21:33:28.000000000 +0200
> @@ -3165,7 +3165,15 @@ static void bcm43xx_periodic_work_handle
>
> badness = estimate_periodic_work_badness(bcm->periodic_state);
> mutex_lock(&bcm->mutex);
> +
> + /* We must fake a started transmission here, as we are going to
> + * disable TX. If we wouldn't fake a TX, it would be possible to
> + * trigger the netdev watchdog, if the last real TX is already
> + * some time on the past (slightly less than 5secs)
> + */
> + bcm->net_dev->trans_start = jiffies;
> netif_tx_disable(bcm->net_dev);
> +
> spin_lock_irqsave(&bcm->irq_lock, flags);
> if (badness > BADNESS_LIMIT) {
> /* Periodic work will take a long time, so we want it to
>
>
>
> --
> Greetings Michael.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fwd: [PATCH] bcm43xx: (hopefully) fix watchdog timeouts.
2006-10-24 14:31 Fwd: [PATCH] bcm43xx: (hopefully) fix watchdog timeouts Michael Buesch
2006-10-24 14:32 ` Michael Buesch
[not found] ` <200610241631.18911.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>
@ 2006-10-25 0:37 ` John W. Linville
[not found] ` <20061025003726.GC7340-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
2 siblings, 1 reply; 6+ messages in thread
From: John W. Linville @ 2006-10-25 0:37 UTC (permalink / raw)
To: Michael Buesch; +Cc: Greg KH, stable, Larry Finger, bcm43xx-dev, netdev
Michael,
It looks like you have a patch that I don't have, one that moves the
netif_tx_disable and spin_lock_irqsave outside of the "if (badness >
BADNESS_LIMIT)" conditional.
Could you pass that one along as well, or correct this patch to match
what is in Linus' tree?
Thanks,
John
On Tue, Oct 24, 2006 at 04:31:18PM +0200, Michael Buesch wrote:
> This fixes a netdev watchdog timeout problem.
> The problem is caused by a needed netif_tx_disable
> in the hardware calibration code and can be shown by the
> following timegraph.
>
> |---5secs - ~10 jiffies time---|---|OOPS
> ^ ^
> last real TX periodic work stops netif
>
> At OOPS, the following happens:
> The watchdog timer triggers, because the timeout of 5secs
> is over. The watchdog first checks for stopped TX.
> _Usually_ TX is only stopped from the TX handler to indicate
> a full TX queue. But this is different. We need to stop TX here,
> regardless of the TX queue state. So the watchdog recognizes
> the stopped device and assumes it is stopped due to full
> TX queues (Which is a _wrong_ assumption in this case). It then
> tests how far the last TX has been in the past. If it's more than
> 5secs (which is the case for low or no traffic), it will fire
> a TX timeout.
>
> Signed-off-by: Michael Buesch <mb@bu3sch.de>
>
> --
>
> John, please apply this bugfix to wireless-2.6.
> Greg, as the -stable maintainer, please consider putting this
> into 2.6.18.2
>
> Index: linux-2.6.18/drivers/net/wireless/bcm43xx/bcm43xx_main.c
> ===================================================================
> --- linux-2.6.18.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c 2006-10-19 21:30:42.000000000 +0200
> +++ linux-2.6.18/drivers/net/wireless/bcm43xx/bcm43xx_main.c 2006-10-19 21:33:28.000000000 +0200
> @@ -3165,7 +3165,15 @@ static void bcm43xx_periodic_work_handle
>
> badness = estimate_periodic_work_badness(bcm->periodic_state);
> mutex_lock(&bcm->mutex);
> +
> + /* We must fake a started transmission here, as we are going to
> + * disable TX. If we wouldn't fake a TX, it would be possible to
> + * trigger the netdev watchdog, if the last real TX is already
> + * some time on the past (slightly less than 5secs)
> + */
> + bcm->net_dev->trans_start = jiffies;
> netif_tx_disable(bcm->net_dev);
> +
> spin_lock_irqsave(&bcm->irq_lock, flags);
> if (badness > BADNESS_LIMIT) {
> /* Periodic work will take a long time, so we want it to
>
>
>
> --
> Greetings Michael.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-10-26 4:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-24 14:31 Fwd: [PATCH] bcm43xx: (hopefully) fix watchdog timeouts Michael Buesch
2006-10-24 14:32 ` Michael Buesch
[not found] ` <200610241631.18911.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>
2006-10-25 0:28 ` John W. Linville
2006-10-25 0:37 ` John W. Linville
[not found] ` <20061025003726.GC7340-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
2006-10-25 9:38 ` Michael Buesch
2006-10-26 4:03 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).