* Re: [patch 07/30] bcm43xx: Drain TX status before starting IRQs
@ 2006-11-21 6:43 Ray Lee
2006-11-21 8:52 ` Michael Buesch
0 siblings, 1 reply; 8+ messages in thread
From: Ray Lee @ 2006-11-21 6:43 UTC (permalink / raw)
To: Larry Finger, LKML, Bcm43xx-dev, Michael Buesch, John Linville
On 11/18/06, Larry Finger <Larry.Finger@lwfinger.net> wrote:
> The regression turns out to be a locking problem involving bcm43xx,
> wpa_supplicant, and NetworkManager. The exact cause is unknown; however,
> this patch is clearly not the problem. Please
> reinstate it for inclusion in -stable.
This patch is different from the patch that I claim is wrong. The patch that I
think is bad *actually changes locking* (merge error?); it is different than
the -stable patch that was submitted.
Michael et al please read the patch (diff portion) below as if you're seeing
it for the first time. Please note the locking changes introduced.
user: Michael Buesch <mb@bu3sch.de>
date: Wed Nov 01 08:15:40 2006 +0500
files: drivers/net/wireless/bcm43xx/bcm43xx_main.c
description:
[PATCH] bcm43xx: Fix low-traffic netdev watchdog TX timeouts
This fixes a netdev watchdog timeout problem.
The software needs to call netif_tx_disable before running the
hardware calibration code. The problem condition can be shown by the
following timegraph.
|---5secs - ~10 jiffies time---|---|OOPS
^ ^
last real TX periodic work stops netif
At OOPS, the following happens:
The watchdog timer triggers, because the timeout of 5secs
is over. The watchdog first checks for stopped TX.
_Usually_ TX is only stopped from the TX handler to indicate
a full TX queue. But this is different. We need to stop TX here,
regardless of the TX queue state. So the watchdog recognizes
the stopped device and assumes it is stopped due to full
TX queues (Which is a _wrong_ assumption in this case). It then
tests how far the last TX has been in the past. If it's more than
5secs (which is the case for low or no traffic), it will fire
a TX timeout.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
committer: John W. Linville <linville@laptop.(none)> 1162350940 -0500
diff -r 41ff0150cbadd56e692f148adb1bfd4ca420e3e0 -r
ca97546422bd9a52a7000607d657ca2915f31104
drivers/net/wireless/bcm43xx/bcm43xx_main.c
--- a/drivers/net/wireless/bcm43xx/bcm43xx_main.c Wed Nov 01 08:15:39
2006 +0500
+++ b/drivers/net/wireless/bcm43xx/bcm43xx_main.c Wed Nov 01 08:15:40
2006 +0500
@@ -3163,9 +3163,11 @@ static void bcm43xx_periodic_work_handle
static void bcm43xx_periodic_work_handler(void *d)
{
struct bcm43xx_private *bcm = d;
+ struct net_device *net_dev = bcm->net_dev;
unsigned long flags;
u32 savedirqs = 0;
int badness;
+ unsigned long orig_trans_start = 0;
mutex_lock(&bcm->mutex);
badness = estimate_periodic_work_badness(bcm->periodic_state);
@@ -3173,7 +3175,18 @@ static void bcm43xx_periodic_work_handle
/* Periodic work will take a long time, so we want it to
* be preemtible.
*/
- netif_tx_disable(bcm->net_dev);
+
+ netif_tx_lock_bh(net_dev);
+ /* We must fake a started transmission here, as we are going to
+ * disable TX. If we wouldn't fake a TX, it would be possible to
+ * trigger the netdev watchdog, if the last real TX is already
+ * some time on the past (slightly less than 5secs)
+ */
+ orig_trans_start = net_dev->trans_start;
+ net_dev->trans_start = jiffies;
+ netif_stop_queue(net_dev);
+ netif_tx_unlock_bh(net_dev);
+
spin_lock_irqsave(&bcm->irq_lock, flags);
bcm43xx_mac_suspend(bcm);
if (bcm43xx_using_pio(bcm))
@@ -3198,6 +3211,7 @@ static void bcm43xx_periodic_work_handle
bcm43xx_pio_thaw_txqueues(bcm);
bcm43xx_mac_enable(bcm);
netif_wake_queue(bcm->net_dev);
+ net_dev->trans_start = orig_trans_start;
}
mmiowb();
spin_unlock_irqrestore(&bcm->irq_lock, flags);
or the patch via gitweb:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=81e171b95d2d06a64465a1e6ab1e2fb864ea2448
Ray
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [patch 07/30] bcm43xx: Drain TX status before starting IRQs
2006-11-21 6:43 [patch 07/30] bcm43xx: Drain TX status before starting IRQs Ray Lee
@ 2006-11-21 8:52 ` Michael Buesch
0 siblings, 0 replies; 8+ messages in thread
From: Michael Buesch @ 2006-11-21 8:52 UTC (permalink / raw)
To: Ray Lee; +Cc: Larry Finger, LKML, Bcm43xx-dev, John Linville
On Tuesday 21 November 2006 07:43, Ray Lee wrote:
> On 11/18/06, Larry Finger <Larry.Finger@lwfinger.net> wrote:
> > The regression turns out to be a locking problem involving bcm43xx,
> > wpa_supplicant, and NetworkManager. The exact cause is unknown; however,
> > this patch is clearly not the problem. Please
> > reinstate it for inclusion in -stable.
>
> This patch is different from the patch that I claim is wrong. The patch that I
> think is bad *actually changes locking* (merge error?); it is different than
> the -stable patch that was submitted.
>
> Michael et al please read the patch (diff portion) below as if you're seeing
> it for the first time. Please note the locking changes introduced.
If you read it carefully, you will notice that there aren't locking changes.
Only a few loads and stores to net_dev->trans_start are added.
> user: Michael Buesch <mb@bu3sch.de>
> date: Wed Nov 01 08:15:40 2006 +0500
> files: drivers/net/wireless/bcm43xx/bcm43xx_main.c
> description:
> [PATCH] bcm43xx: Fix low-traffic netdev watchdog TX timeouts
>
> This fixes a netdev watchdog timeout problem.
> The software needs to call netif_tx_disable before running the
> hardware calibration code. The problem condition can be shown by the
> following timegraph.
>
> |---5secs - ~10 jiffies time---|---|OOPS
> ^ ^
> last real TX periodic work stops netif
>
> At OOPS, the following happens:
> The watchdog timer triggers, because the timeout of 5secs
> is over. The watchdog first checks for stopped TX.
> _Usually_ TX is only stopped from the TX handler to indicate
> a full TX queue. But this is different. We need to stop TX here,
> regardless of the TX queue state. So the watchdog recognizes
> the stopped device and assumes it is stopped due to full
> TX queues (Which is a _wrong_ assumption in this case). It then
> tests how far the last TX has been in the past. If it's more than
> 5secs (which is the case for low or no traffic), it will fire
> a TX timeout.
>
> Signed-off-by: Michael Buesch <mb@bu3sch.de>
> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
> Signed-off-by: John W. Linville <linville@tuxdriver.com>
>
> committer: John W. Linville <linville@laptop.(none)> 1162350940 -0500
>
>
> diff -r 41ff0150cbadd56e692f148adb1bfd4ca420e3e0 -r
> ca97546422bd9a52a7000607d657ca2915f31104
> drivers/net/wireless/bcm43xx/bcm43xx_main.c
> --- a/drivers/net/wireless/bcm43xx/bcm43xx_main.c Wed Nov 01 08:15:39
> 2006 +0500
> +++ b/drivers/net/wireless/bcm43xx/bcm43xx_main.c Wed Nov 01 08:15:40
> 2006 +0500
> @@ -3163,9 +3163,11 @@ static void bcm43xx_periodic_work_handle
> static void bcm43xx_periodic_work_handler(void *d)
> {
> struct bcm43xx_private *bcm = d;
> + struct net_device *net_dev = bcm->net_dev;
> unsigned long flags;
> u32 savedirqs = 0;
> int badness;
> + unsigned long orig_trans_start = 0;
>
> mutex_lock(&bcm->mutex);
> badness = estimate_periodic_work_badness(bcm->periodic_state);
> @@ -3173,7 +3175,18 @@ static void bcm43xx_periodic_work_handle
> /* Periodic work will take a long time, so we want it to
> * be preemtible.
> */
> - netif_tx_disable(bcm->net_dev);
> +
> + netif_tx_lock_bh(net_dev);
> + /* We must fake a started transmission here, as we are going to
> + * disable TX. If we wouldn't fake a TX, it would be possible to
> + * trigger the netdev watchdog, if the last real TX is already
> + * some time on the past (slightly less than 5secs)
> + */
> + orig_trans_start = net_dev->trans_start;
> + net_dev->trans_start = jiffies;
> + netif_stop_queue(net_dev);
> + netif_tx_unlock_bh(net_dev);
> +
> spin_lock_irqsave(&bcm->irq_lock, flags);
> bcm43xx_mac_suspend(bcm);
> if (bcm43xx_using_pio(bcm))
> @@ -3198,6 +3211,7 @@ static void bcm43xx_periodic_work_handle
> bcm43xx_pio_thaw_txqueues(bcm);
> bcm43xx_mac_enable(bcm);
> netif_wake_queue(bcm->net_dev);
> + net_dev->trans_start = orig_trans_start;
> }
> mmiowb();
> spin_unlock_irqrestore(&bcm->irq_lock, flags);
>
> or the patch via gitweb:
> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=81e171b95d2d06a64465a1e6ab1e2fb864ea2448
>
> Ray
>
>
>
--
Greetings Michael.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch 00/30] -stable review
@ 2006-11-16 2:43 Chris Wright
2006-11-16 2:43 ` [patch 07/30] bcm43xx: Drain TX status before starting IRQs Chris Wright
0 siblings, 1 reply; 8+ messages in thread
From: Chris Wright @ 2006-11-16 2:43 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
torvalds, akpm, alan
This is the start of the stable review cycle for the 2.6.18.3 release.
There are 30 patches in this series, all will be posted as a response to
this one. If anyone has any issues with these being applied, please let
us know. If anyone is a maintainer of the proper subsystem, and wants
to add a Signed-off-by: line to the patch, please respond with it.
These patches are sent out with a number of different people on the
Cc: line. If you wish to be a reviewer, please email stable@kernel.org
to add your name to the list. If you want to be off the reviewer list,
also email us.
Responses should be made by Sat Nov 18 02:35 UTC. Anything received
after that time might be too late.
thanks,
the -stable release team
--
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch 07/30] bcm43xx: Drain TX status before starting IRQs
2006-11-16 2:43 [patch 00/30] -stable review Chris Wright
@ 2006-11-16 2:43 ` Chris Wright
2006-11-16 16:09 ` Larry Finger
2006-11-18 19:06 ` Larry Finger
0 siblings, 2 replies; 8+ messages in thread
From: Chris Wright @ 2006-11-16 2:43 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
torvalds, akpm, alan, Larry Finger, netdev, mb, greg,
John W. Linville
[-- Attachment #1: bcm43xx-drain-tx-status-before-starting-irqs.patch --]
[-- Type: text/plain, Size: 1772 bytes --]
-stable review patch. If anyone has any objections, please let us know.
------------------
From: Michael Buesch <mb@bu3sch.de>
Drain the Microcode TX-status-FIFO before we enable IRQs.
This is required, because the FIFO may still have entries left
from a previous run. Those would immediately fire after enabling
IRQs and would lead to an oops in the DMA TXstatus handling code.
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
---
drivers/net/wireless/bcm43xx/bcm43xx_main.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
--- linux-2.6.18.2.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c
+++ linux-2.6.18.2/drivers/net/wireless/bcm43xx/bcm43xx_main.c
@@ -1463,6 +1463,23 @@ static void handle_irq_transmit_status(s
}
}
+static void drain_txstatus_queue(struct bcm43xx_private *bcm)
+{
+ u32 dummy;
+
+ if (bcm->current_core->rev < 5)
+ return;
+ /* Read all entries from the microcode TXstatus FIFO
+ * and throw them away.
+ */
+ while (1) {
+ dummy = bcm43xx_read32(bcm, BCM43xx_MMIO_XMITSTAT_0);
+ if (!dummy)
+ break;
+ dummy = bcm43xx_read32(bcm, BCM43xx_MMIO_XMITSTAT_1);
+ }
+}
+
static void bcm43xx_generate_noise_sample(struct bcm43xx_private *bcm)
{
bcm43xx_shm_write16(bcm, BCM43xx_SHM_SHARED, 0x408, 0x7F7F);
@@ -3517,6 +3534,7 @@ int bcm43xx_select_wireless_core(struct
bcm43xx_macfilter_clear(bcm, BCM43xx_MACFILTER_ASSOC);
bcm43xx_macfilter_set(bcm, BCM43xx_MACFILTER_SELF, (u8 *)(bcm->net_dev->dev_addr));
bcm43xx_security_init(bcm);
+ drain_txstatus_queue(bcm);
ieee80211softmac_start(bcm->net_dev);
/* Let's go! Be careful after enabling the IRQs.
--
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [patch 07/30] bcm43xx: Drain TX status before starting IRQs
2006-11-16 2:43 ` [patch 07/30] bcm43xx: Drain TX status before starting IRQs Chris Wright
@ 2006-11-16 16:09 ` Larry Finger
2006-11-16 18:38 ` Chris Wright
2006-11-18 19:06 ` Larry Finger
1 sibling, 1 reply; 8+ messages in thread
From: Larry Finger @ 2006-11-16 16:09 UTC (permalink / raw)
To: Chris Wright
Cc: linux-kernel, stable, Justin Forbes, Zwane Mwaikambo,
Theodore Ts'o, Randy Dunlap, Dave Jones, Chuck Wolber,
Chris Wedgwood, Michael Krufky, torvalds, akpm, alan, netdev, mb,
greg, John W. Linville
Chris Wright wrote:
> -stable review patch. If anyone has any objections, please let us know.
> ------------------
>
> From: Michael Buesch <mb@bu3sch.de>
>
> Drain the Microcode TX-status-FIFO before we enable IRQs.
> This is required, because the FIFO may still have entries left
> from a previous run. Those would immediately fire after enabling
> IRQs and would lead to an oops in the DMA TXstatus handling code.
>
> Cc: "John W. Linville" <linville@tuxdriver.com>
> Signed-off-by: Michael Buesch <mb@bu3sch.de>
> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
> ---
Chris,
We have a report of a regression between 2.6.19-rc3 and -rc5. As this patch seems to be the only one
that could cause the problem, please pull it from -stable while we sort out the difficulty.
Thanks,
Larry
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 07/30] bcm43xx: Drain TX status before starting IRQs
2006-11-16 16:09 ` Larry Finger
@ 2006-11-16 18:38 ` Chris Wright
0 siblings, 0 replies; 8+ messages in thread
From: Chris Wright @ 2006-11-16 18:38 UTC (permalink / raw)
To: Larry Finger
Cc: Chris Wright, linux-kernel, stable, Justin Forbes,
Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap, Dave Jones,
Chuck Wolber, Chris Wedgwood, Michael Krufky, torvalds, akpm,
alan, netdev, mb, greg, John W. Linville
* Larry Finger (Larry.Finger@lwfinger.net) wrote:
> We have a report of a regression between 2.6.19-rc3 and -rc5. As this patch
> seems to be the only one that could cause the problem, please pull it from
> -stable while we sort out the difficulty.
Thanks a lot for the heads up Larry, dropping this one.
-chris
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 07/30] bcm43xx: Drain TX status before starting IRQs
2006-11-16 2:43 ` [patch 07/30] bcm43xx: Drain TX status before starting IRQs Chris Wright
2006-11-16 16:09 ` Larry Finger
@ 2006-11-18 19:06 ` Larry Finger
2006-11-19 4:11 ` Chris Wright
2006-11-19 23:51 ` Dan Williams
1 sibling, 2 replies; 8+ messages in thread
From: Larry Finger @ 2006-11-18 19:06 UTC (permalink / raw)
To: Chris Wright
Cc: linux-kernel, stable, Justin Forbes, Zwane Mwaikambo,
Theodore Ts'o, Randy Dunlap, Dave Jones, Chuck Wolber,
Chris Wedgwood, Michael Krufky, torvalds, akpm, alan, netdev, mb,
greg, John W. Linville
Chris Wright wrote:
> -stable review patch. If anyone has any objections, please let us know.
> ------------------
>
> From: Michael Buesch <mb@bu3sch.de>
>
> Drain the Microcode TX-status-FIFO before we enable IRQs.
> This is required, because the FIFO may still have entries left
> from a previous run. Those would immediately fire after enabling
> IRQs and would lead to an oops in the DMA TXstatus handling code.
>
> Cc: "John W. Linville" <linville@tuxdriver.com>
> Signed-off-by: Michael Buesch <mb@bu3sch.de>
> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
> ---
Chris,
The regression turns out to be a locking problem involving bcm43xx, wpa_supplicant, and
NetworkManager. The exact cause is unknown; however, this patch is clearly not the problem. Please
reinstate it for inclusion in -stable.
Thanks,
Larry
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 07/30] bcm43xx: Drain TX status before starting IRQs
2006-11-18 19:06 ` Larry Finger
@ 2006-11-19 4:11 ` Chris Wright
2006-11-19 23:51 ` Dan Williams
1 sibling, 0 replies; 8+ messages in thread
From: Chris Wright @ 2006-11-19 4:11 UTC (permalink / raw)
To: Larry Finger
Cc: Chris Wright, linux-kernel, stable, Justin Forbes,
Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap, Dave Jones,
Chuck Wolber, Chris Wedgwood, Michael Krufky, torvalds, akpm,
alan, netdev, mb, greg, John W. Linville
* Larry Finger (Larry.Finger@lwfinger.net) wrote:
> The regression turns out to be a locking problem involving bcm43xx,
> wpa_supplicant, and NetworkManager. The exact cause is unknown; however,
> this patch is clearly not the problem. Please reinstate it for inclusion in
> -stable.
Thanks for the follow-up, Larry. It's queued for next -stable.
thanks,
-chris
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 07/30] bcm43xx: Drain TX status before starting IRQs
2006-11-18 19:06 ` Larry Finger
2006-11-19 4:11 ` Chris Wright
@ 2006-11-19 23:51 ` Dan Williams
1 sibling, 0 replies; 8+ messages in thread
From: Dan Williams @ 2006-11-19 23:51 UTC (permalink / raw)
To: Larry Finger
Cc: Chris Wright, linux-kernel, stable, Justin Forbes,
Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap, Dave Jones,
Chuck Wolber, Chris Wedgwood, Michael Krufky, torvalds, akpm,
alan, netdev, mb, greg, John W. Linville
On Sat, 2006-11-18 at 13:06 -0600, Larry Finger wrote:
> Chris Wright wrote:
> > -stable review patch. If anyone has any objections, please let us know.
> > ------------------
> >
> > From: Michael Buesch <mb@bu3sch.de>
> >
> > Drain the Microcode TX-status-FIFO before we enable IRQs.
> > This is required, because the FIFO may still have entries left
> > from a previous run. Those would immediately fire after enabling
> > IRQs and would lead to an oops in the DMA TXstatus handling code.
> >
> > Cc: "John W. Linville" <linville@tuxdriver.com>
> > Signed-off-by: Michael Buesch <mb@bu3sch.de>
> > Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
> > Signed-off-by: Chris Wright <chrisw@sous-sol.org>
> > ---
>
> Chris,
>
> The regression turns out to be a locking problem involving bcm43xx, wpa_supplicant, and
> NetworkManager. The exact cause is unknown; however, this patch is clearly not the problem. Please
> reinstate it for inclusion in -stable.
NM should be using wpa_supplicant underneath. But depending on the NM
version, NM may be issuing any one of SIWENCODE (only to clear keys),
[S|G]IWSCAN, GIWRANGE, GIWAP, [S|G]IWMODE, [S|G]IWFREQ. Mainly, NM
cleans up after wpa_supplicant, gets information about the current
connection, and does scans. All other connection setup and handling is
done by wpa_supplicant. But note that NM will do any of the above
operations at any time, no matter what wpa_supplicant is doing at that
time. So the locking in the driver needs to be right, but it should be
right anyway regardless of whether either one or both of NM and
wpa_supplicant is in the picture...
Dan
> Thanks,
>
> Larry
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-11-21 8:54 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-21 6:43 [patch 07/30] bcm43xx: Drain TX status before starting IRQs Ray Lee
2006-11-21 8:52 ` Michael Buesch
-- strict thread matches above, loose matches on Subject: below --
2006-11-16 2:43 [patch 00/30] -stable review Chris Wright
2006-11-16 2:43 ` [patch 07/30] bcm43xx: Drain TX status before starting IRQs Chris Wright
2006-11-16 16:09 ` Larry Finger
2006-11-16 18:38 ` Chris Wright
2006-11-18 19:06 ` Larry Finger
2006-11-19 4:11 ` Chris Wright
2006-11-19 23:51 ` Dan Williams
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.