* ath5k: scanning while transmitting causes oops on 802.11a capable card
@ 2009-05-06 16:14 Pavel Roskin
2009-05-06 16:19 ` Johannes Berg
2009-05-06 16:36 ` [ath5k-devel] " Bob Copeland
0 siblings, 2 replies; 10+ messages in thread
From: Pavel Roskin @ 2009-05-06 16:14 UTC (permalink / raw)
To: linux-wireless; +Cc: ath5k-devel
Hello!
If I scan by "iw dev wlan0 scan" while sending data through the
interface, I get a BUG in net/mac80211/tx.c:
/* RC is busted */
if (WARN_ON_ONCE(info->control.rates[i].idx >=
sband->n_bitrates)) {
info->control.rates[i].idx = -1;
continue;
}
I added this statement inside the condition:
printk("idx = %d, bitrates = %d, i = %d\n", info->control.rates[i].idx,
sband->n_bitrates, i);
The result is:
idx = 9, bitrates = 8, i = 0
idx = 10, bitrates = 8, i = 1
idx = 9, bitrates = 8, i = 2
The card is 802.11a capable. My interpretation is that scanning
switches to the 802.11a band temporarily, but doesn't stop transmission.
When transmitting, the rate indices for 2.4 GHz band are checked against
the number of rates in the 5 GHz band, which is indeed 8. There are 12
rates in the 2.4 GHz band.
ath5k 0000:0b:00.0: PCI INT A disabled
ath5k 0000:0b:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
ath5k 0000:0b:00.0: setting latency timer to 64
ath5k 0000:0b:00.0: registered as 'phy2'
ath: Country alpha2 being used: US
ath: Regpair detected: 0x3a
phy2: Selected rate control algorithm 'minstrel'
ath5k phy2: Atheros AR5414 chip found (MAC: 0xa3, PHY: 0x61)
I actually had to patch the kernel, or the oops would escalate to a
panic. Perhaps it's a good idea to have that check:
--- a/drivers/net/wireless/ath/ath5k/base.c
+++ b/drivers/net/wireless/ath/ath5k/base.c
@@ -1246,6 +1246,8 @@ ath5k_txbuf_setup(struct ath5k_softc *sc, struct ath5k_buf *bf)
PCI_DMA_TODEVICE);
rate = ieee80211_get_tx_rate(sc->hw, info);
+ if (!rate)
+ return -EIO;
if (info->flags & IEEE80211_TX_CTL_NO_ACK)
flags |= AR5K_TXDESC_NOACK;
--
Regards,
Pavel Roskin
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: ath5k: scanning while transmitting causes oops on 802.11a capable card 2009-05-06 16:14 ath5k: scanning while transmitting causes oops on 802.11a capable card Pavel Roskin @ 2009-05-06 16:19 ` Johannes Berg 2009-05-06 16:36 ` [ath5k-devel] " Bob Copeland 1 sibling, 0 replies; 10+ messages in thread From: Johannes Berg @ 2009-05-06 16:19 UTC (permalink / raw) To: Pavel Roskin; +Cc: linux-wireless, ath5k-devel [-- Attachment #1: Type: text/plain, Size: 1671 bytes --] > If I scan by "iw dev wlan0 scan" while sending data through the > interface, I get a BUG in net/mac80211/tx.c: > > /* RC is busted */ > if (WARN_ON_ONCE(info->control.rates[i].idx >= > sband->n_bitrates)) { > info->control.rates[i].idx = -1; > continue; > } > > I added this statement inside the condition: > > printk("idx = %d, bitrates = %d, i = %d\n", info->control.rates[i].idx, > sband->n_bitrates, i); > > The result is: > > idx = 9, bitrates = 8, i = 0 > idx = 10, bitrates = 8, i = 1 > idx = 9, bitrates = 8, i = 2 > > The card is 802.11a capable. My interpretation is that scanning > switches to the 802.11a band temporarily, but doesn't stop transmission. > When transmitting, the rate indices for 2.4 GHz band are checked against > the number of rates in the 5 GHz band, which is indeed 8. There are 12 > rates in the 2.4 GHz band. > > ath5k 0000:0b:00.0: PCI INT A disabled > ath5k 0000:0b:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 > ath5k 0000:0b:00.0: setting latency timer to 64 > ath5k 0000:0b:00.0: registered as 'phy2' > ath: Country alpha2 being used: US > ath: Regpair detected: 0x3a > phy2: Selected rate control algorithm 'minstrel' > ath5k phy2: Atheros AR5414 chip found (MAC: 0xa3, PHY: 0x61) > > I actually had to patch the kernel, or the oops would escalate to a > panic. Perhaps it's a good idea to have that check: I don't think it's just ath5k... I had an oops too. I think some of the pid/minstrel "fixes" broke it but don't know yet. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ath5k-devel] ath5k: scanning while transmitting causes oops on 802.11a capable card 2009-05-06 16:14 ath5k: scanning while transmitting causes oops on 802.11a capable card Pavel Roskin 2009-05-06 16:19 ` Johannes Berg @ 2009-05-06 16:36 ` Bob Copeland 2009-05-06 17:25 ` John W. Linville 1 sibling, 1 reply; 10+ messages in thread From: Bob Copeland @ 2009-05-06 16:36 UTC (permalink / raw) To: Pavel Roskin; +Cc: linux-wireless, ath5k-devel On Wed, May 6, 2009 at 12:14 PM, Pavel Roskin <proski@gnu.org> wrote: > Hello! > > If I scan by "iw dev wlan0 scan" while sending data through the > interface, I get a BUG in net/mac80211/tx.c: Agreed... Also I think the same thing happens for rx for ath5k, explaining the 'unknown rate index' warnings (sc->curband changes during scan but we process a beacon from 2ghz band, that one at least just needs some synchronization in the driver). > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* RC is busted */ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (WARN_ON_ONCE(info->control.rates[i= ].idx >=3D > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 sband= ->n_bitrates)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0info->control.rates[i]= =2Eidx =3D -1; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0continue; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} I had a patch here to return rate_lowest_index(). But it still crashed eventually. > =A0 =A0 =A0 =A0rate =3D ieee80211_get_tx_rate(sc->hw, info); > + =A0 =A0 =A0 if (!rate) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -EIO; There are a few more rates here for MRR and RTS/CTS etc. --=20 Bob Copeland %% www.bobcopeland.com -- To unsubscribe from this list: send the line "unsubscribe linux-wireles= s" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ath5k-devel] ath5k: scanning while transmitting causes oops on 802.11a capable card 2009-05-06 16:36 ` [ath5k-devel] " Bob Copeland @ 2009-05-06 17:25 ` John W. Linville 2009-05-06 20:12 ` Pavel Roskin 2009-05-07 8:56 ` Bob Copeland 0 siblings, 2 replies; 10+ messages in thread From: John W. Linville @ 2009-05-06 17:25 UTC (permalink / raw) To: Bob Copeland; +Cc: Pavel Roskin, linux-wireless, ath5k-devel On Wed, May 06, 2009 at 12:36:13PM -0400, Bob Copeland wrote: > On Wed, May 6, 2009 at 12:14 PM, Pavel Roskin <proski@gnu.org> wrote: > > Hello! > > > > If I scan by "iw dev wlan0 scan" while sending data through the > > interface, I get a BUG in net/mac80211/tx.c: > > Agreed... Also I think the same thing happens for rx for ath5k, > explaining the 'unknown rate index' warnings (sc->curband changes > during scan but we process a beacon from 2ghz band, that one at > least just needs some synchronization in the driver). Ah, that could be -- I sure am tired of reading bug reports about that... John -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we have. Be ready. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ath5k-devel] ath5k: scanning while transmitting causes oops on 802.11a capable card 2009-05-06 17:25 ` John W. Linville @ 2009-05-06 20:12 ` Pavel Roskin 2009-05-06 20:26 ` Johannes Berg 2009-05-07 8:56 ` Bob Copeland 1 sibling, 1 reply; 10+ messages in thread From: Pavel Roskin @ 2009-05-06 20:12 UTC (permalink / raw) To: John W. Linville; +Cc: Bob Copeland, linux-wireless, ath5k-devel On Wed, 2009-05-06 at 13:25 -0400, John W. Linville wrote: > On Wed, May 06, 2009 at 12:36:13PM -0400, Bob Copeland wrote: > > On Wed, May 6, 2009 at 12:14 PM, Pavel Roskin <proski@gnu.org> wrote: > > > Hello! > > > > > > If I scan by "iw dev wlan0 scan" while sending data through the > > > interface, I get a BUG in net/mac80211/tx.c: > > > > Agreed... Also I think the same thing happens for rx for ath5k, > > explaining the 'unknown rate index' warnings (sc->curband changes > > during scan but we process a beacon from 2ghz band, that one at > > least just needs some synchronization in the driver). > > Ah, that could be -- I sure am tired of reading bug reports about > that... I've bisected it. The problem is introduced by the commit 2038ccfbb5f7fc7d8bca26bf53bdd6c7778136ff: Author: Johannes Berg <johannes@sipsolutions.net> AuthorDate: Wed Apr 29 12:26:17 2009 +0200 Commit: John W. Linville <linville@tuxdriver.com> CommitDate: Thu Apr 30 15:06:34 2009 -0400 mac80211: tell driver when idle When we aren't doing anything in mac80211, we can turn off much of the hardware, depending on the driver/hw. Not doing anything, aka being idle, means: ... -- Regards, Pavel Roskin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ath5k-devel] ath5k: scanning while transmitting causes oops on 802.11a capable card 2009-05-06 20:12 ` Pavel Roskin @ 2009-05-06 20:26 ` Johannes Berg 2009-05-06 21:29 ` Pavel Roskin 0 siblings, 1 reply; 10+ messages in thread From: Johannes Berg @ 2009-05-06 20:26 UTC (permalink / raw) To: Pavel Roskin; +Cc: John W. Linville, Bob Copeland, linux-wireless, ath5k-devel On Wed, 2009-05-06 at 16:12 -0400, Pavel Roskin wrote: > > > > If I scan by "iw dev wlan0 scan" while sending data through the > > > > interface, I get a BUG in net/mac80211/tx.c: > > > > > > Agreed... Also I think the same thing happens for rx for ath5k, > > > explaining the 'unknown rate index' warnings (sc->curband changes > > > during scan but we process a beacon from 2ghz band, that one at > > > least just needs some synchronization in the driver). > > > > Ah, that could be -- I sure am tired of reading bug reports about > > that... > > I've bisected it. The problem is introduced by the commit > 2038ccfbb5f7fc7d8bca26bf53bdd6c7778136ff: > > Author: Johannes Berg <johannes@sipsolutions.net> > AuthorDate: Wed Apr 29 12:26:17 2009 +0200 > Commit: John W. Linville <linville@tuxdriver.com> > CommitDate: Thu Apr 30 15:06:34 2009 -0400 > > mac80211: tell driver when idle Huh? That's confusing. Also, you say you get a BUG but point out a WARN_ON_ONCE, was that an oversight or does something crash there? OTOH, I can see one thing happening -- it would access scan_channel. Patch should fix that, does it help? johannes --- wireless-testing.orig/net/mac80211/iface.c 2009-05-06 22:25:45.000000000 +0200 +++ wireless-testing/net/mac80211/iface.c 2009-05-06 22:25:53.000000000 +0200 @@ -964,5 +964,6 @@ void ieee80211_recalc_idle(struct ieee80 mutex_lock(&local->iflist_mtx); chg = __ieee80211_recalc_idle(local); mutex_unlock(&local->iflist_mtx); - ieee80211_hw_config(local, chg); + if (chg) + ieee80211_hw_config(local, chg); } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ath5k-devel] ath5k: scanning while transmitting causes oops on 802.11a capable card 2009-05-06 20:26 ` Johannes Berg @ 2009-05-06 21:29 ` Pavel Roskin 2009-05-07 6:01 ` Johannes Berg 0 siblings, 1 reply; 10+ messages in thread From: Pavel Roskin @ 2009-05-06 21:29 UTC (permalink / raw) To: Johannes Berg; +Cc: John W. Linville, Bob Copeland, linux-wireless, ath5k-devel On Wed, 2009-05-06 at 22:26 +0200, Johannes Berg wrote: > Huh? That's confusing. Also, you say you get a BUG but point out a > WARN_ON_ONCE, was that an oversight or does something crash there? Sorry, I meant WARN_ON_ONCE. > OTOH, I can see one thing happening -- it would access scan_channel. > Patch should fix that, does it help? Yes, that does the trick! Several scans with iw and iwlist while flood pinging the AP don't cause any kernel messages. Thank you! -- Regards, Pavel Roskin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ath5k-devel] ath5k: scanning while transmitting causes oops on 802.11a capable card 2009-05-06 21:29 ` Pavel Roskin @ 2009-05-07 6:01 ` Johannes Berg 0 siblings, 0 replies; 10+ messages in thread From: Johannes Berg @ 2009-05-07 6:01 UTC (permalink / raw) To: Pavel Roskin; +Cc: John W. Linville, Bob Copeland, linux-wireless, ath5k-devel [-- Attachment #1: Type: text/plain, Size: 709 bytes --] On Wed, 2009-05-06 at 17:29 -0400, Pavel Roskin wrote: > On Wed, 2009-05-06 at 22:26 +0200, Johannes Berg wrote: > > > Huh? That's confusing. Also, you say you get a BUG but point out a > > WARN_ON_ONCE, was that an oversight or does something crash there? > > Sorry, I meant WARN_ON_ONCE. Ok, no worries, just got confused for a second what you meant. > > OTOH, I can see one thing happening -- it would access scan_channel. > > Patch should fix that, does it help? > > Yes, that does the trick! Several scans with iw and iwlist while flood > pinging the AP don't cause any kernel messages. Thank you! Ok, thanks for testing. I'll figure out the real problem and fix it. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ath5k-devel] ath5k: scanning while transmitting causes oops on 802.11a capable card 2009-05-06 17:25 ` John W. Linville 2009-05-06 20:12 ` Pavel Roskin @ 2009-05-07 8:56 ` Bob Copeland 2009-05-07 14:04 ` Bob Copeland 1 sibling, 1 reply; 10+ messages in thread From: Bob Copeland @ 2009-05-07 8:56 UTC (permalink / raw) To: John W. Linville; +Cc: Pavel Roskin, linux-wireless, ath5k-devel On Wed, May 06, 2009 at 01:25:13PM -0400, John W. Linville wrote: > On Wed, May 06, 2009 at 12:36:13PM -0400, Bob Copeland wrote: > > Agreed... Also I think the same thing happens for rx for ath5k, > > explaining the 'unknown rate index' warnings (sc->curband changes > > during scan but we process a beacon from 2ghz band, that one at > > least just needs some synchronization in the driver). > > Ah, that could be -- I sure am tired of reading bug reports about > that... It's not conceptually hard to fix but I can't think of an easy 1-liner patch. Here are the races in the rx path (note TX status processing has a similar race with hw_to_driver_rix, which is separate from Pavel's get_tx_rate report). It should be pretty easy to reproduce by setting up scans of one channel in each band and running them continuously under load. Race 1 (single cpu, unfortunately placed interrupt): CPU 1: drv_config sc->curchan = xxx sc->curband = yyy [intr] rx_tasklet rxs.rate = hw_to_driver_rix [now reports wrong band/channel] [end tasklet] reset [changes channel] The following (untested!) patch fixes this race in a cheesy way for both RX and TX. However, it doesn't fix the following, smaller race caused by deferred processing on a separate CPU. A proper fix, as indicated, would both rework the reset order and flush out any unprocessed data under the appropriate spinlocks, forcing that stuff to run after the tasklets are completed. CPU 1: CPU 2: drv_config next_chan = xxx next_band = yyy reset [intr] disable intr rx_tasklet stop dma [flush should go here] curchan = next_chan curband = next_chan rxs.rate = hw_to_driver_rix Cheesy untested patch (actually, I know it is broken for ath5k_init, but you get the idea): diff --git a/drivers/net/wireless/ath/ath5k/base.c b/drivers/net/wireless/ath/ath5k/base.c index 6789c5d..6264d49 100644 --- a/drivers/net/wireless/ath/ath5k/base.c +++ b/drivers/net/wireless/ath/ath5k/base.c @@ -1076,8 +1076,8 @@ ath5k_chan_set(struct ath5k_softc *sc, struct ieee80211_channel *chan) if (chan->center_freq != sc->curchan->center_freq || chan->hw_value != sc->curchan->hw_value) { - sc->curchan = chan; - sc->curband = &sc->sbands[chan->band]; + sc->nextchan = chan; + sc->nextband = &sc->sbands[chan->band]; /* * To switch channels clear any pending DMA operations; @@ -2648,6 +2648,8 @@ ath5k_reset(struct ath5k_softc *sc, bool stop, bool change_channel) ath5k_txq_cleanup(sc); ath5k_rx_stop(sc); } + sc->curchan = sc->nextchan; + sc->curband = sc->nextband; ret = ath5k_hw_reset(ah, sc->opmode, sc->curchan, true); if (ret) { ATH5K_ERR(sc, "can't reset hardware (%d)\n", ret); diff --git a/drivers/net/wireless/ath/ath5k/base.h b/drivers/net/wireless/ath/ath5k/base.h index 852b2c1..ce8a5d4 100644 --- a/drivers/net/wireless/ath/ath5k/base.h +++ b/drivers/net/wireless/ath/ath5k/base.h @@ -116,6 +116,7 @@ struct ath5k_softc { struct ath5k_hw *ah; /* Atheros HW */ struct ieee80211_supported_band *curband; + struct ieee80211_supported_band *nextband; #ifdef CONFIG_ATH5K_DEBUG struct ath5k_dbg_info debug; /* debug info */ @@ -137,6 +138,7 @@ struct ath5k_softc { unsigned int filter_flags; /* HW flags, AR5K_RX_FILTER_* */ unsigned int curmode; /* current phy mode */ struct ieee80211_channel *curchan; /* current h/w channel */ + struct ieee80211_channel *nextchan; /* next h/w channel */ struct ieee80211_vif *vif; -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [ath5k-devel] ath5k: scanning while transmitting causes oops on 802.11a capable card 2009-05-07 8:56 ` Bob Copeland @ 2009-05-07 14:04 ` Bob Copeland 0 siblings, 0 replies; 10+ messages in thread From: Bob Copeland @ 2009-05-07 14:04 UTC (permalink / raw) To: John W. Linville; +Cc: ath5k-devel, linux-wireless On Thu, May 7, 2009 at 4:56 AM, Bob Copeland <me@bobcopeland.com> wrote: > Cheesy untested patch (actually, I know it is broken for ath5k_init, > but you get the idea): For what it's worth, I have a better version that passes the channel struct directly to ath5k_reset, after I give it some testing I'll post with a decent changelog. -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-05-07 14:04 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-06 16:14 ath5k: scanning while transmitting causes oops on 802.11a capable card Pavel Roskin 2009-05-06 16:19 ` Johannes Berg 2009-05-06 16:36 ` [ath5k-devel] " Bob Copeland 2009-05-06 17:25 ` John W. Linville 2009-05-06 20:12 ` Pavel Roskin 2009-05-06 20:26 ` Johannes Berg 2009-05-06 21:29 ` Pavel Roskin 2009-05-07 6:01 ` Johannes Berg 2009-05-07 8:56 ` Bob Copeland 2009-05-07 14:04 ` Bob Copeland
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).