From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pavel Roskin Date: Wed, 8 Feb 2012 17:54:00 -0500 Subject: [ath9k-devel] kernel panic with ath9k In-Reply-To: <4F32CCB8.8060600@openwrt.org> References: <20120204104848.28186td6kg8w88cg-cebfxv@webmail.spamcop.net> <20120207170009.612f8d65@mj> <4F32CCB8.8060600@openwrt.org> Message-ID: <20120208175400.40c00ca8@mj> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ath9k-devel@lists.ath9k.org On Wed, 08 Feb 2012 20:27:52 +0100 Felix Fietkau wrote: > On 2012-02-08 8:10 PM, Adrian Chadd wrote: > > Has anyone figured out why -1 is appearing? > Maybe it's fixed by the mac80211 patch I submitted today. mac80211 was > calling .tx_status on the rate control module before .rate_init. > Just a wild guess... I can reproduce the problem reliably with just one router (some stock Linksys router with standard firmware) and one laptop (Asus Eee PC 1005PE with AR9285). 802.11n is not used. I just leave the laptop turned on overnight, and it catches the warning at some point. The bug might be "stimulated" by closing the lid and opening it again while transmitting data (e.g. watching a video). But it doesn't happen most of the time. And if it does happen, it takes about a couple of minutes after the connection is reestablished. This night, I caught two events less than a millisecond apart: [18311.688446] wlan0: moving STA 00:21:29:a0:c7:b2 to state 1 [18311.688453] wlan0: moving STA 00:21:29:a0:c7:b2 to state 2 [18311.711759] wlan0: moving STA 00:21:29:a0:c7:b2 to state 3 [19634.009730] ath_rc_get_rateindex: rate->idx = -1 [19634.009762] ath_rc_get_rateindex: rate->idx = -1 [23124.212050] wlan0: moving STA 00:21:29:a0:c7:b2 to state 2 [23124.212061] wlan0: moving STA 00:21:29:a0:c7:b2 to state 1 [23124.212068] wlan0: moving STA 00:21:29:a0:c7:b2 to state 0 That's all with compat-wireless-3.3-rc1-2, which is used by Fedora. Indeed, the memory corruption happens in ath_tx_status(), the .tx_status function for ath9k rate control, so it's plausible. I tried applying your patch to compat-wireless-3.3-rc1-2, but it depends on the patch that introduced WLAN_STA_RATE_CONTROL. And that patch doesn't apply cleanly because WLAN_STA_INSERTED is not in compat-wireless-3.3-rc1-2. The current bleeding edge compat-wireless is for 2012/02/06 and it needs both the patch that introduces WLAN_STA_RATE_CONTROL and the one that uses it. At least they can be applied and everything compiles. I'm going to leave the patched bleeding edge compat-wireless overnight (with the sanity check I added to the code). I'll report the results. I believe the patch needs to be backported to Linux 3.3 and older versions. The Fedora bug 768639 has 74 comments, mostly messages about duplicate bugs. -- Regards, Pavel Roskin