From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Greear Date: Wed, 01 May 2013 09:57:29 -0700 Subject: [ath9k-devel] 3.9.0-rc8+ (hacked) splat. In-Reply-To: <51813E15.2040908@openwrt.org> References: <517B11F3.1090700@candelatech.com> <517B933F.5030309@openwrt.org> <517D3A9E.6060807@candelatech.com> <518007FE.9050007@candelatech.com> <51813C46.2020102@candelatech.com> <51813E15.2040908@openwrt.org> Message-ID: <51814979.5030106@candelatech.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ath9k-devel@lists.ath9k.org On 05/01/2013 09:08 AM, Felix Fietkau wrote: > On 2013-05-01 6:01 PM, Ben Greear wrote: >> On 04/30/2013 11:05 AM, Ben Greear wrote: >>> On 04/28/2013 08:05 AM, Ben Greear wrote: >>>> On 04/27/2013 01:58 AM, Felix Fietkau wrote: >>>>> On 2013-04-27 1:46 AM, Ben Greear wrote: >>>>>> Was running around 200 stations against a VAP on this system, and >>>>>> then changed the channel from 1 to 36 (by restarting hostapd with new >>>>>> config). >>>>>> >>>>>> Looks like null-pointer de-ref... Anyone seen anything similar? >>>>> I've never seen this one. Please use gdb to figure out the source code >>>>> line that the NULL pointer deref happens in. >>>>> As for the 'keycache entry 228 out of range' stuff, I'm going to send a >>>>> patch for that now. >>>> >>>> Thanks. >>>> >>>> I'm away from the office for a bit, but will build a debugging kernel >>>> and crank on this early next week. >>> >>> Ok, this is against a modified 3.9.0 tree. My patches are here: >>> >>> http://dmz2.candelatech.com/git/gitweb.cgi?p=linux-3.9.dev.y/.git;a=summary >>> >>> I'm going to try reproducing against upstream 3.9.0 (using a smaller number of >>> stations since upstream doesn't have needed optimizations to make it work on >>> my hardware...) >> >> With the wpa_supplicant optimizations I posted yesterday, I can >> reproduce the crash on a standard 3.9.0 kernel with the regdomain >> patch AND the "mac80211: Add per-sdata station hash, and sdata hash." >> >> https://patchwork.kernel.org/patch/2482351/ >> >> I was not able to reproduce this without the hash optimization patch, >> so either it's buggy, or it just makes things a lot faster and that >> triggers bugs in ath9k more easily..... > It's buggy. Take a look at this part: > >> diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c >> index 238a0cc..6b0fe74 100644 >> --- a/net/mac80211/sta_info.c >> +++ b/net/mac80211/sta_info.c >> @@ -965,6 +1018,13 @@ struct ieee80211_sta *ieee80211_find_sta_by_ifaddr(struct ieee80211_hw *hw, >> { >> struct sta_info *sta, *nxt; >> >> + if (localaddr) { >> + sta = sta_info_get_by_vif(hw_to_local(hw), localaddr, addr); >> + if (sta && !sta->uploaded) >> + return NULL; >> + return &sta->sta; >> + } > If sta is NULL, it'll return &sta->sta, which is non-NULL. It matches > the null-pointer crash on dereferencing the driver's tid struct inside > sta->drv_priv. Ahh, thanks so much! That does appear to be the cause... Thanks, Ben > > - Felix > -- Ben Greear Candela Technologies Inc http://www.candelatech.com