From: Ben Greear <greearb@candelatech.com>
To: "Luis R. Rodriguez" <mcgrof@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>,
"linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>
Subject: Re: memory clobber in rx path, maybe related to ath9k.
Date: Wed, 13 Oct 2010 10:48:04 -0700 [thread overview]
Message-ID: <4CB5F0D4.4020907@candelatech.com> (raw)
In-Reply-To: <AANLkTi=c=w8MvXZ=vqxkRgbXasacQTYXUX8o_zXoZgxC@mail.gmail.com>
On 10/13/2010 10:29 AM, Luis R. Rodriguez wrote:
> On Wed, Oct 13, 2010 at 10:12 AM, Ben Greear<greearb@candelatech.com> wrote:
>> On 10/12/2010 11:40 AM, Luis R. Rodriguez wrote:
>>>
>>> On Tue, Oct 12, 2010 at 11:35 AM, Ben Greear<greearb@candelatech.com>
>>> wrote:
>>>>
>>>> On 10/11/2010 11:10 PM, Luis R. Rodriguez wrote:
>>>>>
>>>>> On Mon, Oct 11, 2010 at 8:27 PM, Ben Greear<greearb@candelatech.com>
>>>>> wrote:
>>>>
>>>>>> Another thing I was thinking about: Maybe the queue of skbs and dma
>>>>>> addresses
>>>>>> in ath9k is getting corrupted by multiple VIFs trying to write at once?
>>>>>> Maybe
>>>>>> some locking is needed in the xmit path?
>>>>>
>>>>> That was my second hunch. My first shot was to use spin_lock_irqsave()
>>>>> over the the uses of the rxbuf list and that seemed to help but I
>>>>> still managed to get a poison eventually. My next item to check for is
>>>>> of the permissibility of creating too much pressure to the point we
>>>>> end up looping over the rxbuf list and race against mac80211 free'ing
>>>>> a buffer. Will test that tomorrow if nothing else comes up creeping my
>>>>> priority queue.
>>>>
>>>> This code looks weird to me. One of the paprd branches
>>>> deletes the skb, the other doesn't appear to. Neither
>>>> null out bf->bf_mpdu, which would appear to leave a dangling
>>>> pointer in at least the dev_kfree_skb_any() branch.
>>>>
>>>> ath_tx_complete frees it's skb in all cases, so another
>>>> bf->bf_mpdu dangling pointer issue.
>>>>
>>>> Maybe at the least we should null out bf->bf_mpdu when
>>>> skb is consumed?
>>>
>>> You're reading my mind, that was what I was going to test today. Still
>>> doing e-mail sweep though.
>>
>> At least in the xmit path, it seems cards that have EDMA support do
>> things a bit different. Out of curiosity, on the system(s), you reproduce
>> this, are any of yours supporting EDMA? Mine appear to not support EDMA.
>
> EDMA is used on>= AR9003 families by Atheros. And no, I am not
> testing with an EDMA card, I am testing with an AR9002 family card,
> the AR9280 card. I am going to disregard the TX stuff as the bug is an
> RX issue :) I was able to more easily reproduce by doing an skb_copy()
> and free'ing the buffer right afterwards on the ath_send_to_mac80211()
> thingy, So it does appear that the poison check just happens more
> often when we do an skb_copy(). One reason this is easy to reproduce
> with multiple STAs is mac80211 uses skb_copy() to process each
> received skb for each STA.
>
> In my tests so far, protecting the rxbuf list with spin_lock_irqsave()
> did not help, and the wmb(); didn't either, something else is going on
> here. It would be nice to hack slab to keep an entire trace of the
> place the buffer was last free'd at instead of just the caller that
> freed it.
I instrumented slub a while back and got the backtrace. It
was always in the same place for my testing.
Here's the slub patch if you are interested in using it yourself:
https://patchwork.kernel.org/patch/236921/
Are you able to reproduce this with a single STA interface? If so, we
should be able to somewhat tie-break mac80211 by using another /n NIC,
hopefully with similar AMPDU support, etc.
[From a mail I sent on 10/7 in this thread]
In case it helps, here is a dump of where the corrupted SKB was deleted.
I added debugging to slub to get this information, but it looks like
it's correct to me.
Reading symbols from /home/greearb/kernel/2.6/wireless-testing-dbg.p4s/net/mac80211/mac80211.ko...done.
(gdb) l *(ieee80211_rx+0x74d)
0x13751 is in ieee80211_rx (/home/greearb/git/linux.wireless-testing/include/linux/rcupdate.h:346).
341 *
342 * See rcu_read_lock() for more information.
343 */
344 static inline void rcu_read_unlock(void)
345 {
346 rcu_read_release();
347 __release(RCU);
348 __rcu_read_unlock();
349 }
350
(gdb)
# I don't really know what that second address means, but just in case it's useful,
# I printed it out here:
(gdb) l *(ieee80211_rx+0x7b4)
0x137b8 is in ieee80211_process_measurement_req (/home/greearb/git/linux.wireless-testing/net/mac80211/spectmgmt.c:74).
69 }
70
71 void ieee80211_process_measurement_req(struct ieee80211_sub_if_data *sdata,
72 struct ieee80211_mgmt *mgmt,
73 size_t len)
74 {
75 /*
76 * Ignoring measurement request is spec violation.
77 * Mandatory measurements must be reported optional
78 * measurements might be refused or reported incapable
INFO: Freed in skb_release_data+0x8c/0x90 age=122 cpu=1 pid=0
set_track+0x3c/0x89
__slab_free+0x17f/0x1ba
skb_release_data+0x8c/0x90
kfree+0xaf/0xdf
skb_release_data+0x8c/0x90
skb_release_data+0x8c/0x90
skb_release_data+0x8c/0x90
__kfree_skb+0x12/0x6d
consume_skb+0x2a/0x2c
ieee80211_rx+0x74d/0x7b4 [mac80211]
__kmalloc_track_caller+0xcd/0xf2
trace_hardirqs_on_caller+0xeb/0x125
ath_rx_send_to_mac80211+0x5a/0x60 [ath9k]
trace_hardirqs_on+0xb/0xd
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
next prev parent reply other threads:[~2010-10-13 17:48 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-05 17:00 memory clobber in rx path, maybe related to ath9k Ben Greear
2010-10-05 17:16 ` Luis R. Rodriguez
2010-10-05 17:24 ` Ben Greear
2010-10-05 17:36 ` Luis R. Rodriguez
2010-10-05 17:38 ` Ben Greear
2010-10-05 17:43 ` Luis R. Rodriguez
2010-10-05 17:47 ` Ben Greear
2010-10-05 17:55 ` Luis R. Rodriguez
2010-10-05 18:14 ` Ben Greear
2010-10-05 21:12 ` Ben Greear
2010-10-07 17:33 ` Ben Greear
2010-10-07 18:14 ` Johannes Berg
2010-10-07 18:29 ` Luis R. Rodriguez
2010-10-07 18:39 ` Ben Greear
2010-10-07 18:42 ` Luis R. Rodriguez
2010-10-07 18:45 ` Ben Greear
2010-10-07 19:14 ` Ben Greear
2010-10-07 19:17 ` Johannes Berg
2010-10-07 19:22 ` Ben Greear
2010-10-07 19:27 ` Johannes Berg
2010-10-07 21:31 ` Luis R. Rodriguez
2010-10-07 21:36 ` Luis R. Rodriguez
2010-10-07 21:59 ` Luis R. Rodriguez
2010-10-11 20:51 ` Ben Greear
2010-10-12 1:03 ` Luis R. Rodriguez
2010-10-12 3:27 ` Ben Greear
2010-10-12 6:10 ` Luis R. Rodriguez
2010-10-12 18:35 ` Ben Greear
2010-10-12 18:40 ` Luis R. Rodriguez
2010-10-12 18:43 ` Ben Greear
2010-10-12 19:51 ` Ben Greear
2010-10-13 17:12 ` Ben Greear
2010-10-13 17:29 ` Luis R. Rodriguez
2010-10-13 17:48 ` Ben Greear [this message]
2010-10-14 21:25 ` Luis R. Rodriguez
2010-10-14 21:31 ` Ben Greear
2010-10-14 21:32 ` Luis R. Rodriguez
2010-10-14 21:39 ` Ben Greear
2010-10-14 21:45 ` Johannes Berg
2010-10-14 21:47 ` Ben Greear
2010-10-13 5:31 ` Vasanthakumar Thiagarajan
2010-10-13 16:39 ` Ben Greear
2010-10-13 19:56 ` Björn Smedman
2010-10-13 20:03 ` Luis R. Rodriguez
2010-10-14 19:15 ` Ben Greear
2010-10-14 19:17 ` Luis R. Rodriguez
2010-10-14 21:52 ` Björn Smedman
2010-10-14 22:05 ` Ben Greear
2010-10-14 22:16 ` Luis R. Rodriguez
2010-10-14 22:29 ` Luis R. Rodriguez
2010-10-14 22:35 ` Luis R. Rodriguez
2010-10-14 22:44 ` Ben Greear
2010-10-14 22:54 ` Luis R. Rodriguez
2010-10-14 22:51 ` Luis R. Rodriguez
2010-10-14 23:19 ` Luis R. Rodriguez
2010-10-14 23:30 ` Ben Greear
2010-10-14 23:39 ` Luis R. Rodriguez
2010-10-14 23:48 ` Luis R. Rodriguez
2010-10-15 16:51 ` Ben Greear
2010-10-15 18:47 ` Luis R. Rodriguez
2010-10-15 19:36 ` Ben Greear
2010-10-15 21:07 ` Luis R. Rodriguez
2010-10-15 23:21 ` Luis R. Rodriguez
2010-10-15 23:33 ` Ben Greear
2010-10-15 23:38 ` Luis R. Rodriguez
2010-10-15 23:41 ` Luis R. Rodriguez
2010-10-16 0:07 ` Ben Greear
2010-10-15 23:42 ` Ben Greear
2010-10-15 23:57 ` Luis R. Rodriguez
2010-10-17 19:44 ` Ben Greear
2010-10-18 22:46 ` Luis R. Rodriguez
2010-10-15 23:39 ` Ben Greear
2010-10-14 23:51 ` Ben Greear
2010-10-14 22:47 ` Ben Greear
2010-10-14 23:46 ` Björn Smedman
2010-10-18 13:48 ` Björn Smedman
2010-10-18 17:24 ` Luis R. Rodriguez
2010-10-18 22:34 ` Björn Smedman
2010-10-18 22:41 ` Luis R. Rodriguez
2010-10-14 5:37 ` Vasanthakumar Thiagarajan
2010-10-07 21:52 ` Ben Greear
2010-10-08 0:42 ` Bruno Randolf
2010-10-08 2:30 ` Ben Greear
2010-10-05 17:22 ` Johannes Berg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CB5F0D4.4020907@candelatech.com \
--to=greearb@candelatech.com \
--cc=johannes@sipsolutions.net \
--cc=linux-wireless@vger.kernel.org \
--cc=mcgrof@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.