* WARNING in agg-tx.c (3.5.7+, not tainted)
@ 2012-10-31 18:54 Ben Greear
2012-10-31 19:10 ` Johannes Berg
0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2012-10-31 18:54 UTC (permalink / raw)
To: linux-wireless@vger.kernel.org
While trying to debug another kernel splat, I saw this one. System was
in process of creating and getting DHCP for 400 virtual stations.
Code in question:
mutex_lock(&sta->ampdu_mlme.mtx);
tid_tx = rcu_dereference_protected_tid_tx(sta, tid);
if (WARN_ON(!tid_tx)) {
#ifdef CONFIG_MAC80211_HT_DEBUG
printk(KERN_DEBUG "addBA was not requested!\n");
#endif
goto unlock;
}
------------[ cut here ]------------
WARNING: at /home/greearb/git/linux-3.5.dev.y/net/mac80211/agg-tx.c:650 ieee80211_start_tx_ba_cb+0x9d/0xd4 [mac80211]()
wiphy0: start_sw_scan: running-other-vifs: 0 running-station-vifs: 201, associated-stations: 42 scanning current channel: 2412 MHz
Hardware name: To be filled by O.E.M.
Modules linked in: ath5k ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfs nfs_acl auth_rpcgss fscache 8021q garp stp llc macvlan pktgen lockd sunrpc
gpio_ich joydev coretemp hwmon ppdev kvm snd_hda_codec_realtek microcode snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device serio_raw pcspkr snd_pcm
snd_page_alloc snd_timer snd i2c_i801 lpc_ich mfd_core soundcore e1000e mei parport_pc parport uinput ipv6 i915 video i2c_algo_bit drm_kms_helper drm i2c_core
[last unloaded: nf_nat]
Pid: 146, comm: kworker/u:2 Tainted: G O 3.5.7+ #1
Call Trace:
[<ffffffff8105d94c>] warn_slowpath_common+0x80/0x98
[<ffffffff8105d979>] warn_slowpath_null+0x15/0x17
[<ffffffffa04bab87>] ieee80211_start_tx_ba_cb+0x9d/0xd4 [mac80211]
[<ffffffffa04c09c6>] ieee80211_iface_work+0xaa/0x2c7 [mac80211]
[<ffffffff81072a84>] process_one_work+0x20e/0x345
[<ffffffff81072a23>] ? process_one_work+0x1ad/0x345
[<ffffffffa04c091c>] ? ieee80211_teardown_sdata+0xd9/0xd9 [mac80211]
[<ffffffff81074d44>] worker_thread+0x136/0x255
[<ffffffff81074c0e>] ? manage_workers+0x191/0x191
[<ffffffff81078aeb>] kthread+0x84/0x8c
[<ffffffff815185b4>] kernel_thread_helper+0x4/0x10
[<ffffffff815121f4>] ? retint_restore_args+0x13/0x13
[<ffffffff81078a67>] ? __init_kthread_worker+0x56/0x56
[<ffffffff815185b0>] ? gs_change+0x13/0x13
---[ end trace 03bb877f0a1bdfc6 ]---
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: WARNING in agg-tx.c (3.5.7+, not tainted)
2012-10-31 18:54 WARNING in agg-tx.c (3.5.7+, not tainted) Ben Greear
@ 2012-10-31 19:10 ` Johannes Berg
2012-10-31 19:18 ` Ben Greear
0 siblings, 1 reply; 6+ messages in thread
From: Johannes Berg @ 2012-10-31 19:10 UTC (permalink / raw)
To: Ben Greear; +Cc: linux-wireless@vger.kernel.org
On Wed, 2012-10-31 at 11:54 -0700, Ben Greear wrote:
> While trying to debug another kernel splat, I saw this one. System was
> in process of creating and getting DHCP for 400 virtual stations.
>
> Code in question:
>
>
>
> mutex_lock(&sta->ampdu_mlme.mtx);
> tid_tx = rcu_dereference_protected_tid_tx(sta, tid);
>
> if (WARN_ON(!tid_tx)) {
> #ifdef CONFIG_MAC80211_HT_DEBUG
> printk(KERN_DEBUG "addBA was not requested!\n");
> #endif
Hm should probably be a WARN(), but ...
>From the backtrace it looks like maybe you were tearing down the
interface? So maybe it's possible that you were remove it or something,
and the driver had just accepted the session? Hmm.
johannes
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: WARNING in agg-tx.c (3.5.7+, not tainted)
2012-10-31 19:10 ` Johannes Berg
@ 2012-10-31 19:18 ` Ben Greear
2012-10-31 19:20 ` Johannes Berg
0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2012-10-31 19:18 UTC (permalink / raw)
To: Johannes Berg; +Cc: linux-wireless@vger.kernel.org
On 10/31/2012 12:10 PM, Johannes Berg wrote:
> On Wed, 2012-10-31 at 11:54 -0700, Ben Greear wrote:
>> While trying to debug another kernel splat, I saw this one. System was
>> in process of creating and getting DHCP for 400 virtual stations.
>>
>> Code in question:
>>
>>
>>
>> mutex_lock(&sta->ampdu_mlme.mtx);
>> tid_tx = rcu_dereference_protected_tid_tx(sta, tid);
>>
>> if (WARN_ON(!tid_tx)) {
>> #ifdef CONFIG_MAC80211_HT_DEBUG
>> printk(KERN_DEBUG "addBA was not requested!\n");
>> #endif
>
> Hm should probably be a WARN(), but ...
>
>>From the backtrace it looks like maybe you were tearing down the
> interface? So maybe it's possible that you were remove it or something,
> and the driver had just accepted the session? Hmm.
I've got 400 interfaces churning, some being reset due to lack of
fast enough DHCP response, etc. Could easily be related to that
drv-remove-interface bug as well..it is much more easily reproduced
in this scenario. Will re-run some tests with your suggested patch
applied...
Thanks,
Ben
>
> johannes
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: WARNING in agg-tx.c (3.5.7+, not tainted)
2012-10-31 19:18 ` Ben Greear
@ 2012-10-31 19:20 ` Johannes Berg
2012-10-31 23:44 ` Ben Greear
0 siblings, 1 reply; 6+ messages in thread
From: Johannes Berg @ 2012-10-31 19:20 UTC (permalink / raw)
To: Ben Greear; +Cc: linux-wireless@vger.kernel.org
On Wed, 2012-10-31 at 12:18 -0700, Ben Greear wrote:
> >> mutex_lock(&sta->ampdu_mlme.mtx);
> >> tid_tx = rcu_dereference_protected_tid_tx(sta, tid);
> >>
> >> if (WARN_ON(!tid_tx)) {
> >> #ifdef CONFIG_MAC80211_HT_DEBUG
> >> printk(KERN_DEBUG "addBA was not requested!\n");
> >> #endif
> >
> > Hm should probably be a WARN(), but ...
> >
> >>From the backtrace it looks like maybe you were tearing down the
> > interface? So maybe it's possible that you were remove it or something,
> > and the driver had just accepted the session? Hmm.
>
> I've got 400 interfaces churning, some being reset due to lack of
> fast enough DHCP response, etc. Could easily be related to that
> drv-remove-interface bug as well..
I didn't even think of that, but yeah, that seems possible.
> it is much more easily reproduced
> in this scenario. Will re-run some tests with your suggested patch
> applied...
Ok cool.
johannes
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: WARNING in agg-tx.c (3.5.7+, not tainted)
2012-10-31 19:20 ` Johannes Berg
@ 2012-10-31 23:44 ` Ben Greear
2012-11-14 9:37 ` Johannes Berg
0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2012-10-31 23:44 UTC (permalink / raw)
To: Johannes Berg; +Cc: linux-wireless@vger.kernel.org
On 10/31/2012 12:20 PM, Johannes Berg wrote:
> On Wed, 2012-10-31 at 12:18 -0700, Ben Greear wrote:
>
>>>> mutex_lock(&sta->ampdu_mlme.mtx);
>>>> tid_tx = rcu_dereference_protected_tid_tx(sta, tid);
>>>>
>>>> if (WARN_ON(!tid_tx)) {
>>>> #ifdef CONFIG_MAC80211_HT_DEBUG
>>>> printk(KERN_DEBUG "addBA was not requested!\n");
>>>> #endif
>>>
>>> Hm should probably be a WARN(), but ...
>>>
>>> >From the backtrace it looks like maybe you were tearing down the
>>> interface? So maybe it's possible that you were remove it or something,
>>> and the driver had just accepted the session? Hmm.
>>
>> I've got 400 interfaces churning, some being reset due to lack of
>> fast enough DHCP response, etc. Could easily be related to that
>> drv-remove-interface bug as well..
>
> I didn't even think of that, but yeah, that seems possible.
>
>> it is much more easily reproduced
>> in this scenario. Will re-run some tests with your suggested patch
>> applied...
>
> Ok cool.
Unfortunately, it still happens even with the other patch applied.
I have only seen it once in several hours of testing, and it doesn't
seem to cause any lasting harm.
But, I have at least some sort of test case for it, so
if you have a suggested patch, I'll be happy to test it.
Thanks,
Ben
>
> johannes
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: WARNING in agg-tx.c (3.5.7+, not tainted)
2012-10-31 23:44 ` Ben Greear
@ 2012-11-14 9:37 ` Johannes Berg
0 siblings, 0 replies; 6+ messages in thread
From: Johannes Berg @ 2012-11-14 9:37 UTC (permalink / raw)
To: Ben Greear; +Cc: linux-wireless@vger.kernel.org
On Wed, 2012-10-31 at 16:44 -0700, Ben Greear wrote:
> >>>> mutex_lock(&sta->ampdu_mlme.mtx);
> >>>> tid_tx = rcu_dereference_protected_tid_tx(sta, tid);
> >>>>
> >>>> if (WARN_ON(!tid_tx)) {
> >>>> #ifdef CONFIG_MAC80211_HT_DEBUG
> >>>> printk(KERN_DEBUG "addBA was not requested!\n");
> >>>> #endif
> >>>
> >>> Hm should probably be a WARN(), but ...
> >>>
> >>> From the backtrace it looks like maybe you were tearing down the
> >>> interface? So maybe it's possible that you were remove it or something,
> >>> and the driver had just accepted the session? Hmm.
> >>
> >> I've got 400 interfaces churning, some being reset due to lack of
> >> fast enough DHCP response, etc. Could easily be related to that
> >> drv-remove-interface bug as well..
> Unfortunately, it still happens even with the other patch applied.
Ok.
> I have only seen it once in several hours of testing, and it doesn't
> seem to cause any lasting harm.
>
> But, I have at least some sort of test case for it, so
> if you have a suggested patch, I'll be happy to test it.
I don't, sorry, and I haven't really had the time to investigate. It's
probably a race condition with the driver, something like this:
- BA session start request
- frame sent to the peer
- driver asked
- driver says OK, which gets queued on the workqueue
- meanwhile, workqueue is doing something else
(since you have lots of interfaces)
- now you ifdown before the workqueue comes around to the new item
- removing the interface removes the station, which in turn removes the
aggregation session data
- now the workqueue flush (which you found slow in your other thread)
comes and runs the new item, which warns because the station data
aggregation data is long gone
Something like that, I suspect.
johannes
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-11-14 9:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-31 18:54 WARNING in agg-tx.c (3.5.7+, not tainted) Ben Greear
2012-10-31 19:10 ` Johannes Berg
2012-10-31 19:18 ` Ben Greear
2012-10-31 19:20 ` Johannes Berg
2012-10-31 23:44 ` Ben Greear
2012-11-14 9:37 ` Johannes Berg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).