* [PATCH] mac80211: hoist sta->lock from reorder release timer @ 2010-10-06 10:00 Christian Lamparter 2010-10-06 10:10 ` Johannes Berg 0 siblings, 1 reply; 12+ messages in thread From: Christian Lamparter @ 2010-10-06 10:00 UTC (permalink / raw) To: linux-wireless; +Cc: linville, Ben Greear, Ming Lei The patch "mac80211: AMPDU rx reorder timeout timer" clashes with "mac80211: use netif_receive_skb in ieee80211_rx callpath" The timer itself is part of the station's private struct. The clean-up routine will deactivate the timer as soon as the station is removed. Therefore the extra sta->lock protection should not be necessary. Cc: Ben Greear <greearb@candelatech.com> Reported-by: Ming Lei<tom.leiming@gmail.com> Signed-off-by: Christian Lamparter<chunkeey@googlemail.com> --- reference from Ben Greear: (no tested-by, because no-one could actually reproduce the original lockdep warning) http://www.spinics.net/lists/linux-wireless/msg56900.html --- diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c index 58eab9e..309ed70 100644 --- a/net/mac80211/agg-rx.c +++ b/net/mac80211/agg-rx.c @@ -129,9 +129,7 @@ static void sta_rx_agg_reorder_timer_expired(unsigned long data) timer_to_tid[0]); rcu_read_lock(); - spin_lock(&sta->lock); ieee80211_release_reorder_timeout(sta, *ptid); - spin_unlock(&sta->lock); rcu_read_unlock(); } ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-06 10:00 [PATCH] mac80211: hoist sta->lock from reorder release timer Christian Lamparter @ 2010-10-06 10:10 ` Johannes Berg 2010-10-06 10:20 ` Christian Lamparter 0 siblings, 1 reply; 12+ messages in thread From: Johannes Berg @ 2010-10-06 10:10 UTC (permalink / raw) To: Christian Lamparter; +Cc: linux-wireless, linville, Ben Greear, Ming Lei On Wed, 2010-10-06 at 12:00 +0200, Christian Lamparter wrote: > The timer itself is part of the station's private struct. > The clean-up routine will deactivate the timer as soon as > the station is removed. Therefore the extra sta->lock > protection should not be necessary. > rcu_read_lock(); > - spin_lock(&sta->lock); > ieee80211_release_reorder_timeout(sta, *ptid); > - spin_unlock(&sta->lock); > rcu_read_unlock(); There's a comment on ieee80211_release_reorder_timeout() saying that the lock must be held -- which is probably not true? We don't generally hold that lock on the RX path...? johannes ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-06 10:10 ` Johannes Berg @ 2010-10-06 10:20 ` Christian Lamparter 2010-10-06 10:41 ` Johannes Berg 0 siblings, 1 reply; 12+ messages in thread From: Christian Lamparter @ 2010-10-06 10:20 UTC (permalink / raw) To: Johannes Berg; +Cc: linux-wireless, linville, Ben Greear, Ming Lei On Wednesday 06 October 2010 12:10:26 Johannes Berg wrote: > On Wed, 2010-10-06 at 12:00 +0200, Christian Lamparter wrote: > > > The timer itself is part of the station's private struct. > > The clean-up routine will deactivate the timer as soon as > > the station is removed. Therefore the extra sta->lock > > protection should not be necessary. > > > rcu_read_lock(); > > - spin_lock(&sta->lock); > > ieee80211_release_reorder_timeout(sta, *ptid); > > - spin_unlock(&sta->lock); > > rcu_read_unlock(); > > There's a comment on ieee80211_release_reorder_timeout() saying that the > lock must be held -- which is probably not true? We don't generally hold > that lock on the RX path...? That comment is more or less a 1:1 copy from the comment about struct tid_ampdu_rx (in sta_info.h). > * This structure is protected by RCU and the per-station > * spinlock. Assignments to the array holding it must hold > * the spinlock, only the RX path can access it under RCU > * lock-free. thing is: we now have the reorder_lock which protects the reorder buffer against "destructive access". So, is it "ok" to trim the comments a bit? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-06 10:20 ` Christian Lamparter @ 2010-10-06 10:41 ` Johannes Berg 2010-10-06 11:43 ` Christian Lamparter 0 siblings, 1 reply; 12+ messages in thread From: Johannes Berg @ 2010-10-06 10:41 UTC (permalink / raw) To: Christian Lamparter; +Cc: linux-wireless, linville, Ben Greear, Ming Lei On Wed, 2010-10-06 at 12:20 +0200, Christian Lamparter wrote: > On Wednesday 06 October 2010 12:10:26 Johannes Berg wrote: > > On Wed, 2010-10-06 at 12:00 +0200, Christian Lamparter wrote: > > > > > The timer itself is part of the station's private struct. > > > The clean-up routine will deactivate the timer as soon as > > > the station is removed. Therefore the extra sta->lock > > > protection should not be necessary. > > > > > rcu_read_lock(); > > > - spin_lock(&sta->lock); > > > ieee80211_release_reorder_timeout(sta, *ptid); > > > - spin_unlock(&sta->lock); > > > rcu_read_unlock(); > > > > There's a comment on ieee80211_release_reorder_timeout() saying that the > > lock must be held -- which is probably not true? We don't generally hold > > that lock on the RX path...? > > That comment is more or less a 1:1 copy from the comment about > struct tid_ampdu_rx (in sta_info.h). Kinda. > > * This structure is protected by RCU and the per-station > > * spinlock. Assignments to the array holding it must hold > > * the spinlock, only the RX path can access it under RCU > > * lock-free. > > thing is: we now have the reorder_lock which protects the > reorder buffer against "destructive access". So, is it "ok" > to trim the comments a bit? Well, so if this patch is OK, it would be, but looking at tid_agg_rx->head_seq_num and buf_size for example, they're not always protected by the reorder lock (though they could easily be). In fact, there are more races, like for example ieee80211_release_reorder_frames not being invoked with the reorder lock held from ieee80211_rx_h_ctrl, which could lead to issues? Basically the thing is that until your patch, the data in the struct didn't actually need locking because it was accessed by the RX path only which is not concurrent. So now we have to carefully analyse what we can do. The comment about the sta->lock has probably never been really correct... johannes ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-06 10:41 ` Johannes Berg @ 2010-10-06 11:43 ` Christian Lamparter 2010-10-06 11:46 ` Johannes Berg 0 siblings, 1 reply; 12+ messages in thread From: Christian Lamparter @ 2010-10-06 11:43 UTC (permalink / raw) To: Johannes Berg; +Cc: linux-wireless, linville, Ben Greear, Ming Lei On Wednesday 06 October 2010 12:41:45 Johannes Berg wrote: > On Wed, 2010-10-06 at 12:20 +0200, Christian Lamparter wrote: > > On Wednesday 06 October 2010 12:10:26 Johannes Berg wrote: > > > On Wed, 2010-10-06 at 12:00 +0200, Christian Lamparter wrote: > > > > > > > The timer itself is part of the station's private struct. > > > > The clean-up routine will deactivate the timer as soon as > > > > the station is removed. Therefore the extra sta->lock > > > > protection should not be necessary. > > > > > > > rcu_read_lock(); > > > > - spin_lock(&sta->lock); > > > > ieee80211_release_reorder_timeout(sta, *ptid); > > > > - spin_unlock(&sta->lock); > > > > rcu_read_unlock(); > > > > > > There's a comment on ieee80211_release_reorder_timeout() saying that the > > > lock must be held -- which is probably not true? We don't generally hold > > > that lock on the RX path...? > > > > That comment is more or less a 1:1 copy from the comment about > > struct tid_ampdu_rx (in sta_info.h). > > Kinda. > > > > * This structure is protected by RCU and the per-station > > > * spinlock. Assignments to the array holding it must hold > > > * the spinlock, only the RX path can access it under RCU > > > * lock-free. > > > > thing is: we now have the reorder_lock which protects the > > reorder buffer against "destructive access". So, is it "ok" > > to trim the comments a bit? > > Well, so if this patch is OK, it would be, but looking at > tid_agg_rx->head_seq_num and buf_size for example, they're not always > protected by the reorder lock (though they could easily be). > > In fact, there are more races, like for example > ieee80211_release_reorder_frames not being invoked with the reorder lock > held from ieee80211_rx_h_ctrl, which could lead to issues? > > Basically the thing is that until your patch, the data in the struct > didn't actually need locking because it was accessed by the RX path only > which is not concurrent. > I see. So basically all rx handlers are affected by these rx->sta races. John, can you please revert (or at least drop from the upcoming 2.6.37-rcX cycle): (mac80211: fix release_reorder_timeout in scan) mac80211: fix rcu-unsafe pointer dereference mac80211: AMPDU rx reorder timeout timer (mac80211: remove unused rate function parameter) mac80211: put rx handlers into separate functions ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-06 11:43 ` Christian Lamparter @ 2010-10-06 11:46 ` Johannes Berg 2010-10-06 20:21 ` John W. Linville 0 siblings, 1 reply; 12+ messages in thread From: Johannes Berg @ 2010-10-06 11:46 UTC (permalink / raw) To: Christian Lamparter; +Cc: linux-wireless, linville, Ben Greear, Ming Lei On Wed, 2010-10-06 at 13:43 +0200, Christian Lamparter wrote: > > Basically the thing is that until your patch, the data in the struct > > didn't actually need locking because it was accessed by the RX path only > > which is not concurrent. > > > I see. So basically all rx handlers are affected by these rx->sta races. > > John, can you please revert (or at least drop from the upcoming 2.6.37-rcX cycle): > > (mac80211: fix release_reorder_timeout in scan) > mac80211: fix rcu-unsafe pointer dereference > mac80211: AMPDU rx reorder timeout timer > (mac80211: remove unused rate function parameter) > mac80211: put rx handlers into separate functions I think it's probably easier to fix than to revert now? There are only a handful of fields, and it seemed to me that most of them can easily be moved under the reorder lock. johannes ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-06 11:46 ` Johannes Berg @ 2010-10-06 20:21 ` John W. Linville 2010-10-07 21:03 ` Johannes Berg 0 siblings, 1 reply; 12+ messages in thread From: John W. Linville @ 2010-10-06 20:21 UTC (permalink / raw) To: Johannes Berg; +Cc: Christian Lamparter, linux-wireless, Ben Greear, Ming Lei On Wed, Oct 06, 2010 at 01:46:18PM +0200, Johannes Berg wrote: > On Wed, 2010-10-06 at 13:43 +0200, Christian Lamparter wrote: > > > > Basically the thing is that until your patch, the data in the struct > > > didn't actually need locking because it was accessed by the RX path only > > > which is not concurrent. > > > > > I see. So basically all rx handlers are affected by these rx->sta races. > > > > John, can you please revert (or at least drop from the upcoming 2.6.37-rcX cycle): > > > > (mac80211: fix release_reorder_timeout in scan) > > mac80211: fix rcu-unsafe pointer dereference > > mac80211: AMPDU rx reorder timeout timer > > (mac80211: remove unused rate function parameter) > > mac80211: put rx handlers into separate functions > > I think it's probably easier to fix than to revert now? There are only a > handful of fields, and it seemed to me that most of them can easily be > moved under the reorder lock. I would prefer a fix on top rather than a series of reverts... John -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we have. Be ready. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-06 20:21 ` John W. Linville @ 2010-10-07 21:03 ` Johannes Berg 2010-10-08 16:42 ` Christian Lamparter 0 siblings, 1 reply; 12+ messages in thread From: Johannes Berg @ 2010-10-07 21:03 UTC (permalink / raw) To: John W. Linville Cc: Christian Lamparter, linux-wireless, Ben Greear, Ming Lei On Wed, 2010-10-06 at 16:21 -0400, John W. Linville wrote: > > I think it's probably easier to fix than to revert now? There are only a > > handful of fields, and it seemed to me that most of them can easily be > > moved under the reorder lock. > > I would prefer a fix on top rather than a series of reverts... I think this should fix it. Somebody review please? johannes --- net/mac80211/agg-rx.c | 8 +++----- net/mac80211/debugfs_sta.c | 29 +++++++++++++++-------------- net/mac80211/rx.c | 17 +++++++++++++---- net/mac80211/sta_info.h | 29 ++++++++++++++--------------- 4 files changed, 45 insertions(+), 38 deletions(-) --- wireless-testing.orig/net/mac80211/agg-rx.c 2010-10-07 22:44:04.000000000 +0200 +++ wireless-testing/net/mac80211/agg-rx.c 2010-10-07 22:53:33.000000000 +0200 @@ -129,9 +129,7 @@ static void sta_rx_agg_reorder_timer_exp timer_to_tid[0]); rcu_read_lock(); - spin_lock(&sta->lock); ieee80211_release_reorder_timeout(sta, *ptid); - spin_unlock(&sta->lock); rcu_read_unlock(); } @@ -256,7 +254,7 @@ void ieee80211_process_addba_request(str } /* prepare A-MPDU MLME for Rx aggregation */ - tid_agg_rx = kmalloc(sizeof(struct tid_ampdu_rx), GFP_ATOMIC); + tid_agg_rx = kmalloc(sizeof(struct tid_ampdu_rx), GFP_KERNEL); if (!tid_agg_rx) { #ifdef CONFIG_MAC80211_HT_DEBUG if (net_ratelimit()) @@ -280,9 +278,9 @@ void ieee80211_process_addba_request(str /* prepare reordering buffer */ tid_agg_rx->reorder_buf = - kcalloc(buf_size, sizeof(struct sk_buff *), GFP_ATOMIC); + kcalloc(buf_size, sizeof(struct sk_buff *), GFP_KERNEL); tid_agg_rx->reorder_time = - kcalloc(buf_size, sizeof(unsigned long), GFP_ATOMIC); + kcalloc(buf_size, sizeof(unsigned long), GFP_KERNEL); if (!tid_agg_rx->reorder_buf || !tid_agg_rx->reorder_time) { #ifdef CONFIG_MAC80211_HT_DEBUG if (net_ratelimit()) --- wireless-testing.orig/net/mac80211/debugfs_sta.c 2010-10-07 22:47:41.000000000 +0200 +++ wireless-testing/net/mac80211/debugfs_sta.c 2010-10-07 22:50:03.000000000 +0200 @@ -116,34 +116,35 @@ static ssize_t sta_agg_status_read(struc char buf[71 + STA_TID_NUM * 40], *p = buf; int i; struct sta_info *sta = file->private_data; + struct tid_ampdu_rx *tid_rx; + struct tid_ampdu_tx *tid_tx; + + rcu_read_lock(); - spin_lock_bh(&sta->lock); p += scnprintf(p, sizeof(buf) + buf - p, "next dialog_token: %#02x\n", sta->ampdu_mlme.dialog_token_allocator + 1); p += scnprintf(p, sizeof(buf) + buf - p, "TID\t\tRX active\tDTKN\tSSN\t\tTX\tDTKN\tpending\n"); + for (i = 0; i < STA_TID_NUM; i++) { + tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[i]); + tid_tx = rcu_dereference(sta->ampdu_mlme.tid_tx[i]); + p += scnprintf(p, sizeof(buf) + buf - p, "%02d", i); - p += scnprintf(p, sizeof(buf) + buf - p, "\t\t%x", - !!sta->ampdu_mlme.tid_rx[i]); + p += scnprintf(p, sizeof(buf) + buf - p, "\t\t%x", !!tid_rx); p += scnprintf(p, sizeof(buf) + buf - p, "\t%#.2x", - sta->ampdu_mlme.tid_rx[i] ? - sta->ampdu_mlme.tid_rx[i]->dialog_token : 0); + tid_rx ? tid_rx->dialog_token : 0); p += scnprintf(p, sizeof(buf) + buf - p, "\t%#.3x", - sta->ampdu_mlme.tid_rx[i] ? - sta->ampdu_mlme.tid_rx[i]->ssn : 0); + tid_rx ? tid_rx->ssn : 0); - p += scnprintf(p, sizeof(buf) + buf - p, "\t\t%x", - !!sta->ampdu_mlme.tid_tx[i]); + p += scnprintf(p, sizeof(buf) + buf - p, "\t\t%x", !!tid_tx); p += scnprintf(p, sizeof(buf) + buf - p, "\t%#.2x", - sta->ampdu_mlme.tid_tx[i] ? - sta->ampdu_mlme.tid_tx[i]->dialog_token : 0); + tid_tx ? tid_tx->dialog_token : 0); p += scnprintf(p, sizeof(buf) + buf - p, "\t%03d", - sta->ampdu_mlme.tid_tx[i] ? - skb_queue_len(&sta->ampdu_mlme.tid_tx[i]->pending) : 0); + tid_tx ? skb_queue_len(&tid_tx->pending) : 0); p += scnprintf(p, sizeof(buf) + buf - p, "\n"); } - spin_unlock_bh(&sta->lock); + rcu_read_unlock(); return simple_read_from_buffer(userbuf, count, ppos, buf, p - buf); } --- wireless-testing.orig/net/mac80211/sta_info.h 2010-10-07 22:44:16.000000000 +0200 +++ wireless-testing/net/mac80211/sta_info.h 2010-10-07 23:01:53.000000000 +0200 @@ -81,13 +81,14 @@ enum ieee80211_sta_info_flags { * @stop_initiator: initiator of a session stop * @tx_stop: TX DelBA frame when stopping * - * This structure is protected by RCU and the per-station - * spinlock. Assignments to the array holding it must hold - * the spinlock, only the TX path can access it under RCU - * lock-free if, and only if, the state has the flag - * %HT_AGG_STATE_OPERATIONAL set. Otherwise, the TX path - * must also acquire the spinlock and re-check the state, - * see comments in the tx code touching it. + * This structure's lifetime is managed by RCU, assignments to + * the array holding it must hold the aggregation mutex. + * + * The TX path can access it under RCU lock-free if, and + * only if, the state has the flag %HT_AGG_STATE_OPERATIONAL + * set. Otherwise, the TX path must also acquire the spinlock + * and re-check the state, see comments in the tx code + * touching it. */ struct tid_ampdu_tx { struct rcu_head rcu_head; @@ -115,15 +116,13 @@ struct tid_ampdu_tx { * @rcu_head: RCU head used for freeing this struct * @reorder_lock: serializes access to reorder buffer, see below. * - * This structure is protected by RCU and the per-station - * spinlock. Assignments to the array holding it must hold - * the spinlock. + * This structure's lifetime is managed by RCU, assignments to + * the array holding it must hold the aggregation mutex. * - * The @reorder_lock is used to protect the variables and - * arrays such as @reorder_buf, @reorder_time, @head_seq_num, - * @stored_mpdu_num and @reorder_time from being corrupted by - * concurrent access of the RX path and the expired frame - * release timer. + * The @reorder_lock is used to protect the members of this + * struct, except for @timeout, @buf_size and @dialog_token, + * which are constant across the lifetime of the struct (the + * dialog token being used only for debugging). */ struct tid_ampdu_rx { struct rcu_head rcu_head; --- wireless-testing.orig/net/mac80211/rx.c 2010-10-07 22:52:03.000000000 +0200 +++ wireless-testing/net/mac80211/rx.c 2010-10-07 22:58:17.000000000 +0200 @@ -538,6 +538,8 @@ static void ieee80211_release_reorder_fr { struct sk_buff *skb = tid_agg_rx->reorder_buf[index]; + lockdep_assert_held(&tid_agg_rx->reorder_lock); + if (!skb) goto no_frame; @@ -557,6 +559,8 @@ static void ieee80211_release_reorder_fr { int index; + lockdep_assert_held(&tid_agg_rx->reorder_lock); + while (seq_less(tid_agg_rx->head_seq_num, head_seq_num)) { index = seq_sub(tid_agg_rx->head_seq_num, tid_agg_rx->ssn) % tid_agg_rx->buf_size; @@ -581,6 +585,8 @@ static void ieee80211_sta_reorder_releas { int index, j; + lockdep_assert_held(&tid_agg_rx->reorder_lock); + /* release the buffer until next missing frame */ index = seq_sub(tid_agg_rx->head_seq_num, tid_agg_rx->ssn) % tid_agg_rx->buf_size; @@ -659,10 +665,11 @@ static bool ieee80211_sta_manage_reorder int index; bool ret = true; + spin_lock(&tid_agg_rx->reorder_lock); + buf_size = tid_agg_rx->buf_size; head_seq_num = tid_agg_rx->head_seq_num; - spin_lock(&tid_agg_rx->reorder_lock); /* frame with out of date sequence number */ if (seq_less(mpdu_seq_num, head_seq_num)) { dev_kfree_skb(skb); @@ -1899,9 +1906,12 @@ ieee80211_rx_h_ctrl(struct ieee80211_rx_ mod_timer(&tid_agg_rx->session_timer, TU_TO_EXP_TIME(tid_agg_rx->timeout)); + spin_lock(&tid_agg_rx->reorder_lock); /* release stored frames up to start of BAR */ ieee80211_release_reorder_frames(hw, tid_agg_rx, start_seq_num, frames); + spin_unlock(&tid_agg_rx->reorder_lock); + kfree_skb(skb); return RX_QUEUED; } @@ -2493,9 +2503,8 @@ static void ieee80211_invoke_rx_handlers } /* - * This function makes calls into the RX path. Therefore the - * caller must hold the sta_info->lock and everything has to - * be under rcu_read_lock protection as well. + * This function makes calls into the RX path, therefore + * it has to be invoked under RCU read lock. */ void ieee80211_release_reorder_timeout(struct sta_info *sta, int tid) { ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-07 21:03 ` Johannes Berg @ 2010-10-08 16:42 ` Christian Lamparter 2010-10-08 16:53 ` Johannes Berg 0 siblings, 1 reply; 12+ messages in thread From: Christian Lamparter @ 2010-10-08 16:42 UTC (permalink / raw) To: Johannes Berg; +Cc: John W. Linville, linux-wireless, Ben Greear, Ming Lei On Thursday 07 October 2010 23:03:13 Johannes Berg wrote: > On Wed, 2010-10-06 at 16:21 -0400, John W. Linville wrote: > > > > I think it's probably easier to fix than to revert now? There are only a > > > handful of fields, and it seemed to me that most of them can easily be > > > moved under the reorder lock. > > > > I would prefer a fix on top rather than a series of reverts... > > I think this should fix it. Somebody review please? > > johannes > Sure, a little bit. The code itself is fine but as you said the rx_handler code wasn't written for concurrent/delayed release timer mechanism. for example: Because we can't set IEEE80211_RX_RA_MATCH (since it interferes with scanning (as explained in "mac80211: fix release_reorder_timeout in scan"). We will experience strange results with "ieee80211_rx_h_decrypt": line: 878 > /* > * No point in finding a key and decrypting if the frame is neither > * addressed to us nor a multicast frame. > */ > if (!(status->rx_flags & IEEE80211_RX_RA_MATCH)) > return RX_CONTINUE; > > /* start without a key */ > rx->key = NULL; no software decryption there - not nice but the HW probably does the decryption for us. - That being said, the stack should be able to do the software decryption "just in case". Things are a little bit better with ieee80211_rx_h_sta_process. It updates some statistics and takes care of sta->last_rx (which is currently not that important giving HT BA is only supported for AP/STA operation). In ieee80211_rx_h_data, there could be another potential problem: > if (ieee80211_is_data(hdr->frame_control) && > !is_multicast_ether_addr(hdr->addr1) && > local->hw.conf.dynamic_ps_timeout > 0 && local->ps_sdata) { > mod_timer(&local->dynamic_ps_timer, jiffies + > msecs_to_jiffies(local->hw.conf.dynamic_ps_timeout)); > } I reckon there could be a "hidden" problem. "jiffies" is now approx 100ms after the packet was received from the interface. (Sure, a similar issue was also present in the original reorder release implementation.) In order the fix this/my mess we would need to: 1. move the software decryption before the reordering (802.11n-spec (page 11, Figure 6-1) allows this) (Or: 1. introduce an additional rx_flag for the reorder release case?) (2. maybe cache the original skb jiffie at some place?) (3. make a few counters atomic_t, so concurrent tasklets can update the stats. Or disable the BHs while processing, any rx frames (which is probably what we're going to do, right?)) Regards, Christian Unfortunately, I have to do some other "high priority" right now, so I'm short of time to do "that" now :-/. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-08 16:42 ` Christian Lamparter @ 2010-10-08 16:53 ` Johannes Berg 2010-10-08 18:12 ` Christian Lamparter 0 siblings, 1 reply; 12+ messages in thread From: Johannes Berg @ 2010-10-08 16:53 UTC (permalink / raw) To: Christian Lamparter Cc: John W. Linville, linux-wireless, Ben Greear, Ming Lei On Fri, 2010-10-08 at 18:42 +0200, Christian Lamparter wrote: > Sure, a little bit. The code itself is fine but as you said > the rx_handler code wasn't written for concurrent/delayed > release timer mechanism. But it should be fine now, no? What data does it still access that's not safe? > for example: > > Because we can't set IEEE80211_RX_RA_MATCH (since > it interferes with scanning (as explained in > "mac80211: fix release_reorder_timeout in scan"). That I don't understand. > We will experience strange results with "ieee80211_rx_h_decrypt": > > line: 878 > > /* > > * No point in finding a key and decrypting if the frame is neither > > * addressed to us nor a multicast frame. > > */ > > if (!(status->rx_flags & IEEE80211_RX_RA_MATCH)) > no software decryption there - not nice but the HW probably does > the decryption for us. - That being said, the stack should be able > to do the software decryption "just in case". But note that the rx_flags are in the *status* now, which is part of the SKB, and no longer on the stack. > Things are a little bit better with ieee80211_rx_h_sta_process. > It updates some statistics and takes care of sta->last_rx > (which is currently not that important giving HT BA is only supported > for AP/STA operation). > > In ieee80211_rx_h_data, there could be another potential problem: > > if (ieee80211_is_data(hdr->frame_control) && > > !is_multicast_ether_addr(hdr->addr1) && > > local->hw.conf.dynamic_ps_timeout > 0 && local->ps_sdata) { > > mod_timer(&local->dynamic_ps_timer, jiffies + > > msecs_to_jiffies(local->hw.conf.dynamic_ps_timeout)); > > } > I reckon there could be a "hidden" problem. "jiffies" is now > approx 100ms after the packet was received from the interface. > (Sure, a similar issue was also present in the original > reorder release implementation.) This one's more interesting. I guess we need to bypass these things somehow, maybe setting a flag if this was a "recovered" frame? > In order the fix this/my mess we would need to: > 1. move the software decryption before the reordering > (802.11n-spec (page 11, Figure 6-1) allows this) > > (Or: > 1. introduce an additional rx_flag for the reorder release case?) > > (2. maybe cache the original skb jiffie at some place?) > > (3. make a few counters atomic_t, so concurrent tasklets > can update the stats. Or disable the BHs while processing, > any rx frames (which is probably what we're going to do, right?)) BHs are disabled while processing RX -- and timer is a BH itself so they're also disabled, right? johannes ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-08 16:53 ` Johannes Berg @ 2010-10-08 18:12 ` Christian Lamparter 2010-10-08 18:45 ` Johannes Berg 0 siblings, 1 reply; 12+ messages in thread From: Christian Lamparter @ 2010-10-08 18:12 UTC (permalink / raw) To: Johannes Berg; +Cc: John W. Linville, linux-wireless, Ben Greear, Ming Lei On Friday 08 October 2010 18:53:22 Johannes Berg wrote: > On Fri, 2010-10-08 at 18:42 +0200, Christian Lamparter wrote: > > > Sure, a little bit. The code itself is fine but as you said > > the rx_handler code wasn't written for concurrent/delayed > > release timer mechanism. > > But it should be fine now, no? What data does it still access that's not > safe? that's what I'm asking myself. > > for example: > > > > Because we can't set IEEE80211_RX_RA_MATCH (since > > it interferes with scanning (as explained in > > "mac80211: fix release_reorder_timeout in scan"). > > That I don't understand. > > > We will experience strange results with "ieee80211_rx_h_decrypt": > > > > line: 878 > > > /* > > > * No point in finding a key and decrypting if the frame is neither > > > * addressed to us nor a multicast frame. > > > */ > > > if (!(status->rx_flags & IEEE80211_RX_RA_MATCH)) > > > no software decryption there - not nice but the HW probably does > > the decryption for us. - That being said, the stack should be able > > to do the software decryption "just in case". > > But note that the rx_flags are in the *status* now, which is part of the > SKB, and no longer on the stack. oops, you are right, my fault. But hey, wait a sec. (This one is about AP mode - It's related to IEEE80211_RX_RA_MATCH, but now in a different handler) NULLFUNCs (set/clear PM) are not reordered and they get processed right away, right? So what if the reorder release triggers and ap_sta_ps_end (called by ieee80211_rx_h_sta_process) accidentally resets the "sleeping" flag (because some old frames with a "stale" PSM bit was released after 100ms)? > > Things are a little bit better with ieee80211_rx_h_sta_process. > > It updates some statistics and takes care of sta->last_rx > > (which is currently not that important giving HT BA is only supported > > for AP/STA operation). > > > > In ieee80211_rx_h_data, there could be another potential problem: > > > if (ieee80211_is_data(hdr->frame_control) && > > > !is_multicast_ether_addr(hdr->addr1) && > > > local->hw.conf.dynamic_ps_timeout > 0 && local->ps_sdata) { > > > mod_timer(&local->dynamic_ps_timer, jiffies + > > > msecs_to_jiffies(local->hw.conf.dynamic_ps_timeout)); > > > } > > I reckon there could be a "hidden" problem. "jiffies" is now > > approx 100ms after the packet was received from the interface. > > (Sure, a similar issue was also present in the original > > reorder release implementation.) > > This one's more interesting. I guess we need to bypass these things > somehow, maybe setting a flag if this was a "recovered" frame? (and check the same flag for ap_sta_ps_end/ap_sta_ps_start). Ok, that's doable (even for me :D) > > In order the fix this/my mess we would need to: > > 1. move the software decryption before the reordering > > (802.11n-spec (page 11, Figure 6-1) allows this) > > > > (Or: > > 1. introduce an additional rx_flag for the reorder release case?) > > > > (2. maybe cache the original skb jiffie at some place?) > > > > (3. make a few counters atomic_t, so concurrent tasklets > > can update the stats. Or disable the BHs while processing, > > any rx frames (which is probably what we're going to do, right?)) > > BHs are disabled while processing RX -- and timer is a BH itself so > they're also disabled, right? hmm, are we talking about BH or tasklets? I read something about the occurrence of simultaneous tasklets/timers on multi-core systems? And from a point that all made sense: --- from kernel-hacking.DocBook: "For this reason, tasklets are more often used: they are dynamically-registrable (meaning you can have as many as you want), and they also guarantee that any tasklet will only run on one CPU at any time, although different tasklets can run simultaneously." --- and kernel-locking.DocBook: "Different Tasklets/Timers: If another tasklet/timer wants to share data with your tasklet or timer, you will both need to use spin_lock() and spin_unlock() calls. spin_lock_bh() is unnecessary here, as you are already in a tasklet, and none will be run on the same CPU." <-- "same" CPU. --- http://www.makelinux.net/ldd3/chp-5-sect-7.shtml: "Normally, even a simple operation such as: n_op++; would require locking. Some processors might perform that sort of increment in an atomic manner, but you can't count on it." --- So according to statements above, we need a lock for the stats too. (and I was wrong about "converting" them all to atomic.) * ieee80211_rx_h_sta_process sta->rx_packets++; sta->rx_fragments++; sta->rx_bytes += rx->skb->len; * ieee80211_rx_h_data: dev->stats.rx_packets++; dev->stats.rx_bytes += rx->skb->len; Regards, Chr ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mac80211: hoist sta->lock from reorder release timer 2010-10-08 18:12 ` Christian Lamparter @ 2010-10-08 18:45 ` Johannes Berg 0 siblings, 0 replies; 12+ messages in thread From: Johannes Berg @ 2010-10-08 18:45 UTC (permalink / raw) To: Christian Lamparter Cc: John W. Linville, linux-wireless, Ben Greear, Ming Lei On Fri, 2010-10-08 at 20:12 +0200, Christian Lamparter wrote: > But hey, wait a sec. (This one is about AP mode - It's related to > IEEE80211_RX_RA_MATCH, but now in a different handler) Heh. > NULLFUNCs (set/clear PM) are not reordered and they get > processed right away, right? Yeah, I don't think they can be in A-MPDUs. At least not in any scenario that actually makes sense. > So what if the reorder release triggers and ap_sta_ps_end > (called by ieee80211_rx_h_sta_process) accidentally resets > the "sleeping" flag (because some old frames with a "stale" > PSM bit was released after 100ms)? Yeah... that can happen. > > > Things are a little bit better with ieee80211_rx_h_sta_process. > > > It updates some statistics and takes care of sta->last_rx > > > (which is currently not that important giving HT BA is only supported > > > for AP/STA operation). > > > > > > In ieee80211_rx_h_data, there could be another potential problem: > > > > if (ieee80211_is_data(hdr->frame_control) && > > > > !is_multicast_ether_addr(hdr->addr1) && > > > > local->hw.conf.dynamic_ps_timeout > 0 && local->ps_sdata) { > > > > mod_timer(&local->dynamic_ps_timer, jiffies + > > > > msecs_to_jiffies(local->hw.conf.dynamic_ps_timeout)); > > > > } > > > I reckon there could be a "hidden" problem. "jiffies" is now > > > approx 100ms after the packet was received from the interface. > > > (Sure, a similar issue was also present in the original > > > reorder release implementation.) > > > > This one's more interesting. I guess we need to bypass these things > > somehow, maybe setting a flag if this was a "recovered" frame? > (and check the same flag for ap_sta_ps_end/ap_sta_ps_start). > Ok, that's doable (even for me :D) Yeah, something like that. I guess there are more things like that and we have to go through the RX path once -- but it shouldn't be all that hard. > > BHs are disabled while processing RX -- and timer is a BH itself so > > they're also disabled, right? > hmm, are we talking about BH or tasklets? RX is currently always processed in a tasklet. > I read something about the > occurrence of simultaneous tasklets/timers on multi-core systems? You're right, it's local_bh_disable ... the local is there for a reason :-) > And from a point that all made sense: > --- > from kernel-hacking.DocBook: > > "For this reason, tasklets are more often used: they are > dynamically-registrable (meaning you can have as many as you want), > and they also guarantee that any tasklet will only run on one CPU > at any time, although different tasklets can run simultaneously." Yeah. > and kernel-locking.DocBook: > "Different Tasklets/Timers: > If another tasklet/timer wants to share data with your tasklet or timer, > you will both need to use spin_lock() and spin_unlock() calls. > spin_lock_bh() is unnecessary here, as you are already in a tasklet, and > none will be run on the same CPU." <-- "same" CPU. Indeed. > So according to statements above, we need a lock for the stats > too. (and I was wrong about "converting" them all to atomic.) > > * ieee80211_rx_h_sta_process > sta->rx_packets++; > sta->rx_fragments++; > sta->rx_bytes += rx->skb->len; > > * ieee80211_rx_h_data: > dev->stats.rx_packets++; > dev->stats.rx_bytes += rx->skb->len; Yeah. It's too bad we can't just disable the tasklet while processing the timer -- but we can't because we might also be processing from another context, even process context with BHs disabled, from driver calls to ieee80211_rx() (without _irqsafe). So we need a lock. Question then is, which one do we use? We could use the sta lock (and get rid of the reorder lock again), that would allow processing RX frames for STA A while we're doing timeouts for STA B -- as long as we don't also process a frame for B which would block, but presumably something is wrong when the timeout happens ... But then using the sta lock could be fairly expensive. It might actually be cheaper to simply remove the distinction between _rx() and _rx_irqsafe() and make it all go through the tasklet -- then we can simply disable the tasklet while doing the timer processing.... I don't like reverting these patches, but maybe we should simply comment out the code that arms the timer, thereby disabling all of it, while we work on this? johannes ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-10-08 18:45 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-10-06 10:00 [PATCH] mac80211: hoist sta->lock from reorder release timer Christian Lamparter 2010-10-06 10:10 ` Johannes Berg 2010-10-06 10:20 ` Christian Lamparter 2010-10-06 10:41 ` Johannes Berg 2010-10-06 11:43 ` Christian Lamparter 2010-10-06 11:46 ` Johannes Berg 2010-10-06 20:21 ` John W. Linville 2010-10-07 21:03 ` Johannes Berg 2010-10-08 16:42 ` Christian Lamparter 2010-10-08 16:53 ` Johannes Berg 2010-10-08 18:12 ` Christian Lamparter 2010-10-08 18:45 ` Johannes Berg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).