linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf.
@ 2013-11-20  8:41 Blaise Gassend
  2013-11-20  8:48 ` Johannes Berg
  0 siblings, 1 reply; 8+ messages in thread
From: Blaise Gassend @ 2013-11-20  8:41 UTC (permalink / raw)
  To: linux-wireless; +Cc: Catalin Drula, blaise@suitabletech.com, Alap Modi

Hi,

I have been trying to debug massive packet loss that our product
experiences with recent Aruba access points. The basic symptoms are
that within a few seconds after you start sending significant data,
you start getting 100% RX loss. A few seconds later, RX recovers for a
few seconds before the cycle repeats. The higher the packet send rate,
the faster this cycle repeats.

I have been tracing the packets through the code, and it appears that
the loss happens in ieee80211_sta_manage_reorder_buf. It appears that
when there are broadcast QoS Data packets, their sequence numbers get
mixed with non-broadcast QoS Data sequence numbers causing out-of-date
sequence number conditions to get triggered spuriously.

As far as I can tell broadcast QoS Data packets coming from the AP are
pretty rare (the other networks I have access seem to use Data packets
for broadcast traffic from the AP), but are legal. So I'm suspecting
that the AP is behaving correctly, but is triggering a so-far rare bug
in mac80211.
But this problem is likely to become much more widespread if Aruba's
802.11ac firmware triggers it.

I'm not a deep 802.11 expert or a mac80211 so I could certainly use
some help here. I am putting the details I have gathered below, and
would love any suggestions/advice. Currently, my impression is that we
might need a special tid_rx for broadcast packets similar to the
special handling of broadcast packets in ieee80211_parse_qos.

Best regards,
Blaise


The condition that causes the loss is:

        /* frame with out of date sequence number */
        if (ieee80211_sn_less(mpdu_seq_num, head_seq_num)) {
                dev_kfree_skb(skb);
                goto out;
        }

Adding the following printk statements near the top

        printk("wlan: ieee80211_sta_manage_reorder_buf  %u %u %u\n",
skb->len, mpdu_seq_num, head_seq_num);

and bottom

out:
        printk("wlan: ieee80211_sta_manage_reorder_buf  end %u\n",
tid_agg_rx->head_seq_num);

of ieee80211_sta_manage_reorder_buf, I get the following output at the
time when loss starts (the comments were added manually):

Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  206 552 552
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  end 553
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  206 553 553
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  end 554
# The two packets above got through fine.
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  96 2551 554
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  end 2488
# The broadcast packet above causes the head_seq_num to jump to whatever
# the current broadcast sequence number is.
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  206 554 2488
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  end 2488
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  206 555 2488
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  end 2488
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  206 556 2488
Nov 19 21:55:29 localhost kernel: wlan:
ieee80211_sta_manage_reorder_buf  end 2488
# The three packets above are dropped. And there are plenty more drops
until sequence numbers wrap around.

The corresponding tshark output (I'd be happy to provide a pcap file
on demand, but I'm not sure what linux-wireless will accept) shows the
frames that were traced above, and a few others that aren't related to
my adapter.
17309   5.577576 JuniperN_99:37:0e -> Sparklan_47:57:16 802.11   250
QoS Data, SN=552, FN=0, Flags=.p....F.C
17310   5.577651 ArubaNet_f0:b7:56 (TA) -> Apple_31:89:b6 (RA) 802.11
 46 Request-to-send, Flags=........C
17311   5.579743 ArubaNet_ae:65:78 -> Broadcast    802.11   215 Beacon
frame, SN=1757, FN=0, Flags=........C, BI=100, SSID=workday-corp
17312   5.579790 ArubaNet_ae:65:79 -> Broadcast    802.11   209 Beacon
frame, SN=1757, FN=0, Flags=........C, BI=100, SSID=workday-guest
17313   5.579831 ArubaNet_ec:0d:f0 -> Broadcast    802.11   262 Beacon
frame, SN=397, FN=0, Flags=........C, BI=100, SSID=ethersphere-wpa2
17314   5.579885 ArubaNet_ec:0d:f1 -> Broadcast    802.11   237 Beacon
frame, SN=398, FN=0, Flags=........C, BI=100, SSID=ARUBA-VISITOR
17315   5.579934 IntelCor_bf:5f:f8 -> Broadcast    802.11   576 Data,
SN=399, FN=0, Flags=.p....F.C
17316   5.579952 ArubaNet_f0:b7:55 (TA) -> Sparklan_47:57:12 (RA)
802.11   46 Request-to-send, Flags=........C
17317   5.579968              -> ArubaNet_f0:b7:55 (RA) 802.11   40
Clear-to-send, Flags=........C
17318   5.579975              -> Sparklan_47:57:16 (RA) 802.11   40
Acknowledgement, Flags=........C
17319   5.579989 ArubaNet_f0:b7:55 (TA) -> Sparklan_47:57:12 (RA)
802.11   46 Request-to-send, Flags=........C
17320   5.579997              -> ArubaNet_f0:b7:55 (RA) 802.11   40
Clear-to-send, Flags=........C
17321   5.580016 Sparklan_47:57:16 -> JuniperN_99:37:0e 802.11   212
QoS Data, SN=909, FN=0, Flags=.p.....T
17322   5.581854 ArubaNet_f0:b7:55 (TA) -> Sparklan_47:57:12 (RA)
802.11   46 Request-to-send, Flags=........C
17323   5.581872              -> ArubaNet_f0:b7:55 (RA) 802.11   40
Clear-to-send, Flags=........C
17324   5.581881 JuniperN_99:37:0e -> Sparklan_47:57:12 802.11   140
QoS Data, SN=470, FN=0, Flags=.p..R.F.C
17325   5.581888 Sparklan_47:57:12 (TA) -> ArubaNet_f0:b7:55 (RA)
802.11   58 802.11 Block Ack, Flags=........C
17326   5.581893 ArubaNet_ae:61:28 -> Broadcast    802.11   314 Beacon
frame, SN=429, FN=0, Flags=........C, BI=100
17327   5.581935 ArubaNet_ae:61:2a -> Broadcast    802.11   269 Beacon
frame, SN=421, FN=0, Flags=........C, BI=100, SSID=employee200-8
17328   5.581967 ArubaNet_f0:b7:55 (TA) -> Sparklan_47:57:16 (RA)
802.11   46 Request-to-send, Flags=........C
17329   5.581974 JuniperN_99:37:0e -> Sparklan_47:57:16 802.11   250
QoS Data, SN=553, FN=0, Flags=.p....F.C
17330   5.582038 Sparklan_47:57:12 -> Broadcast    802.11   126 QoS
Data, SN=2551, FN=0, Flags=.p....F.C
17331   5.583623 ArubaNet_f0:b7:56 (TA) -> 84:38:35:5d:f2:aa (RA)
802.11   46 Request-to-send, Flags=........C
17332   5.583635 ArubaNet_f0:b7:56 (TA) -> Apple_31:74:f0 (RA) 802.11
 46 Request-to-send, Flags=........C
17333   5.584426              -> Sparklan_47:57:16 (RA) 802.11   40
Acknowledgement, Flags=........C
17334   5.584465 Sparklan_47:57:16 -> JuniperN_99:37:0e 802.11   212
QoS Data, SN=910, FN=0, Flags=.p.....T
17335   5.585022 ArubaNet_f0:b7:56 (TA) -> b8:e8:56:0a:4a:de (RA)
802.11   46 Request-to-send, Flags=........C
17336   5.587968              -> Apple_31:89:b6 (RA) 802.11   40
Clear-to-send, Flags=........C
17337   5.587984 ArubaNet_f0:b7:56 (TA) -> Apple_31:89:b6 (RA) 802.11
 58 802.11 Block Ack, Flags=........C
17338   5.587990              -> 84:38:35:5d:f2:aa (RA) 802.11   40
Clear-to-send, Flags=........C
17339   5.587993 ArubaNet_f0:b7:56 (TA) -> 84:38:35:5d:f2:aa (RA)
802.11   58 802.11 Block Ack, Flags=........C
17340   5.587997 ArubaNet_f0:b7:56 (TA) -> Apple_31:89:b6 (RA) 802.11
 46 Request-to-send, Flags=........C
17341   5.588001 ArubaNet_f0:b7:56 (TA) -> Apple_31:89:b6 (RA) 802.11
 46 Request-to-send, Flags=........C
17342   5.588004              -> ArubaNet_f0:b7:56 (RA) 802.11   40
Clear-to-send, Flags=........C
17343   5.589312 ArubaNet_f0:b7:55 (TA) -> Sparklan_47:57:16 (RA)
802.11   46 Request-to-send, Flags=........C
17344   5.589331              -> Sparklan_47:57:16 (RA) 802.11   40
Acknowledgement, Flags=........C
17345   5.589348 Sparklan_47:57:16 -> JuniperN_99:37:0e 802.11   212
QoS Data, SN=911, FN=0, Flags=.p.....T
17346   5.590768              -> Apple_31:89:b6 (RA) 802.11   40
Clear-to-send, Flags=........C
17347   5.590787 ArubaNet_f0:b7:56 (TA) -> Apple_31:89:b6 (RA) 802.11
 58 802.11 Block Ack, Flags=........C
17348   5.590794 ArubaNet_f0:b7:55 (TA) -> Sparklan_47:57:16 (RA)
802.11   46 Request-to-send, Flags=........C
17349   5.590805 JuniperN_99:37:0e -> Sparklan_47:57:16 802.11   250
QoS Data, SN=554, FN=0, Flags=.p..R.F.C
17350   5.590837 JuniperN_99:37:0e -> Sparklan_47:57:16 802.11   250
QoS Data, SN=555, FN=0, Flags=.p..R.F.C

Regards,
Blaise Gassend

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf.
  2013-11-20  8:41 QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf Blaise Gassend
@ 2013-11-20  8:48 ` Johannes Berg
  2013-11-20 10:15   ` Blaise Gassend
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Berg @ 2013-11-20  8:48 UTC (permalink / raw)
  To: Blaise Gassend; +Cc: linux-wireless, Catalin Drula, Alap Modi

Hi,

> I have been tracing the packets through the code, and it appears that
> the loss happens in ieee80211_sta_manage_reorder_buf. It appears that
> when there are broadcast QoS Data packets, their sequence numbers get
> mixed with non-broadcast QoS Data sequence numbers causing out-of-date
> sequence number conditions to get triggered spuriously.
> 
> As far as I can tell broadcast QoS Data packets coming from the AP are
> pretty rare (the other networks I have access seem to use Data packets
> for broadcast traffic from the AP), but are legal. So I'm suspecting
> that the AP is behaving correctly, but is triggering a so-far rare bug
> in mac80211.
> But this problem is likely to become much more widespread if Aruba's
> 802.11ac firmware triggers it.
> 
> I'm not a deep 802.11 expert or a mac80211 so I could certainly use
> some help here. I am putting the details I have gathered below, and
> would love any suggestions/advice. Currently, my impression is that we
> might need a special tid_rx for broadcast packets similar to the
> special handling of broadcast packets in ieee80211_parse_qos.

I think we just need to skip reorder processing for multicast, since
they won't be aggregated anyway?

http://p.sipsolutions.net/d00799dd2201676a.txt

Then again I'm not really sure why we didn't do this before??

johannes


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf.
  2013-11-20  8:48 ` Johannes Berg
@ 2013-11-20 10:15   ` Blaise Gassend
  2013-11-20 11:01     ` Karl Beldan
  0 siblings, 1 reply; 8+ messages in thread
From: Blaise Gassend @ 2013-11-20 10:15 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless, Catalin Drula, Alap Modi

Hi Johannes,

Thanks for the quick reply!

> I think we just need to skip reorder processing for multicast, since
> they won't be aggregated anyway?
>
> http://p.sipsolutions.net/d00799dd2201676a.txt

This patch works like a charm for my current predicament. But is it
actually written somewhere that multicast packets can't be aggregated?
I can't find any place that says they can't, but I'm not authoritative
by any means.

If aggregated packets were allowed, would the more a special tid_rx
for multicast packets be the right way to go (similar to what happens
in ieee80211_parse_qos)?

In any case, this patch seems like a huge net improvement over the
current situation and is probably worth merging.

Regards,
Blaise

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf.
  2013-11-20 10:15   ` Blaise Gassend
@ 2013-11-20 11:01     ` Karl Beldan
  2013-11-20 11:06       ` Johannes Berg
  0 siblings, 1 reply; 8+ messages in thread
From: Karl Beldan @ 2013-11-20 11:01 UTC (permalink / raw)
  To: Blaise Gassend; +Cc: Johannes Berg, linux-wireless, Catalin Drula, Alap Modi

On Wed, Nov 20, 2013 at 02:15:27AM -0800, Blaise Gassend wrote:
> Hi Johannes,
> 
> Thanks for the quick reply!
> 
> > I think we just need to skip reorder processing for multicast, since
> > they won't be aggregated anyway?
> >
> > http://p.sipsolutions.net/d00799dd2201676a.txt
> 
> This patch works like a charm for my current predicament. But is it
> actually written somewhere that multicast packets can't be aggregated?
> I can't find any place that says they can't, but I'm not authoritative
> by any means.
> 
There's a chapter "A-MPDU aggregation of group addressed data frames" in
the specs, however I haven't seen this yet.

 
Karl

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf.
  2013-11-20 11:01     ` Karl Beldan
@ 2013-11-20 11:06       ` Johannes Berg
  2013-11-20 11:16         ` Karl Beldan
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Berg @ 2013-11-20 11:06 UTC (permalink / raw)
  To: Karl Beldan; +Cc: Blaise Gassend, linux-wireless, Catalin Drula, Alap Modi

On Wed, 2013-11-20 at 12:01 +0100, Karl Beldan wrote:
> On Wed, Nov 20, 2013 at 02:15:27AM -0800, Blaise Gassend wrote:
> > Hi Johannes,
> > 
> > Thanks for the quick reply!
> > 
> > > I think we just need to skip reorder processing for multicast, since
> > > they won't be aggregated anyway?
> > >
> > > http://p.sipsolutions.net/d00799dd2201676a.txt
> > 
> > This patch works like a charm for my current predicament. But is it
> > actually written somewhere that multicast packets can't be aggregated?
> > I can't find any place that says they can't, but I'm not authoritative
> > by any means.
> > 
> There's a chapter "A-MPDU aggregation of group addressed data frames" in
> the specs, however I haven't seen this yet.

Even then though, I don't think there would be any block-ack session,
and thus you wouldn't be able to use the reorder buffer anyway, right?

johannes


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf.
  2013-11-20 11:06       ` Johannes Berg
@ 2013-11-20 11:16         ` Karl Beldan
  2013-11-20 11:25           ` Karl Beldan
  0 siblings, 1 reply; 8+ messages in thread
From: Karl Beldan @ 2013-11-20 11:16 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Blaise Gassend, linux-wireless, Catalin Drula, Alap Modi

On Wed, Nov 20, 2013 at 12:06:09PM +0100, Johannes Berg wrote:
> On Wed, 2013-11-20 at 12:01 +0100, Karl Beldan wrote:
> > On Wed, Nov 20, 2013 at 02:15:27AM -0800, Blaise Gassend wrote:
> > > Hi Johannes,
> > > 
> > > Thanks for the quick reply!
> > > 
> > > > I think we just need to skip reorder processing for multicast, since
> > > > they won't be aggregated anyway?
> > > >
> > > > http://p.sipsolutions.net/d00799dd2201676a.txt
> > > 
> > > This patch works like a charm for my current predicament. But is it
> > > actually written somewhere that multicast packets can't be aggregated?
> > > I can't find any place that says they can't, but I'm not authoritative
> > > by any means.
> > > 
> > There's a chapter "A-MPDU aggregation of group addressed data frames" in
> > the specs, however I haven't seen this yet.
> 
> Even then though, I don't think there would be any block-ack session,
> and thus you wouldn't be able to use the reorder buffer anyway, right?
> 
I think so.
 
Karl

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf.
  2013-11-20 11:16         ` Karl Beldan
@ 2013-11-20 11:25           ` Karl Beldan
  2013-11-20 11:39             ` Johannes Berg
  0 siblings, 1 reply; 8+ messages in thread
From: Karl Beldan @ 2013-11-20 11:25 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Blaise Gassend, linux-wireless, Catalin Drula, Alap Modi

On Wed, Nov 20, 2013 at 12:16:54PM +0100, Karl Beldan wrote:
> On Wed, Nov 20, 2013 at 12:06:09PM +0100, Johannes Berg wrote:
> > On Wed, 2013-11-20 at 12:01 +0100, Karl Beldan wrote:
> > > On Wed, Nov 20, 2013 at 02:15:27AM -0800, Blaise Gassend wrote:
> > > > Hi Johannes,
> > > > 
> > > > Thanks for the quick reply!
> > > > 
> > > > > I think we just need to skip reorder processing for multicast, since
> > > > > they won't be aggregated anyway?
> > > > >
> > > > > http://p.sipsolutions.net/d00799dd2201676a.txt
> > > > 
> > > > This patch works like a charm for my current predicament. But is it
> > > > actually written somewhere that multicast packets can't be aggregated?
> > > > I can't find any place that says they can't, but I'm not authoritative
> > > > by any means.
> > > > 
> > > There's a chapter "A-MPDU aggregation of group addressed data frames" in
> > > the specs, however I haven't seen this yet.
> > 
> > Even then though, I don't think there would be any block-ack session,
> > and thus you wouldn't be able to use the reorder buffer anyway, right?
> > 
> I think so.
>  
Except maybe for 802.11aa GCR (groupcast with retries) ..
 
Karl

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf.
  2013-11-20 11:25           ` Karl Beldan
@ 2013-11-20 11:39             ` Johannes Berg
  0 siblings, 0 replies; 8+ messages in thread
From: Johannes Berg @ 2013-11-20 11:39 UTC (permalink / raw)
  To: Karl Beldan; +Cc: Blaise Gassend, linux-wireless, Catalin Drula, Alap Modi

On Wed, 2013-11-20 at 12:25 +0100, Karl Beldan wrote:

> > > > > > http://p.sipsolutions.net/d00799dd2201676a.txt
> > > > > 
> > > > > This patch works like a charm for my current predicament. But is it
> > > > > actually written somewhere that multicast packets can't be aggregated?
> > > > > I can't find any place that says they can't, but I'm not authoritative
> > > > > by any means.
> > > > > 
> > > > There's a chapter "A-MPDU aggregation of group addressed data frames" in
> > > > the specs, however I haven't seen this yet.
> > > 
> > > Even then though, I don't think there would be any block-ack session,
> > > and thus you wouldn't be able to use the reorder buffer anyway, right?
> > > 
> > I think so.
> >  
> Except maybe for 802.11aa GCR (groupcast with retries) ..

But that will probably need much more work anyway ... :)

johannes


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-11-20 11:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-20  8:41 QoS Data packets causing massive packet loss in ieee80211_sta_manage_reorder_buf Blaise Gassend
2013-11-20  8:48 ` Johannes Berg
2013-11-20 10:15   ` Blaise Gassend
2013-11-20 11:01     ` Karl Beldan
2013-11-20 11:06       ` Johannes Berg
2013-11-20 11:16         ` Karl Beldan
2013-11-20 11:25           ` Karl Beldan
2013-11-20 11:39             ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).