* Re: [RFCv2 0/3] mac80211: implement fq codel @ 2016-03-21 17:10 Dave Taht 2016-03-22 8:05 ` Michal Kazior 0 siblings, 1 reply; 15+ messages in thread From: Dave Taht @ 2016-03-21 17:10 UTC (permalink / raw) To: Michal Kazior Cc: Network Development, linux-wireless, ath10k@lists.infradead.org, Jasmine Strong, codel@lists.bufferbloat.net, make-wifi-fast [-- Attachment #1: Type: text/plain, Size: 818 bytes --] thx. a lot to digest. A) quick notes on "flent-gui bursts_11e-2016-03-21T09*.gz" 1) the new bursts_11e test *should* have stuck stuff in the VI and VO queues, and there *should* have been some sort of difference shown on the plots with it. There wasn't. For diffserv markings I used BE=CS0, BK=CS1, VI=CS5, and VO=EF. CS6/CS7 should also land in VO (at least with the soft mac handler last I looked). Is there a way to check if you are indeed exercising all four 802.11e hardware queues in this test? in ath9k it is the "xmit" sysfs var.... 2) In all the old cases the BE UDP_RR flow died on the first burst (why?), and the fullpatch preserved it. (I would have kind of hoped to have seen the BK flow die, actually, in the fullpatch) 3) I am also confused on 802.11ac - can VO aggregate? ( can't in in 802.11n). [-- Attachment #2: vivosame.png --] [-- Type: image/png, Size: 276624 bytes --] [-- Attachment #3: Type: text/plain, Size: 140 bytes --] _______________________________________________ Codel mailing list Codel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/codel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-21 17:10 [RFCv2 0/3] mac80211: implement fq codel Dave Taht @ 2016-03-22 8:05 ` Michal Kazior 2016-03-22 9:51 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 15+ messages in thread From: Michal Kazior @ 2016-03-22 8:05 UTC (permalink / raw) To: Dave Taht Cc: Network Development, linux-wireless, ath10k@lists.infradead.org, Jasmine Strong, codel@lists.bufferbloat.net, make-wifi-fast On 21 March 2016 at 18:10, Dave Taht <dave.taht@gmail.com> wrote: > thx. > > a lot to digest. > > A) quick notes on "flent-gui bursts_11e-2016-03-21T09*.gz" > > 1) the new bursts_11e test *should* have stuck stuff in the VI and VO > queues, and there *should* have been some sort of difference shown on > the plots with it. There wasn't. traffic-gen generates only BE traffic. Everything else runs UDP_RR which doesn't generate a lot of traffic. > For diffserv markings I used BE=CS0, BK=CS1, VI=CS5, and VO=EF. > CS6/CS7 should also land in VO (at least with the soft mac handler > last I looked). Is there a way to check if you are indeed exercising > all four 802.11e hardware queues in this test? in ath9k it is the > "xmit" sysfs var.... Hmm.. there are no txq stats. I guess it makes sense to have them? There is /sys/kernel/debug/ieee80211/phy*/fq which dumps state of all queues which will be mostly empty with UDP_RR. You can run netperf UDP stream with diffserv marking to see onto which tid they are mapped. You can see tid-AC mappings here: https://wireless.wiki.kernel.org/en/developers/documentation/mac80211/queues I just checked and EF ends up as tid5 which is VI. It's actually the same as CS5. You can use CS7 to run on tid7 which is VO. > 2) In all the old cases the BE UDP_RR flow died on the first burst > (why?), and the fullpatch preserved it. I think it's related to my setup which involves veth pairs. I use them to simulate bridging/AP behavior but maybe it's not doing the job right, hmm.. > (I would have kind of hoped to > have seen the BK flow die, actually, in the fullpatch) There's no extra weight priority to BK. The difference between BE and BK in 802.11 is contention window access time so BK gets less txops statistically. Both share the same txop, which is 5.484ms in most cases. > 3) I am also confused on 802.11ac - can VO aggregate? ( can't in in 802.11n). Yes, it should be albeit VI and VO have shorter txop compared to BE/BK: 3.008ms and 1.504ms respectively. UDP_RR doesn't really create a lot of opportunities for aggregation. If you want to see how different queues behave when loaded you'll need to modify traffic-gen and add bursts across different ACs in the bursts_11e test. Michał _______________________________________________ Codel mailing list Codel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/codel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-22 8:05 ` Michal Kazior @ 2016-03-22 9:51 ` Toke Høiland-Jørgensen 0 siblings, 0 replies; 15+ messages in thread From: Toke Høiland-Jørgensen @ 2016-03-22 9:51 UTC (permalink / raw) To: Michal Kazior Cc: Network Development, linux-wireless, ath10k@lists.infradead.org, Jasmine Strong, codel@lists.bufferbloat.net, make-wifi-fast Michal Kazior <michal.kazior@tieto.com> writes: > traffic-gen generates only BE traffic. Everything else runs UDP_RR > which doesn't generate a lot of traffic. Good point. Fixed that: the newest git version of traffic-gen supports a -t parameter which will be set as the TOS byte on outgoing traffic (literal; no smart diffserv handling, so you can override the ECN bits as well). Added support for a burst-tos test parameter in the Flent burst test configs which will use this new parameter if set. -Toke _______________________________________________ Codel mailing list Codel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/codel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [RFCv2 0/3] mac80211: implement fq codel @ 2016-03-16 10:17 Michal Kazior 2016-03-16 10:26 ` Michal Kazior 0 siblings, 1 reply; 15+ messages in thread From: Michal Kazior @ 2016-03-16 10:17 UTC (permalink / raw) To: linux-wireless Cc: ath10k, johannes, netdev, dave.taht, emmanuel.grumbach, nbd, Tim Shepard, make-wifi-fast, codel, Michal Kazior Hi, Most notable changes: * fixes (duh); fairness should work better now, * EWMA codel target based on estimated service time, * new tx scheduling helper with in-flight duration limiting (same idea Emmanuel had for iwlwifi), * added a few debugfs hooks. * ath10k proof-of-concept that uses the new tx scheduling (will post results in separate email) The patch grew pretty big and I plan on splitting it before next submission. Any suggestions? The tx scheduling probably needs more work and testing. I didn't evaluate how CPU intensive it is nor how it influences things like peak throughput (lab conditions et al) yet. I've uploaded a branch for convenience: https://github.com/kazikcz/linux/tree/fqmac-rfc-v2 This is based on Kalle's ath tree. Michal Kazior (3): mac80211: implement fq_codel for software queuing ath10k: report per-station tx/rate rates to mac80211 ath10k: use ieee80211_tx_schedule() drivers/net/wireless/ath/ath10k/core.c | 2 - drivers/net/wireless/ath/ath10k/core.h | 8 +- drivers/net/wireless/ath/ath10k/debug.c | 61 ++- drivers/net/wireless/ath/ath10k/mac.c | 126 +++--- drivers/net/wireless/ath/ath10k/wmi.h | 2 +- include/net/mac80211.h | 96 ++++- net/mac80211/agg-tx.c | 8 +- net/mac80211/cfg.c | 2 +- net/mac80211/codel.h | 264 +++++++++++++ net/mac80211/codel_i.h | 89 +++++ net/mac80211/debugfs.c | 267 +++++++++++++ net/mac80211/ieee80211_i.h | 45 ++- net/mac80211/iface.c | 25 +- net/mac80211/main.c | 9 +- net/mac80211/rx.c | 2 +- net/mac80211/sta_info.c | 10 +- net/mac80211/sta_info.h | 27 ++ net/mac80211/status.c | 64 ++++ net/mac80211/tx.c | 658 ++++++++++++++++++++++++++++++-- net/mac80211/util.c | 21 +- 20 files changed, 1629 insertions(+), 157 deletions(-) create mode 100644 net/mac80211/codel.h create mode 100644 net/mac80211/codel_i.h -- 2.1.4 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-16 10:17 Michal Kazior @ 2016-03-16 10:26 ` Michal Kazior 2016-03-16 15:37 ` Dave Taht 0 siblings, 1 reply; 15+ messages in thread From: Michal Kazior @ 2016-03-16 10:26 UTC (permalink / raw) To: linux-wireless Cc: ath10k@lists.infradead.org, Johannes Berg, Network Development, Dave Taht, Emmanuel Grumbach, Felix Fietkau, Tim Shepard, make-wifi-fast, codel, Michal Kazior [-- Attachment #1: Type: text/plain, Size: 1041 bytes --] On 16 March 2016 at 11:17, Michal Kazior <michal.kazior@tieto.com> wrote: > Hi, > > Most notable changes: [...] > * ath10k proof-of-concept that uses the new tx > scheduling (will post results in separate > email) I'm attaching a bunch of tests I've done using flent. They are all "burst" tests with burst-ports=1 and burst-length=2. The testing topology is: AP ----> STA AP )) (( STA [veth]--[br]--[wlan] )) (( [wlan] You can notice that in some tests plot data gets cut-off. There are 2 problems I've identified: - excess drops (not a problem with the patchset and can be seen when there's no codel-in-mac or scheduling isn't used) - UDP_RR hangs (apparently QCA99X0 I have hangs for a few hundred ms sometimes at times and doesn't Rx frames causing UDP_RR to stop mid-way; confirmed with logs and sniffer; I haven't figured out *why* exactly, could be some hw/fw quirk) Let me know if you have questions or comments regarding my testing/results. Michał [-- Attachment #2: fq.tar.gz --] [-- Type: application/x-gzip, Size: 63753 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-16 10:26 ` Michal Kazior @ 2016-03-16 15:37 ` Dave Taht 2016-03-16 18:36 ` Dave Taht 2016-03-17 9:03 ` Michal Kazior 0 siblings, 2 replies; 15+ messages in thread From: Dave Taht @ 2016-03-16 15:37 UTC (permalink / raw) To: Michal Kazior Cc: Felix Fietkau, Emmanuel Grumbach, Network Development, linux-wireless, ath10k@lists.infradead.org, codel@lists.bufferbloat.net, make-wifi-fast, Johannes Berg, Tim Shepard [-- Attachment #1: Type: text/plain, Size: 1695 bytes --] it is helpful to name the test files coherently in the flent tests, in addition to using a directory structure and timestamp. It makes doing comparison plots in data->add-other-open-data-files simpler. "-t patched-mac-300mbps", for example. Also netperf from svn (maybe 2.7, don't remember) will restart udp_rr after a packet loss in 250ms. Seeing a loss on UDP_RR and it stop for a while is "ok". Dave Täht Let's go make home routers and wifi faster! With better software! https://www.gofundme.com/savewifi On Wed, Mar 16, 2016 at 3:26 AM, Michal Kazior <michal.kazior@tieto.com> wrote: > On 16 March 2016 at 11:17, Michal Kazior <michal.kazior@tieto.com> wrote: >> Hi, >> >> Most notable changes: > [...] >> * ath10k proof-of-concept that uses the new tx >> scheduling (will post results in separate >> email) > > I'm attaching a bunch of tests I've done using flent. They are all > "burst" tests with burst-ports=1 and burst-length=2. The testing > topology is: > > AP ----> STA > AP )) (( STA > [veth]--[br]--[wlan] )) (( [wlan] > > You can notice that in some tests plot data gets cut-off. There are 2 > problems I've identified: > - excess drops (not a problem with the patchset and can be seen when > there's no codel-in-mac or scheduling isn't used) > - UDP_RR hangs (apparently QCA99X0 I have hangs for a few hundred ms > sometimes at times and doesn't Rx frames causing UDP_RR to stop > mid-way; confirmed with logs and sniffer; I haven't figured out *why* > exactly, could be some hw/fw quirk) > > Let me know if you have questions or comments regarding my testing/results. > > > Michał [-- Attachment #2: cdf_comparison.png --] [-- Type: image/png, Size: 87203 bytes --] [-- Attachment #3: Type: text/plain, Size: 140 bytes --] _______________________________________________ Codel mailing list Codel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/codel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-16 15:37 ` Dave Taht @ 2016-03-16 18:36 ` Dave Taht [not found] ` <CAA93jw6tDdiYuginPbUY1DFJLiDxofHMFN6j2BvQPabPmBtuRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-03-17 9:43 ` Michal Kazior 2016-03-17 9:03 ` Michal Kazior 1 sibling, 2 replies; 15+ messages in thread From: Dave Taht @ 2016-03-16 18:36 UTC (permalink / raw) To: Michal Kazior Cc: Network Development, codel@lists.bufferbloat.net, linux-wireless, ath10k@lists.infradead.org, make-wifi-fast [-- Attachment #1: Type: text/plain, Size: 2224 bytes --] That is the sanest 802.11e queue behavior I have ever seen! (at both 6 and 300mbit! in the ath10k patched mac test) It would be good to add a flow to this test that exercises the VI queue (CS5 diffserv marking?), and to repeat this test with wmm disabled for comparison. Dave Täht Let's go make home routers and wifi faster! With better software! https://www.gofundme.com/savewifi On Wed, Mar 16, 2016 at 8:37 AM, Dave Taht <dave.taht@gmail.com> wrote: > it is helpful to name the test files coherently in the flent tests, in > addition to using a directory structure and timestamp. It makes doing > comparison plots in data->add-other-open-data-files simpler. "-t > patched-mac-300mbps", for example. > > Also netperf from svn (maybe 2.7, don't remember) will restart udp_rr > after a packet loss in 250ms. Seeing a loss on UDP_RR and it stop for > a while is "ok". > Dave Täht > Let's go make home routers and wifi faster! With better software! > https://www.gofundme.com/savewifi > > > On Wed, Mar 16, 2016 at 3:26 AM, Michal Kazior <michal.kazior@tieto.com> wrote: >> On 16 March 2016 at 11:17, Michal Kazior <michal.kazior@tieto.com> wrote: >>> Hi, >>> >>> Most notable changes: >> [...] >>> * ath10k proof-of-concept that uses the new tx >>> scheduling (will post results in separate >>> email) >> >> I'm attaching a bunch of tests I've done using flent. They are all >> "burst" tests with burst-ports=1 and burst-length=2. The testing >> topology is: >> >> AP ----> STA >> AP )) (( STA >> [veth]--[br]--[wlan] )) (( [wlan] >> >> You can notice that in some tests plot data gets cut-off. There are 2 >> problems I've identified: >> - excess drops (not a problem with the patchset and can be seen when >> there's no codel-in-mac or scheduling isn't used) >> - UDP_RR hangs (apparently QCA99X0 I have hangs for a few hundred ms >> sometimes at times and doesn't Rx frames causing UDP_RR to stop >> mid-way; confirmed with logs and sniffer; I haven't figured out *why* >> exactly, could be some hw/fw quirk) >> >> Let me know if you have questions or comments regarding my testing/results. >> >> >> Michał [-- Attachment #2: sanest_802.11eresult_i_have_ever_seen.png --] [-- Type: image/png, Size: 146956 bytes --] [-- Attachment #3: Type: text/plain, Size: 140 bytes --] _______________________________________________ Codel mailing list Codel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/codel ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CAA93jw6tDdiYuginPbUY1DFJLiDxofHMFN6j2BvQPabPmBtuRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFCv2 0/3] mac80211: implement fq codel [not found] ` <CAA93jw6tDdiYuginPbUY1DFJLiDxofHMFN6j2BvQPabPmBtuRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-03-16 18:55 ` Bob Copeland 2016-03-16 19:48 ` Jasmine Strong 0 siblings, 1 reply; 15+ messages in thread From: Bob Copeland @ 2016-03-16 18:55 UTC (permalink / raw) To: Dave Taht Cc: Michal Kazior, linux-wireless, ath10k-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, Network Development, make-wifi-fast-JXvr2/1DY2fm6VMwtOF2vx4hnT+Y9+D1, codel-JXvr2/1DY2fm6VMwtOF2vx4hnT+Y9+D1@public.gmane.org On Wed, Mar 16, 2016 at 11:36:31AM -0700, Dave Taht wrote: > That is the sanest 802.11e queue behavior I have ever seen! (at both > 6 and 300mbit! in the ath10k patched mac test) Out of curiosity, why does BE have larger latency than BK in that chart? I'd have expected the opposite. -- Bob Copeland %% http://bobcopeland.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-16 18:55 ` Bob Copeland @ 2016-03-16 19:48 ` Jasmine Strong 2016-03-17 8:55 ` Michal Kazior 0 siblings, 1 reply; 15+ messages in thread From: Jasmine Strong @ 2016-03-16 19:48 UTC (permalink / raw) To: Bob Copeland Cc: Network Development, linux-wireless, ath10k@lists.infradead.org, codel@lists.bufferbloat.net, make-wifi-fast [-- Attachment #1.1: Type: text/plain, Size: 634 bytes --] BK usually has 0 txop, so it doesn't do aggregation. On Wed, Mar 16, 2016 at 11:55 AM, Bob Copeland <me@bobcopeland.com> wrote: > On Wed, Mar 16, 2016 at 11:36:31AM -0700, Dave Taht wrote: > > That is the sanest 802.11e queue behavior I have ever seen! (at both > > 6 and 300mbit! in the ath10k patched mac test) > > Out of curiosity, why does BE have larger latency than BK in that chart? > I'd have expected the opposite. > > -- > Bob Copeland %% http://bobcopeland.com/ > > _______________________________________________ > ath10k mailing list > ath10k@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/ath10k > [-- Attachment #1.2: Type: text/html, Size: 1329 bytes --] [-- Attachment #2: Type: text/plain, Size: 140 bytes --] _______________________________________________ Codel mailing list Codel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/codel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-16 19:48 ` Jasmine Strong @ 2016-03-17 8:55 ` Michal Kazior 2016-03-17 11:12 ` Bob Copeland [not found] ` <CA+BoTQnFN6VANn=5EUvHc0Dbfh4Zv0HraOto2ySN3_HdOpD7Sg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 2 replies; 15+ messages in thread From: Michal Kazior @ 2016-03-17 8:55 UTC (permalink / raw) To: Jasmine Strong Cc: Network Development, linux-wireless, ath10k@lists.infradead.org, codel@lists.bufferbloat.net, make-wifi-fast, Bob Copeland [-- Attachment #1: Type: text/plain, Size: 2216 bytes --] TxOP 0 has a special meaning in the standard. For HT/VHT it means the it is actually limited to 5484us (mixed-mode) or 10000us (greenfield). I suspect the BK/BE latency difference has to do with the fact that there's bulk traffic going on BE queues (this isn't reflected explicitly in the plots). The `bursts` flent test includes short bursts of traffic on tid0 (BE) which is shared with ICMP and BE UDP_RR (seen as green and blue lines on the plot). Due to (intended) limited outflow (6mbps) BE queues build up and don't drain for the duration of the entire test creating more opportunities for aggregating BE traffic while other queues are near-empty and very short (time wise as well). If you consider Wi-Fi is half-duplex and latency in the entire stack (for processing ICMP and UDP_RR) is greater than 11e contention window timings you can get your BE flow responses with extra delay (since other queues might have responses ready quicker). I've modified traffic-gen and re-run tests with bursts on all tested tids/ACs (tid0, tid1, tid5). I'm attaching the results. With bursts on all tids you can clearly see BK has much higher latency than BE. (Note, I've changed my AP to QCA988X with oldie firmware 10.1.467 for this test; it doesn't have the weird hiccups I was seeing on QCA99X0 and newer QCA988X firmware reports bogus expected throughput which is most likely a result of my sloppy proof-of-concept change in ath10k). Michał On 16 March 2016 at 20:48, Jasmine Strong <jas@eero.com> wrote: > BK usually has 0 txop, so it doesn't do aggregation. > > On Wed, Mar 16, 2016 at 11:55 AM, Bob Copeland <me@bobcopeland.com> wrote: >> >> On Wed, Mar 16, 2016 at 11:36:31AM -0700, Dave Taht wrote: >> > That is the sanest 802.11e queue behavior I have ever seen! (at both >> > 6 and 300mbit! in the ath10k patched mac test) >> >> Out of curiosity, why does BE have larger latency than BK in that chart? >> I'd have expected the opposite. >> >> -- >> Bob Copeland %% http://bobcopeland.com/ >> >> _______________________________________________ >> ath10k mailing list >> ath10k@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/ath10k > > [-- Attachment #2: bursts-2016-03-17T083932.549858.qca988x_10_1_467_fqmac_ath10k_with_tx_sched_6mbps_.flent.gz --] [-- Type: application/x-gzip, Size: 14649 bytes --] [-- Attachment #3: bursts-2016-03-17T083803.348752.qca988x_10_1_467_fqmac_ath10k_with_tx_sched_6mbps_.flent.gz --] [-- Type: application/x-gzip, Size: 15029 bytes --] [-- Attachment #4: Type: text/plain, Size: 140 bytes --] _______________________________________________ Codel mailing list Codel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/codel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-17 8:55 ` Michal Kazior @ 2016-03-17 11:12 ` Bob Copeland [not found] ` <CA+BoTQnFN6VANn=5EUvHc0Dbfh4Zv0HraOto2ySN3_HdOpD7Sg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 0 replies; 15+ messages in thread From: Bob Copeland @ 2016-03-17 11:12 UTC (permalink / raw) To: Michal Kazior Cc: Jasmine Strong, Dave Taht, Network Development, linux-wireless, ath10k@lists.infradead.org, codel@lists.bufferbloat.net, make-wifi-fast On Thu, Mar 17, 2016 at 09:55:03AM +0100, Michal Kazior wrote: > If you consider Wi-Fi is half-duplex and latency in the entire stack > (for processing ICMP and UDP_RR) is greater than 11e contention window > timings you can get your BE flow responses with extra delay (since > other queues might have responses ready quicker). Got it, that makes sense. Thanks for the explanation! -- Bob Copeland %% http://bobcopeland.com/ ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CA+BoTQnFN6VANn=5EUvHc0Dbfh4Zv0HraOto2ySN3_HdOpD7Sg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFCv2 0/3] mac80211: implement fq codel [not found] ` <CA+BoTQnFN6VANn=5EUvHc0Dbfh4Zv0HraOto2ySN3_HdOpD7Sg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-03-17 17:00 ` Dave Taht 2016-03-21 11:57 ` Michal Kazior 0 siblings, 1 reply; 15+ messages in thread From: Dave Taht @ 2016-03-17 17:00 UTC (permalink / raw) To: Michal Kazior Cc: Jasmine Strong, Network Development, linux-wireless, ath10k-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, codel-JXvr2/1DY2fm6VMwtOF2vx4hnT+Y9+D1@public.gmane.org, make-wifi-fast-JXvr2/1DY2fm6VMwtOF2vx4hnT+Y9+D1 On Thu, Mar 17, 2016 at 1:55 AM, Michal Kazior <michal.kazior-++hxYGjEMp0AvxtiuMwx3w@public.gmane.org> wrote: > I suspect the BK/BE latency difference has to do with the fact that > there's bulk traffic going on BE queues (this isn't reflected > explicitly in the plots). The `bursts` flent test includes short > bursts of traffic on tid0 (BE) which is shared with ICMP and BE UDP_RR > (seen as green and blue lines on the plot). Due to (intended) limited > outflow (6mbps) BE queues build up and don't drain for the duration of > the entire test creating more opportunities for aggregating BE traffic > while other queues are near-empty and very short (time wise as well). I agree with your explanation. Access to the media and queue length are the two variables at play here. I just committed a new flent test that should exercise the vo,vi,be, and bk queues, "bursts_11e". I dropped the conventional ping from it and just rely on netperf's udp_rr for each queue. It seems to "do the right thing" on the ath9k.... And while I'm all in favor of getting 802.11e's behaviors more right, and this seems like a good way to get there... netperf's udp_rr is not how much traffic conventionally behaves. It doesn't do tcp slow start or congestion control in particular... In the case of the VO queue, for example, the (2004) intended behavior was 1 isochronous packet per 10ms per voice sending station and one from the ap, not a "ping". And at the time, VI was intended to be unicast video. TCP was an afterthought. (wifi's original (1993) mac was actually designed for ipx/spx!) I long for regular "rrul" and "rrul_be" tests against the new stuff to blow it up thoroughly as references along the way. (tcp_upload, tcp_download, (and several of the rtt_fair tests also between stations)). Will get formal about it here as soon as we end up on the same kernel trees.... Furthermore 802.11e is not widely used - in particular, not much internet bound/sourced traffic falls into more than BE and BK, presently. and in some cases weirder - comcast remarks a very large percentage of to the home inbound traffic as CS1 (BK), btw, and stations tend to use CS0. Data comes in on BK, acks go out on BE. I/we will try to come up with intermediate tests between the burst tests and the rrul tests as we go along the way. > If you consider Wi-Fi is half-duplex and latency in the entire stack In the context of this test regime... <pedantry> Saying wifi is "half"-duplex is a misleading way to think about it in many respects. it is a shared medium more like early, non-switched ethernet, with a weird mac that governs what sort of packets get access to (a txop) the medium first, across all stations co-operating within EDCA. Half or full duplex is something that mostly applied to p2p serial connections (or p2p wifi), not P2MP. Additionally characteristics like exponential backoff make no sense were wifi any form of duplex, full or half. Certainly much stuff within a txop (block acks for example) can be considered half duplex in a microcosmic context. I wish we actually had words that accurately described wifi's actual behavior. </pedantry> > (for processing ICMP and UDP_RR) is greater than 11e contention window > timings you can get your BE flow responses with extra delay (since > other queues might have responses ready quicker). yes. always having a request pending for each of the 802.11e queues is actually not the best idea, it is better to take advantage of better aggregation afforded by 802.11n/ac, to only have one or two of the queues in use against any given station and promote or demote traffic into a more-right queue. simple example of the damage having all 4 queues always contending is exemplified by running the rrul and rrul_be tests against nearly any given AP. > > I've modified traffic-gen and re-run tests with bursts on all tested > tids/ACs (tid0, tid1, tid5). I'm attaching the results. > > With bursts on all tids you can clearly see BK has much higher latency than BE. The long term goal here, of course, is for BK (or the other queues) to not have seconds of queuing latency but something more bounded to 2x media access time... > (Note, I've changed my AP to QCA988X with oldie firmware 10.1.467 for > this test; it doesn't have the weird hiccups I was seeing on QCA99X0 > and newer QCA988X firmware reports bogus expected throughput which is > most likely a result of my sloppy proof-of-concept change in ath10k). So I should avoid ben greer's firmware for now? > > > Michał > > On 16 March 2016 at 20:48, Jasmine Strong <jas-K/XR6QsbQD0@public.gmane.org> wrote: >> BK usually has 0 txop, so it doesn't do aggregation. >> >> On Wed, Mar 16, 2016 at 11:55 AM, Bob Copeland <me-aXfl/3sk2vNUbtYUoyoikg@public.gmane.org> wrote: >>> >>> On Wed, Mar 16, 2016 at 11:36:31AM -0700, Dave Taht wrote: >>> > That is the sanest 802.11e queue behavior I have ever seen! (at both >>> > 6 and 300mbit! in the ath10k patched mac test) >>> >>> Out of curiosity, why does BE have larger latency than BK in that chart? >>> I'd have expected the opposite. >>> >>> -- >>> Bob Copeland %% http://bobcopeland.com/ >>> >>> _______________________________________________ >>> ath10k mailing list >>> ath10k-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org >>> http://lists.infradead.org/mailman/listinfo/ath10k >> >> -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-17 17:00 ` Dave Taht @ 2016-03-21 11:57 ` Michal Kazior 0 siblings, 0 replies; 15+ messages in thread From: Michal Kazior @ 2016-03-21 11:57 UTC (permalink / raw) To: Dave Taht Cc: Jasmine Strong, Network Development, linux-wireless, ath10k@lists.infradead.org, codel@lists.bufferbloat.net, make-wifi-fast [-- Attachment #1: Type: text/plain, Size: 5516 bytes --] On 17 March 2016 at 18:00, Dave Taht <dave.taht@gmail.com> wrote: > On Thu, Mar 17, 2016 at 1:55 AM, Michal Kazior <michal.kazior@tieto.com> wrote: > >> I suspect the BK/BE latency difference has to do with the fact that >> there's bulk traffic going on BE queues (this isn't reflected >> explicitly in the plots). The `bursts` flent test includes short >> bursts of traffic on tid0 (BE) which is shared with ICMP and BE UDP_RR >> (seen as green and blue lines on the plot). Due to (intended) limited >> outflow (6mbps) BE queues build up and don't drain for the duration of >> the entire test creating more opportunities for aggregating BE traffic >> while other queues are near-empty and very short (time wise as well). > > I agree with your explanation. Access to the media and queue length > are the two variables at play here. > > I just committed a new flent test that should exercise the vo,vi,be, > and bk queues, "bursts_11e". I dropped the conventional ping from it > and just rely on netperf's udp_rr for each queue. It seems to "do the > right thing" on the ath9k.... [...] > I long for regular "rrul" and "rrul_be" tests against the new stuff to > blow it up thoroughly as references along the way. > (tcp_upload, tcp_download, (and several of the rtt_fair tests also > between stations)). Will get formal about it here as soon as we end up > on the same kernel trees.... [...] > simple example of the damage having all 4 queues always contending is > exemplified by running the rrul and rrul_be tests against nearly any > given AP. Thanks! I've run more tests and am attaching results. A couple of words on the test naming: - "fast" means 1x1 station with good RF conditions - "slow" means 1x1 station with bad RF conditions (antenna unplugged) - "fast+slow" means traffic is directed to both "fast" and "slow" stations - "verfast" means 4x4 station for peak tput measurement - "autorate" means rate control is enabled - "rate6m" means 6mbps fixed tx rate on DUT - the DUT is acting as AP in all tests - other devices in the setup *do not* have any extra patches (so bidirectional tests must be carefully analyzed) - 4 sets of software patches: - fullpatch contains all codel patches (mac80211+ath10k) - macpatch contains only mac80211 changes (so ath10k at least gets to use per-txq fq-codel like queuing) - pre-waketx is ath10k with some patches reverted (before pull-push/wake-tx-queue stuff was applied) - waketx is current ath10k (i.e. with simple wake_tx_queue implementation) Observations/ notes: - "slow" case proves my naive get_expected_throughput() for ath10k is highly inaccurate due to not considering retries. because of that latency gets bad as mac80211's tx scheduling is queuing up more than necessary; ath9k should do a lot better with minstrel - i kept netperf2.6 (which has no udp-rr recovery) for now as it's easier to spot glitches Please let me know if you see anything interesting or worrying in these plots. >> I've modified traffic-gen and re-run tests with bursts on all tested >> tids/ACs (tid0, tid1, tid5). I'm attaching the results. >> >> With bursts on all tids you can clearly see BK has much higher latency than BE. > > The long term goal here, of course, is for BK (or the other queues) to > not have seconds of queuing latency but something more bounded to 2x > media access time... My patch already tries to maintain txop-based in-flight tx queue depth. Current defaults are to keep between 3-4 txops per hardware and roughly 2txops per tid. You could argue these are too big but I wanted to keep them conservative, at least initially, to make sure to not affect peak throughput badly. All of these are knobs you can play with via debugfs. This requires drivers to use ieee80211_tx_schedule(). If driver merely uses wake_tx_queue it will only benefit from flow fairness (albeit limited) but it will not keep queues at N txop fill level (unless driver does that on it's own). This means that Tim's ath9k patch will need to be adjusted a bit to make use of this new API prototype for full effect. Unfortunately I didn't have time to play on this front yet. >> (Note, I've changed my AP to QCA988X with oldie firmware 10.1.467 for >> this test; it doesn't have the weird hiccups I was seeing on QCA99X0 >> and newer QCA988X firmware reports bogus expected throughput which is >> most likely a result of my sloppy proof-of-concept change in ath10k). > > So I should avoid ben greer's firmware for now? I'm guessing his 10.1 fork should work fine. Not sure about the 10.2.4 though. Anyway, keep in mind you'll get mixed results with ath10k. The throughput estimation I've done for now is an ugly hack. It works in fixed-rate conditions (which I use to prove a point that given adequate rate estimation you can keep fw/hw tx queues at a reasonable latency). It doesn't consider tx retries and unstable RF conditions (rate control is in firmware and there's limited information available to the driver) though which leads to more frames being queued than necessary (and therefore increasing latency). This becomes apparent with real-life interference and tx retries (just compare "autorate,slow" against "rate6m,fast"). ath9k should do a lot better job at this (although that requires Tim's patches; I haven't tested that myself) because it uses minstrel which and should predict throughput a lot more reliably. Michał [-- Attachment #2: flent-2016-03-21.tar.gz --] [-- Type: application/x-gzip, Size: 2029295 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-16 18:36 ` Dave Taht [not found] ` <CAA93jw6tDdiYuginPbUY1DFJLiDxofHMFN6j2BvQPabPmBtuRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-03-17 9:43 ` Michal Kazior 1 sibling, 0 replies; 15+ messages in thread From: Michal Kazior @ 2016-03-17 9:43 UTC (permalink / raw) To: Dave Taht Cc: Network Development, codel@lists.bufferbloat.net, linux-wireless, ath10k@lists.infradead.org, make-wifi-fast [-- Attachment #1: Type: text/plain, Size: 2810 bytes --] I've re-tested selected cases with wmm_enabled=0 set on the DUT AP. I'm attaching results. Naming: * "old-" is without mac/ath10k changes (referred to as kvalo-reverts previously) and fq_codel on qdiscs, * "patched-" is all patches applied (both mac and ath), * "-be-bursts" is stock "bursts" flent test, * "-all-bursts" is modified "bursts" flent test to burst on all 3 tids simultaneously: tid0(BE), tid1(BK), tid5(VI). Michał On 16 March 2016 at 19:36, Dave Taht <dave.taht@gmail.com> wrote: > That is the sanest 802.11e queue behavior I have ever seen! (at both > 6 and 300mbit! in the ath10k patched mac test) > > It would be good to add a flow to this test that exercises the VI > queue (CS5 diffserv marking?), and to repeat this test with wmm > disabled for comparison. > > > Dave Täht > Let's go make home routers and wifi faster! With better software! > https://www.gofundme.com/savewifi > > > On Wed, Mar 16, 2016 at 8:37 AM, Dave Taht <dave.taht@gmail.com> wrote: >> it is helpful to name the test files coherently in the flent tests, in >> addition to using a directory structure and timestamp. It makes doing >> comparison plots in data->add-other-open-data-files simpler. "-t >> patched-mac-300mbps", for example. >> >> Also netperf from svn (maybe 2.7, don't remember) will restart udp_rr >> after a packet loss in 250ms. Seeing a loss on UDP_RR and it stop for >> a while is "ok". >> Dave Täht >> Let's go make home routers and wifi faster! With better software! >> https://www.gofundme.com/savewifi >> >> >> On Wed, Mar 16, 2016 at 3:26 AM, Michal Kazior <michal.kazior@tieto.com> wrote: >>> On 16 March 2016 at 11:17, Michal Kazior <michal.kazior@tieto.com> wrote: >>>> Hi, >>>> >>>> Most notable changes: >>> [...] >>>> * ath10k proof-of-concept that uses the new tx >>>> scheduling (will post results in separate >>>> email) >>> >>> I'm attaching a bunch of tests I've done using flent. They are all >>> "burst" tests with burst-ports=1 and burst-length=2. The testing >>> topology is: >>> >>> AP ----> STA >>> AP )) (( STA >>> [veth]--[br]--[wlan] )) (( [wlan] >>> >>> You can notice that in some tests plot data gets cut-off. There are 2 >>> problems I've identified: >>> - excess drops (not a problem with the patchset and can be seen when >>> there's no codel-in-mac or scheduling isn't used) >>> - UDP_RR hangs (apparently QCA99X0 I have hangs for a few hundred ms >>> sometimes at times and doesn't Rx frames causing UDP_RR to stop >>> mid-way; confirmed with logs and sniffer; I haven't figured out *why* >>> exactly, could be some hw/fw quirk) >>> >>> Let me know if you have questions or comments regarding my testing/results. >>> >>> >>> Michał [-- Attachment #2: bursts-2016-03-17T093033.443115.patched_all_bursts.flent.gz --] [-- Type: application/x-gzip, Size: 13841 bytes --] [-- Attachment #3: bursts-2016-03-17T092946.721003.patched_be_bursts.flent.gz --] [-- Type: application/x-gzip, Size: 13786 bytes --] [-- Attachment #4: bursts-2016-03-17T092445.132728.old_be_bursts.flent.gz --] [-- Type: application/x-gzip, Size: 6349 bytes --] [-- Attachment #5: bursts-2016-03-17T091952.053950.old_all_bursts.flent.gz --] [-- Type: application/x-gzip, Size: 5458 bytes --] [-- Attachment #6: patched-be-bursts.gif --] [-- Type: image/gif, Size: 17961 bytes --] [-- Attachment #7: Type: text/plain, Size: 140 bytes --] _______________________________________________ Codel mailing list Codel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/codel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFCv2 0/3] mac80211: implement fq codel 2016-03-16 15:37 ` Dave Taht 2016-03-16 18:36 ` Dave Taht @ 2016-03-17 9:03 ` Michal Kazior 1 sibling, 0 replies; 15+ messages in thread From: Michal Kazior @ 2016-03-17 9:03 UTC (permalink / raw) To: Dave Taht Cc: Felix Fietkau, Emmanuel Grumbach, Network Development, linux-wireless, ath10k@lists.infradead.org, codel@lists.bufferbloat.net, make-wifi-fast, Johannes Berg, Tim Shepard On 16 March 2016 at 16:37, Dave Taht <dave.taht@gmail.com> wrote: > it is helpful to name the test files coherently in the flent tests, in > addition to using a directory structure and timestamp. It makes doing > comparison plots in data->add-other-open-data-files simpler. "-t > patched-mac-300mbps", for example. Sorry. I'm still trying to figure out what variables are worth considering for comparison purposes. > Also netperf from svn (maybe 2.7, don't remember) will restart udp_rr > after a packet loss in 250ms. Seeing a loss on UDP_RR and it stop for > a while is "ok". I'm using 2.6 straight out of debian repos so yeah. I guess I'll try using more recent netperf if I can't figure out the hiccups. Michał > Dave Täht > Let's go make home routers and wifi faster! With better software! > https://www.gofundme.com/savewifi > > > On Wed, Mar 16, 2016 at 3:26 AM, Michal Kazior <michal.kazior@tieto.com> wrote: >> On 16 March 2016 at 11:17, Michal Kazior <michal.kazior@tieto.com> wrote: >>> Hi, >>> >>> Most notable changes: >> [...] >>> * ath10k proof-of-concept that uses the new tx >>> scheduling (will post results in separate >>> email) >> >> I'm attaching a bunch of tests I've done using flent. They are all >> "burst" tests with burst-ports=1 and burst-length=2. The testing >> topology is: >> >> AP ----> STA >> AP )) (( STA >> [veth]--[br]--[wlan] )) (( [wlan] >> >> You can notice that in some tests plot data gets cut-off. There are 2 >> problems I've identified: >> - excess drops (not a problem with the patchset and can be seen when >> there's no codel-in-mac or scheduling isn't used) >> - UDP_RR hangs (apparently QCA99X0 I have hangs for a few hundred ms >> sometimes at times and doesn't Rx frames causing UDP_RR to stop >> mid-way; confirmed with logs and sniffer; I haven't figured out *why* >> exactly, could be some hw/fw quirk) >> >> Let me know if you have questions or comments regarding my testing/results. >> >> >> Michał _______________________________________________ Codel mailing list Codel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/codel ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-03-22 9:51 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-21 17:10 [RFCv2 0/3] mac80211: implement fq codel Dave Taht
2016-03-22 8:05 ` Michal Kazior
2016-03-22 9:51 ` Toke Høiland-Jørgensen
-- strict thread matches above, loose matches on Subject: below --
2016-03-16 10:17 Michal Kazior
2016-03-16 10:26 ` Michal Kazior
2016-03-16 15:37 ` Dave Taht
2016-03-16 18:36 ` Dave Taht
[not found] ` <CAA93jw6tDdiYuginPbUY1DFJLiDxofHMFN6j2BvQPabPmBtuRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-16 18:55 ` Bob Copeland
2016-03-16 19:48 ` Jasmine Strong
2016-03-17 8:55 ` Michal Kazior
2016-03-17 11:12 ` Bob Copeland
[not found] ` <CA+BoTQnFN6VANn=5EUvHc0Dbfh4Zv0HraOto2ySN3_HdOpD7Sg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-17 17:00 ` Dave Taht
2016-03-21 11:57 ` Michal Kazior
2016-03-17 9:43 ` Michal Kazior
2016-03-17 9:03 ` Michal Kazior
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).