From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Fastabend Subject: [RFC PATCH 00/17] lockless qdisc Date: Mon, 13 Nov 2017 12:07:38 -0800 Message-ID: <20171113195256.6245.64676.stgit@john-Precision-Tower-5810> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: make0818@gmail.com, netdev@vger.kernel.org, jiri@resnulli.us, xiyou.wangcong@gmail.com To: willemdebruijn.kernel@gmail.com, daniel@iogearbox.net, eric.dumazet@gmail.com Return-path: Received: from mail-pg0-f50.google.com ([74.125.83.50]:56616 "EHLO mail-pg0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755170AbdKMUHx (ORCPT ); Mon, 13 Nov 2017 15:07:53 -0500 Received: by mail-pg0-f50.google.com with SMTP id z184so8112870pgd.13 for ; Mon, 13 Nov 2017 12:07:52 -0800 (PST) Sender: netdev-owner@vger.kernel.org List-ID: Multiple folks asked me about this series at net(dev)conf so with a 10+hour flight and a bit testing once back home I think these are ready to be submitted. Net-next is closed at the moment http://vger.kernel.org/~davem/net-next.html but, once it opens up we can get these in first thing and have plenty of time to resolve in fallout. Although I haven't seen any issues with my latest testing. My first test case uses multiple containers (via cilium) where multiple client containers use 'wrk' to benchmark connections with a server container running lighttpd. Where lighttpd is configured to use multiple threads, one per core. Additionally this test has a proxy agent running so all traffic takes an extra hop through a proxy container. In these cases each TCP packet traverses the egress qdisc layer at least four times and the ingress qdisc layer an additional four times. This makes for a good stress test IMO, perf details below. The other micro-benchmark I run is injecting packets directly into qdisc layer using pktgen. This uses the benchmark script, ./pktgen_bench_xmit_mode_queue_xmit.sh Benchmarks taken in two cases, "base" running latest net-next no changes to qdisc layer and "qdisc" tests run with qdisc lockless updates. Numbers reported in req/sec. All virtual 'veth' devices run with pfifo_fast in the qdisc test case. `wrk -t16 -c $conns -d30 "http://[$SERVER_IP4]:80"` conns 16 32 64 1024 ----------------------------------------------- base: 18831 20201 21393 29151 qdisc: 19309 21063 23899 29265 notice in all cases we see performance improvement when running with qdisc case. Microbenchmarks using pktgen are as follows, `pktgen_bench_xmit_mode_queue_xmit.sh -t 1 -i eth2 -c 20000000 base(mq): 2.1Mpps base(pfifo_fast): 2.1Mpps qdisc(mq): 2.6Mpps qdisc(pfifo_fast): 2.6Mpps notice numbers are the same for mq and pfifo_fast because only testing a single thread here. Comments and feedback welcome. Anyone willing to do additional testing would be greatly appreciated. The patches can be pulled here, https://github.com/cilium/linux/tree/qdisc Thanks, John --- John Fastabend (17): net: sched: cleanup qdisc_run and __qdisc_run semantics net: sched: allow qdiscs to handle locking net: sched: remove remaining uses for qdisc_qlen in xmit path net: sched: provide per cpu qstat helpers net: sched: a dflt qdisc may be used with per cpu stats net: sched: explicit locking in gso_cpu fallback net: sched: drop qdisc_reset from dev_graft_qdisc net: sched: use skb list for skb_bad_tx net: sched: check for frozen queue before skb_bad_txq check net: sched: qdisc_qlen for per cpu logic net: sched: helper to sum qlen net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mq net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mqprio net: skb_array: expose peek API net: sched: pfifo_fast use skb_array net: skb_array additions for unlocked consumer net: sched: lock once per bulk dequeue 0 files changed