From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Fastabend Subject: Re: [RFC PATCH 09/12] net: sched: pfifo_fast use alf_queue Date: Wed, 13 Jan 2016 10:18:09 -0800 Message-ID: <569694E1.6000109@gmail.com> References: <20151230175000.26257.41532.stgit@john-Precision-Tower-5810> <20151230175420.26257.868.stgit@john-Precision-Tower-5810> <20160113.112459.133519495867760481.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: daniel@iogearbox.net, eric.dumazet@gmail.com, jhs@mojatatu.com, aduyck@mirantis.com, brouer@redhat.com, john.r.fastabend@intel.com, netdev@vger.kernel.org To: David Miller Return-path: Received: from mail-pf0-f195.google.com ([209.85.192.195]:35275 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753060AbcAMSSY (ORCPT ); Wed, 13 Jan 2016 13:18:24 -0500 Received: by mail-pf0-f195.google.com with SMTP id 65so6355705pff.2 for ; Wed, 13 Jan 2016 10:18:24 -0800 (PST) In-Reply-To: <20160113.112459.133519495867760481.davem@davemloft.net> Sender: netdev-owner@vger.kernel.org List-ID: On 16-01-13 08:24 AM, David Miller wrote: > From: John Fastabend > Date: Wed, 30 Dec 2015 09:54:20 -0800 > >> This also removes the logic used to pick the next band to dequeue >> from and instead just checks each alf_queue for packets from >> top priority to lowest. This might need to be a bit more clever >> but seems to work for now. > > I suspect we won't need to be more clever, there's only 3 bands > after all and the head/tail tests should be fast enough. > Even with alf_dequeue operation dequeueing a single skb at a time and iterating over the bands as I did here I see a perf improvement on my desktop here, threads mq + pfifo_fast before after 1 1.70 Mpps 2.00 Mpps 2 3.15 Mpps 3.90 Mpps 4 4.70 Mpps 6.98 Mpps 8 9.57 Mpps 11.62 Mpps This is using my pktgen patch previously posted and bulking set to 0 in both cases. This doesn't really say anything about the contention cases, etc so I'll do some more testing before the merge window opens. Also my kernel isn't really optimized I had some of the kernel hacking stuff enabled, etc. It at least looks promising though and dequeueing more than a single skb out of pfifo_fast should help. Something like, static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc) { struct pfifo_fast_priv *priv = qdisc_priv(qdisc); struct sk_buff *skb[8+1]; int band, n = 0, i; skb[0] = NULL; for (band = 0; band < PFIFO_FAST_BANDS && !skb[0]; band++) { struct alf_queue *q = band2list(priv, band); if (alf_queue_empty(q)) continue; n = alf_mc_dequeue(q, skb, 8); <-- 4, 8, or something } .John