From: John Fastabend <john.fastabend@gmail.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
fw@strlen.de, jhs@mojatatu.com, eric.dumazet@gmail.com,
netdev@vger.kernel.org
Subject: Re: [RFC PATCH v2 08/10] net: sched: pfifo_fast use alf_queue
Date: Fri, 15 Jul 2016 15:18:12 -0700 [thread overview]
Message-ID: <57896124.6090402@gmail.com> (raw)
In-Reply-To: <20160715132329.0d04ac42@redhat.com>
On 16-07-15 04:23 AM, Jesper Dangaard Brouer wrote:
> On Thu, 14 Jul 2016 17:07:33 -0700
> John Fastabend <john.fastabend@gmail.com> wrote:
>
>> On 16-07-14 04:42 PM, Alexei Starovoitov wrote:
>>> On Wed, Jul 13, 2016 at 11:23:12PM -0700, John Fastabend wrote:
>>>> This converts the pfifo_fast qdisc to use the alf_queue enqueue and
>>>> dequeue routines then sets the NOLOCK bit.
>>>>
>>>> This also removes the logic used to pick the next band to dequeue from
>>>> and instead just checks each alf_queue for packets from top priority
>>>> to lowest. This might need to be a bit more clever but seems to work
>>>> for now.
>>>>
>>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>>> ---
>>>> net/sched/sch_generic.c | 131 +++++++++++++++++++++++++++--------------------
>>>
>>>> static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc,
>>>> struct sk_buff **to_free)
>>>> {
>>>> - return qdisc_drop(skb, qdisc, to_free);
>>>> + err = skb_array_produce_bh(q, skb);
>>> ..
>>>> static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
>>>> {
>>>> + skb = skb_array_consume_bh(q);
>>>
>>> For this particular qdisc the performance gain should come from
>>> granularityof spin_lock, right?
>>
>> And the fact that the consumer and producer are using different
>> locks now.
>
> Yes. Splitting up enqueue'ers (producer's) from the dequeuer (consumer)
> is an important step, because today the qdisc layer have this problem
> that enqueue'ers can starve the single dequeuer. The current
> mitigation tricks are the enq busy_lock and bulk dequeue.
>
> As John says, using skb_array cause producers and consumer to use
> different locks.
>
>>> Before we were taking the lock much earlier. Here we keep the lock,
>>> but for the very short time.
>>> original pps lockless diff
>>> 1 1418168 1269450 -148718
>>> 2 1587390 1553408 -33982
>>> 4 1084961 1683639 +598678
>>> 8 989636 1522723 +533087
>>> 12 1014018 1348172 +334154
>>>
>
I was able to recover the performance loss here and actually improve it
by fixing a few things in the patchset. Namely qdisc_run was
being called in a few places unnecessarily creating a fairly large per
packet cost overhead and then using the _bh locks was costing quite a
bit and is not needed as Jesper pointer out.
So new pps data here in somewhat raw format. I ran five iterations of
each thread count (1,2,4,8,12)
nolock (pfifo_fast)
1: 1440293 1421602 1409553 1393469 1424543
2: 1754890 1819292 1727948 1797711 1743427
4: 3282665 3344095 3315220 3332777 3348972
8: 2940079 1644450 2950777 2922085 2946310
12: 2042084 2610060 2857581 3493162 3104611
lock (pfifo_fast)
1: 1471479 1469142 1458825 1456788 1453952
2: 1746231 1749490 1753176 1753780 1755959
4: 1119626 1120515 1121478 1119220 1121115
8: 1001471 999308 1000318 1000776 1000384
12: 989269 992122 991590 986581 990430
nolock (mq)
1: 1435952 1459523 1448860 1385451 1435031
2: 2850662 2855702 2859105 2855443 2843382
4: 5288135 5271192 5252242 5270192 5311642
8: 10042731 10018063 9891813 9968382 9956727
12: 13265277 13384199 13438955 13363771 13436198
lock (mq)
1: 1448374 1444208 1437459 1437088 1452453
2: 2687963 2679221 2651059 2691630 2667479
4: 5153884 4684153 5091728 4635261 4902381
8: 9292395 9625869 9681835 9711651 9660498
12: 13553918 13682410 14084055 13946138 13724726
So then if we just use the first test example because I'm being a
bit lazy and don't want to calculate the avg/mean/whatever we get
a pfifo_fast chart like,
locked nolock diff
---------------------------------------------------
1 1471479 1440293 − 31186
2 1746231 1754890 + 8659
4 1119626 3282665 +2163039
8 1119626 2940079 +1820453
12 989269 2857581* +1868312
[*] I pulled the 3rd iteration here as the 1st one seems off
And the mq chart looks reasonable again with these changes,
locked nolock diff
---------------------------------------------------
1 1448374 1435952 - 12422
2 2687963 2850662 + 162699
4 5153884 5288135 + 134251
8 9292395 10042731 + 750336
12 13553918 13265277 - 288641
So the mq case is a bit of a wash from my point of view which I sort
of expected seeing in this test case there is no contention on the
enqueue()/producer or dequeue()/consumer case when running pktgen
at 1 thread per qdisc/queue. A better test would be to fire up a few
thousand udp sessions and bang on the qdiscs to get contention on the
enqueue side. I'll try this next. On another note the variance is a
touch concerning in the data above for the no lock case so might look
into that a bit more to see why we can get 1mpps swing in one of those
cases I sort of wonder if something kicked off on my test machine
to cause that.
Also I'm going to take a look at Jesper's microbenchmark numbers but I
think if I can convince myself that using skb_array helps or at least
does no harm I might push to have this include with skb_array and then
work on optimizing the ring type/kind/etc. as a follow up patch.
Additionally it does seem to provide goodness on the pfifo_fast single
queue case.
Final point is there are more optimizations we can do once the enqueue
and dequeue is separated. For example two fairly easy things include
removing HARD_TX_LOCK nn NICs with a ring per core and adding bulk
dequeue() to the skb_array or alf queue or whatever object we end up
on. And I expect this will provide additional perf boost.
Thanks,
John
next prev parent reply other threads:[~2016-07-15 22:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-14 6:19 [RFC PATCH v2 00/10] running qdiscs without qdisc_lock John Fastabend
2016-07-14 6:19 ` [RFC PATCH v2 01/10] net: sched: allow qdiscs to handle locking John Fastabend
2016-07-14 6:44 ` John Fastabend
2016-07-14 6:20 ` [RFC PATCH v2 02/10] net: sched: qdisc_qlen for per cpu logic John Fastabend
2016-07-14 6:20 ` [RFC PATCH v2 03/10] net: sched: provide per cpu qstat helpers John Fastabend
2016-07-14 6:21 ` [RFC PATCH v2 04/10] net: sched: a dflt qdisc may be used with per cpu stats John Fastabend
2016-07-14 6:21 ` [RFC PATCH v2 05/10] net: sched: per cpu gso handlers John Fastabend
2016-07-14 6:22 ` [RFC PATCH v2 06/10] net: sched: support qdisc_reset on NOLOCK qdisc John Fastabend
2016-07-14 6:22 ` [RFC PATCH v2 07/10] net: sched: support skb_bad_tx with lockless qdisc John Fastabend
2016-07-14 6:23 ` [RFC PATCH v2 08/10] net: sched: pfifo_fast use alf_queue John Fastabend
2016-07-14 15:11 ` Jesper Dangaard Brouer
2016-07-15 0:09 ` John Fastabend
2016-07-15 10:09 ` Jesper Dangaard Brouer
2016-07-15 17:29 ` John Fastabend
2016-07-14 23:42 ` Alexei Starovoitov
2016-07-15 0:07 ` John Fastabend
2016-07-15 11:23 ` Jesper Dangaard Brouer
2016-07-15 22:18 ` John Fastabend [this message]
2016-07-15 22:48 ` Alexei Starovoitov
2016-07-14 6:23 ` [RFC PATCH v2 09/10] net: sched: helper to sum qlen John Fastabend
2016-07-14 6:24 ` [RFC PATCH v2 10/10] net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mq John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57896124.6090402@gmail.com \
--to=john.fastabend@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=brouer@redhat.com \
--cc=eric.dumazet@gmail.com \
--cc=fw@strlen.de \
--cc=jhs@mojatatu.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).