From: John Fastabend <john.fastabend@gmail.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
fw@strlen.de, jhs@mojatatu.com, eric.dumazet@gmail.com,
netdev@vger.kernel.org
Subject: Re: [RFC PATCH v2 08/10] net: sched: pfifo_fast use alf_queue
Date: Fri, 15 Jul 2016 15:18:12 -0700 [thread overview]
Message-ID: <57896124.6090402@gmail.com> (raw)
In-Reply-To: <20160715132329.0d04ac42@redhat.com>
On 16-07-15 04:23 AM, Jesper Dangaard Brouer wrote:
> On Thu, 14 Jul 2016 17:07:33 -0700
> John Fastabend <john.fastabend@gmail.com> wrote:
>
>> On 16-07-14 04:42 PM, Alexei Starovoitov wrote:
>>> On Wed, Jul 13, 2016 at 11:23:12PM -0700, John Fastabend wrote:
>>>> This converts the pfifo_fast qdisc to use the alf_queue enqueue and
>>>> dequeue routines then sets the NOLOCK bit.
>>>>
>>>> This also removes the logic used to pick the next band to dequeue from
>>>> and instead just checks each alf_queue for packets from top priority
>>>> to lowest. This might need to be a bit more clever but seems to work
>>>> for now.
>>>>
>>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>>> ---
>>>> net/sched/sch_generic.c | 131 +++++++++++++++++++++++++++--------------------
>>>
>>>> static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc,
>>>> struct sk_buff **to_free)
>>>> {
>>>> - return qdisc_drop(skb, qdisc, to_free);
>>>> + err = skb_array_produce_bh(q, skb);
>>> ..
>>>> static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
>>>> {
>>>> + skb = skb_array_consume_bh(q);
>>>
>>> For this particular qdisc the performance gain should come from
>>> granularityof spin_lock, right?
>>
>> And the fact that the consumer and producer are using different
>> locks now.
>
> Yes. Splitting up enqueue'ers (producer's) from the dequeuer (consumer)
> is an important step, because today the qdisc layer have this problem
> that enqueue'ers can starve the single dequeuer. The current
> mitigation tricks are the enq busy_lock and bulk dequeue.
>
> As John says, using skb_array cause producers and consumer to use
> different locks.
>
>>> Before we were taking the lock much earlier. Here we keep the lock,
>>> but for the very short time.
>>> original pps lockless diff
>>> 1 1418168 1269450 -148718
>>> 2 1587390 1553408 -33982
>>> 4 1084961 1683639 +598678
>>> 8 989636 1522723 +533087
>>> 12 1014018 1348172 +334154
>>>
>
I was able to recover the performance loss here and actually improve it
by fixing a few things in the patchset. Namely qdisc_run was
being called in a few places unnecessarily creating a fairly large per
packet cost overhead and then using the _bh locks was costing quite a
bit and is not needed as Jesper pointer out.
So new pps data here in somewhat raw format. I ran five iterations of
each thread count (1,2,4,8,12)
nolock (pfifo_fast)
1: 1440293 1421602 1409553 1393469 1424543
2: 1754890 1819292 1727948 1797711 1743427
4: 3282665 3344095 3315220 3332777 3348972
8: 2940079 1644450 2950777 2922085 2946310
12: 2042084 2610060 2857581 3493162 3104611
lock (pfifo_fast)
1: 1471479 1469142 1458825 1456788 1453952
2: 1746231 1749490 1753176 1753780 1755959
4: 1119626 1120515 1121478 1119220 1121115
8: 1001471 999308 1000318 1000776 1000384
12: 989269 992122 991590 986581 990430
nolock (mq)
1: 1435952 1459523 1448860 1385451 1435031
2: 2850662 2855702 2859105 2855443 2843382
4: 5288135 5271192 5252242 5270192 5311642
8: 10042731 10018063 9891813 9968382 9956727
12: 13265277 13384199 13438955 13363771 13436198
lock (mq)
1: 1448374 1444208 1437459 1437088 1452453
2: 2687963 2679221 2651059 2691630 2667479
4: 5153884 4684153 5091728 4635261 4902381
8: 9292395 9625869 9681835 9711651 9660498
12: 13553918 13682410 14084055 13946138 13724726
So then if we just use the first test example because I'm being a
bit lazy and don't want to calculate the avg/mean/whatever we get
a pfifo_fast chart like,
locked nolock diff
---------------------------------------------------
1 1471479 1440293 − 31186
2 1746231 1754890 + 8659
4 1119626 3282665 +2163039
8 1119626 2940079 +1820453
12 989269 2857581* +1868312
[*] I pulled the 3rd iteration here as the 1st one seems off
And the mq chart looks reasonable again with these changes,
locked nolock diff
---------------------------------------------------
1 1448374 1435952 - 12422
2 2687963 2850662 + 162699
4 5153884 5288135 + 134251
8 9292395 10042731 + 750336
12 13553918 13265277 - 288641
So the mq case is a bit of a wash from my point of view which I sort
of expected seeing in this test case there is no contention on the
enqueue()/producer or dequeue()/consumer case when running pktgen
at 1 thread per qdisc/queue. A better test would be to fire up a few
thousand udp sessions and bang on the qdiscs to get contention on the
enqueue side. I'll try this next. On another note the variance is a
touch concerning in the data above for the no lock case so might look
into that a bit more to see why we can get 1mpps swing in one of those
cases I sort of wonder if something kicked off on my test machine
to cause that.
Also I'm going to take a look at Jesper's microbenchmark numbers but I
think if I can convince myself that using skb_array helps or at least
does no harm I might push to have this include with skb_array and then
work on optimizing the ring type/kind/etc. as a follow up patch.
Additionally it does seem to provide goodness on the pfifo_fast single
queue case.
Final point is there are more optimizations we can do once the enqueue
and dequeue is separated. For example two fairly easy things include
removing HARD_TX_LOCK nn NICs with a ring per core and adding bulk
dequeue() to the skb_array or alf queue or whatever object we end up
on. And I expect this will provide additional perf boost.
Thanks,
John
next prev parent reply other threads:[~2016-07-15 22:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-14 6:19 [RFC PATCH v2 00/10] running qdiscs without qdisc_lock John Fastabend
2016-07-14 6:19 ` [RFC PATCH v2 01/10] net: sched: allow qdiscs to handle locking John Fastabend
2016-07-14 6:44 ` John Fastabend
2016-07-14 6:20 ` [RFC PATCH v2 02/10] net: sched: qdisc_qlen for per cpu logic John Fastabend
2016-07-14 6:20 ` [RFC PATCH v2 03/10] net: sched: provide per cpu qstat helpers John Fastabend
2016-07-14 6:21 ` [RFC PATCH v2 04/10] net: sched: a dflt qdisc may be used with per cpu stats John Fastabend
2016-07-14 6:21 ` [RFC PATCH v2 05/10] net: sched: per cpu gso handlers John Fastabend
2016-07-14 6:22 ` [RFC PATCH v2 06/10] net: sched: support qdisc_reset on NOLOCK qdisc John Fastabend
2016-07-14 6:22 ` [RFC PATCH v2 07/10] net: sched: support skb_bad_tx with lockless qdisc John Fastabend
2016-07-14 6:23 ` [RFC PATCH v2 08/10] net: sched: pfifo_fast use alf_queue John Fastabend
2016-07-14 15:11 ` Jesper Dangaard Brouer
2016-07-15 0:09 ` John Fastabend
2016-07-15 10:09 ` Jesper Dangaard Brouer
2016-07-15 17:29 ` John Fastabend
2016-07-14 23:42 ` Alexei Starovoitov
2016-07-15 0:07 ` John Fastabend
2016-07-15 11:23 ` Jesper Dangaard Brouer
2016-07-15 22:18 ` John Fastabend [this message]
2016-07-15 22:48 ` Alexei Starovoitov
2016-07-14 6:23 ` [RFC PATCH v2 09/10] net: sched: helper to sum qlen John Fastabend
2016-07-14 6:24 ` [RFC PATCH v2 10/10] net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mq John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57896124.6090402@gmail.com \
--to=john.fastabend@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=brouer@redhat.com \
--cc=eric.dumazet@gmail.com \
--cc=fw@strlen.de \
--cc=jhs@mojatatu.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.