From: Michael Ma <make0818@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: qdisc spin lock
Date: Thu, 21 Apr 2016 15:12:28 -0700 [thread overview]
Message-ID: <CAAmHdhx_Db3GMCmwn3UJajP7_se6tRHPGk_fQUDgDWDq5hN34A@mail.gmail.com> (raw)
In-Reply-To: <1461242518.7627.8.camel@edumazet-glaptop3.roam.corp.google.com>
2016-04-21 5:41 GMT-07:00 Eric Dumazet <eric.dumazet@gmail.com>:
> On Wed, 2016-04-20 at 22:51 -0700, Michael Ma wrote:
>> 2016-04-20 15:34 GMT-07:00 Eric Dumazet <eric.dumazet@gmail.com>:
>> > On Wed, 2016-04-20 at 14:24 -0700, Michael Ma wrote:
>> >> 2016-04-08 7:19 GMT-07:00 Eric Dumazet <eric.dumazet@gmail.com>:
>> >> > On Thu, 2016-03-31 at 16:48 -0700, Michael Ma wrote:
>> >> >> I didn't really know that multiple qdiscs can be isolated using MQ so
>> >> >> that each txq can be associated with a particular qdisc. Also we don't
>> >> >> really have multiple interfaces...
>> >> >>
>> >> >> With this MQ solution we'll still need to assign transmit queues to
>> >> >> different classes by doing some math on the bandwidth limit if I
>> >> >> understand correctly, which seems to be less convenient compared with
>> >> >> a solution purely within HTB.
>> >> >>
>> >> >> I assume that with this solution I can still share qdisc among
>> >> >> multiple transmit queues - please let me know if this is not the case.
>> >> >
>> >> > Note that this MQ + HTB thing works well, unless you use a bonding
>> >> > device. (Or you need the MQ+HTB on the slaves, with no way of sharing
>> >> > tokens between the slaves)
>> >>
>> >> Actually MQ+HTB works well for small packets - like flow of 512 byte
>> >> packets can be throttled by HTB using one txq without being affected
>> >> by other flows with small packets. However I found using this solution
>> >> large packets (10k for example) will only achieve very limited
>> >> bandwidth. In my test I used MQ to assign one txq to a HTB which sets
>> >> rate at 1Gbit/s, 512 byte packets can achieve the ceiling rate by
>> >> using 30 threads. But sending 10k packets using 10 threads has only 10
>> >> Mbit/s with the same TC configuration. If I increase burst and cburst
>> >> of HTB to some extreme large value (like 50MB) the ceiling rate can be
>> >> hit.
>> >>
>> >> The strange thing is that I don't see this problem when using HTB as
>> >> the root. So txq number seems to be a factor here - however it's
>> >> really hard to understand why would it only affect larger packets. Is
>> >> this a known issue? Any suggestion on how to investigate the issue
>> >> further? Profiling shows that the cpu utilization is pretty low.
>> >
>> > You could try
>> >
>> > perf record -a -g -e skb:kfree_skb sleep 5
>> > perf report
>> >
>> > So that you see where the packets are dropped.
>> >
>> > Chances are that your UDP sockets SO_SNDBUF is too big, and packets are
>> > dropped at qdisc enqueue time, instead of having backpressure.
>> >
>>
>> Thanks for the hint - how should I read the perf report? Also we're
>> using TCP socket in this testing - TCP window size is set to 70kB.
>
> But how are you telling TCP to send 10k packets ?
>
We just write to the socket with 10k buffer and wait for a response
from the server (using read()) before the next write. Using tcpdump I
can see the 10k write is actually sent through 3 packets
(7.3k/1.5k/1.3k).
> AFAIK you can not : TCP happily aggregates packets in write queue
> (see current MSG_EOR discussion)
>
> I suspect a bug in your tc settings.
>
>
Could you help to check my tc setting?
sudo tc qdisc add dev eth0 root mqprio num_tc 6 map 0 1 2 3 4 5 0 0
queues 19@0 1@19 1@20 1@21 1@22 1@23 hw 0
sudo tc qdisc add dev eth0 parent 805a:1a handle 8001:0 htb default 10
sudo tc class add dev eth0 parent 8001: classid 8001:10 htb rate 1000Mbit
I didn't set r2q/burst/cburst/mtu/mpu so the default value should be used.
next prev parent reply other threads:[~2016-04-21 22:12 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-30 7:20 qdisc spin lock Michael Ma
2016-03-31 19:18 ` Jesper Dangaard Brouer
2016-03-31 23:41 ` Michael Ma
2016-04-16 8:52 ` Andrew
2016-03-31 22:16 ` Cong Wang
2016-03-31 23:48 ` Michael Ma
2016-04-01 2:19 ` David Miller
2016-04-01 17:17 ` Michael Ma
2016-04-01 3:44 ` John Fastabend
2016-04-13 18:23 ` Michael Ma
2016-04-08 14:19 ` Eric Dumazet
2016-04-15 22:46 ` Michael Ma
2016-04-15 22:54 ` Eric Dumazet
2016-04-15 23:05 ` Michael Ma
2016-04-15 23:56 ` Eric Dumazet
2016-04-20 21:24 ` Michael Ma
2016-04-20 22:34 ` Eric Dumazet
2016-04-21 5:51 ` Michael Ma
2016-04-21 12:41 ` Eric Dumazet
2016-04-21 22:12 ` Michael Ma [this message]
2016-04-25 17:29 ` Michael Ma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAAmHdhx_Db3GMCmwn3UJajP7_se6tRHPGk_fQUDgDWDq5hN34A@mail.gmail.com \
--to=make0818@gmail.com \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).