Re: qdisc spin lock - Michael Ma

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Michael Ma <make0818@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: qdisc spin lock
Date: Thu, 21 Apr 2016 15:12:28 -0700	[thread overview]
Message-ID: <CAAmHdhx_Db3GMCmwn3UJajP7_se6tRHPGk_fQUDgDWDq5hN34A@mail.gmail.com> (raw)
In-Reply-To: <1461242518.7627.8.camel@edumazet-glaptop3.roam.corp.google.com>

2016-04-21 5:41 GMT-07:00 Eric Dumazet <eric.dumazet@gmail.com>:
> On Wed, 2016-04-20 at 22:51 -0700, Michael Ma wrote:
>> 2016-04-20 15:34 GMT-07:00 Eric Dumazet <eric.dumazet@gmail.com>:
>> > On Wed, 2016-04-20 at 14:24 -0700, Michael Ma wrote:
>> >> 2016-04-08 7:19 GMT-07:00 Eric Dumazet <eric.dumazet@gmail.com>:
>> >> > On Thu, 2016-03-31 at 16:48 -0700, Michael Ma wrote:
>> >> >> I didn't really know that multiple qdiscs can be isolated using MQ so
>> >> >> that each txq can be associated with a particular qdisc. Also we don't
>> >> >> really have multiple interfaces...
>> >> >>
>> >> >> With this MQ solution we'll still need to assign transmit queues to
>> >> >> different classes by doing some math on the bandwidth limit if I
>> >> >> understand correctly, which seems to be less convenient compared with
>> >> >> a solution purely within HTB.
>> >> >>
>> >> >> I assume that with this solution I can still share qdisc among
>> >> >> multiple transmit queues - please let me know if this is not the case.
>> >> >
>> >> > Note that this MQ + HTB thing works well, unless you use a bonding
>> >> > device. (Or you need the MQ+HTB on the slaves, with no way of sharing
>> >> > tokens between the slaves)
>> >>
>> >> Actually MQ+HTB works well for small packets - like flow of 512 byte
>> >> packets can be throttled by HTB using one txq without being affected
>> >> by other flows with small packets. However I found using this solution
>> >> large packets (10k for example) will only achieve very limited
>> >> bandwidth. In my test I used MQ to assign one txq to a HTB which sets
>> >> rate at 1Gbit/s, 512 byte packets can achieve the ceiling rate by
>> >> using 30 threads. But sending 10k packets using 10 threads has only 10
>> >> Mbit/s with the same TC configuration. If I increase burst and cburst
>> >> of HTB to some extreme large value (like 50MB) the ceiling rate can be
>> >> hit.
>> >>
>> >> The strange thing is that I don't see this problem when using HTB as
>> >> the root. So txq number seems to be a factor here - however it's
>> >> really hard to understand why would it only affect larger packets. Is
>> >> this a known issue? Any suggestion on how to investigate the issue
>> >> further? Profiling shows that the cpu utilization is pretty low.
>> >
>> > You could try
>> >
>> > perf record -a -g -e skb:kfree_skb sleep 5
>> > perf report
>> >
>> > So that you see where the packets are dropped.
>> >
>> > Chances are that your UDP sockets SO_SNDBUF is too big, and packets are
>> > dropped at qdisc enqueue time, instead of having backpressure.
>> >
>>
>> Thanks for the hint - how should I read the perf report? Also we're
>> using TCP socket in this testing - TCP window size is set to 70kB.
>
> But how are you telling TCP to send 10k packets ?
>
We just write to the socket with 10k buffer and wait for a response
from the server (using read()) before the next write. Using tcpdump I
can see the 10k write is actually sent through 3 packets
(7.3k/1.5k/1.3k).

> AFAIK you can not : TCP happily aggregates packets in write queue
> (see current MSG_EOR discussion)
>
> I suspect a bug in your tc settings.
>
>

Could you help to check my tc setting?

sudo tc qdisc add dev eth0 root mqprio num_tc 6 map 0 1 2 3 4 5 0 0
queues 19@0 1@19 1@20 1@21 1@22 1@23 hw 0
sudo tc qdisc add dev eth0 parent 805a:1a handle 8001:0 htb default 10
sudo tc class add dev eth0 parent 8001: classid 8001:10 htb rate 1000Mbit

I didn't set r2q/burst/cburst/mtu/mpu so the default value should be used.

next prev parent reply	other threads:[~2016-04-21 22:12 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-30  7:20 qdisc spin lock Michael Ma
2016-03-31 19:18 ` Jesper Dangaard Brouer
2016-03-31 23:41   ` Michael Ma
2016-04-16  8:52     ` Andrew
2016-03-31 22:16 ` Cong Wang
2016-03-31 23:48   ` Michael Ma
2016-04-01  2:19     ` David Miller
2016-04-01 17:17       ` Michael Ma
2016-04-01  3:44     ` John Fastabend
2016-04-13 18:23       ` Michael Ma
2016-04-08 14:19     ` Eric Dumazet
2016-04-15 22:46       ` Michael Ma
2016-04-15 22:54         ` Eric Dumazet
2016-04-15 23:05           ` Michael Ma
2016-04-15 23:56             ` Eric Dumazet
2016-04-20 21:24       ` Michael Ma
2016-04-20 22:34         ` Eric Dumazet
2016-04-21  5:51           ` Michael Ma
2016-04-21 12:41             ` Eric Dumazet
2016-04-21 22:12               ` Michael Ma [this message]
2016-04-25 17:29                 ` Michael Ma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAmHdhx_Db3GMCmwn3UJajP7_se6tRHPGk_fQUDgDWDq5hN34A@mail.gmail.com \
    --to=make0818@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).