Re: [PATCH v2 net-next] pkt_sched: fq: Fair Queue packet scheduler

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jason Wang <jasowang@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	Yuchung Cheng <ycheng@google.com>,
	Neal Cardwell <ncardwell@google.com>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v2 net-next] pkt_sched: fq: Fair Queue packet scheduler
Date: Wed, 04 Sep 2013 13:26:56 +0800	[thread overview]
Message-ID: <5226C4A0.6040709@redhat.com> (raw)
In-Reply-To: <1377816595.8277.54.camel@edumazet-glaptop>

On 08/30/2013 06:49 AM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> - Uses perfect flow match (not stochastic hash like SFQ/FQ_codel)
> - Uses the new_flow/old_flow separation from FQ_codel
> - New flows get an initial credit allowing IW10 without added delay.
> - Special FIFO queue for high prio packets (no need for PRIO + FQ)
> - Uses a hash table of RB trees to locate the flows at enqueue() time
> - Smart on demand gc (at enqueue() time, RB tree lookup evicts old
>   unused flows)
> - Dynamic memory allocations.
> - Designed to allow millions of concurrent flows per Qdisc.
> - Small memory footprint : ~8K per Qdisc, and 104 bytes per flow.
> - Single high resolution timer for throttled flows (if any).
> - One RB tree to link throttled flows.
> - Ability to have a max rate per flow. We might add a socket option
>   to add per socket limitation.
>
> Attempts have been made to add TCP pacing in TCP stack, but this
> seems to add complex code to an already complex stack.
>
> TCP pacing is welcomed for flows having idle times, as the cwnd
> permits TCP stack to queue a possibly large number of packets.
>
[...]
>
> FQ gets a bunch of tunables as :
>
>   limit : max number of packets on whole Qdisc (default 10000)
>
>   flow_limit : max number of packets per flow (default 100)
>
>   quantum : the credit per RR round (default is 2 MTU)
>
>   initial_quantum : initial credit for new flows (default is 10 MTU)
>
>   maxrate : max per flow rate (default : unlimited)
>
>   buckets : number of RB trees (default : 1024) in hash table.
>                (consumes 8 bytes per bucket)
>
>   [no]pacing : disable/enable pacing (default is enable)
>
> All of them can be changed on a live qdisc.
>
> $ tc qd add dev eth0 root fq help
> Usage: ... fq [ limit PACKETS ] [ flow_limit PACKETS ]
>               [ quantum BYTES ] [ initial_quantum BYTES ]
>               [ maxrate RATE  ] [ buckets NUMBER ]
>               [ [no]pacing ]
>
> $ tc -s -d qd
> qdisc fq 8002: dev eth0 root refcnt 32 limit 10000p flow_limit 100p buckets 256 quantum 3028 initial_quantum 15140
>  Sent 216532416 bytes 148395 pkt (dropped 0, overlimits 0 requeues 14)
>  backlog 0b 0p requeues 14
>   511 flows, 511 inactive, 0 throttled
>   110 gc, 0 highprio, 0 retrans, 1143 throttled, 0 flows_plimit
>
>
> [1] Except if initial srtt is overestimated, as if using
> cached srtt in tcp metrics. We'll provide a fix for this issue.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> ---
> v2: added initial_quantum support

I see both degradation and jitter when using fq with virtio-net. Guest
to guest performance drops from 8Gb/s to 3Gb/s-7Gb/s. Guest to local
host drops from 8Gb/s to 4Gb/s-6Gb/s. Guest to external host with ixgbe
drops from 9Gb/s to 7Gb/s

I didn't meet the issue when using sfq or disabling pacing.

So it looks like it was caused by the inaccuracy and jitter of the
pacing estimation in a virt guest?

next prev parent reply	other threads:[~2013-09-04  5:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-29 22:49 [PATCH v2 net-next] pkt_sched: fq: Fair Queue packet scheduler Eric Dumazet
2013-08-30  1:47 ` David Miller
2013-08-30  2:30   ` [PATCH iproute2] " Eric Dumazet
2013-09-03 15:49     ` Stephen Hemminger
2013-09-04  5:26 ` Jason Wang [this message]
2013-09-04  5:59   ` [PATCH v2 net-next] " Eric Dumazet
2013-09-04  6:30     ` Jason Wang
2013-09-04 10:30       ` Eric Dumazet
2013-09-04 11:27         ` Eric Dumazet
2013-09-04 11:59           ` Daniel Borkmann
2013-09-05  3:39             ` Jason Wang
2013-09-05  0:50           ` Eric Dumazet
2013-09-05  1:23             ` Eric Dumazet
2013-09-05  3:43             ` Jason Wang
2013-09-05  3:34           ` Jason Wang
2013-09-05  3:07         ` Jason Wang
2013-09-05  3:41           ` Eric Dumazet
2013-09-05  5:16             ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5226C4A0.6040709@redhat.com \
    --to=jasowang@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=mst@redhat.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.