netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peilin Ye <yepeilin.cs@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	David Ahern <dsahern@kernel.org>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>,
	Peilin Ye <peilin.ye@bytedance.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Cong Wang <cong.wang@bytedance.com>
Subject: Re: [PATCH RFC v1 net-next 0/4] net: Qdisc backpressure infrastructure
Date: Tue, 10 May 2022 16:03:47 -0700	[thread overview]
Message-ID: <20220510230347.GA11152@bytedance> (raw)
In-Reply-To: <2dbd5e38-b748-0c16-5b8b-b32bc0cc43b0@gmail.com>

Hi Eric,

On Mon, May 09, 2022 at 08:26:27PM -0700, Eric Dumazet wrote:
> On 5/6/22 12:43, Peilin Ye wrote:
> > From: Peilin Ye <peilin.ye@bytedance.com>
> > 
> > Hi all,
> > 
> > Currently sockets (especially UDP ones) can drop a lot of skbs at TC
> > egress when rate limited by shaper Qdiscs like HTB.  This experimental
> > patchset tries to improve this by introducing a backpressure mechanism, so
> > that sockets are temporarily throttled when they "send too much".
> > 
> > For now it takes care of TBF, HTB and CBQ, for UDP and TCP sockets.  Any
> > comments, suggestions would be much appreciated.  Thanks!
> 
> This very much looks like trying to solve an old problem to me.
> 
> If you were using EDT model, a simple eBPF program could get rid of the
> HTB/TBF qdisc
> 
> and you could use MQ+FQ as the packet schedulers, with the true multiqueue
> sharding.
> 
> FQ provides fairness, so a flow can not anymore consume all the qdisc limit.

This RFC tries to solve the "when UDP starts to drop (whether because of
per-flow or per-Qdisc limit), it drops a lot" issue described in [I] of
the cover letter; its main goal is not to improve fairness.

> (If your UDP sockets send packets all over the place (not connected
> sockets),
> 
> then the eBPF can also be used to rate limit them)

I was able to reproduce the same issue using EDT: default sch_fq
flow_limit (100 packets), with a 1 Gbit/sec rate limit.  Now if I run
this:

  $ iperf -u -c 1.2.3.4 -p 5678 -l 3K -i 0.5 -t 30 -b 3g

  [ ID] Interval       Transfer     Bandwidth
  [  3]  0.0- 0.5 sec   137 MBytes  2.29 Gbits/sec
  [  3]  0.5- 1.0 sec   142 MBytes  2.38 Gbits/sec
  [  3]  1.0- 1.5 sec   117 MBytes  1.96 Gbits/sec
  [  3]  1.5- 2.0 sec   105 MBytes  1.77 Gbits/sec
  [  3]  2.0- 2.5 sec   132 MBytes  2.22 Gbits/sec
  <...>                             ^^^^^^^^^^^^^^

On average it tries to send 2.31 Gbits per second, dropping 56.71% of
the traffic:

  $ tc -s qdisc show dev eth0
  <...>
  qdisc fq 5: parent 1:4 limit 10000p flow_limit 100p buckets 1024 orphan_mask 1023 quantum 18030 initial_quantum 90150 low_rate_threshold 550Kbit refill_delay 40.0ms
   Sent 16356556 bytes 14159 pkt (dropped 2814461, overlimits 0 requeues 0)
                                  ^^^^^^^^^^^^^^^

This RFC does not cover EDT though, since it does not use Qdisc watchdog
or friends.

Thanks,
Peilin Ye


  reply	other threads:[~2022-05-10 23:03 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-06 19:43 [PATCH RFC v1 net-next 0/4] net: Qdisc backpressure infrastructure Peilin Ye
2022-05-06 19:44 ` [PATCH RFC v1 net-next 1/4] net: Introduce " Peilin Ye
2022-05-06 20:31   ` Stephen Hemminger
2022-05-06 23:34     ` Peilin Ye
2022-05-09  7:53   ` Dave Taht
2022-05-10  2:23     ` Peilin Ye
2022-05-06 19:44 ` [PATCH RFC v1 net-next 2/4] net/sched: sch_tbf: Use " Peilin Ye
2022-05-06 19:45 ` [PATCH RFC v1 net-next 3/4] net/sched: sch_htb: " Peilin Ye
2022-05-06 19:45 ` [PATCH RFC v1 net-next 4/4] net/sched: sch_cbq: " Peilin Ye
2022-05-10  3:26 ` [PATCH RFC v1 net-next 0/4] net: " Eric Dumazet
2022-05-10 23:03   ` Peilin Ye [this message]
2022-05-10 23:27     ` Peilin Ye
2022-08-22  9:10 ` [PATCH RFC v2 net-next 0/5] " Peilin Ye
2022-08-22  9:11   ` [PATCH RFC v2 net-next 1/5] net: Introduce " Peilin Ye
2022-08-22  9:12   ` [PATCH RFC v2 net-next 2/5] net/udp: Implement Qdisc backpressure algorithm Peilin Ye
2022-08-22  9:12   ` [PATCH RFC v2 net-next 3/5] net/sched: sch_tbf: Use Qdisc backpressure infrastructure Peilin Ye
2022-08-22  9:12   ` [PATCH RFC v2 net-next 4/5] net/sched: sch_htb: " Peilin Ye
2022-08-22  9:12   ` [PATCH RFC v2 net-next 5/5] net/sched: sch_cbq: " Peilin Ye
2022-08-22 16:17   ` [PATCH RFC v2 net-next 0/5] net: " Jakub Kicinski
2022-08-29 16:53     ` Cong Wang
2022-08-30  0:21       ` Jakub Kicinski
2022-09-19 17:00         ` Cong Wang
2022-08-22 16:22   ` Eric Dumazet
2022-08-29 16:47     ` Cong Wang
2022-08-29 16:53       ` Eric Dumazet
2022-09-19 17:06         ` Cong Wang
2022-08-30  2:28     ` Yafang Shao
2022-09-19 17:04       ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220510230347.GA11152@bytedance \
    --to=yepeilin.cs@gmail.com \
    --cc=cong.wang@bytedance.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eric.dumazet@gmail.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=peilin.ye@bytedance.com \
    --cc=xiyou.wangcong@gmail.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).