From: David Miller <davem@davemloft.net>
To: stephen@networkplumber.org
Cc: netdev@vger.kernel.org, njunger@uwaterloo.ca
Subject: Re: [PATCH net-next] netem: apply correct delay when rate throttling
Date: Thu, 16 Mar 2017 20:15:54 -0700 (PDT) [thread overview]
Message-ID: <20170316.201554.1318233586227395731.davem@davemloft.net> (raw)
In-Reply-To: <20170313171658.18606-1-sthemmin@microsoft.com>
From: Stephen Hemminger <stephen@networkplumber.org>
Date: Mon, 13 Mar 2017 10:16:58 -0700
> From: Nik Unger <njunger@uwaterloo.ca>
>
> I recently reported on the netem list that iperf network benchmarks
> show unexpected results when a bandwidth throttling rate has been
> configured for netem. Specifically:
>
> 1) The measured link bandwidth *increases* when a higher delay is added
> 2) The measured link bandwidth appears higher than the specified limit
> 3) The measured link bandwidth for the same very slow settings varies significantly across
> machines
>
> The issue can be reproduced by using tc to configure netem with a
> 512kbit rate and various (none, 1us, 50ms, 100ms, 200ms) delays on a
> veth pair between network namespaces, and then using iperf (or any
> other network benchmarking tool) to test throughput. Complete detailed
> instructions are in the original email chain here:
> https://lists.linuxfoundation.org/pipermail/netem/2017-February/001672.html
>
> There appear to be two underlying bugs causing these effects:
>
> - The first issue causes long delays when the rate is slow and no
> delay is configured (e.g., "rate 512kbit"). This is because SKBs are
> not orphaned when no delay is configured, so orphaning does not
> occur until *after* the rate-induced delay has been applied. For
> this reason, adding a tiny delay (e.g., "rate 512kbit delay 1us")
> dramatically increases the measured bandwidth.
>
> - The second issue is that rate-induced delays are not correctly
> applied, allowing SKB delays to occur in parallel. The indended
> approach is to compute the delay for an SKB and to add this delay to
> the end of the current queue. However, the code does not detect
> existing SKBs in the queue due to improperly testing sch->q.qlen,
> which is nonzero even when packets exist only in the
> rbtree. Consequently, new SKBs do not wait for the current queue to
> empty. When packet delays vary significantly (e.g., if packet sizes
> are different), then this also causes unintended reordering.
>
> I modified the code to expect a delay (and orphan the SKB) when a rate
> is configured. I also added some defensive tests that correctly find
> the latest scheduled delivery time, even if it is (unexpectedly) for a
> packet in sch->q. I have tested these changes on the latest kernel
> (4.11.0-rc1+) and the iperf / ping test results are as expected.
>
> Signed-off-by: Nik Unger <njunger@uwaterloo.ca>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Applied.
prev parent reply other threads:[~2017-03-17 3:16 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-13 17:16 [PATCH net-next] netem: apply correct delay when rate throttling Stephen Hemminger
2017-03-17 3:15 ` David Miller [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170316.201554.1318233586227395731.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=njunger@uwaterloo.ca \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).