From: Simon Horman <horms@verge.net.au>
To: netdev@vger.kernel.org
Cc: David Miller <davem@davemloft.net>, Jarek Poplawski <jarkao2@gmail.com>
Subject: Re: Possible regression in HTB
Date: Tue, 7 Oct 2008 15:51:47 +1100 [thread overview]
Message-ID: <20081007045145.GA23883@verge.net.au> (raw)
In-Reply-To: <20081007011551.GA28408@verge.net.au>
On Tue, Oct 07, 2008 at 12:15:52PM +1100, Simon Horman wrote:
> Hi Dave, Hi Jarek,
>
> I know that you guys were/are playing around a lot in here, but
> unfortunately I think that "pkt_sched: Always use q->requeue in
> dev_requeue_skb()" (f0876520b0b721bedafd9cec3b1b0624ae566eee) has
> introduced a performance regression for HTB.
>
> My tc rules are below, but in a nutshell I have 3 leaf classes.
> One with a rate of 500Mbit/s and the other two with 100Mbit/s.
> The ceiling for all classes is 1Gb/s and that is also both
> the rate and ceiling for the parent class.
>
> [ rate=1Gbit/s ]
> [ ceil=1Gbit/s ]
> |
> +--------------------+--------------------+
> | | |
> [ rate=500Mbit/s ] [ rate=100Mbit/s ] [ rate=100Mbit/s ]
> [ ceil= 1Gbit/s ] [ ceil=100Mbit/s ] [ ceil= 1Gbit/s ]
>
> The tc rules have an extra class for all other traffic,
> but its idle, so I left it out of the diagram.
>
> In order to test this I set up filters so that traffic to
> each of port 10194, 10196 and 10197 is directed to one of the leaf-classes.
> I then set up a process on the same host for each port sending
> UDP as fast as it could in a while() { send(); } loop. On another
> host I set up processes listening for the UDP traffic in a
> while () { recv(); } loop. And I measured the results.
>
> ( I should be able to provide the code used for testing,
> but its not mine and my colleague who wrote it is off
> with the flu today. )
>
> Prior to this patch the result looks like this:
>
> 10194: 545134589bits/s 545Mbits/s
> 10197: 205358520bits/s 205Mbits/s
> 10196: 205311416bits/s 205Mbits/s
> -----------------------------------
> total: 955804525bits/s 955Mbits/s
>
> And after the patch the result looks like this:
> 10194: 384248522bits/s 384Mbits/s
> 10197: 284706778bits/s 284Mbits/s
> 10196: 288119464bits/s 288Mbits/s
> -----------------------------------
> total: 957074765bits/s 957Mbits/s
>
> There is some noise in these results, but I think that its clear
> that before the patch all leaf-classes received at least their rate,
> and after the patch the rate=500Mbit/s class received much less than
> its rate. This I believe is a regression.
>
> I do not believe that this happens at lower bit rates, for instance
> if you reduce the ceiling and rate of all classes by a factor of 10.
> I can produce some numbers on that if you want them.
>
> The test machine with the tc rules and udp-sending processes
> has two Intel Xeon Quad-cores running at 1.86GHz. The kernel
> is SMP x86_64.
With the following patch (basically a reversal of ""pkt_sched: Always use
q->requeue in dev_requeue_skb()" forward ported to the current
net-next-2.6 tree (tcp: Respect SO_RCVLOWAT in tcp_poll()), I get some
rather nice numbers (IMHO).
10194: 666780666bits/s 666Mbits/s
10197: 141154197bits/s 141Mbits/s
10196: 141023090bits/s 141Mbits/s
-----------------------------------
total: 948957954bits/s 948Mbits/s
I'm not sure what evil things this patch does to other aspects
of the qdisc code.
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 31f6b61..d2e0da6 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -44,7 +44,10 @@ static inline int qdisc_qlen(struct Qdisc *q)
static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
{
- q->gso_skb = skb;
+ if (unlikely(skb->next))
+ q->gso_skb = skb;
+ else
+ q->ops->requeue(skb, q);
__netif_schedule(q);
return 0;
next prev parent reply other threads:[~2008-10-07 4:51 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-07 1:15 Possible regression in HTB Simon Horman
2008-10-07 4:51 ` Simon Horman [this message]
2008-10-07 7:44 ` Jarek Poplawski
2008-10-07 12:03 ` Patrick McHardy
2008-10-08 0:09 ` Simon Horman
2008-10-08 6:37 ` Jarek Poplawski
2008-10-08 7:22 ` Simon Horman
2008-10-08 7:53 ` Jarek Poplawski
2008-10-07 12:20 ` Jarek Poplawski
2008-10-07 12:48 ` Patrick McHardy
2008-10-07 22:00 ` Jarek Poplawski
2008-10-08 0:21 ` Simon Horman
2008-10-08 0:31 ` Patrick McHardy
2008-10-08 0:40 ` Patrick McHardy
2008-10-08 7:34 ` Martin Devera
2008-10-08 8:53 ` Jarek Poplawski
2008-10-08 10:47 ` Martin Devera
2008-10-08 12:04 ` Jarek Poplawski
2008-10-09 1:09 ` Simon Horman
2008-10-09 6:22 ` Martin Devera
2008-10-09 9:56 ` Jarek Poplawski
2008-10-09 10:14 ` Jarek Poplawski
2008-10-09 10:52 ` Martin Devera
2008-10-09 11:04 ` Jarek Poplawski
2008-10-09 11:11 ` Simon Horman
2008-10-09 11:22 ` Martin Devera
2008-10-08 6:55 ` Jarek Poplawski
2008-10-08 7:06 ` Denys Fedoryshchenko
2008-10-08 7:46 ` [PATCH] " Jarek Poplawski
2008-10-08 18:36 ` David Miller
2008-10-08 7:22 ` Simon Horman
2008-10-08 8:03 ` Jarek Poplawski
2008-10-09 0:54 ` Simon Horman
2008-10-09 6:21 ` Jarek Poplawski
2008-10-09 6:53 ` Martin Devera
2008-10-09 11:18 ` Simon Horman
2008-10-09 11:58 ` Patrick McHardy
2008-10-09 12:36 ` Jarek Poplawski
2008-10-10 6:59 ` Jarek Poplawski
2008-10-10 8:57 ` Jarek Poplawski
2008-10-10 12:12 ` Jarek Poplawski
2008-10-08 0:10 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081007045145.GA23883@verge.net.au \
--to=horms@verge.net.au \
--cc=davem@davemloft.net \
--cc=jarkao2@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).