From: Jason Wang <jasowang@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
netdev <netdev@vger.kernel.org>,
Yuchung Cheng <ycheng@google.com>,
Neal Cardwell <ncardwell@google.com>,
"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v2 net-next] pkt_sched: fq: Fair Queue packet scheduler
Date: Thu, 05 Sep 2013 11:43:56 +0800 [thread overview]
Message-ID: <5227FDFC.5090007@redhat.com> (raw)
In-Reply-To: <1378342226.11205.1.camel@edumazet-glaptop>
On 09/05/2013 08:50 AM, Eric Dumazet wrote:
> On Wed, 2013-09-04 at 04:27 -0700, Eric Dumazet wrote:
>> On Wed, 2013-09-04 at 03:30 -0700, Eric Dumazet wrote:
>>> On Wed, 2013-09-04 at 14:30 +0800, Jason Wang wrote:
>>>
>>>>> And tcpdump would certainly help ;)
>>>> See attachment.
>>>>
>>> Nothing obvious on tcpdump (only that lot of frames are missing)
>>>
>>> 1) Are you capturing part of the payload only (like tcpdump -s 128)
>>>
>>> 2) What is the setup.
>>>
>>> 3) tc -s -d qdisc
>> If you use FQ in the guest, then it could be that high resolution timers
>> have high latency ?
>>
>> So FQ arms short timers, but effective duration could be much longer.
>>
>> Here I get a smooth latency of up to ~3 us
>>
>> lpq83:~# ./netperf -H lpq84 ; ./tc -s -d qd ; dmesg | tail -n1
>> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
>> Recv Send Send
>> Socket Socket Message Elapsed
>> Size Size Size Time Throughput
>> bytes bytes bytes secs. 10^6bits/sec
>>
>> 87380 16384 16384 10.00 9410.82
>> qdisc fq 8005: dev eth0 root refcnt 32 limit 10000p flow_limit 100p buckets 1024 quantum 3028 initial_quantum 15140
>> Sent 50545633991 bytes 33385894 pkt (dropped 0, overlimits 0 requeues 19)
>> rate 9258Mbit 764335pps backlog 0b 0p requeues 19
>> 117 flow, 115 inactive, 0 throttled
>> 0 gc, 0 highprio, 0 retrans, 96861 throttled, 0 flows_plimit
>> [ 572.551664] latency = 3035 ns
>>
>>
>> What do you get with this debugging patch ?
>>
>> diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
>> index 32ad015..c1312a0 100644
>> --- a/net/sched/sch_fq.c
>> +++ b/net/sched/sch_fq.c
>> @@ -103,6 +103,7 @@ struct fq_sched_data {
>> u64 stat_internal_packets;
>> u64 stat_tcp_retrans;
>> u64 stat_throttled;
>> + s64 slatency;
>> u64 stat_flows_plimit;
>> u64 stat_pkts_too_long;
>> u64 stat_allocation_errors;
>> @@ -393,6 +394,7 @@ static int fq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>> static void fq_check_throttled(struct fq_sched_data *q, u64 now)
>> {
>> struct rb_node *p;
>> + bool first = true;
>>
>> if (q->time_next_delayed_flow > now)
>> return;
>> @@ -405,6 +407,13 @@ static void fq_check_throttled(struct fq_sched_data *q, u64 now)
>> q->time_next_delayed_flow = f->time_next_packet;
>> break;
>> }
>> + if (first) {
>> + s64 delay = now - f->time_next_packet;
>> +
>> + first = false;
>> + delay -= q->slatency >> 3;
>> + q->slatency += delay;
>> + }
>> rb_erase(p, &q->delayed);
>> q->throttled_flows--;
>> fq_flow_add_tail(&q->old_flows, f);
>> @@ -711,6 +720,7 @@ static int fq_dump(struct Qdisc *sch, struct sk_buff *skb)
>> if (opts == NULL)
>> goto nla_put_failure;
>>
>> + pr_err("latency = %lld ns\n", q->slatency >> 3);
>> if (nla_put_u32(skb, TCA_FQ_PLIMIT, sch->limit) ||
>> nla_put_u32(skb, TCA_FQ_FLOW_PLIMIT, q->flow_plimit) ||
>> nla_put_u32(skb, TCA_FQ_QUANTUM, q->quantum) ||
>>
>
> BTW what is your HZ value ?
Guest HZ is 1000.
>
> We have a problem in TCP stack, because srtt is in HZ units.
>
> Before we change to us units, I guess tcp_update_pacing_rate() should be
> changed a bit if HZ=250
>
>
next prev parent reply other threads:[~2013-09-05 3:44 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-29 22:49 [PATCH v2 net-next] pkt_sched: fq: Fair Queue packet scheduler Eric Dumazet
2013-08-30 1:47 ` David Miller
2013-08-30 2:30 ` [PATCH iproute2] " Eric Dumazet
2013-09-03 15:49 ` Stephen Hemminger
2013-09-04 5:26 ` [PATCH v2 net-next] " Jason Wang
2013-09-04 5:59 ` Eric Dumazet
2013-09-04 6:30 ` Jason Wang
2013-09-04 10:30 ` Eric Dumazet
2013-09-04 11:27 ` Eric Dumazet
2013-09-04 11:59 ` Daniel Borkmann
2013-09-05 3:39 ` Jason Wang
2013-09-05 0:50 ` Eric Dumazet
2013-09-05 1:23 ` Eric Dumazet
2013-09-05 3:43 ` Jason Wang [this message]
2013-09-05 3:34 ` Jason Wang
2013-09-05 3:07 ` Jason Wang
2013-09-05 3:41 ` Eric Dumazet
2013-09-05 5:16 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5227FDFC.5090007@redhat.com \
--to=jasowang@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=mst@redhat.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).