From: Jason Wang <jasowang@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
netdev <netdev@vger.kernel.org>,
Yuchung Cheng <ycheng@google.com>,
Neal Cardwell <ncardwell@google.com>,
"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v2 net-next] pkt_sched: fq: Fair Queue packet scheduler
Date: Thu, 05 Sep 2013 11:43:56 +0800 [thread overview]
Message-ID: <5227FDFC.5090007@redhat.com> (raw)
In-Reply-To: <1378342226.11205.1.camel@edumazet-glaptop>
On 09/05/2013 08:50 AM, Eric Dumazet wrote:
> On Wed, 2013-09-04 at 04:27 -0700, Eric Dumazet wrote:
>> On Wed, 2013-09-04 at 03:30 -0700, Eric Dumazet wrote:
>>> On Wed, 2013-09-04 at 14:30 +0800, Jason Wang wrote:
>>>
>>>>> And tcpdump would certainly help ;)
>>>> See attachment.
>>>>
>>> Nothing obvious on tcpdump (only that lot of frames are missing)
>>>
>>> 1) Are you capturing part of the payload only (like tcpdump -s 128)
>>>
>>> 2) What is the setup.
>>>
>>> 3) tc -s -d qdisc
>> If you use FQ in the guest, then it could be that high resolution timers
>> have high latency ?
>>
>> So FQ arms short timers, but effective duration could be much longer.
>>
>> Here I get a smooth latency of up to ~3 us
>>
>> lpq83:~# ./netperf -H lpq84 ; ./tc -s -d qd ; dmesg | tail -n1
>> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
>> Recv Send Send
>> Socket Socket Message Elapsed
>> Size Size Size Time Throughput
>> bytes bytes bytes secs. 10^6bits/sec
>>
>> 87380 16384 16384 10.00 9410.82
>> qdisc fq 8005: dev eth0 root refcnt 32 limit 10000p flow_limit 100p buckets 1024 quantum 3028 initial_quantum 15140
>> Sent 50545633991 bytes 33385894 pkt (dropped 0, overlimits 0 requeues 19)
>> rate 9258Mbit 764335pps backlog 0b 0p requeues 19
>> 117 flow, 115 inactive, 0 throttled
>> 0 gc, 0 highprio, 0 retrans, 96861 throttled, 0 flows_plimit
>> [ 572.551664] latency = 3035 ns
>>
>>
>> What do you get with this debugging patch ?
>>
>> diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
>> index 32ad015..c1312a0 100644
>> --- a/net/sched/sch_fq.c
>> +++ b/net/sched/sch_fq.c
>> @@ -103,6 +103,7 @@ struct fq_sched_data {
>> u64 stat_internal_packets;
>> u64 stat_tcp_retrans;
>> u64 stat_throttled;
>> + s64 slatency;
>> u64 stat_flows_plimit;
>> u64 stat_pkts_too_long;
>> u64 stat_allocation_errors;
>> @@ -393,6 +394,7 @@ static int fq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>> static void fq_check_throttled(struct fq_sched_data *q, u64 now)
>> {
>> struct rb_node *p;
>> + bool first = true;
>>
>> if (q->time_next_delayed_flow > now)
>> return;
>> @@ -405,6 +407,13 @@ static void fq_check_throttled(struct fq_sched_data *q, u64 now)
>> q->time_next_delayed_flow = f->time_next_packet;
>> break;
>> }
>> + if (first) {
>> + s64 delay = now - f->time_next_packet;
>> +
>> + first = false;
>> + delay -= q->slatency >> 3;
>> + q->slatency += delay;
>> + }
>> rb_erase(p, &q->delayed);
>> q->throttled_flows--;
>> fq_flow_add_tail(&q->old_flows, f);
>> @@ -711,6 +720,7 @@ static int fq_dump(struct Qdisc *sch, struct sk_buff *skb)
>> if (opts == NULL)
>> goto nla_put_failure;
>>
>> + pr_err("latency = %lld ns\n", q->slatency >> 3);
>> if (nla_put_u32(skb, TCA_FQ_PLIMIT, sch->limit) ||
>> nla_put_u32(skb, TCA_FQ_FLOW_PLIMIT, q->flow_plimit) ||
>> nla_put_u32(skb, TCA_FQ_QUANTUM, q->quantum) ||
>>
>
> BTW what is your HZ value ?
Guest HZ is 1000.
>
> We have a problem in TCP stack, because srtt is in HZ units.
>
> Before we change to us units, I guess tcp_update_pacing_rate() should be
> changed a bit if HZ=250
>
>
next prev parent reply other threads:[~2013-09-05 3:44 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-29 22:49 [PATCH v2 net-next] pkt_sched: fq: Fair Queue packet scheduler Eric Dumazet
2013-08-30 1:47 ` David Miller
2013-08-30 2:30 ` [PATCH iproute2] " Eric Dumazet
2013-09-03 15:49 ` Stephen Hemminger
2013-09-04 5:26 ` [PATCH v2 net-next] " Jason Wang
2013-09-04 5:59 ` Eric Dumazet
2013-09-04 6:30 ` Jason Wang
2013-09-04 10:30 ` Eric Dumazet
2013-09-04 11:27 ` Eric Dumazet
2013-09-04 11:59 ` Daniel Borkmann
2013-09-05 3:39 ` Jason Wang
2013-09-05 0:50 ` Eric Dumazet
2013-09-05 1:23 ` Eric Dumazet
2013-09-05 3:43 ` Jason Wang [this message]
2013-09-05 3:34 ` Jason Wang
2013-09-05 3:07 ` Jason Wang
2013-09-05 3:41 ` Eric Dumazet
2013-09-05 5:16 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5227FDFC.5090007@redhat.com \
--to=jasowang@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=mst@redhat.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.