SFQ on HFSC leaf does not seem to work

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* SFQ on HFSC leaf does not seem to work
@ 2011-12-23  6:00 John A. Sullivan III
  2011-12-23  6:32 ` Dave Taht
  2011-12-23  8:10 ` Eric Dumazet
  0 siblings, 2 replies; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23  6:00 UTC (permalink / raw)
  To: netdev

Hello, all.  I have an experimental HFSC setup with three leaf classes
each with SFQ as the final qdisc.  One queue is for ssh on port 822, one
is for tcp traffic on port 443, the third is the default.

If I flood the 443 queue with netcat, my ssh sessions are responsive and
my continuous ping shows a round trip time of around 50ms in keeping
with my netem settings.

However, if I flood the default queue with netcat on port 80, ssh is
still responsive but my ping round trip times shoot up over 3000ms.

I thought it might be the bufferbloat phenomenon so I reduced the
txqueuelen on both sides of the ping to 0.  Both sides use old 10BaseT
NICs and have no ring buffer.  I also set the SFQ limit on the default
queue to 2 just in case.  Still no difference.

The default queue is dequeuing at roughly 400 kbits which matches my
HFSC configuration.  A full sized packet should take 30 ms to pass at
that rate ((1514 * 8)/400,000) so, if I am round robining the queues, I
would expect latency on a default sized ping to be only 30 ms plus the
netem delay.

Where might this 3000 ms delay be coming from?

Here is the rule set:

tc qdisc add dev eth1 root handle 1: hfsc default 20
tc class add dev eth1 parent 1: classid 1:1 hfsc sc rate 1490kbit ul rate 1490kbit
tc class add dev eth1 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls rate 200kbit
tc qdisc add dev eth1 parent 1:20 handle 1201 sfq perturb 10
tc class add dev eth1 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax 50ms rate 200kbit ls rate 1000kbit
tc qdisc add dev eth1 parent 1:10 handle 1101 sfq perturb 60
tc class add dev eth1 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax 20ms rate 20kbit
tc qdisc add dev eth1 parent 1:30 handle 1301 sfq perturb 60
iptables -t mangle -A POSTROUTING -p 6 --syn --dport 443 -j CONNMARK --set-mark 0x10
iptables -t mangle -A PREROUTING -p 6 --syn --dport 822 -j CONNMARK --set-mark 0x11
iptables -t mangle -A POSTROUTING -o eth1 -p 6 -j CONNMARK --restore-mark
modprobe ifb
ifconfig ifb0 up
ifconfig ifb1 up
tc filter add dev eth1 parent 1:0 protocol ip prio 1 handle 0x11 fw flowid 1:30 action mirred egress redirect dev ifb1
tc filter add dev eth1 parent 1:0 protocol ip prio 1 handle 0x10 fw flowid 1:10 action mirred egress redirect dev ifb1
tc filter add dev eth1 parent 1:0 protocol ip prio 2 u32 match u32 0 0 flowid 1:20 action mirred egress redirect dev ifb1
tc qdisc add dev eth1 ingress
tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0 0 action mirred egress redirect dev ifb0
tc qdisc add dev ifb0 root handle 1: hfsc default 20
tc class add dev ifb0 parent 1: classid 1:1 hfsc sc rate 1490kbit ul rate 1490kbit
tc class add dev ifb0 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls rate 200kbit
tc qdisc add dev ifb0 parent 1:20 handle 1201 netem delay 25ms 5ms distribution normal loss 0.1% 30%
tc class add dev ifb0 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax 50ms rate 200kbit ls rate 1000kbit
tc qdisc add dev ifb0 parent 1:10 handle 1101 netem delay 25ms 5ms distribution normal loss 0.1% 30%
tc class add dev ifb0 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax 20ms rate 20kbit
tc qdisc add dev ifb0 parent 1:30 handle 1301 netem delay 25ms 5ms distribution normal loss 0.1% 30%
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 handle 6: u32 divisor 1
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 match ip protocol 6 0xff link 6: offset at 0 mask 0x0f00 shift 6 plus 0 eat
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp src 443 0x00ff flowid 1:10
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp dst 822 0xff00 flowid 1:30
tc qdisc add dev ifb1 root handle 2 netem delay 25ms 5ms distribution normal loss 0.1% 30%

Thanks - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23  6:00 SFQ on HFSC leaf does not seem to work John A. Sullivan III
@ 2011-12-23  6:32 ` Dave Taht
  2011-12-23  6:40   ` John A. Sullivan III
  2011-12-23  8:10 ` Eric Dumazet
  1 sibling, 1 reply; 23+ messages in thread
From: Dave Taht @ 2011-12-23  6:32 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netdev

On Fri, Dec 23, 2011 at 7:00 AM, John A. Sullivan III
<jsullivan@opensourcedevel.com> wrote:
> Hello, all.  I have an experimental HFSC setup with three leaf classes
> each with SFQ as the final qdisc.  One queue is for ssh on port 822, one
> is for tcp traffic on port 443, the third is the default.
>
> If I flood the 443 queue with netcat, my ssh sessions are responsive and
> my continuous ping shows a round trip time of around 50ms in keeping
> with my netem settings.
>
> However, if I flood the default queue with netcat on port 80, ssh is
> still responsive but my ping round trip times shoot up over 3000ms.
>
> I thought it might be the bufferbloat phenomenon so I reduced the
> txqueuelen on both sides of the ping to 0.  Both sides use old 10BaseT
> NICs and have no ring buffer.  I also set the SFQ limit on the default
> queue to 2 just in case.  Still no difference.

Your txqueuelen on the ifb devices is probably 1000.



>
> The default queue is dequeuing at roughly 400 kbits which matches my
> HFSC configuration.  A full sized packet should take 30 ms to pass at
> that rate ((1514 * 8)/400,000) so, if I am round robining the queues, I
> would expect latency on a default sized ping to be only 30 ms plus the
> netem delay.
>
> Where might this 3000 ms delay be coming from?
>
> Here is the rule set:
>
> tc qdisc add dev eth1 root handle 1: hfsc default 20
> tc class add dev eth1 parent 1: classid 1:1 hfsc sc rate 1490kbit ul rate 1490kbit
> tc class add dev eth1 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls rate 200kbit
> tc qdisc add dev eth1 parent 1:20 handle 1201 sfq perturb 10
> tc class add dev eth1 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax 50ms rate 200kbit ls rate 1000kbit
> tc qdisc add dev eth1 parent 1:10 handle 1101 sfq perturb 60
> tc class add dev eth1 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax 20ms rate 20kbit
> tc qdisc add dev eth1 parent 1:30 handle 1301 sfq perturb 60
> iptables -t mangle -A POSTROUTING -p 6 --syn --dport 443 -j CONNMARK --set-mark 0x10
> iptables -t mangle -A PREROUTING -p 6 --syn --dport 822 -j CONNMARK --set-mark 0x11
> iptables -t mangle -A POSTROUTING -o eth1 -p 6 -j CONNMARK --restore-mark
> modprobe ifb
> ifconfig ifb0 up
> ifconfig ifb1 up
> tc filter add dev eth1 parent 1:0 protocol ip prio 1 handle 0x11 fw flowid 1:30 action mirred egress redirect dev ifb1
> tc filter add dev eth1 parent 1:0 protocol ip prio 1 handle 0x10 fw flowid 1:10 action mirred egress redirect dev ifb1
> tc filter add dev eth1 parent 1:0 protocol ip prio 2 u32 match u32 0 0 flowid 1:20 action mirred egress redirect dev ifb1
> tc qdisc add dev eth1 ingress
> tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0 0 action mirred egress redirect dev ifb0
> tc qdisc add dev ifb0 root handle 1: hfsc default 20
> tc class add dev ifb0 parent 1: classid 1:1 hfsc sc rate 1490kbit ul rate 1490kbit
> tc class add dev ifb0 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls rate 200kbit
> tc qdisc add dev ifb0 parent 1:20 handle 1201 netem delay 25ms 5ms distribution normal loss 0.1% 30%
> tc class add dev ifb0 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax 50ms rate 200kbit ls rate 1000kbit
> tc qdisc add dev ifb0 parent 1:10 handle 1101 netem delay 25ms 5ms distribution normal loss 0.1% 30%
> tc class add dev ifb0 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax 20ms rate 20kbit
> tc qdisc add dev ifb0 parent 1:30 handle 1301 netem delay 25ms 5ms distribution normal loss 0.1% 30%
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 handle 6: u32 divisor 1
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 match ip protocol 6 0xff link 6: offset at 0 mask 0x0f00 shift 6 plus 0 eat
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp src 443 0x00ff flowid 1:10
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp dst 822 0xff00 flowid 1:30
> tc qdisc add dev ifb1 root handle 2 netem delay 25ms 5ms distribution normal loss 0.1% 30%
>
> Thanks - John
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23  6:32 ` Dave Taht
@ 2011-12-23  6:40   ` John A. Sullivan III
  2011-12-23  7:08     ` John A. Sullivan III
  0 siblings, 1 reply; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23  6:40 UTC (permalink / raw)
  To: Dave Taht; +Cc: netdev

On Fri, 2011-12-23 at 07:32 +0100, Dave Taht wrote:
> On Fri, Dec 23, 2011 at 7:00 AM, John A. Sullivan III
> <jsullivan@opensourcedevel.com> wrote:
> > Hello, all.  I have an experimental HFSC setup with three leaf classes
> > each with SFQ as the final qdisc.  One queue is for ssh on port 822, one
> > is for tcp traffic on port 443, the third is the default.
> >
> > If I flood the 443 queue with netcat, my ssh sessions are responsive and
> > my continuous ping shows a round trip time of around 50ms in keeping
> > with my netem settings.
> >
> > However, if I flood the default queue with netcat on port 80, ssh is
> > still responsive but my ping round trip times shoot up over 3000ms.
> >
> > I thought it might be the bufferbloat phenomenon so I reduced the
> > txqueuelen on both sides of the ping to 0.  Both sides use old 10BaseT
> > NICs and have no ring buffer.  I also set the SFQ limit on the default
> > queue to 2 just in case.  Still no difference.
> 
> Your txqueuelen on the ifb devices is probably 1000.
Alas, not.  They defaulted to 32 and I reset them to 0 :(

I'll also paste in a slightly optimized rule set but it still made no
difference.
> 
> 
> 
> >
> > The default queue is dequeuing at roughly 400 kbits which matches my
> > HFSC configuration.  A full sized packet should take 30 ms to pass at
> > that rate ((1514 * 8)/400,000) so, if I am round robining the queues, I
> > would expect latency on a default sized ping to be only 30 ms plus the
> > netem delay.
> >
> > Where might this 3000 ms delay be coming from?
> >
> > Here is the rule set:
> >
> #!/bin/sh

tc qdisc add dev eth1 root handle 1: hfsc default 20
tc class add dev eth1 parent 1: classid 1:1 hfsc sc rate 1490kbit ul rate 1490kbit
tc class add dev eth1 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls rate 200kbit
tc qdisc add dev eth1 parent 1:20 handle 1201 sfq perturb 10
tc class add dev eth1 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax 50ms rate 200kbit ls rate 1000kbit
tc qdisc add dev eth1 parent 1:10 handle 1101 sfq perturb 60
tc class add dev eth1 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax 20ms rate 20kbit
tc qdisc add dev eth1 parent 1:30 handle 1301 sfq perturb 60
iptables -t mangle -A POSTROUTING -p 6 --syn --dport 443 -j CONNMARK --set-mark 0x10
iptables -t mangle -A PREROUTING -p 6 --syn --dport 822 -j CONNMARK --set-mark 0x11
iptables -t mangle -A POSTROUTING -o eth1 -p 6 -j CONNMARK --restore-mark
modprobe ifb
ifconfig ifb0 up
ifconfig ifb1 up
tc qdisc add dev ifb0 root handle 1: hfsc default 20
tc class add dev ifb0 parent 1: classid 1:1 hfsc sc rate 1490kbit ul rate 1490kbit
tc class add dev ifb0 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls rate 200kbit
tc qdisc add dev ifb0 parent 1:20 handle 1201 netem delay 25ms 5ms distribution normal loss 0.1% 30%
tc class add dev ifb0 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax 50ms rate 200kbit ls rate 1000kbit
tc qdisc add dev ifb0 parent 1:10 handle 1101 netem delay 25ms 5ms distribution normal loss 0.1% 30%
tc class add dev ifb0 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax 20ms rate 20kbit
tc qdisc add dev ifb0 parent 1:30 handle 1301 netem delay 25ms 5ms distribution normal loss 0.1% 30%
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 handle 6: u32 divisor 1
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 match ip protocol 6 0xff link 6: offset at 0 mask 0x0f00 shift 6 plus 0 eat
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp src 443 0x00ff flowid 1:10
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp dst 822 0xff00 flowid 1:30
tc qdisc add dev ifb1 root handle 2 netem delay 25ms 5ms distribution normal loss 0.1% 30%
tc qdisc add dev eth1 ingress
tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0 0 action mirred egress redirect dev ifb0
tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x11 fw flowid 1:30
tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x10 fw flowid 1:10
tc filter add dev eth1 parent 1:1 protocol ip prio 2 u32 match u32 0 0 flowid 1:20
tc filter add dev eth1 parent 1:0 protocol ip prio 1 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb1

> >
> > Thanks - John
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23  6:40   ` John A. Sullivan III
@ 2011-12-23  7:08     ` John A. Sullivan III
  0 siblings, 0 replies; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23  7:08 UTC (permalink / raw)
  To: Dave Taht; +Cc: netdev

On Fri, 2011-12-23 at 01:40 -0500, John A. Sullivan III wrote:
> On Fri, 2011-12-23 at 07:32 +0100, Dave Taht wrote:
> > On Fri, Dec 23, 2011 at 7:00 AM, John A. Sullivan III
> > <jsullivan@opensourcedevel.com> wrote:
> > > Hello, all.  I have an experimental HFSC setup with three leaf classes
> > > each with SFQ as the final qdisc.  One queue is for ssh on port 822, one
> > > is for tcp traffic on port 443, the third is the default.
> > >
> > > If I flood the 443 queue with netcat, my ssh sessions are responsive and
> > > my continuous ping shows a round trip time of around 50ms in keeping
> > > with my netem settings.
> > >
> > > However, if I flood the default queue with netcat on port 80, ssh is
> > > still responsive but my ping round trip times shoot up over 3000ms.
> > >
> > > I thought it might be the bufferbloat phenomenon so I reduced the
> > > txqueuelen on both sides of the ping to 0.  Both sides use old 10BaseT
> > > NICs and have no ring buffer.  I also set the SFQ limit on the default
> > > queue to 2 just in case.  Still no difference.
> > 
> > Your txqueuelen on the ifb devices is probably 1000.
> Alas, not.  They defaulted to 32 and I reset them to 0 :(
> 
> I'll also paste in a slightly optimized rule set but it still made no
> difference.
> > 
> > 
> > 
> > >
> > > The default queue is dequeuing at roughly 400 kbits which matches my
> > > HFSC configuration.  A full sized packet should take 30 ms to pass at
> > > that rate ((1514 * 8)/400,000) so, if I am round robining the queues, I
> > > would expect latency on a default sized ping to be only 30 ms plus the
> > > netem delay.
> > >
> > > Where might this 3000 ms delay be coming from?
> > >
> > > Here is the rule set:
> > >
> > #!/bin/sh
> 
> tc qdisc add dev eth1 root handle 1: hfsc default 20
> tc class add dev eth1 parent 1: classid 1:1 hfsc sc rate 1490kbit ul rate 1490kbit
> tc class add dev eth1 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls rate 200kbit
> tc qdisc add dev eth1 parent 1:20 handle 1201 sfq perturb 10
> tc class add dev eth1 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax 50ms rate 200kbit ls rate 1000kbit
> tc qdisc add dev eth1 parent 1:10 handle 1101 sfq perturb 60
> tc class add dev eth1 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax 20ms rate 20kbit
> tc qdisc add dev eth1 parent 1:30 handle 1301 sfq perturb 60
> iptables -t mangle -A POSTROUTING -p 6 --syn --dport 443 -j CONNMARK --set-mark 0x10
> iptables -t mangle -A PREROUTING -p 6 --syn --dport 822 -j CONNMARK --set-mark 0x11
> iptables -t mangle -A POSTROUTING -o eth1 -p 6 -j CONNMARK --restore-mark
> modprobe ifb
> ifconfig ifb0 up
> ifconfig ifb1 up
> tc qdisc add dev ifb0 root handle 1: hfsc default 20
> tc class add dev ifb0 parent 1: classid 1:1 hfsc sc rate 1490kbit ul rate 1490kbit
> tc class add dev ifb0 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls rate 200kbit
> tc qdisc add dev ifb0 parent 1:20 handle 1201 netem delay 25ms 5ms distribution normal loss 0.1% 30%
> tc class add dev ifb0 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax 50ms rate 200kbit ls rate 1000kbit
> tc qdisc add dev ifb0 parent 1:10 handle 1101 netem delay 25ms 5ms distribution normal loss 0.1% 30%
> tc class add dev ifb0 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax 20ms rate 20kbit
> tc qdisc add dev ifb0 parent 1:30 handle 1301 netem delay 25ms 5ms distribution normal loss 0.1% 30%
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 handle 6: u32 divisor 1
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 match ip protocol 6 0xff link 6: offset at 0 mask 0x0f00 shift 6 plus 0 eat
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp src 443 0x00ff flowid 1:10
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp dst 822 0xff00 flowid 1:30
> tc qdisc add dev ifb1 root handle 2 netem delay 25ms 5ms distribution normal loss 0.1% 30%
> tc qdisc add dev eth1 ingress
> tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0 0 action mirred egress redirect dev ifb0
> tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x11 fw flowid 1:30
> tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x10 fw flowid 1:10
> tc filter add dev eth1 parent 1:1 protocol ip prio 2 u32 match u32 0 0 flowid 1:20
> tc filter add dev eth1 parent 1:0 protocol ip prio 1 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb1
<snip>
I just tried setting the txqueuelen to 5 instead of 0 as per Dave's
offlist recommendation but it made no difference.  When I start the bulk
traffic, the ICMP response time builds over about 10s from 60ms to
3100ms.  Any ideas where it is coming from? Thanks - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23  6:00 SFQ on HFSC leaf does not seem to work John A. Sullivan III
  2011-12-23  6:32 ` Dave Taht
@ 2011-12-23  8:10 ` Eric Dumazet
  2011-12-23 13:13   ` John A. Sullivan III
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2011-12-23  8:10 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netdev

Le vendredi 23 décembre 2011 à 01:00 -0500, John A. Sullivan III a
écrit :

> Where might this 3000 ms delay be coming from?
> 

Certainly not from SFQ

You could use tcpdump to check if delay is at egress or ingress side.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23  8:10 ` Eric Dumazet
@ 2011-12-23 13:13   ` John A. Sullivan III
  2011-12-23 13:45     ` Eric Dumazet
  0 siblings, 1 reply; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23 13:13 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Fri, 2011-12-23 at 09:10 +0100, Eric Dumazet wrote:
> Le vendredi 23 décembre 2011 à 01:00 -0500, John A. Sullivan III a
> écrit :
> 
> > Where might this 3000 ms delay be coming from?
> > 
> 
> Certainly not from SFQ
> 
> You could use tcpdump to check if delay is at egress or ingress side.
> 
> 
> 
That's perplexing as well.  Tracing on the eth1 interface of the test
devices, we see the packets going out with near perfect regularity at
one second intervals and replies returning immediately.  They are also
completely in sequence so it is not as if we are matching packet 4 with
packet 7 and seeing an immediate reply that is really offset by three
seconds.

Thus, the delay seems to be registered only in the ICMP application
itself.  The CPU is virtually idle - very impressed with Debian's
handling of interrupts on this old system would have been spending
almost 60% on hardware interrupts.

Another weird characteristic is that the delay is introduced gradually
over several seconds:
64 bytes from 192.168.223.84: icmp_req=4 ttl=64 time=58.0 ms
64 bytes from 192.168.223.84: icmp_req=5 ttl=64 time=52.4 ms
64 bytes from 192.168.223.84: icmp_req=6 ttl=64 time=48.7 ms
64 bytes from 192.168.223.84: icmp_req=7 ttl=64 time=118 ms
64 bytes from 192.168.223.84: icmp_req=8 ttl=64 time=834 ms
64 bytes from 192.168.223.84: icmp_req=9 ttl=64 time=896 ms
64 bytes from 192.168.223.84: icmp_req=10 ttl=64 time=897 ms
64 bytes from 192.168.223.84: icmp_req=11 ttl=64 time=1081 ms
64 bytes from 192.168.223.84: icmp_req=12 ttl=64 time=1257 ms
64 bytes from 192.168.223.84: icmp_req=13 ttl=64 time=1744 ms
64 bytes from 192.168.223.84: icmp_req=14 ttl=64 time=2107 ms
64 bytes from 192.168.223.84: icmp_req=15 ttl=64 time=2532 ms
64 bytes from 192.168.223.84: icmp_req=16 ttl=64 time=2948 ms
64 bytes from 192.168.223.84: icmp_req=17 ttl=64 time=3191 ms
64 bytes from 192.168.223.84: icmp_req=18 ttl=64 time=3163 ms

While the delay builds, we can see the replied noticeably delayed
however, once we hit 3000ms, the display updates one packet per second.
I would expect one packet per three seconds unless we were interleaving
packets but we are not according to the packet trace.

So I am guessing an inbound queue but where? Ah, netem.  I pulled out
netem and I seem vastly different results.  I get a very occasional lost
packet but no impact to latency:

64 bytes from 192.168.223.84: icmp_req=48 ttl=64 time=0.802 ms
64 bytes from 192.168.223.84: icmp_req=49 ttl=64 time=0.843 ms
64 bytes from 192.168.223.84: icmp_req=50 ttl=64 time=0.739 ms
64 bytes from 192.168.223.84: icmp_req=51 ttl=64 time=0.769 ms
64 bytes from 192.168.223.84: icmp_req=52 ttl=64 time=0.833 ms
64 bytes from 192.168.223.84: icmp_req=53 ttl=64 time=0.872 ms
64 bytes from 192.168.223.84: icmp_req=54 ttl=64 time=0.786 ms
64 bytes from 192.168.223.84: icmp_req=55 ttl=64 time=0.766 ms
64 bytes from 192.168.223.84: icmp_req=56 ttl=64 time=0.715 ms
64 bytes from 192.168.223.84: icmp_req=57 ttl=64 time=0.710 ms
64 bytes from 192.168.223.84: icmp_req=58 ttl=64 time=0.784 ms
64 bytes from 192.168.223.84: icmp_req=59 ttl=64 time=0.766 ms
64 bytes from 192.168.223.84: icmp_req=60 ttl=64 time=0.748 ms


So netem has bufferbloat ;)  Thanks - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 13:13   ` John A. Sullivan III
@ 2011-12-23 13:45     ` Eric Dumazet
  2011-12-23 14:00       ` Eric Dumazet
  0 siblings, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2011-12-23 13:45 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netdev

Le vendredi 23 décembre 2011 à 08:13 -0500, John A. Sullivan III a
écrit :
> On Fri, 2011-12-23 at 09:10 +0100, Eric Dumazet wrote:
> > Le vendredi 23 décembre 2011 à 01:00 -0500, John A. Sullivan III a
> > écrit :
> > 
> > > Where might this 3000 ms delay be coming from?
> > > 
> > 
> > Certainly not from SFQ
> > 
> > You could use tcpdump to check if delay is at egress or ingress side.
> > 
> > 
> > 
> That's perplexing as well.  Tracing on the eth1 interface of the test
> devices, we see the packets going out with near perfect regularity at
> one second intervals and replies returning immediately.  They are also
> completely in sequence so it is not as if we are matching packet 4 with
> packet 7 and seeing an immediate reply that is really offset by three
> seconds.
> 
> Thus, the delay seems to be registered only in the ICMP application
> itself.  The CPU is virtually idle - very impressed with Debian's
> handling of interrupts on this old system would have been spending
> almost 60% on hardware interrupts.
> 
> Another weird characteristic is that the delay is introduced gradually
> over several seconds:
> 64 bytes from 192.168.223.84: icmp_req=4 ttl=64 time=58.0 ms
> 64 bytes from 192.168.223.84: icmp_req=5 ttl=64 time=52.4 ms
> 64 bytes from 192.168.223.84: icmp_req=6 ttl=64 time=48.7 ms
> 64 bytes from 192.168.223.84: icmp_req=7 ttl=64 time=118 ms
> 64 bytes from 192.168.223.84: icmp_req=8 ttl=64 time=834 ms
> 64 bytes from 192.168.223.84: icmp_req=9 ttl=64 time=896 ms
> 64 bytes from 192.168.223.84: icmp_req=10 ttl=64 time=897 ms
> 64 bytes from 192.168.223.84: icmp_req=11 ttl=64 time=1081 ms
> 64 bytes from 192.168.223.84: icmp_req=12 ttl=64 time=1257 ms
> 64 bytes from 192.168.223.84: icmp_req=13 ttl=64 time=1744 ms
> 64 bytes from 192.168.223.84: icmp_req=14 ttl=64 time=2107 ms
> 64 bytes from 192.168.223.84: icmp_req=15 ttl=64 time=2532 ms
> 64 bytes from 192.168.223.84: icmp_req=16 ttl=64 time=2948 ms
> 64 bytes from 192.168.223.84: icmp_req=17 ttl=64 time=3191 ms
> 64 bytes from 192.168.223.84: icmp_req=18 ttl=64 time=3163 ms
> 
> While the delay builds, we can see the replied noticeably delayed
> however, once we hit 3000ms, the display updates one packet per second.
> I would expect one packet per three seconds unless we were interleaving
> packets but we are not according to the packet trace.
> 
> So I am guessing an inbound queue but where? Ah, netem.  I pulled out
> netem and I seem vastly different results.  I get a very occasional lost
> packet but no impact to latency:
> 
> 64 bytes from 192.168.223.84: icmp_req=48 ttl=64 time=0.802 ms
> 64 bytes from 192.168.223.84: icmp_req=49 ttl=64 time=0.843 ms
> 64 bytes from 192.168.223.84: icmp_req=50 ttl=64 time=0.739 ms
> 64 bytes from 192.168.223.84: icmp_req=51 ttl=64 time=0.769 ms
> 64 bytes from 192.168.223.84: icmp_req=52 ttl=64 time=0.833 ms
> 64 bytes from 192.168.223.84: icmp_req=53 ttl=64 time=0.872 ms
> 64 bytes from 192.168.223.84: icmp_req=54 ttl=64 time=0.786 ms
> 64 bytes from 192.168.223.84: icmp_req=55 ttl=64 time=0.766 ms
> 64 bytes from 192.168.223.84: icmp_req=56 ttl=64 time=0.715 ms
> 64 bytes from 192.168.223.84: icmp_req=57 ttl=64 time=0.710 ms
> 64 bytes from 192.168.223.84: icmp_req=58 ttl=64 time=0.784 ms
> 64 bytes from 192.168.223.84: icmp_req=59 ttl=64 time=0.766 ms
> 64 bytes from 192.168.223.84: icmp_req=60 ttl=64 time=0.748 ms
> 
> 
> So netem has bufferbloat ;)  Thanks - John
> 

1) What kernel version do you use ?

2) How many concurrent flows are running (number of netperf/netcat)

3) Remind that 'perturb xxx' introduces a temporary doubling of the
number of flows.

4) Had you disabled tso on eth1 ?
   (If not, you might send 64Kbytes packets, and at 400kbit, they take a
lot of time to transmit : more than one second ...)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 13:45     ` Eric Dumazet
@ 2011-12-23 14:00       ` Eric Dumazet
  2011-12-23 14:38         ` John A. Sullivan III
  2011-12-23 15:19         ` [PATCH net-next] sch_hfsc: report backlog information Eric Dumazet
  0 siblings, 2 replies; 23+ messages in thread
From: Eric Dumazet @ 2011-12-23 14:00 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netdev

Le vendredi 23 décembre 2011 à 14:45 +0100, Eric Dumazet a écrit :

> 1) What kernel version do you use ?
> 
> 2) How many concurrent flows are running (number of netperf/netcat)
> 
> 3) Remind that 'perturb xxx' introduces a temporary doubling of the
> number of flows.
> 
> 4) Had you disabled tso on eth1 ?
>    (If not, you might send 64Kbytes packets, and at 400kbit, they take a
> lot of time to transmit : more than one second ...)
> 
> 

Using your script on net-next, (only using eth3 instead of eth1) and

ethtool -K eth3 tso off
ethtool -K eth3 gso off
ip ro flush cache

one ssh : dd if=/dev/zero | ssh 192.168.0.1 "dd of=/dev/null"
my ping is quite good :


$ ping -c 20 192.168.0.1
PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.
2011/11/23 14:57:01.106 64 bytes from 192.168.0.1: icmp_seq=1 ttl=64 time=59.4 ms
2011/11/23 14:57:02.121 64 bytes from 192.168.0.1: icmp_seq=2 ttl=64 time=72.7 ms
2011/11/23 14:57:03.109 64 bytes from 192.168.0.1: icmp_seq=3 ttl=64 time=60.3 ms
2011/11/23 14:57:04.108 64 bytes from 192.168.0.1: icmp_seq=4 ttl=64 time=57.8 ms
2011/11/23 14:57:05.115 64 bytes from 192.168.0.1: icmp_seq=5 ttl=64 time=62.6 ms
2011/11/23 14:57:06.116 64 bytes from 192.168.0.1: icmp_seq=6 ttl=64 time=62.6 ms
2011/11/23 14:57:07.112 64 bytes from 192.168.0.1: icmp_seq=7 ttl=64 time=57.6 ms
2011/11/23 14:57:08.127 64 bytes from 192.168.0.1: icmp_seq=8 ttl=64 time=70.9 ms
2011/11/23 14:57:09.123 64 bytes from 192.168.0.1: icmp_seq=9 ttl=64 time=65.4 ms
2011/11/23 14:57:10.113 64 bytes from 192.168.0.1: icmp_seq=10 ttl=64 time=53.5 ms
2011/11/23 14:57:11.127 64 bytes from 192.168.0.1: icmp_seq=11 ttl=64 time=66.7 ms
2011/11/23 14:57:12.129 64 bytes from 192.168.0.1: icmp_seq=12 ttl=64 time=67.4 ms
2011/11/23 14:57:13.119 64 bytes from 192.168.0.1: icmp_seq=13 ttl=64 time=56.3 ms
2011/11/23 14:57:14.127 64 bytes from 192.168.0.1: icmp_seq=14 ttl=64 time=64.0 ms
2011/11/23 14:57:15.116 64 bytes from 192.168.0.1: icmp_seq=15 ttl=64 time=51.9 ms
2011/11/23 14:57:16.127 64 bytes from 192.168.0.1: icmp_seq=16 ttl=64 time=61.2 ms
2011/11/23 14:57:17.127 64 bytes from 192.168.0.1: icmp_seq=17 ttl=64 time=60.4 ms
2011/11/23 14:57:18.135 64 bytes from 192.168.0.1: icmp_seq=18 ttl=64 time=68.2 ms
2011/11/23 14:57:19.137 64 bytes from 192.168.0.1: icmp_seq=19 ttl=64 time=69.1 ms
2011/11/23 14:57:20.136 64 bytes from 192.168.0.1: icmp_seq=20 ttl=64 time=67.0 ms

--- 192.168.0.1 ping statistics ---
20 packets transmitted, 20 received, 0% packet loss, time 19022ms
rtt min/avg/max/mdev = 51.909/62.796/72.751/5.579 ms

$ tc -s -d class show dev eth3
class hfsc 1: root 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
 period 0 level 2 

class hfsc 1:1 parent 1: sc m1 0bit d 0us m2 1490Kbit ul m1 0bit d 0us m2 1490Kbit 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
 period 69 work 38559740 bytes level 1 

class hfsc 1:10 parent 1:1 leaf 1101: rt m1 327680bit d 50.0ms m2 200000bit ls m1 0bit d 0us m2 1000Kbit 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
 period 0 level 0 

class hfsc 1:20 parent 1:1 leaf 1201: rt m1 0bit d 0us m2 400000bit ls m1 0bit d 0us m2 200000bit 
 Sent 38587058 bytes 27022 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 19p requeues 0 
 period 69 work 38559740 bytes rtwork 10358780 bytes level 0 

class hfsc 1:30 parent 1:1 leaf 1301: rt m1 605600bit d 20.0ms m2 20000bit 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
 period 0 level 0 

class sfq 1201:f7 parent 1201: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 25804b 18p requeues 0 
 allot -1336 


Hmm... we probably could fill hfsc class information with non null bytes backlog...
I'll take a look.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 14:00       ` Eric Dumazet
@ 2011-12-23 14:38         ` John A. Sullivan III
  2011-12-23 14:59           ` Eric Dumazet
  2011-12-23 15:19         ` [PATCH net-next] sch_hfsc: report backlog information Eric Dumazet
  1 sibling, 1 reply; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23 14:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Fri, 2011-12-23 at 15:00 +0100, Eric Dumazet wrote:
> Le vendredi 23 décembre 2011 à 14:45 +0100, Eric Dumazet a écrit :
> 
> > 1) What kernel version do you use ?
> > 
> > 2) How many concurrent flows are running (number of netperf/netcat)
> > 
> > 3) Remind that 'perturb xxx' introduces a temporary doubling of the
> > number of flows.
> > 
> > 4) Had you disabled tso on eth1 ?
> >    (If not, you might send 64Kbytes packets, and at 400kbit, they take a
> > lot of time to transmit : more than one second ...)
> > 
> > 
> 
> Using your script on net-next, (only using eth3 instead of eth1) and
> 
> ethtool -K eth3 tso off
> ethtool -K eth3 gso off
> ip ro flush cache
> 
> one ssh : dd if=/dev/zero | ssh 192.168.0.1 "dd of=/dev/null"
> my ping is quite good :
> 
> 
> $ ping -c 20 192.168.0.1
> PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.
> 2011/11/23 14:57:01.106 64 bytes from 192.168.0.1: icmp_seq=1 ttl=64 time=59.4 ms
> 2011/11/23 14:57:02.121 64 bytes from 192.168.0.1: icmp_seq=2 ttl=64 time=72.7 ms
> 2011/11/23 14:57:03.109 64 bytes from 192.168.0.1: icmp_seq=3 ttl=64 time=60.3 ms
> 2011/11/23 14:57:04.108 64 bytes from 192.168.0.1: icmp_seq=4 ttl=64 time=57.8 ms
> 2011/11/23 14:57:05.115 64 bytes from 192.168.0.1: icmp_seq=5 ttl=64 time=62.6 ms
> 2011/11/23 14:57:06.116 64 bytes from 192.168.0.1: icmp_seq=6 ttl=64 time=62.6 ms
> 2011/11/23 14:57:07.112 64 bytes from 192.168.0.1: icmp_seq=7 ttl=64 time=57.6 ms
> 2011/11/23 14:57:08.127 64 bytes from 192.168.0.1: icmp_seq=8 ttl=64 time=70.9 ms
> 2011/11/23 14:57:09.123 64 bytes from 192.168.0.1: icmp_seq=9 ttl=64 time=65.4 ms
> 2011/11/23 14:57:10.113 64 bytes from 192.168.0.1: icmp_seq=10 ttl=64 time=53.5 ms
> 2011/11/23 14:57:11.127 64 bytes from 192.168.0.1: icmp_seq=11 ttl=64 time=66.7 ms
> 2011/11/23 14:57:12.129 64 bytes from 192.168.0.1: icmp_seq=12 ttl=64 time=67.4 ms
> 2011/11/23 14:57:13.119 64 bytes from 192.168.0.1: icmp_seq=13 ttl=64 time=56.3 ms
> 2011/11/23 14:57:14.127 64 bytes from 192.168.0.1: icmp_seq=14 ttl=64 time=64.0 ms
> 2011/11/23 14:57:15.116 64 bytes from 192.168.0.1: icmp_seq=15 ttl=64 time=51.9 ms
> 2011/11/23 14:57:16.127 64 bytes from 192.168.0.1: icmp_seq=16 ttl=64 time=61.2 ms
> 2011/11/23 14:57:17.127 64 bytes from 192.168.0.1: icmp_seq=17 ttl=64 time=60.4 ms
> 2011/11/23 14:57:18.135 64 bytes from 192.168.0.1: icmp_seq=18 ttl=64 time=68.2 ms
> 2011/11/23 14:57:19.137 64 bytes from 192.168.0.1: icmp_seq=19 ttl=64 time=69.1 ms
> 2011/11/23 14:57:20.136 64 bytes from 192.168.0.1: icmp_seq=20 ttl=64 time=67.0 ms
> 
> --- 192.168.0.1 ping statistics ---
> 20 packets transmitted, 20 received, 0% packet loss, time 19022ms
> rtt min/avg/max/mdev = 51.909/62.796/72.751/5.579 ms
> 
> $ tc -s -d class show dev eth3
> class hfsc 1: root 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0 
>  period 0 level 2 
> 
> class hfsc 1:1 parent 1: sc m1 0bit d 0us m2 1490Kbit ul m1 0bit d 0us m2 1490Kbit 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0 
>  period 69 work 38559740 bytes level 1 
> 
> class hfsc 1:10 parent 1:1 leaf 1101: rt m1 327680bit d 50.0ms m2 200000bit ls m1 0bit d 0us m2 1000Kbit 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0 
>  period 0 level 0 
> 
> class hfsc 1:20 parent 1:1 leaf 1201: rt m1 0bit d 0us m2 400000bit ls m1 0bit d 0us m2 200000bit 
>  Sent 38587058 bytes 27022 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 19p requeues 0 
>  period 69 work 38559740 bytes rtwork 10358780 bytes level 0 
> 
> class hfsc 1:30 parent 1:1 leaf 1301: rt m1 605600bit d 20.0ms m2 20000bit 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0 
>  period 0 level 0 
> 
> class sfq 1201:f7 parent 1201: 
>  (dropped 0, overlimits 0 requeues 0) 
>  backlog 25804b 18p requeues 0 
>  allot -1336 
> 
> 
> Hmm... we probably could fill hfsc class information with non null bytes backlog...
> I'll take a look.
> 
> 
> 
Thanks very much, Eric.  gso and gso only was enabled but disabling it
does not seem to have solved the problem when I activate netem:

root@testswitch01:~# ./tcplay
root@testswitch01:~# man ethtool
root@testswitch01:~# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off
ntuple-filters: off
receive-hashing: off
root@testswitch01:~# ethtool -K eth1 gso off
root@testswitch01:~# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off
ntuple-filters: off
receive-hashing: off
ip ro flush cache

64 bytes from 192.168.223.84: icmp_req=16 ttl=64 time=42.6 ms
64 bytes from 192.168.223.84: icmp_req=17 ttl=64 time=39.1 ms
64 bytes from 192.168.223.84: icmp_req=18 ttl=64 time=45.5 ms
64 bytes from 192.168.223.84: icmp_req=19 ttl=64 time=406 ms
64 bytes from 192.168.223.84: icmp_req=20 ttl=64 time=919 ms
64 bytes from 192.168.223.84: icmp_req=21 ttl=64 time=920 ms
64 bytes from 192.168.223.84: icmp_req=22 ttl=64 time=1013 ms
64 bytes from 192.168.223.84: icmp_req=23 ttl=64 time=1158 ms
64 bytes from 192.168.223.84: icmp_req=24 ttl=64 time=1521 ms
64 bytes from 192.168.223.84: icmp_req=25 ttl=64 time=1915 ms
64 bytes from 192.168.223.84: icmp_req=26 ttl=64 time=2371 ms
64 bytes from 192.168.223.84: icmp_req=27 ttl=64 time=2797 ms
64 bytes from 192.168.223.84: icmp_req=28 ttl=64 time=3161 ms
64 bytes from 192.168.223.84: icmp_req=29 ttl=64 time=3162 ms
64 bytes from 192.168.223.84: icmp_req=30 ttl=64 time=3163 ms

Just in case something is amiss in my methodology, I have four ssh
sessions open to the test firewall; ssh is in a separate prioritized
queue.  In one session I run:
	ping 192.168.223.84
Then, in another, I do:
	nc 192.168.223.100 443 >/dev/null - this should go into a non-default,
prioritized queue.
Pings are OK at this point.
Then, in a third, I do:
	nc 192.168.223.100 80 >/dev/null - this goes into the default queue,
the same as ping, and is when the trouble starts.

I did alter the queue lengths in a recommendation from Dave Taht.  Here
is my current script with netem:

tc qdisc add dev eth1 root handle 1: hfsc default 20
tc class add dev eth1 parent 1: classid 1:1 hfsc sc rate 1490kbit ul
rate 1490kbit
tc class add dev eth1 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls
rate 200kbit
tc qdisc add dev eth1 parent 1:20 handle 1201 sfq perturb 60 limit 30
tc class add dev eth1 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax
50ms rate 200kbit ls rate 1000kbit
tc qdisc add dev eth1 parent 1:10 handle 1101 sfq perturb 60 limit 30
tc class add dev eth1 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax
20ms rate 20kbit
tc qdisc add dev eth1 parent 1:30 handle 1301 sfq perturb 60 limit 30
iptables -t mangle -A POSTROUTING -p 6 --syn --dport 443 -j CONNMARK
--set-mark 0x10
iptables -t mangle -A PREROUTING -p 6 --syn --dport 822 -j CONNMARK
--set-mark 0x11
iptables -t mangle -A POSTROUTING -o eth1 -p 6 -j CONNMARK
--restore-mark
modprobe ifb
ifconfig ifb0 up
ifconfig ifb1 up
tc qdisc add dev ifb0 root handle 1: hfsc default 20
tc class add dev ifb0 parent 1: classid 1:1 hfsc sc rate 1490kbit ul
rate 1490kbit
tc class add dev ifb0 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls
rate 200kbit
tc qdisc add dev ifb0 parent 1:20 handle 1201 netem delay 25ms 5ms
distribution normal loss 0.1% 30%
tc class add dev ifb0 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax
50ms rate 200kbit ls rate 1000kbit
tc qdisc add dev ifb0 parent 1:10 handle 1101 netem delay 25ms 5ms
distribution normal loss 0.1% 30%
tc class add dev ifb0 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax
20ms rate 20kbit
tc qdisc add dev ifb0 parent 1:30 handle 1301 netem delay 25ms 5ms
distribution normal loss 0.1% 30%
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 handle 6: u32
divisor 1
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 match ip
protocol 6 0xff link 6: offset at 0 mask 0x0f00 shift 6 plus 0 eat
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match
tcp src 443 0x00ff flowid 1:10
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match
tcp dst 822 0xff00 flowid 1:30
tc qdisc add dev ifb1 root handle 2 netem delay 25ms 5ms distribution
normal loss 0.1% 30%
tc qdisc add dev eth1 ingress
tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0
0 action mirred egress redirect dev ifb0
tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x11 fw
flowid 1:30
tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x10 fw
flowid 1:10
tc filter add dev eth1 parent 1:1 protocol ip prio 2 u32 match u32 0 0
flowid 1:20
tc filter add dev eth1 parent 1:0 protocol ip prio 1 u32 match u32 0 0
flowid 1:1 action mirred egress redirect dev ifb1
ip link set eth1 txqueuelen 100
ip link set ifb1 txqueuelen 100
ip link set ifb0 txqueuelen 100

I'd love to solve this.  Just when I thought I was all finished having
cracked the multiple filter problem to add netem to hfsc, I hit this.
Thanks again - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 14:38         ` John A. Sullivan III
@ 2011-12-23 14:59           ` Eric Dumazet
  2011-12-23 15:26             ` John A. Sullivan III
  0 siblings, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2011-12-23 14:59 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netdev

Le vendredi 23 décembre 2011 à 09:38 -0500, John A. Sullivan III a
écrit :

> Thanks very much, Eric.  gso and gso only was enabled but disabling it
> does not seem to have solved the problem when I activate netem:
> 

And your kernel version is ?

> root@testswitch01:~# ./tcplay
> root@testswitch01:~# man ethtool
> root@testswitch01:~# ethtool -k eth1
> Offload parameters for eth1:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp-segmentation-offload: off
> udp-fragmentation-offload: off
> generic-segmentation-offload: on
> generic-receive-offload: off
> large-receive-offload: off
> ntuple-filters: off
> receive-hashing: off
> root@testswitch01:~# ethtool -K eth1 gso off
> root@testswitch01:~# ethtool -k eth1
> Offload parameters for eth1:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp-segmentation-offload: off
> udp-fragmentation-offload: off
> generic-segmentation-offload: off
> generic-receive-offload: off
> large-receive-offload: off
> ntuple-filters: off
> receive-hashing: off
> ip ro flush cache
> 
> 64 bytes from 192.168.223.84: icmp_req=16 ttl=64 time=42.6 ms
> 64 bytes from 192.168.223.84: icmp_req=17 ttl=64 time=39.1 ms
> 64 bytes from 192.168.223.84: icmp_req=18 ttl=64 time=45.5 ms
> 64 bytes from 192.168.223.84: icmp_req=19 ttl=64 time=406 ms
> 64 bytes from 192.168.223.84: icmp_req=20 ttl=64 time=919 ms
> 64 bytes from 192.168.223.84: icmp_req=21 ttl=64 time=920 ms
> 64 bytes from 192.168.223.84: icmp_req=22 ttl=64 time=1013 ms
> 64 bytes from 192.168.223.84: icmp_req=23 ttl=64 time=1158 ms
> 64 bytes from 192.168.223.84: icmp_req=24 ttl=64 time=1521 ms
> 64 bytes from 192.168.223.84: icmp_req=25 ttl=64 time=1915 ms
> 64 bytes from 192.168.223.84: icmp_req=26 ttl=64 time=2371 ms
> 64 bytes from 192.168.223.84: icmp_req=27 ttl=64 time=2797 ms
> 64 bytes from 192.168.223.84: icmp_req=28 ttl=64 time=3161 ms
> 64 bytes from 192.168.223.84: icmp_req=29 ttl=64 time=3162 ms
> 64 bytes from 192.168.223.84: icmp_req=30 ttl=64 time=3163 ms
> 
> Just in case something is amiss in my methodology, I have four ssh
> sessions open to the test firewall; ssh is in a separate prioritized
> queue.  In one session I run:
> 	ping 192.168.223.84
> Then, in another, I do:
> 	nc 192.168.223.100 443 >/dev/null - this should go into a non-default,

So you _receive_ trafic ?

Are you aware you dont have SFQ in your ingress setup, only egress  ?

> prioritized queue.
> Pings are OK at this point.
> Then, in a third, I do:
> 	nc 192.168.223.100 80 >/dev/null - this goes into the default queue,

same here ?

> the same as ping, and is when the trouble starts.
> 
> I did alter the queue lengths in a recommendation from Dave Taht.  Here
> is my current script with netem:
> 
> tc qdisc add dev eth1 root handle 1: hfsc default 20
> tc class add dev eth1 parent 1: classid 1:1 hfsc sc rate 1490kbit ul
> rate 1490kbit
> tc class add dev eth1 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls
> rate 200kbit
> tc qdisc add dev eth1 parent 1:20 handle 1201 sfq perturb 60 limit 30
> tc class add dev eth1 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax
> 50ms rate 200kbit ls rate 1000kbit
> tc qdisc add dev eth1 parent 1:10 handle 1101 sfq perturb 60 limit 30
> tc class add dev eth1 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax
> 20ms rate 20kbit
> tc qdisc add dev eth1 parent 1:30 handle 1301 sfq perturb 60 limit 30
> iptables -t mangle -A POSTROUTING -p 6 --syn --dport 443 -j CONNMARK
> --set-mark 0x10
> iptables -t mangle -A PREROUTING -p 6 --syn --dport 822 -j CONNMARK
> --set-mark 0x11
> iptables -t mangle -A POSTROUTING -o eth1 -p 6 -j CONNMARK
> --restore-mark
> modprobe ifb
> ifconfig ifb0 up
> ifconfig ifb1 up
> tc qdisc add dev ifb0 root handle 1: hfsc default 20
> tc class add dev ifb0 parent 1: classid 1:1 hfsc sc rate 1490kbit ul
> rate 1490kbit
> tc class add dev ifb0 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls
> rate 200kbit
> tc qdisc add dev ifb0 parent 1:20 handle 1201 netem delay 25ms 5ms
> distribution normal loss 0.1% 30%
> tc class add dev ifb0 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax
> 50ms rate 200kbit ls rate 1000kbit
> tc qdisc add dev ifb0 parent 1:10 handle 1101 netem delay 25ms 5ms
> distribution normal loss 0.1% 30%
> tc class add dev ifb0 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax
> 20ms rate 20kbit
> tc qdisc add dev ifb0 parent 1:30 handle 1301 netem delay 25ms 5ms
> distribution normal loss 0.1% 30%
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 handle 6: u32
> divisor 1
> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 match ip
> protocol 6 0xff link 6: offset at 0 mask 0x0f00 shift 6 plus 0 eat


> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match
> tcp src 443 0x00ff flowid 1:10

why "src 443 0x00ff" ? It should be "src 443 0xffff"

> tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match
> tcp dst 822 0xff00 flowid 1:30

same here : "dst 822 0xffff"

> tc qdisc add dev ifb1 root handle 2 netem delay 25ms 5ms distribution
> normal loss 0.1% 30%
> tc qdisc add dev eth1 ingress
> tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0
> 0 action mirred egress redirect dev ifb0
> tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x11 fw
> flowid 1:30
> tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x10 fw
> flowid 1:10
> tc filter add dev eth1 parent 1:1 protocol ip prio 2 u32 match u32 0 0
> flowid 1:20
> tc filter add dev eth1 parent 1:0 protocol ip prio 1 u32 match u32 0 0
> flowid 1:1 action mirred egress redirect dev ifb1
> ip link set eth1 txqueuelen 100
> ip link set ifb1 txqueuelen 100
> ip link set ifb0 txqueuelen 100
> 
> I'd love to solve this.  Just when I thought I was all finished having
> cracked the multiple filter problem to add netem to hfsc, I hit this.
> Thanks again - John
> 

Add some SFQ to your ingress too...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net-next] sch_hfsc: report backlog information
  2011-12-23 14:00       ` Eric Dumazet
  2011-12-23 14:38         ` John A. Sullivan III
@ 2011-12-23 15:19         ` Eric Dumazet
  2011-12-23 21:52           ` David Miller
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2011-12-23 15:19 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, John A. Sullivan III

Add backlog (byte count) information in hfsc classes and qdisc, so that
"tc -s" can report it to user, instead of 0 values :

qdisc hfsc 1: root refcnt 6 default 20 
 Sent 45141660 bytes 30545 pkt (dropped 0, overlimits 91751 requeues 0) 
 rate 1492Kbit 126pps backlog 103226b 74p requeues 0 
...
class hfsc 1:20 parent 1:1 leaf 1201: rt m1 0bit d 0us m2 400000bit ls m1 0bit d 0us m2 200000bit 
 Sent 49534912 bytes 33519 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 81822b 56p requeues 0 
 period 23 work 49451576 bytes rtwork 13277552 bytes level 0 
...

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: John A. Sullivan III <jsullivan@opensourcedevel.com>
---
 net/sched/sch_hfsc.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 6488e64..9bdca2e 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1368,6 +1368,7 @@ hfsc_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 	struct tc_hfsc_stats xstats;
 
 	cl->qstats.qlen = cl->qdisc->q.qlen;
+	cl->qstats.backlog = cl->qdisc->qstats.backlog;
 	xstats.level   = cl->level;
 	xstats.period  = cl->cl_vtperiod;
 	xstats.work    = cl->cl_total;
@@ -1561,6 +1562,15 @@ hfsc_dump_qdisc(struct Qdisc *sch, struct sk_buff *skb)
 	struct hfsc_sched *q = qdisc_priv(sch);
 	unsigned char *b = skb_tail_pointer(skb);
 	struct tc_hfsc_qopt qopt;
+	struct hfsc_class *cl;
+	struct hlist_node *n;
+	unsigned int i;
+
+	sch->qstats.backlog = 0;
+	for (i = 0; i < q->clhash.hashsize; i++) {
+		hlist_for_each_entry(cl, n, &q->clhash.hash[i], cl_common.hnode)
+			sch->qstats.backlog += cl->qdisc->qstats.backlog;
+	}
 
 	qopt.defcls = q->defcls;
 	NLA_PUT(skb, TCA_OPTIONS, sizeof(qopt), &qopt);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 14:59           ` Eric Dumazet
@ 2011-12-23 15:26             ` John A. Sullivan III
  2011-12-23 16:16               ` Eric Dumazet
  2011-12-31 22:17               ` John A. Sullivan III
  0 siblings, 2 replies; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23 15:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

Thanks for asking those questions as I had questions about them but felt
I was doing enough spamming of the list.  I am also just in the midst of
gather some of the other information you requested.  I'll respond in
line - John

On Fri, 2011-12-23 at 15:59 +0100, Eric Dumazet wrote:
> Le vendredi 23 décembre 2011 à 09:38 -0500, John A. Sullivan III a
> écrit :
> 
> > Thanks very much, Eric.  gso and gso only was enabled but disabling it
> > does not seem to have solved the problem when I activate netem:
> > 
> 
> And your kernel version is ?
root@testswitch01:~# uname -a
Linux testswitch01 2.6.32-5-686 #1 SMP Mon Oct 3 04:15:24 UTC 2011 i686
GNU/Linux

This is Debian Squeeze i386

As described, I'm running two netcats - one going to the default queue
along with the pings and the other going to a different queue.
> 
> > root@testswitch01:~# ./tcplay
> > root@testswitch01:~# man ethtool
> > root@testswitch01:~# ethtool -k eth1
> > Offload parameters for eth1:
> > rx-checksumming: on
> > tx-checksumming: on
> > scatter-gather: on
> > tcp-segmentation-offload: off
> > udp-fragmentation-offload: off
> > generic-segmentation-offload: on
> > generic-receive-offload: off
> > large-receive-offload: off
> > ntuple-filters: off
> > receive-hashing: off
> > root@testswitch01:~# ethtool -K eth1 gso off
> > root@testswitch01:~# ethtool -k eth1
> > Offload parameters for eth1:
> > rx-checksumming: on
> > tx-checksumming: on
> > scatter-gather: on
> > tcp-segmentation-offload: off
> > udp-fragmentation-offload: off
> > generic-segmentation-offload: off
> > generic-receive-offload: off
> > large-receive-offload: off
> > ntuple-filters: off
> > receive-hashing: off
> > ip ro flush cache
> > 
> > 64 bytes from 192.168.223.84: icmp_req=16 ttl=64 time=42.6 ms
> > 64 bytes from 192.168.223.84: icmp_req=17 ttl=64 time=39.1 ms
> > 64 bytes from 192.168.223.84: icmp_req=18 ttl=64 time=45.5 ms
> > 64 bytes from 192.168.223.84: icmp_req=19 ttl=64 time=406 ms
> > 64 bytes from 192.168.223.84: icmp_req=20 ttl=64 time=919 ms
> > 64 bytes from 192.168.223.84: icmp_req=21 ttl=64 time=920 ms
> > 64 bytes from 192.168.223.84: icmp_req=22 ttl=64 time=1013 ms
> > 64 bytes from 192.168.223.84: icmp_req=23 ttl=64 time=1158 ms
> > 64 bytes from 192.168.223.84: icmp_req=24 ttl=64 time=1521 ms
> > 64 bytes from 192.168.223.84: icmp_req=25 ttl=64 time=1915 ms
> > 64 bytes from 192.168.223.84: icmp_req=26 ttl=64 time=2371 ms
> > 64 bytes from 192.168.223.84: icmp_req=27 ttl=64 time=2797 ms
> > 64 bytes from 192.168.223.84: icmp_req=28 ttl=64 time=3161 ms
> > 64 bytes from 192.168.223.84: icmp_req=29 ttl=64 time=3162 ms
> > 64 bytes from 192.168.223.84: icmp_req=30 ttl=64 time=3163 ms
> > 
> > Just in case something is amiss in my methodology, I have four ssh
> > sessions open to the test firewall; ssh is in a separate prioritized
> > queue.  In one session I run:
> > 	ping 192.168.223.84
> > Then, in another, I do:
> > 	nc 192.168.223.100 443 >/dev/null - this should go into a non-default,
> 
> So you _receive_ trafic ?
Yes
> 
> Are you aware you dont have SFQ in your ingress setup, only egress  ?
Yes.  This is a problem I have with netem on ingress traffic.  I use the
filter on ffff: to redirect to ifb0 for the ingress traffic shaping.  I
cannot figure out a way to redirect a second time to ifb1 for the netem
qdisc.  I tried putting two action mirred statements in the filter but
that did not work.  Unlike eth1, I cannot attach a filter further down
the ifb0 hfsc hierarchy because one can't redirect one ifb into another
ifb.  Thus, the only way I could figure out how to do inbound netem was
to replace the terminal qdisc with netem rather than SFQ.  I'd love to
be able to do that differently.  I tried attaching netem to the SFQ but
that failed (I assume because SFQ is classless) and I tried the other
way around, attaching SFQ to netem since you mentioned netem could take
a class but that did not work either.
> 
> > prioritized queue.
> > Pings are OK at this point.
> > Then, in a third, I do:
> > 	nc 192.168.223.100 80 >/dev/null - this goes into the default queue,
> 
> same here ?
Yes.
> 
> > the same as ping, and is when the trouble starts.
> > 
> > I did alter the queue lengths in a recommendation from Dave Taht.  Here
> > is my current script with netem:
> > 
> > tc qdisc add dev eth1 root handle 1: hfsc default 20
> > tc class add dev eth1 parent 1: classid 1:1 hfsc sc rate 1490kbit ul
> > rate 1490kbit
> > tc class add dev eth1 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls
> > rate 200kbit
> > tc qdisc add dev eth1 parent 1:20 handle 1201 sfq perturb 60 limit 30
> > tc class add dev eth1 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax
> > 50ms rate 200kbit ls rate 1000kbit
> > tc qdisc add dev eth1 parent 1:10 handle 1101 sfq perturb 60 limit 30
> > tc class add dev eth1 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax
> > 20ms rate 20kbit
> > tc qdisc add dev eth1 parent 1:30 handle 1301 sfq perturb 60 limit 30
> > iptables -t mangle -A POSTROUTING -p 6 --syn --dport 443 -j CONNMARK
> > --set-mark 0x10
> > iptables -t mangle -A PREROUTING -p 6 --syn --dport 822 -j CONNMARK
> > --set-mark 0x11
> > iptables -t mangle -A POSTROUTING -o eth1 -p 6 -j CONNMARK
> > --restore-mark
> > modprobe ifb
> > ifconfig ifb0 up
> > ifconfig ifb1 up
> > tc qdisc add dev ifb0 root handle 1: hfsc default 20
> > tc class add dev ifb0 parent 1: classid 1:1 hfsc sc rate 1490kbit ul
> > rate 1490kbit
> > tc class add dev ifb0 parent 1:1 classid 1:20 hfsc rt rate 400kbit ls
> > rate 200kbit
> > tc qdisc add dev ifb0 parent 1:20 handle 1201 netem delay 25ms 5ms
> > distribution normal loss 0.1% 30%
> > tc class add dev ifb0 parent 1:1 classid 1:10 hfsc rt umax 16kbit dmax
> > 50ms rate 200kbit ls rate 1000kbit
> > tc qdisc add dev ifb0 parent 1:10 handle 1101 netem delay 25ms 5ms
> > distribution normal loss 0.1% 30%
> > tc class add dev ifb0 parent 1:1 classid 1:30 hfsc rt umax 1514b dmax
> > 20ms rate 20kbit
> > tc qdisc add dev ifb0 parent 1:30 handle 1301 netem delay 25ms 5ms
> > distribution normal loss 0.1% 30%
> > tc filter add dev ifb0 parent 1:0 protocol ip prio 1 handle 6: u32
> > divisor 1
> > tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 match ip
> > protocol 6 0xff link 6: offset at 0 mask 0x0f00 shift 6 plus 0 eat
> 
> 
> > tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match
> > tcp src 443 0x00ff flowid 1:10
> 
> why "src 443 0x00ff" ? It should be "src 443 0xffff"
That's what I tried at first but nothing matched the filter.  I assumed
it was because it objected to a value in the dst field so I masked it
off and it worked.
> 
> > tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match
> > tcp dst 822 0xff00 flowid 1:30
> 
> same here : "dst 822 0xffff"
Same as above.  No filter matches when using that mask.
> 
> > tc qdisc add dev ifb1 root handle 2 netem delay 25ms 5ms distribution
> > normal loss 0.1% 30%
> > tc qdisc add dev eth1 ingress
> > tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0
> > 0 action mirred egress redirect dev ifb0
> > tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x11 fw
> > flowid 1:30
> > tc filter add dev eth1 parent 1:1 protocol ip prio 1 handle 0x10 fw
> > flowid 1:10
> > tc filter add dev eth1 parent 1:1 protocol ip prio 2 u32 match u32 0 0
> > flowid 1:20
> > tc filter add dev eth1 parent 1:0 protocol ip prio 1 u32 match u32 0 0
> > flowid 1:1 action mirred egress redirect dev ifb1
> > ip link set eth1 txqueuelen 100
> > ip link set ifb1 txqueuelen 100
> > ip link set ifb0 txqueuelen 100
> > 
> > I'd love to solve this.  Just when I thought I was all finished having
> > cracked the multiple filter problem to add netem to hfsc, I hit this.
> > Thanks again - John
> > 
> 
> Add some SFQ to your ingress too...
<grin> how with netem?  Thanks very much again - John
> 
> 
> 
PS - I also manually disabled gro in case that was a problem even though
it was showing off already.  It made no difference.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 15:26             ` John A. Sullivan III
@ 2011-12-23 16:16               ` Eric Dumazet
  2011-12-23 16:44                 ` John A. Sullivan III
  2011-12-31 22:17               ` John A. Sullivan III
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2011-12-23 16:16 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netdev

Le vendredi 23 décembre 2011 à 10:26 -0500, John A. Sullivan III a
écrit :

> Yes.  This is a problem I have with netem on ingress traffic.  I use the
> filter on ffff: to redirect to ifb0 for the ingress traffic shaping.  I
> cannot figure out a way to redirect a second time to ifb1 for the netem
> qdisc.  I tried putting two action mirred statements in the filter but
> that did not work.  Unlike eth1, I cannot attach a filter further down
> the ifb0 hfsc hierarchy because one can't redirect one ifb into another
> ifb.  Thus, the only way I could figure out how to do inbound netem was
> to replace the terminal qdisc with netem rather than SFQ.  I'd love to
> be able to do that differently.  I tried attaching netem to the SFQ but
> that failed (I assume because SFQ is classless) and I tried the other
> way around, attaching SFQ to netem since you mentioned netem could take
> a class but that did not work either.

Unfortunately, netem wants to control skbs itself, in a fifo queue.

To implement what you want, we would need to setup a second qdisc,
and when packets are dequeued from internal netem fifo, queue them in
second qdisc.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 16:16               ` Eric Dumazet
@ 2011-12-23 16:44                 ` John A. Sullivan III
  2011-12-23 17:06                   ` Eric Dumazet
  0 siblings, 1 reply; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23 16:44 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Fri, 2011-12-23 at 17:16 +0100, Eric Dumazet wrote:
> Le vendredi 23 décembre 2011 à 10:26 -0500, John A. Sullivan III a
> écrit :
> 
> > Yes.  This is a problem I have with netem on ingress traffic.  I use the
> > filter on ffff: to redirect to ifb0 for the ingress traffic shaping.  I
> > cannot figure out a way to redirect a second time to ifb1 for the netem
> > qdisc.  I tried putting two action mirred statements in the filter but
> > that did not work.  Unlike eth1, I cannot attach a filter further down
> > the ifb0 hfsc hierarchy because one can't redirect one ifb into another
> > ifb.  Thus, the only way I could figure out how to do inbound netem was
> > to replace the terminal qdisc with netem rather than SFQ.  I'd love to
> > be able to do that differently.  I tried attaching netem to the SFQ but
> > that failed (I assume because SFQ is classless) and I tried the other
> > way around, attaching SFQ to netem since you mentioned netem could take
> > a class but that did not work either.
> 
> Unfortunately, netem wants to control skbs itself, in a fifo queue.
> 
> To implement what you want, we would need to setup a second qdisc,
> and when packets are dequeued from internal netem fifo, queue them in
> second qdisc.
> 
> 
> 
I thought I tried to do that but I must have done it incorrectly.  I
would think something like:

tc qdisc add dev eth1 ingress
tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0 0 action mirred egress redirect dev ifb0
tc qdisc add  dev ifb0 root handle 4 netem delay 25ms 5ms distribution normal loss 0.1% 30%
tc qdisc add dev ifb0 parent 4:0 handle 1: hfsc default 20

but I get:
root@testswitch01:~# tc qdisc add dev ifb0 parent 4:0 handle 1: hfsc default 20
RTNETLINK answers: Operation not supported

What did I do wrong? Thanks - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 16:44                 ` John A. Sullivan III
@ 2011-12-23 17:06                   ` Eric Dumazet
  2011-12-23 17:17                     ` Eric Dumazet
  2011-12-23 17:20                     ` John A. Sullivan III
  0 siblings, 2 replies; 23+ messages in thread
From: Eric Dumazet @ 2011-12-23 17:06 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netdev

Le vendredi 23 décembre 2011 à 11:44 -0500, John A. Sullivan III a
écrit :
> On Fri, 2011-12-23 at 17:16 +0100, Eric Dumazet wrote:
> > Le vendredi 23 décembre 2011 à 10:26 -0500, John A. Sullivan III a
> > écrit :
> > 
> > > Yes.  This is a problem I have with netem on ingress traffic.  I use the
> > > filter on ffff: to redirect to ifb0 for the ingress traffic shaping.  I
> > > cannot figure out a way to redirect a second time to ifb1 for the netem
> > > qdisc.  I tried putting two action mirred statements in the filter but
> > > that did not work.  Unlike eth1, I cannot attach a filter further down
> > > the ifb0 hfsc hierarchy because one can't redirect one ifb into another
> > > ifb.  Thus, the only way I could figure out how to do inbound netem was
> > > to replace the terminal qdisc with netem rather than SFQ.  I'd love to
> > > be able to do that differently.  I tried attaching netem to the SFQ but
> > > that failed (I assume because SFQ is classless) and I tried the other
> > > way around, attaching SFQ to netem since you mentioned netem could take
> > > a class but that did not work either.
> > 
> > Unfortunately, netem wants to control skbs itself, in a fifo queue.
> > 
> > To implement what you want, we would need to setup a second qdisc,
> > and when packets are dequeued from internal netem fifo, queue them in
> > second qdisc.
> > 
> > 
> > 
> I thought I tried to do that but I must have done it incorrectly.  I
> would think something like:
> 
> tc qdisc add dev eth1 ingress
> tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0 0 action mirred egress redirect dev ifb0
> tc qdisc add  dev ifb0 root handle 4 netem delay 25ms 5ms distribution normal loss 0.1% 30%
> tc qdisc add dev ifb0 parent 4:0 handle 1: hfsc default 20
> 
> but I get:
> root@testswitch01:~# tc qdisc add dev ifb0 parent 4:0 handle 1: hfsc default 20
> RTNETLINK answers: Operation not supported
> 
> What did I do wrong? Thanks - John
> 

Maybe I was not clear :

netem currently uses a fifo queue, you cant change this, without
patching kernel.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 17:06                   ` Eric Dumazet
@ 2011-12-23 17:17                     ` Eric Dumazet
  2011-12-23 17:33                       ` John A. Sullivan III
  2011-12-23 17:20                     ` John A. Sullivan III
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2011-12-23 17:17 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netdev

Le vendredi 23 décembre 2011 à 18:06 +0100, Eric Dumazet a écrit :

> Maybe I was not clear :
> 
> netem currently uses a fifo queue, you cant change this, without
> patching kernel.
> 

An other way would be to patch sch_tbf, adding delay, if its all you
want to do.

(Or adding delay capability to ifb)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 17:06                   ` Eric Dumazet
  2011-12-23 17:17                     ` Eric Dumazet
@ 2011-12-23 17:20                     ` John A. Sullivan III
  1 sibling, 0 replies; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23 17:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Fri, 2011-12-23 at 18:06 +0100, Eric Dumazet wrote:
> Le vendredi 23 décembre 2011 à 11:44 -0500, John A. Sullivan III a
> écrit :
> > On Fri, 2011-12-23 at 17:16 +0100, Eric Dumazet wrote:
> > > Le vendredi 23 décembre 2011 à 10:26 -0500, John A. Sullivan III a
> > > écrit :
> > > 
> > > > Yes.  This is a problem I have with netem on ingress traffic.  I use the
> > > > filter on ffff: to redirect to ifb0 for the ingress traffic shaping.  I
> > > > cannot figure out a way to redirect a second time to ifb1 for the netem
> > > > qdisc.  I tried putting two action mirred statements in the filter but
> > > > that did not work.  Unlike eth1, I cannot attach a filter further down
> > > > the ifb0 hfsc hierarchy because one can't redirect one ifb into another
> > > > ifb.  Thus, the only way I could figure out how to do inbound netem was
> > > > to replace the terminal qdisc with netem rather than SFQ.  I'd love to
> > > > be able to do that differently.  I tried attaching netem to the SFQ but
> > > > that failed (I assume because SFQ is classless) and I tried the other
> > > > way around, attaching SFQ to netem since you mentioned netem could take
> > > > a class but that did not work either.
> > > 
> > > Unfortunately, netem wants to control skbs itself, in a fifo queue.
> > > 
> > > To implement what you want, we would need to setup a second qdisc,
> > > and when packets are dequeued from internal netem fifo, queue them in
> > > second qdisc.
> > > 
> > > 
> > > 
> > I thought I tried to do that but I must have done it incorrectly.  I
> > would think something like:
> > 
> > tc qdisc add dev eth1 ingress
> > tc filter add dev eth1 parent ffff: protocol ip prio 50 u32 match u32 0 0 action mirred egress redirect dev ifb0
> > tc qdisc add  dev ifb0 root handle 4 netem delay 25ms 5ms distribution normal loss 0.1% 30%
> > tc qdisc add dev ifb0 parent 4:0 handle 1: hfsc default 20
> > 
> > but I get:
> > root@testswitch01:~# tc qdisc add dev ifb0 parent 4:0 handle 1: hfsc default 20
> > RTNETLINK answers: Operation not supported
> > 
> > What did I do wrong? Thanks - John
> > 
> 
> Maybe I was not clear :
> 
> netem currently uses a fifo queue, you cant change this, without
> patching kernel.
> 
> 
> 
OK - that makes sense and explains what I was seeing but, once they are
out the netem fifo, aren't they headed for the NIC driver? How do we
queue them in a second qdisc? I'll gladly give it a try if I know how.
Thanks - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 17:17                     ` Eric Dumazet
@ 2011-12-23 17:33                       ` John A. Sullivan III
  2011-12-23 17:35                         ` John A. Sullivan III
  0 siblings, 1 reply; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23 17:33 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Fri, 2011-12-23 at 18:17 +0100, Eric Dumazet wrote:
> Le vendredi 23 décembre 2011 à 18:06 +0100, Eric Dumazet a écrit :
> 
> > Maybe I was not clear :
> > 
> > netem currently uses a fifo queue, you cant change this, without
> > patching kernel.
> > 
> 
> An other way would be to patch sch_tbf, adding delay, if its all you
> want to do.
> 
> (Or adding delay capability to ifb)
> 
> 
> 
> 
<grin> that's just a little outside by skill set ;) - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 17:33                       ` John A. Sullivan III
@ 2011-12-23 17:35                         ` John A. Sullivan III
  2011-12-23 21:10                           ` John A. Sullivan III
  0 siblings, 1 reply; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23 17:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Fri, 2011-12-23 at 12:33 -0500, John A. Sullivan III wrote:
> On Fri, 2011-12-23 at 18:17 +0100, Eric Dumazet wrote:
> > Le vendredi 23 décembre 2011 à 18:06 +0100, Eric Dumazet a écrit :
> > 
> > > Maybe I was not clear :
> > > 
> > > netem currently uses a fifo queue, you cant change this, without
> > > patching kernel.
> > > 
> > 
> > An other way would be to patch sch_tbf, adding delay, if its all you
> > want to do.
> > 
> > (Or adding delay capability to ifb)
> > 
> > 
> > 
> > 
> <grin> that's just a little outside by skill set ;) - John
> 
<snip>
And I should mention seriously, that we are viewing our learning curve
as something we will use in production so we'd like to accomplish our
goals using stock distribution code.  Thanks, though - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 17:35                         ` John A. Sullivan III
@ 2011-12-23 21:10                           ` John A. Sullivan III
  2011-12-23 22:24                             ` Eric Dumazet
  0 siblings, 1 reply; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-23 21:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Fri, 2011-12-23 at 12:35 -0500, John A. Sullivan III wrote:
> On Fri, 2011-12-23 at 12:33 -0500, John A. Sullivan III wrote:
> > On Fri, 2011-12-23 at 18:17 +0100, Eric Dumazet wrote:
> > > Le vendredi 23 décembre 2011 à 18:06 +0100, Eric Dumazet a écrit :
> > > 
> > > > Maybe I was not clear :
> > > > 
> > > > netem currently uses a fifo queue, you cant change this, without
> > > > patching kernel.
> > > > 
> > > 
> > > An other way would be to patch sch_tbf, adding delay, if its all you
> > > want to do.
> > > 
> > > (Or adding delay capability to ifb)
> > > 
> > > 
> > > 
> > > 
> > <grin> that's just a little outside by skill set ;) - John
> > 
> <snip>
> And I should mention seriously, that we are viewing our learning curve
> as something we will use in production so we'd like to accomplish our
> goals using stock distribution code.  Thanks, though - John
<snip>
Should I guess that, from the flood of subsequent emails about patching
netem that it is not currently possible to do netem and hfsc/sfq on
ingress traffic using the currently available tools? 

It's definitely netem on the ingress.  When I run it on egress and
disable it on ingress, I do not have the problem.  But, I see no what of
getting the netem traffic into or out of SFQ.  Thanks - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next] sch_hfsc: report backlog information
  2011-12-23 15:19         ` [PATCH net-next] sch_hfsc: report backlog information Eric Dumazet
@ 2011-12-23 21:52           ` David Miller
  0 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2011-12-23 21:52 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, jsullivan

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 23 Dec 2011 16:19:20 +0100

> Add backlog (byte count) information in hfsc classes and qdisc, so that
> "tc -s" can report it to user, instead of 0 values :
> 
> qdisc hfsc 1: root refcnt 6 default 20 
>  Sent 45141660 bytes 30545 pkt (dropped 0, overlimits 91751 requeues 0) 
>  rate 1492Kbit 126pps backlog 103226b 74p requeues 0 
> ...
> class hfsc 1:20 parent 1:1 leaf 1201: rt m1 0bit d 0us m2 400000bit ls m1 0bit d 0us m2 200000bit 
>  Sent 49534912 bytes 33519 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 81822b 56p requeues 0 
>  period 23 work 49451576 bytes rtwork 13277552 bytes level 0 
> ...
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 21:10                           ` John A. Sullivan III
@ 2011-12-23 22:24                             ` Eric Dumazet
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2011-12-23 22:24 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netdev

Le vendredi 23 décembre 2011 à 16:10 -0500, John A. Sullivan III a
écrit :

> Should I guess that, from the flood of subsequent emails about patching
> netem that it is not currently possible to do netem and hfsc/sfq on
> ingress traffic using the currently available tools? 
> 

Yep... current netem is a bit limited.

> It's definitely netem on the ingress.  When I run it on egress and
> disable it on ingress, I do not have the problem.  But, I see no what of
> getting the netem traffic into or out of SFQ.  Thanks - John
> 

It'll be possible, after a few patches, but only using net-next, or
waiting a backport to a 3.2 kernel somehow...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: SFQ on HFSC leaf does not seem to work
  2011-12-23 15:26             ` John A. Sullivan III
  2011-12-23 16:16               ` Eric Dumazet
@ 2011-12-31 22:17               ` John A. Sullivan III
  1 sibling, 0 replies; 23+ messages in thread
From: John A. Sullivan III @ 2011-12-31 22:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Fri, 2011-12-23 at 10:26 -0500, John A. Sullivan III wrote:
> <snip>> 
> > > tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match
> > > tcp src 443 0x00ff flowid 1:10
> > 
> > why "src 443 0x00ff" ? It should be "src 443 0xffff"
> That's what I tried at first but nothing matched the filter.  I assumed
> it was because it objected to a value in the dst field so I masked it
> off and it worked.
> > 
> > > tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match
> > > tcp dst 822 0xff00 flowid 1:30
> > 
> > same here : "dst 822 0xffff"
> Same as above.  No filter matches when using that mask.
> > 
<snip>
Oops! Must have not matched for some other reason - this is a clear
binary brain cramp! This now appears to be working:

tc filter add dev ifb0 parent 1:0 protocol ip prio 1 handle 6: u32 divisor 1
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 match ip protocol 6 0xff link 6: offset at 0 mask 0x0f00 shift 6 plus 0
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp dst 822 0xffff at nexthdr+2 flowid 1:30
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp src 822 0xffff at nexthdr+0 flowid 1:30
# Send packets <64 bytes (u16 0 0xffc0 at 2) with only the ACK flag set (match u8 16 0xff at nexthdr+13) to the low latency queue
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match u16 0 0xffc0 at 2 match u8 16 0xff at nexthdr+13 flowid 1:30
tc filter add dev ifb0 parent 1:0 protocol ip prio 1 u32 ht 6:0 match tcp src 443 0xffff at nexthdr+0 flowid 1:10

Thanks - John

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2011-12-31 22:17 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-23  6:00 SFQ on HFSC leaf does not seem to work John A. Sullivan III
2011-12-23  6:32 ` Dave Taht
2011-12-23  6:40   ` John A. Sullivan III
2011-12-23  7:08     ` John A. Sullivan III
2011-12-23  8:10 ` Eric Dumazet
2011-12-23 13:13   ` John A. Sullivan III
2011-12-23 13:45     ` Eric Dumazet
2011-12-23 14:00       ` Eric Dumazet
2011-12-23 14:38         ` John A. Sullivan III
2011-12-23 14:59           ` Eric Dumazet
2011-12-23 15:26             ` John A. Sullivan III
2011-12-23 16:16               ` Eric Dumazet
2011-12-23 16:44                 ` John A. Sullivan III
2011-12-23 17:06                   ` Eric Dumazet
2011-12-23 17:17                     ` Eric Dumazet
2011-12-23 17:33                       ` John A. Sullivan III
2011-12-23 17:35                         ` John A. Sullivan III
2011-12-23 21:10                           ` John A. Sullivan III
2011-12-23 22:24                             ` Eric Dumazet
2011-12-23 17:20                     ` John A. Sullivan III
2011-12-31 22:17               ` John A. Sullivan III
2011-12-23 15:19         ` [PATCH net-next] sch_hfsc: report backlog information Eric Dumazet
2011-12-23 21:52           ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).