netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* SFQ: Reordering?
@ 2005-05-06 21:53 Asim Shankar
  2005-05-06 22:07 ` Patrick McHardy
  0 siblings, 1 reply; 11+ messages in thread
From: Asim Shankar @ 2005-05-06 21:53 UTC (permalink / raw)
  To: netdev

Hi,

I was looking through sch_sfq.c. From what I could make out, if the
perturbation period is non-zero (say Xseconds), then ever X seconds,
sfq_perturbation() is invoked. This changes the perturbation value
that will be used by the hash function, however, packets already
existing in the queue aren't rehashed.

As a result, new packets being enqueued will have a different hash
value and thus packet re-ordering will take place. I ran a quick test
using netperf and tcpdump and seem to notice this re-ordering.

Should complete rehashing take place in sfq_perturbation(), or am I
missing something? (I was looking at 2.6.9 and also took a cursory
glance at 2.6.11 on lxr.linux.no)

Thanks,

-- Asim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-06 21:53 SFQ: Reordering? Asim Shankar
@ 2005-05-06 22:07 ` Patrick McHardy
  2005-05-06 22:46   ` Patrick McHardy
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2005-05-06 22:07 UTC (permalink / raw)
  To: Asim Shankar; +Cc: netdev

Asim Shankar wrote:
> Hi,
> 
> I was looking through sch_sfq.c. From what I could make out, if the
> perturbation period is non-zero (say Xseconds), then ever X seconds,
> sfq_perturbation() is invoked. This changes the perturbation value
> that will be used by the hash function, however, packets already
> existing in the queue aren't rehashed.
> 
> As a result, new packets being enqueued will have a different hash
> value and thus packet re-ordering will take place. I ran a quick test
> using netperf and tcpdump and seem to notice this re-ordering.
> 
> Should complete rehashing take place in sfq_perturbation(), or am I
> missing something? (I was looking at 2.6.9 and also took a cursory
> glance at 2.6.11 on lxr.linux.no)

I think we should rehash. Can you send a patch?

Regards
Patrick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-06 22:07 ` Patrick McHardy
@ 2005-05-06 22:46   ` Patrick McHardy
  2005-05-06 23:02     ` Thomas Graf
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2005-05-06 22:46 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Asim Shankar, netdev

Patrick McHardy wrote:
> Asim Shankar wrote:
> 
>>Should complete rehashing take place in sfq_perturbation(), or am I
>>missing something? (I was looking at 2.6.9 and also took a cursory
>>glance at 2.6.11 on lxr.linux.no)
> 
> 
> I think we should rehash. Can you send a patch?

Hmm wait, this is not so easy. We can't rehash by going over
the buckets one by one. If we do so and we have a new clash of
two flows previously contained in different buckets the packets
will afterwards be sorted by flow in their new bucket. To retain
fairness we need to iterate over all buckets containing packets
and rehash them one packet per a bucket at a time. But this means
we need lots of temporary storage to store the queues while
rehashing. Can anyone thing of a better solution?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-06 22:46   ` Patrick McHardy
@ 2005-05-06 23:02     ` Thomas Graf
  2005-05-06 23:19       ` Patrick McHardy
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Graf @ 2005-05-06 23:02 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Asim Shankar, netdev

* Patrick McHardy <427BF3C4.1030105@trash.net> 2005-05-07 00:46
> Hmm wait, this is not so easy. We can't rehash by going over
> the buckets one by one. If we do so and we have a new clash of
> two flows previously contained in different buckets the packets
> will afterwards be sorted by flow in their new bucket. To retain
> fairness we need to iterate over all buckets containing packets
> and rehash them one packet per a bucket at a time. But this means
> we need lots of temporary storage to store the queues while
> rehashing. Can anyone thing of a better solution?

We can maintain a second hash table and switch a pointer over to the
new table but keep on dequeueing from the old one until it is empty.
Anyways, any such behaviour should be made optional via a rtnetlink
flag.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-06 23:02     ` Thomas Graf
@ 2005-05-06 23:19       ` Patrick McHardy
  2005-05-07  0:58         ` Thomas Graf
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2005-05-06 23:19 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Asim Shankar, netdev

Thomas Graf wrote:
> We can maintain a second hash table and switch a pointer over to the
> new table but keep on dequeueing from the old one until it is empty.
> Anyways, any such behaviour should be made optional via a rtnetlink
> flag.

This also introduces unfairness. Packets of a flow could be only in
the new table while we're still working on the active table.
A proper solution to avoid reordering shouldn't be optional IMO,
perturbation is already optional.

Regards
Patrick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-06 23:19       ` Patrick McHardy
@ 2005-05-07  0:58         ` Thomas Graf
  2005-05-07  1:28           ` Patrick McHardy
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Graf @ 2005-05-07  0:58 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Asim Shankar, netdev

* Patrick McHardy <427BFB72.7080407@trash.net> 2005-05-07 01:19
> Thomas Graf wrote:
> > We can maintain a second hash table and switch a pointer over to the
> > new table but keep on dequeueing from the old one until it is empty.
> > Anyways, any such behaviour should be made optional via a rtnetlink
> > flag.
> 
> This also introduces unfairness. Packets of a flow could be only in
> the new table while we're still working on the active table.
> A proper solution to avoid reordering shouldn't be optional IMO,
> perturbation is already optional.

Yes, that's true but we can't reach perfect fairness anyway. What
is your primary goal? No reordering inside flows while staying
as fair as possible? I'm not sure whether it's worth to rehash
every single enqueued packet, especially not at a perturbation
interval of just a few seconds. Many scripts set it to a value
of 1..5 to have a higher theoreticaly fairness. That's why I think
this feature should be made optional

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-07  0:58         ` Thomas Graf
@ 2005-05-07  1:28           ` Patrick McHardy
  2005-05-08 11:51             ` Thomas Graf
  2005-05-09 23:14             ` Andy Furniss
  0 siblings, 2 replies; 11+ messages in thread
From: Patrick McHardy @ 2005-05-07  1:28 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Asim Shankar, netdev

Thomas Graf wrote:
> * Patrick McHardy <427BFB72.7080407@trash.net> 2005-05-07 01:19
> 
>>This also introduces unfairness. Packets of a flow could be only in
>>the new table while we're still working on the active table.
>>A proper solution to avoid reordering shouldn't be optional IMO,
>>perturbation is already optional.
> 
> 
> Yes, that's true but we can't reach perfect fairness anyway. What
> is your primary goal? No reordering inside flows while staying
> as fair as possible? I'm not sure whether it's worth to rehash
> every single enqueued packet, especially not at a perturbation
> interval of just a few seconds. Many scripts set it to a value
> of 1..5 to have a higher theoreticaly fairness. That's why I think
> this feature should be made optional

You stated my goal precisely :) I know many people set the interval
to too low values, but because of the tight limits, it shouldn't be
very expensive anyways. Table switching OTOH would introduce frequently
occuring unfairness, and the time to work through a full table is
a lot longer, especially in the environments where SFQ is used.
It would be interesting to see this in real-life, when a single flow
is hashed to multiple buckets (it can be even more than two) and
each bucket has some packets queued, the result should look pretty
chaotic.

Regards
Patrick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-07  1:28           ` Patrick McHardy
@ 2005-05-08 11:51             ` Thomas Graf
  2005-05-08 16:03               ` Patrick McHardy
  2005-05-09 23:14             ` Andy Furniss
  1 sibling, 1 reply; 11+ messages in thread
From: Thomas Graf @ 2005-05-08 11:51 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Asim Shankar, netdev

* Patrick McHardy <427C19C3.5030304@trash.net> 2005-05-07 03:28
> You stated my goal precisely :) I know many people set the interval
> to too low values, but because of the tight limits, it shouldn't be
> very expensive anyways. Table switching OTOH would introduce frequently
> occuring unfairness, and the time to work through a full table is
> a lot longer, especially in the environments where SFQ is used.

I think you are right on this so forget about my thought. Maybe as
an additional input it might be worth mentioning that I have patches
ready to a) make the sfq depth adjustable and b) hash algorithm
selection to a few builtin ones and additional to read theh hash
from tc_classid set by an action. A real world example where this
is useful is for routers primarly doing SNAT without any own local
traffic where hashing algorithm primarly based on the source port
makes a lot more sense.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-08 11:51             ` Thomas Graf
@ 2005-05-08 16:03               ` Patrick McHardy
  2005-05-08 18:33                 ` Thomas Graf
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2005-05-08 16:03 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Asim Shankar, netdev

Thomas Graf wrote:
> * Patrick McHardy <427C19C3.5030304@trash.net> 2005-05-07 03:28
> 
>>You stated my goal precisely :) I know many people set the interval
>>to too low values, but because of the tight limits, it shouldn't be
>>very expensive anyways. Table switching OTOH would introduce frequently
>>occuring unfairness, and the time to work through a full table is
>>a lot longer, especially in the environments where SFQ is used.
> 
> I think you are right on this so forget about my thought.

Actually I was beginning to think you're right about having this
feature optional. Paul McKenney's paper on SFQ states multiple times
that perturbation can cause reordering if implemented the easy way,
the Debian sfq manpage mentions this as well. So it appears to be a
design-choice. Anyways, I suggest to make the decision when we know
what the costs are.

BTW, this is the URL for the paper:
http://rdrop.com/users/paulmck/paper/sfq.2002.06.04.pdf

> Maybe as
> an additional input it might be worth mentioning that I have patches
> ready to a) make the sfq depth adjustable and b) hash algorithm
> selection to a few builtin ones and additional to read theh hash
> from tc_classid set by an action. A real world example where this
> is useful is for routers primarly doing SNAT without any own local
> traffic where hashing algorithm primarly based on the source port
> makes a lot more sense.

I agree both make sense. Are you talking about run-time adjustable
or compile-time adjustable? For SNAT it would also be nice to be
able to use the original address, unfortunately this isn't possible
anymore since we now drop the conntrack reference early.

Regards
Patrick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-08 16:03               ` Patrick McHardy
@ 2005-05-08 18:33                 ` Thomas Graf
  0 siblings, 0 replies; 11+ messages in thread
From: Thomas Graf @ 2005-05-08 18:33 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Asim Shankar, netdev

* Patrick McHardy <427E3847.3050709@trash.net> 2005-05-08 18:03
> Actually I was beginning to think you're right about having this
> feature optional. Paul McKenney's paper on SFQ states multiple times
> that perturbation can cause reordering if implemented the easy way,
> the Debian sfq manpage mentions this as well. So it appears to be a
> design-choice. Anyways, I suggest to make the decision when we know
> what the costs are.
> 
> BTW, this is the URL for the paper:
> http://rdrop.com/users/paulmck/paper/sfq.2002.06.04.pdf

Indeed, well I think it depends on the perturbation interval,
complexity and cost of the hashing function and the average
queue length so having a flag to toggle this feature is
certainly not a bad idea. Assuming a considerable queue length
and a perturbation level of 1-5 seconds it probably doesn't
make sense to rehash.

> I agree both make sense. Are you talking about run-time adjustable
> or compile-time adjustable? For SNAT it would also be nice to be
> able to use the original address, unfortunately this isn't possible
> anymore since we now drop the conntrack reference early.

We can compute the hash at ingress or just before we drop the
reference, store it in tc_classid and then use it at the egress
sfq. Needs some more thinking but should be doable.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SFQ: Reordering?
  2005-05-07  1:28           ` Patrick McHardy
  2005-05-08 11:51             ` Thomas Graf
@ 2005-05-09 23:14             ` Andy Furniss
  1 sibling, 0 replies; 11+ messages in thread
From: Andy Furniss @ 2005-05-09 23:14 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Thomas Graf, Asim Shankar, netdev

Patrick McHardy wrote:

> It would be interesting to see this in real-life, when a single flow
> is hashed to multiple buckets (it can be even more than two) and
> each bucket has some packets queued, the result should look pretty
> chaotic.

It certainly hurts latency if used for ingress shaping, perturb nearly 
always makes the sender backoff and then burst.

Andy.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-05-09 23:14 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-06 21:53 SFQ: Reordering? Asim Shankar
2005-05-06 22:07 ` Patrick McHardy
2005-05-06 22:46   ` Patrick McHardy
2005-05-06 23:02     ` Thomas Graf
2005-05-06 23:19       ` Patrick McHardy
2005-05-07  0:58         ` Thomas Graf
2005-05-07  1:28           ` Patrick McHardy
2005-05-08 11:51             ` Thomas Graf
2005-05-08 16:03               ` Patrick McHardy
2005-05-08 18:33                 ` Thomas Graf
2005-05-09 23:14             ` Andy Furniss

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).