* [PATCH 0/6] Add qdisc->ops->peek() support.
@ 2008-10-16 9:46 Jarek Poplawski
2008-10-16 12:38 ` Patrick McHardy
0 siblings, 1 reply; 21+ messages in thread
From: Jarek Poplawski @ 2008-10-16 9:46 UTC (permalink / raw)
To: Patrick McHardy; +Cc: David Miller, netdev, Herbert Xu
[Subject: Re: [PATCH 00/14]: Killing qdisc->ops->requeue().]
On Tue, Oct 14, 2008 at 10:44:13PM +0200, Patrick McHardy wrote:
...
> I think we really need a peek operation since most qdiscs do have
> some internal priorization. The question is whether all qdiscs need
> it; I tend to think no. Plugging two non-work-conserving qdiscs
> together doesn't make much sense, so we could just prevent this.
> This means we would only need to add ->peek support to the
> work-conserving qdiscs, which looks pretty easy in all cases.
> We actually don't even have to prevent plugging two non-work-conserving
> qdiscs, the ones that need the peek operation could just check whether
> the inner qdisc supports it.
>
> Just as a demonstration how easy adding a peek operation to the
> work-conserving qdiscs actually is. It doesn't need to keep or change
> any internal state in many cases thanks to the guarantee that the
> packet will either be dequeued or, if another packet arrives, the
> upper qdisc will immediately ->peek again to reevaluate the state.
So here is my try (I hope I didn't miss Patrick's intentions to much).
PATCH 6/6 adds my simple idea not to prevent even nonsense configs.
It is optional.
Thanks,
Jarek P.
PS: after this patchset only netem_enqueue() needs qdisc->requeue(),
but I hope this won't take too long.
include/net/sch_generic.h | 29 +++++++++++++++++++++++++++++
net/sched/sch_api.c | 6 ++++++
net/sched/sch_atm.c | 23 ++++++++++++++++++-----
net/sched/sch_blackhole.c | 1 +
net/sched/sch_cbq.c | 1 +
net/sched/sch_dsmark.c | 10 ++++++++++
net/sched/sch_fifo.c | 2 ++
net/sched/sch_generic.c | 16 ++++++++++++++++
net/sched/sch_gred.c | 1 +
net/sched/sch_hfsc.c | 15 ++++-----------
net/sched/sch_htb.c | 1 +
net/sched/sch_multiq.c | 29 +++++++++++++++++++++++++++++
net/sched/sch_netem.c | 15 +++++++--------
net/sched/sch_prio.c | 14 ++++++++++++++
net/sched/sch_red.c | 9 +++++++++
net/sched/sch_sfq.c | 12 ++++++++++++
net/sched/sch_tbf.c | 13 ++++++-------
net/sched/sch_teql.c | 9 +++++++++
18 files changed, 175 insertions(+), 31 deletions(-)
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-16 9:46 [PATCH 0/6] Add qdisc->ops->peek() support Jarek Poplawski @ 2008-10-16 12:38 ` Patrick McHardy 2008-10-16 13:08 ` Jarek Poplawski 2008-10-16 22:09 ` Jarek Poplawski 0 siblings, 2 replies; 21+ messages in thread From: Patrick McHardy @ 2008-10-16 12:38 UTC (permalink / raw) To: Jarek Poplawski; +Cc: David Miller, netdev, Herbert Xu Jarek Poplawski wrote: > So here is my try (I hope I didn't miss Patrick's intentions to much). > > PATCH 6/6 adds my simple idea not to prevent even nonsense configs. > It is optional. These patches look good to me. About 6/6, I'm still thinking we might want to prevent certain kinds of nonsensical configurations for simplicity. Its a bit similar to the multiq stuff, a lot of these complications came from the fact that non-work-conserving qdiscs, which require a global view, are treated similar to work- conserving ones. I don't have a good scheme worked out though, so for now using your patch seems OK too. > PS: after this patchset only netem_enqueue() needs qdisc->requeue(), > but I hope this won't take too long. Assuming work-conserving qdiscs are used with netem, the currently code will always send out a reorder packet immediately. This behaviour is trivial to implement without ->requeue. The problematic case is non-work-conserving inner qdiscs, but that doesn't seem important at all since you'd usually add it as parent of netem, which still works. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-16 12:38 ` Patrick McHardy @ 2008-10-16 13:08 ` Jarek Poplawski 2008-10-16 22:09 ` Jarek Poplawski 1 sibling, 0 replies; 21+ messages in thread From: Jarek Poplawski @ 2008-10-16 13:08 UTC (permalink / raw) To: Patrick McHardy; +Cc: David Miller, netdev, Herbert Xu On Thu, Oct 16, 2008 at 02:38:31PM +0200, Patrick McHardy wrote: > Jarek Poplawski wrote: >> So here is my try (I hope I didn't miss Patrick's intentions to much). >> >> PATCH 6/6 adds my simple idea not to prevent even nonsense configs. >> It is optional. > > These patches look good to me. About 6/6, I'm still thinking we > might want to prevent certain kinds of nonsensical configurations > for simplicity. Its a bit similar to the multiq stuff, a lot of > these complications came from the fact that non-work-conserving > qdiscs, which require a global view, are treated similar to work- > conserving ones. I don't have a good scheme worked out though, > so for now using your patch seems OK too. > >> PS: after this patchset only netem_enqueue() needs qdisc->requeue(), >> but I hope this won't take too long. > > Assuming work-conserving qdiscs are used with netem, the currently > code will always send out a reorder packet immediately. This behaviour > is trivial to implement without ->requeue. The problematic case is > non-work-conserving inner qdiscs, but that doesn't seem important > at all since you'd usually add it as parent of netem, which still > works. Patrick, thanks for the feedback so far. I have to have some break now so I'll try to respond or redo this, if needed, in the evening. Jarek P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-16 12:38 ` Patrick McHardy 2008-10-16 13:08 ` Jarek Poplawski @ 2008-10-16 22:09 ` Jarek Poplawski 2008-10-17 12:33 ` Patrick McHardy 1 sibling, 1 reply; 21+ messages in thread From: Jarek Poplawski @ 2008-10-16 22:09 UTC (permalink / raw) To: Patrick McHardy; +Cc: David Miller, netdev, Herbert Xu Patrick McHardy wrote, On 10/16/2008 02:38 PM: > Jarek Poplawski wrote: ... >> PS: after this patchset only netem_enqueue() needs qdisc->requeue(), >> but I hope this won't take too long. > > Assuming work-conserving qdiscs are used with netem, the currently > code will always send out a reorder packet immediately. This behaviour > is trivial to implement without ->requeue. The problematic case is > non-work-conserving inner qdiscs, but that doesn't seem important > at all since you'd usually add it as parent of netem, which still > works. How about something like this (example only)? Jarek P. --- include/net/sch_generic.h | 1 + net/sched/sch_netem.c | 7 ++++++- 2 files changed, 7 insertions(+), 1 deletions(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 9dcb5bf..9157766 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -45,6 +45,7 @@ struct Qdisc #define TCQ_F_BUILTIN 1 #define TCQ_F_THROTTLED 2 #define TCQ_F_INGRESS 4 +#define TCQ_F_REQUEUE 8 int padded; struct Qdisc_ops *ops; struct qdisc_size_table *stab; diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 3080bd6..6ac8efc 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -233,7 +233,9 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) */ cb->time_to_send = psched_get_time(); q->counter = 0; - ret = q->qdisc->ops->requeue(skb, q->qdisc); + q->qdisc->flags |= TCQ_F_REQUEUE; + ret = qdisc_equeue(skb, q->qdisc); + q->qdisc->flags &= ~TCQ_F_REQUEUE; } if (likely(ret == NET_XMIT_SUCCESS)) { @@ -478,6 +480,9 @@ static int tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch) psched_time_t tnext = netem_skb_cb(nskb)->time_to_send; struct sk_buff *skb; + if (unlikely(sch->flags & TCQ_F_REQUEUE)) + return qdisc_requeue(nskb, sch); + if (likely(skb_queue_len(list) < q->limit)) { /* Optimize for add at tail */ if (likely(skb_queue_empty(list) || tnext >= q->oldest)) { ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-16 22:09 ` Jarek Poplawski @ 2008-10-17 12:33 ` Patrick McHardy 2008-10-17 13:03 ` Jarek Poplawski 0 siblings, 1 reply; 21+ messages in thread From: Patrick McHardy @ 2008-10-17 12:33 UTC (permalink / raw) To: Jarek Poplawski; +Cc: David Miller, netdev, Herbert Xu Jarek Poplawski wrote: > Patrick McHardy wrote, On 10/16/2008 02:38 PM: > >> Jarek Poplawski wrote: > ... >>> PS: after this patchset only netem_enqueue() needs qdisc->requeue(), >>> but I hope this won't take too long. >> Assuming work-conserving qdiscs are used with netem, the currently >> code will always send out a reorder packet immediately. This behaviour >> is trivial to implement without ->requeue. The problematic case is >> non-work-conserving inner qdiscs, but that doesn't seem important >> at all since you'd usually add it as parent of netem, which still >> works. > > How about something like this (example only)? > @@ -233,7 +233,9 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) > */ > cb->time_to_send = psched_get_time(); > q->counter = 0; > - ret = q->qdisc->ops->requeue(skb, q->qdisc); > + q->qdisc->flags |= TCQ_F_REQUEUE; > + ret = qdisc_equeue(skb, q->qdisc); > + q->qdisc->flags &= ~TCQ_F_REQUEUE; Well, the inner qdisc would still need to logic to order packets apprioriately. Its probably not that hard, but as I said, I don't think its necessary at all. It only makes a difference with a non-work-conserving inner qdisc, but a lot of the functionality of netem requires the inner tfifo anyways and rate-limiting is usually done on top of netem. So I would suggest so either hard-wire the tfifo qdisc or at least make the assumption that inner qdiscs are work-conserving. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-17 12:33 ` Patrick McHardy @ 2008-10-17 13:03 ` Jarek Poplawski 2008-10-17 14:12 ` Patrick McHardy 0 siblings, 1 reply; 21+ messages in thread From: Jarek Poplawski @ 2008-10-17 13:03 UTC (permalink / raw) To: Patrick McHardy; +Cc: David Miller, netdev, Herbert Xu On Fri, Oct 17, 2008 at 02:33:23PM +0200, Patrick McHardy wrote: > Jarek Poplawski wrote: >> Patrick McHardy wrote, On 10/16/2008 02:38 PM: >> >>> Jarek Poplawski wrote: >> ... >>>> PS: after this patchset only netem_enqueue() needs qdisc->requeue(), >>>> but I hope this won't take too long. >>> Assuming work-conserving qdiscs are used with netem, the currently >>> code will always send out a reorder packet immediately. This behaviour >>> is trivial to implement without ->requeue. The problematic case is >>> non-work-conserving inner qdiscs, but that doesn't seem important >>> at all since you'd usually add it as parent of netem, which still >>> works. >> >> How about something like this (example only)? >> @@ -233,7 +233,9 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) >> */ >> cb->time_to_send = psched_get_time(); >> q->counter = 0; >> - ret = q->qdisc->ops->requeue(skb, q->qdisc); >> + q->qdisc->flags |= TCQ_F_REQUEUE; >> + ret = qdisc_equeue(skb, q->qdisc); >> + q->qdisc->flags &= ~TCQ_F_REQUEUE; > > Well, the inner qdisc would still need to logic to order packets > apprioriately. I'm not sure I was understood: the idea is to do something like in this example in tfifo_enqueue() in all leaf qdiscs like fifo etc. too, so to redirect their ->enqueue() to their ->requeue() which usually is qdisc_requeue() (or to it directly if needed). > Its probably not that hard, but as I said, I don't > think its necessary at all. It only makes a difference with a > non-work-conserving inner qdisc, but a lot of the functionality of > netem requires the inner tfifo anyways and rate-limiting is usually > done on top of netem. So I would suggest so either hard-wire the > tfifo qdisc or at least make the assumption that inner qdiscs are > work-conserving. Of course, I can do it like this, but wouldn't it break backward compatibility for some users? Thanks, Jarek P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-17 13:03 ` Jarek Poplawski @ 2008-10-17 14:12 ` Patrick McHardy 2008-10-17 20:12 ` [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc Jarek Poplawski 2008-10-17 20:45 ` [PATCH 0/6] Add qdisc->ops->peek() support Jarek Poplawski 0 siblings, 2 replies; 21+ messages in thread From: Patrick McHardy @ 2008-10-17 14:12 UTC (permalink / raw) To: Jarek Poplawski; +Cc: David Miller, netdev, Herbert Xu Jarek Poplawski wrote: > On Fri, Oct 17, 2008 at 02:33:23PM +0200, Patrick McHardy wrote: >>> @@ -233,7 +233,9 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) >>> */ >>> cb->time_to_send = psched_get_time(); >>> q->counter = 0; >>> - ret = q->qdisc->ops->requeue(skb, q->qdisc); >>> + q->qdisc->flags |= TCQ_F_REQUEUE; >>> + ret = qdisc_equeue(skb, q->qdisc); >>> + q->qdisc->flags &= ~TCQ_F_REQUEUE; >> Well, the inner qdisc would still need to logic to order packets >> apprioriately. > > I'm not sure I was understood: the idea is to do something like > in this example in tfifo_enqueue() in all leaf qdiscs like fifo > etc. too, so to redirect their ->enqueue() to their ->requeue() > which usually is qdisc_requeue() (or to it directly if needed). Yes, I misunderstood this, I though the intention was to get rid of requeue entirely. >> Its probably not that hard, but as I said, I don't >> think its necessary at all. It only makes a difference with a >> non-work-conserving inner qdisc, but a lot of the functionality of >> netem requires the inner tfifo anyways and rate-limiting is usually >> done on top of netem. So I would suggest so either hard-wire the >> tfifo qdisc or at least make the assumption that inner qdiscs are >> work-conserving. > > Of course, I can do it like this, but wouldn't it break backward > compatibility for some users? Some general thoughts ... We've never had any systematic checks for useful and non-useful combination of qdiscs, which is causing a lot of these complications. Think of all the multiq work that was required to make it work properly with non-work-conserving qdiscs - while at the same time, using a non-work-conserving qdisc (which require a global view) defeats basically all of the benefits. So it would be really useful to come up with a systematic definition of valid combinations instead of trying handling lots of purely theoretical case that don't make sense. One more example - all the qdiscs implement ->drop(), yet its only needed by CBQ and it doesn't make any sense at all to use lets say HFSC as child of CBQ. About this specific case - yes, it would break compatibility for users using f.i. TBF as child of netem. But if you look at the netem_enqueue() function, it in fact assumes that the inner qdisc is a tfifo, so we'd be breaking an already broken case. We can of course be nice and warn about it for a few releases, but I believe there is some real potential for simplification that makes it worth it. ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-17 14:12 ` Patrick McHardy @ 2008-10-17 20:12 ` Jarek Poplawski 2008-10-21 23:36 ` David Miller 2008-10-17 20:45 ` [PATCH 0/6] Add qdisc->ops->peek() support Jarek Poplawski 1 sibling, 1 reply; 21+ messages in thread From: Jarek Poplawski @ 2008-10-17 20:12 UTC (permalink / raw) To: Patrick McHardy; +Cc: David Miller, netdev, Herbert Xu, Stephen Hemminger On Fri, Oct 17, 2008 at 04:12:03PM +0200, Patrick McHardy wrote: > Jarek Poplawski wrote: >> On Fri, Oct 17, 2008 at 02:33:23PM +0200, Patrick McHardy wrote: >>>> @@ -233,7 +233,9 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) >>>> */ >>>> cb->time_to_send = psched_get_time(); >>>> q->counter = 0; >>>> - ret = q->qdisc->ops->requeue(skb, q->qdisc); >>>> + q->qdisc->flags |= TCQ_F_REQUEUE; >>>> + ret = qdisc_equeue(skb, q->qdisc); >>>> + q->qdisc->flags &= ~TCQ_F_REQUEUE; >>> Well, the inner qdisc would still need to logic to order packets >>> apprioriately. >> >> I'm not sure I was understood: the idea is to do something like >> in this example in tfifo_enqueue() in all leaf qdiscs like fifo >> etc. too, so to redirect their ->enqueue() to their ->requeue() >> which usually is qdisc_requeue() (or to it directly if needed). > > Yes, I misunderstood this, I though the intention was to get > rid of requeue entirely. I was less ambitious and thought about simplifying this at least, but if you think we can go further, it's OK with me. Then we can do it only in tfifo. If qdisc_requeue() does the proper logic for it now, I guess it should be enough to open code this into tfifo_enqueue() (so we could kill qdisc_requeue() later). Using this TCQ_F_REQUEUE flag only for this looks a bit wasteful, but I can't see anything smarter at the moment. >>> Its probably not that hard, but as I said, I don't >>> think its necessary at all. It only makes a difference with a >>> non-work-conserving inner qdisc, but a lot of the functionality of >>> netem requires the inner tfifo anyways and rate-limiting is usually >>> done on top of netem. So I would suggest so either hard-wire the >>> tfifo qdisc or at least make the assumption that inner qdiscs are >>> work-conserving. I think Stephen could be interested with this change so I added him to Cc. Thanks, Jarek P. -------------------> pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. After introducing qdisc->ops->peek() method the only remaining user of qdisc->ops->requeue() is netem_enqueue() using this for packet re-ordering. According to Patrick McHardy: "a lot of the functionality of netem requires the inner tfifo anyways and rate-limiting is usually done on top of netem. So I would suggest so either hard-wire the tfifo qdisc or at least make the assumption that inner qdiscs are work- conserving." This patch tries the former. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> --- include/net/sch_generic.h | 1 + net/sched/sch_netem.c | 18 +++++++++++++++++- 2 files changed, 18 insertions(+), 1 deletions(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 9dcb5bf..9157766 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -45,6 +45,7 @@ struct Qdisc #define TCQ_F_BUILTIN 1 #define TCQ_F_THROTTLED 2 #define TCQ_F_INGRESS 4 +#define TCQ_F_REQUEUE 8 int padded; struct Qdisc_ops *ops; struct qdisc_size_table *stab; diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 3080bd6..a30f5b6 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -233,7 +233,14 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) */ cb->time_to_send = psched_get_time(); q->counter = 0; - ret = q->qdisc->ops->requeue(skb, q->qdisc); + q->qdisc->flags |= TCQ_F_REQUEUE; + ret = qdisc_enqueue(skb, q->qdisc); + if (unlikely(q->qdisc->flags & TCQ_F_REQUEUE)) { + q->qdisc->flags &= ~TCQ_F_REQUEUE; + if (net_ratelimit()) + printk(KERN_WARNING "netem_enqueue: re-ordering" + " unsupported; use default (tfifo) qdisc."); + } } if (likely(ret == NET_XMIT_SUCCESS)) { @@ -478,6 +485,15 @@ static int tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch) psched_time_t tnext = netem_skb_cb(nskb)->time_to_send; struct sk_buff *skb; + if (sch->flags & TCQ_F_REQUEUE) { + sch->flags &= ~TCQ_F_REQUEUE; + __skb_queue_head(list, nskb); + sch->qstats.backlog += qdisc_pkt_len(nskb); + sch->qstats.requeues++; + + return NET_XMIT_SUCCESS; + } + if (likely(skb_queue_len(list) < q->limit)) { /* Optimize for add at tail */ if (likely(skb_queue_empty(list) || tnext >= q->oldest)) { ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-17 20:12 ` [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc Jarek Poplawski @ 2008-10-21 23:36 ` David Miller 2008-10-21 23:51 ` Stephen Hemminger ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: David Miller @ 2008-10-21 23:36 UTC (permalink / raw) To: jarkao2; +Cc: kaber, netdev, herbert, shemminger From: Jarek Poplawski <jarkao2@gmail.com> Date: Fri, 17 Oct 2008 22:12:10 +0200 > pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. > > After introducing qdisc->ops->peek() method the only remaining user of > qdisc->ops->requeue() is netem_enqueue() using this for packet > re-ordering. According to Patrick McHardy: "a lot of the functionality > of netem requires the inner tfifo anyways and rate-limiting is usually > done on top of netem. So I would suggest so either hard-wire the tfifo > qdisc or at least make the assumption that inner qdiscs are work- > conserving." This patch tries the former. > > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> This is an interesting patch. But the thing that strikes me is this: Why don't we just let sch_netem do the reordering inside of itself entirely and just get rid of all of this ->requeue() business? sch_netem is just a black box, like any other packet scheduler node in the tree, and so it can internally do the reordering with a self managed packet list or similar. All of this can be hidden inside of it's ->dequeue() with some pkt_sch watchdog timer that fires to prevent stale packets sitting in the reorder queue forever. Anyways, just and idea and RFC, just like this patch ;-) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-21 23:36 ` David Miller @ 2008-10-21 23:51 ` Stephen Hemminger 2008-10-22 5:37 ` Jarek Poplawski 2008-10-22 15:57 ` [PATCH] " Patrick McHardy 2008-10-22 16:00 ` Patrick McHardy 2 siblings, 1 reply; 21+ messages in thread From: Stephen Hemminger @ 2008-10-21 23:51 UTC (permalink / raw) To: David Miller; +Cc: jarkao2, kaber, netdev, herbert On Tue, 21 Oct 2008 16:36:05 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Jarek Poplawski <jarkao2@gmail.com> > Date: Fri, 17 Oct 2008 22:12:10 +0200 > > > pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. > > > > After introducing qdisc->ops->peek() method the only remaining user of > > qdisc->ops->requeue() is netem_enqueue() using this for packet > > re-ordering. According to Patrick McHardy: "a lot of the functionality > > of netem requires the inner tfifo anyways and rate-limiting is usually > > done on top of netem. So I would suggest so either hard-wire the tfifo > > qdisc or at least make the assumption that inner qdiscs are work- > > conserving." This patch tries the former. > > > > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> > > This is an interesting patch. > > But the thing that strikes me is this: Why don't we just let sch_netem do > the reordering inside of itself entirely and just get rid of all of this > ->requeue() business? > > sch_netem is just a black box, like any other packet scheduler node in > the tree, and so it can internally do the reordering with a self managed > packet list or similar. All of this can be hidden inside of it's ->dequeue() > with some pkt_sch watchdog timer that fires to prevent stale packets sitting > in the reorder queue forever. > > Anyways, just and idea and RFC, just like this patch ;-) The problem is that jamal talked me into having netem as a classful qdisc, instead of doing its own rate control. People like to do use TBF as inner qdisc, and do reordering. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-21 23:51 ` Stephen Hemminger @ 2008-10-22 5:37 ` Jarek Poplawski 2008-10-22 16:00 ` Patrick McHardy 0 siblings, 1 reply; 21+ messages in thread From: Jarek Poplawski @ 2008-10-22 5:37 UTC (permalink / raw) To: Stephen Hemminger; +Cc: David Miller, kaber, netdev, herbert On Tue, Oct 21, 2008 at 04:51:29PM -0700, Stephen Hemminger wrote: > On Tue, 21 Oct 2008 16:36:05 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > > > From: Jarek Poplawski <jarkao2@gmail.com> > > Date: Fri, 17 Oct 2008 22:12:10 +0200 > > > > > pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. > > > > > > After introducing qdisc->ops->peek() method the only remaining user of > > > qdisc->ops->requeue() is netem_enqueue() using this for packet > > > re-ordering. According to Patrick McHardy: "a lot of the functionality > > > of netem requires the inner tfifo anyways and rate-limiting is usually > > > done on top of netem. So I would suggest so either hard-wire the tfifo > > > qdisc or at least make the assumption that inner qdiscs are work- > > > conserving." This patch tries the former. > > > > > > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> > > > > This is an interesting patch. > > > > But the thing that strikes me is this: Why don't we just let sch_netem do > > the reordering inside of itself entirely and just get rid of all of this > > ->requeue() business? We just let for this here with sch_netem's default tfifo qdisc. > > > > sch_netem is just a black box, like any other packet scheduler node in > > the tree, and so it can internally do the reordering with a self managed > > packet list or similar. All of this can be hidden inside of it's ->dequeue() > > with some pkt_sch watchdog timer that fires to prevent stale packets sitting > > in the reorder queue forever. > > > > Anyways, just and idea and RFC, just like this patch ;-) > > The problem is that jamal talked me into having netem as a classful qdisc, > instead of doing its own rate control. People like to do use TBF as inner qdisc, > and do reordering. If it's only this kind of usage we could export tfifo and let use this as a TBF's (etc.) leaf. Of course, this would require changes in those people scripts. Jarek P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-22 5:37 ` Jarek Poplawski @ 2008-10-22 16:00 ` Patrick McHardy 2008-10-22 16:49 ` Jarek Poplawski 0 siblings, 1 reply; 21+ messages in thread From: Patrick McHardy @ 2008-10-22 16:00 UTC (permalink / raw) To: Jarek Poplawski; +Cc: Stephen Hemminger, David Miller, netdev, herbert Jarek Poplawski wrote: > On Tue, Oct 21, 2008 at 04:51:29PM -0700, Stephen Hemminger wrote: >> On Tue, 21 Oct 2008 16:36:05 -0700 (PDT) >> David Miller <davem@davemloft.net> wrote: >> >>> sch_netem is just a black box, like any other packet scheduler node in >>> the tree, and so it can internally do the reordering with a self managed >>> packet list or similar. All of this can be hidden inside of it's ->dequeue() >>> with some pkt_sch watchdog timer that fires to prevent stale packets sitting >>> in the reorder queue forever. >>> >>> Anyways, just and idea and RFC, just like this patch ;-) >> The problem is that jamal talked me into having netem as a classful qdisc, >> instead of doing its own rate control. People like to do use TBF as inner qdisc, >> and do reordering. > > If it's only this kind of usage we could export tfifo and let use this > as a TBF's (etc.) leaf. Of course, this would require changes in those > people scripts. In that case we might as well teach them to use TBF as *parent* of netem (and I'd vote to do that and kill requeue). But we can argue about this forever without any progress. The question is simple - should we enforce a reasonable qdisc structure and kill ->requeue or keep it around forever. Keep in mind that there is no loss of functionality by using TBF as parent and that we can do this gradually so users have a chance to fix their scripts, should anyone really use TBF as inner qdisc. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-22 16:00 ` Patrick McHardy @ 2008-10-22 16:49 ` Jarek Poplawski 2008-10-22 17:32 ` Patrick McHardy 0 siblings, 1 reply; 21+ messages in thread From: Jarek Poplawski @ 2008-10-22 16:49 UTC (permalink / raw) To: Patrick McHardy; +Cc: Stephen Hemminger, David Miller, netdev, herbert On Wed, Oct 22, 2008 at 06:00:56PM +0200, Patrick McHardy wrote: > Jarek Poplawski wrote: >> On Tue, Oct 21, 2008 at 04:51:29PM -0700, Stephen Hemminger wrote: >>> On Tue, 21 Oct 2008 16:36:05 -0700 (PDT) >>> David Miller <davem@davemloft.net> wrote: >>> >>>> sch_netem is just a black box, like any other packet scheduler node in >>>> the tree, and so it can internally do the reordering with a self managed >>>> packet list or similar. All of this can be hidden inside of it's ->dequeue() >>>> with some pkt_sch watchdog timer that fires to prevent stale packets sitting >>>> in the reorder queue forever. >>>> >>>> Anyways, just and idea and RFC, just like this patch ;-) >>> The problem is that jamal talked me into having netem as a classful qdisc, >>> instead of doing its own rate control. People like to do use TBF as inner qdisc, >>> and do reordering. >> >> If it's only this kind of usage we could export tfifo and let use this >> as a TBF's (etc.) leaf. Of course, this would require changes in those >> people scripts. > > In that case we might as well teach them to use TBF as *parent* > of netem (and I'd vote to do that and kill requeue). > > But we can argue about this forever without any progress. The > question is simple - should we enforce a reasonable qdisc structure > and kill ->requeue or keep it around forever. Keep in mind that there > is no loss of functionality by using TBF as parent and that we > can do this gradually so users have a chance to fix their scripts, > should anyone really use TBF as inner qdisc. > I'm not sure we think about the same: this tfifo idea doesn't need ->requeue() at all. This would go through TBF's or prio's (etc.) ->enqueue(), and only tfifo's ->enqueue(), if it's used as a leaf, checks the qdisc flag and can reorder. But if it's useless, no problem. I can redo this patch without this qdisc flag. Jarek P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-22 16:49 ` Jarek Poplawski @ 2008-10-22 17:32 ` Patrick McHardy 2008-10-22 17:53 ` [RFC] " Jarek Poplawski 0 siblings, 1 reply; 21+ messages in thread From: Patrick McHardy @ 2008-10-22 17:32 UTC (permalink / raw) To: Jarek Poplawski; +Cc: Stephen Hemminger, David Miller, netdev, herbert Jarek Poplawski wrote: > On Wed, Oct 22, 2008 at 06:00:56PM +0200, Patrick McHardy wrote: >> Jarek Poplawski wrote: >>> If it's only this kind of usage we could export tfifo and let use this >>> as a TBF's (etc.) leaf. Of course, this would require changes in those >>> people scripts. >> In that case we might as well teach them to use TBF as *parent* >> of netem (and I'd vote to do that and kill requeue). >> >> But we can argue about this forever without any progress. The >> question is simple - should we enforce a reasonable qdisc structure >> and kill ->requeue or keep it around forever. Keep in mind that there >> is no loss of functionality by using TBF as parent and that we >> can do this gradually so users have a chance to fix their scripts, >> should anyone really use TBF as inner qdisc. >> > > I'm not sure we think about the same: this tfifo idea doesn't need > ->requeue() at all. This would go through TBF's or prio's (etc.) > ->enqueue(), and only tfifo's ->enqueue(), if it's used as a leaf, > checks the qdisc flag and can reorder. I see. So both ways would work fine to get rid of requeue. The flag doesn't seem to be necessary though since tfifo already does reordering based on time_to_send. > But if it's useless, no problem. I can redo this patch without this > qdisc flag. Well, you're doing the work, so you decide. I'm undecided myself, the main issues I see are: - we might have to reeducate users twice if we decide to enforce more structure later on - a lot of other qdiscs still don't work as inner qdiscs of netem: - any reordering qdisc can cause stalls since netem_dequeue expects to get packets ordered by time_to_send. This means we can't use cbq, hfsc, htb, prio, sfq, leaving atm, dsmark, netem, red, gred, tbf and the fifos. I guess we can strike atm as "makes no sense" and dsmark as "obsolete since 10(?) years". - netem can't be used as inner qdisc since it would corrupt the skb's cb So the usable inner qdiscs are: tbf, red, gred, *fifo. The fact that over 50% of the qdiscs you could use can cause misbehaviour and TBF, red and gred can be used as upper qdiscs without any loss of functionality makes me think netem simply shouldn't be classful. So actually I am decided :) I think netem shouldn't be classful. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-22 17:32 ` Patrick McHardy @ 2008-10-22 17:53 ` Jarek Poplawski 0 siblings, 0 replies; 21+ messages in thread From: Jarek Poplawski @ 2008-10-22 17:53 UTC (permalink / raw) To: Patrick McHardy; +Cc: Stephen Hemminger, David Miller, netdev, herbert On Wed, Oct 22, 2008 at 07:32:02PM +0200, Patrick McHardy wrote: ... > Well, you're doing the work, so you decide. I'm undecided myself, > the main issues I see are: Well, I've only tried to finish David's "Killing qdisc->ops->requeue()", which I had broken before. Then somebody framed me into this ->peek() /netem thing... ... > So actually I am decided :) I think netem shouldn't be classful. OK, I added RFC to the subject. Jarek P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-21 23:36 ` David Miller 2008-10-21 23:51 ` Stephen Hemminger @ 2008-10-22 15:57 ` Patrick McHardy 2008-10-22 16:00 ` Patrick McHardy 2 siblings, 0 replies; 21+ messages in thread From: Patrick McHardy @ 2008-10-22 15:57 UTC (permalink / raw) To: David Miller; +Cc: jarkao2, netdev, herbert, shemminger David Miller wrote: > From: Jarek Poplawski <jarkao2@gmail.com> > Date: Fri, 17 Oct 2008 22:12:10 +0200 > >> pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. >> >> After introducing qdisc->ops->peek() method the only remaining user of >> qdisc->ops->requeue() is netem_enqueue() using this for packet >> re-ordering. According to Patrick McHardy: "a lot of the functionality >> of netem requires the inner tfifo anyways and rate-limiting is usually >> done on top of netem. So I would suggest so either hard-wire the tfifo >> qdisc or at least make the assumption that inner qdiscs are work- >> conserving." This patch tries the former. >> >> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> > > This is an interesting patch. > > But the thing that strikes me is this: Why don't we just let sch_netem do > the reordering inside of itself entirely and just get rid of all of this > ->requeue() business? I fully agree, keeping all the ->requeue crap around just for this cornercase doesn't seem like a good decision. Most of the ->requeu > sch_netem is just a black box, like any other packet scheduler node in > the tree, and so it can internally do the reordering with a self managed > packet list or similar. All of this can be hidden inside of it's ->dequeue() > with some pkt_sch watchdog timer that fires to prevent stale packets sitting > in the reorder queue forever. > > Anyways, just and idea and RFC, just like this patch ;-) > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. 2008-10-21 23:36 ` David Miller 2008-10-21 23:51 ` Stephen Hemminger 2008-10-22 15:57 ` [PATCH] " Patrick McHardy @ 2008-10-22 16:00 ` Patrick McHardy 2 siblings, 0 replies; 21+ messages in thread From: Patrick McHardy @ 2008-10-22 16:00 UTC (permalink / raw) To: David Miller; +Cc: jarkao2, netdev, herbert, shemminger [Damn hotkeys for sending incomplete mails :)] David Miller wrote: > From: Jarek Poplawski <jarkao2@gmail.com> > Date: Fri, 17 Oct 2008 22:12:10 +0200 > >> pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc. >> >> After introducing qdisc->ops->peek() method the only remaining user of >> qdisc->ops->requeue() is netem_enqueue() using this for packet >> re-ordering. According to Patrick McHardy: "a lot of the functionality >> of netem requires the inner tfifo anyways and rate-limiting is usually >> done on top of netem. So I would suggest so either hard-wire the tfifo >> qdisc or at least make the assumption that inner qdiscs are work- >> conserving." This patch tries the former. >> >> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> > > This is an interesting patch. > > But the thing that strikes me is this: Why don't we just let sch_netem do > the reordering inside of itself entirely and just get rid of all of this > ->requeue() business? I fully agree, keeping all the ->requeue crap around just for this cornercase doesn't seem like a good decision. Most of the ->requeue functions are completely inconsistent in their behaviour anyways: some undo ->enqueue state changes (which is obviously broken for netem), some don't, some redo checks from ->enqueue (necessary for netem), some don't (TBF), ... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-17 14:12 ` Patrick McHardy 2008-10-17 20:12 ` [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc Jarek Poplawski @ 2008-10-17 20:45 ` Jarek Poplawski 2008-10-21 23:43 ` David Miller 2008-10-22 16:04 ` Patrick McHardy 1 sibling, 2 replies; 21+ messages in thread From: Jarek Poplawski @ 2008-10-17 20:45 UTC (permalink / raw) To: Patrick McHardy; +Cc: David Miller, netdev, Herbert Xu On Fri, Oct 17, 2008 at 04:12:03PM +0200, Patrick McHardy wrote: ... > Some general thoughts ... > > We've never had any systematic checks for useful and non-useful > combination of qdiscs, which is causing a lot of these complications. > Think of all the multiq work that was required to make it work > properly with non-work-conserving qdiscs - while at the same time, > using a non-work-conserving qdisc (which require a global view) > defeats basically all of the benefits. > > So it would be really useful to come up with a systematic definition > of valid combinations instead of trying handling lots of purely > theoretical case that don't make sense. One more example - all the > qdiscs implement ->drop(), yet its only needed by CBQ and it doesn't > make any sense at all to use lets say HFSC as child of CBQ. > > About this specific case - yes, it would break compatibility for > users using f.i. TBF as child of netem. But if you look at the > netem_enqueue() function, it in fact assumes that the inner qdisc > is a tfifo, so we'd be breaking an already broken case. We can > of course be nice and warn about it for a few releases, but I believe > there is some real potential for simplification that makes it > worth it. I'm not sure this is all right: at least until there is not too much problems with some code (like with requeuing). We probably should try before this official ways like feature-removal-schedule.txt, and/or maybe some CONFIG_XXX_DEPRECATED things. But I don't persist with this. BTW, I'm not sure if I'm expected to redo any patches in this thread. (Probably some things like this teql_peek() could be removed with ->requeue() killing.) Jarek P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-17 20:45 ` [PATCH 0/6] Add qdisc->ops->peek() support Jarek Poplawski @ 2008-10-21 23:43 ` David Miller 2008-10-22 16:01 ` Patrick McHardy 2008-10-22 16:04 ` Patrick McHardy 1 sibling, 1 reply; 21+ messages in thread From: David Miller @ 2008-10-21 23:43 UTC (permalink / raw) To: jarkao2; +Cc: kaber, netdev, herbert From: Jarek Poplawski <jarkao2@gmail.com> Date: Fri, 17 Oct 2008 22:45:43 +0200 > BTW, I'm not sure if I'm expected to redo any patches in this thread. > (Probably some things like this teql_peek() could be removed with > ->requeue() killing.) In any event, you should resubmit these when I open the net-next-2.6 tree up in about a week or so. It looks like Patrick is in agreement about these changes and they look mostly fine to me too, FWIW. Thanks. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-21 23:43 ` David Miller @ 2008-10-22 16:01 ` Patrick McHardy 0 siblings, 0 replies; 21+ messages in thread From: Patrick McHardy @ 2008-10-22 16:01 UTC (permalink / raw) To: David Miller; +Cc: jarkao2, netdev, herbert David Miller wrote: > From: Jarek Poplawski <jarkao2@gmail.com> > Date: Fri, 17 Oct 2008 22:45:43 +0200 > >> BTW, I'm not sure if I'm expected to redo any patches in this thread. >> (Probably some things like this teql_peek() could be removed with >> ->requeue() killing.) > > In any event, you should resubmit these when I open the net-next-2.6 > tree up in about a week or so. > > It looks like Patrick is in agreement about these changes and they > look mostly fine to me too, FWIW. I think I lost the overview :) I'll review them again once the complete batch is resent. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/6] Add qdisc->ops->peek() support. 2008-10-17 20:45 ` [PATCH 0/6] Add qdisc->ops->peek() support Jarek Poplawski 2008-10-21 23:43 ` David Miller @ 2008-10-22 16:04 ` Patrick McHardy 1 sibling, 0 replies; 21+ messages in thread From: Patrick McHardy @ 2008-10-22 16:04 UTC (permalink / raw) To: Jarek Poplawski; +Cc: David Miller, netdev, Herbert Xu Jarek Poplawski wrote: > On Fri, Oct 17, 2008 at 04:12:03PM +0200, Patrick McHardy wrote: > ... >> Some general thoughts ... >> >> We've never had any systematic checks for useful and non-useful >> combination of qdiscs, which is causing a lot of these complications. >> Think of all the multiq work that was required to make it work >> properly with non-work-conserving qdiscs - while at the same time, >> using a non-work-conserving qdisc (which require a global view) >> defeats basically all of the benefits. >> >> So it would be really useful to come up with a systematic definition >> of valid combinations instead of trying handling lots of purely >> theoretical case that don't make sense. One more example - all the >> qdiscs implement ->drop(), yet its only needed by CBQ and it doesn't >> make any sense at all to use lets say HFSC as child of CBQ. >> >> About this specific case - yes, it would break compatibility for >> users using f.i. TBF as child of netem. But if you look at the >> netem_enqueue() function, it in fact assumes that the inner qdisc >> is a tfifo, so we'd be breaking an already broken case. We can >> of course be nice and warn about it for a few releases, but I believe >> there is some real potential for simplification that makes it >> worth it. > > I'm not sure this is all right: at least until there is not too much > problems with some code (like with requeuing). We probably should try > before this official ways like feature-removal-schedule.txt, and/or > maybe some CONFIG_XXX_DEPRECATED things. But I don't persist with this. Sure. Nobody reads feature-removal-schedule.txt though, so we should also have a runtime warning :) > BTW, I'm not sure if I'm expected to redo any patches in this thread. > (Probably some things like this teql_peek() could be removed with > ->requeue() killing.) Just do the changes you think are right, I'm not expecting you to do anything I suggested :) ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2008-10-22 17:52 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-10-16 9:46 [PATCH 0/6] Add qdisc->ops->peek() support Jarek Poplawski 2008-10-16 12:38 ` Patrick McHardy 2008-10-16 13:08 ` Jarek Poplawski 2008-10-16 22:09 ` Jarek Poplawski 2008-10-17 12:33 ` Patrick McHardy 2008-10-17 13:03 ` Jarek Poplawski 2008-10-17 14:12 ` Patrick McHardy 2008-10-17 20:12 ` [PATCH] pkt_sched: sch_netem: Limit packet re-ordering functionality to tfifo qdisc Jarek Poplawski 2008-10-21 23:36 ` David Miller 2008-10-21 23:51 ` Stephen Hemminger 2008-10-22 5:37 ` Jarek Poplawski 2008-10-22 16:00 ` Patrick McHardy 2008-10-22 16:49 ` Jarek Poplawski 2008-10-22 17:32 ` Patrick McHardy 2008-10-22 17:53 ` [RFC] " Jarek Poplawski 2008-10-22 15:57 ` [PATCH] " Patrick McHardy 2008-10-22 16:00 ` Patrick McHardy 2008-10-17 20:45 ` [PATCH 0/6] Add qdisc->ops->peek() support Jarek Poplawski 2008-10-21 23:43 ` David Miller 2008-10-22 16:01 ` Patrick McHardy 2008-10-22 16:04 ` Patrick McHardy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).