[PATCH 00/14]: Killing qdisc->ops->requeue().

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/14]: Killing qdisc->ops->requeue().
@ 2008-10-14  9:52 Jarek Poplawski
  2008-10-14 11:39 ` Patrick McHardy
  2008-10-14 16:41 ` Alexander Duyck
  0 siblings, 2 replies; 16+ messages in thread
From: Jarek Poplawski @ 2008-10-14  9:52 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

The aim of this patch-set is to finish changes proposed by David S.
Miller in his patch-set with the same subject from Mon, 18 Aug 2008.
The first two patches were applied with some modifications, so, to
apply the rest, there were needed some changes.

Original David's patches include additional info, but signed-off-by
is removed because of changed context. I expect they will be merged
and signed off by David as an author, anyway.

The qdisc->requeue list idea is to limit requeuing to one level only,
so a parent can requeue to its child only. This list is then tried
first while dequeuing (qdisc_dequeue()), except at the top level,
so packets could be requeued only by qdiscs, not by qdisc_restart()
after xmit errors.

Thanks,
Jarek P.

 include/net/sch_generic.h |   23 +++++++++------
 net/sched/sch_api.c       |    7 -----
 net/sched/sch_atm.c       |   24 ++--------------
 net/sched/sch_cbq.c       |   37 +------------------------
 net/sched/sch_dsmark.c    |   23 +---------------
 net/sched/sch_fifo.c      |    2 -
 net/sched/sch_generic.c   |   21 +-------------
 net/sched/sch_gred.c      |   21 --------------
 net/sched/sch_hfsc.c      |   30 ++------------------
 net/sched/sch_htb.c       |   46 +------------------------------
 net/sched/sch_multiq.c    |   38 +------------------------
 net/sched/sch_netem.c     |   29 +++-----------------
 net/sched/sch_prio.c      |   31 +--------------------
 net/sched/sch_red.c       |   20 +-------------
 net/sched/sch_sfq.c       |   66 ---------------------------------------------
 net/sched/sch_tbf.c       |   23 +--------------
 net/sched/sch_teql.c      |   11 -------
 17 files changed, 36 insertions(+), 416 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14  9:52 [PATCH 00/14]: Killing qdisc->ops->requeue() Jarek Poplawski
@ 2008-10-14 11:39 ` Patrick McHardy
  2008-10-14 12:26   ` Jarek Poplawski
  2008-10-14 16:41 ` Alexander Duyck
  1 sibling, 1 reply; 16+ messages in thread
From: Patrick McHardy @ 2008-10-14 11:39 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, netdev

Jarek Poplawski wrote:
> The aim of this patch-set is to finish changes proposed by David S.
> Miller in his patch-set with the same subject from Mon, 18 Aug 2008.
> The first two patches were applied with some modifications, so, to
> apply the rest, there were needed some changes.
> 
> Original David's patches include additional info, but signed-off-by
> is removed because of changed context. I expect they will be merged
> and signed off by David as an author, anyway.
> 
> The qdisc->requeue list idea is to limit requeuing to one level only,
> so a parent can requeue to its child only. This list is then tried
> first while dequeuing (qdisc_dequeue()), except at the top level,
> so packets could be requeued only by qdiscs, not by qdisc_restart()
> after xmit errors.

I didn't follow the original discussion, but I'm wondering what
the reasoning is why these patches won't have negative impact
on latency. Consider these two scenarios with HFSC or TBF:

current situation:

- packet is dequeued and sent
- next packet is peeked at for calculating the deadline
- watchdog is scheduled
- higher priority packet arrives and is queued to inner qdisc
- dequeue is called again, qdisc is overlimit, so peeks again
- watchdog is rescheduled based on higher priority packet

without ->requeue:

- packet is dequeued and sent
- next packet is peeked at for calculating the deadline and
   put into private "requeue" queue
- watchdog is scheduled
- higher priority packet arrives and is queued to inner qdisc
- dequeue is called again, qdisc is overlimit, so peeks again
- higher priority packet doesn't affect watchdog rescheduling
   since we still have one in the private queue
- lower priority packet is sent, assuming qdisc is overlimit
   watchdog is then rescheduled based on higher priority packet

The end result is that the worst case latency for a packet increases
by a full packet transmission time. This may not matter much for high
bandwidth connections, but for f.i. with a 1mbit connection it adds a
full 12ms for a MTU of 1500, which is clearly in the noticable range.

I'm not opposed to killing top-level ->requeue since in that case
the qdisc has already decided to send the packet and if it affects
latency, the qdisc is misconfigured to use too much bandwidth.
Qdisc' use of ->requeue can only be removed without bad side effects
for the CBQ case of overlimit handling, it shouldn't matter much
since CBQ is not very accurate anyways. For the ->peek case (HFSC,
TBF, I think also netem) we really need the peek semantic to avoid
these side effects.

It should actually be pretty easy because for every ->enqueue call,
there is at least one immediately following ->dequeue call, which
gives an upper qdisc a chance to reschedule the watchdog when
conditions change. So what should work is having the requeue-queue
(actually, just an skb pointer) within the innermost qdisc instead
of one level higher, as in your patches.

On a ->peek operation, the qdisc would simply do what is currently
done in ->dequeue, but instead of removing the packet from its
private queues, it would set the pointer to point to the chosen
packet and return it to the upper qdisc. The upper qdisc can use
this for watchdog scheduling. If the next event is a dequeue event
(meaning the watchdog expired), it removes the peeked packet from
the private queues and returns it to the upper qdisc again. If the
next event is an enqueue event, it can replace the pointer
unconditionally since the upper qdisc will immediately call
->dequeue or ->peek again, giving it a chance to reschedule based
on the changed conditions.

So the implementation would probably roughly look like this:

- split ->dequeue into a queue and packet selection operation, setting
   the above mentioned pointer, and an actual dequeue operation to
   remove the selected packet from the queue.

- the queue and packet selection operation is at the same time the
   ->peek operation

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 11:39 ` Patrick McHardy
@ 2008-10-14 12:26   ` Jarek Poplawski
  2008-10-14 12:32     ` Patrick McHardy
  0 siblings, 1 reply; 16+ messages in thread
From: Jarek Poplawski @ 2008-10-14 12:26 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David Miller, netdev

On Tue, Oct 14, 2008 at 01:39:23PM +0200, Patrick McHardy wrote:
> Jarek Poplawski wrote:
>> The aim of this patch-set is to finish changes proposed by David S.
>> Miller in his patch-set with the same subject from Mon, 18 Aug 2008.
>> The first two patches were applied with some modifications, so, to
>> apply the rest, there were needed some changes.
>>
>> Original David's patches include additional info, but signed-off-by
>> is removed because of changed context. I expect they will be merged
>> and signed off by David as an author, anyway.
>>
>> The qdisc->requeue list idea is to limit requeuing to one level only,
>> so a parent can requeue to its child only. This list is then tried
>> first while dequeuing (qdisc_dequeue()), except at the top level,
>> so packets could be requeued only by qdiscs, not by qdisc_restart()
>> after xmit errors.
>
> I didn't follow the original discussion, but I'm wondering what
> the reasoning is why these patches won't have negative impact
> on latency. Consider these two scenarios with HFSC or TBF:

Hmm... What a pity you couldn't write this a bit earlier, because I've
just waited with this part for some distinct cons of this solution.

Alas, I can't analyze your concerns at the moment, and I'll try to
reply in the evening yet, but my idea was this all shouldn't make
(almost) any visible change just for HFSC, so if it's otherwise,
something went wrong. IMHO, with this solution hfsc_requeue() way is
simply realized as a standard now, but I can miss something.

Thanks for now,
Jarek P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 12:26   ` Jarek Poplawski
@ 2008-10-14 12:32     ` Patrick McHardy
  2008-10-14 17:56       ` Jarek Poplawski
  0 siblings, 1 reply; 16+ messages in thread
From: Patrick McHardy @ 2008-10-14 12:32 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, netdev

Jarek Poplawski wrote:
> On Tue, Oct 14, 2008 at 01:39:23PM +0200, Patrick McHardy wrote:
>> Jarek Poplawski wrote:
>>> The aim of this patch-set is to finish changes proposed by David S.
>>> Miller in his patch-set with the same subject from Mon, 18 Aug 2008.
>>> The first two patches were applied with some modifications, so, to
>>> apply the rest, there were needed some changes.
>>>
>>> Original David's patches include additional info, but signed-off-by
>>> is removed because of changed context. I expect they will be merged
>>> and signed off by David as an author, anyway.
>>>
>>> The qdisc->requeue list idea is to limit requeuing to one level only,
>>> so a parent can requeue to its child only. This list is then tried
>>> first while dequeuing (qdisc_dequeue()), except at the top level,
>>> so packets could be requeued only by qdiscs, not by qdisc_restart()
>>> after xmit errors.
>> I didn't follow the original discussion, but I'm wondering what
>> the reasoning is why these patches won't have negative impact
>> on latency. Consider these two scenarios with HFSC or TBF:
> 
> Hmm... What a pity you couldn't write this a bit earlier, because I've
> just waited with this part for some distinct cons of this solution.

I'm sorry for any wasted effort, I think I mentioned this one or
two month ago, but wasn't able to further participate in the
discussion because I was busy with other things.

> Alas, I can't analyze your concerns at the moment, and I'll try to
> reply in the evening yet, but my idea was this all shouldn't make
> (almost) any visible change just for HFSC, so if it's otherwise,
> something went wrong. IMHO, with this solution hfsc_requeue() way is
> simply realized as a standard now, but I can miss something.

The main difference results from the fact that higher priority
packets can't preempt "peeked" lower priority packets anymore.
The difference this causes obviously depends on the bandwidth,
but in my opinion the impact (as mentioned. 12ms delay for
1500b, 1mbit, rises linearly with bigger packets/less bandwidth)
is large enough to speak against these changes.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14  9:52 [PATCH 00/14]: Killing qdisc->ops->requeue() Jarek Poplawski
  2008-10-14 11:39 ` Patrick McHardy
@ 2008-10-14 16:41 ` Alexander Duyck
  2008-10-14 18:37   ` Jarek Poplawski
  2008-10-14 19:15   ` Jarek Poplawski
  1 sibling, 2 replies; 16+ messages in thread
From: Alexander Duyck @ 2008-10-14 16:41 UTC (permalink / raw)
  To: Jarek Poplawski, netdev@vger.kernel.org

Jarek Poplawski wrote:
> The aim of this patch-set is to finish changes proposed by David S.
> Miller in his patch-set with the same subject from Mon, 18 Aug 2008.
> The first two patches were applied with some modifications, so, to
> apply the rest, there were needed some changes.
> 
> Original David's patches include additional info, but signed-off-by
> is removed because of changed context. I expect they will be merged
> and signed off by David as an author, anyway.
> 
> The qdisc->requeue list idea is to limit requeuing to one level only,
> so a parent can requeue to its child only. This list is then tried
> first while dequeuing (qdisc_dequeue()), except at the top level,
> so packets could be requeued only by qdiscs, not by qdisc_restart()
> after xmit errors.

I still do not agree with this change.  It essentially breaks multiq as 
we are right back to all queues being stopped because of one packet in 
the qdisc->requeue list.

I think if anything it seems like you guys actually found the cpu 
performance issue a while back in the fact that the dev_requeue_skb was 
calling __netif_schedule when requeuing on a stopped queue.  That is the 
one piece I would say needs to be changed so that you only call 
__netif_schedule if the skb is not going to a stopped queue.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 12:32     ` Patrick McHardy
@ 2008-10-14 17:56       ` Jarek Poplawski
  2008-10-14 20:18         ` David Miller
  2008-10-14 20:44         ` Patrick McHardy
  0 siblings, 2 replies; 16+ messages in thread
From: Jarek Poplawski @ 2008-10-14 17:56 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David Miller, netdev

David,
I acknowledge Patrick's concerns, so please, consider this
patch-set as withdrawn. Thanks.

A little comment below:

Patrick McHardy wrote, On 10/14/2008 02:32 PM:
> Jarek Poplawski wrote:
...
> I'm sorry for any wasted effort, I think I mentioned this one or
> two month ago, but wasn't able to further participate in the
> discussion because I was busy with other things.

No problem, it seems I missed your point.

>> Alas, I can't analyze your concerns at the moment, and I'll try to
>> reply in the evening yet, but my idea was this all shouldn't make
>> (almost) any visible change just for HFSC, so if it's otherwise,
>> something went wrong. IMHO, with this solution hfsc_requeue() way is
>> simply realized as a standard now, but I can miss something.
> 
> The main difference results from the fact that higher priority
> packets can't preempt "peeked" lower priority packets anymore.
> The difference this causes obviously depends on the bandwidth,
> but in my opinion the impact (as mentioned. 12ms delay for
> 1500b, 1mbit, rises linearly with bigger packets/less bandwidth)
> is large enough to speak against these changes.

Actually, I think I had similar doubts one or two months ago, but I
can remember mainly your discussion on peek with Herbert, and not
much about something like above, but probably I didn't pay enough
attention.

Anyway, the main source of my misleading seems to be HFSC... So do you
mean it's OK to use this kind of requeuing in hfsc_requeue(), but not
elsewhere? Maybe I miss something, but if some other qdisc is used as
a parent of HFSC, and does something like qdisc_peek_len(), it gets
just what you're against above?

Anyway, I'm glad you've found some time for these explanations,
especially since they're in accordance with my previous suspicions.

Thanks again,
Jarek P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 16:41 ` Alexander Duyck
@ 2008-10-14 18:37   ` Jarek Poplawski
  2008-10-14 18:41     ` Jarek Poplawski
  2008-10-14 19:15   ` Jarek Poplawski
  1 sibling, 1 reply; 16+ messages in thread
From: Jarek Poplawski @ 2008-10-14 18:37 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev@vger.kernel.org

On Tue, Oct 14, 2008 at 09:41:30AM -0700, Alexander Duyck wrote:
...
> I still do not agree with this change.  It essentially breaks multiq as  
> we are right back to all queues being stopped because of one packet in  
> the qdisc->requeue list.

Good news! I've withdrawn this patch-set (for another reason yet).

> I think if anything it seems like you guys actually found the cpu  
> performance issue a while back in the fact that the dev_requeue_skb was  
> calling __netif_schedule when requeuing on a stopped queue.  That is the  
> one piece I would say needs to be changed so that you only call  
> __netif_schedule if the skb is not going to a stopped queue.

Alexender, actually I've remembered your concerns, and I waited with
this mostly for you. Since you didn't write I thought you've changed
your mind in the meantime. Anyway, current changes are really minimal
and easy to revert, so I wonder why you don't try to convince us with
some patches and/or test results.

IMHO, even if you are right about some downsides of the current
"simplistic" solution, which I think are overestimated (since multiq
does it's own checks beforehand, especially when used as a root it
should almost never have it's packets requeued - otherwise, according
to David, it's a buggy driver), but it seems it wasn't only me thinking
your proposal adds too much code duplication.

And, after all, you seem to think everything should be adapted for the
sake of sch_multiq, which could be true in the future, but maybe we
should wait with this for users' requests?

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 18:37   ` Jarek Poplawski
@ 2008-10-14 18:41     ` Jarek Poplawski
  0 siblings, 0 replies; 16+ messages in thread
From: Jarek Poplawski @ 2008-10-14 18:41 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev@vger.kernel.org

On Tue, Oct 14, 2008 at 08:37:30PM +0200, Jarek Poplawski wrote:
...
> Alexender, actually I've remembered your concerns, and I waited with

Sorry for this misspelling,
Jarek P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 16:41 ` Alexander Duyck
  2008-10-14 18:37   ` Jarek Poplawski
@ 2008-10-14 19:15   ` Jarek Poplawski
  2008-10-14 20:37     ` Alexander Duyck
  1 sibling, 1 reply; 16+ messages in thread
From: Jarek Poplawski @ 2008-10-14 19:15 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev@vger.kernel.org

On Tue, Oct 14, 2008 at 09:41:30AM -0700, Alexander Duyck wrote:
...
> I think if anything it seems like you guys actually found the cpu  
> performance issue a while back in the fact that the dev_requeue_skb was  
> calling __netif_schedule when requeuing on a stopped queue.  That is the  
> one piece I would say needs to be changed so that you only call  
> __netif_schedule if the skb is not going to a stopped queue.

BTW, since one of the other "dreams" of killing requeuing failed, I
think this proposal is worth checking, but David was concerned about
some issues with buggy drivers after __netif_schedule() removing.

Jarek P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 17:56       ` Jarek Poplawski
@ 2008-10-14 20:18         ` David Miller
  2008-10-14 20:44         ` Patrick McHardy
  1 sibling, 0 replies; 16+ messages in thread
From: David Miller @ 2008-10-14 20:18 UTC (permalink / raw)
  To: jarkao2; +Cc: kaber, netdev

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Tue, 14 Oct 2008 19:56:49 +0200

> I acknowledge Patrick's concerns, so please, consider this
> patch-set as withdrawn. Thanks.

Ok, will scratch it from patchwork, thanks for the update.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 19:15   ` Jarek Poplawski
@ 2008-10-14 20:37     ` Alexander Duyck
  2008-10-15  6:45       ` Jarek Poplawski
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2008-10-14 20:37 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev@vger.kernel.org

Jarek Poplawski wrote:
> On Tue, Oct 14, 2008 at 09:41:30AM -0700, Alexander Duyck wrote:
> ...
>> I think if anything it seems like you guys actually found the cpu
>> performance issue a while back in the fact that the dev_requeue_skb was
>> calling __netif_schedule when requeuing on a stopped queue.  That is the
>> one piece I would say needs to be changed so that you only call
>> __netif_schedule if the skb is not going to a stopped queue.
> 
> BTW, since one of the other "dreams" of killing requeuing failed, I
> think this proposal is worth checking, but David was concerned about
> some issues with buggy drivers after __netif_schedule() removing.
> 
> Jarek P.

I would say not to remove it.  Just add a check to verify that the queue 
the packet is bound for is not stopped prior to calling it.  If the 
queue is stopped it should be rescheduled by the wake_queue call when 
the device has cleared the queue.

Unfortunately I am currently working on other projects so I don't really 
  have the time to create and test a patch for this.  Hopefully my input 
has proven useful.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 17:56       ` Jarek Poplawski
  2008-10-14 20:18         ` David Miller
@ 2008-10-14 20:44         ` Patrick McHardy
  2008-10-15  8:27           ` Jarek Poplawski
  1 sibling, 1 reply; 16+ messages in thread
From: Patrick McHardy @ 2008-10-14 20:44 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, netdev

[-- Attachment #1: Type: text/plain, Size: 1841 bytes --]

Jarek Poplawski wrote:
>> The main difference results from the fact that higher priority
>> packets can't preempt "peeked" lower priority packets anymore.
>> The difference this causes obviously depends on the bandwidth,
>> but in my opinion the impact (as mentioned. 12ms delay for
>> 1500b, 1mbit, rises linearly with bigger packets/less bandwidth)
>> is large enough to speak against these changes.
> 
> Actually, I think I had similar doubts one or two months ago, but I
> can remember mainly your discussion on peek with Herbert, and not
> much about something like above, but probably I didn't pay enough
> attention.
> 
> Anyway, the main source of my misleading seems to be HFSC... So do you
> mean it's OK to use this kind of requeuing in hfsc_requeue(), but not
> elsewhere? Maybe I miss something, but if some other qdisc is used as
> a parent of HFSC, and does something like qdisc_peek_len(), it gets
> just what you're against above?

I think we really need a peek operation since most qdiscs do have
some internal priorization. The question is whether all qdiscs need
it; I tend to think no. Plugging two non-work-conserving qdiscs
together doesn't make much sense, so we could just prevent this.
This means we would only need to add ->peek support to the
work-conserving qdiscs, which looks pretty easy in all cases.
We actually don't even have to prevent plugging two non-work-conserving
qdiscs, the ones that need the peek operation could just check whether
the inner qdisc supports it.

Just as a demonstration how easy adding a peek operation to the
work-conserving qdiscs actually is. It doesn't need to keep or change
any internal state in many cases thanks to the guarantee that the
packet will either be dequeued or, if another packet arrives, the
upper qdisc will immediately ->peek again to reevaluate the state.



[-- Attachment #2: x --]
[-- Type: text/plain, Size: 2916 bytes --]

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index e556962..5462c50 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -431,6 +431,11 @@ static inline struct sk_buff *qdisc_dequeue_tail(struct Qdisc *sch)
 	return __qdisc_dequeue_tail(sch, &sch->q);
 }
 
+static inline struct sk_buff *qdisc_peek_head(struct Qdisc *sch)
+{
+	return skb_peek(&sch->q);
+}
+
 static inline int __qdisc_requeue(struct sk_buff *skb, struct Qdisc *sch,
 				  struct sk_buff_head *list)
 {
diff --git a/net/sched/sch_fifo.c b/net/sched/sch_fifo.c
index 23d258b..8825e88 100644
--- a/net/sched/sch_fifo.c
+++ b/net/sched/sch_fifo.c
@@ -83,6 +83,7 @@ struct Qdisc_ops pfifo_qdisc_ops __read_mostly = {
 	.priv_size	=	sizeof(struct fifo_sched_data),
 	.enqueue	=	pfifo_enqueue,
 	.dequeue	=	qdisc_dequeue_head,
+	.peek		=	qdisc_peek_head,
 	.requeue	=	qdisc_requeue,
 	.drop		=	qdisc_queue_drop,
 	.init		=	fifo_init,
@@ -98,6 +99,7 @@ struct Qdisc_ops bfifo_qdisc_ops __read_mostly = {
 	.priv_size	=	sizeof(struct fifo_sched_data),
 	.enqueue	=	bfifo_enqueue,
 	.dequeue	=	qdisc_dequeue_head,
+	.peek		=	qdisc_peek_head,
 	.requeue	=	qdisc_requeue,
 	.drop		=	qdisc_queue_drop,
 	.init		=	fifo_init,
diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index a6697c6..8411472 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -120,6 +120,19 @@ prio_requeue(struct sk_buff *skb, struct Qdisc* sch)
 	return ret;
 }
 
+static struct sk_buff *prio_peek(struct Qdisc *sch)
+{
+	struct prio_sched_data *q = qdisc_priv(sch);
+	int prio;
+
+	for (prio = 0; prio < q->bands; prio++) {
+		struct Qdisc *qdisc = q->queues[prio];
+		struct sk_buff *skb = qdisc->peek(qdisc);
+		if (skb)
+			return skb;
+	}
+	return NULL;
+}
 
 static struct sk_buff *prio_dequeue(struct Qdisc* sch)
 {
@@ -425,6 +438,7 @@ static struct Qdisc_ops prio_qdisc_ops __read_mostly = {
 	.priv_size	=	sizeof(struct prio_sched_data),
 	.enqueue	=	prio_enqueue,
 	.dequeue	=	prio_dequeue,
+	.peek		=	prio_peek,
 	.requeue	=	prio_requeue,
 	.drop		=	prio_drop,
 	.init		=	prio_init,
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 6e041d1..282a96c 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -391,8 +391,20 @@ sfq_requeue(struct sk_buff *skb, struct Qdisc *sch)
 	return NET_XMIT_CN;
 }
 
+static struct sk_buff *
+sfq_peek(struct Qdisc *sch)
+{
+	struct sfq_sched_data *q = qdisc_priv(sch);
+	struct sk_buff *skb;
+	sfq_index a;
 
+	/* No active slots */
+	if (q->tail == SFQ_DEPTH)
+		return NULL;
 
+	a = q->next[q->tail];
+	return skb_peek(&q->qs[a]);
+}
 
 static struct sk_buff *
 sfq_dequeue(struct Qdisc *sch)
@@ -624,6 +636,7 @@ static struct Qdisc_ops sfq_qdisc_ops __read_mostly = {
 	.priv_size	=	sizeof(struct sfq_sched_data),
 	.enqueue	=	sfq_enqueue,
 	.dequeue	=	sfq_dequeue,
+	.peek		=	sfq_peek,
 	.requeue	=	sfq_requeue,
 	.drop		=	sfq_drop,
 	.init		=	sfq_init,

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 20:37     ` Alexander Duyck
@ 2008-10-15  6:45       ` Jarek Poplawski
  2008-10-15  7:19         ` Jarek Poplawski
  0 siblings, 1 reply; 16+ messages in thread
From: Jarek Poplawski @ 2008-10-15  6:45 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev@vger.kernel.org

On Tue, Oct 14, 2008 at 01:37:14PM -0700, Alexander Duyck wrote:
> Jarek Poplawski wrote:
>> On Tue, Oct 14, 2008 at 09:41:30AM -0700, Alexander Duyck wrote:
>> ...
>>> I think if anything it seems like you guys actually found the cpu
>>> performance issue a while back in the fact that the dev_requeue_skb was
>>> calling __netif_schedule when requeuing on a stopped queue.  That is the
>>> one piece I would say needs to be changed so that you only call
>>> __netif_schedule if the skb is not going to a stopped queue.
>>
>> BTW, since one of the other "dreams" of killing requeuing failed, I
>> think this proposal is worth checking, but David was concerned about
>> some issues with buggy drivers after __netif_schedule() removing.
>>
>> Jarek P.
>
> I would say not to remove it.  Just add a check to verify that the queue  
> the packet is bound for is not stopped prior to calling it.  If the  
> queue is stopped it should be rescheduled by the wake_queue call when  
> the device has cleared the queue.

Sure, I meant "after __netif_schedule() call removing". Anyway, there
were a discussion on something like this:

http://marc.info/?l=linux-netdev&m=122197536025956&w=2

> Unfortunately I am currently working on other projects so I don't really  
>  have the time to create and test a patch for this.  Hopefully my input  
> has proven useful.

Without this __netif_schedule() call we should expect at least higher
cpu use here while packets are xmitted to a stopped queue compared to
older kernels or the current one (especially with multi qdisc/class/
filter configs), so first we should know this requeuing from the top is
really necessary, and I didn't see anything convincing about this yet.

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-15  6:45       ` Jarek Poplawski
@ 2008-10-15  7:19         ` Jarek Poplawski
  0 siblings, 0 replies; 16+ messages in thread
From: Jarek Poplawski @ 2008-10-15  7:19 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev@vger.kernel.org

On Wed, Oct 15, 2008 at 06:45:22AM +0000, Jarek Poplawski wrote:
...
> Without this __netif_schedule() call we should expect at least higher

Should be:
"Even without this __netif_schedule() call..."

Jarek P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-14 20:44         ` Patrick McHardy
@ 2008-10-15  8:27           ` Jarek Poplawski
  2008-10-15  9:45             ` Patrick McHardy
  0 siblings, 1 reply; 16+ messages in thread
From: Jarek Poplawski @ 2008-10-15  8:27 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David Miller, netdev

On Tue, Oct 14, 2008 at 10:44:13PM +0200, Patrick McHardy wrote:
...
> I think we really need a peek operation since most qdiscs do have
> some internal priorization. The question is whether all qdiscs need
> it; I tend to think no.

Looking at qdisc_peek_len() seems to confirm a peek would be useful.
But since this all started from the question if ->requeue() could be
killed or simplified, I wonder if adding this peek could really help
with this problem without too much reworking. It looks like at least
sch_netem would still need more.

But if it's rather about adding something new and useful I'm OK with
this. Anyway, I need some time to rethink this one and your previous
description.

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
  2008-10-15  8:27           ` Jarek Poplawski
@ 2008-10-15  9:45             ` Patrick McHardy
  0 siblings, 0 replies; 16+ messages in thread
From: Patrick McHardy @ 2008-10-15  9:45 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, netdev

Jarek Poplawski wrote:
> On Tue, Oct 14, 2008 at 10:44:13PM +0200, Patrick McHardy wrote:
> ...
>> I think we really need a peek operation since most qdiscs do have
>> some internal priorization. The question is whether all qdiscs need
>> it; I tend to think no.
> 
> Looking at qdisc_peek_len() seems to confirm a peek would be useful.
> But since this all started from the question if ->requeue() could be
> killed or simplified, I wonder if adding this peek could really help
> with this problem without too much reworking. It looks like at least
> sch_netem would still need more.

Indeed. netem is really a special case, it would be good if we could
treat it as such without requiring this kind of support from all qdiscs.
An idea would be to use a second queue for reordering that is always
tried to dequeue first. That should behave identical to what it does
now, except when the inner qdisc does reordering or rate limiting
itself.

> But if it's rather about adding something new and useful I'm OK with
> this. Anyway, I need some time to rethink this one and your previous
> description.

The main idea was to get rid of ->requeue, but it would also simplify
things slightly for HFSC, TBF and DRR (unsubmitted patch of mine).

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-10-15  9:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-14  9:52 [PATCH 00/14]: Killing qdisc->ops->requeue() Jarek Poplawski
2008-10-14 11:39 ` Patrick McHardy
2008-10-14 12:26   ` Jarek Poplawski
2008-10-14 12:32     ` Patrick McHardy
2008-10-14 17:56       ` Jarek Poplawski
2008-10-14 20:18         ` David Miller
2008-10-14 20:44         ` Patrick McHardy
2008-10-15  8:27           ` Jarek Poplawski
2008-10-15  9:45             ` Patrick McHardy
2008-10-14 16:41 ` Alexander Duyck
2008-10-14 18:37   ` Jarek Poplawski
2008-10-14 18:41     ` Jarek Poplawski
2008-10-14 19:15   ` Jarek Poplawski
2008-10-14 20:37     ` Alexander Duyck
2008-10-15  6:45       ` Jarek Poplawski
2008-10-15  7:19         ` Jarek Poplawski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).