netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick McHardy <kaber@trash.net>
To: Jarek Poplawski <jarkao2@gmail.com>
Cc: David Miller <davem@davemloft.net>, netdev@vger.kernel.org
Subject: Re: [PATCH 00/14]: Killing qdisc->ops->requeue().
Date: Tue, 14 Oct 2008 13:39:23 +0200	[thread overview]
Message-ID: <48F484EB.8000201@trash.net> (raw)
In-Reply-To: <20081014095246.GA10804@ff.dom.local>

Jarek Poplawski wrote:
> The aim of this patch-set is to finish changes proposed by David S.
> Miller in his patch-set with the same subject from Mon, 18 Aug 2008.
> The first two patches were applied with some modifications, so, to
> apply the rest, there were needed some changes.
> 
> Original David's patches include additional info, but signed-off-by
> is removed because of changed context. I expect they will be merged
> and signed off by David as an author, anyway.
> 
> The qdisc->requeue list idea is to limit requeuing to one level only,
> so a parent can requeue to its child only. This list is then tried
> first while dequeuing (qdisc_dequeue()), except at the top level,
> so packets could be requeued only by qdiscs, not by qdisc_restart()
> after xmit errors.

I didn't follow the original discussion, but I'm wondering what
the reasoning is why these patches won't have negative impact
on latency. Consider these two scenarios with HFSC or TBF:

current situation:

- packet is dequeued and sent
- next packet is peeked at for calculating the deadline
- watchdog is scheduled
- higher priority packet arrives and is queued to inner qdisc
- dequeue is called again, qdisc is overlimit, so peeks again
- watchdog is rescheduled based on higher priority packet

without ->requeue:

- packet is dequeued and sent
- next packet is peeked at for calculating the deadline and
   put into private "requeue" queue
- watchdog is scheduled
- higher priority packet arrives and is queued to inner qdisc
- dequeue is called again, qdisc is overlimit, so peeks again
- higher priority packet doesn't affect watchdog rescheduling
   since we still have one in the private queue
- lower priority packet is sent, assuming qdisc is overlimit
   watchdog is then rescheduled based on higher priority packet

The end result is that the worst case latency for a packet increases
by a full packet transmission time. This may not matter much for high
bandwidth connections, but for f.i. with a 1mbit connection it adds a
full 12ms for a MTU of 1500, which is clearly in the noticable range.

I'm not opposed to killing top-level ->requeue since in that case
the qdisc has already decided to send the packet and if it affects
latency, the qdisc is misconfigured to use too much bandwidth.
Qdisc' use of ->requeue can only be removed without bad side effects
for the CBQ case of overlimit handling, it shouldn't matter much
since CBQ is not very accurate anyways. For the ->peek case (HFSC,
TBF, I think also netem) we really need the peek semantic to avoid
these side effects.

It should actually be pretty easy because for every ->enqueue call,
there is at least one immediately following ->dequeue call, which
gives an upper qdisc a chance to reschedule the watchdog when
conditions change. So what should work is having the requeue-queue
(actually, just an skb pointer) within the innermost qdisc instead
of one level higher, as in your patches.

On a ->peek operation, the qdisc would simply do what is currently
done in ->dequeue, but instead of removing the packet from its
private queues, it would set the pointer to point to the chosen
packet and return it to the upper qdisc. The upper qdisc can use
this for watchdog scheduling. If the next event is a dequeue event
(meaning the watchdog expired), it removes the peeked packet from
the private queues and returns it to the upper qdisc again. If the
next event is an enqueue event, it can replace the pointer
unconditionally since the upper qdisc will immediately call
->dequeue or ->peek again, giving it a chance to reschedule based
on the changed conditions.

So the implementation would probably roughly look like this:

- split ->dequeue into a queue and packet selection operation, setting
   the above mentioned pointer, and an actual dequeue operation to
   remove the selected packet from the queue.

- the queue and packet selection operation is at the same time the
   ->peek operation


  reply	other threads:[~2008-10-14 11:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-14  9:52 [PATCH 00/14]: Killing qdisc->ops->requeue() Jarek Poplawski
2008-10-14 11:39 ` Patrick McHardy [this message]
2008-10-14 12:26   ` Jarek Poplawski
2008-10-14 12:32     ` Patrick McHardy
2008-10-14 17:56       ` Jarek Poplawski
2008-10-14 20:18         ` David Miller
2008-10-14 20:44         ` Patrick McHardy
2008-10-15  8:27           ` Jarek Poplawski
2008-10-15  9:45             ` Patrick McHardy
2008-10-14 16:41 ` Alexander Duyck
2008-10-14 18:37   ` Jarek Poplawski
2008-10-14 18:41     ` Jarek Poplawski
2008-10-14 19:15   ` Jarek Poplawski
2008-10-14 20:37     ` Alexander Duyck
2008-10-15  6:45       ` Jarek Poplawski
2008-10-15  7:19         ` Jarek Poplawski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48F484EB.8000201@trash.net \
    --to=kaber@trash.net \
    --cc=davem@davemloft.net \
    --cc=jarkao2@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).