[PATCH 0/14]: Make packet scheduler multiqueue aware.

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/14]: Make packet scheduler multiqueue aware.
@ 2008-07-14 22:59 David Miller
  2008-07-15  0:25 ` Patrick McHardy
  0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2008-07-14 22:59 UTC (permalink / raw)
  To: netdev

Ok, here are the fruits of my weekend in a nutshell, it's
pushed into the net-tx-2.6 tree:

1) Every configuration operation previded by qdiscs and
   classifiers is split into two or three operations:

   Either:

		->validate_foo()
		->commit_foo()
		->cancel_foo()

   Or just:

		->validate_foo()
		->commit_foo()

   The deciding factor is whether validate_foo() allocates
   some resources which need to be liberated if we decide to
   no perform the commit operation.

2) The generic top-level operation code allocates an array
   of info blobs that record the progress of things.

3) First, we iterate over all queues and validate the operation.

4) If all validations pass, we iterate and commit the operation.

5) Else, we iterate and cancel the operation if necessary.

As you can see, this required surgery to a lot of the
packet scheduler layer.  I'm sure I got something wrong,
but conceptually I think everything is sound.

At this point I've tested basic qdisc changes.  For example
on my workstation I'm using a PRIO with 3 bands pointing to
various SFQ and TBF qdiscs.

So I haven't given the filter/classifier stuff much of a good
testing yet.

My next plans are:

1) Integrate the feedback received over the weekend which I
   didn't have time to process yet.

2) Respin the tree against a fresh net-next-2.6

3) Implement "QUEUE_NUM" netlink attribute so that the user can
   request that packet scheduler operations apply to a specific
   queue only.

I think, frankly, this is ready to merge upstream even in it's current
state.  We'll have 3 months to test things more thoroughly and sort
out any thinkos I added to the code.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/14]: Make packet scheduler multiqueue aware.
  2008-07-14 22:59 [PATCH 0/14]: Make packet scheduler multiqueue aware David Miller
@ 2008-07-15  0:25 ` Patrick McHardy
  2008-07-15  1:48   ` David Miller
  0 siblings, 1 reply; 5+ messages in thread
From: Patrick McHardy @ 2008-07-15  0:25 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

David Miller wrote:
> My next plans are:
>
> 1) Integrate the feedback received over the weekend which I
>    didn't have time to process yet.
>
> 2) Respin the tree against a fresh net-next-2.6
>
> 3) Implement "QUEUE_NUM" netlink attribute so that the user can
>    request that packet scheduler operations apply to a specific
>    queue only.

I've looked over the API changes and a few qdiscs and everything
looks good to me. My only concern is the automatic replication
when using non-work-conserving qdiscs. They usually enforce an
upper rate to keep control of the queue in software. Replicating
them and distributing packets over multiple queues in combination
with automatic enabling of driver multiqueue features will break
existing configs.

Fundamentally I don't think transparent replication of non-work-
conserving qdiscs is the right thing to do since the only way
they can work correctly is when the administrator explicitly
assigns a smaller amount of bandwidth to the individual queues.
Unfortunately its a bit tricky to determine whether a qdisc is
non-work-conserving since at least HFSC can work in both modes,
and its depends on how the root class is configured (with or
without upper limit service curve).

Actually on second thought, it will even have unexpected effects
on work-conserving qdiscs since the device's queue classifier
(or simple_tx_hash) effectively overrides the manually configured
classifiers by distributing over multiple device qdiscs first.
Which means that for example in a configuration that previously
had certain packets sharing a qdisc, they might now end up in
different replications.

Unfortunately that seems to imply that anything but the really
simple qdiscs can't be used transparently in combination with
multiqueue without breaking configurations.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/14]: Make packet scheduler multiqueue aware.
  2008-07-15  0:25 ` Patrick McHardy
@ 2008-07-15  1:48   ` David Miller
  2008-07-15  2:08     ` Patrick McHardy
  2008-07-16  4:14     ` David Miller
  0 siblings, 2 replies; 5+ messages in thread
From: David Miller @ 2008-07-15  1:48 UTC (permalink / raw)
  To: kaber; +Cc: netdev

From: Patrick McHardy <kaber@trash.net>
Date: Tue, 15 Jul 2008 02:25:32 +0200

> Unfortunately that seems to imply that anything but the really
> simple qdiscs can't be used transparently in combination with
> multiqueue without breaking configurations.

I think we can fix this after the merge window :)

One idea is to allow all of the queues to point at a
single qdisc.  We'd just need to work out how to do the
locking.

For example, the Qdisc has a lock member, and a pointer.
For simple qdiscs the pointer points at the netdev_queue
lock.  But when sharing, we use the in-Qdisc static lock.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/14]: Make packet scheduler multiqueue aware.
  2008-07-15  1:48   ` David Miller
@ 2008-07-15  2:08     ` Patrick McHardy
  2008-07-16  4:14     ` David Miller
  1 sibling, 0 replies; 5+ messages in thread
From: Patrick McHardy @ 2008-07-15  2:08 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Tue, 15 Jul 2008 02:25:32 +0200
>
>   
>> Unfortunately that seems to imply that anything but the really
>> simple qdiscs can't be used transparently in combination with
>> multiqueue without breaking configurations.
>>     
>
> I think we can fix this after the merge window :)
>   
Yes, that shouldn't be a problem.
>
> One idea is to allow all of the queues to point at a
> single qdisc.  We'd just need to work out how to do the
> locking.
>
> For example, the Qdisc has a lock member, and a pointer.
> For simple qdiscs the pointer points at the netdev_queue
> lock.  But when sharing, we use the in-Qdisc static lock
That sounds nice and easy.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/14]: Make packet scheduler multiqueue aware.
  2008-07-15  1:48   ` David Miller
  2008-07-15  2:08     ` Patrick McHardy
@ 2008-07-16  4:14     ` David Miller
  1 sibling, 0 replies; 5+ messages in thread
From: David Miller @ 2008-07-16  4:14 UTC (permalink / raw)
  To: kaber; +Cc: netdev

From: David Miller <davem@davemloft.net>
Date: Mon, 14 Jul 2008 18:48:49 -0700 (PDT)

> One idea is to allow all of the queues to point at a
> single qdisc.  We'd just need to work out how to do the
> locking.
> 
> For example, the Qdisc has a lock member, and a pointer.
> For simple qdiscs the pointer points at the netdev_queue
> lock.  But when sharing, we use the in-Qdisc static lock.

I've investigated several avenues to implement this, and so I've hit
several brick walls with each approach.  It's not easy.

Much of the locking, and some of the semantics, can be trivially
sorted out:

1) I have a change which moves gso_skb into the Qdisc.  That
   definitely would be needed for qdisc sharing amongst netdev_queue
   objects.

2) Qdiscs want to lock the qdisc tree and do several things, ror
   example, unlink and walk up the parents to adjust the qlen.

   For that, locking the root qdisc ought to be sufficient, I suppose.

3) The netdev_queue lock also synchronizes the ->qdisc linkage of the
   root.  In order to combat some of this, I wrote up a change that moved
   the bulk of qdisc_destroy()'s work into the qdisc destroy RCU handler.
   On top of that I passed the first seen qdisc in a qdisc_run() call all
   the way down into qdisc_restart() and removed all of the "recheck
   txq->qdisc" scatters all over these parts.

   Basically this works because we can just enqueue or whatever since
   BH's are disabled the whole time and thus RCU's are blocked out.  The
   RCU will reset and destroy the queue, freeing up any packets therein.

But then we get into the issue of which netdev_queue backpointer
should be used?  This is especially important for a shared qdisc.

At the top level we enqueue an SKB for a particular TXQ.  But as we
call down into qdisc_run(), the SKB we get out of q->dequeue() could
be for another TXQ.

This gets even more interesting when a qdisc_watchdog() fires.  Which
TXQ should we call netif_schedule for? :-)

I guess this means qdisc_run() and netif_schedule() should operate
on root qdiscs, rather than netdev_queue objects.

Then, as qdisc_restart() pulls packets out of the qdisc, it determines
the TXQ for that SKB and uses that to lock and call down into the
->hard_start_xmit() handler.

Just FYI and I'll try to explore this further.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-07-16  4:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-14 22:59 [PATCH 0/14]: Make packet scheduler multiqueue aware David Miller
2008-07-15  0:25 ` Patrick McHardy
2008-07-15  1:48   ` David Miller
2008-07-15  2:08     ` Patrick McHardy
2008-07-16  4:14     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).