netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/39]: New multiqueue TX implementation.
@ 2008-07-03  7:01 David Miller
  2008-07-03 14:54 ` Stefanik Gábor
       [not found] ` <20080703.000146.168093325.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 2 replies; 6+ messages in thread
From: David Miller @ 2008-07-03  7:01 UTC (permalink / raw)
  To: netdev; +Cc: vinay, krkumar2, mchan, Matheos.Worku, linux-wireless


I'm finally at the point where I can post a patch series that actually
does something and I know works for at least one card :-)

The backlog and batching bits are sidelined for the time being.
Don't worry, we'll get back to that soon enough :)

This can all be found in:

	kernel.org:/pub/scm/linux/kernel/git/davem/net-tx-2.6.git

which uses net-next-2.6 as it's origin.

The summarized state is:

1) Everything in the transmit path is multiqueue aware.

2) Qdisc and classification is not.  We hook up default
   qdiscs to each TX queue when the device comes up,
   but the configuration infrastructure still hardcodes
   it's operations to TX queue zero.  This is a temporary
   situation.

3) I rewrote the mac80211 QoS support using the new
   netdev hook added for TX hashing.  I know it is broken
   and not as fully functional as the qdisc implementation.
   It can and will be fixed to match existing functionality.
   The broken parts are:

   a) It no longer does dropping.
   b) Requeueing is not implemented.

   Dropping is easy to add, and we need to investigate
   whether the requeueing is really even useful.

I have tested basic TCP stream functionality on the NIU
driver multiqueue support.  I verified that different
TCP streams end up on different TX queues.

I went through the existing multiqueue capable drivers and
made sure they use the new interfaces properly.  I anticipate
that they will largely still work properly.

For the qdisc/cls issues, I intend to simply use replication as a
first step.  So if a qdisc or classifier config change comes in,
we just replicate that change to all of the TX queues.  The biggest
pain in the butt will be rolling things back if the first few
queues succeed but then one fails.

The observant will note that egress and ingress qdisc handling is now
more consolidated than ever.  I expect many more simplications in this
area.

Some of these config changes make non-trivial things happen, so what I
might do is split qdisc/cls config into two passes.  The first pass
implements the allocation of resources (memory, etc.), the second
pass commits the changes and cannot fail.  So if anything in the
first pass fails, we simply release everything, cleanup, and return
an error.

Quick 'n dirty multiqueue driver port:

1) alloc_etherdev() --> alloc_etherdev_mq().  Specify the maximum
   number of TX queues that the device might be using.

2) Once you know how many TX queues will be in use, set
   netdev->real_num_tx_queues to that value.

   Do not modify this value when the device is up.  It may only be
   changed while the device is down.

3) In ->hard_start_xmit(), skb_get_queue_mapping() tells you which
   TX queue to use.  It will always be in the range 0 --> real_num_tx_queue

4) When operating on a specific TX queue, use netif_tx_{start,stop,wake}_queue()

5) When you want to operate on all queues (bringing the device down,
   bringing it up, resetting, changing some MAC configuration that
   requires full device quiesce) use netif_tx_{start,stop,wake}_all_queues().

And then you're done.  Really, it's as simply as that.  The final
patch in this series that implements TX multiqueue for NIU is a good
guide for other driver authors.

net/core/dev.c:simple_tx_hash() implements the current hashing
algorithm.  This is just to get things going, and will in the end be
augmented with a user-configurable algorithm selection.

There is a lot to clean up, fix up, and flesh out.  But at least
we're this far along.  More details are in the commit log messages.

You'll notice that a lot of it is just moving things around, making
interfaces work with queue objects instead of net devices, and
finally deciding at each spot "what does this operation mean in
a TX multiqueue setting?"

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/39]: New multiqueue TX implementation.
  2008-07-03  7:01 [PATCH 00/39]: New multiqueue TX implementation David Miller
@ 2008-07-03 14:54 ` Stefanik Gábor
  2008-07-03 14:56   ` Johannes Berg
       [not found] ` <20080703.000146.168093325.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  1 sibling, 1 reply; 6+ messages in thread
From: Stefanik Gábor @ 2008-07-03 14:54 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, vinay, krkumar2, mchan, Matheos.Worku, linux-wireless

On Thu, Jul 3, 2008 at 9:01 AM, David Miller <davem@davemloft.net> wrote:
>
> I'm finally at the point where I can post a patch series that actually
> does something and I know works for at least one card :-)
>
> The backlog and batching bits are sidelined for the time being.
> Don't worry, we'll get back to that soon enough :)
>
> This can all be found in:
>
>        kernel.org:/pub/scm/linux/kernel/git/davem/net-tx-2.6.git
>
> which uses net-next-2.6 as it's origin.
>
> The summarized state is:
>
> 1) Everything in the transmit path is multiqueue aware.
>
> 2) Qdisc and classification is not.  We hook up default
>   qdiscs to each TX queue when the device comes up,
>   but the configuration infrastructure still hardcodes
>   it's operations to TX queue zero.  This is a temporary
>   situation.
>
> 3) I rewrote the mac80211 QoS support using the new
>   netdev hook added for TX hashing.  I know it is broken
>   and not as fully functional as the qdisc implementation.
>   It can and will be fixed to match existing functionality.
>   The broken parts are:
>
>   a) It no longer does dropping.
>   b) Requeueing is not implemented.
>
>   Dropping is easy to add, and we need to investigate
>   whether the requeueing is really even useful.
>
> I have tested basic TCP stream functionality on the NIU
> driver multiqueue support.  I verified that different
> TCP streams end up on different TX queues.
>
> I went through the existing multiqueue capable drivers and
> made sure they use the new interfaces properly.  I anticipate
> that they will largely still work properly.
>
> For the qdisc/cls issues, I intend to simply use replication as a
> first step.  So if a qdisc or classifier config change comes in,
> we just replicate that change to all of the TX queues.  The biggest
> pain in the butt will be rolling things back if the first few
> queues succeed but then one fails.
>
> The observant will note that egress and ingress qdisc handling is now
> more consolidated than ever.  I expect many more simplications in this
> area.
>
> Some of these config changes make non-trivial things happen, so what I
> might do is split qdisc/cls config into two passes.  The first pass
> implements the allocation of resources (memory, etc.), the second
> pass commits the changes and cannot fail.  So if anything in the
> first pass fails, we simply release everything, cleanup, and return
> an error.
>
> Quick 'n dirty multiqueue driver port:
>
> 1) alloc_etherdev() --> alloc_etherdev_mq().  Specify the maximum
>   number of TX queues that the device might be using.
>
> 2) Once you know how many TX queues will be in use, set
>   netdev->real_num_tx_queues to that value.
>
>   Do not modify this value when the device is up.  It may only be
>   changed while the device is down.
>
> 3) In ->hard_start_xmit(), skb_get_queue_mapping() tells you which
>   TX queue to use.  It will always be in the range 0 --> real_num_tx_queue
>
> 4) When operating on a specific TX queue, use netif_tx_{start,stop,wake}_queue()
>
> 5) When you want to operate on all queues (bringing the device down,
>   bringing it up, resetting, changing some MAC configuration that
>   requires full device quiesce) use netif_tx_{start,stop,wake}_all_queues().
>
> And then you're done.  Really, it's as simply as that.  The final
> patch in this series that implements TX multiqueue for NIU is a good
> guide for other driver authors.
>
> net/core/dev.c:simple_tx_hash() implements the current hashing
> algorithm.  This is just to get things going, and will in the end be
> augmented with a user-configurable algorithm selection.
>
> There is a lot to clean up, fix up, and flesh out.  But at least
> we're this far along.  More details are in the commit log messages.
>
> You'll notice that a lot of it is just moving things around, making
> interfaces work with queue objects instead of net devices, and
> finally deciding at each spot "what does this operation mean in
> a TX multiqueue setting?"
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Patch 15/39 appears to be missing from the series.

-- 
Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/39]: New multiqueue TX implementation.
  2008-07-03 14:54 ` Stefanik Gábor
@ 2008-07-03 14:56   ` Johannes Berg
       [not found]     ` <1215097009.9975.11.camel-YfaajirXv214zXjbi5bjpg@public.gmane.org>
  2008-07-05  7:37     ` Jarek Poplawski
  0 siblings, 2 replies; 6+ messages in thread
From: Johannes Berg @ 2008-07-03 14:56 UTC (permalink / raw)
  To: Stefanik Gábor
  Cc: David Miller, netdev, vinay, krkumar2, mchan, Matheos.Worku,
	linux-wireless

[-- Attachment #1: Type: text/plain, Size: 159 bytes --]

Can you please quote less?

> Patch 15/39 appears to be missing from the series.

I have it. If you don't you can look at it in the git tree.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/39]: New multiqueue TX implementation.
       [not found]     ` <1215097009.9975.11.camel-YfaajirXv214zXjbi5bjpg@public.gmane.org>
@ 2008-07-03 15:00       ` Stefanik Gábor
  0 siblings, 0 replies; 6+ messages in thread
From: Stefanik Gábor @ 2008-07-03 15:00 UTC (permalink / raw)
  To: Johannes Berg
  Cc: David Miller, netdev-u79uwXL29TY76Z2rM5mHXA,
	vinay-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	krkumar2-xthvdsQ13ZrQT0dZR+AlfA, mchan-dY08KVG/lbpWk0Htik3J/w,
	Matheos.Worku-xsfywfwIY+M, linux-wireless-u79uwXL29TY76Z2rM5mHXA

On Thu, Jul 3, 2008 at 4:56 PM, Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> wrote:
> Can you please quote less?

OK, thanks... I always forget that not everyone uses Gmail. :-)

> I have it. If you don't you can look at it in the git tree.

Looks like it got caught by my spam filter, then.

-- 
Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/39]: New multiqueue TX implementation.
       [not found] ` <20080703.000146.168093325.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2008-07-04  9:33   ` Herbert Xu
  0 siblings, 0 replies; 6+ messages in thread
From: Herbert Xu @ 2008-07-04  9:33 UTC (permalink / raw)
  To: David Miller
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	vinay-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	krkumar2-xthvdsQ13ZrQT0dZR+AlfA, mchan-dY08KVG/lbpWk0Htik3J/w,
	Matheos.Worku-xsfywfwIY+M, linux-wireless-u79uwXL29TY76Z2rM5mHXA

David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> 
> I'm finally at the point where I can post a patch series that actually
> does something and I know works for at least one card :-)
> 
> The backlog and batching bits are sidelined for the time being.
> Don't worry, we'll get back to that soon enough :)
> 
> This can all be found in:
> 
>        kernel.org:/pub/scm/linux/kernel/git/davem/net-tx-2.6.git
> 
> which uses net-next-2.6 as it's origin.

I've only had a quick look but it is clear that this stuff is
pretty awesome and fills in a major missing piece of infrastructure
that's needed to bring Linux networking into the age of multi-core.

Nice work!

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/39]: New multiqueue TX implementation.
  2008-07-03 14:56   ` Johannes Berg
       [not found]     ` <1215097009.9975.11.camel-YfaajirXv214zXjbi5bjpg@public.gmane.org>
@ 2008-07-05  7:37     ` Jarek Poplawski
  1 sibling, 0 replies; 6+ messages in thread
From: Jarek Poplawski @ 2008-07-05  7:37 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Stefanik Gábor, David Miller, netdev, vinay, krkumar2, mchan,
	Matheos.Worku, linux-wireless

Johannes Berg wrote, On 07/03/2008 04:56 PM:

> Can you please quote less?
> 
>> Patch 15/39 appears to be missing from the series.
> 
> I have it. If you don't you can look at it in the git tree.

...or at the patch 14/39, which looks almost the same?!

Regards,
Jarek P.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-07-05  7:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-03  7:01 [PATCH 00/39]: New multiqueue TX implementation David Miller
2008-07-03 14:54 ` Stefanik Gábor
2008-07-03 14:56   ` Johannes Berg
     [not found]     ` <1215097009.9975.11.camel-YfaajirXv214zXjbi5bjpg@public.gmane.org>
2008-07-03 15:00       ` Stefanik Gábor
2008-07-05  7:37     ` Jarek Poplawski
     [not found] ` <20080703.000146.168093325.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-07-04  9:33   ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).