From: John Fastabend <john.fastabend@gmail.com>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.r.fastabend@intel.com>,
Stephen Hemminger <stephen@networkplumber.org>,
Eric Dumazet <eric.dumazet@gmail.com>,
Tom Herbert <therbert@google.com>,
netdev <netdev@vger.kernel.org>
Subject: Re: locating the 'tc actions' hook
Date: Mon, 05 Aug 2013 09:11:31 -0700 [thread overview]
Message-ID: <51FFCEB3.3010404@gmail.com> (raw)
In-Reply-To: <51FCEDCF.8010107@mojatatu.com>
On 08/03/2013 04:47 AM, Jamal Hadi Salim wrote:
> On 13-08-01 07:18 PM, John Fastabend wrote:
>
> [..]
>
>> The first being directly related to the previous per queue rate limiter
>> patch. With rate limiters per queue on a multiqueue device using mq or
>> mqprio I need some mechanism to steer packets to queues. One way to do
>> this is to use mqprio and create a 'tc' with a single queue in it.
>> And then use iptables or netprio_cgroup to steer packets. Another way
>> to do this would be to use 'skbedit queue_mapping' to set the queue from
>> 'tc' but unfortunately with the existing flows the queue has already
>> been selected by the time the classifiers are called.
>
> I am assuming the skb (mark) will be tagged with a proper meaningful
> tag maybe by the driver. Such a tag can be used later up-stack.
>
Actually in the first case here I was considering egress traffic.
Separating the classifier chain from the qdisc allows using the
classifiers to pick a qdisc via queue_mapping because qdisc's are
attached to queues when using sch_mq.
>> Calling into the
>> classifier chain before picking the qdisc would fix this. For flow based
>> QOS with multiqueue devices this type of functionality would be useful.
>>
>
> From your description this seems to be ingress side; so we should be
> able to use that tag if you set it.
>
Sorry I mucked two examples one egress and one ingress together without
clearly stating it. The second case here was ingress the first egress.
>
>> The second thought that I've been piecing together would be to populate
>> the rxhash (or maybe some other field) using the hardware flow
>> classifier in some meaningful way for the ingress qdisc.
>
>
> The rxhash would be useful for further classification or avoiding
> further classification.
>
>> Some of the
>> existing Intel NICs can do this and I believe other vendors have similar
>> capabilities. Although currently with the qdisc lock running around the
>> ingress qdisc the multiqueue devices take a perf hit just by
>> instantiating the ingress qdisc which really is only using the lock to
>> guard some stats and keep the classifier/action chains sane.
>>
>
> Instantiating any qdisc is not a problem. You do it once and are done.
> There are penalties with using qdiscs in terms of locking.
> The locks penalties really have to do with the design of the netdev
> not qdiscs i.e queues are centered around netdevs and netdevs
> are shared across processors.
>
But currently we have a situation on egress where multiqueue devices
use mq or mqprio which aligns qdisc's with queues. This fixes the
performance penalties but you lose the ability to have state across
multiple queues and the ability to bind flows to queues.
For example a SW rate limiter that applies to a set of queues or a set
of tuned low latency queues. To fix this we can attach a qdisc (htb,
whatever) to the netdev but this has performance penalties. So we are
left with a trade off between functionality and performance.
On the ingress side we may have many queues but as soon as we add a
qdisc/classifier/action we have a single lock.
>
>> If you have some good examples it would be great to see them and drop
>> them in my testbed. Go ahead and send them to me offlist if you can.
>>
>
> will do.
>
Thanks. I'll see if I can draw up what I'm thinking here a bit more
clearly and send it your way.
> cheers,
> jamal
>
>
>> .John
>>
--
John Fastabend Intel Corporation
next prev parent reply other threads:[~2013-08-05 16:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-31 21:19 locating the 'tc actions' hook John Fastabend
2013-08-01 11:40 ` Jamal Hadi Salim
2013-08-01 23:18 ` John Fastabend
2013-08-02 18:46 ` John Fastabend
2013-08-03 11:49 ` Jamal Hadi Salim
2013-08-03 11:47 ` Jamal Hadi Salim
2013-08-05 16:11 ` John Fastabend [this message]
2013-08-12 0:55 ` Jamal Hadi Salim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51FFCEB3.3010404@gmail.com \
--to=john.fastabend@gmail.com \
--cc=eric.dumazet@gmail.com \
--cc=jhs@mojatatu.com \
--cc=john.r.fastabend@intel.com \
--cc=netdev@vger.kernel.org \
--cc=stephen@networkplumber.org \
--cc=therbert@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).