From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
To: John Fastabend <john.fastabend@gmail.com>,
jiri@resnulli.us, daniel@iogearbox.net,
simon.horman@netronome.com
Cc: netdev@vger.kernel.org, alexei.starovoitov@gmail.com,
davem@davemloft.net, jhs@mojatatu.com
Subject: Re: [net-next PATCH v2 3/3] net: sched: cls_u32 add bit to specify software only rules
Date: Thu, 25 Feb 2016 23:02:54 -0800 [thread overview]
Message-ID: <56CFF89E.8070602@intel.com> (raw)
In-Reply-To: <20160225232045.9820.6694.stgit@john-Precision-Tower-5810>
On 2/25/2016 3:20 PM, John Fastabend wrote:
> In the initial implementation the only way to stop a rule from being
> inserted into the hardware table was via the device feature flag.
> However this doesn't work well when working on an end host system
> where packets are expect to hit both the hardware and software
> datapaths.
>
> For example we can imagine a rule that will match an IP address and
> increment a field. If we install this rule in both hardware and
> software we may increment the field twice. To date we have only
> added support for the drop action so we have been able to ignore
> these cases. But as we extend the action support we will hit this
> example plus more such cases. Arguably these are not even corner
> cases in many working systems these cases will be common.
>
> To avoid forcing the driver to always abort (i.e. the above example)
> this patch adds a flag to add a rule in software only. A careful
> user can use this flag to build software and hardware datapaths
> that work together. One example we have found particularly useful
> is to use hardware resources to set the skb->mark on the skb when
> the match may be expensive to run in software but a mark lookup
> in a hash table is cheap. The idea here is hardware can do in one
> lookup what the u32 classifier may need to traverse multiple lists
> and hash tables to compute. The flag is only passed down on inserts
> on deletion to avoid stale references in hardware we always try
I think this is supposed to be a new sentence starting with 'On deletion'
> to remove a rule if it exists.
>
> The flags field is part of the classifier specific options. Although
> it is tempting to lift this into the generic structure doing this
> proves difficult do to how the tc netlink attributes are implemented
> along with how the dump/change routines are called. There is also
> precedence for putting seemingly generic pieces in the specific
> classifier options such as TCA_U32_POLICE, TCA_U32_ACT, etc. So
> although not ideal I've left FLAGS in the u32 options as well as it
> simplifies the code greatly and user space has already learned how
> to manage these bits ala 'tc' tool.
>
> Another thing if trying to update a rule we require the flags to
> be unchanged. This is to force user space, software u32 and
> the hardware u32 to keep in sync. Thanks to Simon Horman for
> catching this case.
>
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> ---
> include/net/pkt_cls.h | 13 +++++++++++--
> include/uapi/linux/pkt_cls.h | 1 +
> net/sched/cls_u32.c | 37 +++++++++++++++++++++++++++----------
> 3 files changed, 39 insertions(+), 12 deletions(-)
<snip>
>
> @@ -482,7 +485,9 @@ static void u32_clear_hw_hnode(struct tcf_proto *tp, struct tc_u_hnode *h)
> }
> }
>
> -static void u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n)
> +static void u32_replace_hw_knode(struct tcf_proto *tp,
> + struct tc_u_knode *n,
> + u32 flags)
> {
> struct net_device *dev = tp->q->dev_queue->dev;
> struct tc_cls_u32_offload u32_offload = {0};
> @@ -491,7 +496,7 @@ static void u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n)
> offload.type = TC_SETUP_CLSU32;
> offload.cls_u32 = &u32_offload;
>
> - if (tc_should_offload(dev)) {
> + if (tc_should_offload(dev, flags)) {
> offload.cls_u32->command = TC_CLSU32_REPLACE_KNODE;
> offload.cls_u32->knode.handle = n->handle;
> offload.cls_u32->knode.fshift = n->fshift;
> @@ -679,6 +684,7 @@ static const struct nla_policy u32_policy[TCA_U32_MAX + 1] = {
> [TCA_U32_SEL] = { .len = sizeof(struct tc_u32_sel) },
> [TCA_U32_INDEV] = { .type = NLA_STRING, .len = IFNAMSIZ },
> [TCA_U32_MARK] = { .len = sizeof(struct tc_u32_mark) },
> + [TCA_U32_FLAGS] = { .len = NLA_U32 },
should be .type = NLA_U32
> <snip>
next prev parent reply other threads:[~2016-02-26 7:03 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-25 23:19 [net-next PATCH v2 0/3] tc software only flag John Fastabend
2016-02-25 23:19 ` [net-next PATCH v2 1/3] net: sched: consolidate offload decision in cls_u32 John Fastabend
2016-02-25 23:20 ` [net-next PATCH v2 2/3] net: cls_u32: move TC offload feature bit into cls_u32 offload logic John Fastabend
2016-02-25 23:20 ` [net-next PATCH v2 3/3] net: sched: cls_u32 add bit to specify software only rules John Fastabend
2016-02-26 7:02 ` Samudrala, Sridhar [this message]
2016-02-26 7:21 ` John Fastabend
2016-02-26 10:29 ` Jiri Pirko
2016-02-26 14:29 ` John Fastabend
2016-02-25 23:24 ` [net-next PATCH v2 0/3] tc software only flag John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56CFF89E.8070602@intel.com \
--to=sridhar.samudrala@intel.com \
--cc=alexei.starovoitov@gmail.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=simon.horman@netronome.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).