netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: davem@davemloft.net, john.fastabend@gmail.com, ast@plumgrid.com,
	netdev@vger.kernel.org,
	John Fastabend <john.r.fastabend@intel.com>
Subject: Re: [PATCH net-next 1/4] net: qdisc: add op to run filters/actions before enqueue
Date: Tue, 01 Sep 2015 20:50:30 +0200	[thread overview]
Message-ID: <55E5F376.80600@iogearbox.net> (raw)
In-Reply-To: <1441128083.8932.177.camel@edumazet-glaptop2.roam.corp.google.com>

On 09/01/2015 07:21 PM, Eric Dumazet wrote:
> On Tue, 2015-09-01 at 18:34 +0200, Daniel Borkmann wrote:
>> From: John Fastabend <john.r.fastabend@intel.com>
>>
>> Add a new ->preclassify() op to allow multiqueue queuing disciplines
>> to call tc_classify() or perform other work before dev_pick_tx().
>>
>> This helps, for example, with mqprio queueing discipline that has
>> offload support by most popular 10G NICs, where the txq effectively
>> picks the qdisc.
>>
>> Once traffic is being directed to a specific queue then hardware TX
>> rings may be tuned to support this traffic type. mqprio already
>> gives the ability to do this via skb->priority where the ->preclassify()
>> provides more control over packet steering, it can classify the skb
>> and set the priority, for example, from an eBPF classifier (or action).
>>
>> Also this allows traffic classifiers to be run without holding the
>> qdisc lock and gives one place to attach filters when mqprio is
>> in use. ->preclassify() could also be added to other mq qdiscs later
>> on: f.e. most classful qdiscs first check major/minor numbers of
>> skb->priority before actually consulting a more complex classifier.
>>
>> For mqprio case today, a filter has to be attached to each txq qdisc
>> to have all traffic hit the filter. Since ->preclassify() is currently
>> only used by mqprio, the __dev_queue_xmit() fast path is guarded by
>> a generic, hidden Kconfig option (NET_CLS_PRECLASSIFY) that is only
>> selected by mqprio,
>
> So all distros will select it, basically.
>
> ...
>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 877c848..b768bca 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -3052,6 +3052,23 @@ static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
>>   	rcu_read_lock_bh();
>>
>>   	skb_update_prio(skb);
>> +#ifdef CONFIG_NET_CLS_PRECLASSIFY
>> +	q = rcu_dereference_bh(dev->qdisc);
>> +	if (q && q->preclassify) {
>> +		switch (q->preclassify(skb, q)) {
>> +		default:
>> +			break;
>> +#ifdef CONFIG_NET_CLS_ACT
>> +		case TC_ACT_SHOT:
>> +		case TC_ACT_STOLEN:
>> +		case TC_ACT_QUEUED:
>> +			kfree_skb(skb);
>> +			rc = NET_XMIT_SUCCESS;
>> +			goto out;
>> +#endif
>> +		}
>> +	}
>> +#endif
>>
>
> Since its a device attribute after all, why are you storing it in
> dev->qdisc->preclassify, adding a cache line miss for moderate load ?
>
> (mqprio/mq root qdisc is normally not fetched in fast path ?)
>
> dev->preclassify would be better IMO, close to dev->_tx

Yes, makes sense, as this cacheline is accessed anyway few cycles later.
I'll look into how well this approach integrates into tc's configuration
path. I think mqprio/mq/... could just install/remove dev->preclassify()
handler as soon as first filter is attached/detached.

Thanks,
Daniel

  reply	other threads:[~2015-09-01 18:50 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-01 16:34 [PATCH net-next 0/4] BPF updates Daniel Borkmann
2015-09-01 16:34 ` [PATCH net-next 1/4] net: qdisc: add op to run filters/actions before enqueue Daniel Borkmann
2015-09-01 17:21   ` Eric Dumazet
2015-09-01 18:50     ` Daniel Borkmann [this message]
2015-09-02  6:22   ` Cong Wang
2015-09-02 12:44     ` Daniel Borkmann
2015-09-02 20:29     ` Jamal Hadi Salim
2015-09-01 16:34 ` [PATCH net-next 2/4] ebpf: migrate bpf_prog's flags to bitfield Daniel Borkmann
2015-09-01 16:34 ` [PATCH net-next 3/4] net: {cls,act}_bpf: add helper for retrieving routing realms Daniel Borkmann
2015-09-01 16:34 ` [PATCH net-next 4/4] net: {cls,act}_bpf: make skb->priority writable Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55E5F376.80600@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=ast@plumgrid.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=john.r.fastabend@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).