All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>
Subject: Re: [RFC PATCH 00/12] RCU'ify the net:sched classifier chains
Date: Sat, 11 Jan 2014 15:33:17 -0800	[thread overview]
Message-ID: <52D1D4BD.9080906@gmail.com> (raw)
In-Reply-To: <CAM_iQpU6WWB72zBdU4mh-sB+8B8Vz2gPvH7Hxhjhkr+vMjwdpw@mail.gmail.com>

On 01/11/2014 11:43 AM, Cong Wang wrote:
> On Fri, Jan 10, 2014 at 1:36 AM, John Fastabend
> <john.fastabend@gmail.com> wrote:
>> There appears to be some interest in a few topics around the qdisc
>> layer which could benefit from having the ability to run the
>> filters and actions without holding the qdisc lock.
>>
>> Recently Cong Wang proposed a patch series to drop the ingress
>> qdisc and asked for comments. This series I think gets closer to
>> that goal.
>>
>> The ingress qdisc is a simple qdisc which doesn't maintain any
>> actual list of skb's and is primarily a hook to attach filters.
>> Further the only qdisc that can be attached to the ingress qdisc
>> is sch_ingress. The qdisc lock is currently serializing two
>> operations (1) tc_classify which is addressed here and (2)
>> statistics accounting. The second point is not solved here but
>> it could be a matter of making the bstats and qstats per cpu
>> stats.
>
>
> Yeah, actually I tried to make bstats percpu, but I still doubt
> if it is necessary, since increasing a 32bit counter doesn't
> sound dangerous on SMP?
>

Well what happens when multiple cpus are incrementing the counter?
You can't assume all archs have a fetch and add instruction (addl on
x86) and I fairly certain there is no guarantee the compiler even
on x86 will do it that way. Minimally we need to use the atomic
operations but then its a cache thrashing problem. And because worse
case every CPU is going to be touching those bstats you really need
to make them per cpu. Look around the kernel at other counters its a
common pattern.

Similarly the qstats need to be per cpu, I might have a patch
around here for that piece somewhere. I'll look later.

Send me your patch so I can integrate it with the rest.

>>
>> This is an RFC for now and needs some more work. Some items
>> I know about are (a) an audit of the ematch code paths, (b) resolving
>> the checpatch errors mostly due to moving code around that
>> generates those errors, (c) run smatch, (d) audit u32 code
>> for correctness, (e) do a lot more testing so far only very
>> basic testing has been done. I tried to put some reasonable
>> comments in the commit logs but yes they need more work.
>>
>> Cong, if its not too much to ask can we use this as a base
>> set of patches for this work? I think its reasonably close to
>> correct as is.
>>
>
> Sure, just that:
>
> 1) I myself don't like playing RCU list without using list_head API
> it is still hard for me to read.

I think its a reasonably common practice, and if we don't need the
prev pointer we can save a pointer.

>
> 2) The first patch in your series seems completely irrelevant to
> $subject. :)

If the intent is to drop the qdisc lock around the ingress qdisc and
use the RCU api's I want to be sure to annotate it so we can use
the analysis tools to catch any errors. Smatch and others really are
pretty good at catching dumb mistakes or missed call sites.

>
> Thanks.
>

-- 
John Fastabend         Intel Corporation

  reply	other threads:[~2014-01-11 23:33 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-10  9:36 [RFC PATCH 00/12] RCU'ify the net:sched classifier chains John Fastabend
2014-01-10  9:37 ` [RFC PATCH 01/12] net: qdisc: use rcu prefix and silence sparse warnings John Fastabend
2014-01-10  9:37 ` [RFC PATCH 02/12] net: rcu-ify tcf_proto John Fastabend
2014-01-10  9:38 ` [RFC PATCH 03/12] net: sched: cls_basic use RCU John Fastabend
2014-01-10  9:38 ` [RFC PATCH 04/12] net: sched: cls_cgroup " John Fastabend
2014-01-10  9:39 ` [RFC PATCH 05/12] net: sched: cls_flow " John Fastabend
2014-01-10  9:39 ` [RFC PATCH 06/12] net: sched: fw " John Fastabend
2014-01-10  9:41 ` [RFC PATCH 07/12] net: sched: RCU cls_route John Fastabend
2014-01-10  9:42 ` [RFC PATCH 08/12] net: sched: RCU cls_tcindex John Fastabend
2014-01-10  9:42 ` [RFC PATCH 09/12] net: sched: make cls_u32 lockless John Fastabend
2014-01-10  9:43 ` [RFC PATCH 10/12] net: sched: rcu'ify cls_rsvp John Fastabend
2014-01-10  9:43 ` [RFC PATCH 11/12] net: make cls_bpf rcu safe John Fastabend
2014-01-10  9:44 ` [RFC PATCH 12/12] net: sched: make tc_action safe to walk under RCU John Fastabend
2014-01-11 19:43 ` [RFC PATCH 00/12] RCU'ify the net:sched classifier chains Cong Wang
2014-01-11 23:33   ` John Fastabend [this message]
2014-04-24 23:51     ` Cong Wang
2014-04-30 16:36       ` John Fastabend
2014-01-12 13:28 ` Jamal Hadi Salim
2014-01-12 13:57   ` Jamal Hadi Salim
2014-01-12 14:18     ` Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52D1D4BD.9080906@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jhs@mojatatu.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.