All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>
Cc: Roi Dayan <roid@mellanox.com>,
	"David S. Miller" <davem@davemloft.net>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Jiri Pirko <jiri@mellanox.com>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Cong Wang <cwang@twopensource.com>
Subject: Re: [PATCH net-next] net/sched: cls_flower: verify root pointer before dereferncing it
Date: Tue, 22 Nov 2016 15:36:16 -0800	[thread overview]
Message-ID: <5834D670.2050301@gmail.com> (raw)
In-Reply-To: <5834AD97.1020600@iogearbox.net>

On 16-11-22 12:41 PM, Daniel Borkmann wrote:
> On 11/22/2016 08:28 PM, Cong Wang wrote:
>> On Tue, Nov 22, 2016 at 8:11 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Tue, Nov 22, 2016 at 05:04:11PM CET, daniel@iogearbox.net wrote:
>>>> Hmm, I don't think we want to have such an additional test in fast
>>>> path for each and every classifier. Can we think of ways to avoid that?
>>>>
>>>> My question is, since we unlink individual instances from such
>>>> tp-internal
>>>> lists through RCU and release the instance through call_rcu() as
>>>> well as
>>>> the head (tp->root) via kfree_rcu() eventually, against what are we
>>>> protecting
>>>> setting RCU_INIT_POINTER(tp->root, NULL) in ->destroy() callback?
>>>> Something
>>>> not respecting grace period?
>>>
>>> If you call tp->ops->destroy in call_rcu, you don't have to set tp->root
>>> to null.
> 
> But that's not really an answer to my question. ;)
> 
>> We do need to respect the grace period if we touch the globally visible
>> data structure tp in tcf_destroy(). Therefore Roi's patch is not
>> fixing the
>> right place.
> 
> I think there may be multiple issues actually.
> 
> At the time we go into tc_classify(), from ingress as well as egress side,
> we're under RCU, but BH variant. In cls delete()/destroy() callbacks, we
> everywhere use call_rcu() and kfree_rcu(), same as for tcf_destroy() where
> we use kfree_rcu() on tp, although we iterate tps (and implicitly inner
> filters)
> via rcu_dereference_bh() from reader side. Is there a reason why we don't
> use call_rcu_bh() variant on destruction for all this instead?

I can't think of any if its all under _bh we can convert the call_rcu to
call_rcu_bh it just needs an audit.

> 
> Just looking at cls_bpf and others, what protects
> RCU_INIT_POINTER(tp->root,
> NULL) against? The tp is unlinked in tc_ctl_tfilter() from the tp chain in
> tcf_destroy() cases. Still active readers under RCU BH can race against
> this
> (tp->root being NULL), as the commit identified. Only the get() callback
> checks
> for head against NULL, but both are serialized under rtnl, and the only
> place
> we call this is tc_ctl_tfilter(). Even if we create a new tp, head
> should not
> be NULL there, if it was assigned during the init() cb, but contains an
> empty
> list. (It's different for things like cls_cgroup, though.) So, I'm
> wondering
> if the RCU_INIT_POINTER(tp->root, NULL) can just be removed instead
> (unless I'm
> missing something obvious)?


Just took a look at this I think there are a couple possible solutions.
The easiest is likely to fix all the call sites so that 'tp' is unlinked
before calling the destroy() handlers AND not doing the NULL set. I only
see one such call site where destroy is called before unlinking at the
moment. This should enforce that after a grace period there is no path
to reach the classifiers because 'tp' is unlinked. Calling destroy
before unlinking 'tp' however could cause a small race between grace
period of 'tp' and grace period of the filter.

Another would be to only call the destroy path from the call_rcu path
of the 'tp' object so that destroy is only ever called after the object
is guaranteed to be unlinked from the tc_filter path.

I think both solutions would be fine.

Cong were you working on one of these? Or do you have another idea?


> 
>> Also I don't know why you blame my commit, this problem should already
>> exist prior to my commit, probably date back to John's RCU patches.
> 
> It seems so.

  reply	other threads:[~2016-11-22 23:36 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-22 14:25 [PATCH net-next] net/sched: cls_flower: verify root pointer before dereferncing it Roi Dayan
2016-11-22 14:48 ` Jiri Pirko
2016-11-22 15:37   ` David Miller
2016-11-22 16:13     ` Jiri Pirko
2016-11-22 16:04   ` Daniel Borkmann
2016-11-22 16:11     ` Jiri Pirko
2016-11-22 19:28       ` Cong Wang
2016-11-22 20:41         ` Daniel Borkmann
2016-11-22 23:36           ` John Fastabend [this message]
2016-11-23  5:24             ` Cong Wang
2016-11-23 11:29               ` Daniel Borkmann
2016-11-23 19:29                 ` Cong Wang
2016-11-24 20:25         ` David Miller
2016-11-25  0:17           ` Daniel Borkmann
2016-11-25  6:23             ` Cong Wang
2016-11-25  9:43               ` matchall race (was: Re: [PATCH net-next] net/sched: cls_flower: verify root pointer before dereferncing it) Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5834D670.2050301@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=cwang@twopensource.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=jiri@mellanox.com \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=roid@mellanox.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.