Re: [PATCH 2/2] ipvs: Use cond_resched_rcu_lock() helper when dumping connections

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Simon Horman <horms@verge.net.au>, Julian Anastasov <ja@ssi.bg>,
	Ingo Molnar <mingo@redhat.com>,
	lvs-devel@vger.kernel.org, netdev@vger.kernel.org,
	netfilter-devel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Dipankar Sarma <dipankar@in.ibm.com>,
	dhaval.giani@gmail.com
Subject: Re: [PATCH 2/2] ipvs: Use cond_resched_rcu_lock() helper when dumping connections
Date: Fri, 26 Apr 2013 09:30:46 -0700	[thread overview]
Message-ID: <20130426163046.GG3860@linux.vnet.ibm.com> (raw)
In-Reply-To: <1366991947.8964.233.camel@edumazet-glaptop>

On Fri, Apr 26, 2013 at 08:59:07AM -0700, Eric Dumazet wrote:
> On Fri, 2013-04-26 at 08:45 -0700, Paul E. McKenney wrote:
> 
> > I have done some crude coccinelle patterns in the past, but they are
> > subject to false positives (from when you transfer the pointer from
> > RCU protection to reference-count protection) and also false negatives
> > (when you atomically increment some statistic unrelated to protection).
> > 
> > I could imagine maintaining a per-thread count of the number of outermost
> > RCU read-side critical sections at runtime, and then associating that
> > counter with a given pointer at rcu_dereference() time, but this would
> > require either compiler magic or an API for using a pointer returned
> > by rcu_dereference().  This API could in theory be enforced by sparse.
> > 
> > Dhaval Giani might have some ideas as well, adding him to CC.
> 
> 
> We had this fix the otherday, because tcp prequeue code hit this check :
> 
> static inline struct dst_entry *skb_dst(const struct sk_buff *skb)
> {
>         /* If refdst was not refcounted, check we still are in a 
>          * rcu_read_lock section
>          */
>         WARN_ON((skb->_skb_refdst & SKB_DST_NOREF) &&
>                 !rcu_read_lock_held() &&
>                 !rcu_read_lock_bh_held());
>         return (struct dst_entry *)(skb->_skb_refdst & SKB_DST_PTRMASK);
> }
> 
> ( http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=093162553c33e9479283e107b4431378271c735d )
> 
> Problem is the rcu protected pointer was escaping the rcu lock and was
> then used in another thread.

All the way to some other thread?  That is a serious escape!  ;-)

> What would be cool (but expensive maybe) , would be to get a cookie from
> rcu_read_lock(), and check the cookie at rcu_dereference(). These
> cookies would have system wide scope to catch any kind of errors.

I suspect that your cookie and my counter are quite similar.

> Because a per thread counter would not catch following problem :
> 
> rcu_read_lock();
> 
> ptr = rcu_dereference(x);
> if (!ptr)
> 	return NULL;
> ...
> 
> 
> rcu_read_unlock();
> 
> ...
> rcu_read_lock();
> /* no reload of x, ptr might be now stale/freed */
> if (ptr->field) { ... }

Well, that is why I needed to appeal to compiler magic or an API
extension.  Either way, the pointer returned from rcu_dereference()
must be tagged with the ID of the outermost rcu_read_lock() that
covers it.  Then that tag must be checked against that of the outermost
rcu_read_lock() when it is dereferenced.  If we introduced yet another
set of RCU API members (shudder!) with which to wrapper your
ptr->field use, we could associate additional storage with the
pointer to hold the cookie/counter.  And then use sparse to enforce
use of the new API.

Of course, if we were using C++, we could define a template for pointers
returned by rcu_dereference() in order to make this work.  Now, where
did I put that asbestos suit, you know, the one with the titanium
pinstripes?  ;-)

But even that has some "interesting" issues.  To see this, let's
modify your example a bit:

	rcu_read_lock();

	...

	rcu_read_lock_bh();

	ptr = rcu_dereference(x);
	if (!ptr)
		return NULL;
	...


	rcu_read_unlock_bh();

	...
	/* no reload of x, ptr might be now stale/freed */
	if (ptr->field) { ... }

	...

	rcu_read_unlock();


Now, is it the rcu_read_lock() or the rcu_read_lock_bh() that protects
the rcu_dereference()?

							Thanx, Paul

next prev parent reply	other threads:[~2013-04-26 16:30 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-26  1:45 [PATCH 0/2] sched: Add cond_resched_rcu_lock() helper Simon Horman
2013-04-26  1:45 ` [PATCH 1/2] " Simon Horman
2013-04-26  6:13   ` Ingo Molnar
2013-04-26  1:45 ` [PATCH 2/2] ipvs: Use cond_resched_rcu_lock() helper when dumping connections Simon Horman
2013-04-26  6:15   ` Ingo Molnar
2013-04-30  2:45     ` Simon Horman
2013-04-26  8:03   ` Peter Zijlstra
2013-04-26 15:45     ` Paul E. McKenney
2013-04-26 15:59       ` Eric Dumazet
2013-04-26 16:30         ` Paul E. McKenney [this message]
2013-04-26 17:19       ` Peter Zijlstra
2013-04-26 17:48         ` Paul E. McKenney
2013-04-26 18:26           ` Eric Dumazet
2013-04-26 19:04             ` Paul E. McKenney
2013-04-27  7:18               ` Peter Zijlstra
2013-04-27 16:17                 ` Paul E. McKenney
2013-04-27 11:32             ` Julian Anastasov
2013-04-27 16:20               ` Paul E. McKenney
2013-04-29 21:08                 ` Julian Anastasov
2013-04-29 21:30                   ` Paul E. McKenney
2013-04-29 23:12                     ` Julian Anastasov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130426163046.GG3860@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=dhaval.giani@gmail.com \
    --cc=dipankar@in.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=horms@verge.net.au \
    --cc=ja@ssi.bg \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lvs-devel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).