From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Simon Horman <horms@verge.net.au>, Julian Anastasov <ja@ssi.bg>,
Ingo Molnar <mingo@redhat.com>,
lvs-devel@vger.kernel.org, netdev@vger.kernel.org,
netfilter-devel@vger.kernel.org, linux-kernel@vger.kernel.org,
Pablo Neira Ayuso <pablo@netfilter.org>,
Dipankar Sarma <dipankar@in.ibm.com>,
dhaval.giani@gmail.com
Subject: Re: [PATCH 2/2] ipvs: Use cond_resched_rcu_lock() helper when dumping connections
Date: Fri, 26 Apr 2013 09:30:46 -0700 [thread overview]
Message-ID: <20130426163046.GG3860@linux.vnet.ibm.com> (raw)
In-Reply-To: <1366991947.8964.233.camel@edumazet-glaptop>
On Fri, Apr 26, 2013 at 08:59:07AM -0700, Eric Dumazet wrote:
> On Fri, 2013-04-26 at 08:45 -0700, Paul E. McKenney wrote:
>
> > I have done some crude coccinelle patterns in the past, but they are
> > subject to false positives (from when you transfer the pointer from
> > RCU protection to reference-count protection) and also false negatives
> > (when you atomically increment some statistic unrelated to protection).
> >
> > I could imagine maintaining a per-thread count of the number of outermost
> > RCU read-side critical sections at runtime, and then associating that
> > counter with a given pointer at rcu_dereference() time, but this would
> > require either compiler magic or an API for using a pointer returned
> > by rcu_dereference(). This API could in theory be enforced by sparse.
> >
> > Dhaval Giani might have some ideas as well, adding him to CC.
>
>
> We had this fix the otherday, because tcp prequeue code hit this check :
>
> static inline struct dst_entry *skb_dst(const struct sk_buff *skb)
> {
> /* If refdst was not refcounted, check we still are in a
> * rcu_read_lock section
> */
> WARN_ON((skb->_skb_refdst & SKB_DST_NOREF) &&
> !rcu_read_lock_held() &&
> !rcu_read_lock_bh_held());
> return (struct dst_entry *)(skb->_skb_refdst & SKB_DST_PTRMASK);
> }
>
> ( http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=093162553c33e9479283e107b4431378271c735d )
>
> Problem is the rcu protected pointer was escaping the rcu lock and was
> then used in another thread.
All the way to some other thread? That is a serious escape! ;-)
> What would be cool (but expensive maybe) , would be to get a cookie from
> rcu_read_lock(), and check the cookie at rcu_dereference(). These
> cookies would have system wide scope to catch any kind of errors.
I suspect that your cookie and my counter are quite similar.
> Because a per thread counter would not catch following problem :
>
> rcu_read_lock();
>
> ptr = rcu_dereference(x);
> if (!ptr)
> return NULL;
> ...
>
>
> rcu_read_unlock();
>
> ...
> rcu_read_lock();
> /* no reload of x, ptr might be now stale/freed */
> if (ptr->field) { ... }
Well, that is why I needed to appeal to compiler magic or an API
extension. Either way, the pointer returned from rcu_dereference()
must be tagged with the ID of the outermost rcu_read_lock() that
covers it. Then that tag must be checked against that of the outermost
rcu_read_lock() when it is dereferenced. If we introduced yet another
set of RCU API members (shudder!) with which to wrapper your
ptr->field use, we could associate additional storage with the
pointer to hold the cookie/counter. And then use sparse to enforce
use of the new API.
Of course, if we were using C++, we could define a template for pointers
returned by rcu_dereference() in order to make this work. Now, where
did I put that asbestos suit, you know, the one with the titanium
pinstripes? ;-)
But even that has some "interesting" issues. To see this, let's
modify your example a bit:
rcu_read_lock();
...
rcu_read_lock_bh();
ptr = rcu_dereference(x);
if (!ptr)
return NULL;
...
rcu_read_unlock_bh();
...
/* no reload of x, ptr might be now stale/freed */
if (ptr->field) { ... }
...
rcu_read_unlock();
Now, is it the rcu_read_lock() or the rcu_read_lock_bh() that protects
the rcu_dereference()?
Thanx, Paul
next prev parent reply other threads:[~2013-04-26 16:30 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-26 1:45 [PATCH 0/2] sched: Add cond_resched_rcu_lock() helper Simon Horman
2013-04-26 1:45 ` [PATCH 1/2] " Simon Horman
2013-04-26 6:13 ` Ingo Molnar
2013-04-26 1:45 ` [PATCH 2/2] ipvs: Use cond_resched_rcu_lock() helper when dumping connections Simon Horman
2013-04-26 6:15 ` Ingo Molnar
2013-04-30 2:45 ` Simon Horman
2013-04-26 8:03 ` Peter Zijlstra
2013-04-26 15:45 ` Paul E. McKenney
2013-04-26 15:59 ` Eric Dumazet
2013-04-26 16:30 ` Paul E. McKenney [this message]
2013-04-26 17:19 ` Peter Zijlstra
2013-04-26 17:48 ` Paul E. McKenney
2013-04-26 18:26 ` Eric Dumazet
2013-04-26 19:04 ` Paul E. McKenney
2013-04-27 7:18 ` Peter Zijlstra
2013-04-27 16:17 ` Paul E. McKenney
2013-04-27 11:32 ` Julian Anastasov
2013-04-27 16:20 ` Paul E. McKenney
2013-04-29 21:08 ` Julian Anastasov
2013-04-29 21:30 ` Paul E. McKenney
2013-04-29 23:12 ` Julian Anastasov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130426163046.GG3860@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=dhaval.giani@gmail.com \
--cc=dipankar@in.ibm.com \
--cc=eric.dumazet@gmail.com \
--cc=horms@verge.net.au \
--cc=ja@ssi.bg \
--cc=linux-kernel@vger.kernel.org \
--cc=lvs-devel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.