All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu,
	a.p.zijlstra@chello.nl, paulus@samba.org, acme@redhat.com
Subject: Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
Date: Thu, 13 May 2010 22:48:58 +0200	[thread overview]
Message-ID: <20100513204855.GD5377@nowhere> (raw)
In-Reply-To: <20100513194605.GG2879@linux.vnet.ibm.com>

On Thu, May 13, 2010 at 12:46:05PM -0700, Paul E. McKenney wrote:
> On Thu, May 13, 2010 at 09:03:28PM +0200, Frederic Weisbecker wrote:
> > On Thu, May 13, 2010 at 11:25:56AM -0700, Paul E. McKenney wrote:
> > > This commit guesses at the perf_cpu_context locking design and deploys
> > > an rcu_dereference_check() accordingly.  The design appears to require
> > > that a given CPU be accessing its own per_cpu_context or that it be
> > > traversing under RCU protection.
> > > 
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > > 
> > >  perf_event.c |    4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> > > index a4fa381..002791c 100644
> > > --- a/kernel/perf_event.c
> > > +++ b/kernel/perf_event.c
> > > @@ -4074,7 +4074,9 @@ find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id)
> > >  
> > >  	hash = swevent_hash(type, event_id);
> > >  
> > > -	hlist = rcu_dereference(ctx->swevent_hlist);
> > > +	hlist = rcu_dereference_check(ctx->swevent_hlist,
> > > +				      rcu_read_lock_held() ||
> > > +				      ctx == &__get_cpu_var(perf_cpu_context));
> > >  	if (!hlist)
> > >  		return NULL;
> > >  
> > 
> > 
> > Hmm, that's not exactly that. It will always be the ctx of this cpu
> > but not always under rcu read lock. I mean touching the current cpu
> > ctx is not inherently safe.
> > 
> > In fact we have two paths:
> > 
> > perf_swevent_enable() gets the hlist and if it is called it means
> > that this hlist is not supposed to be NULL. If it is, it's a bug.
> > 
> > If we have created a software event, the hlist has been allocated
> > and perf_swevent_enable() is called later to activate this event.
> > May be I shouldn't use rcu_dereference() here but a simple dereference.
> > And the hlist can't be freed under us at this time so we don't need
> > rcu_read_lock().
> > 
> > OTOH, do_perf_sw_event() can be called anytime so it need this
> > rcu_read_lock().
> > 
> > 
> > On the perf_swevent_enable() path, what prevents the hlist to be
> > freed under us is the ctx->lock. Because we won't ever remove
> > an event from its context list outside this lock, and we might only
> > release the hlist after a software event gets removed from its
> > context list.
> > 
> > So either we do this:
> > 
> > hlist = rcu_dereference_check(ctx->swevent_hlist,
> >                               rcu_read_lock_held() ||
> >                               raw_spin_lock_is_held(&ctx->lock));
> 
> Something very similar to the above was in fact my first guess, but as
> Ingo can attest, this results in build errors.  The problem is that
> perf_cpu_context does not have a ->lock field.  Hmmm...  But it does
> contain a struct perf_event_context (ctx) and a pointer to another
> (task_ctx).  So, would one of the following work?
> 
> list = rcu_dereference_check(ctx->swevent_hlist,
> 			     rcu_read_lock_held() ||
> 			     lockdep_is_held(&ctx->ctx.lock));
> 
> list = rcu_dereference_check(ctx->swevent_hlist,
> 			     rcu_read_lock_held() ||
> 			     lockdep_is_held(&ctx->task_ctx->lock));
> 
> If the latter, can ->task_ctx ever be NULL, and if so, what should
> I do then?
> 
> > or:
> > 
> > 
> > hlist = ctx->swevent_hlist;
> 
> This will be flagged as an error by Arnd's sparse-based checking.  :-(
> 
> 							Thanx, Paul


Paul, does that look good to you?
I've only compile tested. But if it's fine for you, I'll test
it for real and provide you a sane patch.

This version won't use more parameters than necessary in the
off case.


diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index a4fa381..d765e48 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4066,19 +4066,36 @@ static inline u64 swevent_hash(u64 type, u32 event_id)
 	return hash_64(val, SWEVENT_HLIST_BITS);
 }
 
-static struct hlist_head *
-find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id)
+static inline struct hlist_head *
+__find_swevent_head(struct swevent_hlist *hlist, u64 type, u32 event_id)
 {
 	u64 hash;
-	struct swevent_hlist *hlist;
 
 	hash = swevent_hash(type, event_id);
 
+	return &hlist->heads[hash];
+}
+
+static inline struct hlist_head *
+find_swevent_head_rcu(struct perf_cpu_context *ctx, u64 type, u32 event_id)
+{
+	struct swevent_hlist *hlist;
+
 	hlist = rcu_dereference(ctx->swevent_hlist);
-	if (!hlist)
-		return NULL;
 
-	return &hlist->heads[hash];
+	return __find_swevent_head(hlist, type, event_id);
+}
+
+static inline struct hlist_head *
+find_swevent_head(struct perf_cpu_context *ctx, u64 type,
+		  u32 event_id, struct perf_event *event)
+{
+	struct swevent_hlist *hlist;
+
+	hlist = rcu_dereference_check(ctx->swevent_hlist,
+				      lockdep_is_held(&event->ctx->lock));
+
+	return __find_swevent_head(hlist, type, event_id);
 }
 
 static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
@@ -4095,7 +4112,7 @@ static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
 
 	rcu_read_lock();
 
-	head = find_swevent_head(cpuctx, type, event_id);
+	head = find_swevent_head_rcu(cpuctx, type, event_id);
 
 	if (!head)
 		goto end;
@@ -4178,7 +4195,8 @@ static int perf_swevent_enable(struct perf_event *event)
 		perf_swevent_set_period(event);
 	}
 
-	head = find_swevent_head(cpuctx, event->attr.type, event->attr.config);
+	head = find_swevent_head(cpuctx, event->attr.type,
+				 event->attr.config, event);
 	if (WARN_ON_ONCE(!head))
 		return -EINVAL;
 



  parent reply	other threads:[~2010-05-13 20:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-13 18:25 [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat Paul E. McKenney
2010-05-13 19:03 ` Frederic Weisbecker
2010-05-13 19:46   ` Paul E. McKenney
2010-05-13 20:26     ` Frederic Weisbecker
2010-05-13 20:48     ` Frederic Weisbecker [this message]
2010-05-13 20:59       ` Frederic Weisbecker
2010-05-13 23:43         ` Paul E. McKenney
2010-05-20  7:01           ` Frederic Weisbecker
2010-05-14  7:47       ` Peter Zijlstra
2010-05-20  7:04         ` Frederic Weisbecker
2010-05-14  7:44   ` Peter Zijlstra
2010-05-20  7:04     ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100513204855.GD5377@nowhere \
    --to=fweisbec@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.