linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
@ 2010-05-13 18:25 Paul E. McKenney
  2010-05-13 19:03 ` Frederic Weisbecker
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2010-05-13 18:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: mingo, fweisbec, a.p.zijlstra, paulus, acme

This commit guesses at the perf_cpu_context locking design and deploys
an rcu_dereference_check() accordingly.  The design appears to require
that a given CPU be accessing its own per_cpu_context or that it be
traversing under RCU protection.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>

 perf_event.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index a4fa381..002791c 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4074,7 +4074,9 @@ find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id)
 
 	hash = swevent_hash(type, event_id);
 
-	hlist = rcu_dereference(ctx->swevent_hlist);
+	hlist = rcu_dereference_check(ctx->swevent_hlist,
+				      rcu_read_lock_held() ||
+				      ctx == &__get_cpu_var(perf_cpu_context));
 	if (!hlist)
 		return NULL;
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-13 18:25 [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat Paul E. McKenney
@ 2010-05-13 19:03 ` Frederic Weisbecker
  2010-05-13 19:46   ` Paul E. McKenney
  2010-05-14  7:44   ` Peter Zijlstra
  0 siblings, 2 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2010-05-13 19:03 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, mingo, a.p.zijlstra, paulus, acme

On Thu, May 13, 2010 at 11:25:56AM -0700, Paul E. McKenney wrote:
> This commit guesses at the perf_cpu_context locking design and deploys
> an rcu_dereference_check() accordingly.  The design appears to require
> that a given CPU be accessing its own per_cpu_context or that it be
> traversing under RCU protection.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> 
>  perf_event.c |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index a4fa381..002791c 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -4074,7 +4074,9 @@ find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id)
>  
>  	hash = swevent_hash(type, event_id);
>  
> -	hlist = rcu_dereference(ctx->swevent_hlist);
> +	hlist = rcu_dereference_check(ctx->swevent_hlist,
> +				      rcu_read_lock_held() ||
> +				      ctx == &__get_cpu_var(perf_cpu_context));
>  	if (!hlist)
>  		return NULL;
>  


Hmm, that's not exactly that. It will always be the ctx of this cpu
but not always under rcu read lock. I mean touching the current cpu
ctx is not inherently safe.

In fact we have two paths:

perf_swevent_enable() gets the hlist and if it is called it means
that this hlist is not supposed to be NULL. If it is, it's a bug.

If we have created a software event, the hlist has been allocated
and perf_swevent_enable() is called later to activate this event.
May be I shouldn't use rcu_dereference() here but a simple dereference.
And the hlist can't be freed under us at this time so we don't need
rcu_read_lock().

OTOH, do_perf_sw_event() can be called anytime so it need this
rcu_read_lock().


On the perf_swevent_enable() path, what prevents the hlist to be
freed under us is the ctx->lock. Because we won't ever remove
an event from its context list outside this lock, and we might only
release the hlist after a software event gets removed from its
context list.

So either we do this:

hlist = rcu_dereference_check(ctx->swevent_hlist,
                              rcu_read_lock_held() ||
                              raw_spin_lock_is_held(&ctx->lock));

or:


hlist = ctx->swevent_hlist;


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-13 19:03 ` Frederic Weisbecker
@ 2010-05-13 19:46   ` Paul E. McKenney
  2010-05-13 20:26     ` Frederic Weisbecker
  2010-05-13 20:48     ` Frederic Weisbecker
  2010-05-14  7:44   ` Peter Zijlstra
  1 sibling, 2 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-05-13 19:46 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: linux-kernel, mingo, a.p.zijlstra, paulus, acme

On Thu, May 13, 2010 at 09:03:28PM +0200, Frederic Weisbecker wrote:
> On Thu, May 13, 2010 at 11:25:56AM -0700, Paul E. McKenney wrote:
> > This commit guesses at the perf_cpu_context locking design and deploys
> > an rcu_dereference_check() accordingly.  The design appears to require
> > that a given CPU be accessing its own per_cpu_context or that it be
> > traversing under RCU protection.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > 
> >  perf_event.c |    4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> > index a4fa381..002791c 100644
> > --- a/kernel/perf_event.c
> > +++ b/kernel/perf_event.c
> > @@ -4074,7 +4074,9 @@ find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id)
> >  
> >  	hash = swevent_hash(type, event_id);
> >  
> > -	hlist = rcu_dereference(ctx->swevent_hlist);
> > +	hlist = rcu_dereference_check(ctx->swevent_hlist,
> > +				      rcu_read_lock_held() ||
> > +				      ctx == &__get_cpu_var(perf_cpu_context));
> >  	if (!hlist)
> >  		return NULL;
> >  
> 
> 
> Hmm, that's not exactly that. It will always be the ctx of this cpu
> but not always under rcu read lock. I mean touching the current cpu
> ctx is not inherently safe.
> 
> In fact we have two paths:
> 
> perf_swevent_enable() gets the hlist and if it is called it means
> that this hlist is not supposed to be NULL. If it is, it's a bug.
> 
> If we have created a software event, the hlist has been allocated
> and perf_swevent_enable() is called later to activate this event.
> May be I shouldn't use rcu_dereference() here but a simple dereference.
> And the hlist can't be freed under us at this time so we don't need
> rcu_read_lock().
> 
> OTOH, do_perf_sw_event() can be called anytime so it need this
> rcu_read_lock().
> 
> 
> On the perf_swevent_enable() path, what prevents the hlist to be
> freed under us is the ctx->lock. Because we won't ever remove
> an event from its context list outside this lock, and we might only
> release the hlist after a software event gets removed from its
> context list.
> 
> So either we do this:
> 
> hlist = rcu_dereference_check(ctx->swevent_hlist,
>                               rcu_read_lock_held() ||
>                               raw_spin_lock_is_held(&ctx->lock));

Something very similar to the above was in fact my first guess, but as
Ingo can attest, this results in build errors.  The problem is that
perf_cpu_context does not have a ->lock field.  Hmmm...  But it does
contain a struct perf_event_context (ctx) and a pointer to another
(task_ctx).  So, would one of the following work?

list = rcu_dereference_check(ctx->swevent_hlist,
			     rcu_read_lock_held() ||
			     lockdep_is_held(&ctx->ctx.lock));

list = rcu_dereference_check(ctx->swevent_hlist,
			     rcu_read_lock_held() ||
			     lockdep_is_held(&ctx->task_ctx->lock));

If the latter, can ->task_ctx ever be NULL, and if so, what should
I do then?

> or:
> 
> 
> hlist = ctx->swevent_hlist;

This will be flagged as an error by Arnd's sparse-based checking.  :-(

							Thanx, Paul

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-13 19:46   ` Paul E. McKenney
@ 2010-05-13 20:26     ` Frederic Weisbecker
  2010-05-13 20:48     ` Frederic Weisbecker
  1 sibling, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2010-05-13 20:26 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, mingo, a.p.zijlstra, paulus, acme

On Thu, May 13, 2010 at 12:46:05PM -0700, Paul E. McKenney wrote:
> On Thu, May 13, 2010 at 09:03:28PM +0200, Frederic Weisbecker wrote:
> > On the perf_swevent_enable() path, what prevents the hlist to be
> > freed under us is the ctx->lock. Because we won't ever remove
> > an event from its context list outside this lock, and we might only
> > release the hlist after a software event gets removed from its
> > context list.
> > 
> > So either we do this:
> > 
> > hlist = rcu_dereference_check(ctx->swevent_hlist,
> >                               rcu_read_lock_held() ||
> >                               raw_spin_lock_is_held(&ctx->lock));
> 
> Something very similar to the above was in fact my first guess, but as
> Ingo can attest, this results in build errors.  The problem is that
> perf_cpu_context does not have a ->lock field.  Hmmm...  But it does
> contain a struct perf_event_context (ctx) and a pointer to another
> (task_ctx).  So, would one of the following work?
> 
> list = rcu_dereference_check(ctx->swevent_hlist,
> 			     rcu_read_lock_held() ||
> 			     lockdep_is_held(&ctx->ctx.lock));
> 
> list = rcu_dereference_check(ctx->swevent_hlist,
> 			     rcu_read_lock_held() ||
> 			     lockdep_is_held(&ctx->task_ctx->lock));



Ah right. Oh, and it depends where the current event to be enabled
come from: a task context or a cpu context, it can be one of these
two locks. The following check wouldn't be 100 % reliable:

	lockdep_is_held(&ctx->ctx.lock) || lockdep_is_held(&ctx->task_ctx->lock))

because it has a small hole: the cpu context lock can be validated while
the current event needs the task one, or opposite. But at least it narrows
down.

Yet we could get the event in the perf_swevent_enable() and do:

	event && lockdep_is_held(event->ctx->lock)

(do_perf_sw_event() can pass NULL as the event, because it doesn't have
any yet).

That has a drawback though: find_swevent_head() would require one more
parameter, for something used in a fastpath.

That's sad but the only solution I can find.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-13 19:46   ` Paul E. McKenney
  2010-05-13 20:26     ` Frederic Weisbecker
@ 2010-05-13 20:48     ` Frederic Weisbecker
  2010-05-13 20:59       ` Frederic Weisbecker
  2010-05-14  7:47       ` Peter Zijlstra
  1 sibling, 2 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2010-05-13 20:48 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, mingo, a.p.zijlstra, paulus, acme

On Thu, May 13, 2010 at 12:46:05PM -0700, Paul E. McKenney wrote:
> On Thu, May 13, 2010 at 09:03:28PM +0200, Frederic Weisbecker wrote:
> > On Thu, May 13, 2010 at 11:25:56AM -0700, Paul E. McKenney wrote:
> > > This commit guesses at the perf_cpu_context locking design and deploys
> > > an rcu_dereference_check() accordingly.  The design appears to require
> > > that a given CPU be accessing its own per_cpu_context or that it be
> > > traversing under RCU protection.
> > > 
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > > 
> > >  perf_event.c |    4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> > > index a4fa381..002791c 100644
> > > --- a/kernel/perf_event.c
> > > +++ b/kernel/perf_event.c
> > > @@ -4074,7 +4074,9 @@ find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id)
> > >  
> > >  	hash = swevent_hash(type, event_id);
> > >  
> > > -	hlist = rcu_dereference(ctx->swevent_hlist);
> > > +	hlist = rcu_dereference_check(ctx->swevent_hlist,
> > > +				      rcu_read_lock_held() ||
> > > +				      ctx == &__get_cpu_var(perf_cpu_context));
> > >  	if (!hlist)
> > >  		return NULL;
> > >  
> > 
> > 
> > Hmm, that's not exactly that. It will always be the ctx of this cpu
> > but not always under rcu read lock. I mean touching the current cpu
> > ctx is not inherently safe.
> > 
> > In fact we have two paths:
> > 
> > perf_swevent_enable() gets the hlist and if it is called it means
> > that this hlist is not supposed to be NULL. If it is, it's a bug.
> > 
> > If we have created a software event, the hlist has been allocated
> > and perf_swevent_enable() is called later to activate this event.
> > May be I shouldn't use rcu_dereference() here but a simple dereference.
> > And the hlist can't be freed under us at this time so we don't need
> > rcu_read_lock().
> > 
> > OTOH, do_perf_sw_event() can be called anytime so it need this
> > rcu_read_lock().
> > 
> > 
> > On the perf_swevent_enable() path, what prevents the hlist to be
> > freed under us is the ctx->lock. Because we won't ever remove
> > an event from its context list outside this lock, and we might only
> > release the hlist after a software event gets removed from its
> > context list.
> > 
> > So either we do this:
> > 
> > hlist = rcu_dereference_check(ctx->swevent_hlist,
> >                               rcu_read_lock_held() ||
> >                               raw_spin_lock_is_held(&ctx->lock));
> 
> Something very similar to the above was in fact my first guess, but as
> Ingo can attest, this results in build errors.  The problem is that
> perf_cpu_context does not have a ->lock field.  Hmmm...  But it does
> contain a struct perf_event_context (ctx) and a pointer to another
> (task_ctx).  So, would one of the following work?
> 
> list = rcu_dereference_check(ctx->swevent_hlist,
> 			     rcu_read_lock_held() ||
> 			     lockdep_is_held(&ctx->ctx.lock));
> 
> list = rcu_dereference_check(ctx->swevent_hlist,
> 			     rcu_read_lock_held() ||
> 			     lockdep_is_held(&ctx->task_ctx->lock));
> 
> If the latter, can ->task_ctx ever be NULL, and if so, what should
> I do then?
> 
> > or:
> > 
> > 
> > hlist = ctx->swevent_hlist;
> 
> This will be flagged as an error by Arnd's sparse-based checking.  :-(
> 
> 							Thanx, Paul


Paul, does that look good to you?
I've only compile tested. But if it's fine for you, I'll test
it for real and provide you a sane patch.

This version won't use more parameters than necessary in the
off case.


diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index a4fa381..d765e48 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4066,19 +4066,36 @@ static inline u64 swevent_hash(u64 type, u32 event_id)
 	return hash_64(val, SWEVENT_HLIST_BITS);
 }
 
-static struct hlist_head *
-find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id)
+static inline struct hlist_head *
+__find_swevent_head(struct swevent_hlist *hlist, u64 type, u32 event_id)
 {
 	u64 hash;
-	struct swevent_hlist *hlist;
 
 	hash = swevent_hash(type, event_id);
 
+	return &hlist->heads[hash];
+}
+
+static inline struct hlist_head *
+find_swevent_head_rcu(struct perf_cpu_context *ctx, u64 type, u32 event_id)
+{
+	struct swevent_hlist *hlist;
+
 	hlist = rcu_dereference(ctx->swevent_hlist);
-	if (!hlist)
-		return NULL;
 
-	return &hlist->heads[hash];
+	return __find_swevent_head(hlist, type, event_id);
+}
+
+static inline struct hlist_head *
+find_swevent_head(struct perf_cpu_context *ctx, u64 type,
+		  u32 event_id, struct perf_event *event)
+{
+	struct swevent_hlist *hlist;
+
+	hlist = rcu_dereference_check(ctx->swevent_hlist,
+				      lockdep_is_held(&event->ctx->lock));
+
+	return __find_swevent_head(hlist, type, event_id);
 }
 
 static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
@@ -4095,7 +4112,7 @@ static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
 
 	rcu_read_lock();
 
-	head = find_swevent_head(cpuctx, type, event_id);
+	head = find_swevent_head_rcu(cpuctx, type, event_id);
 
 	if (!head)
 		goto end;
@@ -4178,7 +4195,8 @@ static int perf_swevent_enable(struct perf_event *event)
 		perf_swevent_set_period(event);
 	}
 
-	head = find_swevent_head(cpuctx, event->attr.type, event->attr.config);
+	head = find_swevent_head(cpuctx, event->attr.type,
+				 event->attr.config, event);
 	if (WARN_ON_ONCE(!head))
 		return -EINVAL;
 



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-13 20:48     ` Frederic Weisbecker
@ 2010-05-13 20:59       ` Frederic Weisbecker
  2010-05-13 23:43         ` Paul E. McKenney
  2010-05-14  7:47       ` Peter Zijlstra
  1 sibling, 1 reply; 12+ messages in thread
From: Frederic Weisbecker @ 2010-05-13 20:59 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, mingo, a.p.zijlstra, paulus, acme

On Thu, May 13, 2010 at 10:48:58PM +0200, Frederic Weisbecker wrote:
> Paul, does that look good to you?
> I've only compile tested. But if it's fine for you, I'll test
> it for real and provide you a sane patch.
> 
> This version won't use more parameters than necessary in the
> off case.
> 
> 
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index a4fa381..d765e48 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -4066,19 +4066,36 @@ static inline u64 swevent_hash(u64 type, u32 event_id)
>  	return hash_64(val, SWEVENT_HLIST_BITS);
>  }
>  
> -static struct hlist_head *
> -find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id)
> +static inline struct hlist_head *
> +__find_swevent_head(struct swevent_hlist *hlist, u64 type, u32 event_id)
>  {
>  	u64 hash;
> -	struct swevent_hlist *hlist;



+	if (!hlist)
+		return NULL;



>  
>  	hash = swevent_hash(type, event_id);
>  
> +	return &hlist->heads[hash];
> +}
> +
> +static inline struct hlist_head *
> +find_swevent_head_rcu(struct perf_cpu_context *ctx, u64 type, u32 event_id)
> +{
> +	struct swevent_hlist *hlist;
> +
>  	hlist = rcu_dereference(ctx->swevent_hlist);
> -	if (!hlist)
> -		return NULL;
>  
> -	return &hlist->heads[hash];
> +	return __find_swevent_head(hlist, type, event_id);
> +}
> +
> +static inline struct hlist_head *
> +find_swevent_head(struct perf_cpu_context *ctx, u64 type,
> +		  u32 event_id, struct perf_event *event)
> +{
> +	struct swevent_hlist *hlist;
> +
> +	hlist = rcu_dereference_check(ctx->swevent_hlist,
> +				      lockdep_is_held(&event->ctx->lock));
> +
> +	return __find_swevent_head(hlist, type, event_id);
>  }
>  
>  static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
> @@ -4095,7 +4112,7 @@ static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
>  
>  	rcu_read_lock();
>  
> -	head = find_swevent_head(cpuctx, type, event_id);
> +	head = find_swevent_head_rcu(cpuctx, type, event_id);
>  
>  	if (!head)
>  		goto end;
> @@ -4178,7 +4195,8 @@ static int perf_swevent_enable(struct perf_event *event)
>  		perf_swevent_set_period(event);
>  	}
>  
> -	head = find_swevent_head(cpuctx, event->attr.type, event->attr.config);
> +	head = find_swevent_head(cpuctx, event->attr.type,
> +				 event->attr.config, event);
>  	if (WARN_ON_ONCE(!head))
>  		return -EINVAL;
>  
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-13 20:59       ` Frederic Weisbecker
@ 2010-05-13 23:43         ` Paul E. McKenney
  2010-05-20  7:01           ` Frederic Weisbecker
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2010-05-13 23:43 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: linux-kernel, mingo, a.p.zijlstra, paulus, acme

On Thu, May 13, 2010 at 10:59:26PM +0200, Frederic Weisbecker wrote:
> On Thu, May 13, 2010 at 10:48:58PM +0200, Frederic Weisbecker wrote:
> > Paul, does that look good to you?
> > I've only compile tested. But if it's fine for you, I'll test
> > it for real and provide you a sane patch.

That would be very good!!!  I believe that I have already more than proven
my ignorance of the perf_event code.  ;-)

> > This version won't use more parameters than necessary in the
> > off case.
> > 
> > 
> > diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> > index a4fa381..d765e48 100644
> > --- a/kernel/perf_event.c
> > +++ b/kernel/perf_event.c
> > @@ -4066,19 +4066,36 @@ static inline u64 swevent_hash(u64 type, u32 event_id)
> >  	return hash_64(val, SWEVENT_HLIST_BITS);
> >  }
> >  
> > -static struct hlist_head *
> > -find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id)
> > +static inline struct hlist_head *
> > +__find_swevent_head(struct swevent_hlist *hlist, u64 type, u32 event_id)
> >  {
> >  	u64 hash;
> > -	struct swevent_hlist *hlist;
> 
> 
> 
> +	if (!hlist)
> +		return NULL;
> 
> 
> 
> >  
> >  	hash = swevent_hash(type, event_id);
> >  
> > +	return &hlist->heads[hash];
> > +}
> > +
> > +static inline struct hlist_head *
> > +find_swevent_head_rcu(struct perf_cpu_context *ctx, u64 type, u32 event_id)
> > +{
> > +	struct swevent_hlist *hlist;
> > +
> >  	hlist = rcu_dereference(ctx->swevent_hlist);

This is appropriate if find_swevent_head_rcu() is always invoked in an
RCU read-side critical section.  (Which at first glance does appear to
be the intent, just checking.)

> > -	if (!hlist)
> > -		return NULL;
> >  
> > -	return &hlist->heads[hash];
> > +	return __find_swevent_head(hlist, type, event_id);
> > +}
> > +
> > +static inline struct hlist_head *
> > +find_swevent_head(struct perf_cpu_context *ctx, u64 type,
> > +		  u32 event_id, struct perf_event *event)
> > +{
> > +	struct swevent_hlist *hlist;
> > +
> > +	hlist = rcu_dereference_check(ctx->swevent_hlist,
> > +				      lockdep_is_held(&event->ctx->lock));

This could be invoked with either the event->ctx->lock held or in
an RCU read-side critical section.  If this is always called with
the update-side lock held, you can (but don't need to) instead say:

	hlist = rcu_dereference_protected(ctx->swevent_hlist,
				          lockdep_is_held(&event->ctx->lock));

This is slightly faster, as it drops the volatile casts.  In many cases,
you won't care, but in case this code path needs ultimate performance.

Also I thought it was event->ctx.lock, but at this point I trust your eyes
more than my own.  ;-)

							Thanx, Paul

> > +
> > +	return __find_swevent_head(hlist, type, event_id);
> >  }
> >  
> >  static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
> > @@ -4095,7 +4112,7 @@ static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
> >  
> >  	rcu_read_lock();
> >  
> > -	head = find_swevent_head(cpuctx, type, event_id);
> > +	head = find_swevent_head_rcu(cpuctx, type, event_id);
> >  
> >  	if (!head)
> >  		goto end;
> > @@ -4178,7 +4195,8 @@ static int perf_swevent_enable(struct perf_event *event)
> >  		perf_swevent_set_period(event);
> >  	}
> >  
> > -	head = find_swevent_head(cpuctx, event->attr.type, event->attr.config);
> > +	head = find_swevent_head(cpuctx, event->attr.type,
> > +				 event->attr.config, event);
> >  	if (WARN_ON_ONCE(!head))
> >  		return -EINVAL;
> >  
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-13 19:03 ` Frederic Weisbecker
  2010-05-13 19:46   ` Paul E. McKenney
@ 2010-05-14  7:44   ` Peter Zijlstra
  2010-05-20  7:04     ` Frederic Weisbecker
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2010-05-14  7:44 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Paul E. McKenney, linux-kernel, mingo, paulus, acme

On Thu, 2010-05-13 at 21:03 +0200, Frederic Weisbecker wrote:
> hlist = rcu_dereference_check(ctx->swevent_hlist,
>                               rcu_read_lock_held() ||
>                               raw_spin_lock_is_held(&ctx->lock)); 

raw_spin_lock_is_held() is wrong, that should read:
lockdep_is_held(&ctx->lock).

raw_spin_lock_is_held() only checks if the lock is locked,
lockdep_is_held() ensures we are the one holding it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-13 20:48     ` Frederic Weisbecker
  2010-05-13 20:59       ` Frederic Weisbecker
@ 2010-05-14  7:47       ` Peter Zijlstra
  2010-05-20  7:04         ` Frederic Weisbecker
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2010-05-14  7:47 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Paul E. McKenney, linux-kernel, mingo, paulus, acme

On Thu, 2010-05-13 at 22:48 +0200, Frederic Weisbecker wrote:
> +       head = find_swevent_head(cpuctx, event->attr.type,
> +                                event->attr.config, event); 

might as well only pass event and have the function do those derefs.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-13 23:43         ` Paul E. McKenney
@ 2010-05-20  7:01           ` Frederic Weisbecker
  0 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2010-05-20  7:01 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, mingo, a.p.zijlstra, paulus, acme

On Thu, May 13, 2010 at 04:43:27PM -0700, Paul E. McKenney wrote:
> On Thu, May 13, 2010 at 10:59:26PM +0200, Frederic Weisbecker wrote:
> > >  	hash = swevent_hash(type, event_id);
> > >  
> > > +	return &hlist->heads[hash];
> > > +}
> > > +
> > > +static inline struct hlist_head *
> > > +find_swevent_head_rcu(struct perf_cpu_context *ctx, u64 type, u32 event_id)
> > > +{
> > > +	struct swevent_hlist *hlist;
> > > +
> > >  	hlist = rcu_dereference(ctx->swevent_hlist);
> 
> This is appropriate if find_swevent_head_rcu() is always invoked in an
> RCU read-side critical section.  (Which at first glance does appear to
> be the intent, just checking.)



Exactly!


 
> > > -	if (!hlist)
> > > -		return NULL;
> > >  
> > > -	return &hlist->heads[hash];
> > > +	return __find_swevent_head(hlist, type, event_id);
> > > +}
> > > +
> > > +static inline struct hlist_head *
> > > +find_swevent_head(struct perf_cpu_context *ctx, u64 type,
> > > +		  u32 event_id, struct perf_event *event)
> > > +{
> > > +	struct swevent_hlist *hlist;
> > > +
> > > +	hlist = rcu_dereference_check(ctx->swevent_hlist,
> > > +				      lockdep_is_held(&event->ctx->lock));
> 
> This could be invoked with either the event->ctx->lock held or in
> an RCU read-side critical section.  If this is always called with
> the update-side lock held, you can (but don't need to) instead say:
> 
> 	hlist = rcu_dereference_protected(ctx->swevent_hlist,
> 				          lockdep_is_held(&event->ctx->lock));
> 
> This is slightly faster, as it drops the volatile casts.  In many cases,
> you won't care, but in case this code path needs ultimate performance.
> 
> Also I thought it was event->ctx.lock, but at this point I trust your eyes
> more than my own.  ;-)
> 
> 							Thanx, Paul



This is indeed never called from the read side.

In fact the situation is a bit complicated.
The true update side is:

	creation_of_perf_event() {
		mutex_lock(ctx->mutex);
		alloc/rcu_assign_pointer the hlist
		mutex_unlock(ctx->mutex);
	}
	/* the event for which we've allocated the hlist
	   can be scheduled only once the above is
	   finished */

And, in a serialized way, we can have:

	schedule_perf_event() {
		spin_lock(ctx->lock);
		rcu_deref hlist
		spin_unlock(ctx->lock);
	}

Scheduling an event that derefs the hlist will not happen if we haven't
created and assigned the hlist before.
And also scheduling an event that derefs the hlist won't happen after
the hlist has been destroyed (released), it also can't be destroyed
concurrently with the scheduling.

This is why I consider the event scheduling as part of the update side,
since it is serialized with it, the ctx->lock guarantees that.

OTOH, we can have the read side concurrently anytime, hence the need
for rcu_read_lock() there.
	
So, I guess it's justified to use rcu_dereference_protected() here.

Thanks.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-14  7:47       ` Peter Zijlstra
@ 2010-05-20  7:04         ` Frederic Weisbecker
  0 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2010-05-20  7:04 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, mingo, paulus, acme

On Fri, May 14, 2010 at 09:47:05AM +0200, Peter Zijlstra wrote:
> On Thu, 2010-05-13 at 22:48 +0200, Frederic Weisbecker wrote:
> > +       head = find_swevent_head(cpuctx, event->attr.type,
> > +                                event->attr.config, event); 
> 
> might as well only pass event and have the function do those derefs.


Yeah, why not. But now that I see we can optimize the scheduling
side with rcu_dereference_protected, it's probably better to keep
the two derefs protected.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat
  2010-05-14  7:44   ` Peter Zijlstra
@ 2010-05-20  7:04     ` Frederic Weisbecker
  0 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2010-05-20  7:04 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, mingo, paulus, acme

On Fri, May 14, 2010 at 09:44:24AM +0200, Peter Zijlstra wrote:
> On Thu, 2010-05-13 at 21:03 +0200, Frederic Weisbecker wrote:
> > hlist = rcu_dereference_check(ctx->swevent_hlist,
> >                               rcu_read_lock_held() ||
> >                               raw_spin_lock_is_held(&ctx->lock)); 
> 
> raw_spin_lock_is_held() is wrong, that should read:
> lockdep_is_held(&ctx->lock).
> 
> raw_spin_lock_is_held() only checks if the lock is locked,
> lockdep_is_held() ensures we are the one holding it.


Right!

Thanks.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-05-20  7:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-13 18:25 [PATCH RFC] perf: fix find_swevent_head() RCU lockdep splat Paul E. McKenney
2010-05-13 19:03 ` Frederic Weisbecker
2010-05-13 19:46   ` Paul E. McKenney
2010-05-13 20:26     ` Frederic Weisbecker
2010-05-13 20:48     ` Frederic Weisbecker
2010-05-13 20:59       ` Frederic Weisbecker
2010-05-13 23:43         ` Paul E. McKenney
2010-05-20  7:01           ` Frederic Weisbecker
2010-05-14  7:47       ` Peter Zijlstra
2010-05-20  7:04         ` Frederic Weisbecker
2010-05-14  7:44   ` Peter Zijlstra
2010-05-20  7:04     ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).