public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Masami Hiramatsu <mhiramat@kernel.org>
Subject: Re: [RFC][PATCH 0/5] perf/tracing/cpuhotplug: Fix locking order
Date: Tue, 16 May 2017 05:46:06 -0700	[thread overview]
Message-ID: <20170516124606.GC3956@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170516081923.fxg67gawc44eg6i6@hirez.programming.kicks-ass.net>

On Tue, May 16, 2017 at 10:19:23AM +0200, Peter Zijlstra wrote:
> On Mon, May 15, 2017 at 11:40:43AM -0700, Paul E. McKenney wrote:
> 
> > Given that you acquire the global pmus_lock when doing the
> > get_online_cpus(), and given that CPU hotplug is rare, is it possible
> > to momentarily acquire the global pmus_lock in perf_event_init_cpu()
> > and perf_event_exit_cpu() and interact directly with that?  Then perf
> > would presumably leave alone any outgoing CPU that had already executed
> > perf_event_exit_cpu(), and also any incoming CPU that had not already
> > executed perf_event_init_cpu().
> > 
> > What prevents this approach from working?
> 
> Lack of sleep probably ;-)

I know that feeling...

> I'd blame the kids, but those have actually been very good lately.

I don't get that excuse anymore, all are on their own.  So I need
to come up with some fresh excuses.  ;-)

> You're suggesting the below on top, right? I'll run it with lockdep
> enabled after I chase some regression..

Something like this, yes.  Maybe even exactly like this.  ;-)

> ---
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -8997,7 +8997,6 @@ int perf_pmu_register(struct pmu *pmu, c
>  {
>  	int cpu, ret;
> 
> -	get_online_cpus();
>  	mutex_lock(&pmus_lock);
>  	ret = -ENOMEM;
>  	pmu->pmu_disable_count = alloc_percpu(int);

There is usually also some state check in here somewhere for the CPU
being offline from a perf perspective.  Such a check might already exist,
but I must plead ignorance of perf.

> @@ -9093,7 +9092,6 @@ int perf_pmu_register(struct pmu *pmu, c
>  	ret = 0;
>  unlock:
>  	mutex_unlock(&pmus_lock);
> -	put_online_cpus();
> 
>  	return ret;
> 
> @@ -11002,10 +11000,9 @@ static void perf_event_exit_cpu_context(
>  	struct perf_cpu_context *cpuctx;
>  	struct perf_event_context *ctx;
>  	struct pmu *pmu;
> -	int idx;
> 
> -	idx = srcu_read_lock(&pmus_srcu);
> -	list_for_each_entry_rcu(pmu, &pmus, entry) {
> +	mutex_lock(&pmus_lock);

If the state change checked for by perf_pmu_register() needs to be also
guarded by ctx->mutex, this looks right to me.

Just for completeness, the other style is to maintain separate per-CPU
state, in which case you would instead acquire pmus_lock, mark this
CPU off-limits to more perf_pmu_register() usage, release pmus_lock,
then clean up any old usage.

The approach you have here seems to work best when the cleanup
and initialization naturally mark the CPU as off limits and ready,
respectively.  The other style seems to work best when you need a separate
indication of which CPUs are off limits and usable.

RCU is an example of the other style, with the rcu_node structure's
->qsmaskinitnext mask serving to mark which CPUs usable.  One reason
that the other style works so well for RCU is that a CPU coming online
has no effect on the current grace period, so rcu_cpu_starting() just
sets the CPU's bit in ->qsmaskinitnext, which takes effect only once
the the next grace period starts.

It is quite possible that many of the other use cases instead need to
use something like what you have here.  I suspect that the common case
is that a CPU appearing or disappearing must have some immediate effect.

> +	list_for_each_entry(pmu, &pmus, entry) {
>  		cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
>  		ctx = &cpuctx->ctx;
> 
> @@ -11014,7 +11011,7 @@ static void perf_event_exit_cpu_context(
>  		cpuctx->online = 0;
>  		mutex_unlock(&ctx->mutex);
>  	}
> -	srcu_read_unlock(&pmus_srcu, idx);
> +	mutex_unlock(&pmus_lock);
>  }
>  #else
> 
> @@ -11027,12 +11024,11 @@ int perf_event_init_cpu(unsigned int cpu
>  	struct perf_cpu_context *cpuctx;
>  	struct perf_event_context *ctx;
>  	struct pmu *pmu;
> -	int idx;
> 
>  	perf_swevent_init_cpu(cpu);
> 
> -	idx = srcu_read_lock(&pmus_srcu);
> -	list_for_each_entry_rcu(pmu, &pmus, entry) {
> +	mutex_lock(&pmus_lock);
> +	list_for_each_entry(pmu, &pmus, entry) {
>  		cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
>  		ctx = &cpuctx->ctx;
> 
> @@ -11040,7 +11036,7 @@ int perf_event_init_cpu(unsigned int cpu
>  		cpuctx->online = 1;
>  		mutex_unlock(&ctx->mutex);
>  	}
> -	srcu_read_unlock(&pmus_srcu, idx);
> +	mutex_unlock(&pmus_lock);

And same here.

Again for completeness, the other style would be to mark this CPU
as ready for perf usage at the very end, protected by pmus_lock.

							Thanx, Paul

>  	return 0;
>  }
> 

  reply	other threads:[~2017-05-16 12:46 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-12 17:15 [RFC][PATCH 0/5] perf/tracing/cpuhotplug: Fix locking order Steven Rostedt
2017-05-12 17:15 ` [RFC][PATCH 1/5] tracing: Make sure RCU is watching before calling a stack trace Steven Rostedt
2017-05-12 18:25   ` Paul E. McKenney
2017-05-12 18:36     ` Steven Rostedt
2017-05-12 18:50       ` Paul E. McKenney
2017-05-12 20:05         ` Steven Rostedt
2017-05-12 20:31           ` Paul E. McKenney
2017-05-17 16:46             ` Steven Rostedt
2017-05-12 17:15 ` [RFC][PATCH 2/5] cpu-hotplug: Allow get_online_cpus() to nest Steven Rostedt
2017-05-12 18:35   ` Paul E. McKenney
2017-05-12 18:40     ` Steven Rostedt
2017-05-12 18:52       ` Paul E. McKenney
2017-05-12 22:15   ` Thomas Gleixner
2017-05-13  0:23     ` Steven Rostedt
2017-05-12 17:15 ` [RFC][PATCH 3/5] kprobes: Take get_online_cpus() before taking jump_label_lock() Steven Rostedt
2017-05-12 18:39   ` Paul E. McKenney
2017-05-12 18:44     ` Steven Rostedt
2017-05-17 17:50   ` Masami Hiramatsu
2017-05-12 17:15 ` [RFC][PATCH 4/5] tracepoints: Grab get_online_cpus() before taking tracepoints_mutex Steven Rostedt
2017-05-12 17:15 ` [RFC][PATCH 5/5] perf: Grab event_mutex before taking get_online_cpus() Steven Rostedt
2017-05-12 18:13 ` [RFC][PATCH 0/5] perf/tracing/cpuhotplug: Fix locking order Paul E. McKenney
2017-05-12 19:49 ` Peter Zijlstra
2017-05-12 20:14   ` Steven Rostedt
2017-05-12 21:34   ` Steven Rostedt
2017-05-13 13:40     ` Paul E. McKenney
2017-05-15  9:03       ` Peter Zijlstra
2017-05-15 18:40         ` Paul E. McKenney
2017-05-16  8:19           ` Peter Zijlstra
2017-05-16 12:46             ` Paul E. McKenney [this message]
2017-05-16 14:27               ` Paul E. McKenney
2017-05-17 10:40                 ` Peter Zijlstra
2017-05-17 14:55                   ` Paul E. McKenney
2017-05-18  3:58                     ` Paul E. McKenney
2017-05-15 19:06   ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170516124606.GC3956@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox