From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Sebastian Siewior <bigeasy@linutronix.de>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Masami Hiramatsu <mhiramat@kernel.org>
Subject: Re: [patch V3 23/32] perf/tracing/cpuhotplug: Fix locking order
Date: Wed, 24 May 2017 11:30:18 -0700 [thread overview]
Message-ID: <20170524183018.GH3956@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170524081548.930941109@linutronix.de>
On Wed, May 24, 2017 at 10:15:34AM +0200, Thomas Gleixner wrote:
> perf, tracing, kprobes and jump_labels have a gazillion of ways to create
> dependency lock chains. Some of those involve nested invocations of
> get_online_cpus().
>
> The conversion of the hotplug locking to a percpu rwsem requires to avoid
> such nested calls. sys_perf_event_open() protects most of the syscall logic
> against cpu hotplug. This causes nested calls and lock inversions versus
> ftrace and kprobes in various interesting ways.
>
> It's impossible to move the hotplug locking to the outer end of all call
> chains in the involved facilities, so the hotplug protection in
> sys_perf_event_open() needs to be solved differently.
>
> Introduce 'pmus_mutex' which protects a perf private online cpumask. This
> mutex is taken when the mask is updated in the cpu hotplug callbacks and
> can be taken in sys_perf_event_open() to protect the swhash setup/teardown
> code and when the final judgement about a valid event has to be made.
>
> [ tglx: Produced changelog and fixed the swhash interaction ]
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
One question below about use of cpus_read_lock().
Thanx, Paul
> ---
> include/linux/perf_event.h | 2
> kernel/events/core.c | 106 ++++++++++++++++++++++++++++++++-------------
> 2 files changed, 78 insertions(+), 30 deletions(-)
>
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -801,6 +801,8 @@ struct perf_cpu_context {
>
> struct list_head sched_cb_entry;
> int sched_cb_usage;
> +
> + int online;
> };
>
> struct perf_output_handle {
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -389,6 +389,7 @@ static atomic_t nr_switch_events __read_
> static LIST_HEAD(pmus);
> static DEFINE_MUTEX(pmus_lock);
> static struct srcu_struct pmus_srcu;
> +static cpumask_var_t perf_online_mask;
>
> /*
> * perf event paranoia level:
> @@ -3812,14 +3813,6 @@ find_get_context(struct pmu *pmu, struct
> if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
> return ERR_PTR(-EACCES);
>
> - /*
> - * We could be clever and allow to attach a event to an
> - * offline CPU and activate it when the CPU comes up, but
> - * that's for later.
> - */
> - if (!cpu_online(cpu))
> - return ERR_PTR(-ENODEV);
> -
> cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
> ctx = &cpuctx->ctx;
> get_ctx(ctx);
> @@ -7703,7 +7696,8 @@ static int swevent_hlist_get_cpu(int cpu
> int err = 0;
>
> mutex_lock(&swhash->hlist_mutex);
> - if (!swevent_hlist_deref(swhash) && cpu_online(cpu)) {
> + if (!swevent_hlist_deref(swhash) &&
> + cpumask_test_cpu(cpu, perf_online_mask)) {
> struct swevent_hlist *hlist;
>
> hlist = kzalloc(sizeof(*hlist), GFP_KERNEL);
> @@ -7724,7 +7718,7 @@ static int swevent_hlist_get(void)
> {
> int err, cpu, failed_cpu;
>
> - get_online_cpus();
> + mutex_lock(&pmus_lock);
> for_each_possible_cpu(cpu) {
> err = swevent_hlist_get_cpu(cpu);
> if (err) {
> @@ -7732,8 +7726,7 @@ static int swevent_hlist_get(void)
> goto fail;
> }
> }
> - put_online_cpus();
> -
> + mutex_unlock(&pmus_lock);
> return 0;
> fail:
> for_each_possible_cpu(cpu) {
> @@ -7741,8 +7734,7 @@ static int swevent_hlist_get(void)
> break;
> swevent_hlist_put_cpu(cpu);
> }
> -
> - put_online_cpus();
> + mutex_unlock(&pmus_lock);
> return err;
> }
>
> @@ -8920,7 +8912,7 @@ perf_event_mux_interval_ms_store(struct
> pmu->hrtimer_interval_ms = timer;
>
> /* update all cpuctx for this PMU */
> - get_online_cpus();
> + cpus_read_lock();
OK, I'll bite...
Why is this piece using cpus_read_lock() instead of pmus_lock?
My guess is for the benefit of the cpu_function_call() below, but if
the code instead cycled through the perf_online_mask, wouldn't any
CPU selected be guaranteed to be online?
Or is there some reason that it would be necessary to specially handle
CPUs that perf does not consider to be active, but that are still at
least partway online?
> for_each_online_cpu(cpu) {
> struct perf_cpu_context *cpuctx;
> cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
> @@ -8929,7 +8921,7 @@ perf_event_mux_interval_ms_store(struct
> cpu_function_call(cpu,
> (remote_function_f)perf_mux_hrtimer_restart, cpuctx);
> }
> - put_online_cpus();
> + cpus_read_unlock();
> mutex_unlock(&mux_interval_mutex);
>
> return count;
> @@ -9059,6 +9051,7 @@ int perf_pmu_register(struct pmu *pmu, c
> lockdep_set_class(&cpuctx->ctx.mutex, &cpuctx_mutex);
> lockdep_set_class(&cpuctx->ctx.lock, &cpuctx_lock);
> cpuctx->ctx.pmu = pmu;
> + cpuctx->online = cpumask_test_cpu(cpu, perf_online_mask);
>
> __perf_mux_hrtimer_init(cpuctx, cpu);
> }
> @@ -9882,12 +9875,10 @@ SYSCALL_DEFINE5(perf_event_open,
> goto err_task;
> }
>
> - get_online_cpus();
> -
> if (task) {
> err = mutex_lock_interruptible(&task->signal->cred_guard_mutex);
> if (err)
> - goto err_cpus;
> + goto err_cred;
>
> /*
> * Reuse ptrace permission checks for now.
> @@ -10073,6 +10064,23 @@ SYSCALL_DEFINE5(perf_event_open,
> goto err_locked;
> }
>
> + if (!task) {
> + /*
> + * Check if the @cpu we're creating an event for is online.
> + *
> + * We use the perf_cpu_context::ctx::mutex to serialize against
> + * the hotplug notifiers. See perf_event_{init,exit}_cpu().
> + */
> + struct perf_cpu_context *cpuctx =
> + container_of(ctx, struct perf_cpu_context, ctx);
> +
> + if (!cpuctx->online) {
> + err = -ENODEV;
> + goto err_locked;
> + }
> + }
> +
> +
> /*
> * Must be under the same ctx::mutex as perf_install_in_context(),
> * because we need to serialize with concurrent event creation.
> @@ -10162,8 +10170,6 @@ SYSCALL_DEFINE5(perf_event_open,
> put_task_struct(task);
> }
>
> - put_online_cpus();
> -
> mutex_lock(¤t->perf_event_mutex);
> list_add_tail(&event->owner_entry, ¤t->perf_event_list);
> mutex_unlock(¤t->perf_event_mutex);
> @@ -10197,8 +10203,6 @@ SYSCALL_DEFINE5(perf_event_open,
> err_cred:
> if (task)
> mutex_unlock(&task->signal->cred_guard_mutex);
> -err_cpus:
> - put_online_cpus();
> err_task:
> if (task)
> put_task_struct(task);
> @@ -10253,6 +10257,21 @@ perf_event_create_kernel_counter(struct
> goto err_unlock;
> }
>
> + if (!task) {
> + /*
> + * Check if the @cpu we're creating an event for is online.
> + *
> + * We use the perf_cpu_context::ctx::mutex to serialize against
> + * the hotplug notifiers. See perf_event_{init,exit}_cpu().
> + */
> + struct perf_cpu_context *cpuctx =
> + container_of(ctx, struct perf_cpu_context, ctx);
> + if (!cpuctx->online) {
> + err = -ENODEV;
> + goto err_unlock;
> + }
> + }
> +
> if (!exclusive_event_installable(event, ctx)) {
> err = -EBUSY;
> goto err_unlock;
> @@ -10920,6 +10939,8 @@ static void __init perf_event_init_all_c
> struct swevent_htable *swhash;
> int cpu;
>
> + zalloc_cpumask_var(&perf_online_mask, GFP_KERNEL);
> +
> for_each_possible_cpu(cpu) {
> swhash = &per_cpu(swevent_htable, cpu);
> mutex_init(&swhash->hlist_mutex);
> @@ -10935,7 +10956,7 @@ static void __init perf_event_init_all_c
> }
> }
>
> -int perf_event_init_cpu(unsigned int cpu)
> +void perf_swevent_init_cpu(unsigned int cpu)
> {
> struct swevent_htable *swhash = &per_cpu(swevent_htable, cpu);
>
> @@ -10948,7 +10969,6 @@ int perf_event_init_cpu(unsigned int cpu
> rcu_assign_pointer(swhash->swevent_hlist, hlist);
> }
> mutex_unlock(&swhash->hlist_mutex);
> - return 0;
> }
>
> #if defined CONFIG_HOTPLUG_CPU || defined CONFIG_KEXEC_CORE
> @@ -10966,19 +10986,22 @@ static void __perf_event_exit_context(vo
>
> static void perf_event_exit_cpu_context(int cpu)
> {
> + struct perf_cpu_context *cpuctx;
> struct perf_event_context *ctx;
> struct pmu *pmu;
> - int idx;
>
> - idx = srcu_read_lock(&pmus_srcu);
> - list_for_each_entry_rcu(pmu, &pmus, entry) {
> - ctx = &per_cpu_ptr(pmu->pmu_cpu_context, cpu)->ctx;
> + mutex_lock(&pmus_lock);
> + list_for_each_entry(pmu, &pmus, entry) {
> + cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
> + ctx = &cpuctx->ctx;
>
> mutex_lock(&ctx->mutex);
> smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1);
> + cpuctx->online = 0;
> mutex_unlock(&ctx->mutex);
> }
> - srcu_read_unlock(&pmus_srcu, idx);
> + cpumask_clear_cpu(cpu, perf_online_mask);
> + mutex_unlock(&pmus_lock);
> }
> #else
>
> @@ -10986,6 +11009,29 @@ static void perf_event_exit_cpu_context(
>
> #endif
>
> +int perf_event_init_cpu(unsigned int cpu)
> +{
> + struct perf_cpu_context *cpuctx;
> + struct perf_event_context *ctx;
> + struct pmu *pmu;
> +
> + perf_swevent_init_cpu(cpu);
> +
> + mutex_lock(&pmus_lock);
> + cpumask_set_cpu(cpu, perf_online_mask);
> + list_for_each_entry(pmu, &pmus, entry) {
> + cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
> + ctx = &cpuctx->ctx;
> +
> + mutex_lock(&ctx->mutex);
> + cpuctx->online = 1;
> + mutex_unlock(&ctx->mutex);
> + }
> + mutex_unlock(&pmus_lock);
> +
> + return 0;
> +}
> +
> int perf_event_exit_cpu(unsigned int cpu)
> {
> perf_event_exit_cpu_context(cpu);
>
>
next prev parent reply other threads:[~2017-05-24 18:30 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-24 8:15 [patch V3 00/32] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
2017-05-24 8:15 ` [patch V3 01/32] cpu/hotplug: Provide cpus_read|write_[un]lock() Thomas Gleixner
2017-05-24 16:25 ` Paul E. McKenney
2017-05-26 8:31 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 02/32] cpu/hotplug: Provide lockdep_assert_cpus_held() Thomas Gleixner
2017-05-24 16:26 ` Paul E. McKenney
2017-05-26 8:32 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 03/32] cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_cpuslocked() Thomas Gleixner
2017-05-26 8:32 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 04/32] cpu/hotplug: Add __cpuhp_state_add_instance_cpuslocked() Thomas Gleixner
2017-05-26 8:33 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 05/32] stop_machine: Provide stop_machine_cpuslocked() Thomas Gleixner
2017-05-24 17:42 ` Paul E. McKenney
2017-05-26 8:33 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 06/32] padata: Make padata_alloc() static Thomas Gleixner
2017-05-26 8:34 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 07/32] padata: Avoid nested calls to cpus_read_lock() in pcrypt_init_padata() Thomas Gleixner
2017-05-26 8:35 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 08/32] x86/mtrr: Remove get_online_cpus() from mtrr_save_state() Thomas Gleixner
2017-05-26 8:35 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 09/32] cpufreq: Use cpuhp_setup_state_nocalls_cpuslocked() Thomas Gleixner
2017-05-26 8:36 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 10/32] KVM/PPC/Book3S HV: " Thomas Gleixner
2017-05-26 8:36 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 11/32] hwtracing/coresight-etm3x: " Thomas Gleixner
2017-05-25 16:46 ` Mathieu Poirier
2017-05-26 8:37 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 12/32] hwtracing/coresight-etm4x: " Thomas Gleixner
2017-05-25 16:47 ` Mathieu Poirier
2017-05-26 8:37 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 13/32] perf/x86/intel/cqm: Use cpuhp_setup_state_cpuslocked() Thomas Gleixner
2017-05-26 8:38 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 14/32] ARM/hw_breakpoint: " Thomas Gleixner
2017-05-26 8:38 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 15/32] s390/kernel: Use stop_machine_cpuslocked() Thomas Gleixner
2017-05-24 10:57 ` Heiko Carstens
2017-05-26 8:39 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 16/32] powerpc/powernv: " Thomas Gleixner
2017-05-26 8:40 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 17/32] cpu/hotplug: Use stop_machine_cpuslocked() in takedown_cpu() Thomas Gleixner
2017-05-26 8:40 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 18/32] x86/perf: Drop EXPORT of perf_check_microcode Thomas Gleixner
2017-05-26 8:41 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 19/32] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() Thomas Gleixner
2017-05-26 8:41 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24 8:15 ` [patch V3 20/32] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
2017-05-26 8:42 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 21/32] PCI: Replace the racy recursion prevention Thomas Gleixner
2017-05-26 8:42 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 22/32] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
2017-05-26 8:43 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 23/32] perf/tracing/cpuhotplug: Fix locking order Thomas Gleixner
2017-05-24 18:30 ` Paul E. McKenney [this message]
2017-05-24 18:47 ` Thomas Gleixner
2017-05-24 21:10 ` Paul E. McKenney
2017-05-30 11:22 ` Peter Zijlstra
2017-05-30 16:25 ` Paul E. McKenney
2017-05-26 8:43 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 24/32] jump_label: Reorder hotplug lock and jump_label_lock Thomas Gleixner
2017-05-24 12:50 ` David Miller
2017-05-26 8:44 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 25/32] kprobes: Cure hotplug lock ordering issues Thomas Gleixner
2017-05-24 15:54 ` Masami Hiramatsu
2017-05-26 7:47 ` Thomas Gleixner
2017-05-26 8:45 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 26/32] arm64: Prevent cpu hotplug rwsem recursion Thomas Gleixner
2017-05-26 8:45 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 27/32] arm: Prevent " Thomas Gleixner
2017-05-26 8:46 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 28/32] s390: " Thomas Gleixner
2017-05-24 10:57 ` Heiko Carstens
2017-05-26 8:46 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 29/32] cpu/hotplug: Convert hotplug locking to percpu rwsem Thomas Gleixner
2017-05-26 8:47 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 30/32] sched: Provide is_percpu_thread() helper Thomas Gleixner
2017-05-26 8:47 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 31/32] acpi/processor: Prevent cpu hotplug deadlock Thomas Gleixner
2017-05-26 8:48 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 8:15 ` [patch V3 32/32] cpuhotplug: Link lock stacks for hotplug callbacks Thomas Gleixner
2017-05-26 8:48 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 16:22 ` [patch V3 00/32] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Paul E. McKenney
2017-05-26 7:03 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170524183018.GH3956@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=bigeasy@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).