linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch V3 00/32] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem
@ 2017-05-24  8:15 Thomas Gleixner
  2017-05-24  8:15 ` [patch V3 01/32] cpu/hotplug: Provide cpus_read|write_[un]lock() Thomas Gleixner
                   ` (33 more replies)
  0 siblings, 34 replies; 83+ messages in thread
From: Thomas Gleixner @ 2017-05-24  8:15 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Paul McKenney

get_online_cpus() is used in hot pathes in mainline and even more so in
RT. That can show up badly under certain conditions because every locker
contends on a global mutex. RT has it's own homebrewn mitigation which is
a (badly done) open coded implementation of percpu_rwsems with recursion
support.

The proper replacement for that are percpu_rwsems, but that requires to
remove recursion support.

The conversion unearthed real locking issues which were previously not
visible because the get_online_cpus() lockdep annotation was implemented
with recursion support which prevents lockdep from tracking full dependency
chains. These potential deadlocks are not related to recursive calls, they
trigger on the first invocation because lockdep now has the full dependency
chains available.

The following patch series addresses this by

 - Cleaning up places which call get_online_cpus() nested

 - Replacing a few instances with cpu_hotplug_disable() to prevent circular
   locking dependencies.

The series is on top of 4.12-rc2. It's available in git from

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.hotplug

Changes since V2:

  - Reworked the approach vs. perf/ftrace/kprobes, which simplified the lot
  
  - Renamed get_online_cpus() to cpus_read_lock() to reflect the nature of
    the interface

  - Link the lockchains between hotplug control task and per cpu hotplug
    threads and fixed the fallout of that.

Thanks,

        tglx

---
 arch/arm/kernel/hw_breakpoint.c               |   11 -
 arch/arm/kernel/patch.c                       |    2 
 arch/arm/probes/kprobes/core.c                |    3 
 arch/arm64/include/asm/insn.h                 |    1 
 arch/arm64/kernel/insn.c                      |    5 
 arch/mips/kernel/jump_label.c                 |    2 
 arch/powerpc/kvm/book3s_hv.c                  |   14 -
 arch/powerpc/platforms/powernv/subcore.c      |    7 
 arch/s390/kernel/jump_label.c                 |    2 
 arch/s390/kernel/kprobes.c                    |    4 
 arch/s390/kernel/time.c                       |    6 
 arch/x86/events/core.c                        |    1 
 arch/x86/events/intel/cqm.c                   |   16 -
 arch/x86/kernel/cpu/mtrr/main.c               |    2 
 b/arch/sparc/kernel/jump_label.c              |    2 
 b/arch/tile/kernel/jump_label.c               |    2 
 b/arch/x86/events/intel/core.c                |   11 -
 b/arch/x86/kernel/jump_label.c                |    2 
 b/kernel/jump_label.c                         |   20 +-
 drivers/acpi/processor_driver.c               |    4 
 drivers/acpi/processor_throttling.c           |   16 -
 drivers/cpufreq/cpufreq.c                     |   21 +-
 drivers/hwtracing/coresight/coresight-etm3x.c |   20 +-
 drivers/hwtracing/coresight/coresight-etm4x.c |   20 +-
 drivers/pci/pci-driver.c                      |   47 +++--
 include/linux/cpu.h                           |   34 ++--
 include/linux/cpuhotplug.h                    |   38 ++++
 include/linux/padata.h                        |    3 
 include/linux/pci.h                           |    1 
 include/linux/perf_event.h                    |    2 
 include/linux/sched.h                         |   10 +
 include/linux/stop_machine.h                  |   26 ++-
 kernel/cpu.c                                  |  213 +++++++++++---------------
 kernel/events/core.c                          |  106 +++++++++---
 kernel/kprobes.c                              |   59 +++----
 kernel/padata.c                               |   43 ++---
 kernel/stop_machine.c                         |   11 -
 37 files changed, 444 insertions(+), 343 deletions(-)

^ permalink raw reply	[flat|nested] 83+ messages in thread
* [patch V2 24/24] cpu/hotplug: Convert hotplug locking to percpu rwsem
@ 2017-04-18 17:05 Thomas Gleixner
  2017-04-20 11:30 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
  0 siblings, 1 reply; 83+ messages in thread
From: Thomas Gleixner @ 2017-04-18 17:05 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior

[-- Attachment #1: cpu-hotplug--Convert-hotplug-locking-to-percpu-rwsem.patch --]
[-- Type: text/plain, Size: 7514 bytes --]

There are no more (known) nested calls to get_online_cpus() so it's
possible to remove the nested call magic and convert the mutex to a
percpu-rwsem, which speeds up get/put_online_cpus() significantly for the
uncontended case.

The contended case (write locked for hotplug operations) is slow anyway, so
the slightly more expensive down_write of the percpu rwsem does not matter.

[ peterz: Add lockdep assertions ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/cpu.h   |    2 
 kernel/cpu.c          |  110 +++++++-------------------------------------------
 kernel/jump_label.c   |    2 
 kernel/padata.c       |    1 
 kernel/stop_machine.c |    2 
 5 files changed, 23 insertions(+), 94 deletions(-)

--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -105,6 +105,7 @@ extern void cpu_hotplug_begin(void);
 extern void cpu_hotplug_done(void);
 extern void get_online_cpus(void);
 extern void put_online_cpus(void);
+extern void lockdep_assert_hotplug_held(void);
 extern void cpu_hotplug_disable(void);
 extern void cpu_hotplug_enable(void);
 void clear_tasks_mm_cpumask(int cpu);
@@ -118,6 +119,7 @@ static inline void cpu_hotplug_done(void
 #define put_online_cpus()	do { } while (0)
 #define cpu_hotplug_disable()	do { } while (0)
 #define cpu_hotplug_enable()	do { } while (0)
+static inline void lockdep_assert_hotplug_held(void) {}
 #endif		/* CONFIG_HOTPLUG_CPU */
 
 #ifdef CONFIG_PM_SLEEP_SMP
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -27,6 +27,7 @@
 #include <linux/smpboot.h>
 #include <linux/relay.h>
 #include <linux/slab.h>
+#include <linux/percpu-rwsem.h>
 
 #include <trace/events/power.h>
 #define CREATE_TRACE_POINTS
@@ -196,121 +197,41 @@ void cpu_maps_update_done(void)
 	mutex_unlock(&cpu_add_remove_lock);
 }
 
-/* If set, cpu_up and cpu_down will return -EBUSY and do nothing.
+/*
+ * If set, cpu_up and cpu_down will return -EBUSY and do nothing.
  * Should always be manipulated under cpu_add_remove_lock
  */
 static int cpu_hotplug_disabled;
 
 #ifdef CONFIG_HOTPLUG_CPU
 
-static struct {
-	struct task_struct *active_writer;
-	/* wait queue to wake up the active_writer */
-	wait_queue_head_t wq;
-	/* verifies that no writer will get active while readers are active */
-	struct mutex lock;
-	/*
-	 * Also blocks the new readers during
-	 * an ongoing cpu hotplug operation.
-	 */
-	atomic_t refcount;
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map dep_map;
-#endif
-} cpu_hotplug = {
-	.active_writer = NULL,
-	.wq = __WAIT_QUEUE_HEAD_INITIALIZER(cpu_hotplug.wq),
-	.lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	.dep_map = STATIC_LOCKDEP_MAP_INIT("cpu_hotplug.dep_map", &cpu_hotplug.dep_map),
-#endif
-};
-
-/* Lockdep annotations for get/put_online_cpus() and cpu_hotplug_begin/end() */
-#define cpuhp_lock_acquire_read() lock_map_acquire_read(&cpu_hotplug.dep_map)
-#define cpuhp_lock_acquire_tryread() \
-				  lock_map_acquire_tryread(&cpu_hotplug.dep_map)
-#define cpuhp_lock_acquire()      lock_map_acquire(&cpu_hotplug.dep_map)
-#define cpuhp_lock_release()      lock_map_release(&cpu_hotplug.dep_map)
-
+DEFINE_STATIC_PERCPU_RWSEM(cpu_hotplug_lock);
 
 void get_online_cpus(void)
 {
-	might_sleep();
-	if (cpu_hotplug.active_writer == current)
-		return;
-	cpuhp_lock_acquire_read();
-	mutex_lock(&cpu_hotplug.lock);
-	atomic_inc(&cpu_hotplug.refcount);
-	mutex_unlock(&cpu_hotplug.lock);
+	percpu_down_read(&cpu_hotplug_lock);
 }
 EXPORT_SYMBOL_GPL(get_online_cpus);
 
 void put_online_cpus(void)
 {
-	int refcount;
-
-	if (cpu_hotplug.active_writer == current)
-		return;
-
-	refcount = atomic_dec_return(&cpu_hotplug.refcount);
-	if (WARN_ON(refcount < 0)) /* try to fix things up */
-		atomic_inc(&cpu_hotplug.refcount);
-
-	if (refcount <= 0 && waitqueue_active(&cpu_hotplug.wq))
-		wake_up(&cpu_hotplug.wq);
-
-	cpuhp_lock_release();
-
+	percpu_up_read(&cpu_hotplug_lock);
 }
 EXPORT_SYMBOL_GPL(put_online_cpus);
 
-/*
- * This ensures that the hotplug operation can begin only when the
- * refcount goes to zero.
- *
- * Note that during a cpu-hotplug operation, the new readers, if any,
- * will be blocked by the cpu_hotplug.lock
- *
- * Since cpu_hotplug_begin() is always called after invoking
- * cpu_maps_update_begin(), we can be sure that only one writer is active.
- *
- * Note that theoretically, there is a possibility of a livelock:
- * - Refcount goes to zero, last reader wakes up the sleeping
- *   writer.
- * - Last reader unlocks the cpu_hotplug.lock.
- * - A new reader arrives at this moment, bumps up the refcount.
- * - The writer acquires the cpu_hotplug.lock finds the refcount
- *   non zero and goes to sleep again.
- *
- * However, this is very difficult to achieve in practice since
- * get_online_cpus() not an api which is called all that often.
- *
- */
 void cpu_hotplug_begin(void)
 {
-	DEFINE_WAIT(wait);
-
-	cpu_hotplug.active_writer = current;
-	cpuhp_lock_acquire();
-
-	for (;;) {
-		mutex_lock(&cpu_hotplug.lock);
-		prepare_to_wait(&cpu_hotplug.wq, &wait, TASK_UNINTERRUPTIBLE);
-		if (likely(!atomic_read(&cpu_hotplug.refcount)))
-				break;
-		mutex_unlock(&cpu_hotplug.lock);
-		schedule();
-	}
-	finish_wait(&cpu_hotplug.wq, &wait);
+	percpu_down_write(&cpu_hotplug_lock);
 }
 
 void cpu_hotplug_done(void)
 {
-	cpu_hotplug.active_writer = NULL;
-	mutex_unlock(&cpu_hotplug.lock);
-	cpuhp_lock_release();
+	percpu_up_write(&cpu_hotplug_lock);
+}
+
+void lockdep_assert_hotplug_held(void)
+{
+	percpu_rwsem_assert_held(&cpu_hotplug_lock);
 }
 
 /*
@@ -344,8 +265,6 @@ void cpu_hotplug_enable(void)
 EXPORT_SYMBOL_GPL(cpu_hotplug_enable);
 #endif	/* CONFIG_HOTPLUG_CPU */
 
-/* Notifier wrappers for transitioning to state machine */
-
 static int bringup_wait_for_ap(unsigned int cpu)
 {
 	struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
@@ -1482,6 +1401,8 @@ int __cpuhp_setup_state_cpuslocked(enum
 	int cpu, ret = 0;
 	bool dynstate;
 
+	lockdep_assert_hotplug_held();
+
 	if (cpuhp_cb_check(state) || !name)
 		return -EINVAL;
 
@@ -1600,6 +1521,7 @@ void __cpuhp_remove_state_cpuslocked(enu
 	int cpu;
 
 	BUG_ON(cpuhp_cb_check(state));
+	lockdep_assert_hotplug_held();
 
 	mutex_lock(&cpuhp_state_mutex);
 	if (sp->multi_instance) {
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -130,6 +130,7 @@ void __static_key_slow_inc(struct static
 	 * the all CPUs, for that to be serialized against CPU hot-plug
 	 * we need to avoid CPUs coming online.
 	 */
+	lockdep_assert_hotplug_held();
 	jump_label_lock();
 	if (atomic_read(&key->enabled) == 0) {
 		atomic_set(&key->enabled, -1);
@@ -158,6 +159,7 @@ EXPORT_SYMBOL_GPL(static_key_slow_inc_cp
 static void __static_key_slow_dec(struct static_key *key,
 		unsigned long rate_limit, struct delayed_work *work)
 {
+	lockdep_assert_hotplug_held();
 	/*
 	 * The negative count check is valid even when a negative
 	 * key->enabled is in use by static_key_slow_inc(); a
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -1013,6 +1013,7 @@ static struct padata_instance *padata_al
  */
 struct padata_instance *padata_alloc_possible(struct workqueue_struct *wq)
 {
+	lockdep_assert_hotplug_held();
 	return padata_alloc(wq, cpu_possible_mask, cpu_possible_mask);
 }
 EXPORT_SYMBOL(padata_alloc_possible);
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -562,6 +562,8 @@ int stop_machine_cpuslocked(cpu_stop_fn_
 		.active_cpus = cpus,
 	};
 
+	lockdep_assert_hotplug_held();
+
 	if (!stop_machine_initialized) {
 		/*
 		 * Handle the case where stop_machine() is called

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2017-05-30 16:25 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-24  8:15 [patch V3 00/32] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
2017-05-24  8:15 ` [patch V3 01/32] cpu/hotplug: Provide cpus_read|write_[un]lock() Thomas Gleixner
2017-05-24 16:25   ` Paul E. McKenney
2017-05-26  8:31   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 02/32] cpu/hotplug: Provide lockdep_assert_cpus_held() Thomas Gleixner
2017-05-24 16:26   ` Paul E. McKenney
2017-05-26  8:32   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 03/32] cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_cpuslocked() Thomas Gleixner
2017-05-26  8:32   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 04/32] cpu/hotplug: Add __cpuhp_state_add_instance_cpuslocked() Thomas Gleixner
2017-05-26  8:33   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 05/32] stop_machine: Provide stop_machine_cpuslocked() Thomas Gleixner
2017-05-24 17:42   ` Paul E. McKenney
2017-05-26  8:33   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 06/32] padata: Make padata_alloc() static Thomas Gleixner
2017-05-26  8:34   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 07/32] padata: Avoid nested calls to cpus_read_lock() in pcrypt_init_padata() Thomas Gleixner
2017-05-26  8:35   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 08/32] x86/mtrr: Remove get_online_cpus() from mtrr_save_state() Thomas Gleixner
2017-05-26  8:35   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 09/32] cpufreq: Use cpuhp_setup_state_nocalls_cpuslocked() Thomas Gleixner
2017-05-26  8:36   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 10/32] KVM/PPC/Book3S HV: " Thomas Gleixner
2017-05-26  8:36   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 11/32] hwtracing/coresight-etm3x: " Thomas Gleixner
2017-05-25 16:46   ` Mathieu Poirier
2017-05-26  8:37   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 12/32] hwtracing/coresight-etm4x: " Thomas Gleixner
2017-05-25 16:47   ` Mathieu Poirier
2017-05-26  8:37   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 13/32] perf/x86/intel/cqm: Use cpuhp_setup_state_cpuslocked() Thomas Gleixner
2017-05-26  8:38   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 14/32] ARM/hw_breakpoint: " Thomas Gleixner
2017-05-26  8:38   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 15/32] s390/kernel: Use stop_machine_cpuslocked() Thomas Gleixner
2017-05-24 10:57   ` Heiko Carstens
2017-05-26  8:39   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 16/32] powerpc/powernv: " Thomas Gleixner
2017-05-26  8:40   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 17/32] cpu/hotplug: Use stop_machine_cpuslocked() in takedown_cpu() Thomas Gleixner
2017-05-26  8:40   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 18/32] x86/perf: Drop EXPORT of perf_check_microcode Thomas Gleixner
2017-05-26  8:41   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 19/32] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() Thomas Gleixner
2017-05-26  8:41   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2017-05-24  8:15 ` [patch V3 20/32] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
2017-05-26  8:42   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 21/32] PCI: Replace the racy recursion prevention Thomas Gleixner
2017-05-26  8:42   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 22/32] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
2017-05-26  8:43   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 23/32] perf/tracing/cpuhotplug: Fix locking order Thomas Gleixner
2017-05-24 18:30   ` Paul E. McKenney
2017-05-24 18:47     ` Thomas Gleixner
2017-05-24 21:10       ` Paul E. McKenney
2017-05-30 11:22     ` Peter Zijlstra
2017-05-30 16:25       ` Paul E. McKenney
2017-05-26  8:43   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 24/32] jump_label: Reorder hotplug lock and jump_label_lock Thomas Gleixner
2017-05-24 12:50   ` David Miller
2017-05-26  8:44   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 25/32] kprobes: Cure hotplug lock ordering issues Thomas Gleixner
2017-05-24 15:54   ` Masami Hiramatsu
2017-05-26  7:47     ` Thomas Gleixner
2017-05-26  8:45   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 26/32] arm64: Prevent cpu hotplug rwsem recursion Thomas Gleixner
2017-05-26  8:45   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 27/32] arm: Prevent " Thomas Gleixner
2017-05-26  8:46   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 28/32] s390: " Thomas Gleixner
2017-05-24 10:57   ` Heiko Carstens
2017-05-26  8:46   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 29/32] cpu/hotplug: Convert hotplug locking to percpu rwsem Thomas Gleixner
2017-05-26  8:47   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 30/32] sched: Provide is_percpu_thread() helper Thomas Gleixner
2017-05-26  8:47   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 31/32] acpi/processor: Prevent cpu hotplug deadlock Thomas Gleixner
2017-05-26  8:48   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24  8:15 ` [patch V3 32/32] cpuhotplug: Link lock stacks for hotplug callbacks Thomas Gleixner
2017-05-26  8:48   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2017-05-24 16:22 ` [patch V3 00/32] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Paul E. McKenney
2017-05-26  7:03 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2017-04-18 17:05 [patch V2 24/24] cpu/hotplug: Convert hotplug locking to percpu rwsem Thomas Gleixner
2017-04-20 11:30 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).