[patch v4 0/2] enhance stop machine infrastructure for MTRR rendezvous sequence

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [patch v4 0/2] enhance stop machine infrastructure for MTRR rendezvous sequence
@ 2011-06-13 17:58 Suresh Siddha
  2011-06-13 17:58 ` [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path Suresh Siddha
  2011-06-13 17:58 ` [patch v4 2/2] x86, mtrr: use __stop_machine() for doing MTRR rendezvous Suresh Siddha
  0 siblings, 2 replies; 10+ messages in thread
From: Suresh Siddha @ 2011-06-13 17:58 UTC (permalink / raw)
  To: mingo, tglx, hpa, trenn, prarit, tj, rusty
  Cc: linux-kernel, suresh.b.siddha, youquan.song

First patch enhance the stop machine infrastructure so that we
can call __stop_machine() from the cpu hotplug path, where the calling
cpu is not yet online. We do allow this for already for !CONFIG_SMP. So this
patch brings the CONFIG_SMP behavior inline with !CONFIG_SMP

Second patch uses the enhanced __stop_machine() to implement the x86 MTRR
init rendezvous sequence and thus remove the duplicate implementation
of stop machine using stop_one_cpu_nowait(). This duplicate implementation
of stop machine can potentially lead to deadlock if there is a parallel
system wide rendezvous using __stop_machine().

Both these address one of the deadlocks mentioned in the
https://bugzilla.novell.com/show_bug.cgi?id=672008

Changes from v1:
* Use stop_cpu thread enabled status to find out if the cpu is online/offline,
  instead of using cpu_online(smp_processor_id()). This avoids a false
  positive of using smp_processor_id() from preemptible section.

Changes from v2:
* Use cpu_all_mask, instead of NULL mask to include the calling cpu that is
  offline.
* Make __stop_cpus static
* Conslidate wait_for_completion/polling code into
  cpu_stop_wait_for_completion() based on Tejun's review.

Changes from v3:
* Make only __stop_machine() to be called from the cpu that is not yet online
* No changes to stop_cpus() API, that can be called only from an online cpu
* Take the stop_cpus_mutex using mutex_trylock when called from the not yet
  online cpu.
* Go back to cpu_online() checks using raw_smp_processor_id() instead
  of stop_cpu thread status to determine whether calling cpu is online/offline.
  This is safe and easier to understand.

thanks,
suresh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path
  2011-06-13 17:58 [patch v4 0/2] enhance stop machine infrastructure for MTRR rendezvous sequence Suresh Siddha
@ 2011-06-13 17:58 ` Suresh Siddha
  2011-06-13 19:56   ` Ingo Molnar
  2011-06-13 17:58 ` [patch v4 2/2] x86, mtrr: use __stop_machine() for doing MTRR rendezvous Suresh Siddha
  1 sibling, 1 reply; 10+ messages in thread
From: Suresh Siddha @ 2011-06-13 17:58 UTC (permalink / raw)
  To: mingo, tglx, hpa, trenn, prarit, tj, rusty
  Cc: linux-kernel, suresh.b.siddha, youquan.song, stable

[-- Attachment #1: stop_machine_from_cpu_hotplug_path.patch --]
[-- Type: text/plain, Size: 7179 bytes --]

Currently stop machine infrastructure can be called only from a cpu that is
online. But for !CONFIG_SMP, we do allow calling __stop_machine() before the
cpu is online.

x86 for example requires stop machine infrastructure in the cpu online path
and currently implements its own stop machine (using stop_one_cpu_nowait())
for MTRR initialization in the cpu online path.

Enhance the __stop_machine() so that it can be called before the cpu
is onlined. This will pave the way for code consolidation and address potential
deadlocks caused by multiple mechanisms of doing system wide rendezvous.

This will also address the behavioral differences of __stop_machine()
between SMP and UP builds.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org    # v2.6.35+
---
 include/linux/stop_machine.h |   11 +++--
 kernel/stop_machine.c        |   91 ++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 93 insertions(+), 9 deletions(-)

Index: linux-2.6-tip/kernel/stop_machine.c
===================================================================
--- linux-2.6-tip.orig/kernel/stop_machine.c
+++ linux-2.6-tip/kernel/stop_machine.c
@@ -28,6 +28,7 @@
 struct cpu_stop_done {
 	atomic_t		nr_todo;	/* nr left to execute */
 	bool			executed;	/* actually executed? */
+	bool			caller_offline; /* stop_cpu caller cpu status */
 	int			ret;		/* collected return value */
 	struct completion	completion;	/* fired if nr_todo reaches 0 */
 };
@@ -47,15 +48,24 @@ static void cpu_stop_init_done(struct cp
 	memset(done, 0, sizeof(*done));
 	atomic_set(&done->nr_todo, nr_todo);
 	init_completion(&done->completion);
+	done->caller_offline = !cpu_online(raw_smp_processor_id());
 }
 
 /* signal completion unless @done is NULL */
 static void cpu_stop_signal_done(struct cpu_stop_done *done, bool executed)
 {
 	if (done) {
+		bool caller_offline;
+
+		/*
+		 * Get the offline status before we do the atomic_dec_and_test
+		 * below. This will avoid unsafe references to the 'done'
+		 * memory that is present on the stop_cpu caller's stack.
+		 */
+		caller_offline = done->caller_offline;
 		if (executed)
 			done->executed = true;
-		if (atomic_dec_and_test(&done->nr_todo))
+		if (atomic_dec_and_test(&done->nr_todo) && !caller_offline)
 			complete(&done->completion);
 	}
 }
@@ -136,6 +146,56 @@ void stop_one_cpu_nowait(unsigned int cp
 static DEFINE_MUTEX(stop_cpus_mutex);
 static DEFINE_PER_CPU(struct cpu_stop_work, stop_cpus_work);
 
+/* stop all the online cpu's aswell as the calling cpu that is offline */
+static int __stop_all_cpus(cpu_stop_fn_t fn, void *arg)
+{
+	struct cpu_stop_work *work;
+	struct cpu_stop_done done;
+	unsigned int cpu;
+	int ret;
+
+	/*
+	 * This cpu is not yet online, so loop till we get the mutex.
+	 */
+	while (!mutex_trylock(&stop_cpus_mutex))
+		cpu_relax();
+
+	/* initialize works and done */
+	for_each_cpu(cpu, cpu_online_mask) {
+		work = &per_cpu(stop_cpus_work, cpu);
+		work->fn = fn;
+		work->arg = arg;
+		work->done = &done;
+	}
+
+	cpu_stop_init_done(&done, num_online_cpus());
+
+	/*
+	 * Queue the work on all the online cpu's first.
+	 */
+	for_each_cpu(cpu, cpu_online_mask)
+		cpu_stop_queue_work(&per_cpu(cpu_stopper, cpu),
+				    &per_cpu(stop_cpus_work, cpu));
+
+	/*
+	 * This cpu is not yet online. @fn needs to be run on this
+	 * cpu, run it now.
+	 */
+	ret = fn(arg);
+	if (ret)
+		done.ret = ret;
+
+	/*
+	 * This cpu is not yet online. Wait synchronously for others to
+	 * complete the stop cpu work.
+	 */
+	while (atomic_read(&done.nr_todo))
+		cpu_relax();
+
+	mutex_unlock(&stop_cpus_mutex);
+	return done.ret;
+}
+
 int __stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg)
 {
 	struct cpu_stop_work *work;
@@ -431,6 +491,7 @@ static int stop_machine_cpu_stop(void *d
 	struct stop_machine_data *smdata = data;
 	enum stopmachine_state curstate = STOPMACHINE_NONE;
 	int cpu = smp_processor_id(), err = 0;
+	unsigned long flags = 0;
 	bool is_active;
 
 	if (!smdata->active_cpus)
@@ -446,7 +507,13 @@ static int stop_machine_cpu_stop(void *d
 			curstate = smdata->state;
 			switch (curstate) {
 			case STOPMACHINE_DISABLE_IRQ:
-				local_irq_disable();
+				/*
+				 * In most of the cases this gets called from
+				 * the stop_cpu thread context. But this can
+				 * also be called directly from the cpu online
+				 * path where interrupts are disabled.
+				 */
+				local_irq_save(flags);
 				hard_irq_disable();
 				break;
 			case STOPMACHINE_RUN:
@@ -460,7 +527,7 @@ static int stop_machine_cpu_stop(void *d
 		}
 	} while (curstate != STOPMACHINE_EXIT);
 
-	local_irq_enable();
+	local_irq_restore(flags);
 	return err;
 }
 
@@ -470,9 +537,23 @@ int __stop_machine(int (*fn)(void *), vo
 					    .num_threads = num_online_cpus(),
 					    .active_cpus = cpus };
 
-	/* Set the initial state and stop all online cpus. */
+	/*
+	 * Include the calling cpu that might not be online yet. We are
+	 * checking only for the online status here, so no need to worry about
+	 * preemption status (can't be preempted to a cpu that is not
+	 * yet online). So raw_smp_processor_id() is safe here.
+	 */
+	if (!cpu_online(raw_smp_processor_id()))
+		smdata.num_threads++;
+
+	/* Set the initial state and stop all cpus. */
 	set_state(&smdata, STOPMACHINE_PREPARE);
-	return stop_cpus(cpu_online_mask, stop_machine_cpu_stop, &smdata);
+
+	if (!cpu_online(raw_smp_processor_id()))
+		return __stop_all_cpus(stop_machine_cpu_stop, &smdata);
+	else
+		return stop_cpus(cpu_online_mask, stop_machine_cpu_stop,
+				 &smdata);
 }
 
 int stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus)
Index: linux-2.6-tip/include/linux/stop_machine.h
===================================================================
--- linux-2.6-tip.orig/include/linux/stop_machine.h
+++ linux-2.6-tip/include/linux/stop_machine.h
@@ -104,7 +104,7 @@ static inline int try_stop_cpus(const st
  * @data: the data ptr for the @fn()
  * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
  *
- * Description: This causes a thread to be scheduled on every cpu,
+ * Description: This causes a thread to be scheduled on every online cpu,
  * each of which disables interrupts.  The result is that no one is
  * holding a spinlock or inside any other preempt-disabled region when
  * @fn() runs.
@@ -120,7 +120,9 @@ int stop_machine(int (*fn)(void *), void
  * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
  *
  * Description: This is a special version of the above, which assumes cpus
- * won't come or go while it's being called.  Used by hotplug cpu.
+ * won't come or go while it's being called.  Used by hotplug cpu. This can
+ * be also called from the cpu online path, where the calling cpu is not
+ * yet online.
  */
 int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus);
 
@@ -129,10 +131,11 @@ int __stop_machine(int (*fn)(void *), vo
 static inline int __stop_machine(int (*fn)(void *), void *data,
 				 const struct cpumask *cpus)
 {
+	unsigned long flags;
 	int ret;
-	local_irq_disable();
+	local_irq_save(flags);
 	ret = fn(data);
-	local_irq_enable();
+	local_irq_restore(flags);
 	return ret;
 }
 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch v4 2/2] x86, mtrr: use __stop_machine() for doing MTRR rendezvous
  2011-06-13 17:58 [patch v4 0/2] enhance stop machine infrastructure for MTRR rendezvous sequence Suresh Siddha
  2011-06-13 17:58 ` [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path Suresh Siddha
@ 2011-06-13 17:58 ` Suresh Siddha
  1 sibling, 0 replies; 10+ messages in thread
From: Suresh Siddha @ 2011-06-13 17:58 UTC (permalink / raw)
  To: mingo, tglx, hpa, trenn, prarit, tj, rusty
  Cc: linux-kernel, suresh.b.siddha, youquan.song, stable

[-- Attachment #1: use_stop_machine_for_mtrr_rendezvous.patch --]
[-- Type: text/plain, Size: 7633 bytes --]

MTRR rendezvous seqeuence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).

MTRR rendezvous sequence is not implemened using stop_machine() before, as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).

Now that __stop_machine() works even when the calling cpu is not online,
use __stop_machine() to implement the MTRR rendezvous sequence. This
will consolidate the code aswell as avoid the above mentioned deadlock.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org    # v2.6.35+
---
 arch/x86/kernel/cpu/mtrr/main.c |  154 +++++++---------------------------------
 1 file changed, 27 insertions(+), 127 deletions(-)

Index: linux-2.6-tip/arch/x86/kernel/cpu/mtrr/main.c
===================================================================
--- linux-2.6-tip.orig/arch/x86/kernel/cpu/mtrr/main.c
+++ linux-2.6-tip/arch/x86/kernel/cpu/mtrr/main.c
@@ -137,18 +137,15 @@ static void __init init_table(void)
 }
 
 struct set_mtrr_data {
-	atomic_t	count;
-	atomic_t	gate;
 	unsigned long	smp_base;
 	unsigned long	smp_size;
 	unsigned int	smp_reg;
 	mtrr_type	smp_type;
 };
 
-static DEFINE_PER_CPU(struct cpu_stop_work, mtrr_work);
-
 /**
- * mtrr_work_handler - Synchronisation handler. Executed by "other" CPUs.
+ * mtrr_work_handler - Work done in the synchronisation handler. Executed by
+ * all the CPUs.
  * @info: pointer to mtrr configuration data
  *
  * Returns nothing.
@@ -157,35 +154,26 @@ static int mtrr_work_handler(void *info)
 {
 #ifdef CONFIG_SMP
 	struct set_mtrr_data *data = info;
-	unsigned long flags;
-
-	atomic_dec(&data->count);
-	while (!atomic_read(&data->gate))
-		cpu_relax();
-
-	local_irq_save(flags);
 
-	atomic_dec(&data->count);
-	while (atomic_read(&data->gate))
-		cpu_relax();
-
-	/*  The master has cleared me to execute  */
+	/*
+	 * We use this same function to initialize the mtrrs during boot,
+	 * resume, runtime cpu online and on an explicit request to set a
+	 * specific MTRR.
+	 *
+	 * During boot or suspend, the state of the boot cpu's mtrrs has been
+	 * saved, and we want to replicate that across all the cpus that come
+	 * online (either at the end of boot or resume or during a runtime cpu
+	 * online). If we're doing that, @reg is set to something special and on
+	 * all the cpu's we do mtrr_if->set_all() (On the logical cpu that
+	 * started the boot/resume sequence, this might be a duplicate
+	 * set_all()).
+	 */
 	if (data->smp_reg != ~0U) {
 		mtrr_if->set(data->smp_reg, data->smp_base,
 			     data->smp_size, data->smp_type);
-	} else if (mtrr_aps_delayed_init) {
-		/*
-		 * Initialize the MTRRs inaddition to the synchronisation.
-		 */
+	} else if (mtrr_aps_delayed_init || !cpu_online(smp_processor_id())) {
 		mtrr_if->set_all();
 	}
-
-	atomic_dec(&data->count);
-	while (!atomic_read(&data->gate))
-		cpu_relax();
-
-	atomic_dec(&data->count);
-	local_irq_restore(flags);
 #endif
 	return 0;
 }
@@ -223,20 +211,11 @@ static inline int types_compatible(mtrr_
  * 14. Wait for buddies to catch up
  * 15. Enable interrupts.
  *
- * What does that mean for us? Well, first we set data.count to the number
- * of CPUs. As each CPU announces that it started the rendezvous handler by
- * decrementing the count, We reset data.count and set the data.gate flag
- * allowing all the cpu's to proceed with the work. As each cpu disables
- * interrupts, it'll decrement data.count once. We wait until it hits 0 and
- * proceed. We clear the data.gate flag and reset data.count. Meanwhile, they
- * are waiting for that flag to be cleared. Once it's cleared, each
- * CPU goes through the transition of updating MTRRs.
- * The CPU vendors may each do it differently,
- * so we call mtrr_if->set() callback and let them take care of it.
- * When they're done, they again decrement data->count and wait for data.gate
- * to be set.
- * When we finish, we wait for data.count to hit 0 and toggle the data.gate flag
- * Everyone then enables interrupts and we all continue on.
+ * What does that mean for us? Well, __stop_machine() will ensure that
+ * the rendezvous handler is started on each CPU. And in lockstep they
+ * do the state transition of disabling interrupts, updating MTRR's
+ * (the CPU vendors may each do it differently, so we call mtrr_if->set()
+ * callback and let them take care of it.) and enabling interrupts.
  *
  * Note that the mechanism is the same for UP systems, too; all the SMP stuff
  * becomes nops.
@@ -244,92 +223,13 @@ static inline int types_compatible(mtrr_
 static void
 set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type type)
 {
-	struct set_mtrr_data data;
-	unsigned long flags;
-	int cpu;
-
-	preempt_disable();
-
-	data.smp_reg = reg;
-	data.smp_base = base;
-	data.smp_size = size;
-	data.smp_type = type;
-	atomic_set(&data.count, num_booting_cpus() - 1);
-
-	/* Make sure data.count is visible before unleashing other CPUs */
-	smp_wmb();
-	atomic_set(&data.gate, 0);
-
-	/* Start the ball rolling on other CPUs */
-	for_each_online_cpu(cpu) {
-		struct cpu_stop_work *work = &per_cpu(mtrr_work, cpu);
-
-		if (cpu == smp_processor_id())
-			continue;
-
-		stop_one_cpu_nowait(cpu, mtrr_work_handler, &data, work);
-	}
-
-
-	while (atomic_read(&data.count))
-		cpu_relax();
-
-	/* Ok, reset count and toggle gate */
-	atomic_set(&data.count, num_booting_cpus() - 1);
-	smp_wmb();
-	atomic_set(&data.gate, 1);
-
-	local_irq_save(flags);
-
-	while (atomic_read(&data.count))
-		cpu_relax();
-
-	/* Ok, reset count and toggle gate */
-	atomic_set(&data.count, num_booting_cpus() - 1);
-	smp_wmb();
-	atomic_set(&data.gate, 0);
-
-	/* Do our MTRR business */
-
-	/*
-	 * HACK!
-	 *
-	 * We use this same function to initialize the mtrrs during boot,
-	 * resume, runtime cpu online and on an explicit request to set a
-	 * specific MTRR.
-	 *
-	 * During boot or suspend, the state of the boot cpu's mtrrs has been
-	 * saved, and we want to replicate that across all the cpus that come
-	 * online (either at the end of boot or resume or during a runtime cpu
-	 * online). If we're doing that, @reg is set to something special and on
-	 * this cpu we still do mtrr_if->set_all(). During boot/resume, this
-	 * is unnecessary if at this point we are still on the cpu that started
-	 * the boot/resume sequence. But there is no guarantee that we are still
-	 * on the same cpu. So we do mtrr_if->set_all() on this cpu aswell to be
-	 * sure that we are in sync with everyone else.
-	 */
-	if (reg != ~0U)
-		mtrr_if->set(reg, base, size, type);
-	else
-		mtrr_if->set_all();
-
-	/* Wait for the others */
-	while (atomic_read(&data.count))
-		cpu_relax();
-
-	atomic_set(&data.count, num_booting_cpus() - 1);
-	smp_wmb();
-	atomic_set(&data.gate, 1);
-
-	/*
-	 * Wait here for everyone to have seen the gate change
-	 * So we're the last ones to touch 'data'
-	 */
-	while (atomic_read(&data.count))
-		cpu_relax();
+	struct set_mtrr_data data = { .smp_reg = reg,
+				      .smp_base = base,
+				      .smp_size = size,
+				      .smp_type = type
+				    };
 
-	local_irq_restore(flags);
-	preempt_enable();
+	__stop_machine(mtrr_work_handler, &data, cpu_callout_mask);
 }
 
 /**



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path
  2011-06-13 17:58 ` [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path Suresh Siddha
@ 2011-06-13 19:56   ` Ingo Molnar
  2011-06-13 20:49     ` Suresh Siddha
  0 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2011-06-13 19:56 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx, hpa, trenn, prarit, tj, rusty, linux-kernel, youquan.song,
	stable


* Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> Currently stop machine infrastructure can be called only from a cpu that is
> online. But for !CONFIG_SMP, we do allow calling __stop_machine() before the
> cpu is online.
> 
> x86 for example requires stop machine infrastructure in the cpu online path
> and currently implements its own stop machine (using stop_one_cpu_nowait())
> for MTRR initialization in the cpu online path.
> 
> Enhance the __stop_machine() so that it can be called before the cpu
> is onlined. This will pave the way for code consolidation and address potential
> deadlocks caused by multiple mechanisms of doing system wide rendezvous.
> 
> This will also address the behavioral differences of __stop_machine()
> between SMP and UP builds.
> 
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
> Cc: stable@kernel.org    # v2.6.35+
> ---
>  include/linux/stop_machine.h |   11 +++--
>  kernel/stop_machine.c        |   91 ++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 93 insertions(+), 9 deletions(-)

Btw., this is *way* too risky for a -stable backport.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path
  2011-06-13 19:56   ` Ingo Molnar
@ 2011-06-13 20:49     ` Suresh Siddha
  2011-06-13 21:05       ` Suresh Siddha
                         ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Suresh Siddha @ 2011-06-13 20:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: tglx@linutronix.de, hpa@zytor.com, trenn@novell.com,
	prarit@redhat.com, tj@kernel.org, rusty@rustcorp.com.au,
	linux-kernel@vger.kernel.org, Song, Youquan, stable@kernel.org

On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> * Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> 
> >  include/linux/stop_machine.h |   11 +++--
> >  kernel/stop_machine.c        |   91 ++++++++++++++++++++++++++++++++++++++++---
> >  2 files changed, 93 insertions(+), 9 deletions(-)
> 
> Btw., this is *way* too risky for a -stable backport.
> 

Ingo, we can have a smaller patch (appended) for the -stable. How do you
want to go ahead? Take this small patch for both mainline and -stable
and the two code cleanup/consolidation patches for -tip (to go into
3.1?). Thanks.

---
From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: x86, mtrr: lock stop machine during MTRR rendezvous sequence

MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).

MTRR rendezvous sequence is not implemented using stop_machine() as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).
stop_machine() works with only online cpus.

For now, take the stop_cpus mutex in the MTRR rendezvous sequence that
prevents the above mentioned deadlock. If this gets called in the cpu
online path, then we loop using try_lock. Otherwise we wait till the
mutex is available.

    TBD: Extend the stop_machine() infrastructure to handle the case where
         the calling cpu is still not online and use this for MTRR rendezvous
         sequence (rather than implementing own stop machine using
         stop_one_cpu_nowait()). This will consolidate the code.

fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org    # v2.6.35 - v2.6.39
---
 arch/x86/kernel/cpu/mtrr/main.c |   15 ++++++++++++++-
 include/linux/stop_machine.h    |   17 +++++++++++++++++
 kernel/stop_machine.c           |   15 +++++++++++++++
 3 files changed, 46 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 929739a..1e99048 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -244,9 +244,21 @@ static inline int types_compatible(mtrr_type type1, mtrr_type type2)
 static void
 set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type type)
 {
+	int cpu = raw_smp_processor_id();
+	int online = cpu_online(cpu);
 	struct set_mtrr_data data;
 	unsigned long flags;
-	int cpu;
+
+	/*
+	 * If we are not yet online, then loop using the try lock, as we
+	 * can't sleep.
+	 */
+	if (online) {
+		lock_stop_cpus();
+	} else {
+		while (!try_lock_stop_cpus())
+			cpu_relax();
+	}
 
 	preempt_disable();
 
@@ -330,6 +342,7 @@ set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type typ
 
 	local_irq_restore(flags);
 	preempt_enable();
+	unlock_stop_cpus();
 }
 
 /**
diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index 092dc9b..8a28d4c 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -33,6 +33,10 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
 int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
 int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
 
+void lock_stop_cpus(void);
+int try_lock_stop_cpus(void);
+void unlock_stop_cpus(void);
+
 #else	/* CONFIG_SMP */
 
 #include <linux/workqueue.h>
@@ -88,6 +92,19 @@ static inline int try_stop_cpus(const struct cpumask *cpumask,
 	return stop_cpus(cpumask, fn, arg);
 }
 
+static inline void lock_stop_cpus(void)
+{
+}
+
+static inline try_lock_stop_cpus(void)
+{
+	return 1;
+}
+
+static inline void unlock_stop_cpus(void)
+{
+}
+
 #endif	/* CONFIG_SMP */
 
 /*
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index e3516b2..d239a48 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -136,6 +136,21 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
 static DEFINE_MUTEX(stop_cpus_mutex);
 static DEFINE_PER_CPU(struct cpu_stop_work, stop_cpus_work);
 
+void lock_stop_cpus(void)
+{
+	mutex_lock(&stop_cpus_mutex);
+}
+
+int try_lock_stop_cpus(void)
+{
+	return mutex_trylock(&stop_cpus_mutex);
+}
+
+void unlock_stop_cpus(void)
+{
+	mutex_unlock(&stop_cpus_mutex);
+}
+
 int __stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg)
 {
 	struct cpu_stop_work *work;



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path
  2011-06-13 20:49     ` Suresh Siddha
@ 2011-06-13 21:05       ` Suresh Siddha
  2011-06-14  8:09         ` Ingo Molnar
  2011-06-14  8:08       ` Ingo Molnar
  2011-06-14 10:10       ` tj
  2 siblings, 1 reply; 10+ messages in thread
From: Suresh Siddha @ 2011-06-13 21:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: tglx@linutronix.de, hpa@zytor.com, trenn@novell.com,
	prarit@redhat.com, tj@kernel.org, rusty@rustcorp.com.au,
	linux-kernel@vger.kernel.org, Song, Youquan, stable@kernel.org

On Mon, 2011-06-13 at 13:49 -0700, Suresh Siddha wrote:
> On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > * Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > 
> > >  include/linux/stop_machine.h |   11 +++--
> > >  kernel/stop_machine.c        |   91 ++++++++++++++++++++++++++++++++++++++++---
> > >  2 files changed, 93 insertions(+), 9 deletions(-)
> > 
> > Btw., this is *way* too risky for a -stable backport.
> > 
> 
> Ingo, we can have a smaller patch (appended) for the -stable. How do you
> want to go ahead? Take this small patch for both mainline and -stable
> and the two code cleanup/consolidation patches for -tip (to go into
> 3.1?). Thanks.

oops. sent it too soon. fixed a !CONFIG_SMP compilation issue. Updated
simple patch appended.
---

From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: x86, mtrr: lock stop machine during MTRR rendezvous sequence

MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).

MTRR rendezvous sequence is not implemented using stop_machine() as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).
stop_machine() works with only online cpus.

For now, take the stop_cpus mutex in the MTRR rendezvous sequence that
prevents the above mentioned deadlock. If this gets called in the cpu
online path, then we loop using try_lock. Otherwise we wait till the
mutex is available.

    TBD: Extend the stop_machine() infrastructure to handle the case where
         the calling cpu is still not online and use this for MTRR rendezvous
         sequence (rather than implementing own stop machine using
         stop_one_cpu_nowait()). This will consolidate the code.

fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org    # v2.6.35+
---
 arch/x86/kernel/cpu/mtrr/main.c |   14 +++++++++++++-
 include/linux/stop_machine.h    |   17 +++++++++++++++++
 kernel/stop_machine.c           |   15 +++++++++++++++
 3 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 929739a..3d15798 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -244,9 +244,20 @@ static inline int types_compatible(mtrr_type type1, mtrr_type type2)
 static void
 set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type type)
 {
+	int cpu = raw_smp_processor_id();
 	struct set_mtrr_data data;
 	unsigned long flags;
-	int cpu;
+
+	/*
+	 * If we are not yet online, then loop using the try lock, as we
+	 * can't sleep.
+	 */
+	if (cpu_online(cpu)) {
+		lock_stop_cpus();
+	} else {
+		while (!try_lock_stop_cpus())
+			cpu_relax();
+	}
 
 	preempt_disable();
 
@@ -330,6 +341,7 @@ set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type typ
 
 	local_irq_restore(flags);
 	preempt_enable();
+	unlock_stop_cpus();
 }
 
 /**
diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index 092dc9b..7731e57 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -33,6 +33,10 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
 int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
 int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
 
+void lock_stop_cpus(void);
+int try_lock_stop_cpus(void);
+void unlock_stop_cpus(void);
+
 #else	/* CONFIG_SMP */
 
 #include <linux/workqueue.h>
@@ -88,6 +92,19 @@ static inline int try_stop_cpus(const struct cpumask *cpumask,
 	return stop_cpus(cpumask, fn, arg);
 }
 
+static inline void lock_stop_cpus(void)
+{
+}
+
+static inline int try_lock_stop_cpus(void)
+{
+	return 1;
+}
+
+static inline void unlock_stop_cpus(void)
+{
+}
+
 #endif	/* CONFIG_SMP */
 
 /*
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index e3516b2..d239a48 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -136,6 +136,21 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
 static DEFINE_MUTEX(stop_cpus_mutex);
 static DEFINE_PER_CPU(struct cpu_stop_work, stop_cpus_work);
 
+void lock_stop_cpus(void)
+{
+	mutex_lock(&stop_cpus_mutex);
+}
+
+int try_lock_stop_cpus(void)
+{
+	return mutex_trylock(&stop_cpus_mutex);
+}
+
+void unlock_stop_cpus(void)
+{
+	mutex_unlock(&stop_cpus_mutex);
+}
+
 int __stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg)
 {
 	struct cpu_stop_work *work;



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path
  2011-06-13 20:49     ` Suresh Siddha
  2011-06-13 21:05       ` Suresh Siddha
@ 2011-06-14  8:08       ` Ingo Molnar
  2011-06-14 10:10       ` tj
  2 siblings, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2011-06-14  8:08 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx@linutronix.de, hpa@zytor.com, trenn@novell.com,
	prarit@redhat.com, tj@kernel.org, rusty@rustcorp.com.au,
	linux-kernel@vger.kernel.org, Song, Youquan, stable@kernel.org


* Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > * Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > 
> > >  include/linux/stop_machine.h |   11 +++--
> > >  kernel/stop_machine.c        |   91 ++++++++++++++++++++++++++++++++++++++++---
> > >  2 files changed, 93 insertions(+), 9 deletions(-)
> > 
> > Btw., this is *way* too risky for a -stable backport.
> > 
> 
> Ingo, we can have a smaller patch (appended) for the -stable. How 
> do you want to go ahead? Take this small patch for both mainline 
> and -stable and the two code cleanup/consolidation patches for -tip 
> (to go into 3.1?). Thanks.

this:

>  arch/x86/kernel/cpu/mtrr/main.c |   15 ++++++++++++++-
>  include/linux/stop_machine.h    |   17 +++++++++++++++++
>  kernel/stop_machine.c           |   15 +++++++++++++++
>  3 files changed, 46 insertions(+), 1 deletions(-)

looks pretty risky as well, this is core kernel code that is 
relatively rarely used and if it breaks it causes various high impact 
regressions.

Once Tejun is fine with the code we can do the larger patch upstream 
but not mark it for -stable backport. Once it's been upstream for a 
couple of weeks, once we are sure it does not regress, can we perhaps 
forward it to -stable ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path
  2011-06-13 21:05       ` Suresh Siddha
@ 2011-06-14  8:09         ` Ingo Molnar
  0 siblings, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2011-06-14  8:09 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx@linutronix.de, hpa@zytor.com, trenn@novell.com,
	prarit@redhat.com, tj@kernel.org, rusty@rustcorp.com.au,
	linux-kernel@vger.kernel.org, Song, Youquan, stable@kernel.org


* Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> On Mon, 2011-06-13 at 13:49 -0700, Suresh Siddha wrote:
> > On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > > * Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > > 
> > > >  include/linux/stop_machine.h |   11 +++--
> > > >  kernel/stop_machine.c        |   91 ++++++++++++++++++++++++++++++++++++++++---
> > > >  2 files changed, 93 insertions(+), 9 deletions(-)
> > > 
> > > Btw., this is *way* too risky for a -stable backport.
> > > 
> > 
> > Ingo, we can have a smaller patch (appended) for the -stable. How do you
> > want to go ahead? Take this small patch for both mainline and -stable
> > and the two code cleanup/consolidation patches for -tip (to go into
> > 3.1?). Thanks.
> 
> oops. sent it too soon. fixed a !CONFIG_SMP compilation issue. 
> Updated simple patch appended.

See? It's by definition not a "simple" patch.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path
  2011-06-13 20:49     ` Suresh Siddha
  2011-06-13 21:05       ` Suresh Siddha
  2011-06-14  8:08       ` Ingo Molnar
@ 2011-06-14 10:10       ` tj
  2011-06-14 21:17         ` Ingo Molnar
  2 siblings, 1 reply; 10+ messages in thread
From: tj @ 2011-06-14 10:10 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Ingo Molnar, tglx@linutronix.de, hpa@zytor.com, trenn@novell.com,
	prarit@redhat.com, rusty@rustcorp.com.au,
	linux-kernel@vger.kernel.org, Song, Youquan, stable@kernel.org

Hello, Ingo, Suresh.

On Mon, Jun 13, 2011 at 01:49:18PM -0700, Suresh Siddha wrote:
> On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > * Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > 
> > >  include/linux/stop_machine.h |   11 +++--
> > >  kernel/stop_machine.c        |   91 ++++++++++++++++++++++++++++++++++++++++---
> > >  2 files changed, 93 insertions(+), 9 deletions(-)
> > 
> > Btw., this is *way* too risky for a -stable backport.
> > 
> 
> Ingo, we can have a smaller patch (appended) for the -stable. How do you
> want to go ahead? Take this small patch for both mainline and -stable
> and the two code cleanup/consolidation patches for -tip (to go into
> 3.1?). Thanks.

So, here's what I think we should do.

* Polish up this simpler patch and send it for 3.0 through -tip.  It's
  slightly scary but not too much and fixes a real bug.  After a
  while, we can ask -stable to pull the simple version.

* Work on proper update which drops custom implementation from mtrr
  code for 3.1 window.  BTW, even after the recent revisions, I think
  the stop machine change is a bit too hacky.  I'll reply to that
  separately.

> diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
> index 092dc9b..8a28d4c 100644
> --- a/include/linux/stop_machine.h
> +++ b/include/linux/stop_machine.h
> @@ -33,6 +33,10 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
>  int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
>  int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
>  
> +void lock_stop_cpus(void);
> +int try_lock_stop_cpus(void);
> +void unlock_stop_cpus(void);

Ugh... Can you please just export stop_cpus_mutex and have CONFIG_SMP
in mtrr code.  After all, it's a temporary workaround for mtrr.  No
reason to add three functions for that.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path
  2011-06-14 10:10       ` tj
@ 2011-06-14 21:17         ` Ingo Molnar
  0 siblings, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2011-06-14 21:17 UTC (permalink / raw)
  To: tj@kernel.org
  Cc: Suresh Siddha, tglx@linutronix.de, hpa@zytor.com,
	trenn@novell.com, prarit@redhat.com, rusty@rustcorp.com.au,
	linux-kernel@vger.kernel.org, Song, Youquan, stable@kernel.org


* tj@kernel.org <tj@kernel.org> wrote:

> Hello, Ingo, Suresh.
> 
> On Mon, Jun 13, 2011 at 01:49:18PM -0700, Suresh Siddha wrote:
> > On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > > * Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > > 
> > > >  include/linux/stop_machine.h |   11 +++--
> > > >  kernel/stop_machine.c        |   91 ++++++++++++++++++++++++++++++++++++++++---
> > > >  2 files changed, 93 insertions(+), 9 deletions(-)
> > > 
> > > Btw., this is *way* too risky for a -stable backport.
> > > 
> > 
> > Ingo, we can have a smaller patch (appended) for the -stable. How do you
> > want to go ahead? Take this small patch for both mainline and -stable
> > and the two code cleanup/consolidation patches for -tip (to go into
> > 3.1?). Thanks.
> 
> So, here's what I think we should do.
> 
> * Polish up this simpler patch and send it for 3.0 through -tip.  It's
>   slightly scary but not too much and fixes a real bug.  After a
>   while, we can ask -stable to pull the simple version.

Looks good to me.

> * Work on proper update which drops custom implementation from mtrr
>   code for 3.1 window.  BTW, even after the recent revisions, I think
>   the stop machine change is a bit too hacky.  I'll reply to that
>   separately.

ok.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-06-14 21:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-13 17:58 [patch v4 0/2] enhance stop machine infrastructure for MTRR rendezvous sequence Suresh Siddha
2011-06-13 17:58 ` [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path Suresh Siddha
2011-06-13 19:56   ` Ingo Molnar
2011-06-13 20:49     ` Suresh Siddha
2011-06-13 21:05       ` Suresh Siddha
2011-06-14  8:09         ` Ingo Molnar
2011-06-14  8:08       ` Ingo Molnar
2011-06-14 10:10       ` tj
2011-06-14 21:17         ` Ingo Molnar
2011-06-13 17:58 ` [patch v4 2/2] x86, mtrr: use __stop_machine() for doing MTRR rendezvous Suresh Siddha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox