Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 13/45] sched/rt: Use get/put_online_cpus_atomic() to prevent CPU offline
From: Srivatsa S. Bhat @ 2013-01-22  7:36 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

Once stop_machine() is gone from the CPU offline path, we won't be able to
depend on preempt_disable() or local_irq_disable() to prevent CPUs from
going offline from under us.

Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going offline,
while invoking from atomic context.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/sched/rt.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 418feb0..2a637be 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -6,6 +6,7 @@
 #include "sched.h"
 
 #include <linux/slab.h>
+#include <linux/cpu.h>
 
 static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun);
 
@@ -26,7 +27,9 @@ static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer)
 		if (!overrun)
 			break;
 
+		get_online_cpus_atomic();
 		idle = do_sched_rt_period_timer(rt_b, overrun);
+		put_online_cpus_atomic();
 	}
 
 	return idle ? HRTIMER_NORESTART : HRTIMER_RESTART;

^ permalink raw reply related

* [PATCH v5 12/45] sched/migration: Use raw_spin_lock/unlock since interrupts are already disabled
From: Srivatsa S. Bhat @ 2013-01-22  7:35 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

We need not use the raw_spin_lock_irqsave/restore primitives because
all CPU_DYING notifiers run with interrupts disabled. So just use
raw_spin_lock/unlock.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/sched/core.c |   12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c1596ac..c2cec88 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4869,9 +4869,7 @@ static void calc_load_migrate(struct rq *rq)
  * Migrate all tasks from the rq, sleeping tasks will be migrated by
  * try_to_wake_up()->select_task_rq().
  *
- * Called with rq->lock held even though we'er in stop_machine() and
- * there's no concurrency possible, we hold the required locks anyway
- * because of lock validation efforts.
+ * Called with rq->lock held.
  */
 static void migrate_tasks(unsigned int dead_cpu)
 {
@@ -4883,8 +4881,8 @@ static void migrate_tasks(unsigned int dead_cpu)
 	 * Fudge the rq selection such that the below task selection loop
 	 * doesn't get stuck on the currently eligible stop task.
 	 *
-	 * We're currently inside stop_machine() and the rq is either stuck
-	 * in the stop_machine_cpu_stop() loop, or we're executing this code,
+	 * We're currently inside stop_one_cpu() and the rq is either stuck
+	 * in the cpu_stopper_thread(), or we're executing this code,
 	 * either way we should never end up calling schedule() until we're
 	 * done here.
 	 */
@@ -5153,14 +5151,14 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu)
 	case CPU_DYING:
 		sched_ttwu_pending();
 		/* Update our root-domain */
-		raw_spin_lock_irqsave(&rq->lock, flags);
+		raw_spin_lock(&rq->lock); /* Interrupts already disabled */
 		if (rq->rd) {
 			BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
 			set_rq_offline(rq);
 		}
 		migrate_tasks(cpu);
 		BUG_ON(rq->nr_running != 1); /* the migration thread */
-		raw_spin_unlock_irqrestore(&rq->lock, flags);
+		raw_spin_unlock(&rq->lock);
 		break;
 
 	case CPU_DEAD:

^ permalink raw reply related

* [PATCH v5 11/45] sched/timer: Use get/put_online_cpus_atomic() to prevent CPU offline
From: Srivatsa S. Bhat @ 2013-01-22  7:35 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

Once stop_machine() is gone from the CPU offline path, we won't be able to
depend on preempt_disable() or local_irq_disable() to prevent CPUs from going
offline from under us.

Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going offline,
while invoking from atomic context.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/sched/core.c |   24 +++++++++++++++++++++---
 kernel/sched/fair.c |    5 ++++-
 kernel/timer.c      |    2 ++
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 257002c..c1596ac 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1117,11 +1117,11 @@ void kick_process(struct task_struct *p)
 {
 	int cpu;
 
-	preempt_disable();
+	get_online_cpus_atomic();
 	cpu = task_cpu(p);
 	if ((cpu != smp_processor_id()) && task_curr(p))
 		smp_send_reschedule(cpu);
-	preempt_enable();
+	put_online_cpus_atomic();
 }
 EXPORT_SYMBOL_GPL(kick_process);
 #endif /* CONFIG_SMP */
@@ -1129,6 +1129,10 @@ EXPORT_SYMBOL_GPL(kick_process);
 #ifdef CONFIG_SMP
 /*
  * ->cpus_allowed is protected by both rq->lock and p->pi_lock
+ *
+ *  Must be called under get/put_online_cpus_atomic() or
+ *  equivalent, to avoid CPUs from going offline from underneath
+ *  us.
  */
 static int select_fallback_rq(int cpu, struct task_struct *p)
 {
@@ -1192,6 +1196,9 @@ out:
 
 /*
  * The caller (fork, wakeup) owns p->pi_lock, ->cpus_allowed is stable.
+ *
+ * Must be called under get/put_online_cpus_atomic(), to prevent
+ * CPUs from going offline from underneath us.
  */
 static inline
 int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
@@ -1432,6 +1439,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 	int cpu, success = 0;
 
 	smp_wmb();
+	get_online_cpus_atomic();
 	raw_spin_lock_irqsave(&p->pi_lock, flags);
 	if (!(p->state & state))
 		goto out;
@@ -1472,6 +1480,7 @@ stat:
 	ttwu_stat(p, cpu, wake_flags);
 out:
 	raw_spin_unlock_irqrestore(&p->pi_lock, flags);
+	put_online_cpus_atomic();
 
 	return success;
 }
@@ -1692,6 +1701,7 @@ void wake_up_new_task(struct task_struct *p)
 	unsigned long flags;
 	struct rq *rq;
 
+	get_online_cpus_atomic();
 	raw_spin_lock_irqsave(&p->pi_lock, flags);
 #ifdef CONFIG_SMP
 	/*
@@ -1712,6 +1722,7 @@ void wake_up_new_task(struct task_struct *p)
 		p->sched_class->task_woken(rq, p);
 #endif
 	task_rq_unlock(rq, p, &flags);
+	put_online_cpus_atomic();
 }
 
 #ifdef CONFIG_PREEMPT_NOTIFIERS
@@ -2609,6 +2620,7 @@ void sched_exec(void)
 	unsigned long flags;
 	int dest_cpu;
 
+	get_online_cpus_atomic();
 	raw_spin_lock_irqsave(&p->pi_lock, flags);
 	dest_cpu = p->sched_class->select_task_rq(p, SD_BALANCE_EXEC, 0);
 	if (dest_cpu == smp_processor_id())
@@ -2618,11 +2630,13 @@ void sched_exec(void)
 		struct migration_arg arg = { p, dest_cpu };
 
 		raw_spin_unlock_irqrestore(&p->pi_lock, flags);
+		put_online_cpus_atomic();
 		stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg);
 		return;
 	}
 unlock:
 	raw_spin_unlock_irqrestore(&p->pi_lock, flags);
+	put_online_cpus_atomic();
 }
 
 #endif
@@ -4372,6 +4386,7 @@ bool __sched yield_to(struct task_struct *p, bool preempt)
 	unsigned long flags;
 	bool yielded = 0;
 
+	get_online_cpus_atomic();
 	local_irq_save(flags);
 	rq = this_rq();
 
@@ -4399,13 +4414,14 @@ again:
 		 * Make p's CPU reschedule; pick_next_entity takes care of
 		 * fairness.
 		 */
-		if (preempt && rq != p_rq)
+		if (preempt && rq != p_rq && cpu_online(task_cpu(p)))
 			resched_task(p_rq->curr);
 	}
 
 out:
 	double_rq_unlock(rq, p_rq);
 	local_irq_restore(flags);
+	put_online_cpus_atomic();
 
 	if (yielded)
 		schedule();
@@ -4810,9 +4826,11 @@ static int migration_cpu_stop(void *data)
 	 * The original target cpu might have gone down and we might
 	 * be on another cpu but it doesn't matter.
 	 */
+	get_online_cpus_atomic();
 	local_irq_disable();
 	__migrate_task(arg->task, raw_smp_processor_id(), arg->dest_cpu);
 	local_irq_enable();
+	put_online_cpus_atomic();
 	return 0;
 }
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5eea870..a846028 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5695,8 +5695,11 @@ void trigger_load_balance(struct rq *rq, int cpu)
 	    likely(!on_null_domain(cpu)))
 		raise_softirq(SCHED_SOFTIRQ);
 #ifdef CONFIG_NO_HZ
-	if (nohz_kick_needed(rq, cpu) && likely(!on_null_domain(cpu)))
+	if (nohz_kick_needed(rq, cpu) && likely(!on_null_domain(cpu))) {
+		get_online_cpus_atomic();
 		nohz_balancer_kick(cpu);
+		put_online_cpus_atomic();
+	}
 #endif
 }
 
diff --git a/kernel/timer.c b/kernel/timer.c
index 367d008..b1820e3 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -924,6 +924,7 @@ void add_timer_on(struct timer_list *timer, int cpu)
 
 	timer_stats_timer_set_start_info(timer);
 	BUG_ON(timer_pending(timer) || !timer->function);
+	get_online_cpus_atomic();
 	spin_lock_irqsave(&base->lock, flags);
 	timer_set_base(timer, base);
 	debug_activate(timer, timer->expires);
@@ -938,6 +939,7 @@ void add_timer_on(struct timer_list *timer, int cpu)
 	 */
 	wake_up_idle_cpu(cpu);
 	spin_unlock_irqrestore(&base->lock, flags);
+	put_online_cpus_atomic();
 }
 EXPORT_SYMBOL_GPL(add_timer_on);
 

^ permalink raw reply related

* [PATCH v5 10/45] smp, cpu hotplug: Fix on_each_cpu_*() to prevent CPU offline properly
From: Srivatsa S. Bhat @ 2013-01-22  7:35 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

Once stop_machine() is gone from the CPU offline path, we won't be able to
depend on preempt_disable() to prevent CPUs from going offline from under us.

Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going offline,
while invoking from atomic context.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/smp.c |   25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index f421bcc..d870bfe 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -688,12 +688,12 @@ int on_each_cpu(void (*func) (void *info), void *info, int wait)
 	unsigned long flags;
 	int ret = 0;
 
-	preempt_disable();
+	get_online_cpus_atomic();
 	ret = smp_call_function(func, info, wait);
 	local_irq_save(flags);
 	func(info);
 	local_irq_restore(flags);
-	preempt_enable();
+	put_online_cpus_atomic();
 	return ret;
 }
 EXPORT_SYMBOL(on_each_cpu);
@@ -715,7 +715,11 @@ EXPORT_SYMBOL(on_each_cpu);
 void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
 			void *info, bool wait)
 {
-	int cpu = get_cpu();
+	int cpu;
+
+	get_online_cpus_atomic();
+
+	cpu = smp_processor_id();
 
 	smp_call_function_many(mask, func, info, wait);
 	if (cpumask_test_cpu(cpu, mask)) {
@@ -723,7 +727,7 @@ void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
 		func(info);
 		local_irq_enable();
 	}
-	put_cpu();
+	put_online_cpus_atomic();
 }
 EXPORT_SYMBOL(on_each_cpu_mask);
 
@@ -748,8 +752,9 @@ EXPORT_SYMBOL(on_each_cpu_mask);
  * The function might sleep if the GFP flags indicates a non
  * atomic allocation is allowed.
  *
- * Preemption is disabled to protect against CPUs going offline but not online.
- * CPUs going online during the call will not be seen or sent an IPI.
+ * We use get/put_online_cpus_atomic() to prevent CPUs from going
+ * offline in-between our operation. CPUs coming online during the
+ * call will not be seen or sent an IPI.
  *
  * You must not call this function with disabled interrupts or
  * from a hardware interrupt handler or from a bottom half handler.
@@ -764,26 +769,26 @@ void on_each_cpu_cond(bool (*cond_func)(int cpu, void *info),
 	might_sleep_if(gfp_flags & __GFP_WAIT);
 
 	if (likely(zalloc_cpumask_var(&cpus, (gfp_flags|__GFP_NOWARN)))) {
-		preempt_disable();
+		get_online_cpus_atomic();
 		for_each_online_cpu(cpu)
 			if (cond_func(cpu, info))
 				cpumask_set_cpu(cpu, cpus);
 		on_each_cpu_mask(cpus, func, info, wait);
-		preempt_enable();
+		put_online_cpus_atomic();
 		free_cpumask_var(cpus);
 	} else {
 		/*
 		 * No free cpumask, bother. No matter, we'll
 		 * just have to IPI them one by one.
 		 */
-		preempt_disable();
+		get_online_cpus_atomic();
 		for_each_online_cpu(cpu)
 			if (cond_func(cpu, info)) {
 				ret = smp_call_function_single(cpu, func,
 								info, wait);
 				WARN_ON_ONCE(!ret);
 			}
-		preempt_enable();
+		put_online_cpus_atomic();
 	}
 }
 EXPORT_SYMBOL(on_each_cpu_cond);

^ permalink raw reply related

* [PATCH v5 09/45] smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline properly
From: Srivatsa S. Bhat @ 2013-01-22  7:35 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

Once stop_machine() is gone from the CPU offline path, we won't be able to
depend on preempt_disable() to prevent CPUs from going offline from under us.

Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going offline,
while invoking from atomic context.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/smp.c |   40 ++++++++++++++++++++++++++--------------
 1 file changed, 26 insertions(+), 14 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 29dd40a..f421bcc 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -310,7 +310,8 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
 	 * prevent preemption and reschedule on another processor,
 	 * as well as CPU removal
 	 */
-	this_cpu = get_cpu();
+	get_online_cpus_atomic();
+	this_cpu = smp_processor_id();
 
 	/*
 	 * Can deadlock when called with interrupts disabled.
@@ -342,7 +343,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
 		}
 	}
 
-	put_cpu();
+	put_online_cpus_atomic();
 
 	return err;
 }
@@ -371,8 +372,10 @@ int smp_call_function_any(const struct cpumask *mask,
 	const struct cpumask *nodemask;
 	int ret;
 
+	get_online_cpus_atomic();
 	/* Try for same CPU (cheapest) */
-	cpu = get_cpu();
+	cpu = smp_processor_id();
+
 	if (cpumask_test_cpu(cpu, mask))
 		goto call;
 
@@ -388,7 +391,7 @@ int smp_call_function_any(const struct cpumask *mask,
 	cpu = cpumask_any_and(mask, cpu_online_mask);
 call:
 	ret = smp_call_function_single(cpu, func, info, wait);
-	put_cpu();
+	put_online_cpus_atomic();
 	return ret;
 }
 EXPORT_SYMBOL_GPL(smp_call_function_any);
@@ -409,25 +412,28 @@ void __smp_call_function_single(int cpu, struct call_single_data *data,
 	unsigned int this_cpu;
 	unsigned long flags;
 
-	this_cpu = get_cpu();
+	get_online_cpus_atomic();
+
+	this_cpu = smp_processor_id();
+
 	/*
 	 * Can deadlock when called with interrupts disabled.
 	 * We allow cpu's that are not yet online though, as no one else can
 	 * send smp call function interrupt to this cpu and as such deadlocks
 	 * can't happen.
 	 */
-	WARN_ON_ONCE(cpu_online(smp_processor_id()) && wait && irqs_disabled()
+	WARN_ON_ONCE(cpu_online(this_cpu) && wait && irqs_disabled()
 		     && !oops_in_progress);
 
 	if (cpu == this_cpu) {
 		local_irq_save(flags);
 		data->func(data->info);
 		local_irq_restore(flags);
-	} else {
+	} else if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu)) {
 		csd_lock(data);
 		generic_exec_single(cpu, data, wait);
 	}
-	put_cpu();
+	put_online_cpus_atomic();
 }
 
 /**
@@ -451,6 +457,8 @@ void smp_call_function_many(const struct cpumask *mask,
 	unsigned long flags;
 	int refs, cpu, next_cpu, this_cpu = smp_processor_id();
 
+	get_online_cpus_atomic();
+
 	/*
 	 * Can deadlock when called with interrupts disabled.
 	 * We allow cpu's that are not yet online though, as no one else can
@@ -467,17 +475,18 @@ void smp_call_function_many(const struct cpumask *mask,
 
 	/* No online cpus?  We're done. */
 	if (cpu >= nr_cpu_ids)
-		return;
+		goto out_unlock;
 
 	/* Do we have another CPU which isn't us? */
 	next_cpu = cpumask_next_and(cpu, mask, cpu_online_mask);
 	if (next_cpu == this_cpu)
-		next_cpu = cpumask_next_and(next_cpu, mask, cpu_online_mask);
+		next_cpu = cpumask_next_and(next_cpu, mask,
+						cpu_online_mask);
 
 	/* Fastpath: do that cpu by itself. */
 	if (next_cpu >= nr_cpu_ids) {
 		smp_call_function_single(cpu, func, info, wait);
-		return;
+		goto out_unlock;
 	}
 
 	data = &__get_cpu_var(cfd_data);
@@ -523,7 +532,7 @@ void smp_call_function_many(const struct cpumask *mask,
 	/* Some callers race with other cpus changing the passed mask */
 	if (unlikely(!refs)) {
 		csd_unlock(&data->csd);
-		return;
+		goto out_unlock;
 	}
 
 	raw_spin_lock_irqsave(&call_function.lock, flags);
@@ -554,6 +563,9 @@ void smp_call_function_many(const struct cpumask *mask,
 	/* Optionally wait for the CPUs to complete */
 	if (wait)
 		csd_lock_wait(&data->csd);
+
+out_unlock:
+	put_online_cpus_atomic();
 }
 EXPORT_SYMBOL(smp_call_function_many);
 
@@ -574,9 +586,9 @@ EXPORT_SYMBOL(smp_call_function_many);
  */
 int smp_call_function(smp_call_func_t func, void *info, int wait)
 {
-	preempt_disable();
+	get_online_cpus_atomic();
 	smp_call_function_many(cpu_online_mask, func, info, wait);
-	preempt_enable();
+	put_online_cpus_atomic();
 
 	return 0;
 }

^ permalink raw reply related

* [PATCH v5 08/45] CPU hotplug: Convert preprocessor macros to static inline functions
From: Srivatsa S. Bhat @ 2013-01-22  7:35 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

On 12/05/2012 06:10 AM, Andrew Morton wrote:
"static inline C functions would be preferred if possible.  Feel free to
fix up the wrong crufty surrounding code as well ;-)"

Convert the macros in the CPU hotplug code to static inline C functions.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/cpu.h |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index cf24da1..eb79f47 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -198,10 +198,10 @@ static inline void cpu_hotplug_driver_unlock(void)
 
 #else		/* CONFIG_HOTPLUG_CPU */
 
-#define get_online_cpus()	do { } while (0)
-#define put_online_cpus()	do { } while (0)
-#define get_online_cpus_atomic()	do { } while (0)
-#define put_online_cpus_atomic()	do { } while (0)
+static inline void get_online_cpus(void) {}
+static inline void put_online_cpus(void) {}
+static inline void get_online_cpus_atomic(void) {}
+static inline void put_online_cpus_atomic(void) {}
 #define hotcpu_notifier(fn, pri)	do { (void)(fn); } while (0)
 /* These aren't inline functions due to a GCC bug. */
 #define register_hotcpu_notifier(nb)	({ (void)(nb); 0; })

^ permalink raw reply related

* [PATCH v5 07/45] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat @ 2013-01-22  7:34 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

There are places where preempt_disable() or local_irq_disable() are used
to prevent any CPU from going offline during the critical section. Let us
call them as "atomic hotplug readers" ("atomic" because they run in atomic,
non-preemptible contexts).

Today, preempt_disable() or its equivalent works because the hotplug writer
uses stop_machine() to take CPUs offline. But once stop_machine() is gone
from the CPU hotplug offline path, the readers won't be able to prevent
CPUs from going offline using preempt_disable().

So the intent here is to provide synchronization APIs for such atomic hotplug
readers, to prevent (any) CPUs from going offline, without depending on
stop_machine() at the writer-side. The new APIs will look something like
this:  get_online_cpus_atomic() and put_online_cpus_atomic()

Some important design requirements and considerations:
-----------------------------------------------------

1. Scalable synchronization at the reader-side, especially in the fast-path

   Any synchronization at the atomic hotplug readers side must be highly
   scalable - avoid global single-holder locks/counters etc. Because, these
   paths currently use the extremely fast preempt_disable(); our replacement
   to preempt_disable() should not become ridiculously costly and also should
   not serialize the readers among themselves needlessly.

   At a minimum, the new APIs must be extremely fast at the reader side
   atleast in the fast-path, when no CPU offline writers are active.

2. preempt_disable() was recursive. The replacement should also be recursive.

3. No (new) lock-ordering restrictions

   preempt_disable() was super-flexible. It didn't impose any ordering
   restrictions or rules for nesting. Our replacement should also be equally
   flexible and usable.

4. No deadlock possibilities

   Regular per-cpu locking is not the way to go if we want to have relaxed
   rules for lock-ordering. Because, we can end up in circular-locking
   dependencies as explained in https://lkml.org/lkml/2012/12/6/290

   So, avoid the usual per-cpu locking schemes (per-cpu locks/per-cpu atomic
   counters with spin-on-contention etc) as much as possible, to avoid
   numerous deadlock possibilities from creeping in.


Implementation of the design:
----------------------------

We use per-CPU reader-writer locks for synchronization because:

  a. They are quite fast and scalable in the fast-path (when no writers are
     active), since they use fast per-cpu counters in those paths.

  b. They are recursive at the reader side.

  c. They provide a good amount of safety against deadlocks; they don't
     spring new deadlock possibilities on us from out of nowhere. As a
     result, they have relaxed locking rules and are quite flexible, and
     thus are best suited for replacing usages of preempt_disable() or
     local_irq_disable() at the reader side.

Together, these satisfy all the requirements mentioned above.

I'm indebted to Michael Wang and Xiao Guangrong for their numerous thoughtful
suggestions and ideas, which inspired and influenced many of the decisions in
this as well as previous designs. Thanks a lot Michael and Xiao!

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86 at kernel.org
Cc: linux-arm-kernel at lists.infradead.org
Cc: uclinux-dist-devel at blackfin.uclinux.org
Cc: linux-ia64 at vger.kernel.org
Cc: linux-mips at linux-mips.org
Cc: linux-am33-list at redhat.com
Cc: linux-parisc at vger.kernel.org
Cc: linuxppc-dev at lists.ozlabs.org
Cc: linux-s390 at vger.kernel.org
Cc: linux-sh at vger.kernel.org
Cc: sparclinux at vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 arch/arm/Kconfig      |    1 +
 arch/blackfin/Kconfig |    1 +
 arch/ia64/Kconfig     |    1 +
 arch/mips/Kconfig     |    1 +
 arch/mn10300/Kconfig  |    1 +
 arch/parisc/Kconfig   |    1 +
 arch/powerpc/Kconfig  |    1 +
 arch/s390/Kconfig     |    1 +
 arch/sh/Kconfig       |    1 +
 arch/sparc/Kconfig    |    1 +
 arch/x86/Kconfig      |    1 +
 include/linux/cpu.h   |    4 +++
 kernel/cpu.c          |   57 ++++++++++++++++++++++++++++++++++++++++++++++---
 13 files changed, 69 insertions(+), 3 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 67874b8..cb6b94b 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1616,6 +1616,7 @@ config NR_CPUS
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP && HOTPLUG
+	select PERCPU_RWLOCK
 	help
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
 	  can be controlled through /sys/devices/system/cpu.
diff --git a/arch/blackfin/Kconfig b/arch/blackfin/Kconfig
index b6f3ad5..83d9882 100644
--- a/arch/blackfin/Kconfig
+++ b/arch/blackfin/Kconfig
@@ -261,6 +261,7 @@ config NR_CPUS
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP && HOTPLUG
+	select PERCPU_RWLOCK
 	default y
 
 config BF_REV_MIN
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 3279646..c246772 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -378,6 +378,7 @@ config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs (EXPERIMENTAL)"
 	depends on SMP && EXPERIMENTAL
 	select HOTPLUG
+	select PERCPU_RWLOCK
 	default n
 	---help---
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 2ac626a..f97c479 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -956,6 +956,7 @@ config SYS_HAS_EARLY_PRINTK
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP && HOTPLUG && SYS_SUPPORTS_HOTPLUG_CPU
+	select PERCPU_RWLOCK
 	help
 	  Say Y here to allow turning CPUs off and on. CPUs can be
 	  controlled through /sys/devices/system/cpu.
diff --git a/arch/mn10300/Kconfig b/arch/mn10300/Kconfig
index e70001c..a64e488 100644
--- a/arch/mn10300/Kconfig
+++ b/arch/mn10300/Kconfig
@@ -60,6 +60,7 @@ config ARCH_HAS_ILOG2_U32
 
 config HOTPLUG_CPU
 	def_bool n
+	select PERCPU_RWLOCK
 
 source "init/Kconfig"
 
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index b77feff..6f55cd4 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -226,6 +226,7 @@ config HOTPLUG_CPU
 	bool
 	default y if SMP
 	select HOTPLUG
+	select PERCPU_RWLOCK
 
 config ARCH_SELECT_MEMORY_MODEL
 	def_bool y
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 17903f1..56b1f15 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -336,6 +336,7 @@ config HOTPLUG_CPU
 	bool "Support for enabling/disabling CPUs"
 	depends on SMP && HOTPLUG && EXPERIMENTAL && (PPC_PSERIES || \
 	PPC_PMAC || PPC_POWERNV || (PPC_85xx && !PPC_E500MC))
+	select PERCPU_RWLOCK
 	---help---
 	  Say Y here to be able to disable and re-enable individual
 	  CPUs at runtime on SMP machines.
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index b5ea38c..a9aafb4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -299,6 +299,7 @@ config HOTPLUG_CPU
 	prompt "Support for hot-pluggable CPUs"
 	depends on SMP
 	select HOTPLUG
+	select PERCPU_RWLOCK
 	help
 	  Say Y here to be able to turn CPUs off and on. CPUs
 	  can be controlled through /sys/devices/system/cpu/cpu#.
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index babc2b8..8c92eef 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -765,6 +765,7 @@ config NR_CPUS
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs (EXPERIMENTAL)"
 	depends on SMP && HOTPLUG && EXPERIMENTAL
+	select PERCPU_RWLOCK
 	help
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
 	  can be controlled through /sys/devices/system/cpu.
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 9f2edb5..e2bd573 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -253,6 +253,7 @@ config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SPARC64 && SMP
 	select HOTPLUG
+	select PERCPU_RWLOCK
 	help
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
 	  can be controlled through /sys/devices/system/cpu/cpu#.
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 79795af..a225d12 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1689,6 +1689,7 @@ config PHYSICAL_ALIGN
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP && HOTPLUG
+	select PERCPU_RWLOCK
 	---help---
 	  Say Y here to allow turning CPUs off and on. CPUs can be
 	  controlled through /sys/devices/system/cpu.
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index ce7a074..cf24da1 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys;
 
 extern void get_online_cpus(void);
 extern void put_online_cpus(void);
+extern void get_online_cpus_atomic(void);
+extern void put_online_cpus_atomic(void);
 #define hotcpu_notifier(fn, pri)	cpu_notifier(fn, pri)
 #define register_hotcpu_notifier(nb)	register_cpu_notifier(nb)
 #define unregister_hotcpu_notifier(nb)	unregister_cpu_notifier(nb)
@@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void)
 
 #define get_online_cpus()	do { } while (0)
 #define put_online_cpus()	do { } while (0)
+#define get_online_cpus_atomic()	do { } while (0)
+#define put_online_cpus_atomic()	do { } while (0)
 #define hotcpu_notifier(fn, pri)	do { (void)(fn); } while (0)
 /* These aren't inline functions due to a GCC bug. */
 #define register_hotcpu_notifier(nb)	({ (void)(nb); 0; })
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 3046a50..1c84138 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1,6 +1,18 @@
 /* CPU control.
  * (C) 2001, 2002, 2003, 2004 Rusty Russell
  *
+ * Rework of the CPU hotplug offline mechanism to remove its dependence on
+ * the heavy-weight stop_machine() primitive, by Srivatsa S. Bhat and
+ * Paul E. McKenney.
+ *
+ * Copyright (C) IBM Corporation, 2012-2013
+ * Authors: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
+ *          Paul E. McKenney <paulmck@linux.vnet.ibm.com>
+ *
+ * With lots of invaluable suggestions from:
+ *	    Oleg Nesterov <oleg@redhat.com>
+ *	    Tejun Heo <tj@kernel.org>
+ *
  * This code is licenced under the GPL.
  */
 #include <linux/proc_fs.h>
@@ -19,6 +31,7 @@
 #include <linux/mutex.h>
 #include <linux/gfp.h>
 #include <linux/suspend.h>
+#include <linux/percpu-rwlock.h>
 
 #include "smpboot.h"
 
@@ -133,6 +146,38 @@ static void cpu_hotplug_done(void)
 	mutex_unlock(&cpu_hotplug.lock);
 }
 
+/*
+ * Per-CPU Reader-Writer lock to synchronize between atomic hotplug
+ * readers and the CPU offline hotplug writer.
+ */
+DEFINE_STATIC_PERCPU_RWLOCK(hotplug_pcpu_rwlock);
+
+/*
+ * Invoked by atomic hotplug reader (a task which wants to prevent
+ * CPU offline, but which can't afford to sleep), to prevent CPUs from
+ * going offline. So, you can call this function from atomic contexts
+ * (including interrupt handlers).
+ *
+ * Note: This does NOT prevent CPUs from coming online! It only prevents
+ * CPUs from going offline.
+ *
+ * You can call this function recursively.
+ *
+ * Returns with preemption disabled (but interrupts remain as they are;
+ * they are not disabled).
+ */
+void get_online_cpus_atomic(void)
+{
+	percpu_read_lock_irqsafe(&hotplug_pcpu_rwlock);
+}
+EXPORT_SYMBOL_GPL(get_online_cpus_atomic);
+
+void put_online_cpus_atomic(void)
+{
+	percpu_read_unlock_irqsafe(&hotplug_pcpu_rwlock);
+}
+EXPORT_SYMBOL_GPL(put_online_cpus_atomic);
+
 #else /* #if CONFIG_HOTPLUG_CPU */
 static void cpu_hotplug_begin(void) {}
 static void cpu_hotplug_done(void) {}
@@ -246,15 +291,21 @@ struct take_cpu_down_param {
 static int __ref take_cpu_down(void *_param)
 {
 	struct take_cpu_down_param *param = _param;
-	int err;
+	unsigned long flags;
+	int err = 0;
+
+	percpu_write_lock_irqsave(&hotplug_pcpu_rwlock, &flags);
 
 	/* Ensure this CPU doesn't handle any more interrupts. */
 	err = __cpu_disable();
 	if (err < 0)
-		return err;
+		goto out;
 
 	cpu_notify(CPU_DYING | param->mod, param->hcpu);
-	return 0;
+
+out:
+	percpu_write_unlock_irqrestore(&hotplug_pcpu_rwlock, &flags);
+	return err;
 }
 
 /* Requires cpu_add_remove_lock to be held */

^ permalink raw reply related

* [PATCH v5 06/45] percpu_rwlock: Allow writers to be readers, and add lockdep annotations
From: Srivatsa S. Bhat @ 2013-01-22  7:34 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

CPU hotplug (which will be the first user of per-CPU rwlocks) has a special
requirement with respect to locking: the writer, after acquiring the per-CPU
rwlock for write, must be allowed to take the same lock for read, without
deadlocking and without getting complaints from lockdep. In comparison, this
is similar to what get_online_cpus()/put_online_cpus() does today: it allows
a hotplug writer (who holds the cpu_hotplug.lock mutex) to invoke it without
locking issues, because it silently returns if the caller is the hotplug
writer itself.

This can be easily achieved with per-CPU rwlocks as well (even without a
"is this a writer?" check) by incrementing the per-CPU refcount of the writer
immediately after taking the global rwlock for write, and then decrementing
the per-CPU refcount before releasing the global rwlock.
This ensures that any reader that comes along on that CPU while the writer is
active (on that same CPU), notices the non-zero value of the nested counter
and assumes that it is a nested read-side critical section and proceeds by
just incrementing the refcount. Thus we prevent the reader from taking the
global rwlock for read, which prevents the writer from deadlocking itself.

Add that support and teach lockdep about this special locking scheme so
that it knows that this sort of usage is valid. Also add the required lockdep
annotations to enable it to detect common locking problems with per-CPU
rwlocks.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 lib/percpu-rwlock.c |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/lib/percpu-rwlock.c b/lib/percpu-rwlock.c
index a8d177a..054a50a 100644
--- a/lib/percpu-rwlock.c
+++ b/lib/percpu-rwlock.c
@@ -84,6 +84,10 @@ void percpu_read_lock_irqsafe(struct percpu_rwlock *pcpu_rwlock)

 		if (likely(!writer_active(pcpu_rwlock))) {
 			this_cpu_inc(*pcpu_rwlock->reader_refcnt);
+
+			/* Pretend that we take global_rwlock for lockdep */
+			rwlock_acquire_read(&pcpu_rwlock->global_rwlock.dep_map,
+					    0, 0, _RET_IP_);
 		} else {
 			/* Writer is active, so switch to global rwlock. */

@@ -108,6 +112,12 @@ void percpu_read_lock_irqsafe(struct percpu_rwlock *pcpu_rwlock)
 			if (!writer_active(pcpu_rwlock)) {
 				this_cpu_inc(*pcpu_rwlock->reader_refcnt);
 				read_unlock(&pcpu_rwlock->global_rwlock);
+
+				/*
+				 * Pretend that we take global_rwlock for lockdep
+				 */
+				rwlock_acquire_read(&pcpu_rwlock->global_rwlock.dep_map,
+						    0, 0, _RET_IP_);
 			}
 		}
 	}
@@ -128,6 +138,14 @@ void percpu_read_unlock_irqsafe(struct percpu_rwlock *pcpu_rwlock)
 	if (reader_nested_percpu(pcpu_rwlock)) {
 		this_cpu_dec(*pcpu_rwlock->reader_refcnt);
 		smp_wmb(); /* Paired with smp_rmb() in sync_reader() */
+
+		/*
+		 * If this is the last decrement, then it is time to pretend
+		 * to lockdep that we are releasing the read lock.
+		 */
+		if (!reader_nested_percpu(pcpu_rwlock))
+			rwlock_release(&pcpu_rwlock->global_rwlock.dep_map,
+				       1, _RET_IP_);
 	} else {
 		read_unlock(&pcpu_rwlock->global_rwlock);
 	}
@@ -205,11 +223,14 @@ void percpu_write_lock_irqsave(struct percpu_rwlock *pcpu_rwlock,
 	announce_writer_active(pcpu_rwlock);
 	sync_all_readers(pcpu_rwlock);
 	write_lock_irqsave(&pcpu_rwlock->global_rwlock, *flags);
+	this_cpu_inc(*pcpu_rwlock->reader_refcnt);
 }

 void percpu_write_unlock_irqrestore(struct percpu_rwlock *pcpu_rwlock,
 			 unsigned long *flags)
 {
+	this_cpu_dec(*pcpu_rwlock->reader_refcnt);
+
 	/*
 	 * Inform all readers that we are done, so that they can switch back
 	 * to their per-cpu refcounts. (We don't need to wait for them to

^ permalink raw reply related

* [PATCH v5 05/45] percpu_rwlock: Make percpu-rwlocks IRQ-safe, optimally
From: Srivatsa S. Bhat @ 2013-01-22  7:34 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

If interrupt handlers can also be readers, then one of the ways to make
per-CPU rwlocks safe, is to disable interrupts at the reader side before
trying to acquire the per-CPU rwlock and keep it disabled throughout the
duration of the read-side critical section.

The goal is to avoid cases such as:

  1. writer is active and it holds the global rwlock for write

  2. a regular reader comes in and marks itself as present (by incrementing
     its per-CPU refcount) before checking whether writer is active.

  3. an interrupt hits the reader;
     [If it had not hit, the reader would have noticed that the writer is
      active and would have decremented its refcount and would have tried
      to acquire the global rwlock for read].
     Since the interrupt handler also happens to be a reader, it notices
     the non-zero refcount (which was due to the reader who got interrupted)
     and thinks that this is a nested read-side critical section and
     proceeds to take the fastpath, which is wrong. The interrupt handler
     should have noticed that the writer is active and taken the rwlock
     for read.

So, disabling interrupts can help avoid this problem (at the cost of keeping
the interrupts disabled for quite long).

But Oleg had a brilliant idea by which we can do much better than that:
we can manage with disabling interrupts _just_ during the updates (writes to
per-CPU refcounts) to safe-guard against races with interrupt handlers.
Beyond that, we can keep the interrupts enabled and still be safe w.r.t
interrupt handlers that can act as readers.

Basically the idea is that we differentiate between the *part* of the
per-CPU refcount that we use for reference counting vs the part that we use
merely to make the writer wait for us to switch over to the right
synchronization scheme.

The scheme involves splitting the per-CPU refcounts into 2 parts:
eg: the lower 16 bits are used to track the nesting depth of the reader
(a "nested-counter"), and the remaining (upper) bits are used to merely mark
the presence of the reader.

As long as the overall reader_refcnt is non-zero, the writer waits for the
reader (assuming that the reader is still actively using per-CPU refcounts for
synchronization).

The reader first sets one of the higher bits to mark its presence, and then
uses the lower 16 bits to manage the nesting depth. So, an interrupt handler
coming in as illustrated above will be able to distinguish between "this is
a nested read-side critical section" vs "we have merely marked our presence
to make the writer wait for us to switch" by looking at the same refcount.
Thus, it makes it unnecessary to keep interrupts disabled throughout the
read-side critical section, despite having the possibility of interrupt
handlers being readers themselves.


Implement this logic and rename the locking functions appropriately, to
reflect what they do.

Based-on-idea-by: Oleg Nesterov <oleg@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/percpu-rwlock.h |   15 ++++++++++-----
 lib/percpu-rwlock.c           |   41 +++++++++++++++++++++++++++--------------
 2 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/include/linux/percpu-rwlock.h b/include/linux/percpu-rwlock.h
index 6819bb8..856ba6b 100644
--- a/include/linux/percpu-rwlock.h
+++ b/include/linux/percpu-rwlock.h
@@ -34,11 +34,13 @@ struct percpu_rwlock {
 	rwlock_t		global_rwlock;
 };
 
-extern void percpu_read_lock(struct percpu_rwlock *);
-extern void percpu_read_unlock(struct percpu_rwlock *);
+extern void percpu_read_lock_irqsafe(struct percpu_rwlock *);
+extern void percpu_read_unlock_irqsafe(struct percpu_rwlock *);
 
-extern void percpu_write_lock(struct percpu_rwlock *);
-extern void percpu_write_unlock(struct percpu_rwlock *);
+extern void percpu_write_lock_irqsave(struct percpu_rwlock *,
+				      unsigned long *flags);
+extern void percpu_write_unlock_irqrestore(struct percpu_rwlock *,
+					   unsigned long *flags);
 
 extern int __percpu_init_rwlock(struct percpu_rwlock *,
 				const char *, struct lock_class_key *);
@@ -68,11 +70,14 @@ extern void percpu_free_rwlock(struct percpu_rwlock *);
 	__percpu_init_rwlock(pcpu_rwlock, #pcpu_rwlock, &rwlock_key);	\
 })
 
+#define READER_PRESENT		(1UL << 16)
+#define READER_REFCNT_MASK	(READER_PRESENT - 1)
+
 #define reader_uses_percpu_refcnt(pcpu_rwlock, cpu)			\
 		(ACCESS_ONCE(per_cpu(*((pcpu_rwlock)->reader_refcnt), cpu)))
 
 #define reader_nested_percpu(pcpu_rwlock)				\
-			(__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) > 1)
+	(__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) & READER_REFCNT_MASK)
 
 #define writer_active(pcpu_rwlock)					\
 			(__this_cpu_read(*((pcpu_rwlock)->writer_signal)))
diff --git a/lib/percpu-rwlock.c b/lib/percpu-rwlock.c
index 992da5c..a8d177a 100644
--- a/lib/percpu-rwlock.c
+++ b/lib/percpu-rwlock.c
@@ -62,19 +62,19 @@ void percpu_free_rwlock(struct percpu_rwlock *pcpu_rwlock)
 	pcpu_rwlock->writer_signal = NULL;
 }
 
-void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
+void percpu_read_lock_irqsafe(struct percpu_rwlock *pcpu_rwlock)
 {
 	preempt_disable();
 
 	/* First and foremost, let the writer know that a reader is active */
-	this_cpu_inc(*pcpu_rwlock->reader_refcnt);
+	this_cpu_add(*pcpu_rwlock->reader_refcnt, READER_PRESENT);
 
 	/*
 	 * If we are already using per-cpu refcounts, it is not safe to switch
 	 * the synchronization scheme. So continue using the refcounts.
 	 */
 	if (reader_nested_percpu(pcpu_rwlock)) {
-		goto out;
+		this_cpu_inc(*pcpu_rwlock->reader_refcnt);
 	} else {
 		/*
 		 * The write to 'reader_refcnt' must be visible before we
@@ -83,9 +83,19 @@ void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
 		smp_mb(); /* Paired with smp_rmb() in sync_reader() */
 
 		if (likely(!writer_active(pcpu_rwlock))) {
-			goto out;
+			this_cpu_inc(*pcpu_rwlock->reader_refcnt);
 		} else {
 			/* Writer is active, so switch to global rwlock. */
+
+			/*
+			 * While we are spinning on ->global_rwlock, an
+			 * interrupt can hit us, and the interrupt handler
+			 * might call this function. The distinction between
+			 * READER_PRESENT and the refcnt helps ensure that the
+			 * interrupt handler also takes this branch and spins
+			 * on the ->global_rwlock, as long as the writer is
+			 * active.
+			 */
 			read_lock(&pcpu_rwlock->global_rwlock);
 
 			/*
@@ -95,26 +105,27 @@ void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
 			 * back to per-cpu refcounts. (This also helps avoid
 			 * heterogeneous nesting of readers).
 			 */
-			if (writer_active(pcpu_rwlock))
-				this_cpu_dec(*pcpu_rwlock->reader_refcnt);
-			else
+			if (!writer_active(pcpu_rwlock)) {
+				this_cpu_inc(*pcpu_rwlock->reader_refcnt);
 				read_unlock(&pcpu_rwlock->global_rwlock);
+			}
 		}
 	}
 
-out:
+	this_cpu_sub(*pcpu_rwlock->reader_refcnt, READER_PRESENT);
+
 	/* Prevent reordering of any subsequent reads */
 	smp_rmb();
 }
 
-void percpu_read_unlock(struct percpu_rwlock *pcpu_rwlock)
+void percpu_read_unlock_irqsafe(struct percpu_rwlock *pcpu_rwlock)
 {
 	/*
 	 * We never allow heterogeneous nesting of readers. So it is trivial
 	 * to find out the kind of reader we are, and undo the operation
 	 * done by our corresponding percpu_read_lock().
 	 */
-	if (__this_cpu_read(*pcpu_rwlock->reader_refcnt)) {
+	if (reader_nested_percpu(pcpu_rwlock)) {
 		this_cpu_dec(*pcpu_rwlock->reader_refcnt);
 		smp_wmb(); /* Paired with smp_rmb() in sync_reader() */
 	} else {
@@ -184,7 +195,8 @@ static void sync_all_readers(struct percpu_rwlock *pcpu_rwlock)
 		sync_reader(pcpu_rwlock, cpu);
 }
 
-void percpu_write_lock(struct percpu_rwlock *pcpu_rwlock)
+void percpu_write_lock_irqsave(struct percpu_rwlock *pcpu_rwlock,
+			       unsigned long *flags)
 {
 	/*
 	 * Tell all readers that a writer is becoming active, so that they
@@ -192,10 +204,11 @@ void percpu_write_lock(struct percpu_rwlock *pcpu_rwlock)
 	 */
 	announce_writer_active(pcpu_rwlock);
 	sync_all_readers(pcpu_rwlock);
-	write_lock(&pcpu_rwlock->global_rwlock);
+	write_lock_irqsave(&pcpu_rwlock->global_rwlock, *flags);
 }
 
-void percpu_write_unlock(struct percpu_rwlock *pcpu_rwlock)
+void percpu_write_unlock_irqrestore(struct percpu_rwlock *pcpu_rwlock,
+			 unsigned long *flags)
 {
 	/*
 	 * Inform all readers that we are done, so that they can switch back
@@ -203,6 +216,6 @@ void percpu_write_unlock(struct percpu_rwlock *pcpu_rwlock)
 	 * see it).
 	 */
 	announce_writer_inactive(pcpu_rwlock);
-	write_unlock(&pcpu_rwlock->global_rwlock);
+	write_unlock_irqrestore(&pcpu_rwlock->global_rwlock, *flags);
 }
 

^ permalink raw reply related

* [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Srivatsa S. Bhat @ 2013-01-22  7:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

Using global rwlocks as the backend for per-CPU rwlocks helps us avoid many
lock-ordering related problems (unlike per-cpu locks). However, global
rwlocks lead to unnecessary cache-line bouncing even when there are no
writers present, which can slow down the system needlessly.

Per-cpu counters can help solve the cache-line bouncing problem. So we
actually use the best of both: per-cpu counters (no-waiting) at the reader
side in the fast-path, and global rwlocks in the slowpath.

[ Fastpath = no writer is active; Slowpath = a writer is active ]

IOW, the readers just increment/decrement their per-cpu refcounts (disabling
interrupts during the updates, if necessary) when no writer is active.
When a writer becomes active, he signals all readers to switch to global
rwlocks for the duration of his activity. The readers switch over when it
is safe for them (ie., when they are about to start a fresh, non-nested
read-side critical section) and start using (holding) the global rwlock for
read in their subsequent critical sections.

The writer waits for every existing reader to switch, and then acquires the
global rwlock for write and enters his critical section. Later, the writer
signals all readers that he is done, and that they can go back to using their
per-cpu refcounts again.

Note that the lock-safety (despite the per-cpu scheme) comes from the fact
that the readers can *choose* _when_ to switch to rwlocks upon the writer's
signal. And the readers don't wait on anybody based on the per-cpu counters.
The only true synchronization that involves waiting at the reader-side in this
scheme, is the one arising from the global rwlock, which is safe from circular
locking dependency issues.

Reader-writer locks and per-cpu counters are recursive, so they can be
used in a nested fashion in the reader-path, which makes per-CPU rwlocks also
recursive. Also, this design of switching the synchronization scheme ensures
that you can safely nest and use these locks in a very flexible manner.

I'm indebted to Michael Wang and Xiao Guangrong for their numerous thoughtful
suggestions and ideas, which inspired and influenced many of the decisions in
this as well as previous designs. Thanks a lot Michael and Xiao!

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/percpu-rwlock.h |   10 +++
 lib/percpu-rwlock.c           |  128 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/include/linux/percpu-rwlock.h b/include/linux/percpu-rwlock.h
index 8dec8fe..6819bb8 100644
--- a/include/linux/percpu-rwlock.h
+++ b/include/linux/percpu-rwlock.h
@@ -68,4 +68,14 @@ extern void percpu_free_rwlock(struct percpu_rwlock *);
 	__percpu_init_rwlock(pcpu_rwlock, #pcpu_rwlock, &rwlock_key);	\
 })
 
+#define reader_uses_percpu_refcnt(pcpu_rwlock, cpu)			\
+		(ACCESS_ONCE(per_cpu(*((pcpu_rwlock)->reader_refcnt), cpu)))
+
+#define reader_nested_percpu(pcpu_rwlock)				\
+			(__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) > 1)
+
+#define writer_active(pcpu_rwlock)					\
+			(__this_cpu_read(*((pcpu_rwlock)->writer_signal)))
+
 #endif
+
diff --git a/lib/percpu-rwlock.c b/lib/percpu-rwlock.c
index 80dad93..992da5c 100644
--- a/lib/percpu-rwlock.c
+++ b/lib/percpu-rwlock.c
@@ -64,21 +64,145 @@ void percpu_free_rwlock(struct percpu_rwlock *pcpu_rwlock)
 
 void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
 {
-	read_lock(&pcpu_rwlock->global_rwlock);
+	preempt_disable();
+
+	/* First and foremost, let the writer know that a reader is active */
+	this_cpu_inc(*pcpu_rwlock->reader_refcnt);
+
+	/*
+	 * If we are already using per-cpu refcounts, it is not safe to switch
+	 * the synchronization scheme. So continue using the refcounts.
+	 */
+	if (reader_nested_percpu(pcpu_rwlock)) {
+		goto out;
+	} else {
+		/*
+		 * The write to 'reader_refcnt' must be visible before we
+		 * read 'writer_signal'.
+		 */
+		smp_mb(); /* Paired with smp_rmb() in sync_reader() */
+
+		if (likely(!writer_active(pcpu_rwlock))) {
+			goto out;
+		} else {
+			/* Writer is active, so switch to global rwlock. */
+			read_lock(&pcpu_rwlock->global_rwlock);
+
+			/*
+			 * We might have raced with a writer going inactive
+			 * before we took the read-lock. So re-evaluate whether
+			 * we still need to hold the rwlock or if we can switch
+			 * back to per-cpu refcounts. (This also helps avoid
+			 * heterogeneous nesting of readers).
+			 */
+			if (writer_active(pcpu_rwlock))
+				this_cpu_dec(*pcpu_rwlock->reader_refcnt);
+			else
+				read_unlock(&pcpu_rwlock->global_rwlock);
+		}
+	}
+
+out:
+	/* Prevent reordering of any subsequent reads */
+	smp_rmb();
 }
 
 void percpu_read_unlock(struct percpu_rwlock *pcpu_rwlock)
 {
-	read_unlock(&pcpu_rwlock->global_rwlock);
+	/*
+	 * We never allow heterogeneous nesting of readers. So it is trivial
+	 * to find out the kind of reader we are, and undo the operation
+	 * done by our corresponding percpu_read_lock().
+	 */
+	if (__this_cpu_read(*pcpu_rwlock->reader_refcnt)) {
+		this_cpu_dec(*pcpu_rwlock->reader_refcnt);
+		smp_wmb(); /* Paired with smp_rmb() in sync_reader() */
+	} else {
+		read_unlock(&pcpu_rwlock->global_rwlock);
+	}
+
+	preempt_enable();
+}
+
+static inline void raise_writer_signal(struct percpu_rwlock *pcpu_rwlock,
+				       unsigned int cpu)
+{
+	per_cpu(*pcpu_rwlock->writer_signal, cpu) = true;
+}
+
+static inline void drop_writer_signal(struct percpu_rwlock *pcpu_rwlock,
+				      unsigned int cpu)
+{
+	per_cpu(*pcpu_rwlock->writer_signal, cpu) = false;
+}
+
+static void announce_writer_active(struct percpu_rwlock *pcpu_rwlock)
+{
+	unsigned int cpu;
+
+	for_each_online_cpu(cpu)
+		raise_writer_signal(pcpu_rwlock, cpu);
+
+	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
+}
+
+static void announce_writer_inactive(struct percpu_rwlock *pcpu_rwlock)
+{
+	unsigned int cpu;
+
+	drop_writer_signal(pcpu_rwlock, smp_processor_id());
+
+	for_each_online_cpu(cpu)
+		drop_writer_signal(pcpu_rwlock, cpu);
+
+	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
+}
+
+/*
+ * Wait for the reader to see the writer's signal and switch from percpu
+ * refcounts to global rwlock.
+ *
+ * If the reader is still using percpu refcounts, wait for him to switch.
+ * Else, we can safely go ahead, because either the reader has already
+ * switched over, or the next reader that comes along on that CPU will
+ * notice the writer's signal and will switch over to the rwlock.
+ */
+static inline void sync_reader(struct percpu_rwlock *pcpu_rwlock,
+			       unsigned int cpu)
+{
+	smp_rmb(); /* Paired with smp_[w]mb() in percpu_read_[un]lock() */
+
+	while (reader_uses_percpu_refcnt(pcpu_rwlock, cpu))
+		cpu_relax();
+}
+
+static void sync_all_readers(struct percpu_rwlock *pcpu_rwlock)
+{
+	unsigned int cpu;
+
+	for_each_online_cpu(cpu)
+		sync_reader(pcpu_rwlock, cpu);
 }
 
 void percpu_write_lock(struct percpu_rwlock *pcpu_rwlock)
 {
+	/*
+	 * Tell all readers that a writer is becoming active, so that they
+	 * start switching over to the global rwlock.
+	 */
+	announce_writer_active(pcpu_rwlock);
+	sync_all_readers(pcpu_rwlock);
 	write_lock(&pcpu_rwlock->global_rwlock);
 }
 
 void percpu_write_unlock(struct percpu_rwlock *pcpu_rwlock)
 {
+	/*
+	 * Inform all readers that we are done, so that they can switch back
+	 * to their per-cpu refcounts. (We don't need to wait for them to
+	 * see it).
+	 */
+	announce_writer_inactive(pcpu_rwlock);
 	write_unlock(&pcpu_rwlock->global_rwlock);
 }
 

^ permalink raw reply related

* [PATCH v5 03/45] percpu_rwlock: Provide a way to define and init percpu-rwlocks at compile time
From: Srivatsa S. Bhat @ 2013-01-22  7:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

Add the support for defining and initializing percpu-rwlocks at compile time
for those users who would like to use percpu-rwlocks really early in the boot
process (even before dynamic per-CPU allocations can begin).

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/percpu-rwlock.h |   18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/include/linux/percpu-rwlock.h b/include/linux/percpu-rwlock.h
index cd5eab5..8dec8fe 100644
--- a/include/linux/percpu-rwlock.h
+++ b/include/linux/percpu-rwlock.h
@@ -45,6 +45,24 @@ extern int __percpu_init_rwlock(struct percpu_rwlock *,
 
 extern void percpu_free_rwlock(struct percpu_rwlock *);
 
+
+#define __PERCPU_RWLOCK_INIT(name)					\
+	{								\
+		.reader_refcnt = &name##_reader_refcnt,			\
+		.writer_signal = &name##_writer_signal,			\
+		.global_rwlock = __RW_LOCK_UNLOCKED(name.global_rwlock) \
+	}
+
+#define DEFINE_PERCPU_RWLOCK(name)					\
+	static DEFINE_PER_CPU(unsigned long, name##_reader_refcnt);	\
+	static DEFINE_PER_CPU(bool, name##_writer_signal);		\
+	struct percpu_rwlock (name) = __PERCPU_RWLOCK_INIT(name);
+
+#define DEFINE_STATIC_PERCPU_RWLOCK(name)				\
+	static DEFINE_PER_CPU(unsigned long, name##_reader_refcnt);	\
+	static DEFINE_PER_CPU(bool, name##_writer_signal);		\
+	static struct percpu_rwlock(name) = __PERCPU_RWLOCK_INIT(name);
+
 #define percpu_init_rwlock(pcpu_rwlock)					\
 ({	static struct lock_class_key rwlock_key;			\
 	__percpu_init_rwlock(pcpu_rwlock, #pcpu_rwlock, &rwlock_key);	\

^ permalink raw reply related

* [PATCH v5 02/45] percpu_rwlock: Introduce per-CPU variables for the reader and the writer
From: Srivatsa S. Bhat @ 2013-01-22  7:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

Per-CPU rwlocks ought to give better performance than global rwlocks.
That is where the "per-CPU" component comes in. So introduce the necessary
per-CPU variables that would be necessary at the reader and the writer sides,
and add the support for dynamically initializing per-CPU rwlocks.
These per-CPU variables will be used subsequently to implement the core
algorithm behind per-CPU rwlocks.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/percpu-rwlock.h |    4 ++++
 lib/percpu-rwlock.c           |   21 +++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/include/linux/percpu-rwlock.h b/include/linux/percpu-rwlock.h
index 45620d0..cd5eab5 100644
--- a/include/linux/percpu-rwlock.h
+++ b/include/linux/percpu-rwlock.h
@@ -29,6 +29,8 @@
 #include <linux/spinlock.h>
 
 struct percpu_rwlock {
+	unsigned long __percpu	*reader_refcnt;
+	bool __percpu		*writer_signal;
 	rwlock_t		global_rwlock;
 };
 
@@ -41,6 +43,8 @@ extern void percpu_write_unlock(struct percpu_rwlock *);
 extern int __percpu_init_rwlock(struct percpu_rwlock *,
 				const char *, struct lock_class_key *);
 
+extern void percpu_free_rwlock(struct percpu_rwlock *);
+
 #define percpu_init_rwlock(pcpu_rwlock)					\
 ({	static struct lock_class_key rwlock_key;			\
 	__percpu_init_rwlock(pcpu_rwlock, #pcpu_rwlock, &rwlock_key);	\
diff --git a/lib/percpu-rwlock.c b/lib/percpu-rwlock.c
index af0c714..80dad93 100644
--- a/lib/percpu-rwlock.c
+++ b/lib/percpu-rwlock.c
@@ -31,6 +31,17 @@
 int __percpu_init_rwlock(struct percpu_rwlock *pcpu_rwlock,
 			 const char *name, struct lock_class_key *rwlock_key)
 {
+	pcpu_rwlock->reader_refcnt = alloc_percpu(unsigned long);
+	if (unlikely(!pcpu_rwlock->reader_refcnt))
+		return -ENOMEM;
+
+	pcpu_rwlock->writer_signal = alloc_percpu(bool);
+	if (unlikely(!pcpu_rwlock->writer_signal)) {
+		free_percpu(pcpu_rwlock->reader_refcnt);
+		pcpu_rwlock->reader_refcnt = NULL;
+		return -ENOMEM;
+	}
+
 	/* ->global_rwlock represents the whole percpu_rwlock for lockdep */
 #ifdef CONFIG_DEBUG_SPINLOCK
 	__rwlock_init(&pcpu_rwlock->global_rwlock, name, rwlock_key);
@@ -41,6 +52,16 @@ int __percpu_init_rwlock(struct percpu_rwlock *pcpu_rwlock,
 	return 0;
 }
 
+void percpu_free_rwlock(struct percpu_rwlock *pcpu_rwlock)
+{
+	free_percpu(pcpu_rwlock->reader_refcnt);
+	free_percpu(pcpu_rwlock->writer_signal);
+
+	/* Catch use-after-free bugs */
+	pcpu_rwlock->reader_refcnt = NULL;
+	pcpu_rwlock->writer_signal = NULL;
+}
+
 void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
 {
 	read_lock(&pcpu_rwlock->global_rwlock);

^ permalink raw reply related

* [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend
From: Srivatsa S. Bhat @ 2013-01-22  7:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

A straight-forward (and obvious) algorithm to implement Per-CPU Reader-Writer
locks can also lead to too many deadlock possibilities which can make it very
hard/impossible to use. This is explained in the example below, which helps
justify the need for a different algorithm to implement flexible Per-CPU
Reader-Writer locks.

We can use global rwlocks as shown below safely, without fear of deadlocks:

Readers:

         CPU 0                                CPU 1
         ------                               ------

1.    spin_lock(&random_lock);             read_lock(&my_rwlock);


2.    read_lock(&my_rwlock);               spin_lock(&random_lock);


Writer:

         CPU 2:
         ------

       write_lock(&my_rwlock);


We can observe that there is no possibility of deadlocks or circular locking
dependencies here. Its perfectly safe.

Now consider a blind/straight-forward conversion of global rwlocks to per-CPU
rwlocks like this:

The reader locks its own per-CPU rwlock for read, and proceeds.

Something like: read_lock(per-cpu rwlock of this cpu);

The writer acquires all per-CPU rwlocks for write and only then proceeds.

Something like:

  for_each_online_cpu(cpu)
	write_lock(per-cpu rwlock of 'cpu');


Now let's say that for performance reasons, the above scenario (which was
perfectly safe when using global rwlocks) was converted to use per-CPU rwlocks.


         CPU 0                                CPU 1
         ------                               ------

1.    spin_lock(&random_lock);             read_lock(my_rwlock of CPU 1);


2.    read_lock(my_rwlock of CPU 0);       spin_lock(&random_lock);


Writer:

         CPU 2:
         ------

      for_each_online_cpu(cpu)
        write_lock(my_rwlock of 'cpu');


Consider what happens if the writer begins his operation in between steps 1
and 2 at the reader side. It becomes evident that we end up in a (previously
non-existent) deadlock due to a circular locking dependency between the 3
entities, like this:


(holds              Waiting for
 random_lock) CPU 0 -------------> CPU 2  (holds my_rwlock of CPU 0
                                               for write)
               ^                   |
               |                   |
        Waiting|                   | Waiting
          for  |                   |  for
               |                   V
                ------ CPU 1 <------

                (holds my_rwlock of
                 CPU 1 for read)



So obviously this "straight-forward" way of implementing percpu rwlocks is
deadlock-prone. One simple measure for (or characteristic of) safe percpu
rwlock should be that if a user replaces global rwlocks with per-CPU rwlocks
(for performance reasons), he shouldn't suddenly end up in numerous deadlock
possibilities which never existed before. The replacement should continue to
remain safe, and perhaps improve the performance.

Observing the robustness of global rwlocks in providing a fair amount of
deadlock safety, we implement per-CPU rwlocks as nothing but global rwlocks,
as a first step.


Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/percpu-rwlock.h |   49 ++++++++++++++++++++++++++++++++
 lib/Kconfig                   |    3 ++
 lib/Makefile                  |    1 +
 lib/percpu-rwlock.c           |   63 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 116 insertions(+)
 create mode 100644 include/linux/percpu-rwlock.h
 create mode 100644 lib/percpu-rwlock.c

diff --git a/include/linux/percpu-rwlock.h b/include/linux/percpu-rwlock.h
new file mode 100644
index 0000000..45620d0
--- /dev/null
+++ b/include/linux/percpu-rwlock.h
@@ -0,0 +1,49 @@
+/*
+ * Flexible Per-CPU Reader-Writer Locks
+ * (with relaxed locking rules and reduced deadlock-possibilities)
+ *
+ * Copyright (C) IBM Corporation, 2012-2013
+ * Author: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
+ *
+ * With lots of invaluable suggestions from:
+ * 	   Oleg Nesterov <oleg@redhat.com>
+ * 	   Tejun Heo <tj@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef _LINUX_PERCPU_RWLOCK_H
+#define _LINUX_PERCPU_RWLOCK_H
+
+#include <linux/percpu.h>
+#include <linux/lockdep.h>
+#include <linux/spinlock.h>
+
+struct percpu_rwlock {
+	rwlock_t		global_rwlock;
+};
+
+extern void percpu_read_lock(struct percpu_rwlock *);
+extern void percpu_read_unlock(struct percpu_rwlock *);
+
+extern void percpu_write_lock(struct percpu_rwlock *);
+extern void percpu_write_unlock(struct percpu_rwlock *);
+
+extern int __percpu_init_rwlock(struct percpu_rwlock *,
+				const char *, struct lock_class_key *);
+
+#define percpu_init_rwlock(pcpu_rwlock)					\
+({	static struct lock_class_key rwlock_key;			\
+	__percpu_init_rwlock(pcpu_rwlock, #pcpu_rwlock, &rwlock_key);	\
+})
+
+#endif
diff --git a/lib/Kconfig b/lib/Kconfig
index 75cdb77..32fb0b9 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -45,6 +45,9 @@ config STMP_DEVICE
 config PERCPU_RWSEM
 	boolean
 
+config PERCPU_RWLOCK
+	boolean
+
 config CRC_CCITT
 	tristate "CRC-CCITT functions"
 	help
diff --git a/lib/Makefile b/lib/Makefile
index 02ed6c0..1854b5e 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock_debug.o
 lib-$(CONFIG_RWSEM_GENERIC_SPINLOCK) += rwsem-spinlock.o
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
 lib-$(CONFIG_PERCPU_RWSEM) += percpu-rwsem.o
+lib-$(CONFIG_PERCPU_RWLOCK) += percpu-rwlock.o
 
 CFLAGS_hweight.o = $(subst $(quote),,$(CONFIG_ARCH_HWEIGHT_CFLAGS))
 obj-$(CONFIG_GENERIC_HWEIGHT) += hweight.o
diff --git a/lib/percpu-rwlock.c b/lib/percpu-rwlock.c
new file mode 100644
index 0000000..af0c714
--- /dev/null
+++ b/lib/percpu-rwlock.c
@@ -0,0 +1,63 @@
+/*
+ * Flexible Per-CPU Reader-Writer Locks
+ * (with relaxed locking rules and reduced deadlock-possibilities)
+ *
+ * Copyright (C) IBM Corporation, 2012-2013
+ * Author: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
+ *
+ * With lots of invaluable suggestions from:
+ * 	   Oleg Nesterov <oleg@redhat.com>
+ * 	   Tejun Heo <tj@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/spinlock.h>
+#include <linux/percpu.h>
+#include <linux/lockdep.h>
+#include <linux/percpu-rwlock.h>
+#include <linux/errno.h>
+
+
+int __percpu_init_rwlock(struct percpu_rwlock *pcpu_rwlock,
+			 const char *name, struct lock_class_key *rwlock_key)
+{
+	/* ->global_rwlock represents the whole percpu_rwlock for lockdep */
+#ifdef CONFIG_DEBUG_SPINLOCK
+	__rwlock_init(&pcpu_rwlock->global_rwlock, name, rwlock_key);
+#else
+	pcpu_rwlock->global_rwlock =
+			__RW_LOCK_UNLOCKED(&pcpu_rwlock->global_rwlock);
+#endif
+	return 0;
+}
+
+void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
+{
+	read_lock(&pcpu_rwlock->global_rwlock);
+}
+
+void percpu_read_unlock(struct percpu_rwlock *pcpu_rwlock)
+{
+	read_unlock(&pcpu_rwlock->global_rwlock);
+}
+
+void percpu_write_lock(struct percpu_rwlock *pcpu_rwlock)
+{
+	write_lock(&pcpu_rwlock->global_rwlock);
+}
+
+void percpu_write_unlock(struct percpu_rwlock *pcpu_rwlock)
+{
+	write_unlock(&pcpu_rwlock->global_rwlock);
+}
+

^ permalink raw reply related

* [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
From: Srivatsa S. Bhat @ 2013-01-22  7:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

This patchset removes CPU hotplug's dependence on stop_machine() from the CPU
offline path and provides an alternative (set of APIs) to preempt_disable() to
prevent CPUs from going offline, which can be invoked from atomic context.
The motivation behind the removal of stop_machine() is to avoid its ill-effects
and thus improve the design of CPU hotplug. (More description regarding this
is available in the patches).

All the users of preempt_disable()/local_irq_disable() who used to use it to
prevent CPU offline, have been converted to the new primitives introduced in the
patchset. Also, the CPU_DYING notifiers have been audited to check whether
they can cope up with the removal of stop_machine() or whether they need to
use new locks for synchronization (all CPU_DYING notifiers looked OK, without
the need for any new locks).

Applies on v3.8-rc4. It currently has some locking issues with cpu idle (on
which even lockdep didn't provide any insight unfortunately). So for now, it
works with CONFIG_CPU_IDLE=n.

Overview of the patches:
-----------------------

Patches 1 to 6 introduce a generic, flexible Per-CPU Reader-Writer Locking
scheme.

Patch 7 uses this synchronization mechanism to build the
get/put_online_cpus_atomic() APIs which can be used from atomic context, to
prevent CPUs from going offline.

Patch 8 is a cleanup; it converts preprocessor macros to static inline
functions.

Patches 9 to 42 convert various call-sites to use the new APIs.

Patch 43 is the one which actually removes stop_machine() from the CPU
offline path.

Patch 44 decouples stop_machine() and CPU hotplug from Kconfig.

Patch 45 updates the documentation to reflect the new APIs.


Changes in v5:
--------------
  Exposed a new generic locking scheme: Flexible Per-CPU Reader-Writer locks,
  based on the synchronization schemes already discussed in the previous
  versions, and used it in CPU hotplug, to implement the new APIs.

  Audited the CPU_DYING notifiers in the kernel source tree and replaced
  usages of preempt_disable() with the new get/put_online_cpus_atomic() APIs
  where necessary.


Changes in v4:
--------------
  The synchronization scheme has been simplified quite a bit, which makes it
  look a lot less complex than before. Some highlights:

* Implicit ACKs:

  The earlier design required the readers to explicitly ACK the writer's
  signal. The new design uses implicit ACKs instead. The reader switching
  over to rwlock implicitly tells the writer to stop waiting for that reader.

* No atomic operations:

  Since we got rid of explicit ACKs, we no longer have the need for a reader
  and a writer to update the same counter. So we can get rid of atomic ops
  too.

Changes in v3:
--------------
* Dropped the _light() and _full() variants of the APIs. Provided a single
  interface: get/put_online_cpus_atomic().

* Completely redesigned the synchronization mechanism again, to make it
  fast and scalable at the reader-side in the fast-path (when no hotplug
  writers are active). This new scheme also ensures that there is no
  possibility of deadlocks due to circular locking dependency.
  In summary, this provides the scalability and speed of per-cpu rwlocks
  (without actually using them), while avoiding the downside (deadlock
  possibilities) which is inherent in any per-cpu locking scheme that is
  meant to compete with preempt_disable()/enable() in terms of flexibility.

  The problem with using per-cpu locking to replace preempt_disable()/enable
  was explained here:
  https://lkml.org/lkml/2012/12/6/290

  Basically we use per-cpu counters (for scalability) when no writers are
  active, and then switch to global rwlocks (for lock-safety) when a writer
  becomes active. It is a slightly complex scheme, but it is based on
  standard principles of distributed algorithms.

Changes in v2:
-------------
* Completely redesigned the synchronization scheme to avoid using any extra
  cpumasks.

* Provided APIs for 2 types of atomic hotplug readers: "light" (for
  light-weight) and "full". We wish to have more "light" readers than
  the "full" ones, to avoid indirectly inducing the "stop_machine effect"
  without even actually using stop_machine().

  And the patches show that it _is_ generally true: 5 patches deal with
  "light" readers, whereas only 1 patch deals with a "full" reader.

  Also, the "light" readers happen to be in very hot paths. So it makes a
  lot of sense to have such a distinction and a corresponding light-weight
  API.

Links to previous versions:
v4: https://lkml.org/lkml/2012/12/11/209
v3: https://lkml.org/lkml/2012/12/7/287
v2: https://lkml.org/lkml/2012/12/5/322
v1: https://lkml.org/lkml/2012/12/4/88

--

Paul E. McKenney (1):
      cpu: No more __stop_machine() in _cpu_down()

Srivatsa S. Bhat (44):
      percpu_rwlock: Introduce the global reader-writer lock backend
      percpu_rwlock: Introduce per-CPU variables for the reader and the writer
      percpu_rwlock: Provide a way to define and init percpu-rwlocks at compile time
      percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
      percpu_rwlock: Make percpu-rwlocks IRQ-safe, optimally
      percpu_rwlock: Allow writers to be readers, and add lockdep annotations
      CPU hotplug: Provide APIs to prevent CPU offline from atomic context
      CPU hotplug: Convert preprocessor macros to static inline functions
      smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline properly
      smp, cpu hotplug: Fix on_each_cpu_*() to prevent CPU offline properly
      sched/timer: Use get/put_online_cpus_atomic() to prevent CPU offline
      sched/migration: Use raw_spin_lock/unlock since interrupts are already disabled
      sched/rt: Use get/put_online_cpus_atomic() to prevent CPU offline
      rcu, CPU hotplug: Fix comment referring to stop_machine()
      tick: Use get/put_online_cpus_atomic() to prevent CPU offline
      time/clocksource: Use get/put_online_cpus_atomic() to prevent CPU offline
      softirq: Use get/put_online_cpus_atomic() to prevent CPU offline
      irq: Use get/put_online_cpus_atomic() to prevent CPU offline
      net: Use get/put_online_cpus_atomic() to prevent CPU offline
      block: Use get/put_online_cpus_atomic() to prevent CPU offline
      crypto: pcrypt - Protect access to cpu_online_mask with get/put_online_cpus()
      infiniband: ehca: Use get/put_online_cpus_atomic() to prevent CPU offline
      [SCSI] fcoe: Use get/put_online_cpus_atomic() to prevent CPU offline
      staging: octeon: Use get/put_online_cpus_atomic() to prevent CPU offline
      x86: Use get/put_online_cpus_atomic() to prevent CPU offline
      perf/x86: Use get/put_online_cpus_atomic() to prevent CPU offline
      KVM: Use get/put_online_cpus_atomic() to prevent CPU offline from atomic context
      kvm/vmx: Use get/put_online_cpus_atomic() to prevent CPU offline
      x86/xen: Use get/put_online_cpus_atomic() to prevent CPU offline
      alpha/smp: Use get/put_online_cpus_atomic() to prevent CPU offline
      blackfin/smp: Use get/put_online_cpus_atomic() to prevent CPU offline
      cris/smp: Use get/put_online_cpus_atomic() to prevent CPU offline
      hexagon/smp: Use get/put_online_cpus_atomic() to prevent CPU offline
      ia64: Use get/put_online_cpus_atomic() to prevent CPU offline
      m32r: Use get/put_online_cpus_atomic() to prevent CPU offline
      MIPS: Use get/put_online_cpus_atomic() to prevent CPU offline
      mn10300: Use get/put_online_cpus_atomic() to prevent CPU offline
      parisc: Use get/put_online_cpus_atomic() to prevent CPU offline
      powerpc: Use get/put_online_cpus_atomic() to prevent CPU offline
      sh: Use get/put_online_cpus_atomic() to prevent CPU offline
      sparc: Use get/put_online_cpus_atomic() to prevent CPU offline
      tile: Use get/put_online_cpus_atomic() to prevent CPU offline
      CPU hotplug, stop_machine: Decouple CPU hotplug from stop_machine() in Kconfig
      Documentation/cpu-hotplug: Remove references to stop_machine()


  Documentation/cpu-hotplug.txt                 |   17 +-
 arch/alpha/kernel/smp.c                       |   19 +-
 arch/arm/Kconfig                              |    1 
 arch/blackfin/Kconfig                         |    1 
 arch/blackfin/mach-common/smp.c               |    6 -
 arch/cris/arch-v32/kernel/smp.c               |    8 +
 arch/hexagon/kernel/smp.c                     |    5 +
 arch/ia64/Kconfig                             |    1 
 arch/ia64/kernel/irq_ia64.c                   |   13 +
 arch/ia64/kernel/perfmon.c                    |    6 +
 arch/ia64/kernel/smp.c                        |   23 ++
 arch/ia64/mm/tlb.c                            |    6 -
 arch/m32r/kernel/smp.c                        |   12 +
 arch/mips/Kconfig                             |    1 
 arch/mips/kernel/cevt-smtc.c                  |    8 +
 arch/mips/kernel/smp.c                        |   16 +-
 arch/mips/kernel/smtc.c                       |    3 
 arch/mips/mm/c-octeon.c                       |    4 
 arch/mn10300/Kconfig                          |    1 
 arch/mn10300/kernel/smp.c                     |    2 
 arch/mn10300/mm/cache-smp.c                   |    5 +
 arch/mn10300/mm/tlb-smp.c                     |   15 +-
 arch/parisc/Kconfig                           |    1 
 arch/parisc/kernel/smp.c                      |    4 
 arch/powerpc/Kconfig                          |    1 
 arch/powerpc/mm/mmu_context_nohash.c          |    2 
 arch/s390/Kconfig                             |    1 
 arch/sh/Kconfig                               |    1 
 arch/sh/kernel/smp.c                          |   12 +
 arch/sparc/Kconfig                            |    1 
 arch/sparc/kernel/leon_smp.c                  |    2 
 arch/sparc/kernel/smp_64.c                    |    9 +
 arch/sparc/kernel/sun4d_smp.c                 |    2 
 arch/sparc/kernel/sun4m_smp.c                 |    3 
 arch/tile/kernel/smp.c                        |    4 
 arch/x86/Kconfig                              |    1 
 arch/x86/include/asm/ipi.h                    |    5 +
 arch/x86/kernel/apic/apic_flat_64.c           |   10 +
 arch/x86/kernel/apic/apic_numachip.c          |    5 +
 arch/x86/kernel/apic/es7000_32.c              |    5 +
 arch/x86/kernel/apic/io_apic.c                |    7 +
 arch/x86/kernel/apic/ipi.c                    |   10 +
 arch/x86/kernel/apic/x2apic_cluster.c         |    4 
 arch/x86/kernel/apic/x2apic_uv_x.c            |    4 
 arch/x86/kernel/cpu/mcheck/therm_throt.c      |    4 
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |    5 +
 arch/x86/kvm/vmx.c                            |    8 +
 arch/x86/mm/tlb.c                             |   14 +
 arch/x86/xen/mmu.c                            |   11 +
 arch/x86/xen/smp.c                            |    9 +
 block/blk-softirq.c                           |    4 
 crypto/pcrypt.c                               |    4 
 drivers/infiniband/hw/ehca/ehca_irq.c         |    8 +
 drivers/scsi/fcoe/fcoe.c                      |    7 +
 drivers/staging/octeon/ethernet-rx.c          |    3 
 include/linux/cpu.h                           |    8 +
 include/linux/percpu-rwlock.h                 |   86 +++++++++
 include/linux/stop_machine.h                  |    2 
 init/Kconfig                                  |    2 
 kernel/cpu.c                                  |   61 ++++++
 kernel/irq/manage.c                           |    7 +
 kernel/rcutree.c                              |    9 -
 kernel/sched/core.c                           |   36 +++-
 kernel/sched/fair.c                           |    5 -
 kernel/sched/rt.c                             |    3 
 kernel/smp.c                                  |   65 ++++---
 kernel/softirq.c                              |    3 
 kernel/time/clocksource.c                     |    5 +
 kernel/time/tick-broadcast.c                  |    2 
 kernel/timer.c                                |    2 
 lib/Kconfig                                   |    3 
 lib/Makefile                                  |    1 
 lib/percpu-rwlock.c                           |  242 +++++++++++++++++++++++++
 net/core/dev.c                                |    9 +
 virt/kvm/kvm_main.c                           |   10 +
 75 files changed, 776 insertions(+), 129 deletions(-)
 create mode 100644 include/linux/percpu-rwlock.h
 create mode 100644 lib/percpu-rwlock.c



Thanks,
Srivatsa S. Bhat
IBM Linux Technology Center

^ permalink raw reply

* [PATCH v2 1/2] ARM: shmobile: sh73a0: Use generic irqchip_init()
From: Olof Johansson @ 2013-01-22  7:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130121070301.GB15508@avionic-0098.adnet.avionic-design.de>

On Mon, Jan 21, 2013 at 08:03:01AM +0100, Thierry Reding wrote:
> On Mon, Jan 21, 2013 at 09:54:39AM +0900, Simon Horman wrote:
> > On Fri, Jan 18, 2013 at 08:16:12AM +0100, Thierry Reding wrote:
> > > The asm/hardware/gic.h header does no longer exist and the corresponding
> > > functionality was moved to linux/irqchip.h and linux/irqchip/arm-gic.h
> > > respectively. gic_handle_irq() and of_irq_init() are no longer available
> > > either and have been replaced by irqchip_init().
> > 
> > asm/hardware/gic.h Seems to still exist in Linus's tree.
> > Could you let me know which tree of which branch I should depend on
> > in order to apply this change?
> 
> I found this when doing an automated build over all ARM defconfigs on
> linux-next.
> 
> Commit 520f7bd73354f003a9a59937b28e4903d985c420 "irqchip: Move ARM gic.h
> to include/linux/irqchip/arm-gic.h" moved the file and was merged
> through Olof Johansson's next/cleanup and for-next branches.
> 
> Adding Olof on Cc since I'm not quite sure myself about how this is
> handled.

The way to handle this is to base the branch you are adding new shmobile code
in, on top of the cleanup branches that changes the underlying infrastructure.
This is why we merge it early during the release, so that new code for various
platforms can be based on it to avoid a bunch of conflicts in the end.

In this case, you might need to base your branch onto a merge of both
the irqchip/gic-vic-move and timer/cleanup branches from arm-soc.


-Olof

^ permalink raw reply

* [PATCH] net: fec: Add support for multiple phys on mdiobus
From: Wolfgang Grandegger @ 2013-01-22  7:22 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130121120602.GZ1906@pengutronix.de>

On 01/21/2013 01:06 PM, Sascha Hauer wrote:
> On Mon, Jan 21, 2013 at 12:07:30PM +0100, Wolfgang Grandegger wrote:
>> On 01/21/2013 11:07 AM, Sascha Hauer wrote:
>>> On Mon, Jan 21, 2013 at 09:56:24AM +0100, Wolfgang Grandegger wrote:
>>>> On 01/21/2013 09:37 AM, Sascha Hauer wrote:
>>>>> There may be multiple phys on an mdio bus. This series adds support
>>>>> for this to the fec driver. I recently had a board which has a switch
>>>>> connected to the fec's mdio bus, so I had to pick the correct phy.
>>>>
>>>> Pick one PHY from a switch port? Well, does a PHY-less (or fixed-link)
>>>> configuration for a switch not make more sense?
>>>
>>> Yes, you're probably right.
>>>
>>>> Various ARM Ethernet
>>>> contoller drivers do not support it. I recently needed a hack for an
>>>> AT91 board.
>>>
>>> I wonder how we want to proceed. Should there be a devicetree property
>>> 'fixed-link' like done for fs_enet (and not recommended for new code,
>>> stated in the comment above of_phy_connect_fixed_link)?
>>
>> Also the gianfar and ucc_geth drivers use this interface (via fixed
>> link phy). I tried to use it for the AT91 macb driver but stopped
>> quickly because the usage was not straight forward (too much code)...
>> even if the idea of using a fake fixed-link phy is not bad.
>>
>>> Currently I have a property 'phy' in the fec binding which has a phandle
>>> to a phy provided by the fec's mdio bus, but this could equally well
>>
>> But than the cable must be connected to the associated switch port.
>>
>>> point to a fixed dummy phy:
>>>
>>> 	phy = &fixed-phy;
>>
>> The link speed, full/half duplex and maybe some mroe parameter should
>> be configurable via device tree.
> 
> Well this could be done when the fixed phy driver could be registered
> with the devicetree, maybe like this:
> 
> 	fixed-phy: mdiophy {
> 		compatible = "mdio-fixed-phy";
> 		link = "100FD";
> 	};

I find that confusing. There is *no* phy but just a fixed link to the
switch...

> The good thing about this would be that every ethernet driver could just
> use such a fixed phy, any external mdio phy (like on Marvell Armada) or
> just a phy connected to the internal mdio interface provided by the ethernet
> core.

What is wrong with the existing "fixed-link" property of the *ethernet*
node. The fixed-link handling should/could be done in the phy layer, and
not in the driver as it currently is implemented. Maybe that's the
reason why the current code is regarded as hack!

> I probably should write a RFC to devicetree-discuss.

Yep,

Wolfgang.

^ permalink raw reply

* [PATCH] dts: vt8500: Add initial dts support for WM8850
From: Olof Johansson @ 2013-01-22  7:21 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1358577868-29042-1-git-send-email-linux@prisktech.co.nz>

On Sat, Jan 19, 2013 at 07:44:28PM +1300, Tony Prisk wrote:
> This patch adds a soc dtsi for the Wondermedia WM8850.
> 
> A board dts file is also included for the W70v2 tablet, with support
> for all the drivers currently in mainline.
> 
> Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
> ---
> Hi Olof,
> 
> Sorry this is a bit late.

For 3.9? No worries, not late yet.

I've applied this to the same branch as the other wm8x50 patches. I also fixed
up three <space><tab> occurrances in the dtsi that git am complained about.


-Olof

^ permalink raw reply

* [PATCH 7/7] ARM: sunxi: olinuxino: Add muxing for the uart
From: Olof Johansson @ 2013-01-22  7:17 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CACRpkdbUbD-VnkdSAKp=GyRiqub8iZf90nwPHPFtWZHwSMZiAA@mail.gmail.com>

On Mon, Jan 21, 2013 at 11:15:32PM +0100, Linus Walleij wrote:
> On Fri, Jan 18, 2013 at 10:30 PM, Maxime Ripard
> <maxime.ripard@free-electrons.com> wrote:
> 
> > Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> 
> All pinctrl and device tree patches applied to my allwinner branch in the
> pinctrl tree. Hope the ARM SoC can accept me poking around in your
> device trees.

Ah, my just-now sent reply was to an older version of this thread.

Sure, it's not a big deal that you're picking up the DT changes, but there
might end up being add/add conflicts if they are adding more stuff to the same
files. Not a huge deal, ideally we want to avoid them but a couple are ok.

-Olof

^ permalink raw reply

* [PATCHv3 0/7] Add pinctrl driver for Allwinner A1X SoCs
From: Olof Johansson @ 2013-01-22  7:04 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <50F9BD09.1030107@free-electrons.com>

On Fri, Jan 18, 2013 at 10:22:17PM +0100, Maxime Ripard wrote:
> Hi Linus,
> 
> Thanks for your review.
> 
> On 18/01/2013 20:24, Linus Walleij wrote:
> > On Thu, Jan 17, 2013 at 2:31 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
> >> On Tue, Jan 15, 2013 at 11:19 AM, Maxime Ripard
> > 
> >>> Are you ok with this version or do you have additionnal comments ?
> >>
> >> I'm probably OK with it, I've only just now reached this point in my
> >> mail backlog.
> > 
> > As noted I had more comments, they should be quick to address.
> > 
> > When finished, shall this be applied to the pinctrl tree or some other
> > subtree, like ARM SoC?
> 
> Maybe the easiest thing to do would be to push the pinctrl driver in
> itself through your tree, and the dt additions through the arm-soc one?

As long as things work without the DT pieces, that's a good way to do it for
new code (conversion of existing platforms gets more complicated most of the
time).


-Olof

^ permalink raw reply

* [PATCH 3/4] ARM: mach-omap2: apply the errata at run time rather
From: Srinidhi Kasagar @ 2013-01-22  6:49 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <50FE350C.6080504@ti.com>

On Tue, Jan 22, 2013 at 07:43:24 +0100, Santosh Shilimkar wrote:
> On Tuesday 22 January 2013 11:31 AM, Srinidhi Kasagar wrote:
> > On Mon, Jan 21, 2013 at 19:22:37 +0100, Tony Lindgren wrote:
> >> * srinidhi kasagar <srinidhi.kasagar@stericsson.com> [130121 05:19]:
> >>
> >> Forgot to complete the subject and add the description?
> >
> > No :) It has the subject, but description is intentionally skipped
> > because subject has all the meaning and it is a part of the series
> > of patches I sent..
> >
> "ARM: mach-omap2: apply the errata at run time rather"
> 
> Subject doesn't have all the meaning till you read the series ;)
> Like, this is Pl310 XYZ errata etc. Anyway once you sort
> out the concerns from RMK and with a few lines of
> description on subject patch, you can add my ack on OMAP
> patches.

thanks, the description part will be fixed.
Can you check RMK's comment on set_debug patch, patch 1/4? I need
your comment on that to roll it over because set_debug is something
which is OMAP specific..

> 
> Thanks a lot for doing this.
> 
> Regards,
> Santosh

^ permalink raw reply

* [PATCH 3/4] ARM: mach-omap2: apply the errata at run time rather
From: Santosh Shilimkar @ 2013-01-22  6:43 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122060145.GB22429@bnru10>

On Tuesday 22 January 2013 11:31 AM, Srinidhi Kasagar wrote:
> On Mon, Jan 21, 2013 at 19:22:37 +0100, Tony Lindgren wrote:
>> * srinidhi kasagar <srinidhi.kasagar@stericsson.com> [130121 05:19]:
>>
>> Forgot to complete the subject and add the description?
>
> No :) It has the subject, but description is intentionally skipped
> because subject has all the meaning and it is a part of the series
> of patches I sent..
>
"ARM: mach-omap2: apply the errata at run time rather"

Subject doesn't have all the meaning till you read the series ;)
Like, this is Pl310 XYZ errata etc. Anyway once you sort
out the concerns from RMK and with a few lines of
description on subject patch, you can add my ack on OMAP
patches.

Thanks a lot for doing this.

Regards,
Santosh

^ permalink raw reply

* [PATCH 05/15] ASoC: fsl: fiq and dma cannot both be modules
From: Mark Brown @ 2013-01-22  6:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122035028.GB29677@S2100-06.ap.freescale.net>

On Tue, Jan 22, 2013 at 11:50:30AM +0800, Shawn Guo wrote:
> On Mon, Jan 21, 2013 at 05:15:58PM +0000, Arnd Bergmann wrote:

> > Without this patch, we cannot build the ARM 'allmodconfig', or
> > we get this error:

> > sound/soc/fsl/imx-pcm-dma.o: In function `init_module':
> > sound/soc/fsl/imx-pcm-dma.c:177: multiple definition of `init_module'
> > sound/soc/fsl/imx-pcm-fiq.o:sound/soc/fsl/imx-pcm-fiq.c:334: first defined here
> > sound/soc/fsl/imx-pcm-dma.o: In function `cleanup_module':
> > sound/soc/fsl/imx-pcm-dma.c:177: multiple definition of `cleanup_module'
> > sound/soc/fsl/imx-pcm-fiq.o:sound/soc/fsl/imx-pcm-fiq.c:334: first defined here

> I sent a fix [1] for that queued by Mark.

> Mark,

> Is the patch on the way to 3.8-rc?

Yes, should be.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130122/831a9df6/attachment.sig>

^ permalink raw reply

* One of these things (CONFIG_HZ) is not like the others..
From: Santosh Shilimkar @ 2013-01-22  6:23 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130121232322.GK15361@atomide.com>

On Tuesday 22 January 2013 04:53 AM, Tony Lindgren wrote:
> * Russell King - ARM Linux <linux@arm.linux.org.uk> [130121 13:07]:
>>
>> As for Samsung and the rest I can't comment.  The original reason OMAP
>> used this though was because the 32768Hz counter can't produce 100Hz
>> without a .1% error - too much error under pre-clocksource
>> implementations for timekeeping.  Whether that's changed with the
>> clocksource/clockevent support needs to be checked.
>
> Yes that's why HZ was originally set to 128. That value (or some multiple)
> still makes sense when the 32 KiHZ clock source is being used. Of course
> we should rely on the local timer when running for the SoCs that have
> them.
>
This is right. It was only because of the drift associated when clocked
with 32KHz. Even on SOCs where local timers are available for power
management reasons we need to switch to 32KHz clocked device in
low power states. Hence the HZ value should be multiple of 32 on
OMAP.

Regards
Santosh

^ permalink raw reply

* [PATCH 13/15] USB: ehci: make orion and mxc bus glues coexist
From: Shawn Guo @ 2013-01-22  6:14 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20130122061116.GD29677@S2100-06.ap.freescale.net>

On Tue, Jan 22, 2013 at 02:11:18PM +0800, Shawn Guo wrote:
> Alan,
> 
> Thanks for the patch.  I just gave it try.  The USB Host port still
> works for me with a couple of fixes on your changes integrated (one
> for compiling and the other for probing).  So you have my ACK with
> the changes below rolled into your patch.
> 
> Acked-by: Shawn Guo <shawn.guo@linaro.org>
> 
Sorry.  I meant a Test tag.

Tested-by: Shawn Guo <shawn.guo@linaro.org>

^ permalink raw reply

* [PATCH 3/3] arm: sunxi: Add useful information about sunxi clocks
From: Emilio López @ 2013-01-22  6:12 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1358835176-7197-1-git-send-email-emilio@elopez.com.ar>

This patch contains useful bits of information about the sunxi clocks
that may help and/or be interesting for current and future developers.

Signed-off-by: Emilio L?pez <emilio@elopez.com.ar>
---
 Documentation/arm/sunxi/clocks.txt | 56 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)
 create mode 100644 Documentation/arm/sunxi/clocks.txt

diff --git a/Documentation/arm/sunxi/clocks.txt b/Documentation/arm/sunxi/clocks.txt
new file mode 100644
index 0000000..e09a88a
--- /dev/null
+++ b/Documentation/arm/sunxi/clocks.txt
@@ -0,0 +1,56 @@
+Frequently asked questions about the sunxi clock system
+=======================================================
+
+This document contains useful bits of information that people tend to ask
+about the sunxi clock system, as well as accompanying ASCII art when adequate.
+
+Q: Why is the main 24MHz oscillator gatable? Wouldn't that break the
+   system?
+
+A: The 24MHz oscillator allows gating to save power. Indeed, if gated
+   carelessly the system would stop functioning, but with the right
+   steps, one can gate it and keep the system running. Consider this
+   simplified suspend example:
+
+   While the system is operational, you would see something like
+
+      24MHz         32kHz
+       |
+      PLL1
+       \
+        \_ CPU Mux
+             |
+           [CPU]
+
+   When you are about to suspend, you switch the CPU Mux to the 32kHz
+   oscillator:
+
+      24Mhz         32kHz
+       |              |
+      PLL1            |
+                     /
+           CPU Mux _/
+             |
+           [CPU]
+
+    Finally you can gate the main oscillator
+
+                    32kHz
+                      |
+                      |
+                     /
+           CPU Mux _/
+             |
+           [CPU]
+
+Q: Were can I learn more about the sunxi clocks?
+
+A: The linux-sunxi wiki contains a page documenting the clock registers,
+   you can find it at
+
+        http://linux-sunxi.org/A10/CCM
+
+   The authoritative source for information at this time is the ccmu driver
+   released by Allwinner, you can find it at
+
+        https://github.com/linux-sunxi/linux-sunxi/tree/sunxi-3.0/arch/arm/mach-sun4i/clock/ccmu
-- 
1.8.1.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox