public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Heiko Carstens <heiko.carstens@de.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	john stultz <johnstul@us.ibm.com>,
	Roman Zippel <zippel@linux-m68k.org>,
	Christian Borntraeger <cborntra@de.ibm.com>,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [patch] timer/hrtimer: take per cpu locks in sane order
Date: Fri, 2 Mar 2007 20:08:36 +0100	[thread overview]
Message-ID: <20070302190836.GA7942@osiris.ibm.com> (raw)
In-Reply-To: <20070302084833.732d09dd.akpm@linux-foundation.org>

From: Heiko Carstens <heiko.carstens@de.ibm.com>

Doing something like this on a two cpu system

# echo 0 > /sys/devices/system/cpu/cpu0/online 
# echo 1 > /sys/devices/system/cpu/cpu0/online 
# echo 0 > /sys/devices/system/cpu/cpu1/online 

will give me this:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.21-rc2-g562aa1d4-dirty #7
-------------------------------------------------------
bash/1282 is trying to acquire lock:
 (&cpu_base->lock_key){.+..}, at: [<000000000005f17e>] hrtimer_cpu_notify+0xc6/0x240

but task is already holding lock:
 (&cpu_base->lock_key#2){.+..}, at: [<000000000005f174>] hrtimer_cpu_notify+0xbc/0x240

which lock already depends on the new lock.

This happens because we have the following code in kernel/hrtimer.c:

migrate_hrtimers(int cpu)
[...]
old_base = &per_cpu(hrtimer_bases, cpu);
new_base = &get_cpu_var(hrtimer_bases);
[...]
spin_lock(&new_base->lock);
spin_lock(&old_base->lock);

Which means the spinlocks are taken in an order which depends on which cpu
gets shut down from which other cpu. Therefore lockdep complains that there
might be an ABBA deadlock. Since migrate_hrtimers() gets only called on
cpu hotplug it's safe to assume that it isn't executed concurrently on a
different cpu and therefore the locking should be ok.

The same problem exists in kernel/timer.c: migrate_timers().

As pointed out by Christian Borntraeger one possible solution to avoid
the locking order complaints would be to make sure that the locks are
always taken in the same order. E.g. by taking the lock of the cpu with
the lower number first. AFIACS this should be safe and that is what this
patch does.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Christian Borntraeger <cborntra@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 kernel/hrtimer.c |   39 ++++++++++++++++++++++++++++++++++-----
 kernel/timer.c   |   38 ++++++++++++++++++++++++++++++++++----
 2 files changed, 68 insertions(+), 9 deletions(-)

Index: linux-2.6/kernel/hrtimer.c
===================================================================
--- linux-2.6.orig/kernel/hrtimer.c
+++ linux-2.6/kernel/hrtimer.c
@@ -1343,6 +1343,38 @@ static void migrate_hrtimer_list(struct 
 	}
 }
 
+/*
+ * double_hrtimer_lock/unlock are used to ensure that on cpu hotplug the
+ * per cpu timer locks are always taken in the same order.
+ */
+static void double_hrtimer_lock(struct hrtimer_cpu_base *base1,
+				struct hrtimer_cpu_base *base2, int ind)
+	__acquires(base1->lock)
+	__acquires(base2->lock)
+{
+	if (ind < 0) {
+		spin_lock(&base1->lock);
+		spin_lock(&base2->lock);
+	} else {
+		spin_lock(&base2->lock);
+		spin_lock(&base1->lock);
+	}
+}
+
+static void double_hrtimer_unlock(struct hrtimer_cpu_base *base1,
+				  struct hrtimer_cpu_base *base2, int ind)
+	__releases(base1->lock)
+	__releases(base2->lock)
+{
+	if (ind < 0) {
+		spin_unlock(&base2->lock);
+		spin_unlock(&base1->lock);
+	} else {
+		spin_unlock(&base1->lock);
+		spin_unlock(&base2->lock);
+	}
+}
+
 static void migrate_hrtimers(int cpu)
 {
 	struct hrtimer_cpu_base *old_base, *new_base;
@@ -1355,17 +1387,14 @@ static void migrate_hrtimers(int cpu)
 	tick_cancel_sched_timer(cpu);
 
 	local_irq_disable();
-
-	spin_lock(&new_base->lock);
-	spin_lock(&old_base->lock);
+	double_hrtimer_lock(new_base, old_base, smp_processor_id() - cpu);
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
 		migrate_hrtimer_list(&old_base->clock_base[i],
 				     &new_base->clock_base[i]);
 	}
-	spin_unlock(&old_base->lock);
-	spin_unlock(&new_base->lock);
 
+	double_hrtimer_unlock(new_base, old_base, smp_processor_id() - cpu);
 	local_irq_enable();
 	put_cpu_var(hrtimer_bases);
 }
Index: linux-2.6/kernel/timer.c
===================================================================
--- linux-2.6.orig/kernel/timer.c
+++ linux-2.6/kernel/timer.c
@@ -1640,6 +1640,38 @@ static void migrate_timer_list(tvec_base
 	}
 }
 
+/*
+ * double_timer_lock/unlock are used to ensure that on cpu hotplug the
+ * per cpu timer locks are always taken in the same order.
+ */
+static void __devinit double_timer_lock(tvec_base_t *base1,
+					tvec_base_t *base2, int ind)
+	__acquires(base1->lock)
+	__acquires(base2->lock)
+{
+	if (ind < 0) {
+		spin_lock(&base1->lock);
+		spin_lock(&base2->lock);
+	} else {
+		spin_lock(&base2->lock);
+		spin_lock(&base1->lock);
+	}
+}
+
+static void __devinit double_timer_unlock(tvec_base_t *base1,
+					  tvec_base_t *base2, int ind)
+	__releases(base1->lock)
+	__releases(base2->lock)
+{
+	if (ind < 0) {
+		spin_unlock(&base2->lock);
+		spin_unlock(&base1->lock);
+	} else {
+		spin_unlock(&base1->lock);
+		spin_unlock(&base2->lock);
+	}
+}
+
 static void __devinit migrate_timers(int cpu)
 {
 	tvec_base_t *old_base;
@@ -1651,8 +1683,7 @@ static void __devinit migrate_timers(int
 	new_base = get_cpu_var(tvec_bases);
 
 	local_irq_disable();
-	spin_lock(&new_base->lock);
-	spin_lock(&old_base->lock);
+	double_timer_lock(new_base, old_base, smp_processor_id() - cpu);
 
 	BUG_ON(old_base->running_timer);
 
@@ -1665,8 +1696,7 @@ static void __devinit migrate_timers(int
 		migrate_timer_list(new_base, old_base->tv5.vec + i);
 	}
 
-	spin_unlock(&old_base->lock);
-	spin_unlock(&new_base->lock);
+	double_timer_unlock(new_base, old_base, smp_processor_id() - cpu);
 	local_irq_enable();
 	put_cpu_var(tvec_bases);
 }

  reply	other threads:[~2007-03-02 19:10 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-27 15:30 timer/hrtimer & cpu hotplug lockdep complaints Heiko Carstens
2007-03-02 12:58 ` [patch] timer/hrtimer: take per cpu locks in sane order Heiko Carstens
2007-03-02 13:04   ` Ingo Molnar
2007-03-02 14:23     ` Heiko Carstens
2007-03-02 16:48       ` Andrew Morton
2007-03-02 19:08         ` Heiko Carstens [this message]
2007-03-02 19:45           ` Andrew Morton
2007-03-02 21:39             ` Heiko Carstens
2007-03-02 22:47               ` Heiko Carstens
2007-03-04 23:24                 ` Matt Mackall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070302190836.GA7942@osiris.ibm.com \
    --to=heiko.carstens@de.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cborntra@de.ibm.com \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=zippel@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox