From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: stable-review@kernel.org, torvalds@linux-foundation.org,
akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk,
dsterba@novell.com, seto.hidetoshi@jp.fujitsu.com,
suresh.b.siddha@intel.com, Nagananda.Chumbalkar@hp.com,
jdelvare@novell.com, tglx@linutronix.de,
Thomas Renninger <trenn@suse.de>
Subject: [16/24] clockevent: Prevent dead lock on clockevents_lock
Date: Mon, 24 May 2010 15:28:12 -0700 [thread overview]
Message-ID: <20100524223017.910661964@clark.site> (raw)
In-Reply-To: <20100524223544.GA13721@kroah.com>
2.6.27-stable review patch. If anyone has any objections, please let us know.
------------------
From: Suresh Siddha <suresh.b.siddha@intel.com>
This is a merge of two mainline commits, intended
for stable@kernel.org submission for 2.6.27 kernel.
commit f833bab87fca5c3ce13778421b1365845843b976
and
commit 918aae42aa9b611a3663b16ae849fdedc67c2292
Changelog of both:
Currently clockevents_notify() is called with interrupts enabled at
some places and interrupts disabled at some other places.
This results in a deadlock in this scenario.
cpu A holds clockevents_lock in clockevents_notify() with irqs enabled
cpu B waits for clockevents_lock in clockevents_notify() with irqs disabled
cpu C doing set_mtrr() which will try to rendezvous of all the cpus.
This will result in C and A come to the rendezvous point and waiting
for B. B is stuck forever waiting for the spinlock and thus not
reaching the rendezvous point.
Fix the clockevents code so that clockevents_lock is taken with
interrupts disabled and thus avoid the above deadlock.
Also call lapic_timer_propagate_broadcast() on the destination cpu so
that we avoid calling smp_call_function() in the clockevents notifier
chain.
This issue left us wondering if we need to change the MTRR rendezvous
logic to use stop machine logic (instead of smp_call_function) or add
a check in spinlock debug code to see if there are other spinlocks
which gets taken under both interrupts enabled/disabled conditions.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Brown Len" <len.brown@intel.com>
Cc: stable@kernel.org
LKML-Reference: <1250544899.2709.210.camel@sbs-t61.sc.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
I got following warning on ia64 box:
In function 'acpi_processor_power_verify':
642: warning: passing argument 2 of 'smp_call_function_single' from
incompatible pointer type
This smp_call_function_single() was introduced by a commit
f833bab87fca5c3ce13778421b1365845843b976:
The problem is that the lapic_timer_propagate_broadcast() has 2 versions:
One is real code that modified in the above commit, and the other is NOP
code that used when !ARCH_APICTIMER_STOPS_ON_C3:
static void lapic_timer_propagate_broadcast(struct acpi_processor *pr) { }
So I got warning because of !ARCH_APICTIMER_STOPS_ON_C3.
We really want to do nothing here on !ARCH_APICTIMER_STOPS_ON_C3, so
modify lapic_timer_propagate_broadcast() of real version to use
smp_call_function_single() in it.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
arch/x86/kernel/process.c | 6 +-----
drivers/acpi/processor_idle.c | 13 ++++++++++---
kernel/time/clockevents.c | 16 ++++++++++------
kernel/time/tick-broadcast.c | 7 +++----
4 files changed, 24 insertions(+), 18 deletions(-)
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -283,16 +283,12 @@ static void c1e_idle(void)
if (!cpu_isset(cpu, c1e_mask)) {
cpu_set(cpu, c1e_mask);
/*
- * Force broadcast so ACPI can not interfere. Needs
- * to run with interrupts enabled as it uses
- * smp_function_call.
+ * Force broadcast so ACPI can not interfere.
*/
- local_irq_enable();
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_FORCE,
&cpu);
printk(KERN_INFO "Switch to broadcast mode on CPU%d\n",
cpu);
- local_irq_disable();
}
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu);
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -317,8 +317,9 @@ static void acpi_timer_check_state(int s
pr->power.timer_broadcast_on_state = state;
}
-static void acpi_propagate_timer_broadcast(struct acpi_processor *pr)
+static void __lapic_timer_propagate_broadcast(void *arg)
{
+ struct acpi_processor *pr = (struct acpi_processor *) arg;
unsigned long reason;
reason = pr->power.timer_broadcast_on_state < INT_MAX ?
@@ -327,6 +328,12 @@ static void acpi_propagate_timer_broadca
clockevents_notify(reason, &pr->id);
}
+static void lapic_timer_propagate_broadcast(struct acpi_processor *pr)
+{
+ smp_call_function_single(pr->id, __lapic_timer_propagate_broadcast,
+ (void *)pr, 1);
+}
+
/* Power(C) State timer broadcast control */
static void acpi_state_timer_broadcast(struct acpi_processor *pr,
struct acpi_processor_cx *cx,
@@ -347,7 +354,7 @@ static void acpi_state_timer_broadcast(s
static void acpi_timer_check_state(int state, struct acpi_processor *pr,
struct acpi_processor_cx *cstate) { }
-static void acpi_propagate_timer_broadcast(struct acpi_processor *pr) { }
+static void lapic_timer_propagate_broadcast(struct acpi_processor *pr) { }
static void acpi_state_timer_broadcast(struct acpi_processor *pr,
struct acpi_processor_cx *cx,
int broadcast)
@@ -1177,7 +1184,7 @@ static int acpi_processor_power_verify(s
working++;
}
- acpi_propagate_timer_broadcast(pr);
+ lapic_timer_propagate_broadcast(pr);
return (working);
}
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -124,11 +124,12 @@ int clockevents_program_event(struct clo
*/
int clockevents_register_notifier(struct notifier_block *nb)
{
+ unsigned long flags;
int ret;
- spin_lock(&clockevents_lock);
+ spin_lock_irqsave(&clockevents_lock, flags);
ret = raw_notifier_chain_register(&clockevents_chain, nb);
- spin_unlock(&clockevents_lock);
+ spin_unlock_irqrestore(&clockevents_lock, flags);
return ret;
}
@@ -165,6 +166,8 @@ static void clockevents_notify_released(
*/
void clockevents_register_device(struct clock_event_device *dev)
{
+ unsigned long flags;
+
BUG_ON(dev->mode != CLOCK_EVT_MODE_UNUSED);
/*
* A nsec2cyc multiplicator of 0 is invalid and we'd crash
@@ -175,13 +178,13 @@ void clockevents_register_device(struct
WARN_ON(1);
}
- spin_lock(&clockevents_lock);
+ spin_lock_irqsave(&clockevents_lock, flags);
list_add(&dev->list, &clockevent_devices);
clockevents_do_notify(CLOCK_EVT_NOTIFY_ADD, dev);
clockevents_notify_released();
- spin_unlock(&clockevents_lock);
+ spin_unlock_irqrestore(&clockevents_lock, flags);
}
/*
@@ -228,8 +231,9 @@ void clockevents_exchange_device(struct
void clockevents_notify(unsigned long reason, void *arg)
{
struct list_head *node, *tmp;
+ unsigned long flags;
- spin_lock(&clockevents_lock);
+ spin_lock_irqsave(&clockevents_lock, flags);
clockevents_do_notify(reason, arg);
switch (reason) {
@@ -244,7 +248,7 @@ void clockevents_notify(unsigned long re
default:
break;
}
- spin_unlock(&clockevents_lock);
+ spin_unlock_irqrestore(&clockevents_lock, flags);
}
EXPORT_SYMBOL_GPL(clockevents_notify);
#endif
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -205,11 +205,11 @@ static void tick_handle_periodic_broadca
* Powerstate information: The system enters/leaves a state, where
* affected devices might stop
*/
-static void tick_do_broadcast_on_off(void *why)
+static void tick_do_broadcast_on_off(unsigned long *reason)
{
struct clock_event_device *bc, *dev;
struct tick_device *td;
- unsigned long flags, *reason = why;
+ unsigned long flags;
int cpu, bc_stopped;
spin_lock_irqsave(&tick_broadcast_lock, flags);
@@ -276,8 +276,7 @@ void tick_broadcast_on_off(unsigned long
printk(KERN_ERR "tick-broadcast: ignoring broadcast for "
"offline CPU #%d\n", *oncpu);
else
- smp_call_function_single(*oncpu, tick_do_broadcast_on_off,
- &reason, 1);
+ tick_do_broadcast_on_off(&reason);
}
/*
next prev parent reply other threads:[~2010-05-24 22:40 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-24 22:35 [00/24] 2.6.27.47-stable review Greg KH
2010-05-24 22:27 ` [01/24] ALSA: mixart: range checking proc file Greg KH
2010-05-24 22:27 ` [02/24] ext4: invalidate pages if delalloc block allocation fails Greg KH
2010-05-24 22:27 ` [03/24] percpu counter: clean up percpu_counter_sum_and_set() Greg KH
2010-05-24 22:28 ` [04/24] ext4: Make sure all the block allocation paths reserve blocks Greg KH
2010-05-25 7:21 ` Grant Coady
2010-05-25 16:45 ` Greg KH
2010-05-24 22:28 ` [05/24] ext4: Add percpu dirty block accounting Greg KH
2010-05-24 22:28 ` [06/24] ext4: Retry block reservation Greg KH
2010-05-24 22:28 ` [07/24] ext4: Retry block allocation if we have free blocks left Greg KH
2010-05-24 22:28 ` Greg KH
2010-05-24 22:28 ` [08/24] ext4: Use tag dirty lookup during mpage_da_submit_io Greg KH
2010-05-24 22:28 ` [09/24] vfs: Remove the range_cont writeback mode Greg KH
2010-05-24 22:28 ` [10/24] vfs: Add no_nrwrite_index_update writeback control flag Greg KH
2010-05-25 11:12 ` Christoph Hellwig
2010-05-25 16:52 ` Greg KH
2010-05-25 17:00 ` Jayson R. King
2010-05-25 17:12 ` Greg KH
2010-05-26 0:49 ` Jayson R. King
2010-05-25 16:53 ` Jayson R. King
2010-05-25 16:58 ` Greg KH
2010-05-24 22:28 ` [11/24] ext4: Fix file fragmentation during large file write Greg KH
2010-05-24 22:28 ` [12/24] ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages Greg KH
2010-05-24 22:28 ` [13/24] tty: release_one_tty() forgets to put pids Greg KH
2010-05-24 22:28 ` [14/24] [SCSI] megaraid_sas: fix for 32bit apps Greg KH
2010-05-24 22:28 ` [15/24] trace: Fix inappropriate substraction on tracing_pages_allocated in trace_free_page() Greg KH
2010-05-24 22:28 ` Greg KH [this message]
2010-05-24 22:28 ` [17/24] nfsd4: bug in read_buf Greg KH
2010-05-24 22:28 ` [18/24] USB: fix testing the wrong variable in fs_create_by_name() Greg KH
2010-05-24 22:28 ` [19/24] nfs d_revalidate() is too trigger-happy with d_drop() Greg KH
2010-05-24 22:28 ` [20/24] NFS: rsize and wsize settings ignored on v4 mounts Greg KH
2010-05-24 22:28 ` [21/24] i2c: Fix probing of FSC hardware monitoring chips Greg KH
2010-05-24 22:28 ` [22/24] libata: ensure NCQ error result taskfile is fully initialized before returning it via qc->result_tf Greg KH
2010-05-24 22:28 ` [23/24] libata: retry FS IOs even if it has failed with AC_ERR_INVALID Greg KH
2010-05-24 22:28 ` [24/24] svc: Clean up deferred requests on transport destruction Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100524223017.910661964@clark.site \
--to=gregkh@suse.de \
--cc=Nagananda.Chumbalkar@hp.com \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=dsterba@novell.com \
--cc=jdelvare@novell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=stable-review@kernel.org \
--cc=stable@kernel.org \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=trenn@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.