* 3.17-rc3+ CPU hotplug lockdep splat during resume from RAM
@ 2014-09-03 12:08 Jiri Kosina
2014-09-03 13:04 ` [PATCH] ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock Jiri Kosina
0 siblings, 1 reply; 3+ messages in thread
From: Jiri Kosina @ 2014-09-03 12:08 UTC (permalink / raw)
To: Gautham R. Shenoy, Oleg Nesterov, Srivatsa S. Bhat,
Rafael J. Wysocki, Ingo Molnar
Cc: linux-kernel, linux-pm
Hi,
I am getting lockdep complaint below during resume from s2ram (this is
with current Linus' tree, HEAD == 7505ceaf8).
I haven't had time yet to look into this yet myself; if someone beats me
with a fix, that'd be great, otherwise I'll try to look into it as soon as
possible.
[ ... snip ... ]
Freezing user space processes ... (elapsed 0.002 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: Entering mem sleep
Suspending console(s) (use no_console_suspend to debug)
wlan0: deauthenticating from 00:0b:6b:3c:8c:e4 by local choice (Reason: 3=DEAUTH_LEAVING)
cfg80211: Calling CRDA to update world regulatory domain
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Stopping disk
e1000e: EEE TX LPI TIMER: 00000000
PM: suspend of devices complete after 390.773 msecs
PM: late suspend of devices complete after 15.467 msecs
ehci-pci 0000:00:1d.7: System wakeup enabled by ACPI
uhci_hcd 0000:00:1d.0: System wakeup enabled by ACPI
ehci-pci 0000:00:1a.7: System wakeup enabled by ACPI
uhci_hcd 0000:00:1a.2: System wakeup enabled by ACPI
uhci_hcd 0000:00:1a.0: System wakeup enabled by ACPI
e1000e 0000:00:19.0: System wakeup enabled by ACPI
PM: noirq suspend of devices complete after 16.012 msecs
ACPI: Preparing to enter system sleep state S3
PM: Saving platform NVS memory
Disabling non-boot CPUs ...
kvm: disabling virtualization on CPU1
smpboot: CPU 1 is now offline
ACPI: Low-level resume complete
PM: Restoring platform NVS memory
Enabling non-boot CPUs ...
x86: Booting SMP configuration:
smpboot: Booting Node 0 Processor 1 APIC 0x1
Disabled fast string operations
kvm: enabling virtualization on CPU1
CPU1 is up
ACPI: Waking up from system sleep state S3
======================================================
[ INFO: possible circular locking dependency detected ]
3.17.0-rc3-91270-ge5ddf7b #1 Not tainted
-------------------------------------------------------
kworker/0:2/16442 is trying to acquire lock:
(cpu_hotplug.lock){++++++}, at: [<ffffffff8105095d>] get_online_cpus+0x2d/0x80
but task is already holding lock:
(cpuidle_lock){+.+.+.}, at: [<ffffffff8147af02>] cpuidle_pause_and_lock+0x12/0x40
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (cpuidle_lock){+.+.+.}:
[<ffffffff81096c81>] lock_acquire+0x91/0x110
[<ffffffff815a1ecf>] mutex_lock_nested+0x5f/0x3c0
[<ffffffff8147af02>] cpuidle_pause_and_lock+0x12/0x40
[<ffffffffc0075802>] acpi_processor_hotplug+0x44/0x88 [processor]
[<ffffffffc0073257>] acpi_cpu_soft_notify+0xaa/0xdf [processor]
[<ffffffff8106f2d3>] notifier_call_chain+0x53/0xa0
[<ffffffff8106f329>] __raw_notifier_call_chain+0x9/0x10
[<ffffffff81050a6e>] cpu_notify+0x1e/0x40
[<ffffffff81050cb8>] _cpu_up+0x148/0x160
[<ffffffff8159093a>] enable_nonboot_cpus+0xaa/0x1a0
[<ffffffff8109c367>] suspend_devices_and_enter+0x277/0x4d0
[<ffffffff8109c6ad>] pm_suspend+0xed/0x390
[<ffffffff8109b464>] state_store+0x74/0xf0
[<ffffffff812f0bef>] kobj_attr_store+0xf/0x20
[<ffffffff8121311f>] sysfs_kf_write+0x3f/0x50
[<ffffffff81212a47>] kernfs_fop_write+0xe7/0x170
[<ffffffff8119cbd2>] vfs_write+0xb2/0x1f0
[<ffffffff8119d734>] SyS_write+0x44/0xb0
[<ffffffff815a6796>] system_call_fastpath+0x1a/0x1f
-> #1 (cpu_hotplug.lock#2){+.+.+.}:
[<ffffffff81096c81>] lock_acquire+0x91/0x110
[<ffffffff815a1ecf>] mutex_lock_nested+0x5f/0x3c0
[<ffffffff81050afa>] cpu_hotplug_begin+0x4a/0x80
[<ffffffff81050b9f>] _cpu_up+0x2f/0x160
[<ffffffff81050d51>] cpu_up+0x81/0xa0
[<ffffffff81b1137d>] smp_init+0x86/0x88
[<ffffffff81af414e>] kernel_init_freeable+0x151/0x260
[<ffffffff8158fb29>] kernel_init+0x9/0xf0
[<ffffffff815a66ec>] ret_from_fork+0x7c/0xb0
-> #0 (cpu_hotplug.lock){++++++}:
[<ffffffff810961cd>] __lock_acquire+0x171d/0x1a30
[<ffffffff81096c81>] lock_acquire+0x91/0x110
[<ffffffff81050983>] get_online_cpus+0x53/0x80
[<ffffffffc00758b3>] acpi_processor_cst_has_changed+0x6d/0x176 [processor]
[<ffffffffc00733c4>] acpi_processor_notify+0x8a/0x126 [processor]
[<ffffffff81368574>] acpi_ev_notify_dispatch+0x44/0x5c
[<ffffffff8134ee01>] acpi_os_execute_deferred+0xf/0x1b
[<ffffffff81068cea>] process_one_work+0x1da/0x4f0
[<ffffffff81069113>] worker_thread+0x113/0x480
[<ffffffff8106e1f8>] kthread+0xe8/0x100
[<ffffffff815a66ec>] ret_from_fork+0x7c/0xb0
other info that might help us debug this:
Chain exists of:
cpu_hotplug.lock --> cpu_hotplug.lock#2 --> cpuidle_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(cpuidle_lock);
lock(cpu_hotplug.lock#2);
lock(cpuidle_lock);
lock(cpu_hotplug.lock);
*** DEADLOCK ***
3 locks held by kworker/0:2/16442:
#0: ("kacpi_notify"){++++.+}, at: [<ffffffff81068c88>] process_one_work+0x178/0x4f0
#1: ((&dpc->work)#2){+.+.+.}, at: [<ffffffff81068c88>] process_one_work+0x178/0x4f0
#2: (cpuidle_lock){+.+.+.}, at: [<ffffffff8147af02>] cpuidle_pause_and_lock+0x12/0x40
stack backtrace:
CPU: 0 PID: 16442 Comm: kworker/0:2 Not tainted 3.17.0-rc3-91270-ge5ddf7b #1
Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008
Workqueue: kacpi_notify acpi_os_execute_deferred
ffffffff823e2a30 ffff880037b97b68 ffffffff8159e93f ffffffff823e2880
ffff880037b97ba8 ffffffff8159982f ffff880037b97c00 ffff880078c04c20
0000000000000002 0000000000000003 ffff880078c04c20 ffff880078c04350
Call Trace:
[<ffffffff8159e93f>] dump_stack+0x4d/0x66
[<ffffffff8159982f>] print_circular_bug+0x201/0x20f
[<ffffffff810961cd>] __lock_acquire+0x171d/0x1a30
[<ffffffff810cb5c9>] ? generic_exec_single+0xf9/0x170
[<ffffffff81096c81>] lock_acquire+0x91/0x110
[<ffffffff8105095d>] ? get_online_cpus+0x2d/0x80
[<ffffffff81050983>] get_online_cpus+0x53/0x80
[<ffffffff8105095d>] ? get_online_cpus+0x2d/0x80
[<ffffffffc00758b3>] acpi_processor_cst_has_changed+0x6d/0x176 [processor]
[<ffffffffc00733c4>] acpi_processor_notify+0x8a/0x126 [processor]
[<ffffffff81368574>] acpi_ev_notify_dispatch+0x44/0x5c
[<ffffffff8134ee01>] acpi_os_execute_deferred+0xf/0x1b
[<ffffffff81068cea>] process_one_work+0x1da/0x4f0
[<ffffffff81068c88>] ? process_one_work+0x178/0x4f0
[<ffffffff81069113>] worker_thread+0x113/0x480
[<ffffffff81069000>] ? process_one_work+0x4f0/0x4f0
[<ffffffff8106e1f8>] kthread+0xe8/0x100
[<ffffffff8106e110>] ? kthread_create_on_node+0x1f0/0x1f0
[<ffffffff815a66ec>] ret_from_fork+0x7c/0xb0
[<ffffffff8106e110>] ? kthread_create_on_node+0x1f0/0x1f0
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock
2014-09-03 12:08 3.17-rc3+ CPU hotplug lockdep splat during resume from RAM Jiri Kosina
@ 2014-09-03 13:04 ` Jiri Kosina
2014-09-17 8:41 ` Gautham R Shenoy
0 siblings, 1 reply; 3+ messages in thread
From: Jiri Kosina @ 2014-09-03 13:04 UTC (permalink / raw)
To: Gautham R. Shenoy, Rafael J. Wysocki, Oleg Nesterov,
Srivatsa S. Bhat, Ingo Molnar
Cc: linux-kernel, linux-pm
There is a following AB-BA dependency between cpu_hotplug.lock and
cpuidle_lock:
1) cpu_hotplug.lock -> cpuidle_lock
enable_nonboot_cpus()
_cpu_up()
cpu_hotplug_being()
LOCK(cpu_hotplug.lock)
cpu_notify()
...
acpi_processor_hotplug()
cpuidle_pause_and_lock()
LOCK(cpuidle_lock)
2) cpuidle_lock -> cpu_hotplug.lock
acpi_os_execute_deferred() workqueue
...
acpi_processor_cst_has_changed()
cpuidle_pause_and_lock()
LOCK(cpuidle_lock)
get_online_cpus()
LOCK(cpu_hotplug.lock)
Fix this by reversing the order acpi_processor_cst_has_changed() does
thigs -- let it first execute the protection against CPU hotplug by
calling get_online_cpus() and obtain the cpuidle lock only after that (and
perform the symmentric change when allowing CPUs hotplug again and
dropping cpuidle lock).
Spotted by lockdep.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
drivers/acpi/processor_idle.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 3dca36d..17f9ec5 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -1071,9 +1071,9 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
- cpuidle_pause_and_lock();
/* Protect against cpu-hotplug */
get_online_cpus();
+ cpuidle_pause_and_lock();
/* Disable all cpuidle devices */
for_each_online_cpu(cpu) {
@@ -1100,8 +1100,8 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
cpuidle_enable_device(dev);
}
}
- put_online_cpus();
cpuidle_resume_and_unlock();
+ put_online_cpus();
}
return 0;
--
Jiri Kosina
SUSE Labs
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock
2014-09-03 13:04 ` [PATCH] ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock Jiri Kosina
@ 2014-09-17 8:41 ` Gautham R Shenoy
0 siblings, 0 replies; 3+ messages in thread
From: Gautham R Shenoy @ 2014-09-17 8:41 UTC (permalink / raw)
To: Jiri Kosina
Cc: Gautham R. Shenoy, Rafael J. Wysocki, Oleg Nesterov,
Srivatsa S. Bhat, Ingo Molnar, linux-kernel, linux-pm
Hi Jiri,
On Wed, Sep 03, 2014 at 03:04:28PM +0200, Jiri Kosina wrote:
> There is a following AB-BA dependency between cpu_hotplug.lock and
> cpuidle_lock:
>
> 1) cpu_hotplug.lock -> cpuidle_lock
> enable_nonboot_cpus()
> _cpu_up()
> cpu_hotplug_being()
> LOCK(cpu_hotplug.lock)
> cpu_notify()
> ...
> acpi_processor_hotplug()
> cpuidle_pause_and_lock()
> LOCK(cpuidle_lock)
>
> 2) cpuidle_lock -> cpu_hotplug.lock
> acpi_os_execute_deferred() workqueue
> ...
> acpi_processor_cst_has_changed()
> cpuidle_pause_and_lock()
> LOCK(cpuidle_lock)
> get_online_cpus()
> LOCK(cpu_hotplug.lock)
>
> Fix this by reversing the order acpi_processor_cst_has_changed() does
> thigs -- let it first execute the protection against CPU hotplug by
> calling get_online_cpus() and obtain the cpuidle lock only after that (and
> perform the symmentric change when allowing CPUs hotplug again and
> dropping cpuidle lock).
>
> Spotted by lockdep.
>
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Looks good to me.
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
> drivers/acpi/processor_idle.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 3dca36d..17f9ec5 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -1071,9 +1071,9 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
>
> if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
>
> - cpuidle_pause_and_lock();
> /* Protect against cpu-hotplug */
> get_online_cpus();
> + cpuidle_pause_and_lock();
>
> /* Disable all cpuidle devices */
> for_each_online_cpu(cpu) {
> @@ -1100,8 +1100,8 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
> cpuidle_enable_device(dev);
> }
> }
> - put_online_cpus();
> cpuidle_resume_and_unlock();
> + put_online_cpus();
> }
>
> return 0;
> --
> Jiri Kosina
> SUSE Labs
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-09-17 8:41 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-03 12:08 3.17-rc3+ CPU hotplug lockdep splat during resume from RAM Jiri Kosina
2014-09-03 13:04 ` [PATCH] ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock Jiri Kosina
2014-09-17 8:41 ` Gautham R Shenoy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).