stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Patch "ACPI / PM: Fix acpi_pm_notifier_lock vs flush_workqueue() deadlock" has been added to the 4.14-stable tree
@ 2017-11-25 13:53 gregkh
  0 siblings, 0 replies; only message in thread
From: gregkh @ 2017-11-25 13:53 UTC (permalink / raw)
  To: ville.syrjala, gregkh, rafael.j.wysocki; +Cc: stable, stable-commits


This is a note to let you know that I've just added the patch titled

    ACPI / PM: Fix acpi_pm_notifier_lock vs flush_workqueue() deadlock

to the 4.14-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     acpi-pm-fix-acpi_pm_notifier_lock-vs-flush_workqueue-deadlock.patch
and it can be found in the queue-4.14 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From ff1656790b3a4caca94505c52fd0250f981ea187 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala@linux.intel.com>
Date: Tue, 7 Nov 2017 23:08:10 +0200
Subject: ACPI / PM: Fix acpi_pm_notifier_lock vs flush_workqueue() deadlock
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

commit ff1656790b3a4caca94505c52fd0250f981ea187 upstream.

acpi_remove_pm_notifier() ends up calling flush_workqueue() while
holding acpi_pm_notifier_lock, and that same lock is taken by
by the work via acpi_pm_notify_handler(). This can deadlock.

To fix the problem let's split the single lock into two: one to
protect the dev->wakeup between the work vs. add/remove, and
another one to handle notifier installation vs. removal.

After commit a1d14934ea4b "workqueue/lockdep: 'Fix' flush_work()
annotation" I was able to kill the machine (Intel Braswell)
very easily with 'powertop --auto-tune', runtime suspending i915,
and trying to wake it up via the USB keyboard. The cases when
it didn't die are presumably explained by lockdep getting disabled
by something else (cpu hotplug locking issues usually).

Fortunately I still got a lockdep report over netconsole
(trickling in very slowly), even though the machine was
otherwise practically dead:

[  112.179806] ======================================================
[  114.670858] WARNING: possible circular locking dependency detected
[  117.155663] 4.13.0-rc6-bsw-bisect-00169-ga1d14934ea4b #119 Not tainted
[  119.658101] ------------------------------------------------------
[  121.310242] xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command.
[  121.313294] xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
[  121.313346] xhci_hcd 0000:00:14.0: HC died; cleaning up
[  121.313485] usb 1-6: USB disconnect, device number 3
[  121.313501] usb 1-6.2: USB disconnect, device number 4
[  134.747383] kworker/0:2/47 is trying to acquire lock:
[  137.220790]  (acpi_pm_notifier_lock){+.+.}, at: [<ffffffff813cafdf>] acpi_pm_notify_handler+0x2f/0x80
[  139.721524]
[  139.721524] but task is already holding lock:
[  144.672922]  ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
[  147.184450]
[  147.184450] which lock already depends on the new lock.
[  147.184450]
[  154.604711]
[  154.604711] the existing dependency chain (in reverse order) is:
[  159.447888]
[  159.447888] -> #2 ((&dpc->work)){+.+.}:
[  164.183486]        __lock_acquire+0x1255/0x13f0
[  166.504313]        lock_acquire+0xb5/0x210
[  168.778973]        process_one_work+0x1b9/0x720
[  171.030316]        worker_thread+0x4c/0x440
[  173.257184]        kthread+0x154/0x190
[  175.456143]        ret_from_fork+0x27/0x40
[  177.624348]
[  177.624348] -> #1 ("kacpi_notify"){+.+.}:
[  181.850351]        __lock_acquire+0x1255/0x13f0
[  183.941695]        lock_acquire+0xb5/0x210
[  186.046115]        flush_workqueue+0xdd/0x510
[  190.408153]        acpi_os_wait_events_complete+0x31/0x40
[  192.625303]        acpi_remove_notify_handler+0x133/0x188
[  194.820829]        acpi_remove_pm_notifier+0x56/0x90
[  196.989068]        acpi_dev_pm_detach+0x5f/0xa0
[  199.145866]        dev_pm_domain_detach+0x27/0x30
[  201.285614]        i2c_device_probe+0x100/0x210
[  203.411118]        driver_probe_device+0x23e/0x310
[  205.522425]        __driver_attach+0xa3/0xb0
[  207.634268]        bus_for_each_dev+0x69/0xa0
[  209.714797]        driver_attach+0x1e/0x20
[  211.778258]        bus_add_driver+0x1bc/0x230
[  213.837162]        driver_register+0x60/0xe0
[  215.868162]        i2c_register_driver+0x42/0x70
[  217.869551]        0xffffffffa0172017
[  219.863009]        do_one_initcall+0x45/0x170
[  221.843863]        do_init_module+0x5f/0x204
[  223.817915]        load_module+0x225b/0x29b0
[  225.757234]        SyS_finit_module+0xc6/0xd0
[  227.661851]        do_syscall_64+0x5c/0x120
[  229.536819]        return_from_SYSCALL_64+0x0/0x7a
[  231.392444]
[  231.392444] -> #0 (acpi_pm_notifier_lock){+.+.}:
[  235.124914]        check_prev_add+0x44e/0x8a0
[  237.024795]        __lock_acquire+0x1255/0x13f0
[  238.937351]        lock_acquire+0xb5/0x210
[  240.840799]        __mutex_lock+0x75/0x940
[  242.709517]        mutex_lock_nested+0x1c/0x20
[  244.551478]        acpi_pm_notify_handler+0x2f/0x80
[  246.382052]        acpi_ev_notify_dispatch+0x44/0x5c
[  248.194412]        acpi_os_execute_deferred+0x14/0x30
[  250.003925]        process_one_work+0x1ec/0x720
[  251.803191]        worker_thread+0x4c/0x440
[  253.605307]        kthread+0x154/0x190
[  255.387498]        ret_from_fork+0x27/0x40
[  257.153175]
[  257.153175] other info that might help us debug this:
[  257.153175]
[  262.324392] Chain exists of:
[  262.324392]   acpi_pm_notifier_lock --> "kacpi_notify" --> (&dpc->work)
[  262.324392]
[  267.391997]  Possible unsafe locking scenario:
[  267.391997]
[  270.758262]        CPU0                    CPU1
[  272.431713]        ----                    ----
[  274.060756]   lock((&dpc->work));
[  275.646532]                                lock("kacpi_notify");
[  277.260772]                                lock((&dpc->work));
[  278.839146]   lock(acpi_pm_notifier_lock);
[  280.391902]
[  280.391902]  *** DEADLOCK ***
[  280.391902]
[  284.986385] 2 locks held by kworker/0:2/47:
[  286.524895]  #0:  ("kacpi_notify"){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
[  288.112927]  #1:  ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
[  289.727725]

Fixes: c072530f391e (ACPI / PM: Revork the handling of ACPI device wakeup notifications)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/acpi/device_pm.c |   21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -387,6 +387,7 @@ EXPORT_SYMBOL(acpi_bus_power_manageable)
 
 #ifdef CONFIG_PM
 static DEFINE_MUTEX(acpi_pm_notifier_lock);
+static DEFINE_MUTEX(acpi_pm_notifier_install_lock);
 
 void acpi_pm_wakeup_event(struct device *dev)
 {
@@ -443,24 +444,25 @@ acpi_status acpi_add_pm_notifier(struct
 	if (!dev && !func)
 		return AE_BAD_PARAMETER;
 
-	mutex_lock(&acpi_pm_notifier_lock);
+	mutex_lock(&acpi_pm_notifier_install_lock);
 
 	if (adev->wakeup.flags.notifier_present)
 		goto out;
 
-	adev->wakeup.ws = wakeup_source_register(dev_name(&adev->dev));
-	adev->wakeup.context.dev = dev;
-	adev->wakeup.context.func = func;
-
 	status = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
 					     acpi_pm_notify_handler, NULL);
 	if (ACPI_FAILURE(status))
 		goto out;
 
+	mutex_lock(&acpi_pm_notifier_lock);
+	adev->wakeup.ws = wakeup_source_register(dev_name(&adev->dev));
+	adev->wakeup.context.dev = dev;
+	adev->wakeup.context.func = func;
 	adev->wakeup.flags.notifier_present = true;
+	mutex_unlock(&acpi_pm_notifier_lock);
 
  out:
-	mutex_unlock(&acpi_pm_notifier_lock);
+	mutex_unlock(&acpi_pm_notifier_install_lock);
 	return status;
 }
 
@@ -472,7 +474,7 @@ acpi_status acpi_remove_pm_notifier(stru
 {
 	acpi_status status = AE_BAD_PARAMETER;
 
-	mutex_lock(&acpi_pm_notifier_lock);
+	mutex_lock(&acpi_pm_notifier_install_lock);
 
 	if (!adev->wakeup.flags.notifier_present)
 		goto out;
@@ -483,14 +485,15 @@ acpi_status acpi_remove_pm_notifier(stru
 	if (ACPI_FAILURE(status))
 		goto out;
 
+	mutex_lock(&acpi_pm_notifier_lock);
 	adev->wakeup.context.func = NULL;
 	adev->wakeup.context.dev = NULL;
 	wakeup_source_unregister(adev->wakeup.ws);
-
 	adev->wakeup.flags.notifier_present = false;
+	mutex_unlock(&acpi_pm_notifier_lock);
 
  out:
-	mutex_unlock(&acpi_pm_notifier_lock);
+	mutex_unlock(&acpi_pm_notifier_install_lock);
 	return status;
 }
 


Patches currently in stable-queue which might be from ville.syrjala@linux.intel.com are

queue-4.14/acpi-pm-fix-acpi_pm_notifier_lock-vs-flush_workqueue-deadlock.patch

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2017-11-25 13:53 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-25 13:53 Patch "ACPI / PM: Fix acpi_pm_notifier_lock vs flush_workqueue() deadlock" has been added to the 4.14-stable tree gregkh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).