* [PATCH v8 0/4] shut down devices asynchronously
@ 2024-08-22 20:28 Stuart Hayes
2024-08-22 20:28 ` [PATCH v8 1/4] driver core: don't always lock parent in shutdown Stuart Hayes
` (5 more replies)
0 siblings, 6 replies; 29+ messages in thread
From: Stuart Hayes @ 2024-08-22 20:28 UTC (permalink / raw)
To: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Cc: Stuart Hayes
This adds the ability for the kernel to shutdown devices asynchronously.
Only devices with drivers that enable it are shut down asynchronously.
This can dramatically reduce system shutdown/reboot time on systems that
have multiple devices that take many seconds to shut down (like certain
NVMe drives). On one system tested, the shutdown time went from 11 minutes
without this patch to 55 seconds with the patch.
Changes from V7:
Do not expose driver async_shutdown_enable in sysfs.
Wrapped a long line.
Changes from V6:
Removed a sysfs attribute that allowed the async device shutdown to be
"on" (with driver opt-out), "safe" (driver opt-in), or "off"... what was
previously "safe" is now the only behavior, so drivers now only need to
have the option to enable or disable async shutdown.
Changes from V5:
Separated into multiple patches to make review easier.
Reworked some code to make it more readable
Made devices wait for consumers to shut down, not just children
(suggested by David Jeffery)
Changes from V4:
Change code to use cookies for synchronization rather than async domains
Allow async shutdown to be disabled via sysfs, and allow driver opt-in or
opt-out of async shutdown (when not disabled), with ability to control
driver opt-in/opt-out via sysfs
Changes from V3:
Bug fix (used "parent" not "dev->parent" in device_shutdown)
Changes from V2:
Removed recursive functions to schedule children to be shutdown before
parents, since existing device_shutdown loop will already do this
Changes from V1:
Rewritten using kernel async code (suggested by Lukas Wunner)
Stuart Hayes (4):
driver core: don't always lock parent in shutdown
driver core: separate function to shutdown one device
driver core: shut down devices asynchronously
nvme-pci: Make driver prefer asynchronous shutdown
drivers/base/base.h | 4 ++
drivers/base/core.c | 108 ++++++++++++++++++++++++++--------
drivers/nvme/host/pci.c | 1 +
include/linux/device/driver.h | 2 +
4 files changed, 90 insertions(+), 25 deletions(-)
--
2.39.3
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v8 1/4] driver core: don't always lock parent in shutdown
2024-08-22 20:28 [PATCH v8 0/4] shut down devices asynchronously Stuart Hayes
@ 2024-08-22 20:28 ` Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
2024-08-25 7:56 ` Sagi Grimberg
2024-08-22 20:28 ` [PATCH v8 2/4] driver core: separate function to shutdown one device Stuart Hayes
` (4 subsequent siblings)
5 siblings, 2 replies; 29+ messages in thread
From: Stuart Hayes @ 2024-08-22 20:28 UTC (permalink / raw)
To: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Cc: Stuart Hayes
Don't lock a parent device unless it is needed in device_shutdown. This
is in preparation for making device shutdown asynchronous, when it will
be needed to allow children of a common parent to shut down
simultaneously.
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
---
drivers/base/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 8c0733d3aad8..13253d105062 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -4818,7 +4818,7 @@ void device_shutdown(void)
spin_unlock(&devices_kset->list_lock);
/* hold lock to avoid race with probe/release */
- if (parent)
+ if (parent && dev->bus && dev->bus->need_parent_lock)
device_lock(parent);
device_lock(dev);
@@ -4842,7 +4842,7 @@ void device_shutdown(void)
}
device_unlock(dev);
- if (parent)
+ if (parent && dev->bus && dev->bus->need_parent_lock)
device_unlock(parent);
put_device(dev);
--
2.39.3
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v8 2/4] driver core: separate function to shutdown one device
2024-08-22 20:28 [PATCH v8 0/4] shut down devices asynchronously Stuart Hayes
2024-08-22 20:28 ` [PATCH v8 1/4] driver core: don't always lock parent in shutdown Stuart Hayes
@ 2024-08-22 20:28 ` Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
2024-08-25 7:56 ` Sagi Grimberg
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
` (3 subsequent siblings)
5 siblings, 2 replies; 29+ messages in thread
From: Stuart Hayes @ 2024-08-22 20:28 UTC (permalink / raw)
To: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Cc: Stuart Hayes
Make a separate function for the part of device_shutdown() that does the
shutown for a single device. This is in preparation for making device
shutdown asynchronous.
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
---
drivers/base/core.c | 66 ++++++++++++++++++++++++---------------------
1 file changed, 36 insertions(+), 30 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 13253d105062..7e50daa65ca0 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -4781,6 +4781,41 @@ int device_change_owner(struct device *dev, kuid_t kuid, kgid_t kgid)
}
EXPORT_SYMBOL_GPL(device_change_owner);
+static void shutdown_one_device(struct device *dev)
+{
+ /* hold lock to avoid race with probe/release */
+ if (dev->parent && dev->bus && dev->bus->need_parent_lock)
+ device_lock(dev->parent);
+ device_lock(dev);
+
+ /* Don't allow any more runtime suspends */
+ pm_runtime_get_noresume(dev);
+ pm_runtime_barrier(dev);
+
+ if (dev->class && dev->class->shutdown_pre) {
+ if (initcall_debug)
+ dev_info(dev, "shutdown_pre\n");
+ dev->class->shutdown_pre(dev);
+ }
+ if (dev->bus && dev->bus->shutdown) {
+ if (initcall_debug)
+ dev_info(dev, "shutdown\n");
+ dev->bus->shutdown(dev);
+ } else if (dev->driver && dev->driver->shutdown) {
+ if (initcall_debug)
+ dev_info(dev, "shutdown\n");
+ dev->driver->shutdown(dev);
+ }
+
+ device_unlock(dev);
+ if (dev->parent && dev->bus && dev->bus->need_parent_lock)
+ device_unlock(dev->parent);
+
+ put_device(dev);
+ if (dev->parent)
+ put_device(dev->parent);
+}
+
/**
* device_shutdown - call ->shutdown() on each device to shutdown.
*/
@@ -4817,36 +4852,7 @@ void device_shutdown(void)
list_del_init(&dev->kobj.entry);
spin_unlock(&devices_kset->list_lock);
- /* hold lock to avoid race with probe/release */
- if (parent && dev->bus && dev->bus->need_parent_lock)
- device_lock(parent);
- device_lock(dev);
-
- /* Don't allow any more runtime suspends */
- pm_runtime_get_noresume(dev);
- pm_runtime_barrier(dev);
-
- if (dev->class && dev->class->shutdown_pre) {
- if (initcall_debug)
- dev_info(dev, "shutdown_pre\n");
- dev->class->shutdown_pre(dev);
- }
- if (dev->bus && dev->bus->shutdown) {
- if (initcall_debug)
- dev_info(dev, "shutdown\n");
- dev->bus->shutdown(dev);
- } else if (dev->driver && dev->driver->shutdown) {
- if (initcall_debug)
- dev_info(dev, "shutdown\n");
- dev->driver->shutdown(dev);
- }
-
- device_unlock(dev);
- if (parent && dev->bus && dev->bus->need_parent_lock)
- device_unlock(parent);
-
- put_device(dev);
- put_device(parent);
+ shutdown_one_device(dev);
spin_lock(&devices_kset->list_lock);
}
--
2.39.3
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-08-22 20:28 [PATCH v8 0/4] shut down devices asynchronously Stuart Hayes
2024-08-22 20:28 ` [PATCH v8 1/4] driver core: don't always lock parent in shutdown Stuart Hayes
2024-08-22 20:28 ` [PATCH v8 2/4] driver core: separate function to shutdown one device Stuart Hayes
@ 2024-08-22 20:28 ` Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
` (6 more replies)
2024-08-22 20:28 ` [PATCH v8 4/4] nvme-pci: Make driver prefer asynchronous shutdown Stuart Hayes
` (2 subsequent siblings)
5 siblings, 7 replies; 29+ messages in thread
From: Stuart Hayes @ 2024-08-22 20:28 UTC (permalink / raw)
To: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Cc: Stuart Hayes
Add code to allow asynchronous shutdown of devices, ensuring that each
device is shut down before its parents & suppliers.
Only devices with drivers that have async_shutdown_enable enabled will be
shut down asynchronously.
This can dramatically reduce system shutdown/reboot time on systems that
have multiple devices that take many seconds to shut down (like certain
NVMe drives). On one system tested, the shutdown time went from 11 minutes
without this patch to 55 seconds with the patch.
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
---
drivers/base/base.h | 4 +++
drivers/base/core.c | 54 ++++++++++++++++++++++++++++++++++-
include/linux/device/driver.h | 2 ++
3 files changed, 59 insertions(+), 1 deletion(-)
diff --git a/drivers/base/base.h b/drivers/base/base.h
index 0b53593372d7..aa5a2bd3f2b8 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -10,6 +10,7 @@
* shared outside of the drivers/base/ directory.
*
*/
+#include <linux/async.h>
#include <linux/notifier.h>
/**
@@ -97,6 +98,8 @@ struct driver_private {
* the device; typically because it depends on another driver getting
* probed first.
* @async_driver - pointer to device driver awaiting probe via async_probe
+ * @shutdown_after - used during device shutdown to ensure correct shutdown
+ * ordering.
* @device - pointer back to the struct device that this structure is
* associated with.
* @dead - This device is currently either in the process of or has been
@@ -114,6 +117,7 @@ struct device_private {
struct list_head deferred_probe;
const struct device_driver *async_driver;
char *deferred_probe_reason;
+ async_cookie_t shutdown_after;
struct device *device;
u8 dead:1;
};
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 7e50daa65ca0..dd3652ea56fe 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -9,6 +9,7 @@
*/
#include <linux/acpi.h>
+#include <linux/async.h>
#include <linux/cpufreq.h>
#include <linux/device.h>
#include <linux/err.h>
@@ -3531,6 +3532,7 @@ static int device_private_init(struct device *dev)
klist_init(&dev->p->klist_children, klist_children_get,
klist_children_put);
INIT_LIST_HEAD(&dev->p->deferred_probe);
+ dev->p->shutdown_after = 0;
return 0;
}
@@ -4781,6 +4783,8 @@ int device_change_owner(struct device *dev, kuid_t kuid, kgid_t kgid)
}
EXPORT_SYMBOL_GPL(device_change_owner);
+static ASYNC_DOMAIN(sd_domain);
+
static void shutdown_one_device(struct device *dev)
{
/* hold lock to avoid race with probe/release */
@@ -4816,12 +4820,34 @@ static void shutdown_one_device(struct device *dev)
put_device(dev->parent);
}
+/**
+ * shutdown_one_device_async
+ * @data: the pointer to the struct device to be shutdown
+ * @cookie: not used
+ *
+ * Shuts down one device, after waiting for shutdown_after to complete.
+ * shutdown_after should be set to the cookie of the last child or consumer
+ * of this device to be shutdown (if any), or to the cookie of the previous
+ * device to be shut down for devices that don't enable asynchronous shutdown.
+ */
+static void shutdown_one_device_async(void *data, async_cookie_t cookie)
+{
+ struct device *dev = data;
+
+ async_synchronize_cookie_domain(dev->p->shutdown_after + 1, &sd_domain);
+
+ shutdown_one_device(dev);
+}
+
/**
* device_shutdown - call ->shutdown() on each device to shutdown.
*/
void device_shutdown(void)
{
struct device *dev, *parent;
+ async_cookie_t cookie = 0;
+ struct device_link *link;
+ int idx;
wait_for_device_probe();
device_block_probing();
@@ -4852,11 +4878,37 @@ void device_shutdown(void)
list_del_init(&dev->kobj.entry);
spin_unlock(&devices_kset->list_lock);
- shutdown_one_device(dev);
+
+ /*
+ * Set cookie for devices that will be shut down synchronously
+ */
+ if (!dev->driver || !dev->driver->async_shutdown_enable)
+ dev->p->shutdown_after = cookie;
+
+ get_device(dev);
+ get_device(parent);
+
+ cookie = async_schedule_domain(shutdown_one_device_async,
+ dev, &sd_domain);
+ /*
+ * Ensure parent & suppliers wait for this device to shut down
+ */
+ if (parent) {
+ parent->p->shutdown_after = cookie;
+ put_device(parent);
+ }
+
+ idx = device_links_read_lock();
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
+ device_links_read_lock_held())
+ link->supplier->p->shutdown_after = cookie;
+ device_links_read_unlock(idx);
+ put_device(dev);
spin_lock(&devices_kset->list_lock);
}
spin_unlock(&devices_kset->list_lock);
+ async_synchronize_full_domain(&sd_domain);
}
/*
diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h
index 1fc8b68786de..2b6127faaa25 100644
--- a/include/linux/device/driver.h
+++ b/include/linux/device/driver.h
@@ -56,6 +56,7 @@ enum probe_type {
* @mod_name: Used for built-in modules.
* @suppress_bind_attrs: Disables bind/unbind via sysfs.
* @probe_type: Type of the probe (synchronous or asynchronous) to use.
+ * @async_shutdown_enable: Enables devices to be shutdown asynchronously.
* @of_match_table: The open firmware table.
* @acpi_match_table: The ACPI match table.
* @probe: Called to query the existence of a specific device,
@@ -102,6 +103,7 @@ struct device_driver {
bool suppress_bind_attrs; /* disables bind/unbind via sysfs */
enum probe_type probe_type;
+ bool async_shutdown_enable;
const struct of_device_id *of_match_table;
const struct acpi_device_id *acpi_match_table;
--
2.39.3
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v8 4/4] nvme-pci: Make driver prefer asynchronous shutdown
2024-08-22 20:28 [PATCH v8 0/4] shut down devices asynchronously Stuart Hayes
` (2 preceding siblings ...)
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
@ 2024-08-22 20:28 ` Stuart Hayes
2024-08-23 6:27 ` Christoph Hellwig
2024-08-25 7:57 ` Sagi Grimberg
2024-08-23 16:54 ` [PATCH v8 0/4] shut down devices asynchronously Keith Busch
2024-09-03 11:10 ` Greg Kroah-Hartman
5 siblings, 2 replies; 29+ messages in thread
From: Stuart Hayes @ 2024-08-22 20:28 UTC (permalink / raw)
To: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Cc: Stuart Hayes
Set the driver default to enable asynchronous shutdown.
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
---
drivers/nvme/host/pci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 6cd9395ba9ec..58d0d517fead 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3580,6 +3580,7 @@ static struct pci_driver nvme_driver = {
.shutdown = nvme_shutdown,
.driver = {
.probe_type = PROBE_PREFER_ASYNCHRONOUS,
+ .async_shutdown_enable = true,
#ifdef CONFIG_PM_SLEEP
.pm = &nvme_dev_pm_ops,
#endif
--
2.39.3
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v8 1/4] driver core: don't always lock parent in shutdown
2024-08-22 20:28 ` [PATCH v8 1/4] driver core: don't always lock parent in shutdown Stuart Hayes
@ 2024-08-23 6:26 ` Christoph Hellwig
2024-08-25 7:56 ` Sagi Grimberg
1 sibling, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2024-08-23 6:26 UTC (permalink / raw)
To: Stuart Hayes
Cc: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 2/4] driver core: separate function to shutdown one device
2024-08-22 20:28 ` [PATCH v8 2/4] driver core: separate function to shutdown one device Stuart Hayes
@ 2024-08-23 6:26 ` Christoph Hellwig
2024-08-25 7:56 ` Sagi Grimberg
1 sibling, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2024-08-23 6:26 UTC (permalink / raw)
To: Stuart Hayes
Cc: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
@ 2024-08-23 6:26 ` Christoph Hellwig
2024-08-24 10:29 ` Lukas Wunner
` (5 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2024-08-23 6:26 UTC (permalink / raw)
To: Stuart Hayes
Cc: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 4/4] nvme-pci: Make driver prefer asynchronous shutdown
2024-08-22 20:28 ` [PATCH v8 4/4] nvme-pci: Make driver prefer asynchronous shutdown Stuart Hayes
@ 2024-08-23 6:27 ` Christoph Hellwig
2024-08-25 7:57 ` Sagi Grimberg
1 sibling, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2024-08-23 6:27 UTC (permalink / raw)
To: Stuart Hayes
Cc: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 0/4] shut down devices asynchronously
2024-08-22 20:28 [PATCH v8 0/4] shut down devices asynchronously Stuart Hayes
` (3 preceding siblings ...)
2024-08-22 20:28 ` [PATCH v8 4/4] nvme-pci: Make driver prefer asynchronous shutdown Stuart Hayes
@ 2024-08-23 16:54 ` Keith Busch
2024-09-03 11:10 ` Greg Kroah-Hartman
5 siblings, 0 replies; 29+ messages in thread
From: Keith Busch @ 2024-08-23 16:54 UTC (permalink / raw)
To: Stuart Hayes
Cc: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Lukas Wunner, David Jeffery, Jeremy Allison, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-nvme
On Thu, Aug 22, 2024 at 03:28:01PM -0500, Stuart Hayes wrote:
> This adds the ability for the kernel to shutdown devices asynchronously.
>
> Only devices with drivers that enable it are shut down asynchronously.
>
> This can dramatically reduce system shutdown/reboot time on systems that
> have multiple devices that take many seconds to shut down (like certain
> NVMe drives). On one system tested, the shutdown time went from 11 minutes
> without this patch to 55 seconds with the patch.
Still looks good. For the series:
Reviewed-by: Keith Busch <kbusch@kernel.org>
Tested-by: Keith Busch <kbusch@kernel.org>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
@ 2024-08-24 10:29 ` Lukas Wunner
2024-08-25 7:58 ` Sagi Grimberg
` (4 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Lukas Wunner @ 2024-08-24 10:29 UTC (permalink / raw)
To: Stuart Hayes
Cc: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, David Jeffery, Jeremy Allison, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-nvme
On Thu, Aug 22, 2024 at 03:28:04PM -0500, Stuart Hayes wrote:
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3531,6 +3532,7 @@ static int device_private_init(struct device *dev)
> klist_init(&dev->p->klist_children, klist_children_get,
> klist_children_put);
> INIT_LIST_HEAD(&dev->p->deferred_probe);
> + dev->p->shutdown_after = 0;
> return 0;
> }
Nit: Unnecessary assignment since dev->p is allocated with kzalloc()
immediately above this hunk.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 1/4] driver core: don't always lock parent in shutdown
2024-08-22 20:28 ` [PATCH v8 1/4] driver core: don't always lock parent in shutdown Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
@ 2024-08-25 7:56 ` Sagi Grimberg
1 sibling, 0 replies; 29+ messages in thread
From: Sagi Grimberg @ 2024-08-25 7:56 UTC (permalink / raw)
To: Stuart Hayes, linux-kernel, Greg Kroah-Hartman,
Rafael J . Wysocki, Martin Belanger, Oliver O'Halloran,
Daniel Wagner, Keith Busch, Lukas Wunner, David Jeffery,
Jeremy Allison, Jens Axboe, Christoph Hellwig, linux-nvme
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 2/4] driver core: separate function to shutdown one device
2024-08-22 20:28 ` [PATCH v8 2/4] driver core: separate function to shutdown one device Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
@ 2024-08-25 7:56 ` Sagi Grimberg
1 sibling, 0 replies; 29+ messages in thread
From: Sagi Grimberg @ 2024-08-25 7:56 UTC (permalink / raw)
To: Stuart Hayes, linux-kernel, Greg Kroah-Hartman,
Rafael J . Wysocki, Martin Belanger, Oliver O'Halloran,
Daniel Wagner, Keith Busch, Lukas Wunner, David Jeffery,
Jeremy Allison, Jens Axboe, Christoph Hellwig, linux-nvme
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 4/4] nvme-pci: Make driver prefer asynchronous shutdown
2024-08-22 20:28 ` [PATCH v8 4/4] nvme-pci: Make driver prefer asynchronous shutdown Stuart Hayes
2024-08-23 6:27 ` Christoph Hellwig
@ 2024-08-25 7:57 ` Sagi Grimberg
1 sibling, 0 replies; 29+ messages in thread
From: Sagi Grimberg @ 2024-08-25 7:57 UTC (permalink / raw)
To: Stuart Hayes, linux-kernel, Greg Kroah-Hartman,
Rafael J . Wysocki, Martin Belanger, Oliver O'Halloran,
Daniel Wagner, Keith Busch, Lukas Wunner, David Jeffery,
Jeremy Allison, Jens Axboe, Christoph Hellwig, linux-nvme
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
2024-08-24 10:29 ` Lukas Wunner
@ 2024-08-25 7:58 ` Sagi Grimberg
2024-09-05 22:13 ` Nathan Chancellor
` (3 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Sagi Grimberg @ 2024-08-25 7:58 UTC (permalink / raw)
To: Stuart Hayes, linux-kernel, Greg Kroah-Hartman,
Rafael J . Wysocki, Martin Belanger, Oliver O'Halloran,
Daniel Wagner, Keith Busch, Lukas Wunner, David Jeffery,
Jeremy Allison, Jens Axboe, Christoph Hellwig, linux-nvme
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 0/4] shut down devices asynchronously
2024-08-22 20:28 [PATCH v8 0/4] shut down devices asynchronously Stuart Hayes
` (4 preceding siblings ...)
2024-08-23 16:54 ` [PATCH v8 0/4] shut down devices asynchronously Keith Busch
@ 2024-09-03 11:10 ` Greg Kroah-Hartman
5 siblings, 0 replies; 29+ messages in thread
From: Greg Kroah-Hartman @ 2024-09-03 11:10 UTC (permalink / raw)
To: Stuart Hayes
Cc: linux-kernel, Rafael J . Wysocki, Martin Belanger,
Oliver O'Halloran, Daniel Wagner, Keith Busch, Lukas Wunner,
David Jeffery, Jeremy Allison, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, linux-nvme
On Thu, Aug 22, 2024 at 03:28:01PM -0500, Stuart Hayes wrote:
> This adds the ability for the kernel to shutdown devices asynchronously.
>
> Only devices with drivers that enable it are shut down asynchronously.
>
> This can dramatically reduce system shutdown/reboot time on systems that
> have multiple devices that take many seconds to shut down (like certain
> NVMe drives). On one system tested, the shutdown time went from 11 minutes
> without this patch to 55 seconds with the patch.
Thanks for sticking with this, all now queued up in my tree
greg k-h
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
` (2 preceding siblings ...)
2024-08-25 7:58 ` Sagi Grimberg
@ 2024-09-05 22:13 ` Nathan Chancellor
2024-09-06 14:44 ` stuart hayes
2024-09-08 13:36 ` Jan Kiszka
` (2 subsequent siblings)
6 siblings, 1 reply; 29+ messages in thread
From: Nathan Chancellor @ 2024-09-05 22:13 UTC (permalink / raw)
To: Stuart Hayes
Cc: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Hi Stuart,
On Thu, Aug 22, 2024 at 03:28:04PM -0500, Stuart Hayes wrote:
> Add code to allow asynchronous shutdown of devices, ensuring that each
> device is shut down before its parents & suppliers.
>
> Only devices with drivers that have async_shutdown_enable enabled will be
> shut down asynchronously.
>
> This can dramatically reduce system shutdown/reboot time on systems that
> have multiple devices that take many seconds to shut down (like certain
> NVMe drives). On one system tested, the shutdown time went from 11 minutes
> without this patch to 55 seconds with the patch.
>
> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> Signed-off-by: David Jeffery <djeffery@redhat.com>
I am noticing several QEMU machines hang while shutting down after this
change as commit 8064952c6504 ("driver core: shut down devices
asynchronously") in -next. An easy test case due to the size of the
configuration:
$ make -skj"$(nproc)" ARCH=arm64 CROSS_COMPILE=aarch64-linux- mrproper virtconfig Image.gz
$ curl -LSs https://github.com/ClangBuiltLinux/boot-utils/releases/download/20230707-182910/arm64-rootfs.cpio.zst | zstd -d >rootfs.cpio
$ timeout --foreground 3m \
qemu-system-aarch64 \
-display none \
-nodefaults \
-cpu max,pauth-impdef=true \
-machine virt,gic-version=max,virtualization=true \
-append 'console=ttyAMA0 earlycon' \
-kernel arch/arm64/boot/Image.gz \
-initrd rootfs.cpio \
-m 512m \
-serial mon:stdio
[ 0.000000] Linux version 6.11.0-rc4-00022-g8064952c6504 (nathan@thelio-3990X) (aarch64-linux-gcc (GCC) 14.2.0, GNU ld (GNU Binutils) 2.42) #1 SMP PREEMPT Thu Sep 5 15:02:42 MST 2024
...
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system poweroff
qemu-system-aarch64: terminating on signal 15 from pid 2753792 (timeout)
At the parent commit, there are the following two prints after
"Requesting system poweroff" then the machine properly shuts down:
[ 3.411387] kvm: exiting hardware virtualization
[ 3.411741] reboot: Power down
If there is any other information I can provide or patches that I can
test, I am more than happy to do so.
Cheers,
Nathan
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-09-05 22:13 ` Nathan Chancellor
@ 2024-09-06 14:44 ` stuart hayes
0 siblings, 0 replies; 29+ messages in thread
From: stuart hayes @ 2024-09-06 14:44 UTC (permalink / raw)
To: Nathan Chancellor
Cc: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
On 9/5/2024 5:13 PM, Nathan Chancellor wrote:
> Hi Stuart,
>
> On Thu, Aug 22, 2024 at 03:28:04PM -0500, Stuart Hayes wrote:
>> Add code to allow asynchronous shutdown of devices, ensuring that each
>> device is shut down before its parents & suppliers.
>>
>> Only devices with drivers that have async_shutdown_enable enabled will be
>> shut down asynchronously.
>>
>> This can dramatically reduce system shutdown/reboot time on systems that
>> have multiple devices that take many seconds to shut down (like certain
>> NVMe drives). On one system tested, the shutdown time went from 11 minutes
>> without this patch to 55 seconds with the patch.
>>
>> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
>> Signed-off-by: David Jeffery <djeffery@redhat.com>
>
> I am noticing several QEMU machines hang while shutting down after this
> change as commit 8064952c6504 ("driver core: shut down devices
> asynchronously") in -next. An easy test case due to the size of the
> configuration:
>
> $ make -skj"$(nproc)" ARCH=arm64 CROSS_COMPILE=aarch64-linux- mrproper virtconfig Image.gz
>
> $ curl -LSs https://github.com/ClangBuiltLinux/boot-utils/releases/download/20230707-182910/arm64-rootfs.cpio.zst | zstd -d >rootfs.cpio
>
> $ timeout --foreground 3m \
> qemu-system-aarch64 \
> -display none \
> -nodefaults \
> -cpu max,pauth-impdef=true \
> -machine virt,gic-version=max,virtualization=true \
> -append 'console=ttyAMA0 earlycon' \
> -kernel arch/arm64/boot/Image.gz \
> -initrd rootfs.cpio \
> -m 512m \
> -serial mon:stdio
> [ 0.000000] Linux version 6.11.0-rc4-00022-g8064952c6504 (nathan@thelio-3990X) (aarch64-linux-gcc (GCC) 14.2.0, GNU ld (GNU Binutils) 2.42) #1 SMP PREEMPT Thu Sep 5 15:02:42 MST 2024
> ...
> The system is going down NOW!
> Sent SIGTERM to all processes
> Sent SIGKILL to all processes
> Requesting system poweroff
> qemu-system-aarch64: terminating on signal 15 from pid 2753792 (timeout)
>
> At the parent commit, there are the following two prints after
> "Requesting system poweroff" then the machine properly shuts down:
>
> [ 3.411387] kvm: exiting hardware virtualization
> [ 3.411741] reboot: Power down
>
> If there is any other information I can provide or patches that I can
> test, I am more than happy to do so.
>
> Cheers,
> Nathan
Thanks, I'll take a look.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
` (3 preceding siblings ...)
2024-09-05 22:13 ` Nathan Chancellor
@ 2024-09-08 13:36 ` Jan Kiszka
2024-09-11 0:14 ` stuart hayes
2024-09-08 14:44 ` Christophe JAILLET
2024-09-23 20:50 ` Andrey Skvortsov
6 siblings, 1 reply; 29+ messages in thread
From: Jan Kiszka @ 2024-09-08 13:36 UTC (permalink / raw)
To: Stuart Hayes, linux-kernel, Greg Kroah-Hartman,
Rafael J . Wysocki, Martin Belanger, Oliver O'Halloran,
Daniel Wagner, Keith Busch, Lukas Wunner, David Jeffery,
Jeremy Allison, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
linux-nvme
Cc: Nathan Chancellor
On 22.08.24 22:28, Stuart Hayes wrote:
> Add code to allow asynchronous shutdown of devices, ensuring that each
> device is shut down before its parents & suppliers.
>
> Only devices with drivers that have async_shutdown_enable enabled will be
> shut down asynchronously.
>
> This can dramatically reduce system shutdown/reboot time on systems that
> have multiple devices that take many seconds to shut down (like certain
> NVMe drives). On one system tested, the shutdown time went from 11 minutes
> without this patch to 55 seconds with the patch.
>
> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> Signed-off-by: David Jeffery <djeffery@redhat.com>
> ---
> drivers/base/base.h | 4 +++
> drivers/base/core.c | 54 ++++++++++++++++++++++++++++++++++-
> include/linux/device/driver.h | 2 ++
> 3 files changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index 0b53593372d7..aa5a2bd3f2b8 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -10,6 +10,7 @@
> * shared outside of the drivers/base/ directory.
> *
> */
> +#include <linux/async.h>
> #include <linux/notifier.h>
>
> /**
> @@ -97,6 +98,8 @@ struct driver_private {
> * the device; typically because it depends on another driver getting
> * probed first.
> * @async_driver - pointer to device driver awaiting probe via async_probe
> + * @shutdown_after - used during device shutdown to ensure correct shutdown
> + * ordering.
> * @device - pointer back to the struct device that this structure is
> * associated with.
> * @dead - This device is currently either in the process of or has been
> @@ -114,6 +117,7 @@ struct device_private {
> struct list_head deferred_probe;
> const struct device_driver *async_driver;
> char *deferred_probe_reason;
> + async_cookie_t shutdown_after;
> struct device *device;
> u8 dead:1;
> };
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 7e50daa65ca0..dd3652ea56fe 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -9,6 +9,7 @@
> */
>
> #include <linux/acpi.h>
> +#include <linux/async.h>
> #include <linux/cpufreq.h>
> #include <linux/device.h>
> #include <linux/err.h>
> @@ -3531,6 +3532,7 @@ static int device_private_init(struct device *dev)
> klist_init(&dev->p->klist_children, klist_children_get,
> klist_children_put);
> INIT_LIST_HEAD(&dev->p->deferred_probe);
> + dev->p->shutdown_after = 0;
> return 0;
> }
>
> @@ -4781,6 +4783,8 @@ int device_change_owner(struct device *dev, kuid_t kuid, kgid_t kgid)
> }
> EXPORT_SYMBOL_GPL(device_change_owner);
>
> +static ASYNC_DOMAIN(sd_domain);
> +
> static void shutdown_one_device(struct device *dev)
> {
> /* hold lock to avoid race with probe/release */
> @@ -4816,12 +4820,34 @@ static void shutdown_one_device(struct device *dev)
> put_device(dev->parent);
> }
>
> +/**
> + * shutdown_one_device_async
> + * @data: the pointer to the struct device to be shutdown
> + * @cookie: not used
> + *
> + * Shuts down one device, after waiting for shutdown_after to complete.
> + * shutdown_after should be set to the cookie of the last child or consumer
> + * of this device to be shutdown (if any), or to the cookie of the previous
> + * device to be shut down for devices that don't enable asynchronous shutdown.
> + */
> +static void shutdown_one_device_async(void *data, async_cookie_t cookie)
> +{
> + struct device *dev = data;
> +
> + async_synchronize_cookie_domain(dev->p->shutdown_after + 1, &sd_domain);
> +
> + shutdown_one_device(dev);
> +}
> +
> /**
> * device_shutdown - call ->shutdown() on each device to shutdown.
> */
> void device_shutdown(void)
> {
> struct device *dev, *parent;
> + async_cookie_t cookie = 0;
> + struct device_link *link;
> + int idx;
>
> wait_for_device_probe();
> device_block_probing();
> @@ -4852,11 +4878,37 @@ void device_shutdown(void)
> list_del_init(&dev->kobj.entry);
> spin_unlock(&devices_kset->list_lock);
>
> - shutdown_one_device(dev);
> +
> + /*
> + * Set cookie for devices that will be shut down synchronously
> + */
> + if (!dev->driver || !dev->driver->async_shutdown_enable)
> + dev->p->shutdown_after = cookie;
> +
> + get_device(dev);
> + get_device(parent);
> +
> + cookie = async_schedule_domain(shutdown_one_device_async,
> + dev, &sd_domain);
> + /*
> + * Ensure parent & suppliers wait for this device to shut down
> + */
> + if (parent) {
> + parent->p->shutdown_after = cookie;
> + put_device(parent);
> + }
> +
> + idx = device_links_read_lock();
> + list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> + device_links_read_lock_held())
> + link->supplier->p->shutdown_after = cookie;
This will not fly if a supplier registered after its consumer. As we are
walking the list backward, the supplier will now wait for something that
is coming later during shutdown if the affected devices are still doing
this synchronously (as almost all at this stage). This creates a
circular dependency.
Seems to explain the reboot hang that I'm seeing on an embedded target
with a mailbox dev waiting for a remoteproc dev while the mailbox being
after the remoteproc in the list (thus first on shutting down).
This resolves the issue for me so far, but I don't think we are done yet:
list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
device_links_read_lock_held()) {
if (link->supplier->driver &&
link->supplier->driver->async_shutdown_enable)
link->supplier->p->shutdown_after = cookie;
}
I wonder if overwriting the supplier's shutdown_after unconditionally is
a good idea. A supplier may have multiple consumers - will we overwrite
in the right order then? And why do we now need this ordering when we
were so far shutting down suppliers while consumers were still up?
Same overwrite question applies to setting shutdown_after in parents.
Don't we rather need a list for shutdown_after, at least once everything
is async?
This needs to be thought through once more, I guess.
Jan
> + device_links_read_unlock(idx);
> + put_device(dev);
>
> spin_lock(&devices_kset->list_lock);
> }
> spin_unlock(&devices_kset->list_lock);
> + async_synchronize_full_domain(&sd_domain);
> }
>
> /*
> diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h
> index 1fc8b68786de..2b6127faaa25 100644
> --- a/include/linux/device/driver.h
> +++ b/include/linux/device/driver.h
> @@ -56,6 +56,7 @@ enum probe_type {
> * @mod_name: Used for built-in modules.
> * @suppress_bind_attrs: Disables bind/unbind via sysfs.
> * @probe_type: Type of the probe (synchronous or asynchronous) to use.
> + * @async_shutdown_enable: Enables devices to be shutdown asynchronously.
> * @of_match_table: The open firmware table.
> * @acpi_match_table: The ACPI match table.
> * @probe: Called to query the existence of a specific device,
> @@ -102,6 +103,7 @@ struct device_driver {
>
> bool suppress_bind_attrs; /* disables bind/unbind via sysfs */
> enum probe_type probe_type;
> + bool async_shutdown_enable;
>
> const struct of_device_id *of_match_table;
> const struct acpi_device_id *acpi_match_table;
--
Siemens AG, Technology
Linux Expert Center
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
` (4 preceding siblings ...)
2024-09-08 13:36 ` Jan Kiszka
@ 2024-09-08 14:44 ` Christophe JAILLET
2024-09-23 20:50 ` Andrey Skvortsov
6 siblings, 0 replies; 29+ messages in thread
From: Christophe JAILLET @ 2024-09-08 14:44 UTC (permalink / raw)
To: Stuart Hayes, linux-kernel, Greg Kroah-Hartman,
Rafael J . Wysocki, Martin Belanger, Oliver O'Halloran,
Daniel Wagner, Keith Busch, Lukas Wunner, David Jeffery,
Jeremy Allison, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
linux-nvme
Le 22/08/2024 à 22:28, Stuart Hayes a écrit :
> Add code to allow asynchronous shutdown of devices, ensuring that each
> device is shut down before its parents & suppliers.
>
> Only devices with drivers that have async_shutdown_enable enabled will be
> shut down asynchronously.
>
> This can dramatically reduce system shutdown/reboot time on systems that
> have multiple devices that take many seconds to shut down (like certain
> NVMe drives). On one system tested, the shutdown time went from 11 minutes
> without this patch to 55 seconds with the patch.
>
> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> Signed-off-by: David Jeffery <djeffery@redhat.com>
> ---
...
> +/**
> + * shutdown_one_device_async
> + * @data: the pointer to the struct device to be shutdown
> + * @cookie: not used
> + *
> + * Shuts down one device, after waiting for shutdown_after to complete.
> + * shutdown_after should be set to the cookie of the last child or consumer
> + * of this device to be shutdown (if any), or to the cookie of the previous
> + * device to be shut down for devices that don't enable asynchronous shutdown.
> + */
> +static void shutdown_one_device_async(void *data, async_cookie_t cookie)
> +{
> + struct device *dev = data;
> +
> + async_synchronize_cookie_domain(dev->p->shutdown_after + 1, &sd_domain);
> +
> + shutdown_one_device(dev);
> +}
> +
> /**
> * device_shutdown - call ->shutdown() on each device to shutdown.
> */
> void device_shutdown(void)
> {
> struct device *dev, *parent;
> + async_cookie_t cookie = 0;
> + struct device_link *link;
> + int idx;
>
> wait_for_device_probe();
> device_block_probing();
> @@ -4852,11 +4878,37 @@ void device_shutdown(void)
> list_del_init(&dev->kobj.entry);
> spin_unlock(&devices_kset->list_lock);
>
> - shutdown_one_device(dev);
> +
> + /*
> + * Set cookie for devices that will be shut down synchronously
> + */
> + if (!dev->driver || !dev->driver->async_shutdown_enable)
> + dev->p->shutdown_after = cookie;
> +
> + get_device(dev);
> + get_device(parent);
> +
> + cookie = async_schedule_domain(shutdown_one_device_async,
> + dev, &sd_domain);
> + /*
> + * Ensure parent & suppliers wait for this device to shut down
> + */
> + if (parent) {
> + parent->p->shutdown_after = cookie;
> + put_device(parent);
Would it make sense to have this put_device() out of the if block?
IIUC, the behavior would be exactly the same, but it is more intuitive
to have a put_device(parent) called for each get_device(parent) call.
Another way to keep symmetry is to have:
if (parent)
get_device(parent);
above.
> + }
> +
> + idx = device_links_read_lock();
> + list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> + device_links_read_lock_held())
> + link->supplier->p->shutdown_after = cookie;
> + device_links_read_unlock(idx);
> + put_device(dev);
>
> spin_lock(&devices_kset->list_lock);
> }
> spin_unlock(&devices_kset->list_lock);
> + async_synchronize_full_domain(&sd_domain);
> }
>
> /*
> diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h
> index 1fc8b68786de..2b6127faaa25 100644
> --- a/include/linux/device/driver.h
> +++ b/include/linux/device/driver.h
> @@ -56,6 +56,7 @@ enum probe_type {
> * @mod_name: Used for built-in modules.
> * @suppress_bind_attrs: Disables bind/unbind via sysfs.
> * @probe_type: Type of the probe (synchronous or asynchronous) to use.
> + * @async_shutdown_enable: Enables devices to be shutdown asynchronously.
> * @of_match_table: The open firmware table.
> * @acpi_match_table: The ACPI match table.
> * @probe: Called to query the existence of a specific device,
> @@ -102,6 +103,7 @@ struct device_driver {
>
> bool suppress_bind_attrs; /* disables bind/unbind via sysfs */
> enum probe_type probe_type;
> + bool async_shutdown_enable;
Maybe keep these 2 bools together to potentially avoid hole?
Just my 2c.
CJ
>
> const struct of_device_id *of_match_table;
> const struct acpi_device_id *acpi_match_table;
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-09-08 13:36 ` Jan Kiszka
@ 2024-09-11 0:14 ` stuart hayes
2024-09-11 5:51 ` Jan Kiszka
2024-09-12 14:30 ` David Jeffery
0 siblings, 2 replies; 29+ messages in thread
From: stuart hayes @ 2024-09-11 0:14 UTC (permalink / raw)
To: Jan Kiszka, linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Cc: Nathan Chancellor
On 9/8/2024 8:36 AM, Jan Kiszka wrote:
> On 22.08.24 22:28, Stuart Hayes wrote:
>> Add code to allow asynchronous shutdown of devices, ensuring that each
>> device is shut down before its parents & suppliers.
>>
>> Only devices with drivers that have async_shutdown_enable enabled will be
>> shut down asynchronously.
>>
>> This can dramatically reduce system shutdown/reboot time on systems that
>> have multiple devices that take many seconds to shut down (like certain
>> NVMe drives). On one system tested, the shutdown time went from 11 minutes
>> without this patch to 55 seconds with the patch.
>>
>> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
>> Signed-off-by: David Jeffery <djeffery@redhat.com>
>> ---
>> drivers/base/base.h | 4 +++
>> drivers/base/core.c | 54 ++++++++++++++++++++++++++++++++++-
>> include/linux/device/driver.h | 2 ++
>> 3 files changed, 59 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/base/base.h b/drivers/base/base.h
>> index 0b53593372d7..aa5a2bd3f2b8 100644
>> --- a/drivers/base/base.h
>> +++ b/drivers/base/base.h
>> @@ -10,6 +10,7 @@
>> * shared outside of the drivers/base/ directory.
>> *
>> */
>> +#include <linux/async.h>
>> #include <linux/notifier.h>
>>
>> /**
>> @@ -97,6 +98,8 @@ struct driver_private {
>> * the device; typically because it depends on another driver getting
>> * probed first.
>> * @async_driver - pointer to device driver awaiting probe via async_probe
>> + * @shutdown_after - used during device shutdown to ensure correct shutdown
>> + * ordering.
>> * @device - pointer back to the struct device that this structure is
>> * associated with.
>> * @dead - This device is currently either in the process of or has been
>> @@ -114,6 +117,7 @@ struct device_private {
>> struct list_head deferred_probe;
>> const struct device_driver *async_driver;
>> char *deferred_probe_reason;
>> + async_cookie_t shutdown_after;
>> struct device *device;
>> u8 dead:1;
>> };
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 7e50daa65ca0..dd3652ea56fe 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -9,6 +9,7 @@
>> */
>>
>> #include <linux/acpi.h>
>> +#include <linux/async.h>
>> #include <linux/cpufreq.h>
>> #include <linux/device.h>
>> #include <linux/err.h>
>> @@ -3531,6 +3532,7 @@ static int device_private_init(struct device *dev)
>> klist_init(&dev->p->klist_children, klist_children_get,
>> klist_children_put);
>> INIT_LIST_HEAD(&dev->p->deferred_probe);
>> + dev->p->shutdown_after = 0;
>> return 0;
>> }
>>
>> @@ -4781,6 +4783,8 @@ int device_change_owner(struct device *dev, kuid_t kuid, kgid_t kgid)
>> }
>> EXPORT_SYMBOL_GPL(device_change_owner);
>>
>> +static ASYNC_DOMAIN(sd_domain);
>> +
>> static void shutdown_one_device(struct device *dev)
>> {
>> /* hold lock to avoid race with probe/release */
>> @@ -4816,12 +4820,34 @@ static void shutdown_one_device(struct device *dev)
>> put_device(dev->parent);
>> }
>>
>> +/**
>> + * shutdown_one_device_async
>> + * @data: the pointer to the struct device to be shutdown
>> + * @cookie: not used
>> + *
>> + * Shuts down one device, after waiting for shutdown_after to complete.
>> + * shutdown_after should be set to the cookie of the last child or consumer
>> + * of this device to be shutdown (if any), or to the cookie of the previous
>> + * device to be shut down for devices that don't enable asynchronous shutdown.
>> + */
>> +static void shutdown_one_device_async(void *data, async_cookie_t cookie)
>> +{
>> + struct device *dev = data;
>> +
>> + async_synchronize_cookie_domain(dev->p->shutdown_after + 1, &sd_domain);
>> +
>> + shutdown_one_device(dev);
>> +}
>> +
>> /**
>> * device_shutdown - call ->shutdown() on each device to shutdown.
>> */
>> void device_shutdown(void)
>> {
>> struct device *dev, *parent;
>> + async_cookie_t cookie = 0;
>> + struct device_link *link;
>> + int idx;
>>
>> wait_for_device_probe();
>> device_block_probing();
>> @@ -4852,11 +4878,37 @@ void device_shutdown(void)
>> list_del_init(&dev->kobj.entry);
>> spin_unlock(&devices_kset->list_lock);
>>
>> - shutdown_one_device(dev);
>> +
>> + /*
>> + * Set cookie for devices that will be shut down synchronously
>> + */
>> + if (!dev->driver || !dev->driver->async_shutdown_enable)
>> + dev->p->shutdown_after = cookie;
>> +
>> + get_device(dev);
>> + get_device(parent);
>> +
>> + cookie = async_schedule_domain(shutdown_one_device_async,
>> + dev, &sd_domain);
>> + /*
>> + * Ensure parent & suppliers wait for this device to shut down
>> + */
>> + if (parent) {
>> + parent->p->shutdown_after = cookie;
>> + put_device(parent);
>> + }
>> +
>> + idx = device_links_read_lock();
>> + list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
>> + device_links_read_lock_held())
>> + link->supplier->p->shutdown_after = cookie;
>
> This will not fly if a supplier registered after its consumer. As we are
> walking the list backward, the supplier will now wait for something that
> is coming later during shutdown if the affected devices are still doing
> this synchronously (as almost all at this stage). This creates a
> circular dependency.
>
> Seems to explain the reboot hang that I'm seeing on an embedded target
> with a mailbox dev waiting for a remoteproc dev while the mailbox being
> after the remoteproc in the list (thus first on shutting down).
>
> This resolves the issue for me so far, but I don't think we are done yet:
>
> list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> device_links_read_lock_held()) {
> if (link->supplier->driver &&
> link->supplier->driver->async_shutdown_enable)
> link->supplier->p->shutdown_after = cookie;
> }
>
> I wonder if overwriting the supplier's shutdown_after unconditionally is
> a good idea. A supplier may have multiple consumers - will we overwrite
> in the right order then? And why do we now need this ordering when we
> were so far shutting down suppliers while consumers were still up?
>
The devices_kset list should already be in the right order for shutting
stuff down--i.e., parents and suppliers should be shutdown later as the
device_shutdown loop goes through the devices.
With async shutdown this loop still goes the devices_kset list in the same
order it did before the patch, so my expectation was that any parents/suppliers
would come later in the loop than any children/siblings, and it would update
shutdown_after as it went to ensure that each device ended up with the cookie
of the last child/consumer that it needed to wait for.
However, I missed that the devices_kset list isn't always reordered when a
devlink is added and a consumer isn't dependent on the supplier (see
device_is_dependent()). I have a patch would address that, and add a sanity
check in case any devices get in the list in the wrong order somehow:
diff --git a/drivers/base/core.c b/drivers/base/core.c
index b69b82da8837..52d64b419c01 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -4832,6 +4832,13 @@ static void shutdown_one_device_async(void *data, async_cookie_t cookie)
{
struct device *dev = data;
+ /*
+ * Sanity check to prevent shutdown hang in case a parent or supplier
+ * is in devices_kset list in the wrong order
+ */
+ if (dev->p->shutdown_after > cookie)
+ dev->p->shutdown_after = cookie - 1;
+
async_synchronize_cookie_domain(dev->p->shutdown_after + 1, &sd_domain);
shutdown_one_device(dev);
@@ -4898,8 +4905,11 @@ void device_shutdown(void)
idx = device_links_read_lock();
list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
- device_links_read_lock_held())
+ device_links_read_lock_held()) {
+ if (device_link_flag_is_sync_state_only(link->flags))
+ continue;
link->supplier->p->shutdown_after = cookie;
+ }
device_links_read_unlock(idx);
put_device(dev);
I'll submit this shortly if nobody responds with any issues with this.
Thank you!
> Same overwrite question applies to setting shutdown_after in parents.
> Don't we rather need a list for shutdown_after, at least once everything
> is async?
>
> This needs to be thought through once more, I guess.
>
> Jan
>
>> + device_links_read_unlock(idx);
>> + put_device(dev);
>>
>> spin_lock(&devices_kset->list_lock);
>> }
>> spin_unlock(&devices_kset->list_lock);
>> + async_synchronize_full_domain(&sd_domain);
>> }
>>
>> /*
>> diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h
>> index 1fc8b68786de..2b6127faaa25 100644
>> --- a/include/linux/device/driver.h
>> +++ b/include/linux/device/driver.h
>> @@ -56,6 +56,7 @@ enum probe_type {
>> * @mod_name: Used for built-in modules.
>> * @suppress_bind_attrs: Disables bind/unbind via sysfs.
>> * @probe_type: Type of the probe (synchronous or asynchronous) to use.
>> + * @async_shutdown_enable: Enables devices to be shutdown asynchronously.
>> * @of_match_table: The open firmware table.
>> * @acpi_match_table: The ACPI match table.
>> * @probe: Called to query the existence of a specific device,
>> @@ -102,6 +103,7 @@ struct device_driver {
>>
>> bool suppress_bind_attrs; /* disables bind/unbind via sysfs */
>> enum probe_type probe_type;
>> + bool async_shutdown_enable;
>>
>> const struct of_device_id *of_match_table;
>> const struct acpi_device_id *acpi_match_table;
>
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-09-11 0:14 ` stuart hayes
@ 2024-09-11 5:51 ` Jan Kiszka
2024-09-11 22:06 ` stuart hayes
2024-09-12 14:30 ` David Jeffery
1 sibling, 1 reply; 29+ messages in thread
From: Jan Kiszka @ 2024-09-11 5:51 UTC (permalink / raw)
To: stuart hayes, linux-kernel, Greg Kroah-Hartman,
Rafael J . Wysocki, Martin Belanger, Oliver O'Halloran,
Daniel Wagner, Keith Busch, Lukas Wunner, David Jeffery,
Jeremy Allison, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
linux-nvme
Cc: Nathan Chancellor
On 11.09.24 02:14, stuart hayes wrote:
>
>
> On 9/8/2024 8:36 AM, Jan Kiszka wrote:
>> On 22.08.24 22:28, Stuart Hayes wrote:
>>> Add code to allow asynchronous shutdown of devices, ensuring that each
>>> device is shut down before its parents & suppliers.
>>>
>>> Only devices with drivers that have async_shutdown_enable enabled
>>> will be
>>> shut down asynchronously.
>>>
>>> This can dramatically reduce system shutdown/reboot time on systems that
>>> have multiple devices that take many seconds to shut down (like certain
>>> NVMe drives). On one system tested, the shutdown time went from 11
>>> minutes
>>> without this patch to 55 seconds with the patch.
>>>
>>> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
>>> Signed-off-by: David Jeffery <djeffery@redhat.com>
>>> ---
>>> drivers/base/base.h | 4 +++
>>> drivers/base/core.c | 54 ++++++++++++++++++++++++++++++++++-
>>> include/linux/device/driver.h | 2 ++
>>> 3 files changed, 59 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/base/base.h b/drivers/base/base.h
>>> index 0b53593372d7..aa5a2bd3f2b8 100644
>>> --- a/drivers/base/base.h
>>> +++ b/drivers/base/base.h
>>> @@ -10,6 +10,7 @@
>>> * shared outside of the drivers/base/ directory.
>>> *
>>> */
>>> +#include <linux/async.h>
>>> #include <linux/notifier.h>
>>> /**
>>> @@ -97,6 +98,8 @@ struct driver_private {
>>> * the device; typically because it depends on another driver
>>> getting
>>> * probed first.
>>> * @async_driver - pointer to device driver awaiting probe via
>>> async_probe
>>> + * @shutdown_after - used during device shutdown to ensure correct
>>> shutdown
>>> + * ordering.
>>> * @device - pointer back to the struct device that this structure is
>>> * associated with.
>>> * @dead - This device is currently either in the process of or has
>>> been
>>> @@ -114,6 +117,7 @@ struct device_private {
>>> struct list_head deferred_probe;
>>> const struct device_driver *async_driver;
>>> char *deferred_probe_reason;
>>> + async_cookie_t shutdown_after;
>>> struct device *device;
>>> u8 dead:1;
>>> };
>>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>>> index 7e50daa65ca0..dd3652ea56fe 100644
>>> --- a/drivers/base/core.c
>>> +++ b/drivers/base/core.c
>>> @@ -9,6 +9,7 @@
>>> */
>>> #include <linux/acpi.h>
>>> +#include <linux/async.h>
>>> #include <linux/cpufreq.h>
>>> #include <linux/device.h>
>>> #include <linux/err.h>
>>> @@ -3531,6 +3532,7 @@ static int device_private_init(struct device *dev)
>>> klist_init(&dev->p->klist_children, klist_children_get,
>>> klist_children_put);
>>> INIT_LIST_HEAD(&dev->p->deferred_probe);
>>> + dev->p->shutdown_after = 0;
>>> return 0;
>>> }
>>> @@ -4781,6 +4783,8 @@ int device_change_owner(struct device *dev,
>>> kuid_t kuid, kgid_t kgid)
>>> }
>>> EXPORT_SYMBOL_GPL(device_change_owner);
>>> +static ASYNC_DOMAIN(sd_domain);
>>> +
>>> static void shutdown_one_device(struct device *dev)
>>> {
>>> /* hold lock to avoid race with probe/release */
>>> @@ -4816,12 +4820,34 @@ static void shutdown_one_device(struct device
>>> *dev)
>>> put_device(dev->parent);
>>> }
>>> +/**
>>> + * shutdown_one_device_async
>>> + * @data: the pointer to the struct device to be shutdown
>>> + * @cookie: not used
>>> + *
>>> + * Shuts down one device, after waiting for shutdown_after to complete.
>>> + * shutdown_after should be set to the cookie of the last child or
>>> consumer
>>> + * of this device to be shutdown (if any), or to the cookie of the
>>> previous
>>> + * device to be shut down for devices that don't enable asynchronous
>>> shutdown.
>>> + */
>>> +static void shutdown_one_device_async(void *data, async_cookie_t
>>> cookie)
>>> +{
>>> + struct device *dev = data;
>>> +
>>> + async_synchronize_cookie_domain(dev->p->shutdown_after + 1,
>>> &sd_domain);
>>> +
>>> + shutdown_one_device(dev);
>>> +}
>>> +
>>> /**
>>> * device_shutdown - call ->shutdown() on each device to shutdown.
>>> */
>>> void device_shutdown(void)
>>> {
>>> struct device *dev, *parent;
>>> + async_cookie_t cookie = 0;
>>> + struct device_link *link;
>>> + int idx;
>>> wait_for_device_probe();
>>> device_block_probing();
>>> @@ -4852,11 +4878,37 @@ void device_shutdown(void)
>>> list_del_init(&dev->kobj.entry);
>>> spin_unlock(&devices_kset->list_lock);
>>> - shutdown_one_device(dev);
>>> +
>>> + /*
>>> + * Set cookie for devices that will be shut down synchronously
>>> + */
>>> + if (!dev->driver || !dev->driver->async_shutdown_enable)
>>> + dev->p->shutdown_after = cookie;
>>> +
>>> + get_device(dev);
>>> + get_device(parent);
>>> +
>>> + cookie = async_schedule_domain(shutdown_one_device_async,
>>> + dev, &sd_domain);
>>> + /*
>>> + * Ensure parent & suppliers wait for this device to shut down
>>> + */
>>> + if (parent) {
>>> + parent->p->shutdown_after = cookie;
>>> + put_device(parent);
>>> + }
>>> +
>>> + idx = device_links_read_lock();
>>> + list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
>>> + device_links_read_lock_held())
>>> + link->supplier->p->shutdown_after = cookie;
>>
>> This will not fly if a supplier registered after its consumer. As we are
>> walking the list backward, the supplier will now wait for something that
>> is coming later during shutdown if the affected devices are still doing
>> this synchronously (as almost all at this stage). This creates a
>> circular dependency.
>>
>> Seems to explain the reboot hang that I'm seeing on an embedded target
>> with a mailbox dev waiting for a remoteproc dev while the mailbox being
>> after the remoteproc in the list (thus first on shutting down).
>>
>> This resolves the issue for me so far, but I don't think we are done yet:
>>
>> list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
>> device_links_read_lock_held()) {
>> if (link->supplier->driver &&
>> link->supplier->driver->async_shutdown_enable)
>> link->supplier->p->shutdown_after = cookie;
>> }
>>
>> I wonder if overwriting the supplier's shutdown_after unconditionally is
>> a good idea. A supplier may have multiple consumers - will we overwrite
>> in the right order then? And why do we now need this ordering when we
>> were so far shutting down suppliers while consumers were still up?
>>
>
> The devices_kset list should already be in the right order for shutting
> stuff down--i.e., parents and suppliers should be shutdown later as the
> device_shutdown loop goes through the devices.
>
> With async shutdown this loop still goes the devices_kset list in the same
> order it did before the patch, so my expectation was that any
> parents/suppliers
> would come later in the loop than any children/siblings, and it would
> update
> shutdown_after as it went to ensure that each device ended up with the
> cookie
> of the last child/consumer that it needed to wait for.
>
> However, I missed that the devices_kset list isn't always reordered when a
> devlink is added and a consumer isn't dependent on the supplier (see
> device_is_dependent()). I have a patch would address that, and add a
> sanity
> check in case any devices get in the list in the wrong order somehow:
>
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index b69b82da8837..52d64b419c01 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -4832,6 +4832,13 @@ static void shutdown_one_device_async(void *data,
> async_cookie_t cookie)
> {
> struct device *dev = data;
>
> + /*
> + * Sanity check to prevent shutdown hang in case a parent or supplier
> + * is in devices_kset list in the wrong order
> + */
> + if (dev->p->shutdown_after > cookie)
> + dev->p->shutdown_after = cookie - 1;
> +
> async_synchronize_cookie_domain(dev->p->shutdown_after + 1,
> &sd_domain);
>
> shutdown_one_device(dev);
> @@ -4898,8 +4905,11 @@ void device_shutdown(void)
>
> idx = device_links_read_lock();
> list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> - device_links_read_lock_held())
> + device_links_read_lock_held()) {
> + if (device_link_flag_is_sync_state_only(link->flags))
> + continue;
> link->supplier->p->shutdown_after = cookie;
> + }
> device_links_read_unlock(idx);
> put_device(dev);
>
>
> I'll submit this shortly if nobody responds with any issues with this.
>
> Thank you!
>
This sounds widely reasonable to me, and a quick check confirmed that it
apparently resolves the issue I was seeing.
I'm still wondering, though, if overwriting the parent's shutdown_after
and only checking later on in shutdown_one_device_async is sufficient or
if it wouldn't be safer to have a check when we write. The fact that
there could be multiple children for a parent is worrying me.
Jan
>
>> Same overwrite question applies to setting shutdown_after in parents.
>> Don't we rather need a list for shutdown_after, at least once everything
>> is async?
>>
>> This needs to be thought through once more, I guess.
>>
>> Jan
>>
>>> + device_links_read_unlock(idx);
>>> + put_device(dev);
>>> spin_lock(&devices_kset->list_lock);
>>> }
>>> spin_unlock(&devices_kset->list_lock);
>>> + async_synchronize_full_domain(&sd_domain);
>>> }
>>> /*
>>> diff --git a/include/linux/device/driver.h
>>> b/include/linux/device/driver.h
>>> index 1fc8b68786de..2b6127faaa25 100644
>>> --- a/include/linux/device/driver.h
>>> +++ b/include/linux/device/driver.h
>>> @@ -56,6 +56,7 @@ enum probe_type {
>>> * @mod_name: Used for built-in modules.
>>> * @suppress_bind_attrs: Disables bind/unbind via sysfs.
>>> * @probe_type: Type of the probe (synchronous or asynchronous)
>>> to use.
>>> + * @async_shutdown_enable: Enables devices to be shutdown
>>> asynchronously.
>>> * @of_match_table: The open firmware table.
>>> * @acpi_match_table: The ACPI match table.
>>> * @probe: Called to query the existence of a specific device,
>>> @@ -102,6 +103,7 @@ struct device_driver {
>>> bool suppress_bind_attrs; /* disables bind/unbind via
>>> sysfs */
>>> enum probe_type probe_type;
>>> + bool async_shutdown_enable;
>>> const struct of_device_id *of_match_table;
>>> const struct acpi_device_id *acpi_match_table;
>>
--
Siemens AG, Technology
Linux Expert Center
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-09-11 5:51 ` Jan Kiszka
@ 2024-09-11 22:06 ` stuart hayes
0 siblings, 0 replies; 29+ messages in thread
From: stuart hayes @ 2024-09-11 22:06 UTC (permalink / raw)
To: Jan Kiszka, linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Cc: Nathan Chancellor
On 9/11/2024 12:51 AM, Jan Kiszka wrote:
> On 11.09.24 02:14, stuart hayes wrote:
>>
>>
>> On 9/8/2024 8:36 AM, Jan Kiszka wrote:
>>> On 22.08.24 22:28, Stuart Hayes wrote:
>>>> Add code to allow asynchronous shutdown of devices, ensuring that each
>>>> device is shut down before its parents & suppliers.
>>>>
>>>> Only devices with drivers that have async_shutdown_enable enabled
>>>> will be
>>>> shut down asynchronously.
>>>>
>>>> This can dramatically reduce system shutdown/reboot time on systems that
>>>> have multiple devices that take many seconds to shut down (like certain
>>>> NVMe drives). On one system tested, the shutdown time went from 11
>>>> minutes
>>>> without this patch to 55 seconds with the patch.
>>>>
>>>> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
>>>> Signed-off-by: David Jeffery <djeffery@redhat.com>
>>>> ---
>>>> drivers/base/base.h | 4 +++
>>>> drivers/base/core.c | 54 ++++++++++++++++++++++++++++++++++-
>>>> include/linux/device/driver.h | 2 ++
>>>> 3 files changed, 59 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/base/base.h b/drivers/base/base.h
>>>> index 0b53593372d7..aa5a2bd3f2b8 100644
>>>> --- a/drivers/base/base.h
>>>> +++ b/drivers/base/base.h
>>>> @@ -10,6 +10,7 @@
>>>> * shared outside of the drivers/base/ directory.
>>>> *
>>>> */
>>>> +#include <linux/async.h>
>>>> #include <linux/notifier.h>
>>>> /**
>>>> @@ -97,6 +98,8 @@ struct driver_private {
>>>> * the device; typically because it depends on another driver
>>>> getting
>>>> * probed first.
>>>> * @async_driver - pointer to device driver awaiting probe via
>>>> async_probe
>>>> + * @shutdown_after - used during device shutdown to ensure correct
>>>> shutdown
>>>> + * ordering.
>>>> * @device - pointer back to the struct device that this structure is
>>>> * associated with.
>>>> * @dead - This device is currently either in the process of or has
>>>> been
>>>> @@ -114,6 +117,7 @@ struct device_private {
>>>> struct list_head deferred_probe;
>>>> const struct device_driver *async_driver;
>>>> char *deferred_probe_reason;
>>>> + async_cookie_t shutdown_after;
>>>> struct device *device;
>>>> u8 dead:1;
>>>> };
>>>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>>>> index 7e50daa65ca0..dd3652ea56fe 100644
>>>> --- a/drivers/base/core.c
>>>> +++ b/drivers/base/core.c
>>>> @@ -9,6 +9,7 @@
>>>> */
>>>> #include <linux/acpi.h>
>>>> +#include <linux/async.h>
>>>> #include <linux/cpufreq.h>
>>>> #include <linux/device.h>
>>>> #include <linux/err.h>
>>>> @@ -3531,6 +3532,7 @@ static int device_private_init(struct device *dev)
>>>> klist_init(&dev->p->klist_children, klist_children_get,
>>>> klist_children_put);
>>>> INIT_LIST_HEAD(&dev->p->deferred_probe);
>>>> + dev->p->shutdown_after = 0;
>>>> return 0;
>>>> }
>>>> @@ -4781,6 +4783,8 @@ int device_change_owner(struct device *dev,
>>>> kuid_t kuid, kgid_t kgid)
>>>> }
>>>> EXPORT_SYMBOL_GPL(device_change_owner);
>>>> +static ASYNC_DOMAIN(sd_domain);
>>>> +
>>>> static void shutdown_one_device(struct device *dev)
>>>> {
>>>> /* hold lock to avoid race with probe/release */
>>>> @@ -4816,12 +4820,34 @@ static void shutdown_one_device(struct device
>>>> *dev)
>>>> put_device(dev->parent);
>>>> }
>>>> +/**
>>>> + * shutdown_one_device_async
>>>> + * @data: the pointer to the struct device to be shutdown
>>>> + * @cookie: not used
>>>> + *
>>>> + * Shuts down one device, after waiting for shutdown_after to complete.
>>>> + * shutdown_after should be set to the cookie of the last child or
>>>> consumer
>>>> + * of this device to be shutdown (if any), or to the cookie of the
>>>> previous
>>>> + * device to be shut down for devices that don't enable asynchronous
>>>> shutdown.
>>>> + */
>>>> +static void shutdown_one_device_async(void *data, async_cookie_t
>>>> cookie)
>>>> +{
>>>> + struct device *dev = data;
>>>> +
>>>> + async_synchronize_cookie_domain(dev->p->shutdown_after + 1,
>>>> &sd_domain);
>>>> +
>>>> + shutdown_one_device(dev);
>>>> +}
>>>> +
>>>> /**
>>>> * device_shutdown - call ->shutdown() on each device to shutdown.
>>>> */
>>>> void device_shutdown(void)
>>>> {
>>>> struct device *dev, *parent;
>>>> + async_cookie_t cookie = 0;
>>>> + struct device_link *link;
>>>> + int idx;
>>>> wait_for_device_probe();
>>>> device_block_probing();
>>>> @@ -4852,11 +4878,37 @@ void device_shutdown(void)
>>>> list_del_init(&dev->kobj.entry);
>>>> spin_unlock(&devices_kset->list_lock);
>>>> - shutdown_one_device(dev);
>>>> +
>>>> + /*
>>>> + * Set cookie for devices that will be shut down synchronously
>>>> + */
>>>> + if (!dev->driver || !dev->driver->async_shutdown_enable)
>>>> + dev->p->shutdown_after = cookie;
>>>> +
>>>> + get_device(dev);
>>>> + get_device(parent);
>>>> +
>>>> + cookie = async_schedule_domain(shutdown_one_device_async,
>>>> + dev, &sd_domain);
>>>> + /*
>>>> + * Ensure parent & suppliers wait for this device to shut down
>>>> + */
>>>> + if (parent) {
>>>> + parent->p->shutdown_after = cookie;
>>>> + put_device(parent);
>>>> + }
>>>> +
>>>> + idx = device_links_read_lock();
>>>> + list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
>>>> + device_links_read_lock_held())
>>>> + link->supplier->p->shutdown_after = cookie;
>>>
>>> This will not fly if a supplier registered after its consumer. As we are
>>> walking the list backward, the supplier will now wait for something that
>>> is coming later during shutdown if the affected devices are still doing
>>> this synchronously (as almost all at this stage). This creates a
>>> circular dependency.
>>>
>>> Seems to explain the reboot hang that I'm seeing on an embedded target
>>> with a mailbox dev waiting for a remoteproc dev while the mailbox being
>>> after the remoteproc in the list (thus first on shutting down).
>>>
>>> This resolves the issue for me so far, but I don't think we are done yet:
>>>
>>> list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
>>> device_links_read_lock_held()) {
>>> if (link->supplier->driver &&
>>> link->supplier->driver->async_shutdown_enable)
>>> link->supplier->p->shutdown_after = cookie;
>>> }
>>>
>>> I wonder if overwriting the supplier's shutdown_after unconditionally is
>>> a good idea. A supplier may have multiple consumers - will we overwrite
>>> in the right order then? And why do we now need this ordering when we
>>> were so far shutting down suppliers while consumers were still up?
>>>
>>
>> The devices_kset list should already be in the right order for shutting
>> stuff down--i.e., parents and suppliers should be shutdown later as the
>> device_shutdown loop goes through the devices.
>>
>> With async shutdown this loop still goes the devices_kset list in the same
>> order it did before the patch, so my expectation was that any
>> parents/suppliers
>> would come later in the loop than any children/siblings, and it would
>> update
>> shutdown_after as it went to ensure that each device ended up with the
>> cookie
>> of the last child/consumer that it needed to wait for.
>>
>> However, I missed that the devices_kset list isn't always reordered when a
>> devlink is added and a consumer isn't dependent on the supplier (see
>> device_is_dependent()). I have a patch would address that, and add a
>> sanity
>> check in case any devices get in the list in the wrong order somehow:
>>
>>
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index b69b82da8837..52d64b419c01 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -4832,6 +4832,13 @@ static void shutdown_one_device_async(void *data,
>> async_cookie_t cookie)
>> {
>> struct device *dev = data;
>>
>> + /*
>> + * Sanity check to prevent shutdown hang in case a parent or supplier
>> + * is in devices_kset list in the wrong order
>> + */
>> + if (dev->p->shutdown_after > cookie)
>> + dev->p->shutdown_after = cookie - 1;
>> +
>> async_synchronize_cookie_domain(dev->p->shutdown_after + 1,
>> &sd_domain);
>>
>> shutdown_one_device(dev);
>> @@ -4898,8 +4905,11 @@ void device_shutdown(void)
>>
>> idx = device_links_read_lock();
>> list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
>> - device_links_read_lock_held())
>> + device_links_read_lock_held()) {
>> + if (device_link_flag_is_sync_state_only(link->flags))
>> + continue;
>> link->supplier->p->shutdown_after = cookie;
>> + }
>> device_links_read_unlock(idx);
>> put_device(dev);
>>
>>
>> I'll submit this shortly if nobody responds with any issues with this.
>>
>> Thank you!
>>
>
> This sounds widely reasonable to me, and a quick check confirmed that it
> apparently resolves the issue I was seeing.
>
> I'm still wondering, though, if overwriting the parent's shutdown_after
> and only checking later on in shutdown_one_device_async is sufficient or
> if it wouldn't be safer to have a check when we write. The fact that
> there could be multiple children for a parent is worrying me.
>
> Jan
>
Having multiple children isn't a problem. All of the children will be in
the shutdown loop before the parent, and as each of them is seen, the
parent's shutdown_after will be updated with the cookie of the latest
child to be shut down. When the parent then does a synchronize_cookie to
wait for that last child, it won't continue until the earlier children have
also shutdown, because synchronize_cookie doesn't just wait for that one
cookie--it waits until the specified cookie is the lowest cookie in
progress, which means all the earlier children are also done shutting down.
>>
>>> Same overwrite question applies to setting shutdown_after in parents.
>>> Don't we rather need a list for shutdown_after, at least once everything
>>> is async?
>>>
>>> This needs to be thought through once more, I guess.
>>>
>>> Jan
>>>
>>>> + device_links_read_unlock(idx);
>>>> + put_device(dev);
>>>> spin_lock(&devices_kset->list_lock);
>>>> }
>>>> spin_unlock(&devices_kset->list_lock);
>>>> + async_synchronize_full_domain(&sd_domain);
>>>> }
>>>> /*
>>>> diff --git a/include/linux/device/driver.h
>>>> b/include/linux/device/driver.h
>>>> index 1fc8b68786de..2b6127faaa25 100644
>>>> --- a/include/linux/device/driver.h
>>>> +++ b/include/linux/device/driver.h
>>>> @@ -56,6 +56,7 @@ enum probe_type {
>>>> * @mod_name: Used for built-in modules.
>>>> * @suppress_bind_attrs: Disables bind/unbind via sysfs.
>>>> * @probe_type: Type of the probe (synchronous or asynchronous)
>>>> to use.
>>>> + * @async_shutdown_enable: Enables devices to be shutdown
>>>> asynchronously.
>>>> * @of_match_table: The open firmware table.
>>>> * @acpi_match_table: The ACPI match table.
>>>> * @probe: Called to query the existence of a specific device,
>>>> @@ -102,6 +103,7 @@ struct device_driver {
>>>> bool suppress_bind_attrs; /* disables bind/unbind via
>>>> sysfs */
>>>> enum probe_type probe_type;
>>>> + bool async_shutdown_enable;
>>>> const struct of_device_id *of_match_table;
>>>> const struct acpi_device_id *acpi_match_table;
>>>
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-09-11 0:14 ` stuart hayes
2024-09-11 5:51 ` Jan Kiszka
@ 2024-09-12 14:30 ` David Jeffery
2024-09-12 16:20 ` stuart hayes
1 sibling, 1 reply; 29+ messages in thread
From: David Jeffery @ 2024-09-12 14:30 UTC (permalink / raw)
To: stuart hayes
Cc: Jan Kiszka, linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, Jeremy Allison, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-nvme, Nathan Chancellor
On Tue, Sep 10, 2024 at 8:14 PM stuart hayes <stuart.w.hayes@gmail.com> wrote:
>
...
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index b69b82da8837..52d64b419c01 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -4832,6 +4832,13 @@ static void shutdown_one_device_async(void *data, async_cookie_t cookie)
> {
> struct device *dev = data;
>
> + /*
> + * Sanity check to prevent shutdown hang in case a parent or supplier
> + * is in devices_kset list in the wrong order
> + */
> + if (dev->p->shutdown_after > cookie)
> + dev->p->shutdown_after = cookie - 1;
> +
> async_synchronize_cookie_domain(dev->p->shutdown_after + 1, &sd_domain);
>
> shutdown_one_device(dev);
While the race window is really small, there is a potential race with
this fixup. It's possible for the shutdown operation to write a new
value to shutdown_after in the time between the if check and
shutdown_after being re-read and used in the
async_synchronize_cookie_domain call. Such a race would allow a too
high value to be used.
Instead, could do something like:
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -4833,8 +4833,12 @@ static void shutdown_one_device(struct device *dev)
static void shutdown_one_device_async(void *data, async_cookie_t cookie)
{
struct device *dev = data;
+ async_cookie_t wait = dev->p->shutdown_after + 1;
- async_synchronize_cookie_domain(dev->p->shutdown_after + 1, &sd_domain);
+ if (wait > cookie)
+ wait = cookie;
+
+ async_synchronize_cookie_domain(wait, &sd_domain);
shutdown_one_device(dev);
}
This reads the shutdown_after value once and avoids the race window
where its value being changed on another CPU could still cause a
potential deadlock.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-09-12 14:30 ` David Jeffery
@ 2024-09-12 16:20 ` stuart hayes
0 siblings, 0 replies; 29+ messages in thread
From: stuart hayes @ 2024-09-12 16:20 UTC (permalink / raw)
To: David Jeffery
Cc: Jan Kiszka, linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, Jeremy Allison, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-nvme, Nathan Chancellor
On 9/12/2024 9:30 AM, David Jeffery wrote:
> On Tue, Sep 10, 2024 at 8:14 PM stuart hayes <stuart.w.hayes@gmail.com> wrote:
>>
> ...
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index b69b82da8837..52d64b419c01 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -4832,6 +4832,13 @@ static void shutdown_one_device_async(void *data, async_cookie_t cookie)
>> {
>> struct device *dev = data;
>>
>> + /*
>> + * Sanity check to prevent shutdown hang in case a parent or supplier
>> + * is in devices_kset list in the wrong order
>> + */
>> + if (dev->p->shutdown_after > cookie)
>> + dev->p->shutdown_after = cookie - 1;
>> +
>> async_synchronize_cookie_domain(dev->p->shutdown_after + 1, &sd_domain);
>>
>> shutdown_one_device(dev);
>
> While the race window is really small, there is a potential race with
> this fixup. It's possible for the shutdown operation to write a new
> value to shutdown_after in the time between the if check and
> shutdown_after being re-read and used in the
> async_synchronize_cookie_domain call. Such a race would allow a too
> high value to be used.
>
> Instead, could do something like:
>
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -4833,8 +4833,12 @@ static void shutdown_one_device(struct device *dev)
> static void shutdown_one_device_async(void *data, async_cookie_t cookie)
> {
> struct device *dev = data;
> + async_cookie_t wait = dev->p->shutdown_after + 1;
>
> - async_synchronize_cookie_domain(dev->p->shutdown_after + 1, &sd_domain);
> + if (wait > cookie)
> + wait = cookie;
> +
> + async_synchronize_cookie_domain(wait, &sd_domain);
>
> shutdown_one_device(dev);
> }
>
> This reads the shutdown_after value once and avoids the race window
> where its value being changed on another CPU could still cause a
> potential deadlock.
>
Good point. Really that sanity check shouldn't be needed at all. But... maybe it
would be better to just not change the shutdown_after on any device that's
already been scheduled for shutdown... this would work regardless of why the supplier
and consumer devices are in the wrong order on the devices_kset list, and would still
work if supplier/consumer devices don't get reordered for some reason other than
the devlink being sync_state only in the future. Plus, it's a bit simpler.
How does this look?
diff --git a/drivers/base/base.h b/drivers/base/base.h
index ea18aa70f151..f818a0251bb7 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -105,6 +105,8 @@ struct driver_private {
* @dead - This device is currently either in the process of or has been
* removed from the system. Any asynchronous events scheduled for this
* device should exit without taking any action.
+ * @shutdown_scheduled - asynchronous shutdown of the device has already
+ * been scheduled
*
* Nothing outside of the driver core should ever touch these fields.
*/
@@ -120,6 +122,7 @@ struct device_private {
async_cookie_t shutdown_after;
struct device *device;
u8 dead:1;
+ u8 shutdown_scheduled:1;
};
#define to_device_private_parent(obj) \
container_of(obj, struct device_private, knode_parent)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index b69b82da8837..bd6bc4a3dc15 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -4888,6 +4888,8 @@ void device_shutdown(void)
cookie = async_schedule_domain(shutdown_one_device_async,
dev, &sd_domain);
+ dev->p->shutdown_scheduled = 1;
+
/*
* Ensure parent & suppliers wait for this device to shut down
*/
@@ -4898,8 +4900,18 @@ void device_shutdown(void)
idx = device_links_read_lock();
list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
- device_links_read_lock_held())
- link->supplier->p->shutdown_after = cookie;
+ device_links_read_lock_held()) {
+ /*
+ * Only update cookie if device shutdown hasn't
+ * already been scheduled. Some supplier/consumer
+ * devices (sync_state only) aren't reordered on
+ * devices_kset list and don't need this, and setting
+ * this could result in a circular dependency if the
+ * supplier shutdown has already been scheduled.
+ */
+ if (!link->supplier->p->shutdown_scheduled)
+ link->supplier->p->shutdown_after = cookie;
+ }
device_links_read_unlock(idx);
put_device(dev);
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
` (5 preceding siblings ...)
2024-09-08 14:44 ` Christophe JAILLET
@ 2024-09-23 20:50 ` Andrey Skvortsov
2024-09-24 9:23 ` Greg Kroah-Hartman
6 siblings, 1 reply; 29+ messages in thread
From: Andrey Skvortsov @ 2024-09-23 20:50 UTC (permalink / raw)
To: Stuart Hayes
Cc: linux-kernel, Greg Kroah-Hartman, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
Hi Stuart,
On 24-08-22 15:28, Stuart Hayes wrote:
> Add code to allow asynchronous shutdown of devices, ensuring that each
> device is shut down before its parents & suppliers.
>
> Only devices with drivers that have async_shutdown_enable enabled will be
> shut down asynchronously.
>
> This can dramatically reduce system shutdown/reboot time on systems that
> have multiple devices that take many seconds to shut down (like certain
> NVMe drives). On one system tested, the shutdown time went from 11 minutes
> without this patch to 55 seconds with the patch.
>
> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> Signed-off-by: David Jeffery <djeffery@redhat.com>
> ---
> drivers/base/base.h | 4 +++
> drivers/base/core.c | 54 ++++++++++++++++++++++++++++++++++-
> include/linux/device/driver.h | 2 ++
> 3 files changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index 0b53593372d7..aa5a2bd3f2b8 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -10,6 +10,7 @@
> * shared outside of the drivers/base/ directory.
This change landed in linux-next and I have problem with shutdown on
ARM Allwinner A64 device. Device usually hangs at shutdown.
git bisect pointed to "driver core: shut down devices asynchronously"
as a first bad commit.
I've tried to debug the problem and this is what I see:
1) device 'mmc_host mmc0' processed in device_shutdown. For this device
async_schedule_domain is called (cookie 264, for example).
2) after that 'mmcblk mmc0:aaaa' is processed. For this device
async_schedule_domain is called (cookie 296, for example).
3) 'mmc_host mmc0' is parent of 'mmcblk mmc0:aaaa' and
parent->p->shutdown_after is updated from 263 to 296.
4) After sometime shutdown_one_device_async is called for 264
(mmc_host mmc0), but dev->p->shutdown_after was updated to 296 and the
code calls first async_synchronize_cookie_domain for 297.
264 can't finish, because it waits for 297. shutdown process can't continue.
The problem is always with a MMC host controller.
--
Best regards,
Andrey Skvortsov
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-09-23 20:50 ` Andrey Skvortsov
@ 2024-09-24 9:23 ` Greg Kroah-Hartman
2024-09-24 20:44 ` Andrey Skvortsov
0 siblings, 1 reply; 29+ messages in thread
From: Greg Kroah-Hartman @ 2024-09-24 9:23 UTC (permalink / raw)
To: Andrey Skvortsov, Stuart Hayes, linux-kernel, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
On Mon, Sep 23, 2024 at 11:50:39PM +0300, Andrey Skvortsov wrote:
> Hi Stuart,
>
> On 24-08-22 15:28, Stuart Hayes wrote:
> > Add code to allow asynchronous shutdown of devices, ensuring that each
> > device is shut down before its parents & suppliers.
> >
> > Only devices with drivers that have async_shutdown_enable enabled will be
> > shut down asynchronously.
> >
> > This can dramatically reduce system shutdown/reboot time on systems that
> > have multiple devices that take many seconds to shut down (like certain
> > NVMe drives). On one system tested, the shutdown time went from 11 minutes
> > without this patch to 55 seconds with the patch.
> >
> > Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> > Signed-off-by: David Jeffery <djeffery@redhat.com>
> > ---
> > drivers/base/base.h | 4 +++
> > drivers/base/core.c | 54 ++++++++++++++++++++++++++++++++++-
> > include/linux/device/driver.h | 2 ++
> > 3 files changed, 59 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/base/base.h b/drivers/base/base.h
> > index 0b53593372d7..aa5a2bd3f2b8 100644
> > --- a/drivers/base/base.h
> > +++ b/drivers/base/base.h
> > @@ -10,6 +10,7 @@
> > * shared outside of the drivers/base/ directory.
>
> This change landed in linux-next and I have problem with shutdown on
> ARM Allwinner A64 device. Device usually hangs at shutdown.
> git bisect pointed to "driver core: shut down devices asynchronously"
> as a first bad commit.
>
> I've tried to debug the problem and this is what I see:
>
> 1) device 'mmc_host mmc0' processed in device_shutdown. For this device
> async_schedule_domain is called (cookie 264, for example).
>
> 2) after that 'mmcblk mmc0:aaaa' is processed. For this device
> async_schedule_domain is called (cookie 296, for example).
>
> 3) 'mmc_host mmc0' is parent of 'mmcblk mmc0:aaaa' and
> parent->p->shutdown_after is updated from 263 to 296.
>
> 4) After sometime shutdown_one_device_async is called for 264
> (mmc_host mmc0), but dev->p->shutdown_after was updated to 296 and the
> code calls first async_synchronize_cookie_domain for 297.
>
> 264 can't finish, because it waits for 297. shutdown process can't continue.
>
> The problem is always with a MMC host controller.
If you take the patch here:
https://lore.kernel.org/r/20240919043143.1194950-1-stuart.w.hayes@gmail.com
does it solve the problem?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-09-24 9:23 ` Greg Kroah-Hartman
@ 2024-09-24 20:44 ` Andrey Skvortsov
2024-09-25 8:55 ` Greg Kroah-Hartman
0 siblings, 1 reply; 29+ messages in thread
From: Andrey Skvortsov @ 2024-09-24 20:44 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Stuart Hayes, linux-kernel, Rafael J . Wysocki, Martin Belanger,
Oliver O'Halloran, Daniel Wagner, Keith Busch, Lukas Wunner,
David Jeffery, Jeremy Allison, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, linux-nvme
Hi,
On 24-09-24 11:23, Greg Kroah-Hartman wrote:
> On Mon, Sep 23, 2024 at 11:50:39PM +0300, Andrey Skvortsov wrote:
> > Hi Stuart,
> >
> > On 24-08-22 15:28, Stuart Hayes wrote:
> > > Add code to allow asynchronous shutdown of devices, ensuring that each
> > > device is shut down before its parents & suppliers.
> > >
> > > Only devices with drivers that have async_shutdown_enable enabled will be
> > > shut down asynchronously.
> > >
> > > This can dramatically reduce system shutdown/reboot time on systems that
> > > have multiple devices that take many seconds to shut down (like certain
> > > NVMe drives). On one system tested, the shutdown time went from 11 minutes
> > > without this patch to 55 seconds with the patch.
> > >
> > > Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> > > Signed-off-by: David Jeffery <djeffery@redhat.com>
> > > ---
> > > drivers/base/base.h | 4 +++
> > > drivers/base/core.c | 54 ++++++++++++++++++++++++++++++++++-
> > > include/linux/device/driver.h | 2 ++
> > > 3 files changed, 59 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/base/base.h b/drivers/base/base.h
> > > index 0b53593372d7..aa5a2bd3f2b8 100644
> > > --- a/drivers/base/base.h
> > > +++ b/drivers/base/base.h
> > > @@ -10,6 +10,7 @@
> > > * shared outside of the drivers/base/ directory.
> >
> > This change landed in linux-next and I have problem with shutdown on
> > ARM Allwinner A64 device. Device usually hangs at shutdown.
> > git bisect pointed to "driver core: shut down devices asynchronously"
> > as a first bad commit.
> >
> > I've tried to debug the problem and this is what I see:
> >
> > 1) device 'mmc_host mmc0' processed in device_shutdown. For this device
> > async_schedule_domain is called (cookie 264, for example).
> >
> > 2) after that 'mmcblk mmc0:aaaa' is processed. For this device
> > async_schedule_domain is called (cookie 296, for example).
> >
> > 3) 'mmc_host mmc0' is parent of 'mmcblk mmc0:aaaa' and
> > parent->p->shutdown_after is updated from 263 to 296.
> >
> > 4) After sometime shutdown_one_device_async is called for 264
> > (mmc_host mmc0), but dev->p->shutdown_after was updated to 296 and the
> > code calls first async_synchronize_cookie_domain for 297.
> >
> > 264 can't finish, because it waits for 297. shutdown process can't continue.
> >
> > The problem is always with a MMC host controller.
>
> If you take the patch here:
> https://lore.kernel.org/r/20240919043143.1194950-1-stuart.w.hayes@gmail.com
> does it solve the problem?
Unfortunately not. I've applied the patch to next-20240920 and tested
latest next-20240924 with patch integrated already. In both cases
shutdown hangs.
--
Best regards,
Andrey Skvortsov
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v8 3/4] driver core: shut down devices asynchronously
2024-09-24 20:44 ` Andrey Skvortsov
@ 2024-09-25 8:55 ` Greg Kroah-Hartman
0 siblings, 0 replies; 29+ messages in thread
From: Greg Kroah-Hartman @ 2024-09-25 8:55 UTC (permalink / raw)
To: Andrey Skvortsov, Stuart Hayes, linux-kernel, Rafael J . Wysocki,
Martin Belanger, Oliver O'Halloran, Daniel Wagner,
Keith Busch, Lukas Wunner, David Jeffery, Jeremy Allison,
Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-nvme
On Tue, Sep 24, 2024 at 11:44:02PM +0300, Andrey Skvortsov wrote:
> Hi,
>
> On 24-09-24 11:23, Greg Kroah-Hartman wrote:
> > On Mon, Sep 23, 2024 at 11:50:39PM +0300, Andrey Skvortsov wrote:
> > > Hi Stuart,
> > >
> > > On 24-08-22 15:28, Stuart Hayes wrote:
> > > > Add code to allow asynchronous shutdown of devices, ensuring that each
> > > > device is shut down before its parents & suppliers.
> > > >
> > > > Only devices with drivers that have async_shutdown_enable enabled will be
> > > > shut down asynchronously.
> > > >
> > > > This can dramatically reduce system shutdown/reboot time on systems that
> > > > have multiple devices that take many seconds to shut down (like certain
> > > > NVMe drives). On one system tested, the shutdown time went from 11 minutes
> > > > without this patch to 55 seconds with the patch.
> > > >
> > > > Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> > > > Signed-off-by: David Jeffery <djeffery@redhat.com>
> > > > ---
> > > > drivers/base/base.h | 4 +++
> > > > drivers/base/core.c | 54 ++++++++++++++++++++++++++++++++++-
> > > > include/linux/device/driver.h | 2 ++
> > > > 3 files changed, 59 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/base/base.h b/drivers/base/base.h
> > > > index 0b53593372d7..aa5a2bd3f2b8 100644
> > > > --- a/drivers/base/base.h
> > > > +++ b/drivers/base/base.h
> > > > @@ -10,6 +10,7 @@
> > > > * shared outside of the drivers/base/ directory.
> > >
> > > This change landed in linux-next and I have problem with shutdown on
> > > ARM Allwinner A64 device. Device usually hangs at shutdown.
> > > git bisect pointed to "driver core: shut down devices asynchronously"
> > > as a first bad commit.
> > >
> > > I've tried to debug the problem and this is what I see:
> > >
> > > 1) device 'mmc_host mmc0' processed in device_shutdown. For this device
> > > async_schedule_domain is called (cookie 264, for example).
> > >
> > > 2) after that 'mmcblk mmc0:aaaa' is processed. For this device
> > > async_schedule_domain is called (cookie 296, for example).
> > >
> > > 3) 'mmc_host mmc0' is parent of 'mmcblk mmc0:aaaa' and
> > > parent->p->shutdown_after is updated from 263 to 296.
> > >
> > > 4) After sometime shutdown_one_device_async is called for 264
> > > (mmc_host mmc0), but dev->p->shutdown_after was updated to 296 and the
> > > code calls first async_synchronize_cookie_domain for 297.
> > >
> > > 264 can't finish, because it waits for 297. shutdown process can't continue.
> > >
> > > The problem is always with a MMC host controller.
> >
> > If you take the patch here:
> > https://lore.kernel.org/r/20240919043143.1194950-1-stuart.w.hayes@gmail.com
> > does it solve the problem?
>
> Unfortunately not. I've applied the patch to next-20240920 and tested
> latest next-20240924 with patch integrated already. In both cases
> shutdown hangs.
Ugh. Ok, this is making me very nervous so I'm going to go and revert
this series from my tree and let's take it again after -rc1 is out and
people can work through the issues there.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2024-09-25 8:55 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-22 20:28 [PATCH v8 0/4] shut down devices asynchronously Stuart Hayes
2024-08-22 20:28 ` [PATCH v8 1/4] driver core: don't always lock parent in shutdown Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
2024-08-25 7:56 ` Sagi Grimberg
2024-08-22 20:28 ` [PATCH v8 2/4] driver core: separate function to shutdown one device Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
2024-08-25 7:56 ` Sagi Grimberg
2024-08-22 20:28 ` [PATCH v8 3/4] driver core: shut down devices asynchronously Stuart Hayes
2024-08-23 6:26 ` Christoph Hellwig
2024-08-24 10:29 ` Lukas Wunner
2024-08-25 7:58 ` Sagi Grimberg
2024-09-05 22:13 ` Nathan Chancellor
2024-09-06 14:44 ` stuart hayes
2024-09-08 13:36 ` Jan Kiszka
2024-09-11 0:14 ` stuart hayes
2024-09-11 5:51 ` Jan Kiszka
2024-09-11 22:06 ` stuart hayes
2024-09-12 14:30 ` David Jeffery
2024-09-12 16:20 ` stuart hayes
2024-09-08 14:44 ` Christophe JAILLET
2024-09-23 20:50 ` Andrey Skvortsov
2024-09-24 9:23 ` Greg Kroah-Hartman
2024-09-24 20:44 ` Andrey Skvortsov
2024-09-25 8:55 ` Greg Kroah-Hartman
2024-08-22 20:28 ` [PATCH v8 4/4] nvme-pci: Make driver prefer asynchronous shutdown Stuart Hayes
2024-08-23 6:27 ` Christoph Hellwig
2024-08-25 7:57 ` Sagi Grimberg
2024-08-23 16:54 ` [PATCH v8 0/4] shut down devices asynchronously Keith Busch
2024-09-03 11:10 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox