Linux PCI subsystem development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "David Jeffery" <djeffery@redhat.com>
Cc: linux-pci@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: [PATCH 3/5] driver core: async device shutdown infrastructure
Date: Wed, 01 Jul 2026 14:11:44 +0000	[thread overview]
Message-ID: <20260701141145.52A611F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260701135015.81937-4-djeffery@redhat.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] Modifying the core.async_shutdown module parameter during shutdown causes synchronous suppliers to skip waiting for asynchronous consumers.
- [High] Unregistered consumer devices will cause a NULL pointer dereference during async shutdown dependency waiting.
- [High] Adding a new device with async shutdown enabled while its synchronous supplier is being processed leads to an unresolvable deadlock.
- [High] Dereferencing `dev->p->dead` without NULL checks causes a crash if a concurrent `device_add()` fails.
--

commit cf98cdf76df805443cc885868cb3d9e13da5501c
Author: David Jeffery <djeffery@redhat.com>

driver core: async device shutdown infrastructure

This commit allows devices to opt into asynchronous shutdown, executing in
parallel with unrelated devices after their dependencies shut down. Synchronous
shutdown devices will wait for their asynchronous dependencies. The feature is
enabled by default and can be toggled via the core.async_shutdown parameter.

> diff --git a/drivers/base/core.c b/drivers/base/core.c
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
[ ... ]
> +static bool wants_async_shutdown(struct device *dev)
> +{
> +	return async_shutdown && dev_async_shutdown(dev);
> +}
> +
> +static int wait_for_device_shutdown(struct device *dev, void *data)
> +{
> +	bool async = *(bool *)data;
> +
> +	if (async || wants_async_shutdown(dev))
> +		wait_for_completion(&dev->p->complete);
> +
> +	return 0;
> +}

[Severity: High]
If a user-space process toggles the 0644 core.async_shutdown sysfs parameter
from 1 to 0 during shutdown, could this cause synchronous suppliers to skip
waiting for their asynchronous consumers?

If start_async_shutdown_devices() schedules an async consumer, and then the
parameter is toggled before the sync loop processes its supplier, the call to
wants_async_shutdown(dev) here would evaluate to false. 

This would skip the wait_for_completion(), potentially causing the supplier
(e.g., a PCI bus) to shut down while its consumer is still active.

[Severity: High]
Could this result in a NULL pointer dereference if the consumer is not yet
registered?

The device_link_add() function allows creating links to consumer devices that
haven't been registered yet, meaning their dev->p is NULL because
device_private_init() hasn't run.

If wait_for_shutdown_dependencies() is called from the async pool, where the
async argument is forced to true, the wait_for_completion() will
unconditionally evaluate &dev->p->complete, crashing on the NULL pointer.

[ ... ]
> +static void __shutdown_one_device(struct device *dev)
>  {
>  	if (dev->p->dead)
>  		return;

[Severity: High]
Could this cause a NULL pointer dereference if a concurrent device_add() fails?

If a concurrent device_add() fails after calling kobject_add(), its error path
calls kobject_del() and then frees and nullifies dev->p. Evaluating
dev->p->dead without checking if dev->p is valid could crash here.

[ ... ]
> +static void start_async_shutdown_devices(void)
> +{
> +	struct device *dev, *next, *ndev, *needs_put = NULL;
[ ... ]
> +			needs_put = ndev;
> +			spin_lock(&devices_kset->list_lock);
> +			/*
> +			 * If the next device has been marked dead while the
> +			 * spinlock was released, it may no longer be on the
> +			 * devices_kset list. Restart the list walk to be safe
> +			 */
> +			if (ndev && ndev->p->dead)
> +				goto restart;

[Severity: High]
Could ndev->p be NULL if a concurrent device_add() fails?

If start_async_shutdown_devices() obtains a reference to next via
get_device(), and concurrently device_add() fails and clears dev->p,
reacquiring the lock and evaluating ndev->p->dead here could result in a NULL
pointer dereference.

[ ... ]
>  void device_shutdown(void)
>  {
[ ... ]
> -		shutdown_one_device(dev);
> +		if (!wants_async_shutdown(dev) || !shutdown_device_async(dev)) {
> +			wait_for_shutdown_dependencies(dev, false);
> +			shutdown_one_device(dev);
> +		}
>  
>  		spin_lock(&devices_kset->list_lock);
>  	}
>  	spin_unlock(&devices_kset->list_lock);

[Severity: High]
Can this sequence lead to an unresolvable deadlock if a new device is added
concurrently?

If a new device 'C' with DEV_FLAG_ASYNC_SHUTDOWN is added via hotplug
after its synchronous supplier 'S' has already been popped from
devices_kset by this loop, 'C' is appended to devices_kset.

'S' will call wait_for_shutdown_dependencies() and find 'C', blocking on
wait_for_completion(&C->p->complete) because wants_async_shutdown(C) is true.

Since 'C' was added after start_async_shutdown_devices() ran, it is not
running in the async pool. Because the sync loop is blocked on 'S', it will
never reach 'C' to start it, deadlocking the shutdown process permanently.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260701135015.81937-1-djeffery@redhat.com?part=3

  reply	other threads:[~2026-07-01 14:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01 13:50 [PATCH v18 0/5] shut down devices asynchronously David Jeffery
2026-07-01 13:50 ` [PATCH 1/5] driver core: separate function to shutdown one device David Jeffery
2026-07-01 13:58   ` sashiko-bot
2026-07-01 13:50 ` [PATCH 2/5] driver core: do not always lock parent in shutdown David Jeffery
2026-07-01 14:02   ` sashiko-bot
2026-07-01 13:50 ` [PATCH 3/5] driver core: async device shutdown infrastructure David Jeffery
2026-07-01 14:11   ` sashiko-bot [this message]
2026-07-01 13:50 ` [PATCH 4/5] PCI: Enable async shutdown support David Jeffery
2026-07-01 14:07   ` sashiko-bot
2026-07-01 13:50 ` [PATCH 5/5] scsi: " David Jeffery
2026-07-01 14:06   ` sashiko-bot
  -- strict thread matches above, loose matches on Subject: below --
2026-06-16 15:22 [PATCH v17 0/5] shut down devices asynchronously David Jeffery
2026-06-16 15:22 ` [PATCH 3/5] driver core: async device shutdown infrastructure David Jeffery
2026-06-16 15:39   ` sashiko-bot
2026-06-16 16:10   ` David Jeffery
2026-06-16 17:26   ` Randy Dunlap
2026-05-18 19:31 [PATCH v16 0/5] shut down devices asynchronously David Jeffery
2026-05-18 19:32 ` [PATCH 3/5] driver core: async device shutdown infrastructure David Jeffery
2026-04-29 17:50 [PATCH v15 0/5] shut down devices asynchronously David Jeffery
2026-04-29 17:50 ` [PATCH 3/5] driver core: async device shutdown infrastructure David Jeffery
2026-04-20 15:26 [PATCH v14 0/5] shut down devices asynchronously David Jeffery
2026-04-20 15:26 ` [PATCH 3/5] driver core: async device shutdown infrastructure David Jeffery
2026-04-21  7:49   ` John Garry
2026-04-21 17:31     ` David Jeffery
2026-04-07 15:35 [PATCH v13 0/5] shut down devices asynchronously David Jeffery
2026-04-07 15:35 ` [PATCH 3/5] driver core: async device shutdown infrastructure David Jeffery
2026-03-19 14:11 [PATCH v12 0/5] shut down devices asynchronously David Jeffery
2026-03-19 14:11 ` [PATCH 3/5] driver core: async device shutdown infrastructure David Jeffery
2026-03-23  9:43   ` Maurizio Lombardi
2026-03-23 14:07     ` David Jeffery
2026-03-11 17:12 [PATCH 1/5] driver core: do not always lock parent in shutdown David Jeffery
2026-03-11 17:12 ` [PATCH 3/5] driver core: async device shutdown infrastructure David Jeffery
2026-03-11 19:40   ` Randy Dunlap
2026-03-11 23:05   ` Bjorn Helgaas
2026-03-12 14:01     ` David Jeffery

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260701141145.52A611F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=djeffery@redhat.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox