linux-nvme: driver removal deadlock when device hot-removed

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

From: keith.busch@linux.intel.com (Keith Busch)
Subject: linux-nvme: driver removal deadlock when device hot-removed
Date: Fri, 21 Sep 2018 10:53:21 -0600	[thread overview]
Message-ID: <20180921165321.GA1405@localhost.localdomain> (raw)
In-Reply-To: <MWHPR08MB2671C87514045AC3D85AAA5BC9120@MWHPR08MB2671.namprd08.prod.outlook.com>

On Fri, Sep 21, 2018@04:32:19PM +0000, Michael Schoberg (mschoberg) wrote:
> I'm not sure if this is the correct forum, but hopefully someone can
> help instruct me on how to proceed towards resolving this issue.
> 
> I'm working on a problem that occurs when a nvme device is hot-removed
> while IO is occurring.? I see a driver deadlock during recovery or
> more precisely, the removal of the driver instance during the recovery
> attempt.? What seems to be happening is when the system calls the nvme
> timeout routine, it eventually calls outside the driver and runs into
> a mutex deadlock.? I'm running with a 4.18.8 kernel:
> 
> The code path appears as follows (note - the device has been removed
> so there is no possibility of it being recovered):
> 
> 	nvme_timeout --> nvme_warn_reset

We shouldn't get to nvme_timeout on a surprise removal. If you have
native pcie hotplug capable slots, the path ought to be:

  pciehp_isr
   remove_board
    pciehp_unconfigure_device
     pci_remove_bus_device
      device_release_driver
       nvme_remove
        nvme_disable

And that should immediately clear outstanding IO and prevent new IO from
entering the driver.

Given what you're observing, I think your situation must be one of the
following:

 1. You don't have pcie hotplug capable hardware
 2. You don't have a pcie hotplug capable kernel 
 3. You are not using native PCIe hotplug with your platform
 
> 	nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
> 
> *The pci_read_config_word() returns PCIBIOS_SUCCESSFUL when the
> returned values show the device is detached.? I'm not quite sure how
> the configuration read works with a detached device, but it's pretty
> clear the returned values are not valid.

Since you're seeing PCIBIOS_SUCCESSFUL, the host is actually dispatching
the config request to a non-existent device. The reply will be a PCIe
Unsupported Request since there is nothing backing the address you're
trying to read, and you will see this as an "all 1's completion" with
no other indication of failure.

> Within nvme_timeout(), it begins the process to reset the controller:
> 
> 	nvme_reset_ctrl --> nvme_reset_work --> nvme_remove_dead_ctrl --> nvme_kill_queues --> nvme_set_queue_dying --> revalidate_disk
> 
> The thread hits a mutex deadlock after returning back to fs/block_dev.c::revalidate_disk():? mutex_lock(&bdev->bd_mutex)
> 
> This is the point where the driver appears completely stuck and never
> recovers.? A power cycle or system reset is required to restore
> operation (reboot or shutdown hangs).? I'm not sure what within
> block_dev.c is holding bd_mutex that would cause the deadlock and
> therefore it's very possible there is a cleaner solution than what I
> am using.

I'm not sure what could be holding it either. Maybe it's a task waiting
for an entered request to complete, in which case we should have the
nvme driver drain entered requests to failure in this path. The following
should be safe:

---
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 800ee9b345f3..7c1330986a6c 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2233,7 +2233,7 @@ static void nvme_remove_dead_ctrl(struct nvme_dev *dev, int status)
 	dev_warn(dev->ctrl.device, "Removing after probe failure status: %d\n", status);
 
 	nvme_get_ctrl(&dev->ctrl);
-	nvme_dev_disable(dev, false);
+	nvme_dev_disable(dev, true);
 	nvme_kill_queues(&dev->ctrl);
 	if (!queue_work(nvme_wq, &dev->remove_work))
 		nvme_put_ctrl(&dev->ctrl);
--

     prev parent reply	other threads:[~2018-09-21 16:53 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-21 16:32 linux-nvme: driver removal deadlock when device hot-removed Michael Schoberg (mschoberg)
2018-09-21 16:53 ` Keith Busch [this message]

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:800ee9b345f dfblob:7c1330986a6 )
 OR (
bs:"linux-nvme: driver removal deadlock when device hot-removed" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180921165321.GA1405@localhost.localdomain \
    --to=keith.busch@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).