From: keith.busch@intel.com (Keith Busch)
Subject: Bug Report: can't unload nvme module in case of disabled device
Date: Thu, 10 Aug 2017 15:34:23 -0400 [thread overview]
Message-ID: <20170810193423.GA1180@localhost.localdomain> (raw)
In-Reply-To: <20170810191717.GA1604@localhost.localdomain>
On Thu, Aug 10, 2017@03:17:17PM -0400, Keith Busch wrote:
> On Thu, Aug 10, 2017@12:45:36PM -0400, Keith Busch wrote:
> > On Tue, Aug 01, 2017@03:58:10PM +0300, Max Gurtovoy wrote:
> > > Hi all,
> > >
> > > I would like to report a bug that reproduced by the following steps (I'm
> > > using 4.13.0-rc3+):
> > >
> > > 1. modprobe nvme
> > > 2. echo 0 > /sys/block/nvme0n1/device/device/enable
> > > 3. nvme list (stuck for more than 1-2 mins)
> > > 4. modprobe -r nvme (stuck forever)
> > >
> > > log:
> > >
> > > [ 1342.388888] nvme nvme0: controller is down; will reset: CSTS=0x3,
> > > PCI_STATUS=0x10
> > > [ 1476.021392] INFO: task kworker/u98:1:436 blocked for more than 120
> > > seconds.
> > > [ 1476.029072] Not tainted 4.13.0-rc3+ #19
> > > [ 1476.033878] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> > > this message.
> > > [ 1476.042505] kworker/u98:1 D 0 436 2 0x00000000
> > > [ 1476.048569] Workqueue: nvme-wq nvme_reset_work [nvme]
> > > [ 1476.054133] Call Trace:
> > > [ 1476.056862] __schedule+0x1dc/0x780
> > > [ 1476.060706] schedule+0x36/0x80
> > > [ 1476.064180] blk_mq_freeze_queue_wait+0x4b/0xb0
> > > [ 1476.069175] ? remove_wait_queue+0x60/0x60
> > > [ 1476.073693] nvme_wait_freeze+0x33/0x50 [nvme_core]
> > > [ 1476.079068] nvme_reset_work+0x6b9/0xc40 [nvme]
> > > [ 1476.084075] ? __switch_to+0x23e/0x4a0
> > > [ 1476.088209] process_one_work+0x149/0x360
> > > [ 1476.092625] worker_thread+0x4d/0x3c0
> > > [ 1476.096692] kthread+0x109/0x140
> > > [ 1476.100247] ? rescuer_thread+0x380/0x380
> > > [ 1476.104664] ? kthread_park+0x60/0x60
> > > [ 1476.108698] ret_from_fork+0x25/0x30
> >
> > This looks like a path does not pair the freeze start with the reset's
> > freeze wait. I'll have to see what the pci 'enable' sysfs entry does.
>
> I see how the freeze start/stops are not paired in this scenario:
> nvme_dev_disable doesn't start the freeze if the pci device isn't
> disabled. It uses this to know if it is disabling the device twice.
>
> In this test, though, you are disabling the pci device without the
> driver's knowledge, so that breaks that logic. In light of that, we'll
> need different criteria to know when the driver should start a freeze.
> I'll test some things out and send a patch.
This should fix it for your scenario, but I am not completely sure this
can't get the freeze depth higher than we need.
---
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index cd888a4..ca03980 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2006,12 +2006,14 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
struct pci_dev *pdev = to_pci_dev(dev->dev);
mutex_lock(&dev->shutdown_lock);
+
+ if (dev->ctrl.state == NVME_CTRL_LIVE ||
+ dev->ctrl.state == NVME_CTRL_RESETTING)
+ nvme_start_freeze(&dev->ctrl);
+
if (pci_is_enabled(pdev)) {
u32 csts = readl(dev->bar + NVME_REG_CSTS);
- if (dev->ctrl.state == NVME_CTRL_LIVE ||
- dev->ctrl.state == NVME_CTRL_RESETTING)
- nvme_start_freeze(&dev->ctrl);
dead = !!((csts & NVME_CSTS_CFS) || !(csts & NVME_CSTS_RDY) ||
pdev->error_state != pci_channel_io_normal);
}
--
prev parent reply other threads:[~2017-08-10 19:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-01 12:58 Bug Report: can't unload nvme module in case of disabled device Max Gurtovoy
2017-08-10 8:59 ` Christoph Hellwig
2017-08-10 17:04 ` Max Gurtovoy
2017-08-10 19:36 ` Keith Busch
2017-08-13 8:29 ` Max Gurtovoy
2017-08-14 20:24 ` Keith Busch
2017-08-10 16:45 ` Keith Busch
2017-08-10 19:17 ` Keith Busch
2017-08-10 19:34 ` Keith Busch [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170810193423.GA1180@localhost.localdomain \
--to=keith.busch@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox