* NVMe: Invalid controller status value, device hotplug out
@ 2014-07-22 17:25 Mundu
2014-07-22 17:25 ` [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) Mundu
2014-07-23 14:24 ` NVMe: Invalid controller status value, device hotplug out J Freyensee
0 siblings, 2 replies; 5+ messages in thread
From: Mundu @ 2014-07-22 17:25 UTC (permalink / raw)
If device hotplug out, controller status become 0xFFFF_FFFF
1. Controller fatal status become true in nvme_kthread()
2. Ready bit nvere reset to 0 in nvme_wait_ready() for
controller disable.
above two scenarios controller status data is not a valid
data, since device/controller is already removed (hotplug)
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) 2014-07-22 17:25 NVMe: Invalid controller status value, device hotplug out Mundu @ 2014-07-22 17:25 ` Mundu 2014-07-22 17:40 ` Keith Busch 2014-07-23 14:24 ` NVMe: Invalid controller status value, device hotplug out J Freyensee 1 sibling, 1 reply; 5+ messages in thread From: Mundu @ 2014-07-22 17:25 UTC (permalink / raw) Signed-off-by: Mundu <mundu2510 at gmail.com> --- drivers/block/nvme-core.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c index 28aec2d..6c6998e 100644 --- a/drivers/block/nvme-core.c +++ b/drivers/block/nvme-core.c @@ -1396,10 +1396,14 @@ static int nvme_wait_ready(struct nvme_dev *dev, u64 cap, bool enabled) { unsigned long timeout; u32 bit = enabled ? NVME_CSTS_RDY : 0; + u32 csts; timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies; - while ((readl(&dev->bar->csts) & NVME_CSTS_RDY) != bit) { + while (((csts = readl(&dev->bar->csts)) & NVME_CSTS_RDY) != bit) { + /* If device is hot plugout, csts become 0xFFFFFFFF */ + if (csts == -1) + return -ENODEV; msleep(100); if (fatal_signal_pending(current)) return -EINTR; @@ -1905,14 +1909,19 @@ static int nvme_submit_async_req(struct nvme_queue *nvmeq) static int nvme_kthread(void *data) { struct nvme_dev *dev, *next; + u32 csts; while (!kthread_should_stop()) { set_current_state(TASK_INTERRUPTIBLE); spin_lock(&dev_list_lock); list_for_each_entry_safe(dev, next, &dev_list, node) { int i; - if (readl(&dev->bar->csts) & NVME_CSTS_CFS && + if ((csts = readl(&dev->bar->csts)) & NVME_CSTS_CFS && dev->initialized) { + /* If device is hot plugout, + csts become 0xFFFFFFFF */ + if (csts == -1) + continue; if (work_busy(&dev->reset_work)) continue; list_del_init(&dev->node); -- 1.9.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) 2014-07-22 17:25 ` [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) Mundu @ 2014-07-22 17:40 ` Keith Busch 2014-07-22 17:57 ` mundu agarwal 0 siblings, 1 reply; 5+ messages in thread From: Keith Busch @ 2014-07-22 17:40 UTC (permalink / raw) On Tue, 22 Jul 2014, Mundu wrote: > Signed-off-by: Mundu <mundu2510 at gmail.com> > --- > drivers/block/nvme-core.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c > index 28aec2d..6c6998e 100644 > --- a/drivers/block/nvme-core.c > +++ b/drivers/block/nvme-core.c > @@ -1396,10 +1396,14 @@ static int nvme_wait_ready(struct nvme_dev *dev, u64 cap, bool enabled) > { > unsigned long timeout; > u32 bit = enabled ? NVME_CSTS_RDY : 0; > + u32 csts; > > timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies; > > - while ((readl(&dev->bar->csts) & NVME_CSTS_RDY) != bit) { > + while (((csts = readl(&dev->bar->csts)) & NVME_CSTS_RDY) != bit) { > + /* If device is hot plugout, csts become 0xFFFFFFFF */ > + if (csts == -1) > + return -ENODEV; In nvme_dev_map, we already return -ENODEV if reading CSTS returns -1. That happens before we call nvme_wait_ready. Are you just shorting the probe in the event someone hot removes a drive the moment after the driver ioremaps the BAR, but before it enables the device? > msleep(100); > if (fatal_signal_pending(current)) > return -EINTR; > @@ -1905,14 +1909,19 @@ static int nvme_submit_async_req(struct nvme_queue *nvmeq) > static int nvme_kthread(void *data) > { > struct nvme_dev *dev, *next; > + u32 csts; > > while (!kthread_should_stop()) { > set_current_state(TASK_INTERRUPTIBLE); > spin_lock(&dev_list_lock); > list_for_each_entry_safe(dev, next, &dev_list, node) { > int i; > - if (readl(&dev->bar->csts) & NVME_CSTS_CFS && > + if ((csts = readl(&dev->bar->csts)) & NVME_CSTS_CFS && > dev->initialized) { > + /* If device is hot plugout, > + csts become 0xFFFFFFFF */ > + if (csts == -1) > + continue; You definitely want to let the reset work get scheduled here instead of continuing. Some platforms do not support "surprise removal", so this check guards against that by triggering a reset which will bail on the device if it appears gone and remove it from pci. > if (work_busy(&dev->reset_work)) > continue; > list_del_init(&dev->node); ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) 2014-07-22 17:40 ` Keith Busch @ 2014-07-22 17:57 ` mundu agarwal 0 siblings, 0 replies; 5+ messages in thread From: mundu agarwal @ 2014-07-22 17:57 UTC (permalink / raw) >> In nvme_dev_map, we already return -ENODEV if reading CSTS returns >> -1. That happens before we call nvme_wait_ready. Are you just shorting >> the probe in the event someone hot removes a drive the moment after the >> driver ioremaps the BAR, but before it enables the device? Yes On Tue, Jul 22, 2014@11:10 PM, Keith Busch <keith.busch@intel.com> wrote: > On Tue, 22 Jul 2014, Mundu wrote: >> >> Signed-off-by: Mundu <mundu2510 at gmail.com> >> --- >> drivers/block/nvme-core.c | 13 +++++++++++-- >> 1 file changed, 11 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c >> index 28aec2d..6c6998e 100644 >> --- a/drivers/block/nvme-core.c >> +++ b/drivers/block/nvme-core.c >> @@ -1396,10 +1396,14 @@ static int nvme_wait_ready(struct nvme_dev *dev, >> u64 cap, bool enabled) >> { >> unsigned long timeout; >> u32 bit = enabled ? NVME_CSTS_RDY : 0; >> + u32 csts; >> >> timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies; >> >> - while ((readl(&dev->bar->csts) & NVME_CSTS_RDY) != bit) { >> + while (((csts = readl(&dev->bar->csts)) & NVME_CSTS_RDY) != bit) { >> + /* If device is hot plugout, csts become 0xFFFFFFFF */ >> + if (csts == -1) >> + return -ENODEV; > > > In nvme_dev_map, we already return -ENODEV if reading CSTS returns > -1. That happens before we call nvme_wait_ready. Are you just shorting > the probe in the event someone hot removes a drive the moment after the > driver ioremaps the BAR, but before it enables the device? > > >> msleep(100); >> if (fatal_signal_pending(current)) >> return -EINTR; >> @@ -1905,14 +1909,19 @@ static int nvme_submit_async_req(struct nvme_queue >> *nvmeq) >> static int nvme_kthread(void *data) >> { >> struct nvme_dev *dev, *next; >> + u32 csts; >> >> while (!kthread_should_stop()) { >> set_current_state(TASK_INTERRUPTIBLE); >> spin_lock(&dev_list_lock); >> list_for_each_entry_safe(dev, next, &dev_list, node) { >> int i; >> - if (readl(&dev->bar->csts) & NVME_CSTS_CFS && >> + if ((csts = readl(&dev->bar->csts)) & >> NVME_CSTS_CFS && >> dev->initialized) >> { >> + /* If device is hot plugout, >> + csts become 0xFFFFFFFF */ >> + if (csts == -1) >> + continue; > > > You definitely want to let the reset work get scheduled here instead of > continuing. Some platforms do not support "surprise removal", so this > check guards against that by triggering a reset which will bail on the > device if it appears gone and remove it from pci. > > >> if (work_busy(&dev->reset_work)) >> continue; >> list_del_init(&dev->node); ^ permalink raw reply [flat|nested] 5+ messages in thread
* NVMe: Invalid controller status value, device hotplug out 2014-07-22 17:25 NVMe: Invalid controller status value, device hotplug out Mundu 2014-07-22 17:25 ` [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) Mundu @ 2014-07-23 14:24 ` J Freyensee 1 sibling, 0 replies; 5+ messages in thread From: J Freyensee @ 2014-07-23 14:24 UTC (permalink / raw) On 07/22/2014 10:25 AM, Mundu wrote: > If device hotplug out, controller status become 0xFFFF_FFFF > 1. Controller fatal status become true in nvme_kthread() > 2. Ready bit nvere reset to 0 in nvme_wait_ready() for > controller disable. > above two scenarios controller status data is not a valid > data, since device/controller is already removed (hotplug) > I believe you want this comment to be in the actual patch you sent?? The patch you sent after this email has no description, something Matthew pointed out previously. > _______________________________________________ > Linux-nvme mailing list > Linux-nvme at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-nvme > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-07-23 14:24 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-07-22 17:25 NVMe: Invalid controller status value, device hotplug out Mundu 2014-07-22 17:25 ` [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) Mundu 2014-07-22 17:40 ` Keith Busch 2014-07-22 17:57 ` mundu agarwal 2014-07-23 14:24 ` NVMe: Invalid controller status value, device hotplug out J Freyensee
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox