* [PATCHv2 0/2] PCI: driver function reset notification @ 2014-01-13 22:26 Keith Busch 2014-01-13 22:26 ` [PATCHv2 1/2] PCI: call pci reset callback to pci_driver on FLR Keith Busch ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Keith Busch @ 2014-01-13 22:26 UTC (permalink / raw) To: linux-pci, alex.williamson, bhelgaas; +Cc: Keith Busch Here's version 2 of this patch with a driver implementing the intended use as an example. The NVMe stuff requires using the maintainer's tree to get the newly added nvme reset handling code. Willy's repo is located here: git.infradead.org/users/willy/linux-nvme.git v1->v2: As suggested, I'm reusing the slot_reset error handler instead of defining a new one for function_reset. I moved invoking the callback further up the this call stack. The test case I use resets the device via sysfs, and the pci device's command register is cleared at the previous point, so the callback couldn't actually do anything useful other than schedule something to handle it after pci_dev_restore is called. The previous location would break other driver slot_reset implementations and make my nvme implementation a little more complicated. Actually ... I'm a little concered to be using slot_reset instead of defining a new callback for FLR. From looking at other device drivers, I'm not sure they would expect to have their slot_reset invoked in this situation. Keith Busch (2): PCI: call pci reset callback to pci_driver on FLR NVMe: Implement pci reset callback drivers/block/nvme-core.c | 36 ++++++++++++++++++++++++++++++++++-- drivers/pci/pci.c | 6 ++++++ include/linux/nvme.h | 1 + 3 files changed, 41 insertions(+), 2 deletions(-) -- 1.7.10.4 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCHv2 1/2] PCI: call pci reset callback to pci_driver on FLR 2014-01-13 22:26 [PATCHv2 0/2] PCI: driver function reset notification Keith Busch @ 2014-01-13 22:26 ` Keith Busch 2014-01-13 22:26 ` [PATCHv2 2/2] NVMe: Implement pci reset callback Keith Busch 2014-01-16 18:49 ` [PATCHv2 0/2] PCI: driver function reset notification Bjorn Helgaas 2 siblings, 0 replies; 7+ messages in thread From: Keith Busch @ 2014-01-13 22:26 UTC (permalink / raw) To: linux-pci, alex.williamson, bhelgaas; +Cc: Keith Busch A user can issue a pci function level reset to a device using sysfs entry /sys/bus/pci/devices/.../reset. A kernel driver handling the pci device probably would like to know that such a reset has occured, so this patch calls the pci_driver's slot_reset pci_error_handler. Signed-off-by: Keith Busch <keith.busch@intel.com> --- drivers/pci/pci.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index b127fbda..1bfddc5 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3584,6 +3584,12 @@ int pci_reset_function(struct pci_dev *dev) pci_dev_restore(dev); + if (!rc) { + const struct pci_error_handlers *err_handler = dev->driver ? + dev->driver->err_handler : NULL; + if (err_handler && err_handler->slot_reset) + err_handler->slot_reset(dev); + } return rc; } EXPORT_SYMBOL_GPL(pci_reset_function); -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCHv2 2/2] NVMe: Implement pci reset callback 2014-01-13 22:26 [PATCHv2 0/2] PCI: driver function reset notification Keith Busch 2014-01-13 22:26 ` [PATCHv2 1/2] PCI: call pci reset callback to pci_driver on FLR Keith Busch @ 2014-01-13 22:26 ` Keith Busch 2014-01-16 18:49 ` [PATCHv2 0/2] PCI: driver function reset notification Bjorn Helgaas 2 siblings, 0 replies; 7+ messages in thread From: Keith Busch @ 2014-01-13 22:26 UTC (permalink / raw) To: linux-pci, alex.williamson, bhelgaas; +Cc: Keith Busch Initiates nvme controller reset sequence when pci layer notifies of slot reset. The driver's slot_reset handling is skipped if the nvme polling kthread happened to try accessing a device register while the reset was in progress, which would have scheduled the nvme device to be reset anyway. Signed-off-by: Keith Busch <keith.busch@intel.com> --- drivers/block/nvme-core.c | 36 ++++++++++++++++++++++++++++++++++-- include/linux/nvme.h | 1 + 2 files changed, 35 insertions(+), 2 deletions(-) diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c index b59a93a..1f5c18a 100644 --- a/drivers/block/nvme-core.c +++ b/drivers/block/nvme-core.c @@ -1750,6 +1750,7 @@ static int nvme_kthread(void *data) dev->initialized) { if (work_busy(&dev->reset_work)) continue; + dev->reset = 1; list_del_init(&dev->node); dev_warn(&dev->pci_dev->dev, "Failed status, reset controller\n"); @@ -2088,6 +2089,7 @@ static int nvme_dev_map(struct nvme_dev *dev) } dev->db_stride = 1 << NVME_CAP_STRIDE(readq(&dev->bar->cap)); dev->dbs = ((void __iomem *)dev->bar) + 4096; + dev->reset = 0; return 0; @@ -2265,7 +2267,10 @@ static void nvme_dev_shutdown(struct nvme_dev *dev) list_del_init(&dev->node); spin_unlock(&dev_list_lock); - if (!dev->bar || (dev->bar && readl(&dev->bar->csts) == -1)) { + if (!dev->bar || (dev->bar && readl(&dev->bar->csts) == -1) || + dev->reset) { + if (dev->bar) + nvme_shutdown_ctrl(dev); for (i = dev->queue_count - 1; i >= 0; i--) { struct nvme_queue *nvmeq = dev->queues[i]; nvme_suspend_queue(nvmeq); @@ -2575,7 +2580,6 @@ static void nvme_remove(struct pci_dev *pdev) #define nvme_error_detected NULL #define nvme_dump_registers NULL #define nvme_link_reset NULL -#define nvme_slot_reset NULL #define nvme_error_resume NULL static int nvme_suspend(struct device *dev) @@ -2599,6 +2603,34 @@ static int nvme_resume(struct device *dev) return 0; } + +/** + * nvme_slot_reset - handle nvme device that has been reset + * @pdev: pci device that has been through a pci slot or function reset + * + * Called by pci error recovery. We assume the device is in a state requiring + * initialization, so set the device reset flag first to prevent wasting time + * taking the controller offline with admin commands that we expect to timeout. + */ +static pci_ers_result_t nvme_slot_reset(struct pci_dev *pdev) +{ + struct nvme_dev *dev = pci_get_drvdata(pdev); + + spin_lock(&dev_list_lock); + if (!work_busy(&dev->reset_work)) { + list_del_init(&dev->node); + dev->reset = 1; + } else { + spin_unlock(&dev_list_lock); + return PCI_ERS_RESULT_RECOVERED; + } + spin_unlock(&dev_list_lock); + + nvme_dev_reset(dev); + return dev->initialized ? PCI_ERS_RESULT_RECOVERED : + PCI_ERS_RESULT_DISCONNECT; +} + static SIMPLE_DEV_PM_OPS(nvme_dev_pm_ops, nvme_suspend, nvme_resume); static const struct pci_error_handlers nvme_err_handler = { diff --git a/include/linux/nvme.h b/include/linux/nvme.h index 69ae03f..febce6f 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -97,6 +97,7 @@ struct nvme_dev { u16 oncs; u16 abort_limit; u8 initialized; + u8 reset; }; /* -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCHv2 0/2] PCI: driver function reset notification 2014-01-13 22:26 [PATCHv2 0/2] PCI: driver function reset notification Keith Busch 2014-01-13 22:26 ` [PATCHv2 1/2] PCI: call pci reset callback to pci_driver on FLR Keith Busch 2014-01-13 22:26 ` [PATCHv2 2/2] NVMe: Implement pci reset callback Keith Busch @ 2014-01-16 18:49 ` Bjorn Helgaas 2014-01-16 19:10 ` Alex Williamson 2 siblings, 1 reply; 7+ messages in thread From: Bjorn Helgaas @ 2014-01-16 18:49 UTC (permalink / raw) To: Keith Busch; +Cc: linux-pci@vger.kernel.org, alex.williamson@redhat.com On Mon, Jan 13, 2014 at 3:26 PM, Keith Busch <keith.busch@intel.com> wrote: > Here's version 2 of this patch with a driver implementing the intended > use as an example. > > The NVMe stuff requires using the maintainer's tree to get the newly > added nvme reset handling code. Willy's repo is located here: > > git.infradead.org/users/willy/linux-nvme.git > > v1->v2: > > As suggested, I'm reusing the slot_reset error handler instead of defining > a new one for function_reset. > > I moved invoking the callback further up the this call stack. The test > case I use resets the device via sysfs, and the pci device's command > register is cleared at the previous point, so the callback couldn't > actually do anything useful other than schedule something to handle > it after pci_dev_restore is called. The previous location would break > other driver slot_reset implementations and make my nvme implementation > a little more complicated. There's now a pci_try_reset_function(), and something like this callback would have to be done in that path, too. > Actually ... I'm a little concered to be using slot_reset instead of > defining a new callback for FLR. From looking at other device drivers, > I'm not sure they would expect to have their slot_reset invoked in > this situation. I haven't looked at other slot_reset callbacks. Do you have an example we can look at and talk about? The existing model (even without your changes) allows a user to start a reset via sysfs while the driver is still active. What happens when the reset occurs while the driver is programming the device? For example, if the driver sets up a DMA transfer address, the reset occurs (destroying the address), and the driver initiates the DMA, the DMA will go to the wrong place. I'm not sure how to fix this model of resets happening asynchronous to the driver. Maybe we need to tell the driver *before* the reset. Maybe we need to ask the driver to do the reset itself, and only do it in the core if no driver is attached. Maybe we need to make the reset look like a hotplug remove/add to the driver, so we detach the driver, do the reset, and reattach a driver. In the general case, we don't know what the device *is* after a reset because it could have loaded new firmware. It could require more resources or even a different driver. I know it would cause Alex heartburn to make reset look like hotplug. What sort of NVMe problems would that cause? I assume most drivers will have to treat a device coming out of reset basically the same way as a brand new hot-added device. Bjorn ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCHv2 0/2] PCI: driver function reset notification 2014-01-16 18:49 ` [PATCHv2 0/2] PCI: driver function reset notification Bjorn Helgaas @ 2014-01-16 19:10 ` Alex Williamson 2014-01-16 20:38 ` Keith Busch 2014-01-17 17:26 ` Bjorn Helgaas 0 siblings, 2 replies; 7+ messages in thread From: Alex Williamson @ 2014-01-16 19:10 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: Keith Busch, linux-pci@vger.kernel.org On Thu, 2014-01-16 at 11:49 -0700, Bjorn Helgaas wrote: > On Mon, Jan 13, 2014 at 3:26 PM, Keith Busch <keith.busch@intel.com> wrote: > > Here's version 2 of this patch with a driver implementing the intended > > use as an example. > > > > The NVMe stuff requires using the maintainer's tree to get the newly > > added nvme reset handling code. Willy's repo is located here: > > > > git.infradead.org/users/willy/linux-nvme.git > > > > v1->v2: > > > > As suggested, I'm reusing the slot_reset error handler instead of defining > > a new one for function_reset. > > > > I moved invoking the callback further up the this call stack. The test > > case I use resets the device via sysfs, and the pci device's command > > register is cleared at the previous point, so the callback couldn't > > actually do anything useful other than schedule something to handle > > it after pci_dev_restore is called. The previous location would break > > other driver slot_reset implementations and make my nvme implementation > > a little more complicated. > > There's now a pci_try_reset_function(), and something like this > callback would have to be done in that path, too. > > > Actually ... I'm a little concered to be using slot_reset instead of > > defining a new callback for FLR. From looking at other device drivers, > > I'm not sure they would expect to have their slot_reset invoked in > > this situation. > > I haven't looked at other slot_reset callbacks. Do you have an > example we can look at and talk about? > > The existing model (even without your changes) allows a user to start > a reset via sysfs while the driver is still active. What happens when > the reset occurs while the driver is programming the device? For > example, if the driver sets up a DMA transfer address, the reset > occurs (destroying the address), and the driver initiates the DMA, the > DMA will go to the wrong place. > > I'm not sure how to fix this model of resets happening asynchronous to > the driver. Maybe we need to tell the driver *before* the reset. > Maybe we need to ask the driver to do the reset itself, and only do it > in the core if no driver is attached. Maybe we need to make the reset > look like a hotplug remove/add to the driver, so we detach the driver, > do the reset, and reattach a driver. > > In the general case, we don't know what the device *is* after a reset > because it could have loaded new firmware. It could require more > resources or even a different driver. > > I know it would cause Alex heartburn to make reset look like hotplug. Yes it would. I agree that it's theoretically possible for a device to undergo a metamorphosis to something new at reset, but in practice it just doesn't happen. Perhaps a reset with no driver attached could completely remove, reset, and rescan the device, but that might still cause problems with legacy KVM device assignment if pci-stub is not used to hold the device. For a device exposed to userspace through vfio, we want to be able to reset the device, but releasing the device and trying to reacquire it would be challenging to say the least. Thanks, Alex > What sort of NVMe problems would that cause? I assume most drivers > will have to treat a device coming out of reset basically the same way > as a brand new hot-added device. > > Bjorn ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCHv2 0/2] PCI: driver function reset notification 2014-01-16 19:10 ` Alex Williamson @ 2014-01-16 20:38 ` Keith Busch 2014-01-17 17:26 ` Bjorn Helgaas 1 sibling, 0 replies; 7+ messages in thread From: Keith Busch @ 2014-01-16 20:38 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, Keith Busch, linux-pci@vger.kernel.org On Thu, 16 Jan 2014, Alex Williamson wrote: > On Thu, 2014-01-16 at 11:49 -0700, Bjorn Helgaas wrote: >> On Mon, Jan 13, 2014 at 3:26 PM, Keith Busch <keith.busch@intel.com> wrote: ... >>> Actually ... I'm a little concered to be using slot_reset instead of >>> defining a new callback for FLR. From looking at other device drivers, >>> I'm not sure they would expect to have their slot_reset invoked in >>> this situation. >> >> I haven't looked at other slot_reset callbacks. Do you have an >> example we can look at and talk about? I'm comparing with mpt[2,3]sas_scsih.c. It looks like their slot_reset requires pci_errhandlers' "err_detected" called prior to slot_reset, otherwise they'll fail the device remapping. >> The existing model (even without your changes) allows a user to start >> a reset via sysfs while the driver is still active. What happens when >> the reset occurs while the driver is programming the device? For >> example, if the driver sets up a DMA transfer address, the reset >> occurs (destroying the address), and the driver initiates the DMA, the >> DMA will go to the wrong place. >> >> I'm not sure how to fix this model of resets happening asynchronous to >> the driver. Maybe we need to tell the driver *before* the reset. >> Maybe we need to ask the driver to do the reset itself, and only do it >> in the core if no driver is attached. Maybe we need to make the reset >> look like a hotplug remove/add to the driver, so we detach the driver, >> do the reset, and reattach a driver. Ha, Willy actually recommended I add two calls before I sent the first patch! Something like pci_reset_prepare(), pci_reset_complete(). Now that I think about it for my driver, these calls would do be identical to suspend()/resume(). Can we use those?! >> In the general case, we don't know what the device *is* after a reset >> because it could have loaded new firmware. It could require more >> resources or even a different driver. Activating new firmware is the exact use I'm going for here. I've been recommending users manually remove/reset/rescan the device with the appriate sysfs writes. It sounds like your suggesting we make all that automatically happen with only need to hit reset. It turns out making this look like hotplug isn't desirable here because the device's storage may be mounted partitions (sounds a little scary to me), and a removal would take down opened disks. >> I know it would cause Alex heartburn to make reset look like hotplug. > > Yes it would. I agree that it's theoretically possible for a device to > undergo a metamorphosis to something new at reset, but in practice it > just doesn't happen. Perhaps a reset with no driver attached could > completely remove, reset, and rescan the device, but that might still > cause problems with legacy KVM device assignment if pci-stub is not used > to hold the device. For a device exposed to userspace through vfio, we > want to be able to reset the device, but releasing the device and trying > to reacquire it would be challenging to say the least. Thanks, > > Alex > >> What sort of NVMe problems would that cause? I assume most drivers >> will have to treat a device coming out of reset basically the same way >> >> Bjorn Thanks for the feedback! Keith ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCHv2 0/2] PCI: driver function reset notification 2014-01-16 19:10 ` Alex Williamson 2014-01-16 20:38 ` Keith Busch @ 2014-01-17 17:26 ` Bjorn Helgaas 1 sibling, 0 replies; 7+ messages in thread From: Bjorn Helgaas @ 2014-01-17 17:26 UTC (permalink / raw) To: Alex Williamson; +Cc: Keith Busch, linux-pci@vger.kernel.org On Thu, Jan 16, 2014 at 12:10 PM, Alex Williamson <alex.williamson@redhat.com> wrote: > On Thu, 2014-01-16 at 11:49 -0700, Bjorn Helgaas wrote: >> In the general case, we don't know what the device *is* after a reset >> because it could have loaded new firmware. It could require more >> resources or even a different driver. >> >> I know it would cause Alex heartburn to make reset look like hotplug. > > Yes it would. I agree that it's theoretically possible for a device to > undergo a metamorphosis to something new at reset, but in practice it > just doesn't happen. Maybe not in your world, but this definitely happens, and it definitely causes hassles for people. Some resort to hacks like rebooting the whole system because we don't support this well. I mentioned these before, so you may have seen them already: http://forums.xilinx.com/t5/PCI-Express/PC-is-hung-if-run-the-second-program-FPGA-by-JTAG/td-p/20871 http://www.alteraforum.com/forum/archive/index.php/t-28477.html http://www.alteraforum.com/forum/showthread.php?t=35091 https://github.com/NetFPGA/NetFPGA-public/wiki/Restore-PCIE-Configuration http://lkml.indiana.edu/hypermail/linux/kernel/1104.3/02544.html My fear is that we're building a lot of stuff based on assumptions about how things typically operate. That works most of the time, but it does fail on important corner cases that are spec-compliant but atypical. From the PCI subsystem point of view, that makes me uneasy. Bjorn ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-01-17 17:27 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-13 22:26 [PATCHv2 0/2] PCI: driver function reset notification Keith Busch 2014-01-13 22:26 ` [PATCHv2 1/2] PCI: call pci reset callback to pci_driver on FLR Keith Busch 2014-01-13 22:26 ` [PATCHv2 2/2] NVMe: Implement pci reset callback Keith Busch 2014-01-16 18:49 ` [PATCHv2 0/2] PCI: driver function reset notification Bjorn Helgaas 2014-01-16 19:10 ` Alex Williamson 2014-01-16 20:38 ` Keith Busch 2014-01-17 17:26 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).