From: Bjorn Helgaas <helgaas@kernel.org>
To: JD Zheng <jiandong.zheng@broadcom.com>
Cc: linux-pci@vger.kernel.org, keith.busch@intel.com,
bcm-kernel-feedback-list@broadcom.com,
Alex Williamson <alex.williamson@redhat.com>,
Lukas Wunner <lukas@wunner.de>
Subject: Re: SSD surprise removal leads to long wait inside pci_dev_wait() and FLR 65s timeout
Date: Sun, 2 Jun 2019 19:44:14 -0500 [thread overview]
Message-ID: <20190603004414.GA189360@google.com> (raw)
In-Reply-To: <8f2d88a5-9524-c4c3-a61f-7d55d97e1c18@broadcom.com>
[+cc Alex, Lukas]
On Fri, May 31, 2019 at 09:55:20AM -0700, JD Zheng wrote:
> Hello,
>
> I am running DPDK 18.11+SPDK 19.04 with v5.1 kernel. DPDK/SPDK uses SSD vfio
> devices and after running SPDK's nvmf_tgt, unplugging a SSD cause kernel to
> print out following:
> [ 105.426952] vfio-pci 0000:04:00.0: not ready 2047ms after FLR; waiting
> [ 107.698953] vfio-pci 0000:04:00.0: not ready 4095ms after FLR; waiting
> [ 112.050960] vfio-pci 0000:04:00.0: not ready 8191ms after FLR; waiting
> [ 120.498953] vfio-pci 0000:04:00.0: not ready 16383ms after FLR; waiting
> [ 138.418957] vfio-pci 0000:04:00.0: not ready 32767ms after FLR; waiting
> [ 173.234953] vfio-pci 0000:04:00.0: not ready 65535ms after FLR; giving up
>
> Looks like it is a PCI hotplug racing condition between DPDK's
> eal-intr-thread thread and kernel's pciehp thread. And it causes lockup in
> pci_dev_wait() at kernel side.
>
> When SSD is removed, eal-intr-thread immediately receives
> RTE_INTR_HANDLE_ALARM and handler calls rte_pci_detach_dev() and at kernel
> side vfio_pci_release() is triggered to release this vfio device, which
> calls pci_try_reset_function(), then _pci_reset_function_locked().
> pci_try_reset_function acquires the device lock but
> _pci_reset_function_locked() doesn't return, therefore lock is NOT released.
>
> Inside _pci_reset_function_locked(), pcie_has_flr(), pci_pm_reset(), etc.
> call pci_dev_wait() at the end but it doesn't return and print out above
> message until 65s timeout.
>
> At kernel pciehp side, it also detects the removal but doesn't run
> immediately as it is configured as "pciehp.pciehp_poll_time=5". So a couple
> of seconds later, it calls pciehp_unconfigure_device -> pci_walk_bus ->
> pci_dev_set_disconnected. pci_dev_set_disconnected() couldn't get the device
> lock and is stuck too because the lock is hold by eal-intr-thread.
>
> The first issue is in pci_dev_wait(). It calls pci_read_config_dword() and
> only when id is not all ones, it can return. But when SSD is physically
> removed, id retrieved is always all ones therefore, it has to wait for FLR
> 65s timeout to return.
>
> I did the following to check return value of pci_read_config_dword() to fix
> this:
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4439,7 +4439,11 @@ static int pci_dev_wait(struct pci_dev *dev, char
> *reset_type, int timeout)
>
> msleep(delay);
> delay *= 2;
> - pci_read_config_dword(dev, PCI_COMMAND, &id);
> + if (pci_read_config_dword(dev, PCI_COMMAND, &id) ==
> + PCIBIOS_DEVICE_NOT_FOUND) {
> + pci_info(dev, "device disconnected\n");
> + return -ENODEV;
> + }
> }
>
> if (delay > 1000)
>
> The second issue is that due to lock up described above, the
> pci_dev_set_disconnected() is stuck and pci_read_config_dword() won't return
> PCIBIOS_DEVICE_NOT_FOUND.
>
> I didn't find a easy way to fix it. Maybe use device lock in
> pci_dev_set_disconnected() is too coarse and we need a finer device
> err_state lock?
>
> BTW, pci_dev_set_disconnected wasn't using device lock until this change
> a6bd101b8f.
>
> Any suggestions to fix this problem?
Would you mind opening a report at https://bugzilla.kernel.org and
attaching the complete dmesg log and "lspci -vv" output?
Out of curiosity, why do you use "pciehp.pciehp_poll_time=5"?
Bjorn
next prev parent reply other threads:[~2019-06-03 0:44 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-31 16:55 SSD surprise removal leads to long wait inside pci_dev_wait() and FLR 65s timeout JD Zheng
2019-06-03 0:44 ` Bjorn Helgaas [this message]
2019-06-03 1:40 ` Alex Williamson
2019-06-03 21:17 ` JD Zheng
2019-06-03 22:40 ` Alex Williamson
2019-06-03 23:05 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190603004414.GA189360@google.com \
--to=helgaas@kernel.org \
--cc=alex.williamson@redhat.com \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=jiandong.zheng@broadcom.com \
--cc=keith.busch@intel.com \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).