From: Keith Busch <kbusch@kernel.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>,
linux-nvme@lists.infradead.org, Sagi Grimberg <sagi@grimberg.me>,
Yi Zhang <yi.zhang@redhat.com>
Subject: Re: [PATCH] nvme-pci: fix race between pci reset and nvme probe
Date: Mon, 1 Aug 2022 08:33:01 -0600 [thread overview]
Message-ID: <YufkHXoUrw7jRLuT@kbusch-mbp> (raw)
In-Reply-To: <20220801125753.1434024-1-ming.lei@redhat.com>
On Mon, Aug 01, 2022 at 08:57:53PM +0800, Ming Lei wrote:
> After nvme_probe() returns, device lock is released, and PCI reset
> handler may come, meantime reset work is just scheduled and should
> be in-progress.
>
> When nvme_reset_prepare() is run, all NSs may not be allocated yet
> and each NS's request queue won't be frozen by nvme_dev_disable().
>
> But when nvme_reset_done() is called for resetting controller, all
> NSs may have been scanned successfully, and nvme_wait_freeze() is
> called on un-frozen request queues, then wait forever.
>
> Fix the issue by holding device lock for resetting from nvme probe.
>
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Link: https://lore.kernel.org/linux-block/CAHj4cs--KPTAGP=jj+7KMe=arDv=HeGeOgs1T8vbusyk=EjXow@mail.gmail.com/#r
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
> drivers/nvme/host/pci.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 4232192e10dd..d49b1a082983 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3075,9 +3075,14 @@ static unsigned long check_vendor_combination_bug(struct pci_dev *pdev)
> static void nvme_async_probe(void *data, async_cookie_t cookie)
> {
> struct nvme_dev *dev = data;
> + struct pci_dev *pdev = to_pci_dev(dev->dev);
>
> + pci_dev_lock(pdev);
> + nvme_reset_ctrl(&dev->ctrl);
> flush_work(&dev->ctrl.reset_work);
> flush_work(&dev->ctrl.scan_work);
> + pci_dev_unlock(pdev);
> +
> nvme_put_ctrl(&dev->ctrl);
> }
When low on memory, async_schedule() falls back to calling the requested
function directly, so this would deadlock on taking the pci_dev_lock() the
second time within the probe context.
If it is successfully scheduled asynchronously, holding the lock blocks a hot
removal, which might be the only thing that can unblock the nvme reset_work
from forward progress.
If you are encountering a nvme_reset_prepare() condition during scanning, that
might indicate a failure to communicate with the end device. The scan work may
need the error handling to unblock it.
> @@ -3154,7 +3159,6 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>
> dev_info(dev->ctrl.device, "pci function %s\n", dev_name(&pdev->dev));
>
> - nvme_reset_ctrl(&dev->ctrl);
> async_schedule(nvme_async_probe, dev);
>
> return 0;
> --
> 2.31.1
>
next prev parent reply other threads:[~2022-08-01 14:33 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-01 12:57 [PATCH] nvme-pci: fix race between pci reset and nvme probe Ming Lei
2022-08-01 14:33 ` Keith Busch [this message]
2022-08-02 0:05 ` Ming Lei
2022-08-02 2:22 ` Keith Busch
2022-08-04 21:55 ` Keith Busch
2022-08-05 12:34 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YufkHXoUrw7jRLuT@kbusch-mbp \
--to=kbusch@kernel.org \
--cc=hch@lst.de \
--cc=linux-nvme@lists.infradead.org \
--cc=ming.lei@redhat.com \
--cc=sagi@grimberg.me \
--cc=yi.zhang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox