All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Keith Busch <kbusch@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-nvme@lists.infradead.org, Sagi Grimberg <sagi@grimberg.me>,
	Yi Zhang <yi.zhang@redhat.com>,
	ming.lei@redhat.com
Subject: Re: [PATCH] nvme-pci: fix race between pci reset and nvme probe
Date: Fri, 5 Aug 2022 20:34:38 +0800	[thread overview]
Message-ID: <Yu0OXgQo/HdusMEu@T590> (raw)
In-Reply-To: <YuxAQmHrwLGL4Wap@kbusch-mbp.dhcp.thefacebook.com>

On Thu, Aug 04, 2022 at 03:55:14PM -0600, Keith Busch wrote:
> On Mon, Aug 01, 2022 at 08:57:53PM +0800, Ming Lei wrote:
> > After nvme_probe() returns, device lock is released, and PCI reset
> > handler may come, meantime reset work is just scheduled and should
> > be in-progress.
> > 
> > When nvme_reset_prepare() is run, all NSs may not be allocated yet
> > and each NS's request queue won't be frozen by nvme_dev_disable().
> > 
> > But when nvme_reset_done() is called for resetting controller, all
> > NSs may have been scanned successfully, and nvme_wait_freeze() is
> > called on un-frozen request queues, then wait forever.
> > 
> > Fix the issue by holding device lock for resetting from nvme probe.
> 
> Beyond the issues with locking the pci device, I think the problem you're
> describing is generic to any rescan that alters the namespace inventory, so we
> can't resolve it only on probe.

Yeah, just it is easier to trigger during the 1st scan when no any NS
exits.

> 
> Is it enough to track the frozen status within the namespace flags? Something
> like the following:

I guess the patch could avoid the issue.

Another way might be to remove nvme_wait_freeze() from nvme_reset_work(). No
see it is necessary:

- nvme_dev_add():

if (!dev->ctrl.tagset) {
	...
} else {
	blk_mq_update_nr_hw_queues(&dev->tagset, dev->online_queues - 1);
	...
} 

If there isn't tagset, there can't be IO; and if nr_hw_queues is to be
changed really, blk_mq_update_nr_hw_queues() will freeze queue, otherwise
it isn't necessary to nvme_wait_freeze() before calling
blk_mq_update_nr_hw_queues.



Thanks,
Ming



      reply	other threads:[~2022-08-05 12:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-01 12:57 [PATCH] nvme-pci: fix race between pci reset and nvme probe Ming Lei
2022-08-01 14:33 ` Keith Busch
2022-08-02  0:05   ` Ming Lei
2022-08-02  2:22     ` Keith Busch
2022-08-04 21:55 ` Keith Busch
2022-08-05 12:34   ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yu0OXgQo/HdusMEu@T590 \
    --to=ming.lei@redhat.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.