From: Ming Lei <ming.lei@redhat.com>
To: "jianchao.wang" <jianchao.w.wang@oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Laurence Oberman <loberman@redhat.com>,
Sagi Grimberg <sagi@grimberg.me>,
James Smart <james.smart@broadcom.com>,
linux-nvme@lists.infradead.org,
Keith Busch <keith.busch@intel.com>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling
Date: Wed, 16 May 2018 10:09:04 +0800 [thread overview]
Message-ID: <20180516020903.GC17412@ming.t460p> (raw)
In-Reply-To: <20180516020420.GB17412@ming.t460p>
On Wed, May 16, 2018 at 10:04:20AM +0800, Ming Lei wrote:
> On Tue, May 15, 2018 at 05:56:14PM +0800, jianchao.wang wrote:
> > Hi ming
> >
> > On 05/15/2018 08:33 AM, Ming Lei wrote:
> > > We still have to quiesce admin queue before canceling request, so looks
> > > the following patch is better, so please ignore the above patch and try
> > > the following one and see if your hang can be addressed:
> > >
> > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > > index f509d37b2fb8..c2adc76472a8 100644
> > > --- a/drivers/nvme/host/pci.c
> > > +++ b/drivers/nvme/host/pci.c
> > > @@ -1741,8 +1741,7 @@ static int nvme_alloc_admin_tags(struct nvme_dev *dev)
> > > dev->ctrl.admin_q = NULL;
> > > return -ENODEV;
> > > }
> > > - } else
> > > - blk_mq_unquiesce_queue(dev->ctrl.admin_q);
> > > + }
> > >
> > > return 0;
> > > }
> > > @@ -2520,6 +2519,12 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown, bool
> > > */
> > > if (shutdown)
> > > nvme_start_queues(&dev->ctrl);
> > > +
> > > + /*
> > > + * Avoid to suck reset because timeout may happen during reset and
> > > + * reset may hang forever if admin queue is kept as quiesced
> > > + */
> > > + blk_mq_unquiesce_queue(dev->ctrl.admin_q);
> > > mutex_unlock(&dev->shutdown_lock);
> > > }
> >
> > w/ patch above and patch below, both the warning and io hung issue didn't reproduce till now.
> >
> >
> > @@ -1450,6 +1648,7 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
> > {
> > struct nvme_dev *dev = nvmeq->dev;
> > int result;
> > + int cq_vector;
> >
> > if (dev->cmb && use_cmb_sqes && (dev->cmbsz & NVME_CMBSZ_SQS)) {
> > unsigned offset = (qid - 1) * roundup(SQ_SIZE(nvmeq->q_depth),
> > @@ -1462,15 +1661,16 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
> > * A queue's vector matches the queue identifier unless the controller
> > * has only one vector available.
> > */
> > - nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid;
> > - result = adapter_alloc_cq(dev, qid, nvmeq);
> > + cq_vector = dev->num_vecs == 1 ? 0 : qid;
> > + result = adapter_alloc_cq(dev, qid, nvmeq, cq_vector);
> > if (result < 0)
> > - goto release_vector;
> > + goto out;
>
> Think of this issue further, the above change will cause adapter_alloc_cq()
> failed immediately because nvmeq->cq_vector isn't set before submitting this
> admin IO.
>
> So could you check if only the patch("unquiesce admin queue after shutdown
> controller") can fix your IO hang issue?
>
> BTW, the warning from genirq can be left alone, that is another issue.
Ooops, no such issue at all since admin queue is ready, please ignore the
noise, sorry, :-(
Thanks,
Ming
next prev parent reply other threads:[~2018-05-16 2:09 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-11 12:29 [PATCH V5 0/9] nvme: pci: fix & improve timeout handling Ming Lei
2018-05-11 12:29 ` [PATCH V5 1/9] block: introduce blk_quiesce_timeout() and blk_unquiesce_timeout() Ming Lei
2018-05-11 12:29 ` [PATCH V5 2/9] nvme: pci: cover timeout for admin commands running in EH Ming Lei
2018-05-11 12:29 ` [PATCH V5 3/9] nvme: pci: only wait freezing if queue is frozen Ming Lei
2018-05-11 12:29 ` [PATCH V5 4/9] nvme: pci: freeze queue in nvme_dev_disable() in case of error recovery Ming Lei
2018-05-11 12:29 ` [PATCH V5 5/9] nvme: pci: prepare for supporting error recovery from resetting context Ming Lei
2018-05-11 12:29 ` [PATCH V5 6/9] nvme: pci: move error handling out of nvme_reset_dev() Ming Lei
2018-05-11 12:29 ` [PATCH V5 7/9] nvme: pci: don't unfreeze queue until controller state updating succeeds Ming Lei
2018-05-11 12:29 ` [PATCH V5 8/9] nvme: core: introduce nvme_force_change_ctrl_state() Ming Lei
2018-05-11 12:29 ` [PATCH V5 9/9] nvme: pci: support nested EH Ming Lei
2018-05-15 10:02 ` jianchao.wang
2018-05-15 12:39 ` Ming Lei
2018-05-11 20:50 ` [PATCH V5 0/9] nvme: pci: fix & improve timeout handling Keith Busch
2018-05-12 0:21 ` Ming Lei
2018-05-14 15:18 ` Keith Busch
2018-05-14 23:47 ` Ming Lei
2018-05-15 0:33 ` Keith Busch
2018-05-15 9:08 ` Ming Lei
2018-05-16 4:31 ` Ming Lei
2018-05-16 15:18 ` Keith Busch
2018-05-16 22:18 ` Ming Lei
2018-05-14 8:21 ` jianchao.wang
2018-05-14 9:38 ` Ming Lei
2018-05-14 10:05 ` jianchao.wang
2018-05-14 12:22 ` Ming Lei
2018-05-15 0:33 ` Ming Lei
2018-05-15 9:56 ` jianchao.wang
2018-05-15 12:56 ` Ming Lei
2018-05-16 3:03 ` jianchao.wang
2018-05-16 2:04 ` Ming Lei
2018-05-16 2:09 ` Ming Lei [this message]
2018-05-16 2:15 ` jianchao.wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180516020903.GC17412@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=james.smart@broadcom.com \
--cc=jianchao.w.wang@oracle.com \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=loberman@redhat.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox