public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: John Garry <john.garry@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	linux-nvme@lists.infradead.org, Yi Zhang <yi.zhang@redhat.com>,
	ming.lei@redhat.com
Subject: Re: [PATCH] blk-mq: avoid to hang in the cpuhp offline handler
Date: Thu, 22 Sep 2022 17:13:28 +0800	[thread overview]
Message-ID: <YywnOO42kvr8CotB@T590> (raw)
In-Reply-To: <19568225-56a1-f545-b8de-a219b7f843b7@huawei.com>

On Thu, Sep 22, 2022 at 09:47:09AM +0100, John Garry wrote:
> On 20/09/2022 03:17, Ming Lei wrote:
> > For avoiding to trigger io timeout when one hctx becomes inactive, we
> > drain IOs when all CPUs of one hctx are offline. However, driver's
> > timeout handler may require cpus_read_lock, such as nvme-pci,
> > pci_alloc_irq_vectors_affinity() is called in nvme-pci reset context,
> > and irq_build_affinity_masks() needs cpus_read_lock().
> > 
> > Meantime when blk-mq's cpuhp offline handler is called, cpus_write_lock
> > is held, so deadlock is caused.
> > 
> > Fixes the issue by breaking the wait loop if enough long time elapses,
> > and these in-flight not drained IO still can be handled by timeout
> > handler.
> 
> I don't think that that this is a good idea - that is because often drivers
> cannot safely handle scenario of timeout of an IO which has actually
> completed. NVMe timeout handler may poll for completion, but SCSI does not.
> 
> Indeed, if we were going to allow the timeout handler handle these in-flight
> IO then there is no point in having this hotplug handler in the first place.

That is true from the beginning, and we did know the point, I remember that
Hannes asked this question in LSF/MM, and there are many drivers which don't
implement timeout handler.

For this issue, it looks more like one nvme specific since nvme timeout handler
can't move on during nvme reset. Let's see if it can be fixed by nvme
driver.

BTW nvme error handling is really fragile, not only this one, such as, any timeout
during reset will cause device removed.


Thanks.
Ming



      reply	other threads:[~2022-09-22  9:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-20  2:17 [PATCH] blk-mq: avoid to hang in the cpuhp offline handler Ming Lei
2022-09-22  6:25 ` Christoph Hellwig
2022-09-22  7:41   ` Ming Lei
2022-09-22  8:47 ` John Garry
2022-09-22  9:13   ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YywnOO42kvr8CotB@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox