From: Ming Lei <ming.lei@redhat.com>
To: Daniel Wagner <dwagner@suse.de>
Cc: wenxiong@linux.vnet.ibm.com, linux-kernel@vger.kernel.org,
james.smart@broadcom.com, wenxiong@us.ibm.com, sagi@grimberg.me
Subject: Re: [PATCH 1/1] block: System crashes when cpu hotplug + bouncing port
Date: Mon, 28 Jun 2021 17:59:34 +0800 [thread overview]
Message-ID: <YNmdhqd+W3XbJCwd@T590> (raw)
In-Reply-To: <20210628090703.apaowrsazl53lza4@beryllium.lan>
On Mon, Jun 28, 2021 at 11:07:03AM +0200, Daniel Wagner wrote:
> Hi Wen,
>
> On Sun, Jun 27, 2021 at 10:14:32PM -0500, wenxiong@linux.vnet.ibm.com wrote:
> > @@ -468,8 +467,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
> > data.hctx = q->queue_hw_ctx[hctx_idx];
> > if (!blk_mq_hw_queue_mapped(data.hctx))
> > goto out_queue_exit;
> > - cpu = cpumask_first_and(data.hctx->cpumask, cpu_online_mask);
> > - data.ctx = __blk_mq_get_ctx(q, cpu);
> > + data.ctx = __blk_mq_get_ctx(q, hctx_idx);
>
> hctx_idx is just an index, not a CPU id. In this scenario, the hctx_idx
> used to lookup the context happens to be valid. I am still a bit
> confused why [1] doesn't work for this scenario.
[1] is fine from blk-mq viewpoint, but nvme needs to improve the
failure handling, otherwise no io queues may be connected in the
worst case.
>
> As Ming pointed out in [2] we need to update cpumask for CPU hotplug
I mention there is still hole with your patch, not mean we need to
update cpumask.
The root cause is that blk-mq doesn't work well on tag allocation from
specified hctx(blk_mq_alloc_request_hctx), and blk-mq assumes that any
request allocation can't cross hctx inactive/offline, see blk_mq_hctx_notify_offline()
and blk_mq_get_tag(). Either the allocated request is completed or new
allocation is prevented before the current hctx becomes inactive(any CPU in
hctx->cpumask is offline).
I tried[1] to move connecting io queue into driver and kill blk_mq_alloc_request_hctx()
for addressing this issue, but there is corner case(timeout) not covered.
I understand that NVMe's requirement is that connect io queue should be
done successfully no matter if the hctx is inactive or not. Sagi,
connect me if I am wrong.
[1]
https://lore.kernel.org/linux-block/fda43a50-a484-dde7-84a1-94ccf9346bdd@broadcom.com/T/#m1e902f69e8503f5e6202945b8b79e5b7252e3689
Thanks,
Ming
next prev parent reply other threads:[~2021-06-28 9:59 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-28 3:14 [PATCH 1/1] block: System crashes when cpu hotplug + bouncing port wenxiong
2021-06-28 9:07 ` Daniel Wagner
2021-06-28 9:59 ` Ming Lei [this message]
[not found] ` <71d1ce491ed5056bfa921f0e14fa646d@imap.linux.ibm.com>
2021-06-29 1:20 ` Ming Lei
[not found] ` <OFE573413D.44652DC5-ON00258703.000DB949-00258703.000EFCD4@ibm.com>
2021-06-29 2:56 ` Ming Lei
[not found] ` <OF8889275F.DC758B38-ON00258703.001297BC-00258703.00143502@ibm.com>
2021-06-29 3:47 ` Ming Lei
2021-06-29 8:25 ` Daniel Wagner
2021-06-29 8:35 ` Daniel Wagner
2021-06-29 9:01 ` Ming Lei
2021-06-29 9:27 ` Daniel Wagner
2021-06-29 9:35 ` Ming Lei
2021-06-29 9:49 ` Daniel Wagner
2021-06-29 10:06 ` Ming Lei
2021-06-29 11:50 ` Daniel Wagner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YNmdhqd+W3XbJCwd@T590 \
--to=ming.lei@redhat.com \
--cc=dwagner@suse.de \
--cc=james.smart@broadcom.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sagi@grimberg.me \
--cc=wenxiong@linux.vnet.ibm.com \
--cc=wenxiong@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox