From: James Smart <james.smart@broadcom.com>
To: Sagi Grimberg <sagi@grimberg.me>,
linux-nvme@lists.infradead.org, Christoph Hellwig <hch@lst.de>,
Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH v3 7/9] nvme-rdma: serialize controller teardown sequences
Date: Thu, 20 Aug 2020 14:04:49 -0700 [thread overview]
Message-ID: <c9d8a30c-e41f-d94e-7682-2a33cc06d7fc@broadcom.com> (raw)
In-Reply-To: <20200820053651.197057-8-sagi@grimberg.me>
On 8/19/2020 10:36 PM, Sagi Grimberg wrote:
> In the timeout handler we may need to complete a request because the
> request that timed out may be an I/O that is a part of a serial sequence
> of controller teardown or initialization. In order to complete the
> request, we need to fence any other context that may compete with us
> and complete the request that is timing out.
>
> In this case, we could have a potential double completion in case
> a hard-irq or a different competing context triggered error recovery
> and is running inflight request cancellation concurrently with the
> timeout handler.
>
> Protect using a ctrl teardown_lock to serialize contexts that may
> complete a cancelled request due to error recovery or a reset.
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
> drivers/nvme/host/rdma.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 4c5660201009..ed387f61f8f7 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -122,6 +122,7 @@ struct nvme_rdma_ctrl {
> struct sockaddr_storage src_addr;
>
> struct nvme_ctrl ctrl;
> + struct mutex teardown_lock;
> bool use_inline_data;
> u32 io_queues[HCTX_MAX_TYPES];
> };
> @@ -997,6 +998,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
> static void nvme_rdma_teardown_admin_queue(struct nvme_rdma_ctrl *ctrl,
> bool remove)
> {
> + mutex_lock(&ctrl->teardown_lock);
> blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
> nvme_rdma_stop_queue(&ctrl->queues[0]);
> if (ctrl->ctrl.admin_tagset) {
> @@ -1007,11 +1009,13 @@ static void nvme_rdma_teardown_admin_queue(struct nvme_rdma_ctrl *ctrl,
> if (remove)
> blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
> nvme_rdma_destroy_admin_queue(ctrl, remove);
> + mutex_unlock(&ctrl->teardown_lock);
> }
>
> static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl,
> bool remove)
> {
> + mutex_lock(&ctrl->teardown_lock);
> if (ctrl->ctrl.queue_count > 1) {
> nvme_start_freeze(&ctrl->ctrl);
> nvme_stop_queues(&ctrl->ctrl);
> @@ -1025,6 +1029,7 @@ static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl,
> nvme_start_queues(&ctrl->ctrl);
> nvme_rdma_destroy_io_queues(ctrl, remove);
> }
> + mutex_unlock(&ctrl->teardown_lock);
> }
>
> static void nvme_rdma_free_ctrl(struct nvme_ctrl *nctrl)
> @@ -2278,6 +2283,7 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
> return ERR_PTR(-ENOMEM);
> ctrl->ctrl.opts = opts;
> INIT_LIST_HEAD(&ctrl->list);
> + mutex_init(&ctrl->teardown_lock);
>
> if (!(opts->mask & NVMF_OPT_TRSVCID)) {
> opts->trsvcid =
Q: so you don't believe there's any conflict from these teardown paths
(possibly invoked by sysfs delete ctrl) vs a reconnect (thus
nvme_rdma_setup_ctrl) encountering a failure during controller init,
thus it hits the destroy_io and destroy_admin exit paths - which call
nvme_rdma_destroy_io_queues and stop/destroy_admin_queue - which can be
simultaneous to the teardowns and without the mutex ?
-- james
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-08-20 21:05 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-20 5:36 [PATCH v3 0/9] fix possible controller reset hangs in nvme-tcp/nvme-rdma Sagi Grimberg
2020-08-20 5:36 ` [PATCH v3 1/9] nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance Sagi Grimberg
2020-08-20 6:02 ` Christoph Hellwig
2020-08-20 20:49 ` James Smart
2020-08-20 5:36 ` [PATCH v3 2/9] nvme-fabrics: allow to queue requests for live queues Sagi Grimberg
2020-08-20 6:09 ` Christoph Hellwig
2020-08-20 16:58 ` Sagi Grimberg
2020-08-20 20:45 ` James Smart
2020-08-20 22:13 ` Sagi Grimberg
2020-08-20 22:17 ` James Smart
2020-08-21 6:22 ` Christoph Hellwig
2020-08-21 15:22 ` James Smart
2020-08-21 19:44 ` Sagi Grimberg
2020-08-23 15:19 ` James Smart
2020-08-24 8:06 ` Sagi Grimberg
2020-08-24 8:02 ` Sagi Grimberg
2020-08-25 7:13 ` Christoph Hellwig
2020-08-25 15:00 ` Sagi Grimberg
2020-08-25 15:41 ` James Smart
2020-08-25 17:35 ` Sagi Grimberg
2020-09-04 20:26 ` Sagi Grimberg
2020-09-08 9:05 ` Christoph Hellwig
2020-09-08 16:47 ` Sagi Grimberg
2020-09-08 16:48 ` Christoph Hellwig
2020-09-08 19:56 ` Sagi Grimberg
2020-08-20 20:54 ` James Smart
2020-08-20 20:56 ` James Smart
2020-08-20 5:36 ` [PATCH v3 3/9] nvme: have nvme_wait_freeze_timeout return if it timed out Sagi Grimberg
2020-08-20 6:09 ` Christoph Hellwig
2020-08-20 5:36 ` [PATCH v3 4/9] nvme-tcp: serialize controller teardown sequences Sagi Grimberg
2020-08-20 5:36 ` [PATCH v3 5/9] nvme-tcp: fix timeout handler Sagi Grimberg
2020-08-20 5:36 ` [PATCH v3 6/9] nvme-tcp: fix reset hang if controller died in the middle of a reset Sagi Grimberg
2020-08-20 5:36 ` [PATCH v3 7/9] nvme-rdma: serialize controller teardown sequences Sagi Grimberg
2020-08-20 21:04 ` James Smart [this message]
2020-08-20 22:16 ` Sagi Grimberg
2020-08-21 21:08 ` James Smart
2020-08-20 5:36 ` [PATCH v3 8/9] nvme-rdma: fix timeout handler Sagi Grimberg
2020-08-20 6:10 ` Christoph Hellwig
2020-08-20 21:37 ` James Smart
2020-08-20 5:36 ` [PATCH v3 9/9] nvme-rdma: fix reset hang if controller died in the middle of a reset Sagi Grimberg
2020-08-20 6:10 ` Christoph Hellwig
2020-08-24 18:29 ` [PATCH v3 0/9] fix possible controller reset hangs in nvme-tcp/nvme-rdma Sagi Grimberg
2020-08-25 7:16 ` Christoph Hellwig
2020-08-25 15:35 ` James Smart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c9d8a30c-e41f-d94e-7682-2a33cc06d7fc@broadcom.com \
--to=james.smart@broadcom.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.