From: sagi@grimberg.me (Sagi Grimberg)
Subject: target crash / host hang with nvme-all.3 branch of nvme-fabrics
Date: Thu, 16 Jun 2016 22:11:30 +0300 [thread overview]
Message-ID: <5762F9E2.7030101@grimberg.me> (raw)
In-Reply-To: <20160616151048.GA13218@lst.de>
> I think nvmet_rdma_delete_ctrl is getting the exlusion vs other calls
> or __nvmet_rdma_queue_disconnect wrong as we rely on a queue that
> is undergoing deletion to not be on any list.
How do we rely on that? __nvmet_rdma_queue_disconnect callers are
responsible for queue_list deletion and queue the release. I don't
see where are we getting it wrong.
Additionally it also
> check the cntlid instead of the pointer, which would be harmful if
> multiple subsystems have the same cntlid.
That's true, we need to compare pointers...
>
> Does the following patch help?
>
> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
> index b1c6e5b..9ae65a7 100644
> --- a/drivers/nvme/target/rdma.c
> +++ b/drivers/nvme/target/rdma.c
> @@ -1293,19 +1293,21 @@ static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id,
>
> static void nvmet_rdma_delete_ctrl(struct nvmet_ctrl *ctrl)
> {
> - struct nvmet_rdma_queue *queue, *next;
> - static LIST_HEAD(del_list);
> + struct nvmet_rdma_queue *queue, *found = NULL;
>
> mutex_lock(&nvmet_rdma_queue_mutex);
> - list_for_each_entry_safe(queue, next,
> - &nvmet_rdma_queue_list, queue_list) {
> - if (queue->nvme_sq.ctrl->cntlid == ctrl->cntlid)
> - list_move_tail(&queue->queue_list, &del_list);
> + list_for_each_entry(queue, &nvmet_rdma_queue_list, queue_list) {
> + if (queue->nvme_sq.ctrl == ctrl) {
> + list_del_init(&queue->queue_list);
> + found = queue;
> + break;
> + }
> }
> +
> mutex_unlock(&nvmet_rdma_queue_mutex);
>
> - list_for_each_entry_safe(queue, next, &del_list, queue_list)
> - nvmet_rdma_queue_disconnect(queue);
> + if (found)
> + __nvmet_rdma_queue_disconnect(queue);
> }
>
> static int nvmet_rdma_add_port(struct nvmet_port *port)
>
Umm, this looks wrong to me. delete_controller should delete _all_
the ctrl queues (which will usually involve more than 1), what about
all the other queues? what am I missing?
next prev parent reply other threads:[~2016-06-16 19:11 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-16 14:53 target crash / host hang with nvme-all.3 branch of nvme-fabrics Steve Wise
2016-06-16 14:57 ` Christoph Hellwig
2016-06-16 15:10 ` Christoph Hellwig
2016-06-16 15:17 ` Steve Wise
2016-06-16 19:11 ` Sagi Grimberg [this message]
2016-06-16 20:38 ` Christoph Hellwig
2016-06-16 21:37 ` Sagi Grimberg
2016-06-16 21:40 ` Sagi Grimberg
2016-06-21 16:01 ` Christoph Hellwig
2016-06-22 10:22 ` Sagi Grimberg
2016-06-16 15:24 ` Steve Wise
2016-06-16 16:41 ` Steve Wise
2016-06-16 15:56 ` Steve Wise
2016-06-16 19:55 ` Sagi Grimberg
2016-06-16 19:59 ` Steve Wise
2016-06-16 20:07 ` Sagi Grimberg
2016-06-16 20:12 ` Steve Wise
2016-06-16 20:27 ` Ming Lin
2016-06-16 20:28 ` Steve Wise
2016-06-16 20:34 ` 'Christoph Hellwig'
2016-06-16 20:49 ` Steve Wise
2016-06-16 21:06 ` Steve Wise
2016-06-16 21:42 ` Sagi Grimberg
2016-06-16 21:47 ` Ming Lin
2016-06-16 21:53 ` Steve Wise
2016-06-16 21:46 ` Steve Wise
2016-06-27 22:29 ` Ming Lin
2016-06-28 9:14 ` 'Christoph Hellwig'
2016-06-28 14:15 ` Steve Wise
2016-06-28 15:51 ` 'Christoph Hellwig'
2016-06-28 16:31 ` Steve Wise
2016-06-28 16:49 ` Ming Lin
2016-06-28 19:20 ` Steve Wise
2016-06-28 19:43 ` Steve Wise
2016-06-28 21:04 ` Ming Lin
2016-06-29 14:11 ` Steve Wise
2016-06-27 17:26 ` Ming Lin
2016-06-16 20:35 ` Steve Wise
2016-06-16 20:01 ` Steve Wise
2016-06-17 14:05 ` Steve Wise
[not found] ` <005f01d1c8a1$5a229240$0e67b6c0$@opengridcomputing.com>
2016-06-17 14:16 ` Steve Wise
2016-06-17 17:20 ` Ming Lin
2016-06-19 11:57 ` Sagi Grimberg
2016-06-21 14:18 ` Steve Wise
2016-06-21 17:33 ` Ming Lin
2016-06-21 17:59 ` Steve Wise
[not found] ` <006e01d1cbc7$d0d9cc40$728d64c0$@opengridcomputing.com>
2016-06-22 13:42 ` Steve Wise
2016-06-27 14:19 ` Steve Wise
2016-06-28 8:50 ` 'Christoph Hellwig'
2016-07-04 9:57 ` Yoichi Hayakawa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5762F9E2.7030101@grimberg.me \
--to=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).