From: Leon Romanovsky <leon@kernel.org>
To: Doug Ledford <dledford@redhat.com>, Jason Gunthorpe <jgg@mellanox.com>
Cc: linux-rdma@vger.kernel.org
Subject: [PATCH rdma-next 15/15] RDMA/cm: Make sure the cm_id is in the IB_CM_IDLE state in destroy
Date: Tue, 10 Mar 2020 11:25:45 +0200 [thread overview]
Message-ID: <20200310092545.251365-16-leon@kernel.org> (raw)
In-Reply-To: <20200310092545.251365-1-leon@kernel.org>
From: Jason Gunthorpe <jgg@mellanox.com>
The first switch statement in cm_destroy_id() tries to move the ID to
either IB_CM_IDLE or IB_CM_TIMEWAIT. Both states will block concurrent
MAD handlers from progressing.
Previous patches removed the unreliably lock/unlock sequences in this
flow, this patch removes the extra locking steps and adds the missing
parts to guarantee that destroy reaches IB_CM_IDLE. There is no point in
leaving the ID in the IB_CM_TIMEWAIT state the memory about to be kfreed.
Rework things to hold the lock across all the state transitions and
directly assert when done that it ended up in IB_CM_IDLE as expected.
This was accompanied by a careful audit of all the state transitions here,
which generally did end up in IDLE on their success and non-racy paths.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
drivers/infiniband/core/cm.c | 41 ++++++++++++++++++------------------
1 file changed, 21 insertions(+), 20 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index f536e20e5394..bbbfa77dbce7 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1030,34 +1030,34 @@ static void cm_destroy_id(struct ib_cm_id *cm_id, int err)
struct cm_work *work;
cm_id_priv = container_of(cm_id, struct cm_id_private, id);
-retest:
spin_lock_irq(&cm_id_priv->lock);
+retest:
switch (cm_id->state) {
case IB_CM_LISTEN:
- spin_unlock_irq(&cm_id_priv->lock);
-
- spin_lock_irq(&cm.lock);
+ spin_lock(&cm.lock);
if (--cm_id_priv->listen_sharecount > 0) {
/* The id is still shared. */
WARN_ON(refcount_read(&cm_id_priv->refcount) == 1);
+ spin_unlock(&cm.lock);
+ spin_unlock_irq(&cm_id_priv->lock);
cm_deref_id(cm_id_priv);
- spin_unlock_irq(&cm.lock);
return;
}
+ cm_id->state = IB_CM_IDLE;
rb_erase(&cm_id_priv->service_node, &cm.listen_service_table);
RB_CLEAR_NODE(&cm_id_priv->service_node);
- spin_unlock_irq(&cm.lock);
+ spin_unlock(&cm.lock);
break;
case IB_CM_SIDR_REQ_SENT:
cm_id->state = IB_CM_IDLE;
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
- spin_unlock_irq(&cm_id_priv->lock);
break;
case IB_CM_SIDR_REQ_RCVD:
cm_send_sidr_rep_locked(cm_id_priv,
&(struct ib_cm_sidr_rep_param){
.status = IB_SIDR_REJECT });
- spin_unlock_irq(&cm_id_priv->lock);
+ /* cm_send_sidr_rep_locked will not move to IDLE if it fails */
+ cm_id->state = IB_CM_IDLE;
break;
case IB_CM_REQ_SENT:
case IB_CM_MRA_REQ_RCVD:
@@ -1066,18 +1066,15 @@ static void cm_destroy_id(struct ib_cm_id *cm_id, int err)
&cm_id_priv->id.device->node_guid,
sizeof(cm_id_priv->id.device->node_guid),
NULL, 0);
- spin_unlock_irq(&cm_id_priv->lock);
break;
case IB_CM_REQ_RCVD:
if (err == -ENOMEM) {
/* Do not reject to allow future retries. */
cm_reset_to_idle(cm_id_priv);
- spin_unlock_irq(&cm_id_priv->lock);
} else {
cm_send_rej_locked(cm_id_priv,
IB_CM_REJ_CONSUMER_DEFINED, NULL, 0,
NULL, 0);
- spin_unlock_irq(&cm_id_priv->lock);
}
break;
case IB_CM_REP_SENT:
@@ -1089,31 +1086,35 @@ static void cm_destroy_id(struct ib_cm_id *cm_id, int err)
case IB_CM_MRA_REP_SENT:
cm_send_rej_locked(cm_id_priv, IB_CM_REJ_CONSUMER_DEFINED, NULL,
0, NULL, 0);
- spin_unlock_irq(&cm_id_priv->lock);
break;
case IB_CM_ESTABLISHED:
if (cm_id_priv->qp_type == IB_QPT_XRC_TGT) {
- spin_unlock_irq(&cm_id_priv->lock);
+ cm_id->state = IB_CM_IDLE;
break;
}
cm_send_dreq_locked(cm_id_priv, NULL, 0);
- spin_unlock_irq(&cm_id_priv->lock);
goto retest;
case IB_CM_DREQ_SENT:
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
cm_enter_timewait(cm_id_priv);
- spin_unlock_irq(&cm_id_priv->lock);
- break;
+ goto retest;
case IB_CM_DREQ_RCVD:
cm_send_drep_locked(cm_id_priv, NULL, 0);
- spin_unlock_irq(&cm_id_priv->lock);
+ WARN_ON(cm_id->state != IB_CM_TIMEWAIT);
+ goto retest;
+ case IB_CM_TIMEWAIT:
+ /*
+ * The cm_acquire_id in cm_timewait_handler will stop working
+ * once we do cm_free_id() below, so just move to idle here for
+ * consistency.
+ */
+ cm_id->state = IB_CM_IDLE;
break;
- default:
- spin_unlock_irq(&cm_id_priv->lock);
+ case IB_CM_IDLE:
break;
}
+ WARN_ON(cm_id->state != IB_CM_IDLE);
- spin_lock_irq(&cm_id_priv->lock);
spin_lock(&cm.lock);
/* Required for cleanup paths related cm_req_handler() */
if (cm_id_priv->timewait_info) {
--
2.24.1
next prev parent reply other threads:[~2020-03-10 9:26 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-10 9:25 [PATCH rdma-next 00/15] Fix locking around cm_id.state in the ib_cm Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 01/15] RDMA/cm: Fix ordering of xa_alloc_cyclic() in ib_create_cm_id() Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 02/15] RDMA/cm: Fix checking for allowed duplicate listens Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 03/15] RDMA/cm: Remove a race freeing timewait_info Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 04/15] RDMA/cm: Make the destroy_id flow more robust Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 05/15] RDMA/cm: Simplify establishing a listen cm_id Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 06/15] RDMA/cm: Read id.state under lock when doing pr_debug() Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 07/15] RDMA/cm: Make it clear that there is no concurrency in cm_sidr_req_handler() Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 08/15] RDMA/cm: Make it clearer how concurrency works in cm_req_handler() Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 09/15] RDMA/cm: Add missing locking around id.state in cm_dup_req_handler Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 10/15] RDMA/cm: Add some lockdep assertions for cm_id_priv->lock Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 11/15] RDMA/cm: Allow ib_send_cm_dreq() to be done under lock Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 12/15] RDMA/cm: Allow ib_send_cm_drep() " Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 13/15] RDMA/cm: Allow ib_send_cm_rej() " Leon Romanovsky
2020-03-10 9:25 ` [PATCH rdma-next 14/15] RDMA/cm: Allow ib_send_cm_sidr_rep() " Leon Romanovsky
2020-03-10 9:25 ` Leon Romanovsky [this message]
2020-03-17 20:15 ` [PATCH rdma-next 00/15] Fix locking around cm_id.state in the ib_cm Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200310092545.251365-16-leon@kernel.org \
--to=leon@kernel.org \
--cc=dledford@redhat.com \
--cc=jgg@mellanox.com \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).