All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leonro@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Max Gurtovoy <mgurtovoy@nvidia.com>,
	Guoqing Jiang <guoqing.jiang@cloud.ionos.com>,
	Sagi Grimberg <sagi@grimberg.me>,
	linux-rdma@vger.kernel.org, netdev@vger.kernel.org,
	Santosh Shilimkar <santosh.shilimkar@oracle.com>,
	rds-devel@oss.oracle.com, linux-nvme@lists.infradead.org,
	Chao Leng <lengchao@huawei.com>, Keith Busch <kbusch@kernel.org>,
	Jack Wang <jinpu.wang@cloud.ionos.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH rdma v2] RDMA: Add rdma_connect_locked()
Date: Tue, 27 Oct 2020 15:19:36 +0200	[thread overview]
Message-ID: <20201027131936.GD1763578@unreal> (raw)
In-Reply-To: <0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com>

On Tue, Oct 27, 2020 at 09:20:36AM -0300, Jason Gunthorpe wrote:
> There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
> handler triggers a completion and another thread does rdma_connect() or
> the handler directly calls rdma_connect().
>
> In all cases rdma_connect() needs to hold the handler_mutex, but when
> handler's are invoked this is already held by the core code. This causes
> ULPs using the 2nd method to deadlock.
>
> Provide a rdma_connect_locked() and have all ULPs call it from their
> handlers.
>
> Link: https://lore.kernel.org/r/0-v1-75e124dbad74+b05-rdma_connect_locking_jgg@nvidia.com
> Reported-and-tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state")
> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/core/cma.c            | 40 +++++++++++++++++++++---
>  drivers/infiniband/ulp/iser/iser_verbs.c |  2 +-
>  drivers/infiniband/ulp/rtrs/rtrs-clt.c   |  4 +--
>  drivers/nvme/host/rdma.c                 |  4 +--
>  include/rdma/rdma_cm.h                   | 14 ++-------
>  net/rds/ib_cm.c                          |  5 +--
>  6 files changed, 46 insertions(+), 23 deletions(-)
>
> v2:
>  - Remove extra code from nvme (Chao)
>  - Fix long lines (CH)
>
> I've applied this version to rdma-rc - expecting to get these ULPs unbroken for rc2
> release
>
> Thanks,
> Jason
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 7c2ab1f2fbea37..193c8902b9db26 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -405,10 +405,10 @@ static int cma_comp_exch(struct rdma_id_private *id_priv,
>  	/*
>  	 * The FSM uses a funny double locking where state is protected by both
>  	 * the handler_mutex and the spinlock. State is not allowed to change
> -	 * away from a handler_mutex protected value without also holding
> +	 * to/from a handler_mutex protected value without also holding
>  	 * handler_mutex.
>  	 */
> -	if (comp == RDMA_CM_CONNECT)
> +	if (comp == RDMA_CM_CONNECT || exch == RDMA_CM_CONNECT)
>  		lockdep_assert_held(&id_priv->handler_mutex);
>
>  	spin_lock_irqsave(&id_priv->lock, flags);
> @@ -4038,13 +4038,21 @@ static int cma_connect_iw(struct rdma_id_private *id_priv,
>  	return ret;
>  }
>
> -int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> +/**
> + * rdma_connect_locked - Initiate an active connection request.
> + * @id: Connection identifier to connect.
> + * @conn_param: Connection information used for connected QPs.
> + *
> + * Same as rdma_connect() but can only be called from the
> + * RDMA_CM_EVENT_ROUTE_RESOLVED handler callback.
> + */
> +int rdma_connect_locked(struct rdma_cm_id *id,
> +			struct rdma_conn_param *conn_param)
>  {
>  	struct rdma_id_private *id_priv =
>  		container_of(id, struct rdma_id_private, id);
>  	int ret;
>
> -	mutex_lock(&id_priv->handler_mutex);
>  	if (!cma_comp_exch(id_priv, RDMA_CM_ROUTE_RESOLVED, RDMA_CM_CONNECT)) {
>  		ret = -EINVAL;
>  		goto err_unlock;

Not a big deal, but his label is not correct anymore.

Thanks

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

WARNING: multiple messages have this Message-ID (diff)
From: Leon Romanovsky <leonro@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: linux-rdma@vger.kernel.org,
	Guoqing Jiang <guoqing.jiang@cloud.ionos.com>,
	Christoph Hellwig <hch@lst.de>,
	Jack Wang <jinpu.wang@cloud.ionos.com>,
	Chao Leng <lengchao@huawei.com>,
	Santosh Shilimkar <santosh.shilimkar@oracle.com>,
	Keith Busch <kbusch@kernel.org>,
	linux-nvme@lists.infradead.org,
	Max Gurtovoy <mgurtovoy@nvidia.com>,
	netdev@vger.kernel.org, rds-devel@oss.oracle.com,
	Sagi Grimberg <sagi@grimberg.me>
Subject: Re: [PATCH rdma v2] RDMA: Add rdma_connect_locked()
Date: Tue, 27 Oct 2020 15:19:36 +0200	[thread overview]
Message-ID: <20201027131936.GD1763578@unreal> (raw)
In-Reply-To: <0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com>

On Tue, Oct 27, 2020 at 09:20:36AM -0300, Jason Gunthorpe wrote:
> There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
> handler triggers a completion and another thread does rdma_connect() or
> the handler directly calls rdma_connect().
>
> In all cases rdma_connect() needs to hold the handler_mutex, but when
> handler's are invoked this is already held by the core code. This causes
> ULPs using the 2nd method to deadlock.
>
> Provide a rdma_connect_locked() and have all ULPs call it from their
> handlers.
>
> Link: https://lore.kernel.org/r/0-v1-75e124dbad74+b05-rdma_connect_locking_jgg@nvidia.com
> Reported-and-tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state")
> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/core/cma.c            | 40 +++++++++++++++++++++---
>  drivers/infiniband/ulp/iser/iser_verbs.c |  2 +-
>  drivers/infiniband/ulp/rtrs/rtrs-clt.c   |  4 +--
>  drivers/nvme/host/rdma.c                 |  4 +--
>  include/rdma/rdma_cm.h                   | 14 ++-------
>  net/rds/ib_cm.c                          |  5 +--
>  6 files changed, 46 insertions(+), 23 deletions(-)
>
> v2:
>  - Remove extra code from nvme (Chao)
>  - Fix long lines (CH)
>
> I've applied this version to rdma-rc - expecting to get these ULPs unbroken for rc2
> release
>
> Thanks,
> Jason
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 7c2ab1f2fbea37..193c8902b9db26 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -405,10 +405,10 @@ static int cma_comp_exch(struct rdma_id_private *id_priv,
>  	/*
>  	 * The FSM uses a funny double locking where state is protected by both
>  	 * the handler_mutex and the spinlock. State is not allowed to change
> -	 * away from a handler_mutex protected value without also holding
> +	 * to/from a handler_mutex protected value without also holding
>  	 * handler_mutex.
>  	 */
> -	if (comp == RDMA_CM_CONNECT)
> +	if (comp == RDMA_CM_CONNECT || exch == RDMA_CM_CONNECT)
>  		lockdep_assert_held(&id_priv->handler_mutex);
>
>  	spin_lock_irqsave(&id_priv->lock, flags);
> @@ -4038,13 +4038,21 @@ static int cma_connect_iw(struct rdma_id_private *id_priv,
>  	return ret;
>  }
>
> -int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> +/**
> + * rdma_connect_locked - Initiate an active connection request.
> + * @id: Connection identifier to connect.
> + * @conn_param: Connection information used for connected QPs.
> + *
> + * Same as rdma_connect() but can only be called from the
> + * RDMA_CM_EVENT_ROUTE_RESOLVED handler callback.
> + */
> +int rdma_connect_locked(struct rdma_cm_id *id,
> +			struct rdma_conn_param *conn_param)
>  {
>  	struct rdma_id_private *id_priv =
>  		container_of(id, struct rdma_id_private, id);
>  	int ret;
>
> -	mutex_lock(&id_priv->handler_mutex);
>  	if (!cma_comp_exch(id_priv, RDMA_CM_ROUTE_RESOLVED, RDMA_CM_CONNECT)) {
>  		ret = -EINVAL;
>  		goto err_unlock;

Not a big deal, but his label is not correct anymore.

Thanks

  parent reply	other threads:[~2020-10-27 13:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-27 12:20 [PATCH rdma v2] RDMA: Add rdma_connect_locked() Jason Gunthorpe
2020-10-27 12:20 ` Jason Gunthorpe
2020-10-27 13:06 ` Max Gurtovoy
2020-10-27 13:06   ` Max Gurtovoy
2020-10-27 13:19 ` Leon Romanovsky [this message]
2020-10-27 13:19   ` Leon Romanovsky
2020-10-27 13:23   ` Jason Gunthorpe
2020-10-27 13:23     ` Jason Gunthorpe
2020-10-27 21:08 ` Sagi Grimberg
2020-10-27 21:08   ` Sagi Grimberg
2020-10-28  9:19 ` Maor Gottlieb
2020-10-28  9:19   ` Maor Gottlieb
2020-10-28 12:14   ` Jason Gunthorpe
2020-10-28 12:14     ` Jason Gunthorpe
2020-10-28 13:31     ` Maor Gottlieb
2020-10-28 13:31       ` Maor Gottlieb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201027131936.GD1763578@unreal \
    --to=leonro@nvidia.com \
    --cc=guoqing.jiang@cloud.ionos.com \
    --cc=hch@lst.de \
    --cc=jgg@nvidia.com \
    --cc=jinpu.wang@cloud.ionos.com \
    --cc=kbusch@kernel.org \
    --cc=lengchao@huawei.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mgurtovoy@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=rds-devel@oss.oracle.com \
    --cc=sagi@grimberg.me \
    --cc=santosh.shilimkar@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.