From: sagi@grimberg.me (Sagi Grimberg)
Subject: [PATCH RFC 0/3] iwarp device removal deadlock fix
Date: Wed, 20 Jul 2016 11:47:12 +0300 [thread overview]
Message-ID: <578F3A90.1000208@grimberg.me> (raw)
In-Reply-To: <cover.1468879135.git.swise@opengridcomputing.com>
> This RFC series attempts to address the deadlock issue discovered
> while testing nvmf/rdma handling rdma device removal events from
> the rdma_cm.
Thanks for doing this Steve!
> For a discussion of the deadlock that can happen, see
>
> http://lists.infradead.org/pipermail/linux-nvme/2016-July/005440.html.
>
> For my description of the deadlock itself, see this post in the above thread:
>
> http://lists.infradead.org/pipermail/linux-nvme/2016-July/005465.html
>
> In a nutshell, iw_cxgb4 and the iw_cm block during qp/cm_id destruction
> until all references are removed. This combined with the iwarp CM passing
> disconnect events up to the rdma_cm during disconnect and/or qp/cm_id destruction
> leads to a deadlock.
>
> My proposed solution is to remove the need for iw_cxgb4 and iw_cm to
> block during object destruction for the recnts to reach 0, but rather to
> let the freeing of the object memory be deferred when the last deref is
> done. This allows all the qps/cm_ids to be destroyed without blocking, and
> all the object memory freeing ends up happinging when the application's
> device_remove event handler function returns to the rdma_cm.
This sounds like a very good approach moving forward.
> Sean, I was hoping you could have a look at the iwcm.c patch particularly,
> to tell my why its broken. :) I spent some time trying to figure out
> why we really need the CALLBACK_DESTROY flag, but I concluded it really
> isn't needed. The one side effect I see with my change, is that the
> application could possibly get a cm_id event after it has destroyed the
> cm_id. There probably is a way to discard events that have a reference
> on the cm_id but get processed after the app has destoyed the cm_id by
> having a new flag indicating "destroyed by app".
That sounds easy enough. Does this mean that iwcm relies on the driver
to do this or is it inter-operable with the existing logic? If not this
will need to take care of all the iWARP drivers.
next prev parent reply other threads:[~2016-07-20 8:47 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-18 21:58 [PATCH RFC 0/3] iwarp device removal deadlock fix Steve Wise
2016-07-18 20:44 ` [PATCH 1/3] iw_cm: free cm_id resources on the last deref Steve Wise
2016-07-20 8:51 ` Sagi Grimberg
2016-07-20 13:51 ` Steve Wise
2016-07-21 14:17 ` Steve Wise
[not found] ` <045f01d1e35a$93618a60$ba249f20$@opengridcomputing.com>
2016-07-21 15:45 ` Steve Wise
2016-07-18 20:44 ` [PATCH 2/3] iw_cxgb4: don't block in destroy_qp awaiting " Steve Wise
2016-07-20 8:52 ` Sagi Grimberg
2016-07-18 20:44 ` [PATCH 3/3] nvme-rdma: Fix device removal handling Sagi Grimberg
2016-07-21 8:15 ` Christoph Hellwig
2016-07-22 18:37 ` Steve Wise
2016-07-20 8:47 ` Sagi Grimberg [this message]
2016-07-20 13:49 ` [PATCH RFC 0/3] iwarp device removal deadlock fix Steve Wise
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=578F3A90.1000208@grimberg.me \
--to=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).