From: Sagi Grimberg <sagig@dev.mellanox.co.il>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>,
sagi grimberg <sagig@mellanox.com>
Cc: "Nicholas A. Bellinger" <nab@daterainc.com>,
target-devel <target-devel@vger.kernel.org>,
linux-rdma <linux-rdma@vger.kernel.org>,
linux-scsi <linux-scsi@vger.kernel.org>,
Or Gerlitz <ogerlitz@mellanox.com>
Subject: Re: [PATCH 0/6] iser-target: Fix active I/O shutdown related issues
Date: Tue, 11 Mar 2014 12:27:08 +0200 [thread overview]
Message-ID: <531EE4FC.9010402@dev.mellanox.co.il> (raw)
In-Reply-To: <1394488811.16214.32.camel@haakon3.risingtidesystems.com>
On 3/11/2014 12:00 AM, Nicholas A. Bellinger wrote:
> On Thu, 2014-03-06 at 16:05 +0200, sagi grimberg wrote:
>> On 3/6/2014 12:04 AM, Nicholas A. Bellinger wrote:
>>> On Wed, 2014-03-05 at 14:12 +0200, Sagi Grimberg wrote:
>>>> On 3/5/2014 2:06 AM, Nicholas A. Bellinger wrote:
>>>>> On Tue, 2014-03-04 at 17:17 +0200, Sagi Grimberg wrote:
>>>>>> On 3/4/2014 2:00 AM, Nicholas A. Bellinger wrote:
> <SNIP>
>
>>>>> <nod>, I noticed that as well during recent debugging.
>>>>>
>>>>> However, AFAICT the RDMA_CM_EVENT_TIMEWAIT_EVENT doesn't (always) occur
>>>>> on the target side after a RDMA_CM_EVENT_DISCONNECTED, and thus far I've
>>>>> not been able to ascertain what's different about the shutdown sequence
>>>>> that would make this happen, or not happen..
>>>>>
>>>>> Any ideas..?
>>>> That's probably because the cm_id is destroyed before you get the event.
>>>> There is a specific
>>>> timout computation to get this event (see IB spec). If you will attempt
>>>> to disconnect while
>>>> the link is down (initiator won't receive it and send you disconnect
>>>> back), you should be able
>>>> to see this event. As I understand, in order to comply the spec, the QP
>>>> (and the cm_id afterwards)
>>>> should be destroyed only when getting this event and not before.
>>>>
>>> <nod>, thanks for the additional background.
>>>
>>> So currently rdma_destroy_qp() + rdma_destroy_id() is being done via
>>> isert_connect_release(), which occurs after the final isert_put_conn()
>>> happens from either the RDMA_CM_EVENT_DISCONNECTED handler, or within
>>> isert_free_conn() in one of the per connection kernel thread contexts
>>> via iscsit_close_connection().
>>>
>>> If I understand the above correctly, the isert_put_conn() should move
>>> from the RDMA_CM_EVENT_DISCONNECTED handler into the TIMEWAIT_EVENT
>>> handler, yes..?
>> Yes.
>>
>>> And it's safe to assume that DISCONNECTED will always occur before
>>> TIMEWAIT_EVENT, right..?
>> DISCONNECTED event may not even come at all (in case the initiator
>> didn't call rdma_disconnect). no guarantees here..
>> But, if once we get the TIMEWAIT event, we destroy the qp and the
>> *cm_id*, we won't get any CM events at all.
>> As I understand, we don't even need to explicitly destroy the cm_id, we
>> can just return a non-zero return from cma_handler
>> for TIMEWAIT events which will cause rdma_cm to implicitly destroy the
>> cm_id.
>>
> Mmmm, if that's the case then I'm more confused about how reference
> counting for isert_conn should work wrt TIMEWAIT_EVENT than before.. ;)
Yes, it is indeed confusing.
The below might be even more confusing... I'll try to simplify.
> As mentioned earlier, the first isert_put_conn() occurs from the per
> connection process context after calling rdma_disconnect(), and the
> second from the disconnected event handler.
>
> Your comment above would seem to indicate that iser-target code should
> be waiting to receive TIMEWAIT_EVENT, instead of pro-actively calling
> rdma_disconnect() to trigger the disconnect. Is that correct..?
Not instead. iser-target should call rdma_disconnect upon DISCONNECTED
event but
destroy the qp and cm_id when getting TIMEWAIT.
There is a matter of sending disconnect request/response and a matter of
when
to destroy the QP.
DISCONNECTED events are coming from 2 conditions:
static int cma_ib_handler()
{
...
case IB_CM_DREQ_RECEIVED: /* remote side sent a disconnect
request */
case IB_CM_DREP_RECEIVED: /* remote side responded our
disconnect request */
if (!cma_comp_exch(id_priv, RDMA_CM_CONNECT,
RDMA_CM_DISCONNECT))
goto out;
event.event = RDMA_CM_EVENT_DISCONNECTED;
break;
...
}
So regardless if iser_target initiated rdma_disconnect - once it gets
DISCONNECTED event
it should call it *again* to respond the initiator disconnect request
(rdma_disconnect sends
CM DREQ and if fails calls CM_DREP - this is to cover that both sides
will send DREQ and DREP).
The removal of the QP and cm_id should come once getting TIMEWAIT event.
Hope this wasn't even more confusing...
Sagi.
prev parent reply other threads:[~2014-03-11 10:27 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-04 0:00 [PATCH 0/6] iser-target: Fix active I/O shutdown related issues Nicholas A. Bellinger
[not found] ` <1393891265-22910-1-git-send-email-nab-PEzghdH756F8UrSeD/g0lQ@public.gmane.org>
2014-03-04 0:01 ` [PATCH 1/6] iscsi-target: Fix iscsit_get_tpg_from_np tpg_state bug Nicholas A. Bellinger
2014-03-04 0:01 ` [PATCH 2/6] iscsi/iser-target: Use list_del_init for ->i_conn_node Nicholas A. Bellinger
2014-03-04 0:01 ` [PATCH 3/6] iscsi/iser-target: Fix isert_conn->state hung shutdown issues Nicholas A. Bellinger
2014-03-04 0:01 ` [PATCH 4/6] iser-target: Fix post_send_buf_count for RDMA READ/WRITE Nicholas A. Bellinger
2014-03-04 7:49 ` Or Gerlitz
2014-03-04 9:21 ` Sagi Grimberg
2014-03-04 0:01 ` [PATCH 5/6] iser-target: Ignore completions for FRWRs in isert_cq_tx_work Nicholas A. Bellinger
2014-03-04 14:51 ` Sagi Grimberg
[not found] ` <5315E85F.2090904-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-04 23:56 ` Nicholas A. Bellinger
2014-03-04 0:01 ` [PATCH 6/6] iser-target: Fix command leak for tx_desc->comp_llnode_batch Nicholas A. Bellinger
2014-03-04 15:17 ` [PATCH 0/6] iser-target: Fix active I/O shutdown related issues Sagi Grimberg
[not found] ` <5315EE7C.3030806-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-05 0:06 ` Nicholas A. Bellinger
2014-03-05 12:12 ` Sagi Grimberg
2014-03-05 22:04 ` Nicholas A. Bellinger
[not found] ` <1394057083.20601.51.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
2014-03-06 14:05 ` sagi grimberg
2014-03-10 22:00 ` Nicholas A. Bellinger
2014-03-11 10:27 ` Sagi Grimberg [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=531EE4FC.9010402@dev.mellanox.co.il \
--to=sagig@dev.mellanox.co.il \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=nab@daterainc.com \
--cc=nab@linux-iscsi.org \
--cc=ogerlitz@mellanox.com \
--cc=sagig@mellanox.com \
--cc=target-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox