public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagig@dev.mellanox.co.il>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>,
	sagi grimberg <sagig@mellanox.com>
Cc: "Nicholas A. Bellinger" <nab@daterainc.com>,
	target-devel <target-devel@vger.kernel.org>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	Or Gerlitz <ogerlitz@mellanox.com>
Subject: Re: [PATCH 0/6] iser-target: Fix active I/O shutdown related issues
Date: Tue, 11 Mar 2014 12:27:08 +0200	[thread overview]
Message-ID: <531EE4FC.9010402@dev.mellanox.co.il> (raw)
In-Reply-To: <1394488811.16214.32.camel@haakon3.risingtidesystems.com>

On 3/11/2014 12:00 AM, Nicholas A. Bellinger wrote:
> On Thu, 2014-03-06 at 16:05 +0200, sagi grimberg wrote:
>> On 3/6/2014 12:04 AM, Nicholas A. Bellinger wrote:
>>> On Wed, 2014-03-05 at 14:12 +0200, Sagi Grimberg wrote:
>>>> On 3/5/2014 2:06 AM, Nicholas A. Bellinger wrote:
>>>>> On Tue, 2014-03-04 at 17:17 +0200, Sagi Grimberg wrote:
>>>>>> On 3/4/2014 2:00 AM, Nicholas A. Bellinger wrote:
> <SNIP>
>
>>>>> <nod>, I noticed that as well during recent debugging.
>>>>>
>>>>> However, AFAICT the RDMA_CM_EVENT_TIMEWAIT_EVENT doesn't (always) occur
>>>>> on the target side after a RDMA_CM_EVENT_DISCONNECTED, and thus far I've
>>>>> not been able to ascertain what's different about the shutdown sequence
>>>>> that would make this happen, or not happen..
>>>>>
>>>>> Any ideas..?
>>>> That's probably because the cm_id is destroyed before you get the event.
>>>> There is a specific
>>>> timout computation to get this event (see IB spec). If you will attempt
>>>> to disconnect while
>>>> the link is down (initiator won't receive it and send you disconnect
>>>> back), you should be able
>>>> to see this event. As I understand, in order to comply the spec, the QP
>>>> (and the cm_id afterwards)
>>>> should be destroyed only when getting this event and not before.
>>>>
>>> <nod>, thanks for the additional background.
>>>
>>> So currently rdma_destroy_qp() + rdma_destroy_id() is being done via
>>> isert_connect_release(), which occurs after the final isert_put_conn()
>>> happens from either the RDMA_CM_EVENT_DISCONNECTED handler, or within
>>> isert_free_conn() in one of the per connection kernel thread contexts
>>> via iscsit_close_connection().
>>>
>>> If I understand the above correctly, the isert_put_conn() should move
>>> from the RDMA_CM_EVENT_DISCONNECTED handler into the TIMEWAIT_EVENT
>>> handler, yes..?
>> Yes.
>>
>>> And it's safe to assume that DISCONNECTED will always occur before
>>> TIMEWAIT_EVENT, right..?
>> DISCONNECTED event may not even come at all (in case the initiator
>> didn't call rdma_disconnect). no guarantees here..
>> But, if once we get the TIMEWAIT event, we destroy the qp and the
>> *cm_id*, we won't get any CM events at all.
>> As I understand, we don't even need to explicitly destroy the cm_id, we
>> can just return a non-zero return from cma_handler
>> for TIMEWAIT events which will cause rdma_cm to implicitly destroy the
>> cm_id.
>>
> Mmmm, if that's the case then I'm more confused about how reference
> counting for isert_conn should work wrt TIMEWAIT_EVENT than before..  ;)

Yes, it is indeed confusing.
The below might be even more confusing... I'll try to simplify.

> As mentioned earlier, the first isert_put_conn() occurs from the per
> connection process context after calling rdma_disconnect(), and the
> second from the disconnected event handler.
>
> Your comment above would seem to indicate that iser-target code should
> be waiting to receive TIMEWAIT_EVENT, instead of pro-actively calling
> rdma_disconnect() to trigger the disconnect.  Is that correct..?

Not instead. iser-target should call rdma_disconnect upon DISCONNECTED 
event but
destroy the qp and cm_id when getting TIMEWAIT.

There is a matter of sending disconnect request/response and a matter of 
when
to destroy the QP.

DISCONNECTED events are coming from 2 conditions:
static int cma_ib_handler()
{
         ...
         case IB_CM_DREQ_RECEIVED:    /* remote side sent a disconnect 
request */
         case IB_CM_DREP_RECEIVED:    /* remote side responded our 
disconnect request */
                 if (!cma_comp_exch(id_priv, RDMA_CM_CONNECT,
                                    RDMA_CM_DISCONNECT))
                         goto out;
                 event.event = RDMA_CM_EVENT_DISCONNECTED;
                 break;
         ...
}

So regardless if iser_target initiated rdma_disconnect - once it gets 
DISCONNECTED event
it should call it *again* to respond the initiator disconnect request 
(rdma_disconnect sends
CM DREQ and if fails calls CM_DREP - this is to cover that both sides 
will send DREQ and DREP).

The removal of the QP and cm_id should come once getting TIMEWAIT event.

Hope this wasn't even more confusing...

Sagi.

      reply	other threads:[~2014-03-11 10:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-04  0:00 [PATCH 0/6] iser-target: Fix active I/O shutdown related issues Nicholas A. Bellinger
     [not found] ` <1393891265-22910-1-git-send-email-nab-PEzghdH756F8UrSeD/g0lQ@public.gmane.org>
2014-03-04  0:01   ` [PATCH 1/6] iscsi-target: Fix iscsit_get_tpg_from_np tpg_state bug Nicholas A. Bellinger
2014-03-04  0:01 ` [PATCH 2/6] iscsi/iser-target: Use list_del_init for ->i_conn_node Nicholas A. Bellinger
2014-03-04  0:01 ` [PATCH 3/6] iscsi/iser-target: Fix isert_conn->state hung shutdown issues Nicholas A. Bellinger
2014-03-04  0:01 ` [PATCH 4/6] iser-target: Fix post_send_buf_count for RDMA READ/WRITE Nicholas A. Bellinger
2014-03-04  7:49   ` Or Gerlitz
2014-03-04  9:21     ` Sagi Grimberg
2014-03-04  0:01 ` [PATCH 5/6] iser-target: Ignore completions for FRWRs in isert_cq_tx_work Nicholas A. Bellinger
2014-03-04 14:51   ` Sagi Grimberg
     [not found]     ` <5315E85F.2090904-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-04 23:56       ` Nicholas A. Bellinger
2014-03-04  0:01 ` [PATCH 6/6] iser-target: Fix command leak for tx_desc->comp_llnode_batch Nicholas A. Bellinger
2014-03-04 15:17 ` [PATCH 0/6] iser-target: Fix active I/O shutdown related issues Sagi Grimberg
     [not found]   ` <5315EE7C.3030806-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-05  0:06     ` Nicholas A. Bellinger
2014-03-05 12:12       ` Sagi Grimberg
2014-03-05 22:04         ` Nicholas A. Bellinger
     [not found]           ` <1394057083.20601.51.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
2014-03-06 14:05             ` sagi grimberg
2014-03-10 22:00               ` Nicholas A. Bellinger
2014-03-11 10:27                 ` Sagi Grimberg [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=531EE4FC.9010402@dev.mellanox.co.il \
    --to=sagig@dev.mellanox.co.il \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nab@daterainc.com \
    --cc=nab@linux-iscsi.org \
    --cc=ogerlitz@mellanox.com \
    --cc=sagig@mellanox.com \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox