* dat_ep_disconnect() with ABRUPT
@ 2010-11-18 17:40 Pradeep Satyanarayana
[not found] ` <4CE5650C.1090706-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
0 siblings, 1 reply; 2+ messages in thread
From: Pradeep Satyanarayana @ 2010-11-18 17:40 UTC (permalink / raw)
To: Davis, Arlin R; +Cc: linux-rdma
Hi Arlin,
We are seeing some issues with dat_ep_disconnect() with ABRUPT flag. In fact it appears that the ABRUPT flag seems
to behave like the GRACEFUL flag. One difference between the DAT1.2 and DAT2.0 appears to be the following:
In dapls_ib_disconnect()
/* ABRUPT close, wait for callback and DISCONNECTED state */
if (close_flags == DAT_CLOSE_ABRUPT_FLAG) {
dapl_os_lock(&ep_ptr->header.lock);
while (ep_ptr->param.ep_state != DAT_EP_STATE_DISCONNECTED) {
dapl_os_unlock(&ep_ptr->header.lock);
dapl_os_sleep_usec(10000);
dapl_os_lock(&ep_ptr->header.lock);
}
dapl_os_unlock(&ep_ptr->header.lock);
}
this loop exists in DAT2.0 and has been removed in DAT1.2. I am not sure why this is leading to different behaviors
in DAT1.2 and DAT2.0.
One thought is that both DAT1.2 and DAT2.0 have a missing check for ABRUPT flag in dapl_ep_disconnect().
if ( ep_ptr->param.ep_state == DAT_EP_STATE_ACTIVE_CONNECTION_PENDING ||
ep_ptr->param.ep_state == DAT_EP_STATE_COMPLETION_PENDING )
{
/*
* Beginning or waiting on a connection: abort and reset the
* state
*/
ep_ptr->param.ep_state = DAT_EP_STATE_DISCONNECTED;
dapl_os_unlock ( &ep_ptr->header.lock );
/* disconnect and make sure we get no callbacks */
(void) dapls_ib_disconnect (ep_ptr, DAT_CLOSE_ABRUPT_FLAG);
/* clean up connection state */
dapl_sp_remove_ep (ep_ptr);
evd_ptr = (DAPL_EVD *) ep_ptr->param.connect_evd_handle;
dapls_evd_post_connection_event (evd_ptr,
DAT_CONNECTION_EVENT_DISCONNECTED,
(DAT_HANDLE) ep_ptr,
0,
0);
dat_status = DAT_SUCCESS;
goto bail;
}
The if condition above should also have an additional check for disconnect_flags == ABRUPT. If the EP is in a
CONNECTED state and the remote end crashes and this node calls dat_ep_disconnect() with ABRUPT, the if condition
is not true and it is treated as though it was a GRACEFUL disconnect. If you agree with the assessment I can build
a patch.
Thanks
Pradeep
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 2+ messages in thread[parent not found: <4CE5650C.1090706-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>]
* Re: dat_ep_disconnect() with ABRUPT [not found] ` <4CE5650C.1090706-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> @ 2010-11-18 19:51 ` Pradeep Satyanarayana 0 siblings, 0 replies; 2+ messages in thread From: Pradeep Satyanarayana @ 2010-11-18 19:51 UTC (permalink / raw) To: Davis, Arlin R; +Cc: linux-rdma On 11/18/2010 9:40 AM, Pradeep Satyanarayana wrote: > Hi Arlin, > > We are seeing some issues with dat_ep_disconnect() with ABRUPT flag. In fact it appears that the ABRUPT flag seems > to behave like the GRACEFUL flag. One difference between the DAT1.2 and DAT2.0 appears to be the following: > > In dapls_ib_disconnect() > > > /* ABRUPT close, wait for callback and DISCONNECTED state */ > if (close_flags == DAT_CLOSE_ABRUPT_FLAG) { > dapl_os_lock(&ep_ptr->header.lock); > while (ep_ptr->param.ep_state != DAT_EP_STATE_DISCONNECTED) { > dapl_os_unlock(&ep_ptr->header.lock); > dapl_os_sleep_usec(10000); > dapl_os_lock(&ep_ptr->header.lock); > } > dapl_os_unlock(&ep_ptr->header.lock); > } > > this loop exists in DAT2.0 and has been removed in DAT1.2. I am not sure why this is leading to different behaviors > in DAT1.2 and DAT2.0. I believe I understand this behavior. With the ABRUPT flag set it enters dapls_ib_disconnect() (incorrectly), with DAT_EP_STATE_DISCONNECT_PENDING set. Hence it gets stuck in the while loop above and behaves no differently from the GRACEFUL case. One way would be to remove the while loop above as in DAT1.2, or the other alternative (probably more correct?) is what is proposed below. What are your thoughts? Thanks Pradeep > > One thought is that both DAT1.2 and DAT2.0 have a missing check for ABRUPT flag in dapl_ep_disconnect(). > > if ( ep_ptr->param.ep_state == DAT_EP_STATE_ACTIVE_CONNECTION_PENDING || > ep_ptr->param.ep_state == DAT_EP_STATE_COMPLETION_PENDING ) > { > /* > * Beginning or waiting on a connection: abort and reset the > * state > */ > ep_ptr->param.ep_state = DAT_EP_STATE_DISCONNECTED; > > dapl_os_unlock ( &ep_ptr->header.lock ); > /* disconnect and make sure we get no callbacks */ > (void) dapls_ib_disconnect (ep_ptr, DAT_CLOSE_ABRUPT_FLAG); > > /* clean up connection state */ > dapl_sp_remove_ep (ep_ptr); > > evd_ptr = (DAPL_EVD *) ep_ptr->param.connect_evd_handle; > dapls_evd_post_connection_event (evd_ptr, > DAT_CONNECTION_EVENT_DISCONNECTED, > (DAT_HANDLE) ep_ptr, > 0, > 0); > dat_status = DAT_SUCCESS; > goto bail; > } > > The if condition above should also have an additional check for disconnect_flags == ABRUPT. If the EP is in a > CONNECTED state and the remote end crashes and this node calls dat_ep_disconnect() with ABRUPT, the if condition > is not true and it is treated as though it was a GRACEFUL disconnect. If you agree with the assessment I can build > a patch. > > Thanks > Pradeep > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-11-18 19:51 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-18 17:40 dat_ep_disconnect() with ABRUPT Pradeep Satyanarayana
[not found] ` <4CE5650C.1090706-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-11-18 19:51 ` Pradeep Satyanarayana
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).