* When is it safe to release connection resources?
@ 2011-12-29 16:42 Flavio Baronti
[not found] ` <4EFC987B.6070901-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Flavio Baronti @ 2011-12-29 16:42 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Hello,
I'm new to RDMA development and I have a question regarding resource release.
If I understood correctly, when ibv_get_cq_event returns, it holds some sort of lock over the completion queue, which is
released when I call ibv_req_notify_cq. This lock is checked also in ibv_destroy_cq, so that:
1) When ibv_destroy_cq returns, I am certain that there is no thread running somewhere between ibv_get_cq_event and
ibv_req_notify_cq
2) When ibv_destroy_cq returns, I am certain that ibv_get_cq_event will not return the destroyed cq any more.
Is all this correct?
Thanks
Flavio
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread[parent not found: <4EFC987B.6070901-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org>]
* Re: When is it safe to release connection resources? [not found] ` <4EFC987B.6070901-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org> @ 2011-12-31 10:55 ` Bart Van Assche 2012-01-02 16:39 ` Hefty, Sean 1 sibling, 0 replies; 10+ messages in thread From: Bart Van Assche @ 2011-12-31 10:55 UTC (permalink / raw) To: Flavio Baronti; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On Thu, Dec 29, 2011 at 4:42 PM, Flavio Baronti <f.baronti-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org> wrote: > I'm new to RDMA development and I have a question regarding resource release. > If I understood correctly, when ibv_get_cq_event returns, it holds some sort > of lock over the completion queue, which is released when I call > ibv_req_notify_cq. This lock is checked also in ibv_destroy_cq, so that: > 1) When ibv_destroy_cq returns, I am certain that there is no thread running > somewhere between ibv_get_cq_event and ibv_req_notify_cq > 2) When ibv_destroy_cq returns, I am certain that ibv_get_cq_event will not > return the destroyed cq any more. There is a paragraph in the IBTA that warns that the above sequence can cause WQE and data segment leakage in the HCA for QPs associated with an SRQ. I don't know though whether this can also occur for QPs with a private receive queue - I'm not a HCA firmware expert. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: When is it safe to release connection resources? [not found] ` <4EFC987B.6070901-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org> 2011-12-31 10:55 ` Bart Van Assche @ 2012-01-02 16:39 ` Hefty, Sean [not found] ` <1828884A29C6694DAF28B7E6B8A8237325662B63-P5GAC/sN6hlcIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org> 1 sibling, 1 reply; 10+ messages in thread From: Hefty, Sean @ 2012-01-02 16:39 UTC (permalink / raw) To: Flavio Baronti, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > If I understood correctly, when ibv_get_cq_event returns, it holds some sort > of lock over the completion queue, which is ibv_get_cq_event will increment a reference count on the CQ that it is returning. > released when I call ibv_req_notify_cq. This lock is checked also in You decrement the count with ibv_ack_cq_events. ibv_req_notify_cq is used to arm the cq, so that a completion generates an interrupt and a new event. > ibv_destroy_cq, so that: > 1) When ibv_destroy_cq returns, I am certain that there is no thread running > somewhere between ibv_get_cq_event and > ibv_req_notify_cq ibv_destroy_cq will block until all outstanding references on the cq have been released. The intent is to protect the user from ibv_get_cq_event from returning a reference to a cq that is being destroyed from another thread, which could result in a crash in the user's code. > 2) When ibv_destroy_cq returns, I am certain that ibv_get_cq_event will not > return the destroyed cq any more. correct - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <1828884A29C6694DAF28B7E6B8A8237325662B63-P5GAC/sN6hlcIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: When is it safe to release connection resources? [not found] ` <1828884A29C6694DAF28B7E6B8A8237325662B63-P5GAC/sN6hlcIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2012-01-04 9:42 ` Flavio Baronti [not found] ` <4F041F19.1070608-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org> 2012-01-04 19:53 ` Bart Van Assche 1 sibling, 1 reply; 10+ messages in thread From: Flavio Baronti @ 2012-01-04 9:42 UTC (permalink / raw) To: Hefty, Sean; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org So in order to safely call ibv_destroy_cq, I should call ibv_ack_cq_events *after* ibv_poll_cq? The example on the man page for ibv_get_cq_event calls it before, is it an error? Flavio Il 1/2/2012 17:39 PM, Hefty, Sean ha scritto: >> If I understood correctly, when ibv_get_cq_event returns, it holds some sort >> of lock over the completion queue, which is > > ibv_get_cq_event will increment a reference count on the CQ that it is returning. > >> released when I call ibv_req_notify_cq. This lock is checked also in > > You decrement the count with ibv_ack_cq_events. ibv_req_notify_cq is used to arm the cq, so that a completion generates an interrupt and a new event. > >> ibv_destroy_cq, so that: >> 1) When ibv_destroy_cq returns, I am certain that there is no thread running >> somewhere between ibv_get_cq_event and >> ibv_req_notify_cq > > ibv_destroy_cq will block until all outstanding references on the cq have been released. The intent is to protect the user from ibv_get_cq_event from returning a reference to a cq that is being destroyed from another thread, which could result in a crash in the user's code. > >> 2) When ibv_destroy_cq returns, I am certain that ibv_get_cq_event will not >> return the destroyed cq any more. > > correct > > - Sean > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4F041F19.1070608-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org>]
* RE: When is it safe to release connection resources? [not found] ` <4F041F19.1070608-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org> @ 2012-01-04 16:04 ` Hefty, Sean 0 siblings, 0 replies; 10+ messages in thread From: Hefty, Sean @ 2012-01-04 16:04 UTC (permalink / raw) To: Flavio Baronti; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > So in order to safely call ibv_destroy_cq, I should call ibv_ack_cq_events > *after* ibv_poll_cq? The example on the man > page for ibv_get_cq_event calls it before, is it an error? You should call ibv_ack_cq_events after ibv_get_cq_events but before ibv_destroy_cq. That can come before or after calling ibv_poll_cq. And, although it's probably easiest to call ack after get, note that you don't need a 1:1 call ratio between get/ack. You can keep a count of the number of times that ibv_get_cq_events returns a specific cq, and then call ibv_ack_cq_events for that amount just before calling ibv_destroy_cq. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: When is it safe to release connection resources? [not found] ` <1828884A29C6694DAF28B7E6B8A8237325662B63-P5GAC/sN6hlcIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org> 2012-01-04 9:42 ` Flavio Baronti @ 2012-01-04 19:53 ` Bart Van Assche [not found] ` <CAO+b5-qHFFg-KKQQkEZ2sS_+TWAFCJ0Ed4XGV15cs==_9zttSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 10+ messages in thread From: Bart Van Assche @ 2012-01-04 19:53 UTC (permalink / raw) To: Hefty, Sean Cc: Flavio Baronti, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Mon, Jan 2, 2012 at 4:39 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote: >> If I understood correctly, when ibv_get_cq_event returns, it holds some sort >> of lock over the completion queue, which is >> ibv_destroy_cq, so that: >> 1) When ibv_destroy_cq returns, I am certain that there is no thread running >> somewhere between ibv_get_cq_event and >> ibv_req_notify_cq > > ibv_destroy_cq will block until all outstanding references on the cq have been released. > The intent is to protect the user from ibv_get_cq_event from returning a reference to a cq > that is being destroyed from another thread, which could result in a crash in the user's code. I've just had a look at the kernel code that implements all this (uverbs_cmd.c and uverbs_main.c). I haven't found any precautions against ib_uverbs_comp_handler() accessing *uobj after ib_uverbs_destroy_cq() has invoked put_uobj(uobj). Did I miss something ? Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <CAO+b5-qHFFg-KKQQkEZ2sS_+TWAFCJ0Ed4XGV15cs==_9zttSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* RE: When is it safe to release connection resources? [not found] ` <CAO+b5-qHFFg-KKQQkEZ2sS_+TWAFCJ0Ed4XGV15cs==_9zttSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2012-01-04 20:05 ` Hefty, Sean [not found] ` <1828884A29C6694DAF28B7E6B8A823732566B335-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Hefty, Sean @ 2012-01-04 20:05 UTC (permalink / raw) To: Bart Van Assche Cc: Flavio Baronti, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > I've just had a look at the kernel code that implements all this > (uverbs_cmd.c and uverbs_main.c). I haven't found any precautions > against ib_uverbs_comp_handler() accessing *uobj after > ib_uverbs_destroy_cq() has invoked put_uobj(uobj). Did I miss > something ? The kernel operation is different, since it relies on callbacks. When the kernel ib_destroy_cq() returns, we are guaranteed that the completion handler (ib_uverbs_comp_handler) is not executing and will not be called. After destroying the kernel cq, ib_uverbs_destroy_cq() will remove all references to the destroyed cq from the event list. The issue is that another thread could have retrieved an event for this cq before the cleanup occurs. When ib_uverbs_destroy_cq unwinds back to user space, it returns the total number of events that were retrieved from the kernel. ibv_destroy_cq blocks until the application has processed all cq events, which are indicated by the app calling ibv_ack_cq_events(). - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <1828884A29C6694DAF28B7E6B8A823732566B335-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: When is it safe to release connection resources? [not found] ` <1828884A29C6694DAF28B7E6B8A823732566B335-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2012-01-05 11:23 ` Bart Van Assche [not found] ` <CAO+b5-pk8v-1+STkYXmwTv2nKyQJcZd575dJW6Hipfo5_AchwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Bart Van Assche @ 2012-01-05 11:23 UTC (permalink / raw) To: Hefty, Sean Cc: Flavio Baronti, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Wed, Jan 4, 2012 at 9:05 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote: >> I've just had a look at the kernel code that implements all this >> (uverbs_cmd.c and uverbs_main.c). I haven't found any precautions >> against ib_uverbs_comp_handler() accessing *uobj after >> ib_uverbs_destroy_cq() has invoked put_uobj(uobj). Did I miss >> something ? > > The kernel operation is different, since it relies on callbacks. When the kernel ib_destroy_cq() > returns, we are guaranteed that the completion handler (ib_uverbs_comp_handler) is not executing > and will not be called. After destroying the kernel cq, ib_uverbs_destroy_cq() will remove all > references to the destroyed cq from the event list. Sorry if I wasn't clear enough, but I was referring to the case where the kernel completion handler gets invoked after ib_uverbs_destroy_cq() has started but before that function has invoked ib_destroy_cq(). More in general - assuming that completion notifications have been enabled - I'm wondering whether it is possible to shut down a queue pair without triggering a race condition if that queue pair hasn't been transitioned to the reset or error state first. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <CAO+b5-pk8v-1+STkYXmwTv2nKyQJcZd575dJW6Hipfo5_AchwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: When is it safe to release connection resources? [not found] ` <CAO+b5-pk8v-1+STkYXmwTv2nKyQJcZd575dJW6Hipfo5_AchwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2012-01-05 17:29 ` Roland Dreier [not found] ` <CAL1RGDXtZA=6uvCE73isD7cu2=LU4b3CGPs4g7UMPmW=1xqgww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Roland Dreier @ 2012-01-05 17:29 UTC (permalink / raw) To: Bart Van Assche Cc: Hefty, Sean, Flavio Baronti, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Thu, Jan 5, 2012 at 3:23 AM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote: > Sorry if I wasn't clear enough, but I was referring to the case where > the kernel completion handler gets invoked after > ib_uverbs_destroy_cq() has started but before that function has > invoked ib_destroy_cq(). That should be OK. ib_uverbs_destroy_cq() doesn't really do anything until ib_destroy_cq() has returned, and at that point it is guaranteed that the completion handler for the CQ is done. > More in general - assuming that completion notifications have been > enabled - I'm wondering whether it is possible to shut down a queue > pair without triggering a race condition if that queue pair hasn't > been transitioned to the reset or error state first. destroying a QP is pretty much equivalent to transitioning it to the reset state. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <CAL1RGDXtZA=6uvCE73isD7cu2=LU4b3CGPs4g7UMPmW=1xqgww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: When is it safe to release connection resources? [not found] ` <CAL1RGDXtZA=6uvCE73isD7cu2=LU4b3CGPs4g7UMPmW=1xqgww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2012-01-07 9:59 ` Bart Van Assche 0 siblings, 0 replies; 10+ messages in thread From: Bart Van Assche @ 2012-01-07 9:59 UTC (permalink / raw) To: Roland Dreier Cc: Hefty, Sean, Flavio Baronti, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Thu, Jan 5, 2012 at 5:29 PM, Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org> wrote: > That should be OK. ib_uverbs_destroy_cq() doesn't really do anything > until ib_destroy_cq() has returned, and at that point it is guaranteed > that the completion handler for the CQ is done. That sounds like a requirement for low-level IB drivers. Has it already been considered to document this requirement in Documentation/infiniband/core_locking.txt ? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-01-07 9:59 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-29 16:42 When is it safe to release connection resources? Flavio Baronti
[not found] ` <4EFC987B.6070901-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org>
2011-12-31 10:55 ` Bart Van Assche
2012-01-02 16:39 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A8237325662B63-P5GAC/sN6hlcIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-01-04 9:42 ` Flavio Baronti
[not found] ` <4F041F19.1070608-ngIpsMLAhaq41k5uCYKmRQ@public.gmane.org>
2012-01-04 16:04 ` Hefty, Sean
2012-01-04 19:53 ` Bart Van Assche
[not found] ` <CAO+b5-qHFFg-KKQQkEZ2sS_+TWAFCJ0Ed4XGV15cs==_9zttSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-01-04 20:05 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A823732566B335-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-01-05 11:23 ` Bart Van Assche
[not found] ` <CAO+b5-pk8v-1+STkYXmwTv2nKyQJcZd575dJW6Hipfo5_AchwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-01-05 17:29 ` Roland Dreier
[not found] ` <CAL1RGDXtZA=6uvCE73isD7cu2=LU4b3CGPs4g7UMPmW=1xqgww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-01-07 9:59 ` Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox