* how to preserve QP over HA events for librdmacm applications
@ 2012-09-19 15:43 Or Gerlitz
[not found] ` <5059E82E.9020600-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Or Gerlitz @ 2012-09-19 15:43 UTC (permalink / raw)
To: Hefty, Sean
Cc: Alex Rosenbaum,
linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org)
Hi Sean,
We have a case here where an app which uses librdmacm wants to preserve its
QP over HA events such IB link down/up, specifically the sequence of
operations
done by the app is the following:
1. rdma_create_id using the IPoIB port space
2. rdma_bind _addr
3. rdma_create_qp using UD QP type.
We are looking for a way to reset this QP such that any pending send
buffers will be flushed
out and then the QP returns to be functional (in RTS state) - eventually
with the same QPN.
Using rdma_disconnect indeed moves the QP to the error state and the
buffers are flushed,
however, there's no way to modify the QP state again to RTS, etc via
librdmacm.
Can this flushing be somehow done with the current librdmacm/libibverbs APIs
or we need some enhancement?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 15+ messages in thread[parent not found: <5059E82E.9020600-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* RE: how to preserve QP over HA events for librdmacm applications [not found] ` <5059E82E.9020600-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2012-09-19 15:48 ` Hefty, Sean [not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E418-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Hefty, Sean @ 2012-09-19 15:48 UTC (permalink / raw) To: Or Gerlitz Cc: Alex Rosenbaum, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) > Can this flushing be somehow done with the current librdmacm/libibverbs APIs > or we need some enhancement? You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <1828884A29C6694DAF28B7E6B8A8237346A8E418-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: how to preserve QP over HA events for librdmacm applications [not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E418-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2012-09-19 15:52 ` Or Gerlitz [not found] ` <5059EA48.1040407-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Or Gerlitz @ 2012-09-19 15:52 UTC (permalink / raw) To: Alex Rosenbaum Cc: Hefty, Sean, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) On 19/09/2012 18:48, Hefty, Sean wrote: >> Can this flushing be somehow done with the current librdmacm/libibverbs APIs >> or we need some enhancement? > You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway. > > Alex, Any reason we can't deploy this hack? is that for the IPoIB port space it would require copying some low level code from librdmacm or even from the kernel? e.g the IPoIB qkey, etc. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <5059EA48.1040407-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* RE: how to preserve QP over HA events for librdmacm applications [not found] ` <5059EA48.1040407-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2012-09-19 15:58 ` Alex Rosenbaum [not found] ` <A4E971F4031F1840BBA6E79B417E62E82CF29C3B-SlGPd/IId7auSA5JZHE7gA@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Alex Rosenbaum @ 2012-09-19 15:58 UTC (permalink / raw) To: Or Gerlitz Cc: Hefty, Sean, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) Since we use the RDMA_PS_IPOIB we need librdmacm to help get the correct pkey_index and qkey (in INIT->RTR transition) to match IPoIB's UD QP own values. If not, than our user space UD QP will not be able to send/recv from IPoIB on remote machines (which is what we want to gain by using the IPOIB port space). Maybe we can save the values used from the rdma_create_qp and reuse them once modify the UD QP state by libverbs (ibv_modify_qp). It would be nice if we had access to the rdma's modify qp wrapper to do this nicely from application level. Alex -----Original Message----- From: Or Gerlitz Sent: Wednesday, September 19, 2012 6:53 PM To: Alex Rosenbaum Cc: Hefty, Sean; linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) Subject: Re: how to preserve QP over HA events for librdmacm applications On 19/09/2012 18:48, Hefty, Sean wrote: >> Can this flushing be somehow done with the current >> librdmacm/libibverbs APIs or we need some enhancement? > You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway. > > Alex, Any reason we can't deploy this hack? is that for the IPoIB port space it would require copying some low level code from librdmacm or even from the kernel? e.g the IPoIB qkey, etc. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <A4E971F4031F1840BBA6E79B417E62E82CF29C3B-SlGPd/IId7auSA5JZHE7gA@public.gmane.org>]
* Re: how to preserve QP over HA events for librdmacm applications [not found] ` <A4E971F4031F1840BBA6E79B417E62E82CF29C3B-SlGPd/IId7auSA5JZHE7gA@public.gmane.org> @ 2012-09-19 16:52 ` Atchley, Scott [not found] ` <46C75A5F-AD9F-45CF-A441-B7D5F60709D8-1Heg1YXhbW8@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Atchley, Scott @ 2012-09-19 16:52 UTC (permalink / raw) To: Alex Rosenbaum Cc: Or Gerlitz, Hefty, Sean, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) On Sep 19, 2012, at 11:58 AM, Alex Rosenbaum <alexr-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > Since we use the RDMA_PS_IPOIB we need librdmacm to help get the correct pkey_index and qkey (in INIT->RTR transition) to match IPoIB's UD QP own values. If not, than our user space UD QP will not be able to send/recv from IPoIB on remote machines (which is what we want to gain by using the IPOIB port space). > > Maybe we can save the values used from the rdma_create_qp and reuse them once modify the UD QP state by libverbs (ibv_modify_qp). > It would be nice if we had access to the rdma's modify qp wrapper to do this nicely from application level. I too would be interested in bringing a QP from error back to a usable state. I have been debating whether to reconnect using the current RDMA calls versus trying to transition the existing RC QP. I assumed to transition the existing QP that I would need to open a socket to coordinate the two sides. Is that correct? If I were instead to use rdma_connect(), does it require a new CM id or just a new QP within the same id? Thanks, Scott > -----Original Message----- > From: Or Gerlitz > Sent: Wednesday, September 19, 2012 6:53 PM > To: Alex Rosenbaum > Cc: Hefty, Sean; linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) > Subject: Re: how to preserve QP over HA events for librdmacm applications > > On 19/09/2012 18:48, Hefty, Sean wrote: >>> Can this flushing be somehow done with the current >>> librdmacm/libibverbs APIs or we need some enhancement? >> You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway. >> >> > > Alex, > > Any reason we can't deploy this hack? is that for the IPoIB port space it would require copying some low level code from librdmacm or even from the kernel? e.g the IPoIB qkey, etc. > > Or. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <46C75A5F-AD9F-45CF-A441-B7D5F60709D8-1Heg1YXhbW8@public.gmane.org>]
* RE: how to preserve QP over HA events for librdmacm applications [not found] ` <46C75A5F-AD9F-45CF-A441-B7D5F60709D8-1Heg1YXhbW8@public.gmane.org> @ 2012-09-19 17:05 ` Hefty, Sean [not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E47E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Hefty, Sean @ 2012-09-19 17:05 UTC (permalink / raw) To: Atchley, Scott, Alex Rosenbaum Cc: Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) > I too would be interested in bringing a QP from error back to a usable state. I > have been debating whether to reconnect using the current RDMA calls versus > trying to transition the existing RC QP. > > I assumed to transition the existing QP that I would need to open a socket to > coordinate the two sides. Is that correct? > > If I were instead to use rdma_connect(), does it require a new CM id or just a > new QP within the same id? What do you gain by transitioning an RC QP from error to RTS, versus just establishing a new connection? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <1828884A29C6694DAF28B7E6B8A8237346A8E47E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: how to preserve QP over HA events for librdmacm applications [not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E47E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2012-09-19 18:14 ` Atchley, Scott [not found] ` <86756672-ADCC-4EF0-A24C-19C4A0EB8188-1Heg1YXhbW8@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Atchley, Scott @ 2012-09-19 18:14 UTC (permalink / raw) To: Hefty, Sean Cc: Alex Rosenbaum, Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) On Sep 19, 2012, at 1:05 PM, "Hefty, Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote: >> I too would be interested in bringing a QP from error back to a usable state. I >> have been debating whether to reconnect using the current RDMA calls versus >> trying to transition the existing RC QP. >> >> I assumed to transition the existing QP that I would need to open a socket to >> coordinate the two sides. Is that correct? >> >> If I were instead to use rdma_connect(), does it require a new CM id or just a >> new QP within the same id? > > What do you gain by transitioning an RC QP from error to RTS, versus just establishing a new connection? I have a certain amount of state regarding a peer. I lookup that state based on the qp_num returned within a work completion, for example. If I reconnect, I will need to migrate the state from the old qp_num to the new qp_num. I have no preference which is why I asked about the two options (opening a socket to coordinate state transitions versus connecting with a new QP). Scott-- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <86756672-ADCC-4EF0-A24C-19C4A0EB8188-1Heg1YXhbW8@public.gmane.org>]
* Re: how to preserve QP over HA events for librdmacm applications [not found] ` <86756672-ADCC-4EF0-A24C-19C4A0EB8188-1Heg1YXhbW8@public.gmane.org> @ 2012-09-19 18:39 ` Atchley, Scott [not found] ` <16AD9776-40CA-4106-8F3D-A974067EEE2A-1Heg1YXhbW8@public.gmane.org> 2012-09-20 17:37 ` Pradeep Satyanarayana 1 sibling, 1 reply; 15+ messages in thread From: Atchley, Scott @ 2012-09-19 18:39 UTC (permalink / raw) To: Hefty, Sean Cc: Alex Rosenbaum, Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) On Sep 19, 2012, at 2:14 PM, "Atchley, Scott" <atchleyes-1Heg1YXhbW8@public.gmane.org> wrote: > On Sep 19, 2012, at 1:05 PM, "Hefty, Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote: > >>> I too would be interested in bringing a QP from error back to a usable state. I >>> have been debating whether to reconnect using the current RDMA calls versus >>> trying to transition the existing RC QP. >>> >>> I assumed to transition the existing QP that I would need to open a socket to >>> coordinate the two sides. Is that correct? >>> >>> If I were instead to use rdma_connect(), does it require a new CM id or just a >>> new QP within the same id? >> >> What do you gain by transitioning an RC QP from error to RTS, versus just establishing a new connection? > > I have a certain amount of state regarding a peer. I lookup that state based on the qp_num returned within a work completion, for example. If I reconnect, I will need to migrate the state from the old qp_num to the new qp_num. > > I have no preference which is why I asked about the two options (opening a socket to coordinate state transitions versus connecting with a new QP). I don't know if it matters to the conversation or not, but I use an SRQ. I am unclear how to remove a QP from the SRQ. Is ibv_destroy_qp() sufficient? Or do I need to use rdma_destroy_qp()? I basically, use the rdma_* calls for connection setup. After that, I use only ibv_* calls for communication (Send/Recv and RDMA). Scott-- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <16AD9776-40CA-4106-8F3D-A974067EEE2A-1Heg1YXhbW8@public.gmane.org>]
* RE: how to preserve QP over HA events for librdmacm applications [not found] ` <16AD9776-40CA-4106-8F3D-A974067EEE2A-1Heg1YXhbW8@public.gmane.org> @ 2012-09-19 19:22 ` Hefty, Sean 0 siblings, 0 replies; 15+ messages in thread From: Hefty, Sean @ 2012-09-19 19:22 UTC (permalink / raw) To: Atchley, Scott Cc: Alex Rosenbaum, Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) > I don't know if it matters to the conversation or not, but I use an SRQ. I am > unclear how to remove a QP from the SRQ. Is ibv_destroy_qp() sufficient? Or do > I need to use rdma_destroy_qp()? rdma_destroy_qp() is a wrapper around ibv_destroy_qp(), plus destroys any internally allocated resources, like CQs, if the rdma_cm allocated those for the user. If you called ibv_create_cq yourself, either is sufficient. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: how to preserve QP over HA events for librdmacm applications [not found] ` <86756672-ADCC-4EF0-A24C-19C4A0EB8188-1Heg1YXhbW8@public.gmane.org> 2012-09-19 18:39 ` Atchley, Scott @ 2012-09-20 17:37 ` Pradeep Satyanarayana [not found] ` <505B5470.9030707-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> 1 sibling, 1 reply; 15+ messages in thread From: Pradeep Satyanarayana @ 2012-09-20 17:37 UTC (permalink / raw) To: Atchley, Scott Cc: Hefty, Sean, Alex Rosenbaum, Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) On 09/19/2012 11:14 AM, Atchley, Scott wrote: > On Sep 19, 2012, at 1:05 PM, "Hefty, Sean"<sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote: > >>> I too would be interested in bringing a QP from error back to a usable state. I >>> have been debating whether to reconnect using the current RDMA calls versus >>> trying to transition the existing RC QP. >>> >>> I assumed to transition the existing QP that I would need to open a socket to >>> coordinate the two sides. Is that correct? >>> >>> If I were instead to use rdma_connect(), does it require a new CM id or just a >>> new QP within the same id? What if you say pre-created a second (fail over) QP for HA purposes all under the covers of a single socket? And both QPs were connected before the failure. Not sure if that would work with the same CM id though. If not, we will need to rdma_connect() the second QP after failure. By having a second QP and bound to say a different port/device, one could survive not just link up/down events, but device failures too. Would that be more generic? Thanks Pradeep -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <505B5470.9030707-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>]
* Re: how to preserve QP over HA events for librdmacm applications [not found] ` <505B5470.9030707-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> @ 2012-09-20 18:18 ` Atchley, Scott 2012-09-20 20:10 ` Hefty, Sean 1 sibling, 0 replies; 15+ messages in thread From: Atchley, Scott @ 2012-09-20 18:18 UTC (permalink / raw) To: Pradeep Satyanarayana Cc: Hefty, Sean, Alex Rosenbaum, Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) On Sep 20, 2012, at 1:37 PM, Pradeep Satyanarayana <pradeeps-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote: > On 09/19/2012 11:14 AM, Atchley, Scott wrote: >> On Sep 19, 2012, at 1:05 PM, "Hefty, Sean"<sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote: >> >>>> I too would be interested in bringing a QP from error back to a usable state. I >>>> have been debating whether to reconnect using the current RDMA calls versus >>>> trying to transition the existing RC QP. >>>> >>>> I assumed to transition the existing QP that I would need to open a socket to >>>> coordinate the two sides. Is that correct? >>>> >>>> If I were instead to use rdma_connect(), does it require a new CM id or just a >>>> new QP within the same id? > > What if you say pre-created a second (fail over) QP for HA purposes all > under the covers of a single socket? And both QPs were connected before > the failure. Not sure if that would work with the same CM id though. If > not, we will need to rdma_connect() the second QP after failure. > > By having a second QP and bound to say a different port/device, one > could survive not just link up/down events, but device failures too. > Would that be more generic? Hi Pradeep, What is the memory cost of a QP? I assume it will require a second CM id as well. Involving a second device and/or port is not an option for my usage. Scott-- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: how to preserve QP over HA events for librdmacm applications [not found] ` <505B5470.9030707-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> 2012-09-20 18:18 ` Atchley, Scott @ 2012-09-20 20:10 ` Hefty, Sean [not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E77F-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> 1 sibling, 1 reply; 15+ messages in thread From: Hefty, Sean @ 2012-09-20 20:10 UTC (permalink / raw) To: Pradeep Satyanarayana, Atchley, Scott Cc: Alex Rosenbaum, Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) > What if you say pre-created a second (fail over) QP for HA purposes all > under the covers of a single socket? And both QPs were connected before > the failure. Not sure if that would work with the same CM id though. If > not, we will need to rdma_connect() the second QP after failure. CM IDs are not shared across devices, and can't be reused for different QPs until the first connection has been torn down and gone through timewait. For IB, you probably want path migration capabilities. Anything more generic should really be handled by the application. Migrating a connection between devices also requires using different CQs, PDs, MRs, etc. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <1828884A29C6694DAF28B7E6B8A8237346A8E77F-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: how to preserve QP over HA events for librdmacm applications [not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E77F-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2012-09-20 20:57 ` Pradeep Satyanarayana [not found] ` <505B8349.4050402-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Pradeep Satyanarayana @ 2012-09-20 20:57 UTC (permalink / raw) To: Hefty, Sean Cc: Atchley, Scott, Alex Rosenbaum, Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) On 09/20/2012 01:10 PM, Hefty, Sean wrote: >> What if you say pre-created a second (fail over) QP for HA purposes all >> under the covers of a single socket? And both QPs were connected before >> the failure. Not sure if that would work with the same CM id though. If >> not, we will need to rdma_connect() the second QP after failure. > > CM IDs are not shared across devices, and can't be reused for different QPs until the first connection has been torn down and gone through timewait. For IB, you probably want path migration capabilities. > Fair enough, I understand one needs to use a different CM id. For the IB case I was thinking of avoiding APM (since that is limited to a device -isn't that so?). > Anything more generic should really be handled by the application. Migrating a connection between devices also requires using different CQs, PDs, MRs, etc. > Is PD device specific? Couldn't one reuse the same CQs and MRs, even though the QP is different? Of course only one QP would be active at any time. Thanks Pradeep -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <505B8349.4050402-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>]
* RE: how to preserve QP over HA events for librdmacm applications [not found] ` <505B8349.4050402-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> @ 2012-09-20 21:52 ` Hefty, Sean [not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E89E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Hefty, Sean @ 2012-09-20 21:52 UTC (permalink / raw) To: Pradeep Satyanarayana Cc: Atchley, Scott, Alex Rosenbaum, Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) > Fair enough, I understand one needs to use a different CM id. For the IB > case I was thinking of avoiding APM (since that is limited to a device > -isn't that so?). APM is limited to a single device, as is memory registration, CQs, PDs, SRQs, etc. Migration between devices requires entirely new memory registrations, the use of different lkeys/rkeys, and new CQs. There's no guarantee that the HW devices support the same features - registration size, QP size, CQ size, etc. > Is PD device specific? Couldn't one reuse the same CQs and MRs, even > though the QP is different? Of course only one QP would be active at any > time. You can only reuse the resources if you limit yourself to the same device. Supporting migration between devices requires a higher level abstraction which hides the internal RDMA device details. HA itself likely requires more than simply establishing a new connection. You may need to resolve the addresses again, to determine where to migrate to, plus obtain new path records. Any app that wants full HA capability really needs to be able to handle a connection failing completely and establishing a new one. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <1828884A29C6694DAF28B7E6B8A8237346A8E89E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* RE: how to preserve QP over HA events for librdmacm applications [not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E89E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2012-09-22 15:57 ` Alex Rosenbaum 0 siblings, 0 replies; 15+ messages in thread From: Alex Rosenbaum @ 2012-09-22 15:57 UTC (permalink / raw) To: Hefty, Sean Cc: Atchley, Scott, Or Gerlitz, linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org), Pradeep Satyanarayana Sean, I have been thinking of improving the HA support for verbs via RDMA APIs. One rdma API which is missing in my opinion is something that resembles the 'setsockopt(s, SO_BINDTODEVICE, char* ifname)'. The current problem as I see it is that if you define IPoIB bonding interface you get two (or more) net devices which don't have an ip address. Only the bond interface gets an ip address, and it will get the HW address according to the active net device interface in use. An RDMA application will call rdma_bind_addr("bond_ip_addr") and depending on which is the active net device it will be able to create a QP only on that ibv_device+port. Such an application will have to wait for the RDMA_CM_EVENT_ADDR_CHNG and restart the cma_id and its QP to learn of the new ibv_device+prot. I would like to create on application startup several QPs on all the net devices under the bond interface which I cannot today via RDMA CM. Once I create them all I can call ibv_attach_mcast() on application start and not miss any ingress packets once the failover occurs. I will try to come up with a more detailed scheme and return to this thread. My current thought are something in the direction of 'rdma_bind_name(ifname)' or 'rdma_set_option(ID_BINDTODEVICE, ifname)'. Alex -----Original Message----- From: Hefty, Sean [mailto:sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org] Sent: Friday, September 21, 2012 12:53 AM To: Pradeep Satyanarayana Cc: Atchley, Scott; Alex Rosenbaum; Or Gerlitz; linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) Subject: RE: how to preserve QP over HA events for librdmacm applications > Fair enough, I understand one needs to use a different CM id. For the > IB case I was thinking of avoiding APM (since that is limited to a > device -isn't that so?). APM is limited to a single device, as is memory registration, CQs, PDs, SRQs, etc. Migration between devices requires entirely new memory registrations, the use of different lkeys/rkeys, and new CQs. There's no guarantee that the HW devices support the same features - registration size, QP size, CQ size, etc. > Is PD device specific? Couldn't one reuse the same CQs and MRs, even > though the QP is different? Of course only one QP would be active at > any time. You can only reuse the resources if you limit yourself to the same device. Supporting migration between devices requires a higher level abstraction which hides the internal RDMA device details. HA itself likely requires more than simply establishing a new connection. You may need to resolve the addresses again, to determine where to migrate to, plus obtain new path records. Any app that wants full HA capability really needs to be able to handle a connection failing completely and establishing a new one. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-09-22 15:57 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-19 15:43 how to preserve QP over HA events for librdmacm applications Or Gerlitz
[not found] ` <5059E82E.9020600-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-09-19 15:48 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E418-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-09-19 15:52 ` Or Gerlitz
[not found] ` <5059EA48.1040407-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-09-19 15:58 ` Alex Rosenbaum
[not found] ` <A4E971F4031F1840BBA6E79B417E62E82CF29C3B-SlGPd/IId7auSA5JZHE7gA@public.gmane.org>
2012-09-19 16:52 ` Atchley, Scott
[not found] ` <46C75A5F-AD9F-45CF-A441-B7D5F60709D8-1Heg1YXhbW8@public.gmane.org>
2012-09-19 17:05 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E47E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-09-19 18:14 ` Atchley, Scott
[not found] ` <86756672-ADCC-4EF0-A24C-19C4A0EB8188-1Heg1YXhbW8@public.gmane.org>
2012-09-19 18:39 ` Atchley, Scott
[not found] ` <16AD9776-40CA-4106-8F3D-A974067EEE2A-1Heg1YXhbW8@public.gmane.org>
2012-09-19 19:22 ` Hefty, Sean
2012-09-20 17:37 ` Pradeep Satyanarayana
[not found] ` <505B5470.9030707-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2012-09-20 18:18 ` Atchley, Scott
2012-09-20 20:10 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E77F-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-09-20 20:57 ` Pradeep Satyanarayana
[not found] ` <505B8349.4050402-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2012-09-20 21:52 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A8237346A8E89E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-09-22 15:57 ` Alex Rosenbaum
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).