* Is there a way to transfer a rdma connection between userspace processes?
@ 2012-10-15 8:27 Stefan (metze) Metzmacher
[not found] ` <507BC8EE.2020908-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Stefan (metze) Metzmacher @ 2012-10-15 8:27 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 1402 bytes --]
Hi,
I'm currently researching how to implement SMBDirect [MS-SMBD]
together with the multi channel feature of SMB 3.0 in Samba.
As Samba currently uses one process per tcp connection
and maintains a lot of in memory state within the process
(e.g. for the SMB_VFS modules) it would require a lot of work
to change Samba to coordinate two (or more) processes for one logical
multi channel connection.
My current plan tries to pass the socket fd of new connections
(which join an existing multi channel session) via fd-passing to
the existing process.
Now I'm wondering if this would also be possible with
a rdma connection (struct rdma_cm_i ).
From reading the code of rdma_create_event_channel()/rdma_create_id()
and rdma_migrate_id(), it seems that the connection state is in
partly in userspace (structures) and partly in the kernel space
(hidden behind the channel fd)
ibv_create_comp_channel() and ibv_create_cq() seem to have a similar design.
As the ibverbs interface typically has a userspace driver I'm wondering if
it's always true that there's also some kernel state maintained via the
rdma/ibv_comp event channels?
As far as I can see there's currently no way to transfer the rdma/ibv
state to
another process (for me it's enough to transfer it, using it from both
processes
is not strictly needed).
Is anybody working on this already?
metze
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is there a way to transfer a rdma connection between userspace processes?
[not found] ` <507BC8EE.2020908-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
@ 2012-10-15 9:07 ` Yann Droneaud
[not found] ` <1350292042.2750.11.camel-vNW8ozRvgWupuGC+iAP0z+TW4wlIGRCZ@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Yann Droneaud @ 2012-10-15 9:07 UTC (permalink / raw)
To: Stefan (metze) Metzmacher; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Le lundi 15 octobre 2012 à 10:27 +0200, Stefan (metze) Metzmacher a
écrit :
> Hi,
>
> I'm currently researching how to implement SMBDirect [MS-SMBD]
> together with the multi channel feature of SMB 3.0 in Samba.
>
> As Samba currently uses one process per tcp connection
> and maintains a lot of in memory state within the process
> (e.g. for the SMB_VFS modules) it would require a lot of work
> to change Samba to coordinate two (or more) processes for one logical
> multi channel connection.
>
> My current plan tries to pass the socket fd of new connections
> (which join an existing multi channel session) via fd-passing to
> the existing process.
>
> Now I'm wondering if this would also be possible with
> a rdma connection (struct rdma_cm_i ).
>
RDMA / verbs ressources are tied to a process (especially Memory
Registration), but it's ending up in the HCA, which is probably unaware
of processes.
Additionally, an RDMA_CM connection is not identified by a FD, so this
kind of Unix trick (FD passing through Unix socket: SCM_RIGHTS) is not
going to work.
Forking might already be a challenge for a RDMA/verbs application, so I
don't think that sharing/moving an RDMA_CM connection across different
processes is supported.
But other people on this list (especially Roland Dreier and Sean Hefty)
could find a solution.
Regards.
--
Yann Droneaud
OPTEYA
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is there a way to transfer a rdma connection between userspace processes?
[not found] ` <1350292042.2750.11.camel-vNW8ozRvgWupuGC+iAP0z+TW4wlIGRCZ@public.gmane.org>
@ 2012-10-15 15:31 ` Steve Wise
[not found] ` <507C2C6E.7010507-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Steve Wise @ 2012-10-15 15:31 UTC (permalink / raw)
To: Yann Droneaud
Cc: Stefan (metze) Metzmacher, linux-rdma-u79uwXL29TY76Z2rM5mHXA
On 10/15/2012 4:07 AM, Yann Droneaud wrote:
> Le lundi 15 octobre 2012 à 10:27 +0200, Stefan (metze) Metzmacher a
> écrit :
>> Hi,
>>
>> I'm currently researching how to implement SMBDirect [MS-SMBD]
>> together with the multi channel feature of SMB 3.0 in Samba.
>>
>> As Samba currently uses one process per tcp connection
>> and maintains a lot of in memory state within the process
>> (e.g. for the SMB_VFS modules) it would require a lot of work
>> to change Samba to coordinate two (or more) processes for one logical
>> multi channel connection.
>>
>> My current plan tries to pass the socket fd of new connections
>> (which join an existing multi channel session) via fd-passing to
>> the existing process.
>>
>> Now I'm wondering if this would also be possible with
>> a rdma connection (struct rdma_cm_i ).
>>
> RDMA / verbs ressources are tied to a process (especially Memory
> Registration), but it's ending up in the HCA, which is probably unaware
> of processes.
>
> Additionally, an RDMA_CM connection is not identified by a FD, so this
> kind of Unix trick (FD passing through Unix socket: SCM_RIGHTS) is not
> going to work.
>
> Forking might already be a challenge for a RDMA/verbs application, so I
> don't think that sharing/moving an RDMA_CM connection across different
> processes is supported.
>
> But other people on this list (especially Roland Dreier and Sean Hefty)
> could find a solution.
>
> Regards.
>
fork() support like you need is not there in Linux RDMA verbs. Another
alternative is to fork() before you setup the RDMA connection. IE if a
regular TCP socket is first used to negotiate RDMA mode, then maybe you
could fork() after negotiation but before setting up the RDMA connection
and other resources?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is there a way to transfer a rdma connection between userspace processes?
[not found] ` <507C2C6E.7010507-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
@ 2012-10-16 14:25 ` Stefan (metze) Metzmacher
[not found] ` <507D6E3C.2050405-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Stefan (metze) Metzmacher @ 2012-10-16 14:25 UTC (permalink / raw)
To: Steve Wise; +Cc: Yann Droneaud, linux-rdma-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 2577 bytes --]
@Yann and Steve: thanks for the feedback!
>>> I'm currently researching how to implement SMBDirect [MS-SMBD]
>>> together with the multi channel feature of SMB 3.0 in Samba.
>>>
>>> As Samba currently uses one process per tcp connection
>>> and maintains a lot of in memory state within the process
>>> (e.g. for the SMB_VFS modules) it would require a lot of work
>>> to change Samba to coordinate two (or more) processes for one logical
>>> multi channel connection.
>>>
>>> My current plan tries to pass the socket fd of new connections
>>> (which join an existing multi channel session) via fd-passing to
>>> the existing process.
>>>
>>> Now I'm wondering if this would also be possible with
>>> a rdma connection (struct rdma_cm_i ).
>>>
>> RDMA / verbs ressources are tied to a process (especially Memory
>> Registration), but it's ending up in the HCA, which is probably unaware
>> of processes.
>>
>> Additionally, an RDMA_CM connection is not identified by a FD, so this
>> kind of Unix trick (FD passing through Unix socket: SCM_RIGHTS) is not
>> going to work.
>>
>> Forking might already be a challenge for a RDMA/verbs application, so I
>> don't think that sharing/moving an RDMA_CM connection across different
>> processes is supported.
>>
>> But other people on this list (especially Roland Dreier and Sean Hefty)
>> could find a solution.
>>
>> Regards.
>>
>
> fork() support like you need is not there in Linux RDMA verbs. Another
> alternative is to fork() before you setup the RDMA connection. IE if a
> regular TCP socket is first used to negotiate RDMA mode, then maybe you
> could fork() after negotiation but before setting up the RDMA connection
> and other resources?
For client connections that would work, but it can't work for a server
that uses fork() after listen().
Would it be possible to extend the librdmacm and libibverbs to:
- handle fork() after getting RDMA_CM_EVENT_CONNECT_REQUEST
which means there have to be some functions to destroy contexts like
struct rdma_cm_id, but without affecting the real connection.
Similar to close() on an FD that is used shared between two processes.
- export/import an interprocess token attached to the hierarchy of objects
similar to gss_export_sec_context/gss_import_sec_context.
Or a function which exports the contexts into a FD (of a pipe),
which could also transfer open FDs to the other end.
Maybe this could pass the information through a rdma/ibverb event channel?
Are there any other possible solutions?
metze
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is there a way to transfer a rdma connection between userspace processes?
[not found] ` <507D6E3C.2050405-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
@ 2012-10-16 16:46 ` Stefan (metze) Metzmacher
2012-10-17 19:01 ` Hefty, Sean
1 sibling, 0 replies; 6+ messages in thread
From: Stefan (metze) Metzmacher @ 2012-10-16 16:46 UTC (permalink / raw)
To: Steve Wise; +Cc: Yann Droneaud, linux-rdma-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 2881 bytes --]
Am 16.10.2012 16:25, schrieb Stefan (metze) Metzmacher:
> @Yann and Steve: thanks for the feedback!
>
>>>> I'm currently researching how to implement SMBDirect [MS-SMBD]
>>>> together with the multi channel feature of SMB 3.0 in Samba.
>>>>
>>>> As Samba currently uses one process per tcp connection
>>>> and maintains a lot of in memory state within the process
>>>> (e.g. for the SMB_VFS modules) it would require a lot of work
>>>> to change Samba to coordinate two (or more) processes for one logical
>>>> multi channel connection.
>>>>
>>>> My current plan tries to pass the socket fd of new connections
>>>> (which join an existing multi channel session) via fd-passing to
>>>> the existing process.
>>>>
>>>> Now I'm wondering if this would also be possible with
>>>> a rdma connection (struct rdma_cm_i ).
>>>>
>>> RDMA / verbs ressources are tied to a process (especially Memory
>>> Registration), but it's ending up in the HCA, which is probably unaware
>>> of processes.
>>>
>>> Additionally, an RDMA_CM connection is not identified by a FD, so this
>>> kind of Unix trick (FD passing through Unix socket: SCM_RIGHTS) is not
>>> going to work.
>>>
>>> Forking might already be a challenge for a RDMA/verbs application, so I
>>> don't think that sharing/moving an RDMA_CM connection across different
>>> processes is supported.
>>>
>>> But other people on this list (especially Roland Dreier and Sean Hefty)
>>> could find a solution.
>>>
>>> Regards.
>>>
>>
>> fork() support like you need is not there in Linux RDMA verbs. Another
>> alternative is to fork() before you setup the RDMA connection. IE if a
>> regular TCP socket is first used to negotiate RDMA mode, then maybe you
>> could fork() after negotiation but before setting up the RDMA connection
>> and other resources?
>
> For client connections that would work, but it can't work for a server
> that uses fork() after listen().
>
> Would it be possible to extend the librdmacm and libibverbs to:
> - handle fork() after getting RDMA_CM_EVENT_CONNECT_REQUEST
> which means there have to be some functions to destroy contexts like
> struct rdma_cm_id, but without affecting the real connection.
> Similar to close() on an FD that is used shared between two processes.
> - export/import an interprocess token attached to the hierarchy of objects
> similar to gss_export_sec_context/gss_import_sec_context.
> Or a function which exports the contexts into a FD (of a pipe),
> which could also transfer open FDs to the other end.
> Maybe this could pass the information through a rdma/ibverb event channel?
It would be ok if there's a limitation, which means that the transfer to
a different process is only possible if there're:
- no posted work requests
- no queue pairs
- completion queues
active on the connection.
metze
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Is there a way to transfer a rdma connection between userspace processes?
[not found] ` <507D6E3C.2050405-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
2012-10-16 16:46 ` Stefan (metze) Metzmacher
@ 2012-10-17 19:01 ` Hefty, Sean
1 sibling, 0 replies; 6+ messages in thread
From: Hefty, Sean @ 2012-10-17 19:01 UTC (permalink / raw)
To: Stefan (metze) Metzmacher, Steve Wise
Cc: Yann Droneaud, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> For client connections that would work, but it can't work for a server
> that uses fork() after listen().
>
> Would it be possible to extend the librdmacm and libibverbs to:
> - handle fork() after getting RDMA_CM_EVENT_CONNECT_REQUEST
> which means there have to be some functions to destroy contexts like
> struct rdma_cm_id, but without affecting the real connection.
> Similar to close() on an FD that is used shared between two processes.
> - export/import an interprocess token attached to the hierarchy of objects
> similar to gss_export_sec_context/gss_import_sec_context.
> Or a function which exports the contexts into a FD (of a pipe),
> which could also transfer open FDs to the other end.
> Maybe this could pass the information through a rdma/ibverb event channel?
>
> Are there any other possible solutions?
It would be great if an entire RDMA connection could be referred to by an fd. Today that's not the case. Doing this requires an abstraction above verbs. Rsockets provides a similar abstraction, but it doesn't actually return an fd.
Rsockets does support fork() to some degree. It does this by establishing a TCP connection. The RDMA connection is not actually setup until the first data transfer occurs. This handles the case where a server calls fork() after listen(). It works with the apps that I've tested, but still imposes some restrictions on how fork() is used.
I hit into other issues besides fork() trying to support existing socket applications. Without having an fd, other calls require working around. This included dup2(), sendfile(), and fstat(), but I'm sure there are many others.
It would be nice if there were a way to associate an RDMA connection with an fd and under ideal conditions allow data transfers to occur within the userspace context. But allow the communication to migrate to the kernel if certain calls are invoked. This would provide very good performance under most conditions, yet support legacy apps.
- Sean
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-10-17 19:01 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-15 8:27 Is there a way to transfer a rdma connection between userspace processes? Stefan (metze) Metzmacher
[not found] ` <507BC8EE.2020908-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
2012-10-15 9:07 ` Yann Droneaud
[not found] ` <1350292042.2750.11.camel-vNW8ozRvgWupuGC+iAP0z+TW4wlIGRCZ@public.gmane.org>
2012-10-15 15:31 ` Steve Wise
[not found] ` <507C2C6E.7010507-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2012-10-16 14:25 ` Stefan (metze) Metzmacher
[not found] ` <507D6E3C.2050405-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
2012-10-16 16:46 ` Stefan (metze) Metzmacher
2012-10-17 19:01 ` Hefty, Sean
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox