* Handling incoming RDMA CM connections when there is more than one IB HCA in a system
@ 2013-08-25 11:41 Richard Sharpe
[not found] ` <CACyXjPzPJ=cZ1WkjYJ_o_4uLE50mn-TX93Answz7nw2rn619-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Richard Sharpe @ 2013-08-25 11:41 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Hi folks,
I am attempting to implement SMB Direct (aka SMB over RDMA) for Samba.
For historical, protocol and performance reasons I believe that I need
to write a character driver that offloads RDMA stuff to the kernel.
Briefly, these reasons are:
1. Samba forks a new smbd when each incoming SMB connection arrives
2. SMB Over RDMA operates by first connecting to the server over TCP,
bringing up SMB, determining that the server supports RDMA and then
establishing an RDMA connection, bringing up SMB Direct and then
transporting SMB PDUs over that.
3. The current Windows client implementation pays no attention to the
port supplied to it by the server and always connects on port 4554.
I plan on writing a small driver that uses the in-kernel RDMA support
to implement SMB Direct and provide shared memory mechanisms for
avoiding copying data to and from the kernel for RDMA READs and RDMA
WRITEs.
After reading the srpt driver, much of what I need to do seems clear.
However, I figure that I will eventually need to support situations
where there are multiple IB HCAs in a system, and I wondered if there
are any abstractions that allow me to do an ib_cm_listen across
multiple devices at once?
It seems that I have to do an ib_create_cm_id against a device before
I can do a listen, but that suggests that I have to:
1. Create a CM ID for each device in the system. This seems easy
because of the callbacks that result from ib_register_client
2. Listen on each CM ID
3. When I get a callback on one listen, cancel the others.
Is there an easier way?
--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread[parent not found: <CACyXjPzPJ=cZ1WkjYJ_o_4uLE50mn-TX93Answz7nw2rn619-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Handling incoming RDMA CM connections when there is more than one IB HCA in a system [not found] ` <CACyXjPzPJ=cZ1WkjYJ_o_4uLE50mn-TX93Answz7nw2rn619-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-08-25 12:26 ` Or Gerlitz 2013-08-26 17:48 ` Jason Gunthorpe 1 sibling, 0 replies; 5+ messages in thread From: Or Gerlitz @ 2013-08-25 12:26 UTC (permalink / raw) To: Richard Sharpe; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA > Hi folks, > > I am attempting to implement SMB Direct (aka SMB over RDMA) for Samba. > > For historical, protocol and performance reasons I believe that I need > to write a character driver that offloads RDMA stuff to the kernel. > > Briefly, these reasons are: > > 1. Samba forks a new smbd when each incoming SMB connection arrives > > 2. SMB Over RDMA operates by first connecting to the server over TCP, > bringing up SMB, determining that the server supports RDMA and then > establishing an RDMA connection, bringing up SMB Direct and then > transporting SMB PDUs over that. > > 3. The current Windows client implementation pays no attention to the > port supplied to it by the server and always connects on port 4554. > > I plan on writing a small driver that uses the in-kernel RDMA support > to implement SMB Direct and provide shared memory mechanisms for > avoiding copying data to and from the kernel for RDMA READs and RDMA > WRITEs. > > After reading the srpt driver, much of what I need to do seems clear. > > However, I figure that I will eventually need to support situations > where there are multiple IB HCAs in a system, and I wondered if there > are any abstractions that allow me to do an ib_cm_listen across > multiple devices at once? > > It seems that I have to do an ib_create_cm_id against a device before > I can do a listen, but that suggests that I have to: > > 1. Create a CM ID for each device in the system. This seems easy > because of the callbacks that result from ib_register_client > > 2. Listen on each CM ID > > 3. When I get a callback on one listen, cancel the others. > > Is there an easier way? Hi Richard, I would recommend using the kernel rdma-cm API (see include/rdma/rdma_cm.h), this way you can have your control plane to use IP addressing and the equivalent of TCP ports, where you provide sockaddr strucutures containing IP and PORT on the bind stage. Basically, your app flow would look like listen_id = rdma_create_id(your handler, your context, RDMA_PS_TCP, IB_QPT_RC) rdma_bind_addr(listen_id, use $IP:$PORT or IP_ADDR_ANY:$PORT) rdma_listen(listen_id) for each new connection request <-- get RDMA_CM_EVENT_CONNECT_REQUEST (with conn_id) rdma_create_qp(conn_id, your qp attr) rdma_accept(conn_id) <-- get RDMA_CM_EVENT_ESTABLISHED and on tear down rdma_disconnect(conn_id) <-- get RDMA_CM_EVENT_DISCONNECTED You can see the upstream LIO iser driver for how it works drivers/infiniband/ulp/isert If you listen with IP_ADDR_ANY you listen over all HCAs in the system for which there's a running IPoIB device (for IB) or running Ethernet device (for RoCE) Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Handling incoming RDMA CM connections when there is more than one IB HCA in a system [not found] ` <CACyXjPzPJ=cZ1WkjYJ_o_4uLE50mn-TX93Answz7nw2rn619-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-08-25 12:26 ` Or Gerlitz @ 2013-08-26 17:48 ` Jason Gunthorpe [not found] ` <20130826174844.GD12296-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 1 sibling, 1 reply; 5+ messages in thread From: Jason Gunthorpe @ 2013-08-26 17:48 UTC (permalink / raw) To: Richard Sharpe; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On Sun, Aug 25, 2013 at 04:41:59AM -0700, Richard Sharpe wrote: > Hi folks, > > I am attempting to implement SMB Direct (aka SMB over RDMA) for Samba. > > For historical, protocol and performance reasons I believe that I need > to write a character driver that offloads RDMA stuff to the kernel. > > Briefly, these reasons are: > > 1. Samba forks a new smbd when each incoming SMB connection arrives > > 2. SMB Over RDMA operates by first connecting to the server over TCP, > bringing up SMB, determining that the server supports RDMA and then > establishing an RDMA connection, bringing up SMB Direct and then > transporting SMB PDUs over that. > > 3. The current Windows client implementation pays no attention to the > port supplied to it by the server and always connects on port 4554. So your issue is that when the transport is upgraded SMB performs a whole new connection setup to the common port and the server is expected to associate the new connection to the old based on a GUID in the first messages? How about this for a flow? - The master process listens on all relavent TCP and RDMA ports for incoming connections - At each incomming connection it forks and constructs a sub process for that connection. I think we can do this today with RDMA, but if not it should be doable with less effort than making your own kernel driver :) Sean might know for sure.. - The new smbd is now either a from scratch new connection (normal case today) or an 'upgrade' to an prior connection - If it is an upgrade, use some scheme to transfer the samba internal state from the old connection smbd to the new connection smbd That keeps your per-process model.. > However, I figure that I will eventually need to support situations > where there are multiple IB HCAs in a system, and I wondered if > there are any abstractions that allow me to do an ib_cm_listen > across multiple devices at once? Not really, you need to listen on every device. And you almost certainly need to use the RDMA CM interfaces (as Or mentioned), that will be mandatory to support iWarp, and it looks like MS is using the RDMA-CM protocol on IB as well. > 3. When I get a callback on one listen, cancel the others. Why? Wouldn't you listen for RDMA connections permanently, like for the TCP listen? >From what I read about the SMB protocol it looks completely valid to bypass the TCP first stage and go directly to RDMA. Or go from TCP to TCP, or RDMA to TCP, or whatever. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20130826174844.GD12296-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: Handling incoming RDMA CM connections when there is more than one IB HCA in a system [not found] ` <20130826174844.GD12296-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2013-08-26 19:17 ` Richard Sharpe [not found] ` <CACyXjPxyU6LO_31EVf9D_CwsX-aohEh_UbAQx97jJFh6PaLfgw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Richard Sharpe @ 2013-08-26 19:17 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: linux-rdma On Mon, Aug 26, 2013 at 10:48 AM, Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > On Sun, Aug 25, 2013 at 04:41:59AM -0700, Richard Sharpe wrote: [Deletia to be addressed later] > From what I read about the SMB protocol it looks completely valid to > bypass the TCP first stage and go directly to RDMA. Or go from TCP to > TCP, or RDMA to TCP, or whatever. Microsoft tells me that they never do an RDMA-only connection. It is always TCP first then RDMA. -- Regards, Richard Sharpe (何以解憂?唯有杜康。--曹操) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <CACyXjPxyU6LO_31EVf9D_CwsX-aohEh_UbAQx97jJFh6PaLfgw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Handling incoming RDMA CM connections when there is more than one IB HCA in a system [not found] ` <CACyXjPxyU6LO_31EVf9D_CwsX-aohEh_UbAQx97jJFh6PaLfgw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-08-26 19:35 ` Jason Gunthorpe 0 siblings, 0 replies; 5+ messages in thread From: Jason Gunthorpe @ 2013-08-26 19:35 UTC (permalink / raw) To: Richard Sharpe; +Cc: linux-rdma On Mon, Aug 26, 2013 at 12:17:11PM -0700, Richard Sharpe wrote: > > From what I read about the SMB protocol it looks completely valid to > > bypass the TCP first stage and go directly to RDMA. Or go from TCP to > > TCP, or RDMA to TCP, or whatever. > > Microsoft tells me that they never do an RDMA-only connection. It is > always TCP first then RDMA. Sure, but the SMB protocol allows for more than just that one case. Be careful not to architect yourself into a corner that can't do things allowed by the spec, but not performed by clients of the day.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-08-26 19:35 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-25 11:41 Handling incoming RDMA CM connections when there is more than one IB HCA in a system Richard Sharpe
[not found] ` <CACyXjPzPJ=cZ1WkjYJ_o_4uLE50mn-TX93Answz7nw2rn619-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-08-25 12:26 ` Or Gerlitz
2013-08-26 17:48 ` Jason Gunthorpe
[not found] ` <20130826174844.GD12296-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-08-26 19:17 ` Richard Sharpe
[not found] ` <CACyXjPxyU6LO_31EVf9D_CwsX-aohEh_UbAQx97jJFh6PaLfgw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-08-26 19:35 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox