public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* need pointer to Understand how to use IBV_WR_RDMA_WRITE_WITH_IMM
@ 2011-06-07 12:23 Benoit Hudzia
       [not found] ` <BANLkTikC6jPFJm=MxT5oNDoJH4Ex+j41XA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Benoit Hudzia @ 2011-06-07 12:23 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

I am uncertain if this is the correct place for such email, please
point me to an appropriate mailing list place if not.

I would like to use IBV_WR_RDMA_WRITE_WITH_IMM , however I am facing
difficulty to find documentation or example on how to use such
feature.  The idea would be to do an RDMA write and wake up the peer
upon reception in order to process the received data.

What I understand from IBV_WR_RDMA_WRITE_WITH_IMM:

On the server side:

1.      You create your ibv_wr
2.      You set the opcode to IBV_WR_RDMA_WRITE_WITH_IMM
3.      You fill the imm_data of the wr
      a.       Question : can i fill the imm_data with arbitrary data?
I would like to use this field on the peer side to identify the
request
4.      Do the  ibv_post send



On the peer side:

As i understand , IBV_WR_RDMA_WRITE_WITH_IMM consume a Receive Request
in the responder side. So it means i need to post a recv request with
ibv_post_recv .

However i cannot documentation find (maybe i didn’t search hard
enough)  how to construct such request.

* Should i just create an empty / zeroed  :  ibv_recv_wr containing an
empty ibv_sge (or no sge)?
* Can i prefill the recv queu with multiple request ?

On the work completion side within my CQ event handler:

* I suppose i should look for IBV_WC_RECV_RDMA_WITH_IMM
* Then can I use/ check the imm_data in the WC to identify the operation?



Basically at the moment i am struggling to understand how to handle /
detect the rdma write with imm on the receiver side.



Any pointer / code example/ explanation will be greatly appreciated.



Regards

Benoit
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: need pointer to Understand how to use IBV_WR_RDMA_WRITE_WITH_IMM
       [not found] ` <BANLkTikC6jPFJm=MxT5oNDoJH4Ex+j41XA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-06-07 14:18   ` Steven Dake
       [not found]     ` <4DEE3326.9050105-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2011-06-07 14:23   ` Hefty, Sean
  2011-06-07 16:43   ` Jason Gunthorpe
  2 siblings, 1 reply; 7+ messages in thread
From: Steven Dake @ 2011-06-07 14:18 UTC (permalink / raw)
  To: Benoit Hudzia; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 06/07/2011 05:23 AM, Benoit Hudzia wrote:
> I am uncertain if this is the correct place for such email, please
> point me to an appropriate mailing list place if not.
> 
> I would like to use IBV_WR_RDMA_WRITE_WITH_IMM , however I am facing
> difficulty to find documentation or example on how to use such
> feature.  The idea would be to do an RDMA write and wake up the peer
> upon reception in order to process the received data.
> 
> What I understand from IBV_WR_RDMA_WRITE_WITH_IMM:

I would recommend against using IMM modes.  I believe they are not
supported on non-infiniband hardware, which limits your application from
being used on Ethernet rdma systems.

> 
> On the server side:
> 
> 1.      You create your ibv_wr
> 2.      You set the opcode to IBV_WR_RDMA_WRITE_WITH_IMM
> 3.      You fill the imm_data of the wr
>       a.       Question : can i fill the imm_data with arbitrary data?
> I would like to use this field on the peer side to identify the
> request
> 4.      Do the  ibv_post send
> 
> 
> 
> On the peer side:
> 
> As i understand , IBV_WR_RDMA_WRITE_WITH_IMM consume a Receive Request
> in the responder side. So it means i need to post a recv request with
> ibv_post_recv .
> 
> However i cannot documentation find (maybe i didn’t search hard
> enough)  how to construct such request.
> 
> * Should i just create an empty / zeroed  :  ibv_recv_wr containing an
> empty ibv_sge (or no sge)?
> * Can i prefill the recv queu with multiple request ?
> 
> On the work completion side within my CQ event handler:
> 
> * I suppose i should look for IBV_WC_RECV_RDMA_WITH_IMM
> * Then can I use/ check the imm_data in the WC to identify the operation?
> 
> 
> 
> Basically at the moment i am struggling to understand how to handle /
> detect the rdma write with imm on the receiver side.
> 

I would expect on receiver side you would use the normal ibv_post_recv
calls to contain references to the memory regions that received messages
should come into.  Then you can use completion queue events
ibv_create_qp() just as you have mentioned in your email.

A better choice to using IMM would be to stuff the data directly into
your message as a header.

Regards
-steve
> 
> 
> Any pointer / code example/ explanation will be greatly appreciated.
> 
> 
> 
> Regards
> 
> Benoit
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: need pointer to Understand how to use IBV_WR_RDMA_WRITE_WITH_IMM
       [not found] ` <BANLkTikC6jPFJm=MxT5oNDoJH4Ex+j41XA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2011-06-07 14:18   ` Steven Dake
@ 2011-06-07 14:23   ` Hefty, Sean
  2011-06-07 16:43   ` Jason Gunthorpe
  2 siblings, 0 replies; 7+ messages in thread
From: Hefty, Sean @ 2011-06-07 14:23 UTC (permalink / raw)
  To: Benoit Hudzia, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

> What I understand from IBV_WR_RDMA_WRITE_WITH_IMM:
> 
> On the server side:
> 
> 1.      You create your ibv_wr
> 2.      You set the opcode to IBV_WR_RDMA_WRITE_WITH_IMM
> 3.      You fill the imm_data of the wr
>       a.       Question : can i fill the imm_data with arbitrary data?

Yes - but it's limited to 32-bits.

> I would like to use this field on the peer side to identify the
> request
> 4.      Do the  ibv_post send
> 
> 
> 
> On the peer side:
> 
> As i understand , IBV_WR_RDMA_WRITE_WITH_IMM consume a Receive Request
> in the responder side. So it means i need to post a recv request with
> ibv_post_recv .
> 
> However i cannot documentation find (maybe i didn't search hard
> enough)  how to construct such request.
> 
> * Should i just create an empty / zeroed  :  ibv_recv_wr containing an
> empty ibv_sge (or no sge)?

I'm not sure about the underlying implementations, but I believe this should work. 

> * Can i prefill the recv queu with multiple request ?

Yes - up to the size specified for the receive queue.

> On the work completion side within my CQ event handler:
> 
> * I suppose i should look for IBV_WC_RECV_RDMA_WITH_IMM
> * Then can I use/ check the imm_data in the WC to identify the operation?

yes

> Any pointer / code example/ explanation will be greatly appreciated.

I'm not aware of any example code, but it sounds like you understand the concepts.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: need pointer to Understand how to use IBV_WR_RDMA_WRITE_WITH_IMM
       [not found]     ` <4DEE3326.9050105-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2011-06-07 16:31       ` Jason Gunthorpe
       [not found]         ` <20110607163111.GA24005-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2011-06-07 16:31 UTC (permalink / raw)
  To: Steven Dake; +Cc: Benoit Hudzia, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, Jun 07, 2011 at 07:18:14AM -0700, Steven Dake wrote:
> On 06/07/2011 05:23 AM, Benoit Hudzia wrote:
> > I am uncertain if this is the correct place for such email, please
> > point me to an appropriate mailing list place if not.
> > 
> > I would like to use IBV_WR_RDMA_WRITE_WITH_IMM , however I am facing
> > difficulty to find documentation or example on how to use such
> > feature.  The idea would be to do an RDMA write and wake up the peer
> > upon reception in order to process the received data.
> > 
> > What I understand from IBV_WR_RDMA_WRITE_WITH_IMM:
> 
> I would recommend against using IMM modes.  I believe they are not
> supported on non-infiniband hardware, which limits your application from
> being used on Ethernet rdma systems.

There is some talk of fixing that in iWarp, the IMM stuff is much more
efficient than a RDMA + SEND combo.

If you are going to use it, and care about iWarp then you should
design a fall back into your protocol to use RDMA + SEND.. I have some
stuff that works like this.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: need pointer to Understand how to use IBV_WR_RDMA_WRITE_WITH_IMM
       [not found]         ` <20110607163111.GA24005-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-06-07 16:36           ` Steve Wise
  0 siblings, 0 replies; 7+ messages in thread
From: Steve Wise @ 2011-06-07 16:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Steven Dake, Benoit Hudzia, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 06/07/2011 11:31 AM, Jason Gunthorpe wrote:
> On Tue, Jun 07, 2011 at 07:18:14AM -0700, Steven Dake wrote:
>> On 06/07/2011 05:23 AM, Benoit Hudzia wrote:
>>> I am uncertain if this is the correct place for such email, please
>>> point me to an appropriate mailing list place if not.
>>>
>>> I would like to use IBV_WR_RDMA_WRITE_WITH_IMM , however I am facing
>>> difficulty to find documentation or example on how to use such
>>> feature.  The idea would be to do an RDMA write and wake up the peer
>>> upon reception in order to process the received data.
>>>
>>> What I understand from IBV_WR_RDMA_WRITE_WITH_IMM:
>> I would recommend against using IMM modes.  I believe they are not
>> supported on non-infiniband hardware, which limits your application from
>> being used on Ethernet rdma systems.
> There is some talk of fixing that in iWarp, the IMM stuff is much more
> efficient than a RDMA + SEND combo.
>

iWARP 2.0: http://datatracker.ietf.org/doc/draft-ietf-storm-rdmap-ext/


> If you are going to use it, and care about iWarp then you should
> design a fall back into your protocol to use RDMA + SEND.. I have some
> stuff that works like this.
>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: need pointer to Understand how to use IBV_WR_RDMA_WRITE_WITH_IMM
       [not found] ` <BANLkTikC6jPFJm=MxT5oNDoJH4Ex+j41XA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2011-06-07 14:18   ` Steven Dake
  2011-06-07 14:23   ` Hefty, Sean
@ 2011-06-07 16:43   ` Jason Gunthorpe
       [not found]     ` <20110607164333.GB24005-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2 siblings, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2011-06-07 16:43 UTC (permalink / raw)
  To: Benoit Hudzia; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, Jun 07, 2011 at 01:23:01PM +0100, Benoit Hudzia wrote:

> I would like to use IBV_WR_RDMA_WRITE_WITH_IMM , however I am facing
> difficulty to find documentation or example on how to use such
> feature. ?The idea would be to do an RDMA write and wake up the peer
> upon reception in order to process the received data.

You basically have it correct, but generally you also need to mix SEND
traffic on the same QP as you are doing these RDMAs, at a minimum to
manage the allocation of receive resources. Since you can't know if
the remote will place a RDMA+IMM or a SEND into your receive WR you
need to ensure that all of them that are posted are able to handle a
SEND. If you get an IMM then ignore the payload, process the imm_data
recycle the WR back into the HCA, and schedule a signal to the far
side you now have additional receive WR space.

>       a.?????? Question : can i fill the imm_data with arbitrary data?
> I would like to use this field on the peer side to identify the
> request

Keep in mind the imm_data is copied byte-for-byte so you should put it
in a sensible endian-ness for your application.

A simple example of a protocol using this might be:

 SEND [My RDMA Recv buffer is XX-YY, I have RR recv WRs posted]
           SEND [OK, I have RR recv WRs posted]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           RDMA WRITE [IMM = I wrote ZZ bytes]
 SEND [I have RR recv WRs posted]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           RDMA WRITE [IMM = I wrote ZZ bytes]
 SEND [I have RR recv WRs posted]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           SEND [I have RR recv WRs posted]
 SEND [I have RR recv WRs posted]
           RDMA WRITE [IMM = I wrote ZZ bytes]
           RDMA WRITE [IMM = I wrote ZZ bytes]
[etc..]

With a scheme like this all recive work posts would include a recv
buffer to handle a SEND receive completion, and each side would stop
sending when it runs out of recv completion space on the far
side. Generally with a scheme like this some care is needed to avoid
deadlocking because you want to be a bit lazy with flow control sends.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: need pointer to Understand how to use IBV_WR_RDMA_WRITE_WITH_IMM
       [not found]     ` <20110607164333.GB24005-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-06-08 18:35       ` Benoit Hudzia
  0 siblings, 0 replies; 7+ messages in thread
From: Benoit Hudzia @ 2011-06-08 18:35 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Thanks all for your input.

As i am using SoftIwarp currently to do most of my test/ dev i will be
sticking to the RDMA+SEND for the moment.

At the same time i discovered that if i place some malformed or
unsupported request on the queue  through librdmacm/ libibverbs to the
softiwarp lib / module  crash the kernel.
I will probably have to look into the code and submit a patch in order
to provide "verbs command " checking into softiwarp.

On another note: Jason talked about a  design a fall back into the
protocol to use RDMA + SEND..
After browsing through the spec and the different library i didn't
find any function   allowing  to probe/query the supported
capabilities of the device.
Does anyone  know if its programmatically possible?
Ex : checkingDMA+IMM is supported , i suppose that for the atomic op i
just need to check  if ibv_query_device returns a postive value for
the max_qp_rd_atom and consor .

Regards
Benoit






On 7 June 2011 17:43, Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Jun 07, 2011 at 01:23:01PM +0100, Benoit Hudzia wrote:
>
>> I would like to use IBV_WR_RDMA_WRITE_WITH_IMM , however I am facing
>> difficulty to find documentation or example on how to use such
>> feature. ?The idea would be to do an RDMA write and wake up the peer
>> upon reception in order to process the received data.
>
> You basically have it correct, but generally you also need to mix SEND
> traffic on the same QP as you are doing these RDMAs, at a minimum to
> manage the allocation of receive resources. Since you can't know if
> the remote will place a RDMA+IMM or a SEND into your receive WR you
> need to ensure that all of them that are posted are able to handle a
> SEND. If you get an IMM then ignore the payload, process the imm_data
> recycle the WR back into the HCA, and schedule a signal to the far
> side you now have additional receive WR space.
>
>>       a.?????? Question : can i fill the imm_data with arbitrary data?
>> I would like to use this field on the peer side to identify the
>> request
>
> Keep in mind the imm_data is copied byte-for-byte so you should put it
> in a sensible endian-ness for your application.
>
> A simple example of a protocol using this might be:
>
>  SEND [My RDMA Recv buffer is XX-YY, I have RR recv WRs posted]
>           SEND [OK, I have RR recv WRs posted]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>  SEND [I have RR recv WRs posted]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>  SEND [I have RR recv WRs posted]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           SEND [I have RR recv WRs posted]
>  SEND [I have RR recv WRs posted]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
>           RDMA WRITE [IMM = I wrote ZZ bytes]
> [etc..]
>
> With a scheme like this all recive work posts would include a recv
> buffer to handle a SEND receive completion, and each side would stop
> sending when it runs out of recv completion space on the far
> side. Generally with a scheme like this some care is needed to avoid
> deadlocking because you want to be a bit lazy with flow control sends.
>
> Jason
>



-- 
" The production of too many useful things results in too many useless people"
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-06-08 18:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-07 12:23 need pointer to Understand how to use IBV_WR_RDMA_WRITE_WITH_IMM Benoit Hudzia
     [not found] ` <BANLkTikC6jPFJm=MxT5oNDoJH4Ex+j41XA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-06-07 14:18   ` Steven Dake
     [not found]     ` <4DEE3326.9050105-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-06-07 16:31       ` Jason Gunthorpe
     [not found]         ` <20110607163111.GA24005-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-06-07 16:36           ` Steve Wise
2011-06-07 14:23   ` Hefty, Sean
2011-06-07 16:43   ` Jason Gunthorpe
     [not found]     ` <20110607164333.GB24005-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-06-08 18:35       ` Benoit Hudzia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox