public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Dotan Barak <dotanba-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Ding Dinghua <dingdinghua85-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: P
Date: Sun, 20 Jun 2010 20:43:00 +0200	[thread overview]
Message-ID: <4C1E6134.6070304@gmail.com> (raw)
In-Reply-To: <AANLkTinah43AD5N0ZryDsrGprkeVf9-BdLCyr125PQ3p-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 20/06/2010 07:51, Ding Dinghua wrote:
> hello,
>
> 2010/6/19 Dotan Barak<dotanba-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>    
>>      
>>> I call rdma_create_id to create an ib id, then do resolve remote addr,
>>> resolve route work, then
>>> setup qp and call rdma_connect to setup connection, before ack or
>>> error replies, the thread will
>>> wait on a wait queue. The listening ib id of remote node will catch
>>> the connect request,
>>> setup qp, allocate and map pages to construct the RDMA-WRITE space,
>>> and call rdma_accept to reply
>>> the request.
>>>
>>> Some other information which may be useful:
>>> 1.All the "RETRY EXCEEDED" problems happened when there were two
>>> connections which use RDMA-WRITE to transfer things.
>>> And the latter connection had a high possibility to get into this problem.
>>> 2. All the "RETRY EXCEEDED" problems happened when the RMDA-WRITE
>>> space is 256MB each(that is, for two connections, consumes 512MB mem),
>>> when the RDMA-WRITE  space is 64MB, this problem never happened in our
>>> test. Remote node's total memory is 2GB.
>>>
>>> Thanks a lot.
>>>
>>>        
>> Some more questions:
>> * Is the WR that "produces" the RETRY EXCEEDED is the first one/last one/in
>> the middle?
>>      
> it's the first one
>
>    
>> * Which values are you using in the QP context for retry exceeded counter +
>> retry timeout?
>> * Did you try to increase those values?
>>      
> I haven't set these values(actually  I don't know where to set these
> values), i just set max_send_wr and max_send_sge
> fields of struct ib_qp_cap when creating qp.
>
>    
Can you perform query QP after establishing a connection between the QPs 
and check those values?

>> * How many more QPs do you have between those nodes and which operations do
>> they use
>>    (only RDMA-WRITEs?)
>>
>>      
> 4096 QPs for each connection,  only do RDMA-WRITES.
>    
So, you send in parallel total of 4K (QPs) * 64M (Bytes)  = 256 GB
(am i missing something, or this is the amount of data that will be sent 
between two nodes?)

Dotan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-06-20 18:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-11  8:51 A strange problem when using IB to transfer things Ding Dinghua
     [not found] ` <AANLkTiml1Q7sD9MFsg_q1nQx5091rNnzhK8oh1aUnxwT-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-06-11 15:30   ` Dotan Barak
     [not found]     ` <4C125697.1000508-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-06-12  1:22       ` Ding Dinghua
     [not found]         ` <AANLkTilhV8JJTKxc4OpudTUgKqMyJ5mzcxt6XdSMurDS-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-06-12  9:10           ` P Dotan Barak
     [not found]             ` <4C134EFC.5010207-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-06-17  7:07               ` P Ding Dinghua
     [not found]                 ` <AANLkTimdeZwZI3FlTncXY_d3QY8jFfNhHERTxl3BD3Bd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-06-19 10:42                   ` P Dotan Barak
     [not found]                     ` <4C1C9EFC.4020304-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-06-20  5:51                       ` P Ding Dinghua
     [not found]                         ` <AANLkTinah43AD5N0ZryDsrGprkeVf9-BdLCyr125PQ3p-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-06-20 18:43                           ` Dotan Barak [this message]
     [not found]                             ` <4C1E6134.6070304-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-06-21  2:30                               ` P Ding Dinghua
2010-06-17  7:27               ` P Ding Dinghua
  -- strict thread matches above, loose matches on Subject: below --
2009-11-01 19:58 p Sasha Khapyorsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C1E6134.6070304@gmail.com \
    --to=dotanba-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=dingdinghua85-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox