All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dotan Barak <dotanba-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Ding Dinghua <dingdinghua85-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: P
Date: Sun, 20 Jun 2010 20:43:00 +0200	[thread overview]
Message-ID: <4C1E6134.6070304@gmail.com> (raw)
In-Reply-To: <AANLkTinah43AD5N0ZryDsrGprkeVf9-BdLCyr125PQ3p-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 20/06/2010 07:51, Ding Dinghua wrote:
> hello,
>
> 2010/6/19 Dotan Barak<dotanba-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>    
>>      
>>> I call rdma_create_id to create an ib id, then do resolve remote addr,
>>> resolve route work, then
>>> setup qp and call rdma_connect to setup connection, before ack or
>>> error replies, the thread will
>>> wait on a wait queue. The listening ib id of remote node will catch
>>> the connect request,
>>> setup qp, allocate and map pages to construct the RDMA-WRITE space,
>>> and call rdma_accept to reply
>>> the request.
>>>
>>> Some other information which may be useful:
>>> 1.All the "RETRY EXCEEDED" problems happened when there were two
>>> connections which use RDMA-WRITE to transfer things.
>>> And the latter connection had a high possibility to get into this problem.
>>> 2. All the "RETRY EXCEEDED" problems happened when the RMDA-WRITE
>>> space is 256MB each(that is, for two connections, consumes 512MB mem),
>>> when the RDMA-WRITE  space is 64MB, this problem never happened in our
>>> test. Remote node's total memory is 2GB.
>>>
>>> Thanks a lot.
>>>
>>>        
>> Some more questions:
>> * Is the WR that "produces" the RETRY EXCEEDED is the first one/last one/in
>> the middle?
>>      
> it's the first one
>
>    
>> * Which values are you using in the QP context for retry exceeded counter +
>> retry timeout?
>> * Did you try to increase those values?
>>      
> I haven't set these values(actually  I don't know where to set these
> values), i just set max_send_wr and max_send_sge
> fields of struct ib_qp_cap when creating qp.
>
>    
Can you perform query QP after establishing a connection between the QPs 
and check those values?

>> * How many more QPs do you have between those nodes and which operations do
>> they use
>>    (only RDMA-WRITEs?)
>>
>>      
> 4096 QPs for each connection,  only do RDMA-WRITES.
>    
So, you send in parallel total of 4K (QPs) * 64M (Bytes)  = 256 GB
(am i missing something, or this is the amount of data that will be sent 
between two nodes?)

Dotan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-06-20 18:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-11  8:51 A strange problem when using IB to transfer things Ding Dinghua
     [not found] ` <AANLkTiml1Q7sD9MFsg_q1nQx5091rNnzhK8oh1aUnxwT-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-06-11 15:30   ` Dotan Barak
     [not found]     ` <4C125697.1000508-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-06-12  1:22       ` Ding Dinghua
     [not found]         ` <AANLkTilhV8JJTKxc4OpudTUgKqMyJ5mzcxt6XdSMurDS-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-06-12  9:10           ` P Dotan Barak
     [not found]             ` <4C134EFC.5010207-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-06-17  7:07               ` P Ding Dinghua
     [not found]                 ` <AANLkTimdeZwZI3FlTncXY_d3QY8jFfNhHERTxl3BD3Bd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-06-19 10:42                   ` P Dotan Barak
     [not found]                     ` <4C1C9EFC.4020304-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-06-20  5:51                       ` P Ding Dinghua
     [not found]                         ` <AANLkTinah43AD5N0ZryDsrGprkeVf9-BdLCyr125PQ3p-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-06-20 18:43                           ` Dotan Barak [this message]
     [not found]                             ` <4C1E6134.6070304-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-06-21  2:30                               ` P Ding Dinghua
2010-06-17  7:27               ` P Ding Dinghua
  -- strict thread matches above, loose matches on Subject: below --
2009-11-01 19:58 p Sasha Khapyorsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C1E6134.6070304@gmail.com \
    --to=dotanba-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=dingdinghua85-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.