public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Tom Tucker <tom-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Mahesh Siddheshwar
	<siddheshwar.mahesh-xsfywfwIY+M@public.gmane.org>,
	ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
Subject: Re: [ewg] nfsrdma fails to write big file,
Date: Wed, 24 Feb 2010 18:02:01 -0600	[thread overview]
Message-ID: <4B85BDF9.8020009@opengridcomputing.com> (raw)
In-Reply-To: <4B85ACD2.9040405-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>

Vu,

Based on the mapping code, it looks to me like the worst case is 
RPCRDMA_MAX_SEGS * 2 + 1 as the multiplier. 
However, I think in practice, due to the way that iov are built, the 
actual max is 5 (frmr for head + pagelist plus invalidates for same plus 
one for the send itself). Why did you think the max was 6?

Thanks,
Tom

Tom Tucker wrote:
> Vu,
>
> Are you changing any of the default settings? For example rsize/wsize, 
> etc... I'd like to reproduce this problem if I can.
>
> Thanks,
>
> Tom
>
> Vu Pham wrote:
>   
>> Tom,
>>
>> Did you make any change to have bonnie++, dd of a 10G file and vdbench
>> concurrently run & finish?
>>
>> I keep hitting the WQE overflow error below.
>> I saw that most of the requests have two chunks (32K chunk and
>> some-bytes chunk), each chunk requires an frmr + invalidate wrs;
>> However, you set ep->rep_attr.cap.max_send_wr = cdata->max_requests and
>> then for frmr case you do
>> ep->rep_atrr.cap.max_send_wr *=3; which is not enough. Moreover, you
>> also set ep->rep_cqinit = max_send_wr/2 for send completion signal which
>> causes the wqe overflow happened faster.
>>
>> After applying the following patch, I have thing vdbench, dd, and copy
>> 10g_file running overnight
>>
>> -vu
>>
>>
>> --- ofa_kernel-1.5.1.orig/net/sunrpc/xprtrdma/verbs.c   2010-02-24
>> 10:41:22.000000000 -0800
>> +++ ofa_kernel-1.5.1/net/sunrpc/xprtrdma/verbs.c        2010-02-24
>> 10:03:18.000000000 -0800
>> @@ -649,8 +654,15 @@
>>         ep->rep_attr.cap.max_send_wr = cdata->max_requests;
>>         switch (ia->ri_memreg_strategy) {
>>         case RPCRDMA_FRMR:
>> -               /* Add room for frmr register and invalidate WRs */
>> -               ep->rep_attr.cap.max_send_wr *= 3;
>> +               /* 
>> +                * Add room for frmr register and invalidate WRs
>> +                * Requests sometimes have two chunks, each chunk
>> +                * requires to have different frmr. The safest
>> +                * WRs required are max_send_wr * 6; however, we
>> +                * get send completions and poll fast enough, it
>> +                * is pretty safe to have max_send_wr * 4. 
>> +                */
>> +               ep->rep_attr.cap.max_send_wr *= 4;
>>                 if (ep->rep_attr.cap.max_send_wr > devattr.max_qp_wr)
>>                         return -EINVAL;
>>                 break;
>> @@ -682,7 +694,8 @@
>>                 ep->rep_attr.cap.max_recv_sge);
>>
>>         /* set trigger for requesting send completion */
>> -       ep->rep_cqinit = ep->rep_attr.cap.max_send_wr/2 /*  - 1*/;
>> +       ep->rep_cqinit = ep->rep_attr.cap.max_send_wr/4;
>> +       
>>         switch (ia->ri_memreg_strategy) {
>>         case RPCRDMA_MEMWINDOWS_ASYNC:
>>         case RPCRDMA_MEMWINDOWS:
>>
>>
>>
>>
>>
>>   
>>     
>>> -----Original Message-----
>>> From: ewg-bounces-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org [mailto:ewg-
>>> bounces-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org] On Behalf Of Vu Pham
>>> Sent: Monday, February 22, 2010 12:23 PM
>>> To: Tom Tucker
>>> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Mahesh Siddheshwar;
>>> ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
>>> Subject: Re: [ewg] nfsrdma fails to write big file,
>>>
>>> Tom,
>>>
>>> Some more info on the problem:
>>> 1. Running with memreg=4 (FMR) I can not reproduce the problem
>>> 2. I also see different error on client
>>>
>>> Feb 22 12:16:55 mellanox-2 rpc.idmapd[5786]: nss_getpwnam: name
>>> 'nobody'
>>> does not map into domain 'localdomain'
>>> Feb 22 12:16:55 mellanox-2 kernel: QP 0x70004b: WQE overflow
>>> Feb 22 12:16:55 mellanox-2 kernel: QP 0x6c004a: WQE overflow
>>> Feb 22 12:16:55 mellanox-2 kernel: QP 0x6c004a: WQE overflow
>>> Feb 22 12:16:55 mellanox-2 kernel: RPC: rpcrdma_ep_post: ib_post_send
>>> returned -12 cq_init 48 cq_count 32
>>> Feb 22 12:17:00 mellanox-2 kernel: RPC:       rpcrdma_event_process:
>>> send WC status 5, vend_err F5
>>> Feb 22 12:17:00 mellanox-2 kernel: rpcrdma: connection to
>>> 13.20.1.9:20049 closed (-103)
>>>
>>> -vu
>>>
>>>     
>>>       
>>>> -----Original Message-----
>>>> From: Tom Tucker [mailto:tom-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org]
>>>> Sent: Monday, February 22, 2010 10:49 AM
>>>> To: Vu Pham
>>>> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Mahesh Siddheshwar;
>>>> ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
>>>> Subject: Re: [ewg] nfsrdma fails to write big file,
>>>>
>>>> Vu Pham wrote:
>>>>       
>>>>         
>>>>> Setup:
>>>>> 1. linux nfsrdma client/server with OFED-1.5.1-20100217-0600,
>>>>>         
>>>>>           
>>>> ConnectX2
>>>>       
>>>>         
>>>>> QDR HCAs fw 2.7.8-6, RHEL 5.2.
>>>>> 2. Solaris nfsrdma server svn 130, ConnectX QDR HCA.
>>>>>
>>>>>
>>>>> Running vdbench on 10g file or *dd if=/dev/zero of=10g_file bs=1M
>>>>> count=10000*, operation fail, connection get drop, client cannot
>>>>> re-establish connection to server.
>>>>> After rebooting only the client, I can mount again.
>>>>>
>>>>> It happens with both solaris and linux nfsrdma servers.
>>>>>
>>>>> For linux client/server, I run memreg=5 (FRMR), I don't see
>>>>>         
>>>>>           
>> problem
>>   
>>     
>>>> with
>>>>       
>>>>         
>>>>> memreg=6 (global dma key)
>>>>>
>>>>>
>>>>>         
>>>>>           
>>>> Awesome. This is the key I think.
>>>>
>>>> Thanks for the info Vu,
>>>> Tom
>>>>
>>>>
>>>>       
>>>>         
>>>>> On Solaris server snv 130, we see problem decoding write request
>>>>>         
>>>>>           
>> of
>>   
>>     
>>>> 32K.
>>>>       
>>>>         
>>>>> The client send two read chunks (32K & 16-byte), the server fail
>>>>>         
>>>>>           
>> to
>>   
>>     
>>>> do
>>>>       
>>>>         
>>>>> rdma read on the 16-byte chunk (cqe.status = 10 ie.
>>>>> IB_WC_REM_ACCCESS_ERROR); therefore, server terminate the
>>>>>         
>>>>>           
>>> connection.
>>>     
>>>       
>>>> We
>>>>       
>>>>         
>>>>> don't see this problem on nfs version 3 on Solaris. Solaris server
>>>>>         
>>>>>           
>>>> run
>>>>       
>>>>         
>>>>> normal memory registration mode.
>>>>>
>>>>> On linux client, I see cqe.status = 12 ie. IB_WC_RETRY_EXC_ERR
>>>>>
>>>>> I added these notes in bug #1919 (bugs.openfabrics.org) to track
>>>>>         
>>>>>           
>>> the
>>>     
>>>       
>>>>> issue.
>>>>>
>>>>> thanks,
>>>>> -vu
>>>>> _______________________________________________
>>>>> ewg mailing list
>>>>> ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
>>>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>>>>>
>>>>>         
>>>>>           
>>> _______________________________________________
>>> ewg mailing list
>>> ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>>>     
>>>       
>> _______________________________________________
>> ewg mailing list
>> ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>>   
>>     
>
> _______________________________________________
> ewg mailing list
> ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>   

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-02-25  0:02 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-22 18:41 nfsrdma fails to write big file, Vu Pham
     [not found] ` <9FA59C95FFCBB34EA5E42C1A8573784F02662E58-SDnKeQl2TTymvrjiD8yIlgC/G2K4zDHf@public.gmane.org>
2010-02-22 18:49   ` [ewg] " Tom Tucker
2010-02-22 20:22     ` Vu Pham
2010-02-24 18:56       ` Vu Pham
     [not found]         ` <9FA59C95FFCBB34EA5E42C1A8573784F02663166-SDnKeQl2TTymvrjiD8yIlgC/G2K4zDHf@public.gmane.org>
2010-02-24 19:06           ` Roland Dreier
     [not found]             ` <ada3a0q1mje.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-24 22:13               ` Tom Tucker
2010-02-28  4:22               ` Tom Tucker
2010-03-02  0:19                 ` Vu Pham
     [not found]                   ` <9FA59C95FFCBB34EA5E42C1A8573784F02663602-SDnKeQl2TTymvrjiD8yIlgC/G2K4zDHf@public.gmane.org>
2010-03-02  3:17                     ` Tom Tucker
     [not found]                 ` <4B89EF88.1030903-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-03-03 20:26                   ` Mahesh Siddheshwar
     [not found]                     ` <4B8EC600.9050101-xsfywfwIY+M@public.gmane.org>
2010-03-03 22:52                       ` [ewg] " Tom Tucker
     [not found]                         ` <4B8EE813.2010205-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-03-04 16:43                           ` Mahesh Siddheshwar
2010-02-24 22:07           ` Tom Tucker
2010-02-24 22:48           ` Tom Tucker
     [not found]             ` <4B85ACD2.9040405-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-25  0:02               ` Tom Tucker [this message]
     [not found]                 ` <4B85BDF9.8020009-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-25  0:51                   ` Tom Tucker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B85BDF9.8020009@opengridcomputing.com \
    --to=tom-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
    --cc=ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=siddheshwar.mahesh-xsfywfwIY+M@public.gmane.org \
    --cc=vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox