public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Mahesh Siddheshwar <siddheshwar.mahesh-xsfywfwIY+M@public.gmane.org>
To: Tom Tucker <tom-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Cc: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org
Subject: Re: [ewg] nfsrdma fails to write big file,
Date: Thu, 04 Mar 2010 08:43:35 -0800	[thread overview]
Message-ID: <4B8FE337.7050001@sun.com> (raw)
In-Reply-To: <4B8EE813.2010205-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>

Tom Tucker wrote:
> Mahesh Siddheshwar wrote:
>> Hi Tom, Vu,
>>
>> Tom Tucker wrote:
>>> Roland Dreier wrote:
>>>>  > +               /*  > +                * Add room for frmr 
>>>> register and invalidate WRs
>>>>  > +                * Requests sometimes have two chunks, each chunk
>>>>  > +                * requires to have different frmr. The safest
>>>>  > +                * WRs required are max_send_wr * 6; however, we
>>>>  > +                * get send completions and poll fast enough, it
>>>>  > +                * is pretty safe to have max_send_wr * 4.  > 
>>>> +                */
>>>>  > +               ep->rep_attr.cap.max_send_wr *= 4;
>>>>
>>>> Seems like a bad design if there is a possibility of work queue
>>>> overflow; if you're counting on events occurring in a particular order
>>>> or completions being handled "fast enough", then your design is 
>>>> going to
>>>> fail in some high load situations, which I don't think you want.   
>>>
>>> Vu,
>>>
>>> Would you please try the following:
>>>
>>> - Set the multiplier to 5
>> While trying to test this between a Linux client and Solaris server,
>> I made the following changes in :
>> /usr/src/ofa_kernel-1.5.1/net/sunrpc/xprtrdma/verbs.c
>>
>> diff verbs.c.org verbs.c
>> 653c653
>> <               ep->rep_attr.cap.max_send_wr *= 3;
>> ---
>> >               ep->rep_attr.cap.max_send_wr *= 8;
>> 685c685
>> <       ep->rep_cqinit = ep->rep_attr.cap.max_send_wr/2 /*  - 1*/;
>> ---
>> >       ep->rep_cqinit = ep->rep_attr.cap.max
>>
>> (I bumped it to 8)
>>
>> did make install.
>> On reboot I see the errors on NFS READs as opposed to WRITEs
>> as seen before, when I try to read a 10G file from the server.
>>
>> The client is running: RHEL 5.3 (2.6.18-128.el5PAE) with
>> OFED-1.5.1-20100223-0740 bits. The client has an Sun IB
>> HCA: SUN0070130001, MT25418, 2.7.0 firmware, hw_rev = a0.
>> The server is running Solaris based on snv_128.
>>
>> rpcdebug output from the client:
>>
>> ==
>> RPC:    85 call_bind (status 0)
>> RPC:    85 call_connect xprt ec78d800 is connected
>> RPC:    85 call_transmit (status 0)
>> RPC:    85 xprt_prepare_transmit
>> RPC:    85 xprt_cwnd_limited cong = 0 cwnd = 8192
>> RPC:    85 rpc_xdr_encode (status 0)
>> RPC:    85 marshaling UNIX cred eddb4dc0
>> RPC:    85 using AUTH_UNIX cred eddb4dc0 to wrap rpc data
>> RPC:    85 xprt_transmit(164)
>> RPC:       rpcrdma_inline_pullup: pad 0 destp 0xf1dd1410 len 164 
>> hdrlen 164
>> RPC:       rpcrdma_register_frmr_external: Using frmr ec7da920 to map 
>> 4 segments
>> RPC:       rpcrdma_create_chunks: write chunk elem 
>> 16384@0x38536d000:0xa601 (more)
>> RPC:       rpcrdma_register_frmr_external: Using frmr ec7da960 to map 
>> 1 segments
>> RPC:       rpcrdma_create_chunks: write chunk elem 
>> 108@0x31dd153c:0xaa01 (last)
>> RPC:       rpcrdma_marshal_req: write chunk: hdrlen 68 rpclen 164 
>> padlen 0 headerp 0xf1dd124c base 0xf1dd136c lkey 0x500
>> RPC:    85 xmit complete
>> RPC:    85 sleep_on(queue "xprt_pending" time 4683109)
>> RPC:    85 added to queue ec78d994 "xprt_pending"
>> RPC:    85 setting alarm for 60000 ms
>> RPC:       wake_up_next(ec78d944 "xprt_resend")
>> RPC:       wake_up_next(ec78d8f4 "xprt_sending")
>> RPC:       rpcrdma_qp_async_error_upcall: QP error 3 on device mlx4_0 
>> ep ec78db40
>> RPC:    85 __rpc_wake_up_task (now 4683110)
>> RPC:    85 disabling timer
>> RPC:    85 removed from queue ec78d994 "xprt_pending"
>> RPC:       __rpc_wake_up_task done
>> RPC:    85 __rpc_execute flags=0x1
>> RPC:    85 call_status (status -107)
>> RPC:    85 call_bind (status 0)
>> RPC:    85 call_connect xprt ec78d800 is not connected
>> RPC:    85 xprt_connect xprt ec78d800 is not connected
>> RPC:    85 sleep_on(queue "xprt_pending" time 4683110)
>> RPC:    85 added to queue ec78d994 "xprt_pending"
>> RPC:    85 setting alarm for 60000 ms
>> RPC:       rpcrdma_event_process: event rep ec116800 status 5 opcode 
>> 80 length 2493606
>> RPC:       rpcrdma_event_process: recv WC status 5, connection lost
>> RPC:       rpcrdma_conn_upcall: disconnected: ec78dbccI4:20049 (ep 
>> 0xec78db40 event 0xa)
>> RPC:       rpcrdma_conn_upcall: disconnected
>> rpcrdma: connection to ec78dbccI4:20049 closed (-103)
>> RPC:       xprt_rdma_connect_worker: reconnect
>> ==
>>
>> On the server I see:
>>
>> Mar  3 17:45:16 elena-ar hermon: [ID 271130 kern.notice] NOTICE: 
>> hermon0: Device Error: CQE remote access error
>> Mar  3 17:45:16 elena-ar nfssrv: [ID 819430 kern.notice] NOTICE: NFS: 
>> bad sendreply
>> Mar  3 17:45:21 elena-ar hermon: [ID 271130 kern.notice] NOTICE: 
>> hermon0: Device Error: CQE remote access error
>> Mar  3 17:45:21 elena-ar nfssrv: [ID 819430 kern.notice] NOTICE: NFS: 
>> bad sendreply
>>
>> The remote access error is actually seen on RDMA_WRITE.
>> Doing some more debug on the server with DTrace, I see that
>> the destination address and length matches the write chunk
>> element in the Linux debug output above.
>>
>>
>>  0   9385                  rib_write:entry daddr 38536d000, len 4000, 
>> hdl a601
>>  0   9358         rib_init_sendwait:return ffffff44a715d308
>>  1   9296       rib_svc_scq_handler:return 1f7
>>  1   9356              rib_sendwait:return 14
>>  1   9386                 rib_write:return 14
>>
>> ^^^ that is RDMA_FAILED in
>>  1  63295    xdrrdma_send_read_data:return 0
>>  1   5969              xdr_READ3res:return
>>  1   5969              xdr_READ3res:return 0
>>
>> Is this a variation of the previously discussed issue or something new?
>>
>
> I think this is new. This seems to be some kind of base/bounds or 
> access violation or perhaps an invalid rkey.
>
Thanks for checking, Tom. I can file a new bug against this. The
test setup is a DDR HCA (client) connected to a DDR Voltaire Switch,
connected to a QDR HCA (server, but limited to PCI-gen1). I have
not seen this on a similar setup with both client/server configured with
QDR HCAs.

What type of debug info would you need to debug this further?

Thanks,
Mahesh
>> Thanks,
>> Mahesh
>>
>>> - Set the number of buffer credits small as follows "echo 4 > 
>>> /proc/sys/sunrpc/rdma_slot_table_entries"
>>> - Rerun your test and see if you can reproduce the problem?
>>>
>>> I did the above and was unable to reproduce, but I would like to see 
>>> if you can to convince ourselves that 5 is the right number.
>>>
>>> Thanks,
>>> Tom
>>>
>>>>  - R.
>>>>   
>>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-03-04 16:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-22 18:41 nfsrdma fails to write big file, Vu Pham
     [not found] ` <9FA59C95FFCBB34EA5E42C1A8573784F02662E58-SDnKeQl2TTymvrjiD8yIlgC/G2K4zDHf@public.gmane.org>
2010-02-22 18:49   ` [ewg] " Tom Tucker
2010-02-22 20:22     ` Vu Pham
2010-02-24 18:56       ` Vu Pham
     [not found]         ` <9FA59C95FFCBB34EA5E42C1A8573784F02663166-SDnKeQl2TTymvrjiD8yIlgC/G2K4zDHf@public.gmane.org>
2010-02-24 19:06           ` Roland Dreier
     [not found]             ` <ada3a0q1mje.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-24 22:13               ` Tom Tucker
2010-02-28  4:22               ` Tom Tucker
2010-03-02  0:19                 ` Vu Pham
     [not found]                   ` <9FA59C95FFCBB34EA5E42C1A8573784F02663602-SDnKeQl2TTymvrjiD8yIlgC/G2K4zDHf@public.gmane.org>
2010-03-02  3:17                     ` Tom Tucker
     [not found]                 ` <4B89EF88.1030903-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-03-03 20:26                   ` Mahesh Siddheshwar
     [not found]                     ` <4B8EC600.9050101-xsfywfwIY+M@public.gmane.org>
2010-03-03 22:52                       ` [ewg] " Tom Tucker
     [not found]                         ` <4B8EE813.2010205-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-03-04 16:43                           ` Mahesh Siddheshwar [this message]
2010-02-24 22:07           ` Tom Tucker
2010-02-24 22:48           ` Tom Tucker
     [not found]             ` <4B85ACD2.9040405-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-25  0:02               ` Tom Tucker
     [not found]                 ` <4B85BDF9.8020009-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-25  0:51                   ` Tom Tucker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B8FE337.7050001@sun.com \
    --to=siddheshwar.mahesh-xsfywfwiy+m@public.gmane.org \
    --cc=ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
    --cc=tom-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org \
    --cc=vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox