From: Chuck Lever <chuck.lever@oracle.com>
To: Doug Ledford <dledford@redhat.com>
Cc: Anna Schumaker <Anna.Schumaker@netapp.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
linux-rdma@vger.kernel.org,
Roland Dreier <roland@purestorage.com>,
Allen Andrews <allen.andrews@emulex.com>
Subject: Re: [PATCH V3 00/17] NFS/RDMA client-side patches
Date: Fri, 2 May 2014 16:20:41 -0400 [thread overview]
Message-ID: <45067B04-660C-4971-B12F-AEC9F7D32785@oracle.com> (raw)
In-Reply-To: <5363f223.e39f420a.4af6.6fc9SMTPIN_ADDED_BROKEN@mx.google.com>
On May 2, 2014, at 3:27 PM, Doug Ledford <dledford@redhat.com> wrote:
> ----- Original Message -----
>> Changes since V2:
>>
>> - Rebased on v3.15-rc3
>>
>> - "enable pad optimization" dropped. Testing showed Linux NFS/RDMA
>> server does not support pad optimization yet.
>>
>> - "ALLPHYSICAL CONFIG" dropped. There is a lack of consensus on
>> this one. Christoph would like ALLPHYSICAL removed, but the HPC
>> community prefers keeping a performance-at-all-costs option. And,
>> with most other registration modes now removed, ALLPHYSICAL is
>> the mode of last resort if an adapter does not support FRMR or
>> MTHCAFMR, since ALLPHYSICAL is universally supported. We will
>> very likely revisit this later. I'm erring on the side of less
>> churn and dropping this until the community agrees on how to
>> move forward.
>>
>> - Added a patch to ensure there is always a valid ->qp if RPCs
>> might awaken while the transport is disconnected.
>>
>> - Added a patch to clean up an MTU settings hack for a very old
>> adapter model.
>>
>> Test and review the "nfs-rdma-client" branch:
>>
>> git://git.linux-nfs.org/projects/cel/cel-2.6.git
>>
>> Thanks!
>
> Hi Chuck,
>
> I've installed this in my cluster and ran a number of simple tests
> over a variety of hardware. For the most part, it's looking much
> better than NFSoRDMA looked a kernel or two back, but I can still
> trip it up. All tests were run with rhel7 + current upstream
> kernel.
>
> My server was using mlx4 hardware in both IB and RoCE modes.
>
> I tested from mlx4 client in both IB and RoCE modes -> not DOA
> I tested from mlx5 client in IB mode -> not DOA
> I tested from mthca client in IB mode -> not DOA
> I tested from qib client in IB mode -> not DOA
> I tested from ocrdma client in RoCE mode -> DOA (cpu soft lockup
> on mount on the client)
>
> I tested nfsv3 -> not DOA
> I tested nfsv4 + rdma -> still DOA, but I think this is expected
> as last I knew someone needs to write code for nfsv4 mountd
> over rdma before this will work (as nfsv3 uses a tcp connection
> to do mounting, and then switches to rdma for data transfers
> and nfsv4 doesn't support that or something like that...this
> is what I recall Jeff Layton telling me anyway)
>
> I tested nfsv3 in both IB and RoCE modes with rsize=32768 and
> wsize=32768 -> not DOA, reliable, did data verification and passed
>
> I tested nfsv3 in both IB and RoCE modes with rsize=65536 and
> wsize=65536 -> not DOA, but not reliable either, data transfers
> will stop after a certain amount has been transferred and the
> mount will have a soft hang
Can you clarify what you mean by “soft hang?” Are you seeing a
problem when mounting with the “soft” mount option, or does this
mean “CPU soft lockup?” (INFO: task hung for 120 seconds)
> My data verification was simple (but generally effective in
> lots of scenarios):
>
> I had a full linux kernel git repo, with a complete build in it
> (totaling a little over 9GB of disk space used) and I would run
> tar -cf - linus | tar -xvf - -C <tmpdir> to copy the tree
> around (I did copies both on the same mount and on a different
> mount that was also NFSoRDMA, including copying from an IB
> NFSoRDMA mount to a RoCE NFSoRDMA mount on different mlx4 ports),
> and then diff -uprN on the various tree locations to check for
> any data differences.
>
> So there's your testing report. As I said in the beginning, it's
> definitely better than it was since it used to oops the server and
> I didn't encounter any server side problems this time, only client
> side problems.
Thanks for testing!
> ToDo items that I see:
>
> Write NFSv4 rdma protocol mount support
NFSv4 does not use the MNT protocol. If NFSv4 is not working for you,
there’s something else going on. For me NFSv4 works as well as NFSv3.
Let me know if you need help troubleshooting.
> Fix client soft mount hangs when rsize/wsize > 32768
Does that problem occur with unpatched v3.15-rc3 on the client?
HCAs/RNICs that support MTHCAFMR and FRMR should be working up to the
largest rsize and wsize supported by the client and server.
When I use ALLPHYSICAL with large wsize, typically the server starts
dropping NFS WRITE requests. The client retries them forever, and that
looks like a mount point hang.
Something like https://bugzilla.linux-nfs.org/show_bug.cgi?id=248
> Fix DOA of ocrdma driver
Does that problem occur with unpatched v3.15-rc3 on the client?
Emulex has reported some problems when reconnecting, but
I haven’t heard of issues that occur right at mount time.
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
next prev parent reply other threads:[~2014-05-02 20:21 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-30 19:29 [PATCH V3 00/17] NFS/RDMA client-side patches Chuck Lever
2014-04-30 19:29 ` [PATCH V3 01/17] xprtrdma: mind the device's max fast register page list depth Chuck Lever
2014-05-16 7:08 ` Devesh Sharma
2014-05-16 14:10 ` Steve Wise
2014-05-16 14:14 ` Steve Wise
2014-05-16 14:29 ` Steve Wise
2014-05-17 8:23 ` Devesh Sharma
2014-04-30 19:29 ` [PATCH V3 02/17] nfs-rdma: Fix for FMR leaks Chuck Lever
2014-04-30 19:29 ` [PATCH V3 03/17] xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context Chuck Lever
2014-04-30 19:30 ` [PATCH V3 04/17] xprtrdma: Remove BOUNCEBUFFERS memory registration mode Chuck Lever
2014-04-30 19:30 ` [PATCH V3 05/17] xprtrdma: Remove MEMWINDOWS registration modes Chuck Lever
2014-04-30 19:30 ` [PATCH V3 06/17] xprtrdma: Remove REGISTER memory registration mode Chuck Lever
2014-04-30 19:30 ` [PATCH V3 07/17] xprtrdma: Fall back to MTHCAFMR when FRMR is not supported Chuck Lever
2014-04-30 19:30 ` [PATCH V3 08/17] xprtrdma: mount reports "Invalid mount option" if memreg mode " Chuck Lever
2014-04-30 19:30 ` [PATCH V3 09/17] xprtrdma: Simplify rpcrdma_deregister_external() synopsis Chuck Lever
2014-04-30 19:30 ` [PATCH V3 10/17] xprtrdma: Make rpcrdma_ep_destroy() return void Chuck Lever
2014-04-30 19:31 ` [PATCH V3 11/17] xprtrdma: Split the completion queue Chuck Lever
2014-04-30 19:31 ` [PATCH V3 12/17] xprtrmda: Reduce lock contention in completion handlers Chuck Lever
2014-04-30 19:31 ` [PATCH V3 13/17] xprtrmda: Reduce calls to ib_poll_cq() " Chuck Lever
2014-04-30 19:31 ` [PATCH V3 14/17] xprtrdma: Limit work done by completion handler Chuck Lever
2014-04-30 19:31 ` [PATCH V3 15/17] xprtrdma: Reduce the number of hardway buffer allocations Chuck Lever
2014-04-30 19:31 ` [PATCH V3 16/17] xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting Chuck Lever
2014-04-30 19:31 ` [PATCH V3 17/17] xprtrdma: Remove Tavor MTU setting Chuck Lever
2014-05-01 7:36 ` Hal Rosenstock
2014-05-02 19:27 ` [PATCH V3 00/17] NFS/RDMA client-side patches Doug Ledford
[not found] ` <20140430191433.5663.16217.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2014-05-02 19:27 ` Doug Ledford
2014-05-02 19:27 ` Doug Ledford
[not found] ` <5363f223.e39f420a.4af6.6fc9SMTPIN_ADDED_BROKEN@mx.google.com>
2014-05-02 20:20 ` Chuck Lever [this message]
2014-05-02 22:34 ` Doug Ledford
2014-05-02 22:34 ` Doug Ledford
2014-05-02 22:34 ` Doug Ledford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45067B04-660C-4971-B12F-AEC9F7D32785@oracle.com \
--to=chuck.lever@oracle.com \
--cc=Anna.Schumaker@netapp.com \
--cc=allen.andrews@emulex.com \
--cc=dledford@redhat.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=roland@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).