From: "Steve Wise" <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: Raju Rangoju <rajur-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>,
linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Interrupted IO causing async errors
Date: Thu, 23 Jun 2016 10:42:43 -0500 [thread overview]
Message-ID: <00e101d1cd65$e19bf360$a4d3da20$@opengridcomputing.com> (raw)
Hey chuck, we observe with 4.7-rc4 (and older kernels too) that interrupting a
dbench test on a nfsrdma/cxgb4 mount while it is doing heavy I/O can result in
cxgb4 logging an "invalid stag" error on an ingress RDMA WRITE message. Is
this expected? I'm wondering if this is a normal side effect of interrupting
the IO on the mount. Maybe due to the mount options or NFS version? This
error could happen if the NFSRDMA client invalidated MRs that were advertised to
the server for IO, while IO was still in flight. Is this expected or should we
dive in further? Thoughts? thanks...
Here are the details of the test.
Steps:
-> Load iw_cxgb4,rdma_ucm on both nodes.
-> Assign ip to chelsio interfaces on both nodes.
Server Side [gayabari]:
-> mknod /dev/ram0 b 1 0
-> modprobe brd rd_nr=1 rd_size=1048576
-> mkdir /nfsrdma
-> mkfs.ext3 /dev/ram0
-> mount /dev/ram0 /nfsrdma
-> vim /etc/exports
/nfsrdma *(sync,insecure,rw,no_root_squash,no_subtree_check)
-> modprobe xprtrdma
-> modprobe svcrdma
-> service nfsserver restart
-> echo rdma 20049 > /proc/fs/nfsd/portlist
-> exportfs -rav
Client Side [sonada]:
-> modprobe xprtrdma
-> modprobe svcrdma
-> mount 102.1.1.186:/nfsrdma/ -o
rdma,port=20049,vers=3,wsize=65536,rsize=65536
/mnt/
-> Then run below command on client [sonada] :
sonada:~ # dbench -t100 -D /root/share1/ 10
-> Issue is seen only on killing dbench test in between otherwise it ran fine.
Error seen on the nfsdma client:
[ 1593.398351] cxgb4 0000:01:00.4: AE qpid 1028 opcode 0 status 0x1 type 0 len
0x18e6009c wrid.hi 0x2cce2dc wrid.lo 0x2
[ 1593.398374] RPC: rpcrdma_qp_async_error_upcall: QP request error on
device cxgb4_0 ep ffff88022f3567e8
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next reply other threads:[~2016-06-23 15:42 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-23 15:42 Steve Wise [this message]
2016-06-24 0:15 ` Interrupted IO causing async errors Chuck Lever
[not found] ` <BDD2D64C-7A05-42EB-83C2-F95825C7579D-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-06-24 13:36 ` Steve Wise
2016-06-24 19:26 ` Jason Gunthorpe
[not found] ` <20160624192637.GE14506-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-24 22:16 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='00e101d1cd65$e19bf360$a4d3da20$@opengridcomputing.com' \
--to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
--cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=rajur-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox