public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* Interrupted IO causing async errors
@ 2016-06-23 15:42 Steve Wise
  2016-06-24  0:15 ` Chuck Lever
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Wise @ 2016-06-23 15:42 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Raju Rangoju, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hey chuck, we observe with 4.7-rc4 (and older kernels too) that interrupting a
dbench test on a nfsrdma/cxgb4 mount while it is doing heavy I/O can result in
cxgb4 logging an "invalid stag" error on an ingress RDMA WRITE message.   Is
this expected?  I'm wondering if this is a normal side effect of interrupting
the IO on the mount.  Maybe due to the mount options or NFS version?    This
error could happen if the NFSRDMA client invalidated MRs that were advertised to
the server for IO, while IO was still in flight.  Is this expected or should we
dive in further?  Thoughts?  thanks...
 
Here are the details of the test.

Steps:

-> Load iw_cxgb4,rdma_ucm on both nodes.
-> Assign ip to chelsio interfaces on both nodes.

Server Side [gayabari]:

-> mknod /dev/ram0 b 1 0
-> modprobe brd rd_nr=1 rd_size=1048576
-> mkdir /nfsrdma 
-> mkfs.ext3 /dev/ram0
-> mount /dev/ram0 /nfsrdma
-> vim /etc/exports
   /nfsrdma  *(sync,insecure,rw,no_root_squash,no_subtree_check)

-> modprobe xprtrdma 
-> modprobe svcrdma
-> service nfsserver restart 
-> echo rdma 20049 > /proc/fs/nfsd/portlist
-> exportfs -rav

Client Side [sonada]:

-> modprobe xprtrdma 
-> modprobe svcrdma

-> mount 102.1.1.186:/nfsrdma/ -o
rdma,port=20049,vers=3,wsize=65536,rsize=65536
/mnt/ 

-> Then run below command on client [sonada] : 
sonada:~ # dbench -t100 -D /root/share1/  10


-> Issue is seen only on killing dbench test in between otherwise it ran fine.

Error seen on the nfsdma client:

[ 1593.398351] cxgb4 0000:01:00.4: AE qpid 1028 opcode 0 status 0x1 type 0 len
0x18e6009c wrid.hi 0x2cce2dc wrid.lo 0x2
[ 1593.398374] RPC:       rpcrdma_qp_async_error_upcall: QP request error on
device cxgb4_0 ep ffff88022f3567e8

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-24 22:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-23 15:42 Interrupted IO causing async errors Steve Wise
2016-06-24  0:15 ` Chuck Lever
     [not found]   ` <BDD2D64C-7A05-42EB-83C2-F95825C7579D-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-06-24 13:36     ` Steve Wise
2016-06-24 19:26     ` Jason Gunthorpe
     [not found]       ` <20160624192637.GE14506-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-24 22:16         ` Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox