From: "Steve Wise" <swise@opengridcomputing.com>
To: "'Chuck Lever'" <chuck.lever@oracle.com>
Cc: "'Raju Rangoju'" <rajur@chelsio.com>, <linux-nfs@vger.kernel.org>,
<linux-rdma@vger.kernel.org>
Subject: RE: Interrupted IO causing async errors
Date: Fri, 24 Jun 2016 08:36:55 -0500 [thread overview]
Message-ID: <001601d1ce1d$7990c800$6cb25800$@opengridcomputing.com> (raw)
In-Reply-To: <BDD2D64C-7A05-42EB-83C2-F95825C7579D@oracle.com>
> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@oracle.com]
> Sent: Thursday, June 23, 2016 7:15 PM
> To: Steve Wise
> Cc: Raju Rangoju; linux-nfs@vger.kernel.org; linux-rdma@vger.kernel.org
> Subject: Re: Interrupted IO causing async errors
>
> Hi Steve-
>
> > On Jun 23, 2016, at 11:42 AM, Steve Wise <swise@opengridcomputing.com>
> wrote:
> >
> > Hey chuck, we observe with 4.7-rc4 (and older kernels too) that interrupting
a
> > dbench test on a nfsrdma/cxgb4 mount while it is doing heavy I/O can result
in
> > cxgb4 logging an "invalid stag" error on an ingress RDMA WRITE message. Is
> > this expected? I'm wondering if this is a normal side effect of
interrupting
> > the IO on the mount. Maybe due to the mount options or NFS version? This
> > error could happen if the NFSRDMA client invalidated MRs that were
advertised
> to
> > the server for IO, while IO was still in flight. Is this expected or should
we
> > dive in further? Thoughts? thanks...
>
> When an application is signaled, outstanding RPCs are terminated.
> When an RPC completes, whether because a reply was received,
> or because the local application has died, any memory that was
> registered on behalf of that RPC is invalidated before it can be
> used for something else. The data in that memory remains at rest
> until invalidation and DMA unmapping is complete.
>
> It appears that your server is attempting to read an argument or
> write a result for an RPC that is no longer pending. I think both
> sides should report a transport error, and the connection should
> terminate. No other problems, though: other operation should
> continue normally after the client re-establishes a fresh connection.
>
> If this doesn't match your observations, let me know.
>
This is exactly what we see. Thanks!
Steve.
WARNING: multiple messages have this Message-ID (diff)
From: "Steve Wise" <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: 'Chuck Lever' <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: 'Raju Rangoju' <rajur-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>,
linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: RE: Interrupted IO causing async errors
Date: Fri, 24 Jun 2016 08:36:55 -0500 [thread overview]
Message-ID: <001601d1ce1d$7990c800$6cb25800$@opengridcomputing.com> (raw)
In-Reply-To: <BDD2D64C-7A05-42EB-83C2-F95825C7579D-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org]
> Sent: Thursday, June 23, 2016 7:15 PM
> To: Steve Wise
> Cc: Raju Rangoju; linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Subject: Re: Interrupted IO causing async errors
>
> Hi Steve-
>
> > On Jun 23, 2016, at 11:42 AM, Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> wrote:
> >
> > Hey chuck, we observe with 4.7-rc4 (and older kernels too) that interrupting
a
> > dbench test on a nfsrdma/cxgb4 mount while it is doing heavy I/O can result
in
> > cxgb4 logging an "invalid stag" error on an ingress RDMA WRITE message. Is
> > this expected? I'm wondering if this is a normal side effect of
interrupting
> > the IO on the mount. Maybe due to the mount options or NFS version? This
> > error could happen if the NFSRDMA client invalidated MRs that were
advertised
> to
> > the server for IO, while IO was still in flight. Is this expected or should
we
> > dive in further? Thoughts? thanks...
>
> When an application is signaled, outstanding RPCs are terminated.
> When an RPC completes, whether because a reply was received,
> or because the local application has died, any memory that was
> registered on behalf of that RPC is invalidated before it can be
> used for something else. The data in that memory remains at rest
> until invalidation and DMA unmapping is complete.
>
> It appears that your server is attempting to read an argument or
> write a result for an RPC that is no longer pending. I think both
> sides should report a transport error, and the connection should
> terminate. No other problems, though: other operation should
> continue normally after the client re-establishes a fresh connection.
>
> If this doesn't match your observations, let me know.
>
This is exactly what we see. Thanks!
Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-06-24 13:36 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-23 15:42 Interrupted IO causing async errors Steve Wise
2016-06-23 15:42 ` Steve Wise
2016-06-24 0:15 ` Chuck Lever
2016-06-24 0:15 ` Chuck Lever
2016-06-24 13:36 ` Steve Wise [this message]
2016-06-24 13:36 ` Steve Wise
2016-06-24 19:26 ` Jason Gunthorpe
2016-06-24 19:26 ` Jason Gunthorpe
2016-06-24 22:16 ` Chuck Lever
2016-06-24 22:16 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='001601d1ce1d$7990c800$6cb25800$@opengridcomputing.com' \
--to=swise@opengridcomputing.com \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=rajur@chelsio.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.