From: "J. Bruce Fields" <bfields@fieldses.org>
To: Jeff Layton <jlayton@redhat.com>
Cc: Chuck Lever <chuck.lever@oracle.com>,
Eryu Guan <guaneryu@gmail.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: upstream server crash
Date: Mon, 24 Oct 2016 14:08:58 -0400 [thread overview]
Message-ID: <20161024180858.GA27359@fieldses.org> (raw)
In-Reply-To: <1477322680.14828.6.camel@redhat.com>
On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
> On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
> > On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
> > >
> > > >
> > > >
> > > > On Oct 24, 2016, at 9:31 AM, Jeff Layton <jlayton@redhat.com> wrote:
> > > >
> > > > On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
> > > > >
> > > > >
> > > > > On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > I'm getting an intermittent crash in the nfs server as of
> > > > > > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > > > > > pointers for RPC Call and Reply messages".
> > > > > >
> > > > > > I haven't tried to understand that commit or why it would be a problem yet, I
> > > > > > don't see an obvious connection--I can take a closer look Monday.
> > > > > >
> > > > > > Could even be that I just landed on this commit by chance, the problem is a
> > > > > > little hard to reproduce so I don't completely trust my testing.
> > > > >
> > > > > I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
> > > > > reliably by running xfstests generic/013 case, on a loopback mounted
> > > > > NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
> > > > > please see
> > > > >
> > > > > http://marc.info/?l=linux-nfs&m=147714320129362&w=2
> > > > >
> > > >
> > > > Looks like you landed at the same commit as Bruce, so that's probably
> > > > legit. That commit is very small though. The only real change that
> > > > doesn't affect the new field is this:
> > > >
> > > >
> > > > @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
> > > > req->rq_buffer,
> > > > req->rq_callsize);
> > > > xdr_buf_init(&req->rq_rcv_buf,
> > > > - (char *)req->rq_buffer + req->rq_callsize,
> > > > + req->rq_rbuffer,
> > > > req->rq_rcvsize);
> > > >
> > > >
> > > > So I'm guessing this is breaking the callback channel somehow?
> > >
> > > Could be the TCP backchannel code is using rq_buffer in a different
> > > way than RDMA backchannel or the forward channel code.
> > >
> >
> > Well, it basically allocates a page per rpc_rqst and then maps that.
> >
> > One thing I notice is that this patch ensures that rq_rbuffer gets set
> > up in rpc_malloc and xprt_rdma_allocate, but it looks like
> > xprt_alloc_bc_req didn't get the same treatment.
> >
> > I suspect that that may be the problem...
> >
> In fact, maybe we just need this here? (untested and probably
> whitespace damaged):
No change in results for me.
--b.
>
> diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
> index ac701c28f44f..c561aa8ce05b 100644
> --- a/net/sunrpc/backchannel_rqst.c
> +++ b/net/sunrpc/backchannel_rqst.c
> @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
> goto out_free;
> }
> req->rq_rcv_buf.len = PAGE_SIZE;
> + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
>
> /* Preallocate one XDR send buffer */
> if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {
next prev parent reply other threads:[~2016-10-24 18:08 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-23 18:21 upstream server crash J. Bruce Fields
2016-10-23 20:04 ` Chuck Lever
2016-10-23 20:14 ` J. Bruce Fields
2016-10-24 3:15 ` Eryu Guan
2016-10-24 13:31 ` Jeff Layton
2016-10-24 13:51 ` Chuck Lever
2016-10-24 15:19 ` Jeff Layton
2016-10-24 15:24 ` Jeff Layton
2016-10-24 15:55 ` Chuck Lever
2016-10-24 18:08 ` J. Bruce Fields [this message]
2016-10-24 19:17 ` Jeff Layton
2016-10-24 20:40 ` J. Bruce Fields
2016-10-24 21:38 ` Chuck Lever
2016-10-25 0:57 ` Jeff Layton
2016-10-25 1:00 ` Chuck Lever
2016-10-25 1:46 ` Jeff Layton
2016-10-25 2:02 ` Chuck Lever
2016-10-28 1:20 ` Chuck Lever
2016-10-28 20:50 ` J. Bruce Fields
2016-10-28 21:45 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161024180858.GA27359@fieldses.org \
--to=bfields@fieldses.org \
--cc=chuck.lever@oracle.com \
--cc=guaneryu@gmail.com \
--cc=jlayton@redhat.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.