All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Wise <swise@opengridcomputing.com>
To: "'J. Bruce Fields'" <bfields@fieldses.org>,
	Tom Tucker <tom@opengridcomputing.com>
Cc: "'Yan Burman'" <yanb@mellanox.com>,
	linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org,
	"'Or Gerlitz'" <ogerlitz@mellanox.com>
Subject: Re: NFS over RDMA crashing
Date: Sat, 08 Mar 2014 13:20:48 -0600	[thread overview]
Message-ID: <531B6D90.2090208@opengridcomputing.com> (raw)
In-Reply-To: <531B47B3.1070503@opengridcomputing.com>


> I removed your change and started debugging original crash that 
> happens on top-o-tree.   Seems like rq_next_pages is screwed up.  It 
> should always be >= rq_respages, yes?  I added a BUG_ON() to assert 
> this in rdma_read_xdr() we hit the BUG_ON(). Look
>
> crash> svc_rqst.rq_next_page 0xffff8800b84e6000
>   rq_next_page = 0xffff8800b84e6228
> crash> svc_rqst.rq_respages 0xffff8800b84e6000
>   rq_respages = 0xffff8800b84e62a8
>
> Any ideas Bruce/Tom?
>

Guys, the patch below seems to fix the problem.  Dunno if it is correct 
though.  What do you think?

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 0ce7552..6d62411 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
                 sge_no++;
         }
         rqstp->rq_respages = &rqstp->rq_pages[sge_no];
+       rqstp->rq_next_page = rqstp->rq_respages;

         /* We should never run out of SGE because the limit is defined to
          * support the max allowed RPC data length
@@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct svcxprt_rdma 
*xprt,

         /* rq_respages points one past arg pages */
         rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
+       rqstp->rq_next_page = rqstp->rq_respages;

         /* Create the reply and chunk maps */
         offset = 0;



WARNING: multiple messages have this Message-ID (diff)
From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: "'J. Bruce Fields'"
	<bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>,
	Tom Tucker
	<tom-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Cc: 'Yan Burman' <yanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	'Or Gerlitz' <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: NFS over RDMA crashing
Date: Sat, 08 Mar 2014 13:20:48 -0600	[thread overview]
Message-ID: <531B6D90.2090208@opengridcomputing.com> (raw)
In-Reply-To: <531B47B3.1070503-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>


> I removed your change and started debugging original crash that 
> happens on top-o-tree.   Seems like rq_next_pages is screwed up.  It 
> should always be >= rq_respages, yes?  I added a BUG_ON() to assert 
> this in rdma_read_xdr() we hit the BUG_ON(). Look
>
> crash> svc_rqst.rq_next_page 0xffff8800b84e6000
>   rq_next_page = 0xffff8800b84e6228
> crash> svc_rqst.rq_respages 0xffff8800b84e6000
>   rq_respages = 0xffff8800b84e62a8
>
> Any ideas Bruce/Tom?
>

Guys, the patch below seems to fix the problem.  Dunno if it is correct 
though.  What do you think?

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 0ce7552..6d62411 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
                 sge_no++;
         }
         rqstp->rq_respages = &rqstp->rq_pages[sge_no];
+       rqstp->rq_next_page = rqstp->rq_respages;

         /* We should never run out of SGE because the limit is defined to
          * support the max allowed RPC data length
@@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct svcxprt_rdma 
*xprt,

         /* rq_respages points one past arg pages */
         rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
+       rqstp->rq_next_page = rqstp->rq_respages;

         /* Create the reply and chunk maps */
         offset = 0;


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2014-03-08 19:20 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-06 15:48 NFS over RDMA crashing Yan Burman
2013-02-06 15:48 ` Yan Burman
2013-02-06 15:58 ` Steve Wise
2013-02-06 15:58   ` Steve Wise
2013-02-06 17:06   ` Jeff Becker
2013-02-06 17:06     ` Jeff Becker
2013-02-07 15:54     ` Yan Burman
2013-02-07 15:54       ` Yan Burman
2013-02-06 22:24 ` J. Bruce Fields
2013-02-06 22:24   ` J. Bruce Fields
2013-02-06 22:28   ` Steve Wise
2013-02-06 22:28     ` Steve Wise
2013-02-08  5:37     ` Tom Tucker
2013-02-08  5:37       ` Tom Tucker
2013-02-07 16:41   ` J. Bruce Fields
2013-02-07 16:41     ` J. Bruce Fields
2013-02-11 15:19     ` Yan Burman
2013-02-11 15:19       ` Yan Burman
2013-02-11 18:13       ` J. Bruce Fields
2013-02-11 18:13         ` J. Bruce Fields
2013-02-15 15:27       ` J. Bruce Fields
2013-02-15 15:27         ` J. Bruce Fields
2013-02-18 11:44         ` Yan Burman
2013-02-18 11:44           ` Yan Burman
2014-03-07 16:59     ` Steve Wise
2014-03-07 16:59       ` Steve Wise
2014-03-07 20:41       ` Steve Wise
2014-03-07 20:41         ` Steve Wise
2014-03-08 16:39         ` Steve Wise
2014-03-08 16:39           ` Steve Wise
2014-03-08 19:20           ` Steve Wise [this message]
2014-03-08 19:20             ` Steve Wise
2014-03-08 20:13             ` Steve Wise
2014-03-08 20:13               ` Steve Wise
2014-03-12 13:33               ` Jeff Layton
2014-03-12 13:33                 ` Jeff Layton
2014-03-12 14:05                 ` Trond Myklebust
2014-03-12 14:05                   ` Trond Myklebust
2014-03-12 14:22                   ` Tom Tucker
2014-03-12 14:22                     ` Tom Tucker
2014-03-12 14:28                   ` Jeffrey Layton
2014-03-12 14:28                     ` Jeffrey Layton
2014-03-12 15:03                     ` Trond Myklebust
2014-03-12 15:03                       ` Trond Myklebust
2014-03-12 15:29                       ` Jeffrey Layton
2014-03-12 15:29                         ` Jeffrey Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=531B6D90.2090208@opengridcomputing.com \
    --to=swise@opengridcomputing.com \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=tom@opengridcomputing.com \
    --cc=yanb@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.