From: Gyorgy Jeney <nog.lkml@gmail.com>
To: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: Re: NFS over RDMA problem: svcrdma: Error fast registering memory for xprt ffff8803307d7400
Date: Wed, 14 Jul 2010 23:17:23 +0200 [thread overview]
Message-ID: <AANLkTik-BR1txJIbbzaB8YKJQM5lswbmWpbhKSdeUseg@mail.gmail.com> (raw)
In-Reply-To: <AANLkTinjZfpICMjP1EcU4CPAe1TwxKp1zsk-lwX6fmzA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
> I am attempting to use NFS over RDMA (over infiniband), but there is =
some
> problem. =A0The NFS filesystem can be mounted on the client, and thin=
gs
> will work for some time (can read, modify, etc. the files over the mo=
unt),
> but then (at a seemingly random time) the NFS server will dump these
> lines to the logs:
>
> [ 4380.623922] svcrdma: Error fast registering memory for xprt ffff88=
03307d7400
> [ 4413.343161] svcrdma: error fast registering xdr for xprt ffff88033=
19edc00
Digging into it further, it seems like the Mellanox Infiniband driver
could somehow be involved. Adding some trace's to the code, it's obvio=
us
something like this is happening:
At some time sq_cq_reap() is called, which ends up like this:
sq_cq_reap()
ib_poll_cq()
mlx4_ib_poll_cq()
mlx4_ib_poll_one()
mlx4_ib_handle_error_cqe()
- Which then sets wc->status to IB_WC_WR_FLUSH_ERR rather
often, but the killer blow seems to be when
IB_WC_REM_ACCESS_ERR is set.
- Because of the error previously, sq_cq_reap sets the XPT_CLOSE
flag
Then, sometime later:
fast_reg_read_chunks()
svc_rdma_fastreg()
svc_rdma_send()
svc_rdma_send()
- XPT_CLOSE is set and hence -ENOTCONN is returned
- Since svc_rdma_fastreg() had an error fast_reg_read_chunks() bail=
s
and the client seems to then hang.
I'd ask the infiband guys, what does IB_WC_WR_FLUSH_ERR and
IB_WC_REM_ACCESS_ERR mean? Is it something drastic that should result
in hangs?
nog.
> Both client and server are running the latest vanilla 2.6.34.1 kernel
> with Mellanox Connect-X infiniband cards. =A0If more information is
> required, please do ask.
>
> BTW: I can reproduce the problem quite reliably by running the bonnie=
++
> "benchmark" on the NFS mounted filesystem.
>
> nog.
>
> ps: I'm not subscribed to the list, please CC me on all replies.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2010-07-14 21:17 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-13 8:45 NFS over RDMA problem: svcrdma: Error fast registering memory for xprt ffff8803307d7400 Gyorgy Jeney
[not found] ` <AANLkTinjZfpICMjP1EcU4CPAe1TwxKp1zsk-lwX6fmzA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-14 21:17 ` Gyorgy Jeney [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTik-BR1txJIbbzaB8YKJQM5lswbmWpbhKSdeUseg@mail.gmail.com \
--to=nog.lkml@gmail.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).