linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gyorgy Jeney <nog.lkml@gmail.com>
To: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: Re: NFS over RDMA problem: svcrdma: Error fast registering memory for xprt ffff8803307d7400
Date: Wed, 14 Jul 2010 23:17:23 +0200	[thread overview]
Message-ID: <AANLkTik-BR1txJIbbzaB8YKJQM5lswbmWpbhKSdeUseg@mail.gmail.com> (raw)
In-Reply-To: <AANLkTinjZfpICMjP1EcU4CPAe1TwxKp1zsk-lwX6fmzA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

> I am attempting to use NFS over RDMA (over infiniband), but there is =
some
> problem. =A0The NFS filesystem can be mounted on the client, and thin=
gs
> will work for some time (can read, modify, etc. the files over the mo=
unt),
> but then (at a seemingly random time) the NFS server will dump these
> lines to the logs:
>
> [ 4380.623922] svcrdma: Error fast registering memory for xprt ffff88=
03307d7400
> [ 4413.343161] svcrdma: error fast registering xdr for xprt ffff88033=
19edc00

Digging into it further, it seems like the Mellanox Infiniband driver
could somehow be involved.  Adding some trace's to the code, it's obvio=
us
something like this is happening:

At some time sq_cq_reap() is called, which ends up like this:

  sq_cq_reap()
    ib_poll_cq()
      mlx4_ib_poll_cq()
        mlx4_ib_poll_one()
          mlx4_ib_handle_error_cqe()
            - Which then sets wc->status to IB_WC_WR_FLUSH_ERR rather
              often, but the killer blow seems to be when
              IB_WC_REM_ACCESS_ERR is set.
    - Because of the error previously, sq_cq_reap sets the XPT_CLOSE
      flag

Then, sometime later:

  fast_reg_read_chunks()
    svc_rdma_fastreg()
      svc_rdma_send()
        svc_rdma_send()
          - XPT_CLOSE is set and hence -ENOTCONN is returned
    - Since svc_rdma_fastreg() had an error fast_reg_read_chunks() bail=
s
      and the client seems to then hang.

I'd ask the infiband guys, what does IB_WC_WR_FLUSH_ERR and
IB_WC_REM_ACCESS_ERR mean?  Is it something drastic that should result
in hangs?

nog.

> Both client and server are running the latest vanilla 2.6.34.1 kernel
> with Mellanox Connect-X infiniband cards. =A0If more information is
> required, please do ask.
>
> BTW: I can reproduce the problem quite reliably by running the bonnie=
++
> "benchmark" on the NFS mounted filesystem.
>
> nog.
>
> ps: I'm not subscribed to the list, please CC me on all replies.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

      parent reply	other threads:[~2010-07-14 21:17 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-13  8:45 NFS over RDMA problem: svcrdma: Error fast registering memory for xprt ffff8803307d7400 Gyorgy Jeney
     [not found] ` <AANLkTinjZfpICMjP1EcU4CPAe1TwxKp1zsk-lwX6fmzA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-14 21:17   ` Gyorgy Jeney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTik-BR1txJIbbzaB8YKJQM5lswbmWpbhKSdeUseg@mail.gmail.com \
    --to=nog.lkml@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).