public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: "Francisco Manuel Cardoso" <francisco.cardoso-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: 'Sagi Grimberg'
	<sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>,
	'Chuck Lever'
	<chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: RE: NFS over RDMA in SLinux
Date: Sat, 7 Mar 2015 09:03:07 -0000	[thread overview]
Message-ID: <004001d058b5$87830560$96891020$@gmail.com> (raw)
In-Reply-To: <54FA604A.4050807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>

Hello Sagi,

This is about NFSoRDMA, NFS on IPoIPB no issues.

The main issue is that simulation on the HPC cluster starts running
"fine"and after a while, I get loads of errors that the NFS server is not
responding;

Server Side getting messages such as ;

svcrdma: Error -107 posting RDMA_READ
------------[ cut here ]------------
WARNING: at net/sunrpc/xprtrdma/svc_rdma_transport.c:1158
__svc_rdma_free+0x20a/0x230 [svcrdma]() (Tainted: P        W
---------------   )
Hardware name: ProLiant SL4540 Gen8 
Modules linked in: xprtrdma svcrdma nfsd lockd nfs_acl auth_rpcgss sunrpc
autofs4 8021q garp stp llc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
rdma_cm ib_cm iw_cm xfs exportfs iTCO_wdt iTCO_vendor_support ipmi_devintf
power_meter acpi_ipmi ipmi_si ipmi_msghandler hpwdt hpilo igb i2c_algo_bit
i2c_core ptp pps_core serio_raw sg lpc_ich mfd_core ioatdma dca shpchp ext4
jbd2 mbcache sd_mod crc_t10dif hpvsa(P)(U) hpsa mlx4_ib ib_sa ib_mad ib_core
ib_addr ipv6 mlx4_core dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: scsi_wait_scan]
Pid: 51, comm: events/0 Tainted: P        W  ---------------
2.6.32-504.8.1.el6.x86_64 #1
Call Trace:
 [<ffffffff81074df7>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff81074e4a>] ? warn_slowpath_null+0x1a/0x20
 [<ffffffffa073d25a>] ? __svc_rdma_free+0x20a/0x230 [svcrdma]
 [<ffffffffa073d050>] ? __svc_rdma_free+0x0/0x230 [svcrdma]
 [<ffffffff81097fe0>] ? worker_thread+0x170/0x2a0
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81097e70>] ? worker_thread+0x0/0x2a0
 [<ffffffff8109e66e>] ? kthread+0x9e/0xc0
 [<ffffffff8100c20a>] ? child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20
---[ end trace 3ee821ba0f96711f ]---

And;

svcrdma: Error fast registering memory for xprt ffff880c6ae13800
svcrdma: Error fast registering memory for xprt ffff8802e87a3000
svcrdma: Error fast registering memory for xprt ffff880bfa496c00
svcrdma: Error fast registering memory for xprt ffff8808ec717000
svcrdma: Error fast registering memory for xprt ffff880b82577c00
svcrdma: Error fast registering memory for xprt ffff880bfa496c00

I've searched high and low for solutions and went to Red Hat KB, discovered
all the articles regarding high workloads and the workarounds for like for
example the " svcrdma: Error fast registering memory for xprt
ffff8802e87a3000" messages that should be fixed after RH Kernel Errata on
RHEL 6.1.
And the "sunrpc.rdma_memreg_strategy = 6" value change.

If anyone can provide some help or insight would be really great.

Cause I've seen from looking around that usually RDMA with High CPU Loads is
"troublesome".

Regards,

Francisco

-----Original Message-----
From: Sagi Grimberg [mailto:sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org] 
Sent: 07 March 2015 02:20
To: francisco.cardoso-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; Chuck Lever
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: NFS over RDMA in SLinux

On 3/5/2015 9:54 PM, Francisco Manuel Cardoso wrote:
> Hello,
>
>
>
> Sorry newcomer to the group at the moment, brief question i hope 
> someone can at least point me.
>
> Are there any considerations regarding NFS over RDMA on Linux SL6 ?
>
> Question I've been setting up/using an HPC cluster and NFS over IPoIB 
> it's cool as soon as start dishing out things onto with the RDMA things go
crazy.
>
> The tipical setup is each machine is able to handle max 40 processes, 
> using all of those to mpi, I seem to be having some performance 
> issues, if I scale down to 39 I get much better performance still it
crashes.
>
> Anyone got any pointers ?

I'm not sure if you're asking about NFS over IPoIB or NFSoRDMA?

CC'ing Chuck which is probably the best help you can get...

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-03-07  9:03 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-05 19:54 NFS over RDMA in SLinux Francisco Manuel Cardoso
2015-03-07  2:19 ` Sagi Grimberg
     [not found]   ` <54FA604A.4050807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-03-07  9:03     ` Francisco Manuel Cardoso [this message]
2015-03-07 16:12       ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='004001d058b5$87830560$96891020$@gmail.com' \
    --to=francisco.cardoso-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox