* mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
@ 2010-11-15 23:29 Stephen Cousins
[not found] ` <AANLkTi=d6d2vydtcK+ux22e-QR5xqrkenMoeM8AzHBVq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Stephen Cousins @ 2010-11-15 23:29 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
I'm having issues with nfs-rdma. The server has a Mellanox QDR card in it:
# ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.7.626
node_guid: 0002:c903:0007:f352
sys_image_guid: 0002:c903:0007:f355
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: MT_0D90110009
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 1
port_lmc: 0x00
The client is a Supermicro motherboard with the IB QDR built in:
# ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.7.700
node_guid: 0025:90ff:ff16:0240
sys_image_guid: 0025:90ff:ff16:0243
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: SM_2121000001000
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 6
port_lmc: 0x00
The 2.7.700 firmware is brand new from Supermicro. Prior to this it
was 2.7.0 and it showed the same thing.
NFS is working for the most part but the client always logs messages
like below after a modestly sizable transfer ( like 2GB) with dd:
# dd if=/dev/zero of=node4.dat bs=1M count=2K oflag=direct
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 11.7277 seconds, 183 MB/s
messages.log:
Nov 15 09:39:00 node4 kernel: rpcrdma: connection to
192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
and then 5 minutes later:
Nov 15 09:44:00 node4 kernel: rpcrdma: connection to
192.168.0.100:20049 closed (-103)
The server has 16 GB of RAM and the client has 32 GB. They are running
CentOS 5.5 with a 2.6.35.4 kernel.
I have tried changing memreg to 6 and it gives the same corresponding message.
Thanks for any help.
Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <AANLkTi=d6d2vydtcK+ux22e-QR5xqrkenMoeM8AzHBVq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 [not found] ` <AANLkTi=d6d2vydtcK+ux22e-QR5xqrkenMoeM8AzHBVq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2010-11-17 17:43 ` Steve Wise [not found] ` <4CE41458.7020202-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Steve Wise @ 2010-11-17 17:43 UTC (permalink / raw) To: Stephen Cousins; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On 11/15/2010 05:29 PM, Stephen Cousins wrote: > The 2.7.700 firmware is brand new from Supermicro. Prior to this it > was 2.7.0 and it showed the same thing. > > NFS is working for the most part but the client always logs messages > like below after a modestly sizable transfer ( like 2GB) with dd: > > # dd if=/dev/zero of=node4.dat bs=1M count=2K oflag=direct > 2048+0 records in > 2048+0 records out > 2147483648 bytes (2.1 GB) copied, 11.7277 seconds, 183 MB/s > > messages.log: > > Nov 15 09:39:00 node4 kernel: rpcrdma: connection to > 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 > > and then 5 minutes later: > > Nov 15 09:44:00 node4 kernel: rpcrdma: connection to > 192.168.0.100:20049 closed (-103) > > I think NFSRDMA server will close the connection after 5 minutes of inactivity... Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <4CE41458.7020202-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>]
* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 [not found] ` <4CE41458.7020202-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> @ 2010-11-18 1:03 ` Stephen Cousins 2010-11-18 1:12 ` Roland Dreier 1 sibling, 0 replies; 7+ messages in thread From: Stephen Cousins @ 2010-11-18 1:03 UTC (permalink / raw) To: Steve Wise, linux-rdma-u79uwXL29TY76Z2rM5mHXA On Wed, Nov 17, 2010 at 12:43 PM, Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> wrote: > > I think NFSRDMA server will close the connection after 5 minutes of > inactivity... Thanks Steve. I mistook the messages as being a problem that may have been related to nfs speed issues. I think it is something else now. Steve > > Steve. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 [not found] ` <4CE41458.7020202-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> 2010-11-18 1:03 ` Stephen Cousins @ 2010-11-18 1:12 ` Roland Dreier [not found] ` <ada62vvigdx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> 1 sibling, 1 reply; 7+ messages in thread From: Roland Dreier @ 2010-11-18 1:12 UTC (permalink / raw) To: Steve Wise; +Cc: Stephen Cousins, linux-rdma-u79uwXL29TY76Z2rM5mHXA > > Nov 15 09:39:00 node4 kernel: rpcrdma: connection to > > 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 > > > > and then 5 minutes later: > > > > Nov 15 09:44:00 node4 kernel: rpcrdma: connection to > > 192.168.0.100:20049 closed (-103) > > > > > > I think NFSRDMA server will close the connection after 5 minutes of > inactivity... Should the code be spamming the logs for normal events? (Or is this with an elevated log level) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <ada62vvigdx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>]
* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 [not found] ` <ada62vvigdx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> @ 2010-11-18 17:36 ` Steve Wise [not found] ` <4CE5640F.6040005-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Steve Wise @ 2010-11-18 17:36 UTC (permalink / raw) To: Roland Dreier; +Cc: Stephen Cousins, linux-rdma-u79uwXL29TY76Z2rM5mHXA On 11/17/2010 07:12 PM, Roland Dreier wrote: > > > Nov 15 09:39:00 node4 kernel: rpcrdma: connection to > > > 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 > > > > > > and then 5 minutes later: > > > > > > Nov 15 09:44:00 node4 kernel: rpcrdma: connection to > > > 192.168.0.100:20049 closed (-103) > > > > > > > > > > I think NFSRDMA server will close the connection after 5 minutes of > > inactivity... > > Should the code be spamming the logs for normal events? (Or is this > with an elevated log level) > IMO its not needed. From net/sunrpx/xprtrdma/verbs.c (nfsrdma client): [root@r10 xprtrdma]# grep "connection to" *.c verbs.c: printk(KERN_INFO "rpcrdma: connection to %pI4:%u " verbs.c: printk(KERN_INFO "rpcrdma: connection to %pI4:%u closed (%d)\n", [root@r10 xprtrdma]# -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <4CE5640F.6040005-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>]
* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 [not found] ` <4CE5640F.6040005-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> @ 2010-11-18 17:41 ` Steve Wise [not found] ` <4CE56538.3040700-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Steve Wise @ 2010-11-18 17:41 UTC (permalink / raw) To: Roland Dreier; +Cc: Stephen Cousins, linux-rdma-u79uwXL29TY76Z2rM5mHXA On 11/18/2010 11:36 AM, Steve Wise wrote: > On 11/17/2010 07:12 PM, Roland Dreier wrote: >> > > Nov 15 09:39:00 node4 kernel: rpcrdma: connection to >> > > 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 >> > > >> > > and then 5 minutes later: >> > > >> > > Nov 15 09:44:00 node4 kernel: rpcrdma: connection to >> > > 192.168.0.100:20049 closed (-103) >> > > >> > > >> > >> > I think NFSRDMA server will close the connection after 5 minutes of >> > inactivity... >> >> Should the code be spamming the logs for normal events? (Or is this >> with an elevated log level) > > IMO its not needed. From net/sunrpx/xprtrdma/verbs.c (nfsrdma client): > > [root@r10 xprtrdma]# grep "connection to" *.c > verbs.c: printk(KERN_INFO "rpcrdma: connection to %pI4:%u " > verbs.c: printk(KERN_INFO "rpcrdma: connection to %pI4:%u > closed (%d)\n", > [root@r10 xprtrdma]# > > > Looks like its surrounded by #ifdef RPC_DEBUG though. And RPC_DEBUG seems to be always turned on: From include/linux/sunrpc/debug.h: /* * Enable RPC debugging/profiling. */ #ifdef CONFIG_SYSCTL #define RPC_DEBUG #endif However, maybe these two printk's should just be dprintks... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <4CE56538.3040700-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>]
* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 [not found] ` <4CE56538.3040700-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> @ 2010-11-18 19:22 ` Roland Dreier 0 siblings, 0 replies; 7+ messages in thread From: Roland Dreier @ 2010-11-18 19:22 UTC (permalink / raw) To: Steve Wise; +Cc: Stephen Cousins, linux-rdma-u79uwXL29TY76Z2rM5mHXA > However, maybe these two printk's should just be dprintks... Seems reasonable... a KERN_INFO print for something so common seems a bit annoying. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-11-18 19:22 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-15 23:29 mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 Stephen Cousins
[not found] ` <AANLkTi=d6d2vydtcK+ux22e-QR5xqrkenMoeM8AzHBVq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-11-17 17:43 ` Steve Wise
[not found] ` <4CE41458.7020202-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-11-18 1:03 ` Stephen Cousins
2010-11-18 1:12 ` Roland Dreier
[not found] ` <ada62vvigdx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-11-18 17:36 ` Steve Wise
[not found] ` <4CE5640F.6040005-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-11-18 17:41 ` Steve Wise
[not found] ` <4CE56538.3040700-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-11-18 19:22 ` Roland Dreier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).