linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
@ 2010-11-15 23:29 Stephen Cousins
       [not found] ` <AANLkTi=d6d2vydtcK+ux22e-QR5xqrkenMoeM8AzHBVq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Cousins @ 2010-11-15 23:29 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

I'm having issues with nfs-rdma. The server has a Mellanox QDR card in it:

# ibv_devinfo
hca_id:	mlx4_0
	transport:			InfiniBand (0)
	fw_ver:				2.7.626
	node_guid:			0002:c903:0007:f352
	sys_image_guid:			0002:c903:0007:f355
	vendor_id:			0x02c9
	vendor_part_id:			26428
	hw_ver:				0xB0
	board_id:			MT_0D90110009
	phys_port_cnt:			1
		port:	1
			state:			PORT_ACTIVE (4)
			max_mtu:		2048 (4)
			active_mtu:		2048 (4)
			sm_lid:			1
			port_lid:		1
			port_lmc:		0x00

The client is a Supermicro motherboard with the IB QDR built in:

# ibv_devinfo
hca_id:	mlx4_0
	transport:			InfiniBand (0)
	fw_ver:				2.7.700
	node_guid:			0025:90ff:ff16:0240
	sys_image_guid:			0025:90ff:ff16:0243
	vendor_id:			0x02c9
	vendor_part_id:			26428
	hw_ver:				0xB0
	board_id:			SM_2121000001000
	phys_port_cnt:			1
		port:	1
			state:			PORT_ACTIVE (4)
			max_mtu:		2048 (4)
			active_mtu:		2048 (4)
			sm_lid:			1
			port_lid:		6
			port_lmc:		0x00

The 2.7.700 firmware is brand new from Supermicro. Prior to this it
was 2.7.0 and it showed the same thing.

NFS is working for the most part but the client always logs messages
like below after a modestly sizable transfer ( like 2GB) with dd:

# dd if=/dev/zero of=node4.dat bs=1M count=2K oflag=direct
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 11.7277 seconds, 183 MB/s

messages.log:

Nov 15 09:39:00 node4 kernel: rpcrdma: connection to
192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16

and then 5 minutes later:

Nov 15 09:44:00 node4 kernel: rpcrdma: connection to
192.168.0.100:20049 closed (-103)

The server has 16 GB of RAM and the client has 32 GB. They are running
CentOS 5.5 with a 2.6.35.4 kernel.

I have tried changing memreg to 6 and it gives the same corresponding message.

Thanks for any help.

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
       [not found] ` <AANLkTi=d6d2vydtcK+ux22e-QR5xqrkenMoeM8AzHBVq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-11-17 17:43   ` Steve Wise
       [not found]     ` <4CE41458.7020202-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Steve Wise @ 2010-11-17 17:43 UTC (permalink / raw)
  To: Stephen Cousins; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/15/2010 05:29 PM, Stephen Cousins wrote:
> The 2.7.700 firmware is brand new from Supermicro. Prior to this it
> was 2.7.0 and it showed the same thing.
>
> NFS is working for the most part but the client always logs messages
> like below after a modestly sizable transfer ( like 2GB) with dd:
>
> # dd if=/dev/zero of=node4.dat bs=1M count=2K oflag=direct
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 11.7277 seconds, 183 MB/s
>
> messages.log:
>
> Nov 15 09:39:00 node4 kernel: rpcrdma: connection to
> 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
>
> and then 5 minutes later:
>
> Nov 15 09:44:00 node4 kernel: rpcrdma: connection to
> 192.168.0.100:20049 closed (-103)
>
>    

I think NFSRDMA server will close the connection after 5 minutes of 
inactivity...


Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
       [not found]     ` <4CE41458.7020202-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
@ 2010-11-18  1:03       ` Stephen Cousins
  2010-11-18  1:12       ` Roland Dreier
  1 sibling, 0 replies; 7+ messages in thread
From: Stephen Cousins @ 2010-11-18  1:03 UTC (permalink / raw)
  To: Steve Wise, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Nov 17, 2010 at 12:43 PM, Steve Wise
<swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> wrote:
>
> I think NFSRDMA server will close the connection after 5 minutes of
> inactivity...

Thanks Steve. I mistook the messages as being a problem that may have
been related to nfs speed issues. I think it is something else now.

Steve

>
> Steve.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
       [not found]     ` <4CE41458.7020202-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
  2010-11-18  1:03       ` Stephen Cousins
@ 2010-11-18  1:12       ` Roland Dreier
       [not found]         ` <ada62vvigdx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 7+ messages in thread
From: Roland Dreier @ 2010-11-18  1:12 UTC (permalink / raw)
  To: Steve Wise; +Cc: Stephen Cousins, linux-rdma-u79uwXL29TY76Z2rM5mHXA

 > > Nov 15 09:39:00 node4 kernel: rpcrdma: connection to
 > > 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
 > >
 > > and then 5 minutes later:
 > >
 > > Nov 15 09:44:00 node4 kernel: rpcrdma: connection to
 > > 192.168.0.100:20049 closed (-103)
 > >
 > >    
 > 
 > I think NFSRDMA server will close the connection after 5 minutes of
 > inactivity...

Should the code be spamming the logs for normal events?  (Or is this
with an elevated log level)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
       [not found]         ` <ada62vvigdx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-11-18 17:36           ` Steve Wise
       [not found]             ` <4CE5640F.6040005-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Steve Wise @ 2010-11-18 17:36 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Stephen Cousins, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/17/2010 07:12 PM, Roland Dreier wrote:
>   >  >  Nov 15 09:39:00 node4 kernel: rpcrdma: connection to
>   >  >  192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
>   >  >
>   >  >  and then 5 minutes later:
>   >  >
>   >  >  Nov 15 09:44:00 node4 kernel: rpcrdma: connection to
>   >  >  192.168.0.100:20049 closed (-103)
>   >  >
>   >  >
>   >
>   >  I think NFSRDMA server will close the connection after 5 minutes of
>   >  inactivity...
>
> Should the code be spamming the logs for normal events?  (Or is this
> with an elevated log level)
>    

IMO its not needed.  From net/sunrpx/xprtrdma/verbs.c (nfsrdma client):

[root@r10 xprtrdma]# grep "connection to" *.c
verbs.c:        printk(KERN_INFO "rpcrdma: connection to %pI4:%u "
verbs.c:        printk(KERN_INFO "rpcrdma: connection to %pI4:%u closed 
(%d)\n",
[root@r10 xprtrdma]#



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
       [not found]             ` <4CE5640F.6040005-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
@ 2010-11-18 17:41               ` Steve Wise
       [not found]                 ` <4CE56538.3040700-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Steve Wise @ 2010-11-18 17:41 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Stephen Cousins, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/18/2010 11:36 AM, Steve Wise wrote:
> On 11/17/2010 07:12 PM, Roland Dreier wrote:
>> > >  Nov 15 09:39:00 node4 kernel: rpcrdma: connection to
>> > >  192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
>> > >
>> > >  and then 5 minutes later:
>> > >
>> > >  Nov 15 09:44:00 node4 kernel: rpcrdma: connection to
>> > >  192.168.0.100:20049 closed (-103)
>> > >
>> > >
>> >
>> >  I think NFSRDMA server will close the connection after 5 minutes of
>> >  inactivity...
>>
>> Should the code be spamming the logs for normal events?  (Or is this
>> with an elevated log level)
>
> IMO its not needed.  From net/sunrpx/xprtrdma/verbs.c (nfsrdma client):
>
> [root@r10 xprtrdma]# grep "connection to" *.c
> verbs.c:        printk(KERN_INFO "rpcrdma: connection to %pI4:%u "
> verbs.c:        printk(KERN_INFO "rpcrdma: connection to %pI4:%u 
> closed (%d)\n",
> [root@r10 xprtrdma]#
>
>
>
Looks like its surrounded by #ifdef RPC_DEBUG though.  And RPC_DEBUG 
seems to be always turned on:

 From include/linux/sunrpc/debug.h:

/*
  * Enable RPC debugging/profiling.
  */
#ifdef CONFIG_SYSCTL
#define  RPC_DEBUG
#endif


However, maybe these two printk's should just be dprintks...





--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16
       [not found]                 ` <4CE56538.3040700-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
@ 2010-11-18 19:22                   ` Roland Dreier
  0 siblings, 0 replies; 7+ messages in thread
From: Roland Dreier @ 2010-11-18 19:22 UTC (permalink / raw)
  To: Steve Wise; +Cc: Stephen Cousins, linux-rdma-u79uwXL29TY76Z2rM5mHXA

 > However, maybe these two printk's should just be dprintks...

Seems reasonable... a KERN_INFO print for something so common seems a
bit annoying.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-11-18 19:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-15 23:29 mxl4 and rpcrdma: connection to 192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16 Stephen Cousins
     [not found] ` <AANLkTi=d6d2vydtcK+ux22e-QR5xqrkenMoeM8AzHBVq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-11-17 17:43   ` Steve Wise
     [not found]     ` <4CE41458.7020202-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-11-18  1:03       ` Stephen Cousins
2010-11-18  1:12       ` Roland Dreier
     [not found]         ` <ada62vvigdx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-11-18 17:36           ` Steve Wise
     [not found]             ` <4CE5640F.6040005-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-11-18 17:41               ` Steve Wise
     [not found]                 ` <4CE56538.3040700-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-11-18 19:22                   ` Roland Dreier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).