public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* NFS/RDMA RoCE with mlx4_en
@ 2017-06-26 17:24 Chuck Lever
       [not found] ` <5E2BFC42-DDA1-4666-BA45-2E33A47C0ED5-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Chuck Lever @ 2017-06-26 17:24 UTC (permalink / raw)
  To: linux-rdma

Running various I/O stress workloads with iozone on an
NFSv3 mount using RDMA on RoCEv1 (FRWR).

Jun 26 12:50:21 morisot kernel: mlx4_core 0000:01:00.0: command 0x49 timed out (go bit not cleared)
Jun 26 12:50:21 morisot kernel: mlx4_core 0000:01:00.0: device is going to be reset
Jun 26 12:50:22 morisot kernel: mlx4_core 0000:01:00.0: device was reset successfully
Jun 26 12:50:22 morisot kernel: mlx4_en 0000:01:00.0: Internal error detected, restarting device
Jun 26 12:50:22 morisot kernel: <mlx4_ib> mlx4_ib_handle_catas_error: mlx4_ib_handle_catas_error was started
Jun 26 12:50:22 morisot kernel: <mlx4_ib> mlx4_ib_handle_catas_error: mlx4_ib_handle_catas_error ended
Jun 26 12:50:22 morisot kernel: ib_srpt received unrecognized IB event 8
Jun 26 12:50:22 morisot kernel: mlx4_core 0000:01:00.0: command 0x1e failed: fw status = 0x1

Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 17 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: mlx4_en: 0000:01:00.0: Port 1: Using 32 TX rings
Jun 26 12:50:32 morisot kernel: mlx4_en: 0000:01:00.0: Port 1: Using 4 RX rings
Jun 26 12:50:32 morisot kernel: mlx4_en: 0000:01:00.0: Port 1: Initializing port
Jun 26 12:50:32 morisot kernel: mlx4_en 0000:01:00.0: registered PHC clock
Jun 26 12:50:32 morisot kernel: <mlx4_ib> mlx4_ib_add: counter index 1 for port 1 allocated 1
Jun 26 12:50:32 morisot NetworkManager[810]: <info>  [1498495832.7797] manager: (eth0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/3)
Jun 26 12:50:32 morisot kernel: mlx4_core 0000:01:00.0 enp1s0: renamed from eth0
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot NetworkManager[810]: <info>  [1498495832.7962] device (eth0): interface index 4 renamed iface from 'eth0' to 'enp1s0'
Jun 26 12:50:32 morisot NetworkManager[810]: <info>  [1498495832.7971] device (enp1s0): state change: unmanaged -> unavailable (reason 'managed') [10 20 2]
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 20 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 24 pages, ret: -12
Jun 26 12:50:32 morisot kernel: IPv6: ADDRCONF(NETDEV_UP): enp1s0: link is not ready
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12

Within a few more moments, the system became unreachable and
had to be restarted via remote power-on reset.

The cma_alloc failures start at system boot time and continue
whenever the mlx4_en device is used. Those don't seem like
unreasonably sized requests.

I tried reproducing this immediately. Similar result:

Jun 26 13:16:47 morisot kernel: nfs: server klimt-roce not responding, still trying
Jun 26 13:16:47 morisot kernel: nfs: server klimt-roce not responding, still trying
Jun 26 13:16:47 morisot kernel: nfs: server klimt-roce not responding, still trying
Jun 26 13:16:47 morisot kernel: nfs: server klimt-roce not responding, still trying
Jun 26 13:16:51 morisot kernel: mlx4_core 0000:01:00.0: command 0x49 timed out (go bit not cleared)
Jun 26 13:16:51 morisot kernel: mlx4_core 0000:01:00.0: device is going to be reset
Jun 26 13:16:52 morisot kernel: mlx4_core 0000:01:00.0: device was reset successfully
Jun 26 13:16:52 morisot kernel: mlx4_en 0000:01:00.0: Internal error detected, restarting device
Jun 26 13:16:52 morisot kernel: <mlx4_ib> mlx4_ib_handle_catas_error: mlx4_ib_handle_catas_error was started
Jun 26 13:16:52 morisot kernel: <mlx4_ib> mlx4_ib_handle_catas_error: mlx4_ib_handle_catas_error ended
Jun 26 13:16:52 morisot kernel: ib_srpt received unrecognized IB event 8
Jun 26 13:16:52 morisot kernel: mlx4_core 0000:01:00.0: command 0x1e failed: fw status = 0x1
Jun 26 13:16:52 morisot kernel: cma: cma_alloc: alloc failed, req-size: 129 pages, ret: -12
Jun 26 13:16:55 morisot kernel: mlx4_core 0000:01:00.0: Internal error mark was detected on device

Jun 26 13:16:55 morisot nm-dispatcher: req:1 'down' [enp1s0]: new request (4 scripts)
Jun 26 13:16:55 morisot nm-dispatcher: req:1 'down' [enp1s0]: start running ordered scripts...
Jun 26 13:16:55 morisot kernel: mlx4_en 0000:01:00.0: removed PHC
Jun 26 13:16:55 morisot kernel: rpcrdma: removing device mlx4_0 for 192.168.3.5:20049
Jun 26 13:16:55 morisot kernel: mlx4_core 0000:01:00.0: Fail to set mac in port 1 during unregister
Jun 26 13:16:58 morisot kernel: alloc_contig_range: [41e417, 41e418) PFNs busy
Jun 26 13:16:58 morisot kernel: alloc_contig_range: [41e418, 41e419) PFNs busy
Jun 26 13:17:00 morisot kernel: rpcrdma_ep_recreate_xprt: r_xprt = ffff8804090f8000
Jun 26 13:17:01 morisot kernel: alloc_contig_range: [41e417, 41e418) PFNs busy
Jun 26 13:17:01 morisot kernel: alloc_contig_range: [41e418, 41e419) PFNs busy
Jun 26 13:17:01 morisot kernel: alloc_contig_range: [41e4c0, 41e500) PFNs busy
Jun 26 13:17:01 morisot kernel: mlx4_core 0000:01:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s
Jun 26 13:17:01 morisot kernel: mlx4_core 0000:01:00.0: PCIe link width is x8, device supports x8
Jun 26 13:17:01 morisot kernel: alloc_contig_range: [41e417, 41e418) PFNs busy
Jun 26 13:17:01 morisot kernel: alloc_contig_range: [41e418, 41e419) PFNs busy
Jun 26 13:17:01 morisot kernel: alloc_contig_range: [41e417, 41e418) PFNs busy
Jun 26 13:17:01 morisot kernel: alloc_contig_range: [41e418, 41e419) PFNs busy

Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 13:17:02 morisot kernel: mlx4_en 0000:01:00.0: Activating port:1
Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed, req-size: 8 pages, ret: -12
Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed, req-size: 8 pages, ret: -12
Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed, req-size: 8 pages, ret: -12
Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed, req-size: 8 pages, ret: -12


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS/RDMA RoCE with mlx4_en
       [not found] ` <5E2BFC42-DDA1-4666-BA45-2E33A47C0ED5-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2017-06-27  6:22   ` Leon Romanovsky
  2017-06-27  9:57   ` Sagi Grimberg
  2017-06-27 10:33   ` jackm
  2 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2017-06-27  6:22 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-rdma, Jack Morgenstein, Majd Dibbiny

[-- Attachment #1: Type: text/plain, Size: 939 bytes --]

On Mon, Jun 26, 2017 at 01:24:11PM -0400, Chuck Lever wrote:
> Running various I/O stress workloads with iozone on an
> NFSv3 mount using RDMA on RoCEv1 (FRWR).
>
> Jun 26 12:50:21 morisot kernel: mlx4_core 0000:01:00.0: command 0x49 timed out (go bit not cleared)

It means that device had internal error before and/or pci channel is offline and is restarting now.

> Jun 26 12:50:21 morisot kernel: mlx4_core 0000:01:00.0: device is going to be reset
> Jun 26 12:50:22 morisot kernel: mlx4_core 0000:01:00.0: device was reset successfully
> Jun 26 12:50:22 morisot kernel: mlx4_en 0000:01:00.0: Internal error detected, restarting device
> Jun 26 12:50:22 morisot kernel: <mlx4_ib> mlx4_ib_handle_catas_error: mlx4_ib_handle_catas_error was started
> Jun 26 12:50:22 morisot kernel: <mlx4_ib> mlx4_ib_handle_catas_error: mlx4_ib_handle_catas_error ended
> Jun 26 12:50:22 morisot kernel: ib_srpt received unrecognized IB event 8

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS/RDMA RoCE with mlx4_en
       [not found] ` <5E2BFC42-DDA1-4666-BA45-2E33A47C0ED5-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  2017-06-27  6:22   ` Leon Romanovsky
@ 2017-06-27  9:57   ` Sagi Grimberg
  2017-06-27 10:33   ` jackm
  2 siblings, 0 replies; 6+ messages in thread
From: Sagi Grimberg @ 2017-06-27  9:57 UTC (permalink / raw)
  To: Chuck Lever, linux-rdma


> Running various I/O stress workloads with iozone on an
> NFSv3 mount using RDMA on RoCEv1 (FRWR).
> 
> Jun 26 12:50:21 morisot kernel: mlx4_core 0000:01:00.0: command 0x49 timed out (go bit not cleared)

Looks like a FW issue to me...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS/RDMA RoCE with mlx4_en
       [not found] ` <5E2BFC42-DDA1-4666-BA45-2E33A47C0ED5-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  2017-06-27  6:22   ` Leon Romanovsky
  2017-06-27  9:57   ` Sagi Grimberg
@ 2017-06-27 10:33   ` jackm
       [not found]     ` <20170627133306.00003fda-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2 siblings, 1 reply; 6+ messages in thread
From: jackm @ 2017-06-27 10:33 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-rdma, Leon Romanovsky

On Mon, 26 Jun 2017 13:24:11 -0400
Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:

> Running various I/O stress workloads with iozone on an
> NFSv3 mount using RDMA on RoCEv1 (FRWR).
> 
Hi Chuck, I have some questions to help us understand what is happening:

1. What kernel are you running here?
2. What is the underlying Linux distribution?
3. What FW is installed on the ConnectX-3 HCA?
4. Is SRIOV enabled? (i.e., is there a line in a modprobe conf file:
    options mlx4_core num_vfs=<integer greater than zero>
5. Could you dump the card's .ini file and sent it to us?
   (flint dc -d <pci bus-dev-fn> dc connectx3.ini)
6. Is this a dual-port HCA, Are both ports connected?

7. Could you try disabling the mlx4 driver automatic driver start at
boot time?

8. After disabling automatic start at boot time, could you reboot the
host to see if it has problems without the mlx4 driver stack?

9. The mlx4 device was reset because a timeout was detected for the
   DUMP_ETH_STATS command (0x49). The timeout for this command is 60
   seconds.  Did the message log show anything at around 1 minute before
   the timeout occurred?

10. Do you know which app is calling cma_alloc?  If you are willing to
modify your kernel code temporarily for this, you might put a
stack_dump() in file mm/cma.c at line 454 (where the cma_alloc failure
line is output).

Thanks, Chuck -- any help you can give us here will be greatly
appreciated.

-Jack

P.S., some more comments:
Jun 26 12:50:22 morisot kernel: ib_srpt received unrecognized IB event 8
- The above IB event is IB_EVENT_DEVICE_FATAL
  -ib_srpt might consider handling this event somehow.


> Jun 26 12:50:21 morisot kernel: mlx4_core 0000:01:00.0: command 0x49
> timed out (go bit not cleared) Jun 26 12:50:21 morisot kernel:
> mlx4_core 0000:01:00.0: device is going to be reset Jun 26 12:50:22
> morisot kernel: mlx4_core 0000:01:00.0: device was reset successfully
> Jun 26 12:50:22 morisot kernel: mlx4_en 0000:01:00.0: Internal error
> detected, restarting device Jun 26 12:50:22 morisot kernel: <mlx4_ib>
> mlx4_ib_handle_catas_error: mlx4_ib_handle_catas_error was started
> Jun 26 12:50:22 morisot kernel: <mlx4_ib> mlx4_ib_handle_catas_error:
> mlx4_ib_handle_catas_error ended Jun 26 12:50:22 morisot kernel:
> ib_srpt received unrecognized IB event 8 Jun 26 12:50:22 morisot
> kernel: mlx4_core 0000:01:00.0: command 0x1e failed: fw status = 0x1
> 
> Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed,
> req-size: 17 pages, ret: -12 Jun 26 12:50:32 morisot kernel: cma:
> cma_alloc: alloc failed, req-size: 1 pages, ret: -12 Jun 26 12:50:32
> morisot kernel: mlx4_en: 0000:01:00.0: Port 1: Using 32 TX rings Jun
> 26 12:50:32 morisot kernel: mlx4_en: 0000:01:00.0: Port 1: Using 4 RX
> rings Jun 26 12:50:32 morisot kernel: mlx4_en: 0000:01:00.0: Port 1:
> Initializing port Jun 26 12:50:32 morisot kernel: mlx4_en
> 0000:01:00.0: registered PHC clock Jun 26 12:50:32 morisot kernel:
> <mlx4_ib> mlx4_ib_add: counter index 1 for port 1 allocated 1 Jun 26
> 12:50:32 morisot NetworkManager[810]: <info>  [1498495832.7797]
> manager: (eth0): new Ethernet device
> (/org/freedesktop/NetworkManager/Devices/3) Jun 26 12:50:32 morisot
> kernel: mlx4_core 0000:01:00.0 enp1s0: renamed from eth0 Jun 26
> 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1
> pages, ret: -12 Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc
> failed, req-size: 1 pages, ret: -12 Jun 26 12:50:32 morisot kernel:
> cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12 Jun 26
> 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1
> pages, ret: -12 Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc
> failed, req-size: 1 pages, ret: -12 Jun 26 12:50:32 morisot kernel:
> cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12 Jun 26
> 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1
> pages, ret: -12 Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc
> failed, req-size: 1 pages, ret: -12 Jun 26 12:50:32 morisot kernel:
> cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12 Jun 26
> 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1
> pages, ret: -12 Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc
> failed, req-size: 1 pages, ret: -12 Jun 26 12:50:32 morisot kernel:
> cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12 Jun 26
> 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1
> pages, ret: -12 Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc
> failed, req-size: 1 pages, ret: -12 Jun 26 12:50:32 morisot kernel:
> cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12 Jun 26
> 12:50:32 morisot kernel: cma: cma_alloc: alloc failed, req-size: 1
> pages, ret: -12 Jun 26 12:50:32 morisot NetworkManager[810]: <info>
> [1498495832.7962] device (eth0): interface index 4 renamed iface from
> 'eth0' to 'enp1s0' Jun 26 12:50:32 morisot NetworkManager[810]:
> <info>  [1498495832.7971] device (enp1s0): state change: unmanaged ->
> unavailable (reason 'managed') [10 20 2] Jun 26 12:50:32 morisot
> kernel: cma: cma_alloc: alloc failed, req-size: 20 pages, ret: -12
> Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed,
> req-size: 24 pages, ret: -12 Jun 26 12:50:32 morisot kernel: IPv6:
> ADDRCONF(NETDEV_UP): enp1s0: link is not ready Jun 26 12:50:32
> morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret:
> -12 Jun 26 12:50:32 morisot kernel: cma: cma_alloc: alloc failed,
> req-size: 1 pages, ret: -12 Jun 26 12:50:32 morisot kernel: cma:
> cma_alloc: alloc failed, req-size: 1 pages, ret: -12
> 
> Within a few more moments, the system became unreachable and
> had to be restarted via remote power-on reset.
> 
> The cma_alloc failures start at system boot time and continue
> whenever the mlx4_en device is used. Those don't seem like
> unreasonably sized requests.
> 
> I tried reproducing this immediately. Similar result:
> 
> Jun 26 13:16:47 morisot kernel: nfs: server klimt-roce not
> responding, still trying Jun 26 13:16:47 morisot kernel: nfs: server
> klimt-roce not responding, still trying Jun 26 13:16:47 morisot
> kernel: nfs: server klimt-roce not responding, still trying Jun 26
> 13:16:47 morisot kernel: nfs: server klimt-roce not responding, still
> trying Jun 26 13:16:51 morisot kernel: mlx4_core 0000:01:00.0:
> command 0x49 timed out (go bit not cleared) Jun 26 13:16:51 morisot
> kernel: mlx4_core 0000:01:00.0: device is going to be reset Jun 26
> 13:16:52 morisot kernel: mlx4_core 0000:01:00.0: device was reset
> successfully Jun 26 13:16:52 morisot kernel: mlx4_en 0000:01:00.0:
> Internal error detected, restarting device Jun 26 13:16:52 morisot
> kernel: <mlx4_ib> mlx4_ib_handle_catas_error:
> mlx4_ib_handle_catas_error was started Jun 26 13:16:52 morisot
> kernel: <mlx4_ib> mlx4_ib_handle_catas_error:
> mlx4_ib_handle_catas_error ended Jun 26 13:16:52 morisot kernel:
> ib_srpt received unrecognized IB event 8 Jun 26 13:16:52 morisot
> kernel: mlx4_core 0000:01:00.0: command 0x1e failed: fw status = 0x1
> Jun 26 13:16:52 morisot kernel: cma: cma_alloc: alloc failed,
> req-size: 129 pages, ret: -12 Jun 26 13:16:55 morisot kernel:
> mlx4_core 0000:01:00.0: Internal error mark was detected on device
> 
> Jun 26 13:16:55 morisot nm-dispatcher: req:1 'down' [enp1s0]: new
> request (4 scripts) Jun 26 13:16:55 morisot nm-dispatcher: req:1
> 'down' [enp1s0]: start running ordered scripts... Jun 26 13:16:55
> morisot kernel: mlx4_en 0000:01:00.0: removed PHC Jun 26 13:16:55
> morisot kernel: rpcrdma: removing device mlx4_0 for 192.168.3.5:20049
> Jun 26 13:16:55 morisot kernel: mlx4_core 0000:01:00.0: Fail to set
> mac in port 1 during unregister Jun 26 13:16:58 morisot kernel:
> alloc_contig_range: [41e417, 41e418) PFNs busy Jun 26 13:16:58
> morisot kernel: alloc_contig_range: [41e418, 41e419) PFNs busy Jun 26
> 13:17:00 morisot kernel: rpcrdma_ep_recreate_xprt: r_xprt =
> ffff8804090f8000 Jun 26 13:17:01 morisot kernel: alloc_contig_range:
> [41e417, 41e418) PFNs busy Jun 26 13:17:01 morisot kernel:
> alloc_contig_range: [41e418, 41e419) PFNs busy Jun 26 13:17:01
> morisot kernel: alloc_contig_range: [41e4c0, 41e500) PFNs busy Jun 26
> 13:17:01 morisot kernel: mlx4_core 0000:01:00.0: PCIe link speed is
> 8.0GT/s, device supports 8.0GT/s Jun 26 13:17:01 morisot kernel:
> mlx4_core 0000:01:00.0: PCIe link width is x8, device supports x8 Jun
> 26 13:17:01 morisot kernel: alloc_contig_range: [41e417, 41e418) PFNs
> busy Jun 26 13:17:01 morisot kernel: alloc_contig_range: [41e418,
> 41e419) PFNs busy Jun 26 13:17:01 morisot kernel: alloc_contig_range:
> [41e417, 41e418) PFNs busy Jun 26 13:17:01 morisot kernel:
> alloc_contig_range: [41e418, 41e419) PFNs busy
> 
> Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed,
> req-size: 1 pages, ret: -12 Jun 26 13:17:02 morisot kernel: cma:
> cma_alloc: alloc failed, req-size: 1 pages, ret: -12 Jun 26 13:17:02
> morisot kernel: cma: cma_alloc: alloc failed, req-size: 1 pages, ret:
> -12 Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed,
> req-size: 1 pages, ret: -12 Jun 26 13:17:02 morisot kernel: mlx4_en
> 0000:01:00.0: Activating port:1 Jun 26 13:17:02 morisot kernel: cma:
> cma_alloc: alloc failed, req-size: 1 pages, ret: -12 Jun 26 13:17:02
> morisot kernel: cma: cma_alloc: alloc failed, req-size: 8 pages, ret:
> -12 Jun 26 13:17:02 morisot kernel: cma: cma_alloc: alloc failed,
> req-size: 8 pages, ret: -12 Jun 26 13:17:02 morisot kernel: cma:
> cma_alloc: alloc failed, req-size: 8 pages, ret: -12 Jun 26 13:17:02
> morisot kernel: cma: cma_alloc: alloc failed, req-size: 8 pages, ret:
> -12
> 
> 
> --
> Chuck Lever
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS/RDMA RoCE with mlx4_en
       [not found]     ` <20170627133306.00003fda-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2017-06-27 16:28       ` Chuck Lever
       [not found]         ` <7FB59BD3-CB33-4710-B049-B53C6C042736-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Chuck Lever @ 2017-06-27 16:28 UTC (permalink / raw)
  To: jackm; +Cc: linux-rdma, Leon Romanovsky

[-- Attachment #1: Type: text/plain, Size: 2491 bytes --]

Hi Jack-

Thanks for your help!


> On Jun 27, 2017, at 6:33 AM, jackm <jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> 
> On Mon, 26 Jun 2017 13:24:11 -0400
> Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
> 
>> Running various I/O stress workloads with iozone on an
>> NFSv3 mount using RDMA on RoCEv1 (FRWR).
>> 
> Hi Chuck, I have some questions to help us understand what is happening:
> 
> 1. What kernel are you running here?

v4.12-rc2


> 2. What is the underlying Linux distribution?

Oracle Linux 7.3


> 3. What FW is installed on the ConnectX-3 HCA?

2.40.7000


> 4. Is SRIOV enabled? (i.e., is there a line in a modprobe conf file:
>    options mlx4_core num_vfs=<integer greater than zero>

It was enabled in the BIOS, but all lines with "num_vfs=" in these
files are commented out. I disabled the BIOS setting, but no change
in behavior.


> 5. Could you dump the card's .ini file and sent it to us?
>   (flint dc -d <pci bus-dev-fn> dc connectx3.ini)

Attached. Let me know if it doesn't make it.


> 6. Is this a dual-port HCA, Are both ports connected?

Single port.


> 7. Could you try disabling the mlx4 driver automatic driver start at
> boot time?
> 
> 8. After disabling automatic start at boot time, could you reboot the
> host to see if it has problems without the mlx4 driver stack?

I unset CONFIG_CMA. The cma_alloc errors go away, but the mlx4
timeout / reset is unchanged.


> 9. The mlx4 device was reset because a timeout was detected for the
>   DUMP_ETH_STATS command (0x49). The timeout for this command is 60
>   seconds.  Did the message log show anything at around 1 minute before
>   the timeout occurred?

Nothing probative. Lots of "NFS server: not responding".


> 10. Do you know which app is calling cma_alloc?  If you are willing to
> modify your kernel code temporarily for this, you might put a
> stack_dump() in file mm/cma.c at line 454 (where the cma_alloc failure
> line is output).

In the process of collecting data for you, I noticed that
the CX3's maximum Ethernet link speed is 40Gbps, and I
had set the switch port speed to 56Gbps. I've set the
port speed back to 40Gbps, and now neither the device
reset nor the cma_alloc failures are reproducing.

If you'd like to pursue this further, I can switch back to
the higher speed and try to reproduce to collect this
information.


> Thanks, Chuck -- any help you can give us here will be greatly
> appreciated.
> 
> -Jack

--
Chuck Lever



[-- Attachment #2: connectx3.ini --]
[-- Type: application/octet-stream, Size: 8916 bytes --]

;; Generated automatically by iniprep tool on Wed Mar 22 16:01:51 IST 2017 from ./cx3pro_MCX353A_fdr_09v.prs
;;
;; PRS  FILE FOR KESTREL BENTAL
;; $Id$ 



[PS_INFO]
Name = MCX353A-FCC_Ax
Description = ConnectX-3 Pro VPI adapter card; single-port QSFP; FDR IB (56Gb/s) and 40GigE; PCIe3.0 x8 8GT/s
PRS_name    = cx3pro_MCX353A_fdr_09v.prs

[ADAPTER]
PSID = MT_1100111019
pcie_gen2_speed_supported = true
pcie_gen3_speed_supported = true
adapter_dev_id = 0x1007
silicon_rev = 0x00

gpio_mode1 = 0x08000001
gpio_mode0 = 0x04c04032
gpio_default_val = 0x0f306023
gpio_pull_up = 0xff2baf2f
gpio_pull_enable = 0xfbbbbfff

receiver_detect_en = true
vdd_change_to_1_offset = 5

nv_cfg_en = true

nv_config_sectors = 2

[HCA]
hca_header_device_id = 0x1007
hca_header_subsystem_id = 0x0008
hca_header_class_code = 0x28000
eth_xfi_en = true
mdio_en_port1 = 0
pcie_tx_polarity = 0x0f
dpdp_en = true

[IB]
mlpn_en_port0 = true
phy_type_port1 = XFI
ext_phy_board_port1 = FALCON
gen_guids_from_mac = false
do_sense = true
ref_clk_to_use = 0
module_power_level_supported_port0 = 5

num_of_ports = One_Port
new_gpio_scheme_en = true
read_cable_params_port1_en = true

cx3_spec1_3_ib_support_port0 = true
spec1_3_fdr14_ib_support_port0 = true
cx3_spec1_2_ib_support_port0 = true
spec1_3_fdr10_ib_support_port0 = true
mellanox_ddr_ib_support = true
mellanox_qdr_ib_support = true

port1_802_3ap_cr4_enable = true
port1_802_3ap_cr4_ability = true
port1_802_3ap_56kr4_ability = true

center_mix90phase = true

;;Logic lane to Serdes mapping
tx_logic_0_serdes = 0
tx_logic_1_serdes = 1
tx_logic_2_serdes = 2
tx_logic_3_serdes = 3
rx_logic_0_serdes = 3
rx_logic_1_serdes = 2
rx_logic_2_serdes = 1
rx_logic_3_serdes = 0

eth_tx_lane_polarity_port1 = 0xf
eth_rx_lane_polarity_port1 = 0x0
tx_lane_polarity_port1 = 0xf
rx_lane_polarity_port1 = 0x0
; start of '#include "include_QSFP_serdes_prams_bental.h"'

;;Serdes parameters
port0_nego_fdr_mask_en = 0xfffc
port1_nego_fdr_mask_en = 0xfffc
port0_nego_fdr10_mask_en = 0xfffc
port1_nego_fdr10_mask_en = 0xfffc

nego_rx4_slicer_ind_en = 255
nego_rx4_slicer1_enable = 8
nego_rx4_slicer2_enable = 8
nego_rx4_ffe_tap0 = 94
nego_rx4_ffe_tap1 = 134
nego_rx4_ffe_tap2 = 245
nego_rx4_ffe_tap3 = 135
nego_rx4_ffe_tap4 = 171

nego_rx9_ffe_tap0=84
nego_rx9_ffe_tap1=164
nego_rx9_ffe_tap2=251
nego_rx9_ffe_tap3=132
nego_rx9_ffe_tap4=140

nego_rx15_ffe_tap3 = 140
nego_rx15_ffe_tap1 = 140

nego_rx10_ffe_tap3 = 140
nego_rx10_ffe_tap1 = 140

nego_rx8_ffe_tap3 = 140
nego_rx8_ffe_tap1 = 140

force_rx0_slicer_ind_en = 0x0
force_rx0_slicer1_enable = 0x0
force_rx0_slicer2_enable = 0x0
force_rx0_ffe_tap0 = 0xff
force_rx0_ffe_tap1 = 0x80
force_rx0_ffe_tap2 = 0x80
force_rx0_ffe_tap3 = 0x80
force_rx0_ffe_tap4 = 0x80

force_tx0_ob_preemp_pre = 0x40
force_tx0_ob_preemp_post = 0x0
force_tx0_ob_preemp_main = 0x7f
force_tx0_preemp = 0x0
force_tx0_pre_polarity = 0x1
force_tx0_post_polarity = 0x1
force_tx0_main_polarity = 0x0

force_rx2_slicer_ind_en = 0xeb
force_rx2_slicer1_enable = 0x0
force_rx2_slicer2_enable = 0x0
force_rx2_ffe_tap0 = 0x64
force_rx2_ffe_tap1 = 0x80
force_rx2_ffe_tap2 = 0xde
force_rx2_ffe_tap3 = 0x80
force_rx2_ffe_tap4 = 0x46

force_tx2_ob_preemp_pre = 0x30
force_tx2_ob_preemp_post = 0x0
force_tx2_ob_preemp_main = 0x7f
force_tx2_preemp = 0x0
force_tx2_pre_polarity = 0x1
force_tx2_post_polarity = 0x1
force_tx2_main_polarity = 0x0

force_rx3_slicer_ind_en = 0xff
force_rx3_slicer1_enable = 0x8
force_rx3_slicer2_enable = 0x8
force_rx3_ffe_tap0 = 0x6c
force_rx3_ffe_tap1 = 0x80
force_rx3_ffe_tap2 = 0xff
force_rx3_ffe_tap3 = 0x80
force_rx3_ffe_tap4 = 0x80

force_tx3_ob_preemp_pre = 0xc
force_tx3_ob_preemp_post = 0x7f
force_tx3_ob_preemp_main = 0x45
force_tx3_preemp = 0x0
force_tx3_pre_polarity = 0x1
force_tx3_post_polarity = 0x0
force_tx3_main_polarity = 0x1
force_tx3_ob_bias = 0xa

auto_ddr_tx_options = 2
auto_ddr_rx_options = 1

auto_qdr_tx_options = 6
auto_qdr_rx_options = 7

preset_tx_fdr_set12_ob_preemp_pre = 17
preset_tx_fdr_set12_ob_preemp_post = 0
preset_tx_fdr_set12_ob_preemp_main=25
preset_tx_fdr_set12_preemp = 0
preset_tx_fdr_set12_pre_polarity = 1
preset_tx_fdr_set12_post_polarity = 1
preset_tx_fdr_set12_main_polarity = 0
preset_tx_fdr_set12_ob_bias = 5

preset_tx_fdr_set13_ob_preemp_main =40
preset_tx_fdr_set13_ob_preemp_pre = 28
preset_tx_fdr_set13_ob_preemp_post = 0
preset_tx_fdr_set13_preemp = 0
preset_tx_fdr_set13_pre_polarity = 1
preset_tx_fdr_set13_post_polarity = 1
preset_tx_fdr_set13_main_polarity = 0
preset_tx_fdr_set13_ob_bias = 5

preset_tx_fdr_set14_ob_preemp_main = 35
preset_tx_fdr_set14_ob_preemp_pre = 25
preset_tx_fdr_set14_ob_preemp_post = 0
preset_tx_fdr_set14_preemp = 0
preset_tx_fdr_set14_pre_polarity = 1
preset_tx_fdr_set14_post_polarity = 1
preset_tx_fdr_set14_main_polarity = 0
preset_tx_fdr_set14_ob_bias = 5

preset_tx_fdr_set15_ob_preemp_main = 30
preset_tx_fdr_set15_ob_preemp_pre = 20
preset_tx_fdr_set15_ob_preemp_post = 0
preset_tx_fdr_set15_preemp = 0
preset_tx_fdr_set15_pre_polarity = 1
preset_tx_fdr_set15_post_polarity = 1
preset_tx_fdr_set15_main_polarity = 0
preset_tx_fdr_set15_ob_bias = 5

preset_tx_mask = 0xfffe

aba_mask0_start = 0
aba_mask0_end   = 3
aba_mask0 = 0x1000
aba_mask1_start = 4
aba_mask1_end   = 5
aba_mask1 = 0x8000
aba_mask2_start = 6
aba_mask2_end   = 10
aba_mask2 = 0x4000
aba_mask3_start = 11
aba_mask3_end   = 16
aba_mask3 = 0x2000

; ABA 40GE
aba_tx2_ob_preemp_pre = 20
aba_tx2_ob_preemp_main = 42
aba_tx2_ob_preemp_post = 8
aba_tx2_ob_bias = 8
aba_tx2_pre_polarity = 1
aba_tx2_post_polarity = 1
aba_tx2_main_polarity = 0

;;3m
aba_tx3_ob_preemp_pre = 22
aba_tx3_ob_preemp_main = 42
aba_tx3_ob_preemp_post = 5
aba_tx3_ob_bias = 8
aba_tx3_pre_polarity = 1
aba_tx3_post_polarity = 1
aba_tx3_main_polarity = 0

aba_tx4_ob_preemp_pre = 26
aba_tx4_ob_preemp_main = 42
aba_tx4_ob_preemp_post = 3
aba_tx4_ob_bias = 8
aba_tx4_pre_polarity = 1
aba_tx4_post_polarity = 1
aba_tx4_main_polarity = 0

aba_tx5_ob_preemp_pre = 60
aba_tx5_ob_preemp_main = 90
aba_tx5_ob_preemp_post = 8
aba_tx5_ob_bias = 8
aba_tx5_pre_polarity = 1
aba_tx5_post_polarity = 1
aba_tx5_main_polarity = 0

aba_tx6_ob_preemp_pre = 80
aba_tx6_ob_preemp_main = 110
aba_tx6_ob_preemp_post = 10
aba_tx6_ob_bias = 8
aba_tx6_pre_polarity = 1
aba_tx6_post_polarity = 1
aba_tx6_main_polarity = 0

aba_tx7_ob_preemp_pre = 75
aba_tx7_ob_preemp_main = 110
aba_tx7_ob_preemp_post = 15
aba_tx7_ob_bias = 8
aba_tx7_pre_polarity = 1
aba_tx7_post_polarity = 1
aba_tx7_main_polarity = 0

aba_fdr_tx16_ob_preemp_pre = 17
aba_fdr_tx16_ob_preemp_post = 0
aba_fdr_tx16_ob_preemp_main=25
aba_fdr_tx16_preemp = 0
aba_fdr_tx16_pre_polarity = 1
aba_fdr_tx16_post_polarity = 1
aba_fdr_tx16_main_polarity = 0
aba_fdr_tx16_ob_bias = 5

aba_fdr_tx17_ob_preemp_main =46
aba_fdr_tx17_ob_preemp_pre = 32
aba_fdr_tx17_ob_preemp_post = 0
aba_fdr_tx17_preemp = 0
aba_fdr_tx17_pre_polarity = 1
aba_fdr_tx17_post_polarity = 1
aba_fdr_tx17_main_polarity = 0
aba_fdr_tx17_ob_bias = 3

aba_fdr_tx18_ob_preemp_main = 50
aba_fdr_tx18_ob_preemp_pre = 32
aba_fdr_tx18_ob_preemp_post = 0
aba_fdr_tx18_preemp = 0
aba_fdr_tx18_pre_polarity = 1
aba_fdr_tx18_post_polarity = 1
aba_fdr_tx18_main_polarity = 0
aba_fdr_tx18_ob_bias = 3

aba_fdr_tx19_ob_preemp_main = 60
aba_fdr_tx19_ob_preemp_pre = 30
aba_fdr_tx19_ob_preemp_post = 0
aba_fdr_tx19_preemp = 0
aba_fdr_tx19_pre_polarity = 1
aba_fdr_tx19_post_polarity = 1
aba_fdr_tx19_main_polarity = 0
aba_fdr_tx19_ob_bias = 3

aba_index0_start = 0
aba_index0_end   = 3
aba_index0 = 0
aba_index1_start = 4
aba_index1_end   = 5
aba_index1 = 3
aba_index2_start = 6
aba_index2_end   = 9
aba_index2 = 2
aba_index3_start = 10
aba_index3_end   = 16
aba_index3 = 1

aba_rx2_slicer_ind_en = 0xeb
aba_rx2_slicer1_enable = 0x0
aba_rx2_slicer2_enable = 0x0
aba_rx2_ffe_tap0 = 0x80
aba_rx2_ffe_tap1 = 0x68
aba_rx2_ffe_tap2 = 0xd7
aba_rx2_ffe_tap3 = 0x80
aba_rx2_ffe_tap4 = 0x5a

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;; SFP+ section. all QSFP can be converted to SFP+ using QSA adapter.;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; ETH connected to third party device
aba_non_mlpn_tx8_ob_preemp_pre = 5
aba_non_mlpn_tx8_ob_preemp_post = 0
aba_non_mlpn_tx8_ob_preemp_main = 65
aba_non_mlpn_tx8_ob_bias = 8
aba_non_mlpn_tx8_pre_polarity = 1
aba_non_mlpn_tx8_post_polarity = 1
aba_non_mlpn_tx8_main_polarity = 0
aba_non_mlpn_tx8_preemp = 0


nego_eth_rx12_slicer_ind_en = 0xff
nego_eth_rx12_slicer1_enable= 0x8
nego_eth_rx12_slicer2_enable= 0x8
nego_eth_rx12_ffe_tap0=241
nego_eth_rx12_ffe_tap1=128
nego_eth_rx12_ffe_tap2=61
nego_eth_rx12_ffe_tap3=99
nego_eth_rx12_ffe_tap4=128

; end of '#include "include_QSFP_serdes_prams_bental.h"'

[PLL]
lbist_en  = 0
lbist_shift_freq  = 3
flash_div = 0x3
lbist_array_bypass = 1
lbist_pat_cnt_lsb = 0x2
core_f = 60
core_r = 14
core_od = 2
en_427_mhz = true

[FW]
flash_has_suspend_resume = 0
log_flashdev_size = 21
log_flash_sector_size = 2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS/RDMA RoCE with mlx4_en
       [not found]         ` <7FB59BD3-CB33-4710-B049-B53C6C042736-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2017-07-26  5:54           ` jackm
  0 siblings, 0 replies; 6+ messages in thread
From: jackm @ 2017-07-26  5:54 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-rdma, Leon Romanovsky

On Tue, 27 Jun 2017 12:28:43 -0400
Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:

> In the process of collecting data for you, I noticed that
> the CX3's maximum Ethernet link speed is 40Gbps, and I
> had set the switch port speed to 56Gbps. I've set the
> port speed back to 40Gbps, and now neither the device
> reset nor the cma_alloc failures are reproducing.
> 
> If you'd like to pursue this further, I can switch back to
> the higher speed and try to reproduce to collect this
> information.

Hi Chuck,

Thank you for giving us hand in understanding the root cause.
I apologize for the long delay in replying to your kind offer.

Fortunately, using the information you provided, we succeeded to
reproduce the issue in house, so there is no need for you to do any
extra work on this.

Thanks again!

-Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-07-26  5:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-26 17:24 NFS/RDMA RoCE with mlx4_en Chuck Lever
     [not found] ` <5E2BFC42-DDA1-4666-BA45-2E33A47C0ED5-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-06-27  6:22   ` Leon Romanovsky
2017-06-27  9:57   ` Sagi Grimberg
2017-06-27 10:33   ` jackm
     [not found]     ` <20170627133306.00003fda-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2017-06-27 16:28       ` Chuck Lever
     [not found]         ` <7FB59BD3-CB33-4710-B049-B53C6C042736-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-07-26  5:54           ` jackm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox