All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla@dpdk.org
To: dev@dpdk.org
Subject: [Bug 219] DPDK 18.11 builds with MLX4/MLX5 support but testpmd won't recognize the device
Date: Sun, 03 Mar 2019 00:44:05 +0000	[thread overview]
Message-ID: <bug-219-3@http.bugs.dpdk.org/> (raw)

https://bugs.dpdk.org/show_bug.cgi?id=219

            Bug ID: 219
           Summary: DPDK 18.11 builds with MLX4/MLX5 support but testpmd
                    won't recognize the device
           Product: DPDK
           Version: 18.11
          Hardware: x86
                OS: Linux
            Status: CONFIRMED
          Severity: normal
          Priority: Normal
         Component: ethdev
          Assignee: dev@dpdk.org
          Reporter: debugnetiq1@yahoo.ca
  Target Milestone: ---

For testing built 2 versions of DPDK-18.11
- one with static libs (CONFIG_RTE_BUILD_SHARED_LIB=n)
- one with shared libs (CONFIG_RTE_BUILD_SHARED_LIB=y)
In a nutshell, after building and verifying all of the below, croaks:
- with the static-libs DPDK complains of not finding shared libs
  (However DPDK is built by default with static libs)
- with the shared-libs complains of not finding the device
- incidentally pktgen-dpdk, built against the same DPDK static build, complains
of the same issue

With the static-libs DPDK complains of not finding
librte_pmd_mlx4_glue.so.18.02.0
# /opt/dpdk_install/dpdk-18.11/install/bin/testpmd \
>   -l 1-3 \
>   -n 4 \
>   -w aec9:00:02.0 \
>   --vdev="net_vdev_netvsc0,iface=eth1" \
>   -- --port-topology=chained \
>   --nb-cores 1 \
>   --forward-mode=txonly \
>   --eth-peer=0,00:0d:3a:53:13:b7 \
>   --stats-period 1
PMD: mlx4.c:947: mlx4_glue_init(): cannot load glue library:
librte_pmd_mlx4_glue.so.18.02.0: cannot open shared object file: No such file
or directory
PMD: mlx4.c:965: mlx4_glue_init(): cannot initialize PMD due to missing
run-time dependency on rdma-core libraries (libibverbs, libmlx4)
net_mlx5: mlx5.c:1712: mlx5_glue_init(): cannot load glue library:
librte_pmd_mlx5_glue.so.18.11.0: cannot open shared object file: No such file
or directory
net_mlx5: mlx5.c:1730: mlx5_glue_init(): cannot initialize PMD due to missing
run-time dependency on rdma-core libraries (libibverbs, libmlx5)
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Debug dataplane logs available - lower performance
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
clock cycles !
net_vdev_netvsc: probably using routed NetVSC interface "eth1" (index 3)
rte_pmd_tap_probe(): Initializing pmd_tap for net_tap_vsc0 as dtap0
Set txonly packet forwarding mode
Warning: NUMA should be configured manually by using --port-numa-config and
--ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=163456, size=2176,
socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: 00:0D:3A:18:A1:73
Checking link statuses...
Done
No commandline core given, start packet forwarding
txonly packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support
enabled, MP allocation mode: native
Logical Core 2 (socket 0) forwards packets on 1 streams:
  RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=00:0D:3A:53:13:B7

  txonly packet forwarding packets/burst=32
  packet len=64 - nb packet segments=1
  nb forwarding cores=1 - nb forwarding ports=1


With the shared-libs DPDK complains about mlx4_pci_probe(): cannot access
device
# /opt/dpdk_install/dpdk-18.11/install/bin/testpmd \
>   -l 1-3 \
>   -d /opt/dpdk_install/dpdk-18.11/install/lib \
>   -n 4 \
>   -w aec9:00:02.0 \
>   --vdev="net_vdev_netvsc0,iface=eth1" \
>   -- --port-topology=chained \
>   --nb-cores 1 \
>   --forward-mode=txonly \
>   --eth-peer=0,00:0d:3a:53:13:b7 \
>   --stats-period 1
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Debug dataplane logs available - lower performance
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
clock cycles !
EAL: PCI device aec9:00:02.0 on NUMA socket 0
EAL:   probe driver: 15b3:1004 net_mlx4
PMD: mlx4.c:564: mlx4_pci_probe(): cannot access device, is mlx4_ib loaded?
EAL: Requested device aec9:00:02.0 cannot be used
net_vdev_netvsc: probably using routed NetVSC interface "eth1" (index 3)
rte_pmd_tap_probe(): Initializing pmd_tap for net_tap_vsc0 as dtap0
Set txonly packet forwarding mode
Warning: NUMA should be configured manually by using --port-numa-config and
--ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=163456, size=2176,
socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: 00:0D:3A:18:A1:73
Checking link statuses...
Done
No commandline core given, start packet forwarding
txonly packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support
enabled, MP allocation mode: native
Logical Core 2 (socket 0) forwards packets on 1 streams:
  RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=00:0D:3A:53:13:B7
...

Nonetheless regardless of errors both versions seem to start sending packets -
not convinced this is really true

Port statistics ====================================
  ######################## NIC statistics for port 0  ########################
  RX-packets: 0          RX-missed: 0          RX-bytes:  0
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 231072     TX-errors: 0          TX-bytes:  14788608

  Throughput (since last show)
  Rx-pps:            0
  Tx-pps:       201230
  ############################################################################

Incidentally the " mlx4_pci_probe(): cannot access device" error is the same
flagged by pktgen-dpdk (which however crashes) - i.e. there is a common bug in
DPDK w/ respect to MLX4 impacting both testpmd and pktgen


# ./app/x86_64-native-linuxapp-gcc/pktgen -w aec9:00:02.0 -l 1-3  -n 4 -m 4096
--  -m [2-3].0 -l /var/tmp/pktgen.log -T

Copyright (c) <2010-2019>, Intel Corporation. All rights reserved. Powered by
DPDK
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Debug dataplane logs available - lower performance
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
clock cycles !
EAL: PCI device aec9:00:02.0 on NUMA socket 0
EAL:   probe driver: 15b3:1004 net_mlx4
PMD: mlx4.c:564: mlx4_pci_probe(): cannot access device, is mlx4_ib loaded?
EAL: Requested device aec9:00:02.0 cannot be used
Lua 5.3.5  Copyright (C) 1994-2018 Lua.org, PUC-Rio

*** Copyright (c) <2010-2019>, Intel Corporation. All rights reserved.
*** Pktgen created by: Keith Wiles -- >>> Powered by DPDK <<<

!PANIC!: *** Did not find any ports to use ***
PANIC in pktgen_config_ports():
*** Did not find any ports to use ***6:
[./app/x86_64-native-linuxapp-gcc/pktgen_() [0x48726f]]
5: [/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fe75bf693d5]]
4: [./app/x86_64-native-linuxapp-gcc/pktgen_(main+0x630) [0x47efe0]]
3: [./app/x86_64-native-linuxapp-gcc/pktgen_(pktgen_config_ports+0x1611)
[0x4afcb1]]
2: [./app/x86_64-native-linuxapp-gcc/pktgen_(__rte_panic+0xb8) [0x469ec2]]
1: [./app/x86_64-native-linuxapp-gcc/pktgen_(rte_dump_stack+0x1a) [0x582baa]]
./app/x86_64-native-linuxapp-gcc/pktgen: line 7:  6970 Aborted                
$(dirname "$0")/pktgen_ "$@"




Here is what I did:
- installed Mellanox OFED 4.5.1 from "sources"
  wget
http://www.mellanox.com/downloads/ofed/MLNX_OFED-4.5-1.0.1.0/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.6-x86_64.tgz
&& tar -zxf MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.6-x86_64.tgz
  cd MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.6-x86_64.tgz
  ./mlnxofedinstall --dpdk --upstream-libs --add-kernel-support
--enable-mlnx_tune
This builds and installs all userland OFED components, then installing the kmod
drivers
cd
/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-3.10.0-957.5.1.el7.x86_64/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.6-ext/RPMS
&& yum install
mlnx-ofa_kernel-modules-4.5-OFED.4.5.1.0.1.1.gb4fdfac.kver.3.10.0_957.5.1.el7.x86_64.x86_64.rpm

Now building dpdk-18.11
- download, untar then enable in ./config/common_base
CONFIG_RTE_LIBRTE_MLX4_PMD=y and CONFIG_RTE_LIBRTE_MLX5_PMD="y"
export DPDK_DIR=/opt/dpdk_install/dpdk-18.11
cd $DPDK_DIR
export DPDK_BUILD=$DPDK_DIR/install
export RTE_SDK=$DPDK_DIR
export DPDK_TARGET=x86_64-native-linuxapp-gcc
export RTE_TARGET=x86_64-native-linuxapp-gcc

Enabling the following in config/common_base
CONFIG_RTE_BUILD_SHARED_LIB=n
CONFIG_RTE_LIBRTE_MLX4_PMD=y
CONFIG_RTE_LIBRTE_MLX4_DEBUG=y
CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS=n
CONFIG_RTE_LIBRTE_MLX5_PMD=y
CONFIG_RTE_LIBRTE_MLX5_DEBUG=y
CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS=n

CONFIG_RTE_LOG_DP_LEVEL=RTE_LOG_DEBUG

make config T=$DPDK_TARGET
make install T=$DPDK_TARGET DESTDIR=install

It builds fine - by default with static libraries generated under install/lib -
I can see the generated libs
 ls -l install/lib/*mlx*
-rw-r--r--. 1 root root 2126350 Mar  2 22:12 install/lib/librte_pmd_mlx4.a
-rw-r--r--. 1 root root 6613402 Mar  2 22:12 install/lib/librte_pmd_mlx5.a


# lspci -v -n
...
aec9:00:02.0 0200: 15b3:1004
        Subsystem: 15b3:61b0
        Flags: fast devsel, NUMA node 0
        Memory at fe0800000 (64-bit, prefetchable) [size=8M]
        Capabilities: [60] Express Endpoint, MSI 00
        Capabilities: [9c] MSI-X: Enable- Count=24 Masked-
        Capabilities: [40] Power Management version 0
        Kernel driver in use: vfio-pci
        Kernel modules: mlx4_core

# lsmod | grep mlx
mlx5_fpga_tools        14392  0
mlx5_ib               339996  0
ib_uverbs             125872  3 mlx5_ib,ib_ucm,rdma_ucm
mlx5_core             919535  2 mlx5_ib,mlx5_fpga_tools
mlxfw                  18227  1 mlx5_core
mlx4_ib               211832  0
ib_core               294554  10
rdma_cm,ib_cm,iw_cm,mlx4_ib,mlx5_ib,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib
mlx4_en               146509  0
mlx4_core             360644  2 mlx4_en,mlx4_ib
mlx_compat             28730  15
rdma_cm,ib_cm,iw_cm,mlx4_en,mlx4_ib,mlx5_ib,mlx5_fpga_tools,ib_ucm,ib_core,ib_umad,ib_uverbs,mlx4_core,mlx5_core,rdma_ucm,ib_ipoib
devlink                48345  4 mlx4_en,mlx4_ib,mlx4_core,mlx5_core
ptp                    19231  3 hv_utils,mlx4_en,mlx5_core


Now testing with testpmd but first some sanity check

# find /lib/modules/3.10.0-957.5.1.el7.x86_64/ -type f -name "*mlx*" | xargs ls
-l
-rwxr--r--. 1 root root   47688 Mar  2 17:34
/lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/compat/mlx_compat.ko
-rwxr--r--. 1 root root  353296 Mar  2 17:34
/lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/infiniband/hw/mlx4/mlx4_ib.ko
-rwxr--r--. 1 root root  554568 Mar  2 17:34
/lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/infiniband/hw/mlx5/mlx5_ib.ko
-rwxr--r--. 1 root root  573648 Mar  2 17:34
/lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko
-rwxr--r--. 1 root root  255656 Mar  2 17:34
/lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_en.ko
-rwxr--r--. 1 root root 1433680 Mar  2 17:34
/lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko
-rwxr--r--. 1 root root   25728 Mar  2 17:35
/lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/ethernet/mellanox/mlx5/fpga/mlx5_fpga_tools.ko
-rwxr--r--. 1 root root   24728 Mar  2 17:35
/lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/ethernet/mellanox/mlxfw/mlxfw.ko


# cat /etc/modprobe.d/ofed_mlx4.conf
(all options commented out)

With mlx4_ib loaded we have 2 pairs of interfaces eth0 + eth2, eth1 + eth3
Each pair is sharing same MAC
# ip link show
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode
DEFAULT group default qlen 1000    link/ether 00:0d:3a:4d:49:98 brd
ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode
DEFAULT group default qlen 1000    link/ether 00:0d:3a:18:a1:73 brd
ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq state UP
mode DEFAULT group default qlen 1000    link/ether 00:0d:3a:4d:49:98 brd
ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq state UP
mode DEFAULT group default qlen 1000    link/ether 00:0d:3a:18:a1:73 brd
ff:ff:ff:ff:ff:ff

lshw | less
     *-network:1
          description: Ethernet interface
          product: MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual
Function]
          vendor: Mellanox Technologies
          physical id: 2
          bus info: pci@aec9:00:02.0
          logical name: eth3
          version: 00
          serial: 00:0d:3a:18:a1:73
          width: 64 bits
          clock: 33MHz
          capabilities: pciexpress msix pm bus_master cap_list ethernet
physical fibre autonegotiation
          configuration: autonegotiation=on broadcast=yes driver=mlx4_en
driverversion=4.5-1.0.1 duplex=full firmware=2.41.7004 latency=0 link=yes
multicast=yes slave=yes
          resources: iomemory:f0-ef irq:0 memory:fe0800000-fe0ffffff


Using second pair with DPDK (1'st pair has the mgmt IP) 

# dpdk-devbind.py --force --bind mlx4_core aec9:00:02.0
# dpdk-devbind.py -s
Network devices using DPDK-compatible driver
============================================
aec9:00:02.0 'MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual
Function] 1004' drv=vfio-pci unused=mlx4_core

Other Network devices
=====================
a3f2:00:02.0 'MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual
Function] 1004' unused=mlx4_core,vfio-pci


We have 4 cores and huge mem pages
# cpu_layout.py
======================================================================
Core and Socket Information (as reported by '/sys/devices/system/cpu')
======================================================================
cores =  [0, 1]
sockets =  [0]
       Socket 0
       --------
Core 0 [0, 1]
Core 1 [2, 3]

# grep -i huge /proc/meminfo
AnonHugePages:     20480 kB
HugePages_Total:    2048
HugePages_Free:     2048
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

# mount | grep -i huge
cgroup on /sys/fs/cgroup/hugetlb type cgroup
(rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel)
hugetlbfs on /mnt/huge type hugetlbfs (rw,relatime,seclabel)
none on /mnt/huge_2mb type hugetlbfs (rw,relatime,seclabel,pagesize=2MB)

-- 
You are receiving this mail because:
You are the assignee for the bug.

                 reply	other threads:[~2019-03-03  0:44 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-219-3@http.bugs.dpdk.org/ \
    --to=bugzilla@dpdk.org \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.