All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla@dpdk.org
To: dev@dpdk.org
Subject: [DPDK/other Bug 1867] mlx5: enabling internal VF-to-VF communication causes performance to drop significantly
Date: Tue, 13 Jan 2026 12:31:28 +0000	[thread overview]
Message-ID: <bug-1867-3@http.bugs.dpdk.org/> (raw)

http://bugs.dpdk.org/show_bug.cgi?id=1867

            Bug ID: 1867
           Summary: mlx5: enabling internal VF-to-VF communication causes
                    performance to drop significantly
           Product: DPDK
           Version: 25.11
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: other
          Assignee: dev@dpdk.org
          Reporter: robin@jarry.cc
                CC: rasland@nvidia.com
  Target Milestone: ---

SUMMARY
=======

Enabling internal VF-to-VF communication causes performance to drop
significantly.

Here is a visual representation of the topology:

+---------------------------------------------------------------------+
|                                 dut                                 |
|                                                                     |
| +---------------+         +-------------+         +---------------+ |
| |    testpmd    |         |   testpmd   |         |    testpmd    | |
| |   "outer 0"   |         |   "inner"   |         |   "outer 1"   | |
| +--+---------+--+         +--+-------+--+         +--+---------+--+ |
|    |         |               |       |               |         |    |
|    |         |       ,-------'       `-------.       |         |    |
|    |         |       |                       |       |         |    |
| +--+--+   +--+--+ +--+--+                 +--+--+ +--+--+   +--+--+ |
| | vf0 |   | vf1 | | vf2 |                 | vf2 | | vf1 |   | vf0 | |
| +--+--+   +--+--+ +--+--+                 +--+--+ +--+--+   +--+--+ |
|    |         |       |                       |       |         |    |
|    |         |       |                       |       |         |    |
| +--+---------|-------|--+                 +--|-------|---------+--+ |
| |            `-------'  |                 |  `-------'            | |
| |          pf0          |                 |          pf1          | |
| +-----------+-----------+                 +-----------+-----------+ |
|             |                                         |             |
+-------------|-----------------------------------------|-------------+
              |                                         |
+-------------|-----------------------------------------|-------------+
|             |                 switch                  |             |
+-------------|-----------------------------------------|-------------+
              |                                         |
+-------------|-----------------------------------------|-------------+
|             |                                         |             |
| +-----------+------------+                +-----------+-----------+ |
| |          pf0           |                |          pf1          | |
| +-----------+------------+                +-----------+-----------+ |
|             |                                         |             |
| +-----------+-----------------------------------------+-----------+ |
| |                                                                 | |
| |                               trex                              | |
| |                                                                 | |
| +-----------------------------------------------------------------+ |
|                                                                     |
|                                 tgen                                |
+---------------------------------------------------------------------+

Important notes:

* Promisc mode is *disabled* on all ports.
* Traffic is sent from trex with VF0 mac addresses as ethernet destination.
* The switch does *not* flood anything.

Observations:

* When the tgen emits at 1M pkt/s per side, every packet is properly forwarded
to both "outer" testpmds.
* When emitting at 10M pkt/s per side, only ~6.8M pkt/s are received by the
"outer" testpmds on VF0.
* When emitting at 37.5M pkt/s (25G line rate), only ~1.5M pkt/s are received
by the "outer" testpmds on VF0.
* ethtool stats on PF interfaces reflect the actual transmission rate of the
traffic generator (rx_packets_phy) but only a portion of these are relayed to
VF0.

With this simpler setup:

+--------------------------+
|               dut        |
|                          |
| +--------------+         |
| |   testpmd    |         |
| +--+-------+---+         |
|    |       |             |
|    |       |             |
|    |       |             |
| +--+--+ +--+--+          |
| | vf0 | | vf1 |          |
| +--+--+ +--+--+          |
|    |       |             |
|    |       |             |
| +--|-------|-----+       |
| |  \      /      |       |
| |   \    /   pf0 |       |
| +----\  /--------+       |
|       ||                 |
+-------||-----------------+
        ||
+-------||---------------------------------+
|       ||                                 |
|       |\--------------------------\      |
|       |          switch           |      |
|       |                           |      |
+-------|---------------------------|------+
        |                           |
+-------|---------------------------|------+
|       |                           |      |
| +-----+-----+                +----+----+ |
| |    pf0    |                |   pf1   | |
| +-----+-----+                +----+----+ |
|       |                           |      |
| +-----+---------------------------+----+ |
| |                                      | |
| |                  trex                | |
| |                                      | |
| +--------------------------------------+ |
|                                          |
|                    tgen                  |
+------------------------------------------+

The maximum line rate of the port can be achieved (37.5M pkt/s total, 18.75M
pkt/s per side).

SOFTWARE
========

~/dpdk# git describe 
v25.11-4-gcd60dcd503b9

~# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-6.16.7-100.fc41.x86_64 ... \
        intel_iommu=on iommu=pt default_hugepagesz=1GB hugepagesz=1G
hugepages=32 \
        skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 nohz=on \
        isolcpus=10-19,30-39 nohz_full=10-19,30-39 rcu_nocbs=10-19,30-39 \
        tuned.non_isolcpus=3ff003ff intel_pstate=passive nosoftlockup

libibverbs.so.1.14.51.0
libmlx5.so.1.24.51.0

HARDWARE
========

CPU Model name:                          Intel(R) Xeon(R) Silver 4316 CPU @
2.30GHz

SLOT          DRIVER     IFNAME        MAC                LINK/STATE  SPEED  
DEVICE
0000:18:00.0  mlx5_core  enp24s0f0np0  b8:3f:d2:fa:53:86  1/up        25Gb/s 
MT2894 Family [ConnectX-6 Lx]
0000:18:00.1  mlx5_core  enp24s0f1np1  b8:3f:d2:fa:53:87  1/up        25Gb/s 
MT2894 Family [ConnectX-6 Lx]
0000:18:00.2  mlx5_core  enp24s0f0v0   02:aa:aa:aa:aa:00  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:00.3  mlx5_core  enp24s0f0v1   02:aa:aa:aa:aa:01  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:00.4  mlx5_core  enp24s0f0v2   02:aa:aa:aa:aa:02  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:08.2  mlx5_core  enp24s0f1v0   02:cc:cc:cc:cc:00  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:08.3  mlx5_core  enp24s0f1v1   02:cc:cc:cc:cc:01  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:08.4  mlx5_core  enp24s0f1v2   02:cc:cc:cc:cc:02  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function

~# ethtool -i enp24s0f0np0
driver: mlx5_core
version: 6.16.7-100.fc41.x86_64
firmware-version: 26.41.1000 (MT_0000000532)
expansion-rom-version:
bus-info: 0000:18:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

~# ethtool -i enp24s0f1np1
driver: mlx5_core
version: 6.16.7-100.fc41.x86_64
firmware-version: 26.41.1000 (MT_0000000532)
expansion-rom-version:
bus-info: 0000:18:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

VF CONFIGURATION
================

+ pf0=enp24s0f0np0
+ pf1=enp24s0f1np1

+++ readlink -ve /sys/class/net/enp24s0f0np0/device
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.0
+ pci=0000:18:00.0
+ devlink dev eswitch set pci/0000:18:00.0 mode legacy
+ echo 1
+ tee /sys/class/net/enp24s0f0np0/device/sriov_drivers_autoprobe
1
+ echo 3
+ tee /sys/class/net/enp24s0f0np0/device/sriov_numvfs
3

+ ip link set enp24s0f0np0 vf 0 mac 02:aa:aa:aa:aa:00
+ ip link set enp24s0f0np0 vf 1 mac 02:aa:aa:aa:aa:01
+ ip link set enp24s0f0np0 vf 2 mac 02:aa:aa:aa:aa:02
+ ip link set enp24s0f0v0 address 02:aa:aa:aa:aa:00
+ ip link set enp24s0f0v0 up
+ ip link set enp24s0f0v1 address 02:aa:aa:aa:aa:01
+ ip link set enp24s0f0v1 up
+ ip link set enp24s0f0v2 address 02:aa:aa:aa:aa:02
+ ip link set enp24s0f0v2 up

+++ readlink -ve /sys/class/net/enp24s0f0np0/device/virtfn0
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.2
+ vf00_pci=0000:18:00.2
+++ readlink -ve /sys/class/net/enp24s0f0np0/device/virtfn1
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.3
+ vf01_pci=0000:18:00.3
+++ readlink -ve /sys/class/net/enp24s0f0np0/device/virtfn2
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.4
+ vf02_pci=0000:18:00.4

+++ readlink -ve /sys/class/net/enp24s0f1np1/device
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.1
+ pci=0000:18:00.1
+ devlink dev eswitch set pci/0000:18:00.1 mode legacy
+ echo 1
+ tee /sys/class/net/enp24s0f1np1/device/sriov_drivers_autoprobe
1
+ echo 3
+ tee /sys/class/net/enp24s0f1np1/device/sriov_numvfs
3

+++ readlink -ve /sys/class/net/enp24s0f1np1/device/virtfn0
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:08.2
+ vf10_pci=0000:18:08.2
+++ readlink -ve /sys/class/net/enp24s0f1np1/device/virtfn1
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:08.3
+ vf11_pci=0000:18:08.3
+++ readlink -ve /sys/class/net/enp24s0f1np1/device/virtfn2
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:08.4
+ vf12_pci=0000:18:08.4

+ ip link set enp24s0f1np1 vf 0 mac 02:cc:cc:cc:cc:00
+ ip link set enp24s0f1np1 vf 1 mac 02:cc:cc:cc:cc:01
+ ip link set enp24s0f1np1 vf 2 mac 02:cc:cc:cc:cc:02
+ ip link set enp24s0f1v0 address 02:cc:cc:cc:cc:00
+ ip link set enp24s0f1v0 up
+ ip link set enp24s0f1v1 address 02:cc:cc:cc:cc:01
+ ip link set enp24s0f1v1 up
+ ip link set enp24s0f1v2 address 02:cc:cc:cc:cc:02
+ ip link set enp24s0f1v2 up

TESTPMD COMMANDS
================

~# cat ./testpmd
set promisc all off

RUNTIME_DIRECTORY=/tmp/outer0 dpdk-testpmd -n 4 -l 0,12,32,13,33 -a
0000:18:00.2 -a 0000:18:00.3  -- \
        --nb-cores 4 --rxq=4 --txq=4 --rxd=2048 --txd=2048 --forward-mode=mac
-i \
        --eth-peer=0,30:3e:a7:0b:f2:54 --eth-peer=1,02:aa:aa:aa:aa:02 \
        --rss-udp --auto-start --cmdline-file=./testpmd --record-burst-stats

RUNTIME_DIRECTORY=/tmp/inner dpdk-testpmd -n 4 -l 0,14,34,15,35 -a 0000:18:00.4
-a 0000:18:08.4 -- \
        --nb-cores 4 --rxq=4 --txq=4 --rxd=2048 --txd=2048 --forward-mode=mac
-i \
        --eth-peer=0,02:aa:aa:aa:aa:01 --eth-peer=1,02:cc:cc:cc:cc:01 \
        --rss-udp --auto-start --cmdline-file=./testpmd --record-burst-stats

RUNTIME_DIRECTORY=/tmp/outer1 dpdk-testpmd -n 4 -l 0,10,30,11,31 -a
0000:18:08.2 -a 0000:18:08.3 -- \
        --nb-cores 4 --rxq=4 --txq=4 --rxd=2048 --txd=2048 --forward-mode=mac
-i \
        --eth-peer=0,30:3e:a7:0b:f2:55 --eth-peer=1,02:cc:cc:cc:cc:02 \
        --rss-udp --auto-start --cmdline-file=./testpmd --record-burst-stats

-- 
You are receiving this mail because:
You are the assignee for the bug.

                 reply	other threads:[~2026-01-13 12:31 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-1867-3@http.bugs.dpdk.org/ \
    --to=bugzilla@dpdk.org \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.