From: bugzilla@dpdk.org
To: dev@dpdk.org
Subject: [Bug 1086] Significant TX packet drops with Mellanox NIC (mlx5 PMD)
Date: Wed, 28 Sep 2022 13:41:31 +0000 [thread overview]
Message-ID: <bug-1086-3@http.bugs.dpdk.org/> (raw)
https://bugs.dpdk.org/show_bug.cgi?id=1086
Bug ID: 1086
Summary: Significant TX packet drops with Mellanox NIC (mlx5
PMD)
Product: DPDK
Version: 21.11
Hardware: x86
OS: Linux
Status: UNCONFIRMED
Severity: critical
Priority: Normal
Component: ethdev
Assignee: dev@dpdk.org
Reporter: anton@vaa.su
Target Milestone: ---
Created attachment 222
--> https://bugs.dpdk.org/attachment.cgi?id=222&action=edit
testpmd-fec28ca0e3.log.txt
Given 2 servers with 25G Mellanox 2-port NICs:
# dpdk-devbind.py -s
Network devices using kernel driver
===================================
0000:3b:00.0 'MT27710 Family [ConnectX-4 Lx] 1015' if=ens1f0np0 drv=mlx5_core
unused=vfio-pci
0000:3b:00.1 'MT27710 Family [ConnectX-4 Lx] 1015' if=ens1f1np1 drv=mlx5_core
unused=vfio-pci
Servers are connected directly.
The first server is used as a packet generator, running TRex v2.99 in stateless
mode:
./t-rex-64 -c 16 -i
./trex-console
trex>start -f stl/udp_1pkt_range_clients.py -m 17mpps
The second one runs dpdk-testpmd:
OS: Debian GNU/Linux 10 (buster)
uname -r: 4.19.0-21-amd64
ofed_info: MLNX_OFED_LINUX-5.7-1.0.2.0
gcc version 8.3.0 (Debian 8.3.0-6)
When compiled DPDK v21.08 and running testpmd this way:
dpdk-testpmd -l 1-17 -n 4 --log-level=debug -- --nb-ports=2 --nb-cores=16
--portmask=0x3 --rxq=8 --txq=8
It handles roughly 17Mpps per port:
trex>start -f stl/udp_1pkt_range_clients.py -m 17mpps
TRex Port Statistics
port | 0 | 1 | total
-----------+-------------------+-------------------+------------------
owner | root | root |
link | UP | UP |
state | TRANSMITTING | TRANSMITTING |
speed | 25 Gb/s | 25 Gb/s |
CPU util. | 27.76% | 27.76% |
-- | | |
Tx bps L2 | 8.7 Gbps | 8.73 Gbps | 17.43 Gbps
Tx bps L1 | 11.42 Gbps | 11.46 Gbps | 22.88 Gbps
Tx pps | 17 Mpps | 17.05 Mpps | 34.05 Mpps
Line Util. | 45.7 % | 45.83 % |
--- | | |
Rx bps | 8.7 Gbps | 8.73 Gbps | 17.43 Gbps
Rx pps | 17 Mpps | 17.05 Mpps | 34.05 Mpps
---- | | |
opackets | 290928398 | 291050836 | 581979234
ipackets | 290885740 | 291093159 | 581978899
obytes | 18619417472 | 18627254464 | 37246671936
ibytes | 18616688080 | 18629962836 | 37246650916
tx-pkts | 290.93 Mpkts | 291.05 Mpkts | 581.98 Mpkts
rx-pkts | 290.89 Mpkts | 291.09 Mpkts | 581.98 Mpkts
tx-bytes | 18.62 GB | 18.63 GB | 37.25 GB
rx-bytes | 18.62 GB | 18.63 GB | 37.25 GB
----- | | |
oerrors | 0 | 0 | 0
ierrors | 0 | 0 | 0
But if we switch to DPDK v21.11, it becomes much worse:
TRex Port Statistics
port | 0 | 1 | total
-----------+-------------------+-------------------+------------------
owner | root | root |
link | UP | UP |
state | TRANSMITTING | TRANSMITTING |
speed | 25 Gb/s | 25 Gb/s |
CPU util. | 26.06% | 26.06% |
-- | | |
Tx bps L2 | 8.7 Gbps | 8.72 Gbps | 17.42 Gbps
Tx bps L1 | 11.42 Gbps | 11.45 Gbps | 22.86 Gbps
Tx pps | 16.99 Mpps | 17.04 Mpps | 34.02 Mpps
Line Util. | 45.66 % | 45.79 % |
--- | | |
Rx bps | 3.75 Gbps | 3.76 Gbps | 7.5 Gbps
Rx pps | 7.32 Mpps | 7.34 Mpps | 14.66 Mpps
---- | | |
opackets | 190538147 | 190707494 | 381245641
ipackets | 82174700 | 82260152 | 164434852
obytes | 12194441408 | 12205280936 | 24399722344
ibytes | 5259181520 | 5264649728 | 10523831248
tx-pkts | 190.54 Mpkts | 190.71 Mpkts | 381.25 Mpkts
rx-pkts | 82.17 Mpkts | 82.26 Mpkts | 164.43 Mpkts
tx-bytes | 12.19 GB | 12.21 GB | 24.4 GB
rx-bytes | 5.26 GB | 5.26 GB | 10.52 GB
----- | | |
oerrors | 0 | 0 | 0
ierrors | 0 | 0 | 0
It handles only ~7 Mpps for each port, instead of ~17 Mpps! There are huge TX
drops stats reported by testpmd:
---------------------- Forward statistics for port 0 ----------------------
RX-packets: 1101378001 RX-dropped: 0 RX-total: 1101378001
TX-packets: 1016776861 TX-dropped: 84576754 TX-total: 1101353615
----------------------------------------------------------------------------
---------------------- Forward statistics for port 1 ----------------------
RX-packets: 1101353615 RX-dropped: 0 RX-total: 1101353615
TX-packets: 1016804108 TX-dropped: 84573893 TX-total: 1101378001
----------------------------------------------------------------------------
+++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
RX-packets: 2202731616 RX-dropped: 0 RX-total: 2202731616
TX-packets: 2033580969 TX-dropped: 169150647 TX-total: 2202731616
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I found the commit (between 21.08 and 21.11), which caused this trouble using
git bisect:
https://github.com/DPDK/dpdk/commit/fec28ca0e3a93143829f3b41a28a8da933f28499
Also, I've used to profile it with Intel VTune 2021.3.0 (-collect hotspots &
-collect memory-access). I've compared two revisions:
1. 690b2a88c2 (GOOD)
2. fec28ca0e3 (BAD)
I may try to share corresponding profiling results somehow if it helps.
Unfortunately, I cannot attach them here (vtune stats data is too big).
--
You are receiving this mail because:
You are the assignee for the bug.
next reply other threads:[~2022-09-28 13:41 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-28 13:41 bugzilla [this message]
2025-10-28 7:33 ` [DPDK/ethdev Bug 1086] Significant TX packet drops with Mellanox NIC (mlx5 PMD) bugzilla
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-1086-3@http.bugs.dpdk.org/ \
--to=bugzilla@dpdk.org \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.