* [DPDK/ethdev Bug 1776] Segmentation fault encountered in MPRQ vectorized mode
@ 2025-08-21 2:31 bugzilla
0 siblings, 0 replies; only message in thread
From: bugzilla @ 2025-08-21 2:31 UTC (permalink / raw)
To: dev
[-- Attachment #1: Type: text/plain, Size: 3729 bytes --]
https://bugs.dpdk.org/show_bug.cgi?id=1776
Bug ID: 1776
Summary: Segmentation fault encountered in MPRQ vectorized mode
Product: DPDK
Version: 22.11
Hardware: x86
OS: Linux
Status: UNCONFIRMED
Severity: critical
Priority: Normal
Component: ethdev
Assignee: dev@dpdk.org
Reporter: canary.overflow@gmail.com
Target Milestone: ---
I have been encountering segmentation fault when running DPDK in MPRQ
vectorized mode. To reproduce the issue on testpmd, run with the following
parameters:
dpdk-testpmd -l 1-5 -n 4 -a
0000:1f:00.0,rxq_comp_en=1,rxq_pkt_pad_en=1,rxqs_min_mprq=1,mprq_en=1,mprq_log_stride_num=6,mprq_log_stride_size=9,mprq_max_memcpy_len=64,rx_vec_en=1
-- -i --rxd=8192 --max-pkt-len=9000 --rxq=1 --total-num-mbufs=16384
--mbuf-size=3000 --enable-drop-en –-enable-scatter
This segmentation fault goes away when I disable vectorization (rx_vec_en=0).
(Note that the segmentation fault does not occur in forward-mode=rxonly). The
segmentation fault also seems to happen with higher chances when there is a
rxnombuf.
The backtrace of the segmentation fault was:
#0 0x0000000001c34912 in __rte_pktmbuf_free_extbuf ()
#1 0x0000000001c36a10 in rte_pktmbuf_detach ()
#2 0x0000000001c4a9ec in rxq_copy_mprq_mbuf_v ()
#3 0x0000000001c4d63b in rxq_burst_mprq_v ()
#4 0x0000000001c4d7a7 in mlx5_rx_burst_mprq_vec ()
#5 0x000000000050be66 in rte_eth_rx_burst ()
#6 0x000000000050c53d in pkt_burst_io_forward ()
#7 0x00000000005427b4 in run_pkt_fwd_on_lcore ()
#8 0x000000000054289b in start_pkt_forward_on_core ()
#9 0x0000000000a473c9 in eal_thread_loop ()
#10 0x00007ffff60061ca in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffff5c72e73 in clone () from /lib64/libc.so.6
*Note that the addresses may not be exact as I've added some log statements and
attempted fixes previously (they were commented out when I obtained this
backtrace).
Upon some investigation, I noticed that in DPDK’s source codes
drivers/net/mlx5/mlx5_rxtx_vec.c (function rxq_copy_mprq_mbuf_v()), there is a
possibility where the consumed stride exceeds the stride number (64 in this
case) which should not be happening. I'm suspecting that there's some CQE
misalignment here upon encountering rxnombuf.
rxq_copy_mprq_mbuf_v(...) {
...
if(rxq->consumed_strd == strd_n) {
// replenish WQE
}
...
strd_cnt = (elts[i]->pkt_len / strd_sz) +
((elts[i]->pkt_len % strd_sz) ? 1 : 0);
rxq_code = mprq_buf_to_pkt(rxq, elts[i], elts[i]->pkt_len, buf,
rxq->consumed_strd, strd_cnt);
rxq->consumed_strd += strd_cnt; // encountering cases where
rxq->consumed_strd > strd_n
...
}
In addition, there were also cases in mprq_buf_to_pkt() where the allocated seg
address is exactly the same as the pkt (elts[i]) address passed in which should
not happen.
mprq_buf_to_pkt(...) {
...
if(hdrm_overlap > 0) {
MLX5_ASSERT(rxq->strd_scatter_en);
struct rte_mbuf *seg = rte_pktmbuf_alloc(rxq->mp);
if (unlikely(seg == NULL)) return MLX5_RXQ_CODE_NOMBUF;
SET_DATA_OFF(seg, 0);
// added debug statement
// saw instances where pkt = seg
DRV_LOG(DEBUG, "pkt %p seg %p", (void *)pkt, (void *)seg);
rte_memcpy(rte_pktmbuf_mtod(seg, void *), RTE_PTR_ADD(addr, len -
hdrm_overlap), hdrm_overlap);
...
}
}
I have tried upgrading my DPDK version to 24.11 but the segmentation fault
still persists.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #2: Type: text/html, Size: 5654 bytes --]
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2025-08-21 2:31 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-21 2:31 [DPDK/ethdev Bug 1776] Segmentation fault encountered in MPRQ vectorized mode bugzilla
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.