All of lore.kernel.org
 help / color / mirror / Atom feed
* [DPDK/examples Bug 1391] examples/l3fwd: in event-mode hash.txadapter.txq is not always updated
@ 2024-03-01 13:22 bugzilla
  0 siblings, 0 replies; only message in thread
From: bugzilla @ 2024-03-01 13:22 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 5869 bytes --]

https://bugs.dpdk.org/show_bug.cgi?id=1391

            Bug ID: 1391
           Summary: examples/l3fwd: in event-mode hash.txadapter.txq is
                    not always updated
           Product: DPDK
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: examples
          Assignee: dev@dpdk.org
          Reporter: konstantin.v.ananyev@yandex.ru
                CC: pbhagavatula@marvell.com
  Target Milestone: ---

Reproducible with latest main branch.

l3fwd in event-mode with SW with SW eventdev on mlx5 PMDs can crash:

./dpdk-l3fwd --lcores=49,51,53,55,57 -n 6 -a ca:00.0 -a ca:00.1 -a cb:00.0 -a
cb:00.1 -s 0x8000000000000 -\
-vdev event_sw0 -- -L -P -p f --rx-queue-size 1024 --tx-queue-size 1024 --mode
eventdev --eventq-sched=ordered \
--rule_ipv4=test/l3fwd_lpm_v4_u1.cfg --rule_ipv6=test/l3fwd_lpm_v6_u1.cfg
--no-numa

Thread 4 "dpdk-worker51" received signal SIGSEGV, Segmentation fault.
0x000000000135d27f in rte_eth_tx_buffer (tx_pkt=0x17f3ea780, buffer=0x10,
queue_id=43, port_id=1) at ../lib/ethdev/rte_ethdev.h:6637
6637            buffer->pkts[buffer->length++] = tx_pkt;
(gdb) bt
#0  0x000000000135d27f in rte_eth_tx_buffer (tx_pkt=0x17f3ea780, buffer=0x10,
    queue_id=43, port_id=1) at ../lib/ethdev/rte_ethdev.h:6637
#1  txa_service_tx (txa=0x11f89959c0, ev=0x7ffff2f23e10, n=16)
    at ../lib/eventdev/rte_event_eth_tx_adapter.c:631
#2  0x000000000135d3ef in txa_service_func (args=0x11f89959c0)
    at ../lib/eventdev/rte_event_eth_tx_adapter.c:666
#3  0x00000000015d30e1 in service_runner_do_callback (s=0x11ffffe100,
    cs=0x11fffe8500, service_idx=2) at ../lib/eal/common/rte_service.c:405
#4  0x00000000015d3429 in service_run (i=2, cs=0x11fffe8500, service_mask=7,
    s=0x11ffffe100, serialize_mt_unsafe=1)
    at ../lib/eal/common/rte_service.c:441
#5  0x00000000015d363f in service_runner_func (arg=0x0)
    at ../lib/eal/common/rte_service.c:513
#6  0x00000000015c12c1 in eal_thread_loop (arg=0x33)
    at ../lib/eal/common/eal_common_thread.c:212
#7  0x00000000015e1b98 in eal_worker_thread_loop (arg=0x33)
    at ../lib/eal/linux/eal.c:916
#8  0x00007ffff5ff76ea in start_thread () from /lib64/libpthread.so.0
#9  0x00007ffff5d0fa8f in clone () from /lib64/libc.so.6

Obviously 'queue_id=43' is wrong here and it crashed while trying to access
un-configured TX queue. 

What is happening here is a coincidence of two different problems:
1. EVENT framework silently and un-conditionally re-uses mbuf::hash.fdir for
its own purposes:
                        struct {
                                uint32_t reserved1;
                                uint16_t reserved2;
                                uint16_t txq;
                                /**< The event eth Tx adapter uses this field
                                 * to store Tx queue id.
                                 * @see rte_event_eth_tx_adapter_txq_set()
                                 */
                        } txadapter; /**< Eventdev ethdev Tx adapter */
In particular txa_service_tx() expects hash.txadapter.txq to contain valid TX
queue index.
Though l3fwd not always set it properly.
Usually it is ok for that particular app, as only queue 0 is in use, and it
doesn't configure PMDs
to overwrite mbuf::hash.fdir.hi value (RTE_MBUF_F_RX_FDIR).
But if by whatever reason PMD will overwrite mbuf::hash.fdir.hi with some
non-zero value, then we are in trouble.
2. That's exactly what is happening here: mlx5 driver sometimes superfluously
updates  mbuf::hash.fdir.hi.

The fix I applied localy is obvious - *always* set hash.txadapter.txq to a
proper value before calling  rte_event_enqueue_burst().
See below for details.
Note that it is not the 'complete' fix, as same needs to be done for other
codepaths (em, fib, acl, etc.).
As a more general thing - I don't understand while EVENT framework keep using
hash.fdir for its own purposes.
Specially in a completely silent and unconditional way.
I think it would be much cleaner to switch to mbuf dynfiield/dynflag based
approach.

diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
index a484a33089..ef9838aef3 100644
--- a/examples/l3fwd/l3fwd_lpm.c
+++ b/examples/l3fwd/l3fwd_lpm.c
@@ -285,6 +285,8 @@ lpm_event_loop_single(struct l3fwd_event_resources
*evt_rsrc,
                        continue;
                }

+               rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);
+
                if (flags & L3FWD_EVENT_TX_ENQ) {
                        ev.queue_id = tx_q_id;
                        ev.op = RTE_EVENT_OP_FORWARD;
@@ -295,7 +297,6 @@ lpm_event_loop_single(struct l3fwd_event_resources
*evt_rsrc,
                }

                if (flags & L3FWD_EVENT_TX_DIRECT) {
-                       rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);
                        do {
                                enq = rte_event_eth_tx_adapter_enqueue(
                                        event_d_id, event_p_id, &ev, 1, 0);
@@ -344,11 +345,8 @@ lpm_event_loop_burst(struct l3fwd_event_resources
*evt_rsrc,
                                events[i].op = RTE_EVENT_OP_FORWARD;
                        }

-                       if (flags & L3FWD_EVENT_TX_DIRECT)
-                              
rte_event_eth_tx_adapter_txq_set(events[i].mbuf,
-                                                                0);
-
                        lpm_process_event_pkt(lconf, events[i].mbuf);
+                       rte_event_eth_tx_adapter_txq_set(events[i].mbuf, 0);
                }

                if (flags & L3FWD_EVENT_TX_ENQ) {

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #2: Type: text/html, Size: 7909 bytes --]

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2024-03-01 13:22 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-01 13:22 [DPDK/examples Bug 1391] examples/l3fwd: in event-mode hash.txadapter.txq is not always updated bugzilla

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.