* [PATCH 3/6] net/dpaa2: drop stray extract count bump in RSS key build
From: Maxime Leroy @ 2026-06-16 10:47 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy, Hemant Agrawal, Sachin Saxena
In-Reply-To: <20260616104717.723087-1-maxime@leroys.fr>
The IPv4/IPv6 L3 case bumped kg_cfg->num_extracts once in the middle of
the loop, while every other case relies on the final
'kg_cfg->num_extracts = i' that overwrites it. The increment was dead and
misleading; remove it.
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
drivers/net/dpaa2/base/dpaa2_hw_dpni.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/dpaa2/base/dpaa2_hw_dpni.c b/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
index 4df66d8f33..f1d670f213 100644
--- a/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
+++ b/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
@@ -397,7 +397,6 @@ dpaa2_distset_to_dpkg_profile_cfg(
DPKG_EXTRACT_FROM_HDR;
kg_cfg->extracts[i].extract.from_hdr.type =
DPKG_FULL_FIELD;
- kg_cfg->num_extracts++;
i++;
break;
--
2.43.0
^ permalink raw reply related
* [PATCH 2/6] net/dpaa2: use L4 port extraction for SCTP RSS
From: Maxime Leroy @ 2026-06-16 10:47 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy, stable, Hemant Agrawal, Sachin Saxena
In-Reply-To: <20260616104717.723087-1-maxime@leroys.fr>
DPAA2 hardware exposes L4 source and destination port fields at the parser
L4 offset. These fields are valid when TCP, UDP, SCTP or DCCP is present.
The driver already uses the TCP port fields for the TCP/UDP RSS case.
Handle SCTP in the same L4 RSS case, so SCTP packets use the same L4
source and destination port extraction.
Fixes: 89c2ea8f5408 ("net/dpaa2: add RSS flow distribution")
Cc: stable@dpdk.org
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
drivers/net/dpaa2/base/dpaa2_hw_dpni.c | 32 +++-----------------------
1 file changed, 3 insertions(+), 29 deletions(-)
diff --git a/drivers/net/dpaa2/base/dpaa2_hw_dpni.c b/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
index e7578b7576..4df66d8f33 100644
--- a/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
+++ b/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
@@ -228,7 +228,7 @@ dpaa2_distset_to_dpkg_profile_cfg(
uint32_t loop = 0, i = 0;
uint64_t dist_field = 0;
int l2_configured = 0, l3_configured = 0;
- int l4_configured = 0, sctp_configured = 0;
+ int l4_configured = 0;
int mpls_configured = 0;
int vlan_configured = 0;
int esp_configured = 0;
@@ -407,6 +407,8 @@ dpaa2_distset_to_dpkg_profile_cfg(
case RTE_ETH_RSS_NONFRAG_IPV6_UDP:
case RTE_ETH_RSS_IPV6_TCP_EX:
case RTE_ETH_RSS_IPV6_UDP_EX:
+ case RTE_ETH_RSS_NONFRAG_IPV4_SCTP:
+ case RTE_ETH_RSS_NONFRAG_IPV6_SCTP:
if (l4_configured)
break;
@@ -433,34 +435,6 @@ dpaa2_distset_to_dpkg_profile_cfg(
i++;
break;
- case RTE_ETH_RSS_NONFRAG_IPV4_SCTP:
- case RTE_ETH_RSS_NONFRAG_IPV6_SCTP:
-
- if (sctp_configured)
- break;
- sctp_configured = 1;
-
- kg_cfg->extracts[i].extract.from_hdr.prot =
- NET_PROT_SCTP;
- kg_cfg->extracts[i].extract.from_hdr.field =
- NH_FLD_SCTP_PORT_SRC;
- kg_cfg->extracts[i].type =
- DPKG_EXTRACT_FROM_HDR;
- kg_cfg->extracts[i].extract.from_hdr.type =
- DPKG_FULL_FIELD;
- i++;
-
- kg_cfg->extracts[i].extract.from_hdr.prot =
- NET_PROT_SCTP;
- kg_cfg->extracts[i].extract.from_hdr.field =
- NH_FLD_SCTP_PORT_DST;
- kg_cfg->extracts[i].type =
- DPKG_EXTRACT_FROM_HDR;
- kg_cfg->extracts[i].extract.from_hdr.type =
- DPKG_FULL_FIELD;
- i++;
- break;
-
default:
DPAA2_PMD_WARN(
"unsupported flow dist option 0x%" PRIx64,
--
2.43.0
^ permalink raw reply related
* [PATCH 1/6] net/dpaa2: add L4 destination port to the RSS hash key
From: Maxime Leroy @ 2026-06-16 10:47 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy, stable, Hemant Agrawal, Sachin Saxena
In-Reply-To: <20260616104717.723087-1-maxime@leroys.fr>
The TCP/UDP case of the RSS key builder added two extracts but both used
NH_FLD_TCP_PORT_SRC, so the L4 destination port was never part of the
hash. Use NH_FLD_TCP_PORT_DST for the second extract so both source and
destination ports contribute.
NET_PROT_TCP is kept: it maps to the hardware's generic L4 port
extraction (parser L4 offset, valid for TCP/UDP/SCTP), so this also
covers UDP traffic.
Fixes: 89c2ea8f5408 ("net/dpaa2: add RSS flow distribution")
Cc: stable@dpdk.org
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
drivers/net/dpaa2/base/dpaa2_hw_dpni.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dpaa2/base/dpaa2_hw_dpni.c b/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
index 13825046d8..e7578b7576 100644
--- a/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
+++ b/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
@@ -425,7 +425,7 @@ dpaa2_distset_to_dpkg_profile_cfg(
kg_cfg->extracts[i].extract.from_hdr.prot =
NET_PROT_TCP;
kg_cfg->extracts[i].extract.from_hdr.field =
- NH_FLD_TCP_PORT_SRC;
+ NH_FLD_TCP_PORT_DST;
kg_cfg->extracts[i].type =
DPKG_EXTRACT_FROM_HDR;
kg_cfg->extracts[i].extract.from_hdr.type =
--
2.43.0
^ permalink raw reply related
* [PATCH 0/6] net/dpaa2: RSS fixes and improvements
From: Maxime Leroy @ 2026-06-16 10:47 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy
A set of RSS fixes and improvements for the dpaa2 PMD.
Patches 1 and 2 fix the RSS hash key: the L4 destination port was never
added (both extracts used the source port), and SCTP now uses the same L4
port extraction as TCP/UDP. Both are tagged for stable.
Patches 3 and 4 are small cleanups in the key builder (a dead num_extracts
increment, and the unset PPPoE guard flag).
Patch 5 honours RTE_ETH_RSS_LEVEL_INNERMOST so tunnelled traffic hashes on
the inner IP header. Patch 6 implements reta_query / reta_update as an
emulation over the HW distribution-size mechanism, since dpaa2 has no
software-visible indirection table.
Tested on LX2160A (lx2160acex7).
Maxime Leroy (6):
net/dpaa2: add L4 destination port to the RSS hash key
net/dpaa2: use L4 port extraction for SCTP RSS
net/dpaa2: drop stray extract count bump in RSS key build
net/dpaa2: set PPPoE configured flag in RSS key build
net/dpaa2: support inner RSS level for tunnelled traffic
net/dpaa2: implement RSS RETA query and update
doc/guides/nics/features/dpaa2.ini | 1 +
doc/guides/rel_notes/release_26_07.rst | 5 +
drivers/net/dpaa2/base/dpaa2_hw_dpni.c | 94 +++++++-----
drivers/net/dpaa2/dpaa2_ethdev.c | 204 ++++++++++++++++++++++++-
drivers/net/dpaa2/dpaa2_ethdev.h | 9 ++
5 files changed, 272 insertions(+), 41 deletions(-)
--
2.43.0
^ permalink raw reply
* [PATCH v1 1/2] app/testpmd: mask VLAN TCI when specified
From: Anatoly Burakov @ 2026-06-16 10:27 UTC (permalink / raw)
To: dev, Ori Kam, Aman Singh
Currently, when testpmd command `...vlan tci is 0x1234`, the VLAN TCI field
is being specified in spec, but is not masked in mask. Add full mask to
VLAN TCI when specified.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
app/test-pmd/cmdline_flow.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 67f200f2e3..0b7d268535 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -4610,7 +4610,8 @@ static const struct token token_list[] = {
.help = "tag control information",
.next = NEXT(item_vlan, NEXT_ENTRY(COMMON_UNSIGNED),
item_param),
- .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, hdr.vlan_tci)),
+ .args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_vlan,
+ hdr.vlan_tci, "\xff\xff")),
},
[ITEM_VLAN_PCP] = {
.name = "pcp",
--
2.47.3
^ permalink raw reply related
* [PATCH v1 2/2] app/testpmd: mask VLAN inner type when specified
From: Anatoly Burakov @ 2026-06-16 10:27 UTC (permalink / raw)
To: dev, Ori Kam, Aman Singh
In-Reply-To: <00f78873fdb3d48542a7226e68a94e74cab4d8c0.1781605650.git.anatoly.burakov@intel.com>
Currently, when testpmd command `...vlan inner_type is 0x1234`, the VLAN
inner type field is being specified in spec, but is not masked in mask.
Add full mask to VLAN inner type when specified.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
app/test-pmd/cmdline_flow.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 0b7d268535..e41ab0ef9b 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -4642,8 +4642,8 @@ static const struct token token_list[] = {
.help = "inner EtherType",
.next = NEXT(item_vlan, NEXT_ENTRY(COMMON_UNSIGNED),
item_param),
- .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan,
- hdr.eth_proto)),
+ .args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_vlan,
+ hdr.eth_proto, "\xff\xff")),
},
[ITEM_VLAN_HAS_MORE_VLAN] = {
.name = "has_more_vlan",
--
2.47.3
^ permalink raw reply related
* [PATCH v2 6/6] net/dpaa2: drop the fake software VLAN strip offload
From: Maxime Leroy @ 2026-06-16 10:27 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy, Hemant Agrawal, Sachin Saxena
In-Reply-To: <20260616102727.708948-1-maxime@leroys.fr>
RTE_ETH_RX_OFFLOAD_VLAN_STRIP is advertised, but no hardware VLAN strip
backs it: when enabled, the Rx burst calls rte_vlan_strip() on every
frame, a software op masquerading as a hardware offload.
It saves a forwarding application nothing: the datapath reads the L2
header anyway to classify or strip. The offload does not remove that
read, it relocates it into the driver Rx burst, where it is far more
expensive.
The cost is a matter of timing. rte_vlan_strip() reaches the L2 header
through rte_pktmbuf_mtod(), which dereferences mbuf->buf_addr. On a
freshly recycled buffer that mbuf cacheline is cold. eth_fd_to_mbuf()
has just written other fields of it (data_off, ol_flags), but buf_addr
is a persistent field it does not rewrite. A write does not stall: it
posts to the store buffer while the line fills in the background, and
the rewritten fields are forwarded straight from there. buf_addr has
nothing to forward, so it must be read from the line, whose fill is
still in flight, and the read stalls. The ethertype read that follows,
on the cold payload line, stalls again. Read later by the application,
when the fill has completed, the same read hits. The offload just
performs it at the worst possible moment.
Measured on a single-core port-to-port forwarding test over two 10G
ports (one core at 2 GHz, 64-byte untagged frames):
- throughput 4.22 -> 5.00 Mpps (+18 percent)
- IPC 0.93 -> 1.25: the cost was memory stall, not compute
- L3/DRAM-bound L2 refills 319M -> 200M over 10s (-37 percent)
perf confirms it: with the offload, the buf_addr load (the cold mbuf
field) and the payload load account for about 84 percent of the Rx
burst's L2 refills; removing it, those vanish and only the inherent DQRR
dequeue misses remain.
Stop advertising VLAN_STRIP and remove the rte_vlan_strip() calls from
every Rx path. This is a behavioural change: the tag is left in the
frame, so an application must strip it itself, on the L2 header it
already reads.
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
doc/guides/rel_notes/release_26_07.rst | 3 +++
drivers/net/dpaa2/dpaa2_ethdev.c | 1 -
drivers/net/dpaa2/dpaa2_rxtx.c | 23 +++--------------------
3 files changed, 6 insertions(+), 21 deletions(-)
diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 3bc49c3910..1da1d7b729 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -143,6 +143,9 @@ New Features
* **Updated NXP dpaa2 driver.**
* Added Rx queue interrupt support.
+ * Removed the software VLAN strip offload: ``RTE_ETH_RX_OFFLOAD_VLAN_STRIP``
+ is no longer advertised, as no hardware strip backs it. An application
+ that needs the tag removed must now strip it itself.
* **Updated PCAP ethernet driver.**
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index ac7303c116..e0451d4ac6 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -48,7 +48,6 @@ static uint64_t dev_rx_offloads_sup =
RTE_ETH_RX_OFFLOAD_SCTP_CKSUM |
RTE_ETH_RX_OFFLOAD_OUTER_IPV4_CKSUM |
RTE_ETH_RX_OFFLOAD_OUTER_UDP_CKSUM |
- RTE_ETH_RX_OFFLOAD_VLAN_STRIP |
RTE_ETH_RX_OFFLOAD_VLAN_FILTER |
RTE_ETH_RX_OFFLOAD_TIMESTAMP;
diff --git a/drivers/net/dpaa2/dpaa2_rxtx.c b/drivers/net/dpaa2/dpaa2_rxtx.c
index 189accc1de..d16e4f8f35 100644
--- a/drivers/net/dpaa2/dpaa2_rxtx.c
+++ b/drivers/net/dpaa2/dpaa2_rxtx.c
@@ -890,10 +890,6 @@ dpaa2_dev_prefetch_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
}
#endif
- if (eth_data->dev_conf.rxmode.offloads &
- RTE_ETH_RX_OFFLOAD_VLAN_STRIP)
- rte_vlan_strip(bufs[num_rx]);
-
dq_storage++;
num_rx++;
} while (pending);
@@ -922,22 +918,14 @@ dpaa2_dev_prefetch_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
return num_rx;
}
-/* Convert a DQRR'd FD (single or scatter-gather) to an mbuf and apply software
- * VLAN strip, like the poll path.
- */
+/* Convert a DQRR'd FD (single or scatter-gather) to an mbuf. */
static inline struct rte_mbuf *
dpaa2_dqrr_fd_to_mbuf(const struct qbman_fd *fd,
struct rte_eth_dev_data *eth_data)
{
- struct rte_mbuf *m;
-
if (unlikely(DPAA2_FD_GET_FORMAT(fd) == qbman_fd_sg))
- m = eth_sg_fd_to_mbuf(fd, eth_data->port_id);
- else
- m = eth_fd_to_mbuf(fd, eth_data->port_id);
- if (eth_data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_VLAN_STRIP)
- rte_vlan_strip(m);
- return m;
+ return eth_sg_fd_to_mbuf(fd, eth_data->port_id);
+ return eth_fd_to_mbuf(fd, eth_data->port_id);
}
/* prefetch a DQRR'd FD's HW annotation (parse area) ahead of conversion */
@@ -1222,11 +1210,6 @@ dpaa2_dev_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
}
#endif
- if (eth_data->dev_conf.rxmode.offloads &
- RTE_ETH_RX_OFFLOAD_VLAN_STRIP) {
- rte_vlan_strip(bufs[num_rx]);
- }
-
dq_storage++;
num_rx++;
num_pulled++;
--
2.43.0
^ permalink raw reply related
* [PATCH v2 5/6] net/dpaa2: fix Rx queue count for primary process
From: Maxime Leroy @ 2026-06-16 10:27 UTC (permalink / raw)
To: dev
Cc: Maxime Leroy, stable, Hemant Agrawal, Sachin Saxena,
Andrew Rybchenko, Ferruh Yigit, David Marchand
In-Reply-To: <20260616102727.708948-1-maxime@leroys.fr>
The rx_queue_count callback was only assigned on the secondary process
path of dpaa2_dev_init(), leaving eth_dev->rx_queue_count NULL for the
primary process. The fast-path rte_eth_rx_queue_count() performs an
unguarded indirect call in non-debug builds, so invoking it on a
primary-process dpaa2 port dereferences a NULL function pointer and
crashes.
Assign the callback once before the process-type split so both the
primary and secondary paths set it.
Fixes: cbfc6111b557 ("ethdev: move inline device operations")
Cc: stable@dpdk.org
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
drivers/net/dpaa2/dpaa2_ethdev.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 76e2df6167..ac7303c116 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -3410,6 +3410,7 @@ dpaa2_dev_init(struct rte_eth_dev *eth_dev)
}
eth_dev->dev_ops = &dpaa2_ethdev_ops;
+ eth_dev->rx_queue_count = dpaa2_dev_rx_queue_count;
if (dpaa2_get_devargs(dev->devargs, DRIVER_LOOPBACK_MODE)) {
eth_dev->rx_pkt_burst = dpaa2_dev_loopback_rx;
--
2.43.0
^ permalink raw reply related
* [PATCH v2 4/6] bus/fslmc/dpio: tune DQRI interrupt coalescing holdoff
From: Maxime Leroy @ 2026-06-16 10:27 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy, Hemant Agrawal, Sachin Saxena
In-Reply-To: <20260616102727.708948-1-maxime@leroys.fr>
The portal DQRI interrupt used a fixed threshold of 3 and a raw 0xFF
timeout. Parameterize dpaa2_dpio_intr_init() with (threshold, timeout) so
each mode supplies its own: the event driver keeps the legacy 3 / 0xFF
and its DPAA2_PORTAL_INTR_THRESHOLD / DPAA2_PORTAL_INTR_TIMEOUT env-var
overrides, while rx-queue interrupts default the threshold to the HW DQRR
ring depth (ring-1, =7 on QBMan >= 4.1) and use a coalescing holdoff in
microseconds, converted to ITP units from the MC-reported QBMan clock
(itp = holdoff_us * clk_MHz / 256, capped at the 12-bit field). The setup
is portal-wide and idempotent, so the first mode to arm a given portal
wins; a portal is normally driven by a single mode.
The net/dpaa2 PMD exposes both rx-queue-interrupt knobs as per-port
devargs: drv_rx_intr_holdoff_us (default 100us) and drv_rx_intr_threshold
(default 0 = ring-1, clamped to [1, ring-1]). Also expose
dpaa2_dpio_intr_deinit() (no longer event-only), and on the intr_init
error paths close the epoll fd and disable the interrupt.
Add qbman_swp_dqrr_size() to expose the ring depth.
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
doc/guides/nics/dpaa2.rst | 10 +++
drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 72 +++++++++++++------
drivers/bus/fslmc/portal/dpaa2_hw_dpio.h | 12 +++-
.../fslmc/qbman/include/fsl_qbman_portal.h | 10 +++
drivers/bus/fslmc/qbman/qbman_portal.c | 6 ++
drivers/net/dpaa2/dpaa2_ethdev.c | 60 +++++++++++++++-
drivers/net/dpaa2/dpaa2_ethdev.h | 7 ++
7 files changed, 152 insertions(+), 25 deletions(-)
diff --git a/doc/guides/nics/dpaa2.rst b/doc/guides/nics/dpaa2.rst
index 2d70bd0ab9..47a52c9287 100644
--- a/doc/guides/nics/dpaa2.rst
+++ b/doc/guides/nics/dpaa2.rst
@@ -492,6 +492,16 @@ for details.
packets, so that user can check what is wrong with those packets.
e.g. ``fslmc:dpni.1,drv_error_queue=1``
+* Use dev arg option ``drv_rx_intr_holdoff_us=<uint32>`` to set the Rx queue
+ interrupt coalescing holdoff in microseconds (default 100). Only applies in
+ Rx queue interrupt mode.
+ e.g. ``fslmc:dpni.1,drv_rx_intr_holdoff_us=50``
+
+* Use dev arg option ``drv_rx_intr_threshold=<uint32>`` to set the Rx queue
+ interrupt coalescing frame threshold; 0 (default) means the DQRR ring depth
+ minus one.
+ e.g. ``fslmc:dpni.1,drv_rx_intr_threshold=4``
+
Enabling logs
-------------
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
index e6b4e74b3b..c5525a94fa 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
@@ -206,12 +206,35 @@ dpaa2_affine_dpio_intr_to_respective_core(int32_t dpio_id, int cpu_id)
}
#endif /* RTE_EVENT_DPAA2 */
+/* holdoff (us) -> QBMan ITP units (256 cycles each), capped at the 12-bit field */
+RTE_EXPORT_INTERNAL_SYMBOL(dpaa2_dpio_holdoff_to_itp)
+int dpaa2_dpio_holdoff_to_itp(struct dpaa2_dpio_dev *dpio_dev, uint32_t holdoff_us)
+{
+ uint32_t qman_mhz = 0;
+ struct dpio_attr attr;
+ uint64_t itp;
+
+ if (dpio_get_attributes(dpio_dev->dpio, CMD_PRI_LOW, dpio_dev->token, &attr) == 0)
+ qman_mhz = attr.clk / 1000000;
+ itp = qman_mhz ? ((uint64_t)holdoff_us * qman_mhz) / 256 : 0xFF;
+ if (itp > 0xfff) /* 12-bit ITP field */
+ itp = 0xfff;
+
+ return (int)itp;
+}
+
+/* threshold: DQRR fill raising DQRI (< ring depth); timeout: holdoff in ITP units.
+ * Per-mode values from the caller (eventdev vs rx-queue intr); no env override.
+ * The DQRI config is portal-wide and this is idempotent: the first caller to
+ * arm a portal wins, a later caller's values are ignored (a portal normally
+ * serves a single mode).
+ */
RTE_EXPORT_INTERNAL_SYMBOL(dpaa2_dpio_intr_init)
-int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epoll)
+int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, int threshold,
+ int timeout, bool build_epoll)
{
- struct epoll_event epoll_ev;
int eventfd, dpio_epoll_fd, ret;
- int threshold = 0x3, timeout = 0xFF;
+ struct epoll_event epoll_ev;
if (dpio_dev->intr_enabled)
return 0;
@@ -222,12 +245,6 @@ int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epoll)
return -1;
}
- if (getenv("DPAA2_PORTAL_INTR_THRESHOLD"))
- threshold = atoi(getenv("DPAA2_PORTAL_INTR_THRESHOLD"));
-
- if (getenv("DPAA2_PORTAL_INTR_TIMEOUT"))
- sscanf(getenv("DPAA2_PORTAL_INTR_TIMEOUT"), "%x", &timeout);
-
qbman_swp_interrupt_set_trigger(dpio_dev->sw_portal,
QBMAN_SWP_INTERRUPT_DQRI);
qbman_swp_interrupt_clear_status(dpio_dev->sw_portal, 0xffffffff);
@@ -238,9 +255,9 @@ int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epoll)
dpio_dev->epoll_fd = -1;
/* The event PMD dequeues by sleeping on a private epoll instance owned
- * by the portal, so build it here. A caller that waits on another
- * epoll (the net rx-queue-interrupt path uses the application's) skips
- * this.
+ * by the portal, so build it here. The net rx-queue-interrupt path
+ * exposes the raw eventfd through the generic ethdev API and waits on
+ * the application's own epoll instead, so it skips this.
*/
if (build_epoll) {
dpio_epoll_fd = epoll_create(1);
@@ -269,11 +286,14 @@ int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epoll)
return 0;
}
-#ifdef RTE_EVENT_DPAA2
-static void dpaa2_dpio_intr_deinit(struct dpaa2_dpio_dev *dpio_dev)
+RTE_EXPORT_INTERNAL_SYMBOL(dpaa2_dpio_intr_deinit)
+void dpaa2_dpio_intr_deinit(struct dpaa2_dpio_dev *dpio_dev)
{
int ret;
+ if (!dpio_dev->intr_enabled)
+ return;
+
ret = rte_dpaa2_intr_disable(dpio_dev->intr_handle, 0);
if (ret)
DPAA2_BUS_ERR("DPIO interrupt disable failed");
@@ -284,7 +304,6 @@ static void dpaa2_dpio_intr_deinit(struct dpaa2_dpio_dev *dpio_dev)
}
dpio_dev->intr_enabled = 0;
}
-#endif
static int
dpaa2_configure_stashing(struct dpaa2_dpio_dev *dpio_dev, int cpu_id)
@@ -306,9 +325,18 @@ dpaa2_configure_stashing(struct dpaa2_dpio_dev *dpio_dev, int cpu_id)
}
#ifdef RTE_EVENT_DPAA2
- if (dpaa2_dpio_intr_init(dpio_dev, true)) {
- DPAA2_BUS_ERR("Interrupt registration failed for dpio");
- return -1;
+ {
+ int threshold = 3, timeout = 0xFF;
+
+ if (getenv("DPAA2_PORTAL_INTR_THRESHOLD"))
+ threshold = atoi(getenv("DPAA2_PORTAL_INTR_THRESHOLD"));
+ if (getenv("DPAA2_PORTAL_INTR_TIMEOUT"))
+ sscanf(getenv("DPAA2_PORTAL_INTR_TIMEOUT"), "%x", &timeout);
+
+ if (dpaa2_dpio_intr_init(dpio_dev, threshold, timeout, true)) {
+ DPAA2_BUS_ERR("Interrupt registration failed for dpio");
+ return -1;
+ }
}
dpaa2_affine_dpio_intr_to_respective_core(dpio_dev->hw_id, cpu_id);
#endif
@@ -319,9 +347,11 @@ dpaa2_configure_stashing(struct dpaa2_dpio_dev *dpio_dev, int cpu_id)
static void dpaa2_put_qbman_swp(struct dpaa2_dpio_dev *dpio_dev)
{
if (dpio_dev) {
-#ifdef RTE_EVENT_DPAA2
+ /* rx-queue interrupts (net PMD) can arm a portal without the
+ * event driver; tear it down unconditionally. Safe when never
+ * armed: intr_deinit returns early if intr is not enabled.
+ */
dpaa2_dpio_intr_deinit(dpio_dev);
-#endif
rte_atomic16_clear(&dpio_dev->ref_count);
}
}
@@ -512,6 +542,8 @@ dpaa2_create_dpio_device(int vdev_fd,
goto err;
}
+ DPAA2_BUS_DEBUG("QBMAN clk = %u Hz (%u MHz)", attr.clk, attr.clk / 1000000);
+
/* find the SoC type for the first time */
if (!dpaa2_svr_family) {
struct mc_soc_version mc_plat_info = {0};
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h
index 10dd968e5f..090fa14410 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h
@@ -50,9 +50,17 @@ int dpaa2_affine_qbman_swp(void);
__rte_internal
int dpaa2_affine_qbman_ethrx_swp(void);
-/* set up a DPIO portal's DQRI interrupt (rx-queue interrupt mode) */
+/* set up / tear down a DPIO portal's DQRI interrupt (rx-queue interrupt mode) */
__rte_internal
-int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epoll);
+int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, int threshold,
+ int timeout, bool build_epoll);
+
+__rte_internal
+void dpaa2_dpio_intr_deinit(struct dpaa2_dpio_dev *dpio_dev);
+
+/* convert a coalescing holdoff (microseconds) to QBMan ITP units */
+__rte_internal
+int dpaa2_dpio_holdoff_to_itp(struct dpaa2_dpio_dev *dpio_dev, uint32_t holdoff_us);
/* allocate memory for FQ - dq storage */
__rte_internal
diff --git a/drivers/bus/fslmc/qbman/include/fsl_qbman_portal.h b/drivers/bus/fslmc/qbman/include/fsl_qbman_portal.h
index bb8bd86103..e9eda31927 100644
--- a/drivers/bus/fslmc/qbman/include/fsl_qbman_portal.h
+++ b/drivers/bus/fslmc/qbman/include/fsl_qbman_portal.h
@@ -157,6 +157,16 @@ uint32_t qbman_swp_intr_timeout_read_status(struct qbman_swp *p);
*/
void qbman_swp_intr_timeout_write(struct qbman_swp *p, uint32_t mask);
+/**
+ * qbman_swp_dqrr_size() - Get the HW DQRR ring depth of a software portal.
+ * @p: the given software portal object.
+ *
+ * Returns the number of DQRR entries (4 on QBMan < 4.1, 8 on >= 4.1). Useful
+ * as the upper bound for the DQRR interrupt coalescing threshold.
+ */
+__rte_internal
+uint8_t qbman_swp_dqrr_size(struct qbman_swp *p);
+
/**
* qbman_swp_interrupt_get_trigger() - Get the data in software portal
* interrupt enable register.
diff --git a/drivers/bus/fslmc/qbman/qbman_portal.c b/drivers/bus/fslmc/qbman/qbman_portal.c
index 947415363a..81c2d87e0a 100644
--- a/drivers/bus/fslmc/qbman/qbman_portal.c
+++ b/drivers/bus/fslmc/qbman/qbman_portal.c
@@ -433,6 +433,12 @@ void qbman_swp_intr_timeout_write(struct qbman_swp *p, uint32_t mask)
qbman_cinh_write(&p->sys, QBMAN_CINH_SWP_ITPR, mask);
}
+RTE_EXPORT_INTERNAL_SYMBOL(qbman_swp_dqrr_size)
+uint8_t qbman_swp_dqrr_size(struct qbman_swp *p)
+{
+ return p->dqrr.dqrr_size;
+}
+
uint32_t qbman_swp_interrupt_get_trigger(struct qbman_swp *p)
{
return qbman_cinh_read(&p->sys, QBMAN_CINH_SWP_IER);
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 61e7c820de..76e2df6167 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -36,6 +36,9 @@
#define DRIVER_ERROR_QUEUE "drv_err_queue"
#define DRIVER_NO_TAILDROP "drv_no_taildrop"
#define DRIVER_NO_DATA_STASHING "drv_no_data_stashing"
+#define DRIVER_RX_INTR_HOLDOFF_US "drv_rx_intr_holdoff_us"
+#define DPAA2_RX_INTR_HOLDOFF_US_DEF 100
+#define DRIVER_RX_INTR_THRESHOLD "drv_rx_intr_threshold"
#define CHECK_INTERVAL 100 /* 100ms */
#define MAX_REPEAT_TIME 90 /* 9s (90 * 100ms) in total */
@@ -2873,7 +2876,7 @@ dpaa2_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
struct dpaa2_dev_priv *priv = dev->data->dev_private;
struct dpaa2_queue *dpaa2_q = priv->rx_vq[queue_id];
struct dpaa2_dpio_dev *dpio, *old;
- int ret;
+ int ret, threshold, timeout, dqrr_max;
if (!dpaa2_q->napi_dpcon)
return -ENOTSUP; /* no channel -> caller keeps polling */
@@ -2882,10 +2885,22 @@ dpaa2_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
return -EIO;
dpio = DPAA2_PER_LCORE_ETHRX_DPIO;
+ /* threshold from drv_rx_intr_threshold (0 = ring-1), holdoff from
+ * drv_rx_intr_holdoff_us. idempotent: no-op if the dpio is already
+ * armed (e.g. event driver)
+ */
+ dqrr_max = qbman_swp_dqrr_size(dpio->sw_portal) - 1;
+ threshold = priv->rx_intr_threshold ? (int)priv->rx_intr_threshold : dqrr_max;
+ if (threshold < 1 || threshold > dqrr_max) {
+ DPAA2_PMD_WARN("drv_rx_intr_threshold %d out of [1, %d], clamping",
+ threshold, dqrr_max);
+ threshold = threshold < 1 ? 1 : dqrr_max;
+ }
+ timeout = dpaa2_dpio_holdoff_to_itp(dpio, priv->rx_intr_holdoff_us);
/* build_epoll=false: the generic ethdev rx-intr API waits on the
* application epoll, not the portal's private one (event PMD only).
*/
- ret = dpaa2_dpio_intr_init(dpio, false); /* VFIO eventfd, no MC */
+ ret = dpaa2_dpio_intr_init(dpio, threshold, timeout, false);
if (ret)
return ret;
@@ -3139,6 +3154,35 @@ dpaa2_get_devargs(struct rte_devargs *devargs, const char *key)
return 1;
}
+static int
+u32_devarg_handler(__rte_unused const char *key, const char *value, void *opaque)
+{
+ char *end;
+ unsigned long v = strtoul(value, &end, 0);
+
+ if (*value == '\0' || *end != '\0' || v > UINT32_MAX)
+ return -1;
+ *(uint32_t *)opaque = (uint32_t)v;
+
+ return 0;
+}
+
+/* Read a u32-valued devarg into *out, leaving *out untouched if absent. */
+static void
+dpaa2_get_devargs_u32(struct rte_devargs *devargs, const char *key, uint32_t *out)
+{
+ struct rte_kvargs *kvlist;
+
+ if (!devargs)
+ return;
+ kvlist = rte_kvargs_parse(devargs->args, NULL);
+ if (!kvlist)
+ return;
+ if (rte_kvargs_count(kvlist, key))
+ rte_kvargs_process(kvlist, key, u32_devarg_handler, out);
+ rte_kvargs_free(kvlist);
+}
+
static int
dpaa2_dev_init(struct rte_eth_dev *eth_dev)
{
@@ -3166,6 +3210,14 @@ dpaa2_dev_init(struct rte_eth_dev *eth_dev)
DPAA2_PMD_INFO("No RX prefetch mode");
}
+ priv->rx_intr_holdoff_us = DPAA2_RX_INTR_HOLDOFF_US_DEF;
+ dpaa2_get_devargs_u32(dev->devargs, DRIVER_RX_INTR_HOLDOFF_US,
+ &priv->rx_intr_holdoff_us);
+
+ priv->rx_intr_threshold = 0;
+ dpaa2_get_devargs_u32(dev->devargs, DRIVER_RX_INTR_THRESHOLD,
+ &priv->rx_intr_threshold);
+
if (dpaa2_get_devargs(dev->devargs, DRIVER_LOOPBACK_MODE)) {
priv->flags |= DPAA2_RX_LOOPBACK_MODE;
DPAA2_PMD_INFO("Rx loopback mode");
@@ -3681,5 +3733,7 @@ RTE_PMD_REGISTER_PARAM_STRING(NET_DPAA2_PMD_DRIVER_NAME,
DRIVER_RX_PARSE_ERR_DROP "=<int>"
DRIVER_ERROR_QUEUE "=<int>"
DRIVER_NO_TAILDROP "=<int>"
- DRIVER_NO_DATA_STASHING "=<int>");
+ DRIVER_NO_DATA_STASHING "=<int> "
+ DRIVER_RX_INTR_HOLDOFF_US "=<uint32> "
+ DRIVER_RX_INTR_THRESHOLD "=<uint32>");
RTE_LOG_REGISTER_DEFAULT(dpaa2_logtype_pmd, NOTICE);
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.h b/drivers/net/dpaa2/dpaa2_ethdev.h
index 3765f79e84..84785c0561 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.h
+++ b/drivers/net/dpaa2/dpaa2_ethdev.h
@@ -412,6 +412,13 @@ struct dpaa2_dev_priv {
uint8_t max_cgs;
uint8_t cgid_in_use[MAX_RX_QUEUES];
+ /* DQRI holdoff (us) for rx-queue interrupts (drv_rx_intr_holdoff_us) */
+ uint32_t rx_intr_holdoff_us;
+ /* DQRI threshold for rx-queue interrupts (drv_rx_intr_threshold);
+ * 0 = auto (DQRR ring depth - 1)
+ */
+ uint32_t rx_intr_threshold;
+
uint16_t dpni_ver_major;
uint16_t dpni_ver_minor;
uint32_t speed_capa;
--
2.43.0
^ permalink raw reply related
* [PATCH v2 3/6] net/dpaa2: support Rx queue interrupts
From: Maxime Leroy @ 2026-06-16 10:27 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy, Hemant Agrawal, Sachin Saxena
In-Reply-To: <20260616102727.708948-1-maxime@leroys.fr>
Implement .rx_queue_intr_enable / .rx_queue_intr_disable so a worker
can sleep on a queue's data-availability notification instead of
busy-polling, through the generic rte_eth_dev_rx_intr_* API.
A worker wakes on its software portal's DQRI, which fires when the
portal's DQRR holds frames, so the Rx FQ must be scheduled to a channel
that portal dequeues. The natural dpni_set_queue with a notification
destination holds the global MC lock long enough to wedge the firmware
and must target a disabled dpni. But the polling portal is only known
once a worker affines, after dev_start, so the destination cannot be
the worker's portal.
Bind each Rx FQ to its own DPCON channel instead. The default Rx burst
pulls frames from the FQ with a volatile dequeue and cannot be
interrupt-driven; to wake on the DQRI the FQ must be pushed to the
portal's DQRR. dev_start issues the DEST_DPCON set_queue statically on
the still-disabled dpni with no knowledge of the polling lcore; a worker
later subscribes its own ethrx portal to the channel and arms the DQRI
in rx_queue_intr_enable (a one-shot per-portal MC op plus QBMan, never
the wedging set_queue).
This pushed/DQRR consumption is how the event PMD works, but the DPCON
use differs. The event PMD uses one DPCON per worker, concentrates N
FQs onto it, and lets the QBMan scheduler load-balance events across
cores. Here affinity is static and there is no scheduling, so each FQ
gets its own DPCON (one per FQ, more channels, drawn from the shared
pool that the DPCON move to the fslmc bus now feeds), bound once at
dev_start before the lcore is known. Frames are delivered by
rte_eth_rx_burst (dpaa2_dev_rx_dqrr), not as events via
rte_event_dequeue.
rte_eth_dev_rx_intr_enable(q) subscribes the lcore portal to q's DPCON
and arms the DQRI. rte_eth_dev_rx_intr_ctl_q(q) adds q's eventfd (the
portal DQRI fd) to the thread epoll.
wire
|
[ DPMAC ]
|
[ DPNI ] (1)
|
TC0: FQ0 FQ1 FQ2 FQ3 (2)
| | | | (3)
[DPCON][DPCON][DPCON][DPCON]
\ | | / (4)
[ DPIO A ] [ DPIO B ] (5)
| |
DQRR DQRR (6)
| |
DQRI DQRI (7)
| |
eventfd eventfd (8)
| |
rte_epoll_wait rte_epoll_wait (9)
| |
dpaa2_dev_rx_dqrr (10)
(1) WRIOP picks a TC (QoS), then RSS-hashes within the TC to an FQ
(2) FQ0..FQ3 are the rte_eth Rx queues
(3) dpni_set_queue(DEST_DPCON): one DPCON per FQ
(4) the lcore portal subscribes to its DPCONs (push_set)
(5) one QBMan software portal per lcore
(6) QMan pushes the FDs into the portal DQRR
(7) DQRI is raised when the DQRR is non-empty
(8) a portal's queues share one fd (its DQRI eventfd)
(9) worker sleeps here when all its queues are idle
(10) dpaa2_dev_rx_dqrr drains the DQRR, demuxes FDs to FQs by fqd_ctx
The DQRI and eventfd are portal-wide: a queue's eventfd is its portal's
DQRI fd, and the inhibit bit is refcounted by armed queues so disabling
one queue never masks a sibling. The static per-queue bind also lets a
queue be re-homed to another lcore at runtime, the new worker
reclaiming the channel, with no set_queue and no port stop.
On single-core 64-byte forwarding this interrupt path runs at ~5.0 Mpps
versus ~5.86 Mpps polling: per-frame DQRR demux and consume cost about
15 percent over the polling batch dequeue.
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
doc/guides/nics/features/dpaa2.ini | 1 +
doc/guides/rel_notes/release_26_07.rst | 4 +
drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 11 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpio.h | 4 +
drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 27 +-
.../fslmc/qbman/include/fsl_qbman_portal.h | 1 +
drivers/bus/fslmc/qbman/qbman_portal.c | 1 +
drivers/net/dpaa2/dpaa2_ethdev.c | 291 +++++++++++++++++-
drivers/net/dpaa2/dpaa2_ethdev.h | 3 +
drivers/net/dpaa2/dpaa2_rxtx.c | 122 ++++++++
10 files changed, 459 insertions(+), 6 deletions(-)
diff --git a/doc/guides/nics/features/dpaa2.ini b/doc/guides/nics/features/dpaa2.ini
index 5f9c587847..fff313603f 100644
--- a/doc/guides/nics/features/dpaa2.ini
+++ b/doc/guides/nics/features/dpaa2.ini
@@ -7,6 +7,7 @@
Speed capabilities = Y
Link status = Y
Link status event = Y
+Rx interrupt = Y
Burst mode info = Y
Queue start/stop = Y
Scattered Rx = Y
diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 5d7aa8d1bf..3bc49c3910 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -140,6 +140,10 @@ New Features
* Added support for selective Rx in scalar SPRQ Rx path.
+* **Updated NXP dpaa2 driver.**
+
+ * Added Rx queue interrupt support.
+
* **Updated PCAP ethernet driver.**
* Added support for VLAN insertion and stripping.
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
index 3a5abb2e6d..e6b4e74b3b 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
@@ -204,13 +204,18 @@ dpaa2_affine_dpio_intr_to_respective_core(int32_t dpio_id, int cpu_id)
fclose(file);
}
+#endif /* RTE_EVENT_DPAA2 */
-static int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epoll)
+RTE_EXPORT_INTERNAL_SYMBOL(dpaa2_dpio_intr_init)
+int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epoll)
{
struct epoll_event epoll_ev;
int eventfd, dpio_epoll_fd, ret;
int threshold = 0x3, timeout = 0xFF;
+ if (dpio_dev->intr_enabled)
+ return 0;
+
ret = rte_dpaa2_intr_enable(dpio_dev->intr_handle, 0);
if (ret) {
DPAA2_BUS_ERR("Interrupt registration failed");
@@ -259,9 +264,12 @@ static int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epol
dpio_dev->epoll_fd = dpio_epoll_fd;
}
+ dpio_dev->intr_enabled = 1;
+
return 0;
}
+#ifdef RTE_EVENT_DPAA2
static void dpaa2_dpio_intr_deinit(struct dpaa2_dpio_dev *dpio_dev)
{
int ret;
@@ -274,6 +282,7 @@ static void dpaa2_dpio_intr_deinit(struct dpaa2_dpio_dev *dpio_dev)
close(dpio_dev->epoll_fd);
dpio_dev->epoll_fd = -1;
}
+ dpio_dev->intr_enabled = 0;
}
#endif
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h
index 328e1e788a..10dd968e5f 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h
@@ -50,6 +50,10 @@ int dpaa2_affine_qbman_swp(void);
__rte_internal
int dpaa2_affine_qbman_ethrx_swp(void);
+/* set up a DPIO portal's DQRI interrupt (rx-queue interrupt mode) */
+__rte_internal
+int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epoll);
+
/* allocate memory for FQ - dq storage */
__rte_internal
int
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
index 79a2ec41e3..af75e96b27 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
@@ -133,6 +133,8 @@ struct dpaa2_dpio_dev {
struct rte_intr_handle *intr_handle; /* Interrupt related info */
int32_t epoll_fd; /**< File descriptor created for interrupt polling */
int32_t hw_id; /**< An unique ID of this DPIO device instance */
+ uint8_t intr_enabled; /**< DQRI portal interrupt already set up */
+ uint16_t ethrx_intr_refcnt; /**< rx queues currently armed on this portal */
struct dpaa2_portal_dqrr dpaa2_held_bufs;
};
@@ -164,6 +166,20 @@ typedef void (dpaa2_queue_cb_dqrr_t)(struct qbman_swp *swp,
typedef void (dpaa2_queue_cb_eqresp_free_t)(uint16_t eqresp_ci,
struct dpaa2_queue *dpaa2_q);
+#define DPAA2_NAPI_FD_STASH_SIZE 64 /*!< power of 2; >= 2x rx burst so the
+ * peer port's frames fit before HW
+ * backpressure (2 ports/worker)
+ */
+
+/* Lcore-local FIFO of raw FDs demuxed to this queue by another queue's burst
+ * on the same portal (see dpaa2_queue::napi_stash).
+ */
+struct dpaa2_napi_stash {
+ uint16_t head; /*!< pop index (drain) */
+ uint16_t tail; /*!< push index (park) */
+ struct qbman_fd fd[DPAA2_NAPI_FD_STASH_SIZE];
+};
+
struct __rte_cache_aligned dpaa2_queue {
struct rte_mempool *mb_pool; /**< mbuf pool to populate RX ring. */
union {
@@ -176,7 +192,7 @@ struct __rte_cache_aligned dpaa2_queue {
uint8_t cgid; /*! < Congestion Group id for this queue */
uint64_t rx_pkts;
uint64_t tx_pkts;
- uint64_t err_pkts;
+ uint64_t err_pkts; /*!< also counts NAPI stash-full drops (imissed) */
union {
/**Ingress*/
struct queue_storage_info_t *q_storage[RTE_MAX_LCORE];
@@ -195,6 +211,15 @@ struct __rte_cache_aligned dpaa2_queue {
uint64_t offloads;
uint64_t lpbk_cntx;
uint8_t data_stashing_off;
+ /* NAPI rx-interrupt: per-queue DPCON bound to this FQ at dev_start
+ * (DEST_DPCON, static); the polling worker subscribes its ethrx portal
+ * to the channel and arms the DQRI, rx_dqrr drains+demuxes by fqd_ctx.
+ */
+ struct dpaa2_dpcon_dev *napi_dpcon; /*!< notif channel, NULL = napi off */
+ RTE_ATOMIC(struct dpaa2_dpio_dev *) napi_sub_dpio; /*!< subscribed portal or NULL */
+ uint8_t napi_channel_index; /*!< portal-local static-dequeue idx */
+ uint8_t napi_armed; /*!< this queue requests DQRI wakeups */
+ struct dpaa2_napi_stash napi_stash; /*!< NAPI/DQRR demux FDs (~2 KB) */
};
struct swp_active_dqs {
diff --git a/drivers/bus/fslmc/qbman/include/fsl_qbman_portal.h b/drivers/bus/fslmc/qbman/include/fsl_qbman_portal.h
index 5375ea386d..bb8bd86103 100644
--- a/drivers/bus/fslmc/qbman/include/fsl_qbman_portal.h
+++ b/drivers/bus/fslmc/qbman/include/fsl_qbman_portal.h
@@ -189,6 +189,7 @@ int qbman_swp_interrupt_get_inhibit(struct qbman_swp *p);
* @p: the given software portal object.
* @mask: The value to set in SWP_IIR register.
*/
+__rte_internal
void qbman_swp_interrupt_set_inhibit(struct qbman_swp *p, int inhibit);
/************/
diff --git a/drivers/bus/fslmc/qbman/qbman_portal.c b/drivers/bus/fslmc/qbman/qbman_portal.c
index 84853924e7..947415363a 100644
--- a/drivers/bus/fslmc/qbman/qbman_portal.c
+++ b/drivers/bus/fslmc/qbman/qbman_portal.c
@@ -448,6 +448,7 @@ int qbman_swp_interrupt_get_inhibit(struct qbman_swp *p)
return qbman_cinh_read(&p->sys, QBMAN_CINH_SWP_IIR);
}
+RTE_EXPORT_INTERNAL_SYMBOL(qbman_swp_interrupt_set_inhibit)
void qbman_swp_interrupt_set_inhibit(struct qbman_swp *p, int inhibit)
{
qbman_cinh_write(&p->sys, QBMAN_CINH_SWP_IIR,
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 803a8321e0..61e7c820de 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -623,6 +623,8 @@ dpaa2_clear_queue_active_dps(struct dpaa2_queue *q, int num_lcores)
}
}
+static void dpaa2_dev_rx_queue_intr_unbind(struct dpaa2_queue *dpaa2_q);
+
static void
dpaa2_free_rx_tx_queues(struct rte_eth_dev *dev)
{
@@ -640,6 +642,12 @@ dpaa2_free_rx_tx_queues(struct rte_eth_dev *dev)
/* cleaning up queue storage */
for (i = 0; i < priv->nb_rx_queues; i++) {
dpaa2_q = priv->rx_vq[i];
+ if (dpaa2_q->napi_dpcon) { /* release the rx-intr channel */
+ dpaa2_dev_rx_queue_intr_unbind(dpaa2_q);
+ rte_dpaa2_free_dpcon_dev(dpaa2_q->napi_dpcon);
+ dpaa2_q->napi_dpcon = NULL;
+ dpaa2_q->napi_sub_dpio = NULL;
+ }
dpaa2_clear_queue_active_dps(dpaa2_q,
RTE_MAX_LCORE);
dpaa2_queue_storage_free(dpaa2_q,
@@ -845,6 +853,19 @@ dpaa2_eth_dev_configure(struct rte_eth_dev *dev)
}
}
+ if (dev->data->dev_conf.intr_conf.rxq) {
+ if (!dev->intr_handle)
+ dev->intr_handle = rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+ if (!dev->intr_handle ||
+ rte_intr_vec_list_alloc(dev->intr_handle, "rxq_intr",
+ dev->data->nb_rx_queues) ||
+ rte_intr_nb_efd_set(dev->intr_handle, dev->data->nb_rx_queues) ||
+ rte_intr_type_set(dev->intr_handle, RTE_INTR_HANDLE_EXT)) {
+ DPAA2_PMD_ERR("Failed to set up rx-queue interrupts");
+ return -rte_errno;
+ }
+ }
+
dpaa2_tm_init(dev);
return 0;
@@ -863,6 +884,7 @@ dpaa2_dev_rx_queue_setup(struct rte_eth_dev *dev,
{
struct dpaa2_dev_priv *priv = dev->data->dev_private;
struct fsl_mc_io *dpni = dev->process_private;
+ bool dpcon_allocated = false;
struct dpaa2_queue *dpaa2_q;
struct dpni_queue cfg;
uint8_t options = 0;
@@ -903,6 +925,21 @@ dpaa2_dev_rx_queue_setup(struct rte_eth_dev *dev,
dpaa2_q->bp_array = rte_dpaa2_bpid_info;
dpaa2_q->offloads = rx_conf->offloads;
+ /* NAPI: grab a DPCON channel so dev_start can bind this FQ statically.
+ * The DQRR burst replaces the poll path for every queue at once, so a
+ * missing channel is fatal rather than a silent per-queue fallback.
+ */
+ dpaa2_q->napi_sub_dpio = NULL;
+ if (dev->data->dev_conf.intr_conf.rxq && !dpaa2_q->napi_dpcon) {
+ dpaa2_q->napi_dpcon = rte_dpaa2_alloc_dpcon_dev();
+ if (!dpaa2_q->napi_dpcon) {
+ DPAA2_PMD_ERR("rxq %d: no DPCON for rx-queue interrupts",
+ rx_queue_id);
+ return -ENODEV;
+ }
+ dpcon_allocated = true;
+ }
+
/*Get the flow id from given VQ id*/
flow_id = dpaa2_q->flow_id;
memset(&cfg, 0, sizeof(struct dpni_queue));
@@ -910,6 +947,10 @@ dpaa2_dev_rx_queue_setup(struct rte_eth_dev *dev,
options = options | DPNI_QUEUE_OPT_USER_CTX;
cfg.user_context = (size_t)(dpaa2_q);
+ /* clear any stale DPIO dest left scheduled by a prior rx-intr run */
+ options |= DPNI_QUEUE_OPT_DEST;
+ cfg.destination.type = DPNI_DEST_NONE;
+
/* check if a private cgr available. */
for (i = 0; i < priv->max_cgs; i++) {
if (!priv->cgid_in_use[i]) {
@@ -950,7 +991,7 @@ dpaa2_dev_rx_queue_setup(struct rte_eth_dev *dev,
dpaa2_q->tc_index, flow_id, options, &cfg);
if (ret) {
DPAA2_PMD_ERR("Error in setting the rx flow: = %d", ret);
- return ret;
+ goto err_free_dpcon;
}
dpaa2_q->nb_desc = nb_rx_desc;
@@ -991,7 +1032,7 @@ dpaa2_dev_rx_queue_setup(struct rte_eth_dev *dev,
if (ret) {
DPAA2_PMD_ERR("Error in setting taildrop. err=(%d)",
ret);
- return ret;
+ goto err_free_dpcon;
}
} else { /* Disable tail Drop */
struct dpni_taildrop taildrop = {0};
@@ -1011,12 +1052,22 @@ dpaa2_dev_rx_queue_setup(struct rte_eth_dev *dev,
if (ret) {
DPAA2_PMD_ERR("Error in setting taildrop. err=(%d)",
ret);
- return ret;
+ goto err_free_dpcon;
}
}
dev->data->rx_queues[rx_queue_id] = dpaa2_q;
return 0;
+
+err_free_dpcon:
+ /* free only the DPCON this call allocated; a pre-existing one belongs to
+ * an earlier setup and is released at dev_close
+ */
+ if (dpcon_allocated) {
+ rte_dpaa2_free_dpcon_dev(dpaa2_q->napi_dpcon);
+ dpaa2_q->napi_dpcon = NULL;
+ }
+ return ret;
}
static int
@@ -1175,6 +1226,62 @@ dpaa2_dev_tx_queue_setup(struct rte_eth_dev *dev,
return 0;
}
+/* Fully release a queue's rx-interrupt state: detach the FQ from its DPCON,
+ * unbind the static dequeue channel from the portal and free any stashed FDs.
+ * Teardown only: the port is stopped and the portal quiesced; not a runtime
+ * rx_queue_intr_disable() replacement. Call before freeing the DPCON.
+ */
+static void
+dpaa2_dev_rx_queue_intr_unbind(struct dpaa2_queue *dpaa2_q)
+{
+ struct dpaa2_dev_priv *priv;
+ struct dpaa2_dpio_dev *dpio;
+ struct fsl_mc_io *dpni;
+ struct dpni_queue cfg;
+ int ret;
+
+ if (!dpaa2_q || !dpaa2_q->napi_dpcon)
+ return;
+
+ /* detach the FQ from its DPCON so it no longer points at a channel
+ * about to be returned to the pool (dpni is disabled at teardown)
+ */
+ priv = dpaa2_q->eth_data->dev_private;
+ dpni = priv->eth_dev->process_private;
+ memset(&cfg, 0, sizeof(cfg));
+ cfg.destination.type = DPNI_DEST_NONE;
+ ret = dpni_set_queue(dpni, CMD_PRI_LOW, priv->token, DPNI_QUEUE_RX,
+ dpaa2_q->tc_index, dpaa2_q->flow_id,
+ DPNI_QUEUE_OPT_DEST, &cfg);
+ if (ret)
+ DPAA2_PMD_ERR("napi: DEST_NONE rxq flow %u: %d",
+ dpaa2_q->flow_id, ret);
+
+ /* unbind the static dequeue channel from the portal it was armed on */
+ dpio = rte_atomic_load_explicit(&dpaa2_q->napi_sub_dpio,
+ rte_memory_order_acquire);
+ if (dpio) {
+ qbman_swp_push_set(dpio->sw_portal,
+ dpaa2_q->napi_channel_index, 0);
+ if (dpaa2_q->napi_armed) {
+ dpaa2_q->napi_armed = 0;
+ if (dpio->ethrx_intr_refcnt > 0 &&
+ --dpio->ethrx_intr_refcnt == 0)
+ qbman_swp_interrupt_set_inhibit(dpio->sw_portal, 1);
+ }
+ ret = dpio_remove_static_dequeue_channel(dpio->dpio, CMD_PRI_LOW,
+ dpio->token, dpaa2_q->napi_dpcon->dpcon_id);
+ if (ret)
+ DPAA2_PMD_ERR("napi: remove DPCON %d static dequeue channel: %d",
+ dpaa2_q->napi_dpcon->dpcon_id, ret);
+ rte_atomic_store_explicit(&dpaa2_q->napi_sub_dpio, NULL,
+ rte_memory_order_release);
+ }
+
+ /* free FDs parked for this queue but never drained by a burst */
+ dpaa2_dev_rx_queue_napi_stash_drain(dpaa2_q);
+}
+
static void
dpaa2_dev_rx_queue_release(struct rte_eth_dev *dev, uint16_t rx_queue_id)
{
@@ -1204,6 +1311,12 @@ dpaa2_dev_rx_queue_release(struct rte_eth_dev *dev, uint16_t rx_queue_id)
priv->cgid_in_use[dpaa2_q->cgid] = 0;
dpaa2_q->cgid = DPAA2_INVALID_CGID;
}
+
+ if (dpaa2_q->napi_dpcon) {
+ dpaa2_dev_rx_queue_intr_unbind(dpaa2_q);
+ rte_dpaa2_free_dpcon_dev(dpaa2_q->napi_dpcon);
+ dpaa2_q->napi_dpcon = NULL;
+ }
}
static int
@@ -1354,6 +1467,36 @@ dpaa2_dev_start(struct rte_eth_dev *dev)
intr_handle = dpaa2_dev->intr_handle;
PMD_INIT_FUNC_TRACE();
+
+ /* NAPI: bind each rx FQ to its own DPCON channel while the dpni is still
+ * disabled (a DEST set_queue on an enabled dpni wedges the shared MC).
+ * Static, affinity-free; the polling worker subscribes its portal later.
+ */
+ if (dev->data->dev_conf.intr_conf.rxq) {
+ for (i = 0; i < data->nb_rx_queues; i++) {
+ dpaa2_q = data->rx_queues[i];
+ if (!dpaa2_q->napi_dpcon)
+ continue;
+ memset(&cfg, 0, sizeof(cfg));
+ cfg.destination.type = DPNI_DEST_DPCON;
+ cfg.destination.id = dpaa2_q->napi_dpcon->dpcon_id;
+ cfg.user_context = (size_t)dpaa2_q;
+ ret = dpni_set_queue(dpni, CMD_PRI_LOW, priv->token,
+ DPNI_QUEUE_RX, dpaa2_q->tc_index,
+ dpaa2_q->flow_id,
+ DPNI_QUEUE_OPT_DEST | DPNI_QUEUE_OPT_USER_CTX,
+ &cfg);
+ if (ret) {
+ DPAA2_PMD_ERR("napi: DPCON bind rxq %d: %d", i, ret);
+ return ret;
+ }
+ }
+ /* DQRR burst for all queues; a queue only yields frames once
+ * rx_queue_intr_enable() has subscribed its portal
+ */
+ dev->rx_pkt_burst = dpaa2_dev_rx_dqrr;
+ }
+
ret = dpni_enable(dpni, CMD_PRI_LOW, priv->token);
if (ret) {
DPAA2_PMD_ERR("Failure in enabling dpni %d device: err=%d",
@@ -1824,6 +1967,13 @@ dpaa2_dev_stats_get(struct rte_eth_dev *dev,
stats->oerrors = value.page_2.egress_discarded_frames;
stats->imissed = value.page_2.ingress_nobuffer_discards;
+ /* software Rx drops (full napi stash) are not in the HW counters */
+ for (i = 0; i < priv->nb_rx_queues; i++) {
+ dpaa2_rxq = priv->rx_vq[i];
+ if (dpaa2_rxq != NULL)
+ stats->imissed += dpaa2_rxq->err_pkts;
+ }
+
/* Fill in per queue stats */
if (qstats != NULL) {
for (i = 0; (i < RTE_ETHDEV_QUEUE_STAT_CNTRS) &&
@@ -2137,8 +2287,10 @@ dpaa2_dev_stats_reset(struct rte_eth_dev *dev)
/* Reset the per queue stats in dpaa2_queue structure */
for (i = 0; i < priv->nb_rx_queues; i++) {
dpaa2_q = priv->rx_vq[i];
- if (dpaa2_q)
+ if (dpaa2_q) {
dpaa2_q->rx_pkts = 0;
+ dpaa2_q->err_pkts = 0;
+ }
}
for (i = 0; i < priv->nb_tx_queues; i++) {
@@ -2698,6 +2850,135 @@ rte_pmd_dpaa2_thread_init(void)
}
}
+/* Arm rx-queue interrupts on the worker lcore: subscribe its ethrx portal to
+ * the queue's DPCON channel (one-shot per-portal MC) and unmask the portal DQRI
+ * (pure QBMan).
+ *
+ * Affinity is static queue-to-lcore; a lcore may own several rx queues. The
+ * DQRI and the eventfd are portal-wide, so frames are demuxed by fqd_ctx in the
+ * burst and the portal's inhibit bit is reference-counted by the number of its
+ * queues currently armed (ethrx_intr_refcnt) -- disabling one queue must not
+ * mask wakeups still wanted by its siblings. napi_armed and ethrx_intr_refcnt
+ * are plain (not atomic): these ops run on the queue's owner lcore against its
+ * own portal (one portal per lcore), so per-portal isolation keeps them from
+ * racing, not control-plane serialization.
+ *
+ * A re-home reclaims the channel by poking the old portal, so the caller must
+ * have quiesced the previous owner and disabled the queue there; napi_armed is
+ * then 0 and only the new portal is counted.
+ */
+static int
+dpaa2_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+ struct dpaa2_dev_priv *priv = dev->data->dev_private;
+ struct dpaa2_queue *dpaa2_q = priv->rx_vq[queue_id];
+ struct dpaa2_dpio_dev *dpio, *old;
+ int ret;
+
+ if (!dpaa2_q->napi_dpcon)
+ return -ENOTSUP; /* no channel -> caller keeps polling */
+
+ if (dpaa2_affine_qbman_ethrx_swp())
+ return -EIO;
+ dpio = DPAA2_PER_LCORE_ETHRX_DPIO;
+
+ /* build_epoll=false: the generic ethdev rx-intr API waits on the
+ * application epoll, not the portal's private one (event PMD only).
+ */
+ ret = dpaa2_dpio_intr_init(dpio, false); /* VFIO eventfd, no MC */
+ if (ret)
+ return ret;
+
+ old = rte_atomic_load_explicit(&dpaa2_q->napi_sub_dpio, rte_memory_order_acquire);
+ if (old && old != dpio && dpaa2_q->napi_armed) {
+ DPAA2_PMD_ERR("rxq %d still armed on another portal; disable it first",
+ queue_id);
+ return -EBUSY;
+ }
+ if (old != dpio) {
+ if (old) { /* reclaim from old portal (quiesced; QBMan MMIO unsynced) */
+ qbman_swp_push_set(old->sw_portal,
+ dpaa2_q->napi_channel_index, 0);
+ ret = dpio_remove_static_dequeue_channel(old->dpio,
+ CMD_PRI_LOW, old->token,
+ dpaa2_q->napi_dpcon->dpcon_id);
+ /* push_set(0) above already stops the old portal from
+ * dequeuing; a failed unbind only leaks a static-channel
+ * slot on the old DPIO, so warn and proceed
+ */
+ if (ret)
+ DPAA2_PMD_WARN("napi: reclaim rxq %d: %d",
+ queue_id, ret);
+ /* on no portal until the add below succeeds */
+ rte_atomic_store_explicit(&dpaa2_q->napi_sub_dpio, NULL,
+ rte_memory_order_release);
+ }
+ ret = dpio_add_static_dequeue_channel(dpio->dpio, CMD_PRI_LOW,
+ dpio->token, dpaa2_q->napi_dpcon->dpcon_id,
+ &dpaa2_q->napi_channel_index);
+ if (ret) {
+ DPAA2_PMD_ERR("napi: subscribe rxq %d: %d", queue_id, ret);
+ return ret;
+ }
+ qbman_swp_push_set(dpio->sw_portal,
+ dpaa2_q->napi_channel_index, 1);
+ /* point this queue's eventfd at the portal's DQRI fd so the
+ * generic rte_eth_dev_rx_intr_ctl_q epoll wakes on it
+ */
+ if (rte_intr_vec_list_index_set(dev->intr_handle, queue_id, queue_id) ||
+ rte_intr_efds_index_set(dev->intr_handle, queue_id,
+ rte_intr_fd_get(dpio->intr_handle))) {
+ DPAA2_PMD_ERR("napi: efd wiring rxq %d", queue_id);
+ /* unwind the half-done subscription so HW and driver
+ * state stay consistent
+ */
+ qbman_swp_push_set(dpio->sw_portal,
+ dpaa2_q->napi_channel_index, 0);
+ dpio_remove_static_dequeue_channel(dpio->dpio,
+ CMD_PRI_LOW, dpio->token,
+ dpaa2_q->napi_dpcon->dpcon_id);
+ return -EIO;
+ }
+ rte_atomic_store_explicit(&dpaa2_q->napi_sub_dpio, dpio, rte_memory_order_release);
+ }
+
+ /* arm this queue; the portal DQRI is unmasked only on the 0 -> 1 edge
+ * of its armed-queue count
+ */
+ if (!dpaa2_q->napi_armed) {
+ dpaa2_q->napi_armed = 1;
+ if (dpio->ethrx_intr_refcnt++ == 0) {
+ qbman_swp_interrupt_clear_status(dpio->sw_portal,
+ 0xffffffff);
+ qbman_swp_interrupt_set_inhibit(dpio->sw_portal, 0);
+ }
+ }
+
+ return 0;
+}
+
+/* Disarm rx-queue interrupts for this queue. The portal DQRI is masked only
+ * once the last of its queues disarms; act on the portal the queue is actually
+ * subscribed to, not the caller's current portal.
+ */
+static int
+dpaa2_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+ struct dpaa2_dev_priv *priv = dev->data->dev_private;
+ struct dpaa2_queue *dpaa2_q = priv->rx_vq[queue_id];
+ struct dpaa2_dpio_dev *dpio;
+
+ dpio = rte_atomic_load_explicit(&dpaa2_q->napi_sub_dpio, rte_memory_order_acquire);
+ if (dpio && dpaa2_q->napi_armed) {
+ dpaa2_q->napi_armed = 0;
+ if (dpio->ethrx_intr_refcnt > 0 &&
+ --dpio->ethrx_intr_refcnt == 0)
+ qbman_swp_interrupt_set_inhibit(dpio->sw_portal, 1);
+ }
+
+ return 0;
+}
+
static struct eth_dev_ops dpaa2_ethdev_ops = {
.dev_configure = dpaa2_eth_dev_configure,
.dev_start = dpaa2_dev_start,
@@ -2726,6 +3007,8 @@ static struct eth_dev_ops dpaa2_ethdev_ops = {
.vlan_tpid_set = dpaa2_vlan_tpid_set,
.rx_queue_setup = dpaa2_dev_rx_queue_setup,
.rx_queue_release = dpaa2_dev_rx_queue_release,
+ .rx_queue_intr_enable = dpaa2_dev_rx_queue_intr_enable,
+ .rx_queue_intr_disable = dpaa2_dev_rx_queue_intr_disable,
.tx_queue_setup = dpaa2_dev_tx_queue_setup,
.rx_burst_mode_get = dpaa2_dev_rx_burst_mode_get,
.tx_burst_mode_get = dpaa2_dev_tx_burst_mode_get,
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.h b/drivers/net/dpaa2/dpaa2_ethdev.h
index 4da47a543a..3765f79e84 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.h
+++ b/drivers/net/dpaa2/dpaa2_ethdev.h
@@ -491,6 +491,9 @@ uint16_t dpaa2_dev_loopback_rx(void *queue, struct rte_mbuf **bufs,
uint16_t dpaa2_dev_prefetch_rx(void *queue, struct rte_mbuf **bufs,
uint16_t nb_pkts);
+uint16_t dpaa2_dev_rx_dqrr(void *queue, struct rte_mbuf **bufs,
+ uint16_t nb_pkts);
+void dpaa2_dev_rx_queue_napi_stash_drain(struct dpaa2_queue *dpaa2_q);
void dpaa2_dev_process_parallel_event(struct qbman_swp *swp,
const struct qbman_fd *fd,
const struct qbman_result *dq,
diff --git a/drivers/net/dpaa2/dpaa2_rxtx.c b/drivers/net/dpaa2/dpaa2_rxtx.c
index b316e23e87..189accc1de 100644
--- a/drivers/net/dpaa2/dpaa2_rxtx.c
+++ b/drivers/net/dpaa2/dpaa2_rxtx.c
@@ -922,6 +922,128 @@ dpaa2_dev_prefetch_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
return num_rx;
}
+/* Convert a DQRR'd FD (single or scatter-gather) to an mbuf and apply software
+ * VLAN strip, like the poll path.
+ */
+static inline struct rte_mbuf *
+dpaa2_dqrr_fd_to_mbuf(const struct qbman_fd *fd,
+ struct rte_eth_dev_data *eth_data)
+{
+ struct rte_mbuf *m;
+
+ if (unlikely(DPAA2_FD_GET_FORMAT(fd) == qbman_fd_sg))
+ m = eth_sg_fd_to_mbuf(fd, eth_data->port_id);
+ else
+ m = eth_fd_to_mbuf(fd, eth_data->port_id);
+ if (eth_data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_VLAN_STRIP)
+ rte_vlan_strip(m);
+ return m;
+}
+
+/* prefetch a DQRR'd FD's HW annotation (parse area) ahead of conversion */
+static inline void
+dpaa2_dqrr_prefetch_annot(const struct qbman_fd *fd)
+{
+ rte_prefetch0((void *)((size_t)DPAA2_IOVA_TO_VADDR(DPAA2_GET_FD_ADDR(fd))
+ + DPAA2_FD_PTA_SIZE));
+}
+
+/* Free FDs a sibling burst parked in this queue's stash but that were never
+ * drained (queue released/freed while the lcore still held its frames).
+ */
+void
+dpaa2_dev_rx_queue_napi_stash_drain(struct dpaa2_queue *dpaa2_q)
+{
+ struct dpaa2_napi_stash *stash = &dpaa2_q->napi_stash;
+ const struct qbman_fd *fd;
+
+ while (stash->head != stash->tail) {
+ fd = &stash->fd[stash->head & (DPAA2_NAPI_FD_STASH_SIZE - 1)];
+ rte_pktmbuf_free(dpaa2_dqrr_fd_to_mbuf(fd, dpaa2_q->eth_data));
+ stash->head++;
+ }
+ stash->head = 0;
+ stash->tail = 0;
+}
+
+/* rx interrupt/DQRR path: the FQ is scheduled to a channel the lcore's ethrx
+ * portal statically dequeues -- a VDQ on a scheduled FQ never completes, so DQRR
+ * is the only model compatible with interrupt sleep. One portal serves every
+ * queue the lcore owns, so the burst demuxes by fqd_ctx: own frames are
+ * returned, foreign ones have their raw FD parked in the target queue's stash.
+ *
+ * The application must therefore poll all queues assigned to the lcore after a
+ * wakeup -- the same scheduling contract as plain DPDK polling. When a foreign
+ * queue's stash is full the FD is dropped (freed) rather than left on the shared
+ * DQRR ring, which would head-of-line block every other queue on the portal.
+ */
+uint16_t __rte_hot
+dpaa2_dev_rx_dqrr(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+ struct dpaa2_queue *dpaa2_q = queue;
+ struct rte_eth_dev_data *eth_data = dpaa2_q->eth_data;
+ struct dpaa2_napi_stash *stash = &dpaa2_q->napi_stash;
+ const struct qbman_result *dq;
+ const struct qbman_fd *fd;
+ struct dpaa2_queue *rxq;
+ struct qbman_swp *swp;
+ uint16_t num_rx = 0;
+
+ if (unlikely(!DPAA2_PER_LCORE_ETHRX_DPIO)) {
+ if (dpaa2_affine_qbman_ethrx_swp()) {
+ DPAA2_PMD_ERR("Failure in affining portal");
+ return 0;
+ }
+ }
+ swp = DPAA2_PER_LCORE_ETHRX_PORTAL;
+
+ /* our frames parked by another queue's burst -- convert now (hot) */
+ while (num_rx < nb_pkts && stash->head != stash->tail) {
+ fd = &stash->fd[stash->head & (DPAA2_NAPI_FD_STASH_SIZE - 1)];
+ if (dpaa2_svr_family != SVR_LX2160A &&
+ (uint16_t)(stash->head + 1) != stash->tail)
+ dpaa2_dqrr_prefetch_annot(&stash->fd[(stash->head + 1) &
+ (DPAA2_NAPI_FD_STASH_SIZE - 1)]);
+ bufs[num_rx++] = dpaa2_dqrr_fd_to_mbuf(fd, eth_data);
+ stash->head++;
+ }
+
+ while (num_rx < nb_pkts) {
+ dq = qbman_swp_dqrr_next(swp);
+ if (!dq)
+ break; /* ring momentarily empty */
+ qbman_swp_prefetch_dqrr_next(swp);
+ fd = qbman_result_DQ_fd(dq);
+ /* parse summary is in the FRC on LX2160A; annotation is HW-stashed */
+ if (dpaa2_svr_family != SVR_LX2160A)
+ dpaa2_dqrr_prefetch_annot(fd);
+ rxq = (struct dpaa2_queue *)(size_t)qbman_result_DQ_fqd_ctx(dq);
+ if (unlikely(!rxq))
+ rxq = dpaa2_q;
+ if (rxq == dpaa2_q) {
+ bufs[num_rx++] = dpaa2_dqrr_fd_to_mbuf(fd, eth_data);
+ } else {
+ struct dpaa2_napi_stash *fs = &rxq->napi_stash;
+
+ if (unlikely((uint16_t)(fs->tail - fs->head) >=
+ DPAA2_NAPI_FD_STASH_SIZE)) {
+ /* stash full: drop rather than leave it on the ring
+ * and head-of-line block the shared portal
+ */
+ rte_pktmbuf_free(dpaa2_dqrr_fd_to_mbuf(fd, rxq->eth_data));
+ rxq->err_pkts++;
+ } else {
+ fs->fd[fs->tail & (DPAA2_NAPI_FD_STASH_SIZE - 1)] = *fd;
+ fs->tail++;
+ }
+ }
+ qbman_swp_dqrr_consume(swp, dq);
+ }
+
+ dpaa2_q->rx_pkts += num_rx;
+ return num_rx;
+}
+
void __rte_hot
dpaa2_dev_process_parallel_event(struct qbman_swp *swp,
const struct qbman_fd *fd,
--
2.43.0
^ permalink raw reply related
* [PATCH v2 2/6] bus/fslmc/dpio: make the portal DQRI epoll optional
From: Maxime Leroy @ 2026-06-16 10:27 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy, Hemant Agrawal, Sachin Saxena
In-Reply-To: <20260616102727.708948-1-maxime@leroys.fr>
dpaa2_dpio_intr_init() builds a private epoll instance the event PMD
sleeps on. The upcoming net rx-queue-interrupt path waits on the
application's own epoll instead, so that instance would be built but
never used.
Add a build_epoll parameter: pass true to build it (event PMD), false
to skip the epoll_create/epoll_ctl. epoll_fd is set to -1 when none is
built and closed in intr_deinit only when valid. The sole caller passes
true: no functional change.
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 44 +++++++++++++++++-------
1 file changed, 32 insertions(+), 12 deletions(-)
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
index 2a9e519668..3a5abb2e6d 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
@@ -205,13 +205,12 @@ dpaa2_affine_dpio_intr_to_respective_core(int32_t dpio_id, int cpu_id)
fclose(file);
}
-static int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev)
+static int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev, bool build_epoll)
{
struct epoll_event epoll_ev;
int eventfd, dpio_epoll_fd, ret;
int threshold = 0x3, timeout = 0xFF;
- dpio_epoll_fd = epoll_create(1);
ret = rte_dpaa2_intr_enable(dpio_dev->intr_handle, 0);
if (ret) {
DPAA2_BUS_ERR("Interrupt registration failed");
@@ -231,16 +230,34 @@ static int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev)
qbman_swp_dqrr_thrshld_write(dpio_dev->sw_portal, threshold);
qbman_swp_intr_timeout_write(dpio_dev->sw_portal, timeout);
- eventfd = rte_intr_fd_get(dpio_dev->intr_handle);
- epoll_ev.events = EPOLLIN | EPOLLPRI | EPOLLET;
- epoll_ev.data.fd = eventfd;
+ dpio_dev->epoll_fd = -1;
- ret = epoll_ctl(dpio_epoll_fd, EPOLL_CTL_ADD, eventfd, &epoll_ev);
- if (ret < 0) {
- DPAA2_BUS_ERR("epoll_ctl failed");
- return -1;
+ /* The event PMD dequeues by sleeping on a private epoll instance owned
+ * by the portal, so build it here. A caller that waits on another
+ * epoll (the net rx-queue-interrupt path uses the application's) skips
+ * this.
+ */
+ if (build_epoll) {
+ dpio_epoll_fd = epoll_create(1);
+ if (dpio_epoll_fd < 0) {
+ DPAA2_BUS_ERR("epoll_create failed");
+ rte_dpaa2_intr_disable(dpio_dev->intr_handle, 0);
+ return -1;
+ }
+
+ eventfd = rte_intr_fd_get(dpio_dev->intr_handle);
+ epoll_ev.events = EPOLLIN | EPOLLPRI | EPOLLET;
+ epoll_ev.data.fd = eventfd;
+
+ ret = epoll_ctl(dpio_epoll_fd, EPOLL_CTL_ADD, eventfd, &epoll_ev);
+ if (ret < 0) {
+ DPAA2_BUS_ERR("epoll_ctl failed");
+ rte_dpaa2_intr_disable(dpio_dev->intr_handle, 0);
+ close(dpio_epoll_fd);
+ return -1;
+ }
+ dpio_dev->epoll_fd = dpio_epoll_fd;
}
- dpio_dev->epoll_fd = dpio_epoll_fd;
return 0;
}
@@ -253,7 +270,10 @@ static void dpaa2_dpio_intr_deinit(struct dpaa2_dpio_dev *dpio_dev)
if (ret)
DPAA2_BUS_ERR("DPIO interrupt disable failed");
- close(dpio_dev->epoll_fd);
+ if (dpio_dev->epoll_fd >= 0) {
+ close(dpio_dev->epoll_fd);
+ dpio_dev->epoll_fd = -1;
+ }
}
#endif
@@ -277,7 +297,7 @@ dpaa2_configure_stashing(struct dpaa2_dpio_dev *dpio_dev, int cpu_id)
}
#ifdef RTE_EVENT_DPAA2
- if (dpaa2_dpio_intr_init(dpio_dev)) {
+ if (dpaa2_dpio_intr_init(dpio_dev, true)) {
DPAA2_BUS_ERR("Interrupt registration failed for dpio");
return -1;
}
--
2.43.0
^ permalink raw reply related
* [PATCH v2 1/6] bus/fslmc: move DPCON management from event driver to bus
From: Maxime Leroy @ 2026-06-16 10:27 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy, Hemant Agrawal, Sachin Saxena
In-Reply-To: <20260616102727.708948-1-maxime@leroys.fr>
The DPCON allocation helpers (rte_dpaa2_alloc_dpcon_dev /
rte_dpaa2_free_dpcon_dev) lived in the event driver, but a notification
channel is a generic QBMan resource. Move dpaa2_hw_dpcon.c to the fslmc
bus and export the helpers as internal symbols so both the event PMD and
the net driver's rx-queue interrupt path can draw channels from the same
pool. No functional change.
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
drivers/bus/fslmc/meson.build | 1 +
.../dpaa2 => bus/fslmc/portal}/dpaa2_hw_dpcon.c | 16 +++++++---------
drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 8 ++++++++
drivers/event/dpaa2/dpaa2_eventdev.h | 5 +++--
drivers/event/dpaa2/meson.build | 1 -
5 files changed, 19 insertions(+), 12 deletions(-)
rename drivers/{event/dpaa2 => bus/fslmc/portal}/dpaa2_hw_dpcon.c (90%)
diff --git a/drivers/bus/fslmc/meson.build b/drivers/bus/fslmc/meson.build
index ceae1c6c11..50d9e91a37 100644
--- a/drivers/bus/fslmc/meson.build
+++ b/drivers/bus/fslmc/meson.build
@@ -22,6 +22,7 @@ sources = files(
'mc/mc_sys.c',
'portal/dpaa2_hw_dpbp.c',
'portal/dpaa2_hw_dpci.c',
+ 'portal/dpaa2_hw_dpcon.c',
'portal/dpaa2_hw_dpio.c',
'portal/dpaa2_hw_dprc.c',
'qbman/qbman_portal.c',
diff --git a/drivers/event/dpaa2/dpaa2_hw_dpcon.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpcon.c
similarity index 90%
rename from drivers/event/dpaa2/dpaa2_hw_dpcon.c
rename to drivers/bus/fslmc/portal/dpaa2_hw_dpcon.c
index ea5b0d4b85..6fd96ec0b9 100644
--- a/drivers/event/dpaa2/dpaa2_hw_dpcon.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpcon.c
@@ -18,13 +18,12 @@
#include <rte_cycles.h>
#include <rte_kvargs.h>
#include <dev_driver.h>
-#include <ethdev_driver.h>
+#include <eal_export.h>
#include <bus_fslmc_driver.h>
#include <mc/fsl_dpcon.h>
#include <portal/dpaa2_hw_pvt.h>
-#include "dpaa2_eventdev.h"
-#include "dpaa2_eventdev_logs.h"
+#include <fslmc_logs.h>
TAILQ_HEAD(dpcon_dev_list, dpaa2_dpcon_dev);
static struct dpcon_dev_list dpcon_dev_list
@@ -55,8 +54,7 @@ rte_dpaa2_create_dpcon_device(int dev_fd __rte_unused,
/* Allocate DPAA2 dpcon handle */
dpcon_node = rte_malloc(NULL, sizeof(struct dpaa2_dpcon_dev), 0);
if (!dpcon_node) {
- DPAA2_EVENTDEV_ERR(
- "Memory allocation failed for dpcon device");
+ DPAA2_BUS_ERR("Memory allocation failed for dpcon device");
return -1;
}
@@ -65,8 +63,7 @@ rte_dpaa2_create_dpcon_device(int dev_fd __rte_unused,
ret = dpcon_open(&dpcon_node->dpcon,
CMD_PRI_LOW, dpcon_id, &dpcon_node->token);
if (ret) {
- DPAA2_EVENTDEV_ERR("Unable to open dpcon device: err(%d)",
- ret);
+ DPAA2_BUS_ERR("Unable to open dpcon device: err(%d)", ret);
rte_free(dpcon_node);
return -1;
}
@@ -75,8 +72,7 @@ rte_dpaa2_create_dpcon_device(int dev_fd __rte_unused,
ret = dpcon_get_attributes(&dpcon_node->dpcon,
CMD_PRI_LOW, dpcon_node->token, &attr);
if (ret != 0) {
- DPAA2_EVENTDEV_ERR("dpcon attribute fetch failed: err(%d)",
- ret);
+ DPAA2_BUS_ERR("dpcon attribute fetch failed: err(%d)", ret);
rte_free(dpcon_node);
return -1;
}
@@ -92,6 +88,7 @@ rte_dpaa2_create_dpcon_device(int dev_fd __rte_unused,
return 0;
}
+RTE_EXPORT_INTERNAL_SYMBOL(rte_dpaa2_alloc_dpcon_dev)
struct dpaa2_dpcon_dev *rte_dpaa2_alloc_dpcon_dev(void)
{
struct dpaa2_dpcon_dev *dpcon_dev = NULL;
@@ -105,6 +102,7 @@ struct dpaa2_dpcon_dev *rte_dpaa2_alloc_dpcon_dev(void)
return dpcon_dev;
}
+RTE_EXPORT_INTERNAL_SYMBOL(rte_dpaa2_free_dpcon_dev)
void rte_dpaa2_free_dpcon_dev(struct dpaa2_dpcon_dev *dpcon)
{
struct dpaa2_dpcon_dev *dpcon_dev = NULL;
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
index e625a5c035..79a2ec41e3 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
@@ -274,6 +274,14 @@ struct dpaa2_dpcon_dev {
uint8_t channel_index;
};
+/* DPCON channel allocation -- managed by the fslmc bus so both the net
+ * NAPI/DQRR rx path and the event PMD can grab channels.
+ */
+__rte_internal
+struct dpaa2_dpcon_dev *rte_dpaa2_alloc_dpcon_dev(void);
+__rte_internal
+void rte_dpaa2_free_dpcon_dev(struct dpaa2_dpcon_dev *dpcon);
+
/* Refer to Table 7-3 in SEC BG */
#define QBMAN_FLE_WORD4_FMT_SBF 0x0 /* Single buffer frame */
#define QBMAN_FLE_WORD4_FMT_SGE 0x2 /* Scatter gather frame */
diff --git a/drivers/event/dpaa2/dpaa2_eventdev.h b/drivers/event/dpaa2/dpaa2_eventdev.h
index bb87bdbab2..f53efce61c 100644
--- a/drivers/event/dpaa2/dpaa2_eventdev.h
+++ b/drivers/event/dpaa2/dpaa2_eventdev.h
@@ -85,8 +85,9 @@ struct dpaa2_eventdev {
uint32_t event_dev_cfg;
};
-struct dpaa2_dpcon_dev *rte_dpaa2_alloc_dpcon_dev(void);
-void rte_dpaa2_free_dpcon_dev(struct dpaa2_dpcon_dev *dpcon);
+/* rte_dpaa2_alloc_dpcon_dev()/rte_dpaa2_free_dpcon_dev() now live in the fslmc
+ * bus (portal/dpaa2_hw_pvt.h), which this header's includers already pull in.
+ */
int test_eventdev_dpaa2(void);
diff --git a/drivers/event/dpaa2/meson.build b/drivers/event/dpaa2/meson.build
index dd5063af43..62b8507652 100644
--- a/drivers/event/dpaa2/meson.build
+++ b/drivers/event/dpaa2/meson.build
@@ -7,7 +7,6 @@ if not is_linux
endif
deps += ['bus_vdev', 'net_dpaa2', 'crypto_dpaa2_sec']
sources = files(
- 'dpaa2_hw_dpcon.c',
'dpaa2_eventdev.c',
'dpaa2_eventdev_selftest.c',
)
--
2.43.0
^ permalink raw reply related
* [PATCH v2 0/6] net/dpaa2: NAPI-style Rx queue interrupts
From: Maxime Leroy @ 2026-06-16 10:27 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy
In-Reply-To: <20260611154926.392670-1-maxime@leroys.fr>
This series lets a dpaa2 worker sleep on a queue's data-availability
notification instead of busy-polling, exposed through the generic
rte_eth_dev_rx_intr_* API (NAPI-style: poll while frames keep coming,
arm the interrupt and sleep when the queue runs dry).
Why it is not a trivial .rx_queue_intr_enable
----------------------------------------------
A worker wakes on its software portal's DQRI, which fires when the
portal's DQRR holds frames. The default dpaa2 Rx burst pulls frames
from the FQ with a volatile dequeue and cannot be interrupt-driven; to
wake on the DQRI the FQ must instead be pushed to the portal's DQRR.
The natural dpni_set_queue with a notification destination would have to
target the worker's portal, but that portal is only known once a worker
affines, after dev_start, and that MC command holds the global MC lock
long enough to wedge the firmware while traffic runs. So the bind cannot
be done late, against the polling lcore.
Design
------
Each Rx FQ is bound to its own DPCON channel, statically, at dev_start
while the dpni is still disabled (no knowledge of the polling lcore). A
worker later subscribes its own ethrx portal to the channel and arms the
DQRI in rx_queue_intr_enable, a one-shot per-portal op, never the wedging
set_queue. One portal serves every queue a worker owns, so the DQRR
burst demuxes frames to their FQ by fqd_ctx; foreign frames are parked in
the target queue's stash, so the application polls all its queues after a
wakeup, the same scheduling contract as plain DPDK polling. A queue can
be re-homed to another lcore at runtime with no set_queue and no port
stop.
This reuses the event PMD's pushed/DQRR model but with one DPCON per FQ
and static affinity (no QBMan scheduling), so the DPCON allocator is
moved from the event driver to the fslmc bus and shared.
Patches 1 and 2 move the DPCON allocator to the fslmc bus and make the
portal DQRI epoll optional; patch 3 adds the interrupt support proper and
patch 4 tunes the DQRI coalescing holdoff. Patch 5 (rx_queue_count NULL on
the primary process) is a real fix the path depends on and uncovered,
tagged for stable and backportable on its own. Patch 6 (drop the software
VLAN strip) is an independent net/dpaa2 change the interrupt path does not
require.
The path also depends on two fixes sent separately: an eal change so the
shared portal eventfd does not fail with -EEXIST (already applied to main)
and the ethdev fix for fast-path ops left NULL after port stop (see
Depends-on below).
Tested on LX2160A (lx2160acex7).
Depends-on: series-38450 ("ethdev: fix fast-path ops on a stopped port")
v2:
- Dropped the RSS RETA patch, an independent net/dpaa2 change the
interrupt path does not require; it will be sent as its own series.
- Dropped the ethdev fast-path ops fix; it is now a standalone series
(Depends-on above).
- Dropped the eal/interrupts -EEXIST fix, applied to main by David Marchand.
- Declared qbman_swp_interrupt_set_inhibit and qbman_swp_dqrr_size
__rte_internal (David Marchand).
- Minor formatting cleanup in the Rx interrupt setup.
Maxime Leroy (6):
bus/fslmc: move DPCON management from event driver to bus
bus/fslmc/dpio: make the portal DQRI epoll optional
net/dpaa2: support Rx queue interrupts
bus/fslmc/dpio: tune DQRI interrupt coalescing holdoff
net/dpaa2: fix Rx queue count for primary process
net/dpaa2: drop the fake software VLAN strip offload
doc/guides/nics/dpaa2.rst | 10 +
doc/guides/nics/features/dpaa2.ini | 1 +
doc/guides/rel_notes/release_26_07.rst | 7 +
drivers/bus/fslmc/meson.build | 1 +
.../fslmc/portal}/dpaa2_hw_dpcon.c | 16 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 113 ++++--
drivers/bus/fslmc/portal/dpaa2_hw_dpio.h | 12 +
drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 35 +-
.../fslmc/qbman/include/fsl_qbman_portal.h | 11 +
drivers/bus/fslmc/qbman/qbman_portal.c | 7 +
drivers/event/dpaa2/dpaa2_eventdev.h | 5 +-
drivers/event/dpaa2/meson.build | 1 -
drivers/net/dpaa2/dpaa2_ethdev.c | 349 +++++++++++++++++-
drivers/net/dpaa2/dpaa2_ethdev.h | 10 +
drivers/net/dpaa2/dpaa2_rxtx.c | 123 +++++-
15 files changed, 647 insertions(+), 54 deletions(-)
rename drivers/{event/dpaa2 => bus/fslmc/portal}/dpaa2_hw_dpcon.c (90%)
--
2.43.0
^ permalink raw reply
* RE: [PATCH 2/2] ethdev: return 0 from dummy queue count
From: Morten Brørup @ 2026-06-16 9:54 UTC (permalink / raw)
To: Maxime Leroy, dev
Cc: stable, Stephen Hemminger, Thomas Monjalon, Andrew Rybchenko,
Sunil Kumar Kori
In-Reply-To: <20260616094259.686555-3-maxime@leroys.fr>
> From: Maxime Leroy [mailto:maxime.leroys@gmail.com] On Behalf Of Maxime
> Leroy
> Sent: Tuesday, 16 June 2026 11.43
>
> The dummy rx_queue_count/tx_queue_count callback returned -ENOTSUP. On
> a
> port that is not started (freshly allocated, or stopped once the fast-
> path
> ops are reset to dummies) there are no packets queued, so the truthful
> answer is 0, not an error: querying the count is not an unsupported
> operation. This also matches the dummy Rx/Tx burst, which reports 0
> packets.
>
> A poll-mode worker checking rte_eth_rx_queue_count() across a
> concurrent
> port stop then sees an empty queue instead of a negative error.
>
> Fixes: 066f3d9cc21c ("ethdev: remove callback checks from fast path")
> Cc: stable@dpdk.org
>
> Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
> Signed-off-by: Maxime Leroy <maxime@leroys.fr>
> ---
Acked-by: Morten Brørup <mb@smartsharesystems.com>
^ permalink raw reply
* Re: [PATCH] dts: avoid Scapy MAC resolution in Rx split test
From: Thomas Monjalon @ 2026-06-16 9:53 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, Luca Vizzarro, Patrick Robb
In-Reply-To: <20260611115421.12c4e6ee@phoenix.local>
11/06/2026 20:54, Stephen Hemminger:
> On Wed, 10 Jun 2026 20:32:18 +0200
> Thomas Monjalon <thomas@monjalon.net> wrote:
>
> > The test gets the Ethernet header length from Scapy with len(Ether()).
> >
> > When building DTS API documentation, Sphinx imports the test module
> > and shows this warning:
> > WARNING: MAC address to reach destination not found. Using broadcast.
> >
> > Use a dummy MAC address so Scapy no longer performs
> > destination resolution during import.
> >
> > Fixes: 01c70544cffd ("dts: add selective Rx tests")
> >
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>
> Thanks, I previously reported this as:
>
> https://bugs.dpdk.org/show_bug.cgi?id=1951
OK, added the tag in the commit log.
> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Applied
^ permalink raw reply
* [PATCH 2/2] ethdev: return 0 from dummy queue count
From: Maxime Leroy @ 2026-06-16 9:42 UTC (permalink / raw)
To: dev
Cc: Maxime Leroy, stable, Stephen Hemminger, Thomas Monjalon,
Andrew Rybchenko, Sunil Kumar Kori, Morten Brørup
In-Reply-To: <20260616094259.686555-1-maxime@leroys.fr>
The dummy rx_queue_count/tx_queue_count callback returned -ENOTSUP. On a
port that is not started (freshly allocated, or stopped once the fast-path
ops are reset to dummies) there are no packets queued, so the truthful
answer is 0, not an error: querying the count is not an unsupported
operation. This also matches the dummy Rx/Tx burst, which reports 0
packets.
A poll-mode worker checking rte_eth_rx_queue_count() across a concurrent
port stop then sees an empty queue instead of a negative error.
Fixes: 066f3d9cc21c ("ethdev: remove callback checks from fast path")
Cc: stable@dpdk.org
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
lib/ethdev/ethdev_driver.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
index 70ddce5bfc..eab5c15d12 100644
--- a/lib/ethdev/ethdev_driver.c
+++ b/lib/ethdev/ethdev_driver.c
@@ -875,7 +875,7 @@ RTE_EXPORT_INTERNAL_SYMBOL(rte_eth_queue_count_dummy)
int
rte_eth_queue_count_dummy(void *queue __rte_unused)
{
- return -ENOTSUP;
+ return 0;
}
RTE_EXPORT_INTERNAL_SYMBOL(rte_eth_descriptor_status_dummy)
--
2.43.0
^ permalink raw reply related
* [PATCH 1/2] ethdev: keep fast-path ops valid after port stop
From: Maxime Leroy @ 2026-06-16 9:42 UTC (permalink / raw)
To: dev
Cc: Maxime Leroy, stable, Morten Brørup, Thomas Monjalon,
Andrew Rybchenko, Sunil Kumar Kori
In-Reply-To: <20260616094259.686555-1-maxime@leroys.fr>
eth_dev_fp_ops_reset() restores a port's fast-path ops on stop/release
via a compound literal, so every field it omits is zeroed to NULL. It
sets only rx_pkt_burst/tx_pkt_burst (and the rxq/txq data), leaving
rx_queue_count, tx_queue_count, rx/tx_descriptor_status, tx_pkt_prepare
and the recycle callbacks NULL.
In non-debug builds these ops are reached through an unguarded indirect
call (the NULL check exists only under RTE_ETHDEV_DEBUG_RX/TX). So a
thread calling e.g. rte_eth_rx_queue_count() on a port being stopped
dereferences NULL and crashes, while the same race on rte_eth_rx_burst()
is harmless because the burst ops are reset to dummies. A poll-mode
worker re-checking rx_queue_count before arming the Rx interrupt and
sleeping hits exactly this.
Reset these non-burst ops to the same dummies eth_dev_set_dummy_fops()
installs, so a stopped port behaves like a freshly allocated one: every
fast-path op is a safe no-op, none is NULL.
Fixes: 066f3d9cc21c ("ethdev: remove callback checks from fast path")
Cc: stable@dpdk.org
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
lib/ethdev/ethdev_private.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
index 72a0723846..75ea3eedff 100644
--- a/lib/ethdev/ethdev_private.c
+++ b/lib/ethdev/ethdev_private.c
@@ -263,6 +263,13 @@ eth_dev_fp_ops_reset(struct rte_eth_fp_ops *fpo)
*fpo = (struct rte_eth_fp_ops) {
.rx_pkt_burst = dummy_eth_rx_burst,
.tx_pkt_burst = dummy_eth_tx_burst,
+ .tx_pkt_prepare = rte_eth_tx_pkt_prepare_dummy,
+ .rx_queue_count = rte_eth_queue_count_dummy,
+ .tx_queue_count = rte_eth_queue_count_dummy,
+ .rx_descriptor_status = rte_eth_descriptor_status_dummy,
+ .tx_descriptor_status = rte_eth_descriptor_status_dummy,
+ .recycle_tx_mbufs_reuse = rte_eth_recycle_tx_mbufs_reuse_dummy,
+ .recycle_rx_descriptors_refill = rte_eth_recycle_rx_descriptors_refill_dummy,
.rxq = {
.data = (void **)&dummy_queues_array[port_id],
.clbk = dummy_data,
--
2.43.0
^ permalink raw reply related
* [PATCH 0/2] ethdev: fix fast-path ops on a stopped port
From: Maxime Leroy @ 2026-06-16 9:42 UTC (permalink / raw)
To: dev; +Cc: Maxime Leroy
Two small fixes for fast-path ops on a stopped port:
patch 1 stops rte_eth_rx_queue_count() from dereferencing NULL after a port
stop, patch 2 makes the dummy queue count return 0 (empty queue) instead of
-ENOTSUP.
Maxime Leroy (2):
ethdev: keep fast-path ops valid after port stop
ethdev: return 0 from dummy queue count
lib/ethdev/ethdev_driver.c | 2 +-
lib/ethdev/ethdev_private.c | 7 +++++++
2 files changed, 8 insertions(+), 1 deletion(-)
--
2.43.0
^ permalink raw reply
* [PATCH] app/testpmd: include IP fields in UDP RSS option
From: Maxime Leroy @ 2026-06-16 9:39 UTC (permalink / raw)
To: dev
Cc: Maxime Leroy, stable, Aman Singh, Heqing Zhu, Jijiang Liu,
Helin Zhang, Cunming Liang, Jing Chen
The --rss-udp option is documented as enabling IPv4/IPv6 + UDP RSS, but it
currently sets the RSS hash functions to RTE_ETH_RSS_UDP only.
On PMDs that translate this directly to L4 port extracts, this can build a
L4-only RSS key. Add RTE_ETH_RSS_IP when --rss-udp is selected so the
configured key matches the documented IPv4/IPv6 + UDP behavior.
Make --rss-ip additive as well, so combining --rss-ip and --rss-udp is
order-independent.
Fixes: 8a387fa85f02 ("ethdev: more RSS flags")
Cc: stable@dpdk.org
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
app/test-pmd/parameters.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 337d8fc8ac..0032ea4e25 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -1286,10 +1286,10 @@ launch_args_parse(int argc, char** argv)
set_pkt_forwarding_mode(optarg);
break;
case TESTPMD_OPT_RSS_IP_NUM:
- rss_hf = RTE_ETH_RSS_IP;
+ rss_hf |= RTE_ETH_RSS_IP;
break;
case TESTPMD_OPT_RSS_UDP_NUM:
- rss_hf = RTE_ETH_RSS_UDP;
+ rss_hf |= RTE_ETH_RSS_IP | RTE_ETH_RSS_UDP;
break;
case TESTPMD_OPT_RSS_LEVEL_INNER_NUM:
rss_hf |= RTE_ETH_RSS_LEVEL_INNERMOST;
--
2.43.0
^ permalink raw reply related
* Re: [PATCH 8/9] ethdev: keep fast-path ops valid after port stop
From: David Marchand @ 2026-06-16 9:33 UTC (permalink / raw)
To: Maxime Leroy
Cc: hemant.agrawal, sachin.saxena, dev, stable, Thomas Monjalon,
Andrew Rybchenko, Morten Brørup, Sunil Kumar Kori
In-Reply-To: <CAHHRULXCs5yrQn_rQt_iPY3gBgoEx7K+vvdVOT_4dGk6E9_2NQ@mail.gmail.com>
On Tue, 16 Jun 2026 at 11:23, Maxime Leroy <maxime@leroys.fr> wrote:
> On Mon, Jun 15, 2026 at 11:26 AM David Marchand
> <david.marchand@redhat.com> wrote:
> >
> > On Thu, 11 Jun 2026 at 17:51, Maxime Leroy <maxime@leroys.fr> wrote:
> > >
> > > eth_dev_fp_ops_reset() restores a port's fast-path ops on stop/release
> > > via a compound literal, so every field it omits is zeroed to NULL. It
> > > sets only rx_pkt_burst/tx_pkt_burst (and the rxq/txq data), leaving
> > > rx_queue_count, tx_queue_count, rx/tx_descriptor_status, tx_pkt_prepare
> > > and the recycle callbacks NULL.
> > >
> > > In non-debug builds these ops are reached through an unguarded indirect
> > > call (the NULL check exists only under RTE_ETHDEV_DEBUG_RX/TX). So a
> > > thread calling e.g. rte_eth_rx_queue_count() on a port being stopped
> > > dereferences NULL and crashes, while the same race on rte_eth_rx_burst()
> > > is harmless because the burst ops are reset to dummies. A poll-mode
> > > worker re-checking rx_queue_count before arming the Rx interrupt and
> > > sleeping hits exactly this.
> > >
> > > Reset these ops to the same dummies eth_dev_set_dummy_fops() installs,
> > > so a stopped port behaves like a freshly allocated one: every fast-path
> > > op is a safe no-op, none is NULL.
> > >
> > > Fixes: 066f3d9cc21c ("ethdev: remove callback checks from fast path")
> > > Cc: stable@dpdk.org
> > > Signed-off-by: Maxime Leroy <maxime@leroys.fr>
> > > ---
> > > lib/ethdev/ethdev_private.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
> > > index 72a0723846..75ea3eedff 100644
> > > --- a/lib/ethdev/ethdev_private.c
> > > +++ b/lib/ethdev/ethdev_private.c
> > > @@ -263,6 +263,13 @@ eth_dev_fp_ops_reset(struct rte_eth_fp_ops *fpo)
> > > *fpo = (struct rte_eth_fp_ops) {
> > > .rx_pkt_burst = dummy_eth_rx_burst,
> > > .tx_pkt_burst = dummy_eth_tx_burst,
> > > + .tx_pkt_prepare = rte_eth_tx_pkt_prepare_dummy,
> > > + .rx_queue_count = rte_eth_queue_count_dummy,
> > > + .tx_queue_count = rte_eth_queue_count_dummy,
> > > + .rx_descriptor_status = rte_eth_descriptor_status_dummy,
> > > + .tx_descriptor_status = rte_eth_descriptor_status_dummy,
> > > + .recycle_tx_mbufs_reuse = rte_eth_recycle_tx_mbufs_reuse_dummy,
> > > + .recycle_rx_descriptors_refill = rte_eth_recycle_rx_descriptors_refill_dummy,
> > > .rxq = {
> > > .data = (void **)&dummy_queues_array[port_id],
> > > .clbk = dummy_data,
> >
> > Could we replace eth_dev_set_dummy_fops() with a call to
> > eth_dev_fp_ops_reset() in rte_eth_dev_allocate?
> > I don't like keeping two separate helpers.
>
> Thanks for the review.
>
> Avoiding the duplication is a good idea, but I could not find a clean
> way to do it: eth_dev_set_dummy_fops() and eth_dev_fp_ops_reset()
> cannot be unified without making things worse.
>
> - They write two different structures. eth_dev_set_dummy_fops() sets
> struct rte_eth_dev (the source ops); rte_eth_dev_allocate() fills
> eth_dev->*, and eth_dev_fp_ops_setup() then copies eth_dev->* into the
> per-port struct rte_eth_fp_ops. eth_dev_fp_ops_reset() writes that
> fast-path table entry directly. So calling fp_ops_reset() from
> rte_eth_dev_allocate() would populate the wrong structure.
>
> - The burst dummies are intentionally different. set_dummy_fops() uses
> the silent rte_eth_pkt_burst_dummy (a freshly allocated,
> not-yet-started port is benign), while fp_ops_reset() uses
> dummy_eth_rx_burst/dummy_eth_tx_burst, which log an error and dump the
> stack because hitting the data path on a stopped port is a misuse
> worth flagging.
>
> - fp_ops_reset() also wires fpo->rxq/txq to the
> dummy_queues_array/dummy_data used by the fast-path table, with no
> equivalent on rte_eth_dev.
>
> The only shared part is the non-burst dummy assignments, but factoring
> those across the two different struct types would require a token
> macro, and I don't have a clean solution for it. So for now I have
> kept the two helpers as they are. Suggestions welcome.
Ok, I had missed the separate structs.
Let's keep it simple, and avoid adding macros.
Can you add a comment that those helpers should be kept in sync?
--
David Marchand
^ permalink raw reply
* Re: [PATCH 2/9] eal/interrupts: keep real errno on epoll error
From: David Marchand @ 2026-06-16 9:29 UTC (permalink / raw)
To: Maxime Leroy
Cc: hemant.agrawal, sachin.saxena, dev, stable, Harman Kalra,
Cunming Liang, Stephen Hemminger, Thomas Monjalon
In-Reply-To: <CAJFAV8xH9GOGF2aR7tp2uMsjhpJs-1gXW=xzG4Gak3T_n1KpNQ@mail.gmail.com>
On Tue, 16 Jun 2026 at 10:02, David Marchand <david.marchand@redhat.com> wrote:
>
> On Thu, 11 Jun 2026 at 17:50, Maxime Leroy <maxime@leroys.fr> wrote:
> >
> > Some interrupt users have several vectors backed by the same eventfd
> > (e.g. several Rx queues behind one DPAA2 portal eventfd). Adding the
> > second vector to the same epoll instance then fails with EEXIST.
> >
> > Upper layers such as ethdev and bbdev already treat -EEXIST as a
> > non-fatal duplicate registration (if (ret && ret != -EEXIST)), but
> > rte_intr_rx_ctl() lost that information: rte_epoll_ctl() returned -1 and
> > rte_intr_rx_ctl() flattened every failure to -EPERM.
> >
> > Return the negative errno from rte_epoll_ctl() (its documented contract
> > is already "a negative value") and stop rte_intr_rx_ctl() from
> > flattening errors to -EPERM, so EEXIST reaches the upper layers that
> > already handle it; other failures carry their real errno.
> >
> > Fixes: 9efe9c6cdcac ("eal/linux: add epoll wrappers")
> > Fixes: c9f3ec1a0f3f ("eal/linux: add Rx interrupt control function")
> > Cc: stable@dpdk.org
> > Signed-off-by: Maxime Leroy <maxime@leroys.fr>
>
> Reviewed-by: David Marchand <david.marchand@redhat.com>
>
> Nit: the eal/ prefix is only for OS specific / arch specific changes.
> The title prefix should be interrupt:
I took this fix directly in main.
Applied, thanks.
--
David marchand
^ permalink raw reply
* Re: [PATCH 8/9] ethdev: keep fast-path ops valid after port stop
From: Maxime Leroy @ 2026-06-16 9:23 UTC (permalink / raw)
To: David Marchand
Cc: hemant.agrawal, sachin.saxena, dev, stable, Thomas Monjalon,
Andrew Rybchenko, Morten Brørup, Sunil Kumar Kori
In-Reply-To: <CAJFAV8xTgkgdSzX980qW3zT+gqcDkrQyDV48SMzGh5qSCkz=7Q@mail.gmail.com>
On Mon, Jun 15, 2026 at 11:26 AM David Marchand
<david.marchand@redhat.com> wrote:
>
> On Thu, 11 Jun 2026 at 17:51, Maxime Leroy <maxime@leroys.fr> wrote:
> >
> > eth_dev_fp_ops_reset() restores a port's fast-path ops on stop/release
> > via a compound literal, so every field it omits is zeroed to NULL. It
> > sets only rx_pkt_burst/tx_pkt_burst (and the rxq/txq data), leaving
> > rx_queue_count, tx_queue_count, rx/tx_descriptor_status, tx_pkt_prepare
> > and the recycle callbacks NULL.
> >
> > In non-debug builds these ops are reached through an unguarded indirect
> > call (the NULL check exists only under RTE_ETHDEV_DEBUG_RX/TX). So a
> > thread calling e.g. rte_eth_rx_queue_count() on a port being stopped
> > dereferences NULL and crashes, while the same race on rte_eth_rx_burst()
> > is harmless because the burst ops are reset to dummies. A poll-mode
> > worker re-checking rx_queue_count before arming the Rx interrupt and
> > sleeping hits exactly this.
> >
> > Reset these ops to the same dummies eth_dev_set_dummy_fops() installs,
> > so a stopped port behaves like a freshly allocated one: every fast-path
> > op is a safe no-op, none is NULL.
> >
> > Fixes: 066f3d9cc21c ("ethdev: remove callback checks from fast path")
> > Cc: stable@dpdk.org
> > Signed-off-by: Maxime Leroy <maxime@leroys.fr>
> > ---
> > lib/ethdev/ethdev_private.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
> > index 72a0723846..75ea3eedff 100644
> > --- a/lib/ethdev/ethdev_private.c
> > +++ b/lib/ethdev/ethdev_private.c
> > @@ -263,6 +263,13 @@ eth_dev_fp_ops_reset(struct rte_eth_fp_ops *fpo)
> > *fpo = (struct rte_eth_fp_ops) {
> > .rx_pkt_burst = dummy_eth_rx_burst,
> > .tx_pkt_burst = dummy_eth_tx_burst,
> > + .tx_pkt_prepare = rte_eth_tx_pkt_prepare_dummy,
> > + .rx_queue_count = rte_eth_queue_count_dummy,
> > + .tx_queue_count = rte_eth_queue_count_dummy,
> > + .rx_descriptor_status = rte_eth_descriptor_status_dummy,
> > + .tx_descriptor_status = rte_eth_descriptor_status_dummy,
> > + .recycle_tx_mbufs_reuse = rte_eth_recycle_tx_mbufs_reuse_dummy,
> > + .recycle_rx_descriptors_refill = rte_eth_recycle_rx_descriptors_refill_dummy,
> > .rxq = {
> > .data = (void **)&dummy_queues_array[port_id],
> > .clbk = dummy_data,
>
> Could we replace eth_dev_set_dummy_fops() with a call to
> eth_dev_fp_ops_reset() in rte_eth_dev_allocate?
> I don't like keeping two separate helpers.
>
>
> --
> David Marchand
>
Hi David,
Thanks for the review.
Avoiding the duplication is a good idea, but I could not find a clean
way to do it: eth_dev_set_dummy_fops() and eth_dev_fp_ops_reset()
cannot be unified without making things worse.
- They write two different structures. eth_dev_set_dummy_fops() sets
struct rte_eth_dev (the source ops); rte_eth_dev_allocate() fills
eth_dev->*, and eth_dev_fp_ops_setup() then copies eth_dev->* into the
per-port struct rte_eth_fp_ops. eth_dev_fp_ops_reset() writes that
fast-path table entry directly. So calling fp_ops_reset() from
rte_eth_dev_allocate() would populate the wrong structure.
- The burst dummies are intentionally different. set_dummy_fops() uses
the silent rte_eth_pkt_burst_dummy (a freshly allocated,
not-yet-started port is benign), while fp_ops_reset() uses
dummy_eth_rx_burst/dummy_eth_tx_burst, which log an error and dump the
stack because hitting the data path on a stopped port is a misuse
worth flagging.
- fp_ops_reset() also wires fpo->rxq/txq to the
dummy_queues_array/dummy_data used by the fast-path table, with no
equivalent on rte_eth_dev.
The only shared part is the non-burst dummy assignments, but factoring
those across the two different struct types would require a token
macro, and I don't have a clean solution for it. So for now I have
kept the two helpers as they are. Suggestions welcome.
Thanks,
Maxime
^ permalink raw reply
* Re: [PATCH v3 1/2] eal: fix off by one in in tailq name init
From: David Marchand @ 2026-06-16 9:17 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, stable, Bruce Richardson, fengchengwen
In-Reply-To: <cd21b6e3-1efe-4ad6-9110-13b8c88f3248@huawei.com>
On Wed, 10 Jun 2026 at 03:36, fengchengwen <fengchengwen@huawei.com> wrote:
>
> On 6/9/2026 11:53 PM, Stephen Hemminger wrote:
> > The tailq name is defined as 32 bytes, but name would be
> > silently truncated at 31 bytes. The function strlcpy() size
> > already accounts for the NUL character at the end.
> >
Bugzilla ID: 1954
> > Fixes: f9acaf84e923 ("replace snprintf with strlcpy without adding extra include")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Applied, thanks.
--
David Marchand
^ permalink raw reply
* [PATCH] net/crc: add 4x folding loop for aarch64 NEON implementation
From: Shreesh Adiga @ 2026-06-16 9:11 UTC (permalink / raw)
To: Wathsala Vithanage; +Cc: dev
Add a 64-byte loop that maintains 4 fold registers and processes
64 bytes at a time. The 4x fold registers is then reduced to 16 byte
single fold, similar to x86 SSE implementation. This technique is
described in the paper by Intel:
"Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction"
This results in roughly 2x performance improvement due to better ILP
for large input sizes like 1024 observed on Cortex-X925.
Signed-off-by: Shreesh Adiga <16567adigashreesh@gmail.com>
---
lib/net/net_crc_neon.c | 51 +++++++++++++++++++++++++++++++++++-------
1 file changed, 43 insertions(+), 8 deletions(-)
diff --git a/lib/net/net_crc_neon.c b/lib/net/net_crc_neon.c
index cee75ddd31..fc817e54f5 100644
--- a/lib/net/net_crc_neon.c
+++ b/lib/net/net_crc_neon.c
@@ -16,6 +16,7 @@
/** PMULL CRC computation context structure */
struct crc_pmull_ctx {
uint64x2_t rk1_rk2;
+ uint64x2_t rk3_rk4;
uint64x2_t rk5_rk6;
uint64x2_t rk7_rk8;
};
@@ -136,9 +137,36 @@ crc32_eth_calc_pmull(
temp = vreinterpretq_u64_u32(vsetq_lane_u32(crc, vmovq_n_u32(0), 0));
/**
- * Folding all data into single 16 byte data block
- * Assumes: fold holds first 16 bytes of data
+ * Folding all data into 4 parallel 16 byte data block
+ * Later folds 4 parallel blocks into single fold block
*/
+ if (likely(data_len >= 64)) {
+ uint64x2_t fold1, fold2, fold3, fold4;
+ uint64x2_t temp1, temp2, temp3, temp4;
+ fold1 = vld1q_u64((const uint64_t *)(data + 0));
+ fold2 = vld1q_u64((const uint64_t *)(data + 16));
+ fold3 = vld1q_u64((const uint64_t *)(data + 32));
+ fold4 = vld1q_u64((const uint64_t *)(data + 48));
+ fold1 = veorq_u64(fold1, temp);
+ k = params->rk1_rk2;
+
+ for (n = 64; (n + 64) <= data_len; n += 64) {
+ temp1 = vld1q_u64((const uint64_t *)&data[n + 0]);
+ temp2 = vld1q_u64((const uint64_t *)&data[n + 16]);
+ temp3 = vld1q_u64((const uint64_t *)&data[n + 32]);
+ temp4 = vld1q_u64((const uint64_t *)&data[n + 48]);
+ fold1 = crcr32_folding_round(temp1, k, fold1);
+ fold2 = crcr32_folding_round(temp2, k, fold2);
+ fold3 = crcr32_folding_round(temp3, k, fold3);
+ fold4 = crcr32_folding_round(temp4, k, fold4);
+ }
+ k = params->rk3_rk4;
+ fold1 = crcr32_folding_round(fold2, k, fold1);
+ fold1 = crcr32_folding_round(fold3, k, fold1);
+ fold = crcr32_folding_round(fold4, k, fold1);
+ goto single_fold_loop;
+ }
+
if (unlikely(data_len < 32)) {
if (unlikely(data_len == 16)) {
/* 16 bytes */
@@ -176,9 +204,12 @@ crc32_eth_calc_pmull(
fold = vld1q_u64((const uint64_t *)data);
fold = veorq_u64(fold, temp);
- /** Main folding loop - the last 16 bytes is processed separately */
- k = params->rk1_rk2;
- for (n = 16; (n + 16) <= data_len; n += 16) {
+ /** Single folding loop - the last 16 bytes is processed separately */
+ k = params->rk3_rk4;
+ n = 16;
+
+single_fold_loop:
+ for (; (n + 16) <= data_len; n += 16) {
temp = vld1q_u64((const uint64_t *)&data[n]);
fold = crcr32_folding_round(temp, k, fold);
}
@@ -194,7 +225,7 @@ crc32_eth_calc_pmull(
mask = vshift_bytes_left(vdupq_n_u64(-1), 16 - rem);
b = vorrq_u64(b, vandq_u64(mask, last16));
- /* k = rk1 & rk2 */
+ /* k = rk3 & rk4 */
temp = vreinterpretq_u64_p128(vmull_p64(
vgetq_lane_p64(vreinterpretq_p64_u64(a), 1),
vgetq_lane_p64(vreinterpretq_p64_u64(k), 0)));
@@ -221,22 +252,26 @@ void
rte_net_crc_neon_init(void)
{
/* Initialize CRC16 data */
- uint64_t ccitt_k1_k2[2] = {0x189aeLLU, 0x8e10LLU};
+ uint64_t ccitt_k1_k2[2] = {0x14ff2LLU, 0x19a3cLLU};
+ uint64_t ccitt_k3_k4[2] = {0x189aeLLU, 0x8e10LLU};
uint64_t ccitt_k5_k6[2] = {0x189aeLLU, 0x114aaLLU};
uint64_t ccitt_k7_k8[2] = {0x11c581910LLU, 0x10811LLU};
/* Initialize CRC32 data */
- uint64_t eth_k1_k2[2] = {0xccaa009eLLU, 0x1751997d0LLU};
+ uint64_t eth_k1_k2[2] = {0x1c6e41596LLU, 0x154442bd4LLU};
+ uint64_t eth_k3_k4[2] = {0xccaa009eLLU, 0x1751997d0LLU};
uint64_t eth_k5_k6[2] = {0xccaa009eLLU, 0x163cd6124LLU};
uint64_t eth_k7_k8[2] = {0x1f7011640LLU, 0x1db710641LLU};
/** Save the params in context structure */
crc16_ccitt_pmull.rk1_rk2 = vld1q_u64(ccitt_k1_k2);
+ crc16_ccitt_pmull.rk3_rk4 = vld1q_u64(ccitt_k3_k4);
crc16_ccitt_pmull.rk5_rk6 = vld1q_u64(ccitt_k5_k6);
crc16_ccitt_pmull.rk7_rk8 = vld1q_u64(ccitt_k7_k8);
/** Save the params in context structure */
crc32_eth_pmull.rk1_rk2 = vld1q_u64(eth_k1_k2);
+ crc32_eth_pmull.rk3_rk4 = vld1q_u64(eth_k3_k4);
crc32_eth_pmull.rk5_rk6 = vld1q_u64(eth_k5_k6);
crc32_eth_pmull.rk7_rk8 = vld1q_u64(eth_k7_k8);
}
--
2.53.0
^ permalink raw reply related
* Re: [PATCH v1 1/1] net/i40e: allow discontiguous queue lists in hash
From: Bruce Richardson @ 2026-06-16 8:42 UTC (permalink / raw)
To: Anatoly Burakov; +Cc: dev
In-Reply-To: <9999fab5d9491d15ff98ac5aafa248e11df558de.1781521311.git.anatoly.burakov@intel.com>
On Mon, Jun 15, 2026 at 12:01:58PM +0100, Anatoly Burakov wrote:
> Due to recent refactors and code unification, there are now the following
> properties of RSS queue list that can be checked by common infrastructure:
>
> - Monotony (i.e. queue indices always increase, never decrease)
> - No duplication (i.e. can't have the same index specified twice)
> - Contiguousness (i.e. can't have holes in the queue list)
>
> The latter is an optional feature that can be enabled with a flag. However,
> previous hash code only enforced contiguousness for queue *regions* but not
> queue *lists*, whereas after the refactor, all queue lists were required to
> be contiguous. This is an unnecessary restriction, and it breaks backwards
> compatibility.
>
> Fix it by only specifying contiguousness requirement for the VLAN branch
> where we are actually looking for a queue *region* not queue *list*.
>
> Fixes: 0185303c2e24 ("net/i40e: refactor RSS flow parameter checks")
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Applied to dpdk-next-net-intel (with corrected fixline commit id).
Thanks,
/Bruce
> drivers/net/intel/i40e/i40e_hash.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/intel/i40e/i40e_hash.c b/drivers/net/intel/i40e/i40e_hash.c
> index 3c1302469c..8b80d0a91c 100644
> --- a/drivers/net/intel/i40e/i40e_hash.c
> +++ b/drivers/net/intel/i40e/i40e_hash.c
> @@ -1238,7 +1238,6 @@ i40e_hash_parse(struct rte_eth_dev *dev,
> },
> .max_actions = 1,
> .driver_ctx = dev->data->dev_private,
> - .rss_queues_contig = true,
> /* each pattern type will add specific check function */
> };
> const struct rte_flow_action_rss *rss_act;
> @@ -1265,6 +1264,8 @@ i40e_hash_parse(struct rte_eth_dev *dev,
> /* VLAN path */
> if (is_vlan) {
> ac_param.check = i40e_hash_validate_queue_region;
> + /* queue regions must be contiguous */
> + ac_param.rss_queues_contig = true;
> ret = ci_flow_check_actions(actions, &ac_param, &parsed_actions, error);
> if (ret)
> return ret;
> --
> 2.47.3
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox