From: Stephen Hemminger <stephen@networkplumber.org>
To: Vincent Jardin <vjardin@free.fr>
Cc: dev@dpdk.org, rasland@nvidia.com, thomas@monjalon.net,
andrew.rybchenko@oktetlabs.ru, dsosnowski@nvidia.com,
viacheslavo@nvidia.com, bingz@nvidia.com, orika@nvidia.com,
suanmingm@nvidia.com, matan@nvidia.com,
aman.deep.singh@intel.com
Subject: Re: [PATCH v4 00/10] net/mlx5: per-queue Tx rate limiting via packet pacing
Date: Mon, 23 Mar 2026 16:09:55 -0700 [thread overview]
Message-ID: <20260323160955.5379adfe@phoenix.local> (raw)
In-Reply-To: <20260322134629.94633-1-vjardin@free.fr>
On Sun, 22 Mar 2026 14:46:19 +0100
Vincent Jardin <vjardin@free.fr> wrote:
> This series adds per-queue Tx data-rate limiting to the mlx5 PMD using
> hardware packet pacing (PP), and a symmetric rte_eth_get_queue_rate_limit()
> ethdev API to read back the configured rate.
>
> Each Tx queue can be assigned an individual rate (in Mbps) at runtime via
> rte_eth_set_queue_rate_limit(). The mlx5 implementation allocates a PP
> context per queue from the HW rate table, programs the PP index into the
> SQ via modify_sq, and relies on the kernel to share identical rates
> across PP contexts to conserve table entries. A PMD-specific API exposes
> per-queue PP diagnostics and rate table capacity.
>
> Patch breakdown:
>
> 01/10 doc/nics/mlx5: fix stale packet pacing documentation
> 02/10 common/mlx5: query packet pacing rate table capabilities
> 03/10 common/mlx5: extend SQ modify to support rate limit update
> 04/10 net/mlx5: add per-queue packet pacing infrastructure
> 05/10 net/mlx5: support per-queue rate limiting
> 06/10 net/mlx5: add burst pacing devargs
> 07/10 net/mlx5: add testpmd command to query per-queue rate limit
> 08/10 ethdev: add getter for per-queue Tx rate limit
> 09/10 net/mlx5: implement per-queue Tx rate limit getter
> 10/10 net/mlx5: add rate table capacity query API
>
> Release notes for the new ethdev API and mlx5 per-queue rate
> limiting can be added to a release_26_07.rst once the file is
> created at the start of the 26.07 development cycle.
>
> Changes since v3:
>
> Addressed review feedback from Stephen and Slava (nvidia/Mellanox).
>
> Patch 02/10 (query caps):
> - Added Acked-by: Viacheslav Ovsiienko
>
> Patch 03/10 (SQ modify):
> - Define MLX5_MODIFY_SQ_IN_MODIFY_BITMASK_PACKET_PACING_RATE_LIMIT_INDEX
> enum in mlx5_prm.h, following the MLX5_MODIFY_RQ_IN_MODIFY_xxx pattern
> - Use read-modify-write for modify_bitmask (MLX5_GET64 | OR | MLX5_SET64)
> instead of direct overwrite, for forward compatibility
>
> Patch 04/10 (PP infrastructure):
> - Rename struct member and parameters from "rl" to "rate_limit"
> for consistency with codebase naming style
> - Replace MLX5_ASSERT(rate_mbps > 0) with runtime check returning
> -EINVAL in non-debug builds
> - Move mlx5_txq_free_pp_rate_limit() to after txq_obj_release() in
> mlx5_txq_release() — destroy the SQ before freeing the PP index
> it references
> - Clarify commit message: distinct PP handle per queue (for cleanup)
> but kernel shares the same pp_id for identical rate parameters
>
> Patch 05/10 (set rate):
> - Fix obj->sq vs obj->sq_obj.sq: use obj->sq_obj.sq from the start
> for non-hairpin queues (was introduced in patch 07 in v3, breaking
> git bisect)
> - Move all variable declarations to block top (sq_devx,
> new_rate_limit)
> - Add queue state check: reject set_queue_rate_limit if queue is not
> STARTED (SQ not in RDY state)
> - Update mlx5 feature matrix: Rate limitation = Y
> - Add Per-Queue Tx Rate Limiting documentation section in mlx5.rst
> covering DevX requirement, hardware support, rate table sharing,
> and testpmd usage
>
> Patch 06/10 (burst devargs):
> - Remove burst_upper_bound/typical_packet_size from Clock Queue
> path (mlx5_txpp_alloc_pp_index) — Clock Queue uses WQE rate
> pacing and does not need these parameters
> - Update commit message and documentation accordingly
>
> Patch 07/10 (testpmd + PMD query):
> - sq_obj.sq accessor change moved to patch 05 (see above)
> - sq_devx declaration moved to block top
>
> Patch 08/10 (ethdev getter) — split from v3 patch 08:
> - Split into ethdev API (this patch) and mlx5 driver (patch 09)
> - Add rte_eth_trace_get_queue_rate_limit() trace point matching
> the existing setter pattern
>
> Patch 09/10 — NEW (was part of v3 patch 08):
> - mlx5 driver implementation of get_queue_rate_limit callback,
> split out per Slava's request
>
> Patch 10/10 (rate table query):
> - Rename struct field "used" to "port_used" to clarify per-port
> scope
> - Strengthen Doxygen: rate table is a global shared HW resource
> (firmware, kernel, other DPDK instances may consume entries);
> port_used is a lower bound
> - Document PP sharing behavior with flags=0
> - Note that applications should aggregate across ports for
> device-wide visibility
>
> Changes since v2:
>
> Addressed review feedback from Stephen Hemminger:
>
> Patch 04: cleaned redundant cast parentheses on (struct mlx5dv_pp *)
> Patch 04: consolidated dv_alloc_pp call onto one line
> Patch 05+08: removed redundant queue_idx bounds checks from driver
> callbacks — ethdev layer is the single validation point
> Patch 07: added generic testpmd command: show port <id> queue <id> rate
> Patch 08+10: removed release notes from release_26_03.rst (targets 26.07)
> Patch 10: use MLX5_MEM_SYS | MLX5_MEM_ZERO for heap allocation
> Patch 10: consolidated packet_pacing_rate_table_size onto one line
>
> Changes since v1:
>
> Patch 01: Acked-by Viacheslav Ovsiienko
> Patch 04: rate bounds validation, uint64_t overflow fix, remove
> early PP free
> Patch 05: PP leak fix (temp struct pattern), rte_errno in error paths
> Patch 07: inverted rte_eth_tx_queue_is_valid() check
> Patch 10: stack array replaced with heap, per-port scope documented
>
> Testing:
>
> - Build: GCC, no warnings
> - Hardware: ConnectX-6 Dx
> - DevX path (default): set/get/disable rate limiting verified
> - Verbs path (dv_flow_en=0): returns -EINVAL cleanly (SQ DevX
> object not available), no crash
>
> Vincent Jardin (10):
> doc/nics/mlx5: fix stale packet pacing documentation
> common/mlx5: query packet pacing rate table capabilities
> common/mlx5: extend SQ modify to support rate limit update
> net/mlx5: add per-queue packet pacing infrastructure
> net/mlx5: support per-queue rate limiting
> net/mlx5: add burst pacing devargs
> net/mlx5: add testpmd command to query per-queue rate limit
> ethdev: add getter for per-queue Tx rate limit
> net/mlx5: implement per-queue Tx rate limit getter
> net/mlx5: add rate table capacity query API
>
> Vincent Jardin (10):
> doc/nics/mlx5: fix stale packet pacing documentation
> common/mlx5: query packet pacing rate table capabilities
> common/mlx5: extend SQ modify to support rate limit update
> net/mlx5: add per-queue packet pacing infrastructure
> net/mlx5: support per-queue rate limiting
> net/mlx5: add burst pacing devargs
> net/mlx5: add testpmd command to query per-queue rate limit
> ethdev: add getter for per-queue Tx rate limit
> net/mlx5: implement per-queue Tx rate limit getter
> net/mlx5: add rate table capacity query API
>
> app/test-pmd/cmdline.c | 69 ++++++++++
> doc/guides/nics/features/mlx5.ini | 1 +
> doc/guides/nics/mlx5.rst | 180 ++++++++++++++++++++++-----
> drivers/common/mlx5/mlx5_devx_cmds.c | 23 ++++
> drivers/common/mlx5/mlx5_devx_cmds.h | 14 ++-
> drivers/common/mlx5/mlx5_prm.h | 7 ++
> drivers/net/mlx5/mlx5.c | 46 +++++++
> drivers/net/mlx5/mlx5.h | 13 ++
> drivers/net/mlx5/mlx5_testpmd.c | 93 ++++++++++++++
> drivers/net/mlx5/mlx5_tx.c | 104 +++++++++++++++-
> drivers/net/mlx5/mlx5_tx.h | 5 +
> drivers/net/mlx5/mlx5_txpp.c | 84 +++++++++++++
> drivers/net/mlx5/mlx5_txq.c | 149 ++++++++++++++++++++++
> drivers/net/mlx5/rte_pmd_mlx5.h | 74 +++++++++++
> lib/ethdev/ethdev_driver.h | 7 ++
> lib/ethdev/ethdev_trace.h | 9 ++
> lib/ethdev/ethdev_trace_points.c | 3 +
> lib/ethdev/rte_ethdev.c | 35 ++++++
> lib/ethdev/rte_ethdev.h | 24 ++++
> 19 files changed, 906 insertions(+), 33 deletions(-)
>
I didn't see anything wrong, but AI nit picking did.
---
## Review: [PATCH v4 01-10/10] mlx5 per-queue rate limiting
### Patch 05/10 — `net/mlx5: support per-queue rate limiting`
**Error (Correctness): Data truncation in `rate_kbps` → PRM `rate_limit` field**
In `mlx5_txq_alloc_pp_rate_limit()`, `rate_kbps` is computed as `(uint64_t)rate_mbps * 1000` and then stored with `MLX5_SET(..., rate_limit, (uint32_t)rate_kbps)`. When `qos->packet_pacing_max_rate` is zero (HW doesn't report a maximum), the range check is skipped, and any `rate_mbps` value above ~4,294,967 will silently overflow the `uint32_t` cast, programming a completely wrong rate into the hardware.
For example, `rate_mbps = 5000000` (5 Tbps — unlikely but allowed by the `uint32_t` parameter type) yields `rate_kbps = 5,000,000,000` which truncates to `705,032,704` kbps.
Suggested fix: add an explicit check that `rate_kbps` fits in `uint32_t` before the cast, or cap it:
```c
if (rate_kbps > UINT32_MAX) {
DRV_LOG(ERR, "Rate %u Mbps overflows PRM rate_limit field.", rate_mbps);
rte_errno = ERANGE;
return -ERANGE;
}
```
**Warning (Correctness): Missing NULL check on `txq_data` in `rte_pmd_mlx5_txq_rate_limit_query()` (patch 07)**
After `rte_eth_tx_queue_is_valid()` confirms the index is in range, the code does:
```c
txq_data = (*priv->txqs)[queue_id];
txq_ctrl = container_of(txq_data, struct mlx5_txq_ctrl, txq);
```
If the queue index is valid but the queue has been released (entry is NULL), `container_of` on a NULL pointer produces an invalid pointer, and subsequent dereferences are undefined behavior. The setter in patch 05 checks `(*priv->txqs)[queue_idx] == NULL` but the query function does not.
Suggested fix: add a NULL check on `txq_data` before the `container_of`, similar to what the setter does.
The same issue exists in `rte_pmd_mlx5_pp_rate_table_query()` (patch 10) but there the loop body does check `(*priv->txqs)[i] == NULL` and skips, so that one is fine.
### Patch 09/10 — `net/mlx5: implement per-queue Tx rate limit getter`
**Warning (Correctness): Missing NULL check on queue data in `mlx5_get_queue_rate_limit()`**
The getter checks `priv->txqs == NULL || (*priv->txqs)[queue_idx] == NULL` which is good. However, the ethdev layer (`rte_eth_get_queue_rate_limit()` in patch 08) only checks `queue_idx >= dev->data->nb_tx_queues`, not whether `priv->txqs` is allocated. The driver-side check handles this, so this is actually fine — noting for completeness.
### Patch 01/10 — `doc/nics/mlx5: fix stale packet pacing documentation`
**Warning: Missing `Cc: stable@dpdk.org`**
The patch has a `Fixes:` tag referencing a previous commit but is missing `Cc: stable@dpdk.org` for backport consideration, even though it's a documentation-only fix.
### Patches 02, 03, 04, 06, 08, 10
No correctness issues found. The code is well-structured with proper error handling, resource cleanup on failure paths, and appropriate validation.
### General observations
The series is well organized with clean separation of concerns across patches. Error paths in `mlx5_set_queue_rate_limit()` properly clean up the temporary `new_rate_limit` PP allocation on modify-SQ failure, and `mlx5_txq_release()` correctly frees the rate limit context on queue teardown. The `rte_errno = -ret` pattern after `mlx5_devx_cmd_modify_sq()` is consistent with how other mlx5 callers handle that function's negative errno return convention.
next prev parent reply other threads:[~2026-03-23 23:10 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-10 9:20 [PATCH v1 00/10] net/mlx5: per-queue Tx rate limiting via packet pacing Vincent Jardin
2026-03-10 9:20 ` [PATCH v1 01/10] doc/nics/mlx5: fix stale packet pacing documentation Vincent Jardin
2026-03-11 12:26 ` Slava Ovsiienko
2026-03-10 9:20 ` [PATCH v1 02/10] common/mlx5: query packet pacing rate table capabilities Vincent Jardin
2026-03-10 9:20 ` [PATCH v1 03/10] common/mlx5: extend SQ modify to support rate limit update Vincent Jardin
2026-03-10 9:20 ` [PATCH v1 04/10] net/mlx5: add per-queue packet pacing infrastructure Vincent Jardin
2026-03-10 9:20 ` [PATCH v1 05/10] net/mlx5: support per-queue rate limiting Vincent Jardin
2026-03-10 9:20 ` [PATCH v1 06/10] net/mlx5: add burst pacing devargs Vincent Jardin
2026-03-10 9:20 ` [PATCH v1 07/10] net/mlx5: add testpmd command to query per-queue rate limit Vincent Jardin
2026-03-10 9:20 ` [PATCH v1 08/10] ethdev: add getter for per-queue Tx " Vincent Jardin
2026-03-10 9:20 ` [PATCH v1 9/10] net/mlx5: share pacing rate table entries across queues Vincent Jardin
2026-03-10 9:20 ` [PATCH v1 10/10] net/mlx5: add rate table capacity query API Vincent Jardin
2026-03-10 14:20 ` [PATCH v1 00/10] net/mlx5: per-queue Tx rate limiting via packet pacing Stephen Hemminger
2026-03-10 23:26 ` [PATCH v2 " Vincent Jardin
2026-03-10 23:26 ` [PATCH v2 01/10] doc/nics/mlx5: fix stale packet pacing documentation Vincent Jardin
2026-03-10 23:26 ` [PATCH v2 02/10] common/mlx5: query packet pacing rate table capabilities Vincent Jardin
2026-03-10 23:26 ` [PATCH v2 03/10] common/mlx5: extend SQ modify to support rate limit update Vincent Jardin
2026-03-10 23:26 ` [PATCH v2 04/10] net/mlx5: add per-queue packet pacing infrastructure Vincent Jardin
2026-03-11 16:29 ` Stephen Hemminger
2026-03-10 23:26 ` [PATCH v2 05/10] net/mlx5: support per-queue rate limiting Vincent Jardin
2026-03-10 23:26 ` [PATCH v2 06/10] net/mlx5: add burst pacing devargs Vincent Jardin
2026-03-10 23:26 ` [PATCH v2 07/10] net/mlx5: add testpmd command to query per-queue rate limit Vincent Jardin
2026-03-11 16:31 ` Stephen Hemminger
2026-03-10 23:26 ` [PATCH v2 08/10] ethdev: add getter for per-queue Tx " Vincent Jardin
2026-03-11 16:17 ` Stephen Hemminger
2026-03-11 16:26 ` Stephen Hemminger
2026-03-12 15:54 ` Vincent Jardin
2026-03-10 23:26 ` [PATCH v2 09/10] net/mlx5: share pacing rate table entries across queues Vincent Jardin
2026-03-10 23:26 ` [PATCH v2 10/10] net/mlx5: add rate table capacity query API Vincent Jardin
2026-03-11 16:31 ` Stephen Hemminger
2026-03-11 16:35 ` Stephen Hemminger
2026-03-12 15:05 ` Vincent Jardin
2026-03-12 16:01 ` Stephen Hemminger
2026-03-12 22:01 ` [PATCH v3 00/9] net/mlx5: per-queue Tx rate limiting via packet pacing Vincent Jardin
2026-03-12 22:01 ` [PATCH v3 01/9] doc/nics/mlx5: fix stale packet pacing documentation Vincent Jardin
2026-03-12 22:01 ` [PATCH v3 02/9] common/mlx5: query packet pacing rate table capabilities Vincent Jardin
2026-03-20 12:02 ` Slava Ovsiienko
2026-03-12 22:01 ` [PATCH v3 03/9] common/mlx5: extend SQ modify to support rate limit update Vincent Jardin
2026-03-20 12:01 ` Slava Ovsiienko
2026-03-12 22:01 ` [PATCH v3 04/9] net/mlx5: add per-queue packet pacing infrastructure Vincent Jardin
2026-03-20 12:51 ` Slava Ovsiienko
2026-03-12 22:01 ` [PATCH v3 05/9] net/mlx5: support per-queue rate limiting Vincent Jardin
2026-03-20 15:11 ` Slava Ovsiienko
2026-03-12 22:01 ` [PATCH v3 06/9] net/mlx5: add burst pacing devargs Vincent Jardin
2026-03-20 15:19 ` Slava Ovsiienko
2026-03-12 22:01 ` [PATCH v3 07/9] net/mlx5: add testpmd command to query per-queue rate limit Vincent Jardin
2026-03-20 15:38 ` Slava Ovsiienko
2026-03-22 14:02 ` Vincent Jardin
2026-03-12 22:01 ` [PATCH v3 08/9] ethdev: add getter for per-queue Tx " Vincent Jardin
2026-03-20 15:44 ` Slava Ovsiienko
2026-03-12 22:01 ` [PATCH v3 09/9] net/mlx5: add rate table capacity query API Vincent Jardin
2026-03-20 15:49 ` Slava Ovsiienko
2026-03-16 16:04 ` [PATCH v3 00/9] net/mlx5: per-queue Tx rate limiting via packet pacing Stephen Hemminger
2026-03-22 14:16 ` Vincent Jardin
2026-03-22 13:46 ` [PATCH v4 00/10] " Vincent Jardin
2026-03-22 13:46 ` [PATCH v4 01/10] doc/nics/mlx5: fix stale packet pacing documentation Vincent Jardin
2026-03-22 13:46 ` [PATCH v4 02/10] common/mlx5: query packet pacing rate table capabilities Vincent Jardin
2026-03-22 13:46 ` [PATCH v4 03/10] common/mlx5: extend SQ modify to support rate limit update Vincent Jardin
2026-03-23 12:59 ` Slava Ovsiienko
2026-03-22 13:46 ` [PATCH v4 04/10] net/mlx5: add per-queue packet pacing infrastructure Vincent Jardin
2026-03-23 13:00 ` Slava Ovsiienko
2026-03-22 13:46 ` [PATCH v4 05/10] net/mlx5: support per-queue rate limiting Vincent Jardin
2026-03-23 13:17 ` Slava Ovsiienko
2026-03-22 13:46 ` [PATCH v4 06/10] net/mlx5: add burst pacing devargs Vincent Jardin
2026-03-23 13:18 ` Slava Ovsiienko
2026-03-22 13:46 ` [PATCH v4 07/10] net/mlx5: add testpmd command to query per-queue rate limit Vincent Jardin
2026-03-23 13:19 ` Slava Ovsiienko
2026-03-22 13:46 ` [PATCH v4 08/10] ethdev: add getter for per-queue Tx " Vincent Jardin
2026-03-23 13:19 ` Slava Ovsiienko
2026-03-22 13:46 ` [PATCH v4 09/10] net/mlx5: implement per-queue Tx rate limit getter Vincent Jardin
2026-03-23 13:20 ` Slava Ovsiienko
2026-03-22 13:46 ` [PATCH v4 10/10] net/mlx5: add rate table capacity query API Vincent Jardin
2026-03-23 13:20 ` Slava Ovsiienko
2026-03-23 23:09 ` Stephen Hemminger [this message]
2026-03-24 16:50 ` [PATCH v5 00/10] net/mlx5: per-queue Tx rate limiting via packet pacing Vincent Jardin
2026-03-24 16:50 ` [PATCH v5 01/10] doc/nics/mlx5: fix stale packet pacing documentation Vincent Jardin
2026-03-24 16:50 ` [PATCH v5 02/10] common/mlx5: query packet pacing rate table capabilities Vincent Jardin
2026-03-24 16:50 ` [PATCH v5 03/10] common/mlx5: extend SQ modify to support rate limit update Vincent Jardin
2026-03-24 16:50 ` [PATCH v5 04/10] net/mlx5: add per-queue packet pacing infrastructure Vincent Jardin
2026-03-24 16:50 ` [PATCH v5 05/10] net/mlx5: support per-queue rate limiting Vincent Jardin
2026-03-24 16:50 ` [PATCH v5 06/10] net/mlx5: add burst pacing devargs Vincent Jardin
2026-03-24 16:50 ` [PATCH v5 07/10] net/mlx5: add testpmd command to query per-queue rate limit Vincent Jardin
2026-03-24 16:50 ` [PATCH v5 08/10] ethdev: add getter for per-queue Tx " Vincent Jardin
2026-03-25 2:24 ` Stephen Hemminger
2026-03-24 16:50 ` [PATCH v5 09/10] net/mlx5: implement per-queue Tx rate limit getter Vincent Jardin
2026-03-24 16:50 ` [PATCH v5 10/10] net/mlx5: add rate table capacity query API Vincent Jardin
2026-03-25 2:25 ` [PATCH v5 00/10] net/mlx5: per-queue Tx rate limiting via packet pacing Stephen Hemminger
2026-04-13 13:18 ` Raslan Darawsheh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260323160955.5379adfe@phoenix.local \
--to=stephen@networkplumber.org \
--cc=aman.deep.singh@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=bingz@nvidia.com \
--cc=dev@dpdk.org \
--cc=dsosnowski@nvidia.com \
--cc=matan@nvidia.com \
--cc=orika@nvidia.com \
--cc=rasland@nvidia.com \
--cc=suanmingm@nvidia.com \
--cc=thomas@monjalon.net \
--cc=viacheslavo@nvidia.com \
--cc=vjardin@free.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.