* Re: [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-16 18:00 [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics Stephen Hemminger
@ 2026-02-17 6:41 ` Andrew Rybchenko
2026-02-17 14:54 ` Stephen Hemminger
2026-02-18 8:48 ` Morten Brørup
2026-02-19 0:44 ` [PATCH v2] " Stephen Hemminger
2 siblings, 1 reply; 11+ messages in thread
From: Andrew Rybchenko @ 2026-02-17 6:41 UTC (permalink / raw)
To: Stephen Hemminger, dev; +Cc: Thomas Monjalon
On 2/16/26 9:00 PM, Stephen Hemminger wrote:
> The documentation for rte_eth_tx_burst() uses the word "sent" to
> describe the return value, which is misleading. Packets returned as
> consumed may not have been transmitted yet; they have been accepted
> by the driver and are no longer the caller's responsibility.
>
> This matters because the common usage pattern is:
>
> n = rte_eth_tx_burst(port, txq, mbufs, nb_pkts);
> for (i = n; i < nb_pkts; i++)
> rte_pktmbuf_free(mbufs[i]);
>
> For this to work correctly, the contract must be:
> - tx_pkts[0..n-1]: ownership transferred to the driver.
> - tx_pkts[n..nb_pkts-1]: untouched, still owned by the caller.
>
> Several drivers (and AI-assisted reviews) misinterpret the current
> wording and treat packets with errors as unconsumed, returning a
> short count. This causes callers to retry those packets indefinitely.
> The correct behavior is that the driver must consume (and free)
> erroneous packets, counting them via tx_errors.
>
> Replace "sent" with "consumed" in the return value description,
> spell out the mbuf ownership contract, clarify the error handling
> expectation, and update the @return block to match.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Thanks for the clarification. I really like it.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-17 6:41 ` Andrew Rybchenko
@ 2026-02-17 14:54 ` Stephen Hemminger
0 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2026-02-17 14:54 UTC (permalink / raw)
To: Andrew Rybchenko; +Cc: dev, Thomas Monjalon
On Tue, 17 Feb 2026 09:41:07 +0300
Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
> On 2/16/26 9:00 PM, Stephen Hemminger wrote:
> > The documentation for rte_eth_tx_burst() uses the word "sent" to
> > describe the return value, which is misleading. Packets returned as
> > consumed may not have been transmitted yet; they have been accepted
> > by the driver and are no longer the caller's responsibility.
> >
> > This matters because the common usage pattern is:
> >
> > n = rte_eth_tx_burst(port, txq, mbufs, nb_pkts);
> > for (i = n; i < nb_pkts; i++)
> > rte_pktmbuf_free(mbufs[i]);
> >
> > For this to work correctly, the contract must be:
> > - tx_pkts[0..n-1]: ownership transferred to the driver.
> > - tx_pkts[n..nb_pkts-1]: untouched, still owned by the caller.
> >
> > Several drivers (and AI-assisted reviews) misinterpret the current
> > wording and treat packets with errors as unconsumed, returning a
> > short count. This causes callers to retry those packets indefinitely.
> > The correct behavior is that the driver must consume (and free)
> > erroneous packets, counting them via tx_errors.
> >
> > Replace "sent" with "consumed" in the return value description,
> > spell out the mbuf ownership contract, clarify the error handling
> > expectation, and update the @return block to match.
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>
> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>
> Thanks for the clarification. I really like it.
>
I haven't reviewed all drivers but have found bugs related to this
in tap, af_packet and likely other software drivers. The hardware
drivers seem to be modeled after ixgbe and get it right.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-16 18:00 [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics Stephen Hemminger
2026-02-17 6:41 ` Andrew Rybchenko
@ 2026-02-18 8:48 ` Morten Brørup
2026-02-18 8:52 ` Bruce Richardson
2026-02-18 17:13 ` Stephen Hemminger
2026-02-19 0:44 ` [PATCH v2] " Stephen Hemminger
2 siblings, 2 replies; 11+ messages in thread
From: Morten Brørup @ 2026-02-18 8:48 UTC (permalink / raw)
To: Stephen Hemminger, dev; +Cc: Thomas Monjalon, Andrew Rybchenko
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Monday, 16 February 2026 19.00
>
> The documentation for rte_eth_tx_burst() uses the word "sent" to
> describe the return value, which is misleading. Packets returned as
> consumed may not have been transmitted yet; they have been accepted
> by the driver and are no longer the caller's responsibility.
>
> This matters because the common usage pattern is:
>
> n = rte_eth_tx_burst(port, txq, mbufs, nb_pkts);
> for (i = n; i < nb_pkts; i++)
> rte_pktmbuf_free(mbufs[i]);
>
> For this to work correctly, the contract must be:
> - tx_pkts[0..n-1]: ownership transferred to the driver.
> - tx_pkts[n..nb_pkts-1]: untouched, still owned by the caller.
>
> Several drivers (and AI-assisted reviews) misinterpret the current
> wording and treat packets with errors as unconsumed, returning a
> short count. This causes callers to retry those packets indefinitely.
> The correct behavior is that the driver must consume (and free)
> erroneous packets, counting them via tx_errors.
>
> Replace "sent" with "consumed" in the return value description,
> spell out the mbuf ownership contract, clarify the error handling
> expectation, and update the @return block to match.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> lib/ethdev/rte_ethdev.h | 21 ++++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 0d8e2d0236..9e49c4a945 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -6639,13 +6639,24 @@ uint16_t rte_eth_call_tx_callbacks(uint16_t
> port_id, uint16_t queue_id,
> * of the ring.
> *
> * The rte_eth_tx_burst() function returns the number of packets it
> - * actually sent. A return value equal to *nb_pkts* means that all
> packets
> - * have been sent, and this is likely to signify that other output
> packets
> + * has consumed from the *tx_pkts* array. The driver takes ownership
> of
> + * the mbufs for all consumed packets (tx_pkts[0] to tx_pkts[n-1]);
> + * the caller must not access them afterward. The remaining packets
> + * (tx_pkts[n] to tx_pkts[nb_pkts-1]) are not modified and remain the
> + * caller's responsibility.
> + *
> + * A return value equal to *nb_pkts* means that all packets have been
> + * consumed, and this is likely to signify that other output packets
> * could be immediately transmitted again. Applications that implement
> a
> * "send as many packets to transmit as possible" policy can check
> this
> * specific case and keep invoking the rte_eth_tx_burst() function
> until
> * a value less than *nb_pkts* is returned.
> *
> + * If a packet cannot be transmitted due to an error (for example, an
> + * invalid offload flag), the driver must still consume it and free
> the
> + * mbuf, rather than stopping at that point. Such packets should be
> + * counted in the *tx_errors* port statistic.
The above paragraph is driver centric, it should be application centric.
Suggest rephrasing as:
If a packet cannot be transmitted due to an error (for example, an invalid offload flag), the rte_eth_tx_burst() function will still consume it, rather than stopping at that point.
Such packets are counted in the *oerrors* port statistic.
NB: In struct rte_eth_stats [1], the error counter is named "oerrors", not "tx_errors".
[1]: https://elixir.bootlin.com/dpdk/v25.11/source/lib/ethdev/rte_ethdev.h#L273
While discussing details...
Let's say a packet has 4 segments, and the driver only has 2 descriptors remaining available.
In that case, I think the driver should not consume the packet, but leave it for the application to either drop it or retry transmitting it later.
Do we want to mention this case too, or is it a semi-obvious case of the descriptor ring having no more room?
> + *
> * It is the responsibility of the rte_eth_tx_burst() function to
> * transparently free the memory buffers of packets previously sent.
> * This feature is driven by the *tx_free_thresh* value supplied to
> the
> @@ -6679,9 +6690,9 @@ uint16_t rte_eth_call_tx_callbacks(uint16_t
> port_id, uint16_t queue_id,
> * @param nb_pkts
> * The maximum number of packets to transmit.
> * @return
> - * The number of output packets actually stored in transmit
> descriptors of
> - * the transmit ring. The return value can be less than the value of
> the
> - * *tx_pkts* parameter when the transmit ring is full or has been
> filled up.
> + * The number of packets consumed from the *tx_pkts* array.
> + * The return value can be less than the value of the
> + * *nb_pkts* parameter when the transmit ring is full or has been
> filled up.
> */
> static inline uint16_t
> rte_eth_tx_burst(uint16_t port_id, uint16_t queue_id,
> --
> 2.51.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-18 8:48 ` Morten Brørup
@ 2026-02-18 8:52 ` Bruce Richardson
2026-02-18 17:13 ` Stephen Hemminger
1 sibling, 0 replies; 11+ messages in thread
From: Bruce Richardson @ 2026-02-18 8:52 UTC (permalink / raw)
To: Morten Brørup
Cc: Stephen Hemminger, dev, Thomas Monjalon, Andrew Rybchenko
On Wed, Feb 18, 2026 at 09:48:04AM +0100, Morten Brørup wrote:
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Monday, 16 February 2026 19.00
> >
> > The documentation for rte_eth_tx_burst() uses the word "sent" to
> > describe the return value, which is misleading. Packets returned as
> > consumed may not have been transmitted yet; they have been accepted
> > by the driver and are no longer the caller's responsibility.
> >
> > This matters because the common usage pattern is:
> >
> > n = rte_eth_tx_burst(port, txq, mbufs, nb_pkts);
> > for (i = n; i < nb_pkts; i++)
> > rte_pktmbuf_free(mbufs[i]);
> >
> > For this to work correctly, the contract must be:
> > - tx_pkts[0..n-1]: ownership transferred to the driver.
> > - tx_pkts[n..nb_pkts-1]: untouched, still owned by the caller.
> >
> > Several drivers (and AI-assisted reviews) misinterpret the current
> > wording and treat packets with errors as unconsumed, returning a
> > short count. This causes callers to retry those packets indefinitely.
> > The correct behavior is that the driver must consume (and free)
> > erroneous packets, counting them via tx_errors.
> >
> > Replace "sent" with "consumed" in the return value description,
> > spell out the mbuf ownership contract, clarify the error handling
> > expectation, and update the @return block to match.
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > lib/ethdev/rte_ethdev.h | 21 ++++++++++++++++-----
> > 1 file changed, 16 insertions(+), 5 deletions(-)
> >
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > index 0d8e2d0236..9e49c4a945 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -6639,13 +6639,24 @@ uint16_t rte_eth_call_tx_callbacks(uint16_t
> > port_id, uint16_t queue_id,
> > * of the ring.
> > *
> > * The rte_eth_tx_burst() function returns the number of packets it
> > - * actually sent. A return value equal to *nb_pkts* means that all
> > packets
> > - * have been sent, and this is likely to signify that other output
> > packets
> > + * has consumed from the *tx_pkts* array. The driver takes ownership
> > of
> > + * the mbufs for all consumed packets (tx_pkts[0] to tx_pkts[n-1]);
> > + * the caller must not access them afterward. The remaining packets
> > + * (tx_pkts[n] to tx_pkts[nb_pkts-1]) are not modified and remain the
> > + * caller's responsibility.
> > + *
> > + * A return value equal to *nb_pkts* means that all packets have been
> > + * consumed, and this is likely to signify that other output packets
> > * could be immediately transmitted again. Applications that implement
> > a
> > * "send as many packets to transmit as possible" policy can check
> > this
> > * specific case and keep invoking the rte_eth_tx_burst() function
> > until
> > * a value less than *nb_pkts* is returned.
> > *
> > + * If a packet cannot be transmitted due to an error (for example, an
> > + * invalid offload flag), the driver must still consume it and free
> > the
> > + * mbuf, rather than stopping at that point. Such packets should be
> > + * counted in the *tx_errors* port statistic.
>
> The above paragraph is driver centric, it should be application centric.
> Suggest rephrasing as:
>
> If a packet cannot be transmitted due to an error (for example, an invalid offload flag), the rte_eth_tx_burst() function will still consume it, rather than stopping at that point.
> Such packets are counted in the *oerrors* port statistic.
>
> NB: In struct rte_eth_stats [1], the error counter is named "oerrors", not "tx_errors".
>
> [1]: https://elixir.bootlin.com/dpdk/v25.11/source/lib/ethdev/rte_ethdev.h#L273
>
> While discussing details...
> Let's say a packet has 4 segments, and the driver only has 2 descriptors remaining available.
> In that case, I think the driver should not consume the packet, but leave it for the application to either drop it or retry transmitting it later.
> Do we want to mention this case too, or is it a semi-obvious case of the descriptor ring having no more room?
>
I would tend towards it being covered by the descriptor ring not having
room. If we try to cover all edge cases here the documentation will get too
long and therefore less likely to be read.
/Bruce
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-18 8:48 ` Morten Brørup
2026-02-18 8:52 ` Bruce Richardson
@ 2026-02-18 17:13 ` Stephen Hemminger
2026-02-18 17:32 ` Morten Brørup
1 sibling, 1 reply; 11+ messages in thread
From: Stephen Hemminger @ 2026-02-18 17:13 UTC (permalink / raw)
To: Morten Brørup; +Cc: dev, Thomas Monjalon, Andrew Rybchenko
On Wed, 18 Feb 2026 09:48:04 +0100
Morten Brørup <mb@smartsharesystems.com> wrote:
> > + *
> > + * A return value equal to *nb_pkts* means that all packets have been
> > + * consumed, and this is likely to signify that other output packets
> > * could be immediately transmitted again. Applications that implement
> > a
> > * "send as many packets to transmit as possible" policy can check
> > this
> > * specific case and keep invoking the rte_eth_tx_burst() function
> > until
> > * a value less than *nb_pkts* is returned.
> > *
> > + * If a packet cannot be transmitted due to an error (for example, an
> > + * invalid offload flag), the driver must still consume it and free
> > the
> > + * mbuf, rather than stopping at that point. Such packets should be
> > + * counted in the *tx_errors* port statistic.
>
> The above paragraph is driver centric, it should be application centric.
Most of the applications are doing it right already since everybody
starts with l2fwd, or l3fwd. The problem I see is buggy drivers.
> Suggest rephrasing as:
>
> If a packet cannot be transmitted due to an error (for example, an invalid offload flag), the rte_eth_tx_burst() function will still consume it, rather than stopping at that point.
> Such packets are counted in the *oerrors* port statistic.
>
> NB: In struct rte_eth_stats [1], the error counter is named "oerrors", not "tx_errors".
>
> [1]: https://elixir.bootlin.com/dpdk/v25.11/source/lib/ethdev/rte_ethdev.h#L273
Good point, I was thinking of the per-queue stats and xstats.
> While discussing details...
> Let's say a packet has 4 segments, and the driver only has 2 descriptors remaining available.
> In that case, I think the driver should not consume the packet, but leave it for the application to either drop it or retry transmitting it later.
> Do we want to mention this case too, or is it a semi-obvious case of the descriptor ring having no more room?
There are also other cases of backpressure like when driver talks to kernel and gets EAGAIN or EBUSY
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-18 17:13 ` Stephen Hemminger
@ 2026-02-18 17:32 ` Morten Brørup
0 siblings, 0 replies; 11+ messages in thread
From: Morten Brørup @ 2026-02-18 17:32 UTC (permalink / raw)
To: Stephen Hemminger
Cc: dev, Thomas Monjalon, Andrew Rybchenko, bruce.richardson
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, 18 February 2026 18.13
>
> On Wed, 18 Feb 2026 09:48:04 +0100
> Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > > + *
> > > + * A return value equal to *nb_pkts* means that all packets have
> been
> > > + * consumed, and this is likely to signify that other output
> packets
> > > * could be immediately transmitted again. Applications that
> implement
> > > a
> > > * "send as many packets to transmit as possible" policy can check
> > > this
> > > * specific case and keep invoking the rte_eth_tx_burst() function
> > > until
> > > * a value less than *nb_pkts* is returned.
> > > *
> > > + * If a packet cannot be transmitted due to an error (for example,
> an
> > > + * invalid offload flag), the driver must still consume it and
> free
> > > the
> > > + * mbuf, rather than stopping at that point. Such packets should
> be
> > > + * counted in the *tx_errors* port statistic.
> >
> > The above paragraph is driver centric, it should be application
> centric.
>
> Most of the applications are doing it right already since everybody
> starts with l2fwd, or l3fwd. The problem I see is buggy drivers.
I agree.
But this API is for applications, so its documentation should be written for application developers.
Drivers are doing it wrong because driver APIs are largely undocumented, e.g. [2] and [3].
[2]: https://elixir.bootlin.com/dpdk/v25.11/source/lib/ethdev/rte_ethdev_core.h#L33
[3]: https://elixir.bootlin.com/dpdk/v25.11/source/lib/mempool/rte_mempool.h#L478
It would be an improvement if driver API documentation at least referred to the application APIs that wrap them.
>
> > Suggest rephrasing as:
> >
> > If a packet cannot be transmitted due to an error (for example, an
> invalid offload flag), the rte_eth_tx_burst() function will still
> consume it, rather than stopping at that point.
> > Such packets are counted in the *oerrors* port statistic.
> >
> > NB: In struct rte_eth_stats [1], the error counter is named
> "oerrors", not "tx_errors".
> >
> > [1]:
> https://elixir.bootlin.com/dpdk/v25.11/source/lib/ethdev/rte_ethdev.h#L
> 273
>
> Good point, I was thinking of the per-queue stats and xstats.
>
> > While discussing details...
> > Let's say a packet has 4 segments, and the driver only has 2
> descriptors remaining available.
> > In that case, I think the driver should not consume the packet, but
> leave it for the application to either drop it or retry transmitting it
> later.
> > Do we want to mention this case too, or is it a semi-obvious case of
> the descriptor ring having no more room?
>
> There are also other cases of backpressure like when driver talks to
> kernel and gets EAGAIN or EBUSY
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-16 18:00 [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics Stephen Hemminger
2026-02-17 6:41 ` Andrew Rybchenko
2026-02-18 8:48 ` Morten Brørup
@ 2026-02-19 0:44 ` Stephen Hemminger
2026-02-19 7:20 ` Morten Brørup
2026-02-19 19:00 ` Stephen Hemminger
2 siblings, 2 replies; 11+ messages in thread
From: Stephen Hemminger @ 2026-02-19 0:44 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Andrew Rybchenko
The documentation for rte_eth_tx_burst() uses the word "sent" to
describe the return value, which is misleading. Packets returned as
consumed may not have been transmitted yet; they have been accepted
by the driver and are no longer the caller's responsibility.
This matters because the common usage pattern is:
n = rte_eth_tx_burst(port, txq, mbufs, nb_pkts);
for (i = n; i < nb_pkts; i++)
rte_pktmbuf_free(mbufs[i]);
For this to work correctly, the contract must be:
- tx_pkts[0..n-1]: ownership transferred to the driver.
- tx_pkts[n..nb_pkts-1]: untouched, still owned by the caller.
Several drivers (and AI-assisted reviews) misinterpret the current
wording and treat packets with errors as unconsumed, returning a
short count. This causes callers to retry those packets indefinitely.
The correct behavior is that the driver must consume (and free)
erroneous packets, counting them via oerrors.
Replace "sent" with "consumed" in the return value description,
spell out the mbuf ownership contract, clarify the error handling
expectation, and update the @return block to match.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
lib/ethdev/rte_ethdev.h | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 0d8e2d0236..9e49c4a945 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -6639,13 +6639,24 @@ uint16_t rte_eth_call_tx_callbacks(uint16_t port_id, uint16_t queue_id,
* of the ring.
*
* The rte_eth_tx_burst() function returns the number of packets it
- * actually sent. A return value equal to *nb_pkts* means that all packets
- * have been sent, and this is likely to signify that other output packets
+ * has consumed from the *tx_pkts* array. The driver takes ownership of
+ * the mbufs for all consumed packets (tx_pkts[0] to tx_pkts[n-1]);
+ * the caller must not access them afterward. The remaining packets
+ * (tx_pkts[n] to tx_pkts[nb_pkts-1]) are not modified and remain the
+ * caller's responsibility.
+ *
+ * A return value equal to *nb_pkts* means that all packets have been
+ * consumed, and this is likely to signify that other output packets
* could be immediately transmitted again. Applications that implement a
* "send as many packets to transmit as possible" policy can check this
* specific case and keep invoking the rte_eth_tx_burst() function until
* a value less than *nb_pkts* is returned.
*
+ * If a packet cannot be transmitted due to an error (for example, an
+ * invalid offload flag), the driver must still consume it and free the
+ * mbuf, rather than stopping at that point. Such packets should be
+ * counted in the *tx_errors* port statistic.
+ *
* It is the responsibility of the rte_eth_tx_burst() function to
* transparently free the memory buffers of packets previously sent.
* This feature is driven by the *tx_free_thresh* value supplied to the
@@ -6679,9 +6690,9 @@ uint16_t rte_eth_call_tx_callbacks(uint16_t port_id, uint16_t queue_id,
* @param nb_pkts
* The maximum number of packets to transmit.
* @return
- * The number of output packets actually stored in transmit descriptors of
- * the transmit ring. The return value can be less than the value of the
- * *tx_pkts* parameter when the transmit ring is full or has been filled up.
+ * The number of packets consumed from the *tx_pkts* array.
+ * The return value can be less than the value of the
+ * *nb_pkts* parameter when the transmit ring is full or has been filled up.
*/
static inline uint16_t
rte_eth_tx_burst(uint16_t port_id, uint16_t queue_id,
--
2.51.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* RE: [PATCH v2] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-19 0:44 ` [PATCH v2] " Stephen Hemminger
@ 2026-02-19 7:20 ` Morten Brørup
2026-02-19 19:00 ` Stephen Hemminger
2026-02-19 19:00 ` Stephen Hemminger
1 sibling, 1 reply; 11+ messages in thread
From: Morten Brørup @ 2026-02-19 7:20 UTC (permalink / raw)
To: Stephen Hemminger, dev; +Cc: Andrew Rybchenko, bruce.richardson
> + * If a packet cannot be transmitted due to an error (for example, an
> + * invalid offload flag), the driver must still consume it and free
> the
> + * mbuf, rather than stopping at that point. Such packets should be
> + * counted in the *tx_errors* port statistic.
> + *
Looks like v1.
Please update with my feedback.
-Morten
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-19 7:20 ` Morten Brørup
@ 2026-02-19 19:00 ` Stephen Hemminger
0 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2026-02-19 19:00 UTC (permalink / raw)
To: Morten Brørup; +Cc: dev, Andrew Rybchenko, bruce.richardson
On Thu, 19 Feb 2026 08:20:44 +0100
Morten Brørup <mb@smartsharesystems.com> wrote:
> > + * If a packet cannot be transmitted due to an error (for example, an
> > + * invalid offload flag), the driver must still consume it and free
> > the
> > + * mbuf, rather than stopping at that point. Such packets should be
> > + * counted in the *tx_errors* port statistic.
> > + *
>
> Looks like v1.
> Please update with my feedback.
>
> -Morten
Thanks missed that in re-edit
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
2026-02-19 0:44 ` [PATCH v2] " Stephen Hemminger
2026-02-19 7:20 ` Morten Brørup
@ 2026-02-19 19:00 ` Stephen Hemminger
1 sibling, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2026-02-19 19:00 UTC (permalink / raw)
To: dev; +Cc: Andrew Rybchenko
On Wed, 18 Feb 2026 16:44:49 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:
> The documentation for rte_eth_tx_burst() uses the word "sent" to
> describe the return value, which is misleading. Packets returned as
> consumed may not have been transmitted yet; they have been accepted
> by the driver and are no longer the caller's responsibility.
>
> This matters because the common usage pattern is:
>
> n = rte_eth_tx_burst(port, txq, mbufs, nb_pkts);
> for (i = n; i < nb_pkts; i++)
> rte_pktmbuf_free(mbufs[i]);
>
> For this to work correctly, the contract must be:
> - tx_pkts[0..n-1]: ownership transferred to the driver.
> - tx_pkts[n..nb_pkts-1]: untouched, still owned by the caller.
>
> Several drivers (and AI-assisted reviews) misinterpret the current
> wording and treat packets with errors as unconsumed, returning a
> short count. This causes callers to retry those packets indefinitely.
> The correct behavior is that the driver must consume (and free)
> erroneous packets, counting them via oerrors.
>
> Replace "sent" with "consumed" in the return value description,
> spell out the mbuf ownership contract, clarify the error handling
> expectation, and update the @return block to match.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> ---
FYI - many, many drivers got this wrong. Only a few seem to get
it right.
^ permalink raw reply [flat|nested] 11+ messages in thread