From: Lorenzo Bianconi <lorenzo@kernel.org>
To: Simon Horman <horms@kernel.org>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
linux-arm-kernel@lists.infradead.org,
linux-mediatek@lists.infradead.org, netdev@vger.kernel.org
Subject: Re: [PATCH net] net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit()
Date: Sat, 11 Oct 2025 15:34:41 +0200 [thread overview]
Message-ID: <aOpc8d0dPGOnwfJE@lore-desk> (raw)
In-Reply-To: <aOo0woPiMxjABFv2@horms.kernel.org>
[-- Attachment #1: Type: text/plain, Size: 3998 bytes --]
> On Fri, Oct 10, 2025 at 07:21:43PM +0200, Lorenzo Bianconi wrote:
> > Completion napi can free out-of-order tx descriptors if hw QoS is
> > enabled and packets with different priority are queued to same DMA ring.
> > Take into account possible out-of-order reports checking if the tx queue
> > is full using circular buffer head/tail pointer instead of the number of
> > queued packets.
> >
> > Fixes: 23020f0493270 ("net: airoha: Introduce ethernet support for EN7581 SoC")
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > ---
> > drivers/net/ethernet/airoha/airoha_eth.c | 15 ++++++++++++++-
> > 1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> > index 833dd911980b3f698bd7e5f9fd9e2ce131dd5222..5e2ff52dba03a7323141fe9860fba52806279bd0 100644
> > --- a/drivers/net/ethernet/airoha/airoha_eth.c
> > +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> > @@ -1873,6 +1873,19 @@ static u32 airoha_get_dsa_tag(struct sk_buff *skb, struct net_device *dev)
> > #endif
> > }
> >
> > +static bool airoha_dev_is_tx_busy(struct airoha_queue *q, u32 nr_frags)
> > +{
> > + u16 index = (q->head + nr_frags) % q->ndesc;
> > +
> > + /* completion napi can free out-of-order tx descriptors if hw QoS is
> > + * enabled and packets with different priorities are queued to the same
> > + * DMA ring. Take into account possible out-of-order reports checking
> > + * if the tx queue is full using circular buffer head/tail pointers
> > + * instead of the number of queued packets.
> > + */
> > + return index >= q->tail && (q->head < q->tail || q->head > index);
>
> Hi Lorenzo,
Hi Simon,
thx for the review.
>
> I think there is a corner case here.
> Perhaps they can't occur, but here goes.
>
> Let us suppose that head is 1.
> And the ring is completely full, so tail is 2.
>
> Now, suppose nr_frags is ndesc - 1.
> In this case the function above will return false. But the ring is full.
>
> Ok, ndesc is actually 1024 and nfrags should never be close to that.
> But the problem is general. And a perhaps more realistic example is:
>
> ndesc is 1024
> head is 1008
> The ring is full so tail is 1009
> (Or head is any other value that leaves less than 16 slots free)
> nr_frags is 16
>
> airoha_dev_is_tx_busy() returns false, even though the ring is full.
yes, you are right, this corner case is not properly managed by the proposed
algorithm, thx for pointing this out.
>
> Probably this has it's own problems. But if my reasoning above is correct
> (is it?) then the following seems to address it by flattening and extending
> the ring. Because what we are about is the relative value of head, index
> and tail. Not the slots they occupy in the ring.
>
> N.B: I tetsed the algorirthm with a quick implementation in user-space.
> The following is, however, completely untested.
>
> static bool airoha_dev_is_tx_busy(struct airoha_queue *q, u32 nr_frags)
> {
> unsigned int tail = q->tail < q->head ? q->tail + q->ndesc : q->tail;
> unsigned int index = q->head + nr_frags;
>
> return index >= tail;
> }
I agree, the algorithm you proposed properly manages the 99% of the cases. The
only case where it fails is when the queue is empty (so tail = head = x,
e.g. x = 0). In this case we would have:
- q->ndesc = 1024
- q->tail = q->head = 0
- tail = 0
- index = n (e.g. n = 1)
- index >= tail ==> 1 >= 0 ==> busy (but the queue is actually empty).
I guess we should add a minor change in the tail definition:
u32 tail = q->tail <= q->head ? q->tail + q->ndesc : q->tail;
so:
- q->ndesc = 1024
- q->tail = q->head = 0
- tail = 1024
- index = n (e.g. n = 1)
- index >= tail => 1 < 1024 => OK
Can you spot any downside with this approach?
I tested the proposed approach and it seems to be working fine.
Regards,
Lorenzo
>
> ...
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
next prev parent reply other threads:[~2025-10-11 13:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-10 17:21 [PATCH net] net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit() Lorenzo Bianconi
2025-10-11 10:43 ` Simon Horman
2025-10-11 13:34 ` Lorenzo Bianconi [this message]
2025-10-11 15:03 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOpc8d0dPGOnwfJE@lore-desk \
--to=lorenzo@kernel.org \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.