From: Jesper Dangaard Brouer <hawk@kernel.org>
To: Toshiaki Makita <toshiaki.makita1@gmail.com>
Cc: "Eric Dumazet" <eric.dumazet@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
ihor.solodrai@linux.dev, "Michael S. Tsirkin" <mst@redhat.com>,
makita.toshiaki@lab.ntt.co.jp, bpf@vger.kernel.org,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, kernel-team@cloudflare.com,
netdev@vger.kernel.org, "Toke Høiland-Jørgensen" <toke@toke.dk>
Subject: Re: [PATCH net V2 2/2] veth: more robust handing of race to avoid txq getting stuck
Date: Wed, 29 Oct 2025 11:33:23 +0100 [thread overview]
Message-ID: <27e74aeb-89f5-4547-8ecc-232570e2644c@kernel.org> (raw)
In-Reply-To: <aacc9c56-bea9-44eb-90fd-726d41b418dd@gmail.com>
On 28/10/2025 15.56, Toshiaki Makita wrote:
> On 2025/10/28 5:05, Jesper Dangaard Brouer wrote:
>
>> (1) In veth_xmit(), the racy conditional wake-up logic and its memory
>> barrier
>> are removed. Instead, after stopping the queue, we unconditionally call
>> __veth_xdp_flush(rq). This guarantees that the NAPI consumer is
>> scheduled,
>> making it solely responsible for re-waking the TXQ.
>
> Maybe another option is to use !ptr_ring_full() instead of
> ptr_ring_empty()?
Nope, that will not work.
I think MST will agree.
> I'm not sure which is better. Anyway I'm ok with your approach.
>
> ...
>
>> (3) Finally, the NAPI completion check in veth_poll() is updated. If NAPI is
>> about to complete (napi_complete_done), it now also checks if the peer TXQ
>> is stopped. If the ring is empty but the peer TXQ is stopped, NAPI will
>> reschedule itself. This prevents a new race where the producer stops the
>> queue just as the consumer is finishing its poll, ensuring the wakeup
>> is not missed.
> ...
>
>> @@ -986,7 +979,8 @@ static int veth_poll(struct napi_struct *napi, int
>> budget)
>> if (done < budget && napi_complete_done(napi, done)) {
>> /* Write rx_notify_masked before reading ptr_ring */
>> smp_store_mb(rq->rx_notify_masked, false);
>> - if (unlikely(!__ptr_ring_empty(&rq->xdp_ring))) {
>> + if (unlikely(!__ptr_ring_empty(&rq->xdp_ring) ||
>> + (peer_txq && netif_tx_queue_stopped(peer_txq)))) {
>
> Not sure if this is necessary.
How sure are you that this isn't necessary?
> From commitlog, your intention seems to be making sure to wake up the
> queue,
> but you wake up the queue immediately after this hunk in the same function,
> so isn't it guaranteed without scheduling another napi?
>
The above code catches the case, where the ptr_ring is empty and the
tx_queue is stopped. It feels wrong not to reach in this case, but you
*might* be right that it isn't strictly necessary, because below code
will also call netif_tx_wake_queue() which *should* have a SKB stored
that will *indirectly* trigger a restart of the NAPI.
I will stare some more at the code to see if I can convince myself that
we don't have to catch this case.
Please, also provide "How sure are you that this isn't necessary?"
>> if (napi_schedule_prep(&rq->xdp_napi)) {
>> WRITE_ONCE(rq->rx_notify_masked, true);
>> __napi_schedule(&rq->xdp_napi);
>> @@ -998,6 +992,13 @@ static int veth_poll(struct napi_struct *napi,
>> int budget)
>> veth_xdp_flush(rq, &bq);
>> xdp_clear_return_frame_no_direct();
>> + /* Release backpressure per NAPI poll */
>> + smp_rmb(); /* Paired with netif_tx_stop_queue set_bit */
>> + if (peer_txq && netif_tx_queue_stopped(peer_txq)) {
>> + txq_trans_cond_update(peer_txq);
>> + netif_tx_wake_queue(peer_txq);
>> + }
>> +
>> return done;
>> }
>
> --
> Toshiaki Makita
next prev parent reply other threads:[~2025-10-29 10:33 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-27 20:05 [PATCH net V2 0/2] veth: Fix TXQ stall race condition and add recovery Jesper Dangaard Brouer
2025-10-27 20:05 ` [PATCH net V2 1/2] veth: enable dev_watchdog for detecting stalled TXQs Jesper Dangaard Brouer
2025-10-28 9:10 ` Toke Høiland-Jørgensen
2025-10-27 20:05 ` [PATCH net V2 2/2] veth: more robust handing of race to avoid txq getting stuck Jesper Dangaard Brouer
2025-10-28 9:10 ` Toke Høiland-Jørgensen
2025-10-28 14:56 ` Toshiaki Makita
2025-10-29 10:33 ` Jesper Dangaard Brouer [this message]
2025-10-29 15:00 ` Toshiaki Makita
2025-10-30 19:06 ` Jesper Dangaard Brouer
2025-11-03 8:41 ` Toshiaki Makita
2025-10-30 12:28 ` Paolo Abeni
2025-11-05 15:54 ` Jesper Dangaard Brouer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27e74aeb-89f5-4547-8ecc-232570e2644c@kernel.org \
--to=hawk@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=ihor.solodrai@linux.dev \
--cc=kernel-team@cloudflare.com \
--cc=kuba@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=makita.toshiaki@lab.ntt.co.jp \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=toke@toke.dk \
--cc=toshiaki.makita1@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox