Netdev List
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <hawk@kernel.org>
To: netdev@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>
Cc: "Jesper Dangaard Brouer" <hawk@kernel.org>,
	bpf@vger.kernel.org, tom@herbertland.com,
	"Eric Dumazet" <eric.dumazet@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Toke Høiland-Jørgensen" <toke@toke.dk>,
	dsahern@kernel.org, makita.toshiaki@lab.ntt.co.jp,
	kernel-team@cloudflare.com, phil@nwl.cc
Subject: [PATCH net-next V7 0/2] veth: qdisc backpressure and qdisc check refactor
Date: Fri, 25 Apr 2025 16:55:25 +0200	[thread overview]
Message-ID: <174559288731.827981.8748257839971869213.stgit@firesoul> (raw)

This patch series addresses TX drops seen on veth devices under load,
particularly when using threaded NAPI, which is our setup in production.

The root cause is that the NAPI consumer often runs on a different CPU
than the producer. Combined with scheduling delays or simply slower
consumption, this increases the chance that the ptr_ring fills up before
packets are drained, resulting in drops from veth_xmit() (ndo_start_xmit()).

To make this easier to reproduce, we’ve created a script that sets up a
test scenario using network namespaces. The script inserts 1000 iptables
rules in the consumer namespace to slow down packet processing and
amplify the issue. Reproducer script:

https://github.com/xdp-project/xdp-project/blob/main/areas/core/veth_setup01_NAPI_TX_drops.sh

This series first introduces a helper to detect no-queue qdiscs and then
uses it in the veth driver to conditionally apply qdisc-level
backpressure when a real qdisc is attached. The behavior is off by
default and opt-in, ensuring minimal impact and easy activation.

---
V7:
 - Adjust race handling with smp_mb__after_atomic() for other archs than x86
 - Link to V6: https://lore.kernel.org/all/174549933665.608169.392044991754158047.stgit@firesoul/
V6:
 - Remove __veth_xdp_flush() and handle race via __ptr_ring_empty instead
 - Link to V5: https://lore.kernel.org/all/174489803410.355490.13216831426556849084.stgit@firesoul/
V5:
 - use rcu_dereference_check to signal that NAPI is a RCU section
 - whitespace fixes reported by checkpatch.pl
 - handle unlikely race
 - Link to V4 https://lore.kernel.org/all/174472463778.274639.12670590457453196991.stgit@firesoul/
V4:
 - Check against no-queue instead of no-op qdisc
 - Link to V3: https://lore.kernel.org/all/174464549885.20396.6987653753122223942.stgit@firesoul/
V3:
 - Reorder patches, generalize check for no-op qdisc as first patch
   - RFC: As testing show this is incorrect
 - rcu_dereference(priv->peer) in veth_xdp_rcv as this runs in NAPI
   context rcu_read_lock() is implicit.
 - Link to V2: https://lore.kernel.org/all/174412623473.3702169.4235683143719614624.stgit@firesoul/
V2:
 - Generalize check for no-op qdisc
 - Link to RFC-V1: https://lore.kernel.org/all/174377814192.3376479.16481605648460889310.stgit@firesoul/
---

Jesper Dangaard Brouer (2):
      net: sched: generalize check for no-queue qdisc on TX queue
      veth: apply qdisc backpressure on full ptr_ring to reduce TX drops


 drivers/net/veth.c        | 57 ++++++++++++++++++++++++++++++++-------
 drivers/net/vrf.c         |  4 +--
 include/net/sch_generic.h |  8 ++++++
 3 files changed, 56 insertions(+), 13 deletions(-)

--


             reply	other threads:[~2025-04-25 14:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-25 14:55 Jesper Dangaard Brouer [this message]
2025-04-25 14:55 ` [PATCH net-next V7 1/2] net: sched: generalize check for no-queue qdisc on TX queue Jesper Dangaard Brouer
2025-04-25 14:55 ` [PATCH net-next V7 2/2] veth: apply qdisc backpressure on full ptr_ring to reduce TX drops Jesper Dangaard Brouer
2025-04-26  5:40   ` Toshiaki Makita
2025-06-09 22:09   ` Ihor Solodrai
2025-06-10 11:43     ` Jesper Dangaard Brouer
2025-06-10 15:56       ` Jesper Dangaard Brouer
2025-06-10 18:26         ` Ihor Solodrai
2025-06-10 21:40           ` Jesper Dangaard Brouer
2025-06-11  0:25             ` Ihor Solodrai
2025-06-11  7:40               ` Jesper Dangaard Brouer
2025-06-11 12:40                 ` [PATCH net V1] veth: prevent NULL pointer dereference in veth_xdp_rcv Jesper Dangaard Brouer
2025-06-11 16:00                   ` Ihor Solodrai
2025-06-12 15:20                   ` patchwork-bot+netdevbpf
2025-10-17 16:09   ` [PATCH net-next V7 2/2] veth: apply qdisc backpressure on full ptr_ring to reduce TX drops Jesper Dangaard Brouer
2025-04-28 22:20 ` [PATCH net-next V7 0/2] veth: qdisc backpressure and qdisc check refactor patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=174559288731.827981.8748257839971869213.stgit@firesoul \
    --to=hawk@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eric.dumazet@gmail.com \
    --cc=kernel-team@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=phil@nwl.cc \
    --cc=toke@toke.dk \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox