public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next RFC V3 0/2] veth: qdisc backpressure and qdisc check refactor
@ 2025-04-14 15:45 Jesper Dangaard Brouer
  2025-04-14 15:45 ` [PATCH net-next RFC V3 1/2] net: sched: generalize check for no-op qdisc on TX queue Jesper Dangaard Brouer
  2025-04-14 15:45 ` [PATCH net-next RFC V3 2/2] veth: apply qdisc backpressure on full ptr_ring to reduce TX drops Jesper Dangaard Brouer
  0 siblings, 2 replies; 3+ messages in thread
From: Jesper Dangaard Brouer @ 2025-04-14 15:45 UTC (permalink / raw)
  To: netdev, Jakub Kicinski
  Cc: Jesper Dangaard Brouer, bpf, tom, Eric Dumazet, David S. Miller,
	Paolo Abeni, Toke Høiland-Jørgensen, dsahern,
	makita.toshiaki, kernel-team

RFC: As refactor is incorrect - need help/input from qdisc experts

This patch series addresses TX drops seen on veth devices under load,
particularly when using threaded NAPI, which is our setup in production.

The root cause is that the NAPI consumer often runs on a different CPU
than the producer. Combined with scheduling delays or simply slower
consumption, this increases the chance that the ptr_ring fills up before
packets are drained, resulting in drops from veth_xmit() (ndo_start_xmit()).

To make this easier to reproduce, we’ve created a script that sets up a
test scenario using network namespaces. The script inserts 1000 iptables
rules in the consumer namespace to slow down packet processing and
amplify the issue. Reproducer script:

https://github.com/xdp-project/xdp-project/blob/main/areas/core/veth_setup01_NAPI_TX_drops.sh

This series first introduces a helper to detect no-op qdiscs and then
uses it in the veth driver to conditionally apply qdisc-level
backpressure when a real qdisc is attached. The behavior is off by
default and opt-in, ensuring minimal impact and easy activation.

---
V3:
 - Reorder patches, generalize check for no-op qdisc as first patch
   - RFC: As testing show this is incorrect
 - rcu_dereference(priv->peer) in veth_xdp_rcv as this runs in NAPI
   context rcu_read_lock() is implicit.
 - Link to V2: https://lore.kernel.org/all/174412623473.3702169.4235683143719614624.stgit@firesoul/
V2:
 - Generalize check for no-op qdisc
 - Link to RFC-V1: https://lore.kernel.org/all/174377814192.3376479.16481605648460889310.stgit@firesoul/

Jesper Dangaard Brouer (2):
      net: sched: generalize check for no-op qdisc on TX queue
      veth: apply qdisc backpressure on full ptr_ring to reduce TX drops


 drivers/net/veth.c        | 49 ++++++++++++++++++++++++++++++++-------
 drivers/net/vrf.c         |  4 +---
 include/net/sch_generic.h |  7 +++++-
 3 files changed, 48 insertions(+), 12 deletions(-)

--


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-04-14 15:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-14 15:45 [PATCH net-next RFC V3 0/2] veth: qdisc backpressure and qdisc check refactor Jesper Dangaard Brouer
2025-04-14 15:45 ` [PATCH net-next RFC V3 1/2] net: sched: generalize check for no-op qdisc on TX queue Jesper Dangaard Brouer
2025-04-14 15:45 ` [PATCH net-next RFC V3 2/2] veth: apply qdisc backpressure on full ptr_ring to reduce TX drops Jesper Dangaard Brouer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox