public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Jason Xing" <kerneljasonxing@gmail.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH RFC net-next v4 05/14] xsk: add direct xmit in batch function
Date: Wed, 15 Apr 2026 09:11:58 +0000	[thread overview]
Message-ID: <20260415091159.2BC4FC19424@smtp.kernel.org> (raw)
In-Reply-To: <20260415082654.21026-6-kerneljasonxing@gmail.com>

Sashiko AI review found 4 potential issue(s):
- [High] Packet validation executed inside hardware transmit lock, causing severe lock contention
- [High] Missing transmit recursion protection introduces deadlock vectors
- [High] Missing device state validation allows transmission to stopped or suspended devices
- [Medium] Hardware doorbell batching is defeated by hardcoded 'more' parameter
--

commit b98984346f64d24e3ecd521b219464ccbe9fba91
Author: Jason Xing <kernelxing@tencent.com>

xsk: add direct xmit in batch function

This commit adds a batch transmission logic for AF_XDP. It introduces a
new direct transmit function that disables bottom halves, grabs the lock
once, and sends all aggregated packets in a loop.

> diff --git a/net/core/dev.c b/net/core/dev.c
> --- a/net/core/dev.c
> +++ b/net/core/dev.c

[ ... ]

> +int xsk_direct_xmit_batch(struct xdp_sock *xs, struct net_device *dev)
> +{
> +	u16 queue_id = xs->queue_id;
> +	struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_id);
> +	struct sk_buff_head *send_queue = &xs->batch.send_queue;
> +	int ret = NETDEV_TX_BUSY;
> +	struct sk_buff *skb;
> +

Should there be a check for the device state before attempting to transmit?
Direct transmit paths that bypass the Qdisc layer, like __dev_direct_xmit(),
typically verify that the device is running and the carrier is up:

	if (unlikely(!netif_running(dev) ||
		     !netif_carrier_ok(dev)))
		goto drop;

Could invoking transmission on a stopped or suspended device cause issues?

> +	local_bh_disable();

Are we missing transmit recursion protection here? Standard paths wrap the
hardware transmit lock with recursion tracking to protect against stack
overflows and deadlocks when routing through virtual devices:

	dev_xmit_recursion_inc();
	HARD_TX_LOCK(dev, txq, smp_processor_id());

> +	HARD_TX_LOCK(dev, txq, smp_processor_id());
> +	while ((skb = __skb_dequeue(send_queue)) != NULL) {
> +		struct sk_buff *orig_skb = skb;
> +		bool again = false;
> +
> +		skb = validate_xmit_skb_list(skb, dev, &again);

Will executing validate_xmit_skb_list() inside the hardware transmit lock
cause lock contention? This function can perform computationally expensive
operations such as GSO segmentation, software checksum fallbacks, and
subsystem hooks. Other transmit paths execute this validation strictly
before acquiring the hardware lock.

> +		if (skb != orig_skb) {
> +			dev_core_stats_tx_dropped_inc(dev);
> +			kfree_skb_list(skb);
> +			ret = NET_XMIT_DROP;
> +			break;
> +		}
> +
> +		if (netif_xmit_frozen_or_drv_stopped(txq)) {
> +			__skb_queue_head(send_queue, skb);
> +			break;
> +		}
> +		skb_set_queue_mapping(skb, queue_id);
> +		ret = netdev_start_xmit(skb, dev, txq, false);

Does passing false for the more parameter defeat the purpose of hardware
doorbell batching? It instructs the driver to ring the hardware doorbell for
every single packet instead of delaying it until the end of the batch.

It looks like this might be fixed later in the patch series by commit
6d7c8f2ad056e ("xsk: support dynamic xmit.more control for batch xmit").

> +		if (ret != NETDEV_TX_OK) {
> +			if (ret == NETDEV_TX_BUSY)
> +				__skb_queue_head(send_queue, skb);
> +			break;
> +		}
> +	}
> +	HARD_TX_UNLOCK(dev, txq);

Also, we would need to pair any added recursion increment with a decrement
here:

	dev_xmit_recursion_dec();

> +	local_bh_enable();
> +
> +	return ret;
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260415082654.21026-1-kerneljasonxing@gmail.com?part=5

  reply	other threads:[~2026-04-15  9:11 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-15  8:26 [PATCH RFC net-next v4 00/14] xsk: batch xmit in copy mode Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 01/14] xsk: introduce XDP_GENERIC_XMIT_BATCH setsockopt Jason Xing
2026-04-15  8:51   ` sashiko-bot
2026-04-15 12:46     ` Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 02/14] xsk: extend xsk_build_skb() to support passing an already allocated skb Jason Xing
2026-04-15  8:52   ` sashiko-bot
2026-04-15 13:19     ` Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 03/14] xsk: add xsk_alloc_batch_skb() to build skbs in batch Jason Xing
2026-04-15  9:17   ` sashiko-bot
2026-04-16  1:18     ` Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 04/14] xsk: cache data buffers to avoid frequently calling kmalloc_reserve Jason Xing
2026-04-15  9:38   ` sashiko-bot
2026-04-16  2:45     ` Jason Xing
2026-04-16 12:18       ` Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 05/14] xsk: add direct xmit in batch function Jason Xing
2026-04-15  9:11   ` sashiko-bot [this message]
2026-04-16  3:04     ` Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 06/14] xsk: support dynamic xmit.more control for batch xmit Jason Xing
2026-04-15  9:35   ` sashiko-bot
2026-04-16  3:43     ` Jason Xing
2026-04-16  4:50       ` Dmitry Torokhov
2026-04-16  4:51         ` Dmitry Torokhov
2026-04-15  8:26 ` [PATCH RFC net-next v4 07/14] xsk: try to skip validating skb list in xmit path Jason Xing
2026-04-15  9:33   ` sashiko-bot
2026-04-16  5:55     ` Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 08/14] xsk: rename nb_pkts to nb_descs in xsk_tx_peek_release_desc_batch Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 09/14] xsk: extend xskq_cons_read_desc_batch to count nb_pkts Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 10/14] xsk: extend xsk_cq_reserve_locked() to reserve n slots Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 11/14] xsk: support batch xmit main logic Jason Xing
2026-04-15  9:38   ` sashiko-bot
2026-04-16  9:58     ` Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 12/14] xsk: separate read-mostly and write-heavy fields in xsk_buff_pool Jason Xing
2026-04-15  9:20   ` sashiko-bot
2026-04-16 10:09     ` Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 13/14] xsk: retire old xmit path in copy mode Jason Xing
2026-04-15  9:18   ` sashiko-bot
2026-04-16 10:33     ` Jason Xing
2026-04-15  8:26 ` [PATCH RFC net-next v4 14/14] xsk: optimize xsk_build_skb for batch copy-mode fast path Jason Xing
2026-04-15  9:47   ` sashiko-bot
2026-04-16 13:12     ` Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260415091159.2BC4FC19424@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=kerneljasonxing@gmail.com \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox