From: "Théo Lebrun" <theo.lebrun@bootlin.com>
To: "Maxime Chevallier" <maxime.chevallier@bootlin.com>,
"Théo Lebrun" <theo.lebrun@bootlin.com>,
"Nicolas Ferre" <nicolas.ferre@microchip.com>,
"Claudiu Beznea" <claudiu.beznea@tuxon.dev>,
"Andrew Lunn" <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>,
"Stanislav Fomichev" <sdf@fomichev.me>,
"Richard Cochran" <richardcochran@gmail.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org,
"Vladimir Kondratiev" <vladimir.kondratiev@mobileye.com>,
"Gregory CLEMENT" <gregory.clement@bootlin.com>,
"Benoît Monin" <benoit.monin@bootlin.com>,
"Tawfik Bayouk" <tawfik.bayouk@mobileye.com>,
"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>
Subject: Re: [PATCH net-next 8/8] net: macb: add Tx zero-copy AF_XDP support
Date: Fri, 06 Mar 2026 18:18:03 +0100 [thread overview]
Message-ID: <DGVVCKPUZVSQ.3U7SG8TTTA1CM@bootlin.com> (raw)
In-Reply-To: <74089907-d7d3-4242-92a5-909812779932@bootlin.com>
Hello!
On Fri Mar 6, 2026 at 1:48 PM CET, Maxime Chevallier wrote:
> On 04/03/2026 19:24, Théo Lebrun wrote:
>> Add a new buffer type (to `enum macb_tx_buff_type`). Near the end of
>> macb_tx_complete(), we go and read the XSK buffers using
>> xsk_tx_peek_release_desc_batch() and append those buffers to our Tx
>> ring.
>>
>> Additionally, in macb_tx_complete(), we signal to the XSK subsystem
>> number of bytes completed and conditionally mark the need_wakeup
>> flag.
>>
>> Lastly, we update XSK wakeup by writing the TCOMP bit in the per-queue
>> IMR register, to ensure NAPI scheduling will take place.
>>
>> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
>> ---
>
> [...]
>
>> +static void macb_xdp_xmit_zc(struct macb *bp, unsigned int queue_index, int budget)
>> +{
>> + struct macb_queue *queue = &bp->queues[queue_index];
>> + struct xsk_buff_pool *xsk = queue->xsk_pool;
>> + dma_addr_t mapping;
>> + u32 slot_available;
>> + size_t bytes = 0;
>> + u32 batch;
>> +
>> + guard(spinlock_irqsave)(&queue->tx_ptr_lock);
>> +
>> + /* This is a hard error, log it. */
>> + slot_available = CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size);
>> + if (slot_available < 1) {
>> + netif_stop_subqueue(bp->dev, queue_index);
>> + netdev_dbg(bp->dev, "tx_head = %u, tx_tail = %u\n",
>> + queue->tx_head, queue->tx_tail);
>> + return;
>> + }
>> +
>> + batch = min_t(u32, slot_available, budget);
>> + batch = xsk_tx_peek_release_desc_batch(xsk, batch);
>> + if (!batch)
>> + return;
>> +
>> + for (u32 i = 0; i < batch; i++) {
>> + struct xdp_desc *desc = &xsk->tx_descs[i];
>> +
>> + mapping = xsk_buff_raw_get_dma(xsk, desc->addr);
>> + xsk_buff_raw_dma_sync_for_device(xsk, mapping, desc->len);
>> +
>> + macb_xdp_submit_buff(bp, queue_index, (struct macb_tx_buff){
>> + .ptr = NULL,
>> + .mapping = mapping,
>> + .size = desc->len,
>> + .mapped_as_page = false,
>> + .type = MACB_TYPE_XSK,
>> + });
>> +
>> + bytes += desc->len;
>> + }
>> +
>> + /* Make newly initialized descriptor visible to hardware */
>> + wmb();
>> + spin_lock(&bp->lock);
>> + macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
>> + spin_unlock(&bp->lock);
>
> this lock is also taken in interrupt context, this should probably use a
> irqsave/restore variant. Now, there are a few other parts of this driver
> that use a plain spin_lock() call and except for the paths that actually
> run in interrupt context, they don't seem correct to me :(
I almost sent a reply agreeing with you, but actually here is the
exhaustive `spin_lock(&bp->lock)` list:
# Function Context
------------------------------------------
1 gem_wol_interrupt() irq
2 macb_interrupt() irq
3 macb_wol_interrupt() irq
4 macb_tx_error_task() workqueue/user
5 macb_tx_restart() napi/softirq
6 macb_xdp_xmit_zc() napi/softirq
7 macb_start_xmit() user
8 macb_xdp_submit_frame() user
And all contexts are safe because it always is this sequence in non-IRQ
contexts (#4-8):
spin_lock_irqsave(&queue->tx_ptr_lock, flags);
spin_lock(&bp->lock);
spin_unlock(&bp->lock);
spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);
So bp->tx_ptr_lock always wraps bp->lock and does the local CPU IRQ
disabling.
(I also checked we don't risk ABBA deadlock, and we don't: all code
acquires bp->tx_ptr_lock THEN bp->lock.)
However, there is still a bug in the code you quoted: setting
BIT(TSTART) is done twice by macb_xdp_xmit_zc():
- once in the helper function macb_xdp_submit_buff() and,
- once in its own body (code you quoted)
This is fixed for V2!
Thanks Maxime,
Have a nice week-end,
--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
next prev parent reply other threads:[~2026-03-06 17:18 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 1/8] net: macb: make rx error messages rate-limited Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 2/8] net: macb: account for stats in Rx XDP codepaths Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 3/8] net: macb: account for stats in Tx " Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 4/8] net: macb: drop handling of recycled buffers in gem_rx_refill() Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 5/8] net: macb: move macb_xdp_submit_frame() body to helper function Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 6/8] net: macb: add infrastructure for XSK buffer pool Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 7/8] net: macb: add Rx zero-copy AF_XDP support Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 8/8] net: macb: add Tx " Théo Lebrun
2026-03-06 12:48 ` Maxime Chevallier
2026-03-06 17:18 ` Théo Lebrun [this message]
2026-03-06 17:53 ` Maxime Chevallier
2026-03-09 10:56 ` Théo Lebrun
2026-03-06 3:11 ` [PATCH net-next 0/8] net: macb: add XSK support Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DGVVCKPUZVSQ.3U7SG8TTTA1CM@bootlin.com \
--to=theo.lebrun@bootlin.com \
--cc=andrew+netdev@lunn.ch \
--cc=ast@kernel.org \
--cc=benoit.monin@bootlin.com \
--cc=bpf@vger.kernel.org \
--cc=claudiu.beznea@tuxon.dev \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gregory.clement@bootlin.com \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maxime.chevallier@bootlin.com \
--cc=netdev@vger.kernel.org \
--cc=nicolas.ferre@microchip.com \
--cc=pabeni@redhat.com \
--cc=richardcochran@gmail.com \
--cc=sdf@fomichev.me \
--cc=tawfik.bayouk@mobileye.com \
--cc=thomas.petazzoni@bootlin.com \
--cc=vladimir.kondratiev@mobileye.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.