From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtpout-03.galae.net (smtpout-03.galae.net [185.246.85.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7493392C4C for ; Mon, 9 Mar 2026 10:56:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.246.85.4 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773053789; cv=none; b=iGYfP2yuQkDPhrvbZahuwe86Sib04vWSqqtBNtuL1u/Q7BaSjzntIV7bcDvj9Xkj9lLCG62aPZ7rCjDkDzLRDnp+zmpz8f3h91i+9SmiwYl7VMx0cdywNUATmoubINQhB9Lnfb9gdqJcENHWrliIB452oQR09sZqMoENAfOWmlk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773053789; c=relaxed/simple; bh=+dc6Lnwo5YZ7/3JzHek9nuMZRXBo/9cGU84S0tcDqyQ=; h=Mime-Version:Content-Type:Date:Message-Id:Cc:To:From:Subject: References:In-Reply-To; b=PxIUqr3ThT6vGa+7JdI2uAKR4N+nFO1pQXL+hJ6AAe3gLKQQpxCRDhgGJR8GhU3drv6YnzzZCa5PLgHhIgNYIJy2YVNWVIGf5uoP6svTanX9fx1dU4WBIzTzVFW8svtD45RRzTiHHkELKgfyC7pomZnctx5cKtGColmggitHf0k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=bootlin.com; spf=pass smtp.mailfrom=bootlin.com; dkim=pass (2048-bit key) header.d=bootlin.com header.i=@bootlin.com header.b=DtdmIMfW; arc=none smtp.client-ip=185.246.85.4 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=bootlin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bootlin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bootlin.com header.i=@bootlin.com header.b="DtdmIMfW" Received: from smtpout-01.galae.net (smtpout-01.galae.net [212.83.139.233]) by smtpout-03.galae.net (Postfix) with ESMTPS id 27EA04E4251A; Mon, 9 Mar 2026 10:56:24 +0000 (UTC) Received: from mail.galae.net (mail.galae.net [212.83.136.155]) by smtpout-01.galae.net (Postfix) with ESMTPS id E800E5FFB8; Mon, 9 Mar 2026 10:56:23 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by localhost (Mailerdaemon) with ESMTPSA id B6F6610369B53; Mon, 9 Mar 2026 11:56:17 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bootlin.com; s=dkim; t=1773053782; h=from:subject:date:message-id:to:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:references; bh=Gzr14tOSv8CsjphFxN1CH5k+Wd2mUuktsUCoITd4ydI=; b=DtdmIMfWFsDTwbpFaa3qJVfmdBivebdRRUV51CDLf68QcJQzLcX+IavHOIfxbdSVrrMKt7 ZwDbzNremPKThMvqMoRwZR91Fo6ApLa8JiihVvdiR0BZVSvIUFxpsCTBTzarb/2yeaydfH MNxvGdWPARLwMwnoagBOsgx9JmWKm82GX+Wcltws1g4JD05nDSMZAROegGaBFnWTUhtb1j 8LokKLNO7jqwaIHUJOzMk1vw+9IWXHfFm2kJNpllgQXNvUfUi1SpcgkUOocFx/0ZAmluZj kkWuAp5IVHqr7+l4B655xMReNyVuN7AyFESyLpY/2Xr04B9guY2YagSrSsjytw== Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Mon, 09 Mar 2026 11:56:17 +0100 Message-Id: Cc: , , , "Vladimir Kondratiev" , "Gregory CLEMENT" , =?utf-8?q?Beno=C3=AEt_Monin?= , "Tawfik Bayouk" , "Thomas Petazzoni" To: "Maxime Chevallier" , =?utf-8?q?Th=C3=A9o_Lebrun?= , "Nicolas Ferre" , "Claudiu Beznea" , "Andrew Lunn" , "David S. Miller" , "Eric Dumazet" , "Jakub Kicinski" , "Paolo Abeni" , "Alexei Starovoitov" , "Daniel Borkmann" , "Jesper Dangaard Brouer" , "John Fastabend" , "Stanislav Fomichev" , "Richard Cochran" From: =?utf-8?q?Th=C3=A9o_Lebrun?= Subject: Re: [PATCH net-next 8/8] net: macb: add Tx zero-copy AF_XDP support X-Mailer: aerc 0.21.0-0-g5549850facc2 References: <20260304-macb-xsk-v1-0-ba2ebe2bdaa3@bootlin.com> <20260304-macb-xsk-v1-8-ba2ebe2bdaa3@bootlin.com> <74089907-d7d3-4242-92a5-909812779932@bootlin.com> In-Reply-To: X-Last-TLS-Session-Version: TLSv1.3 Hello Maxime, On Fri Mar 6, 2026 at 6:53 PM CET, Maxime Chevallier wrote: > On 06/03/2026 18:18, Th=C3=A9o Lebrun wrote: >> Hello! >>=20 >> On Fri Mar 6, 2026 at 1:48 PM CET, Maxime Chevallier wrote: >>> On 04/03/2026 19:24, Th=C3=A9o Lebrun wrote: >>>> Add a new buffer type (to `enum macb_tx_buff_type`). Near the end of >>>> macb_tx_complete(), we go and read the XSK buffers using >>>> xsk_tx_peek_release_desc_batch() and append those buffers to our Tx >>>> ring. >>>> >>>> Additionally, in macb_tx_complete(), we signal to the XSK subsystem >>>> number of bytes completed and conditionally mark the need_wakeup >>>> flag. >>>> >>>> Lastly, we update XSK wakeup by writing the TCOMP bit in the per-queue >>>> IMR register, to ensure NAPI scheduling will take place. >>>> >>>> Signed-off-by: Th=C3=A9o Lebrun >>>> --- >>> >>> [...] >>> >>>> +static void macb_xdp_xmit_zc(struct macb *bp, unsigned int queue_inde= x, int budget) >>>> +{ >>>> + struct macb_queue *queue =3D &bp->queues[queue_index]; >>>> + struct xsk_buff_pool *xsk =3D queue->xsk_pool; >>>> + dma_addr_t mapping; >>>> + u32 slot_available; >>>> + size_t bytes =3D 0; >>>> + u32 batch; >>>> + >>>> + guard(spinlock_irqsave)(&queue->tx_ptr_lock); >>>> + >>>> + /* This is a hard error, log it. */ >>>> + slot_available =3D CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx= _ring_size); >>>> + if (slot_available < 1) { >>>> + netif_stop_subqueue(bp->dev, queue_index); >>>> + netdev_dbg(bp->dev, "tx_head =3D %u, tx_tail =3D %u\n", >>>> + queue->tx_head, queue->tx_tail); >>>> + return; >>>> + } >>>> + >>>> + batch =3D min_t(u32, slot_available, budget); >>>> + batch =3D xsk_tx_peek_release_desc_batch(xsk, batch); >>>> + if (!batch) >>>> + return; >>>> + >>>> + for (u32 i =3D 0; i < batch; i++) { >>>> + struct xdp_desc *desc =3D &xsk->tx_descs[i]; >>>> + >>>> + mapping =3D xsk_buff_raw_get_dma(xsk, desc->addr); >>>> + xsk_buff_raw_dma_sync_for_device(xsk, mapping, desc->len); >>>> + >>>> + macb_xdp_submit_buff(bp, queue_index, (struct macb_tx_buff){ >>>> + .ptr =3D NULL, >>>> + .mapping =3D mapping, >>>> + .size =3D desc->len, >>>> + .mapped_as_page =3D false, >>>> + .type =3D MACB_TYPE_XSK, >>>> + }); >>>> + >>>> + bytes +=3D desc->len; >>>> + } >>>> + >>>> + /* Make newly initialized descriptor visible to hardware */ >>>> + wmb(); >>>> + spin_lock(&bp->lock); >>>> + macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART)); >>>> + spin_unlock(&bp->lock); >>> >>> this lock is also taken in interrupt context, this should probably use = a >>> irqsave/restore variant. Now, there are a few other parts of this drive= r >>> that use a plain spin_lock() call and except for the paths that actuall= y >>> run in interrupt context, they don't seem correct to me :( >>=20 >> I almost sent a reply agreeing with you, but actually here is the >> exhaustive `spin_lock(&bp->lock)` list: >>=20 >> # Function Context >> ------------------------------------------ >> 1 gem_wol_interrupt() irq >> 2 macb_interrupt() irq >> 3 macb_wol_interrupt() irq >> 4 macb_tx_error_task() workqueue/user >> 5 macb_tx_restart() napi/softirq >> 6 macb_xdp_xmit_zc() napi/softirq >> 7 macb_start_xmit() user >> 8 macb_xdp_submit_frame() user >>=20 >> And all contexts are safe because it always is this sequence in non-IRQ >> contexts (#4-8): >>=20 >> spin_lock_irqsave(&queue->tx_ptr_lock, flags); >> spin_lock(&bp->lock); >> spin_unlock(&bp->lock); >> spin_unlock_irqrestore(&queue->tx_ptr_lock, flags); > > Is it because of the guard statement ? > > guard(spinlock_irqsave)(&queue->tx_ptr_lock); > > It really doesn't make it obvious that this is how it plays out :( Yes! A guard does an operation when called and one at scope end (in our case at the end of macb_xdp_xmit_zc()). That way we don't forget the cleanup, and we can do early returns without a list of labels and gotos (and mess up along the way). It uses the __attribute__((cleanup(cleanup_function))) compiler feature, that is aliased to `__cleanup()` in the kernel. https://gcc.gnu.org/onlinedocs/gcc/Common-Attributes.html#index-cleanup https://elixir.bootlin.com/linux/v6.19.6/source/include/linux/compiler_attr= ibutes.h#L76 Guard definition for `spinlock_irqsave`: https://elixir.bootlin.com/linux/v6.19.6/source/include/linux/spinlock.h#L5= 85-L588 (delving into those macros is not recommended) Code documentation is good: https://elixir.bootlin.com/linux/v6.19.6/source/include/linux/cleanup.h#L10 Thanks, -- Th=C3=A9o Lebrun, Bootlin Embedded Linux and Kernel engineering https://bootlin.com