[PATCH net-next v2 11/18] net: macb: single dma_alloc_coherent() for DMA descriptors

devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Théo Lebrun" <theo.lebrun@bootlin.com>
To: Andrew Lunn <andrew+netdev@lunn.ch>,
	 "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	 Jakub Kicinski <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,  Rob Herring <robh@kernel.org>,
	Krzysztof Kozlowski <krzk+dt@kernel.org>,
	 Conor Dooley <conor+dt@kernel.org>,
	 Nicolas Ferre <nicolas.ferre@microchip.com>,
	 Claudiu Beznea <claudiu.beznea@tuxon.dev>,
	 Paul Walmsley <paul.walmsley@sifive.com>,
	 Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>,
	 Alexandre Ghiti <alex@ghiti.fr>,
	Samuel Holland <samuel.holland@sifive.com>,
	 Richard Cochran <richardcochran@gmail.com>,
	 Russell King <linux@armlinux.org.uk>,
	 Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	 Vladimir Kondratiev <vladimir.kondratiev@mobileye.com>,
	 Gregory CLEMENT <gregory.clement@bootlin.com>,
	 Cyrille Pitchen <cyrille.pitchen@atmel.com>,
	 Harini Katakam <harini.katakam@xilinx.com>,
	 Rafal Ozieblo <rafalo@cadence.com>,
	 Haavard Skinnemoen <hskinnemoen@atmel.com>,
	Jeff Garzik <jeff@garzik.org>
Cc: netdev@vger.kernel.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
	linux-mips@vger.kernel.org,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	"Tawfik Bayouk" <tawfik.bayouk@mobileye.com>,
	"Théo Lebrun" <theo.lebrun@bootlin.com>
Subject: [PATCH net-next v2 11/18] net: macb: single dma_alloc_coherent() for DMA descriptors
Date: Fri, 27 Jun 2025 11:08:57 +0200	[thread overview]
Message-ID: <20250627-macb-v2-11-ff8207d0bb77@bootlin.com> (raw)
In-Reply-To: <20250627-macb-v2-0-ff8207d0bb77@bootlin.com>

Move from two (Tx/Rx) dma_alloc_coherent() for DMA descriptor rings *per
queue* to two dma_alloc_coherent() overall.

Issue is with how all queues share the same register for configuring the
upper 32-bits of Tx/Rx descriptor rings. For example, with Tx, notice
how TBQPH does *not* depend on the queue index:

	#define GEM_TBQP(hw_q)		(0x0440 + ((hw_q) << 2))
	#define GEM_TBQPH(hw_q)		(0x04C8)

	queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
	#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
	if (bp->hw_dma_cap & HW_DMA_CAP_64B)
		queue_writel(queue, TBQPH, upper_32_bits(queue->tx_ring_dma));
	#endif

To maxime our chances of getting valid DMA addresses, we do a single
dma_alloc_coherent() across queues. This improves the odds because
alloc_pages() guarantees natural alignment. It cannot ensure valid DMA
addresses because of IOMMU or codepaths that don't go through
alloc_pages().

We error out if all rings don't have the same upper 32 bits, which is
better than the current (theoretical, not reproduced) silent corruption
caused by hardware that accesses invalid addresses.

Two considerations:
 - dma_alloc_coherent() gives us page alignment. Here we remove this
   containst meaning each queue's ring won't be page-aligned anymore.
 - This can save some memory. Less allocations means less overhead
   (constant cost per alloc) and less wasted bytes due to alignment
   constraints.

Fixes: 02c958dd3446 ("net/macb: add TX multiqueue support for gem")
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 83 ++++++++++++++++++--------------
 1 file changed, 46 insertions(+), 37 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index d3b3635998cad095246edf8a75faebbcf7115355..48b75d95861317b9925b366446c7572c7e186628 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -2445,33 +2445,32 @@ static void macb_free_rx_buffers(struct macb *bp)
 
 static void macb_free_consistent(struct macb *bp)
 {
-	struct macb_queue *queue;
+	size_t size, tx_size_per_queue, rx_size_per_queue;
+	struct macb_queue *queue, *queue0 = bp->queues;
+	struct device *dev = &bp->pdev->dev;
 	unsigned int q;
-	int size;
 
 	if (bp->rx_ring_tieoff) {
-		dma_free_coherent(&bp->pdev->dev, macb_dma_desc_get_size(bp),
+		dma_free_coherent(dev, macb_dma_desc_get_size(bp),
 				  bp->rx_ring_tieoff, bp->rx_ring_tieoff_dma);
 		bp->rx_ring_tieoff = NULL;
 	}
 
 	bp->macbgem_ops.mog_free_rx_buffers(bp);
 
+	tx_size_per_queue = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
+	size = bp->num_queues * tx_size_per_queue;
+	dma_free_coherent(dev, size, queue0->tx_ring, queue0->tx_ring_dma);
+
+	rx_size_per_queue = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
+	size = bp->num_queues * rx_size_per_queue;
+	dma_free_coherent(dev, size, queue0->rx_ring, queue0->rx_ring_dma);
+
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
 		kfree(queue->tx_skb);
 		queue->tx_skb = NULL;
-		if (queue->tx_ring) {
-			size = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
-			dma_free_coherent(&bp->pdev->dev, size,
-					  queue->tx_ring, queue->tx_ring_dma);
-			queue->tx_ring = NULL;
-		}
-		if (queue->rx_ring) {
-			size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
-			dma_free_coherent(&bp->pdev->dev, size,
-					  queue->rx_ring, queue->rx_ring_dma);
-			queue->rx_ring = NULL;
-		}
+		queue->tx_ring = NULL; /* Single buffer owned by queue0 */
+		queue->rx_ring = NULL; /* Single buffer owned by queue0 */
 	}
 }
 
@@ -2513,37 +2512,47 @@ static int macb_alloc_rx_buffers(struct macb *bp)
 
 static int macb_alloc_consistent(struct macb *bp)
 {
+	size_t size, tx_size_per_queue, rx_size_per_queue;
+	dma_addr_t tx_dma, rx_dma;
+	struct device *dev = &bp->pdev->dev;
 	struct macb_queue *queue;
 	unsigned int q;
-	int size;
+	void *tx, *rx;
+
+	/*
+	 * Upper 32-bits of Tx/Rx DMA descriptor for each queues much match!
+	 * We cannot enforce this guarantee, the best we can do is do a single
+	 * allocation and hope it will land into alloc_pages() that guarantees
+	 * natural alignment of physical addresses.
+	 */
+
+	tx_size_per_queue = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
+	size = bp->num_queues * tx_size_per_queue;
+	tx = dma_alloc_coherent(dev, size, &tx_dma, GFP_KERNEL);
+	if (!tx || upper_32_bits(tx_dma) != upper_32_bits(tx_dma + size - 1))
+		goto out_err;
+	netdev_dbg(bp->dev, "Allocated %zu bytes for %u TX rings at %08lx (mapped %p)\n",
+		   size, bp->num_queues, (unsigned long)tx_dma, tx);
+
+	rx_size_per_queue = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
+	size = bp->num_queues * rx_size_per_queue;
+	rx = dma_alloc_coherent(dev, size, &rx_dma, GFP_KERNEL);
+	if (!rx || upper_32_bits(rx_dma) != upper_32_bits(rx_dma + size - 1))
+		goto out_err;
+	netdev_dbg(bp->dev, "Allocated %zu bytes for %u RX rings at %08lx (mapped %p)\n",
+		   size, bp->num_queues, (unsigned long)rx_dma, rx);
 
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
-		size = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
-		queue->tx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
-						    &queue->tx_ring_dma,
-						    GFP_KERNEL);
-		if (!queue->tx_ring ||
-		    upper_32_bits(queue->tx_ring_dma) != upper_32_bits(bp->queues->tx_ring_dma))
-			goto out_err;
-		netdev_dbg(bp->dev,
-			   "Allocated TX ring for queue %u of %d bytes at %08lx (mapped %p)\n",
-			   q, size, (unsigned long)queue->tx_ring_dma,
-			   queue->tx_ring);
+		queue->tx_ring = tx + tx_size_per_queue * q;
+		queue->tx_ring_dma = tx_dma + tx_size_per_queue * q;
+
+		queue->rx_ring = rx + rx_size_per_queue * q;
+		queue->rx_ring_dma = rx_dma + rx_size_per_queue * q;
 
 		size = bp->tx_ring_size * sizeof(struct macb_tx_skb);
 		queue->tx_skb = kmalloc(size, GFP_KERNEL);
 		if (!queue->tx_skb)
 			goto out_err;
-
-		size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
-		queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
-						 &queue->rx_ring_dma, GFP_KERNEL);
-		if (!queue->rx_ring ||
-		    upper_32_bits(queue->rx_ring_dma) != upper_32_bits(bp->queues->rx_ring_dma))
-			goto out_err;
-		netdev_dbg(bp->dev,
-			   "Allocated RX ring of %d bytes at %08lx (mapped %p)\n",
-			   size, (unsigned long)queue->rx_ring_dma, queue->rx_ring);
 	}
 	if (bp->macbgem_ops.mog_alloc_rx_buffers(bp))
 		goto out_err;

-- 
2.50.0

next prev parent reply	other threads:[~2025-06-27  9:09 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-27  9:08 [PATCH net-next v2 00/18] Support the Cadence MACB/GEM instances on Mobileye EyeQ5 SoCs Théo Lebrun
2025-06-27  9:08 ` [PATCH net-next v2 01/18] dt-bindings: net: cdns,macb: sort compatibles Théo Lebrun
2025-07-01  8:16   ` Krzysztof Kozlowski
2025-06-27  9:08 ` [PATCH net-next v2 02/18] dt-bindings: net: cdns,macb: add Mobileye EyeQ5 ethernet interface Théo Lebrun
2025-07-01  8:18   ` Krzysztof Kozlowski
2025-06-27  9:08 ` [PATCH net-next v2 03/18] dt-bindings: net: cdns,macb: allow tsu_clk without tx_clk Théo Lebrun
2025-07-01  8:19   ` Krzysztof Kozlowski
2025-06-27  9:08 ` [PATCH net-next v2 04/18] dt-bindings: net: cdns,macb: allow dma-coherent Théo Lebrun
2025-06-27  9:08 ` [PATCH net-next v2 05/18] net: macb: use BIT() macro for capability definitions Théo Lebrun
2025-07-01 15:35   ` Sean Anderson
2025-06-27  9:08 ` [PATCH net-next v2 06/18] net: macb: Remove local variables clk_init and init in macb_probe() Théo Lebrun
2025-07-01 15:35   ` Sean Anderson
2025-06-27  9:08 ` [PATCH net-next v2 07/18] net: macb: drop macb_config NULL checking Théo Lebrun
2025-07-01 15:37   ` Sean Anderson
2025-06-27  9:08 ` [PATCH net-next v2 08/18] net: macb: introduce DMA descriptor helpers (is 64bit? is PTP?) Théo Lebrun
2025-07-01 15:56   ` Sean Anderson
2025-06-27  9:08 ` [PATCH net-next v2 09/18] net: macb: sort #includes Théo Lebrun
2025-07-01 15:58   ` Sean Anderson
2025-06-27  9:08 ` [PATCH net-next v2 10/18] net: macb: remove illusion about TBQPH/RBQPH being per-queue Théo Lebrun
2025-07-01 16:15   ` Sean Anderson
2025-07-01 16:20     ` Sean Anderson
2025-06-27  9:08 ` Théo Lebrun [this message]
2025-07-01 16:32   ` [PATCH net-next v2 11/18] net: macb: single dma_alloc_coherent() for DMA descriptors Sean Anderson
2025-08-07 14:48     ` Théo Lebrun
2025-06-27  9:08 ` [PATCH net-next v2 12/18] net: macb: match skb_reserve(skb, NET_IP_ALIGN) with HW alignment Théo Lebrun
2025-07-01 16:40   ` Sean Anderson
2025-08-07 15:24     ` Théo Lebrun
2025-08-11 18:53       ` Sean Anderson
2025-06-27  9:08 ` [PATCH net-next v2 13/18] net: macb: avoid double endianness swap in macb_set_hwaddr() Théo Lebrun
2025-07-01 16:44   ` Sean Anderson
2025-06-27  9:09 ` [PATCH net-next v2 14/18] net: macb: add no LSO capability (MACB_CAPS_NO_LSO) Théo Lebrun
2025-07-01 16:51   ` Sean Anderson
2025-06-27  9:09 ` [PATCH net-next v2 15/18] net: macb: Add "mobileye,eyeq5-gem" compatible Théo Lebrun
2025-07-01 16:51   ` Sean Anderson
2025-06-27  9:09 ` [PATCH net-next v2 16/18] MIPS: mobileye: add EyeQ5 DMA IOCU support Théo Lebrun
2025-06-27 19:15   ` Simon Horman
2025-06-30 13:35   ` Jiaxun Yang
2025-08-07 16:11     ` Théo Lebrun
2025-06-27  9:09 ` [PATCH net-next v2 17/18] MIPS: mobileye: eyeq5: add two Cadence GEM Ethernet controllers Théo Lebrun
2025-06-27  9:09 ` [PATCH net-next v2 18/18] MIPS: mobileye: eyeq5-epm: add two Cadence GEM Ethernet PHYs Théo Lebrun
2025-06-27  9:41 ` [PATCH net-next v2 00/18] Support the Cadence MACB/GEM instances on Mobileye EyeQ5 SoCs Maxime Chevallier
2025-07-01 16:53 ` Sean Anderson

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:d3b3635998cad095246edf8a75faebbcf711535
dfblob:48b75d95861317b9925b366446c7572c7e18662 )
 OR (
bs:"[PATCH net-next v2 11/18] net: macb: single dma_alloc_coherent() for DMA descriptors" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250627-macb-v2-11-ff8207d0bb77@bootlin.com \
    --to=theo.lebrun@bootlin.com \
    --cc=alex@ghiti.fr \
    --cc=andrew+netdev@lunn.ch \
    --cc=aou@eecs.berkeley.edu \
    --cc=claudiu.beznea@tuxon.dev \
    --cc=conor+dt@kernel.org \
    --cc=cyrille.pitchen@atmel.com \
    --cc=davem@davemloft.net \
    --cc=devicetree@vger.kernel.org \
    --cc=edumazet@google.com \
    --cc=gregory.clement@bootlin.com \
    --cc=harini.katakam@xilinx.com \
    --cc=hskinnemoen@atmel.com \
    --cc=jeff@garzik.org \
    --cc=krzk+dt@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.ferre@microchip.com \
    --cc=pabeni@redhat.com \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=rafalo@cadence.com \
    --cc=richardcochran@gmail.com \
    --cc=robh@kernel.org \
    --cc=samuel.holland@sifive.com \
    --cc=tawfik.bayouk@mobileye.com \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=vladimir.kondratiev@mobileye.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).