* [PATCH net v4 1/5] dt-bindings: net: cdns,macb: allow tsu_clk without tx_clk
2025-08-20 14:55 [PATCH net v4 0/5] net: macb: various fixes Théo Lebrun
@ 2025-08-20 14:55 ` Théo Lebrun
2025-08-26 9:25 ` Nicolas Ferre
2025-08-20 14:55 ` [PATCH net v4 2/5] net: macb: remove illusion about TBQPH/RBQPH being per-queue Théo Lebrun
` (4 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Théo Lebrun @ 2025-08-20 14:55 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Nicolas Ferre, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk,
Théo Lebrun, Krzysztof Kozlowski
Allow providing tsu_clk without a tx_clk as both are optional.
This is about relaxing unneeded constraints. It so happened that in the
past HW that needed a tsu_clk always needed a tx_clk.
Fixes: 4e5b6de1f46d ("dt-bindings: net: cdns,macb: Convert to json-schema")
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
Documentation/devicetree/bindings/net/cdns,macb.yaml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/net/cdns,macb.yaml b/Documentation/devicetree/bindings/net/cdns,macb.yaml
index 559d0f733e7e7ac2909b87ab759be51d59be51c2..6e20d67e7628cd9dcef6e430b2a49eeedd0991a7 100644
--- a/Documentation/devicetree/bindings/net/cdns,macb.yaml
+++ b/Documentation/devicetree/bindings/net/cdns,macb.yaml
@@ -85,7 +85,7 @@ properties:
items:
- enum: [ ether_clk, hclk, pclk ]
- enum: [ hclk, pclk ]
- - const: tx_clk
+ - enum: [ tx_clk, tsu_clk ]
- enum: [ rx_clk, tsu_clk ]
- const: tsu_clk
--
2.50.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net v4 1/5] dt-bindings: net: cdns,macb: allow tsu_clk without tx_clk
2025-08-20 14:55 ` [PATCH net v4 1/5] dt-bindings: net: cdns,macb: allow tsu_clk without tx_clk Théo Lebrun
@ 2025-08-26 9:25 ` Nicolas Ferre
0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Ferre @ 2025-08-26 9:25 UTC (permalink / raw)
To: Théo Lebrun, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk,
Krzysztof Kozlowski
On 20/08/2025 at 16:55, Théo Lebrun wrote:
> Allow providing tsu_clk without a tx_clk as both are optional.
>
> This is about relaxing unneeded constraints. It so happened that in the
> past HW that needed a tsu_clk always needed a tx_clk.
>
> Fixes: 4e5b6de1f46d ("dt-bindings: net: cdns,macb: Convert to json-schema")
> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
> ---
> Documentation/devicetree/bindings/net/cdns,macb.yaml | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/devicetree/bindings/net/cdns,macb.yaml b/Documentation/devicetree/bindings/net/cdns,macb.yaml
> index 559d0f733e7e7ac2909b87ab759be51d59be51c2..6e20d67e7628cd9dcef6e430b2a49eeedd0991a7 100644
> --- a/Documentation/devicetree/bindings/net/cdns,macb.yaml
> +++ b/Documentation/devicetree/bindings/net/cdns,macb.yaml
> @@ -85,7 +85,7 @@ properties:
> items:
> - enum: [ ether_clk, hclk, pclk ]
> - enum: [ hclk, pclk ]
> - - const: tx_clk
> + - enum: [ tx_clk, tsu_clk ]
> - enum: [ rx_clk, tsu_clk ]
> - const: tsu_clk
>
>
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net v4 2/5] net: macb: remove illusion about TBQPH/RBQPH being per-queue
2025-08-20 14:55 [PATCH net v4 0/5] net: macb: various fixes Théo Lebrun
2025-08-20 14:55 ` [PATCH net v4 1/5] dt-bindings: net: cdns,macb: allow tsu_clk without tx_clk Théo Lebrun
@ 2025-08-20 14:55 ` Théo Lebrun
2025-08-26 9:27 ` Nicolas Ferre
2025-08-20 14:55 ` [PATCH net v4 3/5] net: macb: move ring size computation to functions Théo Lebrun
` (3 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Théo Lebrun @ 2025-08-20 14:55 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Nicolas Ferre, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk,
Théo Lebrun, Sean Anderson
The MACB driver acts as if TBQPH/RBQPH are configurable on a per queue
basis; this is a lie. A single register configures the upper 32 bits of
each DMA descriptor buffers for all queues.
Concrete actions:
- Drop GEM_TBQPH/GEM_RBQPH macros which have a queue index argument.
Only use MACB_TBQPH/MACB_RBQPH constants.
- Drop struct macb_queue->TBQPH/RBQPH fields.
- In macb_init_buffers(): do a single write to TBQPH and RBQPH for all
queues instead of a write per queue.
- In macb_tx_error_task(): drop the write to TBQPH.
- In macb_alloc_consistent(): if allocations give different upper
32-bits, fail. Previously, it would have lead to silent memory
corruption as queues would have used the upper 32 bits of the alloc
from queue 0 and their own low 32 bits.
- In macb_suspend(): if we use the tie off descriptor for suspend, do
the write once for all queues instead of once per queue.
Fixes: fff8019a08b6 ("net: macb: Add 64 bit addressing support for GEM")
Fixes: ae1f2a56d273 ("net: macb: Added support for many RX queues")
Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
drivers/net/ethernet/cadence/macb.h | 4 ---
drivers/net/ethernet/cadence/macb_main.c | 57 ++++++++++++++------------------
2 files changed, 24 insertions(+), 37 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index c9a5c8beb2fa8166195d1d83f187d2d0c62668a8..a7e845fee4b3a2e3d14abb49abdbaf3e8e6ea02b 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -213,10 +213,8 @@
#define GEM_ISR(hw_q) (0x0400 + ((hw_q) << 2))
#define GEM_TBQP(hw_q) (0x0440 + ((hw_q) << 2))
-#define GEM_TBQPH(hw_q) (0x04C8)
#define GEM_RBQP(hw_q) (0x0480 + ((hw_q) << 2))
#define GEM_RBQS(hw_q) (0x04A0 + ((hw_q) << 2))
-#define GEM_RBQPH(hw_q) (0x04D4)
#define GEM_IER(hw_q) (0x0600 + ((hw_q) << 2))
#define GEM_IDR(hw_q) (0x0620 + ((hw_q) << 2))
#define GEM_IMR(hw_q) (0x0640 + ((hw_q) << 2))
@@ -1214,10 +1212,8 @@ struct macb_queue {
unsigned int IDR;
unsigned int IMR;
unsigned int TBQP;
- unsigned int TBQPH;
unsigned int RBQS;
unsigned int RBQP;
- unsigned int RBQPH;
/* Lock to protect tx_head and tx_tail */
spinlock_t tx_ptr_lock;
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index ce95fad8cedd7331d4818ba9f73fb6970249e85c..69325665c766927797ca2e1eb1384105bcde3cb5 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -495,19 +495,19 @@ static void macb_init_buffers(struct macb *bp)
struct macb_queue *queue;
unsigned int q;
+#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+ /* Single register for all queues' high 32 bits. */
+ if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
+ macb_writel(bp, RBQPH,
+ upper_32_bits(bp->queues[0].rx_ring_dma));
+ macb_writel(bp, TBQPH,
+ upper_32_bits(bp->queues[0].tx_ring_dma));
+ }
+#endif
+
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
queue_writel(queue, RBQP, lower_32_bits(queue->rx_ring_dma));
-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
- if (bp->hw_dma_cap & HW_DMA_CAP_64B)
- queue_writel(queue, RBQPH,
- upper_32_bits(queue->rx_ring_dma));
-#endif
queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
- if (bp->hw_dma_cap & HW_DMA_CAP_64B)
- queue_writel(queue, TBQPH,
- upper_32_bits(queue->tx_ring_dma));
-#endif
}
}
@@ -1166,10 +1166,6 @@ static void macb_tx_error_task(struct work_struct *work)
/* Reinitialize the TX desc queue */
queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
- if (bp->hw_dma_cap & HW_DMA_CAP_64B)
- queue_writel(queue, TBQPH, upper_32_bits(queue->tx_ring_dma));
-#endif
/* Make TX ring reflect state of hardware */
queue->tx_head = 0;
queue->tx_tail = 0;
@@ -2542,6 +2538,7 @@ static int macb_alloc_consistent(struct macb *bp)
{
struct macb_queue *queue;
unsigned int q;
+ u32 upper;
int size;
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
@@ -2549,7 +2546,9 @@ static int macb_alloc_consistent(struct macb *bp)
queue->tx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
&queue->tx_ring_dma,
GFP_KERNEL);
- if (!queue->tx_ring)
+ upper = upper_32_bits(queue->tx_ring_dma);
+ if (!queue->tx_ring ||
+ upper != upper_32_bits(bp->queues[0].tx_ring_dma))
goto out_err;
netdev_dbg(bp->dev,
"Allocated TX ring for queue %u of %d bytes at %08lx (mapped %p)\n",
@@ -2563,8 +2562,11 @@ static int macb_alloc_consistent(struct macb *bp)
size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
- &queue->rx_ring_dma, GFP_KERNEL);
- if (!queue->rx_ring)
+ &queue->rx_ring_dma,
+ GFP_KERNEL);
+ upper = upper_32_bits(queue->rx_ring_dma);
+ if (!queue->rx_ring ||
+ upper != upper_32_bits(bp->queues[0].rx_ring_dma))
goto out_err;
netdev_dbg(bp->dev,
"Allocated RX ring of %d bytes at %08lx (mapped %p)\n",
@@ -4305,12 +4307,6 @@ static int macb_init(struct platform_device *pdev)
queue->TBQP = GEM_TBQP(hw_q - 1);
queue->RBQP = GEM_RBQP(hw_q - 1);
queue->RBQS = GEM_RBQS(hw_q - 1);
-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
- if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
- queue->TBQPH = GEM_TBQPH(hw_q - 1);
- queue->RBQPH = GEM_RBQPH(hw_q - 1);
- }
-#endif
} else {
/* queue0 uses legacy registers */
queue->ISR = MACB_ISR;
@@ -4319,12 +4315,6 @@ static int macb_init(struct platform_device *pdev)
queue->IMR = MACB_IMR;
queue->TBQP = MACB_TBQP;
queue->RBQP = MACB_RBQP;
-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
- if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
- queue->TBQPH = MACB_TBQPH;
- queue->RBQPH = MACB_RBQPH;
- }
-#endif
}
/* get irq: here we use the linux queue index, not the hardware
@@ -5450,6 +5440,11 @@ static int __maybe_unused macb_suspend(struct device *dev)
*/
tmp = macb_readl(bp, NCR);
macb_writel(bp, NCR, tmp & ~(MACB_BIT(TE) | MACB_BIT(RE)));
+#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+ if (!(bp->caps & MACB_CAPS_QUEUE_DISABLE))
+ macb_writel(bp, RBQPH,
+ upper_32_bits(bp->rx_ring_tieoff_dma));
+#endif
for (q = 0, queue = bp->queues; q < bp->num_queues;
++q, ++queue) {
/* Disable RX queues */
@@ -5459,10 +5454,6 @@ static int __maybe_unused macb_suspend(struct device *dev)
/* Tie off RX queues */
queue_writel(queue, RBQP,
lower_32_bits(bp->rx_ring_tieoff_dma));
-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
- queue_writel(queue, RBQPH,
- upper_32_bits(bp->rx_ring_tieoff_dma));
-#endif
}
/* Disable all interrupts */
queue_writel(queue, IDR, -1);
--
2.50.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net v4 2/5] net: macb: remove illusion about TBQPH/RBQPH being per-queue
2025-08-20 14:55 ` [PATCH net v4 2/5] net: macb: remove illusion about TBQPH/RBQPH being per-queue Théo Lebrun
@ 2025-08-26 9:27 ` Nicolas Ferre
0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Ferre @ 2025-08-26 9:27 UTC (permalink / raw)
To: Théo Lebrun, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk,
Sean Anderson
On 20/08/2025 at 16:55, Théo Lebrun wrote:
> The MACB driver acts as if TBQPH/RBQPH are configurable on a per queue
> basis; this is a lie. A single register configures the upper 32 bits of
> each DMA descriptor buffers for all queues.
>
> Concrete actions:
>
> - Drop GEM_TBQPH/GEM_RBQPH macros which have a queue index argument.
> Only use MACB_TBQPH/MACB_RBQPH constants.
>
> - Drop struct macb_queue->TBQPH/RBQPH fields.
>
> - In macb_init_buffers(): do a single write to TBQPH and RBQPH for all
> queues instead of a write per queue.
>
> - In macb_tx_error_task(): drop the write to TBQPH.
>
> - In macb_alloc_consistent(): if allocations give different upper
> 32-bits, fail. Previously, it would have lead to silent memory
> corruption as queues would have used the upper 32 bits of the alloc
> from queue 0 and their own low 32 bits.
>
> - In macb_suspend(): if we use the tie off descriptor for suspend, do
> the write once for all queues instead of once per queue.
Indeed, agreed.
> Fixes: fff8019a08b6 ("net: macb: Add 64 bit addressing support for GEM")
> Fixes: ae1f2a56d273 ("net: macb: Added support for many RX queues")
> Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Thanks Théo, best regards,
Nicolas
> ---
> drivers/net/ethernet/cadence/macb.h | 4 ---
> drivers/net/ethernet/cadence/macb_main.c | 57 ++++++++++++++------------------
> 2 files changed, 24 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
> index c9a5c8beb2fa8166195d1d83f187d2d0c62668a8..a7e845fee4b3a2e3d14abb49abdbaf3e8e6ea02b 100644
> --- a/drivers/net/ethernet/cadence/macb.h
> +++ b/drivers/net/ethernet/cadence/macb.h
> @@ -213,10 +213,8 @@
>
> #define GEM_ISR(hw_q) (0x0400 + ((hw_q) << 2))
> #define GEM_TBQP(hw_q) (0x0440 + ((hw_q) << 2))
> -#define GEM_TBQPH(hw_q) (0x04C8)
> #define GEM_RBQP(hw_q) (0x0480 + ((hw_q) << 2))
> #define GEM_RBQS(hw_q) (0x04A0 + ((hw_q) << 2))
> -#define GEM_RBQPH(hw_q) (0x04D4)
> #define GEM_IER(hw_q) (0x0600 + ((hw_q) << 2))
> #define GEM_IDR(hw_q) (0x0620 + ((hw_q) << 2))
> #define GEM_IMR(hw_q) (0x0640 + ((hw_q) << 2))
> @@ -1214,10 +1212,8 @@ struct macb_queue {
> unsigned int IDR;
> unsigned int IMR;
> unsigned int TBQP;
> - unsigned int TBQPH;
> unsigned int RBQS;
> unsigned int RBQP;
> - unsigned int RBQPH;
>
> /* Lock to protect tx_head and tx_tail */
> spinlock_t tx_ptr_lock;
> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index ce95fad8cedd7331d4818ba9f73fb6970249e85c..69325665c766927797ca2e1eb1384105bcde3cb5 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -495,19 +495,19 @@ static void macb_init_buffers(struct macb *bp)
> struct macb_queue *queue;
> unsigned int q;
>
> +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> + /* Single register for all queues' high 32 bits. */
> + if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
> + macb_writel(bp, RBQPH,
> + upper_32_bits(bp->queues[0].rx_ring_dma));
> + macb_writel(bp, TBQPH,
> + upper_32_bits(bp->queues[0].tx_ring_dma));
> + }
> +#endif
> +
> for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
> queue_writel(queue, RBQP, lower_32_bits(queue->rx_ring_dma));
> -#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> - if (bp->hw_dma_cap & HW_DMA_CAP_64B)
> - queue_writel(queue, RBQPH,
> - upper_32_bits(queue->rx_ring_dma));
> -#endif
> queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
> -#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> - if (bp->hw_dma_cap & HW_DMA_CAP_64B)
> - queue_writel(queue, TBQPH,
> - upper_32_bits(queue->tx_ring_dma));
> -#endif
> }
> }
>
> @@ -1166,10 +1166,6 @@ static void macb_tx_error_task(struct work_struct *work)
>
> /* Reinitialize the TX desc queue */
> queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
> -#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> - if (bp->hw_dma_cap & HW_DMA_CAP_64B)
> - queue_writel(queue, TBQPH, upper_32_bits(queue->tx_ring_dma));
> -#endif
> /* Make TX ring reflect state of hardware */
> queue->tx_head = 0;
> queue->tx_tail = 0;
> @@ -2542,6 +2538,7 @@ static int macb_alloc_consistent(struct macb *bp)
> {
> struct macb_queue *queue;
> unsigned int q;
> + u32 upper;
> int size;
>
> for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
> @@ -2549,7 +2546,9 @@ static int macb_alloc_consistent(struct macb *bp)
> queue->tx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
> &queue->tx_ring_dma,
> GFP_KERNEL);
> - if (!queue->tx_ring)
> + upper = upper_32_bits(queue->tx_ring_dma);
> + if (!queue->tx_ring ||
> + upper != upper_32_bits(bp->queues[0].tx_ring_dma))
> goto out_err;
> netdev_dbg(bp->dev,
> "Allocated TX ring for queue %u of %d bytes at %08lx (mapped %p)\n",
> @@ -2563,8 +2562,11 @@ static int macb_alloc_consistent(struct macb *bp)
>
> size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
> queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
> - &queue->rx_ring_dma, GFP_KERNEL);
> - if (!queue->rx_ring)
> + &queue->rx_ring_dma,
> + GFP_KERNEL);
> + upper = upper_32_bits(queue->rx_ring_dma);
> + if (!queue->rx_ring ||
> + upper != upper_32_bits(bp->queues[0].rx_ring_dma))
> goto out_err;
> netdev_dbg(bp->dev,
> "Allocated RX ring of %d bytes at %08lx (mapped %p)\n",
> @@ -4305,12 +4307,6 @@ static int macb_init(struct platform_device *pdev)
> queue->TBQP = GEM_TBQP(hw_q - 1);
> queue->RBQP = GEM_RBQP(hw_q - 1);
> queue->RBQS = GEM_RBQS(hw_q - 1);
> -#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> - if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
> - queue->TBQPH = GEM_TBQPH(hw_q - 1);
> - queue->RBQPH = GEM_RBQPH(hw_q - 1);
> - }
> -#endif
> } else {
> /* queue0 uses legacy registers */
> queue->ISR = MACB_ISR;
> @@ -4319,12 +4315,6 @@ static int macb_init(struct platform_device *pdev)
> queue->IMR = MACB_IMR;
> queue->TBQP = MACB_TBQP;
> queue->RBQP = MACB_RBQP;
> -#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> - if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
> - queue->TBQPH = MACB_TBQPH;
> - queue->RBQPH = MACB_RBQPH;
> - }
> -#endif
> }
>
> /* get irq: here we use the linux queue index, not the hardware
> @@ -5450,6 +5440,11 @@ static int __maybe_unused macb_suspend(struct device *dev)
> */
> tmp = macb_readl(bp, NCR);
> macb_writel(bp, NCR, tmp & ~(MACB_BIT(TE) | MACB_BIT(RE)));
> +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> + if (!(bp->caps & MACB_CAPS_QUEUE_DISABLE))
> + macb_writel(bp, RBQPH,
> + upper_32_bits(bp->rx_ring_tieoff_dma));
> +#endif
> for (q = 0, queue = bp->queues; q < bp->num_queues;
> ++q, ++queue) {
> /* Disable RX queues */
> @@ -5459,10 +5454,6 @@ static int __maybe_unused macb_suspend(struct device *dev)
> /* Tie off RX queues */
> queue_writel(queue, RBQP,
> lower_32_bits(bp->rx_ring_tieoff_dma));
> -#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> - queue_writel(queue, RBQPH,
> - upper_32_bits(bp->rx_ring_tieoff_dma));
> -#endif
> }
> /* Disable all interrupts */
> queue_writel(queue, IDR, -1);
>
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net v4 3/5] net: macb: move ring size computation to functions
2025-08-20 14:55 [PATCH net v4 0/5] net: macb: various fixes Théo Lebrun
2025-08-20 14:55 ` [PATCH net v4 1/5] dt-bindings: net: cdns,macb: allow tsu_clk without tx_clk Théo Lebrun
2025-08-20 14:55 ` [PATCH net v4 2/5] net: macb: remove illusion about TBQPH/RBQPH being per-queue Théo Lebrun
@ 2025-08-20 14:55 ` Théo Lebrun
2025-08-26 9:30 ` Nicolas Ferre
2025-08-20 14:55 ` [PATCH net v4 4/5] net: macb: single dma_alloc_coherent() for DMA descriptors Théo Lebrun
` (2 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Théo Lebrun @ 2025-08-20 14:55 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Nicolas Ferre, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk,
Théo Lebrun
The tx/rx ring size calculation is somewhat complex and partially hidden
behind a macro. Move that out of the {RX,TX}_RING_BYTES() macros and
macb_{alloc,free}_consistent() functions into neat separate functions.
In macb_free_consistent(), we drop the size variable and directly call
the size helpers in the arguments list. In macb_alloc_consistent(), we
keep the size variable that is used by netdev_dbg() calls.
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
drivers/net/ethernet/cadence/macb_main.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 69325665c766927797ca2e1eb1384105bcde3cb5..d413e8bd4977187fd73f7cc48268baf933aab051 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -51,14 +51,10 @@ struct sifive_fu540_macb_mgmt {
#define DEFAULT_RX_RING_SIZE 512 /* must be power of 2 */
#define MIN_RX_RING_SIZE 64
#define MAX_RX_RING_SIZE 8192
-#define RX_RING_BYTES(bp) (macb_dma_desc_get_size(bp) \
- * (bp)->rx_ring_size)
#define DEFAULT_TX_RING_SIZE 512 /* must be power of 2 */
#define MIN_TX_RING_SIZE 64
#define MAX_TX_RING_SIZE 4096
-#define TX_RING_BYTES(bp) (macb_dma_desc_get_size(bp) \
- * (bp)->tx_ring_size)
/* level of occupied TX descriptors under which we wake up TX process */
#define MACB_TX_WAKEUP_THRESH(bp) (3 * (bp)->tx_ring_size / 4)
@@ -2466,11 +2462,20 @@ static void macb_free_rx_buffers(struct macb *bp)
}
}
+static unsigned int macb_tx_ring_size_per_queue(struct macb *bp)
+{
+ return macb_dma_desc_get_size(bp) * bp->tx_ring_size + bp->tx_bd_rd_prefetch;
+}
+
+static unsigned int macb_rx_ring_size_per_queue(struct macb *bp)
+{
+ return macb_dma_desc_get_size(bp) * bp->rx_ring_size + bp->rx_bd_rd_prefetch;
+}
+
static void macb_free_consistent(struct macb *bp)
{
struct macb_queue *queue;
unsigned int q;
- int size;
if (bp->rx_ring_tieoff) {
dma_free_coherent(&bp->pdev->dev, macb_dma_desc_get_size(bp),
@@ -2484,14 +2489,14 @@ static void macb_free_consistent(struct macb *bp)
kfree(queue->tx_skb);
queue->tx_skb = NULL;
if (queue->tx_ring) {
- size = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
- dma_free_coherent(&bp->pdev->dev, size,
+ dma_free_coherent(&bp->pdev->dev,
+ macb_tx_ring_size_per_queue(bp),
queue->tx_ring, queue->tx_ring_dma);
queue->tx_ring = NULL;
}
if (queue->rx_ring) {
- size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
- dma_free_coherent(&bp->pdev->dev, size,
+ dma_free_coherent(&bp->pdev->dev,
+ macb_rx_ring_size_per_queue(bp),
queue->rx_ring, queue->rx_ring_dma);
queue->rx_ring = NULL;
}
@@ -2542,7 +2547,7 @@ static int macb_alloc_consistent(struct macb *bp)
int size;
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
- size = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
+ size = macb_tx_ring_size_per_queue(bp);
queue->tx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
&queue->tx_ring_dma,
GFP_KERNEL);
@@ -2560,7 +2565,7 @@ static int macb_alloc_consistent(struct macb *bp)
if (!queue->tx_skb)
goto out_err;
- size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
+ size = macb_rx_ring_size_per_queue(bp);
queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
&queue->rx_ring_dma,
GFP_KERNEL);
--
2.50.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net v4 3/5] net: macb: move ring size computation to functions
2025-08-20 14:55 ` [PATCH net v4 3/5] net: macb: move ring size computation to functions Théo Lebrun
@ 2025-08-26 9:30 ` Nicolas Ferre
0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Ferre @ 2025-08-26 9:30 UTC (permalink / raw)
To: Théo Lebrun, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk
On 20/08/2025 at 16:55, Théo Lebrun wrote:
> The tx/rx ring size calculation is somewhat complex and partially hidden
> behind a macro. Move that out of the {RX,TX}_RING_BYTES() macros and
> macb_{alloc,free}_consistent() functions into neat separate functions.
I agree.
> In macb_free_consistent(), we drop the size variable and directly call
> the size helpers in the arguments list. In macb_alloc_consistent(), we
> keep the size variable that is used by netdev_dbg() calls.
>
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
> ---
> drivers/net/ethernet/cadence/macb_main.c | 27 ++++++++++++++++-----------
> 1 file changed, 16 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index 69325665c766927797ca2e1eb1384105bcde3cb5..d413e8bd4977187fd73f7cc48268baf933aab051 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -51,14 +51,10 @@ struct sifive_fu540_macb_mgmt {
> #define DEFAULT_RX_RING_SIZE 512 /* must be power of 2 */
> #define MIN_RX_RING_SIZE 64
> #define MAX_RX_RING_SIZE 8192
> -#define RX_RING_BYTES(bp) (macb_dma_desc_get_size(bp) \
> - * (bp)->rx_ring_size)
>
> #define DEFAULT_TX_RING_SIZE 512 /* must be power of 2 */
> #define MIN_TX_RING_SIZE 64
> #define MAX_TX_RING_SIZE 4096
> -#define TX_RING_BYTES(bp) (macb_dma_desc_get_size(bp) \
> - * (bp)->tx_ring_size)
>
> /* level of occupied TX descriptors under which we wake up TX process */
> #define MACB_TX_WAKEUP_THRESH(bp) (3 * (bp)->tx_ring_size / 4)
> @@ -2466,11 +2462,20 @@ static void macb_free_rx_buffers(struct macb *bp)
> }
> }
>
> +static unsigned int macb_tx_ring_size_per_queue(struct macb *bp)
> +{
> + return macb_dma_desc_get_size(bp) * bp->tx_ring_size + bp->tx_bd_rd_prefetch;
> +}
> +
> +static unsigned int macb_rx_ring_size_per_queue(struct macb *bp)
> +{
> + return macb_dma_desc_get_size(bp) * bp->rx_ring_size + bp->rx_bd_rd_prefetch;
> +}
> +
> static void macb_free_consistent(struct macb *bp)
> {
> struct macb_queue *queue;
> unsigned int q;
> - int size;
>
> if (bp->rx_ring_tieoff) {
> dma_free_coherent(&bp->pdev->dev, macb_dma_desc_get_size(bp),
> @@ -2484,14 +2489,14 @@ static void macb_free_consistent(struct macb *bp)
> kfree(queue->tx_skb);
> queue->tx_skb = NULL;
> if (queue->tx_ring) {
> - size = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
> - dma_free_coherent(&bp->pdev->dev, size,
> + dma_free_coherent(&bp->pdev->dev,
> + macb_tx_ring_size_per_queue(bp),
> queue->tx_ring, queue->tx_ring_dma);
> queue->tx_ring = NULL;
> }
> if (queue->rx_ring) {
> - size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
> - dma_free_coherent(&bp->pdev->dev, size,
> + dma_free_coherent(&bp->pdev->dev,
> + macb_rx_ring_size_per_queue(bp),
> queue->rx_ring, queue->rx_ring_dma);
> queue->rx_ring = NULL;
> }
> @@ -2542,7 +2547,7 @@ static int macb_alloc_consistent(struct macb *bp)
> int size;
>
> for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
> - size = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
> + size = macb_tx_ring_size_per_queue(bp);
> queue->tx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
> &queue->tx_ring_dma,
> GFP_KERNEL);
> @@ -2560,7 +2565,7 @@ static int macb_alloc_consistent(struct macb *bp)
> if (!queue->tx_skb)
> goto out_err;
>
> - size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
> + size = macb_rx_ring_size_per_queue(bp);
> queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
> &queue->rx_ring_dma,
> GFP_KERNEL);
>
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net v4 4/5] net: macb: single dma_alloc_coherent() for DMA descriptors
2025-08-20 14:55 [PATCH net v4 0/5] net: macb: various fixes Théo Lebrun
` (2 preceding siblings ...)
2025-08-20 14:55 ` [PATCH net v4 3/5] net: macb: move ring size computation to functions Théo Lebrun
@ 2025-08-20 14:55 ` Théo Lebrun
2025-08-26 15:23 ` Nicolas Ferre
2025-08-20 14:55 ` [PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr() Théo Lebrun
2025-08-20 15:03 ` [PATCH net v4 0/5] net: macb: various fixes Théo Lebrun
5 siblings, 1 reply; 13+ messages in thread
From: Théo Lebrun @ 2025-08-20 14:55 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Nicolas Ferre, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk,
Théo Lebrun, Sean Anderson
Move from 2*NUM_QUEUES dma_alloc_coherent() for DMA descriptor rings to
2 calls overall.
Issue is with how all queues share the same register for configuring the
upper 32-bits of Tx/Rx descriptor rings. Taking Tx, notice how TBQPH
does *not* depend on the queue index:
#define GEM_TBQP(hw_q) (0x0440 + ((hw_q) << 2))
#define GEM_TBQPH(hw_q) (0x04C8)
queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
if (bp->hw_dma_cap & HW_DMA_CAP_64B)
queue_writel(queue, TBQPH, upper_32_bits(queue->tx_ring_dma));
#endif
To maximise our chances of getting valid DMA addresses, we do a single
dma_alloc_coherent() across queues. This improves the odds because
alloc_pages() guarantees natural alignment. Other codepaths (IOMMU or
dev/arch dma_map_ops) don't give high enough guarantees
(even page-aligned isn't enough).
Two consideration:
- dma_alloc_coherent() gives us page alignment. Here we remove this
constraint meaning each queue's ring won't be page-aligned anymore.
- This can save some tiny amounts of memory. Fewer allocations means
(1) less overhead (constant cost per alloc) and (2) less wasted bytes
due to alignment constraints.
Example for (2): 4 queues, default ring size (512), 64-bit DMA
descriptors, 16K pages:
- Before: 8 allocs of 8K, each rounded to 16K => 64K wasted.
- After: 2 allocs of 32K => 0K wasted.
Fixes: 02c958dd3446 ("net/macb: add TX multiqueue support for gem")
Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
drivers/net/ethernet/cadence/macb_main.c | 80 ++++++++++++++++----------------
1 file changed, 41 insertions(+), 39 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index d413e8bd4977187fd73f7cc48268baf933aab051..7f31f264a6d342ea01e2f61944b12c9b9a3fe66e 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -2474,32 +2474,30 @@ static unsigned int macb_rx_ring_size_per_queue(struct macb *bp)
static void macb_free_consistent(struct macb *bp)
{
+ struct device *dev = &bp->pdev->dev;
struct macb_queue *queue;
unsigned int q;
+ size_t size;
if (bp->rx_ring_tieoff) {
- dma_free_coherent(&bp->pdev->dev, macb_dma_desc_get_size(bp),
+ dma_free_coherent(dev, macb_dma_desc_get_size(bp),
bp->rx_ring_tieoff, bp->rx_ring_tieoff_dma);
bp->rx_ring_tieoff = NULL;
}
bp->macbgem_ops.mog_free_rx_buffers(bp);
+ size = bp->num_queues * macb_tx_ring_size_per_queue(bp);
+ dma_free_coherent(dev, size, bp->queues[0].tx_ring, bp->queues[0].tx_ring_dma);
+
+ size = bp->num_queues * macb_rx_ring_size_per_queue(bp);
+ dma_free_coherent(dev, size, bp->queues[0].rx_ring, bp->queues[0].rx_ring_dma);
+
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
kfree(queue->tx_skb);
queue->tx_skb = NULL;
- if (queue->tx_ring) {
- dma_free_coherent(&bp->pdev->dev,
- macb_tx_ring_size_per_queue(bp),
- queue->tx_ring, queue->tx_ring_dma);
- queue->tx_ring = NULL;
- }
- if (queue->rx_ring) {
- dma_free_coherent(&bp->pdev->dev,
- macb_rx_ring_size_per_queue(bp),
- queue->rx_ring, queue->rx_ring_dma);
- queue->rx_ring = NULL;
- }
+ queue->tx_ring = NULL;
+ queue->rx_ring = NULL;
}
}
@@ -2541,41 +2539,45 @@ static int macb_alloc_rx_buffers(struct macb *bp)
static int macb_alloc_consistent(struct macb *bp)
{
+ struct device *dev = &bp->pdev->dev;
+ dma_addr_t tx_dma, rx_dma;
struct macb_queue *queue;
unsigned int q;
- u32 upper;
- int size;
+ void *tx, *rx;
+ size_t size;
+
+ /*
+ * Upper 32-bits of Tx/Rx DMA descriptor for each queues much match!
+ * We cannot enforce this guarantee, the best we can do is do a single
+ * allocation and hope it will land into alloc_pages() that guarantees
+ * natural alignment of physical addresses.
+ */
+
+ size = bp->num_queues * macb_tx_ring_size_per_queue(bp);
+ tx = dma_alloc_coherent(dev, size, &tx_dma, GFP_KERNEL);
+ if (!tx || upper_32_bits(tx_dma) != upper_32_bits(tx_dma + size - 1))
+ goto out_err;
+ netdev_dbg(bp->dev, "Allocated %zu bytes for %u TX rings at %08lx (mapped %p)\n",
+ size, bp->num_queues, (unsigned long)tx_dma, tx);
+
+ size = bp->num_queues * macb_rx_ring_size_per_queue(bp);
+ rx = dma_alloc_coherent(dev, size, &rx_dma, GFP_KERNEL);
+ if (!rx || upper_32_bits(rx_dma) != upper_32_bits(rx_dma + size - 1))
+ goto out_err;
+ netdev_dbg(bp->dev, "Allocated %zu bytes for %u RX rings at %08lx (mapped %p)\n",
+ size, bp->num_queues, (unsigned long)rx_dma, rx);
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
- size = macb_tx_ring_size_per_queue(bp);
- queue->tx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
- &queue->tx_ring_dma,
- GFP_KERNEL);
- upper = upper_32_bits(queue->tx_ring_dma);
- if (!queue->tx_ring ||
- upper != upper_32_bits(bp->queues[0].tx_ring_dma))
- goto out_err;
- netdev_dbg(bp->dev,
- "Allocated TX ring for queue %u of %d bytes at %08lx (mapped %p)\n",
- q, size, (unsigned long)queue->tx_ring_dma,
- queue->tx_ring);
+ queue->tx_ring = tx + macb_tx_ring_size_per_queue(bp) * q;
+ queue->tx_ring_dma = tx_dma + macb_tx_ring_size_per_queue(bp) * q;
+
+ queue->rx_ring = rx + macb_rx_ring_size_per_queue(bp) * q;
+ queue->rx_ring_dma = rx_dma + macb_rx_ring_size_per_queue(bp) * q;
size = bp->tx_ring_size * sizeof(struct macb_tx_skb);
queue->tx_skb = kmalloc(size, GFP_KERNEL);
if (!queue->tx_skb)
goto out_err;
-
- size = macb_rx_ring_size_per_queue(bp);
- queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
- &queue->rx_ring_dma,
- GFP_KERNEL);
- upper = upper_32_bits(queue->rx_ring_dma);
- if (!queue->rx_ring ||
- upper != upper_32_bits(bp->queues[0].rx_ring_dma))
- goto out_err;
- netdev_dbg(bp->dev,
- "Allocated RX ring of %d bytes at %08lx (mapped %p)\n",
- size, (unsigned long)queue->rx_ring_dma, queue->rx_ring);
}
if (bp->macbgem_ops.mog_alloc_rx_buffers(bp))
goto out_err;
--
2.50.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net v4 4/5] net: macb: single dma_alloc_coherent() for DMA descriptors
2025-08-20 14:55 ` [PATCH net v4 4/5] net: macb: single dma_alloc_coherent() for DMA descriptors Théo Lebrun
@ 2025-08-26 15:23 ` Nicolas Ferre
0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Ferre @ 2025-08-26 15:23 UTC (permalink / raw)
To: Théo Lebrun, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk,
Sean Anderson
On 20/08/2025 at 16:55, Théo Lebrun wrote:
> Move from 2*NUM_QUEUES dma_alloc_coherent() for DMA descriptor rings to
> 2 calls overall.
>
> Issue is with how all queues share the same register for configuring the
> upper 32-bits of Tx/Rx descriptor rings. Taking Tx, notice how TBQPH
> does *not* depend on the queue index:
>
> #define GEM_TBQP(hw_q) (0x0440 + ((hw_q) << 2))
> #define GEM_TBQPH(hw_q) (0x04C8)
>
> queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
> #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> if (bp->hw_dma_cap & HW_DMA_CAP_64B)
> queue_writel(queue, TBQPH, upper_32_bits(queue->tx_ring_dma));
> #endif
>
> To maximise our chances of getting valid DMA addresses, we do a single
> dma_alloc_coherent() across queues. This improves the odds because
> alloc_pages() guarantees natural alignment. Other codepaths (IOMMU or
> dev/arch dma_map_ops) don't give high enough guarantees
> (even page-aligned isn't enough).
>
> Two consideration:
>
> - dma_alloc_coherent() gives us page alignment. Here we remove this
> constraint meaning each queue's ring won't be page-aligned anymore.
However... We must guarantee alignement depending on the controller's
bus width (32 or 64 bits)... but being page aligned and having
descriptors multiple of 64 bits anyway, we're good for each descriptor
(might be worth explicitly adding).
>
> - This can save some tiny amounts of memory. Fewer allocations means
> (1) less overhead (constant cost per alloc) and (2) less wasted bytes
> due to alignment constraints.
>
> Example for (2): 4 queues, default ring size (512), 64-bit DMA
> descriptors, 16K pages:
> - Before: 8 allocs of 8K, each rounded to 16K => 64K wasted.
> - After: 2 allocs of 32K => 0K wasted.
>
> Fixes: 02c958dd3446 ("net/macb: add TX multiqueue support for gem")
> Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Tested-by: Nicolas Ferre <nicolas.ferre@microchip.com> # on sam9x75
> ---
> drivers/net/ethernet/cadence/macb_main.c | 80 ++++++++++++++++----------------
> 1 file changed, 41 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index d413e8bd4977187fd73f7cc48268baf933aab051..7f31f264a6d342ea01e2f61944b12c9b9a3fe66e 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -2474,32 +2474,30 @@ static unsigned int macb_rx_ring_size_per_queue(struct macb *bp)
>
> static void macb_free_consistent(struct macb *bp)
> {
> + struct device *dev = &bp->pdev->dev;
> struct macb_queue *queue;
> unsigned int q;
> + size_t size;
>
> if (bp->rx_ring_tieoff) {
> - dma_free_coherent(&bp->pdev->dev, macb_dma_desc_get_size(bp),
> + dma_free_coherent(dev, macb_dma_desc_get_size(bp),
> bp->rx_ring_tieoff, bp->rx_ring_tieoff_dma);
> bp->rx_ring_tieoff = NULL;
> }
>
> bp->macbgem_ops.mog_free_rx_buffers(bp);
>
> + size = bp->num_queues * macb_tx_ring_size_per_queue(bp);
> + dma_free_coherent(dev, size, bp->queues[0].tx_ring, bp->queues[0].tx_ring_dma);
> +
> + size = bp->num_queues * macb_rx_ring_size_per_queue(bp);
> + dma_free_coherent(dev, size, bp->queues[0].rx_ring, bp->queues[0].rx_ring_dma);
> +
> for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
> kfree(queue->tx_skb);
> queue->tx_skb = NULL;
> - if (queue->tx_ring) {
> - dma_free_coherent(&bp->pdev->dev,
> - macb_tx_ring_size_per_queue(bp),
> - queue->tx_ring, queue->tx_ring_dma);
> - queue->tx_ring = NULL;
> - }
> - if (queue->rx_ring) {
> - dma_free_coherent(&bp->pdev->dev,
> - macb_rx_ring_size_per_queue(bp),
> - queue->rx_ring, queue->rx_ring_dma);
> - queue->rx_ring = NULL;
> - }
> + queue->tx_ring = NULL;
> + queue->rx_ring = NULL;
> }
> }
>
> @@ -2541,41 +2539,45 @@ static int macb_alloc_rx_buffers(struct macb *bp)
>
> static int macb_alloc_consistent(struct macb *bp)
> {
> + struct device *dev = &bp->pdev->dev;
> + dma_addr_t tx_dma, rx_dma;
> struct macb_queue *queue;
> unsigned int q;
> - u32 upper;
> - int size;
> + void *tx, *rx;
> + size_t size;
> +
> + /*
> + * Upper 32-bits of Tx/Rx DMA descriptor for each queues much match!
> + * We cannot enforce this guarantee, the best we can do is do a single
> + * allocation and hope it will land into alloc_pages() that guarantees
> + * natural alignment of physical addresses.
> + */
> +
> + size = bp->num_queues * macb_tx_ring_size_per_queue(bp);
> + tx = dma_alloc_coherent(dev, size, &tx_dma, GFP_KERNEL);
> + if (!tx || upper_32_bits(tx_dma) != upper_32_bits(tx_dma + size - 1))
> + goto out_err;
> + netdev_dbg(bp->dev, "Allocated %zu bytes for %u TX rings at %08lx (mapped %p)\n",
> + size, bp->num_queues, (unsigned long)tx_dma, tx);
> +
> + size = bp->num_queues * macb_rx_ring_size_per_queue(bp);
> + rx = dma_alloc_coherent(dev, size, &rx_dma, GFP_KERNEL);
> + if (!rx || upper_32_bits(rx_dma) != upper_32_bits(rx_dma + size - 1))
> + goto out_err;
> + netdev_dbg(bp->dev, "Allocated %zu bytes for %u RX rings at %08lx (mapped %p)\n",
> + size, bp->num_queues, (unsigned long)rx_dma, rx);
>
> for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
> - size = macb_tx_ring_size_per_queue(bp);
> - queue->tx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
> - &queue->tx_ring_dma,
> - GFP_KERNEL);
> - upper = upper_32_bits(queue->tx_ring_dma);
> - if (!queue->tx_ring ||
> - upper != upper_32_bits(bp->queues[0].tx_ring_dma))
> - goto out_err;
> - netdev_dbg(bp->dev,
> - "Allocated TX ring for queue %u of %d bytes at %08lx (mapped %p)\n",
> - q, size, (unsigned long)queue->tx_ring_dma,
> - queue->tx_ring);
> + queue->tx_ring = tx + macb_tx_ring_size_per_queue(bp) * q;
> + queue->tx_ring_dma = tx_dma + macb_tx_ring_size_per_queue(bp) * q;
> +
> + queue->rx_ring = rx + macb_rx_ring_size_per_queue(bp) * q;
> + queue->rx_ring_dma = rx_dma + macb_rx_ring_size_per_queue(bp) * q;
>
> size = bp->tx_ring_size * sizeof(struct macb_tx_skb);
> queue->tx_skb = kmalloc(size, GFP_KERNEL);
> if (!queue->tx_skb)
> goto out_err;
> -
> - size = macb_rx_ring_size_per_queue(bp);
> - queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
> - &queue->rx_ring_dma,
> - GFP_KERNEL);
> - upper = upper_32_bits(queue->rx_ring_dma);
> - if (!queue->rx_ring ||
> - upper != upper_32_bits(bp->queues[0].rx_ring_dma))
> - goto out_err;
> - netdev_dbg(bp->dev,
> - "Allocated RX ring of %d bytes at %08lx (mapped %p)\n",
> - size, (unsigned long)queue->rx_ring_dma, queue->rx_ring);
> }
> if (bp->macbgem_ops.mog_alloc_rx_buffers(bp))
> goto out_err;
>
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr()
2025-08-20 14:55 [PATCH net v4 0/5] net: macb: various fixes Théo Lebrun
` (3 preceding siblings ...)
2025-08-20 14:55 ` [PATCH net v4 4/5] net: macb: single dma_alloc_coherent() for DMA descriptors Théo Lebrun
@ 2025-08-20 14:55 ` Théo Lebrun
2025-08-20 15:25 ` Russell King (Oracle)
2025-08-20 15:03 ` [PATCH net v4 0/5] net: macb: various fixes Théo Lebrun
5 siblings, 1 reply; 13+ messages in thread
From: Théo Lebrun @ 2025-08-20 14:55 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Nicolas Ferre, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk,
Théo Lebrun, Sean Anderson
writel() does a CPU->LE conversion. Drop manual cpu_to_le*() calls.
On little-endian system:
- cpu_to_le32() is a no-op (LE->LE),
- writel() is a no-op (LE->LE),
- dev_addr will therefore not be swapped and written as-is.
On big-endian system:
- cpu_to_le32() is a swap (BE->LE),
- writel() is a swap (BE->LE),
- dev_addr will therefore be swapped twice and written as a BE value.
This was found using sparse:
⟩ make C=2 drivers/net/ethernet/cadence/macb_main.o
warning: incorrect type in assignment (different base types)
expected unsigned int [usertype] bottom
got restricted __le32 [usertype]
warning: incorrect type in assignment (different base types)
expected unsigned short [usertype] top
got restricted __le16 [usertype]
...
Fixes: 89e5785fc8a6 ("[PATCH] Atmel MACB ethernet driver")
Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
drivers/net/ethernet/cadence/macb_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 7f31f264a6d342ea01e2f61944b12c9b9a3fe66e..fe319d77f2a8f6b1f3b698e0d11781936345ea8f 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -274,9 +274,9 @@ static void macb_set_hwaddr(struct macb *bp)
u32 bottom;
u16 top;
- bottom = cpu_to_le32(*((u32 *)bp->dev->dev_addr));
+ bottom = *((u32 *)bp->dev->dev_addr);
macb_or_gem_writel(bp, SA1B, bottom);
- top = cpu_to_le16(*((u16 *)(bp->dev->dev_addr + 4)));
+ top = *((u16 *)(bp->dev->dev_addr + 4));
macb_or_gem_writel(bp, SA1T, top);
if (gem_has_ptp(bp)) {
--
2.50.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr()
2025-08-20 14:55 ` [PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr() Théo Lebrun
@ 2025-08-20 15:25 ` Russell King (Oracle)
2025-09-05 9:02 ` Théo Lebrun
0 siblings, 1 reply; 13+ messages in thread
From: Russell King (Oracle) @ 2025-08-20 15:25 UTC (permalink / raw)
To: Théo Lebrun
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Nicolas Ferre, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, netdev, devicetree, linux-kernel,
Thomas Petazzoni, Tawfik Bayouk, Sean Anderson
On Wed, Aug 20, 2025 at 04:55:09PM +0200, Théo Lebrun wrote:
> writel() does a CPU->LE conversion. Drop manual cpu_to_le*() calls.
>
> On little-endian system:
> - cpu_to_le32() is a no-op (LE->LE),
> - writel() is a no-op (LE->LE),
> - dev_addr will therefore not be swapped and written as-is.
>
> On big-endian system:
> - cpu_to_le32() is a swap (BE->LE),
> - writel() is a swap (BE->LE),
> - dev_addr will therefore be swapped twice and written as a BE value.
I'm not convinced by this, I think you're missing something.
writel() on a BE or LE system will give you bits 7:0 of the CPU value
written to LE bit 7:0 of the register. It has to be this way, otherwise
we would need to do endian conversions everwhere where we write simple
numbers to device registers.
Why?
Remember that on a LE system with a 32-bit bus, a hex value of
0x76543210 at the CPU when written without conversion will appear
as:
0 on bus bits 0:3
1 on bus bits 4:7
...
6 on bus bits 24:27
7 on bus bits 28:31
whereas on a BE system, this is reversed:
6 on bus bits 0:3
7 on bus bits 4:7
...
0 on bus bits 24:27
1 on bus bits 28:31
The specification is that writel() will write in LE format even on
BE systems, so there is a need to do an endian conversion for BE
systems.
So, if a device expects bits 0:7 on the bus to be the first byte of
the MAC address (high byte of the OUI) then this must be in CPU
bits 0:7 as well.
Now, assuming that a MAC address of AA:BB:CC:DD:EE:FF gets read as
0xDDCCBBAA by the first read on a LE machine, it will get read as
0xAABBCCDD on a BE machine.
We can now see that combining these two, getting rid of the
cpu_to_le32() is likely wrong.
Therefore, I am not convinced this patch is actually correct.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr()
2025-08-20 15:25 ` Russell King (Oracle)
@ 2025-09-05 9:02 ` Théo Lebrun
0 siblings, 0 replies; 13+ messages in thread
From: Théo Lebrun @ 2025-09-05 9:02 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Nicolas Ferre, Claudiu Beznea, Geert Uytterhoeven, Harini Katakam,
Richard Cochran, netdev, devicetree, linux-kernel,
Thomas Petazzoni, Tawfik Bayouk, Sean Anderson
Hello Russell,
On Wed Aug 20, 2025 at 5:25 PM CEST, Russell King (Oracle) wrote:
> On Wed, Aug 20, 2025 at 04:55:09PM +0200, Théo Lebrun wrote:
>> writel() does a CPU->LE conversion. Drop manual cpu_to_le*() calls.
>>
>> On little-endian system:
>> - cpu_to_le32() is a no-op (LE->LE),
>> - writel() is a no-op (LE->LE),
>> - dev_addr will therefore not be swapped and written as-is.
>>
>> On big-endian system:
>> - cpu_to_le32() is a swap (BE->LE),
>> - writel() is a swap (BE->LE),
>> - dev_addr will therefore be swapped twice and written as a BE value.
>
> I'm not convinced by this, I think you're missing something.
>
> writel() on a BE or LE system will give you bits 7:0 of the CPU value
> written to LE bit 7:0 of the register. It has to be this way, otherwise
> we would need to do endian conversions everwhere where we write simple
> numbers to device registers.
>
> Why?
>
> Remember that on a LE system with a 32-bit bus, a hex value of
> 0x76543210 at the CPU when written without conversion will appear
> as:
> 0 on bus bits 0:3
> 1 on bus bits 4:7
> ...
> 6 on bus bits 24:27
> 7 on bus bits 28:31
>
> whereas on a BE system, this is reversed:
> 6 on bus bits 0:3
> 7 on bus bits 4:7
> ...
> 0 on bus bits 24:27
> 1 on bus bits 28:31
>
> The specification is that writel() will write in LE format even on
> BE systems, so there is a need to do an endian conversion for BE
> systems.
>
> So, if a device expects bits 0:7 on the bus to be the first byte of
> the MAC address (high byte of the OUI) then this must be in CPU
> bits 0:7 as well.
>
>
> Now, assuming that a MAC address of AA:BB:CC:DD:EE:FF gets read as
> 0xDDCCBBAA by the first read on a LE machine, it will get read as
> 0xAABBCCDD on a BE machine.
>
> We can now see that combining these two, getting rid of the
> cpu_to_le32() is likely wrong.
>
> Therefore, I am not convinced this patch is actually correct.
Thanks for the above, in-detail, explanation. I agree with it all.
I've always have had a hard time wrapping my head around endianness
conversion.
Indeed the patch is wrong: the swap is required on BE platforms.
My gripe is more with the semantic of the current code:
bottom = cpu_to_le32(*((u32 *)bp->dev->dev_addr));
macb_or_gem_writel(bp, SA1B, bottom);
top = cpu_to_le16(*((u16 *)(bp->dev->dev_addr + 4)));
macb_or_gem_writel(bp, SA1T, top);
Notice how:
- The type of the argument to cpu_to_le32(); pointer is to a CPU-endian
value (u32) but in reality is to a BE32.
- We apply cpu_to_le32() to get a swap but the semantic is wrong; input
value is BE32 that we want to turn into CPU-endian.
- Above two points apply to `u16 top` as well.
- writel() are unrelated to the issue; they do the right thing by
writing a CPU value into a LE device register.
Sparse is complaining about the second bulletpoint; it won't complain
about the first one because it trusts that you know what you are doing
with explicit casts.
warning: incorrect type in assignment (different base types)
expected unsigned int [usertype] bottom
got restricted __le32 [usertype]
If we want to keep to the same structure, this does the exact same but
its semantic is more aligned to reality (to my eyes):
bottom = le32_to_cpu(*((__le32 *)bp->dev->dev_addr));
macb_or_gem_writel(bp, SA1B, bottom);
top = le16_to_cpu(*((__le16 *)(bp->dev->dev_addr + 4)));
macb_or_gem_writel(bp, SA1T, top);
Notice how:
- Casts are fixed to signal proper types.
- Use le32_to_cpu().
Sparse is happy and code has been tested on a BE platform.
Assembly generated is strictly identical.
However, I think we can do better. Second option:
const unsigned char *addr = bp->dev->dev_addr;
bottom = addr[0] << 0 | addr[1] << 8 | addr[2] << 16 | addr[3] << 24;
top = addr[4] << 0 | addr[5] << 8;
This is a bit of a mouthful, what about this one?
bottom = get_unaligned_le32(addr);
top = get_unaligned_le16(addr + 4);
It is my preferred. I found those helpers reading more code that reads
the `unsigned char *dev_addr` field. Explicit and straight forward.
Can you confirm that last option fits well?
Thanks Russell,
--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net v4 0/5] net: macb: various fixes
2025-08-20 14:55 [PATCH net v4 0/5] net: macb: various fixes Théo Lebrun
` (4 preceding siblings ...)
2025-08-20 14:55 ` [PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr() Théo Lebrun
@ 2025-08-20 15:03 ` Théo Lebrun
5 siblings, 0 replies; 13+ messages in thread
From: Théo Lebrun @ 2025-08-20 15:03 UTC (permalink / raw)
To: Théo Lebrun, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Nicolas Ferre, Claudiu Beznea, Geert Uytterhoeven,
Harini Katakam, Richard Cochran, Russell King
Cc: netdev, devicetree, linux-kernel, Thomas Petazzoni, Tawfik Bayouk,
Krzysztof Kozlowski, Sean Anderson
On Wed Aug 20, 2025 at 4:55 PM CEST, Théo Lebrun wrote:
> Changes in v4:
> - Drop 11 patches that are only cleanups. That includes the
> RBOF/skb_reserve() patch that, after discussion with Sean [1], has
> had its Fixes trailer dropped. "move ring size computation to
> functions" is the only non-fix patch that is kept, as it is depended
> upon by further patches. Dropped patches:
> dt-bindings: net: cdns,macb: sort compatibles
> net: macb: match skb_reserve(skb, NET_IP_ALIGN) with HW alignment
> net: macb: use BIT() macro for capability definitions
> net: macb: remove gap in MACB_CAPS_* flags
> net: macb: Remove local variables clk_init and init in macb_probe()
> net: macb: drop macb_config NULL checking
> net: macb: simplify macb_dma_desc_get_size()
> net: macb: simplify macb_adj_dma_desc_idx()
> net: macb: move bp->hw_dma_cap flags to bp->caps
> net: macb: introduce DMA descriptor helpers (is 64bit? is PTP?)
> net: macb: sort #includes
> [1]: https://lore.kernel.org/lkml/d4bead1c-697a-46d8-ba9c-64292fccb19f@linux.dev/
> - Link to v3: https://lore.kernel.org/r/20250808-macb-fixes-v3-0-08f1fcb5179f@bootlin.com
And I forgot mentioning that Jakub's comment [0] about wrapping to 80
characters got taken into account. Sorry about that omission!
[0]: https://lore.kernel.org/lkml/20250808160615.695beafe@kernel.org/
Thanks,
--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 13+ messages in thread