public inbox for linux-mediatek@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH net-next] net: airoha: Wait for TX to complete in airoha_dev_stop()
@ 2026-04-10 22:05 Lorenzo Bianconi
  2026-04-14  8:29 ` [net-next] " Paolo Abeni
  0 siblings, 1 reply; 3+ messages in thread
From: Lorenzo Bianconi @ 2026-04-10 22:05 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: linux-arm-kernel, linux-mediatek, netdev, Lorenzo Bianconi

Wait for TX to complete in airoha_dev_stop routine before stopping the
TX DMA and run airoha_qdma_cleanup_tx_queue routine. Moreover,
start/stop TX/RX NAPIs in ndo_open()/ndo_stop() callbacks in order to be
sure the TX NAPIs have completed before stopping the TX DMA engine in
airoha_dev_stop routine.
Please note this patch on the commit 'b1c803d5c816 ("net: airoha: Rework
the code flow in airoha_remove() and in airoha_probe() error path")'
that is available only in net-next tree at the moment.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/airoha/airoha_eth.c | 44 +++++++++++++++++---------------
 1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index 8e4b043af4bc..9e40c8f375c1 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -1662,10 +1662,12 @@ static int airoha_dev_open(struct net_device *dev)
 		      FIELD_PREP(GDM_SHORT_LEN_MASK, 60) |
 		      FIELD_PREP(GDM_LONG_LEN_MASK, len));
 
-	airoha_qdma_set(qdma, REG_QDMA_GLOBAL_CFG,
-			GLOBAL_CFG_TX_DMA_EN_MASK |
-			GLOBAL_CFG_RX_DMA_EN_MASK);
-	atomic_inc(&qdma->users);
+	if (!atomic_fetch_inc(&qdma->users)) {
+		airoha_qdma_set(qdma, REG_QDMA_GLOBAL_CFG,
+				GLOBAL_CFG_TX_DMA_EN_MASK |
+				GLOBAL_CFG_RX_DMA_EN_MASK);
+		airoha_qdma_start_napi(qdma);
+	}
 
 	if (port->id == AIROHA_GDM2_IDX &&
 	    airoha_ppe_is_enabled(qdma->eth, 1)) {
@@ -1684,18 +1686,26 @@ static int airoha_dev_stop(struct net_device *dev)
 	struct airoha_qdma *qdma = port->qdma;
 	int i, err;
 
-	netif_tx_disable(dev);
 	err = airoha_set_vip_for_gdm_port(port, false);
 	if (err)
 		return err;
 
-	for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
-		netdev_tx_reset_subqueue(dev, i);
-
 	airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id),
 				    FE_PSE_PORT_DROP);
 
+	netif_tx_disable(dev);
 	if (atomic_dec_and_test(&qdma->users)) {
+		u32 val;
+
+		/* Wait for TX to complete */
+		err = read_poll_timeout(airoha_qdma_rr, val,
+					!(val & GLOBAL_CFG_TX_DMA_BUSY_MASK),
+					USEC_PER_MSEC, 100 * USEC_PER_MSEC,
+					false, qdma, REG_QDMA_GLOBAL_CFG);
+		if (err)
+			return err;
+
+		airoha_qdma_stop_napi(qdma);
 		airoha_qdma_clear(qdma, REG_QDMA_GLOBAL_CFG,
 				  GLOBAL_CFG_TX_DMA_EN_MASK |
 				  GLOBAL_CFG_RX_DMA_EN_MASK);
@@ -1708,6 +1718,9 @@ static int airoha_dev_stop(struct net_device *dev)
 		}
 	}
 
+	for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
+		netdev_tx_reset_subqueue(dev, i);
+
 	return 0;
 }
 
@@ -3048,9 +3061,6 @@ static int airoha_probe(struct platform_device *pdev)
 	if (err)
 		goto error_netdev_free;
 
-	for (i = 0; i < ARRAY_SIZE(eth->qdma); i++)
-		airoha_qdma_start_napi(&eth->qdma[i]);
-
 	for_each_child_of_node(pdev->dev.of_node, np) {
 		if (!of_device_is_compatible(np, "airoha,eth-mac"))
 			continue;
@@ -3061,20 +3071,17 @@ static int airoha_probe(struct platform_device *pdev)
 		err = airoha_alloc_gdm_port(eth, np);
 		if (err) {
 			of_node_put(np);
-			goto error_napi_stop;
+			goto error_netdev_unregister;
 		}
 	}
 
 	err = airoha_register_gdm_devices(eth);
 	if (err)
-		goto error_napi_stop;
+		goto error_netdev_unregister;
 
 	return 0;
 
-error_napi_stop:
-	for (i = 0; i < ARRAY_SIZE(eth->qdma); i++)
-		airoha_qdma_stop_napi(&eth->qdma[i]);
-
+error_netdev_unregister:
 	for (i = 0; i < ARRAY_SIZE(eth->ports); i++) {
 		struct airoha_gdm_port *port = eth->ports[i];
 
@@ -3098,9 +3105,6 @@ static void airoha_remove(struct platform_device *pdev)
 	struct airoha_eth *eth = platform_get_drvdata(pdev);
 	int i;
 
-	for (i = 0; i < ARRAY_SIZE(eth->qdma); i++)
-		airoha_qdma_stop_napi(&eth->qdma[i]);
-
 	for (i = 0; i < ARRAY_SIZE(eth->ports); i++) {
 		struct airoha_gdm_port *port = eth->ports[i];
 

---
base-commit: 42f9b4c6ef19e71d2c7d9bfd3c5037d4fe434ad7
change-id: 20260410-airoha-fix-ndo_stop-ebbf3c724ae0

Best regards,
-- 
Lorenzo Bianconi <lorenzo@kernel.org>



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [net-next] net: airoha: Wait for TX to complete in airoha_dev_stop()
  2026-04-10 22:05 [PATCH net-next] net: airoha: Wait for TX to complete in airoha_dev_stop() Lorenzo Bianconi
@ 2026-04-14  8:29 ` Paolo Abeni
  2026-04-14  8:57   ` Lorenzo Bianconi
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Abeni @ 2026-04-14  8:29 UTC (permalink / raw)
  To: lorenzo
  Cc: edumazet, kuba, davem, netdev, andrew+netdev, pabeni,
	linux-arm-kernel, linux-mediatek

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: airoha: Wait for TX to complete in airoha_dev_stop()

Wait for TX to complete in airoha_dev_stop routine before stopping the
TX DMA and run airoha_qdma_cleanup_tx_queue routine. Moreover,
start/stop TX/RX NAPIs in ndo_open()/ndo_stop() callbacks in order to be
sure the TX NAPIs have completed before stopping the TX DMA engine in
airoha_dev_stop routine.

> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index 8e4b043af4bc..9e40c8f375c1 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c

[ ... ]

> @@ -1684,18 +1686,26 @@ static int airoha_dev_stop(struct net_device *dev)
>  	struct airoha_qdma *qdma = port->qdma;
>  	int i, err;
>
> -	netif_tx_disable(dev);
>  	err = airoha_set_vip_for_gdm_port(port, false);
>  	if (err)
>  		return err;
>
> -	for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
> -		netdev_tx_reset_subqueue(dev, i);
> -
>  	airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id),
>  				    FE_PSE_PORT_DROP);
>
> +	netif_tx_disable(dev);
>  	if (atomic_dec_and_test(&qdma->users)) {
> +		u32 val;
> +
> +		/* Wait for TX to complete */
> +		err = read_poll_timeout(airoha_qdma_rr, val,
> +					!(val & GLOBAL_CFG_TX_DMA_BUSY_MASK),
> +					USEC_PER_MSEC, 100 * USEC_PER_MSEC,
> +					false, qdma, REG_QDMA_GLOBAL_CFG);
> +		if (err)
> +			return err;
>                   ^^^^^^^^^^

Can this leave the device in an inconsistent state? When read_poll_timeout()
fails (TX DMA busy for more than 100ms), airoha_dev_stop() returns an error
immediately. However, atomic_dec_and_test(&qdma->users) has already
decremented qdma->users to 0, but airoha_qdma_stop_napi() was never called,
so all NAPIs remain enabled.

The networking core (__dev_close_many in net/core/dev.c) ignores the
ndo_stop return value and marks the device as down regardless. When the
device is later brought up again, airoha_dev_open() sees atomic_fetch_inc()
return 0 (old value) and calls airoha_qdma_start_napi() -> napi_enable(),
which hits BUG_ON(!test_bit(NAPI_STATE_SCHED, &val)) in napi_enable_locked()
because the NAPI was never disabled.

The early return also skips the netdev_tx_reset_subqueue() calls at the end
of the function.

Should the code either continue with cleanup on timeout (don't return), or
ensure NAPI and DMA state are properly cleaned up before returning?

> +
> +		airoha_qdma_stop_napi(qdma);
>  		airoha_qdma_clear(qdma, REG_QDMA_GLOBAL_CFG,
>  				  GLOBAL_CFG_TX_DMA_EN_MASK |
>  				  GLOBAL_CFG_RX_DMA_EN_MASK);

[ ... ]



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [net-next] net: airoha: Wait for TX to complete in airoha_dev_stop()
  2026-04-14  8:29 ` [net-next] " Paolo Abeni
@ 2026-04-14  8:57   ` Lorenzo Bianconi
  0 siblings, 0 replies; 3+ messages in thread
From: Lorenzo Bianconi @ 2026-04-14  8:57 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: edumazet, kuba, davem, netdev, andrew+netdev, linux-arm-kernel,
	linux-mediatek

[-- Attachment #1: Type: text/plain, Size: 3065 bytes --]

> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> 
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> net: airoha: Wait for TX to complete in airoha_dev_stop()
> 
> Wait for TX to complete in airoha_dev_stop routine before stopping the
> TX DMA and run airoha_qdma_cleanup_tx_queue routine. Moreover,
> start/stop TX/RX NAPIs in ndo_open()/ndo_stop() callbacks in order to be
> sure the TX NAPIs have completed before stopping the TX DMA engine in
> airoha_dev_stop routine.
> 
> > diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> > index 8e4b043af4bc..9e40c8f375c1 100644
> > --- a/drivers/net/ethernet/airoha/airoha_eth.c
> > +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> 
> [ ... ]
> 
> > @@ -1684,18 +1686,26 @@ static int airoha_dev_stop(struct net_device *dev)
> >  	struct airoha_qdma *qdma = port->qdma;
> >  	int i, err;
> >
> > -	netif_tx_disable(dev);
> >  	err = airoha_set_vip_for_gdm_port(port, false);
> >  	if (err)
> >  		return err;
> >
> > -	for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
> > -		netdev_tx_reset_subqueue(dev, i);
> > -
> >  	airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id),
> >  				    FE_PSE_PORT_DROP);
> >
> > +	netif_tx_disable(dev);
> >  	if (atomic_dec_and_test(&qdma->users)) {
> > +		u32 val;
> > +
> > +		/* Wait for TX to complete */
> > +		err = read_poll_timeout(airoha_qdma_rr, val,
> > +					!(val & GLOBAL_CFG_TX_DMA_BUSY_MASK),
> > +					USEC_PER_MSEC, 100 * USEC_PER_MSEC,
> > +					false, qdma, REG_QDMA_GLOBAL_CFG);
> > +		if (err)
> > +			return err;
> >                   ^^^^^^^^^^
> 
> Can this leave the device in an inconsistent state? When read_poll_timeout()
> fails (TX DMA busy for more than 100ms), airoha_dev_stop() returns an error
> immediately. However, atomic_dec_and_test(&qdma->users) has already
> decremented qdma->users to 0, but airoha_qdma_stop_napi() was never called,
> so all NAPIs remain enabled.
> 
> The networking core (__dev_close_many in net/core/dev.c) ignores the
> ndo_stop return value and marks the device as down regardless. When the
> device is later brought up again, airoha_dev_open() sees atomic_fetch_inc()
> return 0 (old value) and calls airoha_qdma_start_napi() -> napi_enable(),
> which hits BUG_ON(!test_bit(NAPI_STATE_SCHED, &val)) in napi_enable_locked()
> because the NAPI was never disabled.
> 
> The early return also skips the netdev_tx_reset_subqueue() calls at the end
> of the function.
> 
> Should the code either continue with cleanup on timeout (don't return), or
> ensure NAPI and DMA state are properly cleaned up before returning?

ack, I will fix it in v2.

Regards,
Lorenzo

> 
> > +
> > +		airoha_qdma_stop_napi(qdma);
> >  		airoha_qdma_clear(qdma, REG_QDMA_GLOBAL_CFG,
> >  				  GLOBAL_CFG_TX_DMA_EN_MASK |
> >  				  GLOBAL_CFG_RX_DMA_EN_MASK);
> 
> [ ... ]
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-14  8:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-10 22:05 [PATCH net-next] net: airoha: Wait for TX to complete in airoha_dev_stop() Lorenzo Bianconi
2026-04-14  8:29 ` [net-next] " Paolo Abeni
2026-04-14  8:57   ` Lorenzo Bianconi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox