> This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > > For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html > --- > net: airoha: Wait for TX to complete in airoha_dev_stop() > > Wait for TX to complete in airoha_dev_stop routine before stopping the > TX DMA and run airoha_qdma_cleanup_tx_queue routine. Moreover, > start/stop TX/RX NAPIs in ndo_open()/ndo_stop() callbacks in order to be > sure the TX NAPIs have completed before stopping the TX DMA engine in > airoha_dev_stop routine. > > > diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c > > index 8e4b043af4bc..9e40c8f375c1 100644 > > --- a/drivers/net/ethernet/airoha/airoha_eth.c > > +++ b/drivers/net/ethernet/airoha/airoha_eth.c > > [ ... ] > > > @@ -1684,18 +1686,26 @@ static int airoha_dev_stop(struct net_device *dev) > > struct airoha_qdma *qdma = port->qdma; > > int i, err; > > > > - netif_tx_disable(dev); > > err = airoha_set_vip_for_gdm_port(port, false); > > if (err) > > return err; > > > > - for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++) > > - netdev_tx_reset_subqueue(dev, i); > > - > > airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id), > > FE_PSE_PORT_DROP); > > > > + netif_tx_disable(dev); > > if (atomic_dec_and_test(&qdma->users)) { > > + u32 val; > > + > > + /* Wait for TX to complete */ > > + err = read_poll_timeout(airoha_qdma_rr, val, > > + !(val & GLOBAL_CFG_TX_DMA_BUSY_MASK), > > + USEC_PER_MSEC, 100 * USEC_PER_MSEC, > > + false, qdma, REG_QDMA_GLOBAL_CFG); > > + if (err) > > + return err; > > ^^^^^^^^^^ > > Can this leave the device in an inconsistent state? When read_poll_timeout() > fails (TX DMA busy for more than 100ms), airoha_dev_stop() returns an error > immediately. However, atomic_dec_and_test(&qdma->users) has already > decremented qdma->users to 0, but airoha_qdma_stop_napi() was never called, > so all NAPIs remain enabled. > > The networking core (__dev_close_many in net/core/dev.c) ignores the > ndo_stop return value and marks the device as down regardless. When the > device is later brought up again, airoha_dev_open() sees atomic_fetch_inc() > return 0 (old value) and calls airoha_qdma_start_napi() -> napi_enable(), > which hits BUG_ON(!test_bit(NAPI_STATE_SCHED, &val)) in napi_enable_locked() > because the NAPI was never disabled. > > The early return also skips the netdev_tx_reset_subqueue() calls at the end > of the function. > > Should the code either continue with cleanup on timeout (don't return), or > ensure NAPI and DMA state are properly cleaned up before returning? ack, I will fix it in v2. Regards, Lorenzo > > > + > > + airoha_qdma_stop_napi(qdma); > > airoha_qdma_clear(qdma, REG_QDMA_GLOBAL_CFG, > > GLOBAL_CFG_TX_DMA_EN_MASK | > > GLOBAL_CFG_RX_DMA_EN_MASK); > > [ ... ] >