On May 08, Jakub Kicinski wrote: > This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > --- > net: airoha: Move qos_sq_bmap in airoha_qdma struct > > The change moves the HTB-offload channel-allocation bitmap from the > per-netdev airoha_gdm_port to the per-QDMA airoha_qdma, so sibling > netdevs attached to different QDMA blocks no longer clobber each other's > bitmap. > > > diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c > > index 080705e2f58d..69a4c2e0d58b 100644 > > --- a/drivers/net/ethernet/airoha/airoha_eth.c > > +++ b/drivers/net/ethernet/airoha/airoha_eth.c > > @@ -2600,7 +2600,7 @@ static int airoha_tc_htb_alloc_leaf_queue(struct net_device *netdev, > > u32 rate = div_u64(opt->rate, 1000) << 3; /* kbps */ > > int err, num_tx_queues = netdev->real_num_tx_queues; > > struct airoha_gdm_dev *dev = netdev_priv(netdev); > > - struct airoha_gdm_port *port = dev->port; > > + struct airoha_qdma *qdma = dev->qdma; > > > > if (opt->parent_classid != TC_HTB_CLASSID_ROOT) { > > NL_SET_ERR_MSG_MOD(opt->extack, "invalid parent classid"); > > @@ -2627,7 +2627,7 @@ static int airoha_tc_htb_alloc_leaf_queue(struct net_device *netdev, > > return err; > > } > > > > - set_bit(channel, port->qos_sq_bmap); > > + set_bit(channel, qdma->qos_sq_bmap); > > opt->qid = AIROHA_NUM_TX_RING + channel; > > > > return 0; > > [High] > Now that qos_sq_bmap is per-QDMA, can two sibling netdevs that share the > same QDMA block collide on allocation? On EN7581, GDM2/GDM3/GDM4 all > select the same qdma pointer via: > > drivers/net/ethernet/airoha/airoha_eth.c: > /* QDMA0 is used for lan ports while QDMA1 is used for WAN ports */ > dev->qdma = ð->qdma[!airoha_is_lan_gdm_dev(dev)]; > > and airoha_is_lan_gdm_dev() returns true only for port->id == 1, so > every non-LAN GDM netdev resolves to eth->qdma[1] and shares > qdma->qos_sq_bmap. > > The allocation path does not test the bit before setting it: > > channel = TC_H_MIN(opt->classid) % AIROHA_NUM_QOS_CHANNELS; > ... > set_bit(channel, qdma->qos_sq_bmap); > opt->qid = AIROHA_NUM_TX_RING + channel; > > If two sibling netdevs on the same QDMA pick classids whose TC_H_MIN > maps to the same channel, both airoha_qdma_set_tx_rate_limit() calls > program the same HW TRTCM egress slot, the later caller silently > overwrites the earlier configuration, and both netdevs receive the same > opt->qid. Should this path use test_and_set_bit() (or similar owner > tracking) to reject a channel already owned by a sibling netdev? ack, I think we should move qos_sq_bmap in airoha_gdm_dev struct and track if a shared qdma qos channel is already in use. I will fix it in v5. > > > @@ -2811,11 +2811,11 @@ static int airoha_dev_setup_tc_block(struct net_device *dev, > > static void airoha_tc_remove_htb_queue(struct net_device *netdev, int queue) > > { > > struct airoha_gdm_dev *dev = netdev_priv(netdev); > > - struct airoha_gdm_port *port = dev->port; > > + struct airoha_qdma *qdma = dev->qdma; > > > > netif_set_real_num_tx_queues(netdev, netdev->real_num_tx_queues - 1); > > airoha_qdma_set_tx_rate_limit(netdev, queue + 1, 0, 0); > > - clear_bit(queue, port->qos_sq_bmap); > > + clear_bit(queue, qdma->qos_sq_bmap); > > } > > [High] > airoha_tc_remove_htb_queue() operates unconditionally on the caller's > netdev but clears a bit that may be owned by a sibling netdev on the > same QDMA. The effects that follow look like cross-netdev interference: > > - netif_set_real_num_tx_queues(netdev, real_num_tx_queues - 1) > decrements the caller's tx-queue count for a queue it never > allocated, and its return value is discarded (eventually the txq<1 > EINVAL guard in netif_set_real_num_tx_queues() will trip silently). > > - airoha_qdma_set_tx_rate_limit(netdev, queue + 1, 0, 0) zeros the > egress TRTCM rate-limit registers for a channel that the sibling > netdev still considers active. > > - clear_bit(queue, qdma->qos_sq_bmap) then hides the channel from the > legitimate owner's later test_bit() checks. This issue will be fixed by the change described above. I will fix it in v5. > > > @@ -2838,10 +2838,10 @@ static int airoha_tc_htb_delete_leaf_queue(struct net_device *netdev, > > static int airoha_tc_htb_destroy(struct net_device *netdev) > > { > > struct airoha_gdm_dev *dev = netdev_priv(netdev); > > - struct airoha_gdm_port *port = dev->port; > > + struct airoha_qdma *qdma = dev->qdma; > > int q; > > > > - for_each_set_bit(q, port->qos_sq_bmap, AIROHA_NUM_QOS_CHANNELS) > > + for_each_set_bit(q, qdma->qos_sq_bmap, AIROHA_NUM_QOS_CHANNELS) > > airoha_tc_remove_htb_queue(netdev, q); > > > > return 0; > > } > > [High] > Does airoha_tc_htb_destroy() now walk bits owned by sibling netdevs? > The iteration no longer filters by owning netdev: > > for_each_set_bit(q, qdma->qos_sq_bmap, AIROHA_NUM_QOS_CHANNELS) > airoha_tc_remove_htb_queue(netdev, q); > > Previously each netdev had a private bitmap, so destroy could only > touch its own channels. After this change, tearing down HTB on one > GDM netdev appears to remove channels and clear rate-limit HW state > belonging to every sibling GDM netdev on the same QDMA. A subsequent > tc-htb delete or get on the legitimate owner then hits: > > if (!test_bit(channel, qdma->qos_sq_bmap)) { > NL_SET_ERR_MSG_MOD(opt->extack, "invalid queue id"); > return -EINVAL; > } > > Should the bitmap track the owning netdev alongside the bit (for > example a per-channel owner pointer, or a per-netdev sub-bitmap inside > airoha_qdma), so that destroy/delete can only touch channels owned by > the caller? This issue will be fixed by the change described above. I will fix it in v5. Regards, Lorenzo > > > @@ -2852,9 +2852,9 @@ static int airoha_tc_get_htb_get_leaf_queue(struct net_device *netdev, > > { > > u32 channel = TC_H_MIN(opt->classid) % AIROHA_NUM_QOS_CHANNELS; > > struct airoha_gdm_dev *dev = netdev_priv(netdev); > > - struct airoha_gdm_port *port = dev->port; > > + struct airoha_qdma *qdma = dev->qdma; > > > > - if (!test_bit(channel, port->qos_sq_bmap)) { > > + if (!test_bit(channel, qdma->qos_sq_bmap)) { > > NL_SET_ERR_MSG_MOD(opt->extack, "invalid queue id"); > > return -EINVAL; > > }