* Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics [not found] <20260430-dsa_lan9645x_switch_driver_base-v4-9-f1b6005fa8b7@microchip.com> @ 2026-05-06 1:46 ` Jakub Kicinski 2026-05-06 1:48 ` Jakub Kicinski 2026-05-12 8:47 ` Jens Emil Schulz Ostergaard 0 siblings, 2 replies; 4+ messages in thread From: Jakub Kicinski @ 2026-05-06 1:46 UTC (permalink / raw) To: jensemil.schulzostergaard Cc: Jakub Kicinski, UNGLinuxDriver, andrew, olteanv, davem, edumazet, pabeni, horms, robh, krzk+dt, conor+dt, woojung.huh, linux, Steen.Hegelund, daniel.machon, linux-kernel, netdev, devicetree This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. --- net: dsa: lan9645x: add port statistics This patch wires up ethtool -S, standard ethtool group stats, and ndo_get_stats64 for the lan9645x DSA driver. A delayed work polls the 32-bit hardware counters every 3 seconds and accumulates them into a 64-bit software cache to handle hardware wraparound. > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c > index 15a493c1f215..cea46bfe729e 100644 > --- a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c [ ... ] > @@ -255,6 +257,12 @@ static int lan9645x_setup(struct dsa_switch *ds) > if (!lan9645x->owq) > return -ENOMEM; > > + err = lan9645x_stats_init(lan9645x); > + if (err) { > + dev_err(dev, "Failed to init stats.\n"); > + goto owq_destroy; > + } > + > ds->mtu_enforcement_ingress = true; > ds->assisted_learning_on_cpu_port = true; > ds->fdb_isolation = true; > @@ -265,6 +273,10 @@ static int lan9645x_setup(struct dsa_switch *ds) > lan9645x->num_phys_ports - lan9645x->num_port_dis); > > return 0; > + > +owq_destroy: > + destroy_workqueue(lan9645x->owq); > + return err; > } Is this owq_destroy error path reachable today? Looking at lan9645x_stats_init(): int lan9645x_stats_init(struct lan9645x *lan9645x) { struct lan9645x_stats *stats = lan9645x->stats; queue_delayed_work(stats->queue, &stats->work, LAN9645X_STATS_CHECK_DELAY); return 0; } it unconditionally returns 0, so the goto owq_destroy branch is dead. If a future change either makes lan9645x_stats_init() return an error, or adds any fallible step after it in lan9645x_setup(), the cleanup would fall through owq_destroy without calling lan9645x_stats_deinit(), leaving the self-requeuing delayed work armed on stats->queue. Also, dsa_switch_setup() in net/dsa/dsa.c does not invoke ds->ops->teardown when ops->setup() itself fails, so lan9645x_teardown (which is where lan9645x_stats_deinit is called today) will not run in that scenario. Then in lan9645x_remove() -> lan9645x_stats_free() -> destroy_workqueue(), the queue would be drained with a chained self-requeue still live. Would adding a dedicated label that calls lan9645x_stats_deinit() before falling through to owq_destroy be safer for future edits? [ ... ] > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c > new file mode 100644 > index 000000000000..46ac62b4016f > --- /dev/null > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c [ ... ] > +static const struct lan9645x_ethtool_stat lan9645x_port_ethtool_stats[] = { > + { "rx_cat_drop", SCNT_RX_CAT_DROP }, > + { "rx_red_prio_0", SCNT_RX_RED_PRIO_0 }, Per Documentation/networking/statistics.rst and the netdev stats hierarchy, counters that have a standard uAPI must not be duplicated in ethtool -S. The rx_cat_drop counter is summed into stats->rx_dropped in lan9645x_stats_get_stats64(): stats->rx_dropped = c[SCNT_RX_LONG] + c[SCNT_DR_LOCAL] + c[SCNT_DR_TAIL] + c[SCNT_RX_CAT_DROP] + ... Since rtnl_link_stats64.rx_dropped is reachable via "ip -s link show", does exposing rx_cat_drop here duplicate a value already covered by the standard uAPI? [ ... ] > + { "tx_drop", SCNT_TX_DROP }, Similarly, tx_drop feeds stats->tx_dropped: stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED]; Should tx_drop be dropped from the ethtool -S list for the same reason? [ ... ] > + { "tx_aged", SCNT_TX_AGED }, > + { "tx_bufdrop", SCNT_TX_BUFDROP }, tx_aged is also summed into stats->tx_dropped in lan9645x_stats_get_stats64(), so does it fall under the same duplicate-uAPI concern as tx_drop? A related question about tx_bufdrop: if SCNT_TX_BUFDROP represents transmit frames dropped due to buffer exhaustion, why is it not included in stats->tx_dropped alongside SCNT_TX_DROP and SCNT_TX_AGED? stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED]; If tx_bufdrop is in fact a transmit-drop cause, would users of ip -s link show see an under-reported tx_dropped relative to the private ethtool counter? > + { "dr_local", SCNT_DR_LOCAL }, > + { "dr_tail", SCNT_DR_TAIL }, dr_local and dr_tail are also fed into stats->rx_dropped: stats->rx_dropped = c[SCNT_RX_LONG] + c[SCNT_DR_LOCAL] + c[SCNT_DR_TAIL] + ... Would it be appropriate to drop these from the ethtool -S list as well, given they are already aggregated into the standard rx_dropped counter? [ ... ] > +}; [ ... ] > +static void lan9645x_check_stats_work(struct work_struct *work) > +{ > + struct delayed_work *del_work = to_delayed_work(work); > + struct lan9645x_stats *stats; > + > + stats = container_of(del_work, struct lan9645x_stats, work); > + > + lan9645x_stats_update(stats->lan9645x); > + > + queue_delayed_work(stats->queue, &stats->work, > + LAN9645X_STATS_CHECK_DELAY); > +} The commit message says this worker "update[s] it frequently to handle overflows in hardware." Looking at lan9645x_stats_add_cnt(): static inline void lan9645x_stats_add_cnt(u64 *cnt, u32 val) { if (val < (*cnt & U32_MAX)) *cnt += (u64)1 << 32; /* value has wrapped */ *cnt = (*cnt & ~(u64)U32_MAX) + val; } this compensates for at most one 32-bit wrap per polling interval. If two or more wraps happen between reads, the low-half comparison no longer detects the missed wrap and the 64-bit counter silently loses 2^32 per missed wrap. At the chip's 2.5 Gbps line rate (LAN9645X_SPEED_2500 in lan9645x_main.h), a 32-bit byte counter wraps roughly every 13.7 s, so two wraps happen in about 27.5 s. The polling interval is LAN9645X_STATS_CHECK_DELAY = 3 * HZ, but there is no upper bound on the actual elapsed time between reads: delayed work can be stretched by system suspend/resume, heavy CPU load, or regmap/SPI contention. Would it be worthwhile to either bound the worst-case polling gap (e.g. a suspend/resume hook that forces a read, or a timestamp-based sanity check that detects a stretched interval and logs a warning) so that rx_bytes/tx_bytes reported to ndo_get_stats64 and ethtool cannot silently undercount by multiples of 2^32? [ ... ] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics 2026-05-06 1:46 ` [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics Jakub Kicinski @ 2026-05-06 1:48 ` Jakub Kicinski 2026-05-12 8:47 ` Jens Emil Schulz Ostergaard 1 sibling, 0 replies; 4+ messages in thread From: Jakub Kicinski @ 2026-05-06 1:48 UTC (permalink / raw) To: jensemil.schulzostergaard Cc: UNGLinuxDriver, andrew, olteanv, davem, edumazet, pabeni, horms, robh, krzk+dt, conor+dt, woojung.huh, linux, Steen.Hegelund, daniel.machon, linux-kernel, netdev, devicetree On Tue, 5 May 2026 18:46:18 -0700 Jakub Kicinski wrote: > > +static const struct lan9645x_ethtool_stat lan9645x_port_ethtool_stats[] = { > > + { "rx_cat_drop", SCNT_RX_CAT_DROP }, > > + { "rx_red_prio_0", SCNT_RX_RED_PRIO_0 }, > > Per Documentation/networking/statistics.rst and the netdev stats hierarchy, > counters that have a standard uAPI must not be duplicated in ethtool -S. > > The rx_cat_drop counter is summed into stats->rx_dropped in > lan9645x_stats_get_stats64(): > > stats->rx_dropped = c[SCNT_RX_LONG] + > c[SCNT_DR_LOCAL] + > c[SCNT_DR_TAIL] + > c[SCNT_RX_CAT_DROP] + > ... > > Since rtnl_link_stats64.rx_dropped is reachable via "ip -s link show", > does exposing rx_cat_drop here duplicate a value already covered by the > standard uAPI? > > [ ... ] > > > + { "tx_drop", SCNT_TX_DROP }, > > Similarly, tx_drop feeds stats->tx_dropped: > > stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED]; > > Should tx_drop be dropped from the ethtool -S list for the same reason? Ignore this part, I need to update the AI prompts.. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics 2026-05-06 1:46 ` [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics Jakub Kicinski 2026-05-06 1:48 ` Jakub Kicinski @ 2026-05-12 8:47 ` Jens Emil Schulz Ostergaard 2026-05-12 23:39 ` Jakub Kicinski 1 sibling, 1 reply; 4+ messages in thread From: Jens Emil Schulz Ostergaard @ 2026-05-12 8:47 UTC (permalink / raw) To: Jakub Kicinski Cc: UNGLinuxDriver, andrew, olteanv, davem, edumazet, pabeni, horms, robh, krzk+dt, conor+dt, woojung.huh, linux, Steen.Hegelund, daniel.machon, linux-kernel, netdev, devicetree On Tue, 2026-05-05 at 18:46 -0700, Jakub Kicinski wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > --- > net: dsa: lan9645x: add port statistics > > This patch wires up ethtool -S, standard ethtool group stats, > and ndo_get_stats64 for the lan9645x DSA driver. A delayed work > polls the 32-bit hardware counters every 3 seconds and accumulates > them into a 64-bit software cache to handle hardware wraparound. > > > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c > > index 15a493c1f215..cea46bfe729e 100644 > > --- a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c > > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c > > [ ... ] > > > @@ -255,6 +257,12 @@ static int lan9645x_setup(struct dsa_switch *ds) > > if (!lan9645x->owq) > > return -ENOMEM; > > > > + err = lan9645x_stats_init(lan9645x); > > + if (err) { > > + dev_err(dev, "Failed to init stats.\n"); > > + goto owq_destroy; > > + } > > + > > ds->mtu_enforcement_ingress = true; > > ds->assisted_learning_on_cpu_port = true; > > ds->fdb_isolation = true; > > @@ -265,6 +273,10 @@ static int lan9645x_setup(struct dsa_switch *ds) > > lan9645x->num_phys_ports - lan9645x->num_port_dis); > > > > return 0; > > + > > +owq_destroy: > > + destroy_workqueue(lan9645x->owq); > > + return err; > > } > > Is this owq_destroy error path reachable today? > > Looking at lan9645x_stats_init(): > > int lan9645x_stats_init(struct lan9645x *lan9645x) > { > struct lan9645x_stats *stats = lan9645x->stats; > > queue_delayed_work(stats->queue, &stats->work, > LAN9645X_STATS_CHECK_DELAY); > > return 0; > } > > it unconditionally returns 0, so the goto owq_destroy branch is dead. > > If a future change either makes lan9645x_stats_init() return an error, or > adds any fallible step after it in lan9645x_setup(), the cleanup would > fall through owq_destroy without calling lan9645x_stats_deinit(), leaving > the self-requeuing delayed work armed on stats->queue. > > Also, dsa_switch_setup() in net/dsa/dsa.c does not invoke ds->ops->teardown > when ops->setup() itself fails, so lan9645x_teardown (which is where > lan9645x_stats_deinit is called today) will not run in that scenario. > > Then in lan9645x_remove() -> lan9645x_stats_free() -> destroy_workqueue(), > the queue would be drained with a chained self-requeue still live. Would > adding a dedicated label that calls lan9645x_stats_deinit() before falling > through to owq_destroy be safer for future edits? > I will make lan9645x_stats_init return void. There is nothing after lan9645x_stats_init which can fail, so at this point a label for lan9645x_stats_deinit would be dead also? > [ ... ] > > > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c > > new file mode 100644 > > index 000000000000..46ac62b4016f > > --- /dev/null > > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c > > [ ... ] > > > +static const struct lan9645x_ethtool_stat lan9645x_port_ethtool_stats[] = { > > + { "rx_cat_drop", SCNT_RX_CAT_DROP }, > > + { "rx_red_prio_0", SCNT_RX_RED_PRIO_0 }, > > Per Documentation/networking/statistics.rst and the netdev stats hierarchy, > counters that have a standard uAPI must not be duplicated in ethtool -S. > > The rx_cat_drop counter is summed into stats->rx_dropped in > lan9645x_stats_get_stats64(): > > stats->rx_dropped = c[SCNT_RX_LONG] + > c[SCNT_DR_LOCAL] + > c[SCNT_DR_TAIL] + > c[SCNT_RX_CAT_DROP] + > ... > > Since rtnl_link_stats64.rx_dropped is reachable via "ip -s link show", > does exposing rx_cat_drop here duplicate a value already covered by the > standard uAPI? > > [ ... ] > > > + { "tx_drop", SCNT_TX_DROP }, > > Similarly, tx_drop feeds stats->tx_dropped: > > stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED]; > > Should tx_drop be dropped from the ethtool -S list for the same reason? > > [ ... ] > > > + { "tx_aged", SCNT_TX_AGED }, > > + { "tx_bufdrop", SCNT_TX_BUFDROP }, > > tx_aged is also summed into stats->tx_dropped in > lan9645x_stats_get_stats64(), so does it fall under the same duplicate-uAPI > concern as tx_drop? > > A related question about tx_bufdrop: if SCNT_TX_BUFDROP represents transmit > frames dropped due to buffer exhaustion, why is it not included in > stats->tx_dropped alongside SCNT_TX_DROP and SCNT_TX_AGED? > > stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED]; > > If tx_bufdrop is in fact a transmit-drop cause, would users of > ip -s link show see an under-reported tx_dropped relative to the private > ethtool counter? Yes, I think this is a bug, and bufdrop should be added here. I will make sure there no overlap, and otherwise add it. > > > + { "dr_local", SCNT_DR_LOCAL }, > > + { "dr_tail", SCNT_DR_TAIL }, > > dr_local and dr_tail are also fed into stats->rx_dropped: > > stats->rx_dropped = c[SCNT_RX_LONG] + > c[SCNT_DR_LOCAL] + > c[SCNT_DR_TAIL] + > ... > > Would it be appropriate to drop these from the ethtool -S list as well, > given they are already aggregated into the standard rx_dropped counter? > > [ ... ] > > > +}; > > [ ... ] > > > +static void lan9645x_check_stats_work(struct work_struct *work) > > +{ > > + struct delayed_work *del_work = to_delayed_work(work); > > + struct lan9645x_stats *stats; > > + > > + stats = container_of(del_work, struct lan9645x_stats, work); > > + > > + lan9645x_stats_update(stats->lan9645x); > > + > > + queue_delayed_work(stats->queue, &stats->work, > > + LAN9645X_STATS_CHECK_DELAY); > > +} > > The commit message says this worker "update[s] it frequently to handle > overflows in hardware." > > Looking at lan9645x_stats_add_cnt(): > > static inline void lan9645x_stats_add_cnt(u64 *cnt, u32 val) > { > if (val < (*cnt & U32_MAX)) > *cnt += (u64)1 << 32; /* value has wrapped */ > > *cnt = (*cnt & ~(u64)U32_MAX) + val; > } > > this compensates for at most one 32-bit wrap per polling interval. If two > or more wraps happen between reads, the low-half comparison no longer > detects the missed wrap and the 64-bit counter silently loses 2^32 per > missed wrap. > > At the chip's 2.5 Gbps line rate (LAN9645X_SPEED_2500 in lan9645x_main.h), > a 32-bit byte counter wraps roughly every 13.7 s, so two wraps happen in > about 27.5 s. The polling interval is LAN9645X_STATS_CHECK_DELAY = 3 * HZ, > but there is no upper bound on the actual elapsed time between reads: > delayed work can be stretched by system suspend/resume, heavy CPU load, or > regmap/SPI contention. > > Would it be worthwhile to either bound the worst-case polling gap > (e.g. a suspend/resume hook that forces a read, or a timestamp-based > sanity check that detects a stretched interval and logs a warning) so > that rx_bytes/tx_bytes reported to ndo_get_stats64 and ethtool cannot > silently undercount by multiples of 2^32? > It is true, and the timing is chosen so at most 1 wrap is supposed to occur. I believe this is the exact pattern already used by ocelot, lan966x, sparx5 and lan969x. Maybe I misunderstand the suspend/resume comment. We do not implement support to suspend/resume, but if we did, then I assume you can suspend for an arbitrary amount of time, so doing an additional read on suspend/resume will not solve this problem? The hw counters are 32bit. To stay synced they must never wrap more than once. I think the only way avoid this problem is if we can make absolute guarantees about how often the polling code runs? > [ ... ] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics 2026-05-12 8:47 ` Jens Emil Schulz Ostergaard @ 2026-05-12 23:39 ` Jakub Kicinski 0 siblings, 0 replies; 4+ messages in thread From: Jakub Kicinski @ 2026-05-12 23:39 UTC (permalink / raw) To: Jens Emil Schulz Ostergaard Cc: UNGLinuxDriver, andrew, olteanv, davem, edumazet, pabeni, horms, robh, krzk+dt, conor+dt, woojung.huh, linux, Steen.Hegelund, daniel.machon, linux-kernel, netdev, devicetree On Tue, 12 May 2026 10:47:22 +0200 Jens Emil Schulz Ostergaard wrote: > > The commit message says this worker "update[s] it frequently to handle > > overflows in hardware." > > > > Looking at lan9645x_stats_add_cnt(): > > > > static inline void lan9645x_stats_add_cnt(u64 *cnt, u32 val) > > { > > if (val < (*cnt & U32_MAX)) > > *cnt += (u64)1 << 32; /* value has wrapped */ > > > > *cnt = (*cnt & ~(u64)U32_MAX) + val; > > } > > > > this compensates for at most one 32-bit wrap per polling interval. If two > > or more wraps happen between reads, the low-half comparison no longer > > detects the missed wrap and the 64-bit counter silently loses 2^32 per > > missed wrap. > > > > At the chip's 2.5 Gbps line rate (LAN9645X_SPEED_2500 in lan9645x_main.h), > > a 32-bit byte counter wraps roughly every 13.7 s, so two wraps happen in > > about 27.5 s. The polling interval is LAN9645X_STATS_CHECK_DELAY = 3 * HZ, > > but there is no upper bound on the actual elapsed time between reads: > > delayed work can be stretched by system suspend/resume, heavy CPU load, or > > regmap/SPI contention. > > > > Would it be worthwhile to either bound the worst-case polling gap > > (e.g. a suspend/resume hook that forces a read, or a timestamp-based > > sanity check that detects a stretched interval and logs a warning) so > > that rx_bytes/tx_bytes reported to ndo_get_stats64 and ethtool cannot > > silently undercount by multiples of 2^32? > > > > It is true, and the timing is chosen so at most 1 wrap is supposed to occur. > I believe this is the exact pattern already used by ocelot, lan966x, sparx5 > and lan969x. > > Maybe I misunderstand the suspend/resume comment. We do not implement > support to suspend/resume, but if we did, then I assume you can suspend > for an arbitrary amount of time, so doing an additional read on > suspend/resume will not solve this problem? > > The hw counters are 32bit. To stay synced they must never wrap more than > once. I think the only way avoid this problem is if we can make absolute > guarantees about how often the polling code runs? I think AI is probably asking for too much here. You could stash jiffies on each work run, and detect potential overflow, but all you can do is print a warning. During suspend there should be no traffic, so that's bogus. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-12 23:39 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260430-dsa_lan9645x_switch_driver_base-v4-9-f1b6005fa8b7@microchip.com>
2026-05-06 1:46 ` [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics Jakub Kicinski
2026-05-06 1:48 ` Jakub Kicinski
2026-05-12 8:47 ` Jens Emil Schulz Ostergaard
2026-05-12 23:39 ` Jakub Kicinski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox