Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

* Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics
       [not found] <20260430-dsa_lan9645x_switch_driver_base-v4-9-f1b6005fa8b7@microchip.com>
@ 2026-05-06  1:46 ` Jakub Kicinski
  2026-05-06  1:48   ` Jakub Kicinski
  2026-05-12  8:47   ` Jens Emil Schulz Ostergaard
  0 siblings, 2 replies; 4+ messages in thread
From: Jakub Kicinski @ 2026-05-06  1:46 UTC (permalink / raw)
  To: jensemil.schulzostergaard
  Cc: Jakub Kicinski, UNGLinuxDriver, andrew, olteanv, davem, edumazet,
	pabeni, horms, robh, krzk+dt, conor+dt, woojung.huh, linux,
	Steen.Hegelund, daniel.machon, linux-kernel, netdev, devicetree

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: dsa: lan9645x: add port statistics

This patch wires up ethtool -S, standard ethtool group stats,
and ndo_get_stats64 for the lan9645x DSA driver. A delayed work
polls the 32-bit hardware counters every 3 seconds and accumulates
them into a 64-bit software cache to handle hardware wraparound.

> diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> index 15a493c1f215..cea46bfe729e 100644
> --- a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c

[ ... ]

> @@ -255,6 +257,12 @@ static int lan9645x_setup(struct dsa_switch *ds)
>  	if (!lan9645x->owq)
>  		return -ENOMEM;
>  
> +	err = lan9645x_stats_init(lan9645x);
> +	if (err) {
> +		dev_err(dev, "Failed to init stats.\n");
> +		goto owq_destroy;
> +	}
> +
>  	ds->mtu_enforcement_ingress = true;
>  	ds->assisted_learning_on_cpu_port = true;
>  	ds->fdb_isolation = true;
> @@ -265,6 +273,10 @@ static int lan9645x_setup(struct dsa_switch *ds)
>  		 lan9645x->num_phys_ports - lan9645x->num_port_dis);
>  
>  	return 0;
> +
> +owq_destroy:
> +	destroy_workqueue(lan9645x->owq);
> +	return err;
>  }

Is this owq_destroy error path reachable today?

Looking at lan9645x_stats_init():

    int lan9645x_stats_init(struct lan9645x *lan9645x)
    {
        struct lan9645x_stats *stats = lan9645x->stats;

        queue_delayed_work(stats->queue, &stats->work,
                           LAN9645X_STATS_CHECK_DELAY);

        return 0;
    }

it unconditionally returns 0, so the goto owq_destroy branch is dead.

If a future change either makes lan9645x_stats_init() return an error, or
adds any fallible step after it in lan9645x_setup(), the cleanup would
fall through owq_destroy without calling lan9645x_stats_deinit(), leaving
the self-requeuing delayed work armed on stats->queue.

Also, dsa_switch_setup() in net/dsa/dsa.c does not invoke ds->ops->teardown
when ops->setup() itself fails, so lan9645x_teardown (which is where
lan9645x_stats_deinit is called today) will not run in that scenario.

Then in lan9645x_remove() -> lan9645x_stats_free() -> destroy_workqueue(),
the queue would be drained with a chained self-requeue still live. Would
adding a dedicated label that calls lan9645x_stats_deinit() before falling
through to owq_destroy be safer for future edits?

[ ... ]

> diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c
> new file mode 100644
> index 000000000000..46ac62b4016f
> --- /dev/null
> +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c

[ ... ]

> +static const struct lan9645x_ethtool_stat lan9645x_port_ethtool_stats[] = {
> +	{ "rx_cat_drop",        SCNT_RX_CAT_DROP },
> +	{ "rx_red_prio_0",      SCNT_RX_RED_PRIO_0 },

Per Documentation/networking/statistics.rst and the netdev stats hierarchy,
counters that have a standard uAPI must not be duplicated in ethtool -S.

The rx_cat_drop counter is summed into stats->rx_dropped in
lan9645x_stats_get_stats64():

    stats->rx_dropped = c[SCNT_RX_LONG] +
                        c[SCNT_DR_LOCAL] +
                        c[SCNT_DR_TAIL] +
                        c[SCNT_RX_CAT_DROP] +
                        ...

Since rtnl_link_stats64.rx_dropped is reachable via "ip -s link show",
does exposing rx_cat_drop here duplicate a value already covered by the
standard uAPI?

[ ... ]

> +	{ "tx_drop",            SCNT_TX_DROP },

Similarly, tx_drop feeds stats->tx_dropped:

    stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED];

Should tx_drop be dropped from the ethtool -S list for the same reason?

[ ... ]

> +	{ "tx_aged",            SCNT_TX_AGED },
> +	{ "tx_bufdrop",         SCNT_TX_BUFDROP },

tx_aged is also summed into stats->tx_dropped in
lan9645x_stats_get_stats64(), so does it fall under the same duplicate-uAPI
concern as tx_drop?

A related question about tx_bufdrop: if SCNT_TX_BUFDROP represents transmit
frames dropped due to buffer exhaustion, why is it not included in
stats->tx_dropped alongside SCNT_TX_DROP and SCNT_TX_AGED?

    stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED];

If tx_bufdrop is in fact a transmit-drop cause, would users of
ip -s link show see an under-reported tx_dropped relative to the private
ethtool counter?

> +	{ "dr_local",           SCNT_DR_LOCAL },
> +	{ "dr_tail",            SCNT_DR_TAIL },

dr_local and dr_tail are also fed into stats->rx_dropped:

    stats->rx_dropped = c[SCNT_RX_LONG] +
                        c[SCNT_DR_LOCAL] +
                        c[SCNT_DR_TAIL] +
                        ...

Would it be appropriate to drop these from the ethtool -S list as well,
given they are already aggregated into the standard rx_dropped counter?

[ ... ]

> +};

[ ... ]

> +static void lan9645x_check_stats_work(struct work_struct *work)
> +{
> +	struct delayed_work *del_work = to_delayed_work(work);
> +	struct lan9645x_stats *stats;
> +
> +	stats = container_of(del_work, struct lan9645x_stats, work);
> +
> +	lan9645x_stats_update(stats->lan9645x);
> +
> +	queue_delayed_work(stats->queue, &stats->work,
> +			   LAN9645X_STATS_CHECK_DELAY);
> +}

The commit message says this worker "update[s] it frequently to handle
overflows in hardware."

Looking at lan9645x_stats_add_cnt():

    static inline void lan9645x_stats_add_cnt(u64 *cnt, u32 val)
    {
        if (val < (*cnt & U32_MAX))
            *cnt += (u64)1 << 32; /* value has wrapped */

        *cnt = (*cnt & ~(u64)U32_MAX) + val;
    }

this compensates for at most one 32-bit wrap per polling interval. If two
or more wraps happen between reads, the low-half comparison no longer
detects the missed wrap and the 64-bit counter silently loses 2^32 per
missed wrap.

At the chip's 2.5 Gbps line rate (LAN9645X_SPEED_2500 in lan9645x_main.h),
a 32-bit byte counter wraps roughly every 13.7 s, so two wraps happen in
about 27.5 s. The polling interval is LAN9645X_STATS_CHECK_DELAY = 3 * HZ,
but there is no upper bound on the actual elapsed time between reads:
delayed work can be stretched by system suspend/resume, heavy CPU load, or
regmap/SPI contention.

Would it be worthwhile to either bound the worst-case polling gap
(e.g. a suspend/resume hook that forces a read, or a timestamp-based
sanity check that detects a stretched interval and logs a warning) so
that rx_bytes/tx_bytes reported to ndo_get_stats64 and ethtool cannot
silently undercount by multiples of 2^32?

[ ... ]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics
  2026-05-06  1:46 ` [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics Jakub Kicinski
@ 2026-05-06  1:48   ` Jakub Kicinski
  2026-05-12  8:47   ` Jens Emil Schulz Ostergaard
  1 sibling, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2026-05-06  1:48 UTC (permalink / raw)
  To: jensemil.schulzostergaard
  Cc: UNGLinuxDriver, andrew, olteanv, davem, edumazet, pabeni, horms,
	robh, krzk+dt, conor+dt, woojung.huh, linux, Steen.Hegelund,
	daniel.machon, linux-kernel, netdev, devicetree

On Tue,  5 May 2026 18:46:18 -0700 Jakub Kicinski wrote:
> > +static const struct lan9645x_ethtool_stat lan9645x_port_ethtool_stats[] = {
> > +	{ "rx_cat_drop",        SCNT_RX_CAT_DROP },
> > +	{ "rx_red_prio_0",      SCNT_RX_RED_PRIO_0 },  
> 
> Per Documentation/networking/statistics.rst and the netdev stats hierarchy,
> counters that have a standard uAPI must not be duplicated in ethtool -S.
> 
> The rx_cat_drop counter is summed into stats->rx_dropped in
> lan9645x_stats_get_stats64():
> 
>     stats->rx_dropped = c[SCNT_RX_LONG] +
>                         c[SCNT_DR_LOCAL] +
>                         c[SCNT_DR_TAIL] +
>                         c[SCNT_RX_CAT_DROP] +
>                         ...
> 
> Since rtnl_link_stats64.rx_dropped is reachable via "ip -s link show",
> does exposing rx_cat_drop here duplicate a value already covered by the
> standard uAPI?
> 
> [ ... ]
> 
> > +	{ "tx_drop",            SCNT_TX_DROP },  
> 
> Similarly, tx_drop feeds stats->tx_dropped:
> 
>     stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED];
> 
> Should tx_drop be dropped from the ethtool -S list for the same reason?

Ignore this part, I need to update the AI prompts..

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics
  2026-05-06  1:46 ` [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics Jakub Kicinski
  2026-05-06  1:48   ` Jakub Kicinski
@ 2026-05-12  8:47   ` Jens Emil Schulz Ostergaard
  2026-05-12 23:39     ` Jakub Kicinski
  1 sibling, 1 reply; 4+ messages in thread
From: Jens Emil Schulz Ostergaard @ 2026-05-12  8:47 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: UNGLinuxDriver, andrew, olteanv, davem, edumazet, pabeni, horms,
	robh, krzk+dt, conor+dt, woojung.huh, linux, Steen.Hegelund,
	daniel.machon, linux-kernel, netdev, devicetree

On Tue, 2026-05-05 at 18:46 -0700, Jakub Kicinski wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> ---
> net: dsa: lan9645x: add port statistics
> 
> This patch wires up ethtool -S, standard ethtool group stats,
> and ndo_get_stats64 for the lan9645x DSA driver. A delayed work
> polls the 32-bit hardware counters every 3 seconds and accumulates
> them into a 64-bit software cache to handle hardware wraparound.
> 
> > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> > index 15a493c1f215..cea46bfe729e 100644
> > --- a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> 
> [ ... ]
> 
> > @@ -255,6 +257,12 @@ static int lan9645x_setup(struct dsa_switch *ds)
> >       if (!lan9645x->owq)
> >               return -ENOMEM;
> > 
> > +     err = lan9645x_stats_init(lan9645x);
> > +     if (err) {
> > +             dev_err(dev, "Failed to init stats.\n");
> > +             goto owq_destroy;
> > +     }
> > +
> >       ds->mtu_enforcement_ingress = true;
> >       ds->assisted_learning_on_cpu_port = true;
> >       ds->fdb_isolation = true;
> > @@ -265,6 +273,10 @@ static int lan9645x_setup(struct dsa_switch *ds)
> >                lan9645x->num_phys_ports - lan9645x->num_port_dis);
> > 
> >       return 0;
> > +
> > +owq_destroy:
> > +     destroy_workqueue(lan9645x->owq);
> > +     return err;
> >  }
> 
> Is this owq_destroy error path reachable today?
> 
> Looking at lan9645x_stats_init():
> 
>     int lan9645x_stats_init(struct lan9645x *lan9645x)
>     {
>         struct lan9645x_stats *stats = lan9645x->stats;
> 
>         queue_delayed_work(stats->queue, &stats->work,
>                            LAN9645X_STATS_CHECK_DELAY);
> 
>         return 0;
>     }
> 
> it unconditionally returns 0, so the goto owq_destroy branch is dead.
> 
> If a future change either makes lan9645x_stats_init() return an error, or
> adds any fallible step after it in lan9645x_setup(), the cleanup would
> fall through owq_destroy without calling lan9645x_stats_deinit(), leaving
> the self-requeuing delayed work armed on stats->queue.
> 
> Also, dsa_switch_setup() in net/dsa/dsa.c does not invoke ds->ops->teardown
> when ops->setup() itself fails, so lan9645x_teardown (which is where
> lan9645x_stats_deinit is called today) will not run in that scenario.
> 
> Then in lan9645x_remove() -> lan9645x_stats_free() -> destroy_workqueue(),
> the queue would be drained with a chained self-requeue still live. Would
> adding a dedicated label that calls lan9645x_stats_deinit() before falling
> through to owq_destroy be safer for future edits?
> 


I will make lan9645x_stats_init return void. There is nothing after
lan9645x_stats_init which can fail, so at this point a label for 
lan9645x_stats_deinit would be dead also?

> [ ... ]
> 
> > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c
> > new file mode 100644
> > index 000000000000..46ac62b4016f
> > --- /dev/null
> > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c
> 
> [ ... ]
> 
> > +static const struct lan9645x_ethtool_stat lan9645x_port_ethtool_stats[] = {
> > +     { "rx_cat_drop",        SCNT_RX_CAT_DROP },
> > +     { "rx_red_prio_0",      SCNT_RX_RED_PRIO_0 },
> 
> Per Documentation/networking/statistics.rst and the netdev stats hierarchy,
> counters that have a standard uAPI must not be duplicated in ethtool -S.
> 
> The rx_cat_drop counter is summed into stats->rx_dropped in
> lan9645x_stats_get_stats64():
> 
>     stats->rx_dropped = c[SCNT_RX_LONG] +
>                         c[SCNT_DR_LOCAL] +
>                         c[SCNT_DR_TAIL] +
>                         c[SCNT_RX_CAT_DROP] +
>                         ...
> 
> Since rtnl_link_stats64.rx_dropped is reachable via "ip -s link show",
> does exposing rx_cat_drop here duplicate a value already covered by the
> standard uAPI?
> 
> [ ... ]
> 
> > +     { "tx_drop",            SCNT_TX_DROP },
> 
> Similarly, tx_drop feeds stats->tx_dropped:
> 
>     stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED];
> 
> Should tx_drop be dropped from the ethtool -S list for the same reason?
> 
> [ ... ]
> 
> > +     { "tx_aged",            SCNT_TX_AGED },
> > +     { "tx_bufdrop",         SCNT_TX_BUFDROP },
> 
> tx_aged is also summed into stats->tx_dropped in
> lan9645x_stats_get_stats64(), so does it fall under the same duplicate-uAPI
> concern as tx_drop?
> 
> A related question about tx_bufdrop: if SCNT_TX_BUFDROP represents transmit
> frames dropped due to buffer exhaustion, why is it not included in
> stats->tx_dropped alongside SCNT_TX_DROP and SCNT_TX_AGED?
> 
>     stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED];
> 
> If tx_bufdrop is in fact a transmit-drop cause, would users of
> ip -s link show see an under-reported tx_dropped relative to the private
> ethtool counter?


Yes, I think this is a bug, and bufdrop should be added here. I will make sure
there no overlap, and otherwise add it.

> 
> > +     { "dr_local",           SCNT_DR_LOCAL },
> > +     { "dr_tail",            SCNT_DR_TAIL },
> 
> dr_local and dr_tail are also fed into stats->rx_dropped:
> 
>     stats->rx_dropped = c[SCNT_RX_LONG] +
>                         c[SCNT_DR_LOCAL] +
>                         c[SCNT_DR_TAIL] +
>                         ...
> 
> Would it be appropriate to drop these from the ethtool -S list as well,
> given they are already aggregated into the standard rx_dropped counter?
> 
> [ ... ]
> 
> > +};
> 
> [ ... ]
> 
> > +static void lan9645x_check_stats_work(struct work_struct *work)
> > +{
> > +     struct delayed_work *del_work = to_delayed_work(work);
> > +     struct lan9645x_stats *stats;
> > +
> > +     stats = container_of(del_work, struct lan9645x_stats, work);
> > +
> > +     lan9645x_stats_update(stats->lan9645x);
> > +
> > +     queue_delayed_work(stats->queue, &stats->work,
> > +                        LAN9645X_STATS_CHECK_DELAY);
> > +}
> 
> The commit message says this worker "update[s] it frequently to handle
> overflows in hardware."
> 
> Looking at lan9645x_stats_add_cnt():
> 
>     static inline void lan9645x_stats_add_cnt(u64 *cnt, u32 val)
>     {
>         if (val < (*cnt & U32_MAX))
>             *cnt += (u64)1 << 32; /* value has wrapped */
> 
>         *cnt = (*cnt & ~(u64)U32_MAX) + val;
>     }
> 
> this compensates for at most one 32-bit wrap per polling interval. If two
> or more wraps happen between reads, the low-half comparison no longer
> detects the missed wrap and the 64-bit counter silently loses 2^32 per
> missed wrap.
> 
> At the chip's 2.5 Gbps line rate (LAN9645X_SPEED_2500 in lan9645x_main.h),
> a 32-bit byte counter wraps roughly every 13.7 s, so two wraps happen in
> about 27.5 s. The polling interval is LAN9645X_STATS_CHECK_DELAY = 3 * HZ,
> but there is no upper bound on the actual elapsed time between reads:
> delayed work can be stretched by system suspend/resume, heavy CPU load, or
> regmap/SPI contention.
> 
> Would it be worthwhile to either bound the worst-case polling gap
> (e.g. a suspend/resume hook that forces a read, or a timestamp-based
> sanity check that detects a stretched interval and logs a warning) so
> that rx_bytes/tx_bytes reported to ndo_get_stats64 and ethtool cannot
> silently undercount by multiples of 2^32?
> 

It is true, and the timing is chosen so at most 1 wrap is supposed to occur.
I believe this is the exact pattern already used by ocelot, lan966x, sparx5
and lan969x.

Maybe I misunderstand the suspend/resume comment. We do not implement
support to suspend/resume, but if we did, then I assume you can suspend
for an arbitrary amount of time, so doing an additional read on 
suspend/resume will not solve this problem?

The hw counters are 32bit. To stay synced they must never wrap more than
once. I think the only way avoid this problem is if we can make absolute
guarantees about how often the polling code runs?

> [ ... ]


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics
  2026-05-12  8:47   ` Jens Emil Schulz Ostergaard
@ 2026-05-12 23:39     ` Jakub Kicinski
  0 siblings, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2026-05-12 23:39 UTC (permalink / raw)
  To: Jens Emil Schulz Ostergaard
  Cc: UNGLinuxDriver, andrew, olteanv, davem, edumazet, pabeni, horms,
	robh, krzk+dt, conor+dt, woojung.huh, linux, Steen.Hegelund,
	daniel.machon, linux-kernel, netdev, devicetree

On Tue, 12 May 2026 10:47:22 +0200 Jens Emil Schulz Ostergaard wrote:
> > The commit message says this worker "update[s] it frequently to handle
> > overflows in hardware."
> > 
> > Looking at lan9645x_stats_add_cnt():
> > 
> >     static inline void lan9645x_stats_add_cnt(u64 *cnt, u32 val)
> >     {
> >         if (val < (*cnt & U32_MAX))
> >             *cnt += (u64)1 << 32; /* value has wrapped */
> > 
> >         *cnt = (*cnt & ~(u64)U32_MAX) + val;
> >     }
> > 
> > this compensates for at most one 32-bit wrap per polling interval. If two
> > or more wraps happen between reads, the low-half comparison no longer
> > detects the missed wrap and the 64-bit counter silently loses 2^32 per
> > missed wrap.
> > 
> > At the chip's 2.5 Gbps line rate (LAN9645X_SPEED_2500 in lan9645x_main.h),
> > a 32-bit byte counter wraps roughly every 13.7 s, so two wraps happen in
> > about 27.5 s. The polling interval is LAN9645X_STATS_CHECK_DELAY = 3 * HZ,
> > but there is no upper bound on the actual elapsed time between reads:
> > delayed work can be stretched by system suspend/resume, heavy CPU load, or
> > regmap/SPI contention.
> > 
> > Would it be worthwhile to either bound the worst-case polling gap
> > (e.g. a suspend/resume hook that forces a read, or a timestamp-based
> > sanity check that detects a stretched interval and logs a warning) so
> > that rx_bytes/tx_bytes reported to ndo_get_stats64 and ethtool cannot
> > silently undercount by multiples of 2^32?
> >   
> 
> It is true, and the timing is chosen so at most 1 wrap is supposed to occur.
> I believe this is the exact pattern already used by ocelot, lan966x, sparx5
> and lan969x.
> 
> Maybe I misunderstand the suspend/resume comment. We do not implement
> support to suspend/resume, but if we did, then I assume you can suspend
> for an arbitrary amount of time, so doing an additional read on 
> suspend/resume will not solve this problem?
> 
> The hw counters are 32bit. To stay synced they must never wrap more than
> once. I think the only way avoid this problem is if we can make absolute
> guarantees about how often the polling code runs?

I think AI is probably asking for too much here. You could stash jiffies
on each work run, and detect potential overflow, but all you can do is
print a warning. During suspend there should be no traffic, so that's
bogus.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-12 23:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260430-dsa_lan9645x_switch_driver_base-v4-9-f1b6005fa8b7@microchip.com>
2026-05-06  1:46 ` [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics Jakub Kicinski
2026-05-06  1:48   ` Jakub Kicinski
2026-05-12  8:47   ` Jens Emil Schulz Ostergaard
2026-05-12 23:39     ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox