From: Jens Emil Schulz Ostergaard <jensemil.schulzostergaard@microchip.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: <UNGLinuxDriver@microchip.com>, <andrew@lunn.ch>,
<olteanv@gmail.com>, <davem@davemloft.net>, <edumazet@google.com>,
<pabeni@redhat.com>, <horms@kernel.org>, <robh@kernel.org>,
<krzk+dt@kernel.org>, <conor+dt@kernel.org>,
<woojung.huh@microchip.com>, <linux@armlinux.org.uk>,
<Steen.Hegelund@microchip.com>, <daniel.machon@microchip.com>,
<linux-kernel@vger.kernel.org>, <netdev@vger.kernel.org>,
<devicetree@vger.kernel.org>
Subject: Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics
Date: Tue, 12 May 2026 10:47:22 +0200 [thread overview]
Message-ID: <c48f64d22429c1d4e10015033da4fff16c2a6d7d.camel@microchip.com> (raw)
In-Reply-To: <20260506014618.1616861-1-kuba@kernel.org>
On Tue, 2026-05-05 at 18:46 -0700, Jakub Kicinski wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> ---
> net: dsa: lan9645x: add port statistics
>
> This patch wires up ethtool -S, standard ethtool group stats,
> and ndo_get_stats64 for the lan9645x DSA driver. A delayed work
> polls the 32-bit hardware counters every 3 seconds and accumulates
> them into a 64-bit software cache to handle hardware wraparound.
>
> > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> > index 15a493c1f215..cea46bfe729e 100644
> > --- a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
>
> [ ... ]
>
> > @@ -255,6 +257,12 @@ static int lan9645x_setup(struct dsa_switch *ds)
> > if (!lan9645x->owq)
> > return -ENOMEM;
> >
> > + err = lan9645x_stats_init(lan9645x);
> > + if (err) {
> > + dev_err(dev, "Failed to init stats.\n");
> > + goto owq_destroy;
> > + }
> > +
> > ds->mtu_enforcement_ingress = true;
> > ds->assisted_learning_on_cpu_port = true;
> > ds->fdb_isolation = true;
> > @@ -265,6 +273,10 @@ static int lan9645x_setup(struct dsa_switch *ds)
> > lan9645x->num_phys_ports - lan9645x->num_port_dis);
> >
> > return 0;
> > +
> > +owq_destroy:
> > + destroy_workqueue(lan9645x->owq);
> > + return err;
> > }
>
> Is this owq_destroy error path reachable today?
>
> Looking at lan9645x_stats_init():
>
> int lan9645x_stats_init(struct lan9645x *lan9645x)
> {
> struct lan9645x_stats *stats = lan9645x->stats;
>
> queue_delayed_work(stats->queue, &stats->work,
> LAN9645X_STATS_CHECK_DELAY);
>
> return 0;
> }
>
> it unconditionally returns 0, so the goto owq_destroy branch is dead.
>
> If a future change either makes lan9645x_stats_init() return an error, or
> adds any fallible step after it in lan9645x_setup(), the cleanup would
> fall through owq_destroy without calling lan9645x_stats_deinit(), leaving
> the self-requeuing delayed work armed on stats->queue.
>
> Also, dsa_switch_setup() in net/dsa/dsa.c does not invoke ds->ops->teardown
> when ops->setup() itself fails, so lan9645x_teardown (which is where
> lan9645x_stats_deinit is called today) will not run in that scenario.
>
> Then in lan9645x_remove() -> lan9645x_stats_free() -> destroy_workqueue(),
> the queue would be drained with a chained self-requeue still live. Would
> adding a dedicated label that calls lan9645x_stats_deinit() before falling
> through to owq_destroy be safer for future edits?
>
I will make lan9645x_stats_init return void. There is nothing after
lan9645x_stats_init which can fail, so at this point a label for
lan9645x_stats_deinit would be dead also?
> [ ... ]
>
> > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c
> > new file mode 100644
> > index 000000000000..46ac62b4016f
> > --- /dev/null
> > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c
>
> [ ... ]
>
> > +static const struct lan9645x_ethtool_stat lan9645x_port_ethtool_stats[] = {
> > + { "rx_cat_drop", SCNT_RX_CAT_DROP },
> > + { "rx_red_prio_0", SCNT_RX_RED_PRIO_0 },
>
> Per Documentation/networking/statistics.rst and the netdev stats hierarchy,
> counters that have a standard uAPI must not be duplicated in ethtool -S.
>
> The rx_cat_drop counter is summed into stats->rx_dropped in
> lan9645x_stats_get_stats64():
>
> stats->rx_dropped = c[SCNT_RX_LONG] +
> c[SCNT_DR_LOCAL] +
> c[SCNT_DR_TAIL] +
> c[SCNT_RX_CAT_DROP] +
> ...
>
> Since rtnl_link_stats64.rx_dropped is reachable via "ip -s link show",
> does exposing rx_cat_drop here duplicate a value already covered by the
> standard uAPI?
>
> [ ... ]
>
> > + { "tx_drop", SCNT_TX_DROP },
>
> Similarly, tx_drop feeds stats->tx_dropped:
>
> stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED];
>
> Should tx_drop be dropped from the ethtool -S list for the same reason?
>
> [ ... ]
>
> > + { "tx_aged", SCNT_TX_AGED },
> > + { "tx_bufdrop", SCNT_TX_BUFDROP },
>
> tx_aged is also summed into stats->tx_dropped in
> lan9645x_stats_get_stats64(), so does it fall under the same duplicate-uAPI
> concern as tx_drop?
>
> A related question about tx_bufdrop: if SCNT_TX_BUFDROP represents transmit
> frames dropped due to buffer exhaustion, why is it not included in
> stats->tx_dropped alongside SCNT_TX_DROP and SCNT_TX_AGED?
>
> stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED];
>
> If tx_bufdrop is in fact a transmit-drop cause, would users of
> ip -s link show see an under-reported tx_dropped relative to the private
> ethtool counter?
Yes, I think this is a bug, and bufdrop should be added here. I will make sure
there no overlap, and otherwise add it.
>
> > + { "dr_local", SCNT_DR_LOCAL },
> > + { "dr_tail", SCNT_DR_TAIL },
>
> dr_local and dr_tail are also fed into stats->rx_dropped:
>
> stats->rx_dropped = c[SCNT_RX_LONG] +
> c[SCNT_DR_LOCAL] +
> c[SCNT_DR_TAIL] +
> ...
>
> Would it be appropriate to drop these from the ethtool -S list as well,
> given they are already aggregated into the standard rx_dropped counter?
>
> [ ... ]
>
> > +};
>
> [ ... ]
>
> > +static void lan9645x_check_stats_work(struct work_struct *work)
> > +{
> > + struct delayed_work *del_work = to_delayed_work(work);
> > + struct lan9645x_stats *stats;
> > +
> > + stats = container_of(del_work, struct lan9645x_stats, work);
> > +
> > + lan9645x_stats_update(stats->lan9645x);
> > +
> > + queue_delayed_work(stats->queue, &stats->work,
> > + LAN9645X_STATS_CHECK_DELAY);
> > +}
>
> The commit message says this worker "update[s] it frequently to handle
> overflows in hardware."
>
> Looking at lan9645x_stats_add_cnt():
>
> static inline void lan9645x_stats_add_cnt(u64 *cnt, u32 val)
> {
> if (val < (*cnt & U32_MAX))
> *cnt += (u64)1 << 32; /* value has wrapped */
>
> *cnt = (*cnt & ~(u64)U32_MAX) + val;
> }
>
> this compensates for at most one 32-bit wrap per polling interval. If two
> or more wraps happen between reads, the low-half comparison no longer
> detects the missed wrap and the 64-bit counter silently loses 2^32 per
> missed wrap.
>
> At the chip's 2.5 Gbps line rate (LAN9645X_SPEED_2500 in lan9645x_main.h),
> a 32-bit byte counter wraps roughly every 13.7 s, so two wraps happen in
> about 27.5 s. The polling interval is LAN9645X_STATS_CHECK_DELAY = 3 * HZ,
> but there is no upper bound on the actual elapsed time between reads:
> delayed work can be stretched by system suspend/resume, heavy CPU load, or
> regmap/SPI contention.
>
> Would it be worthwhile to either bound the worst-case polling gap
> (e.g. a suspend/resume hook that forces a read, or a timestamp-based
> sanity check that detects a stretched interval and logs a warning) so
> that rx_bytes/tx_bytes reported to ndo_get_stats64 and ethtool cannot
> silently undercount by multiples of 2^32?
>
It is true, and the timing is chosen so at most 1 wrap is supposed to occur.
I believe this is the exact pattern already used by ocelot, lan966x, sparx5
and lan969x.
Maybe I misunderstand the suspend/resume comment. We do not implement
support to suspend/resume, but if we did, then I assume you can suspend
for an arbitrary amount of time, so doing an additional read on
suspend/resume will not solve this problem?
The hw counters are 32bit. To stay synced they must never wrap more than
once. I think the only way avoid this problem is if we can make absolute
guarantees about how often the polling code runs?
> [ ... ]
prev parent reply other threads:[~2026-05-12 8:47 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 9:34 [PATCH net-next v4 0/9] net: dsa: add DSA support for the LAN9645x switch chip family Jens Emil Schulz Østergaard
2026-04-30 9:34 ` [PATCH net-next v4 1/9] net: dsa: add tag driver for LAN9645X Jens Emil Schulz Østergaard
2026-05-06 1:45 ` Jakub Kicinski
2026-05-12 6:28 ` Jens Emil Schulz Ostergaard
2026-04-30 9:34 ` [PATCH net-next v4 2/9] dt-bindings: net: lan9645x: add LAN9645X switch bindings Jens Emil Schulz Østergaard
2026-04-30 9:34 ` [PATCH net-next v4 3/9] net: dsa: lan9645x: add autogenerated register macros Jens Emil Schulz Østergaard
2026-04-30 9:34 ` [PATCH net-next v4 4/9] net: dsa: lan9645x: add basic dsa driver for LAN9645X Jens Emil Schulz Østergaard
2026-05-06 1:45 ` Jakub Kicinski
2026-05-12 7:15 ` Jens Emil Schulz Ostergaard
2026-04-30 9:34 ` [PATCH net-next v4 5/9] net: dsa: lan9645x: add bridge support Jens Emil Schulz Østergaard
2026-05-06 1:46 ` Jakub Kicinski
2026-05-12 7:24 ` Jens Emil Schulz Ostergaard
2026-04-30 9:34 ` [PATCH net-next v4 6/9] net: dsa: lan9645x: add vlan support Jens Emil Schulz Østergaard
2026-05-06 1:46 ` Jakub Kicinski
2026-05-12 7:29 ` Jens Emil Schulz Ostergaard
2026-04-30 9:34 ` [PATCH net-next v4 7/9] net: dsa: lan9645x: add mac table integration Jens Emil Schulz Østergaard
2026-05-06 1:46 ` Jakub Kicinski
2026-05-12 7:42 ` Jens Emil Schulz Ostergaard
2026-04-30 9:34 ` [PATCH net-next v4 8/9] net: dsa: lan9645x: add mdb management Jens Emil Schulz Østergaard
2026-05-06 1:46 ` Jakub Kicinski
2026-05-12 7:45 ` Jens Emil Schulz Ostergaard
2026-04-30 9:34 ` [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics Jens Emil Schulz Østergaard
2026-05-06 1:46 ` Jakub Kicinski
2026-05-06 1:48 ` Jakub Kicinski
2026-05-12 8:47 ` Jens Emil Schulz Ostergaard [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c48f64d22429c1d4e10015033da4fff16c2a6d7d.camel@microchip.com \
--to=jensemil.schulzostergaard@microchip.com \
--cc=Steen.Hegelund@microchip.com \
--cc=UNGLinuxDriver@microchip.com \
--cc=andrew@lunn.ch \
--cc=conor+dt@kernel.org \
--cc=daniel.machon@microchip.com \
--cc=davem@davemloft.net \
--cc=devicetree@vger.kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=krzk+dt@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=netdev@vger.kernel.org \
--cc=olteanv@gmail.com \
--cc=pabeni@redhat.com \
--cc=robh@kernel.org \
--cc=woojung.huh@microchip.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox