Netdev List
 help / color / mirror / Atom feed
From: Lorenzo Bianconi <lorenzo@kernel.org>
To: Aniket Negi <aniket.negi03@gmail.com>
Cc: netdev@vger.kernel.org, Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net v2] net: airoha: fix MIB stats collection to be lossless
Date: Thu, 2 Jul 2026 00:43:29 +0200	[thread overview]
Message-ID: <akWYEf5wusTq9_ol@lore-desk> (raw)
In-Reply-To: <20260701173941.314795-1-aniket.negi03@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 14296 bytes --]

> The current driver resets hardware MIB counters after every read via
> REG_FE_GDM_MIB_CLEAR. This creates a race window: packets arriving
> between the read and the clear are silently lost from statistics.
> 
> Fix this by removing the MIB clear and switching to a delta-based
> software tracking approach:
> 
> - 64-bit H+L registers (tx/rx ok pkts, ok bytes, E64..L1023):
>   read the absolute hardware total directly each poll.
> 
> - 32-bit registers (drops, bc, mc, errors, runt, long, ...):
>   store the previous raw register value in mib_prev and accumulate
>   (u32)(curr - prev) into a 64-bit software counter. Unsigned
>   subtraction handles wrap-around transparently.
> 
> - tx_len[0]/rx_len[0] ({0,64} RMON bucket) combines RUNT_CNT
>   (32-bit, delta-tracked via mib_prev.tx_runt_cnt) and E64_CNT
>   (64-bit, absolute). A u64 accumulator tx_runt_accum64 holds the
>   running RUNT delta sum so that each poll sets:
>     tx_len[0] = tx_runt_accum64 + E64_abs
>   without double-counting the E64 value.
> 
> Merge airoha_dev_get_hw_stats() into airoha_update_hw_stats(),
> moving the port spin_lock inside so callers do not need a separate
> wrapper.
> 
> Signed-off-by: Aniket Negi <aniket.negi03@gmail.com>

Hi Aniket,

just few nits inline. Fixing them:

Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>

> ---
> 
> Changes in v2:
>   - Store _CNT_L register reads in val before adding to stats, improving
>     readability (suggested by Lorenzo Bianconi)
>   - Fix double-counting bug in the RUNT+E64 combined bucket: previously
>     "+=" for E64 re-added the full absolute counter each poll; now a
>     dedicated tx_runt_accum64/rx_runt_accum64 accumulator holds the
>     running RUNT delta, and tx_len[0] is assigned (not accumulated) each
>     poll as runt_accum64 + E64_abs
>   - Replace 7-element tx_len[]/rx_len[] shadow arrays in mib_prev with
>     focused tx_runt_cnt/tx_long_cnt and rx_runt_cnt/rx_long_cnt fields;
>     only RUNT and LONG are 32-bit and need wrap-around tracking
>   - Rename inner struct hw_prev_stats to mib_prev; rename accumulator
>     fields to tx_runt_accum64/rx_runt_accum64 for clarity
>   - Fix comment alignment in mib_prev struct block
>   - Rename airoha_dev_get_hw_stats() to airoha_update_hw_stats() and
>     move the port spin_lock inside, removing the separate wrapper
> 
>  drivers/net/ethernet/airoha/airoha_eth.c | 115 +++++++++++++----------
>  drivers/net/ethernet/airoha/airoha_eth.h |  27 ++++++
>  2 files changed, 92 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index 59001fd4b6f7..4b7c547de165 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -1686,12 +1686,14 @@ static void airoha_qdma_stop_napi(struct airoha_qdma *qdma)
>  	}
>  }
>  
> -static void airoha_dev_get_hw_stats(struct airoha_gdm_dev *dev)
> +static void airoha_update_hw_stats(struct airoha_gdm_dev *dev)
>  {
>  	struct airoha_gdm_port *port = dev->port;
>  	struct airoha_eth *eth = dev->eth;
>  	u32 val, i = 0;
>  
> +	spin_lock(&port->stats_lock);
> +
>  	/* Read relevant MIB for GDM with multiple port attached */
>  	if (port->id == AIROHA_GDM3_IDX || port->id == AIROHA_GDM4_IDX)
>  		airoha_fe_rmw(eth, REG_FE_GDM_MIB_CFG(port->id),
> @@ -1701,152 +1703,165 @@ static void airoha_dev_get_hw_stats(struct airoha_gdm_dev *dev)
>  
>  	u64_stats_update_begin(&dev->stats.syncp);
>  
> -	/* TX */
> +	/* TX - 64-bit H+L registers: hw accumulates the total, read directly. */
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_OK_PKT_CNT_H(port->id));
> -	dev->stats.tx_ok_pkts += ((u64)val << 32);
> +	dev->stats.tx_ok_pkts = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_OK_PKT_CNT_L(port->id));
>  	dev->stats.tx_ok_pkts += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_OK_BYTE_CNT_H(port->id));
> -	dev->stats.tx_ok_bytes += ((u64)val << 32);
> +	dev->stats.tx_ok_bytes = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_OK_BYTE_CNT_L(port->id));
>  	dev->stats.tx_ok_bytes += val;
>  
> +	/* TX - 32-bit registers: accumulate delta to handle wrap-around. */
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_DROP_CNT(port->id));
> -	dev->stats.tx_drops += val;
> +	dev->stats.tx_drops += (u32)(val - dev->stats.mib_prev.tx_drops);
> +	dev->stats.mib_prev.tx_drops = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_BC_CNT(port->id));
> -	dev->stats.tx_broadcast += val;
> +	dev->stats.tx_broadcast += (u32)(val - dev->stats.mib_prev.tx_broadcast);
> +	dev->stats.mib_prev.tx_broadcast = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_MC_CNT(port->id));
> -	dev->stats.tx_multicast += val;
> +	dev->stats.tx_multicast += (u32)(val - dev->stats.mib_prev.tx_multicast);
> +	dev->stats.mib_prev.tx_multicast = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_RUNT_CNT(port->id));
> -	dev->stats.tx_len[i] += val;
> +	dev->stats.mib_prev.tx_runt_accum64 +=

I guess dev->stats.mib_prev.tx_runt64

> +		(u32)(val - dev->stats.mib_prev.tx_runt_cnt);

	dev->stats.mib_prev.tx_runt64 += (u32)(val -
					       dev->stats.mib_prev.tx_runt);

> +	dev->stats.mib_prev.tx_runt_cnt = val;
> +
> +	/* tx_len[0]: RUNT (32-bit, delta) + E64 (64-bit, absolute) → {0, 64} bucket.
> +	 * Accumulate RUNT delta in tx_runt_accum64, then assign tx_len[0] as
> +	 * accum + E64_abs so each call gives the correct combined total.
> +	 */

no new-line here.

> +
> +	dev->stats.tx_len[i] = dev->stats.mib_prev.tx_runt_accum64;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_E64_CNT_H(port->id));
> -	dev->stats.tx_len[i] += ((u64)val << 32);
> +	dev->stats.tx_len[i] += (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_E64_CNT_L(port->id));
>  	dev->stats.tx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L64_CNT_H(port->id));
> -	dev->stats.tx_len[i] += ((u64)val << 32);
> +	dev->stats.tx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L64_CNT_L(port->id));
>  	dev->stats.tx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L127_CNT_H(port->id));
> -	dev->stats.tx_len[i] += ((u64)val << 32);
> +	dev->stats.tx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L127_CNT_L(port->id));
>  	dev->stats.tx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L255_CNT_H(port->id));
> -	dev->stats.tx_len[i] += ((u64)val << 32);
> +	dev->stats.tx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L255_CNT_L(port->id));
>  	dev->stats.tx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L511_CNT_H(port->id));
> -	dev->stats.tx_len[i] += ((u64)val << 32);
> +	dev->stats.tx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L511_CNT_L(port->id));
>  	dev->stats.tx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L1023_CNT_H(port->id));
> -	dev->stats.tx_len[i] += ((u64)val << 32);
> +	dev->stats.tx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_L1023_CNT_L(port->id));
>  	dev->stats.tx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_TX_ETH_LONG_CNT(port->id));
> -	dev->stats.tx_len[i++] += val;
> +	dev->stats.tx_len[i++] += (u32)(val - dev->stats.mib_prev.tx_long_cnt);
> +	dev->stats.mib_prev.tx_long_cnt = val;
>  
>  	/* RX */
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_OK_PKT_CNT_H(port->id));
> -	dev->stats.rx_ok_pkts += ((u64)val << 32);
> +	dev->stats.rx_ok_pkts = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_OK_PKT_CNT_L(port->id));
>  	dev->stats.rx_ok_pkts += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_OK_BYTE_CNT_H(port->id));
> -	dev->stats.rx_ok_bytes += ((u64)val << 32);
> +	dev->stats.rx_ok_bytes = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_OK_BYTE_CNT_L(port->id));
>  	dev->stats.rx_ok_bytes += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_DROP_CNT(port->id));
> -	dev->stats.rx_drops += val;
> +	dev->stats.rx_drops += (u32)(val - dev->stats.mib_prev.rx_drops);
> +	dev->stats.mib_prev.rx_drops = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_BC_CNT(port->id));
> -	dev->stats.rx_broadcast += val;
> +	dev->stats.rx_broadcast += (u32)(val - dev->stats.mib_prev.rx_broadcast);
> +	dev->stats.mib_prev.rx_broadcast = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_MC_CNT(port->id));
> -	dev->stats.rx_multicast += val;
> +	dev->stats.rx_multicast += (u32)(val - dev->stats.mib_prev.rx_multicast);
> +	dev->stats.mib_prev.rx_multicast = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ERROR_DROP_CNT(port->id));
> -	dev->stats.rx_errors += val;
> +	dev->stats.rx_errors += (u32)(val - dev->stats.mib_prev.rx_errors);
> +	dev->stats.mib_prev.rx_errors = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_CRC_ERR_CNT(port->id));
> -	dev->stats.rx_crc_error += val;
> +	dev->stats.rx_crc_error += (u32)(val - dev->stats.mib_prev.rx_crc_error);
> +	dev->stats.mib_prev.rx_crc_error = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_OVERFLOW_DROP_CNT(port->id));
> -	dev->stats.rx_over_errors += val;
> +	dev->stats.rx_over_errors += (u32)(val - dev->stats.mib_prev.rx_over_errors);
> +	dev->stats.mib_prev.rx_over_errors = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_FRAG_CNT(port->id));
> -	dev->stats.rx_fragment += val;
> +	dev->stats.rx_fragment += (u32)(val - dev->stats.mib_prev.rx_fragment);
> +	dev->stats.mib_prev.rx_fragment = val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_JABBER_CNT(port->id));
> -	dev->stats.rx_jabber += val;
> +	dev->stats.rx_jabber += (u32)(val - dev->stats.mib_prev.rx_jabber);
> +	dev->stats.mib_prev.rx_jabber = val;
>  
>  	i = 0;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_RUNT_CNT(port->id));
> -	dev->stats.rx_len[i] += val;
> +	dev->stats.mib_prev.rx_runt_accum64 +=
> +		(u32)(val - dev->stats.mib_prev.rx_runt_cnt);

ditto.

> +	dev->stats.mib_prev.rx_runt_cnt = val;
> +
> +	/* rx_len[0]: RUNT (32-bit, delta) + E64 (64-bit, absolute) → {0, 64} bucket.
> +	 * then assign rx_len[0] = rx_runt_accum64 + E64_abs.
> +	 */
>  
ditto.

> +	dev->stats.rx_len[i] = dev->stats.mib_prev.rx_runt_accum64;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_E64_CNT_H(port->id));
> -	dev->stats.rx_len[i] += ((u64)val << 32);
> +	dev->stats.rx_len[i] += (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_E64_CNT_L(port->id));
>  	dev->stats.rx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L64_CNT_H(port->id));
> -	dev->stats.rx_len[i] += ((u64)val << 32);
> +	dev->stats.rx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L64_CNT_L(port->id));
>  	dev->stats.rx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L127_CNT_H(port->id));
> -	dev->stats.rx_len[i] += ((u64)val << 32);
> +	dev->stats.rx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L127_CNT_L(port->id));
>  	dev->stats.rx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L255_CNT_H(port->id));
> -	dev->stats.rx_len[i] += ((u64)val << 32);
> +	dev->stats.rx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L255_CNT_L(port->id));
>  	dev->stats.rx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L511_CNT_H(port->id));
> -	dev->stats.rx_len[i] += ((u64)val << 32);
> +	dev->stats.rx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L511_CNT_L(port->id));
>  	dev->stats.rx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L1023_CNT_H(port->id));
> -	dev->stats.rx_len[i] += ((u64)val << 32);
> +	dev->stats.rx_len[i] = (u64)val << 32;
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_L1023_CNT_L(port->id));
>  	dev->stats.rx_len[i++] += val;
>  
>  	val = airoha_fe_rr(eth, REG_FE_GDM_RX_ETH_LONG_CNT(port->id));
> -	dev->stats.rx_len[i++] += val;
> +	dev->stats.rx_len[i] += (u32)(val - dev->stats.mib_prev.rx_long_cnt);
> +	dev->stats.mib_prev.rx_long_cnt = val;
>  
>  	u64_stats_update_end(&dev->stats.syncp);
> -}
> -
> -static void airoha_update_hw_stats(struct airoha_gdm_dev *dev)
> -{
> -	struct airoha_gdm_port *port = dev->port;
> -	int i;
> -
> -	spin_lock(&port->stats_lock);
> -
> -	for (i = 0; i < ARRAY_SIZE(port->devs); i++) {
> -		if (port->devs[i])
> -			airoha_dev_get_hw_stats(port->devs[i]);
> -	}
> -
> -	/* Reset MIB counters */
> -	airoha_fe_set(dev->eth, REG_FE_GDM_MIB_CLEAR(port->id),
> -		      FE_GDM_MIB_RX_CLEAR_MASK | FE_GDM_MIB_TX_CLEAR_MASK);
>  
>  	spin_unlock(&port->stats_lock);
>  }
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.h b/drivers/net/ethernet/airoha/airoha_eth.h
> index f6d01a8e8da1..3af1c49dd62d 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.h
> +++ b/drivers/net/ethernet/airoha/airoha_eth.h
> @@ -245,6 +245,33 @@ struct airoha_hw_stats {
>  	u64 rx_fragment;
>  	u64 rx_jabber;
>  	u64 rx_len[7];
> +
> +	struct {
> +		/* Previous HW register values for 32-bit counter delta
> +		 * tracking. Storing the last seen value and accumulating
> +		 * (u32)(curr - prev) into the 64-bit software counter
> +		 * handles wrap-around transparently via unsigned arithmetic.
> +		 * tx_runt_accum64/rx_runt_accum64 hold the running sum of
> +		 * runt deltas. These fields are never reported to userspace.
> +		 */
> +		u32 tx_drops;
> +		u32 tx_broadcast;
> +		u32 tx_multicast;
> +		u32 tx_runt_cnt;

		u32 tx_runt;

> +		u32 tx_long_cnt;

		u32 tx_long;

> +		u64 tx_runt_accum64;

		64 tx_runt64;

> +		u32 rx_drops;
> +		u32 rx_broadcast;
> +		u32 rx_multicast;
> +		u32 rx_errors;
> +		u32 rx_crc_error;
> +		u32 rx_over_errors;
> +		u32 rx_fragment;
> +		u32 rx_jabber;
> +		u32 rx_runt_cnt;

		u32 rx_runt;

> +		u32 rx_long_cnt;

		u32 rx_long;

> +		u64 rx_runt_accum64;

		u64 rx_runt64;

> +	} mib_prev;
>  };
>  
>  enum {
> 
> base-commit: a225f8c20712713406ae47024b8df42deacddd4a
> -- 
> 2.43.0
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

      reply	other threads:[~2026-07-01 22:43 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-30 11:18 [PATCH] net: airoha: fix MIB stats collection to be lossless Aniket Negi
2026-06-30 14:21 ` Lorenzo Bianconi
2026-07-01  6:38   ` Aniket Negi
2026-07-01  6:51     ` Lorenzo Bianconi
2026-07-01 10:29       ` Aniket Negi
2026-07-01 17:39 ` [PATCH net v2] " Aniket Negi
2026-07-01 22:43   ` Lorenzo Bianconi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akWYEf5wusTq9_ol@lore-desk \
    --to=lorenzo@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=aniket.negi03@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox