* [PATCH 0/2] net: mana: Refactor GF stats handling and add rx_missed_errors counter
@ 2025-09-15 3:58 Erni Sri Satya Vennela
2025-09-15 3:58 ` [PATCH 1/2] net: mana: Refactor GF stats to use global mana_context Erni Sri Satya Vennela
2025-09-15 3:58 ` [PATCH 2/2] net: mana: Add standard counter rx_missed_errors Erni Sri Satya Vennela
0 siblings, 2 replies; 8+ messages in thread
From: Erni Sri Satya Vennela @ 2025-09-15 3:58 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, longli, kotaranov, horms, shradhagupta, dipayanroy,
shirazsaleem, ernis, rosenp, linux-hyperv, netdev, linux-kernel,
linux-rdma
Restructure mana_query_gf_stats() to operate on the per-VF mana_context,
instead of per-port statistics. Introduce mana_ethtool_hc_stats to
isolate hardware counter statistics and update the
"ethtool -S <interface>" output to expose all relevant counters while
preserving backward compatibility.
Add support for the standard rx_missed_errors counter by mapping it to
the hardware’s hc_rx_discards_no_wqe metric. Introduce a
dedicated workqueue that refreshes statistics every 2 seconds, ensuring
timely and consistent updates of hardware counters.
Erni Sri Satya Vennela (2):
net: mana: Refactor GF stats to use global mana_context
net: mana: Add standard counter rx_missed_errors
drivers/net/ethernet/microsoft/mana/mana_en.c | 103 ++++++++++++------
.../ethernet/microsoft/mana/mana_ethtool.c | 85 ++++++++-------
include/net/mana/gdma.h | 6 +-
include/net/mana/mana.h | 17 ++-
4 files changed, 131 insertions(+), 80 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] net: mana: Refactor GF stats to use global mana_context
2025-09-15 3:58 [PATCH 0/2] net: mana: Refactor GF stats handling and add rx_missed_errors counter Erni Sri Satya Vennela
@ 2025-09-15 3:58 ` Erni Sri Satya Vennela
2025-09-15 3:58 ` [PATCH 2/2] net: mana: Add standard counter rx_missed_errors Erni Sri Satya Vennela
1 sibling, 0 replies; 8+ messages in thread
From: Erni Sri Satya Vennela @ 2025-09-15 3:58 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, longli, kotaranov, horms, shradhagupta, dipayanroy,
shirazsaleem, ernis, rosenp, linux-hyperv, netdev, linux-kernel,
linux-rdma
Refactor mana_query_gf_stats() to use mana_context instead of per-port,
enabling single query for all VFs. Isolate hardware counter stats by
introducing mana_ethtool_hc_stats in mana_context and update the code
to ensure all stats are properly reported via ethtool -S <interface>,
maintaining consistency with previous behavior.
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 67 ++++++++-------
.../ethernet/microsoft/mana/mana_ethtool.c | 85 ++++++++++---------
include/net/mana/mana.h | 14 +--
3 files changed, 90 insertions(+), 76 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index f4fc86f20213..787644229897 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2779,11 +2779,12 @@ int mana_config_rss(struct mana_port_context *apc, enum TRI_STATE rx,
return 0;
}
-void mana_query_gf_stats(struct mana_port_context *apc)
+void mana_query_gf_stats(struct mana_context *ac)
{
struct mana_query_gf_stat_resp resp = {};
struct mana_query_gf_stat_req req = {};
- struct net_device *ndev = apc->ndev;
+ struct gdma_context *gc = ac->gdma_dev->gdma_context;
+ struct device *dev = gc->dev;
int err;
mana_gd_init_req_hdr(&req.hdr, MANA_QUERY_GF_STAT,
@@ -2817,52 +2818,52 @@ void mana_query_gf_stats(struct mana_port_context *apc)
STATISTICS_FLAGS_HC_TX_BCAST_BYTES |
STATISTICS_FLAGS_TX_ERRORS_GDMA_ERROR;
- err = mana_send_request(apc->ac, &req, sizeof(req), &resp,
+ err = mana_send_request(ac, &req, sizeof(req), &resp,
sizeof(resp));
if (err) {
- netdev_err(ndev, "Failed to query GF stats: %d\n", err);
+ dev_err(dev, "Failed to query GF stats: %d\n", err);
return;
}
err = mana_verify_resp_hdr(&resp.hdr, MANA_QUERY_GF_STAT,
sizeof(resp));
if (err || resp.hdr.status) {
- netdev_err(ndev, "Failed to query GF stats: %d, 0x%x\n", err,
- resp.hdr.status);
+ dev_err(dev, "Failed to query GF stats: %d, 0x%x\n", err,
+ resp.hdr.status);
return;
}
- apc->eth_stats.hc_rx_discards_no_wqe = resp.rx_discards_nowqe;
- apc->eth_stats.hc_rx_err_vport_disabled = resp.rx_err_vport_disabled;
- apc->eth_stats.hc_rx_bytes = resp.hc_rx_bytes;
- apc->eth_stats.hc_rx_ucast_pkts = resp.hc_rx_ucast_pkts;
- apc->eth_stats.hc_rx_ucast_bytes = resp.hc_rx_ucast_bytes;
- apc->eth_stats.hc_rx_bcast_pkts = resp.hc_rx_bcast_pkts;
- apc->eth_stats.hc_rx_bcast_bytes = resp.hc_rx_bcast_bytes;
- apc->eth_stats.hc_rx_mcast_pkts = resp.hc_rx_mcast_pkts;
- apc->eth_stats.hc_rx_mcast_bytes = resp.hc_rx_mcast_bytes;
- apc->eth_stats.hc_tx_err_gf_disabled = resp.tx_err_gf_disabled;
- apc->eth_stats.hc_tx_err_vport_disabled = resp.tx_err_vport_disabled;
- apc->eth_stats.hc_tx_err_inval_vportoffset_pkt =
+ ac->hc_stats.hc_rx_discards_no_wqe = resp.rx_discards_nowqe;
+ ac->hc_stats.hc_rx_err_vport_disabled = resp.rx_err_vport_disabled;
+ ac->hc_stats.hc_rx_bytes = resp.hc_rx_bytes;
+ ac->hc_stats.hc_rx_ucast_pkts = resp.hc_rx_ucast_pkts;
+ ac->hc_stats.hc_rx_ucast_bytes = resp.hc_rx_ucast_bytes;
+ ac->hc_stats.hc_rx_bcast_pkts = resp.hc_rx_bcast_pkts;
+ ac->hc_stats.hc_rx_bcast_bytes = resp.hc_rx_bcast_bytes;
+ ac->hc_stats.hc_rx_mcast_pkts = resp.hc_rx_mcast_pkts;
+ ac->hc_stats.hc_rx_mcast_bytes = resp.hc_rx_mcast_bytes;
+ ac->hc_stats.hc_tx_err_gf_disabled = resp.tx_err_gf_disabled;
+ ac->hc_stats.hc_tx_err_vport_disabled = resp.tx_err_vport_disabled;
+ ac->hc_stats.hc_tx_err_inval_vportoffset_pkt =
resp.tx_err_inval_vport_offset_pkt;
- apc->eth_stats.hc_tx_err_vlan_enforcement =
+ ac->hc_stats.hc_tx_err_vlan_enforcement =
resp.tx_err_vlan_enforcement;
- apc->eth_stats.hc_tx_err_eth_type_enforcement =
+ ac->hc_stats.hc_tx_err_eth_type_enforcement =
resp.tx_err_ethtype_enforcement;
- apc->eth_stats.hc_tx_err_sa_enforcement = resp.tx_err_SA_enforcement;
- apc->eth_stats.hc_tx_err_sqpdid_enforcement =
+ ac->hc_stats.hc_tx_err_sa_enforcement = resp.tx_err_SA_enforcement;
+ ac->hc_stats.hc_tx_err_sqpdid_enforcement =
resp.tx_err_SQPDID_enforcement;
- apc->eth_stats.hc_tx_err_cqpdid_enforcement =
+ ac->hc_stats.hc_tx_err_cqpdid_enforcement =
resp.tx_err_CQPDID_enforcement;
- apc->eth_stats.hc_tx_err_mtu_violation = resp.tx_err_mtu_violation;
- apc->eth_stats.hc_tx_err_inval_oob = resp.tx_err_inval_oob;
- apc->eth_stats.hc_tx_bytes = resp.hc_tx_bytes;
- apc->eth_stats.hc_tx_ucast_pkts = resp.hc_tx_ucast_pkts;
- apc->eth_stats.hc_tx_ucast_bytes = resp.hc_tx_ucast_bytes;
- apc->eth_stats.hc_tx_bcast_pkts = resp.hc_tx_bcast_pkts;
- apc->eth_stats.hc_tx_bcast_bytes = resp.hc_tx_bcast_bytes;
- apc->eth_stats.hc_tx_mcast_pkts = resp.hc_tx_mcast_pkts;
- apc->eth_stats.hc_tx_mcast_bytes = resp.hc_tx_mcast_bytes;
- apc->eth_stats.hc_tx_err_gdma = resp.tx_err_gdma;
+ ac->hc_stats.hc_tx_err_mtu_violation = resp.tx_err_mtu_violation;
+ ac->hc_stats.hc_tx_err_inval_oob = resp.tx_err_inval_oob;
+ ac->hc_stats.hc_tx_bytes = resp.hc_tx_bytes;
+ ac->hc_stats.hc_tx_ucast_pkts = resp.hc_tx_ucast_pkts;
+ ac->hc_stats.hc_tx_ucast_bytes = resp.hc_tx_ucast_bytes;
+ ac->hc_stats.hc_tx_bcast_pkts = resp.hc_tx_bcast_pkts;
+ ac->hc_stats.hc_tx_bcast_bytes = resp.hc_tx_bcast_bytes;
+ ac->hc_stats.hc_tx_mcast_pkts = resp.hc_tx_mcast_pkts;
+ ac->hc_stats.hc_tx_mcast_bytes = resp.hc_tx_mcast_bytes;
+ ac->hc_stats.hc_tx_err_gdma = resp.tx_err_gdma;
}
void mana_query_phy_stats(struct mana_port_context *apc)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
index a1afa75a9463..3dfd96146424 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
@@ -15,66 +15,69 @@ struct mana_stats_desc {
static const struct mana_stats_desc mana_eth_stats[] = {
{"stop_queue", offsetof(struct mana_ethtool_stats, stop_queue)},
{"wake_queue", offsetof(struct mana_ethtool_stats, wake_queue)},
- {"hc_rx_discards_no_wqe", offsetof(struct mana_ethtool_stats,
+ {"tx_cq_err", offsetof(struct mana_ethtool_stats, tx_cqe_err)},
+ {"tx_cqe_unknown_type", offsetof(struct mana_ethtool_stats,
+ tx_cqe_unknown_type)},
+ {"rx_coalesced_err", offsetof(struct mana_ethtool_stats,
+ rx_coalesced_err)},
+ {"rx_cqe_unknown_type", offsetof(struct mana_ethtool_stats,
+ rx_cqe_unknown_type)},
+};
+
+static const struct mana_stats_desc mana_hc_stats[] = {
+ {"hc_rx_discards_no_wqe", offsetof(struct mana_ethtool_hc_stats,
hc_rx_discards_no_wqe)},
- {"hc_rx_err_vport_disabled", offsetof(struct mana_ethtool_stats,
+ {"hc_rx_err_vport_disabled", offsetof(struct mana_ethtool_hc_stats,
hc_rx_err_vport_disabled)},
- {"hc_rx_bytes", offsetof(struct mana_ethtool_stats, hc_rx_bytes)},
- {"hc_rx_ucast_pkts", offsetof(struct mana_ethtool_stats,
+ {"hc_rx_bytes", offsetof(struct mana_ethtool_hc_stats, hc_rx_bytes)},
+ {"hc_rx_ucast_pkts", offsetof(struct mana_ethtool_hc_stats,
hc_rx_ucast_pkts)},
- {"hc_rx_ucast_bytes", offsetof(struct mana_ethtool_stats,
+ {"hc_rx_ucast_bytes", offsetof(struct mana_ethtool_hc_stats,
hc_rx_ucast_bytes)},
- {"hc_rx_bcast_pkts", offsetof(struct mana_ethtool_stats,
+ {"hc_rx_bcast_pkts", offsetof(struct mana_ethtool_hc_stats,
hc_rx_bcast_pkts)},
- {"hc_rx_bcast_bytes", offsetof(struct mana_ethtool_stats,
+ {"hc_rx_bcast_bytes", offsetof(struct mana_ethtool_hc_stats,
hc_rx_bcast_bytes)},
- {"hc_rx_mcast_pkts", offsetof(struct mana_ethtool_stats,
- hc_rx_mcast_pkts)},
- {"hc_rx_mcast_bytes", offsetof(struct mana_ethtool_stats,
+ {"hc_rx_mcast_pkts", offsetof(struct mana_ethtool_hc_stats,
+ hc_rx_mcast_pkts)},
+ {"hc_rx_mcast_bytes", offsetof(struct mana_ethtool_hc_stats,
hc_rx_mcast_bytes)},
- {"hc_tx_err_gf_disabled", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_err_gf_disabled", offsetof(struct mana_ethtool_hc_stats,
hc_tx_err_gf_disabled)},
- {"hc_tx_err_vport_disabled", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_err_vport_disabled", offsetof(struct mana_ethtool_hc_stats,
hc_tx_err_vport_disabled)},
{"hc_tx_err_inval_vportoffset_pkt",
- offsetof(struct mana_ethtool_stats,
+ offsetof(struct mana_ethtool_hc_stats,
hc_tx_err_inval_vportoffset_pkt)},
- {"hc_tx_err_vlan_enforcement", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_err_vlan_enforcement", offsetof(struct mana_ethtool_hc_stats,
hc_tx_err_vlan_enforcement)},
{"hc_tx_err_eth_type_enforcement",
- offsetof(struct mana_ethtool_stats, hc_tx_err_eth_type_enforcement)},
- {"hc_tx_err_sa_enforcement", offsetof(struct mana_ethtool_stats,
+ offsetof(struct mana_ethtool_hc_stats, hc_tx_err_eth_type_enforcement)},
+ {"hc_tx_err_sa_enforcement", offsetof(struct mana_ethtool_hc_stats,
hc_tx_err_sa_enforcement)},
{"hc_tx_err_sqpdid_enforcement",
- offsetof(struct mana_ethtool_stats, hc_tx_err_sqpdid_enforcement)},
+ offsetof(struct mana_ethtool_hc_stats, hc_tx_err_sqpdid_enforcement)},
{"hc_tx_err_cqpdid_enforcement",
- offsetof(struct mana_ethtool_stats, hc_tx_err_cqpdid_enforcement)},
- {"hc_tx_err_mtu_violation", offsetof(struct mana_ethtool_stats,
+ offsetof(struct mana_ethtool_hc_stats, hc_tx_err_cqpdid_enforcement)},
+ {"hc_tx_err_mtu_violation", offsetof(struct mana_ethtool_hc_stats,
hc_tx_err_mtu_violation)},
- {"hc_tx_err_inval_oob", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_err_inval_oob", offsetof(struct mana_ethtool_hc_stats,
hc_tx_err_inval_oob)},
- {"hc_tx_err_gdma", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_err_gdma", offsetof(struct mana_ethtool_hc_stats,
hc_tx_err_gdma)},
- {"hc_tx_bytes", offsetof(struct mana_ethtool_stats, hc_tx_bytes)},
- {"hc_tx_ucast_pkts", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_bytes", offsetof(struct mana_ethtool_hc_stats, hc_tx_bytes)},
+ {"hc_tx_ucast_pkts", offsetof(struct mana_ethtool_hc_stats,
hc_tx_ucast_pkts)},
- {"hc_tx_ucast_bytes", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_ucast_bytes", offsetof(struct mana_ethtool_hc_stats,
hc_tx_ucast_bytes)},
- {"hc_tx_bcast_pkts", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_bcast_pkts", offsetof(struct mana_ethtool_hc_stats,
hc_tx_bcast_pkts)},
- {"hc_tx_bcast_bytes", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_bcast_bytes", offsetof(struct mana_ethtool_hc_stats,
hc_tx_bcast_bytes)},
- {"hc_tx_mcast_pkts", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_mcast_pkts", offsetof(struct mana_ethtool_hc_stats,
hc_tx_mcast_pkts)},
- {"hc_tx_mcast_bytes", offsetof(struct mana_ethtool_stats,
+ {"hc_tx_mcast_bytes", offsetof(struct mana_ethtool_hc_stats,
hc_tx_mcast_bytes)},
- {"tx_cq_err", offsetof(struct mana_ethtool_stats, tx_cqe_err)},
- {"tx_cqe_unknown_type", offsetof(struct mana_ethtool_stats,
- tx_cqe_unknown_type)},
- {"rx_coalesced_err", offsetof(struct mana_ethtool_stats,
- rx_coalesced_err)},
- {"rx_cqe_unknown_type", offsetof(struct mana_ethtool_stats,
- rx_cqe_unknown_type)},
};
static const struct mana_stats_desc mana_phy_stats[] = {
@@ -138,7 +141,7 @@ static int mana_get_sset_count(struct net_device *ndev, int stringset)
if (stringset != ETH_SS_STATS)
return -EINVAL;
- return ARRAY_SIZE(mana_eth_stats) + ARRAY_SIZE(mana_phy_stats) +
+ return ARRAY_SIZE(mana_eth_stats) + ARRAY_SIZE(mana_phy_stats) + ARRAY_SIZE(mana_hc_stats) +
num_queues * (MANA_STATS_RX_COUNT + MANA_STATS_TX_COUNT);
}
@@ -150,10 +153,12 @@ static void mana_get_strings(struct net_device *ndev, u32 stringset, u8 *data)
if (stringset != ETH_SS_STATS)
return;
-
for (i = 0; i < ARRAY_SIZE(mana_eth_stats); i++)
ethtool_puts(&data, mana_eth_stats[i].name);
+ for (i = 0; i < ARRAY_SIZE(mana_hc_stats); i++)
+ ethtool_puts(&data, mana_hc_stats[i].name);
+
for (i = 0; i < ARRAY_SIZE(mana_phy_stats); i++)
ethtool_puts(&data, mana_phy_stats[i].name);
@@ -186,6 +191,7 @@ static void mana_get_ethtool_stats(struct net_device *ndev,
struct mana_port_context *apc = netdev_priv(ndev);
unsigned int num_queues = apc->num_queues;
void *eth_stats = &apc->eth_stats;
+ void *hc_stats = &apc->ac->hc_stats;
void *phy_stats = &apc->phy_stats;
struct mana_stats_rx *rx_stats;
struct mana_stats_tx *tx_stats;
@@ -208,7 +214,7 @@ static void mana_get_ethtool_stats(struct net_device *ndev,
if (!apc->port_is_up)
return;
/* we call mana function to update stats from GDMA */
- mana_query_gf_stats(apc);
+ mana_query_gf_stats(apc->ac);
/* We call this mana function to get the phy stats from GDMA and includes
* aggregate tx/rx drop counters, Per-TC(Traffic Channel) tx/rx and pause
@@ -219,6 +225,9 @@ static void mana_get_ethtool_stats(struct net_device *ndev,
for (q = 0; q < ARRAY_SIZE(mana_eth_stats); q++)
data[i++] = *(u64 *)(eth_stats + mana_eth_stats[q].offset);
+ for (q = 0; q < ARRAY_SIZE(mana_hc_stats); q++)
+ data[i++] = *(u64 *)(hc_stats + mana_hc_stats[q].offset);
+
for (q = 0; q < ARRAY_SIZE(mana_phy_stats); q++)
data[i++] = *(u64 *)(phy_stats + mana_phy_stats[q].offset);
diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
index 0921485565c0..519c4384c51f 100644
--- a/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -375,6 +375,13 @@ struct mana_tx_qp {
struct mana_ethtool_stats {
u64 stop_queue;
u64 wake_queue;
+ u64 tx_cqe_err;
+ u64 tx_cqe_unknown_type;
+ u64 rx_coalesced_err;
+ u64 rx_cqe_unknown_type;
+};
+
+struct mana_ethtool_hc_stats {
u64 hc_rx_discards_no_wqe;
u64 hc_rx_err_vport_disabled;
u64 hc_rx_bytes;
@@ -402,10 +409,6 @@ struct mana_ethtool_stats {
u64 hc_tx_mcast_pkts;
u64 hc_tx_mcast_bytes;
u64 hc_tx_err_gdma;
- u64 tx_cqe_err;
- u64 tx_cqe_unknown_type;
- u64 rx_coalesced_err;
- u64 rx_cqe_unknown_type;
};
struct mana_ethtool_phy_stats {
@@ -473,6 +476,7 @@ struct mana_context {
u16 num_ports;
u8 bm_hostmode;
+ struct mana_ethtool_hc_stats hc_stats;
struct mana_eq *eqs;
struct dentry *mana_eqs_debugfs;
@@ -573,7 +577,7 @@ u32 mana_run_xdp(struct net_device *ndev, struct mana_rxq *rxq,
struct bpf_prog *mana_xdp_get(struct mana_port_context *apc);
void mana_chn_setxdp(struct mana_port_context *apc, struct bpf_prog *prog);
int mana_bpf(struct net_device *ndev, struct netdev_bpf *bpf);
-void mana_query_gf_stats(struct mana_port_context *apc);
+void mana_query_gf_stats(struct mana_context *ac);
int mana_query_link_cfg(struct mana_port_context *apc);
int mana_set_bw_clamp(struct mana_port_context *apc, u32 speed,
int enable_clamping);
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] net: mana: Add standard counter rx_missed_errors
2025-09-15 3:58 [PATCH 0/2] net: mana: Refactor GF stats handling and add rx_missed_errors counter Erni Sri Satya Vennela
2025-09-15 3:58 ` [PATCH 1/2] net: mana: Refactor GF stats to use global mana_context Erni Sri Satya Vennela
@ 2025-09-15 3:58 ` Erni Sri Satya Vennela
2025-09-16 13:22 ` Paolo Abeni
1 sibling, 1 reply; 8+ messages in thread
From: Erni Sri Satya Vennela @ 2025-09-15 3:58 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, longli, kotaranov, horms, shradhagupta, dipayanroy,
shirazsaleem, ernis, rosenp, linux-hyperv, netdev, linux-kernel,
linux-rdma
Report standard counter stats->rx_missed_errors
using hc_rx_discards_no_wqe from the hardware.
Add a dedicated workqueue to periodically run
mana_query_gf_stats every 2 seconds to get the latest
info in eth_stats and define a driver capability flag
to notify hardware of the periodic queries.
To avoid repeated failures and log flooding, the workqueue
is not rescheduled if mana_query_gf_stats fails.
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-By: Haiyang Zhang <haiyangz@microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 38 +++++++++++++++++--
.../ethernet/microsoft/mana/mana_ethtool.c | 2 -
include/net/mana/gdma.h | 6 ++-
include/net/mana/mana.h | 5 ++-
4 files changed, 44 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 787644229897..9213ae6ba038 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -494,6 +494,8 @@ static void mana_get_stats64(struct net_device *ndev,
netdev_stats_to_stats64(st, &ndev->stats);
+ st->rx_missed_errors = apc->ac->hc_stats.hc_rx_discards_no_wqe;
+
for (q = 0; q < num_queues; q++) {
rx_stats = &apc->rxqs[q]->stats;
@@ -2779,7 +2781,7 @@ int mana_config_rss(struct mana_port_context *apc, enum TRI_STATE rx,
return 0;
}
-void mana_query_gf_stats(struct mana_context *ac)
+int mana_query_gf_stats(struct mana_context *ac)
{
struct mana_query_gf_stat_resp resp = {};
struct mana_query_gf_stat_req req = {};
@@ -2822,14 +2824,14 @@ void mana_query_gf_stats(struct mana_context *ac)
sizeof(resp));
if (err) {
dev_err(dev, "Failed to query GF stats: %d\n", err);
- return;
+ return err;
}
err = mana_verify_resp_hdr(&resp.hdr, MANA_QUERY_GF_STAT,
sizeof(resp));
if (err || resp.hdr.status) {
dev_err(dev, "Failed to query GF stats: %d, 0x%x\n", err,
resp.hdr.status);
- return;
+ return err;
}
ac->hc_stats.hc_rx_discards_no_wqe = resp.rx_discards_nowqe;
@@ -2864,6 +2866,8 @@ void mana_query_gf_stats(struct mana_context *ac)
ac->hc_stats.hc_tx_mcast_pkts = resp.hc_tx_mcast_pkts;
ac->hc_stats.hc_tx_mcast_bytes = resp.hc_tx_mcast_bytes;
ac->hc_stats.hc_tx_err_gdma = resp.tx_err_gdma;
+
+ return 0;
}
void mana_query_phy_stats(struct mana_port_context *apc)
@@ -3400,6 +3404,19 @@ int mana_rdma_service_event(struct gdma_context *gc, enum gdma_service_type even
return 0;
}
+#define MANA_GF_STATS_PERIOD (2 * HZ)
+
+static void mana_gf_stats_work_handler(struct work_struct *work)
+{
+ struct mana_context *ac =
+ container_of(to_delayed_work(work), struct mana_context, gf_stats_work);
+
+ if (mana_query_gf_stats(ac))
+ return;
+
+ queue_delayed_work(ac->gf_stats_wq, &ac->gf_stats_work, MANA_GF_STATS_PERIOD);
+}
+
int mana_probe(struct gdma_dev *gd, bool resuming)
{
struct gdma_context *gc = gd->gdma_context;
@@ -3488,6 +3505,15 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
}
err = add_adev(gd, "eth");
+ ac->gf_stats_wq = create_singlethread_workqueue("mana_gf_stats");
+ if (!ac->gf_stats_wq) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
+ queue_delayed_work(ac->gf_stats_wq, &ac->gf_stats_work, MANA_GF_STATS_PERIOD);
+
out:
if (err) {
mana_remove(gd, false);
@@ -3511,6 +3537,12 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
int err;
int i;
+ if (ac->gf_stats_wq) {
+ cancel_delayed_work_sync(&ac->gf_stats_work);
+ destroy_workqueue(ac->gf_stats_wq);
+ ac->gf_stats_wq = NULL;
+ }
+
/* adev currently doesn't support suspending, always remove it */
if (gd->adev)
remove_adev(gd);
diff --git a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
index 3dfd96146424..99e811208683 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
@@ -213,8 +213,6 @@ static void mana_get_ethtool_stats(struct net_device *ndev,
if (!apc->port_is_up)
return;
- /* we call mana function to update stats from GDMA */
- mana_query_gf_stats(apc->ac);
/* We call this mana function to get the phy stats from GDMA and includes
* aggregate tx/rx drop counters, Per-TC(Traffic Channel) tx/rx and pause
diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h
index 57df78cfbf82..88a81fb164a0 100644
--- a/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -591,6 +591,9 @@ enum {
/* Driver can self reset on FPGA Reconfig EQE notification */
#define GDMA_DRV_CAP_FLAG_1_HANDLE_RECONFIG_EQE BIT(17)
+/* Driver can send HWC periodically to query stats */
+#define GDMA_DRV_CAP_FLAG_1_PERIODIC_STATS_QUERY BIT(21)
+
#define GDMA_DRV_CAP_FLAGS1 \
(GDMA_DRV_CAP_FLAG_1_EQ_SHARING_MULTI_VPORT | \
GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX | \
@@ -599,7 +602,8 @@ enum {
GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP | \
GDMA_DRV_CAP_FLAG_1_DYNAMIC_IRQ_ALLOC_SUPPORT | \
GDMA_DRV_CAP_FLAG_1_SELF_RESET_ON_EQE | \
- GDMA_DRV_CAP_FLAG_1_HANDLE_RECONFIG_EQE)
+ GDMA_DRV_CAP_FLAG_1_HANDLE_RECONFIG_EQE | \
+ GDMA_DRV_CAP_FLAG_1_PERIODIC_STATS_QUERY)
#define GDMA_DRV_CAP_FLAGS2 0
diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
index 519c4384c51f..74f112bcdc02 100644
--- a/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -480,6 +480,9 @@ struct mana_context {
struct mana_eq *eqs;
struct dentry *mana_eqs_debugfs;
+ struct workqueue_struct *gf_stats_wq;
+ struct delayed_work gf_stats_work;
+
struct net_device *ports[MAX_PORTS_IN_MANA_DEV];
};
@@ -577,7 +580,7 @@ u32 mana_run_xdp(struct net_device *ndev, struct mana_rxq *rxq,
struct bpf_prog *mana_xdp_get(struct mana_port_context *apc);
void mana_chn_setxdp(struct mana_port_context *apc, struct bpf_prog *prog);
int mana_bpf(struct net_device *ndev, struct netdev_bpf *bpf);
-void mana_query_gf_stats(struct mana_context *ac);
+int mana_query_gf_stats(struct mana_context *ac);
int mana_query_link_cfg(struct mana_port_context *apc);
int mana_set_bw_clamp(struct mana_port_context *apc, u32 speed,
int enable_clamping);
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] net: mana: Add standard counter rx_missed_errors
2025-09-15 3:58 ` [PATCH 2/2] net: mana: Add standard counter rx_missed_errors Erni Sri Satya Vennela
@ 2025-09-16 13:22 ` Paolo Abeni
2025-09-17 5:53 ` Erni Sri Satya Vennela
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Paolo Abeni @ 2025-09-16 13:22 UTC (permalink / raw)
To: Erni Sri Satya Vennela, kys, haiyangz, wei.liu, decui,
andrew+netdev, davem, edumazet, kuba, longli, kotaranov, horms,
shradhagupta, dipayanroy, shirazsaleem, rosenp, linux-hyperv,
netdev, linux-kernel, linux-rdma
On 9/15/25 5:58 AM, Erni Sri Satya Vennela wrote:
> Report standard counter stats->rx_missed_errors
> using hc_rx_discards_no_wqe from the hardware.
>
> Add a dedicated workqueue to periodically run
> mana_query_gf_stats every 2 seconds to get the latest
> info in eth_stats and define a driver capability flag
> to notify hardware of the periodic queries.
>
> To avoid repeated failures and log flooding, the workqueue
> is not rescheduled if mana_query_gf_stats fails.
Can the failure root cause be a "transient" one? If so, this looks like
a dangerous strategy; is such scenario, AFAICS, stats will be broken
until the device is removed and re-probed.
/P
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] net: mana: Add standard counter rx_missed_errors
2025-09-16 13:22 ` Paolo Abeni
@ 2025-09-17 5:53 ` Erni Sri Satya Vennela
2025-09-25 4:24 ` Erni Sri Satya Vennela
2025-10-23 5:31 ` Erni Sri Satya Vennela
2 siblings, 0 replies; 8+ messages in thread
From: Erni Sri Satya Vennela @ 2025-09-17 5:53 UTC (permalink / raw)
To: Paolo Abeni
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, longli, kotaranov, horms, shradhagupta, dipayanroy,
shirazsaleem, rosenp, linux-hyperv, netdev, linux-kernel,
linux-rdma
On Tue, Sep 16, 2025 at 03:22:54PM +0200, Paolo Abeni wrote:
> On 9/15/25 5:58 AM, Erni Sri Satya Vennela wrote:
> > Report standard counter stats->rx_missed_errors
> > using hc_rx_discards_no_wqe from the hardware.
> >
> > Add a dedicated workqueue to periodically run
> > mana_query_gf_stats every 2 seconds to get the latest
> > info in eth_stats and define a driver capability flag
> > to notify hardware of the periodic queries.
> >
> > To avoid repeated failures and log flooding, the workqueue
> > is not rescheduled if mana_query_gf_stats fails.
>
> Can the failure root cause be a "transient" one? If so, this looks like
> a dangerous strategy; is such scenario, AFAICS, stats will be broken
> until the device is removed and re-probed.
>
We are working on using the stats query as a health check for the
hardware and its channel. Even if it fails once, the VF needs to
be reset, similar to a probe. The hardware team also confirmed that even
a one-time or temporary failure needs a VF reset.
- Vennela
> /P
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] net: mana: Add standard counter rx_missed_errors
2025-09-16 13:22 ` Paolo Abeni
2025-09-17 5:53 ` Erni Sri Satya Vennela
@ 2025-09-25 4:24 ` Erni Sri Satya Vennela
2025-10-14 9:36 ` Erni Sri Satya Vennela
2025-10-23 5:31 ` Erni Sri Satya Vennela
2 siblings, 1 reply; 8+ messages in thread
From: Erni Sri Satya Vennela @ 2025-09-25 4:24 UTC (permalink / raw)
To: Paolo Abeni
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, longli, kotaranov, horms, shradhagupta, dipayanroy,
shirazsaleem, rosenp, linux-hyperv, netdev, linux-kernel,
linux-rdma
On Tue, Sep 16, 2025 at 03:22:54PM +0200, Paolo Abeni wrote:
> On 9/15/25 5:58 AM, Erni Sri Satya Vennela wrote:
> > Report standard counter stats->rx_missed_errors
> > using hc_rx_discards_no_wqe from the hardware.
> >
> > Add a dedicated workqueue to periodically run
> > mana_query_gf_stats every 2 seconds to get the latest
> > info in eth_stats and define a driver capability flag
> > to notify hardware of the periodic queries.
> >
> > To avoid repeated failures and log flooding, the workqueue
> > is not rescheduled if mana_query_gf_stats fails.
>
> Can the failure root cause be a "transient" one? If so, this looks like
> a dangerous strategy; is such scenario, AFAICS, stats will be broken
> until the device is removed and re-probed.
>
> /P
Hi Paolo,
Does this patch require further discussion?
- Vennela
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] net: mana: Add standard counter rx_missed_errors
2025-09-25 4:24 ` Erni Sri Satya Vennela
@ 2025-10-14 9:36 ` Erni Sri Satya Vennela
0 siblings, 0 replies; 8+ messages in thread
From: Erni Sri Satya Vennela @ 2025-10-14 9:36 UTC (permalink / raw)
To: Paolo Abeni
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, longli, kotaranov, horms, shradhagupta, dipayanroy,
shirazsaleem, rosenp, linux-hyperv, netdev, linux-kernel,
linux-rdma
On Wed, Sep 24, 2025 at 09:24:05PM -0700, Erni Sri Satya Vennela wrote:
> On Tue, Sep 16, 2025 at 03:22:54PM +0200, Paolo Abeni wrote:
> > On 9/15/25 5:58 AM, Erni Sri Satya Vennela wrote:
> > > Report standard counter stats->rx_missed_errors
> > > using hc_rx_discards_no_wqe from the hardware.
> > >
> > > Add a dedicated workqueue to periodically run
> > > mana_query_gf_stats every 2 seconds to get the latest
> > > info in eth_stats and define a driver capability flag
> > > to notify hardware of the periodic queries.
> > >
> > > To avoid repeated failures and log flooding, the workqueue
> > > is not rescheduled if mana_query_gf_stats fails.
> >
> > Can the failure root cause be a "transient" one? If so, this looks like
> > a dangerous strategy; is such scenario, AFAICS, stats will be broken
> > until the device is removed and re-probed.
> >
> > /P
Hi Paolo,
I wanted to follow up on the clarification I shared regarding my
patch. Please let me know if there's anything further needed from my
side.
- Vennela
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] net: mana: Add standard counter rx_missed_errors
2025-09-16 13:22 ` Paolo Abeni
2025-09-17 5:53 ` Erni Sri Satya Vennela
2025-09-25 4:24 ` Erni Sri Satya Vennela
@ 2025-10-23 5:31 ` Erni Sri Satya Vennela
2 siblings, 0 replies; 8+ messages in thread
From: Erni Sri Satya Vennela @ 2025-10-23 5:31 UTC (permalink / raw)
To: Paolo Abeni
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, longli, kotaranov, horms, shradhagupta, dipayanroy,
shirazsaleem, rosenp, linux-hyperv, netdev, linux-kernel,
linux-rdma
On Tue, Sep 16, 2025 at 03:22:54PM +0200, Paolo Abeni wrote:
> On 9/15/25 5:58 AM, Erni Sri Satya Vennela wrote:
> > Report standard counter stats->rx_missed_errors
> > using hc_rx_discards_no_wqe from the hardware.
> >
> > Add a dedicated workqueue to periodically run
> > mana_query_gf_stats every 2 seconds to get the latest
> > info in eth_stats and define a driver capability flag
> > to notify hardware of the periodic queries.
> >
> > To avoid repeated failures and log flooding, the workqueue
> > is not rescheduled if mana_query_gf_stats fails.
>
> Can the failure root cause be a "transient" one? If so, this looks like
> a dangerous strategy; is such scenario, AFAICS, stats will be broken
> until the device is removed and re-probed.
>
> /P
After internal discussion, We are planning to fix this issue following
the below approach:
Stop rescheduling the work queue only upon detecting HWC timeout.
In this case:
1. Reset all stats to zero to avoid stale reporting.
2. Introduce a driver flag to detect the first occurrence of HWC timeout.
3. Log a warn_once during subsequent calls to mana_get_stats64 to signal
the issue.
Thanks,
Vennela
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-10-23 5:31 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-15 3:58 [PATCH 0/2] net: mana: Refactor GF stats handling and add rx_missed_errors counter Erni Sri Satya Vennela
2025-09-15 3:58 ` [PATCH 1/2] net: mana: Refactor GF stats to use global mana_context Erni Sri Satya Vennela
2025-09-15 3:58 ` [PATCH 2/2] net: mana: Add standard counter rx_missed_errors Erni Sri Satya Vennela
2025-09-16 13:22 ` Paolo Abeni
2025-09-17 5:53 ` Erni Sri Satya Vennela
2025-09-25 4:24 ` Erni Sri Satya Vennela
2025-10-14 9:36 ` Erni Sri Satya Vennela
2025-10-23 5:31 ` Erni Sri Satya Vennela
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.