* [PATCH net 0/2] gve: Stats reporting fixes
@ 2026-02-02 19:39 Harshitha Ramamurthy
2026-02-02 19:39 ` [PATCH net 1/2] gve: Fix stats report corruption on queue count change Harshitha Ramamurthy
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Harshitha Ramamurthy @ 2026-02-02 19:39 UTC (permalink / raw)
To: netdev
Cc: joshwash, hramamurthy, andrew+netdev, davem, edumazet, kuba,
pabeni, willemb, ziweixiao, jordanrhee, nktgrg, kuozhao, yangchun,
awogbemila, maolson, ast, daniel, hawk, john.fastabend, sdf, bpf,
linux-kernel, stable, Max Yuan
From: Max Yuan <maxyuan@google.com>
This series addresses two issues related to statistics in the gve driver.
The first patch fixes a memory corruption issue that occurs when resizing
the stats region during queue count changes. By allocating the maximum
possible size upfront and aligning offset calculations with the NIC,
we ensure stability and accuracy across reconfigurations.
The second patch fixes the 'rx_dropped' counter by removing allocation
failures and incorporating XDP transmit and redirect errors to provide
a more accurate representation of dropped packets.
Debarghya Kundu (1):
gve: Fix stats report corruption on queue count change
Max Yuan (1):
gve: Correct ethtool rx_dropped calculation
drivers/net/ethernet/google/gve/gve_ethtool.c | 77 ++++++++++++++++++++++++++--------------
drivers/net/ethernet/google/gve/gve_main.c | 4 +--
2 files changed, 53 insertions(+), 28 deletions(-)
--
2.53.0.rc1.225.gd81095ad13-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net 1/2] gve: Fix stats report corruption on queue count change
2026-02-02 19:39 [PATCH net 0/2] gve: Stats reporting fixes Harshitha Ramamurthy
@ 2026-02-02 19:39 ` Harshitha Ramamurthy
2026-02-04 1:41 ` Jacob Keller
2026-02-02 19:39 ` [PATCH net 2/2] gve: Correct ethtool rx_dropped calculation Harshitha Ramamurthy
2026-02-04 3:40 ` [PATCH net 0/2] gve: Stats reporting fixes patchwork-bot+netdevbpf
2 siblings, 1 reply; 6+ messages in thread
From: Harshitha Ramamurthy @ 2026-02-02 19:39 UTC (permalink / raw)
To: netdev
Cc: joshwash, hramamurthy, andrew+netdev, davem, edumazet, kuba,
pabeni, willemb, ziweixiao, jordanrhee, nktgrg, kuozhao, yangchun,
awogbemila, maolson, ast, daniel, hawk, john.fastabend, sdf, bpf,
linux-kernel, stable, Debarghya Kundu, stable
From: Debarghya Kundu <debarghyak@google.com>
The driver and the NIC share a region in memory for stats reporting.
The NIC calculates its offset into this region based on the total size
of the stats region and the size of the NIC's stats.
When the number of queues is changed, the driver's stats region is
resized. If the queue count is increased, the NIC can write past
the end of the allocated stats region, causing memory corruption.
If the queue count is decreased, there is a gap between the driver
and NIC stats, leading to incorrect stats reporting.
This change fixes the issue by allocating stats region with maximum
size, and the offset calculation for NIC stats is changed to match
with the calculation of the NIC.
Cc: stable@vger.kernel.org
Fixes: 24aeb56f2d38 ("gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.")
Signed-off-by: Debarghya Kundu <debarghyak@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
---
drivers/net/ethernet/google/gve/gve_ethtool.c | 54 +++++++++++++++++++++++++---------------
drivers/net/ethernet/google/gve/gve_main.c | 4 +--
2 files changed, 36 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c
index 52500ae8..f7864ae7 100644
--- a/drivers/net/ethernet/google/gve/gve_ethtool.c
+++ b/drivers/net/ethernet/google/gve/gve_ethtool.c
@@ -156,7 +156,8 @@ gve_get_ethtool_stats(struct net_device *netdev,
u64 rx_buf_alloc_fail, rx_desc_err_dropped_pkt, rx_hsplit_unsplit_pkt,
rx_pkts, rx_hsplit_pkt, rx_skb_alloc_fail, rx_bytes, tx_pkts, tx_bytes,
tx_dropped;
- int stats_idx, base_stats_idx, max_stats_idx;
+ int rx_base_stats_idx, max_rx_stats_idx, max_tx_stats_idx;
+ int stats_idx, stats_region_len, nic_stats_len;
struct stats *report_stats;
int *rx_qid_to_stats_idx;
int *tx_qid_to_stats_idx;
@@ -265,20 +266,38 @@ gve_get_ethtool_stats(struct net_device *netdev,
data[i++] = priv->stats_report_trigger_cnt;
i = GVE_MAIN_STATS_LEN;
- /* For rx cross-reporting stats, start from nic rx stats in report */
- base_stats_idx = GVE_TX_STATS_REPORT_NUM * num_tx_queues +
- GVE_RX_STATS_REPORT_NUM * priv->rx_cfg.num_queues;
- /* The boundary between driver stats and NIC stats shifts if there are
- * stopped queues.
- */
- base_stats_idx += NIC_RX_STATS_REPORT_NUM * num_stopped_rxqs +
- NIC_TX_STATS_REPORT_NUM * num_stopped_txqs;
- max_stats_idx = NIC_RX_STATS_REPORT_NUM *
- (priv->rx_cfg.num_queues - num_stopped_rxqs) +
- base_stats_idx;
+ rx_base_stats_idx = 0;
+ max_rx_stats_idx = 0;
+ max_tx_stats_idx = 0;
+ stats_region_len = priv->stats_report_len -
+ sizeof(struct gve_stats_report);
+ nic_stats_len = (NIC_RX_STATS_REPORT_NUM * priv->rx_cfg.num_queues +
+ NIC_TX_STATS_REPORT_NUM * num_tx_queues) * sizeof(struct stats);
+ if (unlikely((stats_region_len -
+ nic_stats_len) % sizeof(struct stats))) {
+ net_err_ratelimited("Starting index of NIC stats should be multiple of stats size");
+ } else {
+ /* For rx cross-reporting stats,
+ * start from nic rx stats in report
+ */
+ rx_base_stats_idx = (stats_region_len - nic_stats_len) /
+ sizeof(struct stats);
+ /* The boundary between driver stats and NIC stats
+ * shifts if there are stopped queues
+ */
+ rx_base_stats_idx += NIC_RX_STATS_REPORT_NUM *
+ num_stopped_rxqs + NIC_TX_STATS_REPORT_NUM *
+ num_stopped_txqs;
+ max_rx_stats_idx = NIC_RX_STATS_REPORT_NUM *
+ (priv->rx_cfg.num_queues - num_stopped_rxqs) +
+ rx_base_stats_idx;
+ max_tx_stats_idx = NIC_TX_STATS_REPORT_NUM *
+ (num_tx_queues - num_stopped_txqs) +
+ max_rx_stats_idx;
+ }
/* Preprocess the stats report for rx, map queue id to start index */
skip_nic_stats = false;
- for (stats_idx = base_stats_idx; stats_idx < max_stats_idx;
+ for (stats_idx = rx_base_stats_idx; stats_idx < max_rx_stats_idx;
stats_idx += NIC_RX_STATS_REPORT_NUM) {
u32 stat_name = be32_to_cpu(report_stats[stats_idx].stat_name);
u32 queue_id = be32_to_cpu(report_stats[stats_idx].queue_id);
@@ -354,14 +373,9 @@ gve_get_ethtool_stats(struct net_device *netdev,
i += priv->rx_cfg.num_queues * NUM_GVE_RX_CNTS;
}
- /* For tx cross-reporting stats, start from nic tx stats in report */
- base_stats_idx = max_stats_idx;
- max_stats_idx = NIC_TX_STATS_REPORT_NUM *
- (num_tx_queues - num_stopped_txqs) +
- max_stats_idx;
- /* Preprocess the stats report for tx, map queue id to start index */
skip_nic_stats = false;
- for (stats_idx = base_stats_idx; stats_idx < max_stats_idx;
+ /* NIC TX stats start right after NIC RX stats */
+ for (stats_idx = max_rx_stats_idx; stats_idx < max_tx_stats_idx;
stats_idx += NIC_TX_STATS_REPORT_NUM) {
u32 stat_name = be32_to_cpu(report_stats[stats_idx].stat_name);
u32 queue_id = be32_to_cpu(report_stats[stats_idx].queue_id);
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index a7a088a7..5a747603 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -283,9 +283,9 @@ static int gve_alloc_stats_report(struct gve_priv *priv)
int tx_stats_num, rx_stats_num;
tx_stats_num = (GVE_TX_STATS_REPORT_NUM + NIC_TX_STATS_REPORT_NUM) *
- gve_num_tx_queues(priv);
+ priv->tx_cfg.max_queues;
rx_stats_num = (GVE_RX_STATS_REPORT_NUM + NIC_RX_STATS_REPORT_NUM) *
- priv->rx_cfg.num_queues;
+ priv->rx_cfg.max_queues;
priv->stats_report_len = struct_size(priv->stats_report, stats,
size_add(tx_stats_num, rx_stats_num));
priv->stats_report =
--
2.53.0.rc1.225.gd81095ad13-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net 2/2] gve: Correct ethtool rx_dropped calculation
2026-02-02 19:39 [PATCH net 0/2] gve: Stats reporting fixes Harshitha Ramamurthy
2026-02-02 19:39 ` [PATCH net 1/2] gve: Fix stats report corruption on queue count change Harshitha Ramamurthy
@ 2026-02-02 19:39 ` Harshitha Ramamurthy
2026-02-04 1:42 ` Jacob Keller
2026-02-04 3:40 ` [PATCH net 0/2] gve: Stats reporting fixes patchwork-bot+netdevbpf
2 siblings, 1 reply; 6+ messages in thread
From: Harshitha Ramamurthy @ 2026-02-02 19:39 UTC (permalink / raw)
To: netdev
Cc: joshwash, hramamurthy, andrew+netdev, davem, edumazet, kuba,
pabeni, willemb, ziweixiao, jordanrhee, nktgrg, kuozhao, yangchun,
awogbemila, maolson, ast, daniel, hawk, john.fastabend, sdf, bpf,
linux-kernel, stable, Max Yuan, stable
From: Max Yuan <maxyuan@google.com>
The gve driver's "rx_dropped" statistic, exposed via `ethtool -S`,
incorrectly includes `rx_buf_alloc_fail` counts. These failures
represent an inability to allocate receive buffers, not true packet
drops where a received packet is discarded. This misrepresentation can
lead to inaccurate diagnostics.
This patch rectifies the ethtool "rx_dropped" calculation. It removes
`rx_buf_alloc_fail` from the total and adds `xdp_tx_errors` and
`xdp_redirect_errors`, which represent legitimate packet drops within
the XDP path.
Cc: stable@vger.kernel.org
Fixes: 433e274b8f7b ("gve: Add stats for gve.")
Signed-off-by: Max Yuan <maxyuan@google.com>
Reviewed-by: Jordan Rhee <jordanrhee@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Matt Olson <maolson@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
---
drivers/net/ethernet/google/gve/gve_ethtool.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c
index f7864ae7..9fd954d1 100644
--- a/drivers/net/ethernet/google/gve/gve_ethtool.c
+++ b/drivers/net/ethernet/google/gve/gve_ethtool.c
@@ -152,10 +152,11 @@ gve_get_ethtool_stats(struct net_device *netdev,
u64 tmp_rx_pkts, tmp_rx_hsplit_pkt, tmp_rx_bytes, tmp_rx_hsplit_bytes,
tmp_rx_skb_alloc_fail, tmp_rx_buf_alloc_fail,
tmp_rx_desc_err_dropped_pkt, tmp_rx_hsplit_unsplit_pkt,
- tmp_tx_pkts, tmp_tx_bytes;
+ tmp_tx_pkts, tmp_tx_bytes,
+ tmp_xdp_tx_errors, tmp_xdp_redirect_errors;
u64 rx_buf_alloc_fail, rx_desc_err_dropped_pkt, rx_hsplit_unsplit_pkt,
rx_pkts, rx_hsplit_pkt, rx_skb_alloc_fail, rx_bytes, tx_pkts, tx_bytes,
- tx_dropped;
+ tx_dropped, xdp_tx_errors, xdp_redirect_errors;
int rx_base_stats_idx, max_rx_stats_idx, max_tx_stats_idx;
int stats_idx, stats_region_len, nic_stats_len;
struct stats *report_stats;
@@ -199,6 +200,7 @@ gve_get_ethtool_stats(struct net_device *netdev,
for (rx_pkts = 0, rx_bytes = 0, rx_hsplit_pkt = 0,
rx_skb_alloc_fail = 0, rx_buf_alloc_fail = 0,
rx_desc_err_dropped_pkt = 0, rx_hsplit_unsplit_pkt = 0,
+ xdp_tx_errors = 0, xdp_redirect_errors = 0,
ring = 0;
ring < priv->rx_cfg.num_queues; ring++) {
if (priv->rx) {
@@ -216,6 +218,9 @@ gve_get_ethtool_stats(struct net_device *netdev,
rx->rx_desc_err_dropped_pkt;
tmp_rx_hsplit_unsplit_pkt =
rx->rx_hsplit_unsplit_pkt;
+ tmp_xdp_tx_errors = rx->xdp_tx_errors;
+ tmp_xdp_redirect_errors =
+ rx->xdp_redirect_errors;
} while (u64_stats_fetch_retry(&priv->rx[ring].statss,
start));
rx_pkts += tmp_rx_pkts;
@@ -225,6 +230,8 @@ gve_get_ethtool_stats(struct net_device *netdev,
rx_buf_alloc_fail += tmp_rx_buf_alloc_fail;
rx_desc_err_dropped_pkt += tmp_rx_desc_err_dropped_pkt;
rx_hsplit_unsplit_pkt += tmp_rx_hsplit_unsplit_pkt;
+ xdp_tx_errors += tmp_xdp_tx_errors;
+ xdp_redirect_errors += tmp_xdp_redirect_errors;
}
}
for (tx_pkts = 0, tx_bytes = 0, tx_dropped = 0, ring = 0;
@@ -250,8 +257,8 @@ gve_get_ethtool_stats(struct net_device *netdev,
data[i++] = rx_bytes;
data[i++] = tx_bytes;
/* total rx dropped packets */
- data[i++] = rx_skb_alloc_fail + rx_buf_alloc_fail +
- rx_desc_err_dropped_pkt;
+ data[i++] = rx_skb_alloc_fail + rx_desc_err_dropped_pkt +
+ xdp_tx_errors + xdp_redirect_errors;
data[i++] = tx_dropped;
data[i++] = priv->tx_timeo_cnt;
data[i++] = rx_skb_alloc_fail;
@@ -330,6 +337,9 @@ gve_get_ethtool_stats(struct net_device *netdev,
tmp_rx_buf_alloc_fail = rx->rx_buf_alloc_fail;
tmp_rx_desc_err_dropped_pkt =
rx->rx_desc_err_dropped_pkt;
+ tmp_xdp_tx_errors = rx->xdp_tx_errors;
+ tmp_xdp_redirect_errors =
+ rx->xdp_redirect_errors;
} while (u64_stats_fetch_retry(&priv->rx[ring].statss,
start));
data[i++] = tmp_rx_bytes;
@@ -340,8 +350,9 @@ gve_get_ethtool_stats(struct net_device *netdev,
data[i++] = rx->rx_frag_alloc_cnt;
/* rx dropped packets */
data[i++] = tmp_rx_skb_alloc_fail +
- tmp_rx_buf_alloc_fail +
- tmp_rx_desc_err_dropped_pkt;
+ tmp_rx_desc_err_dropped_pkt +
+ tmp_xdp_tx_errors +
+ tmp_xdp_redirect_errors;
data[i++] = rx->rx_copybreak_pkt;
data[i++] = rx->rx_copied_pkt;
/* stats from NIC */
--
2.53.0.rc1.225.gd81095ad13-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net 1/2] gve: Fix stats report corruption on queue count change
2026-02-02 19:39 ` [PATCH net 1/2] gve: Fix stats report corruption on queue count change Harshitha Ramamurthy
@ 2026-02-04 1:41 ` Jacob Keller
0 siblings, 0 replies; 6+ messages in thread
From: Jacob Keller @ 2026-02-04 1:41 UTC (permalink / raw)
To: Harshitha Ramamurthy, netdev
Cc: joshwash, andrew+netdev, davem, edumazet, kuba, pabeni, willemb,
ziweixiao, jordanrhee, nktgrg, kuozhao, yangchun, awogbemila,
maolson, ast, daniel, hawk, john.fastabend, sdf, bpf,
linux-kernel, stable, Debarghya Kundu, stable
On 2/2/2026 11:39 AM, Harshitha Ramamurthy wrote:
> From: Debarghya Kundu <debarghyak@google.com>
>
> The driver and the NIC share a region in memory for stats reporting.
> The NIC calculates its offset into this region based on the total size
> of the stats region and the size of the NIC's stats.
>
> When the number of queues is changed, the driver's stats region is
> resized. If the queue count is increased, the NIC can write past
> the end of the allocated stats region, causing memory corruption.
> If the queue count is decreased, there is a gap between the driver
> and NIC stats, leading to incorrect stats reporting.
>
> This change fixes the issue by allocating stats region with maximum
> size, and the offset calculation for NIC stats is changed to match
> with the calculation of the NIC.
>
> Cc: stable@vger.kernel.org
> Fixes: 24aeb56f2d38 ("gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.")
> Signed-off-by: Debarghya Kundu <debarghyak@google.com>
> Reviewed-by: Joshua Washington <joshwash@google.com>
> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
> ---
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net 2/2] gve: Correct ethtool rx_dropped calculation
2026-02-02 19:39 ` [PATCH net 2/2] gve: Correct ethtool rx_dropped calculation Harshitha Ramamurthy
@ 2026-02-04 1:42 ` Jacob Keller
0 siblings, 0 replies; 6+ messages in thread
From: Jacob Keller @ 2026-02-04 1:42 UTC (permalink / raw)
To: Harshitha Ramamurthy, netdev
Cc: joshwash, andrew+netdev, davem, edumazet, kuba, pabeni, willemb,
ziweixiao, jordanrhee, nktgrg, kuozhao, yangchun, awogbemila,
maolson, ast, daniel, hawk, john.fastabend, sdf, bpf,
linux-kernel, stable, Max Yuan, stable
On 2/2/2026 11:39 AM, Harshitha Ramamurthy wrote:
> From: Max Yuan <maxyuan@google.com>
>
> The gve driver's "rx_dropped" statistic, exposed via `ethtool -S`,
> incorrectly includes `rx_buf_alloc_fail` counts. These failures
> represent an inability to allocate receive buffers, not true packet
> drops where a received packet is discarded. This misrepresentation can
> lead to inaccurate diagnostics.
>
> This patch rectifies the ethtool "rx_dropped" calculation. It removes
> `rx_buf_alloc_fail` from the total and adds `xdp_tx_errors` and
> `xdp_redirect_errors`, which represent legitimate packet drops within
> the XDP path.
>
> Cc: stable@vger.kernel.org
> Fixes: 433e274b8f7b ("gve: Add stats for gve.")
> Signed-off-by: Max Yuan <maxyuan@google.com>
> Reviewed-by: Jordan Rhee <jordanrhee@google.com>
> Reviewed-by: Joshua Washington <joshwash@google.com>
> Reviewed-by: Matt Olson <maolson@google.com>
> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
> ---
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net 0/2] gve: Stats reporting fixes
2026-02-02 19:39 [PATCH net 0/2] gve: Stats reporting fixes Harshitha Ramamurthy
2026-02-02 19:39 ` [PATCH net 1/2] gve: Fix stats report corruption on queue count change Harshitha Ramamurthy
2026-02-02 19:39 ` [PATCH net 2/2] gve: Correct ethtool rx_dropped calculation Harshitha Ramamurthy
@ 2026-02-04 3:40 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-02-04 3:40 UTC (permalink / raw)
To: Harshitha Ramamurthy
Cc: netdev, joshwash, andrew+netdev, davem, edumazet, kuba, pabeni,
willemb, ziweixiao, jordanrhee, nktgrg, kuozhao, yangchun,
awogbemila, maolson, ast, daniel, hawk, john.fastabend, sdf, bpf,
linux-kernel, stable, maxyuan
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Mon, 2 Feb 2026 19:39:23 +0000 you wrote:
> From: Max Yuan <maxyuan@google.com>
>
> This series addresses two issues related to statistics in the gve driver.
>
> The first patch fixes a memory corruption issue that occurs when resizing
> the stats region during queue count changes. By allocating the maximum
> possible size upfront and aligning offset calculations with the NIC,
> we ensure stability and accuracy across reconfigurations.
>
> [...]
Here is the summary with links:
- [net,1/2] gve: Fix stats report corruption on queue count change
https://git.kernel.org/netdev/net/c/7b9ebcce0296
- [net,2/2] gve: Correct ethtool rx_dropped calculation
https://git.kernel.org/netdev/net/c/c7db85d579a1
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-04 3:40 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-02 19:39 [PATCH net 0/2] gve: Stats reporting fixes Harshitha Ramamurthy
2026-02-02 19:39 ` [PATCH net 1/2] gve: Fix stats report corruption on queue count change Harshitha Ramamurthy
2026-02-04 1:41 ` Jacob Keller
2026-02-02 19:39 ` [PATCH net 2/2] gve: Correct ethtool rx_dropped calculation Harshitha Ramamurthy
2026-02-04 1:42 ` Jacob Keller
2026-02-04 3:40 ` [PATCH net 0/2] gve: Stats reporting fixes patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox