* [PATCH v3 net-next 2/2] ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting
2025-07-09 18:40 [PATCH v3 net-next 0/2] ibmvnic: Fix/Improve queue stats Mingming Cao
@ 2025-07-09 18:40 ` Mingming Cao
0 siblings, 0 replies; 7+ messages in thread
From: Mingming Cao @ 2025-07-09 18:40 UTC (permalink / raw)
To: netdev; +Cc: kuba, horms, bjking1, haren, ricklind, davemarq, mmc
VNIC testing on multi-core Power systems showed SAR stats drift
and packet rate inconsistencies under load.
Implements ndo_get_stats64 to provide safe aggregation of queue-level
atomic64 counters into rtnl_link_stats64 for use by tools like 'ip -s',
'ifconfig', and 'sar'. Switch to ndo_get_stats64 to align SAR reporting
with the standard kernel interface for retrieving netdev stats.
This removes redundant per-adapter stat updates, reduces overhead,
eliminates cacheline bouncing from hot path updates, and improves
the accuracy of reported packet rates.
Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
Reviewed by: Brian King <bjking1@linux.ibm.com>
Reviewed by: Dave Marquardt <davemarq@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmvnic.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 79fdba4293a4..6dd7809b1bc7 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2312,8 +2312,6 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
tx_pool->num_buffers - 1 :
tx_pool->consumer_index - 1;
tx_buff = &tx_pool->tx_buff[index];
- adapter->netdev->stats.tx_packets--;
- adapter->netdev->stats.tx_bytes -= tx_buff->skb->len;
atomic64_dec(&adapter->tx_stats_buffers[queue_num].batched_packets);
if (atomic64_sub_return(tx_buff->skb->len,
&adapter->tx_stats_buffers[queue_num].bytes) < 0) {
@@ -2325,9 +2323,9 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
atomic64_read(&adapter->tx_stats_buffers[queue_num].bytes),
tx_buff->skb->len);
}
+ atomic64_inc(&adapter->tx_stats_buffers[queue_num].dropped_packets);
dev_kfree_skb_any(tx_buff->skb);
tx_buff->skb = NULL;
- adapter->netdev->stats.tx_dropped++;
}
ind_bufp->index = 0;
@@ -2655,9 +2653,6 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
}
out:
rcu_read_unlock();
- netdev->stats.tx_dropped += tx_dropped;
- netdev->stats.tx_bytes += tx_bytes;
- netdev->stats.tx_packets += tx_bpackets + tx_dpackets;
adapter->tx_send_failed += tx_send_failed;
adapter->tx_map_failed += tx_map_failed;
atomic64_add(tx_bpackets, &adapter->tx_stats_buffers[queue_num].batched_packets);
@@ -3460,6 +3455,25 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
return -ret;
}
+static void ibmvnic_get_stats64(struct net_device *netdev,
+ struct rtnl_link_stats64 *stats)
+{
+ struct ibmvnic_adapter *adapter = netdev_priv(netdev);
+ int i;
+
+ for (i = 0; i < adapter->req_rx_queues; i++) {
+ stats->rx_packets += atomic64_read(&adapter->rx_stats_buffers[i].packets);
+ stats->rx_bytes += atomic64_read(&adapter->rx_stats_buffers[i].bytes);
+ }
+
+ for (i = 0; i < adapter->req_tx_queues; i++) {
+ stats->tx_packets += atomic64_read(&adapter->tx_stats_buffers[i].batched_packets);
+ stats->tx_packets += atomic64_read(&adapter->tx_stats_buffers[i].direct_packets);
+ stats->tx_bytes += atomic64_read(&adapter->tx_stats_buffers[i].bytes);
+ stats->tx_dropped += atomic64_read(&adapter->tx_stats_buffers[i].dropped_packets);
+ }
+}
+
static void ibmvnic_tx_timeout(struct net_device *dev, unsigned int txqueue)
{
struct ibmvnic_adapter *adapter = netdev_priv(dev);
@@ -3575,8 +3589,6 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
length = skb->len;
napi_gro_receive(napi, skb); /* send it up */
- netdev->stats.rx_packets++;
- netdev->stats.rx_bytes += length;
atomic64_inc(&adapter->rx_stats_buffers[scrq_num].packets);
atomic64_add(length, &adapter->rx_stats_buffers[scrq_num].bytes);
frames_processed++;
@@ -3686,6 +3698,7 @@ static const struct net_device_ops ibmvnic_netdev_ops = {
.ndo_set_rx_mode = ibmvnic_set_multi,
.ndo_set_mac_address = ibmvnic_set_mac,
.ndo_validate_addr = eth_validate_addr,
+ .ndo_get_stats64 = ibmvnic_get_stats64,
.ndo_tx_timeout = ibmvnic_tx_timeout,
.ndo_change_mtu = ibmvnic_change_mtu,
.ndo_features_check = ibmvnic_features_check,
--
2.39.3 (Apple Git-146)
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 net-next resubmit 0/2] ibmvnic: Fix/Improve queue stats
@ 2025-07-14 17:35 Mingming Cao
2025-07-14 17:35 ` [PATCH v3 net-next 1/2] ibmvnic: Use atomic64_t for " Mingming Cao
2025-07-14 17:35 ` [PATCH v3 net-next 2/2] ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting Mingming Cao
0 siblings, 2 replies; 7+ messages in thread
From: Mingming Cao @ 2025-07-14 17:35 UTC (permalink / raw)
To: netdev; +Cc: kuba, horms, bjking1, haren, ricklind, davemarq, mmc
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=true, Size: 2545 bytes --]
This series is resubmitted after the bug fix for hardcoded
`NUM_RX_STATS` and `NUM_TX_STATS` has already been included in the `net-next` tree.
This patch series introduces two updates to the ibmvnic driver,
focusing on improving the accuracy and safety of queue-level statistics.
Patch 1: Convert per-queue RX/TX stats to atomic64_t to ensure thread-safe
updates in concurrent environments. This establishes a safe and consistent
foundation for subsequent statistics-related fixes.
Patch 2: Fix inaccurate sar statistics by implementing ndo_get_stats64() and
removing the outdated manual updates to netdev->stats. This patch also improves
packets rates and bandwith with large workload.
--------------------------------------
Changes since v2:
link to v2: https://www.spinics.net/lists/netdev/msg1104665.html
- Dropped Patch 2 from v2, which fixed the hardcoded `NUM_RX_STATS` and `NUM_TX_STATS`,
as suggested by Simon. https://www.spinics.net/lists/netdev/msg1106669.html included
in net-next
- Updated Patch 1 in v2 to rebase on top of the above fix in `net`.
– Patch 3 in v2 (now patch 2) unchanged.
- Dropped Patch 4 from v2, which raised the default number of indirect sub-CRQ entries
and introduced a module parameter for backward compatibility. Based on review feedback,
plan to explore alternative ways to handle older systems without adding a module parameter.
---
Changes since v1:
Link to v1: https://www.spinics.net/lists/netdev/msg1103893.html
Patch 1 (was Patch 2 in v1): Introduces atomic64_t stats early to establish
a consistent foundation. Replaces atomic64_sub() with atomic64_sub_return()
and adds a warning for underflow. This change now comes first to ensure correct
ordering of dependency with stat size derivation.
Patch 2 (was Patch 1 in v1): Commit message rewritten to clearly highlight the
mismatch bug caused by static stat counts and rebased on top of the atomic64
conversion to avoid broken logic during intermediate states.
Patch 3: No functional changes
Patch 4: Commit message clarified to explain the use of 128 indirect entries
on POWER9+ systems and the intent of the module parameter as a transitional
fallback for legacy or constrained systems.
Mingming Cao (2):
ibmvnic: Use atomic64_t for queue stats
ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting
drivers/net/ethernet/ibm/ibmvnic.c | 71 +++++++++++++++++++-----------
drivers/net/ethernet/ibm/ibmvnic.h | 18 ++++----
2 files changed, 55 insertions(+), 34 deletions(-)
--
2.39.3 (Apple Git-146)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 net-next 1/2] ibmvnic: Use atomic64_t for queue stats
2025-07-14 17:35 [PATCH v3 net-next resubmit 0/2] ibmvnic: Fix/Improve queue stats Mingming Cao
@ 2025-07-14 17:35 ` Mingming Cao
2025-07-15 13:21 ` Simon Horman
2025-07-16 0:14 ` Jakub Kicinski
2025-07-14 17:35 ` [PATCH v3 net-next 2/2] ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting Mingming Cao
1 sibling, 2 replies; 7+ messages in thread
From: Mingming Cao @ 2025-07-14 17:35 UTC (permalink / raw)
To: netdev; +Cc: kuba, horms, bjking1, haren, ricklind, davemarq, mmc
This patch improves the ibmvnic driver by changing the per-queue
packet and byte counters to atomic64_t types. This makes updates
thread-safe and easier to manage across multiple cores.
It also updates the ethtool statistics to safely read these new counters.
Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
Reviewed-by: Brian King <bjking1@linux.ibm.com>
Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
Reviewed by: Rick Lindsley <ricklind@linux.ibm.com>
Reviewed by: Haren Myneni <haren@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmvnic.c | 42 ++++++++++++++++++------------
drivers/net/ethernet/ibm/ibmvnic.h | 18 ++++++-------
2 files changed, 34 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 92647e137cf8..79fdba4293a4 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2314,9 +2314,17 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
tx_buff = &tx_pool->tx_buff[index];
adapter->netdev->stats.tx_packets--;
adapter->netdev->stats.tx_bytes -= tx_buff->skb->len;
- adapter->tx_stats_buffers[queue_num].batched_packets--;
- adapter->tx_stats_buffers[queue_num].bytes -=
- tx_buff->skb->len;
+ atomic64_dec(&adapter->tx_stats_buffers[queue_num].batched_packets);
+ if (atomic64_sub_return(tx_buff->skb->len,
+ &adapter->tx_stats_buffers[queue_num].bytes) < 0) {
+ atomic64_set(&adapter->tx_stats_buffers[queue_num].bytes, 0);
+ netdev_warn(adapter->netdev,
+ "TX stats underflow on queue %u: bytes (%lld) < skb->len (%u),\n"
+ "clamping to 0\n",
+ queue_num,
+ atomic64_read(&adapter->tx_stats_buffers[queue_num].bytes),
+ tx_buff->skb->len);
+ }
dev_kfree_skb_any(tx_buff->skb);
tx_buff->skb = NULL;
adapter->netdev->stats.tx_dropped++;
@@ -2652,10 +2660,10 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
netdev->stats.tx_packets += tx_bpackets + tx_dpackets;
adapter->tx_send_failed += tx_send_failed;
adapter->tx_map_failed += tx_map_failed;
- adapter->tx_stats_buffers[queue_num].batched_packets += tx_bpackets;
- adapter->tx_stats_buffers[queue_num].direct_packets += tx_dpackets;
- adapter->tx_stats_buffers[queue_num].bytes += tx_bytes;
- adapter->tx_stats_buffers[queue_num].dropped_packets += tx_dropped;
+ atomic64_add(tx_bpackets, &adapter->tx_stats_buffers[queue_num].batched_packets);
+ atomic64_add(tx_dpackets, &adapter->tx_stats_buffers[queue_num].direct_packets);
+ atomic64_add(tx_bytes, &adapter->tx_stats_buffers[queue_num].bytes);
+ atomic64_add(tx_dropped, &adapter->tx_stats_buffers[queue_num].dropped_packets);
return ret;
}
@@ -3569,8 +3577,8 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
napi_gro_receive(napi, skb); /* send it up */
netdev->stats.rx_packets++;
netdev->stats.rx_bytes += length;
- adapter->rx_stats_buffers[scrq_num].packets++;
- adapter->rx_stats_buffers[scrq_num].bytes += length;
+ atomic64_inc(&adapter->rx_stats_buffers[scrq_num].packets);
+ atomic64_add(length, &adapter->rx_stats_buffers[scrq_num].bytes);
frames_processed++;
}
@@ -3874,22 +3882,22 @@ static void ibmvnic_get_ethtool_stats(struct net_device *dev,
(adapter, ibmvnic_stats[i].offset));
for (j = 0; j < adapter->req_tx_queues; j++) {
- data[i] = adapter->tx_stats_buffers[j].batched_packets;
+ data[i] = atomic64_read(&adapter->tx_stats_buffers[j].batched_packets);
i++;
- data[i] = adapter->tx_stats_buffers[j].direct_packets;
+ data[i] = atomic64_read(&adapter->tx_stats_buffers[j].direct_packets);
i++;
- data[i] = adapter->tx_stats_buffers[j].bytes;
+ data[i] = atomic64_read(&adapter->tx_stats_buffers[j].bytes);
i++;
- data[i] = adapter->tx_stats_buffers[j].dropped_packets;
+ data[i] = atomic64_read(&adapter->tx_stats_buffers[j].dropped_packets);
i++;
}
for (j = 0; j < adapter->req_rx_queues; j++) {
- data[i] = adapter->rx_stats_buffers[j].packets;
+ data[i] = atomic64_read(&adapter->rx_stats_buffers[j].packets);
i++;
- data[i] = adapter->rx_stats_buffers[j].bytes;
+ data[i] = atomic64_read(&adapter->rx_stats_buffers[j].bytes);
i++;
- data[i] = adapter->rx_stats_buffers[j].interrupts;
+ data[i] = atomic64_read(&adapter->rx_stats_buffers[j].interrupts);
i++;
}
}
@@ -4307,7 +4315,7 @@ static irqreturn_t ibmvnic_interrupt_rx(int irq, void *instance)
if (unlikely(adapter->state != VNIC_OPEN))
return IRQ_NONE;
- adapter->rx_stats_buffers[scrq->scrq_num].interrupts++;
+ atomic64_inc(&adapter->rx_stats_buffers[scrq->scrq_num].interrupts);
if (napi_schedule_prep(&adapter->napi[scrq->scrq_num])) {
disable_scrq_irq(adapter, scrq);
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 246ddce753f9..e574eed97cc0 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -212,23 +212,23 @@ struct ibmvnic_statistics {
} __packed __aligned(8);
struct ibmvnic_tx_queue_stats {
- u64 batched_packets;
- u64 direct_packets;
- u64 bytes;
- u64 dropped_packets;
+ atomic64_t batched_packets;
+ atomic64_t direct_packets;
+ atomic64_t bytes;
+ atomic64_t dropped_packets;
};
#define NUM_TX_STATS \
- (sizeof(struct ibmvnic_tx_queue_stats) / sizeof(u64))
+ (sizeof(struct ibmvnic_tx_queue_stats) / sizeof(atomic64_t))
struct ibmvnic_rx_queue_stats {
- u64 packets;
- u64 bytes;
- u64 interrupts;
+ atomic64_t packets;
+ atomic64_t bytes;
+ atomic64_t interrupts;
};
#define NUM_RX_STATS \
- (sizeof(struct ibmvnic_rx_queue_stats) / sizeof(u64))
+ (sizeof(struct ibmvnic_rx_queue_stats) / sizeof(atomic64_t))
struct ibmvnic_acl_buffer {
__be32 len;
--
2.39.3 (Apple Git-146)
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 net-next 2/2] ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting
2025-07-14 17:35 [PATCH v3 net-next resubmit 0/2] ibmvnic: Fix/Improve queue stats Mingming Cao
2025-07-14 17:35 ` [PATCH v3 net-next 1/2] ibmvnic: Use atomic64_t for " Mingming Cao
@ 2025-07-14 17:35 ` Mingming Cao
2025-07-15 13:22 ` Simon Horman
1 sibling, 1 reply; 7+ messages in thread
From: Mingming Cao @ 2025-07-14 17:35 UTC (permalink / raw)
To: netdev; +Cc: kuba, horms, bjking1, haren, ricklind, davemarq, mmc
VNIC testing on multi-core Power systems showed SAR stats drift
and packet rate inconsistencies under load.
Implements ndo_get_stats64 to provide safe aggregation of queue-level
atomic64 counters into rtnl_link_stats64 for use by tools like 'ip -s',
'ifconfig', and 'sar'. Switch to ndo_get_stats64 to align SAR reporting
with the standard kernel interface for retrieving netdev stats.
This removes redundant per-adapter stat updates, reduces overhead,
eliminates cacheline bouncing from hot path updates, and improves
the accuracy of reported packet rates.
Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
Reviewed by: Brian King <bjking1@linux.ibm.com>
Reviewed by: Dave Marquardt <davemarq@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmvnic.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 79fdba4293a4..6dd7809b1bc7 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2312,8 +2312,6 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
tx_pool->num_buffers - 1 :
tx_pool->consumer_index - 1;
tx_buff = &tx_pool->tx_buff[index];
- adapter->netdev->stats.tx_packets--;
- adapter->netdev->stats.tx_bytes -= tx_buff->skb->len;
atomic64_dec(&adapter->tx_stats_buffers[queue_num].batched_packets);
if (atomic64_sub_return(tx_buff->skb->len,
&adapter->tx_stats_buffers[queue_num].bytes) < 0) {
@@ -2325,9 +2323,9 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
atomic64_read(&adapter->tx_stats_buffers[queue_num].bytes),
tx_buff->skb->len);
}
+ atomic64_inc(&adapter->tx_stats_buffers[queue_num].dropped_packets);
dev_kfree_skb_any(tx_buff->skb);
tx_buff->skb = NULL;
- adapter->netdev->stats.tx_dropped++;
}
ind_bufp->index = 0;
@@ -2655,9 +2653,6 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
}
out:
rcu_read_unlock();
- netdev->stats.tx_dropped += tx_dropped;
- netdev->stats.tx_bytes += tx_bytes;
- netdev->stats.tx_packets += tx_bpackets + tx_dpackets;
adapter->tx_send_failed += tx_send_failed;
adapter->tx_map_failed += tx_map_failed;
atomic64_add(tx_bpackets, &adapter->tx_stats_buffers[queue_num].batched_packets);
@@ -3460,6 +3455,25 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
return -ret;
}
+static void ibmvnic_get_stats64(struct net_device *netdev,
+ struct rtnl_link_stats64 *stats)
+{
+ struct ibmvnic_adapter *adapter = netdev_priv(netdev);
+ int i;
+
+ for (i = 0; i < adapter->req_rx_queues; i++) {
+ stats->rx_packets += atomic64_read(&adapter->rx_stats_buffers[i].packets);
+ stats->rx_bytes += atomic64_read(&adapter->rx_stats_buffers[i].bytes);
+ }
+
+ for (i = 0; i < adapter->req_tx_queues; i++) {
+ stats->tx_packets += atomic64_read(&adapter->tx_stats_buffers[i].batched_packets);
+ stats->tx_packets += atomic64_read(&adapter->tx_stats_buffers[i].direct_packets);
+ stats->tx_bytes += atomic64_read(&adapter->tx_stats_buffers[i].bytes);
+ stats->tx_dropped += atomic64_read(&adapter->tx_stats_buffers[i].dropped_packets);
+ }
+}
+
static void ibmvnic_tx_timeout(struct net_device *dev, unsigned int txqueue)
{
struct ibmvnic_adapter *adapter = netdev_priv(dev);
@@ -3575,8 +3589,6 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
length = skb->len;
napi_gro_receive(napi, skb); /* send it up */
- netdev->stats.rx_packets++;
- netdev->stats.rx_bytes += length;
atomic64_inc(&adapter->rx_stats_buffers[scrq_num].packets);
atomic64_add(length, &adapter->rx_stats_buffers[scrq_num].bytes);
frames_processed++;
@@ -3686,6 +3698,7 @@ static const struct net_device_ops ibmvnic_netdev_ops = {
.ndo_set_rx_mode = ibmvnic_set_multi,
.ndo_set_mac_address = ibmvnic_set_mac,
.ndo_validate_addr = eth_validate_addr,
+ .ndo_get_stats64 = ibmvnic_get_stats64,
.ndo_tx_timeout = ibmvnic_tx_timeout,
.ndo_change_mtu = ibmvnic_change_mtu,
.ndo_features_check = ibmvnic_features_check,
--
2.39.3 (Apple Git-146)
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3 net-next 1/2] ibmvnic: Use atomic64_t for queue stats
2025-07-14 17:35 ` [PATCH v3 net-next 1/2] ibmvnic: Use atomic64_t for " Mingming Cao
@ 2025-07-15 13:21 ` Simon Horman
2025-07-16 0:14 ` Jakub Kicinski
1 sibling, 0 replies; 7+ messages in thread
From: Simon Horman @ 2025-07-15 13:21 UTC (permalink / raw)
To: Mingming Cao; +Cc: netdev, kuba, bjking1, haren, ricklind, davemarq
On Mon, Jul 14, 2025 at 01:35:06PM -0400, Mingming Cao wrote:
> This patch improves the ibmvnic driver by changing the per-queue
> packet and byte counters to atomic64_t types. This makes updates
> thread-safe and easier to manage across multiple cores.
>
> It also updates the ethtool statistics to safely read these new counters.
>
> Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
> Reviewed-by: Brian King <bjking1@linux.ibm.com>
> Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
> Reviewed by: Rick Lindsley <ricklind@linux.ibm.com>
> Reviewed by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 net-next 2/2] ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting
2025-07-14 17:35 ` [PATCH v3 net-next 2/2] ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting Mingming Cao
@ 2025-07-15 13:22 ` Simon Horman
0 siblings, 0 replies; 7+ messages in thread
From: Simon Horman @ 2025-07-15 13:22 UTC (permalink / raw)
To: Mingming Cao; +Cc: netdev, kuba, bjking1, haren, ricklind, davemarq
On Mon, Jul 14, 2025 at 01:35:07PM -0400, Mingming Cao wrote:
> VNIC testing on multi-core Power systems showed SAR stats drift
> and packet rate inconsistencies under load.
>
> Implements ndo_get_stats64 to provide safe aggregation of queue-level
> atomic64 counters into rtnl_link_stats64 for use by tools like 'ip -s',
> 'ifconfig', and 'sar'. Switch to ndo_get_stats64 to align SAR reporting
> with the standard kernel interface for retrieving netdev stats.
>
> This removes redundant per-adapter stat updates, reduces overhead,
> eliminates cacheline bouncing from hot path updates, and improves
> the accuracy of reported packet rates.
>
> Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
> Reviewed by: Brian King <bjking1@linux.ibm.com>
> Reviewed by: Dave Marquardt <davemarq@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 net-next 1/2] ibmvnic: Use atomic64_t for queue stats
2025-07-14 17:35 ` [PATCH v3 net-next 1/2] ibmvnic: Use atomic64_t for " Mingming Cao
2025-07-15 13:21 ` Simon Horman
@ 2025-07-16 0:14 ` Jakub Kicinski
1 sibling, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2025-07-16 0:14 UTC (permalink / raw)
To: Mingming Cao; +Cc: netdev, horms, bjking1, haren, ricklind, davemarq
On Mon, 14 Jul 2025 13:35:06 -0400 Mingming Cao wrote:
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 92647e137cf8..79fdba4293a4 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2314,9 +2314,17 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
> tx_buff = &tx_pool->tx_buff[index];
> adapter->netdev->stats.tx_packets--;
> adapter->netdev->stats.tx_bytes -= tx_buff->skb->len;
> - adapter->tx_stats_buffers[queue_num].batched_packets--;
> - adapter->tx_stats_buffers[queue_num].bytes -=
> - tx_buff->skb->len;
> + atomic64_dec(&adapter->tx_stats_buffers[queue_num].batched_packets);
> + if (atomic64_sub_return(tx_buff->skb->len,
> + &adapter->tx_stats_buffers[queue_num].bytes) < 0) {
> + atomic64_set(&adapter->tx_stats_buffers[queue_num].bytes, 0);
> + netdev_warn(adapter->netdev,
Any datapath print needs to be rate limited, otherwise it may flood
the logs.
> + "TX stats underflow on queue %u: bytes (%lld) < skb->len (%u),\n"
> + "clamping to 0\n",
> + queue_num,
> + atomic64_read(&adapter->tx_stats_buffers[queue_num].bytes),
> + tx_buff->skb->len);
> + }
> dev_kfree_skb_any(tx_buff->skb);
> tx_buff->skb = NULL;
> adapter->netdev->stats.tx_dropped++;
> @@ -2652,10 +2660,10 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
> netdev->stats.tx_packets += tx_bpackets + tx_dpackets;
> adapter->tx_send_failed += tx_send_failed;
> adapter->tx_map_failed += tx_map_failed;
> - adapter->tx_stats_buffers[queue_num].batched_packets += tx_bpackets;
> - adapter->tx_stats_buffers[queue_num].direct_packets += tx_dpackets;
> - adapter->tx_stats_buffers[queue_num].bytes += tx_bytes;
> - adapter->tx_stats_buffers[queue_num].dropped_packets += tx_dropped;
> + atomic64_add(tx_bpackets, &adapter->tx_stats_buffers[queue_num].batched_packets);
> + atomic64_add(tx_dpackets, &adapter->tx_stats_buffers[queue_num].direct_packets);
> + atomic64_add(tx_bytes, &adapter->tx_stats_buffers[queue_num].bytes);
> + atomic64_add(tx_dropped, &adapter->tx_stats_buffers[queue_num].dropped_packets);
Are atomics really cheap on your platform? Normally having these many
atomic ops per packet would bring performance concerns.
I assume queue accesses are already protected by some sort of a lock
so isn't it enough to make the stats per queue without making them
atomic?
--
pw-bot: cr
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-07-16 0:14 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-14 17:35 [PATCH v3 net-next resubmit 0/2] ibmvnic: Fix/Improve queue stats Mingming Cao
2025-07-14 17:35 ` [PATCH v3 net-next 1/2] ibmvnic: Use atomic64_t for " Mingming Cao
2025-07-15 13:21 ` Simon Horman
2025-07-16 0:14 ` Jakub Kicinski
2025-07-14 17:35 ` [PATCH v3 net-next 2/2] ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting Mingming Cao
2025-07-15 13:22 ` Simon Horman
-- strict thread matches above, loose matches on Subject: below --
2025-07-09 18:40 [PATCH v3 net-next 0/2] ibmvnic: Fix/Improve queue stats Mingming Cao
2025-07-09 18:40 ` [PATCH v3 net-next 2/2] ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting Mingming Cao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).