netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
@ 2010-01-12 13:41 Jesper Dangaard Brouer
  2010-01-14  1:30 ` David Miller
  0 siblings, 1 reply; 24+ messages in thread
From: Jesper Dangaard Brouer @ 2010-01-12 13:41 UTC (permalink / raw)
  To: Jeff Kirsher, Alexander Duyck; +Cc: Jesper Dangaard Brouer, netdev

Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
updating the adapter stats when reading /proc/net/dev.  Currently the
stats are updated by the watchdog timer every 2 sec, or when getting
stats via ethtool -S.

A number of userspace apps read these /proc/net/dev stats every second,
e.g. ifstat, which then gives a perceived very bursty traffic pattern,
which is actually false.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
---

 drivers/net/igb/igb_main.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index d967949..7207dd3 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -3862,11 +3862,17 @@ static void igb_reset_task(struct work_struct *work)
  * @netdev: network interface device structure
  *
  * Returns the address of the device statistics structure.
- * The statistics are actually updated from the timer callback.
+ * The statistics are also updated from the timer callback
+ * igb_watchdog_task().
  **/
 static struct net_device_stats *igb_get_stats(struct net_device *netdev)
 {
-	/* only return the current stats */
+	struct igb_adapter *adapter = netdev_priv(netdev);
+
+	/* update stats */
+	igb_update_stats(adapter);
+
+	/* return the current stats */
 	return &netdev->stats;
 }
 
@@ -3940,7 +3946,7 @@ static int igb_change_mtu(struct net_device *netdev, int new_mtu)
 
 void igb_update_stats(struct igb_adapter *adapter)
 {
-	struct net_device_stats *net_stats = igb_get_stats(adapter->netdev);
+	struct net_device_stats *net_stats = &adapter->netdev->stats;
 	struct e1000_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
 	u32 rnbc;


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-01-12 13:41 Jesper Dangaard Brouer
@ 2010-01-14  1:30 ` David Miller
  2010-01-14  3:01   ` Jeff Kirsher
  0 siblings, 1 reply; 24+ messages in thread
From: David Miller @ 2010-01-14  1:30 UTC (permalink / raw)
  To: hawk; +Cc: jeffrey.t.kirsher, alexander.h.duyck, netdev

From: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Tue, 12 Jan 2010 14:41:53 +0100

> Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> updating the adapter stats when reading /proc/net/dev.  Currently the
> stats are updated by the watchdog timer every 2 sec, or when getting
> stats via ethtool -S.
> 
> A number of userspace apps read these /proc/net/dev stats every second,
> e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> which is actually false.
> 
> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>

Intel folks, I assume you guys are going to integrate this.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-01-14  1:30 ` David Miller
@ 2010-01-14  3:01   ` Jeff Kirsher
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2010-01-14  3:01 UTC (permalink / raw)
  To: David Miller; +Cc: hawk, alexander.h.duyck, netdev

On Wed, Jan 13, 2010 at 17:30, David Miller <davem@davemloft.net> wrote:
> From: Jesper Dangaard Brouer <hawk@comx.dk>
> Date: Tue, 12 Jan 2010 14:41:53 +0100
>
>> Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
>> updating the adapter stats when reading /proc/net/dev.  Currently the
>> stats are updated by the watchdog timer every 2 sec, or when getting
>> stats via ethtool -S.
>>
>> A number of userspace apps read these /proc/net/dev stats every second,
>> e.g. ifstat, which then gives a perceived very bursty traffic pattern,
>> which is actually false.
>>
>> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
>
> Intel folks, I assume you guys are going to integrate this.
>
> Thanks.
> --

Correct, I have sucked this into my queue of patches.

-- 
Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
@ 2010-10-05 14:18 Jesper Dangaard Brouer
  2010-10-05 14:41 ` Eric Dumazet
  2010-10-07  6:36 ` David Miller
  0 siblings, 2 replies; 24+ messages in thread
From: Jesper Dangaard Brouer @ 2010-10-05 14:18 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Jeff Kirsher

Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
updating the adapter stats when reading /proc/net/dev.  Currently the
stats are updated by the watchdog timer every 2 sec, or when getting
stats via ethtool -S.

A number of userspace apps read these /proc/net/dev stats every second,
e.g. ifstat, which then gives a perceived very bursty traffic pattern,
which is actually false.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
---
 drivers/net/igb/igb_main.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 55edcb7..6cec297 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -4218,11 +4218,17 @@ static void igb_reset_task(struct work_struct *work)
  * @netdev: network interface device structure
  *
  * Returns the address of the device statistics structure.
- * The statistics are actually updated from the timer callback.
+ * The statistics are also updated from the timer callback
+ * igb_watchdog_task().
  **/
 static struct net_device_stats *igb_get_stats(struct net_device *netdev)
 {
-	/* only return the current stats */
+	struct igb_adapter *adapter = netdev_priv(netdev);
+
+	/* update stats */
+	igb_update_stats(adapter);
+
+	/* return the current stats */
 	return &netdev->stats;
 }
 
@@ -4307,7 +4313,7 @@ static int igb_change_mtu(struct net_device *netdev, int new_mtu)
 
 void igb_update_stats(struct igb_adapter *adapter)
 {
-	struct net_device_stats *net_stats = igb_get_stats(adapter->netdev);
+	struct net_device_stats *net_stats = &adapter->netdev->stats;
 	struct e1000_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
 	u32 reg, mpc;


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-05 14:18 [net-next PATCH] igb: update adapter stats when reading /proc/net/dev Jesper Dangaard Brouer
@ 2010-10-05 14:41 ` Eric Dumazet
  2010-10-05 14:53   ` Jesper Dangaard Brouer
  2010-10-07  6:36 ` David Miller
  1 sibling, 1 reply; 24+ messages in thread
From: Eric Dumazet @ 2010-10-05 14:41 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: David S. Miller, netdev, Jeff Kirsher

Le mardi 05 octobre 2010 à 16:18 +0200, Jesper Dangaard Brouer a écrit :
> Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> updating the adapter stats when reading /proc/net/dev.  Currently the
> stats are updated by the watchdog timer every 2 sec, or when getting
> stats via ethtool -S.
> 
> A number of userspace apps read these /proc/net/dev stats every second,
> e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> which is actually false.
> 
> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>

Unfortunately this is racy with igb_watchdog_task()

igb_update_stats() must be called under protection of a lock.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-05 14:41 ` Eric Dumazet
@ 2010-10-05 14:53   ` Jesper Dangaard Brouer
  2010-10-05 15:19     ` Eric Dumazet
  0 siblings, 1 reply; 24+ messages in thread
From: Jesper Dangaard Brouer @ 2010-10-05 14:53 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, netdev, Jeff Kirsher

On Tue, 2010-10-05 at 16:41 +0200, Eric Dumazet wrote:
> Le mardi 05 octobre 2010 à 16:18 +0200, Jesper Dangaard Brouer a écrit :
> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> > updating the adapter stats when reading /proc/net/dev.  Currently the
> > stats are updated by the watchdog timer every 2 sec, or when getting
> > stats via ethtool -S.
> > 
> > A number of userspace apps read these /proc/net/dev stats every second,
> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> > which is actually false.
> > 
> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> 
> Unfortunately this is racy with igb_watchdog_task()
> 
> igb_update_stats() must be called under protection of a lock.

Its already racy, because "ethtool -S" reads out the stats immediately,
and thus races with the timer.

See: igb_ethtool.c
 igb_get_ethtool_stats() invoke igb_update_stats(adapter);

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network Kernel Developer
  Cand. Scient Datalog / MSc.CS
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-05 14:53   ` Jesper Dangaard Brouer
@ 2010-10-05 15:19     ` Eric Dumazet
  2010-10-05 21:01       ` Eric Dumazet
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Dumazet @ 2010-10-05 15:19 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: David S. Miller, netdev, Jeff Kirsher

Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :

> Its already racy, because "ethtool -S" reads out the stats immediately,
> and thus races with the timer.
> 
> See: igb_ethtool.c
>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
> 

You would be surprised how many bugs are waiting to be found and
fixed ;)




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-05 15:19     ` Eric Dumazet
@ 2010-10-05 21:01       ` Eric Dumazet
  2010-10-05 21:16         ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Dumazet @ 2010-10-05 21:01 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: David S. Miller, netdev, Jeff Kirsher

Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
> 
> > Its already racy, because "ethtool -S" reads out the stats immediately,
> > and thus races with the timer.
> > 
> > See: igb_ethtool.c
> >  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
> > 
> 
> You would be surprised how many bugs are waiting to be found and
> fixed ;)
> 
> 

I took a look at igb stats, and it appears they also are racy on 32bit
arches. igb uses u64 counters, with no synchronization between
producers(writers) and consumers(readers).

Some fixes are needed ;)




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-05 21:01       ` Eric Dumazet
@ 2010-10-05 21:16         ` Jesper Dangaard Brouer
  2010-10-05 22:34           ` Jeff Kirsher
  0 siblings, 1 reply; 24+ messages in thread
From: Jesper Dangaard Brouer @ 2010-10-05 21:16 UTC (permalink / raw)
  To: Eric Dumazet, Alexander Duyck
  Cc: Jesper Dangaard Brouer, David S. Miller, netdev, Jeff Kirsher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1133 bytes --]

On Tue, 5 Oct 2010, Eric Dumazet wrote:

> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
>>
>>> Its already racy, because "ethtool -S" reads out the stats immediately,
>>> and thus races with the timer.
>>>
>>> See: igb_ethtool.c
>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
>>>
>>
>> You would be surprised how many bugs are waiting to be found and
>> fixed ;)
>
> I took a look at igb stats, and it appears they also are racy on 32bit
> arches. igb uses u64 counters, with no synchronization between
> producers(writers) and consumers(readers).

Are you saying, that we need more than a simple mutex in 
igb_update_stats()?


> Some fixes are needed ;)

The question is then if Intel wants to fix it, or let it be up to you or 
me?

Cheers,
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-05 21:16         ` Jesper Dangaard Brouer
@ 2010-10-05 22:34           ` Jeff Kirsher
  2010-10-06  3:28             ` Eric Dumazet
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Kirsher @ 2010-10-05 22:34 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Eric Dumazet, Alexander Duyck, Jesper Dangaard Brouer,
	David S. Miller, netdev, Carolyn Wyborny

On Tue, Oct 5, 2010 at 14:16, Jesper Dangaard Brouer <hawk@diku.dk> wrote:
> On Tue, 5 Oct 2010, Eric Dumazet wrote:
>
>> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
>>>
>>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
>>>
>>>> Its already racy, because "ethtool -S" reads out the stats immediately,
>>>> and thus races with the timer.
>>>>
>>>> See: igb_ethtool.c
>>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
>>>>
>>>
>>> You would be surprised how many bugs are waiting to be found and
>>> fixed ;)
>>
>> I took a look at igb stats, and it appears they also are racy on 32bit
>> arches. igb uses u64 counters, with no synchronization between
>> producers(writers) and consumers(readers).
>
> Are you saying, that we need more than a simple mutex in igb_update_stats()?
>
>
>> Some fixes are needed ;)
>
> The question is then if Intel wants to fix it, or let it be up to you or me?
>

I will make sure that Carolyn and Alex know about the issue.  But,
feel free to submit a patch if you guys have the time.

-- 
Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-05 22:34           ` Jeff Kirsher
@ 2010-10-06  3:28             ` Eric Dumazet
  2010-10-06  4:36               ` [PATCH net-next] igb: fix stats handling Eric Dumazet
  2010-10-06  5:47               ` [net-next PATCH] igb: update adapter stats when reading /proc/net/dev Jeff Kirsher
  0 siblings, 2 replies; 24+ messages in thread
From: Eric Dumazet @ 2010-10-06  3:28 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Jesper Dangaard Brouer, Alexander Duyck, Jesper Dangaard Brouer,
	David S. Miller, netdev, Carolyn Wyborny

Le mardi 05 octobre 2010 à 15:34 -0700, Jeff Kirsher a écrit :
> On Tue, Oct 5, 2010 at 14:16, Jesper Dangaard Brouer <hawk@diku.dk> wrote:
> > On Tue, 5 Oct 2010, Eric Dumazet wrote:
> >
> >> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
> >>>
> >>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
> >>>
> >>>> Its already racy, because "ethtool -S" reads out the stats immediately,
> >>>> and thus races with the timer.
> >>>>
> >>>> See: igb_ethtool.c
> >>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
> >>>>
> >>>
> >>> You would be surprised how many bugs are waiting to be found and
> >>> fixed ;)
> >>
> >> I took a look at igb stats, and it appears they also are racy on 32bit
> >> arches. igb uses u64 counters, with no synchronization between
> >> producers(writers) and consumers(readers).
> >
> > Are you saying, that we need more than a simple mutex in igb_update_stats()?
> >
> >
> >> Some fixes are needed ;)
> >
> > The question is then if Intel wants to fix it, or let it be up to you or me?
> >
> 
> I will make sure that Carolyn and Alex know about the issue.  But,
> feel free to submit a patch if you guys have the time.
> 

I woke up early this morning, I'll provide patches to fix issues for
net-next-2.6

I'll let Intel guys doing the backporting work, but for old kernels,
you'll probably need to use "unsigned long" instead of "u64"

My plan is :

- Provide 64bit counters even on 32bit arch
- with proper synchro (include/linux/u64_stats_sync.h)
- Add a spinlock so we can apply Jesper patch.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH net-next] igb: fix stats handling
  2010-10-06  3:28             ` Eric Dumazet
@ 2010-10-06  4:36               ` Eric Dumazet
  2010-10-06  5:49                 ` Jeff Kirsher
                                   ` (2 more replies)
  2010-10-06  5:47               ` [net-next PATCH] igb: update adapter stats when reading /proc/net/dev Jeff Kirsher
  1 sibling, 3 replies; 24+ messages in thread
From: Eric Dumazet @ 2010-10-06  4:36 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Jesper Dangaard Brouer, Alexander Duyck, Jesper Dangaard Brouer,
	David S. Miller, netdev, Carolyn Wyborny

Le mercredi 06 octobre 2010 à 05:28 +0200, Eric Dumazet a écrit :

> I'll let Intel guys doing the backporting work, but for old kernels,
> you'll probably need to use "unsigned long" instead of "u64"
> 
> My plan is :
> 
> - Provide 64bit counters even on 32bit arch
> - with proper synchro (include/linux/u64_stats_sync.h)
> - Add a spinlock so we can apply Jesper patch.

Here is the net-next-2.6 patch, I am currently enable to test it, the
dev machine with IGB NIC cannot be restarted until tomorrow, my son
Nicolas is currently using it ;)

Could you and/or Jesper test it, possibly on 32 and 64 bit kernels ?

Thanks !

[PATCH net-next] igb: fix stats handling

There are currently some problems with igb.

- On 32bit arches, maintaining 64bit counters without proper
synchronization between writers and readers.

- Stats updated every two seconds, as reported by Jesper.
   (Jesper provided a patch for this)

- Potential problem between worker thread and ethtool -S

This patch uses u64_stats_sync, and convert everything to be 64bit safe,
SMP safe, even on 32bit arches.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/igb/igb.h         |    7 +-
 drivers/net/igb/igb_ethtool.c |   10 +-
 drivers/net/igb/igb_main.c    |  111 +++++++++++++++++++++++---------
 3 files changed, 94 insertions(+), 34 deletions(-)

diff --git a/drivers/net/igb/igb.h b/drivers/net/igb/igb.h
index 44e0ff1..a1b9584 100644
--- a/drivers/net/igb/igb.h
+++ b/drivers/net/igb/igb.h
@@ -159,6 +159,7 @@ struct igb_tx_queue_stats {
 	u64 packets;
 	u64 bytes;
 	u64 restart_queue;
+	struct u64_stats_sync syncp;
 };
 
 struct igb_rx_queue_stats {
@@ -167,6 +168,7 @@ struct igb_rx_queue_stats {
 	u64 drops;
 	u64 csum_err;
 	u64 alloc_failed;
+	struct u64_stats_sync syncp;
 };
 
 struct igb_q_vector {
@@ -288,6 +290,9 @@ struct igb_adapter {
 	struct timecompare compare;
 	struct hwtstamp_config hwtstamp_config;
 
+	spinlock_t stats64_lock;
+	struct rtnl_link_stats64 stats64;
+
 	/* structs defined in e1000_hw.h */
 	struct e1000_hw hw;
 	struct e1000_hw_stats stats;
@@ -357,7 +362,7 @@ extern netdev_tx_t igb_xmit_frame_ring_adv(struct sk_buff *, struct igb_ring *);
 extern void igb_unmap_and_free_tx_resource(struct igb_ring *,
 					   struct igb_buffer *);
 extern void igb_alloc_rx_buffers_adv(struct igb_ring *, int);
-extern void igb_update_stats(struct igb_adapter *);
+extern void igb_update_stats(struct igb_adapter *, struct rtnl_link_stats64 *);
 extern bool igb_has_link(struct igb_adapter *adapter);
 extern void igb_set_ethtool_ops(struct net_device *);
 extern void igb_power_up_link(struct igb_adapter *);
diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index 26bf6a1..e51c233 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -90,8 +90,8 @@ static const struct igb_stats igb_gstrings_stats[] = {
 
 #define IGB_NETDEV_STAT(_net_stat) { \
 	.stat_string = __stringify(_net_stat), \
-	.sizeof_stat = FIELD_SIZEOF(struct net_device_stats, _net_stat), \
-	.stat_offset = offsetof(struct net_device_stats, _net_stat) \
+	.sizeof_stat = FIELD_SIZEOF(struct rtnl_link_stats64, _net_stat), \
+	.stat_offset = offsetof(struct rtnl_link_stats64, _net_stat) \
 }
 static const struct igb_stats igb_gstrings_net_stats[] = {
 	IGB_NETDEV_STAT(rx_errors),
@@ -2070,12 +2070,13 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 				  struct ethtool_stats *stats, u64 *data)
 {
 	struct igb_adapter *adapter = netdev_priv(netdev);
-	struct net_device_stats *net_stats = &netdev->stats;
+	struct rtnl_link_stats64 *net_stats = &adapter->stats64;
 	u64 *queue_stat;
 	int i, j, k;
 	char *p;
 
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, net_stats);
 
 	for (i = 0; i < IGB_GLOBAL_STATS_LEN; i++) {
 		p = (char *)adapter + igb_gstrings_stats[i].stat_offset;
@@ -2097,6 +2098,7 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 		for (k = 0; k < IGB_RX_QUEUE_STATS_LEN; k++, i++)
 			data[i] = queue_stat[k];
 	}
+	spin_unlock(&adapter->stats64_lock);
 }
 
 static void igb_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 55edcb7..8a009ff 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -96,7 +96,6 @@ static int igb_setup_all_rx_resources(struct igb_adapter *);
 static void igb_free_all_tx_resources(struct igb_adapter *);
 static void igb_free_all_rx_resources(struct igb_adapter *);
 static void igb_setup_mrqc(struct igb_adapter *);
-void igb_update_stats(struct igb_adapter *);
 static int igb_probe(struct pci_dev *, const struct pci_device_id *);
 static void __devexit igb_remove(struct pci_dev *pdev);
 static int igb_sw_init(struct igb_adapter *);
@@ -113,7 +112,8 @@ static void igb_update_phy_info(unsigned long);
 static void igb_watchdog(unsigned long);
 static void igb_watchdog_task(struct work_struct *);
 static netdev_tx_t igb_xmit_frame_adv(struct sk_buff *skb, struct net_device *);
-static struct net_device_stats *igb_get_stats(struct net_device *);
+static struct rtnl_link_stats64 *igb_get_stats64(struct net_device *dev,
+						 struct rtnl_link_stats64 *stats);
 static int igb_change_mtu(struct net_device *, int);
 static int igb_set_mac(struct net_device *, void *);
 static void igb_set_uta(struct igb_adapter *adapter);
@@ -1536,7 +1536,9 @@ void igb_down(struct igb_adapter *adapter)
 	netif_carrier_off(netdev);
 
 	/* record the stats before reset*/
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	spin_unlock(&adapter->stats64_lock);
 
 	adapter->link_speed = 0;
 	adapter->link_duplex = 0;
@@ -1689,7 +1691,7 @@ static const struct net_device_ops igb_netdev_ops = {
 	.ndo_open		= igb_open,
 	.ndo_stop		= igb_close,
 	.ndo_start_xmit		= igb_xmit_frame_adv,
-	.ndo_get_stats		= igb_get_stats,
+	.ndo_get_stats64	= igb_get_stats64,
 	.ndo_set_rx_mode	= igb_set_rx_mode,
 	.ndo_set_multicast_list	= igb_set_rx_mode,
 	.ndo_set_mac_address	= igb_set_mac,
@@ -2276,6 +2278,7 @@ static int __devinit igb_sw_init(struct igb_adapter *adapter)
 	adapter->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
+	spin_lock_init(&adapter->stats64_lock);
 #ifdef CONFIG_PCI_IOV
 	if (hw->mac.type == e1000_82576)
 		adapter->vfs_allocated_count = (max_vfs > 7) ? 7 : max_vfs;
@@ -3483,7 +3486,9 @@ static void igb_watchdog_task(struct work_struct *work)
 		}
 	}
 
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	spin_unlock(&adapter->stats64_lock);
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igb_ring *tx_ring = adapter->tx_ring[i];
@@ -3550,6 +3555,8 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
 	int new_val = q_vector->itr_val;
 	int avg_wire_size = 0;
 	struct igb_adapter *adapter = q_vector->adapter;
+	struct igb_ring *ring;
+	unsigned int packets;
 
 	/* For non-gigabit speeds, just fix the interrupt rate at 4000
 	 * ints/sec - ITR timer value of 120 ticks.
@@ -3559,16 +3566,21 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
 		goto set_itr_val;
 	}
 
-	if (q_vector->rx_ring && q_vector->rx_ring->total_packets) {
-		struct igb_ring *ring = q_vector->rx_ring;
-		avg_wire_size = ring->total_bytes / ring->total_packets;
+	ring = q_vector->rx_ring;
+	if (ring) {
+		packets = ACCESS_ONCE(ring->total_packets);
+
+		if (packets) 
+			avg_wire_size = ring->total_bytes / packets;
 	}
 
-	if (q_vector->tx_ring && q_vector->tx_ring->total_packets) {
-		struct igb_ring *ring = q_vector->tx_ring;
-		avg_wire_size = max_t(u32, avg_wire_size,
-		                      (ring->total_bytes /
-		                       ring->total_packets));
+	ring = q_vector->tx_ring;
+	if (ring) {
+		packets = ACCESS_ONCE(ring->total_packets);
+
+		if (packets)
+			avg_wire_size = max_t(u32, avg_wire_size,
+			                      ring->total_bytes / packets);
 	}
 
 	/* if avg_wire_size isn't set no work was done */
@@ -4077,7 +4089,11 @@ static int __igb_maybe_stop_tx(struct igb_ring *tx_ring, int size)
 
 	/* A reprieve! */
 	netif_wake_subqueue(netdev, tx_ring->queue_index);
+
+	u64_stats_update_begin(&tx_ring->tx_stats.syncp);
 	tx_ring->tx_stats.restart_queue++;
+	u64_stats_update_end(&tx_ring->tx_stats.syncp);
+
 	return 0;
 }
 
@@ -4214,16 +4230,22 @@ static void igb_reset_task(struct work_struct *work)
 }
 
 /**
- * igb_get_stats - Get System Network Statistics
+ * igb_get_stats64 - Get System Network Statistics
  * @netdev: network interface device structure
+ * @stats: rtnl_link_stats64 pointer
  *
- * Returns the address of the device statistics structure.
- * The statistics are actually updated from the timer callback.
  **/
-static struct net_device_stats *igb_get_stats(struct net_device *netdev)
+static struct rtnl_link_stats64 *igb_get_stats64(struct net_device *netdev,
+						 struct rtnl_link_stats64 *stats)
 {
-	/* only return the current stats */
-	return &netdev->stats;
+	struct igb_adapter *adapter = netdev_priv(netdev);
+
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	memcpy(stats, &adapter->stats64, sizeof(*stats));
+	spin_unlock(&adapter->stats64_lock);
+
+	return stats;
 }
 
 /**
@@ -4305,15 +4327,17 @@ static int igb_change_mtu(struct net_device *netdev, int new_mtu)
  * @adapter: board private structure
  **/
 
-void igb_update_stats(struct igb_adapter *adapter)
+void igb_update_stats(struct igb_adapter *adapter,
+		      struct rtnl_link_stats64 *net_stats)
 {
-	struct net_device_stats *net_stats = igb_get_stats(adapter->netdev);
 	struct e1000_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
 	u32 reg, mpc;
 	u16 phy_tmp;
 	int i;
 	u64 bytes, packets;
+	unsigned int start;
+	u64 _bytes, _packets;
 
 #define PHY_IDLE_ERROR_COUNT_MASK 0x00FF
 
@@ -4331,10 +4355,17 @@ void igb_update_stats(struct igb_adapter *adapter)
 	for (i = 0; i < adapter->num_rx_queues; i++) {
 		u32 rqdpc_tmp = rd32(E1000_RQDPC(i)) & 0x0FFF;
 		struct igb_ring *ring = adapter->rx_ring[i];
+
 		ring->rx_stats.drops += rqdpc_tmp;
 		net_stats->rx_fifo_errors += rqdpc_tmp;
-		bytes += ring->rx_stats.bytes;
-		packets += ring->rx_stats.packets;
+		
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->rx_stats.syncp);
+			_bytes = ring->rx_stats.bytes;
+			_packets = ring->rx_stats.packets;
+		} while (u64_stats_fetch_retry_bh(&ring->rx_stats.syncp, start));
+		bytes += _bytes;
+		packets += _packets;
 	}
 
 	net_stats->rx_bytes = bytes;
@@ -4344,8 +4375,13 @@ void igb_update_stats(struct igb_adapter *adapter)
 	packets = 0;
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igb_ring *ring = adapter->tx_ring[i];
-		bytes += ring->tx_stats.bytes;
-		packets += ring->tx_stats.packets;
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->tx_stats.syncp);
+			_bytes = ring->tx_stats.bytes;
+			_packets = ring->tx_stats.packets;
+		} while (u64_stats_fetch_retry_bh(&ring->tx_stats.syncp, start));
+		bytes += _bytes;
+		packets += _packets;
 	}
 	net_stats->tx_bytes = bytes;
 	net_stats->tx_packets = packets;
@@ -5397,7 +5433,10 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
 		if (__netif_subqueue_stopped(netdev, tx_ring->queue_index) &&
 		    !(test_bit(__IGB_DOWN, &adapter->state))) {
 			netif_wake_subqueue(netdev, tx_ring->queue_index);
+
+			u64_stats_update_begin(&tx_ring->tx_stats.syncp);
 			tx_ring->tx_stats.restart_queue++;
+			u64_stats_update_end(&tx_ring->tx_stats.syncp);
 		}
 	}
 
@@ -5437,8 +5476,10 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
 	}
 	tx_ring->total_bytes += total_bytes;
 	tx_ring->total_packets += total_packets;
+	u64_stats_update_begin(&tx_ring->tx_stats.syncp);
 	tx_ring->tx_stats.bytes += total_bytes;
 	tx_ring->tx_stats.packets += total_packets;
+	u64_stats_update_end(&tx_ring->tx_stats.syncp);
 	return count < tx_ring->count;
 }
 
@@ -5480,9 +5521,11 @@ static inline void igb_rx_checksum_adv(struct igb_ring *ring,
 		 * packets, (aka let the stack check the crc32c)
 		 */
 		if ((skb->len == 60) &&
-		    (ring->flags & IGB_RING_FLAG_RX_SCTP_CSUM))
+		    (ring->flags & IGB_RING_FLAG_RX_SCTP_CSUM)) {
+			u64_stats_update_begin(&ring->rx_stats.syncp);
 			ring->rx_stats.csum_err++;
-
+			u64_stats_update_end(&ring->rx_stats.syncp);
+		}
 		/* let the stack verify checksum errors */
 		return;
 	}
@@ -5669,8 +5712,10 @@ next_desc:
 
 	rx_ring->total_packets += total_packets;
 	rx_ring->total_bytes += total_bytes;
+	u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 	rx_ring->rx_stats.packets += total_packets;
 	rx_ring->rx_stats.bytes += total_bytes;
+	u64_stats_update_end(&rx_ring->rx_stats.syncp);
 	return cleaned;
 }
 
@@ -5698,8 +5743,10 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 		if ((bufsz < IGB_RXBUFFER_1024) && !buffer_info->page_dma) {
 			if (!buffer_info->page) {
 				buffer_info->page = netdev_alloc_page(netdev);
-				if (!buffer_info->page) {
+				if (unlikely(!buffer_info->page)) {
+					u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 					rx_ring->rx_stats.alloc_failed++;
+					u64_stats_update_end(&rx_ring->rx_stats.syncp);
 					goto no_buffers;
 				}
 				buffer_info->page_offset = 0;
@@ -5714,7 +5761,9 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 			if (dma_mapping_error(rx_ring->dev,
 					      buffer_info->page_dma)) {
 				buffer_info->page_dma = 0;
+				u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_stats.syncp);
 				goto no_buffers;
 			}
 		}
@@ -5722,8 +5771,10 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 		skb = buffer_info->skb;
 		if (!skb) {
 			skb = netdev_alloc_skb_ip_align(netdev, bufsz);
-			if (!skb) {
+			if (unlikely(!skb)) {
+				u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_stats.syncp);
 				goto no_buffers;
 			}
 
@@ -5737,7 +5788,9 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 			if (dma_mapping_error(rx_ring->dev,
 					      buffer_info->dma)) {
 				buffer_info->dma = 0;
+				u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_stats.syncp);
 				goto no_buffers;
 			}
 		}



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-06  3:28             ` Eric Dumazet
  2010-10-06  4:36               ` [PATCH net-next] igb: fix stats handling Eric Dumazet
@ 2010-10-06  5:47               ` Jeff Kirsher
  1 sibling, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2010-10-06  5:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesper Dangaard Brouer, Alexander Duyck, Jesper Dangaard Brouer,
	David S. Miller, netdev, Carolyn Wyborny

On Tue, Oct 5, 2010 at 20:28, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 05 octobre 2010 à 15:34 -0700, Jeff Kirsher a écrit :
>> On Tue, Oct 5, 2010 at 14:16, Jesper Dangaard Brouer <hawk@diku.dk> wrote:
>> > On Tue, 5 Oct 2010, Eric Dumazet wrote:
>> >
>> >> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
>> >>>
>> >>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
>> >>>
>> >>>> Its already racy, because "ethtool -S" reads out the stats immediately,
>> >>>> and thus races with the timer.
>> >>>>
>> >>>> See: igb_ethtool.c
>> >>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
>> >>>>
>> >>>
>> >>> You would be surprised how many bugs are waiting to be found and
>> >>> fixed ;)
>> >>
>> >> I took a look at igb stats, and it appears they also are racy on 32bit
>> >> arches. igb uses u64 counters, with no synchronization between
>> >> producers(writers) and consumers(readers).
>> >
>> > Are you saying, that we need more than a simple mutex in igb_update_stats()?
>> >
>> >
>> >> Some fixes are needed ;)
>> >
>> > The question is then if Intel wants to fix it, or let it be up to you or me?
>> >
>>
>> I will make sure that Carolyn and Alex know about the issue.  But,
>> feel free to submit a patch if you guys have the time.
>>
>
> I woke up early this morning, I'll provide patches to fix issues for
> net-next-2.6
>
> I'll let Intel guys doing the backporting work, but for old kernels,
> you'll probably need to use "unsigned long" instead of "u64"
>
> My plan is :
>
> - Provide 64bit counters even on 32bit arch
> - with proper synchro (include/linux/u64_stats_sync.h)
> - Add a spinlock so we can apply Jesper patch.
>
>

Thanks Eric, I really appreciate it.

-- 
Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] igb: fix stats handling
  2010-10-06  4:36               ` [PATCH net-next] igb: fix stats handling Eric Dumazet
@ 2010-10-06  5:49                 ` Jeff Kirsher
  2010-10-06 11:36                 ` Jesper Dangaard Brouer
  2010-10-09 23:57                 ` Tantilov, Emil S
  2 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2010-10-06  5:49 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesper Dangaard Brouer, Alexander Duyck, Jesper Dangaard Brouer,
	David S. Miller, netdev, Carolyn Wyborny

On Tue, Oct 5, 2010 at 21:36, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mercredi 06 octobre 2010 à 05:28 +0200, Eric Dumazet a écrit :
>
>> I'll let Intel guys doing the backporting work, but for old kernels,
>> you'll probably need to use "unsigned long" instead of "u64"
>>
>> My plan is :
>>
>> - Provide 64bit counters even on 32bit arch
>> - with proper synchro (include/linux/u64_stats_sync.h)
>> - Add a spinlock so we can apply Jesper patch.
>
> Here is the net-next-2.6 patch, I am currently enable to test it, the
> dev machine with IGB NIC cannot be restarted until tomorrow, my son
> Nicolas is currently using it ;)
>
> Could you and/or Jesper test it, possibly on 32 and 64 bit kernels ?
>
> Thanks !
>
> [PATCH net-next] igb: fix stats handling
>
> There are currently some problems with igb.
>
> - On 32bit arches, maintaining 64bit counters without proper
> synchronization between writers and readers.
>
> - Stats updated every two seconds, as reported by Jesper.
>   (Jesper provided a patch for this)
>
> - Potential problem between worker thread and ethtool -S
>
> This patch uses u64_stats_sync, and convert everything to be 64bit safe,
> SMP safe, even on 32bit arches.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  drivers/net/igb/igb.h         |    7 +-
>  drivers/net/igb/igb_ethtool.c |   10 +-
>  drivers/net/igb/igb_main.c    |  111 +++++++++++++++++++++++---------
>  3 files changed, 94 insertions(+), 34 deletions(-)
>

Thanks again Eric!  I will add your patch to my queue.

-- 
Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] igb: fix stats handling
  2010-10-06  4:36               ` [PATCH net-next] igb: fix stats handling Eric Dumazet
  2010-10-06  5:49                 ` Jeff Kirsher
@ 2010-10-06 11:36                 ` Jesper Dangaard Brouer
  2010-10-09 23:57                 ` Tantilov, Emil S
  2 siblings, 0 replies; 24+ messages in thread
From: Jesper Dangaard Brouer @ 2010-10-06 11:36 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jeff Kirsher, Jesper Dangaard Brouer, Alexander Duyck,
	David S. Miller, netdev, Carolyn Wyborny

On Wed, 2010-10-06 at 06:36 +0200, Eric Dumazet wrote:
> Le mercredi 06 octobre 2010 à 05:28 +0200, Eric Dumazet a écrit :
> 
> > I'll let Intel guys doing the backporting work, but for old kernels,
> > you'll probably need to use "unsigned long" instead of "u64"
> > 
> > My plan is :
> > 
> > - Provide 64bit counters even on 32bit arch
> > - with proper synchro (include/linux/u64_stats_sync.h)
> > - Add a spinlock so we can apply Jesper patch.

You have already integrated my patch, so you get my:

Acked-by: Jesper Dangaard Brouer <hawk@comx.dk>

> Here is the net-next-2.6 patch, I am currently enable to test it, the
> dev machine with IGB NIC cannot be restarted until tomorrow, my son
> Nicolas is currently using it ;)
> 
> Could you and/or Jesper test it, possibly on 32 and 64 bit kernels ?

I could not resist, I have tested the patch, on a 64 bit kernel, it
works :-).  Did some simple test with two while loops running ethtool -S
and ifconfig, didn't see any wrong packet counts.

Tested-by: Jesper Dangaard Brouer <hawk@comx.dk>

Now back to moving office ;-)

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network Kernel Developer
  Cand. Scient Datalog / MSc.CS
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-05 14:18 [net-next PATCH] igb: update adapter stats when reading /proc/net/dev Jesper Dangaard Brouer
  2010-10-05 14:41 ` Eric Dumazet
@ 2010-10-07  6:36 ` David Miller
  2010-10-07  6:44   ` Eric Dumazet
  1 sibling, 1 reply; 24+ messages in thread
From: David Miller @ 2010-10-07  6:36 UTC (permalink / raw)
  To: hawk; +Cc: netdev, jeffrey.t.kirsher

From: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Tue, 05 Oct 2010 16:18:33 +0200

> Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> updating the adapter stats when reading /proc/net/dev.  Currently the
> stats are updated by the watchdog timer every 2 sec, or when getting
> stats via ethtool -S.
> 
> A number of userspace apps read these /proc/net/dev stats every second,
> e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> which is actually false.
> 
> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>

I assume that the Intel folks will take care of integrating this
now that the locking is fixed.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-07  6:36 ` David Miller
@ 2010-10-07  6:44   ` Eric Dumazet
  2010-10-07  6:46     ` David Miller
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Dumazet @ 2010-10-07  6:44 UTC (permalink / raw)
  To: David Miller; +Cc: hawk, netdev, jeffrey.t.kirsher

Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
> From: Jesper Dangaard Brouer <hawk@comx.dk>
> Date: Tue, 05 Oct 2010 16:18:33 +0200
> 
> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> > updating the adapter stats when reading /proc/net/dev.  Currently the
> > stats are updated by the watchdog timer every 2 sec, or when getting
> > stats via ethtool -S.
> > 
> > A number of userspace apps read these /proc/net/dev stats every second,
> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> > which is actually false.
> > 
> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> 
> I assume that the Intel folks will take care of integrating this
> now that the locking is fixed.
> 

I integrated Jesper patch into my cumulative patch.

BTW, ixgbe has similar locking problem.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-07  6:44   ` Eric Dumazet
@ 2010-10-07  6:46     ` David Miller
  2010-10-07 10:06       ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 24+ messages in thread
From: David Miller @ 2010-10-07  6:46 UTC (permalink / raw)
  To: eric.dumazet; +Cc: hawk, netdev, jeffrey.t.kirsher

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 07 Oct 2010 08:44:08 +0200

> Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
>> From: Jesper Dangaard Brouer <hawk@comx.dk>
>> Date: Tue, 05 Oct 2010 16:18:33 +0200
>> 
>> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
>> > updating the adapter stats when reading /proc/net/dev.  Currently the
>> > stats are updated by the watchdog timer every 2 sec, or when getting
>> > stats via ethtool -S.
>> > 
>> > A number of userspace apps read these /proc/net/dev stats every second,
>> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
>> > which is actually false.
>> > 
>> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
>> 
>> I assume that the Intel folks will take care of integrating this
>> now that the locking is fixed.
>> 
> 
> I integrated Jesper patch into my cumulative patch.

Ok.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-07  6:46     ` David Miller
@ 2010-10-07 10:06       ` Jesper Dangaard Brouer
  2010-10-07 10:24         ` Eric Dumazet
  0 siblings, 1 reply; 24+ messages in thread
From: Jesper Dangaard Brouer @ 2010-10-07 10:06 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, jeffrey.t.kirsher

On Wed, 2010-10-06 at 23:46 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 07 Oct 2010 08:44:08 +0200
> 
> > Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
> >> From: Jesper Dangaard Brouer <hawk@comx.dk>
> >> Date: Tue, 05 Oct 2010 16:18:33 +0200
> >> 
> >> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> >> > updating the adapter stats when reading /proc/net/dev.  Currently the
> >> > stats are updated by the watchdog timer every 2 sec, or when getting
> >> > stats via ethtool -S.
> >> > 
> >> > A number of userspace apps read these /proc/net/dev stats every second,
> >> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> >> > which is actually false.
> >> > 
> >> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> >> 
> >> I assume that the Intel folks will take care of integrating this
> >> now that the locking is fixed.
> > 
> > I integrated Jesper patch into my cumulative patch.
> 
> Ok.

I'm fine with this, as long as the commit message describe this change
of accuracy of stats in /proc/net/dev.

Something like:

 This patch also increase the accuracy of stats in /proc/net/dev, by
 updating the adapter stats when reading /proc/net/dev.  Previously
 the stats were only updated by the watchdog timer every 2 sec, which
 resulted in false observations from userspace.


-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network Kernel Developer
  Cand. Scient Datalog / MSc.CS
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-07 10:06       ` Jesper Dangaard Brouer
@ 2010-10-07 10:24         ` Eric Dumazet
  2010-10-07 11:36           ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Dumazet @ 2010-10-07 10:24 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: David Miller, netdev, jeffrey.t.kirsher

Le jeudi 07 octobre 2010 à 12:06 +0200, Jesper Dangaard Brouer a écrit :
> On Wed, 2010-10-06 at 23:46 -0700, David Miller wrote:
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Thu, 07 Oct 2010 08:44:08 +0200
> > 
> > > Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
> > >> From: Jesper Dangaard Brouer <hawk@comx.dk>
> > >> Date: Tue, 05 Oct 2010 16:18:33 +0200
> > >> 
> > >> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> > >> > updating the adapter stats when reading /proc/net/dev.  Currently the
> > >> > stats are updated by the watchdog timer every 2 sec, or when getting
> > >> > stats via ethtool -S.
> > >> > 
> > >> > A number of userspace apps read these /proc/net/dev stats every second,
> > >> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> > >> > which is actually false.
> > >> > 
> > >> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> > >> 
> > >> I assume that the Intel folks will take care of integrating this
> > >> now that the locking is fixed.
> > > 
> > > I integrated Jesper patch into my cumulative patch.
> > 
> > Ok.
> 
> I'm fine with this, as long as the commit message describe this change
> of accuracy of stats in /proc/net/dev.
> 
> Something like:
> 
>  This patch also increase the accuracy of stats in /proc/net/dev, by
>  updating the adapter stats when reading /proc/net/dev.  Previously
>  the stats were only updated by the watchdog timer every 2 sec, which
>  resulted in false observations from userspace.
> 
> 

Well, its not exactly true :)

Previously, igb stats were updated :
- By watchdog timer, every 2 secs
- Every time an "ethtool -S ethX" was done

There is no general guarantee netdev stats are immediately available to
user.

ndo_get_stats() is not allowed to sleep, (because of bonding...), so
drivers can not always provide accurate stats, if they need to make a
long transaction with hardware.

Other drivers do the same (provide hardware statistics), with about one
second resolution.

So the "resulted in false observations from userspace." is something
that might upset admins, but is not a hard requirement of netdev stats.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
  2010-10-07 10:24         ` Eric Dumazet
@ 2010-10-07 11:36           ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 24+ messages in thread
From: Jesper Dangaard Brouer @ 2010-10-07 11:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, jeffrey.t.kirsher

On Thu, 2010-10-07 at 12:24 +0200, Eric Dumazet wrote:
> Le jeudi 07 octobre 2010 à 12:06 +0200, Jesper Dangaard Brouer a écrit :
> > On Wed, 2010-10-06 at 23:46 -0700, David Miller wrote:
> > > From: Eric Dumazet <eric.dumazet@gmail.com>
> > > Date: Thu, 07 Oct 2010 08:44:08 +0200
> > > 
> > > > Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
> > > >> From: Jesper Dangaard Brouer <hawk@comx.dk>
> > > >> Date: Tue, 05 Oct 2010 16:18:33 +0200
> > > >> 
> > > >> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> > > >> > updating the adapter stats when reading /proc/net/dev.  Currently the
> > > >> > stats are updated by the watchdog timer every 2 sec, or when getting
> > > >> > stats via ethtool -S.
> > > >> > 
> > > >> > A number of userspace apps read these /proc/net/dev stats every second,
> > > >> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> > > >> > which is actually false.
> > > >> > 
> > > >> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> > > >> 
> > > >> I assume that the Intel folks will take care of integrating this
> > > >> now that the locking is fixed.
> > > > 
> > > > I integrated Jesper patch into my cumulative patch.
> > > 
> > > Ok.
> > 
> > I'm fine with this, as long as the commit message describe this change
> > of accuracy of stats in /proc/net/dev.
> > 
> > Something like:
> > 
> >  This patch also increase the accuracy of stats in /proc/net/dev, by
> >  updating the adapter stats when reading /proc/net/dev.  Previously
> >  the stats were only updated by the watchdog timer every 2 sec, which
> >  resulted in false observations from userspace.
> > 
> > 
> 
> Well, its not exactly true :)
> 
> Previously, igb stats were updated :
> - By watchdog timer, every 2 secs
> - Every time an "ethtool -S ethX" was done
> 
> There is no general guarantee netdev stats are immediately available to
> user.

I'm not talking general, just for this driver.
And I'm not giving any guarantees, that is why I choose the wording
"increase the accuracy" and not "provide accurate stats".

> ndo_get_stats() is not allowed to sleep, (because of bonding...), so
> drivers can not always provide accurate stats, if they need to make a
> long transaction with hardware.
> 
> Other drivers do the same (provide hardware statistics), with about one
> second resolution.

Yes, a lot of drivers don't provide accurate states. And one second
resolution, as most drivers use, will usually be good enough for most
admins.

Our usage pattern is a bit different, as we are very interested in
measuring bursty traffic.  I'm explain you why during the Netfilter
Workshop, if I get my talk about IPTV accepted ;-)

I also use the increased accuracy, when running my pktgen testing
scripts.


> So the "resulted in false observations from userspace." is something
> that might upset admins, but is not a hard requirement of netdev stats.

Its definitly not a hard requirement, but lets fix it where we can.

Then change the text:
 "resulted in false observations from userspace"
to:
 "resulted in inaccurate observations from userspace"

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network Kernel Developer
  Cand. Scient Datalog / MSc.CS
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer



^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH net-next] igb: fix stats handling
  2010-10-06  4:36               ` [PATCH net-next] igb: fix stats handling Eric Dumazet
  2010-10-06  5:49                 ` Jeff Kirsher
  2010-10-06 11:36                 ` Jesper Dangaard Brouer
@ 2010-10-09 23:57                 ` Tantilov, Emil S
  2010-10-10  8:14                   ` [PATCH net-next V2] " Eric Dumazet
  2 siblings, 1 reply; 24+ messages in thread
From: Tantilov, Emil S @ 2010-10-09 23:57 UTC (permalink / raw)
  To: Eric Dumazet, Kirsher, Jeffrey T
  Cc: Jesper Dangaard Brouer, Duyck, Alexander H,
	Jesper Dangaard Brouer, David S. Miller, netdev, Wyborny, Carolyn

Eric Dumazet wrote:
> Le mercredi 06 octobre 2010 à 05:28 +0200, Eric Dumazet a écrit :
> 
>> I'll let Intel guys doing the backporting work, but for old kernels,
>> you'll probably need to use "unsigned long" instead of "u64"
>> 
>> My plan is :
>> 
>> - Provide 64bit counters even on 32bit arch
>> - with proper synchro (include/linux/u64_stats_sync.h)
>> - Add a spinlock so we can apply Jesper patch.
> 
> Here is the net-next-2.6 patch, I am currently enable to test it, the
> dev machine with IGB NIC cannot be restarted until tomorrow, my son
> Nicolas is currently using it ;)
> 
> Could you and/or Jesper test it, possibly on 32 and 64 bit kernels ?
> 
> Thanks !
> 
> [PATCH net-next] igb: fix stats handling
> 
> There are currently some problems with igb.
> 
> - On 32bit arches, maintaining 64bit counters without proper
> synchronization between writers and readers.
> 
> - Stats updated every two seconds, as reported by Jesper.
>    (Jesper provided a patch for this)
> 
> - Potential problem between worker thread and ethtool -S
> 
> This patch uses u64_stats_sync, and convert everything to be 64bit
> safe, 
> SMP safe, even on 32bit arches.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  drivers/net/igb/igb.h         |    7 +-
>  drivers/net/igb/igb_ethtool.c |   10 +-
>  drivers/net/igb/igb_main.c    |  111 +++++++++++++++++++++++---------
>  3 files changed, 94 insertions(+), 34 deletions(-)

This patch is causing a hang when testing with 2 sessions in a while loop reading /proc/net/dev/ and ethtool -S. I think even just reading /proc/net/dev/ is sufficient, but have not confirmed it yet. I have seen the hang somewhere between 15 min to an hour. Without the patch same test ran 24+ hours without issues.

There was no trace on the screen, I got this with magic sysrq:

[15388.393579] SysRq : Show Regs 
[15388.397341] Modules linked in: igb [last unloaded: scsi_wait_scan] [15388.404846]  
[15388.406889] Pid: 16218, comm: kworker/4:1 Not tainted 2.6.36-rc3-net-next-igb-100810+ #2 S5520HC/S5520HC 
[15388.418393] EIP: 0060:[<c13fead2>] EFLAGS: 00000297 CPU: 4 
[15388.424908] EIP is at _raw_spin_lock+0x13/0x19 
[15388.430257] EAX: f6eab55c EBX: f6eab380 ECX: 00000001 EDX: 00004e4a [15388.437629] ESI: f6eab000 EDI: f6eab41c EBP: f3d9bf4c ESP: f3d9bf4c [15388.445011]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 
[15388.451422] Process kworker/4:1 (pid: 16218, ti=f3d9a000 task=f5ce0ed0 task.ti=f3d9a000) 
[15388.461173] Stack: 
[15388.463796]  f3d9bf64 f8586c57 f3d9bf5c f6eab41c f6901e00 c8703cc0 f3d9bf90 c1041379 
[15388.473116] <0> 00000004 00000000 f8586b16 00000000 c8707b05 c8707b00 f6901e00 c8703cc4 
[15388.483379] <0> c8703cc0 f3d9bfb8 c1042879 c16bb640 c8703cc4 00000c54 c16bb640 f6901e10 
[15388.494219] Call Trace: 
[15388.497336]  [<f8586c57>] ? igb_watchdog_task+0x141/0x21a [igb] 
[15388.504336]  [<c1041379>] ? process_one_work+0x18e/0x265 
[15388.510643]  [<f8586b16>] ? igb_watchdog_task+0x0/0x21a [igb] 
[15388.517455]  [<c1042879>] ? worker_thread+0xf3/0x1ef 
[15388.523384]  [<c1042786>] ? worker_thread+0x0/0x1ef 
[15388.529222]  [<c104506b>] ? kthread+0x62/0x67 
[15388.534475]  [<c1045009>] ? kthread+0x0/0x67 
[15388.539623]  [<c1002d36>] ? kernel_thread_helper+0x6/0x10 
[15388.546034] Code: 00 75 05 f0 66 0f b1 0a 0f 94 c1 0f b6 c1 85 c0 0f 95 c0 0f b6 c0 5d c3 55 ba 00 01 00 00 89 e5 f0 66 0f c1 10 38 f2 74 06 f3 90 <8a> 10 eb f6 5d c3 55 89 e5 9c 59 fa ba 00 01 00 00 f0 66 0f c1

Thanks,
Emil


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH net-next V2] igb: fix stats handling
  2010-10-09 23:57                 ` Tantilov, Emil S
@ 2010-10-10  8:14                   ` Eric Dumazet
  2010-10-12  0:12                     ` Jeff Kirsher
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Dumazet @ 2010-10-10  8:14 UTC (permalink / raw)
  To: Tantilov, Emil S
  Cc: Kirsher, Jeffrey T, Jesper Dangaard Brouer, Duyck, Alexander H,
	Jesper Dangaard Brouer, David S. Miller, netdev, Wyborny, Carolyn

Le samedi 09 octobre 2010 à 17:57 -0600, Tantilov, Emil S a écrit :
> Eric Dumazet wrote:
> > Le mercredi 06 octobre 2010 à 05:28 +0200, Eric Dumazet a écrit :
> > 
> >> I'll let Intel guys doing the backporting work, but for old kernels,
> >> you'll probably need to use "unsigned long" instead of "u64"
> >> 
> >> My plan is :
> >> 
> >> - Provide 64bit counters even on 32bit arch
> >> - with proper synchro (include/linux/u64_stats_sync.h)
> >> - Add a spinlock so we can apply Jesper patch.
> > 
> > Here is the net-next-2.6 patch, I am currently enable to test it, the
> > dev machine with IGB NIC cannot be restarted until tomorrow, my son
> > Nicolas is currently using it ;)
> > 
> > Could you and/or Jesper test it, possibly on 32 and 64 bit kernels ?
> > 
> > Thanks !
> > 
> > [PATCH net-next] igb: fix stats handling
> > 
> > There are currently some problems with igb.
> > 
> > - On 32bit arches, maintaining 64bit counters without proper
> > synchronization between writers and readers.
> > 
> > - Stats updated every two seconds, as reported by Jesper.
> >    (Jesper provided a patch for this)
> > 
> > - Potential problem between worker thread and ethtool -S
> > 
> > This patch uses u64_stats_sync, and convert everything to be 64bit
> > safe, 
> > SMP safe, even on 32bit arches.
> > 
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > ---
> >  drivers/net/igb/igb.h         |    7 +-
> >  drivers/net/igb/igb_ethtool.c |   10 +-
> >  drivers/net/igb/igb_main.c    |  111 +++++++++++++++++++++++---------
> >  3 files changed, 94 insertions(+), 34 deletions(-)
> 
> This patch is causing a hang when testing with 2 sessions in a while loop reading /proc/net/dev/ and ethtool -S. I think even just reading /proc/net/dev/ is sufficient, but have not confirmed it yet. I have seen the hang somewhere between 15 min to an hour. Without the patch same test ran 24+ hours without issues.
> 
> There was no trace on the screen, I got this with magic sysrq:
> 
> [15388.393579] SysRq : Show Regs 
> [15388.397341] Modules linked in: igb [last unloaded: scsi_wait_scan] [15388.404846]  
> [15388.406889] Pid: 16218, comm: kworker/4:1 Not tainted 2.6.36-rc3-net-next-igb-100810+ #2 S5520HC/S5520HC 
> [15388.418393] EIP: 0060:[<c13fead2>] EFLAGS: 00000297 CPU: 4 
> [15388.424908] EIP is at _raw_spin_lock+0x13/0x19 
> [15388.430257] EAX: f6eab55c EBX: f6eab380 ECX: 00000001 EDX: 00004e4a [15388.437629] ESI: f6eab000 EDI: f6eab41c EBP: f3d9bf4c ESP: f3d9bf4c [15388.445011]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 
> [15388.451422] Process kworker/4:1 (pid: 16218, ti=f3d9a000 task=f5ce0ed0 task.ti=f3d9a000) 
> [15388.461173] Stack: 
> [15388.463796]  f3d9bf64 f8586c57 f3d9bf5c f6eab41c f6901e00 c8703cc0 f3d9bf90 c1041379 
> [15388.473116] <0> 00000004 00000000 f8586b16 00000000 c8707b05 c8707b00 f6901e00 c8703cc4 
> [15388.483379] <0> c8703cc0 f3d9bfb8 c1042879 c16bb640 c8703cc4 00000c54 c16bb640 f6901e10 
> [15388.494219] Call Trace: 
> [15388.497336]  [<f8586c57>] ? igb_watchdog_task+0x141/0x21a [igb] 
> [15388.504336]  [<c1041379>] ? process_one_work+0x18e/0x265 
> [15388.510643]  [<f8586b16>] ? igb_watchdog_task+0x0/0x21a [igb] 
> [15388.517455]  [<c1042879>] ? worker_thread+0xf3/0x1ef 
> [15388.523384]  [<c1042786>] ? worker_thread+0x0/0x1ef 
> [15388.529222]  [<c104506b>] ? kthread+0x62/0x67 
> [15388.534475]  [<c1045009>] ? kthread+0x0/0x67 
> [15388.539623]  [<c1002d36>] ? kernel_thread_helper+0x6/0x10 
> [15388.546034] Code: 00 75 05 f0 66 0f b1 0a 0f 94 c1 0f b6 c1 85 c0 0f 95 c0 0f b6 c0 5d c3 55 ba 00 01 00 00 89 e5 f0 66 0f c1 10 38 f2 74 06 f3 90 <8a> 10 eb f6 5d c3 55 89 e5 9c 59 fa ba 00 01 00 00 f0 66 0f c1
> 
> Thanks,
> Emil
> 

Hi Emil, thanks for testing.

It seems one one the u64_stats_sync invariant is not met.

I believe problem comes from "restart_queue"

This one can be updated in // by two cpus.

So we can lose one of the u64_stats_update_begin() /
u64_stats_update_end() increment and freeze a reader.

So igb had a previous bug here, un-noticed :)

restart_queue can be updated by start_xmit() (so only one cpu can be
there, because of txqueue lock serialization), but it also can be
updated by the igb_clean_tx_irq() function (one cpu there too).

One solution to this problem is to use two separate counters, with two
separate syncp.

[PATCH net-next V2] igb: fix stats handling

There are currently some problems with igb.

- On 32bit arches, maintaining 64bit counters without proper
synchronization between writers and readers.

- Stats updated every two seconds, as reported by Jesper.
   (Jesper provided a patch for this)

- Potential problem between worker thread and ethtool -S

This patch uses u64_stats_sync, and convert everything to be 64bit safe,
SMP safe, even on 32bit arches. It integrates Jesper idea of providing
accurate stats at the time user reads them.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
V2: Add a second restart_queue field, with separate syncp, because
original restart_queue was potentially udpated by two cpus.
   Corrected igb_get_ethtool_stats() to also use appropriate syncp,
   and do the sum of two restart_queue fields.

 drivers/net/igb/igb.h         |    9 ++
 drivers/net/igb/igb_ethtool.c |   52 ++++++++++----
 drivers/net/igb/igb_main.c    |  113 +++++++++++++++++++++++---------
 3 files changed, 129 insertions(+), 45 deletions(-)

diff --git a/drivers/net/igb/igb.h b/drivers/net/igb/igb.h
index 44e0ff1..edab9c4 100644
--- a/drivers/net/igb/igb.h
+++ b/drivers/net/igb/igb.h
@@ -159,6 +159,7 @@ struct igb_tx_queue_stats {
 	u64 packets;
 	u64 bytes;
 	u64 restart_queue;
+	u64 restart_queue2;
 };
 
 struct igb_rx_queue_stats {
@@ -210,11 +211,14 @@ struct igb_ring {
 		/* TX */
 		struct {
 			struct igb_tx_queue_stats tx_stats;
+			struct u64_stats_sync tx_syncp;
+			struct u64_stats_sync tx_syncp2;
 			bool detect_tx_hung;
 		};
 		/* RX */
 		struct {
 			struct igb_rx_queue_stats rx_stats;
+			struct u64_stats_sync rx_syncp;
 			u32 rx_buffer_len;
 		};
 	};
@@ -288,6 +292,9 @@ struct igb_adapter {
 	struct timecompare compare;
 	struct hwtstamp_config hwtstamp_config;
 
+	spinlock_t stats64_lock;
+	struct rtnl_link_stats64 stats64;
+
 	/* structs defined in e1000_hw.h */
 	struct e1000_hw hw;
 	struct e1000_hw_stats stats;
@@ -357,7 +364,7 @@ extern netdev_tx_t igb_xmit_frame_ring_adv(struct sk_buff *, struct igb_ring *);
 extern void igb_unmap_and_free_tx_resource(struct igb_ring *,
 					   struct igb_buffer *);
 extern void igb_alloc_rx_buffers_adv(struct igb_ring *, int);
-extern void igb_update_stats(struct igb_adapter *);
+extern void igb_update_stats(struct igb_adapter *, struct rtnl_link_stats64 *);
 extern bool igb_has_link(struct igb_adapter *adapter);
 extern void igb_set_ethtool_ops(struct net_device *);
 extern void igb_power_up_link(struct igb_adapter *);
diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index 26bf6a1..a70e16b 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -90,8 +90,8 @@ static const struct igb_stats igb_gstrings_stats[] = {
 
 #define IGB_NETDEV_STAT(_net_stat) { \
 	.stat_string = __stringify(_net_stat), \
-	.sizeof_stat = FIELD_SIZEOF(struct net_device_stats, _net_stat), \
-	.stat_offset = offsetof(struct net_device_stats, _net_stat) \
+	.sizeof_stat = FIELD_SIZEOF(struct rtnl_link_stats64, _net_stat), \
+	.stat_offset = offsetof(struct rtnl_link_stats64, _net_stat) \
 }
 static const struct igb_stats igb_gstrings_net_stats[] = {
 	IGB_NETDEV_STAT(rx_errors),
@@ -111,8 +111,9 @@ static const struct igb_stats igb_gstrings_net_stats[] = {
 	(sizeof(igb_gstrings_net_stats) / sizeof(struct igb_stats))
 #define IGB_RX_QUEUE_STATS_LEN \
 	(sizeof(struct igb_rx_queue_stats) / sizeof(u64))
-#define IGB_TX_QUEUE_STATS_LEN \
-	(sizeof(struct igb_tx_queue_stats) / sizeof(u64))
+
+#define IGB_TX_QUEUE_STATS_LEN 3 /* packets, bytes, restart_queue */
+
 #define IGB_QUEUE_STATS_LEN \
 	((((struct igb_adapter *)netdev_priv(netdev))->num_rx_queues * \
 	  IGB_RX_QUEUE_STATS_LEN) + \
@@ -2070,12 +2071,14 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 				  struct ethtool_stats *stats, u64 *data)
 {
 	struct igb_adapter *adapter = netdev_priv(netdev);
-	struct net_device_stats *net_stats = &netdev->stats;
-	u64 *queue_stat;
-	int i, j, k;
+	struct rtnl_link_stats64 *net_stats = &adapter->stats64;
+	unsigned int start;
+	struct igb_ring *ring;
+	int i, j;
 	char *p;
 
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, net_stats);
 
 	for (i = 0; i < IGB_GLOBAL_STATS_LEN; i++) {
 		p = (char *)adapter + igb_gstrings_stats[i].stat_offset;
@@ -2088,15 +2091,36 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 			sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
 	}
 	for (j = 0; j < adapter->num_tx_queues; j++) {
-		queue_stat = (u64 *)&adapter->tx_ring[j]->tx_stats;
-		for (k = 0; k < IGB_TX_QUEUE_STATS_LEN; k++, i++)
-			data[i] = queue_stat[k];
+		u64	restart2;
+
+		ring = adapter->tx_ring[j];
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->tx_syncp);
+			data[i]   = ring->tx_stats.packets;
+			data[i+1] = ring->tx_stats.bytes;
+			data[i+2] = ring->tx_stats.restart_queue;
+		} while (u64_stats_fetch_retry_bh(&ring->tx_syncp, start));
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->tx_syncp2);
+			restart2  = ring->tx_stats.restart_queue2;
+		} while (u64_stats_fetch_retry_bh(&ring->tx_syncp2, start));
+		data[i+2] += restart2;
+
+		i += IGB_TX_QUEUE_STATS_LEN;
 	}
 	for (j = 0; j < adapter->num_rx_queues; j++) {
-		queue_stat = (u64 *)&adapter->rx_ring[j]->rx_stats;
-		for (k = 0; k < IGB_RX_QUEUE_STATS_LEN; k++, i++)
-			data[i] = queue_stat[k];
+		ring = adapter->rx_ring[j];
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->rx_syncp);
+			data[i]   = ring->rx_stats.packets;
+			data[i+1] = ring->rx_stats.bytes;
+			data[i+2] = ring->rx_stats.drops;
+			data[i+3] = ring->rx_stats.csum_err;
+			data[i+4] = ring->rx_stats.alloc_failed;
+		} while (u64_stats_fetch_retry_bh(&ring->rx_syncp, start));
+		i += IGB_RX_QUEUE_STATS_LEN;
 	}
+	spin_unlock(&adapter->stats64_lock);
 }
 
 static void igb_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 55edcb7..c8f1249 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -96,7 +96,6 @@ static int igb_setup_all_rx_resources(struct igb_adapter *);
 static void igb_free_all_tx_resources(struct igb_adapter *);
 static void igb_free_all_rx_resources(struct igb_adapter *);
 static void igb_setup_mrqc(struct igb_adapter *);
-void igb_update_stats(struct igb_adapter *);
 static int igb_probe(struct pci_dev *, const struct pci_device_id *);
 static void __devexit igb_remove(struct pci_dev *pdev);
 static int igb_sw_init(struct igb_adapter *);
@@ -113,7 +112,8 @@ static void igb_update_phy_info(unsigned long);
 static void igb_watchdog(unsigned long);
 static void igb_watchdog_task(struct work_struct *);
 static netdev_tx_t igb_xmit_frame_adv(struct sk_buff *skb, struct net_device *);
-static struct net_device_stats *igb_get_stats(struct net_device *);
+static struct rtnl_link_stats64 *igb_get_stats64(struct net_device *dev,
+						 struct rtnl_link_stats64 *stats);
 static int igb_change_mtu(struct net_device *, int);
 static int igb_set_mac(struct net_device *, void *);
 static void igb_set_uta(struct igb_adapter *adapter);
@@ -1536,7 +1536,9 @@ void igb_down(struct igb_adapter *adapter)
 	netif_carrier_off(netdev);
 
 	/* record the stats before reset*/
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	spin_unlock(&adapter->stats64_lock);
 
 	adapter->link_speed = 0;
 	adapter->link_duplex = 0;
@@ -1689,7 +1691,7 @@ static const struct net_device_ops igb_netdev_ops = {
 	.ndo_open		= igb_open,
 	.ndo_stop		= igb_close,
 	.ndo_start_xmit		= igb_xmit_frame_adv,
-	.ndo_get_stats		= igb_get_stats,
+	.ndo_get_stats64	= igb_get_stats64,
 	.ndo_set_rx_mode	= igb_set_rx_mode,
 	.ndo_set_multicast_list	= igb_set_rx_mode,
 	.ndo_set_mac_address	= igb_set_mac,
@@ -2276,6 +2278,7 @@ static int __devinit igb_sw_init(struct igb_adapter *adapter)
 	adapter->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
+	spin_lock_init(&adapter->stats64_lock);
 #ifdef CONFIG_PCI_IOV
 	if (hw->mac.type == e1000_82576)
 		adapter->vfs_allocated_count = (max_vfs > 7) ? 7 : max_vfs;
@@ -3483,7 +3486,9 @@ static void igb_watchdog_task(struct work_struct *work)
 		}
 	}
 
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	spin_unlock(&adapter->stats64_lock);
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igb_ring *tx_ring = adapter->tx_ring[i];
@@ -3550,6 +3555,8 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
 	int new_val = q_vector->itr_val;
 	int avg_wire_size = 0;
 	struct igb_adapter *adapter = q_vector->adapter;
+	struct igb_ring *ring;
+	unsigned int packets;
 
 	/* For non-gigabit speeds, just fix the interrupt rate at 4000
 	 * ints/sec - ITR timer value of 120 ticks.
@@ -3559,16 +3566,21 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
 		goto set_itr_val;
 	}
 
-	if (q_vector->rx_ring && q_vector->rx_ring->total_packets) {
-		struct igb_ring *ring = q_vector->rx_ring;
-		avg_wire_size = ring->total_bytes / ring->total_packets;
+	ring = q_vector->rx_ring;
+	if (ring) {
+		packets = ACCESS_ONCE(ring->total_packets);
+
+		if (packets) 
+			avg_wire_size = ring->total_bytes / packets;
 	}
 
-	if (q_vector->tx_ring && q_vector->tx_ring->total_packets) {
-		struct igb_ring *ring = q_vector->tx_ring;
-		avg_wire_size = max_t(u32, avg_wire_size,
-		                      (ring->total_bytes /
-		                       ring->total_packets));
+	ring = q_vector->tx_ring;
+	if (ring) {
+		packets = ACCESS_ONCE(ring->total_packets);
+
+		if (packets)
+			avg_wire_size = max_t(u32, avg_wire_size,
+			                      ring->total_bytes / packets);
 	}
 
 	/* if avg_wire_size isn't set no work was done */
@@ -4077,7 +4089,11 @@ static int __igb_maybe_stop_tx(struct igb_ring *tx_ring, int size)
 
 	/* A reprieve! */
 	netif_wake_subqueue(netdev, tx_ring->queue_index);
-	tx_ring->tx_stats.restart_queue++;
+
+	u64_stats_update_begin(&tx_ring->tx_syncp2);
+	tx_ring->tx_stats.restart_queue2++;
+	u64_stats_update_end(&tx_ring->tx_syncp2);
+
 	return 0;
 }
 
@@ -4214,16 +4230,22 @@ static void igb_reset_task(struct work_struct *work)
 }
 
 /**
- * igb_get_stats - Get System Network Statistics
+ * igb_get_stats64 - Get System Network Statistics
  * @netdev: network interface device structure
+ * @stats: rtnl_link_stats64 pointer
  *
- * Returns the address of the device statistics structure.
- * The statistics are actually updated from the timer callback.
  **/
-static struct net_device_stats *igb_get_stats(struct net_device *netdev)
+static struct rtnl_link_stats64 *igb_get_stats64(struct net_device *netdev,
+						 struct rtnl_link_stats64 *stats)
 {
-	/* only return the current stats */
-	return &netdev->stats;
+	struct igb_adapter *adapter = netdev_priv(netdev);
+
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	memcpy(stats, &adapter->stats64, sizeof(*stats));
+	spin_unlock(&adapter->stats64_lock);
+
+	return stats;
 }
 
 /**
@@ -4305,15 +4327,17 @@ static int igb_change_mtu(struct net_device *netdev, int new_mtu)
  * @adapter: board private structure
  **/
 
-void igb_update_stats(struct igb_adapter *adapter)
+void igb_update_stats(struct igb_adapter *adapter,
+		      struct rtnl_link_stats64 *net_stats)
 {
-	struct net_device_stats *net_stats = igb_get_stats(adapter->netdev);
 	struct e1000_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
 	u32 reg, mpc;
 	u16 phy_tmp;
 	int i;
 	u64 bytes, packets;
+	unsigned int start;
+	u64 _bytes, _packets;
 
 #define PHY_IDLE_ERROR_COUNT_MASK 0x00FF
 
@@ -4331,10 +4355,17 @@ void igb_update_stats(struct igb_adapter *adapter)
 	for (i = 0; i < adapter->num_rx_queues; i++) {
 		u32 rqdpc_tmp = rd32(E1000_RQDPC(i)) & 0x0FFF;
 		struct igb_ring *ring = adapter->rx_ring[i];
+
 		ring->rx_stats.drops += rqdpc_tmp;
 		net_stats->rx_fifo_errors += rqdpc_tmp;
-		bytes += ring->rx_stats.bytes;
-		packets += ring->rx_stats.packets;
+		
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->rx_syncp);
+			_bytes = ring->rx_stats.bytes;
+			_packets = ring->rx_stats.packets;
+		} while (u64_stats_fetch_retry_bh(&ring->rx_syncp, start));
+		bytes += _bytes;
+		packets += _packets;
 	}
 
 	net_stats->rx_bytes = bytes;
@@ -4344,8 +4375,13 @@ void igb_update_stats(struct igb_adapter *adapter)
 	packets = 0;
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igb_ring *ring = adapter->tx_ring[i];
-		bytes += ring->tx_stats.bytes;
-		packets += ring->tx_stats.packets;
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->tx_syncp);
+			_bytes = ring->tx_stats.bytes;
+			_packets = ring->tx_stats.packets;
+		} while (u64_stats_fetch_retry_bh(&ring->tx_syncp, start));
+		bytes += _bytes;
+		packets += _packets;
 	}
 	net_stats->tx_bytes = bytes;
 	net_stats->tx_packets = packets;
@@ -5397,7 +5433,10 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
 		if (__netif_subqueue_stopped(netdev, tx_ring->queue_index) &&
 		    !(test_bit(__IGB_DOWN, &adapter->state))) {
 			netif_wake_subqueue(netdev, tx_ring->queue_index);
+
+			u64_stats_update_begin(&tx_ring->tx_syncp);
 			tx_ring->tx_stats.restart_queue++;
+			u64_stats_update_end(&tx_ring->tx_syncp);
 		}
 	}
 
@@ -5437,8 +5476,10 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
 	}
 	tx_ring->total_bytes += total_bytes;
 	tx_ring->total_packets += total_packets;
+	u64_stats_update_begin(&tx_ring->tx_syncp);
 	tx_ring->tx_stats.bytes += total_bytes;
 	tx_ring->tx_stats.packets += total_packets;
+	u64_stats_update_end(&tx_ring->tx_syncp);
 	return count < tx_ring->count;
 }
 
@@ -5480,9 +5521,11 @@ static inline void igb_rx_checksum_adv(struct igb_ring *ring,
 		 * packets, (aka let the stack check the crc32c)
 		 */
 		if ((skb->len == 60) &&
-		    (ring->flags & IGB_RING_FLAG_RX_SCTP_CSUM))
+		    (ring->flags & IGB_RING_FLAG_RX_SCTP_CSUM)) {
+			u64_stats_update_begin(&ring->rx_syncp);
 			ring->rx_stats.csum_err++;
-
+			u64_stats_update_end(&ring->rx_syncp);
+		}
 		/* let the stack verify checksum errors */
 		return;
 	}
@@ -5669,8 +5712,10 @@ next_desc:
 
 	rx_ring->total_packets += total_packets;
 	rx_ring->total_bytes += total_bytes;
+	u64_stats_update_begin(&rx_ring->rx_syncp);
 	rx_ring->rx_stats.packets += total_packets;
 	rx_ring->rx_stats.bytes += total_bytes;
+	u64_stats_update_end(&rx_ring->rx_syncp);
 	return cleaned;
 }
 
@@ -5698,8 +5743,10 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 		if ((bufsz < IGB_RXBUFFER_1024) && !buffer_info->page_dma) {
 			if (!buffer_info->page) {
 				buffer_info->page = netdev_alloc_page(netdev);
-				if (!buffer_info->page) {
+				if (unlikely(!buffer_info->page)) {
+					u64_stats_update_begin(&rx_ring->rx_syncp);
 					rx_ring->rx_stats.alloc_failed++;
+					u64_stats_update_end(&rx_ring->rx_syncp);
 					goto no_buffers;
 				}
 				buffer_info->page_offset = 0;
@@ -5714,7 +5761,9 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 			if (dma_mapping_error(rx_ring->dev,
 					      buffer_info->page_dma)) {
 				buffer_info->page_dma = 0;
+				u64_stats_update_begin(&rx_ring->rx_syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_syncp);
 				goto no_buffers;
 			}
 		}
@@ -5722,8 +5771,10 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 		skb = buffer_info->skb;
 		if (!skb) {
 			skb = netdev_alloc_skb_ip_align(netdev, bufsz);
-			if (!skb) {
+			if (unlikely(!skb)) {
+				u64_stats_update_begin(&rx_ring->rx_syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_syncp);
 				goto no_buffers;
 			}
 
@@ -5737,7 +5788,9 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 			if (dma_mapping_error(rx_ring->dev,
 					      buffer_info->dma)) {
 				buffer_info->dma = 0;
+				u64_stats_update_begin(&rx_ring->rx_syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_syncp);
 				goto no_buffers;
 			}
 		}



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next V2] igb: fix stats handling
  2010-10-10  8:14                   ` [PATCH net-next V2] " Eric Dumazet
@ 2010-10-12  0:12                     ` Jeff Kirsher
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2010-10-12  0:12 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tantilov, Emil S, Jesper Dangaard Brouer, Duyck, Alexander H,
	Jesper Dangaard Brouer, David S. Miller, netdev, Wyborny, Carolyn

On Sun, Oct 10, 2010 at 01:14, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le samedi 09 octobre 2010 à 17:57 -0600, Tantilov, Emil S a écrit :
>> Eric Dumazet wrote:
>> > Le mercredi 06 octobre 2010 à 05:28 +0200, Eric Dumazet a écrit :
>> >
>> >> I'll let Intel guys doing the backporting work, but for old kernels,
>> >> you'll probably need to use "unsigned long" instead of "u64"
>> >>
>> >> My plan is :
>> >>
>> >> - Provide 64bit counters even on 32bit arch
>> >> - with proper synchro (include/linux/u64_stats_sync.h)
>> >> - Add a spinlock so we can apply Jesper patch.
>> >
>> > Here is the net-next-2.6 patch, I am currently enable to test it, the
>> > dev machine with IGB NIC cannot be restarted until tomorrow, my son
>> > Nicolas is currently using it ;)
>> >
>> > Could you and/or Jesper test it, possibly on 32 and 64 bit kernels ?
>> >
>> > Thanks !
>> >
>> > [PATCH net-next] igb: fix stats handling
>> >
>> > There are currently some problems with igb.
>> >
>> > - On 32bit arches, maintaining 64bit counters without proper
>> > synchronization between writers and readers.
>> >
>> > - Stats updated every two seconds, as reported by Jesper.
>> >    (Jesper provided a patch for this)
>> >
>> > - Potential problem between worker thread and ethtool -S
>> >
>> > This patch uses u64_stats_sync, and convert everything to be 64bit
>> > safe,
>> > SMP safe, even on 32bit arches.
>> >
>> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> > ---
>> >  drivers/net/igb/igb.h         |    7 +-
>> >  drivers/net/igb/igb_ethtool.c |   10 +-
>> >  drivers/net/igb/igb_main.c    |  111 +++++++++++++++++++++++---------
>> >  3 files changed, 94 insertions(+), 34 deletions(-)
>>
>> This patch is causing a hang when testing with 2 sessions in a while loop reading /proc/net/dev/ and ethtool -S. I think even just reading /proc/net/dev/ is sufficient, but have not confirmed it yet. I have seen the hang somewhere between 15 min to an hour. Without the patch same test ran 24+ hours without issues.
>>
>> There was no trace on the screen, I got this with magic sysrq:
>>
>> [15388.393579] SysRq : Show Regs
>> [15388.397341] Modules linked in: igb [last unloaded: scsi_wait_scan] [15388.404846]
>> [15388.406889] Pid: 16218, comm: kworker/4:1 Not tainted 2.6.36-rc3-net-next-igb-100810+ #2 S5520HC/S5520HC
>> [15388.418393] EIP: 0060:[<c13fead2>] EFLAGS: 00000297 CPU: 4
>> [15388.424908] EIP is at _raw_spin_lock+0x13/0x19
>> [15388.430257] EAX: f6eab55c EBX: f6eab380 ECX: 00000001 EDX: 00004e4a [15388.437629] ESI: f6eab000 EDI: f6eab41c EBP: f3d9bf4c ESP: f3d9bf4c [15388.445011]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> [15388.451422] Process kworker/4:1 (pid: 16218, ti=f3d9a000 task=f5ce0ed0 task.ti=f3d9a000)
>> [15388.461173] Stack:
>> [15388.463796]  f3d9bf64 f8586c57 f3d9bf5c f6eab41c f6901e00 c8703cc0 f3d9bf90 c1041379
>> [15388.473116] <0> 00000004 00000000 f8586b16 00000000 c8707b05 c8707b00 f6901e00 c8703cc4
>> [15388.483379] <0> c8703cc0 f3d9bfb8 c1042879 c16bb640 c8703cc4 00000c54 c16bb640 f6901e10
>> [15388.494219] Call Trace:
>> [15388.497336]  [<f8586c57>] ? igb_watchdog_task+0x141/0x21a [igb]
>> [15388.504336]  [<c1041379>] ? process_one_work+0x18e/0x265
>> [15388.510643]  [<f8586b16>] ? igb_watchdog_task+0x0/0x21a [igb]
>> [15388.517455]  [<c1042879>] ? worker_thread+0xf3/0x1ef
>> [15388.523384]  [<c1042786>] ? worker_thread+0x0/0x1ef
>> [15388.529222]  [<c104506b>] ? kthread+0x62/0x67
>> [15388.534475]  [<c1045009>] ? kthread+0x0/0x67
>> [15388.539623]  [<c1002d36>] ? kernel_thread_helper+0x6/0x10
>> [15388.546034] Code: 00 75 05 f0 66 0f b1 0a 0f 94 c1 0f b6 c1 85 c0 0f 95 c0 0f b6 c0 5d c3 55 ba 00 01 00 00 89 e5 f0 66 0f c1 10 38 f2 74 06 f3 90 <8a> 10 eb f6 5d c3 55 89 e5 9c 59 fa ba 00 01 00 00 f0 66 0f c1
>>
>> Thanks,
>> Emil
>>
>
> Hi Emil, thanks for testing.
>
> It seems one one the u64_stats_sync invariant is not met.
>
> I believe problem comes from "restart_queue"
>
> This one can be updated in // by two cpus.
>
> So we can lose one of the u64_stats_update_begin() /
> u64_stats_update_end() increment and freeze a reader.
>
> So igb had a previous bug here, un-noticed :)
>
> restart_queue can be updated by start_xmit() (so only one cpu can be
> there, because of txqueue lock serialization), but it also can be
> updated by the igb_clean_tx_irq() function (one cpu there too).
>
> One solution to this problem is to use two separate counters, with two
> separate syncp.
>
> [PATCH net-next V2] igb: fix stats handling
>
> There are currently some problems with igb.
>
> - On 32bit arches, maintaining 64bit counters without proper
> synchronization between writers and readers.
>
> - Stats updated every two seconds, as reported by Jesper.
>   (Jesper provided a patch for this)
>
> - Potential problem between worker thread and ethtool -S
>
> This patch uses u64_stats_sync, and convert everything to be 64bit safe,
> SMP safe, even on 32bit arches. It integrates Jesper idea of providing
> accurate stats at the time user reads them.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> V2: Add a second restart_queue field, with separate syncp, because
> original restart_queue was potentially udpated by two cpus.
>   Corrected igb_get_ethtool_stats() to also use appropriate syncp,
>   and do the sum of two restart_queue fields.
>
>  drivers/net/igb/igb.h         |    9 ++
>  drivers/net/igb/igb_ethtool.c |   52 ++++++++++----
>  drivers/net/igb/igb_main.c    |  113 +++++++++++++++++++++++---------
>  3 files changed, 129 insertions(+), 45 deletions(-)
>

Thanks Eric, I have added the updated patch to my queue.

-- 
Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-10-12  0:18 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-05 14:18 [net-next PATCH] igb: update adapter stats when reading /proc/net/dev Jesper Dangaard Brouer
2010-10-05 14:41 ` Eric Dumazet
2010-10-05 14:53   ` Jesper Dangaard Brouer
2010-10-05 15:19     ` Eric Dumazet
2010-10-05 21:01       ` Eric Dumazet
2010-10-05 21:16         ` Jesper Dangaard Brouer
2010-10-05 22:34           ` Jeff Kirsher
2010-10-06  3:28             ` Eric Dumazet
2010-10-06  4:36               ` [PATCH net-next] igb: fix stats handling Eric Dumazet
2010-10-06  5:49                 ` Jeff Kirsher
2010-10-06 11:36                 ` Jesper Dangaard Brouer
2010-10-09 23:57                 ` Tantilov, Emil S
2010-10-10  8:14                   ` [PATCH net-next V2] " Eric Dumazet
2010-10-12  0:12                     ` Jeff Kirsher
2010-10-06  5:47               ` [net-next PATCH] igb: update adapter stats when reading /proc/net/dev Jeff Kirsher
2010-10-07  6:36 ` David Miller
2010-10-07  6:44   ` Eric Dumazet
2010-10-07  6:46     ` David Miller
2010-10-07 10:06       ` Jesper Dangaard Brouer
2010-10-07 10:24         ` Eric Dumazet
2010-10-07 11:36           ` Jesper Dangaard Brouer
  -- strict thread matches above, loose matches on Subject: below --
2010-01-12 13:41 Jesper Dangaard Brouer
2010-01-14  1:30 ` David Miller
2010-01-14  3:01   ` Jeff Kirsher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).