public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 64 bit netdev stats counter
@ 2002-07-12 19:12 Matt Stegman
  0 siblings, 0 replies; 10+ messages in thread
From: Matt Stegman @ 2002-07-12 19:12 UTC (permalink / raw)
  To: linux-kernel

Hello,

I have a linux fileserver that wraps around the 'RX bytes' and 'TX bytes'
counters once every couple of days.  Since these variables are "unsigned
long" in the kernel (32 bits on x86) it wraps at four gigabytes.  I
patched a kernel (2.4.19-pre10-ac2) to make these two variables, along
with RX and TX packet counters, "unsigned long long"  which is 64 bits on
x86.

This seems to work OK.  The "ifconfig" command (from RedHat net-tools-1.60
RPM) already defines these values as "unsigned long long", so I don't have
any problems with ifconfig reporting weird numbers.

I did notice that different classes of network devices seem to use
different structs, so I guess this would only work for ethernet devices.

This is the first hacking of any kind I've done on the kernel.  Any
comments would be appreciated.



The patch I used:
diff -urN linux-orig/include/linux/netdevice.h linux/include/linux/netdevice.h
--- linux-orig/include/linux/netdevice.h	Tue Jul  9 13:28:43 2002
+++ linux/include/linux/netdevice.h	Tue Jul  9 13:48:05 2002
@@ -96,10 +96,10 @@

 struct net_device_stats
 {
-	unsigned long	rx_packets;		/* total packets received	*/
-	unsigned long	tx_packets;		/* total packets transmitted	*/
-	unsigned long	rx_bytes;		/* total bytes received 	*/
-	unsigned long	tx_bytes;		/* total bytes transmitted	*/
+	unsigned long long rx_packets;		/* total packets received	*/
+	unsigned long long tx_packets;		/* total packets transmitted	*/
+	unsigned long long rx_bytes;		/* total bytes received 	*/
+	unsigned long long tx_bytes;		/* total bytes transmitted	*/
 	unsigned long	rx_errors;		/* bad packets received		*/
 	unsigned long	tx_errors;		/* packet transmit problems	*/
 	unsigned long	rx_dropped;		/* no space in linux buffers	*/
diff -urN linux-orig/net/core/dev.c linux/net/core/dev.c
--- linux-orig/net/core/dev.c	Tue Jul  9 13:28:44 2002
+++ linux/net/core/dev.c	Tue Jul  9 13:45:03 2002
@@ -1689,7 +1689,7 @@
 	int size;

 	if (stats)
-		size = sprintf(buffer, "%6s:%8lu %7lu %4lu %4lu %4lu %5lu %10lu %9lu %8lu %7lu %4lu %4lu %4lu %5lu %7lu %10lu\n",
+		size = sprintf(buffer, "%6s:%8llu %7llu %4lu %4lu %4lu %5lu %10lu %9lu %8llu %7llu %4lu %4lu %4lu %5lu %7lu %10lu\n",
  		   dev->name,
 		   stats->rx_bytes,
 		   stats->rx_packets, stats->rx_errors,

$ cat /proc/net/dev
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:   25660      71    0    0    0     0          0         0    25660      71    0    0    0     0       0          0
  eth0:7208362840 12115561    0    0    0     0          0         0 3171323568 5493455    0    0    0     0       0          0
$ ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:00:00:00:00:00
          inet addr:a.b.c.d  Bcast:a.b.e.f  Mask:255.255.248.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:12115561 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5493455 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:7208362840 (6874.4 Mb)  TX bytes:3171323568 (3024.4 Mb)
          Interrupt:5 Base address:0x2000


-- 
      -Matt Stegman



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 64 bit netdev stats counter
  2002-07-12 22:02 ` 64 bit netdev stats counter Matt Stegman
@ 2002-07-12 21:58   ` David S. Miller
  2002-07-12 23:20     ` Alan Cox
  2002-07-13  0:50     ` kuznet
  0 siblings, 2 replies; 10+ messages in thread
From: David S. Miller @ 2002-07-12 21:58 UTC (permalink / raw)
  To: matts; +Cc: shemminger, linux-kernel

   From: Matt Stegman <matts@ksu.edu>
   Date: Fri, 12 Jul 2002 17:02:07 -0500 (CDT)

   On 12 Jul 2002, Stephen Hemminger wrote:
   
   > 64 bit values are not atomic so on x86 there will be glitches if this
   > ever wraps over on an SMP machine.  One other engineer is already
   > adressing this for inode values like size; but this would introduce the
   > same problem for network stuff.
   
   One other solution I thought of would be to have an rx_bytes_wrap counter
   in the struct.

32-bit values aren't atomic either, what is the issue?
We don't use atomic_t ops on these counters so they aren't
guarenteed in any way right now even.  GCC is going to
output "incl MEM" or similar for net_stats->counter++, since
it lacks the 'lock;' prefix it is not atomic.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 64 bit netdev stats counter
       [not found] <1026503694.26819.4.camel@dell_ss3.pdx.osdl.net>
@ 2002-07-12 22:02 ` Matt Stegman
  2002-07-12 21:58   ` David S. Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Matt Stegman @ 2002-07-12 22:02 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: linux-kernel

On 12 Jul 2002, Stephen Hemminger wrote:

> 64 bit values are not atomic so on x86 there will be glitches if this
> ever wraps over on an SMP machine.  One other engineer is already
> adressing this for inode values like size; but this would introduce the
> same problem for network stuff.

One other solution I thought of would be to have an rx_bytes_wrap counter
in the struct.

struct net_device_stats {
...
	unsigned long rx_bytes;
	unsigned long rx_bytes_wrap;
	unsigned long tx_bytes;
	unsigned long tx_bytes_wrap;
...
}

... and then, everywhere we add to rx_bytes (or tx_bytes):

stats->rx_bytes += pkt_length;
if (stats->rx_bytes <= pkt_length) stats->rx_bytes_wrap++;

I suppose this might also be more backwards compatible - if other code I
don't know about expects the rx_bytes to be a long, this would keep it so.
Also, if gcc has real problems with doing 64-bit math on a 32-bit
processor, this keeps everything in 32-bits.  I could then make a local
"unsigned long long" variable in the code that prints /proc/net/dev, and
in ifconfig.

unsigned long long rx_total_bytes = stats->rx_bytes * stats->rx_bytes_wrap;
sprintf(buf, "RX bytes:%llu", stats->rx_total_bytes);

-- 
      -Matt Stegman





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 64 bit netdev stats counter
  2002-07-12 23:20     ` Alan Cox
@ 2002-07-12 22:06       ` David S. Miller
  2002-07-12 22:27         ` Andrew Morton
  2002-07-12 22:31       ` Chris Friesen
  1 sibling, 1 reply; 10+ messages in thread
From: David S. Miller @ 2002-07-12 22:06 UTC (permalink / raw)
  To: alan; +Cc: matts, shemminger, linux-kernel

   From: Alan Cox <alan@lxorguk.ukuu.org.uk>
   Date: 13 Jul 2002 00:20:53 +0100

   gcc will output
   
   		increment low 32bit
   		if zero
   			increment high
   
   Which means we can rapidly get 2^32 out of sync
   
True and this equals the "fix" suggested C code involving
two 32-bit counters that the original author posted :-)

So this makes both cases equally wrong.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 64 bit netdev stats counter
  2002-07-12 22:06       ` David S. Miller
@ 2002-07-12 22:27         ` Andrew Morton
  2002-07-12 22:28           ` David S. Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2002-07-12 22:27 UTC (permalink / raw)
  To: David S. Miller; +Cc: alan, matts, shemminger, linux-kernel

"David S. Miller" wrote:
> 
>    From: Alan Cox <alan@lxorguk.ukuu.org.uk>
>    Date: 13 Jul 2002 00:20:53 +0100
> 
>    gcc will output
> 
>                 increment low 32bit
>                 if zero
>                         increment high
> 
>    Which means we can rapidly get 2^32 out of sync
> 
> True and this equals the "fix" suggested C code involving
> two 32-bit counters that the original author posted :-)
> 

Could you make the counters per-cpu and only add them all up
in the rare case where someone wants to read the stats?

And then change all the drivers ;)

-

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 64 bit netdev stats counter
  2002-07-12 22:27         ` Andrew Morton
@ 2002-07-12 22:28           ` David S. Miller
  0 siblings, 0 replies; 10+ messages in thread
From: David S. Miller @ 2002-07-12 22:28 UTC (permalink / raw)
  To: akpm; +Cc: alan, matts, shemminger, linux-kernel

   From: Andrew Morton <akpm@zip.com.au>
   Date: Fri, 12 Jul 2002 15:27:09 -0700
   
   Could you make the counters per-cpu and only add them all up
   in the rare case where someone wants to read the stats?
   
   And then change all the drivers ;)

Since spinlocks are held %99 of the time when these counters are
bumped, I'd like to suggest another more sane and space optimized
solution :-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 64 bit netdev stats counter
  2002-07-12 23:20     ` Alan Cox
  2002-07-12 22:06       ` David S. Miller
@ 2002-07-12 22:31       ` Chris Friesen
  2002-07-13  2:18         ` Alan Cox
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Friesen @ 2002-07-12 22:31 UTC (permalink / raw)
  To: Alan Cox; +Cc: David S. Miller, linux-kernel

Alan Cox wrote:
> 
> On Fri, 2002-07-12 at 22:58, David S. Miller wrote:
> > 32-bit values aren't atomic either, what is the issue?
> > We don't use atomic_t ops on these counters so they aren't
> > guarenteed in any way right now even.  GCC is going to
> > output "incl MEM" or similar for net_stats->counter++, since
> > it lacks the 'lock;' prefix it is not atomic.
> 
> The behaviour is quite different though. On a 32bit counter the worst we
> do is lose a few counts. On a 64bit one on 32bit cpus its quite likely
> gcc will output
> 
>                 increment low 32bit
>                 if zero
>                         increment high
> 
> Which means we can rapidly get 2^32 out of sync

Isn't this the same as 32-bit counters on a machine that doesn't do atomic
32-bit ops?  Although in that case you could only be 2^16 off...

Chris

-- 
Chris Friesen                    | MailStop: 043/33/F10  
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 64 bit netdev stats counter
  2002-07-12 21:58   ` David S. Miller
@ 2002-07-12 23:20     ` Alan Cox
  2002-07-12 22:06       ` David S. Miller
  2002-07-12 22:31       ` Chris Friesen
  2002-07-13  0:50     ` kuznet
  1 sibling, 2 replies; 10+ messages in thread
From: Alan Cox @ 2002-07-12 23:20 UTC (permalink / raw)
  To: David S. Miller; +Cc: matts, shemminger, linux-kernel

On Fri, 2002-07-12 at 22:58, David S. Miller wrote:
> 32-bit values aren't atomic either, what is the issue?
> We don't use atomic_t ops on these counters so they aren't
> guarenteed in any way right now even.  GCC is going to
> output "incl MEM" or similar for net_stats->counter++, since
> it lacks the 'lock;' prefix it is not atomic.

The behaviour is quite different though. On a 32bit counter the worst we
do is lose a few counts. On a 64bit one on 32bit cpus its quite likely
gcc will output

		increment low 32bit
		if zero
			increment high

Which means we can rapidly get 2^32 out of sync


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 64 bit netdev stats counter
  2002-07-12 21:58   ` David S. Miller
  2002-07-12 23:20     ` Alan Cox
@ 2002-07-13  0:50     ` kuznet
  1 sibling, 0 replies; 10+ messages in thread
From: kuznet @ 2002-07-13  0:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

Hello!

> 32-bit values aren't atomic either, what is the issue?
...
> output "incl MEM" or similar for net_stats->counter++, since
> it lacks the 'lock;' prefix it is not atomic.

The issue is that this does not matter, all the updates to counters
are serialized by driver logic in any case.

The counters were atomic wrt _read_ i.e. read used to produce valid result
rather than a garbage. We have discussed this some time ago, someone
proposed to use pair of 32bit numbers and prohibiting direct read,
fetching  the result with a macro sort of

	  do {
	     result_lo = lo;
	     result_hi = hi;
	  } while (lo != result_lo);

or something similar. Well, plus some barriers when/if needed.

Honestly, I do not feel any enthusiasm about doing this in kernel.
Some small bit of useless work each packet, some minor waste of memory,
some minor crap in code... All this is not essential, but does not cause any
enthisiasm yet. :-)

Alexey


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 64 bit netdev stats counter
  2002-07-12 22:31       ` Chris Friesen
@ 2002-07-13  2:18         ` Alan Cox
  0 siblings, 0 replies; 10+ messages in thread
From: Alan Cox @ 2002-07-13  2:18 UTC (permalink / raw)
  To: Chris Friesen; +Cc: David S. Miller, linux-kernel

On Fri, 2002-07-12 at 23:31, Chris Friesen wrote:
> 
> Isn't this the same as 32-bit counters on a machine that doesn't do atomic
> 32-bit ops?  Although in that case you could only be 2^16 off...

Yes but we don't happen to have any of those I care about 8)


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-07-13  1:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1026503694.26819.4.camel@dell_ss3.pdx.osdl.net>
2002-07-12 22:02 ` 64 bit netdev stats counter Matt Stegman
2002-07-12 21:58   ` David S. Miller
2002-07-12 23:20     ` Alan Cox
2002-07-12 22:06       ` David S. Miller
2002-07-12 22:27         ` Andrew Morton
2002-07-12 22:28           ` David S. Miller
2002-07-12 22:31       ` Chris Friesen
2002-07-13  2:18         ` Alan Cox
2002-07-13  0:50     ` kuznet
2002-07-12 19:12 Matt Stegman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox