occasionally corrupted network stats in /proc/net/dev

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Mark Seger <Mark.Seger@hp.com>
To: netdev@vger.kernel.org
Subject: occasionally corrupted network stats in /proc/net/dev
Date: Mon, 14 Jan 2008 12:20:38 -0500	[thread overview]
Message-ID: <478B99E6.2050800@hp.com> (raw)

I had posted the following on linux-net and haven't see any responses 
possibly because nobody had any or that list is obsolete.  I have been 
told this is the current list for everything networking on linux so I 
thought I'd try again...

I suspect the answer will be that it is what it is, but here's the 
deal.  I have a tool I use for monitoring network traffic among other 
things - see http://collectl.sourceforge.net/ - and one of its benefits  
is that you can run it continuously as a daemon (similar to sar) and 
generate data in a format suitable for plotting.  This means that you 
can automate your entire network monitoring infrastructure at fairly 
fine granularity, down to second if you like.  Actually 1-second level 
monitoring will provide incorrect data on earlier kernels because the 
stats aren't updated on 1 second boundaries and you need to monitor at 
an interval of 0.9765 seconds, but that's a different story which is 
explained at http://collectl.sourceforge.net/NetworkStats.html

But more importantly, I've found that occasionally (not that often) 
there is bogus data reported from /proc/net/dev.  While I don't have a 
lot of details on this it seems to only show up in 64 bit kernels.  Look 
at the following samples taken at 1 second intervals:

 eth0:135115809 1024897    0    0    0     0          0         9 
135458926  910340    0    0    0     0       0          0
 eth0:135118023 1024923    0    0    0     0          0         9 
135460952  910363    0    0    0     0       0          0
 eth0:        0  884620    0    0    0     0          0    909397   
9687563 1049736    0    0    0     0       0          0
 eth0:135121189 1024957    0    0    0     0          0         9 
135464222  910400    0    0    0     0       0          0
 eth0:135129565 1024995    0    0    0     0          0         9 
135473687  910435    0    0    0     0       0          0

see the middle sample?  When I look at the change between samples it 
generates a really big number since the difference is assumed to be 
caused a counter wrapping.  The problem is it's not always 
straightforward when there is bad data.  For example if the original and 
bogus values are close enough it's not even clear there is a problem.

So the obvious question is, is there any way to prevent the bogus data 
from getting reported?   If not, is there any way to set the values to 
something to indicate that the correct values can't be determined?  
Clearly this problem would be visible to any tool that looks at /proc 
but since many tools are not automated or don't take it to the level I 
do, nobody probably notices.  As for the counter update frequency, even 
though they now appear to be updated closer to a 1 second boundary it 
also means tools that can monitor at sub-second intervals will report 
incorrect data since the counters only change once a second.

-mark

next             reply	other threads:[~2008-01-14 17:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-14 17:20 Mark Seger [this message]
2008-01-14 17:38 ` occasionally corrupted network stats in /proc/net/dev Eric Dumazet
2008-01-14 18:08 ` Ben Greear
2008-01-14 18:24   ` Mark Seger
2008-01-14 18:51     ` Mark Seger
2008-01-14 19:01       ` Ben Greear
2008-01-14 19:12       ` Eric Dumazet
2008-01-14 20:41         ` Michael Chan
2008-01-14 20:05           ` Mark Seger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=478B99E6.2050800@hp.com \
    --to=mark.seger@hp.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).