netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
       [not found] <482DA5B6.1020606@sngx.net>
@ 2008-05-16 20:00 ` David Miller
  2008-05-16 20:09   ` Rick Jones
       [not found] ` <482DB46A.8020103@cosmosbay.com>
  1 sibling, 1 reply; 38+ messages in thread
From: David Miller @ 2008-05-16 20:00 UTC (permalink / raw)
  To: jimi; +Cc: netdev, linux-kernel

From: James Cammarata <jimi@sngx.net>
Date: Fri, 16 May 2008 10:18:14 -0500

We saw your initial patch posting, you don't need to send this
again.

netdev added to CC:

> From: James Cammarata <jimi@sngx.net>
> 
> Added the ability to write to /proc/net/dev in order to clear the interface counters for a given interface.  The ability to zero out counters (especially the error counters) is extremely useful when troubleshooting interface issues.  Now diffed against 2.6.25.4, and added some sanity checking.
> 
> Syntax:
>   echo 'net clear-stats ifdev' > /proc/net/dev
> Where "ifdev" is the device name you wish to clear.
> 
> This code is based mainly on the code found in drivers/scsi/scsi_proc.c
> 
> Signed-off-by: James Cammarata <jimi@sngx.net>
> ---
> 
> --- linux-2.6.25.4/net/core/dev.c       2008-05-15 10:00:12.000000000 -0500
> +++ linux-2.6.25.4-jcammara/net/core/dev.c      2008-05-16 07:02:48.000000000 -0500
> @@ -2455,2 +2455,78 @@
> 
> +/**
> + * proc_net_dev_write - handle writes to /proc/net/dev
> + * @file: not used
> + * @buf: buffer to write
> + * @length: length of buf, at most PAGE_SIZE
> + * @ppos: not used
> + *
> + * Description: this provides a mechanism to clear statistics on a
> + * per-interface basis
> + * "echo 'net clear-stats ifdev' >/proc/net/dev"
> + * with  "ifdev" replaced by the device name you wish to clear.
> + *
> + */
> +static ssize_t proc_net_dev_write(struct file *file, const char __user *buf,
> +                                  size_t length, loff_t *ppos)
> +{
> +        char *buffer, *p;
> +        char devname[IFNAMSIZ];
> +        struct net *net;
> +        struct net_device *dev;
> +        int err;
> +
> +        if (!buf || length > PAGE_SIZE)
> +                return -EINVAL;
> +
> +        buffer = (char *)__get_free_page(GFP_KERNEL);
> +        if (!buffer)
> +                return -ENOMEM;
> +
> +        err = -EFAULT;
> +        if (copy_from_user(buffer, buf, length))
> +                goto out;
> +
> +        err = -EINVAL;
> +        if (length < PAGE_SIZE)
> +                buffer[length] = '\0';
> +        else if (buffer[PAGE_SIZE-1])
> +                goto out;
> +
> +        err = -ENXIO;
> +        net = get_proc_net(file->f_dentry->d_inode);
> +        if (!net)
> +                goto out;
> +
> +        /*
> +         * Usage: echo "net clear-stats ifdev" >/proc/net/dev
> +         * with  "ifdev" replaced by the device name you wish to clear.
> +         */
> +        if (!strncmp("net clear-stats",buffer,15)) {
> +                p = buffer + 16;
> +                if(sscanf(p,"%16s",devname)>0) {
> +                        dev = dev_get_by_name(net,devname);
> +                        if (dev) {
> +                                if (dev->get_stats) {
> +                                        struct net_device_stats *stats =
> +                                               dev->get_stats(dev);
> +                                        memset(stats,0,
> +                                               sizeof(struct net_device_stats));
> +                                }
> +                                dev_put(dev);
> +                        }
> +                }
> +        }
> +
> +        /*
> +         * convert success returns so that we return the
> +         * number of bytes consumed.
> +         */
> +        if (!err)
> +                err = length;
> +
> + out:
> +        free_page((unsigned long)buffer);
> +        return err;
> +}
> +
>  static void *softnet_seq_start(struct seq_file *seq, loff_t *pos)
> @@ -2498,2 +2574,3 @@
>         .read    = seq_read,
> +        .write   = proc_net_dev_write,
>         .llseek  = seq_lseek,
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
       [not found] ` <482DB46A.8020103@cosmosbay.com>
@ 2008-05-16 20:03   ` David Miller
       [not found]   ` <482EF192.4070707@sngx.net>
  1 sibling, 0 replies; 38+ messages in thread
From: David Miller @ 2008-05-16 20:03 UTC (permalink / raw)
  To: dada1; +Cc: jimi, netdev, linux-kernel

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri, 16 May 2008 18:20:58 +0200

netdev CC:'d, people please...  Eric, you in particular know
better :-)))

> James Cammarata a écrit :
> > From: James Cammarata <jimi@sngx.net>
> >
> > Added the ability to write to /proc/net/dev in order to clear the 
> > interface counters for a given interface.  The ability to zero out 
> > counters (especially the error counters) is extremely useful when 
> > troubleshooting interface issues.  Now diffed against 2.6.25.4, and 
> > added some sanity checking.
> >
> Hello James
> 
> 1) Every call to get_proc_net() should be paired with a call to 
> put_net(), or else you  leak  struct net.
> 
> 2) I am not sure this is the right way to do this... Did you consider to 
> extend ethtool instead ?
> 
> > Syntax:
> >  echo 'net clear-stats ifdev' > /proc/net/dev
> > Where "ifdev" is the device name you wish to clear.
> >
> > This code is based mainly on the code found in drivers/scsi/scsi_proc.c
> >
> > Signed-off-by: James Cammarata <jimi@sngx.net>
> > ---
> >
> > --- linux-2.6.25.4/net/core/dev.c       2008-05-15 10:00:12.000000000 
> > -0500
> > +++ linux-2.6.25.4-jcammara/net/core/dev.c      2008-05-16 
> > 07:02:48.000000000 -0500
> > @@ -2455,2 +2455,78 @@
> >
> > +/**
> > + * proc_net_dev_write - handle writes to /proc/net/dev
> > + * @file: not used
> > + * @buf: buffer to write
> > + * @length: length of buf, at most PAGE_SIZE
> > + * @ppos: not used
> > + *
> > + * Description: this provides a mechanism to clear statistics on a
> > + * per-interface basis
> > + * "echo 'net clear-stats ifdev' >/proc/net/dev"
> > + * with  "ifdev" replaced by the device name you wish to clear.
> > + *
> > + */
> > +static ssize_t proc_net_dev_write(struct file *file, const char 
> > __user *buf,
> > +                                  size_t length, loff_t *ppos)
> > +{
> > +        char *buffer, *p;
> > +        char devname[IFNAMSIZ];
> > +        struct net *net;
> > +        struct net_device *dev;
> > +        int err;
> > +
> > +        if (!buf || length > PAGE_SIZE)
> > +                return -EINVAL;
> > +
> > +        buffer = (char *)__get_free_page(GFP_KERNEL);
> > +        if (!buffer)
> > +                return -ENOMEM;
> > +
> > +        err = -EFAULT;
> > +        if (copy_from_user(buffer, buf, length))
> > +                goto out;
> > +
> > +        err = -EINVAL;
> > +        if (length < PAGE_SIZE)
> > +                buffer[length] = '\0';
> > +        else if (buffer[PAGE_SIZE-1])
> > +                goto out;
> > +
> > +        err = -ENXIO;
> > +        net = get_proc_net(file->f_dentry->d_inode);
> > +        if (!net)
> > +                goto out;
> > +
> > +        /*
> > +         * Usage: echo "net clear-stats ifdev" >/proc/net/dev
> > +         * with  "ifdev" replaced by the device name you wish to clear.
> > +         */
> > +        if (!strncmp("net clear-stats",buffer,15)) {
> > +                p = buffer + 16;
> > +                if(sscanf(p,"%16s",devname)>0) {
> > +                        dev = dev_get_by_name(net,devname);
> > +                        if (dev) {
> > +                                if (dev->get_stats) {
> > +                                        struct net_device_stats *stats =
> > +                                               dev->get_stats(dev);
> > +                                        memset(stats,0,
> > +                                               sizeof(struct 
> > net_device_stats));
> > +                                }
> > +                                dev_put(dev);
> > +                        }
> > +                }
> > +        }
> > +
> > +        /*
> > +         * convert success returns so that we return the
> > +         * number of bytes consumed.
> > +         */
> > +        if (!err)
> > +                err = length;
> > +
> Here probably, you should add
> 
>     put_net(net);
> 
> > + out:
> > +        free_page((unsigned long)buffer);
> > +        return err;
> > +}
> > +
> > static void *softnet_seq_start(struct seq_file *seq, loff_t *pos)
> > @@ -2498,2 +2574,3 @@
> >        .read    = seq_read,
> > +        .write   = proc_net_dev_write,
> >        .llseek  = seq_lseek,
> >
> >
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
  2008-05-16 20:00 ` [PATCH updated] net: add ability to clear per-interface network statistics via procfs David Miller
@ 2008-05-16 20:09   ` Rick Jones
  2008-05-17 15:06     ` James Cammarata
  0 siblings, 1 reply; 38+ messages in thread
From: Rick Jones @ 2008-05-16 20:09 UTC (permalink / raw)
  To: David Miller; +Cc: jimi, netdev, linux-kernel

David Miller wrote:
> From: James Cammarata <jimi@sngx.net>
> Date: Fri, 16 May 2008 10:18:14 -0500
> 
> We saw your initial patch posting, you don't need to send this
> again.
> 
> netdev added to CC:
> 
> 
>>From: James Cammarata <jimi@sngx.net>
>>
>> Added the ability to write to /proc/net/dev in order to clear the
>> interface counters for a given interface.  The ability to zero out
>> counters (especially the error counters) is extremely useful when
>> troubleshooting interface issues.  Now diffed against 2.6.25.4, and
>> added some sanity checking.
>>
>>Syntax:
>>  echo 'net clear-stats ifdev' > /proc/net/dev
>>Where "ifdev" is the device name you wish to clear.

I don't wish to suggest I don't like the change - I've used similar 
features on "other oses" in the past, but when something like this has 
come-up in the past, haven't there been concerns about MIBs and SNMP and 
whatnot?

rick jones
ftp://ftp.cup.hp.com/dist/networking/tools - beforeafter - subtract one 
set of stats from another...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
  2008-05-16 20:09   ` Rick Jones
@ 2008-05-17 15:06     ` James Cammarata
  0 siblings, 0 replies; 38+ messages in thread
From: James Cammarata @ 2008-05-17 15:06 UTC (permalink / raw)
  To: Rick Jones; +Cc: David Miller, netdev, linux-kernel

>> We saw your initial patch posting, you don't need to send this
>> again.
>>
>> netdev added to CC:

I had regenerated the diff against .25.4 and made a few changes so I resubmitted.
Sorry if I made it seem like I was spamming the patch.

> I don't wish to suggest I don't like the change - I've used similar 
> features on "other oses" in the past, but when something like this has 
> come-up in the past, haven't there been concerns about MIBs and SNMP and 
> whatnot?
> 
> rick jones
> ftp://ftp.cup.hp.com/dist/networking/tools - beforeafter - subtract one 
> set of stats from another...

Yes, this would definitely be a caveat, as it could cause problems if the software
takes a baseline once, and calculates differences from that ever after.  For others,
it would cause a hiccup where stats show as being negative for one period.

However, I think the benefit of being able to troubleshoot issues on an interface 
in a non-invasive manner outweigh the downside here.  Like you say, this ability 
is already found elsewhere, most namely in network devices like Ciscos, and the 
concern for clearing stats there does not exist.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
       [not found]   ` <482EF192.4070707@sngx.net>
@ 2008-05-17 21:41     ` Eric Dumazet
  2008-05-17 22:49       ` James Cammarata
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Dumazet @ 2008-05-17 21:41 UTC (permalink / raw)
  To: James Cammarata; +Cc: linux-kernel, Linux Netdev List

James Cammarata a écrit :
>> 1) Every call to get_proc_net() should be paired with a call to 
>> put_net(), or else you  leak  struct net.
>
> My mistake, I will add this and resubmit.
>
This version has a bug, please check again.
>> 2) I am not sure this is the right way to do this... Did you consider 
>> to extend ethtool instead ?
>
> Yes, I did consider that.  My reasoning for adding it to procfs is 
> that each
> driver defines their own ethtool interface, so it was much simpler to 
> add it here, with a lot less code changes and broader support.
>
Maybe much simpler but looks like a hack...
>
> ---
>
> --- linux-2.6.25.4/net/core/dev.c       2008-05-15 10:00:12.000000000 
> -0500
> +++ linux-2.6.25.4-jcammara/net/core/dev.c      2008-05-17 
> 09:50:22.000000000 -0500
> @@ -2455,2 +2455,80 @@
>
> +/**
> + * proc_net_dev_write - handle writes to /proc/net/dev
> + * @file: not used
> + * @buf: buffer to write
> + * @length: length of buf, at most PAGE_SIZE
> + * @ppos: not used
> + *
> + * Description: this provides a mechanism to clear statistics on a
> + * per-interface basis
> + * "echo 'net clear-stats ifdev' >/proc/net/dev"
> + * with  "ifdev" replaced by the device name you wish to clear.
> + *
> + */
> +static ssize_t proc_net_dev_write(struct file *file, const char 
> __user *buf,
> +                                  size_t length, loff_t *ppos)
> +{
> +        char *buffer, *p;
> +        char devname[IFNAMSIZ];
> +        struct net *net;
> +        struct net_device *dev;
> +        int err;
> +
> +        if (!buf || length > PAGE_SIZE)
> +                return -EINVAL;
> +
> +        buffer = (char *)__get_free_page(GFP_KERNEL);
> +        if (!buffer)
> +                return -ENOMEM;
> +
> +        err = -EFAULT;
> +        if (copy_from_user(buffer, buf, length))
net is not initialized so you will call put_net(garbage) and crash

My suggestion was better, since you dont have to test if (net) ,
just call put_net(net) at the right place (before out: label)
> +                goto out;
> +
> +        err = -EINVAL;
> +        if (length < PAGE_SIZE)
> +                buffer[length] = '\0';
> +        else if (buffer[PAGE_SIZE-1])
> +                goto out;
> +
> +        err = -ENXIO;
> +        net = get_proc_net(file->f_dentry->d_inode);
> +        if (!net)
> +                goto out;
> +
> +        /*
> +         * Usage: echo "net clear-stats ifdev" >/proc/net/dev
> +         * with  "ifdev" replaced by the device name you wish to clear.
> +         */
> +        if (!strncmp("net clear-stats",buffer,15)) {
> +                p = buffer + 16;
> +                if(sscanf(p,"%16s",devname)>0) {
> +                        dev = dev_get_by_name(net,devname);
> +                        if (dev) {
> +                                if (dev->get_stats) {
> +                                        struct net_device_stats *stats =
> +                                               dev->get_stats(dev);
> +                                        memset(stats,0,
> +                                               sizeof(struct 
> net_device_stats));
> +                                }
> +                                dev_put(dev);
> +                        }
> +                }
> +        }
> +
> +        /*
> +         * convert success returns so that we return the
> +         * number of bytes consumed.
> +         */
> +        if (!err)
> +                err = length;
> +
> + out:
> +        if(net)
> +                put_net(net);
> +        free_page((unsigned long)buffer);
> +        return err;
> +}
> +
> static void *softnet_seq_start(struct seq_file *seq, loff_t *pos)
> @@ -2498,2 +2576,3 @@
>        .read    = seq_read,
> +        .write   = proc_net_dev_write,
>        .llseek  = seq_lseek,
>
>





^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
  2008-05-17 21:41     ` Eric Dumazet
@ 2008-05-17 22:49       ` James Cammarata
  2008-05-18  0:31         ` Ben Hutchings
  0 siblings, 1 reply; 38+ messages in thread
From: James Cammarata @ 2008-05-17 22:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, Linux Netdev List

> This version has a bug, please check again.

Gah, you're right, fixed below.

---

--- linux-2.6.25.4/net/core/dev.c       2008-05-15 10:00:12.000000000 -0500
+++ linux-2.6.25.4-jcammara/net/core/dev.c      2008-05-17 17:45:57.000000000 -0500
@@ -2455,2 +2455,80 @@

+/**
+ * proc_net_dev_write - handle writes to /proc/net/dev
+ * @file: not used
+ * @buf: buffer to write
+ * @length: length of buf, at most PAGE_SIZE
+ * @ppos: not used
+ *
+ * Description: this provides a mechanism to clear statistics on a
+ * per-interface basis
+ * "echo 'net clear-stats ifdev' >/proc/net/dev"
+ * with  "ifdev" replaced by the device name you wish to clear.
+ *
+ */
+static ssize_t proc_net_dev_write(struct file *file, const char __user *buf,
+                                  size_t length, loff_t *ppos)
+{
+        char *buffer, *p;
+        char devname[IFNAMSIZ];
+        struct net *net;
+        struct net_device *dev;
+        int err;
+
+        if (!buf || length > PAGE_SIZE)
+                return -EINVAL;
+
+        buffer = (char *)__get_free_page(GFP_KERNEL);
+        if (!buffer)
+                return -ENOMEM;
+
+        err = -EFAULT;
+        if (copy_from_user(buffer, buf, length))
+                goto out;
+
+        err = -EINVAL;
+        if (length < PAGE_SIZE)
+                buffer[length] = '\0';
+        else if (buffer[PAGE_SIZE-1])
+                goto out;
+
+        err = -ENXIO;
+        net = get_proc_net(file->f_dentry->d_inode);
+        if (!net)
+                goto out;
+
+        /*
+         * Usage: echo "net clear-stats ifdev" >/proc/net/dev
+         * with  "ifdev" replaced by the device name you wish to clear.
+         */
+        if (!strncmp("net clear-stats",buffer,15)) {
+                p = buffer + 16;
+                if(sscanf(p,"%16s",devname)>0) {
+                        dev = dev_get_by_name(net,devname);
+                        if (dev) {
+                                if (dev->get_stats) {
+                                        struct net_device_stats *stats =
+                                               dev->get_stats(dev);
+                                        memset(stats,0,
+                                               sizeof(struct net_device_stats));
+                                }
+                                dev_put(dev);
+                        }
+                }
+        }
+
+        /*
+         * convert success returns so that we return the
+         * number of bytes consumed.
+         */
+        if (!err)
+                err = length;
+
+        put_net(net);
+
+ out:
+        free_page((unsigned long)buffer);
+        return err;
+}
+
 static void *softnet_seq_start(struct seq_file *seq, loff_t *pos)
@@ -2498,2 +2576,3 @@
        .read    = seq_read,
+        .write   = proc_net_dev_write,
        .llseek  = seq_lseek,

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
  2008-05-17 22:49       ` James Cammarata
@ 2008-05-18  0:31         ` Ben Hutchings
  2008-05-18  1:43           ` Patrick McHardy
  2008-05-18  5:09           ` James Cammarata
  0 siblings, 2 replies; 38+ messages in thread
From: Ben Hutchings @ 2008-05-18  0:31 UTC (permalink / raw)
  To: James Cammarata; +Cc: Eric Dumazet, linux-kernel, Linux Netdev List

James Cammarata wrote:
> >This version has a bug, please check again.
> 
> Gah, you're right, fixed below.
> 
> ---
> 
> --- linux-2.6.25.4/net/core/dev.c       2008-05-15 10:00:12.000000000 -0500
> +++ linux-2.6.25.4-jcammara/net/core/dev.c      2008-05-17 
> 17:45:57.000000000 -0500
> @@ -2455,2 +2455,80 @@
> 
> +/**
> + * proc_net_dev_write - handle writes to /proc/net/dev
> + * @file: not used
> + * @buf: buffer to write
> + * @length: length of buf, at most PAGE_SIZE
> + * @ppos: not used
> + *
> + * Description: this provides a mechanism to clear statistics on a
> + * per-interface basis
> + * "echo 'net clear-stats ifdev' >/proc/net/dev"
> + * with  "ifdev" replaced by the device name you wish to clear.
> + *
> + */
> +static ssize_t proc_net_dev_write(struct file *file, const char __user 
> *buf,

You need to stop your mail client word-wrapping patches.  It looks like
something has converted tabs to spaces, too.

[...]
> +        /*
> +         * Usage: echo "net clear-stats ifdev" >/proc/net/dev
> +         * with  "ifdev" replaced by the device name you wish to clear.
> +         */

This is redundant with the kernel-doc comment.

> +        if (!strncmp("net clear-stats",buffer,15)) {

This doesn't check for a space after, which you rely on later on.  (What if
length == 15?)  Also, explicitly writing the length of a literal string is
error-prone.  Seems like it would be better to do something like:

	static const char command[] = "net clear-stats ";
        ...
        if (!strncmp(command, buffer, sizeof(command) - 1)) {

> +                p = buffer + 16;
> +                if(sscanf(p,"%16s",devname)>0) {
> +                        dev = dev_get_by_name(net,devname);
> +                        if (dev) {
> +                                if (dev->get_stats) {
> +                                        struct net_device_stats *stats =
> +                                               dev->get_stats(dev);
> +                                        memset(stats,0,
> +                                               sizeof(struct 
> net_device_stats));
> +                                }
> +                                dev_put(dev);
> +                        }

Shouldn't this return an error if the device doesn't exist?

> +                }
> +        }
> +
> +        /*
> +         * convert success returns so that we return the
> +         * number of bytes consumed.
> +         */
> +        if (!err)
> +                err = length;

This won't work; the function updates err unconditionally further up!

> +
> +        put_net(net);
> +
> + out:
> +        free_page((unsigned long)buffer);
> +        return err;
> +}
> +
> static void *softnet_seq_start(struct seq_file *seq, loff_t *pos)
> @@ -2498,2 +2576,3 @@
>        .read    = seq_read,
> +        .write   = proc_net_dev_write,
>        .llseek  = seq_lseek,

The context for this chunk seems to be too short.

There are a few formatting oddities; checkpatch.pl will point them out.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
  2008-05-18  0:31         ` Ben Hutchings
@ 2008-05-18  1:43           ` Patrick McHardy
  2008-05-18  5:09           ` James Cammarata
  1 sibling, 0 replies; 38+ messages in thread
From: Patrick McHardy @ 2008-05-18  1:43 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: James Cammarata, Eric Dumazet, linux-kernel, Linux Netdev List

Ben Hutchings wrote:
>> +                        dev = dev_get_by_name(net,devname);
>> +                        if (dev) {
>> +                                if (dev->get_stats) {
>> +                                        struct net_device_stats *stats =
>> +                                               dev->get_stats(dev);
>> +                                        memset(stats,0,
>> +                                               sizeof(struct 
>> net_device_stats));

This doesn't work for devices that generate the stats (for
example by reading them from the hardware) in ->get_stats().


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
  2008-05-18  0:31         ` Ben Hutchings
  2008-05-18  1:43           ` Patrick McHardy
@ 2008-05-18  5:09           ` James Cammarata
  2008-05-18 11:27             ` Ben Hutchings
  2008-05-29  1:45             ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata
  1 sibling, 2 replies; 38+ messages in thread
From: James Cammarata @ 2008-05-18  5:09 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Eric Dumazet, linux-kernel, Linux Netdev List

> You need to stop your mail client word-wrapping patches.  It looks like
> something has converted tabs to spaces, too.

Hmm, I don't know what might be causing that.  I'm using Thunderbird and disabled 
wrapping, and things look ok on lkml when I view the archives there. The tab/spaces 
issue was a result of me copying it out of a terminal window, and the patch now 
clears checkpatch.pl.

As for the rest, I've done as you suggested, my only comment about it is that 
like I said, my code is based heavily on that found in drivers/scsi/scsi_proc.c, so
it may be worth it to do the same changes there.

---

--- linux-2.6.25.4/net/core/dev.c	2008-05-15 10:00:12.000000000 -0500
+++ linux-2.6.25.4-jcammara/net/core/dev.c	2008-05-17 20:53:13.000000000 -0500
@@ -2453,6 +2453,75 @@ static struct netif_rx_stats *softnet_ge
 	return rc;
 }
 
+/**
+ * proc_net_dev_write - handle writes to /proc/net/dev
+ * @file: not used
+ * @buf: buffer to write
+ * @length: length of buf, at most PAGE_SIZE
+ * @ppos: not used
+ *
+ * Description: this provides a mechanism to clear statistics on a
+ * per-interface basis
+ * "echo 'net clear-stats ifdev' >/proc/net/dev"
+ * with  "ifdev" replaced by the device name you wish to clear.
+ *
+ */
+static ssize_t proc_net_dev_write(struct file *file, const char __user *buf,
+				  size_t length, loff_t *ppos)
+{
+	char *buffer, *p;
+	char devname[IFNAMSIZ];
+	static const char command[] = "net clear-stats ";
+	struct net *net;
+	struct net_device *dev;
+	int err;
+
+	if (!buf || length > PAGE_SIZE)
+		return -EINVAL;
+
+	buffer = (char *)__get_free_page(GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
+
+	err = -EFAULT;
+	if (copy_from_user(buffer, buf, length))
+		goto out;
+
+	err = -EINVAL;
+	if (length < PAGE_SIZE)
+		buffer[length] = '\0';
+	else if (buffer[PAGE_SIZE-1])
+		goto out;
+
+	err = -ENXIO;
+	net = get_proc_net(file->f_dentry->d_inode);
+	if (!net)
+		goto out;
+
+	if (!strncmp(command, buffer, sizeof(command) - 1)) {
+		p = buffer + strlen(command);
+		if (sscanf(p, "%16s", devname) > 0) {
+			dev = dev_get_by_name(net, devname);
+			if (dev) {
+				if (dev->get_stats) {
+					struct net_device_stats *stats =
+					       dev->get_stats(dev);
+					memset(stats, 0,
+					       sizeof(struct net_device_stats));
+				}
+				dev_put(dev);
+				err = length;
+			}
+		}
+	}
+
+	put_net(net);
+
+ out:
+	free_page((unsigned long)buffer);
+	return err;
+}
+
 static void *softnet_seq_start(struct seq_file *seq, loff_t *pos)
 {
 	return softnet_get_online(pos);
@@ -2496,6 +2565,7 @@ static const struct file_operations dev_
 	.owner	 = THIS_MODULE,
 	.open    = dev_seq_open,
 	.read    = seq_read,
+	.write   = proc_net_dev_write,
 	.llseek  = seq_lseek,
 	.release = seq_release_net,
 };

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs
  2008-05-18  5:09           ` James Cammarata
@ 2008-05-18 11:27             ` Ben Hutchings
  2008-05-29  1:45             ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata
  1 sibling, 0 replies; 38+ messages in thread
From: Ben Hutchings @ 2008-05-18 11:27 UTC (permalink / raw)
  To: James Cammarata; +Cc: Eric Dumazet, linux-kernel, Linux Netdev List

James Cammarata wrote:
> >You need to stop your mail client word-wrapping patches.  It looks like
> >something has converted tabs to spaces, too.
> 
> Hmm, I don't know what might be causing that.  I'm using Thunderbird and 
> disabled wrapping, and things look ok on lkml when I view the archives 
> there.

Sorry, you're right - your messages are marked as format=flowed, which
means they should be word-wrapped when displayed.  This is odd but
doesn't break the patches.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-18  5:09           ` James Cammarata
  2008-05-18 11:27             ` Ben Hutchings
@ 2008-05-29  1:45             ` James Cammarata
  2008-05-29  2:08               ` James Cammarata
  2008-05-29  5:11               ` Andrew Morton
  1 sibling, 2 replies; 38+ messages in thread
From: James Cammarata @ 2008-05-29  1:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linux Netdev List

I had originally submitted this ability as a patch to procfs, and
general consensus was that it seemed "hackish" to do it that way.  It
was suggested that it be implemented as a feature of ethtool, so I 
thought I'd take on the challenge and add it that way.

I've laid the groundwork here, and added the ability to two of the 
drivers for which I have hardware to test on (e1000 and pcnet32).  I 
also added the code required to call this functionality to the ethtool 
code base, and have been using that to test my changes (using -z as the 
flag to ethtool for clearing stats for now).

If this is an acceptable start, I will gladly start working on adding 
this to as many drivers as possible (we do have some bnx2 hardware, 
though I'm not sure if it's available for testing).

Some other sys-admins have mentioned to me that ethtool doesn't work 
sometimes and they fall back to using mii-tool to configure interfaces,
so I'd still like to have the procfs change implemented to accommodate 
that hardware, but I agree, this is a better way forward.


Signed-off-by: James Cammarata <jimi@sngx.net>
---

--- linux-2.6.25.4/include/linux/ethtool.h	2008-05-15 10:00:12.000000000 -0500
+++ linux-2.6.25.4-jcammara/include/linux/ethtool.h	2008-05-28 17:45:56.000000000 -0500
@@ -325,6 +325,7 @@ int ethtool_op_set_flags(struct net_devi
  * get_strings: Return a set of strings that describe the requested objects 
  * phys_id: Identify the device
  * get_stats: Return statistics about the device
+ * clear_ethtool_stats: Clear device statistics
  * get_flags: get 32-bit flags bitmap
  * set_flags: set 32-bit flags bitmap
  * 
@@ -383,6 +384,7 @@ struct ethtool_ops {
 	void	(*get_strings)(struct net_device *, u32 stringset, u8 *);
 	int	(*phys_id)(struct net_device *, u32);
 	void	(*get_ethtool_stats)(struct net_device *, struct ethtool_stats *, u64 *);
+	int	(*clear_ethtool_stats)(struct net_device *);
 	int	(*begin)(struct net_device *);
 	void	(*complete)(struct net_device *);
 	u32     (*get_ufo)(struct net_device *);
@@ -441,6 +443,7 @@ struct ethtool_ops {
 #define ETHTOOL_SFLAGS		0x00000026 /* Set flags bitmap(ethtool_value) */
 #define ETHTOOL_GPFLAGS		0x00000027 /* Get driver-private flags bitmap */
 #define ETHTOOL_SPFLAGS		0x00000028 /* Set driver-private flags bitmap */
+#define ETHTOOL_CSTATS		0x00000029 /* Clear NIC statistics */
 
 /* compatibility with older code */
 #define SPARC_ETH_GSET		ETHTOOL_GSET
--- linux-2.6.25.4/net/core/ethtool.c	2008-05-15 10:00:12.000000000 -0500
+++ linux-2.6.25.4-jcammara/net/core/ethtool.c	2008-05-28 17:48:51.000000000 -0500
@@ -714,6 +714,16 @@ static int ethtool_get_stats(struct net_
 	return ret;
 }
 
+static int ethtool_clear_stats(struct net_device *dev, void __user *useraddr)
+{
+	const struct ethtool_ops *ops = dev->ethtool_ops;
+
+	if (!ops->clear_ethtool_stats)
+		return -EOPNOTSUPP;
+
+	return ops->clear_ethtool_stats(dev);
+}
+
 static int ethtool_get_perm_addr(struct net_device *dev, void __user *useraddr)
 {
 	struct ethtool_perm_addr epaddr;
@@ -964,6 +974,9 @@ int dev_ethtool(struct net *net, struct 
 		rc = ethtool_set_value(dev, useraddr,
 				       dev->ethtool_ops->set_priv_flags);
 		break;
+	case ETHTOOL_CSTATS:
+		rc = ethtool_clear_stats(dev, useraddr);
+		break;
 	default:
 		rc = -EOPNOTSUPP;
 	}
--- linux-2.6.25.4/drivers/net/e1000/e1000_main.c	2008-05-15 10:00:12.000000000 -0500
+++ linux-2.6.25.4-jcammara/drivers/net/e1000/e1000_main.c	2008-05-28 17:51:07.000000000 -0500
@@ -141,6 +141,7 @@ static void e1000_free_tx_resources(stru
 static void e1000_free_rx_resources(struct e1000_adapter *adapter,
                              struct e1000_rx_ring *rx_ring);
 void e1000_update_stats(struct e1000_adapter *adapter);
+void e1000_clear_stats(struct e1000_adapter *adapter);
 
 static int e1000_init_module(void);
 static void e1000_exit_module(void);
@@ -3819,6 +3820,31 @@ e1000_update_stats(struct e1000_adapter 
 	spin_unlock_irqrestore(&adapter->stats_lock, flags);
 }
 
+void
+e1000_clear_stats(struct e1000_adapter *adapter)
+{
+	struct pci_dev *pdev = adapter->pdev;
+	unsigned long flags;
+
+	#define PHY_IDLE_ERROR_COUNT_MASK 0x00FF
+
+	/*
+	 * Prevent stats update while adapter is being reset, or if the pci
+	 * connection is down.
+	 */
+	if (adapter->link_speed == 0)
+		return;
+	if (pci_channel_offline(pdev))
+		return;
+
+	spin_lock_irqsave(&adapter->stats_lock, flags);
+
+	memset(&adapter->stats, 0, sizeof(struct e1000_hw_stats));
+	memset(&adapter->net_stats, 0, sizeof(struct net_device_stats));
+
+	spin_unlock_irqrestore(&adapter->stats_lock, flags);
+}
+
 /**
  * e1000_intr_msi - Interrupt Handler
  * @irq: interrupt number
--- linux-2.6.25.4/drivers/net/e1000/e1000_ethtool.c	2008-05-15 10:00:12.000000000 -0500
+++ linux-2.6.25.4-jcammara/drivers/net/e1000/e1000_ethtool.c	2008-05-28 17:53:28.000000000 -0500
@@ -42,7 +42,7 @@ extern int e1000_setup_all_tx_resources(
 extern void e1000_free_all_rx_resources(struct e1000_adapter *adapter);
 extern void e1000_free_all_tx_resources(struct e1000_adapter *adapter);
 extern void e1000_update_stats(struct e1000_adapter *adapter);
-
+extern void e1000_clear_stats(struct e1000_adapter *adapter);
 
 struct e1000_stats {
 	char stat_string[ETH_GSTRING_LEN];
@@ -1940,6 +1940,14 @@ e1000_get_ethtool_stats(struct net_devic
 /*	BUG_ON(i != E1000_STATS_LEN); */
 }
 
+static int
+e1000_clear_ethtool_stats(struct net_device *netdev)
+{
+	struct e1000_adapter *adapter = netdev_priv(netdev);
+	e1000_clear_stats(adapter);
+	return 0;
+}
+
 static void
 e1000_get_strings(struct net_device *netdev, uint32_t stringset, uint8_t *data)
 {
@@ -1991,6 +1999,7 @@ static const struct ethtool_ops e1000_et
 	.get_strings            = e1000_get_strings,
 	.phys_id                = e1000_phys_id,
 	.get_ethtool_stats      = e1000_get_ethtool_stats,
+	.clear_ethtool_stats	= e1000_clear_ethtool_stats,
 	.get_sset_count		= e1000_get_sset_count,
 };
 
--- linux-2.6.25.4/drivers/net/pcnet32.c	2008-05-15 10:00:12.000000000 -0500
+++ linux-2.6.25.4-jcammara/drivers/net/pcnet32.c	2008-05-28 17:59:42.000000000 -0500
@@ -312,6 +312,7 @@ static void pcnet32_tx_timeout(struct ne
 static irqreturn_t pcnet32_interrupt(int, void *);
 static int pcnet32_close(struct net_device *);
 static struct net_device_stats *pcnet32_get_stats(struct net_device *);
+static int pcnet32_clear_stats(struct net_device *);
 static void pcnet32_load_multicast(struct net_device *dev);
 static void pcnet32_set_multicast_list(struct net_device *);
 static int pcnet32_ioctl(struct net_device *, struct ifreq *, int);
@@ -1532,6 +1533,7 @@ static const struct ethtool_ops pcnet32_
 	.get_regs_len		= pcnet32_get_regs_len,
 	.get_regs		= pcnet32_get_regs,
 	.get_sset_count		= pcnet32_get_sset_count,
+	.clear_ethtool_stats	= pcnet32_clear_stats,
 };
 
 /* only probes for non-PCI devices, the rest are handled by
@@ -2719,6 +2721,17 @@ static struct net_device_stats *pcnet32_
 	return &dev->stats;
 }
 
+static int pcnet32_clear_stats(struct net_device *dev)
+{
+	struct pcnet32_private *lp = netdev_priv(dev);
+	unsigned long flags;
+
+	spin_lock_irqsave(&lp->lock, flags);
+	memset(&dev->stats, 0, sizeof(struct net_device_stats));
+	spin_unlock_irqrestore(&lp->lock, flags);
+	return 0;
+}
+
 /* taken from the sunlance driver, which it took from the depca driver */
 static void pcnet32_load_multicast(struct net_device *dev)
 {

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-29  1:45             ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata
@ 2008-05-29  2:08               ` James Cammarata
  2008-05-29  5:11               ` Andrew Morton
  1 sibling, 0 replies; 38+ messages in thread
From: James Cammarata @ 2008-05-29  2:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linux Netdev List

Missed an errant define in some copied code (from the get_stats function in e1000_main.c).

Signed-off-by: James Cammarata <jimi@sngx.net>
---

--- linux-2.6.25.4/drivers/net/e1000/e1000_main.c	2008-05-15 10:00:12.000000000 -0500
+++ linux-2.6.25.4-jcammara/drivers/net/e1000/e1000_main.c	2008-05-28 21:06:19.000000000 -0500
@@ -141,6 +141,7 @@ static void e1000_free_tx_resources(stru
 static void e1000_free_rx_resources(struct e1000_adapter *adapter,
                              struct e1000_rx_ring *rx_ring);
 void e1000_update_stats(struct e1000_adapter *adapter);
+void e1000_clear_stats(struct e1000_adapter *adapter);
 
 static int e1000_init_module(void);
 static void e1000_exit_module(void);
@@ -3819,6 +3820,29 @@ e1000_update_stats(struct e1000_adapter 
 	spin_unlock_irqrestore(&adapter->stats_lock, flags);
 }
 
+void
+e1000_clear_stats(struct e1000_adapter *adapter)
+{
+	struct pci_dev *pdev = adapter->pdev;
+	unsigned long flags;
+
+	/*
+	 * Prevent stats update while adapter is being reset, or if the pci
+	 * connection is down.
+	 */
+	if (adapter->link_speed == 0)
+		return;
+	if (pci_channel_offline(pdev))
+		return;
+
+	spin_lock_irqsave(&adapter->stats_lock, flags);
+
+	memset(&adapter->stats, 0, sizeof(struct e1000_hw_stats));
+	memset(&adapter->net_stats, 0, sizeof(struct net_device_stats));
+
+	spin_unlock_irqrestore(&adapter->stats_lock, flags);
+}
+
 /**
  * e1000_intr_msi - Interrupt Handler
  * @irq: interrupt number

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-29  1:45             ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata
  2008-05-29  2:08               ` James Cammarata
@ 2008-05-29  5:11               ` Andrew Morton
  2008-05-29 12:34                 ` James Cammarata
  1 sibling, 1 reply; 38+ messages in thread
From: Andrew Morton @ 2008-05-29  5:11 UTC (permalink / raw)
  To: James Cammarata; +Cc: linux-kernel, Linux Netdev List

On Wed, 28 May 2008 20:45:18 -0500 James Cammarata <jimi@sngx.net> wrote:

> I had originally submitted this ability as a patch to procfs, and
> general consensus was that it seemed "hackish" to do it that way.  It
> was suggested that it be implemented as a feature of ethtool, so I 
> thought I'd take on the challenge and add it that way.
> 
> I've laid the groundwork here, and added the ability to two of the 
> drivers for which I have hardware to test on (e1000 and pcnet32).  I 
> also added the code required to call this functionality to the ethtool 
> code base, and have been using that to test my changes (using -z as the 
> flag to ethtool for clearing stats for now).
> 
> If this is an acceptable start, I will gladly start working on adding 
> this to as many drivers as possible (we do have some bnx2 hardware, 
> though I'm not sure if it's available for testing).
> 
> Some other sys-admins have mentioned to me that ethtool doesn't work 
> sometimes and they fall back to using mii-tool to configure interfaces,
> so I'd still like to have the procfs change implemented to accommodate 
> that hardware, but I agree, this is a better way forward.

You didn't provide a reason from adding this feature to the kernel.

Many of the kernel's accounting accumulators cannot be reset.  We
handle that in userspace tools by using subtraction.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-29  5:11               ` Andrew Morton
@ 2008-05-29 12:34                 ` James Cammarata
  2008-05-29 14:45                   ` Alan Cox
  2008-05-29 14:48                   ` Chris Friesen
  0 siblings, 2 replies; 38+ messages in thread
From: James Cammarata @ 2008-05-29 12:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Linux Netdev List

> You didn't provide a reason from adding this feature to the kernel.

Sorry, my reasoning was in my original patch only.  The ability to reset network 
counters is, in my experience, one of the first things you do when trying to 
troubleshoot networking issues - especially when you have incrementing errors.


> Many of the kernel's accounting accumulators cannot be reset.  We
> handle that in userspace tools by using subtraction.

I don't think that should preclude the ability to reset these, unless it is
shown that it would break something very badly.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-29 12:34                 ` James Cammarata
@ 2008-05-29 14:45                   ` Alan Cox
  2008-05-29 17:15                     ` James Cammarata
  2008-05-30 19:12                     ` Bill Fink
  2008-05-29 14:48                   ` Chris Friesen
  1 sibling, 2 replies; 38+ messages in thread
From: Alan Cox @ 2008-05-29 14:45 UTC (permalink / raw)
  To: James Cammarata; +Cc: Andrew Morton, linux-kernel, Linux Netdev List

> > Many of the kernel's accounting accumulators cannot be reset.  We
> > handle that in userspace tools by using subtraction.
> 
> I don't think that should preclude the ability to reset these, unless it is
> shown that it would break something very badly.

For one quite a few of the network cards keep the stats in hardware and
don't neccessarily have a way to reset them. In other cases there is a
mix of OS accumulated stats bulk updated by overflow events on the device
itself.

Its a lot of complexity, and changes all over the places for the sake of
a trivial userspace change. I would suggest you instead write a quick bit
of perl or python that fetches the stats and then updates every second
with the changes.

Alan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-29 12:34                 ` James Cammarata
  2008-05-29 14:45                   ` Alan Cox
@ 2008-05-29 14:48                   ` Chris Friesen
  1 sibling, 0 replies; 38+ messages in thread
From: Chris Friesen @ 2008-05-29 14:48 UTC (permalink / raw)
  To: James Cammarata; +Cc: Andrew Morton, linux-kernel, Linux Netdev List

James Cammarata wrote:
>> You didn't provide a reason from adding this feature to the kernel.

> Sorry, my reasoning was in my original patch only.  The ability to reset 
> network counters is, in my experience, one of the first things you do 
> when trying to troubleshoot networking issues - especially when you have 
> incrementing errors.

The ability to reset counters does make it easier to detect changes 
manually, as any non-zero value is something of interest.  However, if 
you're gathering data via a script there is no functional gain.

>> Many of the kernel's accounting accumulators cannot be reset.  We
>> handle that in userspace tools by using subtraction.

> I don't think that should preclude the ability to reset these, unless it is
> shown that it would break something very badly.

You're trying to add new functionality to the kernel, increasing its 
size and adding more work for maintainers.  Also, you're implicitly 
asking that all of the network drivers be updated to add support for 
this feature.  The burden of proof is on you to show why the feature is 
useful enough that the others involved should take on this additional work.

Chris

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-29 14:45                   ` Alan Cox
@ 2008-05-29 17:15                     ` James Cammarata
  2008-05-29 20:50                       ` David Miller
  2008-05-30 19:12                     ` Bill Fink
  1 sibling, 1 reply; 38+ messages in thread
From: James Cammarata @ 2008-05-29 17:15 UTC (permalink / raw)
  To: Alan Cox; +Cc: Andrew Morton, linux-kernel, Linux Netdev List

> For one quite a few of the network cards keep the stats in hardware and
> don't neccessarily have a way to reset them. In other cases there is a
> mix of OS accumulated stats bulk updated by overflow events on the device
> itself.

Yes, and there's no good way to get around that, so the best thing for that 
hardware would probably to not add this functionality, in which case ethtool
would simply report that it's not available.  
 
> Its a lot of complexity, and changes all over the places for the sake of
> a trivial userspace change. I would suggest you instead write a quick bit
> of perl or python that fetches the stats and then updates every second
> with the changes.

I would disagree with you on the complexity issue.  In general, it is one
new function (trivial in length) and an addition to the ethtool ops.  SNMP
and sar cover the reporting needs, but there have been multiple times when
troubleshooting under the gun that this would have been extremely useful
to help diagnose the issue (in one instance, we were getting floods of 
jumbo frames and other errors, and we did not know how quickly they were 
accumulating, because the switch was not seeing the same problem).  My day
job is in the financial industry, so we are often under very short SLA's 
to diagnose issues, and having this built into the kernel/existing tools
would be a boon.  I'd be more than happy to change this to a kernel config
option, experimental and default off for those who would not want it.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-29 17:15                     ` James Cammarata
@ 2008-05-29 20:50                       ` David Miller
  2008-05-29 21:18                         ` James Cammarata
  0 siblings, 1 reply; 38+ messages in thread
From: David Miller @ 2008-05-29 20:50 UTC (permalink / raw)
  To: jimi; +Cc: alan, akpm, linux-kernel, netdev

From: James Cammarata <jimi@sngx.net>
Date: Thu, 29 May 2008 12:15:16 -0500

>  I'd be more than happy to change this to a kernel config
> option, experimental and default off for those who would not want it.

We never add pieces of code which are marked experimental and
intended to be marked that way forever, that's nonsense.

What you want is your little special kernel hack in the upstream
tree, and that's now how we handle such things, sorry.

I also agree with Alan that the kernel is the wrong place to
handle this problem.  Do it in userspace.

We have consistent precedence for this elsewhere in the kernel tree,
and there is good reason for that.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-29 20:50                       ` David Miller
@ 2008-05-29 21:18                         ` James Cammarata
  0 siblings, 0 replies; 38+ messages in thread
From: James Cammarata @ 2008-05-29 21:18 UTC (permalink / raw)
  To: David Miller; +Cc: alan, akpm, linux-kernel, netdev

>>  I'd be more than happy to change this to a kernel config
>> option, experimental and default off for those who would not want it.
> 
> We never add pieces of code which are marked experimental and
> intended to be marked that way forever, that's nonsense.

Completely not my intention, I did not ever intend to have something
marked experimental forever, and I'm not sure where you're drawing
that inference from.
 
> What you want is your little special kernel hack in the upstream
> tree, and that's now how we handle such things, sorry.

No, I saw what I thought was a lack of a feature that is commonplace 
on many other platforms, and thought I'd give it a shot to see if 
I could add it myself.  The NAKs are apparently overwhelming on this
issue, so I'll stop beating a dead horse and move on.  Thanks for the 
input.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-29 14:45                   ` Alan Cox
  2008-05-29 17:15                     ` James Cammarata
@ 2008-05-30 19:12                     ` Bill Fink
  2008-05-30 22:14                       ` Rick Jones
  2008-05-31 12:11                       ` Alan Cox
  1 sibling, 2 replies; 38+ messages in thread
From: Bill Fink @ 2008-05-30 19:12 UTC (permalink / raw)
  To: Alan Cox; +Cc: James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List

On Thu, 29 May 2008, Alan Cox wrote:

> > > Many of the kernel's accounting accumulators cannot be reset.  We
> > > handle that in userspace tools by using subtraction.
> > 
> > I don't think that should preclude the ability to reset these, unless it is
> > shown that it would break something very badly.
> 
> For one quite a few of the network cards keep the stats in hardware and
> don't neccessarily have a way to reset them. In other cases there is a
> mix of OS accumulated stats bulk updated by overflow events on the device
> itself.
> 
> Its a lot of complexity, and changes all over the places for the sake of
> a trivial userspace change. I would suggest you instead write a quick bit
> of perl or python that fetches the stats and then updates every second
> with the changes.

When diagnosing network problems, the ability to zero counters is
a major aid in diagnosis.  Writing scripts is not a general solution
since often several systems are involved and it's not simple to do
this via a script.  Saving stats output and running beforeafter on
a number of systems is a royal pain when troubleshooting.

I like the idea someone else had. where the clear stats actually did
a checkpoint of the stats, and then future get stats would return the
diff between the checkpointed and current values (perhaps there could
also be a mechanism to still get the absolute values if desired).

						-Bill

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-30 19:12                     ` Bill Fink
@ 2008-05-30 22:14                       ` Rick Jones
  2008-05-31  1:09                         ` Bill Fink
  2008-05-31 12:11                       ` Alan Cox
  1 sibling, 1 reply; 38+ messages in thread
From: Rick Jones @ 2008-05-30 22:14 UTC (permalink / raw)
  To: Bill Fink
  Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel,
	Linux Netdev List


> Saving stats output and running beforeafter on a number of systems is
> a royal pain when troubleshooting.

Well, with that ringing endorsement :) I would like to state that 
Yoshihiro Ishijima has fixed some bugs in beforeafter and it will now 
deal with /proc/interrupts output better than before.

ftp://ftp.cup.hp.com/dist/networking/tools/

rick jones

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-30 22:14                       ` Rick Jones
@ 2008-05-31  1:09                         ` Bill Fink
  2008-05-31  2:41                           ` Stephen Hemminger
  0 siblings, 1 reply; 38+ messages in thread
From: Bill Fink @ 2008-05-31  1:09 UTC (permalink / raw)
  To: Rick Jones
  Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel,
	Linux Netdev List

On Fri, 30 May 2008, Rick Jones wrote:

> > Saving stats output and running beforeafter on a number of systems is
> > a royal pain when troubleshooting.
> 
> Well, with that ringing endorsement :) I would like to state that 
> Yoshihiro Ishijima has fixed some bugs in beforeafter and it will now 
> deal with /proc/interrupts output better than before.
> 
> ftp://ftp.cup.hp.com/dist/networking/tools/

Nothing against beforefter.  It's a very useful tool, especially
given Linux's lack of a method to clear interface counters.  But
it still is very cumbersome to use in the scenario I mentioned,
having to save the stats before and after a test on each of several
systems, and then having to run beforeafter on each of those systems.
Then you need to do this all again for every test run.  It's just
not very efficient.

I don't know why the major objection to adding a small amount of
code to the kernel that would make life a lot easier for hundreds
(or more) of Linux network adminstrators.  And the snapshot/diff
idea would not be complex to maintain and would also address the
issue of some hardware not having a clear hw stats capability.

						-Bill

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-31  1:09                         ` Bill Fink
@ 2008-05-31  2:41                           ` Stephen Hemminger
  2008-05-31  4:47                             ` Bill Fink
  0 siblings, 1 reply; 38+ messages in thread
From: Stephen Hemminger @ 2008-05-31  2:41 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

On Fri, 30 May 2008 21:09:31 -0400
Bill Fink <billfink@mindspring.com> wrote:

> On Fri, 30 May 2008, Rick Jones wrote:
> 
> > > Saving stats output and running beforeafter on a number of systems is
> > > a royal pain when troubleshooting.
> > 
> > Well, with that ringing endorsement :) I would like to state that 
> > Yoshihiro Ishijima has fixed some bugs in beforeafter and it will now 
> > deal with /proc/interrupts output better than before.
> > 
> > ftp://ftp.cup.hp.com/dist/networking/tools/
> 
> Nothing against beforefter.  It's a very useful tool, especially
> given Linux's lack of a method to clear interface counters.  But
> it still is very cumbersome to use in the scenario I mentioned,
> having to save the stats before and after a test on each of several
> systems, and then having to run beforeafter on each of those systems.
> Then you need to do this all again for every test run.  It's just
> not very efficient.
> 
> I don't know why the major objection to adding a small amount of
> code to the kernel that would make life a lot easier for hundreds
> (or more) of Linux network adminstrators.  And the snapshot/diff
> idea would not be complex to maintain and would also address the
> issue of some hardware not having a clear hw stats capability.
> 
> 						-Bill

ifstat from iproute2 tools also keeps state to compute number
of packets since last update as well. Come on, it isn't that
hard.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-31  2:41                           ` Stephen Hemminger
@ 2008-05-31  4:47                             ` Bill Fink
  0 siblings, 0 replies; 38+ messages in thread
From: Bill Fink @ 2008-05-31  4:47 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, linux-kernel

On Fri, 30 May 2008, Stephen Hemminger wrote:

> On Fri, 30 May 2008 21:09:31 -0400
> Bill Fink <billfink@mindspring.com> wrote:
> 
> > On Fri, 30 May 2008, Rick Jones wrote:
> > 
> > > > Saving stats output and running beforeafter on a number of systems is
> > > > a royal pain when troubleshooting.
> > > 
> > > Well, with that ringing endorsement :) I would like to state that 
> > > Yoshihiro Ishijima has fixed some bugs in beforeafter and it will now 
> > > deal with /proc/interrupts output better than before.
> > > 
> > > ftp://ftp.cup.hp.com/dist/networking/tools/
> > 
> > Nothing against beforefter.  It's a very useful tool, especially
> > given Linux's lack of a method to clear interface counters.  But
> > it still is very cumbersome to use in the scenario I mentioned,
> > having to save the stats before and after a test on each of several
> > systems, and then having to run beforeafter on each of those systems.
> > Then you need to do this all again for every test run.  It's just
> > not very efficient.
> > 
> > I don't know why the major objection to adding a small amount of
> > code to the kernel that would make life a lot easier for hundreds
> > (or more) of Linux network adminstrators.  And the snapshot/diff
> > idea would not be complex to maintain and would also address the
> > issue of some hardware not having a clear hw stats capability.
> > 
> > 						-Bill
> 
> ifstat from iproute2 tools also keeps state to compute number
> of packets since last update as well. Come on, it isn't that
> hard.

Thanks for the pointer.  I wasn't aware of ifstat (it's not installed
by default on any of my systems).  I will check out ifstat and see what
it can do.

						-Thanks

						-Bill

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-30 19:12                     ` Bill Fink
  2008-05-30 22:14                       ` Rick Jones
@ 2008-05-31 12:11                       ` Alan Cox
  2008-05-31 23:57                         ` Bill Fink
  1 sibling, 1 reply; 38+ messages in thread
From: Alan Cox @ 2008-05-31 12:11 UTC (permalink / raw)
  To: Bill Fink; +Cc: James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List

> When diagnosing network problems, the ability to zero counters is
> a major aid in diagnosis.  Writing scripts is not a general solution
> since often several systems are involved and it's not simple to do
> this via a script.  Saving stats output and running beforeafter on
> a number of systems is a royal pain when troubleshooting.

Zeroing them is not a general solution either - you break all the other
tools monitoring the same stats in parallel - like network usage monitors.

This is a user space tools problem, scripts or otherwise.

Alan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-31 12:11                       ` Alan Cox
@ 2008-05-31 23:57                         ` Bill Fink
  2008-06-01  1:46                           ` Ben Hutchings
  2008-06-02  4:50                           ` Glen Turner
  0 siblings, 2 replies; 38+ messages in thread
From: Bill Fink @ 2008-05-31 23:57 UTC (permalink / raw)
  To: Alan Cox; +Cc: James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List

On Sat, 31 May 2008, Alan Cox wrote:

> > When diagnosing network problems, the ability to zero counters is
> > a major aid in diagnosis.  Writing scripts is not a general solution
> > since often several systems are involved and it's not simple to do
> > this via a script.  Saving stats output and running beforeafter on
> > a number of systems is a royal pain when troubleshooting.
> 
> Zeroing them is not a general solution either - you break all the other
> tools monitoring the same stats in parallel - like network usage monitors.
> 
> This is a user space tools problem, scripts or otherwise.

Two points.  One, if I as a Linux network administrator, make a
deliberate decision to zero counters, I will be making that decision
taking into account any other effects it might have.

Second, I already expressed support for the preferred option of
the zeroing of counters actually just doing a snapshot of the stats,
and subsequent requests for the stats doing a diff from the snapshot.
There could also be a mechanism for getting the absolute values of
the stats that for example SNMP monitoring would utilize.

Perhaps a user space tool such as ifstat can be part of the solution
(I haven't had a chance to check it out yet).  But it would be
difficult I believe to make a general user space tool for dealing
with the device specific stats reported by "ethtool -S" since they
are quite specific to an individual device driver.

Yes, every individual Linux network administrator can re-create the
wheel by devising their own scripts, but it makes much more sense
to me to implement a simple general kernel mechanism once that could
be used generically, than to have hundreds (or thousands) of Linux
network administrators each having to do it themselves (perhaps
multiple times if they have a variety of types of systems and types
of NICs).

						-Bill

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-31 23:57                         ` Bill Fink
@ 2008-06-01  1:46                           ` Ben Hutchings
  2008-06-01 20:46                             ` Bill Fink
  2008-06-02  4:50                           ` Glen Turner
  1 sibling, 1 reply; 38+ messages in thread
From: Ben Hutchings @ 2008-06-01  1:46 UTC (permalink / raw)
  To: Bill Fink
  Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel,
	Linux Netdev List

Bill Fink wrote:
<snip>
> Yes, every individual Linux network administrator can re-create the
> wheel by devising their own scripts, but it makes much more sense
> to me to implement a simple general kernel mechanism once that could
> be used generically, than to have hundreds (or thousands) of Linux
> network administrators each having to do it themselves (perhaps
> multiple times if they have a variety of types of systems and types
> of NICs).

The ethtool interface is pretty generic, even if the names aren't.
Here's some Python code I just knocked up which demonstrates how
to get a set of named stats.  It shouldn't be terribly hard to
extend this to saving and subtracting stat sets.

Ben.

import array, fcntl, os, socket, struct

SIOCETHTOOL = 0x8946

ETH_GSTRING_LEN = 32
ETH_SS_STATS = 1

ETHTOOL_GDRVINFO = 0x00000003
ETHTOOL_GSTRINGS = 0x0000001b
ETHTOOL_GSTATS = 0x0000001d

def ethtool(sock, name, buf):
    ifreq = struct.pack('16sP', name, buf.buffer_info()[0])
    fcntl.ioctl(sock.fileno(), SIOCETHTOOL, ifreq)

def ethtool_gdrvinfo(sock, name):
    cmd = struct.pack('=I', ETHTOOL_GDRVINFO)
    buf = array.array('c', cmd)
    format = '=32s32s32s32s32x12xIIIII'
    buf.extend('\0' * struct.calcsize(format))
    ethtool(sock, name, buf)
    return dict(zip(['driver', 'version', 'fw_version', 'bus_info',
                     'n_priv_flags', 'n_stats', 'testinfo_len',
                     'eedump_len', 'regdump_len'],
                    struct.unpack(format, buf[len(cmd):])))

def ethtool_gstrings(sock, name, string_set, count):
    cmd = struct.pack('=II', ETHTOOL_GSTRINGS, string_set)
    buf = array.array('c', cmd)
    buf.extend('\0' * (4 + count * ETH_GSTRING_LEN))
    ethtool(sock, name, buf)
    def nth_string(buf, n):
        data = buf[12 + n * ETH_GSTRING_LEN : 12 + (n + 1) * ETH_GSTRING_LEN]
        return data.tostring().replace('\0', '')
    return [nth_string(buf, n) for n in range(count)]

def ethtool_gstats(sock, name, count):
    cmd = struct.pack('=I', ETHTOOL_GSTATS)
    buf = array.array('c', cmd)
    buf.extend('\0' * (4 + count * 8))
    ethtool(sock, name, buf)
    def nth_stat(buf, n):
        data = buf[8 + n * 8 : 8 + (n + 1) * 8]
        return struct.unpack('=Q', data)[0]
    return [nth_stat(buf, n) for n in range(count)]

if __name__ == '__main__':
    import sys
    name = sys.argv[1]
    sock = socket.socket()
    drvinfo = ethtool_gdrvinfo(sock, name)
    count = drvinfo['n_stats']
    print dict(zip(ethtool_gstrings(sock, name, ETH_SS_STATS, count),
                   ethtool_gstats(sock, name, count)))

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-01  1:46                           ` Ben Hutchings
@ 2008-06-01 20:46                             ` Bill Fink
  2008-06-01 22:29                               ` Ben Hutchings
  2008-06-02  5:39                               ` David Miller
  0 siblings, 2 replies; 38+ messages in thread
From: Bill Fink @ 2008-06-01 20:46 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel,
	Linux Netdev List

On Sun, 1 Jun 2008, Ben Hutchings wrote:

> Bill Fink wrote:
> <snip>
> > Yes, every individual Linux network administrator can re-create the
> > wheel by devising their own scripts, but it makes much more sense
> > to me to implement a simple general kernel mechanism once that could
> > be used generically, than to have hundreds (or thousands) of Linux
> > network administrators each having to do it themselves (perhaps
> > multiple times if they have a variety of types of systems and types
> > of NICs).
> 
> The ethtool interface is pretty generic, even if the names aren't.
> Here's some Python code I just knocked up which demonstrates how
> to get a set of named stats.  It shouldn't be terribly hard to
> extend this to saving and subtracting stat sets.

I'm not sure what that proves.  Your python code just basically gives
the same info as running the "ethtool -S" command.  But the question
is how does one devise a generic script or tool that doesn't require
any special knowledge of the specific NIC being used.  For example,
here's the "ethtool -S" info for my myri10ge NIC:

[root@chance8 ~]# ethtool -S eth2
NIC statistics:
     rx_packets: 53243864310
     tx_packets: 112826823797
     rx_bytes: 301727733072710
     tx_bytes: 716648208451198
     rx_errors: 0
     tx_errors: 0
     rx_dropped: 0
     tx_dropped: 0
     multicast: 0
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_fifo_errors: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_window_errors: 0
     tx_boundary: 4096
     WC: 1
     irq: 8413
     MSI: 1
     read_dma_bw_MBs: 1398
     write_dma_bw_MBs: 1613
     read_write_dma_bw_MBs: 2711
     serial_number: 287046
     tx_pkt_start: 1157674101
     tx_pkt_done: 1157674101
     tx_req: 188226127
     tx_done: 188226127
     rx_small_cnt: 3009560676
     rx_big_cnt: 1726230729
     wake_queue: 57969440
     stop_queue: 57969440
     watchdog_resets: 0
     tx_linearized: 0
     link_changes: 8
     link_up: 1
     dropped_link_overflow: 0
     dropped_link_error_or_filtered: 26584
     dropped_multicast_filtered: 2190912
     dropped_runt: 0
     dropped_overrun: 0
     dropped_no_small_buffer: 0
     dropped_no_big_buffer: 0

How does one know which of these reported values are counter stats
that one wishes to zero/snapshot, and which are not?

Another issue that occurred to me is if multiple people are working
on troubleshooting a network problem, how do we insure that they all
get a consistent view of the stats?  If this is done via a kernel
mechanism then there isn't an issue.  But if it's done via user space,
then you have to make sure that everyone zeros/snapshots the stats
at the same time.

Ideally, one should be able to do something like "ethtool -z ethX"
to zero/snapshot the driver stats, and then "ethtool -S ethX" to get
the stats since the last snapshot.  You should be able to use the
same tool ("ethtool") to do all of this, and not some other special
tool or specially devised homegrown script.  Why make users lives
any more difficult than need be?

						-Bill

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-01 20:46                             ` Bill Fink
@ 2008-06-01 22:29                               ` Ben Hutchings
  2008-06-02  3:55                                 ` Bill Fink
  2008-06-02  5:39                               ` David Miller
  1 sibling, 1 reply; 38+ messages in thread
From: Ben Hutchings @ 2008-06-01 22:29 UTC (permalink / raw)
  To: Bill Fink
  Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel,
	Linux Netdev List

Bill Fink wrote:
> On Sun, 1 Jun 2008, Ben Hutchings wrote:
> 
> > Bill Fink wrote:
> > <snip>
> > > Yes, every individual Linux network administrator can re-create the
> > > wheel by devising their own scripts, but it makes much more sense
> > > to me to implement a simple general kernel mechanism once that could
> > > be used generically, than to have hundreds (or thousands) of Linux
> > > network administrators each having to do it themselves (perhaps
> > > multiple times if they have a variety of types of systems and types
> > > of NICs).
> > 
> > The ethtool interface is pretty generic, even if the names aren't.
> > Here's some Python code I just knocked up which demonstrates how
> > to get a set of named stats.  It shouldn't be terribly hard to
> > extend this to saving and subtracting stat sets.
> 
> I'm not sure what that proves.  Your python code just basically gives
> the same info as running the "ethtool -S" command.

Yes, but in a form you can more easily manipulate.

> But the question
> is how does one devise a generic script or tool that doesn't require
> any special knowledge of the specific NIC being used.  For example,
> here's the "ethtool -S" info for my myri10ge NIC:
> 
> [root@chance8 ~]# ethtool -S eth2
> NIC statistics:
[...]
>      WC: 1
>      irq: 8413
>      MSI: 1
>      read_dma_bw_MBs: 1398
>      write_dma_bw_MBs: 1613
>      read_write_dma_bw_MBs: 2711
>      serial_number: 287046
[...]
>      tx_linearized: 0
>      link_changes: 8
>      link_up: 1
[...]
> 
> How does one know which of these reported values are counter stats
> that one wishes to zero/snapshot, and which are not?

Ah, I see, I didn't realise some drivers were abusing ethtool stats in
this way.  But for the most part the differences will show up as zeroes.

If there's really a need for ethtool stats that aren't counters,
maybe the ethtool API should include flags to indicate which they are.

> Another issue that occurred to me is if multiple people are working
> on troubleshooting a network problem, how do we insure that they all
> get a consistent view of the stats?  If this is done via a kernel
> mechanism then there isn't an issue.  But if it's done via user space,
> then you have to make sure that everyone zeros/snapshots the stats
> at the same time.
> 
> Ideally, one should be able to do something like "ethtool -z ethX"
> to zero/snapshot the driver stats, and then "ethtool -S ethX" to get
> the stats since the last snapshot.  You should be able to use the
> same tool ("ethtool") to do all of this, and not some other special
> tool or specially devised homegrown script.  Why make users lives
> any more difficult than need be?

No-one's stopping you from adding these options to ethtool.  You could
have it save statistic sets in, say, /var/run/ethtool.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-01 22:29                               ` Ben Hutchings
@ 2008-06-02  3:55                                 ` Bill Fink
  0 siblings, 0 replies; 38+ messages in thread
From: Bill Fink @ 2008-06-02  3:55 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel,
	Linux Netdev List

On Sun, 1 Jun 2008, Ben Hutchings wrote:

> Bill Fink wrote:
> 
> > But the question
> > is how does one devise a generic script or tool that doesn't require
> > any special knowledge of the specific NIC being used.  For example,
> > here's the "ethtool -S" info for my myri10ge NIC:
> > 
> > [root@chance8 ~]# ethtool -S eth2
> > NIC statistics:
> [...]
> >      WC: 1
> >      irq: 8413
> >      MSI: 1
> >      read_dma_bw_MBs: 1398
> >      write_dma_bw_MBs: 1613
> >      read_write_dma_bw_MBs: 2711
> >      serial_number: 287046
> [...]
> >      tx_linearized: 0
> >      link_changes: 8
> >      link_up: 1
> [...]
> > 
> > How does one know which of these reported values are counter stats
> > that one wishes to zero/snapshot, and which are not?
> 
> Ah, I see, I didn't realise some drivers were abusing ethtool stats in
> this way.  But for the most part the differences will show up as zeroes.

I'm not sure I would characterize that as abuse of the ethtool stats,
but rather just a different category of stats.  And I wouldn't want
them to show up as zeros after doing a clear/snapshot of the counter
stats.  The "ethtool -S" output should be completely normal afterward,
with just the counter stats being zeroed.

> If there's really a need for ethtool stats that aren't counters,
> maybe the ethtool API should include flags to indicate which they are.

That would be useful.  Maybe some way of determining the type of an
ethtool stat such as COUNTER (perhaps with subtypes for signed versus
unsigned, 32-bit versus 64-bit), RUNTIME_INFO, etc.

> > Another issue that occurred to me is if multiple people are working
> > on troubleshooting a network problem, how do we insure that they all
> > get a consistent view of the stats?  If this is done via a kernel
> > mechanism then there isn't an issue.  But if it's done via user space,
> > then you have to make sure that everyone zeros/snapshots the stats
> > at the same time.
> > 
> > Ideally, one should be able to do something like "ethtool -z ethX"
> > to zero/snapshot the driver stats, and then "ethtool -S ethX" to get
> > the stats since the last snapshot.  You should be able to use the
> > same tool ("ethtool") to do all of this, and not some other special
> > tool or specially devised homegrown script.  Why make users lives
> > any more difficult than need be?
> 
> No-one's stopping you from adding these options to ethtool.  You could
> have it save statistic sets in, say, /var/run/ethtool.

Yes, that could be done if the ability to determine which ethtool stats
were counter stats was added to the ethtool API.  And I guess they
should be saved in something like /var/run/ethtool/ethX.  But then
what happens when you start using multiple network namespaces, and
for example have eth0 in several different network namespaces.  How
could that be handled, to keep the different network namespaces from
clobbering the stats of another namespace?  If done in the kernel,
I believe it would all work as expected, but it's not clear to me
how to handle this in user space.

						-Bill

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-05-31 23:57                         ` Bill Fink
  2008-06-01  1:46                           ` Ben Hutchings
@ 2008-06-02  4:50                           ` Glen Turner
  2008-06-02 16:10                             ` Bill Fink
  1 sibling, 1 reply; 38+ messages in thread
From: Glen Turner @ 2008-06-02  4:50 UTC (permalink / raw)
  To: Bill Fink
  Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel,
	Linux Netdev List


> Yes, every individual Linux network administrator can re-create the
> wheel by devising their own scripts, but it makes much more sense
> to me to implement a simple general kernel mechanism once that could
> be used generically, than to have hundreds (or thousands) of Linux
> network administrators each having to do it themselves (perhaps
> multiple times if they have a variety of types of systems and types
> of NICs).

Hi Bill,

If you pull the stats using a SNMP polling tool (torrus, cacti, mrtg)
then those package's graphs give nice "did this get better or worse"
output for debugging network issues.

I'd suggest you use one of those tools rather than writing your
own scripts.  Even if 99% of the time the graphs record zero errors,
knowing when those errors started is very valuable and well worth
the additional effort of configuring the tools over a command-line
or a kernel hack.

The more sophisticated tools can do alerting to Nagios should
a variable suddenly change its behaviour.

The Cisco/Juniper/everyone-else feature to run console stats
separately from SNMP stats is nice, but it's rather tuned to
the needs of router-heads and tends to fall apart when multiple
staff are debugging a fault.

If we do proceed with better command line stats then the number
of errored seconds and the worst errored second and its value
would be useful. These useful numbers can't be calculated by
the SNMP polling tools and it's hard to see how they could be
done in user-space.

Cheers, Glen

-- 
  Glen Turner

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-01 20:46                             ` Bill Fink
  2008-06-01 22:29                               ` Ben Hutchings
@ 2008-06-02  5:39                               ` David Miller
  2008-06-02 15:41                                 ` Bill Fink
  1 sibling, 1 reply; 38+ messages in thread
From: David Miller @ 2008-06-02  5:39 UTC (permalink / raw)
  To: billfink; +Cc: bhutchings, alan, jimi, akpm, linux-kernel, netdev

From: Bill Fink <billfink@mindspring.com>
Date: Sun, 1 Jun 2008 16:46:14 -0400

> Another issue that occurred to me is if multiple people are working
> on troubleshooting a network problem, how do we insure that they all
> get a consistent view of the stats?

Some of them definitely won't get a consistent view if one of them
triggers this undesirable statistic clearing knob.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-02  5:39                               ` David Miller
@ 2008-06-02 15:41                                 ` Bill Fink
  0 siblings, 0 replies; 38+ messages in thread
From: Bill Fink @ 2008-06-02 15:41 UTC (permalink / raw)
  To: David Miller; +Cc: bhutchings, alan, jimi, akpm, linux-kernel, netdev

On Sun, 01 Jun 2008, David Miller wrote:

> From: Bill Fink <billfink@mindspring.com>
> Date: Sun, 1 Jun 2008 16:46:14 -0400
> 
> > Another issue that occurred to me is if multiple people are working
> > on troubleshooting a network problem, how do we insure that they all
> > get a consistent view of the stats?
> 
> Some of them definitely won't get a consistent view if one of them
> triggers this undesirable statistic clearing knob.

The typical case I was referring to is for example when I am assisting
someone else with troubleshooting.  This is coordinated with the system
admin, who would usually be the one to clear/snapshot the stats.  This
cannot be done by a normal user as it requires root privileges.  BTW
I'm also not clear why running "ethtool -S" requires root privileges,
as it would be more convenient if a normal user could do this (currently
I have to be given sudo access).

This is a useful mechanism for many, many people.  Bottom line I don't
really care if it's done in the kernel or in user space, but it should
be possible using the standard "ethtool -S" command, and I have raised
at least a couple of issues with modifying the ethtool command to
support this (but which wouldn't be an issue if implemented in the
kernel).

						-Bill

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-02  4:50                           ` Glen Turner
@ 2008-06-02 16:10                             ` Bill Fink
  2008-06-03 12:28                               ` James Cammarata
  0 siblings, 1 reply; 38+ messages in thread
From: Bill Fink @ 2008-06-02 16:10 UTC (permalink / raw)
  To: Glen Turner
  Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel,
	Linux Netdev List

On Mon, 02 Jun 2008, Glen Turner wrote:

> 
> > Yes, every individual Linux network administrator can re-create the
> > wheel by devising their own scripts, but it makes much more sense
> > to me to implement a simple general kernel mechanism once that could
> > be used generically, than to have hundreds (or thousands) of Linux
> > network administrators each having to do it themselves (perhaps
> > multiple times if they have a variety of types of systems and types
> > of NICs).
> 
> Hi Bill,
> 
> If you pull the stats using a SNMP polling tool (torrus, cacti, mrtg)
> then those package's graphs give nice "did this get better or worse"
> output for debugging network issues.

I do use mrtg for network monitoring to determine when things go
bad, but when they do go bad, then I typically need to get much
more detailed info when troubleshooting the problem.

> I'd suggest you use one of those tools rather than writing your
> own scripts.  Even if 99% of the time the graphs record zero errors,
> knowing when those errors started is very valuable and well worth
> the additional effort of configuring the tools over a command-line
> or a kernel hack.

First of all, when assisting a user, they typically aren't even
running an snmp daemon (and there might be firewall issues to
access it if they are).  And I don't think the "ethtool -S" driver
stats are even accessible via SNMP (although they may contribute
to more generic interface stats which are), and it is the specific
driver stats which are often key to help diagnosing the problem.

> The more sophisticated tools can do alerting to Nagios should
> a variable suddenly change its behaviour.

Definitely useful for certain arenas.

> The Cisco/Juniper/everyone-else feature to run console stats
> separately from SNMP stats is nice, but it's rather tuned to
> the needs of router-heads and tends to fall apart when multiple
> staff are debugging a fault.

I use it all the time in coordination with network peers and
joint troubleshooting.  They clear the interface stats, and they
and I can then view the interface stats as a test is run (they
give me RO access to view the stats), or vice versa depending
on whose network is being examined.

> If we do proceed with better command line stats then the number
> of errored seconds and the worst errored second and its value
> would be useful. These useful numbers can't be calculated by
> the SNMP polling tools and it's hard to see how they could be
> done in user-space.

I'm all for any improved debugging/diagnostic capabilities, including
the extremely useful ability to clear/snapshot driver stats (there
could also be an option to un-snapshot if you wanted to get back to
seeing the absolute counter values).

						-Bill

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-02 16:10                             ` Bill Fink
@ 2008-06-03 12:28                               ` James Cammarata
  2008-06-03 12:35                                 ` Ben Hutchings
  2008-06-03 14:46                                 ` Alan Cox
  0 siblings, 2 replies; 38+ messages in thread
From: James Cammarata @ 2008-06-03 12:28 UTC (permalink / raw)
  To: Bill Fink; +Cc: linux-kernel, Linux Netdev List

> First of all, when assisting a user, they typically aren't even
> running an snmp daemon (and there might be firewall issues to
> access it if they are).  And I don't think the "ethtool -S" driver
> stats are even accessible via SNMP (although they may contribute
> to more generic interface stats which are), and it is the specific
> driver stats which are often key to help diagnosing the problem.

Ok, just to jump back in and add my $0.02, I plan to send a patch to 
the ethtool project to do what was suggested (snapshot to /var, and 
diff stats against that for future reference).  I would just like to 
point out that this will introduce inconsistencies between _every_ 
other source of interface stats (ifconfig/procfs/etc).  I can deal 
with that, it just means one more thing to keep track of when you're 
troubleshooting.

Also, I'd like to point out that the "stepping on other peoples toes" 
argument is a bogus one, since most major network vendors include this 
ability, and it's not exactly a concern now.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-03 12:28                               ` James Cammarata
@ 2008-06-03 12:35                                 ` Ben Hutchings
  2008-06-03 14:46                                 ` Alan Cox
  1 sibling, 0 replies; 38+ messages in thread
From: Ben Hutchings @ 2008-06-03 12:35 UTC (permalink / raw)
  To: James Cammarata; +Cc: Bill Fink, linux-kernel, Linux Netdev List

James Cammarata wrote:
> >First of all, when assisting a user, they typically aren't even
> >running an snmp daemon (and there might be firewall issues to
> >access it if they are).  And I don't think the "ethtool -S" driver
> >stats are even accessible via SNMP (although they may contribute
> >to more generic interface stats which are), and it is the specific
> >driver stats which are often key to help diagnosing the problem.
> 
> Ok, just to jump back in and add my $0.02, I plan to send a patch to 
> the ethtool project to do what was suggested (snapshot to /var, and 
> diff stats against that for future reference).  I would just like to 
> point out that this will introduce inconsistencies between _every_ 
> other source of interface stats (ifconfig/procfs/etc).  I can deal 
> with that, it just means one more thing to keep track of when you're 
> troubleshooting.

You could perhaps add a third column to the statistics when a snapshot
exists, with the second column still showing absolute values and the
third showing the difference from the snapshot.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-03 12:28                               ` James Cammarata
  2008-06-03 12:35                                 ` Ben Hutchings
@ 2008-06-03 14:46                                 ` Alan Cox
  2008-06-04  3:05                                   ` James Cammarata
  1 sibling, 1 reply; 38+ messages in thread
From: Alan Cox @ 2008-06-03 14:46 UTC (permalink / raw)
  To: James Cammarata; +Cc: Bill Fink, linux-kernel, Linux Netdev List

> Also, I'd like to point out that the "stepping on other peoples toes" 
> argument is a bogus one, since most major network vendors include this 
> ability, and it's not exactly a concern now.

I used to work in a large ISP - it was a huge concern then and was
enforced and managed by the less effective 'do you like your kneecaps'
approach to permissions.

Its basically impossible to write a correct non-racy application which
zeros kernel statistics and then measures the change, because you cannot
know another application did the same while you were running.

This is the most basic and blindingly obvious stuff. You should not be
able to zero the kernel stats just because you can't work perl.

Alan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32
  2008-06-03 14:46                                 ` Alan Cox
@ 2008-06-04  3:05                                   ` James Cammarata
  0 siblings, 0 replies; 38+ messages in thread
From: James Cammarata @ 2008-06-04  3:05 UTC (permalink / raw)
  To: Alan Cox; +Cc: Bill Fink, linux-kernel, Linux Netdev List

> I used to work in a large ISP - it was a huge concern then and was
> enforced and managed by the less effective 'do you like your kneecaps'
> approach to permissions.

I work at a large ISP now, and you're absolutely right.  You don't just go 
around resetting interface counters on backbone routers for the hell of it, 
and we never do it without customer permission while troubleshooting an 
issue with a connection, that is why I said I thought it was a non-argument.
There seems to be an irrational fear of counter-based anarchy here.

> Its basically impossible to write a correct non-racy application which
> zeros kernel statistics and then measures the change, because you cannot
> know another application did the same while you were running.
> 
> This is the most basic and blindingly obvious stuff. You should not be
> able to zero the kernel stats just because you can't work perl.

I've already said I'd drop the issue 4+ days ago, and that I'd be more
than happy to do it in userland as you suggested, my point was simply that
adding it to only one userland tool will lead to inconsistencies.  It is not
an issue of being able to "work perl" or not.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2008-06-04  3:03 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <482DA5B6.1020606@sngx.net>
2008-05-16 20:00 ` [PATCH updated] net: add ability to clear per-interface network statistics via procfs David Miller
2008-05-16 20:09   ` Rick Jones
2008-05-17 15:06     ` James Cammarata
     [not found] ` <482DB46A.8020103@cosmosbay.com>
2008-05-16 20:03   ` David Miller
     [not found]   ` <482EF192.4070707@sngx.net>
2008-05-17 21:41     ` Eric Dumazet
2008-05-17 22:49       ` James Cammarata
2008-05-18  0:31         ` Ben Hutchings
2008-05-18  1:43           ` Patrick McHardy
2008-05-18  5:09           ` James Cammarata
2008-05-18 11:27             ` Ben Hutchings
2008-05-29  1:45             ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata
2008-05-29  2:08               ` James Cammarata
2008-05-29  5:11               ` Andrew Morton
2008-05-29 12:34                 ` James Cammarata
2008-05-29 14:45                   ` Alan Cox
2008-05-29 17:15                     ` James Cammarata
2008-05-29 20:50                       ` David Miller
2008-05-29 21:18                         ` James Cammarata
2008-05-30 19:12                     ` Bill Fink
2008-05-30 22:14                       ` Rick Jones
2008-05-31  1:09                         ` Bill Fink
2008-05-31  2:41                           ` Stephen Hemminger
2008-05-31  4:47                             ` Bill Fink
2008-05-31 12:11                       ` Alan Cox
2008-05-31 23:57                         ` Bill Fink
2008-06-01  1:46                           ` Ben Hutchings
2008-06-01 20:46                             ` Bill Fink
2008-06-01 22:29                               ` Ben Hutchings
2008-06-02  3:55                                 ` Bill Fink
2008-06-02  5:39                               ` David Miller
2008-06-02 15:41                                 ` Bill Fink
2008-06-02  4:50                           ` Glen Turner
2008-06-02 16:10                             ` Bill Fink
2008-06-03 12:28                               ` James Cammarata
2008-06-03 12:35                                 ` Ben Hutchings
2008-06-03 14:46                                 ` Alan Cox
2008-06-04  3:05                                   ` James Cammarata
2008-05-29 14:48                   ` Chris Friesen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).