* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs [not found] <482DA5B6.1020606@sngx.net> @ 2008-05-16 20:00 ` David Miller 2008-05-16 20:09 ` Rick Jones [not found] ` <482DB46A.8020103@cosmosbay.com> 1 sibling, 1 reply; 38+ messages in thread From: David Miller @ 2008-05-16 20:00 UTC (permalink / raw) To: jimi; +Cc: netdev, linux-kernel From: James Cammarata <jimi@sngx.net> Date: Fri, 16 May 2008 10:18:14 -0500 We saw your initial patch posting, you don't need to send this again. netdev added to CC: > From: James Cammarata <jimi@sngx.net> > > Added the ability to write to /proc/net/dev in order to clear the interface counters for a given interface. The ability to zero out counters (especially the error counters) is extremely useful when troubleshooting interface issues. Now diffed against 2.6.25.4, and added some sanity checking. > > Syntax: > echo 'net clear-stats ifdev' > /proc/net/dev > Where "ifdev" is the device name you wish to clear. > > This code is based mainly on the code found in drivers/scsi/scsi_proc.c > > Signed-off-by: James Cammarata <jimi@sngx.net> > --- > > --- linux-2.6.25.4/net/core/dev.c 2008-05-15 10:00:12.000000000 -0500 > +++ linux-2.6.25.4-jcammara/net/core/dev.c 2008-05-16 07:02:48.000000000 -0500 > @@ -2455,2 +2455,78 @@ > > +/** > + * proc_net_dev_write - handle writes to /proc/net/dev > + * @file: not used > + * @buf: buffer to write > + * @length: length of buf, at most PAGE_SIZE > + * @ppos: not used > + * > + * Description: this provides a mechanism to clear statistics on a > + * per-interface basis > + * "echo 'net clear-stats ifdev' >/proc/net/dev" > + * with "ifdev" replaced by the device name you wish to clear. > + * > + */ > +static ssize_t proc_net_dev_write(struct file *file, const char __user *buf, > + size_t length, loff_t *ppos) > +{ > + char *buffer, *p; > + char devname[IFNAMSIZ]; > + struct net *net; > + struct net_device *dev; > + int err; > + > + if (!buf || length > PAGE_SIZE) > + return -EINVAL; > + > + buffer = (char *)__get_free_page(GFP_KERNEL); > + if (!buffer) > + return -ENOMEM; > + > + err = -EFAULT; > + if (copy_from_user(buffer, buf, length)) > + goto out; > + > + err = -EINVAL; > + if (length < PAGE_SIZE) > + buffer[length] = '\0'; > + else if (buffer[PAGE_SIZE-1]) > + goto out; > + > + err = -ENXIO; > + net = get_proc_net(file->f_dentry->d_inode); > + if (!net) > + goto out; > + > + /* > + * Usage: echo "net clear-stats ifdev" >/proc/net/dev > + * with "ifdev" replaced by the device name you wish to clear. > + */ > + if (!strncmp("net clear-stats",buffer,15)) { > + p = buffer + 16; > + if(sscanf(p,"%16s",devname)>0) { > + dev = dev_get_by_name(net,devname); > + if (dev) { > + if (dev->get_stats) { > + struct net_device_stats *stats = > + dev->get_stats(dev); > + memset(stats,0, > + sizeof(struct net_device_stats)); > + } > + dev_put(dev); > + } > + } > + } > + > + /* > + * convert success returns so that we return the > + * number of bytes consumed. > + */ > + if (!err) > + err = length; > + > + out: > + free_page((unsigned long)buffer); > + return err; > +} > + > static void *softnet_seq_start(struct seq_file *seq, loff_t *pos) > @@ -2498,2 +2574,3 @@ > .read = seq_read, > + .write = proc_net_dev_write, > .llseek = seq_lseek, > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs 2008-05-16 20:00 ` [PATCH updated] net: add ability to clear per-interface network statistics via procfs David Miller @ 2008-05-16 20:09 ` Rick Jones 2008-05-17 15:06 ` James Cammarata 0 siblings, 1 reply; 38+ messages in thread From: Rick Jones @ 2008-05-16 20:09 UTC (permalink / raw) To: David Miller; +Cc: jimi, netdev, linux-kernel David Miller wrote: > From: James Cammarata <jimi@sngx.net> > Date: Fri, 16 May 2008 10:18:14 -0500 > > We saw your initial patch posting, you don't need to send this > again. > > netdev added to CC: > > >>From: James Cammarata <jimi@sngx.net> >> >> Added the ability to write to /proc/net/dev in order to clear the >> interface counters for a given interface. The ability to zero out >> counters (especially the error counters) is extremely useful when >> troubleshooting interface issues. Now diffed against 2.6.25.4, and >> added some sanity checking. >> >>Syntax: >> echo 'net clear-stats ifdev' > /proc/net/dev >>Where "ifdev" is the device name you wish to clear. I don't wish to suggest I don't like the change - I've used similar features on "other oses" in the past, but when something like this has come-up in the past, haven't there been concerns about MIBs and SNMP and whatnot? rick jones ftp://ftp.cup.hp.com/dist/networking/tools - beforeafter - subtract one set of stats from another... ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs 2008-05-16 20:09 ` Rick Jones @ 2008-05-17 15:06 ` James Cammarata 0 siblings, 0 replies; 38+ messages in thread From: James Cammarata @ 2008-05-17 15:06 UTC (permalink / raw) To: Rick Jones; +Cc: David Miller, netdev, linux-kernel >> We saw your initial patch posting, you don't need to send this >> again. >> >> netdev added to CC: I had regenerated the diff against .25.4 and made a few changes so I resubmitted. Sorry if I made it seem like I was spamming the patch. > I don't wish to suggest I don't like the change - I've used similar > features on "other oses" in the past, but when something like this has > come-up in the past, haven't there been concerns about MIBs and SNMP and > whatnot? > > rick jones > ftp://ftp.cup.hp.com/dist/networking/tools - beforeafter - subtract one > set of stats from another... Yes, this would definitely be a caveat, as it could cause problems if the software takes a baseline once, and calculates differences from that ever after. For others, it would cause a hiccup where stats show as being negative for one period. However, I think the benefit of being able to troubleshoot issues on an interface in a non-invasive manner outweigh the downside here. Like you say, this ability is already found elsewhere, most namely in network devices like Ciscos, and the concern for clearing stats there does not exist. ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <482DB46A.8020103@cosmosbay.com>]
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs [not found] ` <482DB46A.8020103@cosmosbay.com> @ 2008-05-16 20:03 ` David Miller [not found] ` <482EF192.4070707@sngx.net> 1 sibling, 0 replies; 38+ messages in thread From: David Miller @ 2008-05-16 20:03 UTC (permalink / raw) To: dada1; +Cc: jimi, netdev, linux-kernel From: Eric Dumazet <dada1@cosmosbay.com> Date: Fri, 16 May 2008 18:20:58 +0200 netdev CC:'d, people please... Eric, you in particular know better :-))) > James Cammarata a écrit : > > From: James Cammarata <jimi@sngx.net> > > > > Added the ability to write to /proc/net/dev in order to clear the > > interface counters for a given interface. The ability to zero out > > counters (especially the error counters) is extremely useful when > > troubleshooting interface issues. Now diffed against 2.6.25.4, and > > added some sanity checking. > > > Hello James > > 1) Every call to get_proc_net() should be paired with a call to > put_net(), or else you leak struct net. > > 2) I am not sure this is the right way to do this... Did you consider to > extend ethtool instead ? > > > Syntax: > > echo 'net clear-stats ifdev' > /proc/net/dev > > Where "ifdev" is the device name you wish to clear. > > > > This code is based mainly on the code found in drivers/scsi/scsi_proc.c > > > > Signed-off-by: James Cammarata <jimi@sngx.net> > > --- > > > > --- linux-2.6.25.4/net/core/dev.c 2008-05-15 10:00:12.000000000 > > -0500 > > +++ linux-2.6.25.4-jcammara/net/core/dev.c 2008-05-16 > > 07:02:48.000000000 -0500 > > @@ -2455,2 +2455,78 @@ > > > > +/** > > + * proc_net_dev_write - handle writes to /proc/net/dev > > + * @file: not used > > + * @buf: buffer to write > > + * @length: length of buf, at most PAGE_SIZE > > + * @ppos: not used > > + * > > + * Description: this provides a mechanism to clear statistics on a > > + * per-interface basis > > + * "echo 'net clear-stats ifdev' >/proc/net/dev" > > + * with "ifdev" replaced by the device name you wish to clear. > > + * > > + */ > > +static ssize_t proc_net_dev_write(struct file *file, const char > > __user *buf, > > + size_t length, loff_t *ppos) > > +{ > > + char *buffer, *p; > > + char devname[IFNAMSIZ]; > > + struct net *net; > > + struct net_device *dev; > > + int err; > > + > > + if (!buf || length > PAGE_SIZE) > > + return -EINVAL; > > + > > + buffer = (char *)__get_free_page(GFP_KERNEL); > > + if (!buffer) > > + return -ENOMEM; > > + > > + err = -EFAULT; > > + if (copy_from_user(buffer, buf, length)) > > + goto out; > > + > > + err = -EINVAL; > > + if (length < PAGE_SIZE) > > + buffer[length] = '\0'; > > + else if (buffer[PAGE_SIZE-1]) > > + goto out; > > + > > + err = -ENXIO; > > + net = get_proc_net(file->f_dentry->d_inode); > > + if (!net) > > + goto out; > > + > > + /* > > + * Usage: echo "net clear-stats ifdev" >/proc/net/dev > > + * with "ifdev" replaced by the device name you wish to clear. > > + */ > > + if (!strncmp("net clear-stats",buffer,15)) { > > + p = buffer + 16; > > + if(sscanf(p,"%16s",devname)>0) { > > + dev = dev_get_by_name(net,devname); > > + if (dev) { > > + if (dev->get_stats) { > > + struct net_device_stats *stats = > > + dev->get_stats(dev); > > + memset(stats,0, > > + sizeof(struct > > net_device_stats)); > > + } > > + dev_put(dev); > > + } > > + } > > + } > > + > > + /* > > + * convert success returns so that we return the > > + * number of bytes consumed. > > + */ > > + if (!err) > > + err = length; > > + > Here probably, you should add > > put_net(net); > > > + out: > > + free_page((unsigned long)buffer); > > + return err; > > +} > > + > > static void *softnet_seq_start(struct seq_file *seq, loff_t *pos) > > @@ -2498,2 +2574,3 @@ > > .read = seq_read, > > + .write = proc_net_dev_write, > > .llseek = seq_lseek, > > > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <482EF192.4070707@sngx.net>]
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs [not found] ` <482EF192.4070707@sngx.net> @ 2008-05-17 21:41 ` Eric Dumazet 2008-05-17 22:49 ` James Cammarata 0 siblings, 1 reply; 38+ messages in thread From: Eric Dumazet @ 2008-05-17 21:41 UTC (permalink / raw) To: James Cammarata; +Cc: linux-kernel, Linux Netdev List James Cammarata a écrit : >> 1) Every call to get_proc_net() should be paired with a call to >> put_net(), or else you leak struct net. > > My mistake, I will add this and resubmit. > This version has a bug, please check again. >> 2) I am not sure this is the right way to do this... Did you consider >> to extend ethtool instead ? > > Yes, I did consider that. My reasoning for adding it to procfs is > that each > driver defines their own ethtool interface, so it was much simpler to > add it here, with a lot less code changes and broader support. > Maybe much simpler but looks like a hack... > > --- > > --- linux-2.6.25.4/net/core/dev.c 2008-05-15 10:00:12.000000000 > -0500 > +++ linux-2.6.25.4-jcammara/net/core/dev.c 2008-05-17 > 09:50:22.000000000 -0500 > @@ -2455,2 +2455,80 @@ > > +/** > + * proc_net_dev_write - handle writes to /proc/net/dev > + * @file: not used > + * @buf: buffer to write > + * @length: length of buf, at most PAGE_SIZE > + * @ppos: not used > + * > + * Description: this provides a mechanism to clear statistics on a > + * per-interface basis > + * "echo 'net clear-stats ifdev' >/proc/net/dev" > + * with "ifdev" replaced by the device name you wish to clear. > + * > + */ > +static ssize_t proc_net_dev_write(struct file *file, const char > __user *buf, > + size_t length, loff_t *ppos) > +{ > + char *buffer, *p; > + char devname[IFNAMSIZ]; > + struct net *net; > + struct net_device *dev; > + int err; > + > + if (!buf || length > PAGE_SIZE) > + return -EINVAL; > + > + buffer = (char *)__get_free_page(GFP_KERNEL); > + if (!buffer) > + return -ENOMEM; > + > + err = -EFAULT; > + if (copy_from_user(buffer, buf, length)) net is not initialized so you will call put_net(garbage) and crash My suggestion was better, since you dont have to test if (net) , just call put_net(net) at the right place (before out: label) > + goto out; > + > + err = -EINVAL; > + if (length < PAGE_SIZE) > + buffer[length] = '\0'; > + else if (buffer[PAGE_SIZE-1]) > + goto out; > + > + err = -ENXIO; > + net = get_proc_net(file->f_dentry->d_inode); > + if (!net) > + goto out; > + > + /* > + * Usage: echo "net clear-stats ifdev" >/proc/net/dev > + * with "ifdev" replaced by the device name you wish to clear. > + */ > + if (!strncmp("net clear-stats",buffer,15)) { > + p = buffer + 16; > + if(sscanf(p,"%16s",devname)>0) { > + dev = dev_get_by_name(net,devname); > + if (dev) { > + if (dev->get_stats) { > + struct net_device_stats *stats = > + dev->get_stats(dev); > + memset(stats,0, > + sizeof(struct > net_device_stats)); > + } > + dev_put(dev); > + } > + } > + } > + > + /* > + * convert success returns so that we return the > + * number of bytes consumed. > + */ > + if (!err) > + err = length; > + > + out: > + if(net) > + put_net(net); > + free_page((unsigned long)buffer); > + return err; > +} > + > static void *softnet_seq_start(struct seq_file *seq, loff_t *pos) > @@ -2498,2 +2576,3 @@ > .read = seq_read, > + .write = proc_net_dev_write, > .llseek = seq_lseek, > > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs 2008-05-17 21:41 ` Eric Dumazet @ 2008-05-17 22:49 ` James Cammarata 2008-05-18 0:31 ` Ben Hutchings 0 siblings, 1 reply; 38+ messages in thread From: James Cammarata @ 2008-05-17 22:49 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-kernel, Linux Netdev List > This version has a bug, please check again. Gah, you're right, fixed below. --- --- linux-2.6.25.4/net/core/dev.c 2008-05-15 10:00:12.000000000 -0500 +++ linux-2.6.25.4-jcammara/net/core/dev.c 2008-05-17 17:45:57.000000000 -0500 @@ -2455,2 +2455,80 @@ +/** + * proc_net_dev_write - handle writes to /proc/net/dev + * @file: not used + * @buf: buffer to write + * @length: length of buf, at most PAGE_SIZE + * @ppos: not used + * + * Description: this provides a mechanism to clear statistics on a + * per-interface basis + * "echo 'net clear-stats ifdev' >/proc/net/dev" + * with "ifdev" replaced by the device name you wish to clear. + * + */ +static ssize_t proc_net_dev_write(struct file *file, const char __user *buf, + size_t length, loff_t *ppos) +{ + char *buffer, *p; + char devname[IFNAMSIZ]; + struct net *net; + struct net_device *dev; + int err; + + if (!buf || length > PAGE_SIZE) + return -EINVAL; + + buffer = (char *)__get_free_page(GFP_KERNEL); + if (!buffer) + return -ENOMEM; + + err = -EFAULT; + if (copy_from_user(buffer, buf, length)) + goto out; + + err = -EINVAL; + if (length < PAGE_SIZE) + buffer[length] = '\0'; + else if (buffer[PAGE_SIZE-1]) + goto out; + + err = -ENXIO; + net = get_proc_net(file->f_dentry->d_inode); + if (!net) + goto out; + + /* + * Usage: echo "net clear-stats ifdev" >/proc/net/dev + * with "ifdev" replaced by the device name you wish to clear. + */ + if (!strncmp("net clear-stats",buffer,15)) { + p = buffer + 16; + if(sscanf(p,"%16s",devname)>0) { + dev = dev_get_by_name(net,devname); + if (dev) { + if (dev->get_stats) { + struct net_device_stats *stats = + dev->get_stats(dev); + memset(stats,0, + sizeof(struct net_device_stats)); + } + dev_put(dev); + } + } + } + + /* + * convert success returns so that we return the + * number of bytes consumed. + */ + if (!err) + err = length; + + put_net(net); + + out: + free_page((unsigned long)buffer); + return err; +} + static void *softnet_seq_start(struct seq_file *seq, loff_t *pos) @@ -2498,2 +2576,3 @@ .read = seq_read, + .write = proc_net_dev_write, .llseek = seq_lseek, ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs 2008-05-17 22:49 ` James Cammarata @ 2008-05-18 0:31 ` Ben Hutchings 2008-05-18 1:43 ` Patrick McHardy 2008-05-18 5:09 ` James Cammarata 0 siblings, 2 replies; 38+ messages in thread From: Ben Hutchings @ 2008-05-18 0:31 UTC (permalink / raw) To: James Cammarata; +Cc: Eric Dumazet, linux-kernel, Linux Netdev List James Cammarata wrote: > >This version has a bug, please check again. > > Gah, you're right, fixed below. > > --- > > --- linux-2.6.25.4/net/core/dev.c 2008-05-15 10:00:12.000000000 -0500 > +++ linux-2.6.25.4-jcammara/net/core/dev.c 2008-05-17 > 17:45:57.000000000 -0500 > @@ -2455,2 +2455,80 @@ > > +/** > + * proc_net_dev_write - handle writes to /proc/net/dev > + * @file: not used > + * @buf: buffer to write > + * @length: length of buf, at most PAGE_SIZE > + * @ppos: not used > + * > + * Description: this provides a mechanism to clear statistics on a > + * per-interface basis > + * "echo 'net clear-stats ifdev' >/proc/net/dev" > + * with "ifdev" replaced by the device name you wish to clear. > + * > + */ > +static ssize_t proc_net_dev_write(struct file *file, const char __user > *buf, You need to stop your mail client word-wrapping patches. It looks like something has converted tabs to spaces, too. [...] > + /* > + * Usage: echo "net clear-stats ifdev" >/proc/net/dev > + * with "ifdev" replaced by the device name you wish to clear. > + */ This is redundant with the kernel-doc comment. > + if (!strncmp("net clear-stats",buffer,15)) { This doesn't check for a space after, which you rely on later on. (What if length == 15?) Also, explicitly writing the length of a literal string is error-prone. Seems like it would be better to do something like: static const char command[] = "net clear-stats "; ... if (!strncmp(command, buffer, sizeof(command) - 1)) { > + p = buffer + 16; > + if(sscanf(p,"%16s",devname)>0) { > + dev = dev_get_by_name(net,devname); > + if (dev) { > + if (dev->get_stats) { > + struct net_device_stats *stats = > + dev->get_stats(dev); > + memset(stats,0, > + sizeof(struct > net_device_stats)); > + } > + dev_put(dev); > + } Shouldn't this return an error if the device doesn't exist? > + } > + } > + > + /* > + * convert success returns so that we return the > + * number of bytes consumed. > + */ > + if (!err) > + err = length; This won't work; the function updates err unconditionally further up! > + > + put_net(net); > + > + out: > + free_page((unsigned long)buffer); > + return err; > +} > + > static void *softnet_seq_start(struct seq_file *seq, loff_t *pos) > @@ -2498,2 +2576,3 @@ > .read = seq_read, > + .write = proc_net_dev_write, > .llseek = seq_lseek, The context for this chunk seems to be too short. There are a few formatting oddities; checkpatch.pl will point them out. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs 2008-05-18 0:31 ` Ben Hutchings @ 2008-05-18 1:43 ` Patrick McHardy 2008-05-18 5:09 ` James Cammarata 1 sibling, 0 replies; 38+ messages in thread From: Patrick McHardy @ 2008-05-18 1:43 UTC (permalink / raw) To: Ben Hutchings Cc: James Cammarata, Eric Dumazet, linux-kernel, Linux Netdev List Ben Hutchings wrote: >> + dev = dev_get_by_name(net,devname); >> + if (dev) { >> + if (dev->get_stats) { >> + struct net_device_stats *stats = >> + dev->get_stats(dev); >> + memset(stats,0, >> + sizeof(struct >> net_device_stats)); This doesn't work for devices that generate the stats (for example by reading them from the hardware) in ->get_stats(). ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs 2008-05-18 0:31 ` Ben Hutchings 2008-05-18 1:43 ` Patrick McHardy @ 2008-05-18 5:09 ` James Cammarata 2008-05-18 11:27 ` Ben Hutchings 2008-05-29 1:45 ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata 1 sibling, 2 replies; 38+ messages in thread From: James Cammarata @ 2008-05-18 5:09 UTC (permalink / raw) To: Ben Hutchings; +Cc: Eric Dumazet, linux-kernel, Linux Netdev List > You need to stop your mail client word-wrapping patches. It looks like > something has converted tabs to spaces, too. Hmm, I don't know what might be causing that. I'm using Thunderbird and disabled wrapping, and things look ok on lkml when I view the archives there. The tab/spaces issue was a result of me copying it out of a terminal window, and the patch now clears checkpatch.pl. As for the rest, I've done as you suggested, my only comment about it is that like I said, my code is based heavily on that found in drivers/scsi/scsi_proc.c, so it may be worth it to do the same changes there. --- --- linux-2.6.25.4/net/core/dev.c 2008-05-15 10:00:12.000000000 -0500 +++ linux-2.6.25.4-jcammara/net/core/dev.c 2008-05-17 20:53:13.000000000 -0500 @@ -2453,6 +2453,75 @@ static struct netif_rx_stats *softnet_ge return rc; } +/** + * proc_net_dev_write - handle writes to /proc/net/dev + * @file: not used + * @buf: buffer to write + * @length: length of buf, at most PAGE_SIZE + * @ppos: not used + * + * Description: this provides a mechanism to clear statistics on a + * per-interface basis + * "echo 'net clear-stats ifdev' >/proc/net/dev" + * with "ifdev" replaced by the device name you wish to clear. + * + */ +static ssize_t proc_net_dev_write(struct file *file, const char __user *buf, + size_t length, loff_t *ppos) +{ + char *buffer, *p; + char devname[IFNAMSIZ]; + static const char command[] = "net clear-stats "; + struct net *net; + struct net_device *dev; + int err; + + if (!buf || length > PAGE_SIZE) + return -EINVAL; + + buffer = (char *)__get_free_page(GFP_KERNEL); + if (!buffer) + return -ENOMEM; + + err = -EFAULT; + if (copy_from_user(buffer, buf, length)) + goto out; + + err = -EINVAL; + if (length < PAGE_SIZE) + buffer[length] = '\0'; + else if (buffer[PAGE_SIZE-1]) + goto out; + + err = -ENXIO; + net = get_proc_net(file->f_dentry->d_inode); + if (!net) + goto out; + + if (!strncmp(command, buffer, sizeof(command) - 1)) { + p = buffer + strlen(command); + if (sscanf(p, "%16s", devname) > 0) { + dev = dev_get_by_name(net, devname); + if (dev) { + if (dev->get_stats) { + struct net_device_stats *stats = + dev->get_stats(dev); + memset(stats, 0, + sizeof(struct net_device_stats)); + } + dev_put(dev); + err = length; + } + } + } + + put_net(net); + + out: + free_page((unsigned long)buffer); + return err; +} + static void *softnet_seq_start(struct seq_file *seq, loff_t *pos) { return softnet_get_online(pos); @@ -2496,6 +2565,7 @@ static const struct file_operations dev_ .owner = THIS_MODULE, .open = dev_seq_open, .read = seq_read, + .write = proc_net_dev_write, .llseek = seq_lseek, .release = seq_release_net, }; ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH updated] net: add ability to clear per-interface network statistics via procfs 2008-05-18 5:09 ` James Cammarata @ 2008-05-18 11:27 ` Ben Hutchings 2008-05-29 1:45 ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata 1 sibling, 0 replies; 38+ messages in thread From: Ben Hutchings @ 2008-05-18 11:27 UTC (permalink / raw) To: James Cammarata; +Cc: Eric Dumazet, linux-kernel, Linux Netdev List James Cammarata wrote: > >You need to stop your mail client word-wrapping patches. It looks like > >something has converted tabs to spaces, too. > > Hmm, I don't know what might be causing that. I'm using Thunderbird and > disabled wrapping, and things look ok on lkml when I view the archives > there. Sorry, you're right - your messages are marked as format=flowed, which means they should be word-wrapped when displayed. This is odd but doesn't break the patches. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-18 5:09 ` James Cammarata 2008-05-18 11:27 ` Ben Hutchings @ 2008-05-29 1:45 ` James Cammarata 2008-05-29 2:08 ` James Cammarata 2008-05-29 5:11 ` Andrew Morton 1 sibling, 2 replies; 38+ messages in thread From: James Cammarata @ 2008-05-29 1:45 UTC (permalink / raw) To: linux-kernel; +Cc: Linux Netdev List I had originally submitted this ability as a patch to procfs, and general consensus was that it seemed "hackish" to do it that way. It was suggested that it be implemented as a feature of ethtool, so I thought I'd take on the challenge and add it that way. I've laid the groundwork here, and added the ability to two of the drivers for which I have hardware to test on (e1000 and pcnet32). I also added the code required to call this functionality to the ethtool code base, and have been using that to test my changes (using -z as the flag to ethtool for clearing stats for now). If this is an acceptable start, I will gladly start working on adding this to as many drivers as possible (we do have some bnx2 hardware, though I'm not sure if it's available for testing). Some other sys-admins have mentioned to me that ethtool doesn't work sometimes and they fall back to using mii-tool to configure interfaces, so I'd still like to have the procfs change implemented to accommodate that hardware, but I agree, this is a better way forward. Signed-off-by: James Cammarata <jimi@sngx.net> --- --- linux-2.6.25.4/include/linux/ethtool.h 2008-05-15 10:00:12.000000000 -0500 +++ linux-2.6.25.4-jcammara/include/linux/ethtool.h 2008-05-28 17:45:56.000000000 -0500 @@ -325,6 +325,7 @@ int ethtool_op_set_flags(struct net_devi * get_strings: Return a set of strings that describe the requested objects * phys_id: Identify the device * get_stats: Return statistics about the device + * clear_ethtool_stats: Clear device statistics * get_flags: get 32-bit flags bitmap * set_flags: set 32-bit flags bitmap * @@ -383,6 +384,7 @@ struct ethtool_ops { void (*get_strings)(struct net_device *, u32 stringset, u8 *); int (*phys_id)(struct net_device *, u32); void (*get_ethtool_stats)(struct net_device *, struct ethtool_stats *, u64 *); + int (*clear_ethtool_stats)(struct net_device *); int (*begin)(struct net_device *); void (*complete)(struct net_device *); u32 (*get_ufo)(struct net_device *); @@ -441,6 +443,7 @@ struct ethtool_ops { #define ETHTOOL_SFLAGS 0x00000026 /* Set flags bitmap(ethtool_value) */ #define ETHTOOL_GPFLAGS 0x00000027 /* Get driver-private flags bitmap */ #define ETHTOOL_SPFLAGS 0x00000028 /* Set driver-private flags bitmap */ +#define ETHTOOL_CSTATS 0x00000029 /* Clear NIC statistics */ /* compatibility with older code */ #define SPARC_ETH_GSET ETHTOOL_GSET --- linux-2.6.25.4/net/core/ethtool.c 2008-05-15 10:00:12.000000000 -0500 +++ linux-2.6.25.4-jcammara/net/core/ethtool.c 2008-05-28 17:48:51.000000000 -0500 @@ -714,6 +714,16 @@ static int ethtool_get_stats(struct net_ return ret; } +static int ethtool_clear_stats(struct net_device *dev, void __user *useraddr) +{ + const struct ethtool_ops *ops = dev->ethtool_ops; + + if (!ops->clear_ethtool_stats) + return -EOPNOTSUPP; + + return ops->clear_ethtool_stats(dev); +} + static int ethtool_get_perm_addr(struct net_device *dev, void __user *useraddr) { struct ethtool_perm_addr epaddr; @@ -964,6 +974,9 @@ int dev_ethtool(struct net *net, struct rc = ethtool_set_value(dev, useraddr, dev->ethtool_ops->set_priv_flags); break; + case ETHTOOL_CSTATS: + rc = ethtool_clear_stats(dev, useraddr); + break; default: rc = -EOPNOTSUPP; } --- linux-2.6.25.4/drivers/net/e1000/e1000_main.c 2008-05-15 10:00:12.000000000 -0500 +++ linux-2.6.25.4-jcammara/drivers/net/e1000/e1000_main.c 2008-05-28 17:51:07.000000000 -0500 @@ -141,6 +141,7 @@ static void e1000_free_tx_resources(stru static void e1000_free_rx_resources(struct e1000_adapter *adapter, struct e1000_rx_ring *rx_ring); void e1000_update_stats(struct e1000_adapter *adapter); +void e1000_clear_stats(struct e1000_adapter *adapter); static int e1000_init_module(void); static void e1000_exit_module(void); @@ -3819,6 +3820,31 @@ e1000_update_stats(struct e1000_adapter spin_unlock_irqrestore(&adapter->stats_lock, flags); } +void +e1000_clear_stats(struct e1000_adapter *adapter) +{ + struct pci_dev *pdev = adapter->pdev; + unsigned long flags; + + #define PHY_IDLE_ERROR_COUNT_MASK 0x00FF + + /* + * Prevent stats update while adapter is being reset, or if the pci + * connection is down. + */ + if (adapter->link_speed == 0) + return; + if (pci_channel_offline(pdev)) + return; + + spin_lock_irqsave(&adapter->stats_lock, flags); + + memset(&adapter->stats, 0, sizeof(struct e1000_hw_stats)); + memset(&adapter->net_stats, 0, sizeof(struct net_device_stats)); + + spin_unlock_irqrestore(&adapter->stats_lock, flags); +} + /** * e1000_intr_msi - Interrupt Handler * @irq: interrupt number --- linux-2.6.25.4/drivers/net/e1000/e1000_ethtool.c 2008-05-15 10:00:12.000000000 -0500 +++ linux-2.6.25.4-jcammara/drivers/net/e1000/e1000_ethtool.c 2008-05-28 17:53:28.000000000 -0500 @@ -42,7 +42,7 @@ extern int e1000_setup_all_tx_resources( extern void e1000_free_all_rx_resources(struct e1000_adapter *adapter); extern void e1000_free_all_tx_resources(struct e1000_adapter *adapter); extern void e1000_update_stats(struct e1000_adapter *adapter); - +extern void e1000_clear_stats(struct e1000_adapter *adapter); struct e1000_stats { char stat_string[ETH_GSTRING_LEN]; @@ -1940,6 +1940,14 @@ e1000_get_ethtool_stats(struct net_devic /* BUG_ON(i != E1000_STATS_LEN); */ } +static int +e1000_clear_ethtool_stats(struct net_device *netdev) +{ + struct e1000_adapter *adapter = netdev_priv(netdev); + e1000_clear_stats(adapter); + return 0; +} + static void e1000_get_strings(struct net_device *netdev, uint32_t stringset, uint8_t *data) { @@ -1991,6 +1999,7 @@ static const struct ethtool_ops e1000_et .get_strings = e1000_get_strings, .phys_id = e1000_phys_id, .get_ethtool_stats = e1000_get_ethtool_stats, + .clear_ethtool_stats = e1000_clear_ethtool_stats, .get_sset_count = e1000_get_sset_count, }; --- linux-2.6.25.4/drivers/net/pcnet32.c 2008-05-15 10:00:12.000000000 -0500 +++ linux-2.6.25.4-jcammara/drivers/net/pcnet32.c 2008-05-28 17:59:42.000000000 -0500 @@ -312,6 +312,7 @@ static void pcnet32_tx_timeout(struct ne static irqreturn_t pcnet32_interrupt(int, void *); static int pcnet32_close(struct net_device *); static struct net_device_stats *pcnet32_get_stats(struct net_device *); +static int pcnet32_clear_stats(struct net_device *); static void pcnet32_load_multicast(struct net_device *dev); static void pcnet32_set_multicast_list(struct net_device *); static int pcnet32_ioctl(struct net_device *, struct ifreq *, int); @@ -1532,6 +1533,7 @@ static const struct ethtool_ops pcnet32_ .get_regs_len = pcnet32_get_regs_len, .get_regs = pcnet32_get_regs, .get_sset_count = pcnet32_get_sset_count, + .clear_ethtool_stats = pcnet32_clear_stats, }; /* only probes for non-PCI devices, the rest are handled by @@ -2719,6 +2721,17 @@ static struct net_device_stats *pcnet32_ return &dev->stats; } +static int pcnet32_clear_stats(struct net_device *dev) +{ + struct pcnet32_private *lp = netdev_priv(dev); + unsigned long flags; + + spin_lock_irqsave(&lp->lock, flags); + memset(&dev->stats, 0, sizeof(struct net_device_stats)); + spin_unlock_irqrestore(&lp->lock, flags); + return 0; +} + /* taken from the sunlance driver, which it took from the depca driver */ static void pcnet32_load_multicast(struct net_device *dev) { ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-29 1:45 ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata @ 2008-05-29 2:08 ` James Cammarata 2008-05-29 5:11 ` Andrew Morton 1 sibling, 0 replies; 38+ messages in thread From: James Cammarata @ 2008-05-29 2:08 UTC (permalink / raw) To: linux-kernel; +Cc: Linux Netdev List Missed an errant define in some copied code (from the get_stats function in e1000_main.c). Signed-off-by: James Cammarata <jimi@sngx.net> --- --- linux-2.6.25.4/drivers/net/e1000/e1000_main.c 2008-05-15 10:00:12.000000000 -0500 +++ linux-2.6.25.4-jcammara/drivers/net/e1000/e1000_main.c 2008-05-28 21:06:19.000000000 -0500 @@ -141,6 +141,7 @@ static void e1000_free_tx_resources(stru static void e1000_free_rx_resources(struct e1000_adapter *adapter, struct e1000_rx_ring *rx_ring); void e1000_update_stats(struct e1000_adapter *adapter); +void e1000_clear_stats(struct e1000_adapter *adapter); static int e1000_init_module(void); static void e1000_exit_module(void); @@ -3819,6 +3820,29 @@ e1000_update_stats(struct e1000_adapter spin_unlock_irqrestore(&adapter->stats_lock, flags); } +void +e1000_clear_stats(struct e1000_adapter *adapter) +{ + struct pci_dev *pdev = adapter->pdev; + unsigned long flags; + + /* + * Prevent stats update while adapter is being reset, or if the pci + * connection is down. + */ + if (adapter->link_speed == 0) + return; + if (pci_channel_offline(pdev)) + return; + + spin_lock_irqsave(&adapter->stats_lock, flags); + + memset(&adapter->stats, 0, sizeof(struct e1000_hw_stats)); + memset(&adapter->net_stats, 0, sizeof(struct net_device_stats)); + + spin_unlock_irqrestore(&adapter->stats_lock, flags); +} + /** * e1000_intr_msi - Interrupt Handler * @irq: interrupt number ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-29 1:45 ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata 2008-05-29 2:08 ` James Cammarata @ 2008-05-29 5:11 ` Andrew Morton 2008-05-29 12:34 ` James Cammarata 1 sibling, 1 reply; 38+ messages in thread From: Andrew Morton @ 2008-05-29 5:11 UTC (permalink / raw) To: James Cammarata; +Cc: linux-kernel, Linux Netdev List On Wed, 28 May 2008 20:45:18 -0500 James Cammarata <jimi@sngx.net> wrote: > I had originally submitted this ability as a patch to procfs, and > general consensus was that it seemed "hackish" to do it that way. It > was suggested that it be implemented as a feature of ethtool, so I > thought I'd take on the challenge and add it that way. > > I've laid the groundwork here, and added the ability to two of the > drivers for which I have hardware to test on (e1000 and pcnet32). I > also added the code required to call this functionality to the ethtool > code base, and have been using that to test my changes (using -z as the > flag to ethtool for clearing stats for now). > > If this is an acceptable start, I will gladly start working on adding > this to as many drivers as possible (we do have some bnx2 hardware, > though I'm not sure if it's available for testing). > > Some other sys-admins have mentioned to me that ethtool doesn't work > sometimes and they fall back to using mii-tool to configure interfaces, > so I'd still like to have the procfs change implemented to accommodate > that hardware, but I agree, this is a better way forward. You didn't provide a reason from adding this feature to the kernel. Many of the kernel's accounting accumulators cannot be reset. We handle that in userspace tools by using subtraction. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-29 5:11 ` Andrew Morton @ 2008-05-29 12:34 ` James Cammarata 2008-05-29 14:45 ` Alan Cox 2008-05-29 14:48 ` Chris Friesen 0 siblings, 2 replies; 38+ messages in thread From: James Cammarata @ 2008-05-29 12:34 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, Linux Netdev List > You didn't provide a reason from adding this feature to the kernel. Sorry, my reasoning was in my original patch only. The ability to reset network counters is, in my experience, one of the first things you do when trying to troubleshoot networking issues - especially when you have incrementing errors. > Many of the kernel's accounting accumulators cannot be reset. We > handle that in userspace tools by using subtraction. I don't think that should preclude the ability to reset these, unless it is shown that it would break something very badly. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-29 12:34 ` James Cammarata @ 2008-05-29 14:45 ` Alan Cox 2008-05-29 17:15 ` James Cammarata 2008-05-30 19:12 ` Bill Fink 2008-05-29 14:48 ` Chris Friesen 1 sibling, 2 replies; 38+ messages in thread From: Alan Cox @ 2008-05-29 14:45 UTC (permalink / raw) To: James Cammarata; +Cc: Andrew Morton, linux-kernel, Linux Netdev List > > Many of the kernel's accounting accumulators cannot be reset. We > > handle that in userspace tools by using subtraction. > > I don't think that should preclude the ability to reset these, unless it is > shown that it would break something very badly. For one quite a few of the network cards keep the stats in hardware and don't neccessarily have a way to reset them. In other cases there is a mix of OS accumulated stats bulk updated by overflow events on the device itself. Its a lot of complexity, and changes all over the places for the sake of a trivial userspace change. I would suggest you instead write a quick bit of perl or python that fetches the stats and then updates every second with the changes. Alan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-29 14:45 ` Alan Cox @ 2008-05-29 17:15 ` James Cammarata 2008-05-29 20:50 ` David Miller 2008-05-30 19:12 ` Bill Fink 1 sibling, 1 reply; 38+ messages in thread From: James Cammarata @ 2008-05-29 17:15 UTC (permalink / raw) To: Alan Cox; +Cc: Andrew Morton, linux-kernel, Linux Netdev List > For one quite a few of the network cards keep the stats in hardware and > don't neccessarily have a way to reset them. In other cases there is a > mix of OS accumulated stats bulk updated by overflow events on the device > itself. Yes, and there's no good way to get around that, so the best thing for that hardware would probably to not add this functionality, in which case ethtool would simply report that it's not available. > Its a lot of complexity, and changes all over the places for the sake of > a trivial userspace change. I would suggest you instead write a quick bit > of perl or python that fetches the stats and then updates every second > with the changes. I would disagree with you on the complexity issue. In general, it is one new function (trivial in length) and an addition to the ethtool ops. SNMP and sar cover the reporting needs, but there have been multiple times when troubleshooting under the gun that this would have been extremely useful to help diagnose the issue (in one instance, we were getting floods of jumbo frames and other errors, and we did not know how quickly they were accumulating, because the switch was not seeing the same problem). My day job is in the financial industry, so we are often under very short SLA's to diagnose issues, and having this built into the kernel/existing tools would be a boon. I'd be more than happy to change this to a kernel config option, experimental and default off for those who would not want it. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-29 17:15 ` James Cammarata @ 2008-05-29 20:50 ` David Miller 2008-05-29 21:18 ` James Cammarata 0 siblings, 1 reply; 38+ messages in thread From: David Miller @ 2008-05-29 20:50 UTC (permalink / raw) To: jimi; +Cc: alan, akpm, linux-kernel, netdev From: James Cammarata <jimi@sngx.net> Date: Thu, 29 May 2008 12:15:16 -0500 > I'd be more than happy to change this to a kernel config > option, experimental and default off for those who would not want it. We never add pieces of code which are marked experimental and intended to be marked that way forever, that's nonsense. What you want is your little special kernel hack in the upstream tree, and that's now how we handle such things, sorry. I also agree with Alan that the kernel is the wrong place to handle this problem. Do it in userspace. We have consistent precedence for this elsewhere in the kernel tree, and there is good reason for that. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-29 20:50 ` David Miller @ 2008-05-29 21:18 ` James Cammarata 0 siblings, 0 replies; 38+ messages in thread From: James Cammarata @ 2008-05-29 21:18 UTC (permalink / raw) To: David Miller; +Cc: alan, akpm, linux-kernel, netdev >> I'd be more than happy to change this to a kernel config >> option, experimental and default off for those who would not want it. > > We never add pieces of code which are marked experimental and > intended to be marked that way forever, that's nonsense. Completely not my intention, I did not ever intend to have something marked experimental forever, and I'm not sure where you're drawing that inference from. > What you want is your little special kernel hack in the upstream > tree, and that's now how we handle such things, sorry. No, I saw what I thought was a lack of a feature that is commonplace on many other platforms, and thought I'd give it a shot to see if I could add it myself. The NAKs are apparently overwhelming on this issue, so I'll stop beating a dead horse and move on. Thanks for the input. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-29 14:45 ` Alan Cox 2008-05-29 17:15 ` James Cammarata @ 2008-05-30 19:12 ` Bill Fink 2008-05-30 22:14 ` Rick Jones 2008-05-31 12:11 ` Alan Cox 1 sibling, 2 replies; 38+ messages in thread From: Bill Fink @ 2008-05-30 19:12 UTC (permalink / raw) To: Alan Cox; +Cc: James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List On Thu, 29 May 2008, Alan Cox wrote: > > > Many of the kernel's accounting accumulators cannot be reset. We > > > handle that in userspace tools by using subtraction. > > > > I don't think that should preclude the ability to reset these, unless it is > > shown that it would break something very badly. > > For one quite a few of the network cards keep the stats in hardware and > don't neccessarily have a way to reset them. In other cases there is a > mix of OS accumulated stats bulk updated by overflow events on the device > itself. > > Its a lot of complexity, and changes all over the places for the sake of > a trivial userspace change. I would suggest you instead write a quick bit > of perl or python that fetches the stats and then updates every second > with the changes. When diagnosing network problems, the ability to zero counters is a major aid in diagnosis. Writing scripts is not a general solution since often several systems are involved and it's not simple to do this via a script. Saving stats output and running beforeafter on a number of systems is a royal pain when troubleshooting. I like the idea someone else had. where the clear stats actually did a checkpoint of the stats, and then future get stats would return the diff between the checkpointed and current values (perhaps there could also be a mechanism to still get the absolute values if desired). -Bill ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-30 19:12 ` Bill Fink @ 2008-05-30 22:14 ` Rick Jones 2008-05-31 1:09 ` Bill Fink 2008-05-31 12:11 ` Alan Cox 1 sibling, 1 reply; 38+ messages in thread From: Rick Jones @ 2008-05-30 22:14 UTC (permalink / raw) To: Bill Fink Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List > Saving stats output and running beforeafter on a number of systems is > a royal pain when troubleshooting. Well, with that ringing endorsement :) I would like to state that Yoshihiro Ishijima has fixed some bugs in beforeafter and it will now deal with /proc/interrupts output better than before. ftp://ftp.cup.hp.com/dist/networking/tools/ rick jones ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-30 22:14 ` Rick Jones @ 2008-05-31 1:09 ` Bill Fink 2008-05-31 2:41 ` Stephen Hemminger 0 siblings, 1 reply; 38+ messages in thread From: Bill Fink @ 2008-05-31 1:09 UTC (permalink / raw) To: Rick Jones Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List On Fri, 30 May 2008, Rick Jones wrote: > > Saving stats output and running beforeafter on a number of systems is > > a royal pain when troubleshooting. > > Well, with that ringing endorsement :) I would like to state that > Yoshihiro Ishijima has fixed some bugs in beforeafter and it will now > deal with /proc/interrupts output better than before. > > ftp://ftp.cup.hp.com/dist/networking/tools/ Nothing against beforefter. It's a very useful tool, especially given Linux's lack of a method to clear interface counters. But it still is very cumbersome to use in the scenario I mentioned, having to save the stats before and after a test on each of several systems, and then having to run beforeafter on each of those systems. Then you need to do this all again for every test run. It's just not very efficient. I don't know why the major objection to adding a small amount of code to the kernel that would make life a lot easier for hundreds (or more) of Linux network adminstrators. And the snapshot/diff idea would not be complex to maintain and would also address the issue of some hardware not having a clear hw stats capability. -Bill ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-31 1:09 ` Bill Fink @ 2008-05-31 2:41 ` Stephen Hemminger 2008-05-31 4:47 ` Bill Fink 0 siblings, 1 reply; 38+ messages in thread From: Stephen Hemminger @ 2008-05-31 2:41 UTC (permalink / raw) To: netdev; +Cc: linux-kernel On Fri, 30 May 2008 21:09:31 -0400 Bill Fink <billfink@mindspring.com> wrote: > On Fri, 30 May 2008, Rick Jones wrote: > > > > Saving stats output and running beforeafter on a number of systems is > > > a royal pain when troubleshooting. > > > > Well, with that ringing endorsement :) I would like to state that > > Yoshihiro Ishijima has fixed some bugs in beforeafter and it will now > > deal with /proc/interrupts output better than before. > > > > ftp://ftp.cup.hp.com/dist/networking/tools/ > > Nothing against beforefter. It's a very useful tool, especially > given Linux's lack of a method to clear interface counters. But > it still is very cumbersome to use in the scenario I mentioned, > having to save the stats before and after a test on each of several > systems, and then having to run beforeafter on each of those systems. > Then you need to do this all again for every test run. It's just > not very efficient. > > I don't know why the major objection to adding a small amount of > code to the kernel that would make life a lot easier for hundreds > (or more) of Linux network adminstrators. And the snapshot/diff > idea would not be complex to maintain and would also address the > issue of some hardware not having a clear hw stats capability. > > -Bill ifstat from iproute2 tools also keeps state to compute number of packets since last update as well. Come on, it isn't that hard. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-31 2:41 ` Stephen Hemminger @ 2008-05-31 4:47 ` Bill Fink 0 siblings, 0 replies; 38+ messages in thread From: Bill Fink @ 2008-05-31 4:47 UTC (permalink / raw) To: Stephen Hemminger; +Cc: netdev, linux-kernel On Fri, 30 May 2008, Stephen Hemminger wrote: > On Fri, 30 May 2008 21:09:31 -0400 > Bill Fink <billfink@mindspring.com> wrote: > > > On Fri, 30 May 2008, Rick Jones wrote: > > > > > > Saving stats output and running beforeafter on a number of systems is > > > > a royal pain when troubleshooting. > > > > > > Well, with that ringing endorsement :) I would like to state that > > > Yoshihiro Ishijima has fixed some bugs in beforeafter and it will now > > > deal with /proc/interrupts output better than before. > > > > > > ftp://ftp.cup.hp.com/dist/networking/tools/ > > > > Nothing against beforefter. It's a very useful tool, especially > > given Linux's lack of a method to clear interface counters. But > > it still is very cumbersome to use in the scenario I mentioned, > > having to save the stats before and after a test on each of several > > systems, and then having to run beforeafter on each of those systems. > > Then you need to do this all again for every test run. It's just > > not very efficient. > > > > I don't know why the major objection to adding a small amount of > > code to the kernel that would make life a lot easier for hundreds > > (or more) of Linux network adminstrators. And the snapshot/diff > > idea would not be complex to maintain and would also address the > > issue of some hardware not having a clear hw stats capability. > > > > -Bill > > ifstat from iproute2 tools also keeps state to compute number > of packets since last update as well. Come on, it isn't that > hard. Thanks for the pointer. I wasn't aware of ifstat (it's not installed by default on any of my systems). I will check out ifstat and see what it can do. -Thanks -Bill ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-30 19:12 ` Bill Fink 2008-05-30 22:14 ` Rick Jones @ 2008-05-31 12:11 ` Alan Cox 2008-05-31 23:57 ` Bill Fink 1 sibling, 1 reply; 38+ messages in thread From: Alan Cox @ 2008-05-31 12:11 UTC (permalink / raw) To: Bill Fink; +Cc: James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List > When diagnosing network problems, the ability to zero counters is > a major aid in diagnosis. Writing scripts is not a general solution > since often several systems are involved and it's not simple to do > this via a script. Saving stats output and running beforeafter on > a number of systems is a royal pain when troubleshooting. Zeroing them is not a general solution either - you break all the other tools monitoring the same stats in parallel - like network usage monitors. This is a user space tools problem, scripts or otherwise. Alan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-31 12:11 ` Alan Cox @ 2008-05-31 23:57 ` Bill Fink 2008-06-01 1:46 ` Ben Hutchings 2008-06-02 4:50 ` Glen Turner 0 siblings, 2 replies; 38+ messages in thread From: Bill Fink @ 2008-05-31 23:57 UTC (permalink / raw) To: Alan Cox; +Cc: James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List On Sat, 31 May 2008, Alan Cox wrote: > > When diagnosing network problems, the ability to zero counters is > > a major aid in diagnosis. Writing scripts is not a general solution > > since often several systems are involved and it's not simple to do > > this via a script. Saving stats output and running beforeafter on > > a number of systems is a royal pain when troubleshooting. > > Zeroing them is not a general solution either - you break all the other > tools monitoring the same stats in parallel - like network usage monitors. > > This is a user space tools problem, scripts or otherwise. Two points. One, if I as a Linux network administrator, make a deliberate decision to zero counters, I will be making that decision taking into account any other effects it might have. Second, I already expressed support for the preferred option of the zeroing of counters actually just doing a snapshot of the stats, and subsequent requests for the stats doing a diff from the snapshot. There could also be a mechanism for getting the absolute values of the stats that for example SNMP monitoring would utilize. Perhaps a user space tool such as ifstat can be part of the solution (I haven't had a chance to check it out yet). But it would be difficult I believe to make a general user space tool for dealing with the device specific stats reported by "ethtool -S" since they are quite specific to an individual device driver. Yes, every individual Linux network administrator can re-create the wheel by devising their own scripts, but it makes much more sense to me to implement a simple general kernel mechanism once that could be used generically, than to have hundreds (or thousands) of Linux network administrators each having to do it themselves (perhaps multiple times if they have a variety of types of systems and types of NICs). -Bill ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-31 23:57 ` Bill Fink @ 2008-06-01 1:46 ` Ben Hutchings 2008-06-01 20:46 ` Bill Fink 2008-06-02 4:50 ` Glen Turner 1 sibling, 1 reply; 38+ messages in thread From: Ben Hutchings @ 2008-06-01 1:46 UTC (permalink / raw) To: Bill Fink Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List Bill Fink wrote: <snip> > Yes, every individual Linux network administrator can re-create the > wheel by devising their own scripts, but it makes much more sense > to me to implement a simple general kernel mechanism once that could > be used generically, than to have hundreds (or thousands) of Linux > network administrators each having to do it themselves (perhaps > multiple times if they have a variety of types of systems and types > of NICs). The ethtool interface is pretty generic, even if the names aren't. Here's some Python code I just knocked up which demonstrates how to get a set of named stats. It shouldn't be terribly hard to extend this to saving and subtracting stat sets. Ben. import array, fcntl, os, socket, struct SIOCETHTOOL = 0x8946 ETH_GSTRING_LEN = 32 ETH_SS_STATS = 1 ETHTOOL_GDRVINFO = 0x00000003 ETHTOOL_GSTRINGS = 0x0000001b ETHTOOL_GSTATS = 0x0000001d def ethtool(sock, name, buf): ifreq = struct.pack('16sP', name, buf.buffer_info()[0]) fcntl.ioctl(sock.fileno(), SIOCETHTOOL, ifreq) def ethtool_gdrvinfo(sock, name): cmd = struct.pack('=I', ETHTOOL_GDRVINFO) buf = array.array('c', cmd) format = '=32s32s32s32s32x12xIIIII' buf.extend('\0' * struct.calcsize(format)) ethtool(sock, name, buf) return dict(zip(['driver', 'version', 'fw_version', 'bus_info', 'n_priv_flags', 'n_stats', 'testinfo_len', 'eedump_len', 'regdump_len'], struct.unpack(format, buf[len(cmd):]))) def ethtool_gstrings(sock, name, string_set, count): cmd = struct.pack('=II', ETHTOOL_GSTRINGS, string_set) buf = array.array('c', cmd) buf.extend('\0' * (4 + count * ETH_GSTRING_LEN)) ethtool(sock, name, buf) def nth_string(buf, n): data = buf[12 + n * ETH_GSTRING_LEN : 12 + (n + 1) * ETH_GSTRING_LEN] return data.tostring().replace('\0', '') return [nth_string(buf, n) for n in range(count)] def ethtool_gstats(sock, name, count): cmd = struct.pack('=I', ETHTOOL_GSTATS) buf = array.array('c', cmd) buf.extend('\0' * (4 + count * 8)) ethtool(sock, name, buf) def nth_stat(buf, n): data = buf[8 + n * 8 : 8 + (n + 1) * 8] return struct.unpack('=Q', data)[0] return [nth_stat(buf, n) for n in range(count)] if __name__ == '__main__': import sys name = sys.argv[1] sock = socket.socket() drvinfo = ethtool_gdrvinfo(sock, name) count = drvinfo['n_stats'] print dict(zip(ethtool_gstrings(sock, name, ETH_SS_STATS, count), ethtool_gstats(sock, name, count))) -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-01 1:46 ` Ben Hutchings @ 2008-06-01 20:46 ` Bill Fink 2008-06-01 22:29 ` Ben Hutchings 2008-06-02 5:39 ` David Miller 0 siblings, 2 replies; 38+ messages in thread From: Bill Fink @ 2008-06-01 20:46 UTC (permalink / raw) To: Ben Hutchings Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List On Sun, 1 Jun 2008, Ben Hutchings wrote: > Bill Fink wrote: > <snip> > > Yes, every individual Linux network administrator can re-create the > > wheel by devising their own scripts, but it makes much more sense > > to me to implement a simple general kernel mechanism once that could > > be used generically, than to have hundreds (or thousands) of Linux > > network administrators each having to do it themselves (perhaps > > multiple times if they have a variety of types of systems and types > > of NICs). > > The ethtool interface is pretty generic, even if the names aren't. > Here's some Python code I just knocked up which demonstrates how > to get a set of named stats. It shouldn't be terribly hard to > extend this to saving and subtracting stat sets. I'm not sure what that proves. Your python code just basically gives the same info as running the "ethtool -S" command. But the question is how does one devise a generic script or tool that doesn't require any special knowledge of the specific NIC being used. For example, here's the "ethtool -S" info for my myri10ge NIC: [root@chance8 ~]# ethtool -S eth2 NIC statistics: rx_packets: 53243864310 tx_packets: 112826823797 rx_bytes: 301727733072710 tx_bytes: 716648208451198 rx_errors: 0 tx_errors: 0 rx_dropped: 0 tx_dropped: 0 multicast: 0 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_fifo_errors: 0 rx_missed_errors: 0 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_boundary: 4096 WC: 1 irq: 8413 MSI: 1 read_dma_bw_MBs: 1398 write_dma_bw_MBs: 1613 read_write_dma_bw_MBs: 2711 serial_number: 287046 tx_pkt_start: 1157674101 tx_pkt_done: 1157674101 tx_req: 188226127 tx_done: 188226127 rx_small_cnt: 3009560676 rx_big_cnt: 1726230729 wake_queue: 57969440 stop_queue: 57969440 watchdog_resets: 0 tx_linearized: 0 link_changes: 8 link_up: 1 dropped_link_overflow: 0 dropped_link_error_or_filtered: 26584 dropped_multicast_filtered: 2190912 dropped_runt: 0 dropped_overrun: 0 dropped_no_small_buffer: 0 dropped_no_big_buffer: 0 How does one know which of these reported values are counter stats that one wishes to zero/snapshot, and which are not? Another issue that occurred to me is if multiple people are working on troubleshooting a network problem, how do we insure that they all get a consistent view of the stats? If this is done via a kernel mechanism then there isn't an issue. But if it's done via user space, then you have to make sure that everyone zeros/snapshots the stats at the same time. Ideally, one should be able to do something like "ethtool -z ethX" to zero/snapshot the driver stats, and then "ethtool -S ethX" to get the stats since the last snapshot. You should be able to use the same tool ("ethtool") to do all of this, and not some other special tool or specially devised homegrown script. Why make users lives any more difficult than need be? -Bill ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-01 20:46 ` Bill Fink @ 2008-06-01 22:29 ` Ben Hutchings 2008-06-02 3:55 ` Bill Fink 2008-06-02 5:39 ` David Miller 1 sibling, 1 reply; 38+ messages in thread From: Ben Hutchings @ 2008-06-01 22:29 UTC (permalink / raw) To: Bill Fink Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List Bill Fink wrote: > On Sun, 1 Jun 2008, Ben Hutchings wrote: > > > Bill Fink wrote: > > <snip> > > > Yes, every individual Linux network administrator can re-create the > > > wheel by devising their own scripts, but it makes much more sense > > > to me to implement a simple general kernel mechanism once that could > > > be used generically, than to have hundreds (or thousands) of Linux > > > network administrators each having to do it themselves (perhaps > > > multiple times if they have a variety of types of systems and types > > > of NICs). > > > > The ethtool interface is pretty generic, even if the names aren't. > > Here's some Python code I just knocked up which demonstrates how > > to get a set of named stats. It shouldn't be terribly hard to > > extend this to saving and subtracting stat sets. > > I'm not sure what that proves. Your python code just basically gives > the same info as running the "ethtool -S" command. Yes, but in a form you can more easily manipulate. > But the question > is how does one devise a generic script or tool that doesn't require > any special knowledge of the specific NIC being used. For example, > here's the "ethtool -S" info for my myri10ge NIC: > > [root@chance8 ~]# ethtool -S eth2 > NIC statistics: [...] > WC: 1 > irq: 8413 > MSI: 1 > read_dma_bw_MBs: 1398 > write_dma_bw_MBs: 1613 > read_write_dma_bw_MBs: 2711 > serial_number: 287046 [...] > tx_linearized: 0 > link_changes: 8 > link_up: 1 [...] > > How does one know which of these reported values are counter stats > that one wishes to zero/snapshot, and which are not? Ah, I see, I didn't realise some drivers were abusing ethtool stats in this way. But for the most part the differences will show up as zeroes. If there's really a need for ethtool stats that aren't counters, maybe the ethtool API should include flags to indicate which they are. > Another issue that occurred to me is if multiple people are working > on troubleshooting a network problem, how do we insure that they all > get a consistent view of the stats? If this is done via a kernel > mechanism then there isn't an issue. But if it's done via user space, > then you have to make sure that everyone zeros/snapshots the stats > at the same time. > > Ideally, one should be able to do something like "ethtool -z ethX" > to zero/snapshot the driver stats, and then "ethtool -S ethX" to get > the stats since the last snapshot. You should be able to use the > same tool ("ethtool") to do all of this, and not some other special > tool or specially devised homegrown script. Why make users lives > any more difficult than need be? No-one's stopping you from adding these options to ethtool. You could have it save statistic sets in, say, /var/run/ethtool. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-01 22:29 ` Ben Hutchings @ 2008-06-02 3:55 ` Bill Fink 0 siblings, 0 replies; 38+ messages in thread From: Bill Fink @ 2008-06-02 3:55 UTC (permalink / raw) To: Ben Hutchings Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List On Sun, 1 Jun 2008, Ben Hutchings wrote: > Bill Fink wrote: > > > But the question > > is how does one devise a generic script or tool that doesn't require > > any special knowledge of the specific NIC being used. For example, > > here's the "ethtool -S" info for my myri10ge NIC: > > > > [root@chance8 ~]# ethtool -S eth2 > > NIC statistics: > [...] > > WC: 1 > > irq: 8413 > > MSI: 1 > > read_dma_bw_MBs: 1398 > > write_dma_bw_MBs: 1613 > > read_write_dma_bw_MBs: 2711 > > serial_number: 287046 > [...] > > tx_linearized: 0 > > link_changes: 8 > > link_up: 1 > [...] > > > > How does one know which of these reported values are counter stats > > that one wishes to zero/snapshot, and which are not? > > Ah, I see, I didn't realise some drivers were abusing ethtool stats in > this way. But for the most part the differences will show up as zeroes. I'm not sure I would characterize that as abuse of the ethtool stats, but rather just a different category of stats. And I wouldn't want them to show up as zeros after doing a clear/snapshot of the counter stats. The "ethtool -S" output should be completely normal afterward, with just the counter stats being zeroed. > If there's really a need for ethtool stats that aren't counters, > maybe the ethtool API should include flags to indicate which they are. That would be useful. Maybe some way of determining the type of an ethtool stat such as COUNTER (perhaps with subtypes for signed versus unsigned, 32-bit versus 64-bit), RUNTIME_INFO, etc. > > Another issue that occurred to me is if multiple people are working > > on troubleshooting a network problem, how do we insure that they all > > get a consistent view of the stats? If this is done via a kernel > > mechanism then there isn't an issue. But if it's done via user space, > > then you have to make sure that everyone zeros/snapshots the stats > > at the same time. > > > > Ideally, one should be able to do something like "ethtool -z ethX" > > to zero/snapshot the driver stats, and then "ethtool -S ethX" to get > > the stats since the last snapshot. You should be able to use the > > same tool ("ethtool") to do all of this, and not some other special > > tool or specially devised homegrown script. Why make users lives > > any more difficult than need be? > > No-one's stopping you from adding these options to ethtool. You could > have it save statistic sets in, say, /var/run/ethtool. Yes, that could be done if the ability to determine which ethtool stats were counter stats was added to the ethtool API. And I guess they should be saved in something like /var/run/ethtool/ethX. But then what happens when you start using multiple network namespaces, and for example have eth0 in several different network namespaces. How could that be handled, to keep the different network namespaces from clobbering the stats of another namespace? If done in the kernel, I believe it would all work as expected, but it's not clear to me how to handle this in user space. -Bill ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-01 20:46 ` Bill Fink 2008-06-01 22:29 ` Ben Hutchings @ 2008-06-02 5:39 ` David Miller 2008-06-02 15:41 ` Bill Fink 1 sibling, 1 reply; 38+ messages in thread From: David Miller @ 2008-06-02 5:39 UTC (permalink / raw) To: billfink; +Cc: bhutchings, alan, jimi, akpm, linux-kernel, netdev From: Bill Fink <billfink@mindspring.com> Date: Sun, 1 Jun 2008 16:46:14 -0400 > Another issue that occurred to me is if multiple people are working > on troubleshooting a network problem, how do we insure that they all > get a consistent view of the stats? Some of them definitely won't get a consistent view if one of them triggers this undesirable statistic clearing knob. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-02 5:39 ` David Miller @ 2008-06-02 15:41 ` Bill Fink 0 siblings, 0 replies; 38+ messages in thread From: Bill Fink @ 2008-06-02 15:41 UTC (permalink / raw) To: David Miller; +Cc: bhutchings, alan, jimi, akpm, linux-kernel, netdev On Sun, 01 Jun 2008, David Miller wrote: > From: Bill Fink <billfink@mindspring.com> > Date: Sun, 1 Jun 2008 16:46:14 -0400 > > > Another issue that occurred to me is if multiple people are working > > on troubleshooting a network problem, how do we insure that they all > > get a consistent view of the stats? > > Some of them definitely won't get a consistent view if one of them > triggers this undesirable statistic clearing knob. The typical case I was referring to is for example when I am assisting someone else with troubleshooting. This is coordinated with the system admin, who would usually be the one to clear/snapshot the stats. This cannot be done by a normal user as it requires root privileges. BTW I'm also not clear why running "ethtool -S" requires root privileges, as it would be more convenient if a normal user could do this (currently I have to be given sudo access). This is a useful mechanism for many, many people. Bottom line I don't really care if it's done in the kernel or in user space, but it should be possible using the standard "ethtool -S" command, and I have raised at least a couple of issues with modifying the ethtool command to support this (but which wouldn't be an issue if implemented in the kernel). -Bill ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-31 23:57 ` Bill Fink 2008-06-01 1:46 ` Ben Hutchings @ 2008-06-02 4:50 ` Glen Turner 2008-06-02 16:10 ` Bill Fink 1 sibling, 1 reply; 38+ messages in thread From: Glen Turner @ 2008-06-02 4:50 UTC (permalink / raw) To: Bill Fink Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List > Yes, every individual Linux network administrator can re-create the > wheel by devising their own scripts, but it makes much more sense > to me to implement a simple general kernel mechanism once that could > be used generically, than to have hundreds (or thousands) of Linux > network administrators each having to do it themselves (perhaps > multiple times if they have a variety of types of systems and types > of NICs). Hi Bill, If you pull the stats using a SNMP polling tool (torrus, cacti, mrtg) then those package's graphs give nice "did this get better or worse" output for debugging network issues. I'd suggest you use one of those tools rather than writing your own scripts. Even if 99% of the time the graphs record zero errors, knowing when those errors started is very valuable and well worth the additional effort of configuring the tools over a command-line or a kernel hack. The more sophisticated tools can do alerting to Nagios should a variable suddenly change its behaviour. The Cisco/Juniper/everyone-else feature to run console stats separately from SNMP stats is nice, but it's rather tuned to the needs of router-heads and tends to fall apart when multiple staff are debugging a fault. If we do proceed with better command line stats then the number of errored seconds and the worst errored second and its value would be useful. These useful numbers can't be calculated by the SNMP polling tools and it's hard to see how they could be done in user-space. Cheers, Glen -- Glen Turner ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-02 4:50 ` Glen Turner @ 2008-06-02 16:10 ` Bill Fink 2008-06-03 12:28 ` James Cammarata 0 siblings, 1 reply; 38+ messages in thread From: Bill Fink @ 2008-06-02 16:10 UTC (permalink / raw) To: Glen Turner Cc: Alan Cox, James Cammarata, Andrew Morton, linux-kernel, Linux Netdev List On Mon, 02 Jun 2008, Glen Turner wrote: > > > Yes, every individual Linux network administrator can re-create the > > wheel by devising their own scripts, but it makes much more sense > > to me to implement a simple general kernel mechanism once that could > > be used generically, than to have hundreds (or thousands) of Linux > > network administrators each having to do it themselves (perhaps > > multiple times if they have a variety of types of systems and types > > of NICs). > > Hi Bill, > > If you pull the stats using a SNMP polling tool (torrus, cacti, mrtg) > then those package's graphs give nice "did this get better or worse" > output for debugging network issues. I do use mrtg for network monitoring to determine when things go bad, but when they do go bad, then I typically need to get much more detailed info when troubleshooting the problem. > I'd suggest you use one of those tools rather than writing your > own scripts. Even if 99% of the time the graphs record zero errors, > knowing when those errors started is very valuable and well worth > the additional effort of configuring the tools over a command-line > or a kernel hack. First of all, when assisting a user, they typically aren't even running an snmp daemon (and there might be firewall issues to access it if they are). And I don't think the "ethtool -S" driver stats are even accessible via SNMP (although they may contribute to more generic interface stats which are), and it is the specific driver stats which are often key to help diagnosing the problem. > The more sophisticated tools can do alerting to Nagios should > a variable suddenly change its behaviour. Definitely useful for certain arenas. > The Cisco/Juniper/everyone-else feature to run console stats > separately from SNMP stats is nice, but it's rather tuned to > the needs of router-heads and tends to fall apart when multiple > staff are debugging a fault. I use it all the time in coordination with network peers and joint troubleshooting. They clear the interface stats, and they and I can then view the interface stats as a test is run (they give me RO access to view the stats), or vice versa depending on whose network is being examined. > If we do proceed with better command line stats then the number > of errored seconds and the worst errored second and its value > would be useful. These useful numbers can't be calculated by > the SNMP polling tools and it's hard to see how they could be > done in user-space. I'm all for any improved debugging/diagnostic capabilities, including the extremely useful ability to clear/snapshot driver stats (there could also be an option to un-snapshot if you wanted to get back to seeing the absolute counter values). -Bill ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-02 16:10 ` Bill Fink @ 2008-06-03 12:28 ` James Cammarata 2008-06-03 12:35 ` Ben Hutchings 2008-06-03 14:46 ` Alan Cox 0 siblings, 2 replies; 38+ messages in thread From: James Cammarata @ 2008-06-03 12:28 UTC (permalink / raw) To: Bill Fink; +Cc: linux-kernel, Linux Netdev List > First of all, when assisting a user, they typically aren't even > running an snmp daemon (and there might be firewall issues to > access it if they are). And I don't think the "ethtool -S" driver > stats are even accessible via SNMP (although they may contribute > to more generic interface stats which are), and it is the specific > driver stats which are often key to help diagnosing the problem. Ok, just to jump back in and add my $0.02, I plan to send a patch to the ethtool project to do what was suggested (snapshot to /var, and diff stats against that for future reference). I would just like to point out that this will introduce inconsistencies between _every_ other source of interface stats (ifconfig/procfs/etc). I can deal with that, it just means one more thing to keep track of when you're troubleshooting. Also, I'd like to point out that the "stepping on other peoples toes" argument is a bogus one, since most major network vendors include this ability, and it's not exactly a concern now. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-03 12:28 ` James Cammarata @ 2008-06-03 12:35 ` Ben Hutchings 2008-06-03 14:46 ` Alan Cox 1 sibling, 0 replies; 38+ messages in thread From: Ben Hutchings @ 2008-06-03 12:35 UTC (permalink / raw) To: James Cammarata; +Cc: Bill Fink, linux-kernel, Linux Netdev List James Cammarata wrote: > >First of all, when assisting a user, they typically aren't even > >running an snmp daemon (and there might be firewall issues to > >access it if they are). And I don't think the "ethtool -S" driver > >stats are even accessible via SNMP (although they may contribute > >to more generic interface stats which are), and it is the specific > >driver stats which are often key to help diagnosing the problem. > > Ok, just to jump back in and add my $0.02, I plan to send a patch to > the ethtool project to do what was suggested (snapshot to /var, and > diff stats against that for future reference). I would just like to > point out that this will introduce inconsistencies between _every_ > other source of interface stats (ifconfig/procfs/etc). I can deal > with that, it just means one more thing to keep track of when you're > troubleshooting. You could perhaps add a third column to the statistics when a snapshot exists, with the second column still showing absolute values and the third showing the difference from the snapshot. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-03 12:28 ` James Cammarata 2008-06-03 12:35 ` Ben Hutchings @ 2008-06-03 14:46 ` Alan Cox 2008-06-04 3:05 ` James Cammarata 1 sibling, 1 reply; 38+ messages in thread From: Alan Cox @ 2008-06-03 14:46 UTC (permalink / raw) To: James Cammarata; +Cc: Bill Fink, linux-kernel, Linux Netdev List > Also, I'd like to point out that the "stepping on other peoples toes" > argument is a bogus one, since most major network vendors include this > ability, and it's not exactly a concern now. I used to work in a large ISP - it was a huge concern then and was enforced and managed by the less effective 'do you like your kneecaps' approach to permissions. Its basically impossible to write a correct non-racy application which zeros kernel statistics and then measures the change, because you cannot know another application did the same while you were running. This is the most basic and blindingly obvious stuff. You should not be able to zero the kernel stats just because you can't work perl. Alan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-06-03 14:46 ` Alan Cox @ 2008-06-04 3:05 ` James Cammarata 0 siblings, 0 replies; 38+ messages in thread From: James Cammarata @ 2008-06-04 3:05 UTC (permalink / raw) To: Alan Cox; +Cc: Bill Fink, linux-kernel, Linux Netdev List > I used to work in a large ISP - it was a huge concern then and was > enforced and managed by the less effective 'do you like your kneecaps' > approach to permissions. I work at a large ISP now, and you're absolutely right. You don't just go around resetting interface counters on backbone routers for the hell of it, and we never do it without customer permission while troubleshooting an issue with a connection, that is why I said I thought it was a non-argument. There seems to be an irrational fear of counter-based anarchy here. > Its basically impossible to write a correct non-racy application which > zeros kernel statistics and then measures the change, because you cannot > know another application did the same while you were running. > > This is the most basic and blindingly obvious stuff. You should not be > able to zero the kernel stats just because you can't work perl. I've already said I'd drop the issue 4+ days ago, and that I'd be more than happy to do it in userland as you suggested, my point was simply that adding it to only one userland tool will lead to inconsistencies. It is not an issue of being able to "work perl" or not. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 2008-05-29 12:34 ` James Cammarata 2008-05-29 14:45 ` Alan Cox @ 2008-05-29 14:48 ` Chris Friesen 1 sibling, 0 replies; 38+ messages in thread From: Chris Friesen @ 2008-05-29 14:48 UTC (permalink / raw) To: James Cammarata; +Cc: Andrew Morton, linux-kernel, Linux Netdev List James Cammarata wrote: >> You didn't provide a reason from adding this feature to the kernel. > Sorry, my reasoning was in my original patch only. The ability to reset > network counters is, in my experience, one of the first things you do > when trying to troubleshoot networking issues - especially when you have > incrementing errors. The ability to reset counters does make it easier to detect changes manually, as any non-zero value is something of interest. However, if you're gathering data via a script there is no functional gain. >> Many of the kernel's accounting accumulators cannot be reset. We >> handle that in userspace tools by using subtraction. > I don't think that should preclude the ability to reset these, unless it is > shown that it would break something very badly. You're trying to add new functionality to the kernel, increasing its size and adding more work for maintainers. Also, you're implicitly asking that all of the network drivers be updated to add support for this feature. The burden of proof is on you to show why the feature is useful enough that the others involved should take on this additional work. Chris ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2008-06-04 3:03 UTC | newest]
Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <482DA5B6.1020606@sngx.net>
2008-05-16 20:00 ` [PATCH updated] net: add ability to clear per-interface network statistics via procfs David Miller
2008-05-16 20:09 ` Rick Jones
2008-05-17 15:06 ` James Cammarata
[not found] ` <482DB46A.8020103@cosmosbay.com>
2008-05-16 20:03 ` David Miller
[not found] ` <482EF192.4070707@sngx.net>
2008-05-17 21:41 ` Eric Dumazet
2008-05-17 22:49 ` James Cammarata
2008-05-18 0:31 ` Ben Hutchings
2008-05-18 1:43 ` Patrick McHardy
2008-05-18 5:09 ` James Cammarata
2008-05-18 11:27 ` Ben Hutchings
2008-05-29 1:45 ` [PATCH] net: add ability to clear stats via ethtool - e1000/pcnet32 James Cammarata
2008-05-29 2:08 ` James Cammarata
2008-05-29 5:11 ` Andrew Morton
2008-05-29 12:34 ` James Cammarata
2008-05-29 14:45 ` Alan Cox
2008-05-29 17:15 ` James Cammarata
2008-05-29 20:50 ` David Miller
2008-05-29 21:18 ` James Cammarata
2008-05-30 19:12 ` Bill Fink
2008-05-30 22:14 ` Rick Jones
2008-05-31 1:09 ` Bill Fink
2008-05-31 2:41 ` Stephen Hemminger
2008-05-31 4:47 ` Bill Fink
2008-05-31 12:11 ` Alan Cox
2008-05-31 23:57 ` Bill Fink
2008-06-01 1:46 ` Ben Hutchings
2008-06-01 20:46 ` Bill Fink
2008-06-01 22:29 ` Ben Hutchings
2008-06-02 3:55 ` Bill Fink
2008-06-02 5:39 ` David Miller
2008-06-02 15:41 ` Bill Fink
2008-06-02 4:50 ` Glen Turner
2008-06-02 16:10 ` Bill Fink
2008-06-03 12:28 ` James Cammarata
2008-06-03 12:35 ` Ben Hutchings
2008-06-03 14:46 ` Alan Cox
2008-06-04 3:05 ` James Cammarata
2008-05-29 14:48 ` Chris Friesen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).