Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next-2.6] tg3: 64bits stats
From: Eric Dumazet @ 2010-07-05 18:35 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Matt Carlson, Michael Chan, netdev, David Miller
In-Reply-To: <1278354656.2087.108.camel@achroite.uk.solarflarecom.com>

Le lundi 05 juillet 2010 à 19:30 +0100, Ben Hutchings a écrit :

> I think you should use a similar approach here as you did in the
> loopback driver, i.e. update private variables in the RX and TX path and
> then copy/aggregate them in the implementation ndo_get_stats64 (only
> without the need for percpu stats).
> 
> If you want to include a seqlock in the driver stats interface, you can
> do that but it's not going to be pretty and we're still going to need
> additional seqlocks for per-queue (or percpy) stats in some drivers.

Yes, I provided one patch but am working on a different one, requiring a
new rtnl_link_stats64 param to ndo_get_stats64() methods and
dev_get_stats() as well.

dev->stats64 should not be overwritten without some synchronization, so
just disallow it for the moment...



^ permalink raw reply

* Re: [PATCHv3 2/2] sfc: Implement 64-bit net device statistics on all architectures
From: Eric Dumazet @ 2010-07-05 18:31 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, Stephen Hemminger, Arnd Bergmann, netdev,
	linux-net-drivers
In-Reply-To: <1278353780.2877.620.camel@edumazet-laptop>

Le lundi 05 juillet 2010 à 20:16 +0200, Eric Dumazet a écrit :
> Le mardi 08 juin 2010 à 18:21 +0100, Ben Hutchings a écrit :
> > Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
> > ---
> > This is unchanged from v1.
> > 
> > Ben.
> > 
> >  drivers/net/sfc/efx.c |    6 +++---
> >  1 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
> > index 26b0cc2..8ad476a 100644
> > --- a/drivers/net/sfc/efx.c
> > +++ b/drivers/net/sfc/efx.c
> > @@ -1492,11 +1492,11 @@ static int efx_net_stop(struct net_device *net_dev)
> >  }
> >  
> >  /* Context: process, dev_base_lock or RTNL held, non-blocking. */
> > -static struct net_device_stats *efx_net_stats(struct net_device *net_dev)
> > +static struct rtnl_link_stats64 *efx_net_stats(struct net_device *net_dev)
> >  {
> >  	struct efx_nic *efx = netdev_priv(net_dev);
> >  	struct efx_mac_stats *mac_stats = &efx->mac_stats;
> > -	struct net_device_stats *stats = &net_dev->stats;
> > +	struct rtnl_link_stats64 *stats = &net_dev->stats64;
> >  
> >  	spin_lock_bh(&efx->stats_lock);
> >  	efx->type->update_stats(efx);
> > @@ -1630,7 +1630,7 @@ static void efx_set_multicast_list(struct net_device *net_dev)
> >  static const struct net_device_ops efx_netdev_ops = {
> >  	.ndo_open		= efx_net_open,
> >  	.ndo_stop		= efx_net_stop,
> > -	.ndo_get_stats		= efx_net_stats,
> > +	.ndo_get_stats64	= efx_net_stats,
> >  	.ndo_tx_timeout		= efx_watchdog,
> >  	.ndo_start_xmit		= efx_hard_start_xmit,
> >  	.ndo_validate_addr	= eth_validate_addr,
> > -- 
> > 1.6.2.5
> > 
> 
> Ben, David
> 
> I believe following patch is needed after our recent commits.
> Not sure a seqlock is really needed, maybe a spinlock would be enough.
> 

One other way would be to add a rtnl_link_stats64 param to
ndo_get_stats64() method, and ask drivers to copy their stats in this
zone, instead of returning &dev->stats64 or something...

And also change dev_get_stats() with this new parameter.

Each caller would use a private copy, with no risk of concurrent
updates.




^ permalink raw reply

* Re: [PATCH net-next-2.6] tg3: 64bits stats
From: Ben Hutchings @ 2010-07-05 18:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Matt Carlson, Michael Chan, netdev, David Miller
In-Reply-To: <1278353230.2877.601.camel@edumazet-laptop>

On Mon, 2010-07-05 at 20:07 +0200, Eric Dumazet wrote:
> Le lundi 05 juillet 2010 à 18:31 +0100, Ben Hutchings a écrit :
> > On Mon, 2010-07-05 at 18:03 +0200, Eric Dumazet wrote:
> > > Le lundi 05 juillet 2010 à 11:14 +0200, Eric Dumazet a écrit :
> > > > After commit be1f3c2c027c (net: Enable 64-bit net device statistics on
> > > > 32-bit architectures), we can now provide 64bit stats, even on 32bit
> > > > arches.
> > > > 
> > > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > > > ---
> > > 
> > > Please disregard this patch.
> > > 
> > > There is small possibility a reader might read a 64bit value while
> > > another writer makes a change to it, changing high 32bit value.
> > > 
> > > A change in core network would be needed to make this 100% safe,
> > > possibly using a seqlock to protect dev->stats64
> > 
> > I really didn't want to add this overhead and complication to readers
> > when only some drivers need it.
> > 
> 
> Overhead would be minimal, only in get_stats() method and only for
> drivers that want to provide 64 bit stats...
> 
> Clearly, rx/tx path must not have any overhead.
> 
> Right now, even RTNL 64bit stats are not safe.
> 
> Should we revert prior patches or try to make progress ?

I think you should use a similar approach here as you did in the
loopback driver, i.e. update private variables in the RX and TX path and
then copy/aggregate them in the implementation ndo_get_stats64 (only
without the need for percpu stats).

If you want to include a seqlock in the driver stats interface, you can
do that but it's not going to be pretty and we're still going to need
additional seqlocks for per-queue (or percpy) stats in some drivers.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCHv3 2/2] sfc: Implement 64-bit net device statistics on all architectures
From: Eric Dumazet @ 2010-07-05 18:16 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, Stephen Hemminger, Arnd Bergmann, netdev,
	linux-net-drivers
In-Reply-To: <1276017672.2185.12.camel@achroite.uk.solarflarecom.com>

Le mardi 08 juin 2010 à 18:21 +0100, Ben Hutchings a écrit :
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
> ---
> This is unchanged from v1.
> 
> Ben.
> 
>  drivers/net/sfc/efx.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
> index 26b0cc2..8ad476a 100644
> --- a/drivers/net/sfc/efx.c
> +++ b/drivers/net/sfc/efx.c
> @@ -1492,11 +1492,11 @@ static int efx_net_stop(struct net_device *net_dev)
>  }
>  
>  /* Context: process, dev_base_lock or RTNL held, non-blocking. */
> -static struct net_device_stats *efx_net_stats(struct net_device *net_dev)
> +static struct rtnl_link_stats64 *efx_net_stats(struct net_device *net_dev)
>  {
>  	struct efx_nic *efx = netdev_priv(net_dev);
>  	struct efx_mac_stats *mac_stats = &efx->mac_stats;
> -	struct net_device_stats *stats = &net_dev->stats;
> +	struct rtnl_link_stats64 *stats = &net_dev->stats64;
>  
>  	spin_lock_bh(&efx->stats_lock);
>  	efx->type->update_stats(efx);
> @@ -1630,7 +1630,7 @@ static void efx_set_multicast_list(struct net_device *net_dev)
>  static const struct net_device_ops efx_netdev_ops = {
>  	.ndo_open		= efx_net_open,
>  	.ndo_stop		= efx_net_stop,
> -	.ndo_get_stats		= efx_net_stats,
> +	.ndo_get_stats64	= efx_net_stats,
>  	.ndo_tx_timeout		= efx_watchdog,
>  	.ndo_start_xmit		= efx_hard_start_xmit,
>  	.ndo_validate_addr	= eth_validate_addr,
> -- 
> 1.6.2.5
> 

Ben, David

I believe following patch is needed after our recent commits.
Not sure a seqlock is really needed, maybe a spinlock would be enough.

Thanks

[PATCH net-next-2.6] net: fix 64 bit counters on 32 bit arches

There is a small possibility that a reader gets incorrect values on 32
bit arches. SNMP applications could catch incorrect counters when a
32bit high part is changed.

I believe we need to add some synchronisation to avoid this.

A driver that provides 64bit stats on a 32bit arches should make sure it
gets the seqlock before updating them. Other drivers can stay unchanged,
since upper 32bits are not updated.

This new seqlock should be acquired in slow path only in process context
(no BH protection for instance).

Drivers should not use it in their rx/tx path. (They better stay with
32bit counters on 32bit arches)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/loopback.c    |    4 ++
 drivers/net/macvlan.c     |    2 +
 drivers/net/sfc/efx.c     |    2 +
 include/linux/netdevice.h |   43 +++++++++++++++++++++++
 net/8021q/vlan_dev.c      |    2 +
 net/core/dev.c            |   38 ++++++++++++--------
 net/core/rtnetlink.c      |   66 +++++++++++++++++++-----------------
 7 files changed, 112 insertions(+), 45 deletions(-)

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 4dd0510..5e82e4a 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -123,12 +123,16 @@ static struct rtnl_link_stats64 *loopback_get_stats64(struct net_device *dev)
 		bytes   += tbytes;
 		packets += tpackets;
 	}
+
+	netdev_stats_update_begin(dev);
 	stats->rx_packets = packets;
 	stats->tx_packets = packets;
 	stats->rx_dropped = drops;
 	stats->rx_errors  = drops;
 	stats->rx_bytes   = bytes;
 	stats->tx_bytes   = bytes;
+	netdev_stats_update_end(dev);
+
 	return stats;
 }
 
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index e6d626e..c39a90a 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -458,11 +458,13 @@ static struct rtnl_link_stats64 *macvlan_dev_get_stats64(struct net_device *dev)
 			/* rx_errors is an ulong, updated without syncp protection */
 			accum.rx_errors		+= p->rx_errors;
 		}
+		netdev_stats_update_begin(dev);
 		stats->rx_packets = accum.rx_packets;
 		stats->rx_bytes   = accum.rx_bytes;
 		stats->rx_errors  = accum.rx_errors;
 		stats->rx_dropped = accum.rx_errors;
 		stats->multicast  = accum.rx_multicast;
+		netdev_stats_update_end(dev);
 	}
 	return stats;
 }
diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 35b3f29..06afff6 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1543,6 +1543,7 @@ static struct rtnl_link_stats64 *efx_net_stats(struct net_device *net_dev)
 	efx->type->update_stats(efx);
 	spin_unlock_bh(&efx->stats_lock);
 
+	netdev_stats_update_begin(net_dev);
 	stats->rx_packets = mac_stats->rx_packets;
 	stats->tx_packets = mac_stats->tx_packets;
 	stats->rx_bytes = mac_stats->rx_bytes;
@@ -1565,6 +1566,7 @@ static struct rtnl_link_stats64 *efx_net_stats(struct net_device *net_dev)
 	stats->tx_errors = (stats->tx_window_errors +
 			    mac_stats->tx_bad);
 
+	netdev_stats_update_end(net_dev);
 	return stats;
 }
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4d27368..d6726cf 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -886,6 +886,9 @@ struct net_device {
 	int			ifindex;
 	int			iflink;
 
+#if BITS_PER_LONG==32
+	seqlock_t		stats_lock;
+#endif
 	union {
 		struct rtnl_link_stats64 stats64;
 		struct net_device_stats stats;
@@ -1080,6 +1083,46 @@ struct net_device {
 };
 #define to_net_dev(d) container_of(d, struct net_device, dev)
 
+
+static inline void netdev_stats_initlock(struct net_device *dev)
+{
+#if BITS_PER_LONG==32
+	seqlock_init(&dev->stats_lock);
+#endif
+}
+
+static inline void netdev_stats_update_begin(struct net_device *dev)
+{
+#if BITS_PER_LONG==32
+	write_seqlock(&dev->stats_lock);
+#endif
+}
+
+static inline void netdev_stats_update_end(struct net_device *dev)
+{
+#if BITS_PER_LONG==32
+	write_sequnlock(&dev->stats_lock);
+#endif
+}
+
+static inline unsigned int netdev_stats_fetch_begin(const struct net_device *dev)
+{
+#if BITS_PER_LONG==32
+	return read_seqbegin(&dev->stats_lock);
+#else
+	return 0;
+#endif
+}
+
+static bool inline netdev_stats_fetch_retry(const struct net_device *dev, unsigned int start)
+{
+#if BITS_PER_LONG==32
+	return read_seqretry(&dev->stats_lock, start);
+#else
+	return false;
+#endif
+}
+
 #define	NETDEV_ALIGN		32
 
 static inline
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index c6456cb..3d4e2f0 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -830,10 +830,12 @@ static struct rtnl_link_stats64 *vlan_dev_get_stats64(struct net_device *dev)
 			/* rx_errors is an ulong, not protected by syncp */
 			accum.rx_errors  += p->rx_errors;
 		}
+		netdev_stats_update_begin(dev);
 		stats->rx_packets = accum.rx_packets;
 		stats->rx_bytes   = accum.rx_bytes;
 		stats->rx_errors  = accum.rx_errors;
 		stats->multicast  = accum.rx_multicast;
+		netdev_stats_update_end(dev);
 	}
 	return stats;
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index 93b8929..b41c7ac 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3704,24 +3704,31 @@ void dev_seq_stop(struct seq_file *seq, void *v)
 static void dev_seq_printf_stats(struct seq_file *seq, struct net_device *dev)
 {
 	const struct rtnl_link_stats64 *stats = dev_get_stats(dev);
+	struct rtnl_link_stats64 temp;
+	unsigned int start;
+	
+	do {
+		start = netdev_stats_fetch_begin(dev);
+		memcpy(&temp, stats, sizeof(temp));
+	} while (netdev_stats_fetch_retry(dev, start));
 
 	seq_printf(seq, "%6s: %7llu %7llu %4llu %4llu %4llu %5llu %10llu %9llu "
 		   "%8llu %7llu %4llu %4llu %4llu %5llu %7llu %10llu\n",
-		   dev->name, stats->rx_bytes, stats->rx_packets,
-		   stats->rx_errors,
-		   stats->rx_dropped + stats->rx_missed_errors,
-		   stats->rx_fifo_errors,
-		   stats->rx_length_errors + stats->rx_over_errors +
-		    stats->rx_crc_errors + stats->rx_frame_errors,
-		   stats->rx_compressed, stats->multicast,
-		   stats->tx_bytes, stats->tx_packets,
-		   stats->tx_errors, stats->tx_dropped,
-		   stats->tx_fifo_errors, stats->collisions,
-		   stats->tx_carrier_errors +
-		    stats->tx_aborted_errors +
-		    stats->tx_window_errors +
-		    stats->tx_heartbeat_errors,
-		   stats->tx_compressed);
+		   dev->name, temp.rx_bytes, temp.rx_packets,
+		   temp.rx_errors,
+		   temp.rx_dropped + temp.rx_missed_errors,
+		   temp.rx_fifo_errors,
+		   temp.rx_length_errors + temp.rx_over_errors +
+		    temp.rx_crc_errors + temp.rx_frame_errors,
+		   temp.rx_compressed, temp.multicast,
+		   temp.tx_bytes, temp.tx_packets,
+		   temp.tx_errors, temp.tx_dropped,
+		   temp.tx_fifo_errors, temp.collisions,
+		   temp.tx_carrier_errors +
+		    temp.tx_aborted_errors +
+		    temp.tx_window_errors +
+		    temp.tx_heartbeat_errors,
+		   temp.tx_compressed);
 }
 
 /*
@@ -5386,6 +5393,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
 	if (dev_addr_init(dev))
 		goto free_rx;
 
+	netdev_stats_initlock(dev);
 	dev_mc_init(dev);
 	dev_uc_init(dev);
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index e645778..b3a3812 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -610,37 +610,43 @@ static void copy_rtnl_link_stats(struct rtnl_link_stats *a,
 	a->tx_compressed = b->tx_compressed;
 }
 
-static void copy_rtnl_link_stats64(void *v, const struct rtnl_link_stats64 *b)
+static void copy_rtnl_link_stats64(void *v, const struct rtnl_link_stats64 *b,
+				   struct net_device *dev)
 {
 	struct rtnl_link_stats64 a;
-
-	a.rx_packets = b->rx_packets;
-	a.tx_packets = b->tx_packets;
-	a.rx_bytes = b->rx_bytes;
-	a.tx_bytes = b->tx_bytes;
-	a.rx_errors = b->rx_errors;
-	a.tx_errors = b->tx_errors;
-	a.rx_dropped = b->rx_dropped;
-	a.tx_dropped = b->tx_dropped;
-
-	a.multicast = b->multicast;
-	a.collisions = b->collisions;
-
-	a.rx_length_errors = b->rx_length_errors;
-	a.rx_over_errors = b->rx_over_errors;
-	a.rx_crc_errors = b->rx_crc_errors;
-	a.rx_frame_errors = b->rx_frame_errors;
-	a.rx_fifo_errors = b->rx_fifo_errors;
-	a.rx_missed_errors = b->rx_missed_errors;
-
-	a.tx_aborted_errors = b->tx_aborted_errors;
-	a.tx_carrier_errors = b->tx_carrier_errors;
-	a.tx_fifo_errors = b->tx_fifo_errors;
-	a.tx_heartbeat_errors = b->tx_heartbeat_errors;
-	a.tx_window_errors = b->tx_window_errors;
-
-	a.rx_compressed = b->rx_compressed;
-	a.tx_compressed = b->tx_compressed;
+	unsigned int start;
+
+	do {
+		start = netdev_stats_fetch_begin(dev);
+
+		a.rx_packets = b->rx_packets;
+		a.tx_packets = b->tx_packets;
+		a.rx_bytes = b->rx_bytes;
+		a.tx_bytes = b->tx_bytes;
+		a.rx_errors = b->rx_errors;
+		a.tx_errors = b->tx_errors;
+		a.rx_dropped = b->rx_dropped;
+		a.tx_dropped = b->tx_dropped;
+	
+		a.multicast = b->multicast;
+		a.collisions = b->collisions;
+
+		a.rx_length_errors = b->rx_length_errors;
+		a.rx_over_errors = b->rx_over_errors;
+		a.rx_crc_errors = b->rx_crc_errors;
+		a.rx_frame_errors = b->rx_frame_errors;
+		a.rx_fifo_errors = b->rx_fifo_errors;
+		a.rx_missed_errors = b->rx_missed_errors;
+
+		a.tx_aborted_errors = b->tx_aborted_errors;
+		a.tx_carrier_errors = b->tx_carrier_errors;
+		a.tx_fifo_errors = b->tx_fifo_errors;
+		a.tx_heartbeat_errors = b->tx_heartbeat_errors;
+		a.tx_window_errors = b->tx_window_errors;
+
+		a.rx_compressed = b->rx_compressed;
+		a.tx_compressed = b->tx_compressed;
+	} while (netdev_stats_fetch_retry(dev, start));
 	memcpy(v, &a, sizeof(a));
 }
 
@@ -854,7 +860,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 			sizeof(struct rtnl_link_stats64));
 	if (attr == NULL)
 		goto nla_put_failure;
-	copy_rtnl_link_stats64(nla_data(attr), stats);
+	copy_rtnl_link_stats64(nla_data(attr), stats, dev);
 
 	if (dev->dev.parent)
 		NLA_PUT_U32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent));



^ permalink raw reply related

* Re: [PATCH net-next-2.6] tg3: 64bits stats
From: Eric Dumazet @ 2010-07-05 18:07 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Matt Carlson, Michael Chan, netdev, David Miller
In-Reply-To: <1278351070.2087.70.camel@achroite.uk.solarflarecom.com>

Le lundi 05 juillet 2010 à 18:31 +0100, Ben Hutchings a écrit :
> On Mon, 2010-07-05 at 18:03 +0200, Eric Dumazet wrote:
> > Le lundi 05 juillet 2010 à 11:14 +0200, Eric Dumazet a écrit :
> > > After commit be1f3c2c027c (net: Enable 64-bit net device statistics on
> > > 32-bit architectures), we can now provide 64bit stats, even on 32bit
> > > arches.
> > > 
> > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > > ---
> > 
> > Please disregard this patch.
> > 
> > There is small possibility a reader might read a 64bit value while
> > another writer makes a change to it, changing high 32bit value.
> > 
> > A change in core network would be needed to make this 100% safe,
> > possibly using a seqlock to protect dev->stats64
> 
> I really didn't want to add this overhead and complication to readers
> when only some drivers need it.
> 

Overhead would be minimal, only in get_stats() method and only for
drivers that want to provide 64 bit stats...

Clearly, rx/tx path must not have any overhead.

Right now, even RTNL 64bit stats are not safe.

Should we revert prior patches or try to make progress ?




^ permalink raw reply

* Re: setsockopt(IP_TOS) being privileged or distinct capability?
From: Philip Prindeville @ 2010-07-05 18:08 UTC (permalink / raw)
  To: Alexander Clouter; +Cc: netdev
In-Reply-To: <20100703234813.GJ24655@chipmunk>

Say, on a slightly different subject... but still having to do with 
setsockopt(IP_TOS)... is it intentional that system call effectively 
does nothing if the socket is in listen(), connect(), or bind() states?

In other words, you have to issue the setsockopt() immediately after the 
socket() call, or it doesn't do anything.

As I remember, that's slightly different semantics from BSD, which 
allows you to change the markings on a bound or listening socket.

I've not walked through the kernel sources to see why this is.

-Philip

^ permalink raw reply

* Re: setsockopt(IP_TOS) being privileged or distinct capability?
From: Philip Prindeville @ 2010-07-05 18:04 UTC (permalink / raw)
  To: Alexander Clouter; +Cc: netdev
In-Reply-To: <20100703234813.GJ24655@chipmunk>

On 7/3/10 5:48 PM, Alexander Clouter wrote:
> Hi,
>
> * Philip Prindeville<philipp_subx@redfish-solutions.com>  [2010-07-03 17:07:52-0600]:
>    
>> On 7/3/10 12:55 PM, Alexander Clouter wrote:
>>      
>>>
>>>        
>>>> Does anyone else think that setsockopt(IP_TOS) should be a privileged
>>>> operation, perhaps using CAP_NET_ADMIN, or maybe even adding separate
>>>> granularity as CAP_NET_TOS?
>>>>
>>>>
>>>>          
>>> I really would prefer not having to run telnet and ssh *clients* as
>>> root. :)
>>>        
>> Don't ping and traceroute -I currently run as root?
>>
>>      
> Indeed, but I have no idea what that has to do with ToS/DSCP flags?
>    

The logic being that if having a RAW socket requires privilege, but it's 
necessary to have reasonable security for user-invokable programs... and 
we manage to do this without too much trouble for those to programs, 
then it shouldn't be an undue hardship to do the same for ssh.

> ping and (old skool) traceroute use ICMP where you need to open a
> privileged socket; to send and receive ICMP packets.  Opening a UDP/TCP
> is an unprivileged operation and so is setsockopt(IP_TOS).
>    

Right.  And I'm saying because of the potentially disruptive nature of 
setsockopt(IP_TOS), perhaps it should require privilege.

> I'm guessing, if you excuse me Google-stalking you), this is all linked
> to:
>
> https://bugzilla.mindrot.org/show_bug.cgi?id=1733
>    

That would be a very good guess.

And google-stalking is fine.  I draw the line at leaving dead cats at my 
front door.

> You have to bear in mind ToS is a marking that userland can utilise to
> request that the network provides it with a particular QoS, this does
> not mean for an instant the network has to honour that (I know my ISP
> does not and neither does my work network I sysadmin for)...otherwise
> nothing would stop me using:
>    

I understand that.  That's part of the reason that I've submitted 
patches for APR, Apache, Thunderbird, Firefox, Proftpd, Curl, wget, 
etc.  There is pressure within certain technical groups to get ISP's to 
voluntarily implement RFC-4594... that's the carrot.  The stick being 
FCC heavy-handed regulation of the ISP's if they don't.

Once QoS markings actually *are* implemented in carrier networks, the 
potential for abuse is non-insignificant.  Hence the suggestion to make 
it privileged.

> iptables -t mangle -I POSTROUTING -j DSCP --set-dscp-class EF
>    

Except that "iptables" is also a privileged operation.

> QoS is meaningless unless you place boundaries on the policies; the
> ToS/DSCP marking should only be used as a *hint* for classification of
> traffic flows.
>    

Like I said, there's an effort to push the ISP's into implementing 
RFC-4594 widespread.  Their previous arguments for not doing so were (a) 
most software doesn't implement QoS (hence the scrub I did above), and 
(b) there were no standard markings.  RFC-4594 attempts to impose a 
standard.

> For example, 'interactive' and 'low latency' (in the case of SSH or
> telnet) should not exceed 10kB/s...unless you like to play 0verkill :)
> Anything marking it's traffic as interactive but shutting traffic at
> 500kB/s is obviously telling lies.  If you build your policing rules to
> blindly accept whatever is in the ToS/DSCP field, you are configuring a
> DoS vector on your network.
>
> Cheers
>
>    

When you say "interactive" and "low latency" are you referring to the 
RFC-791 mappings for those, or to the RFC-4594 mappings of those classes?

Here we use Arno's Iptables Firewall with the traffic-shaper plugin I 
wrote.  This does shaping and policing within traffic classes.

And yes, doing an out-of-the-box shaper for Fedora is on my TODO list.

-Philip

^ permalink raw reply

* Re: Possible bug in net/ipv4/route.c?
From: Stephen Hemminger @ 2010-07-05 18:03 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Sol Kavy, linux-kernel, gren, gjin, msezgin, silgen,
	David S. Miller, netdev
In-Reply-To: <20100705120413.GA6219@gondor.apana.org.au>

On Mon, 5 Jul 2010 20:04:13 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> Sol Kavy <skavy@ubicom.com> wrote:
> > Found Linux: 2.6.28
> > Arch: Ubicom32 <not yet pushed>
> > Project: uCLinux based Router
> > Test: Bit torrent Stress Test
> > 
> > Note: The top of Linus git net/ipv4/route.c appears to have the same issue.
> > 
> > The following is a patch for clearing out IP options area in an input skb during link failure processing.  Without this patch, the icmp_send() can result in a call to ip_options_echo() where the common buffer area of the skb is incorrectly interpreted.  Depending on the previous use of the skb->cb[], the interpreted option length values can cause stack corruption by copying more than 40 bytes to the output options.
> > 
> > In our case, a driver is using the skb->cb[] area to hold driver specific data.   The driver is not zeroing out the area after use.  I can see three basic solutions:
> > 
> > 1) Drivers are not allowed to use the skb->cb[] area at all.  Ubicom should modify the driver to use a different approach.
> > 
> > 2) The layer using skb->cb[] should clear this area after use and before handing the skb to another layer.  Ubicom should modify the driver to clear the skb->cb[] area before sending it up the line.
> > 
> > 3) Any layer that "uses" the skb->cb[] area must clear the area before use.  In which case, the proposed patch would fix the problem for the ipv4_link_failure().  I believe that this is the correct fix because I see ip_rcv() clears the skb->cb[] before using it.
> > 
> > Can someone confirm that this is the appropriate fix?  If this is documented somewhere, please direct me to the documentation.
> 
> Thanks for the report!
>  
> > Patch:  
> > 
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index 125ee64..d13805f 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -1606,6 +1606,14 @@ static void ipv4_link_failure(struct sk_buff *skb)
> > {
> >         struct rtable *rt;
> > 
> > +       /*
> > +         * Since link failure can be called with skbs from many layers (see arp)
> > +         * the cb area of the skb must be cleared before use. Because the cb area 
> > +         * can be formatted according to the caller layer's cb area format and it may cause
> > +         * corruptions when it is handled in a different network layer.
> > +         */
> > +       memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
> >         icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0);
> >         rt = skb->rtable;
> > 
> > The packet is enqueud by:
> > do_IRQ()->do_softirq()->__do_softirq()->net_rx_action()->ubi32_eth_napi_poll()->ubi32_eth_receive()->__vlan_hwaccel_rx()->netif_receive_skb()->br_handle_frame()->nf_hook_slow()->br_nf_pre_routing_finish()->br_nfr_pre_routing_finish_bridge()->neight_resolve_output()->__neigh_event_send().
> > 
> > The packet is then dequeued by: 
> > do_IRQ() -> irq_exit() -> do_softirq() -> run_timer_softirq() -> neigh_timer_handler() -> arp_error_report() -> ipv4_link_failure() -> icmp_send() -> ip_options_echo().
> > 
> > Because the Ubicom Ethernet driver overwrites the common buffer area, the enqueued packet contains garbage when casted as an IP options data structure.  This results in ip_options_echo() miss reading the option length information and overwriting memory.  By clearing the skb->cb[] before processing the icmp_send() against the packet, we ensure that ip_options_echo() does not corrupt memory.   
> 
> Generally this area should be cleared on entry to each stack
> intending on using it.  So in this case, I'd point the finger
> of blame at the bridge stack for letting this packet into the
> IP stack through the back entrance without taking the proper
> precautions.

The CB is used in two places in the bridge code.
1) The recent multicast changes (IGMP)
2) Netfilter / ebtables to store header.

Ebtables is okay because the only part of the cb[] it uses
is the incoming device (brdev) which is always initialized coming into
the bridge: br_dev_xmit and br_handle_frame_finish

The IGMP code looks buggy.

/* net device transmit always called with no BH (preempt_disabled) */
netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
{
... [A]
	BR_INPUT_SKB_CB(skb)->brdev = dev;

	skb_reset_mac_header(skb);
	skb_pull(skb, ETH_HLEN);

	if (is_multicast_ether_addr(dest)) {
		if (br_multicast_rcv(br, NULL, skb))
			goto out;

		mdst = br_mdb_get(br, skb);
		if (mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb))
			... [B]

The problem is that br_dev_xmit is looking at flags in the CB[] that
are uninitialized.

if br_dev_xmit cleared the CB at [A] the mrouters_only would always be zero
at [B].

Where should the mrouters and igmp_only fields in skb be initialized?



^ permalink raw reply

* Re: [PATCH net-next-2.6] tg3: 64bits stats
From: Ben Hutchings @ 2010-07-05 17:31 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Matt Carlson, Michael Chan, netdev, David Miller
In-Reply-To: <1278345780.2877.376.camel@edumazet-laptop>

On Mon, 2010-07-05 at 18:03 +0200, Eric Dumazet wrote:
> Le lundi 05 juillet 2010 à 11:14 +0200, Eric Dumazet a écrit :
> > After commit be1f3c2c027c (net: Enable 64-bit net device statistics on
> > 32-bit architectures), we can now provide 64bit stats, even on 32bit
> > arches.
> > 
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > ---
> 
> Please disregard this patch.
> 
> There is small possibility a reader might read a 64bit value while
> another writer makes a change to it, changing high 32bit value.
> 
> A change in core network would be needed to make this 100% safe,
> possibly using a seqlock to protect dev->stats64

I really didn't want to add this overhead and complication to readers
when only some drivers need it.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: Distributed Switch Architecture(DSA)
From: Lennert Buytenhek @ 2010-07-05 17:24 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: netdev
In-Reply-To: <OFE5DDA53C.8EDE0B1A-ONC1257748.00467368-C1257748.0050B4A2@transmode.se>

On Sun, Jun 20, 2010 at 04:41:31PM +0200, Joakim Tjernlund wrote:

> > > If not, what is the point of DSA then if it doesn't use the native
> > > forwarding capabilities of the HW switch?
> >
> > The point is and always was to provide a framework for proper integration
> > of hardware switch chips into the Linux kernel.  This framework doesn't
> > become useless just because it doesn't already support every single
> > hardware feature at this point.
> 
> Right, sorry if I sounded a bit harsh.
> 
> So DSA currently does a very minimal config of the HW switch to get
> things going.

Correct.


> If you want to do something more fancy one has to
> add a control plane to DSA which would possibly talk
> to a user space app. Is that correct?

Yes and no -- yes in the sense that if you want to use more functionality
of the switch chip, you'll have to add some code that extracts that info
from the Linux network interface config and turns it into commands for the
switch chip, and no in the sense that I'm not sure yet what the best way
to implement this would be.  (Doing it all in userspace is one option.)

^ permalink raw reply

* Re: [PATCH net-next-2.6] tg3: 64bits stats
From: Eric Dumazet @ 2010-07-05 16:03 UTC (permalink / raw)
  To: Matt Carlson; +Cc: Michael Chan, netdev, David Miller
In-Reply-To: <1278321242.2877.18.camel@edumazet-laptop>

Le lundi 05 juillet 2010 à 11:14 +0200, Eric Dumazet a écrit :
> After commit be1f3c2c027c (net: Enable 64-bit net device statistics on
> 32-bit architectures), we can now provide 64bit stats, even on 32bit
> arches.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---

Please disregard this patch.

There is small possibility a reader might read a 64bit value while
another writer makes a change to it, changing high 32bit value.

A change in core network would be needed to make this 100% safe,
possibly using a seqlock to protect dev->stats64




^ permalink raw reply

* RE: Splice status
From: Eric Dumazet @ 2010-07-05 15:34 UTC (permalink / raw)
  To: Ofer Heifetz; +Cc: Changli Gao, netdev@vger.kernel.org
In-Reply-To: <EE71107DF0D1F24FA2D95041E64AB9E8ED2541B6E4@IL-MB01.marvell.com>

Le lundi 05 juillet 2010 à 16:47 +0300, Ofer Heifetz a écrit :
> Hi,
> 
> Well, Samba still disables splice support (hard coded), I applied your
> patch (adding the SPLICE_F_NONBLOCK to the splice(sock, pipe)) and I
> managed to write 4G file to Samba share.
> 
> I did notice that the splice is done on buffers in two sizes: 1380 and
> 2760 (when writing to share file), I guess that if I can get samba to
> use bigger buffers it will reduce the splice calls and achieve better
> performance.
> 

Note that if your load increases or network is faster, splice will
naturally use more data per call. Dont worry.

Also, you can change MIN(count, 16384) to MIN(count, 65536) now the real
samba bug is known and can be fixed (by the SPLICE_F_NONBLOCK patch I
sent)

(I guess using 16384 instead of 65536 was a try to reduce hang
probability)


> I also saw that when re-writing a file splice does use the maximum
> buffer size (~16K) occasionally.

max is 16 * PAGE_SIZE, 65536 bytes on x86

> 
> Need to perform some more testing with samba splice ...
> 
> -Ofer
> 
> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com] 
> Sent: Monday, July 05, 2010 3:51 PM
> To: Ofer Heifetz
> Cc: Changli Gao; netdev@vger.kernel.org
> Subject: RE: Splice status
> 
> Le lundi 05 juillet 2010 à 13:52 +0300, Ofer Heifetz a écrit :
> > I am using Samba, so from my understanding of the source code, it
> loops and performs splice(sock, pipe) and splice(pipe, fd). There is no
> flush of any sort in between.
> > 
> > When you say drain you mean to flush all data to pipe?
> > 
> 
> Draining pipe before splice() call would only trigger the bug less
> often.
> 
> splice(sock, pipe) can block if caller dont use appropriate "non
> blocking pipe' splice() mode, even if pipe is empty before a splice()
> call.
> 
> Last time I checked, splice() code was disabled in samba.
> 
> Is it a patched version ?
> 
> Samba should add SPLICE_F_NONBLOCK to first splice() call (from sock to
> pipe)
> 
> (You also need a recent kernel, check for details :
> http://patchwork.ozlabs.org/patch/34511/ )
> 
> diff --git a/source3/lib/recvfile.c b/source3/lib/recvfile.c
> index ea01596..65e6f34 100644
> --- a/source3/lib/recvfile.c
> +++ b/source3/lib/recvfile.c
> @@ -182,7 +182,7 @@ ssize_t sys_recvfile(int fromfd,
>                 int nread, to_write;
>  
>                 nread = splice(fromfd, NULL, pipefd[1], NULL,
> -                              MIN(count, 16384), SPLICE_F_MOVE);
> +                              MIN(count, 16384), SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
>                 if (nread == -1) {
>                         if (errno == EINTR) {
>                                 continue;
> 
> 



^ permalink raw reply

* [PATCH] drivers/net: correct valid flag
From: Julia Lawall @ 2010-07-05 15:07 UTC (permalink / raw)
  To: netdev, linux-kernel, kernel-janitors

From: Julia Lawall <julia@diku.dk>

Elsewhere in the "optimized" functions, the "2" constants are used.
NV_TX_VALID and NV_TX2_VALID have the same value.

Signed-off-by: Julia Lawall <julia@diku.dk>

---
diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
index 268ea4d..870c18b 100644
--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -2468,7 +2468,8 @@ static int nv_tx_done_optimized(struct net_device *dev, int limit)
 	struct ring_desc_ex* orig_get_tx = np->get_tx.ex;
 
 	while ((np->get_tx.ex != np->put_tx.ex) &&
-	       !((flags = le32_to_cpu(np->get_tx.ex->flaglen)) & NV_TX_VALID) &&
+	       !((flags = le32_to_cpu(np->get_tx.ex->flaglen)) & NV_TX2_VALID)
+	       &&
 	       (tx_work < limit)) {
 
 		dprintk(KERN_DEBUG "%s: nv_tx_done_optimized: flags 0x%x.\n",

^ permalink raw reply related

* Re: Fwd: Possible bug in net/ipv4/route.c?
From: Herbert Xu @ 2010-07-05 14:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: yoshfuji, netdev, linux-kernel, Stephen Hemminger
In-Reply-To: <1278336898.2877.212.camel@edumazet-laptop>

On Mon, Jul 05, 2010 at 03:34:58PM +0200, Eric Dumazet wrote:
>
> I see. We could :
> 
> Avoid this memset(skb->cb, 0, sizeof(skb->cb)) in fastpath.

Yeah I think this might work.  Although skb's are used in all
sorts of esoteric places (such as netlink which may not even be
related to networking) so I can't guarantee that every alloc_skb
caller does the right thing.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 3/3] pm_qos: get rid of the allocation in pm_qos_add_request()
From: James Bottomley @ 2010-07-05 14:02 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Linux PM, markgross, netdev
In-Reply-To: <s5hvd8ubfcz.wl%tiwai@suse.de>

On Mon, 2010-07-05 at 08:41 +0200, Takashi Iwai wrote:
> sorry for the late reply, as I've been on vacation in the last week
> (and shut off mails intentionally :)

Envy forbids me from saying that's OK.

> At Mon, 28 Jun 2010 12:44:48 -0500,
> James Bottomley wrote:
> > 
> > Since every caller has to squirrel away the returned pointer anyway,
> > they might as well supply the memory area.  This fixes a bug in a few of
> > the call sites where the returned pointer was dereferenced without
> > checking it for NULL (which gets returned if the kzalloc failed).
> > 
> > I'd like to hear how sound and netdev feels about this: it will add
> > about two more pointers worth of data to struct netdev and struct
> > snd_pcm_substream .. but I think it's worth it.  If you're OK, I'll add
> > your acks and send through the pm tree.
> > 
> > This also looks to me like an android independent clean up (even though
> > it renders the request_add atomically callable).  I also added include
> > guards to include/linux/pm_qos_params.h
> 
> I like the patch very well, too.
> But, just wondering...
> 
> > @@ -262,6 +260,11 @@ void pm_qos_update_request(struct pm_qos_request_list *pm_qos_req,
> >  	if (!pm_qos_req) /*guard against callers passing in null */
> >  		return;
> >  
> > +	if (pm_qos_request_active(pm_qos_req)) {
> > +		WARN(1, KERN_ERR "pm_qos_update_request() called for unknown object\n");
> > +		return;
> > +	}
> > +
> 
> Is this correct...?  Shouldn't it be a negative check?

Yes, it should be a negative check ... I'll update the patch.  I guess
this still means that no-one has managed to test it on a functional
system ...

James



^ permalink raw reply

* RE: Splice status
From: Ofer Heifetz @ 2010-07-05 13:47 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Changli Gao, netdev@vger.kernel.org
In-Reply-To: <1278334254.2877.158.camel@edumazet-laptop>

Hi,

Well, Samba still disables splice support (hard coded), I applied your patch (adding the SPLICE_F_NONBLOCK to the splice(sock, pipe)) and I managed to write 4G file to Samba share.

I did notice that the splice is done on buffers in two sizes: 1380 and 2760 (when writing to share file), I guess that if I can get samba to use bigger buffers it will reduce the splice calls and achieve better performance.

I also saw that when re-writing a file splice does use the maximum buffer size (~16K) occasionally.

Need to perform some more testing with samba splice ...

-Ofer

-----Original Message-----
From: Eric Dumazet [mailto:eric.dumazet@gmail.com] 
Sent: Monday, July 05, 2010 3:51 PM
To: Ofer Heifetz
Cc: Changli Gao; netdev@vger.kernel.org
Subject: RE: Splice status

Le lundi 05 juillet 2010 à 13:52 +0300, Ofer Heifetz a écrit :
> I am using Samba, so from my understanding of the source code, it
loops and performs splice(sock, pipe) and splice(pipe, fd). There is no
flush of any sort in between.
> 
> When you say drain you mean to flush all data to pipe?
> 

Draining pipe before splice() call would only trigger the bug less
often.

splice(sock, pipe) can block if caller dont use appropriate "non
blocking pipe' splice() mode, even if pipe is empty before a splice()
call.

Last time I checked, splice() code was disabled in samba.

Is it a patched version ?

Samba should add SPLICE_F_NONBLOCK to first splice() call (from sock to
pipe)

(You also need a recent kernel, check for details :
http://patchwork.ozlabs.org/patch/34511/ )

diff --git a/source3/lib/recvfile.c b/source3/lib/recvfile.c
index ea01596..65e6f34 100644
--- a/source3/lib/recvfile.c
+++ b/source3/lib/recvfile.c
@@ -182,7 +182,7 @@ ssize_t sys_recvfile(int fromfd,
                int nread, to_write;

                nread = splice(fromfd, NULL, pipefd[1], NULL,
-                              MIN(count, 16384), SPLICE_F_MOVE);
+                              MIN(count, 16384), SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
                if (nread == -1) {
                        if (errno == EINTR) {
                                continue;

^ permalink raw reply related

* Re: Fwd: Possible bug in net/ipv4/route.c?
From: Eric Dumazet @ 2010-07-05 13:34 UTC (permalink / raw)
  To: Herbert Xu; +Cc: yoshfuji, netdev, linux-kernel, Stephen Hemminger
In-Reply-To: <20100705132245.GA6876@gondor.apana.org.au>

Le lundi 05 juillet 2010 à 21:22 +0800, Herbert Xu a écrit :
> On Mon, Jul 05, 2010 at 02:59:14PM +0200, Eric Dumazet wrote:
> >
> > Why do we clear full 48 bytes skb->cb[] in skb_alloc(), if no protocol
> > stack should rely it being zero ?
> 
> Unless a protocol is allocating the skb itself, then the fact
> that skb_alloc clears skb->cb is no guarantee that the skb->cb
> will be zero.

I see. We could :

Avoid this memset(skb->cb, 0, sizeof(skb->cb)) in fastpath.

or in debug mode, poison it to trigger errors more often.

Thanks

^ permalink raw reply

* Re: Fwd: Possible bug in net/ipv4/route.c?
From: Herbert Xu @ 2010-07-05 13:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: yoshfuji, netdev, linux-kernel, Stephen Hemminger
In-Reply-To: <1278334754.2877.173.camel@edumazet-laptop>

On Mon, Jul 05, 2010 at 02:59:14PM +0200, Eric Dumazet wrote:
>
> Why do we clear full 48 bytes skb->cb[] in skb_alloc(), if no protocol
> stack should rely it being zero ?

Unless a protocol is allocating the skb itself, then the fact
that skb_alloc clears skb->cb is no guarantee that the skb->cb
will be zero.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [NET] ioc3-eth: Use the instance of net_device_stats from net_device.
From: Ralf Baechle @ 2010-07-05 12:59 UTC (permalink / raw)
  To: David S. Miller
  Cc: Kulikov Vasiliy, Kernel Janitors, Jiri Pirko, Eric Dumazet,
	Patrick McHardy, Alexey Dobriyan, linux-mips, netdev,
	linux-kernel

Since net_device has an instance of net_device_stats, we can remove the
instance of this from the adapter structure.

Based on original patch by Kulikov Vasiliy.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>

 drivers/net/ioc3-eth.c |   49 ++++++++++++++++++++++++-----------------------
 1 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/net/ioc3-eth.c b/drivers/net/ioc3-eth.c
index e3b5e94..0b3f6df 100644
--- a/drivers/net/ioc3-eth.c
+++ b/drivers/net/ioc3-eth.c
@@ -82,7 +82,6 @@ struct ioc3_private {
 	struct ioc3_etxd *txr;
 	struct sk_buff *rx_skbs[512];
 	struct sk_buff *tx_skbs[128];
-	struct net_device_stats stats;
 	int rx_ci;			/* RX consumer index */
 	int rx_pi;			/* RX producer index */
 	int tx_ci;			/* TX consumer index */
@@ -504,8 +503,8 @@ static struct net_device_stats *ioc3_get_stats(struct net_device *dev)
 	struct ioc3_private *ip = netdev_priv(dev);
 	struct ioc3 *ioc3 = ip->regs;
 
-	ip->stats.collisions += (ioc3_r_etcdc() & ETCDC_COLLCNT_MASK);
-	return &ip->stats;
+	dev->stats.collisions += (ioc3_r_etcdc() & ETCDC_COLLCNT_MASK);
+	return &dev->stats;
 }
 
 static void ioc3_tcpudp_checksum(struct sk_buff *skb, uint32_t hwsum, int len)
@@ -576,8 +575,9 @@ static void ioc3_tcpudp_checksum(struct sk_buff *skb, uint32_t hwsum, int len)
 		skb->ip_summed = CHECKSUM_UNNECESSARY;
 }
 
-static inline void ioc3_rx(struct ioc3_private *ip)
+static inline void ioc3_rx(struct net_device *dev)
 {
+	struct ioc3_private *ip = netdev_priv(dev);
 	struct sk_buff *skb, *new_skb;
 	struct ioc3 *ioc3 = ip->regs;
 	int rx_entry, n_entry, len;
@@ -598,13 +598,13 @@ static inline void ioc3_rx(struct ioc3_private *ip)
 		if (err & ERXBUF_GOODPKT) {
 			len = ((w0 >> ERXBUF_BYTECNT_SHIFT) & 0x7ff) - 4;
 			skb_trim(skb, len);
-			skb->protocol = eth_type_trans(skb, priv_netdev(ip));
+			skb->protocol = eth_type_trans(skb, dev);
 
 			new_skb = ioc3_alloc_skb(RX_BUF_ALLOC_SIZE, GFP_ATOMIC);
 			if (!new_skb) {
 				/* Ouch, drop packet and just recycle packet
 				   to keep the ring filled.  */
-				ip->stats.rx_dropped++;
+				dev->stats.rx_dropped++;
 				new_skb = skb;
 				goto next;
 			}
@@ -622,19 +622,19 @@ static inline void ioc3_rx(struct ioc3_private *ip)
 			rxb = (struct ioc3_erxbuf *) new_skb->data;
 			skb_reserve(new_skb, RX_OFFSET);
 
-			ip->stats.rx_packets++;		/* Statistics */
-			ip->stats.rx_bytes += len;
+			dev->stats.rx_packets++;		/* Statistics */
+			dev->stats.rx_bytes += len;
 		} else {
- 			/* The frame is invalid and the skb never
-                           reached the network layer so we can just
-                           recycle it.  */
- 			new_skb = skb;
- 			ip->stats.rx_errors++;
+			/* The frame is invalid and the skb never
+			   reached the network layer so we can just
+			   recycle it.  */
+			new_skb = skb;
+			dev->stats.rx_errors++;
 		}
 		if (err & ERXBUF_CRCERR)	/* Statistics */
-			ip->stats.rx_crc_errors++;
+			dev->stats.rx_crc_errors++;
 		if (err & ERXBUF_FRAMERR)
-			ip->stats.rx_frame_errors++;
+			dev->stats.rx_frame_errors++;
 next:
 		ip->rx_skbs[n_entry] = new_skb;
 		rxr[n_entry] = cpu_to_be64(ioc3_map(rxb, 1));
@@ -652,8 +652,9 @@ next:
 	ip->rx_ci = rx_entry;
 }
 
-static inline void ioc3_tx(struct ioc3_private *ip)
+static inline void ioc3_tx(struct net_device *dev)
 {
+	struct ioc3_private *ip = netdev_priv(dev);
 	unsigned long packets, bytes;
 	struct ioc3 *ioc3 = ip->regs;
 	int tx_entry, o_entry;
@@ -681,12 +682,12 @@ static inline void ioc3_tx(struct ioc3_private *ip)
 		tx_entry = (etcir >> 7) & 127;
 	}
 
-	ip->stats.tx_packets += packets;
-	ip->stats.tx_bytes += bytes;
+	dev->stats.tx_packets += packets;
+	dev->stats.tx_bytes += bytes;
 	ip->txqlen -= packets;
 
 	if (ip->txqlen < 128)
-		netif_wake_queue(priv_netdev(ip));
+		netif_wake_queue(dev);
 
 	ip->tx_ci = o_entry;
 	spin_unlock(&ip->ioc3_lock);
@@ -699,9 +700,9 @@ static inline void ioc3_tx(struct ioc3_private *ip)
  * with such error interrupts if something really goes wrong, so we might
  * also consider to take the interface down.
  */
-static void ioc3_error(struct ioc3_private *ip, u32 eisr)
+static void ioc3_error(struct net_device *dev, u32 eisr)
 {
-	struct net_device *dev = priv_netdev(ip);
+	struct ioc3_private *ip = netdev_priv(dev);
 	unsigned char *iface = dev->name;
 
 	spin_lock(&ip->ioc3_lock);
@@ -747,11 +748,11 @@ static irqreturn_t ioc3_interrupt(int irq, void *_dev)
 
 	if (eisr & (EISR_RXOFLO | EISR_RXBUFOFLO | EISR_RXMEMERR |
 	            EISR_RXPARERR | EISR_TXBUFUFLO | EISR_TXMEMERR))
-		ioc3_error(ip, eisr);
+		ioc3_error(dev, eisr);
 	if (eisr & EISR_RXTIMERINT)
-		ioc3_rx(ip);
+		ioc3_rx(dev);
 	if (eisr & EISR_TXEXPLICIT)
-		ioc3_tx(ip);
+		ioc3_tx(dev);
 
 	return IRQ_HANDLED;
 }

^ permalink raw reply related

* Re: Fwd: Possible bug in net/ipv4/route.c?
From: Eric Dumazet @ 2010-07-05 12:59 UTC (permalink / raw)
  To: Herbert Xu; +Cc: yoshfuji, netdev, linux-kernel, Stephen Hemminger
In-Reply-To: <20100705120617.GA6267@gondor.apana.org.au>

Le lundi 05 juillet 2010 à 20:06 +0800, Herbert Xu a écrit :
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> >> 2) The layer using skb->cb[] should clear this area after use and
> >> before handing the skb to another layer.  Ubicom should modify the
> >> driver to clear the skb->cb[] area before sending it up the line.
> >> 
> > 
> > This is the right option. If you use one word in cb[], only your driver
> > knows how to clear it efficiently.
> 
> Absolutely not! No protocol stack should rely on an external skb
> having a zero cb.
> 

Why do we clear full 48 bytes skb->cb[] in skb_alloc(), if no protocol
stack should rely it being zero ?

^ permalink raw reply

* Re: [PATCH] ioc3-eth: Use the instance of net_device_stats from net_device.
From: Ralf Baechle @ 2010-07-05 12:55 UTC (permalink / raw)
  To: Kulikov Vasiliy
  Cc: Kernel Janitors, David S. Miller, Jiri Pirko, Eric Dumazet,
	Patrick McHardy, Alexey Dobriyan, linux-mips, netdev,
	linux-kernel
In-Reply-To: <1278332034-17122-1-git-send-email-segooon@gmail.com>

On Mon, Jul 05, 2010 at 04:13:51PM +0400, Kulikov Vasiliy wrote:

> Since net_device has an instance of net_device_stats,
> we can remove the instance of this from the adapter structure.
> 
> Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>

NACK, your patch doesn't compile.  I'll post a fixed patch in a separate
mail.

  Ralf

^ permalink raw reply

* RE: Splice status
From: Eric Dumazet @ 2010-07-05 12:50 UTC (permalink / raw)
  To: Ofer Heifetz; +Cc: Changli Gao, netdev@vger.kernel.org
In-Reply-To: <EE71107DF0D1F24FA2D95041E64AB9E8ED2541B66E@IL-MB01.marvell.com>

Le lundi 05 juillet 2010 à 13:52 +0300, Ofer Heifetz a écrit :
> I am using Samba, so from my understanding of the source code, it
loops and performs splice(sock, pipe) and splice(pipe, fd). There is no
flush of any sort in between.
> 
> When you say drain you mean to flush all data to pipe?
> 

Draining pipe before splice() call would only trigger the bug less
often.

splice(sock, pipe) can block if caller dont use appropriate "non
blocking pipe' splice() mode, even if pipe is empty before a splice()
call.

Last time I checked, splice() code was disabled in samba.

Is it a patched version ?

Samba should add SPLICE_F_NONBLOCK to first splice() call (from sock to
pipe)

(You also need a recent kernel, check for details :
http://patchwork.ozlabs.org/patch/34511/ )

diff --git a/source3/lib/recvfile.c b/source3/lib/recvfile.c
index ea01596..65e6f34 100644
--- a/source3/lib/recvfile.c
+++ b/source3/lib/recvfile.c
@@ -182,7 +182,7 @@ ssize_t sys_recvfile(int fromfd,
                int nread, to_write;
 
                nread = splice(fromfd, NULL, pipefd[1], NULL,
-                              MIN(count, 16384), SPLICE_F_MOVE);
+                              MIN(count, 16384), SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
                if (nread == -1) {
                        if (errno == EINTR) {
                                continue;



^ permalink raw reply related

* Re: [PATCH net-next-2.6] ipv6: adding ip_nonlocal_bind option from ipv4
From: Michal Humpula @ 2010-07-05 12:26 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki; +Cc: netdev
In-Reply-To: <1278324822.19358.24.camel@sylph.linux-ipv6.org>

On Monday 05 of July 2010 12:13:42 YOSHIFUJI Hideaki wrote:
> Hello.
> 
> Michal Humpula wrote:
> > Adds ability to bind non-local IPv6 address the same way as for IPv4
> > 
> > Signed-off-by: Michal Humpula <michal.humpula@web4u.cz>
> > 
> > diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> > index e830cd4..55b3552 100644
> > --- a/net/ipv6/af_inet6.c
> > +++ b/net/ipv6/af_inet6.c
> > 
> > @@ -252,6 +252,8 @@ out_rcu_unlock:
> >  	goto out;
> >  
> >  }
> > 
> > +int sysctl_ipv6_nonlocal_bind __read_mostly;
> > +EXPORT_SYMBOL(sysctl_ipv6_nonlocal_bind);
> > 
> >  /* bind for INET6 API */
> >  int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int
> > 
> > addr_len)
> > @@ -345,8 +347,10 @@ int inet6_bind(struct socket *sock, struct
> > sockaddr *uaddr, int addr_len)
> > 
> >  			if (!(addr_type &I do think i IPV6_ADDR_MULTICAST))	{
> >  			
> >  				if (!ipv6_chk_addr(net, &addr->sin6_addr,
> >  				
> >  						   dev, 0)) {
> > 
> > -					err = -EADDRNOTAVAIL;
> > -					goto out_unlock;
> > +					if (!sysctl_ipv6_nonlocal_bind) {
> > +						err = -EADDRNOTAVAIL;
> > +						goto out_unlock;
> > +					}
> > 
> >  				}
> >  			
> >  			}
> >  			rcu_read_unlock();
> > 
> > diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
> > index fa1d8f4..56bfe76 100644
> > --- a/net/ipv6/sysctl_net_ipv6.c
> > +++ b/net/ipv6/sysctl_net_ipv6.c
> > @@ -35,6 +35,13 @@ static ctl_table ipv6_table_template[] = {
> > 
> >  		.mode		= 0644,When you try to send packets / connect to
> 
> remote address,
> 
> >  		.proc_handler	= proc_dointvec
> >  	
> >  	}, so far.
> > 
> > +	{
> > +		.procname = "ipv6_nonlocal_bind",
> > +		.data   = &sysctl_ipv6_nonlocal_bind,
> > +		.maxlen   = sizeof(int),
> > +		.mode   = 0644,
> > +		.proc_handler = proc_dointvec
> > +	},
> > 
> >  	{ }
> >  
> >  };
> 
> This is not sufficient.
> 
> In IPv4, even if you do non-local bind, you cannot connect/send
> packets from that address until the admin really assigns that
> address on the node.  Local address is checked when you try to
> connect (or to send), and this is important thing to do.
> 
> But in IPv6, it is not checked, and it is very bad to open
> this "hole".
> 
> --yoshfuji

Thanks again for review. Could you please point me to part, where the check is done?
Is there a reason why is the check not done in IPv6 too?

^ permalink raw reply

* [PATCH] starfire: Use the instance of net_device_stats from net_device.
From: Kulikov Vasiliy @ 2010-07-05 12:14 UTC (permalink / raw)
  To: Kernel Janitors
  Cc: Ion Badulescu, David S. Miller, Joe Perches, Jiri Pirko,
	Stephen Hemminger, Andrew Morton, netdev, linux-kernel

Since net_device has an instance of net_device_stats,
we can remove the instance of this from the adapter structure.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
---
 drivers/net/starfire.c |   47 +++++++++++++++++++++++------------------------
 1 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/drivers/net/starfire.c b/drivers/net/starfire.c
index 74b7ae7..a42b687 100644
--- a/drivers/net/starfire.c
+++ b/drivers/net/starfire.c
@@ -562,7 +562,6 @@ struct netdev_private {
 	unsigned int tx_done;
 	struct napi_struct napi;
 	struct net_device *dev;
-	struct net_device_stats stats;
 	struct pci_dev *pci_dev;
 #ifdef VLAN_SUPPORT
 	struct vlan_group *vlgrp;
@@ -1174,7 +1173,7 @@ static void tx_timeout(struct net_device *dev)
 	/* Trigger an immediate transmit demand. */
 
 	dev->trans_start = jiffies; /* prevent tx timeout */
-	np->stats.tx_errors++;
+	dev->stats.tx_errors++;
 	netif_wake_queue(dev);
 }
 
@@ -1265,7 +1264,7 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct net_device *dev)
 			}
 			if (skb->ip_summed == CHECKSUM_PARTIAL) {
 				status |= TxCalTCP;
-				np->stats.tx_compressed++;
+				dev->stats.tx_compressed++;
 			}
 			status |= skb_first_frag_len(skb) | (skb_num_frags(skb) << 16);
 
@@ -1374,7 +1373,7 @@ static irqreturn_t intr_handler(int irq, void *dev_instance)
 				printk(KERN_DEBUG "%s: Tx completion #%d entry %d is %#8.8x.\n",
 				       dev->name, np->dirty_tx, np->tx_done, tx_status);
 			if ((tx_status & 0xe0000000) == 0xa0000000) {
-				np->stats.tx_packets++;
+				dev->stats.tx_packets++;
 			} else if ((tx_status & 0xe0000000) == 0x80000000) {
 				u16 entry = (tx_status & 0x7fff) / sizeof(starfire_tx_desc);
 				struct sk_buff *skb = np->tx_info[entry].skb;
@@ -1462,9 +1461,9 @@ static int __netdev_rx(struct net_device *dev, int *quota)
 			/* There was an error. */
 			if (debug > 2)
 				printk(KERN_DEBUG "  netdev_rx() Rx error was %#8.8x.\n", desc_status);
-			np->stats.rx_errors++;
+			dev->stats.rx_errors++;
 			if (desc_status & RxFIFOErr)
-				np->stats.rx_fifo_errors++;
+				dev->stats.rx_fifo_errors++;
 			goto next_rx;
 		}
 
@@ -1515,7 +1514,7 @@ static int __netdev_rx(struct net_device *dev, int *quota)
 #endif
 		if (le16_to_cpu(desc->status2) & 0x0100) {
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
-			np->stats.rx_compressed++;
+			dev->stats.rx_compressed++;
 		}
 		/*
 		 * This feature doesn't seem to be working, at least
@@ -1547,7 +1546,7 @@ static int __netdev_rx(struct net_device *dev, int *quota)
 		} else
 #endif /* VLAN_SUPPORT */
 			netif_receive_skb(skb);
-		np->stats.rx_packets++;
+		dev->stats.rx_packets++;
 
 	next_rx:
 		np->cur_rx++;
@@ -1717,12 +1716,12 @@ static void netdev_error(struct net_device *dev, int intr_status)
 			printk(KERN_WARNING "%s: PCI Tx underflow -- adapter is probably malfunctioning\n", dev->name);
 	}
 	if (intr_status & IntrRxGFPDead) {
-		np->stats.rx_fifo_errors++;
-		np->stats.rx_errors++;
+		dev->stats.rx_fifo_errors++;
+		dev->stats.rx_errors++;
 	}
 	if (intr_status & (IntrNoTxCsum | IntrDMAErr)) {
-		np->stats.tx_fifo_errors++;
-		np->stats.tx_errors++;
+		dev->stats.tx_fifo_errors++;
+		dev->stats.tx_errors++;
 	}
 	if ((intr_status & ~(IntrNormalMask | IntrAbnormalSummary | IntrLinkChange | IntrStatsMax | IntrTxDataLow | IntrRxGFPDead | IntrNoTxCsum | IntrPCIPad)) && debug)
 		printk(KERN_ERR "%s: Something Wicked happened! %#8.8x.\n",
@@ -1736,24 +1735,24 @@ static struct net_device_stats *get_stats(struct net_device *dev)
 	void __iomem *ioaddr = np->base;
 
 	/* This adapter architecture needs no SMP locks. */
-	np->stats.tx_bytes = readl(ioaddr + 0x57010);
-	np->stats.rx_bytes = readl(ioaddr + 0x57044);
-	np->stats.tx_packets = readl(ioaddr + 0x57000);
-	np->stats.tx_aborted_errors =
+	dev->stats.tx_bytes = readl(ioaddr + 0x57010);
+	dev->stats.rx_bytes = readl(ioaddr + 0x57044);
+	dev->stats.tx_packets = readl(ioaddr + 0x57000);
+	dev->stats.tx_aborted_errors =
 		readl(ioaddr + 0x57024) + readl(ioaddr + 0x57028);
-	np->stats.tx_window_errors = readl(ioaddr + 0x57018);
-	np->stats.collisions =
+	dev->stats.tx_window_errors = readl(ioaddr + 0x57018);
+	dev->stats.collisions =
 		readl(ioaddr + 0x57004) + readl(ioaddr + 0x57008);
 
 	/* The chip only need report frame silently dropped. */
-	np->stats.rx_dropped += readw(ioaddr + RxDMAStatus);
+	dev->stats.rx_dropped += readw(ioaddr + RxDMAStatus);
 	writew(0, ioaddr + RxDMAStatus);
-	np->stats.rx_crc_errors = readl(ioaddr + 0x5703C);
-	np->stats.rx_frame_errors = readl(ioaddr + 0x57040);
-	np->stats.rx_length_errors = readl(ioaddr + 0x57058);
-	np->stats.rx_missed_errors = readl(ioaddr + 0x5707C);
+	dev->stats.rx_crc_errors = readl(ioaddr + 0x5703C);
+	dev->stats.rx_frame_errors = readl(ioaddr + 0x57040);
+	dev->stats.rx_length_errors = readl(ioaddr + 0x57058);
+	dev->stats.rx_missed_errors = readl(ioaddr + 0x5707C);
 
-	return &np->stats;
+	return &dev->stats;
 }
 
 
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH] ns83820: Use the instance of net_device_stats from net_device.
From: Kulikov Vasiliy @ 2010-07-05 12:14 UTC (permalink / raw)
  To: Kernel Janitors
  Cc: David S. Miller, Alexey Dobriyan, Stephen Hemminger, Tejun Heo,
	Jiri Pirko, netdev, linux-kernel

Since net_device has an instance of net_device_stats,
we can remove the instance of this from the adapter structure.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
---
 drivers/net/ns83820.c |   44 ++++++++++++++++++++++----------------------
 1 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ns83820.c b/drivers/net/ns83820.c
index e88e97c..5a3488f 100644
--- a/drivers/net/ns83820.c
+++ b/drivers/net/ns83820.c
@@ -424,7 +424,6 @@ struct rx_info {
 
 
 struct ns83820 {
-	struct net_device_stats	stats;
 	u8			__iomem *base;
 
 	struct pci_dev		*pci_dev;
@@ -918,9 +917,9 @@ static void rx_irq(struct net_device *ndev)
 			if (unlikely(!skb))
 				goto netdev_mangle_me_harder_failed;
 			if (cmdsts & CMDSTS_DEST_MULTI)
-				dev->stats.multicast ++;
-			dev->stats.rx_packets ++;
-			dev->stats.rx_bytes += len;
+				ndev->stats.multicast++;
+			ndev->stats.rx_packets++;
+			ndev->stats.rx_bytes += len;
 			if ((extsts & 0x002a0000) && !(extsts & 0x00540000)) {
 				skb->ip_summed = CHECKSUM_UNNECESSARY;
 			} else {
@@ -940,7 +939,7 @@ static void rx_irq(struct net_device *ndev)
 #endif
 			if (NET_RX_DROP == rx_rc) {
 netdev_mangle_me_harder_failed:
-				dev->stats.rx_dropped ++;
+				ndev->stats.rx_dropped++;
 			}
 		} else {
 			kfree_skb(skb);
@@ -1008,11 +1007,11 @@ static void do_tx_done(struct net_device *ndev)
 		dma_addr_t addr;
 
 		if (cmdsts & CMDSTS_ERR)
-			dev->stats.tx_errors ++;
+			ndev->stats.tx_errors++;
 		if (cmdsts & CMDSTS_OK)
-			dev->stats.tx_packets ++;
+			ndev->stats.tx_packets++;
 		if (cmdsts & CMDSTS_OK)
-			dev->stats.tx_bytes += cmdsts & 0xffff;
+			ndev->stats.tx_bytes += cmdsts & 0xffff;
 
 		dprintk("tx_done_idx=%d free_idx=%d cmdsts=%08x\n",
 			tx_done_idx, dev->tx_free_idx, cmdsts);
@@ -1212,20 +1211,21 @@ again:
 
 static void ns83820_update_stats(struct ns83820 *dev)
 {
+	struct net_device *ndev = dev->ndev;
 	u8 __iomem *base = dev->base;
 
 	/* the DP83820 will freeze counters, so we need to read all of them */
-	dev->stats.rx_errors		+= readl(base + 0x60) & 0xffff;
-	dev->stats.rx_crc_errors	+= readl(base + 0x64) & 0xffff;
-	dev->stats.rx_missed_errors	+= readl(base + 0x68) & 0xffff;
-	dev->stats.rx_frame_errors	+= readl(base + 0x6c) & 0xffff;
-	/*dev->stats.rx_symbol_errors +=*/ readl(base + 0x70);
-	dev->stats.rx_length_errors	+= readl(base + 0x74) & 0xffff;
-	dev->stats.rx_length_errors	+= readl(base + 0x78) & 0xffff;
-	/*dev->stats.rx_badopcode_errors += */ readl(base + 0x7c);
-	/*dev->stats.rx_pause_count += */  readl(base + 0x80);
-	/*dev->stats.tx_pause_count += */  readl(base + 0x84);
-	dev->stats.tx_carrier_errors	+= readl(base + 0x88) & 0xff;
+	ndev->stats.rx_errors		+= readl(base + 0x60) & 0xffff;
+	ndev->stats.rx_crc_errors	+= readl(base + 0x64) & 0xffff;
+	ndev->stats.rx_missed_errors	+= readl(base + 0x68) & 0xffff;
+	ndev->stats.rx_frame_errors	+= readl(base + 0x6c) & 0xffff;
+	/*ndev->stats.rx_symbol_errors +=*/ readl(base + 0x70);
+	ndev->stats.rx_length_errors	+= readl(base + 0x74) & 0xffff;
+	ndev->stats.rx_length_errors	+= readl(base + 0x78) & 0xffff;
+	/*ndev->stats.rx_badopcode_errors += */ readl(base + 0x7c);
+	/*ndev->stats.rx_pause_count += */  readl(base + 0x80);
+	/*ndev->stats.tx_pause_count += */  readl(base + 0x84);
+	ndev->stats.tx_carrier_errors	+= readl(base + 0x88) & 0xff;
 }
 
 static struct net_device_stats *ns83820_get_stats(struct net_device *ndev)
@@ -1237,7 +1237,7 @@ static struct net_device_stats *ns83820_get_stats(struct net_device *ndev)
 	ns83820_update_stats(dev);
 	spin_unlock_irq(&dev->misc_lock);
 
-	return &dev->stats;
+	return &ndev->stats;
 }
 
 /* Let ethtool retrieve info */
@@ -1464,12 +1464,12 @@ static void ns83820_do_isr(struct net_device *ndev, u32 isr)
 
 	if (unlikely(ISR_RXSOVR & isr)) {
 		//printk("overrun: rxsovr\n");
-		dev->stats.rx_fifo_errors ++;
+		ndev->stats.rx_fifo_errors++;
 	}
 
 	if (unlikely(ISR_RXORN & isr)) {
 		//printk("overrun: rxorn\n");
-		dev->stats.rx_fifo_errors ++;
+		ndev->stats.rx_fifo_errors++;
 	}
 
 	if ((ISR_RXRCMP & isr) && dev->rx_info.up)
-- 
1.7.0.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox