Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next-2.6] net: Fix rxq ref counting
From: Eric Dumazet @ 2010-10-07 20:59 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <alpine.DEB.1.00.1010071304240.4792@pokey.mtv.corp.google.com>

Le jeudi 07 octobre 2010 à 13:09 -0700, Tom Herbert a écrit :
> The rx->count reference is used to track reference counts to the
> number of rx-queue kobjects created for the device.  This patch
> eliminates initialization of the counter in netif_alloc_rx_queues
> and instead increments the counter each time a kobject is created.
> This is now symmetric with the decrement that is done when an object is
> released.
> 
> Signed-off-by: Tom Herbert <therbert@google.com>
> ---
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 7d14955..58b31d1 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5026,7 +5026,6 @@ static int netif_alloc_rx_queues(struct net_device *dev)
>  			return -ENOMEM;
>  		}
>  		dev->_rx = rx;
> -		atomic_set(&rx->count, count);
>  
>  		/*
>  		 * Set a pointer to first element in the array which holds the
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index fa81fd0..b143173 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -726,6 +726,7 @@ static struct kobj_type rx_queue_ktype = {
>  static int rx_queue_add_kobject(struct net_device *net, int index)
>  {
>  	struct netdev_rx_queue *queue = net->_rx + index;
> +	struct netdev_rx_queue *first = queue->first;
>  	struct kobject *kobj = &queue->kobj;
>  	int error = 0;
>  
> @@ -738,6 +739,7 @@ static int rx_queue_add_kobject(struct net_device *net, int index)
>  	}
>  
>  	kobject_uevent(kobj, KOBJ_ADD);
> +	atomic_inc(&first->count);
>  
>  	return error;
>  }

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>




^ permalink raw reply

* [PATCH] net: clear heap allocations for privileged ethtool actions
From: Kees Cook @ 2010-10-07 21:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: David S. Miller, Ben Hutchings, Jeff Garzik, Jeff Kirsher,
	Peter P Waskiewicz Jr, netdev

Several other ethtool functions leave heap uncleared (potentially) by
drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
are well controlled. In some situations (e.g. unchecked error conditions),
the heap will remain unchanged in areas before copying back to userspace.
Note that these are less of an issue since these all require CAP_NET_ADMIN.

Cc: stable@kernel.org
Signed-off-by: Kees Cook <kees.cook@canonical.com>
---
 net/core/ethtool.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 7a85367..fb9cf30 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -397,7 +397,7 @@ static noinline_for_stack int ethtool_get_rxfh_indir(struct net_device *dev,
 	    (KMALLOC_MAX_SIZE - sizeof(*indir)) / sizeof(*indir->ring_index))
 		return -ENOMEM;
 	full_size = sizeof(*indir) + sizeof(*indir->ring_index) * table_size;
-	indir = kmalloc(full_size, GFP_USER);
+	indir = kzalloc(full_size, GFP_USER);
 	if (!indir)
 		return -ENOMEM;
 
@@ -538,7 +538,7 @@ static int ethtool_get_rx_ntuple(struct net_device *dev, void __user *useraddr)
 
 	gstrings.len = ret;
 
-	data = kmalloc(gstrings.len * ETH_GSTRING_LEN, GFP_USER);
+	data = kzalloc(gstrings.len * ETH_GSTRING_LEN, GFP_USER);
 	if (!data)
 		return -ENOMEM;
 
@@ -775,7 +775,7 @@ static int ethtool_get_regs(struct net_device *dev, char __user *useraddr)
 	if (regs.len > reglen)
 		regs.len = reglen;
 
-	regbuf = kmalloc(reglen, GFP_USER);
+	regbuf = kzalloc(reglen, GFP_USER);
 	if (!regbuf)
 		return -ENOMEM;
 
-- 
1.7.1

-- 
Kees Cook
Ubuntu Security Team

^ permalink raw reply related

* Re: [PATCH] net: clear heap allocations for privileged ethtool actions
From: Eric Dumazet @ 2010-10-07 21:31 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, David S. Miller, Ben Hutchings, Jeff Garzik,
	Jeff Kirsher, Peter P Waskiewicz Jr, netdev
In-Reply-To: <20101007211004.GA20267@outflux.net>

Le jeudi 07 octobre 2010 à 14:10 -0700, Kees Cook a écrit :
> Several other ethtool functions leave heap uncleared (potentially) by
> drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
> are well controlled. In some situations (e.g. unchecked error conditions),
> the heap will remain unchanged in areas before copying back to userspace.
> Note that these are less of an issue since these all require CAP_NET_ADMIN.

> @@ -775,7 +775,7 @@ static int ethtool_get_regs(struct net_device *dev, char __user *useraddr)
>  	if (regs.len > reglen)
>  		regs.len = reglen;
>  
> -	regbuf = kmalloc(reglen, GFP_USER);
> +	regbuf = kzalloc(reglen, GFP_USER);
>  	if (!regbuf)
>  		return -ENOMEM;
>  
> -- 
> 1.7.1
> 

Are you sure this is not hiding a more problematic problem ?

Code does :

reglen = ops->get_regs_len(dev);
if (regs.len > reglen)
	regs.len = reglen;
regbuf = kmalloc(reglen, GFP_USER);

So we can not copy back kernel memory.

However, what happens if user provides regs.len = 1 byte, and driver
get_regs() doesnt properly checks regs.len and write past end of regbuf
-> We probably write on other parts of kernel memory

An audit is needed, but first driver I checked is buggy
(drivers/net/qlcnic/qlcnic_ethtool.c)

->
 	memset(p, 0, qlcnic_get_regs_len(dev));




^ permalink raw reply

* Re: [PATCH] net: clear heap allocations for privileged ethtool actions
From: Ben Hutchings @ 2010-10-07 21:34 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, David S. Miller, Jeff Garzik, Jeff Kirsher,
	Peter P Waskiewicz Jr, netdev
In-Reply-To: <20101007211004.GA20267@outflux.net>

On Thu, 2010-10-07 at 14:10 -0700, Kees Cook wrote:
> Several other ethtool functions leave heap uncleared (potentially) by
> drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
> are well controlled. In some situations (e.g. unchecked error conditions),
> the heap will remain unchanged in areas before copying back to userspace.
> Note that these are less of an issue since these all require CAP_NET_ADMIN.
> 
> Cc: stable@kernel.org
> Signed-off-by: Kees Cook <kees.cook@canonical.com>
> ---
>  net/core/ethtool.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index 7a85367..fb9cf30 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -397,7 +397,7 @@ static noinline_for_stack int ethtool_get_rxfh_indir(struct net_device *dev,
>  	    (KMALLOC_MAX_SIZE - sizeof(*indir)) / sizeof(*indir->ring_index))
>  		return -ENOMEM;
>  	full_size = sizeof(*indir) + sizeof(*indir->ring_index) * table_size;
> -	indir = kmalloc(full_size, GFP_USER);
> +	indir = kzalloc(full_size, GFP_USER);
>  	if (!indir)
>  		return -ENOMEM;
>  
[...]

Acked-by: Ben Hutchings <bhutchings@solarflare.com>

You could alternately recalculate full_size before copying back to the
user buffer:

	full_size = sizeof(*indir) + sizeof(*indir->ring_index) * indir->size;

but kzalloc() is more obviously safe.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH] net: clear heap allocations for privileged ethtool actions
From: Ben Hutchings @ 2010-10-07 21:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Kees Cook, linux-kernel, David S. Miller, Jeff Garzik,
	Jeff Kirsher, Peter P Waskiewicz Jr, netdev
In-Reply-To: <1286487085.3745.99.camel@edumazet-laptop>

On Thu, 2010-10-07 at 23:31 +0200, Eric Dumazet wrote:
> Le jeudi 07 octobre 2010 à 14:10 -0700, Kees Cook a écrit :
> > Several other ethtool functions leave heap uncleared (potentially) by
> > drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
> > are well controlled. In some situations (e.g. unchecked error conditions),
> > the heap will remain unchanged in areas before copying back to userspace.
> > Note that these are less of an issue since these all require CAP_NET_ADMIN.
> 
> > @@ -775,7 +775,7 @@ static int ethtool_get_regs(struct net_device *dev, char __user *useraddr)
> >  	if (regs.len > reglen)
> >  		regs.len = reglen;
> >  
> > -	regbuf = kmalloc(reglen, GFP_USER);
> > +	regbuf = kzalloc(reglen, GFP_USER);

Actually, I recently changed this to vmalloc() so your patch won't
apply.

> >  	if (!regbuf)
> >  		return -ENOMEM;
> >  
> > -- 
> > 1.7.1
> > 
> 
> Are you sure this is not hiding a more problematic problem ?
> 
> Code does :
> 
> reglen = ops->get_regs_len(dev);
> if (regs.len > reglen)
> 	regs.len = reglen;
> regbuf = kmalloc(reglen, GFP_USER);
> 
> So we can not copy back kernel memory.
> 
> However, what happens if user provides regs.len = 1 byte, and driver
> get_regs() doesnt properly checks regs.len and write past end of regbuf
> -> We probably write on other parts of kernel memory
[...]

Why should the driver's get_regs() check regs.len?  The buffer is
allocated based on reglen which is provided by the driver, not the user.

reglen (length of the kernel buffer) is not reduced; regs.len (length of
the user buffer) is.  That lets the user know how much of the user
buffer was actually used.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] net: clear heap allocations for privileged ethtool actions
From: Kees Cook @ 2010-10-07 21:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linux-kernel, David S. Miller, Ben Hutchings, Jeff Garzik,
	Jeff Kirsher, Peter P Waskiewicz Jr, netdev
In-Reply-To: <1286487085.3745.99.camel@edumazet-laptop>

Hi Eric,

On Thu, Oct 07, 2010 at 11:31:25PM +0200, Eric Dumazet wrote:
> Le jeudi 07 octobre 2010 à 14:10 -0700, Kees Cook a écrit :
> > Several other ethtool functions leave heap uncleared (potentially) by
> > drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
> > are well controlled. In some situations (e.g. unchecked error conditions),
> > the heap will remain unchanged in areas before copying back to userspace.
> > Note that these are less of an issue since these all require CAP_NET_ADMIN.
> 
> > @@ -775,7 +775,7 @@ static int ethtool_get_regs(struct net_device *dev, char __user *useraddr)
> >  	if (regs.len > reglen)
> >  		regs.len = reglen;
> >  
> > -	regbuf = kmalloc(reglen, GFP_USER);
> > +	regbuf = kzalloc(reglen, GFP_USER);
> >  	if (!regbuf)
> >  		return -ENOMEM;
> >  
> > -- 
> > 1.7.1
> > 
> 
> Are you sure this is not hiding a more problematic problem ?
> 
> Code does :
> 
> reglen = ops->get_regs_len(dev);
> if (regs.len > reglen)
> 	regs.len = reglen;
> regbuf = kmalloc(reglen, GFP_USER);
> 
> So we can not copy back kernel memory.
> 
> However, what happens if user provides regs.len = 1 byte, and driver
> get_regs() doesnt properly checks regs.len and write past end of regbuf
> -> We probably write on other parts of kernel memory

This code is basically a max() call from what I see.

regbuf = kmalloc(max(regs.len, ops->get_regs_len(dev)), GFP_USER);

If the user passes regs.len = 1, it will be ignored in favor of reglen,
so we'll not write past the end of regbuf, unless I'm misunderstanding.

-Kees

-- 
Kees Cook
Ubuntu Security Team

^ permalink raw reply

* IPv4: sysctl table check failed [was: mmotm 2010-10-07-14-08 uploaded]
From: Jiri Slaby @ 2010-10-07 22:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, mm-commits, ML netdev, David S. Miller
In-Reply-To: <201010072140.o97Le69i025659@imap1.linux-foundation.org>

On 10/07/2010 11:08 PM, akpm@linux-foundation.org wrote:
> The mm-of-the-moment snapshot 2010-10-07-14-08 has been uploaded to

Hi, I got bunch of "sysctl table check failed" below. All seem to be
related to ipv4:

sysctl table check failed: /net/ipv4/tcp_mem  No min
Pid: 1, comm: swapper Not tainted 2.6.36-rc7-mm1_64+ #1285
Call Trace:
 [<ffffffff8108d444>] set_fail+0xa4/0xf0
 [<ffffffff8108d736>] sysctl_check_table+0x2a6/0x310
 [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
 [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
 [<ffffffff81072204>] __register_sysctl_paths+0xf4/0x320
 [<ffffffff815c1232>] ? printk+0x3c/0x42
 [<ffffffff8115d3bc>] ? sysfs_add_file+0xc/0x10
 [<ffffffff818b18bd>] ? sysctl_ipv4_init+0x0/0x87
 [<ffffffff81072456>] register_sysctl_paths+0x26/0x30
 [<ffffffff818b18fd>] sysctl_ipv4_init+0x40/0x87
 [<ffffffff810002df>] do_one_initcall+0x3f/0x170
 [<ffffffff81884d44>] kernel_init+0x158/0x1e2
 [<ffffffff8102fb54>] kernel_thread_helper+0x4/0x10
 [<ffffffff81884bec>] ? kernel_init+0x0/0x1e2
 [<ffffffff8102fb50>] ? kernel_thread_helper+0x0/0x10
sysctl table check failed: /net/ipv4/tcp_mem  No max
Pid: 1, comm: swapper Not tainted 2.6.36-rc7-mm1_64+ #1285
Call Trace:
 [<ffffffff8108d444>] set_fail+0xa4/0xf0
 [<ffffffff8108d4da>] sysctl_check_table+0x4a/0x310
 [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
 [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
 [<ffffffff81072204>] __register_sysctl_paths+0xf4/0x320
 [<ffffffff815c1232>] ? printk+0x3c/0x42
 [<ffffffff8115d3bc>] ? sysfs_add_file+0xc/0x10
 [<ffffffff818b18bd>] ? sysctl_ipv4_init+0x0/0x87
 [<ffffffff81072456>] register_sysctl_paths+0x26/0x30
 [<ffffffff818b18fd>] sysctl_ipv4_init+0x40/0x87
 [<ffffffff810002df>] do_one_initcall+0x3f/0x170
 [<ffffffff81884d44>] kernel_init+0x158/0x1e2
 [<ffffffff8102fb54>] kernel_thread_helper+0x4/0x10
 [<ffffffff81884bec>] ? kernel_init+0x0/0x1e2
 [<ffffffff8102fb50>] ? kernel_thread_helper+0x0/0x10
sysctl table check failed: /net/ipv4/udp_mem  No min
Pid: 1, comm: swapper Not tainted 2.6.36-rc7-mm1_64+ #1285
Call Trace:
 [<ffffffff8108d444>] set_fail+0xa4/0xf0
 [<ffffffff8108d736>] sysctl_check_table+0x2a6/0x310
 [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
 [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
 [<ffffffff81072204>] __register_sysctl_paths+0xf4/0x320
 [<ffffffff815c1232>] ? printk+0x3c/0x42
 [<ffffffff8115d3bc>] ? sysfs_add_file+0xc/0x10
 [<ffffffff818b18bd>] ? sysctl_ipv4_init+0x0/0x87
 [<ffffffff81072456>] register_sysctl_paths+0x26/0x30
 [<ffffffff818b18fd>] sysctl_ipv4_init+0x40/0x87
 [<ffffffff810002df>] do_one_initcall+0x3f/0x170
 [<ffffffff81884d44>] kernel_init+0x158/0x1e2
 [<ffffffff8102fb54>] kernel_thread_helper+0x4/0x10
 [<ffffffff81884bec>] ? kernel_init+0x0/0x1e2
 [<ffffffff8102fb50>] ? kernel_thread_helper+0x0/0x10
sysctl table check failed: /net/ipv4/udp_mem  No max
Pid: 1, comm: swapper Not tainted 2.6.36-rc7-mm1_64+ #1285
Call Trace:
 [<ffffffff8108d444>] set_fail+0xa4/0xf0
 [<ffffffff8108d4da>] sysctl_check_table+0x4a/0x310
 [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
 [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
 [<ffffffff81072204>] __register_sysctl_paths+0xf4/0x320
 [<ffffffff815c1232>] ? printk+0x3c/0x42
 [<ffffffff8115d3bc>] ? sysfs_add_file+0xc/0x10
 [<ffffffff818b18bd>] ? sysctl_ipv4_init+0x0/0x87
 [<ffffffff81072456>] register_sysctl_paths+0x26/0x30
 [<ffffffff818b18fd>] sysctl_ipv4_init+0x40/0x87
 [<ffffffff810002df>] do_one_initcall+0x3f/0x170
 [<ffffffff81884d44>] kernel_init+0x158/0x1e2
 [<ffffffff8102fb54>] kernel_thread_helper+0x4/0x10
 [<ffffffff81884bec>] ? kernel_init+0x0/0x1e2
 [<ffffffff8102fb50>] ? kernel_thread_helper+0x0/0x10

regards,
-- 
js

^ permalink raw reply

* Re: IPv4: sysctl table check failed [was: mmotm 2010-10-07-14-08 uploaded]
From: Eric Dumazet @ 2010-10-07 22:22 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: linux-kernel, akpm, mm-commits, ML netdev, David S. Miller
In-Reply-To: <4CAE4479.6010606@gmail.com>

Le vendredi 08 octobre 2010 à 00:06 +0200, Jiri Slaby a écrit :
> On 10/07/2010 11:08 PM, akpm@linux-foundation.org wrote:
> > The mm-of-the-moment snapshot 2010-10-07-14-08 has been uploaded to
> 
> Hi, I got bunch of "sysctl table check failed" below. All seem to be
> related to ipv4:

I would say, sysctl check is buggy :(

min/max are optional

[PATCH] sysctl: min/max bounds are optional

sysctl check complains when proc_doulongvec_minmax or
proc_doulongvec_ms_jiffies_minmax are used by a vector of longs (with
more than one element), with no min or max value specified.

This is unexpected, given we had a bug on this min/max handling :)

Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 kernel/sysctl_check.c |    9 ---------
 1 file changed, 9 deletions(-)

diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index 04cdcf7..10b90d8 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -143,15 +143,6 @@ int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
 				if (!table->maxlen)
 					set_fail(&fail, table, "No maxlen");
 			}
-			if ((table->proc_handler == proc_doulongvec_minmax) ||
-			    (table->proc_handler == proc_doulongvec_ms_jiffies_minmax)) {
-				if (table->maxlen > sizeof (unsigned long)) {
-					if (!table->extra1)
-						set_fail(&fail, table, "No min");
-					if (!table->extra2)
-						set_fail(&fail, table, "No max");
-				}
-			}
 #ifdef CONFIG_PROC_SYSCTL
 			if (table->procname && !table->proc_handler)
 				set_fail(&fail, table, "No proc_handler");



^ permalink raw reply related

* Re: IPv4: sysctl table check failed [was: mmotm 2010-10-07-14-08 uploaded]
From: Andrew Morton @ 2010-10-07 22:22 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: linux-kernel, mm-commits, ML netdev, David S. Miller,
	Eric Dumazet, Eric W. Biederman
In-Reply-To: <4CAE4479.6010606@gmail.com>

On Fri, 08 Oct 2010 00:06:49 +0200
Jiri Slaby <jirislaby@gmail.com> wrote:

> On 10/07/2010 11:08 PM, akpm@linux-foundation.org wrote:
> > The mm-of-the-moment snapshot 2010-10-07-14-08 has been uploaded to
> 
> Hi, I got bunch of "sysctl table check failed" below. All seem to be
> related to ipv4:
> 
> sysctl table check failed: /net/ipv4/tcp_mem  No min
> Pid: 1, comm: swapper Not tainted 2.6.36-rc7-mm1_64+ #1285
> Call Trace:
>  [<ffffffff8108d444>] set_fail+0xa4/0xf0
>  [<ffffffff8108d736>] sysctl_check_table+0x2a6/0x310
>  [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
>  [<ffffffff8108d4eb>] sysctl_check_table+0x5b/0x310
>  [<ffffffff81072204>] __register_sysctl_paths+0xf4/0x320
>  [<ffffffff815c1232>] ? printk+0x3c/0x42
>  [<ffffffff8115d3bc>] ? sysfs_add_file+0xc/0x10
>  [<ffffffff818b18bd>] ? sysctl_ipv4_init+0x0/0x87
>  [<ffffffff81072456>] register_sysctl_paths+0x26/0x30
>  [<ffffffff818b18fd>] sysctl_ipv4_init+0x40/0x87
>  [<ffffffff810002df>] do_one_initcall+0x3f/0x170
>  [<ffffffff81884d44>] kernel_init+0x158/0x1e2
>  [<ffffffff8102fb54>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81884bec>] ? kernel_init+0x0/0x1e2
>  [<ffffffff8102fb50>] ? kernel_thread_helper+0x0/0x10

OK, thanks.

Eric D's net-avoid-limits-overflow.patch switched tcp_mem and udp_mem
from proc_dointvec() to proc_doulongvec_minmax().  And
sysctl_check_table() checks `min' and `max' for
proc_doulongvec_minmax() but not for proc_dointvec().

I'm not sure which Eric to blame ;) .min and .max are optional, so
perhaps the check is wrong?


^ permalink raw reply

* Re: IPv4: sysctl table check failed [was: mmotm 2010-10-07-14-08 uploaded]
From: Andrew Morton @ 2010-10-07 22:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jiri Slaby, linux-kernel, mm-commits, ML netdev, David S. Miller,
	Eric W. Biederman
In-Reply-To: <1286490135.6536.75.camel@edumazet-laptop>

On Fri, 08 Oct 2010 00:22:15 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le vendredi 08 octobre 2010 __ 00:06 +0200, Jiri Slaby a __crit :
> > On 10/07/2010 11:08 PM, akpm@linux-foundation.org wrote:
> > > The mm-of-the-moment snapshot 2010-10-07-14-08 has been uploaded to
> > 
> > Hi, I got bunch of "sysctl table check failed" below. All seem to be
> > related to ipv4:
> 
> I would say, sysctl check is buggy :(
> 
> min/max are optional
> 
> [PATCH] sysctl: min/max bounds are optional
> 
> sysctl check complains when proc_doulongvec_minmax or
> proc_doulongvec_ms_jiffies_minmax are used by a vector of longs (with
> more than one element), with no min or max value specified.
> 
> This is unexpected, given we had a bug on this min/max handling :)
> 
> Reported-by: Jiri Slaby <jirislaby@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  kernel/sysctl_check.c |    9 ---------
>  1 file changed, 9 deletions(-)
> 
> diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
> index 04cdcf7..10b90d8 100644
> --- a/kernel/sysctl_check.c
> +++ b/kernel/sysctl_check.c
> @@ -143,15 +143,6 @@ int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
>  				if (!table->maxlen)
>  					set_fail(&fail, table, "No maxlen");
>  			}
> -			if ((table->proc_handler == proc_doulongvec_minmax) ||
> -			    (table->proc_handler == proc_doulongvec_ms_jiffies_minmax)) {
> -				if (table->maxlen > sizeof (unsigned long)) {
> -					if (!table->extra1)
> -						set_fail(&fail, table, "No min");
> -					if (!table->extra2)
> -						set_fail(&fail, table, "No max");
> -				}
> -			}
>  #ifdef CONFIG_PROC_SYSCTL
>  			if (table->procname && !table->proc_handler)
>  				set_fail(&fail, table, "No proc_handler");

That will probably fix it ;)

net-avoid-limits-overflow.patch is dependent on this patch.  Unless
Eric B squeaks I'll plan on sending this patch in for 2.6.37.


^ permalink raw reply

* [PATCH] ehea: Fix a checksum issue on the receive path
From: leitao @ 2010-10-07 23:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, Breno Leitao, Jay Vosburgh

Currently we set all skbs with CHECKSUM_UNNECESSARY, even
those whose protocol we don't know. This patch just
add the CHECKSUM_COMPLETE tag for non TCP/UDP packets.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
---
 drivers/net/ehea/ehea_main.c |    9 ++++++++-
 drivers/net/ehea/ehea_qmr.h  |    1 +
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index 0471cae..45fd045 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -533,8 +533,15 @@ static inline void ehea_fill_skb(struct net_device *dev,
 	int length = cqe->num_bytes_transfered - 4;	/*remove CRC */
 
 	skb_put(skb, length);
-	skb->ip_summed = CHECKSUM_UNNECESSARY;
 	skb->protocol = eth_type_trans(skb, dev);
+
+	/* The packet was not an IPV4 packet so a complemented checksum was
+	   calculated. The value is found in the Internet Checksum field. */
+	if (cqe->status & EHEA_CQE_BLIND_CKSUM) {
+		skb->ip_summed = CHECKSUM_COMPLETE;
+		skb->csum = csum_unfold(~cqe->inet_checksum_value);
+	} else
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
 }
 
 static inline struct sk_buff *get_skb_by_index(struct sk_buff **skb_array,
diff --git a/drivers/net/ehea/ehea_qmr.h b/drivers/net/ehea/ehea_qmr.h
index f608a6c..3810473 100644
--- a/drivers/net/ehea/ehea_qmr.h
+++ b/drivers/net/ehea/ehea_qmr.h
@@ -150,6 +150,7 @@ struct ehea_rwqe {
 #define EHEA_CQE_TYPE_RQ           0x60
 #define EHEA_CQE_STAT_ERR_MASK     0x700F
 #define EHEA_CQE_STAT_FAT_ERR_MASK 0xF
+#define EHEA_CQE_BLIND_CKSUM       0x8000
 #define EHEA_CQE_STAT_ERR_TCP      0x4000
 #define EHEA_CQE_STAT_ERR_IP       0x2000
 #define EHEA_CQE_STAT_ERR_CRC      0x1000
-- 
1.6.5


^ permalink raw reply related

* [PATCH] ehea: simplify conditional
From: Nicolas Kaiser @ 2010-10-07 23:14 UTC (permalink / raw)
  To: Breno Leitao; +Cc: netdev, linux-kernel

Simplify: ((a && b) || (!a && !b)) => (a == b)

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
---
 drivers/net/ehea/ehea_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index 190fb69..ef305f3 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -1914,7 +1914,7 @@ static void ehea_promiscuous(struct net_device *dev, int enable)
 	struct hcp_ehea_port_cb7 *cb7;
 	u64 hret;
 
-	if ((enable && port->promisc) || (!enable && !port->promisc))
+	if (enable == port->promisc)
 		return;
 
 	cb7 = (void *)get_zeroed_page(GFP_ATOMIC);
-- 
1.7.2.2

^ permalink raw reply related

* Re: [PATCH] ehea: simplify conditional
From: Breno Leitao @ 2010-10-07 23:49 UTC (permalink / raw)
  To: Nicolas Kaiser; +Cc: netdev, linux-kernel
In-Reply-To: <20101008011450.0126ffc0@absol.kitzblitz>

On 10/07/2010 08:14 PM, Nicolas Kaiser wrote:
> Simplify: ((a&&  b) || (!a&&  !b)) =>  (a == b)
>
> Signed-off-by: Nicolas Kaiser<nikai@nikai.net>
Acked-by: Breno Leitao <leitao@linux.vnet.ibm.com>

^ permalink raw reply

* [PATCH 2/2 net-next] bnx2: Enable AER on PCIE devices only
From: Michael Chan @ 2010-10-07 23:42 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1286494973-5115-1-git-send-email-mchan@broadcom.com>

To prevent unnecessary error message.  pci_save_state() is also moved to
the end of ->probe() so that all PCI config, including AER state, will be
saved.

Update version to 2.0.18.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <mchan@broadcom.com>
---
 drivers/net/bnx2.c |   36 ++++++++++++++++++++++--------------
 1 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 56f3dfe..ae894bc 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -59,8 +59,8 @@
 #include "bnx2_fw.h"
 
 #define DRV_MODULE_NAME		"bnx2"
-#define DRV_MODULE_VERSION	"2.0.17"
-#define DRV_MODULE_RELDATE	"July 18, 2010"
+#define DRV_MODULE_VERSION	"2.0.18"
+#define DRV_MODULE_RELDATE	"Oct 7, 2010"
 #define FW_MIPS_FILE_06		"bnx2/bnx2-mips-06-6.0.15.fw"
 #define FW_RV2P_FILE_06		"bnx2/bnx2-rv2p-06-6.0.15.fw"
 #define FW_MIPS_FILE_09		"bnx2/bnx2-mips-09-6.0.17.fw"
@@ -7915,16 +7915,7 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 		goto err_out_disable;
 	}
 
-	/* AER (Advanced Error Reporting) hooks */
-	err = pci_enable_pcie_error_reporting(pdev);
-	if (err) {
-		dev_err(&pdev->dev, "pci_enable_pcie_error_reporting failed "
-				    "0x%x\n", err);
-		/* non-fatal, continue */
-	}
-
 	pci_set_master(pdev);
-	pci_save_state(pdev);
 
 	bp->pm_cap = pci_find_capability(pdev, PCI_CAP_ID_PM);
 	if (bp->pm_cap == 0) {
@@ -7979,6 +7970,15 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 		bp->flags |= BNX2_FLAG_PCIE;
 		if (CHIP_REV(bp) == CHIP_REV_Ax)
 			bp->flags |= BNX2_FLAG_JUMBO_BROKEN;
+
+		/* AER (Advanced Error Reporting) hooks */
+		err = pci_enable_pcie_error_reporting(pdev);
+		if (err) {
+			dev_err(&pdev->dev, "pci_enable_pcie_error_reporting "
+					    "failed 0x%x\n", err);
+			/* non-fatal, continue */
+		}
+
 	} else {
 		bp->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX);
 		if (bp->pcix_cap == 0) {
@@ -8235,16 +8235,20 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 	bp->timer.data = (unsigned long) bp;
 	bp->timer.function = bnx2_timer;
 
+	pci_save_state(pdev);
+
 	return 0;
 
 err_out_unmap:
+	if (bp->flags & BNX2_FLAG_PCIE)
+		pci_disable_pcie_error_reporting(pdev);
+
 	if (bp->regview) {
 		iounmap(bp->regview);
 		bp->regview = NULL;
 	}
 
 err_out_release:
-	pci_disable_pcie_error_reporting(pdev);
 	pci_release_regions(pdev);
 
 err_out_disable:
@@ -8434,9 +8438,10 @@ bnx2_remove_one(struct pci_dev *pdev)
 
 	kfree(bp->temp_stats_blk);
 
-	free_netdev(dev);
+	if (bp->flags & BNX2_FLAG_PCIE)
+		pci_disable_pcie_error_reporting(pdev);
 
-	pci_disable_pcie_error_reporting(pdev);
+	free_netdev(dev);
 
 	pci_release_regions(pdev);
 	pci_disable_device(pdev);
@@ -8550,6 +8555,9 @@ static pci_ers_result_t bnx2_io_slot_reset(struct pci_dev *pdev)
 	}
 	rtnl_unlock();
 
+	if (!(bp->flags & BNX2_FLAG_PCIE))
+		return result;
+
 	err = pci_cleanup_aer_uncorrect_error_status(pdev);
 	if (err) {
 		dev_err(&pdev->dev,
-- 
1.6.4.GIT



^ permalink raw reply related

* Re: [PATCHv4 net-next-2.6 1/5] XFRM,IPv6: Remove xfrm_spi_hash() dependency on destination address
From: Herbert Xu @ 2010-10-08  0:42 UTC (permalink / raw)
  To: Arnaud Ebalard; +Cc: David S. Miller, Eric Dumazet, Hideaki YOSHIFUJI, netdev
In-Reply-To: <87fwwhoj73.fsf@small.ssi.corp>

On Thu, Oct 07, 2010 at 10:13:04PM +0200, Arnaud Ebalard wrote:
> 
> I think I will try the following alternative approach based on your
> comments and proposals:
> 
>  - drop my patch to change spi hash computation
>  - handle destination address remapping during input upon failure of
>    xfrm_state_lookup()
>  - handle source address remapping as it is currently done in the patch,
>    i.e. by comparing received one against x->props.saddr once the
>    state found and do 

I'm fine with this from IPsec's point-of-view.

On the other hand, if it is at all possible to setup these remapping
entries in the SADB beforehand, that would definitely be my
preferred solution.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH] remove leftover rcu_read_unlock calls from __mkroute_output
From: Dimitris Michailidis @ 2010-10-08  0:48 UTC (permalink / raw)
  To: eric.dumazet, netdev

Commit "fib: RCU conversion of fib_lookup()" removed rcu_read_lock() from
__mkroute_output but left a couple of calls to rcu_read_unlock() in there.
This causes lockdep to complain that the rcu_read_unlock() call in
__ip_route_output_key causes a lock inbalance and quickly crashes the
kernel. The below fixes this for me.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
---
 net/ipv4/route.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 7864d0c..3888f6b 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2396,12 +2396,10 @@ static int __mkroute_output(struct rtable **result,
 
 
 	rth = dst_alloc(&ipv4_dst_ops);
-	if (!rth) {
-		rcu_read_unlock();
+	if (!rth)
 		return -ENOBUFS;
-	}
+
 	in_dev_hold(in_dev);
-	rcu_read_unlock();
 	rth->idev = in_dev;
 
 	atomic_set(&rth->dst.__refcnt, 1);
-- 
1.5.4


^ permalink raw reply related

* Re: IPv4: sysctl table check failed [was: mmotm 2010-10-07-14-08 uploaded]
From: Eric W. Biederman @ 2010-10-08  0:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric Dumazet, Jiri Slaby, linux-kernel, mm-commits, ML netdev,
	David S. Miller
In-Reply-To: <20101007152806.119d1522.akpm@linux-foundation.org>

Andrew Morton <akpm@linux-foundation.org> writes:

> On Fri, 08 Oct 2010 00:22:15 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>> Le vendredi 08 octobre 2010 __ 00:06 +0200, Jiri Slaby a __crit :
>> > On 10/07/2010 11:08 PM, akpm@linux-foundation.org wrote:
>> > > The mm-of-the-moment snapshot 2010-10-07-14-08 has been uploaded to
>> > 
>> > Hi, I got bunch of "sysctl table check failed" below. All seem to be
>> > related to ipv4:
>> 
>> I would say, sysctl check is buggy :(
>> 
>> min/max are optional
>> 
>> [PATCH] sysctl: min/max bounds are optional
>> 
>> sysctl check complains when proc_doulongvec_minmax or
>> proc_doulongvec_ms_jiffies_minmax are used by a vector of longs (with
>> more than one element), with no min or max value specified.
>> 
>> This is unexpected, given we had a bug on this min/max handling :)
>> 
>> Reported-by: Jiri Slaby <jirislaby@gmail.com>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> ---
>>  kernel/sysctl_check.c |    9 ---------
>>  1 file changed, 9 deletions(-)
>> 
>> diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
>> index 04cdcf7..10b90d8 100644
>> --- a/kernel/sysctl_check.c
>> +++ b/kernel/sysctl_check.c
>> @@ -143,15 +143,6 @@ int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
>>  				if (!table->maxlen)
>>  					set_fail(&fail, table, "No maxlen");
>>  			}
>> -			if ((table->proc_handler == proc_doulongvec_minmax) ||
>> -			    (table->proc_handler == proc_doulongvec_ms_jiffies_minmax)) {
>> -				if (table->maxlen > sizeof (unsigned long)) {
>> -					if (!table->extra1)
>> -						set_fail(&fail, table, "No min");
>> -					if (!table->extra2)
>> -						set_fail(&fail, table, "No max");
>> -				}
>> -			}
>>  #ifdef CONFIG_PROC_SYSCTL
>>  			if (table->procname && !table->proc_handler)
>>  				set_fail(&fail, table, "No proc_handler");
>
> That will probably fix it ;)
>
> net-avoid-limits-overflow.patch is dependent on this patch.  Unless
> Eric B squeaks I'll plan on sending this patch in for 2.6.37.

Oh.  I see. I actually had a sanity check for the case that was failing.
I probably spotted the buggy code and wanted to see if there was
anything that cared.

So sysctl_check was perfectly correct until the bug was removed from
proc_doulongvec_minmax.  Which also means we have been auditing the
kernel for quite a while to make certain that it is safe not to
increment min and max.

Eric

^ permalink raw reply

* Re: [PATCH] ehea: Fix a checksum issue on the receive path
From: Eric Dumazet @ 2010-10-08  4:45 UTC (permalink / raw)
  To: leitao; +Cc: davem, netdev, Jay Vosburgh
In-Reply-To: <1286493453-21784-1-git-send-email-leitao@linux.vnet.ibm.com>

Le jeudi 07 octobre 2010 à 19:17 -0400, leitao@linux.vnet.ibm.com a
écrit :
> Currently we set all skbs with CHECKSUM_UNNECESSARY, even
> those whose protocol we don't know. This patch just
> add the CHECKSUM_COMPLETE tag for non TCP/UDP packets.
> 
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
> ---
>  drivers/net/ehea/ehea_main.c |    9 ++++++++-
>  drivers/net/ehea/ehea_qmr.h  |    1 +
>  2 files changed, 9 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
> index 0471cae..45fd045 100644
> --- a/drivers/net/ehea/ehea_main.c
> +++ b/drivers/net/ehea/ehea_main.c
> @@ -533,8 +533,15 @@ static inline void ehea_fill_skb(struct net_device *dev,
>  	int length = cqe->num_bytes_transfered - 4;	/*remove CRC */
>  
>  	skb_put(skb, length);
> -	skb->ip_summed = CHECKSUM_UNNECESSARY;
>  	skb->protocol = eth_type_trans(skb, dev);
> +
> +	/* The packet was not an IPV4 packet so a complemented checksum was
> +	   calculated. The value is found in the Internet Checksum field. */
> +	if (cqe->status & EHEA_CQE_BLIND_CKSUM) {
> +		skb->ip_summed = CHECKSUM_COMPLETE;
> +		skb->csum = csum_unfold(~cqe->inet_checksum_value);
> +	} else
> +		skb->ip_summed = CHECKSUM_UNNECESSARY;
>  }
>  

Hi Breno

Just to be clear : packets with wrong checksums are not given to upper
stack, so a tcpdump can not display them ? I am not sure many drivers do
that.

(EHEA_CQE_STAT_ERR_TCP, EHEA_CQE_STAT_ERR_IP)

Thanks !



^ permalink raw reply

* Re: [PATCH] remove leftover rcu_read_unlock calls from __mkroute_output
From: Eric Dumazet @ 2010-10-08  4:47 UTC (permalink / raw)
  To: Dimitris Michailidis; +Cc: netdev
In-Reply-To: <1286498918-30636-1-git-send-email-dm@chelsio.com>

Le jeudi 07 octobre 2010 à 17:48 -0700, Dimitris Michailidis a écrit :
> Commit "fib: RCU conversion of fib_lookup()" removed rcu_read_lock() from
> __mkroute_output but left a couple of calls to rcu_read_unlock() in there.
> This causes lockdep to complain that the rcu_read_unlock() call in
> __ip_route_output_key causes a lock inbalance and quickly crashes the
> kernel. The below fixes this for me.
> 
> Signed-off-by: Dimitris Michailidis <dm@chelsio.com>

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Thanks Dimitris !



^ permalink raw reply

* Re: [RFC] net: allow FEC driver to not have attached PHY
From: Greg Ungerer @ 2010-10-08  5:57 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev
In-Reply-To: <201010071524.43789.florian@openwrt.org>


Hi Florian,

On 07/10/10 23:24, Florian Fainelli wrote:
> On Thursday 07 October 2010 05:50:16 Greg Ungerer wrote:
>> Hi All,
>>
>> I have a board with a ColdFire SoC on it with the built-in FEC
>> ethernet module. On this hardware the FEC eth output is directly
>> attached to a RTL8305 4-port 10/100 switch. There is no conventional
>> PHY, the FEC output is direct into the uplink port of the switch
>> chip.
>>
>> This setup doesn't work after the FEC code was switch to using
>> phylib. The driver used to have code to bypass phy detection/setup
>> for this particular board. The phylib probe finds nothing, and of
>> course sets a no-link condition.
>>
>> So, what is the cleanest way to support this?
>
> If phy detection fails and you cannot attach to a known PHY driver, you could
> register a fixed-PHY driver wich will report the link to be up. I had to do
> something like this for the cpmac driver[1] where various switches and
> external PHY configurations are supported.
>
> [1]:
> https://dev.openwrt.org/browser/trunk/target/linux/ar7/patches-2.6.35/972-
> cpmac_multi_probe.patch

Ah yes, that looks like exactly what I need.
Thanks for the pointer.

Regards
Greg



>> The attached patch adds a config option to do this sort of generically
>> for the FEC driver. But I am wondering if there isn't a better way?
>>
>> Regards
>> Greg
>>
>>
>> ---
>>
>> [RFC] net: allow FEC driver to not have attached PHY
>>
>> At least one board using the FEC driver does not have a conventional
>> PHY attached to it, it is directly connected to a somewhat simple
>> ethernet switch (the board is the SnapGear/LITE, and the attached
>> 4-port ethernet switch is a RealTek RTL8305). This switch does not
>> present the usual register interface of a PHY, it presents nothing.
>> So a PHY scan will find nothing.
>>
>> After the FEC driver was changed to use phylib for supporting phys
>> it no longer works on this particular board/switch setup.
>>
>> Add a config option to allow configuring the FEC driver to not expect
>> a PHY to be present.
>>
>> Signed-off-by: Greg Ungerer<gerg@uclinux.org>
>> ---
>>   drivers/net/Kconfig |    9 +++++++++
>>   drivers/net/fec.c   |    7 +++++++
>>   2 files changed, 16 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
>> index 93494e2..ee44728 100644
>> --- a/drivers/net/Kconfig
>> +++ b/drivers/net/Kconfig
>> @@ -1934,6 +1934,15 @@ config FEC2
>>   	  Say Y here if you want to use the second built-in 10/100 Fast
>>   	  ethernet controller on some Motorola ColdFire processors.
>>
>> +config FEC_NOPHY
>> +	bool "FEC has no attached PHY"
>> +	depends on FEC
>> +	help
>> +	  Some boards using the FEC driver may not have a PHY directly
>> +	  attached to it. Typically in this scenario the FEC output is
>> +	  directly connected to the input of an ethernet switch or hub.
>> +	  Say Y here if your hardware is like this.
>> +
>>   config FEC_MPC52xx
>>   	tristate "MPC52xx FEC driver"
>>   	depends on PPC_MPC52xx&&  PPC_BESTCOMM
>> diff --git a/drivers/net/fec.c b/drivers/net/fec.c
>> index 768b840..3637f89 100644
>> --- a/drivers/net/fec.c
>> +++ b/drivers/net/fec.c
>> @@ -910,6 +910,11 @@ fec_enet_open(struct net_device *dev)
>>   	if (ret)
>>   		return ret;
>>
>> +#ifdef CONFIG_FEC_NOPHY
>> +	/* No PHY connected, assume link is always up */
>> +	fep->link = 1;
>> +	fec_restart(dev, 0);
>> +#else
>>   	/* Probe and connect to PHY when open the interface */
>>   	ret = fec_enet_mii_probe(dev);
>>   	if (ret) {
>> @@ -917,6 +922,8 @@ fec_enet_open(struct net_device *dev)
>>   		return ret;
>>   	}
>>   	phy_start(fep->phy_dev);
>> +#endif
>> +
>>   	netif_start_queue(dev);
>>   	fep->opened = 1;
>>   	return 0;
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
------------------------------------------------------------------------
Greg Ungerer  --  Principal Engineer        EMAIL:     gerg@snapgear.com
SnapGear Group, McAfee                      PHONE:       +61 7 3435 2888
8 Gardner Close                             FAX:         +61 7 3217 5323
Milton, QLD, 4064, Australia                WEB: http://www.SnapGear.com

^ permalink raw reply

* Re: [RFC] net: allow FEC driver to not have attached PHY
From: Greg Ungerer @ 2010-10-08  5:58 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev
In-Reply-To: <1286457071.2271.4.camel@achroite.uk.solarflarecom.com>

Hi Ben,

On 07/10/10 23:11, Ben Hutchings wrote:
> On Thu, 2010-10-07 at 13:50 +1000, Greg Ungerer wrote:
>> Hi All,
>>
>> I have a board with a ColdFire SoC on it with the built-in FEC
>> ethernet module. On this hardware the FEC eth output is directly
>> attached to a RTL8305 4-port 10/100 switch. There is no conventional
>> PHY, the FEC output is direct into the uplink port of the switch
>> chip.
>>
>> This setup doesn't work after the FEC code was switch to using
>> phylib. The driver used to have code to bypass phy detection/setup
>> for this particular board. The phylib probe finds nothing, and of
>> course sets a no-link condition.
>>
>> So, what is the cleanest way to support this?
>>
>> The attached patch adds a config option to do this sort of generically
>> for the FEC driver. But I am wondering if there isn't a better way?
> [...]
>
> Perhaps there could be a null PHY driver which does no MDIO and always
> reports link-up at the expected speed and duplex.  We used something
> like that in the sfc driver for some PHY-less boards (though we use our
> own PHY abstraction, not phylib).

Looks like that is what the FIXED_PHY does.,

Thanks
Greg


------------------------------------------------------------------------
Greg Ungerer  --  Principal Engineer        EMAIL:     gerg@snapgear.com
SnapGear Group, McAfee                      PHONE:       +61 7 3435 2888
8 Gardner Close                             FAX:         +61 7 3217 5323
Milton, QLD, 4064, Australia                WEB: http://www.SnapGear.com

^ permalink raw reply

* [PATCH] net_sched: use __TCA_HTB_MAX and TCA_HTB_MAX
From: Changli Gao @ 2010-09-30 16:17 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: David S. Miller, netdev, Changli Gao

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
---
 net/sched/sch_htb.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 4be8d04..87d9449 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1302,14 +1302,14 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
 	struct htb_class *cl = (struct htb_class *)*arg, *parent;
 	struct nlattr *opt = tca[TCA_OPTIONS];
 	struct qdisc_rate_table *rtab = NULL, *ctab = NULL;
-	struct nlattr *tb[TCA_HTB_RTAB + 1];
+	struct nlattr *tb[__TCA_HTB_MAX];
 	struct tc_htb_opt *hopt;
 
 	/* extract all subattrs from opt attr */
 	if (!opt)
 		goto failure;
 
-	err = nla_parse_nested(tb, TCA_HTB_RTAB, opt, htb_policy);
+	err = nla_parse_nested(tb, TCA_HTB_MAX, opt, htb_policy);
 	if (err < 0)
 		goto failure;
 

^ permalink raw reply related

* Re: powerpc, fs_enet: scanning PHY after Linux is up
From: Holger brunck @ 2010-10-08  8:50 UTC (permalink / raw)
  To: Grant Likely; +Cc: hs, linuxppc-dev, netdev, devicetree-discuss, Detlev Zundel
In-Reply-To: <AANLkTi=GkkD_-Vu-NswNedhgVuPaYePOHWa_2ytQgMf_@mail.gmail.com>

Hi Grant,

On 10/06/2010 06:52 PM, Grant Likely wrote:
> On Wed, Oct 6, 2010 at 3:53 AM, Heiko Schocher <hs@denx.de> wrote:
>>>> So, the question is, is there a possibility to solve this problem?
>>>>
>>>> If there is no standard option, what would be with adding a
>>>> "scan_phy" file in
>>>>
>>>> /proc/device-tree/soc\@f0000000/cpm\@119c0/mdio\@10d40
>>>> (or better destination?)
>>>>
>>>> which with we could rescan a PHY with
>>>> "echo addr > /proc/device-tree/soc\@f0000000/cpm\@119c0/mdio\@10d40/scan_phy"
>>>> (so there is no need for using of_find_node_by_path(), as we should
>>>>  have the associated device node here, and can step through the child
>>>>  nodes with "for_each_child_of_node(np, child)" and check if reg == addr)
>>>>
>>>> or shouldn;t be at least, if the phy couldn;t be found when opening
>>>> the port, retrigger a scanning, if the phy now is accessible?
>>>
>>> One option would be to still register a phy_device for each phy
>>> described in the device tree, but defer binding a driver to each phy
>>> that doesn't respond.  Then at of_phy_find_device() time, if it
>>
>> Maybe I din;t get the trick, but the problem is, that
>> you can;t register a phy_device in drivers/of/of_mdio.c
>> of_mdiobus_register(), if the phy didn;t respond with the
>> phy_id ... and of_phy_find_device() is not (yet) used in fs_enet
> 
> I'm suggesting modifying the phy layer so that it is possible to
> register a phy_device that doesn't (yet) respond.
> 

yes this sounds reasonable.

>>> matches with a phy_device that isn't bound to a driver yet, then
>>> re-trigger the binding operation.  At which point the phy id can be
>>> probed and the correct driver can be chosen.  If binding succeeds,
>>> then return the phy_device handle.  If not, then fail as it currently
>>> does.
>>
>> Wouldn;t it be good, just if we need a PHY (on calling fs_enet_open)
>> to look if there is one?
>>
>> Something like that (not tested):
>>
>> in drivers/net/fs_enet/fs_enet-main.c in fs_init_phy()
>> called from fs_enet_open():
>>
>> Do first:
>> phydev =  of_phy_find_device(fep->fpi->phy_node);
>>
>> Look if there is a driver (phy_dev->drv == NULL ?)
>>
>> If not, call new function
>> of_mdiobus_register_phy(mii_bus, fep->fpi->phy_node)
>> see below patch for it.
>>
>> If this succeeds, all is OK, and we can use this phy,
>> else ethernet not work.
> 
> I don't like this approach because it muddies the concept of which
> device is actually responsible for managing the phys on the bus.  Is
> it managed by the mdio bus device or the Ethernet device?  It also has
> a potential race condition.  Whereas triggering a late driver bind
> will be safe.
> 
> Alternately, I'd also be okay with a common method to trigger a
> reprobe of a particular phy from userspace, but I fear that would be a
> significantly more complex solution.
> 
>>
>> !!just no idea, how to get mii_bus pointer ...
> 
> You'd have to get the parent of the phy node, and then loop over all
> the registered mdio busses looking for a bus that uses that node.
> 

you say that you don't like the approach to probe the phy again in fs_enet_open,
but currently I don't understand what would be the alternate trigger point to
rescan the mdio bus?

I made a first patch to enhance the phy_device structure and rescan the mdio bus
at time of fs_enet_open (because I didn't see a better trigger point). The
advantage is that we got the mii_bus pointer and the phy addr stored in the
already created phy device structure and is therefore easy to use. See the patch
below for this modifications. Whats currently missing in the patch is to set the
phy_id if the phy was scanned later after phy_device creation. For the mgcoge
board it seems to solve our problem, but maybe I miss something important.

Best regards
Holger Brunck

diff --git a/drivers/net/fs_enet/fs_enet-main.c b/drivers/net/fs_enet/fs_enet-main.c
index ec2f503..6bc117f 100644
--- a/drivers/net/fs_enet/fs_enet-main.c
+++ b/drivers/net/fs_enet/fs_enet-main.c
@@ -775,7 +774,8 @@ static int fs_enet_open(struct net_device *dev)
 {
        struct fs_enet_private *fep = netdev_priv(dev);
        int r;
-       int err;
+       int err = 0;
+       u32 phy_id = 0;

        /* to initialize the fep->cur_rx,... */
        /* not doing this, will cause a crash in fs_enet_rx_napi */
@@ -795,13 +795,23 @@ static int fs_enet_open(struct net_device *dev)
                return -EINVAL;
        }

-       err = fs_init_phy(dev);
-       if (err) {
+       if (fep->phydev == NULL)
+               err = fs_init_phy(dev);
+
+       if (!err && (fep->phydev->available == false))
+               r = get_phy_id(fep->phydev->bus, fep->phydev->addr, &phy_id);
+
+       if (err || (phy_id == 0xffffffff)) {
                free_irq(fep->interrupt, dev);
                if (fep->fpi->use_napi)
                        napi_disable(&fep->napi);
-               return err;
+               if (err)
+                       return err;
+               else
+                       return -EINVAL;
        }
+       else
+               fep->phydev->available = true;
        phy_start(fep->phydev);

        netif_start_queue(dev);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index adbc0fd..1f443cb 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -173,6 +173,10 @@ struct phy_device* phy_device_create(struct mii_bus *bus,
int addr, int phy_id)
        dev->dev.bus = &mdio_bus_type;
        dev->irq = bus->irq != NULL ? bus->irq[addr] : PHY_POLL;
        dev_set_name(&dev->dev, PHY_ID_FMT, bus->id, addr);
+       if (phy_id == 0xffffffff)
+               dev->available = false;
+       else
+               dev->available = true;

        dev->state = PHY_DOWN;

@@ -232,13 +236,11 @@ struct phy_device * get_phy_device(struct mii_bus *bus,
int addr)
        int r;

        r = get_phy_id(bus, addr, &phy_id);
-       if (r)
-               return ERR_PTR(r);

        /* If the phy_id is mostly Fs, there is no device there */
-       if ((phy_id & 0x1fffffff) == 0x1fffffff)
-               return NULL;
-
+       if (((phy_id & 0x1fffffff) == 0x1fffffff) || r)
+               phy_id = 0xffffffff;
+       /* create phy even if the phy is currently not available */
        dev = phy_device_create(bus, addr, phy_id);

        return dev;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 6a7eb40..12dc3e4 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -303,6 +303,9 @@ struct phy_device {

        int link_timeout;

+       /* Flag to support delayed availability */
+       bool available;
+
        /*
         * Interrupt number for this PHY
         * -1 means no interrupt


^ permalink raw reply related

* Query on usage of Netlink
From: Kumar SANGHVI @ 2010-10-08  8:51 UTC (permalink / raw)
  To: netdev
  Cc: srinidhi.kasagar, linus.walleij, kumar.sanghvi, sudeep.divakaran,
	gulshan.karmani

Hi All,

We have a requirement where-in we want to communicate the status of
modem (whether modem is online or offline) from linux driver to
user-space components. So that user-space components stop sending data
to linux driver for communicating with modem, in case modem goes
offline.

We have decided to use the netlink mechanism to achieve this. We intend
to send an integer value, defined in enum, to user-space indicating the
current modem status.

For this, we want to know what Netlink message type should we use?
Should we use some existing Netlink type? Will there be any risk of
getting other messages (the ones not sent by our driver) if using
existing Netlink message type? For e.g. if we use the existing type
NETLINK_ROUTE then, user-space may un-necessary receive netlink notification
for route changes.

Or should we define a custom Netlink type in include/linux/netlink.h?

Thank you for suggestions.

Best regards,
Kumar.

^ permalink raw reply

* Re: [Bugme-new] [Bug 19692] New: linux-2.6.36-rc5 crash with gianfar ethernet at full line rate traffic
From: Jarek Poplawski @ 2010-10-08  9:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: netdev, bugzilla-daemon, bugme-daemon, eminak71, Anton Vorontsov
In-Reply-To: <20101004135310.6a5f8e93.akpm@linux-foundation.org>

Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Mon, 4 Oct 2010 06:25:14 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
>> https://bugzilla.kernel.org/show_bug.cgi?id=19692
>>
>>            Summary: linux-2.6.36-rc5 crash with gianfar ethernet at full
>>                     line rate traffic
...

Emin, until there is something better I hope you could try this patch.
(not tested nor compiled)

Thanks,
Jarek P.
---
diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 4f7c3f3..db47b55 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -2515,7 +2515,7 @@ static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
 				skb_recycle_check(skb, priv->rx_buffer_size +
 					RXBUF_ALIGNMENT)) {
 			gfar_align_skb(skb);
-			__skb_queue_head(&priv->rx_recycle, skb);
+			skb_queue_head(&priv->rx_recycle, skb);
 		} else
 			dev_kfree_skb_any(skb);
 
@@ -2598,7 +2598,7 @@ struct sk_buff * gfar_new_skb(struct net_device *dev)
 	struct gfar_private *priv = netdev_priv(dev);
 	struct sk_buff *skb = NULL;
 
-	skb = __skb_dequeue(&priv->rx_recycle);
+	skb = skb_dequeue(&priv->rx_recycle);
 	if (!skb)
 		skb = gfar_alloc_skb(dev);
 
@@ -2754,7 +2754,7 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit)
 			if (unlikely(!newskb))
 				newskb = skb;
 			else if (skb)
-				__skb_queue_head(&priv->rx_recycle, skb);
+				skb_queue_head(&priv->rx_recycle, skb);
 		} else {
 			/* Increment the number of packets */
 			rx_queue->stats.rx_packets++;

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox