Netdev List

Netdev List
 help / color / mirror / Atom feed

* RE: [PATCH UCC TDM 3/3 ] Modified Documentation to explain dtsentries for TDM driver
From: Aggrwal Poonam @ 2008-01-25  3:58 UTC (permalink / raw)
  To: Wood Scott
  Cc: Barkowski Michael, netdev, Gala Kumar, linux-kernel, rubini,
	linuxppc-dev, Kalra Ashish, Cutler Richard, akpm, Tabi Timur
In-Reply-To: <20080124201226.GA3926@loki.buserror.net>

Hi Scott

The device tree already has a brg-frequency  property in qe node which
is the value of BRGCLK. The function get_brg_clk uses this property to
find the value of BRGCLK.
In case this value is 0(some older u-boots populate bus-frequency
property of qe and not the brg-frequency), get_brg_clk  uses
bus-frequency/2 as BRGCLK.


With Regards
Poonam 
 
 

-----Original Message-----
From: Wood Scott 
Sent: Friday, January 25, 2008 1:42 AM
To: Aggrwal Poonam
Cc: Gala Kumar; akpm@linux-foundation.org; linux-kernel@vger.kernel.org;
netdev@vger.kernel.org; rubini@vision.unipv.it; linuxppc-dev@ozlabs.org;
Barkowski Michael; Cutler Richard; Tabi Timur; Kalra Ashish
Subject: Re: [PATCH UCC TDM 3/3 ] Modified Documentation to explain
dtsentries for TDM driver

On Thu, Jan 24, 2008 at 10:24:13AM +0530, Poonam_Aggrwal-b10812 wrote:
> +  ix) Baud Rate Generator (BRG)
> +
> +  Required properties:
> +  - compatible : shpuld be "fsl,cpm-brg"
> +  - fsl,brg-sources : define the input clock for all 16 BRGs. The
input
> +    clock source could be 1 to 24 for CLK1 to CLK24. Zero means that
the
> +    particular BRG will be driven by QE clock(BRGCLK).

Should also have a clock-frequency property to specify what BRGCLK is.

-Scott

^ permalink raw reply

* Re: pull request: wireless-2.6 'upstream' 2008-01-24
From: John W. Linville @ 2008-01-25  3:57 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-wireless
In-Reply-To: <20080124195343.GC3200@tuxdriver.com>

On Thu, Jan 24, 2008 at 02:53:43PM -0500, John W. Linville wrote:

> The cfg80211 API change breaks ath5k, so I have listed it as "depends
> on BROKEN".  I am assured that the ath5k team has agreed to fix
> this ASAP.  Meanwhile we wanted to have it in place so that we can
> start shaking-out problems with other drivers.

Looks like this might have broken more than I expected, and there
may be some refactoring of the wireless rndis bits too.  So, let's
hold off on this request.  I'll post a new one in the next day or so.

Thanks,

John
-- 
John W. Linville
linville@tuxdriver.com

^ permalink raw reply

* RE: [PATCH UCC TDM 1/3 Updated] Platform changes for UCC TDM driver for MPC8323eRDB. Also includes related QE changes and dts entries.
From: Aggrwal Poonam @ 2008-01-25  4:09 UTC (permalink / raw)
  To: avorontsov, Tabi Timur
  Cc: Gala Kumar, akpm, linux-kernel, netdev, rubini, linuxppc-dev,
	Barkowski Michael, Cutler Richard, Kalra Ashish
In-Reply-To: <20080124172327.GA6786@localhost.localdomain>

Hello Anton/Tabi

I am not sure which is the best place to configure the pins. Because
some drivers do it in one way and some in the other.
I actually tried to make the driver similar to ucc_geth because it is a
QE driver. The driver has no platform code in the platform files similar
to ucc_geth. It is probed along with all the QE devices thorugh
of_platform_bus_probe.
And the pins are configured for all the QE devices using  par_io_init.
I thought this to be the most consistent way at that time.

How should we close this point?
Can we go ahead with the pio-map?

Infact the discussion in this thread was very good and I got to know a
lot of rationales behind this.

Please suggest.

Thanks and Regards
Poonam

From: Anton Vorontsov [mailto:avorontsov@ru.mvista.com] 
Sent: Thursday, January 24, 2008 10:53 PM
To: Tabi Timur
Cc: Aggrwal Poonam; Gala Kumar; akpm@linux-foundation.org;
linux-kernel@vger.kernel.org; netdev@vger.kernel.org;
rubini@vision.unipv.it; linuxppc-dev@ozlabs.org; Barkowski Michael;
Cutler Richard; Kalra Ashish
Subject: Re: [PATCH UCC TDM 1/3 Updated] Platform changes for UCC TDM
driver for MPC8323eRDB. Also includes related QE changes and dts
entries.

On Thu, Jan 24, 2008 at 10:33:47AM -0600, Timur Tabi wrote:
> Anton Vorontsov wrote:
> 
> >Are you saying that TDM is sharing same pins with the other QE 
> >device, and we can choose to use/not use some device depending on 
> >which driver is loaded?
> 
> No.  I'd have to closely examine the DTS, but I don't think that UCC 
> devices share pins at all.  But that isn't my point.
> 
> >In that particular case UCC configuration is static, for every UCC.
> >So, we can set up all pins in the firmware/board file.
> 
> Yes, but deciding what the UCC does might not be static.  At what 
> point do we declare, "UCC5 is for eth0 and eth0 only"?
> 
> The advantage of putting the pin configurations in the device tree is 
> that they now become configurable.  I can envision a scenario where 
> UCC5 could be either an Ethernet or a UART, depending on the setting 
> of some jumpers on the board. That's what the QE was designed for: any

> UCC can do any task, and you can even have a UCC change its purpose
while the system is running.
> So I don't want the pin configurations hard-coded into the kernel.  
> Having them in the device tree gives me some flexibility.

If hardware configuration is selected at the bootup time, by jumpers or
switches, it's even easier to do it right. Without pio-map.

> For instance, I have a plan (that I keep postponing) to introduce a 
> new feature in U-Boot where U-Boot can determine the settings of some 
> board jumpers and modify the device tree accordingly. The instructions

> on how to modify the device tree would be embedded in the tree itself.

Why you need to modify the device tree for that? Let the U-Boot simply
setup pins for the kernel. Regarding kernel overwriting pins
configuration...

> I can't
> support this feature if the kernel calls par_io_config_pin() 
> regardless of what's in the device tree.

What I've understood from the previous debates, is that ideally kernel
should not touch pins' configuration. Today we're using pio-map solely
to fix up some old firmware misconfiguration. And we can do this in the
board file still. To determine if we need to fixup the firmware or not,
we can use some device tree property instead (firmware version?).

p.s.
I'm neither for pio-map nor against. I just want some consequence
regarding this. Last thread ended with consequence that pio-map is a bad
thing to use...

--
Anton Vorontsov
email: cbou@mail.ru
backup email: ya-cbou@yandex.ru
irc://irc.freenode.net/bd2

^ permalink raw reply

* Re: pull request: wireless-2.6 'upstream' 2008-01-24
From: David Miller @ 2008-01-25  5:26 UTC (permalink / raw)
  To: linville; +Cc: netdev, linux-wireless
In-Reply-To: <20080125035751.GC3411@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Thu, 24 Jan 2008 22:57:51 -0500

> On Thu, Jan 24, 2008 at 02:53:43PM -0500, John W. Linville wrote:
> 
> > The cfg80211 API change breaks ath5k, so I have listed it as "depends
> > on BROKEN".  I am assured that the ath5k team has agreed to fix
> > this ASAP.  Meanwhile we wanted to have it in place so that we can
> > start shaking-out problems with other drivers.
> 
> Looks like this might have broken more than I expected, and there
> may be some refactoring of the wireless rndis bits too.  So, let's
> hold off on this request.  I'll post a new one in the next day or so.

Ok.

^ permalink raw reply

* Re: bluetooth : lockdep warning on rfcomm
From: Dave Young @ 2008-01-25  5:37 UTC (permalink / raw)
  To: LKML; +Cc: Netdev, Marcel Holtmann, David Miller, bluez-devel
In-Reply-To: <a8e1da0801240125x6c65a4bfva24ff50cef890eba@mail.gmail.com>

On Jan 24, 2008 5:25 PM, Dave Young <hidave.darkstar@gmail.com> wrote:
>
> On Jan 24, 2008 11:02 AM, Dave Young <hidave.darkstar@gmail.com> wrote:
> > =============================================
> > [ INFO: possible recursive locking detected ]
> > 2.6.24-rc8-mm1 #8
> > ---------------------------------------------
> > bluepush/3213 is trying to acquire lock:
> >  (sk_lock-AF_BLUETOOTH){--..}, at: [<f8978c80>]
> > l2cap_sock_bind+0x40/0x100 [l2cap]
> >
> > but task is already holding lock:
> >  (sk_lock-AF_BLUETOOTH){--..}, at: [<f894a31e>]
> > rfcomm_sock_connect+0x3e/0xe0 [rfcomm]
> >
> > other info that might help us debug this:
> > 2 locks held by bluepush/3213:
> >  #0:  (sk_lock-AF_BLUETOOTH){--..}, at: [<f894a31e>]
> > rfcomm_sock_connect+0x3e/0xe0 [rfcomm]
> >  #1:  (rfcomm_mutex){--..}, at: [<f8947556>] rfcomm_dlc_open+0x26/0x60 [rfcomm]
> >
> > stack backtrace:
> > Pid: 3213, comm: bluepush Not tainted 2.6.24-rc8-mm1 #8
> >  [<c0132128>] ? printk+0x18/0x20
> >  [<c0154437>] print_deadlock_bug+0xc7/0xe0
> >  [<c01544bc>] check_deadlock+0x6c/0x80
> >  [<c01548fc>] validate_chain+0x14c/0x320
> >  [<c0156221>] __lock_acquire+0x1c1/0x730
> >  [<c0156d89>] lock_acquire+0x79/0xb0
> >  [<f8978c80>] ? l2cap_sock_bind+0x40/0x100 [l2cap]
> >  [<c03c05f5>] lock_sock_nested+0x55/0x70
> >  [<f8978c80>] ? l2cap_sock_bind+0x40/0x100 [l2cap]
> >  [<f8978c80>] l2cap_sock_bind+0x40/0x100 [l2cap]
> >  [<c03bdb4a>] kernel_bind+0xa/0x10
> >  [<f8947afc>] rfcomm_session_create+0x4c/0x110 [rfcomm]
> >  [<f8947509>] __rfcomm_dlc_open+0x129/0x150 [rfcomm]
> >  [<f8947568>] rfcomm_dlc_open+0x38/0x60 [rfcomm]
> >  [<f894a396>] rfcomm_sock_connect+0xb6/0xe0 [rfcomm]
> >  [<c03bcd39>] sys_connect+0x99/0xd0
> >  [<c010f509>] ? cache_add_dev+0x39/0x1a0
> >  [<c015323d>] ? put_lock_stats+0xd/0x30
> >  [<c01532c0>] ? lock_release_holdtime+0x60/0x80
> >  [<c018e86c>] ? fget+0x7c/0x100
> >  [<c0156b97>] ? __lock_release+0x47/0x70
> >  [<c018e86c>] ? fget+0x7c/0x100
> >  [<c02611b7>] ? copy_from_user+0x37/0x70
> >  [<c03bd855>] sys_socketcall+0xa5/0x230
> >  [<c0155659>] ? trace_hardirqs_on+0xb9/0x130
> >  [<c010501b>] ? restore_nocheck+0x12/0x15
> >
>
> While fixing this issue, another locking dependency confused me. Are
> rfcomm_dev_lock and &d->lock in same lock class?
>
> The warnings as following:
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.24-rc8-mm1 #8
> -------------------------------------------------------
> krfcommd/2912 is trying to acquire lock:
>  (rfcomm_dev_lock){-.--}, at: [<f89d5bf2>]
> rfcomm_dev_state_change+0x92/0x160 [rfcomm]
>
> but task is already holding lock:
>  (&d->lock){--..}, at: [<f89d15d3>] __rfcomm_dlc_close+0x43/0xd0 [rfcomm]
>
> which lock already depends on the new lock.

Answer to myself, the reason is that lockdep think this as a lock order problem.

in rfcomm_device_add the lock order is :
        rfcomm_dev_lock --> dlc lock
but in __rfcomm_dlc_close is :
        dlc lock --> rfcomm_dev_lock (-> state_change->rfcomm_dev_get)

>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&d->lock){--..}:
>        [<c0153b13>] add_lock_to_list+0x33/0x70
>        [<c01545a3>] check_prev_add+0xd3/0x200
>        [<f89d5311>] rfcomm_dev_add+0x191/0x300 [rfcomm]
>        [<c0154765>] check_prevs_add+0x95/0xe0
>        [<c01549ef>] validate_chain+0x23f/0x320
>        [<c0156221>] __lock_acquire+0x1c1/0x730
>        [<c0155539>] mark_held_locks+0x39/0x80
>        [<c0156d89>] lock_acquire+0x79/0xb0
>        [<f89d5311>] rfcomm_dev_add+0x191/0x300 [rfcomm]
>        [<c0436749>] _spin_lock+0x39/0x80
>        [<f89d5311>] rfcomm_dev_add+0x191/0x300 [rfcomm]
>        [<f89d5b10>] rfcomm_dev_data_ready+0x0/0x50 [rfcomm]
>        [<f89d5311>] rfcomm_dev_add+0x191/0x300 [rfcomm]
>        [<f89d565e>] rfcomm_create_dev+0x6e/0x100 [rfcomm]
>        [<c03c05fa>] lock_sock_nested+0x5a/0x70
>        [<f89d5ae3>] rfcomm_dev_ioctl+0x33/0x60 [rfcomm]
>        [<f89d4c8c>] rfcomm_sock_ioctl+0x2c/0x50 [rfcomm]
>        [<c03bc001>] sock_ioctl+0xc1/0x220
>        [<c03bbf40>] sock_ioctl+0x0/0x220
>        [<c019a366>] vfs_ioctl+0x76/0x90
>        [<c019a616>] do_vfs_ioctl+0x56/0x140
>        [<c019a762>] sys_ioctl+0x62/0x70
>        [<c0104fba>] syscall_call+0x7/0xb
>        [<ffffffff>] 0xffffffff
>
> -> #0 (rfcomm_dev_lock){-.--}:
>        [<c0154504>] check_prev_add+0x34/0x200
>        [<c012a9ab>] default_wake_function+0xb/0x10
>        [<c0154765>] check_prevs_add+0x95/0xe0
>        [<c01549ef>] validate_chain+0x23f/0x320
>        [<c0156221>] __lock_acquire+0x1c1/0x730
>        [<c0156d89>] lock_acquire+0x79/0xb0
>        [<f89d5bf2>] rfcomm_dev_state_change+0x92/0x160 [rfcomm]
>        [<c04361e9>] _read_lock+0x39/0x80
>        [<f89d5bf2>] rfcomm_dev_state_change+0x92/0x160 [rfcomm]
>        [<f89d5bf2>] rfcomm_dev_state_change+0x92/0x160 [rfcomm]
>        [<f89d15e8>] __rfcomm_dlc_close+0x58/0xd0 [rfcomm]
>        [<f89d2523>] rfcomm_recv_ua+0x73/0x140 [rfcomm]
>        [<f89d3151>] rfcomm_recv_frame+0x171/0x1e0 [rfcomm]
>        [<c0155659>] trace_hardirqs_on+0xb9/0x130
>        [<c0436a39>] _spin_unlock_irqrestore+0x39/0x70
>        [<f89d3447>] rfcomm_run+0xe7/0x590 [rfcomm]
>        [<c0120000>] hpet_legacy_next_event+0x20/0x50
>        [<f89d3360>] rfcomm_run+0x0/0x590 [rfcomm]
>        [<c0146e0c>] kthread+0x5c/0xa0
>        [<c0146db0>] kthread+0x0/0xa0
>        [<c0105c17>] kernel_thread_helper+0x7/0x10
>        [<ffffffff>] 0xffffffff
>
> other info that might help us debug this:
>
> 2 locks held by krfcommd/2912:
>  #0:  (rfcomm_mutex){--..}, at: [<f89d33db>] rfcomm_run+0x7b/0x590 [rfcomm]
>  #1:  (&d->lock){--..}, at: [<f89d15d3>] __rfcomm_dlc_close+0x43/0xd0 [rfcomm]
>
> stack backtrace:
> Pid: 2912, comm: krfcommd Not tainted 2.6.24-rc8-mm1 #8
>  [<c0132128>] ? printk+0x18/0x20
>  [<c0153cff>] print_circular_bug_tail+0x6f/0x80
>  [<c0154504>] check_prev_add+0x34/0x200
>  [<c012a9ab>] ? default_wake_function+0xb/0x10
>  [<c0154765>] check_prevs_add+0x95/0xe0
>  [<c01549ef>] validate_chain+0x23f/0x320
>  [<c0156221>] __lock_acquire+0x1c1/0x730
>  [<c0156d89>] lock_acquire+0x79/0xb0
>  [<f89d5bf2>] ? rfcomm_dev_state_change+0x92/0x160 [rfcomm]
>  [<c04361e9>] _read_lock+0x39/0x80
>  [<f89d5bf2>] ? rfcomm_dev_state_change+0x92/0x160 [rfcomm]
>  [<f89d5bf2>] rfcomm_dev_state_change+0x92/0x160 [rfcomm]
>  [<f89d15e8>] __rfcomm_dlc_close+0x58/0xd0 [rfcomm]
>  [<f89d2523>] rfcomm_recv_ua+0x73/0x140 [rfcomm]
>  [<f89d3151>] rfcomm_recv_frame+0x171/0x1e0 [rfcomm]
>  [<c0155659>] ? trace_hardirqs_on+0xb9/0x130
>  [<c0436a39>] ? _spin_unlock_irqrestore+0x39/0x70
>  [<f89d3447>] rfcomm_run+0xe7/0x590 [rfcomm]
>  [<c0120000>] ? hpet_legacy_next_event+0x20/0x50
>  [<f89d3360>] ? rfcomm_run+0x0/0x590 [rfcomm]
>  [<c0146e0c>] kthread+0x5c/0xa0
>  [<c0146db0>] ? kthread+0x0/0xa0
>  [<c0105c17>] kernel_thread_helper+0x7/0x10
>  =======================
>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply

* [PATCH] SMC91x: poll_controller(): check for interrupts before calling handler
From: Kevin Hilman @ 2008-01-25  5:38 UTC (permalink / raw)
  To: netdev; +Cc: Nicolas Pitre, Kevin Hilman

When using polling, smc_poll_controller() can call smc_interrupt()
when there are likely to be no real interrups.  This will trigger the
"spurious interrupt" printk whenever the driver is being polled.

Instead, check for actual interrupts before calling smc_interrupt()

Signed-off-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Nicolas Pitre <nico@cam.org>
---
 drivers/net/smc91x.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/net/smc91x.c b/drivers/net/smc91x.c
index 7da7589..64ef6c1 100644
--- a/drivers/net/smc91x.c
+++ b/drivers/net/smc91x.c
@@ -1350,8 +1350,12 @@ static irqreturn_t smc_interrupt(int irq, void *dev_id)
  */
 static void smc_poll_controller(struct net_device *dev)
 {
+	struct smc_local *lp = netdev_priv(dev);
+	void __iomem *ioaddr = lp->base;
+
 	disable_irq(dev->irq);
-	smc_interrupt(dev->irq, dev);
+	if (SMC_GET_INT() & SMC_GET_INT_MASK())
+		smc_interrupt(dev->irq, dev);
 	enable_irq(dev->irq);
 }
 #endif
-- 
1.5.3.7


^ permalink raw reply related

* Re: [PATCH] fib_trie: rescan if key is lost during dump
From: Jarek Poplawski @ 2008-01-25  8:23 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, kaber, netdev
In-Reply-To: <20080124135112.32b5c1c7@deepthought>

On 24-01-2008 22:51, Stephen Hemminger wrote:
> Normally during a dump the key of the last dumped entry is used for
> continuation, but since lock is dropped it might be lost. In that case
> fallback to the old counter based N^2 behaviour.  This means the dump will end up
> skipping some routes which matches what FIB_HASH does.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
...
> @@ -1918,35 +1931,37 @@ static int fn_trie_dump(struct fib_table
>  	struct leaf *l;
>  	struct trie *t = (struct trie *) tb->tb_data;
>  	t_key key = cb->args[2];
> +	int count = cb->args[3];
>  
>  	rcu_read_lock();

Sorry, but I lost the point: is rtnl held or not held here at the moment?
If held, how this rcu_read_lock can help? Maybe some additional comment
in the code?

Thanks,
Jarek P.

^ permalink raw reply

* Re: [Bugme-new] [Bug 9810] New: Bridge doesn't work with e1000e driver
From: Andrew Morton @ 2008-01-25  9:33 UTC (permalink / raw)
  To: netdev; +Cc: bugme-daemon, Auke Kok, Jesse Brandeburg, cijoml,
	Stephen Hemminger
In-Reply-To: <bug-9810-10286@http.bugzilla.kernel.org/>

> On Fri, 25 Jan 2008 01:13:19 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9810
> 
>            Summary: Bridge doesn't work with e1000e driver
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.24-rc8
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: cijoml@volny.cz
> 
> 
> Latest working kernel version: unknown
> Earliest failing kernel version: unknown
> Distribution: Debian stable
> Hardware Environment: Dell Optiplex 755
> Software Environment: Debian stable, vanilla 2.6.24-rc8
> Problem Description:
> 
> Bridge doesn't work with e1000e driver
> 
> Steps to reproduce:
> 
> optiplex:/home/cijoml# brctl addbr br0
> optiplex:/home/cijoml# brctl addif br0 eth0
> optiplex:/home/cijoml# brctl show
> bridge name     bridge id               STP enabled     interfaces
> br0             8000.001e4f93dd74       no              eth0
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> optiplex:/home/cijoml# killall dhclient
> dhclient: no process killed
> optiplex:/home/cijoml# dhclient br0
> Internet Systems Consortium DHCP Client V3.0.4
> Copyright 2004-2006 Internet Systems Consortium.
> All rights reserved.
> For info, please visit http://www.isc.org/sw/dhcp/
> 
> Listening on LPF/br0/00:1e:4f:93:dd:74
> Sending on   LPF/br0/00:1e:4f:93:dd:74
> Sending on   Socket/fallback
> DHCPREQUEST on br0 to 255.255.255.255 port 67
> DHCPREQUEST on br0 to 255.255.255.255 port 67
> DHCPREQUEST on br0 to 255.255.255.255 port 67
> DHCPDISCOVER on br0 to 255.255.255.255 port 67 interval 6
> DHCPDISCOVER on br0 to 255.255.255.255 port 67 interval 8
> DHCPDISCOVER on br0 to 255.255.255.255 port 67 interval 12
> DHCPDISCOVER on br0 to 255.255.255.255 port 67 interval 18
> DHCPDISCOVER on br0 to 255.255.255.255 port 67 interval 17
> No DHCPOFFERS received.
> Trying recorded lease 10.136.212.12
> PING 10.136.212.1 (10.136.212.1) 56(84) bytes of data.
> 
> --- 10.136.212.1 ping statistics ---
> 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
> 
> No working leases in persistent database - sleeping.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> optiplex:/home/cijoml# killall dhclient
> optiplex:/home/cijoml# ifconfig br0 down
> optiplex:/home/cijoml# brctl delbr br0
> optiplex:/home/cijoml# dhclient eth0
> Internet Systems Consortium DHCP Client V3.0.4
> Copyright 2004-2006 Internet Systems Consortium.
> All rights reserved.
> For info, please visit http://www.isc.org/sw/dhcp/
> 
> Listening on LPF/eth0/00:1e:4f:93:dd:74
> Sending on   LPF/eth0/00:1e:4f:93:dd:74
> Sending on   Socket/fallback
> DHCPREQUEST on eth0 to 255.255.255.255 port 67
> DHCPREQUEST on eth0 to 255.255.255.255 port 67
> DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6
> ip length 339 disagrees with bytes received 343.
> accepting packet with data after udp payload.
> DHCPOFFER from 10.136.212.2
> DHCPREQUEST on eth0 to 255.255.255.255 port 67
> ip length 339 disagrees with bytes received 343.
> accepting packet with data after udp payload.
> ip length 339 disagrees with bytes received 343.
> accepting packet with data after udp payload.
> DHCPACK from 10.136.212.2
> bound to 10.136.212.15 -- renewal in 1149646 seconds.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> 

^ permalink raw reply

* Re: [Bugme-new] [Bug 9811] New: Loopback address to eth0 interface changes scope permanently
From: Andrew Morton @ 2008-01-25 10:50 UTC (permalink / raw)
  To: netdev; +Cc: bugme-daemon, bjorn
In-Reply-To: <bug-9811-10286@http.bugzilla.kernel.org/>

> On Fri, 25 Jan 2008 02:04:04 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9811
> 
>            Summary: Loopback address to eth0 interface changes scope
>                     permanently
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.24-rc8
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: IPV4
>         AssignedTo: shemminger@linux-foundation.org
>         ReportedBy: bjorn@mork.no
> 
> 
> Latest working kernel version: none
> Earliest failing kernel version: 2.6.18 (verified, but most likely "any")
> Distribution: Debian
> Hardware Environment: 
> Software Environment:
> Problem Description: 
> 
> >From gary.manchon@gmail.com :
> 
> After a bad network interface configuration (ifconfig eth0 127.0.0.1),
> I cannot recover the network without rebooting the kernel.
> 
> This is my working configuration :
> 
> # ifconfig
> eth0      Link encap:Ethernet  HWaddr 00:90:3E:1F:1C:17
>           inet addr:192.168.240.195  Bcast:192.168.247.255  Mask:255.255.248.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:26 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:2633 (2.5 KiB)  TX bytes:0 (0.0 B)
> 
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
> 
> # route
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
> 192.168.240.0   *               255.255.248.0   U     0      0        0 eth0
> default         192.168.240.1   0.0.0.0         UG    0      0        0 eth0
> 
> # ping -c 1 66.102.11.99
> PING 66.102.11.99 (66.102.11.99): 56 data bytes
> 64 bytes from 66.102.11.99: icmp_seq=0 ttl=63 time=4.5 ms
> 
> --- 66.102.11.99 ping statistics ---
> 1 packets transmitted, 1 packets received, 0% packet loss
> round-trip min/avg/max = 4.5/4.5/4.5 ms
> 
> Steps to reproduce:
> 
> # ifconfig eth0 127.0.0.1
> # ifconfig eth0 192.168.240.195 netmask 255.255.248.0
> # route add default gw 192.168.240.1
> 
> 
> >From Bjørn Mork <bjorn@mork.no> :
> 
> I suspect that the problem might be this code in net/ipv4/devinet.c ,
> which sets ifa_scope to RT_SCOPE_HOST if you configure a loopback
> address (127/8) on any interface.  I guess it's there to protect us from
> sending packets with a loopback source address, which woulnd't look too
> good:
> 
> static int inet_set_ifa(struct net_device *dev, struct in_ifaddr *ifa)
> {
>         struct in_device *in_dev = __in_dev_get_rtnl(dev);
> 
>         ASSERT_RTNL();
> 
>         if (!in_dev) {
>                 inet_free_ifa(ifa);
>                 return -ENOBUFS;
>         }
>         ipv4_devconf_setall(in_dev);
>         if (ifa->ifa_dev != in_dev) {
>                 BUG_TRAP(!ifa->ifa_dev);
>                 in_dev_hold(in_dev);
>                 ifa->ifa_dev = in_dev;
>         }
>         if (LOOPBACK(ifa->ifa_local))
>                 ifa->ifa_scope = RT_SCOPE_HOST;
>         return inet_insert_ifa(ifa);
> }
> 
> 
> 
> The real problem is that there's never anything resetting this scope if
> you change the address later.  The attached patch fixes this.
> 
> 
> dhcp232:~# ip addr show dev eth0
> 2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
>     link/ether 00:aa:00:ff:00:ff brd ff:ff:ff:ff:ff:ff
>     inet 192.168.3.232/24 brd 192.168.3.255 scope global eth0
>     inet6 2001:16d8:ffb4:0:2aa:ff:feff:ff/64 scope global dynamic 
>        valid_lft 2591971sec preferred_lft 604771sec
>     inet6 fe80::2aa:ff:feff:ff/64 scope link 
>        valid_lft forever preferred_lft forever
> dhcp232:~# ifconfig eth0 127.0.0.1
> dhcp232:~# ip addr show dev eth0
> 2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
>     link/ether 00:aa:00:ff:00:ff brd ff:ff:ff:ff:ff:ff
>     inet 127.0.0.1/8 brd 127.255.255.255 scope host eth0
>     inet6 2001:16d8:ffb4:0:2aa:ff:feff:ff/64 scope global dynamic 
>        valid_lft 2591951sec preferred_lft 604751sec
>     inet6 fe80::2aa:ff:feff:ff/64 scope link 
>        valid_lft forever preferred_lft forever
> dhcp232:~# ifconfig eth0 192.168.3.232
> dhcp232:~# ip addr show dev eth0
> 2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
>     link/ether 00:aa:00:ff:00:ff brd ff:ff:ff:ff:ff:ff
>     inet 192.168.3.232/24 brd 192.168.3.255 scope host eth0
>     inet6 2001:16d8:ffb4:0:2aa:ff:feff:ff/64 scope global dynamic 
>        valid_lft 2591933sec preferred_lft 604733sec
>     inet6 fe80::2aa:ff:feff:ff/64 scope link 
>        valid_lft forever preferred_lft forever
> 
> 
> 
> Notice how the scope changes from "global" to "host" when configuring
> 127.0.0.1, and just never changes back.  It will stay that way forever,
> or until something changes the scope or deletes the address.
> 
> Deleting the addresss and re-adding it will work around the problem:
> 
> dhcp232:~# ip addr del 192.168.3.232/24 dev eth0
> dhcp232:~# ifconfig eth0 192.168.3.232
> dhcp232:~# ip addr show dev eth0
> 2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
>     link/ether 00:aa:00:ff:00:ff brd ff:ff:ff:ff:ff:ff
>     inet 192.168.3.232/24 brd 192.168.3.255 scope global eth0
>     inet6 2001:16d8:ffb4:0:2aa:ff:feff:ff/64 scope global dynamic 
>        valid_lft 2591963sec preferred_lft 604763sec
>     inet6 fe80::2aa:ff:feff:ff/64 scope link 
>        valid_lft forever preferred_lft forever
> 
> 
> Bjørn

^ permalink raw reply

* Re: [Bugme-new] [Bug 9812] New: Kernel does not deliver packets going through the INPUT chain even if the app is listening on IN_ADDR_ANY
From: Andrew Morton @ 2008-01-25 13:14 UTC (permalink / raw)
  To: netdev; +Cc: bugme-daemon, pawel
In-Reply-To: <bug-9812-10286@http.bugzilla.kernel.org/>

> On Fri, 25 Jan 2008 03:59:27 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9812
> 
>            Summary: Kernel does not deliver packets going through the INPUT
>                     chain even if the app is listening on IN_ADDR_ANY
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.22-14
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Netfilter/Iptables
>         AssignedTo: networking_netfilter-iptables@kernel-bugs.osdl.org
>         ReportedBy: pawel@rogocz.com
> 
> 
> Latest working kernel version: never worked
> Earliest failing kernel version:
> Distribution: ubuntu 7.10
> 
> Problem Description:
> 
> I have a farm of web servers behind a load balancer in DSR mode.
> I would like to be able to dump packets with any IP address on the web servers
> through the load balancer and have web servers reply to these requests.
> Currently for every IP web servers are supposed to reply to, I have to
> configure the IP on all servers on their loopback interfaces. The web server
> listens on IN_ADDR_ANY and I am forcing packets into the INPUT chain via
> iptables/netfilter but the kernel never replies to any connection requests. I
> only see the counter for OutNoRoutes increasing.
> 
> Steps to reproduce:
> 
> web server:
> 
> #iptables -A PREROUTING -i eth0 -t mangle -p tcp --dport 80 -j MARK --set-mark
> 1 
> #ip route add to local default dev lo protocol kernel table 1
> #ip rule add fwmark 1 table 1 priority 1
> 
> client:
> 
> #ip route add 1.1.1.1/32 via <web server IP> dev eth0
> 
> while making requests to 1.1.1.1:80 from the client you can see increasing
> OutNoRoutes counter but no connection ever takes place.
> 
> 


^ permalink raw reply

* Re: [PATCH 10/14] [rndis_host] Add rndis_early_init function pointer to 'struct rndis_data'.
From: Jussi Kivilinna @ 2008-01-25 13:14 UTC (permalink / raw)
  To: David Brownell
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA, bjd-a1rhEgazXTw,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <200801241710.35528.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>

On Thu, 2008-01-24 at 17:10 -0800, David Brownell wrote:
> Could this -- and #11/14 -- instead be generalized a bit,
> so they're not RNDIS-specific?  At least in name; the
> only user for now would be the rndis_host code.
> 
> The generalization would presumably be "early_init" and
> "link_change", paired with doc comments reflecting that
> they're usable by any driver stack built over the usbnet
> framework core.
> 
> There's no point IMO to having generalizable hooks be
> restricted this way.

Sure.

^ permalink raw reply

* Re: [Bugme-new] [Bug 9812] New: Kernel does not deliver packets going through the INPUT chain even if the app is listening on IN_ADDR_ANY
From: Patrick McHardy @ 2008-01-25 13:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: netdev, bugme-daemon, pawel
In-Reply-To: <20080125051404.3bbefab3.akpm@linux-foundation.org>

Andrew Morton wrote:
>> On Fri, 25 Jan 2008 03:59:27 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
>> http://bugzilla.kernel.org/show_bug.cgi?id=9812
>>

>> Problem Description:
>>
>> I have a farm of web servers behind a load balancer in DSR mode.
>> I would like to be able to dump packets with any IP address on the web servers
>> through the load balancer and have web servers reply to these requests.
>> Currently for every IP web servers are supposed to reply to, I have to
>> configure the IP on all servers on their loopback interfaces. The web server
>> listens on IN_ADDR_ANY and I am forcing packets into the INPUT chain via
>> iptables/netfilter but the kernel never replies to any connection requests. I
>> only see the counter for OutNoRoutes increasing.
>>
>> Steps to reproduce:
>>
>> web server:
>>
>> #iptables -A PREROUTING -i eth0 -t mangle -p tcp --dport 80 -j MARK --set-mark
>> 1 
>> #ip route add to local default dev lo protocol kernel table 1
>> #ip rule add fwmark 1 table 1 priority 1
>>
>> client:
>>
>> #ip route add 1.1.1.1/32 via <web server IP> dev eth0
>>
>> while making requests to 1.1.1.1:80 from the client you can see increasing
>> OutNoRoutes counter but no connection ever takes place.


I already commented in bugzilla, this is not a bug but an incorrect
setup. DNAT or REDIRECT must be used to make this work properly.


^ permalink raw reply

* Re: [PATCH 00/14] RFC: Driver for Wireless RNDIS USB devices.
From: Jussi Kivilinna @ 2008-01-25 13:20 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-wireless, bjd, netdev
In-Reply-To: <200801241719.14941.david-b@pacbell.net>

On Thu, 2008-01-24 at 17:19 -0800, David Brownell wrote:
> > 13. [rndis_host] blacklist known wireless RNDIS devices
> 
> That will be a headache over time though ... can't you just
> let the probe succeed enough to recogize it's wireless (using
> the media flag) and then bail, so the next driver can try?

Sure, that works too (but causes a little bit more message flood).

^ permalink raw reply

* Re: Slow OOM in netif_RX function
From: Andi Kleen @ 2008-01-25 13:21 UTC (permalink / raw)
  To: Ivan H. Dichev; +Cc: netdev
In-Reply-To: <20080124211810.3E24A46E9A@smtp.obs.bg>

"Ivan H. Dichev" <idichev@obs.bg> writes:
>
> What could happen if I put different Lan card in every slot?
> In ex. to-private -> 3com
>       to-inet    -> VIA
>       to-dmz     -> rtl8139
> And then to look which RX function is consuming the memory.
> (boomerang_rx, rtl8139_rx, ... etc) 

The problem is unlikely to be in the driver (these are both
well tested ones) but more likely your complicated iptables setup somehow
triggers a skb leak.

There are unfortunately no shrink wrapped debug mechanisms in the kernel
for leaks like this (ok you could enable CONFIG_NETFILTER_DEBUG 
and see if it prints something interesting, but that's a long shot).

If you wanted to write a custom debugging patch I would do something like this:

- Add two new integer fields to struct sk_buff: a time stamp and a integer field
- Fill the time stamp with jiffies in alloc_skb and clear the integer field
- In __kfree_skb clear the time stamp
- For all the ipt target modules in net/ipv4/netfilter/*.c you use change their 
->target functions to put an unique value into the integer field you added.
- Do the same for the pkt_to_tuple functions for all conntrack modules

Then when you observe the leak take a crash dump using kdump on the router 
and then use crash to dump all the slab objects for the sk_head_cache.
Then look for any that have an old time stamp and check what value they
have in the integer field. Then the netfilter function who set that unique value 
likely triggered the leak somehow.

-Andi

^ permalink raw reply

* [patch net-2.6.25][IPV4][FIB] fix fib_proc compilation error
From: Daniel Lezcano @ 2008-01-25 13:30 UTC (permalink / raw)
  To: David Miller, Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]



[-- Attachment #2: fix-fib-frontend-compilation-error.patch --]
[-- Type: text/x-patch, Size: 848 bytes --]

Subject: fix fib_proc compilation error
From: Daniel Lezcano <dlezcano@fr.ibm.com>

Fix fib_proc_[init|exit] definition when CONFIG_PROCFS=no

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
---
 include/net/ip_fib.h |   10 ++++++++++
 1 file changed, 10 insertions(+)

Index: net-2.6.25-fix/include/net/ip_fib.h
===================================================================
--- net-2.6.25-fix.orig/include/net/ip_fib.h
+++ net-2.6.25-fix/include/net/ip_fib.h
@@ -264,6 +264,16 @@ static inline void fib_res_put(struct fi
 #ifdef CONFIG_PROC_FS
 extern int __net_init  fib_proc_init(struct net *net);
 extern void __net_exit fib_proc_exit(struct net *net);
+#else
+static inline int fib_proc_init(struct net *net)
+{
+	return 0;
+}
+
+static inline void fib_proc_exit(struct net *net)
+{
+	return ;
+}
 #endif
 
 #endif  /* _NET_FIB_H */

^ permalink raw reply

* [patch net-2.6.25][IPV6][SYSCTL] fix sysctl compilation error
From: Daniel Lezcano @ 2008-01-25 13:32 UTC (permalink / raw)
  To: David Miller; +Cc: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]



[-- Attachment #2: fix-sysctl-compilation-error.patch --]
[-- Type: text/x-patch, Size: 1493 bytes --]

Subject: fix sysctl compilation error
From: Daniel Lezcano <dlezcano@fr.ibm.com>

Move ipv6_icmp_sysctl_init and ipv6_route_sysctl_init into
the right ifdef section otherwise that does not compile when
CONFIG_SYSCTL=yes and CONFIG_PROC_FS=no

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
---
 include/net/ipv6.h |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Index: net-2.6.25-fix/include/net/ipv6.h
===================================================================
--- net-2.6.25-fix.orig/include/net/ipv6.h
+++ net-2.6.25-fix/include/net/ipv6.h
@@ -110,7 +110,6 @@ struct frag_hdr {
 
 /* sysctls */
 extern int sysctl_mld_max_msf;
-
 extern struct ctl_path net_ipv6_ctl_path[];
 
 #define _DEVINC(statname, modifier, idev, field)			\
@@ -586,9 +585,6 @@ extern int ip6_mc_msfget(struct sock *sk
 			 int __user *optlen);
 
 #ifdef CONFIG_PROC_FS
-extern struct ctl_table *ipv6_icmp_sysctl_init(struct net *net);
-extern struct ctl_table *ipv6_route_sysctl_init(struct net *net);
-
 extern int  ac6_proc_init(void);
 extern void ac6_proc_exit(void);
 extern int  raw6_proc_init(void);
@@ -621,6 +617,8 @@ static inline int snmp6_unregister_dev(s
 extern ctl_table ipv6_route_table_template[];
 extern ctl_table ipv6_icmp_table_template[];
 
+extern struct ctl_table *ipv6_icmp_sysctl_init(struct net *net);
+extern struct ctl_table *ipv6_route_sysctl_init(struct net *net);
 extern int ipv6_sysctl_register(void);
 extern void ipv6_sysctl_unregister(void);
 #endif

^ permalink raw reply

* Re: [patch net-2.6.25][IPV6][SYSCTL] fix sysctl compilation error
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2008-01-25 13:40 UTC (permalink / raw)
  To: dlezcano, davem; +Cc: yoshfuji, netdev
In-Reply-To: <4799E4E7.5020000@fr.ibm.com>

In article <4799E4E7.5020000@fr.ibm.com> (at Fri, 25 Jan 2008 14:32:23 +0100), Daniel Lezcano <dlezcano@fr.ibm.com> says:

> Move ipv6_icmp_sysctl_init and ipv6_route_sysctl_init into
> the right ifdef section otherwise that does not compile when
> CONFIG_SYSCTL=yes and CONFIG_PROC_FS=no
> 
> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>

My bad....

Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

--yoshfuji

^ permalink raw reply

* [PATCH 1/7 net-2.6.25] [IPV4]: Fix memory leak on error path during FIB initialization.
From: Denis V. Lunev @ 2008-01-25 13:51 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, containers, Denis V. Lunev

net->ipv4.fib_table_hash is not freed when fib4_rules_init failed. The problem
has been introduced by the following commit.
commit c8050bf6d84785a7edd2e81591e8f833231477e8
Author: Denis V. Lunev <den@openvz.org>
Date:   Thu Jan 10 03:28:24 2008 -0800

Signed-off-by: Denis V. Lunev <den@openvz.org>
---
 net/ipv4/fib_frontend.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index d282618..d0507f4 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -975,6 +975,7 @@ static struct notifier_block fib_netdev_notifier = {
 
 static int __net_init ip_fib_net_init(struct net *net)
 {
+	int err;
 	unsigned int i;
 
 	net->ipv4.fib_table_hash = kzalloc(
@@ -985,7 +986,14 @@ static int __net_init ip_fib_net_init(struct net *net)
 	for (i = 0; i < FIB_TABLE_HASHSZ; i++)
 		INIT_HLIST_HEAD(&net->ipv4.fib_table_hash[i]);
 
-	return fib4_rules_init(net);
+	err = fib4_rules_init(net);
+	if (err < 0)
+		goto fail;
+	return 0;
+
+fail:
+	kfree(net->ipv4.fib_table_hash);
+	return err;
 }
 
 static void __net_exit ip_fib_net_exit(struct net *net)
-- 
1.5.3.rc5


^ permalink raw reply related

* [PATCH 7/7 net-2.6.25] [NETNS]: Lookup in FIB semantic hashes taking into account the namespace.
From: Denis V. Lunev @ 2008-01-25 13:52 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, containers, Denis V. Lunev
In-Reply-To: <1201269123-20378-1-git-send-email-den@openvz.org>

The namespace is not available in the fib_sync_down_addr, add it
as a parameter.

Looking up a device by the pointer to it is OK. Looking up using a result
from fib_trie/fib_hash table lookup is also safe. No need to fix that at all.
So, just fix lookup by address and insertion to the hash table path.

Signed-off-by: Denis V. Lunev <den@openvz.org>
---
 include/net/ip_fib.h     |    2 +-
 net/ipv4/fib_frontend.c  |    2 +-
 net/ipv4/fib_semantics.c |    6 +++++-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index cb0df37..90d1175 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -220,7 +220,7 @@ extern void fib_select_default(struct net *net, const struct flowi *flp,
 /* Exported by fib_semantics.c */
 extern int ip_fib_check_default(__be32 gw, struct net_device *dev);
 extern int fib_sync_down_dev(struct net_device *dev, int force);
-extern int fib_sync_down_addr(__be32 local);
+extern int fib_sync_down_addr(struct net *net, __be32 local);
 extern int fib_sync_up(struct net_device *dev);
 extern __be32  __fib_res_prefsrc(struct fib_result *res);
 extern void fib_select_multipath(const struct flowi *flp, struct fib_result *res);
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index d69ffa2..86ff271 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -808,7 +808,7 @@ static void fib_del_ifaddr(struct in_ifaddr *ifa)
 			   First of all, we scan fib_info list searching
 			   for stray nexthop entries, then ignite fib_flush.
 			*/
-			if (fib_sync_down_addr(ifa->ifa_local))
+			if (fib_sync_down_addr(dev->nd_net, ifa->ifa_local))
 				fib_flush(dev->nd_net);
 		}
 	}
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 97cc494..a13c847 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -229,6 +229,8 @@ static struct fib_info *fib_find_info(const struct fib_info *nfi)
 	head = &fib_info_hash[hash];
 
 	hlist_for_each_entry(fi, node, head, fib_hash) {
+		if (fi->fib_net != nfi->fib_net)
+			continue;
 		if (fi->fib_nhs != nfi->fib_nhs)
 			continue;
 		if (nfi->fib_protocol == fi->fib_protocol &&
@@ -1031,7 +1033,7 @@ nla_put_failure:
      referring to it.
    - device went down -> we must shutdown all nexthops going via it.
  */
-int fib_sync_down_addr(__be32 local)
+int fib_sync_down_addr(struct net *net, __be32 local)
 {
 	int ret = 0;
 	unsigned int hash = fib_laddr_hashfn(local);
@@ -1043,6 +1045,8 @@ int fib_sync_down_addr(__be32 local)
 		return 0;
 
 	hlist_for_each_entry(fi, node, head, fib_lhash) {
+		if (fi->fib_net != net)
+			continue;
 		if (fi->fib_prefsrc == local) {
 			fi->fib_flags |= RTNH_F_DEAD;
 			ret++;
-- 
1.5.3.rc5


^ permalink raw reply related

* [PATCH 6/7 net-2.6.25] [NETNS]: Add a namespace mark to fib_info.
From: Denis V. Lunev @ 2008-01-25 13:52 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, containers, Denis V. Lunev
In-Reply-To: <1201269123-20378-1-git-send-email-den@openvz.org>

This is required to make fib_info lookups namespace aware. In the other case
initial namespace devices are marked as dead in the local routing table
during other namespace stop.

Signed-off-by: Denis V. Lunev <den@openvz.org>
---
 include/net/ip_fib.h     |    1 +
 net/ipv4/fib_semantics.c |    8 ++++----
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 1b2f008..cb0df37 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -69,6 +69,7 @@ struct fib_nh {
 struct fib_info {
 	struct hlist_node	fib_hash;
 	struct hlist_node	fib_lhash;
+	struct net		*fib_net;
 	int			fib_treeref;
 	atomic_t		fib_clntref;
 	int			fib_dead;
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 5beff2e..97cc494 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -687,6 +687,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
 	struct fib_info *fi = NULL;
 	struct fib_info *ofi;
 	int nhs = 1;
+	struct net *net = cfg->fc_nlinfo.nl_net;
 
 	/* Fast check to catch the most weird cases */
 	if (fib_props[cfg->fc_type].scope > cfg->fc_scope)
@@ -727,6 +728,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
 		goto failure;
 	fib_info_cnt++;
 
+	fi->fib_net = net;
 	fi->fib_protocol = cfg->fc_protocol;
 	fi->fib_flags = cfg->fc_flags;
 	fi->fib_priority = cfg->fc_priority;
@@ -798,8 +800,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
 		if (nhs != 1 || nh->nh_gw)
 			goto err_inval;
 		nh->nh_scope = RT_SCOPE_NOWHERE;
-		nh->nh_dev = dev_get_by_index(cfg->fc_nlinfo.nl_net,
-					      fi->fib_nh->nh_oif);
+		nh->nh_dev = dev_get_by_index(net, fi->fib_nh->nh_oif);
 		err = -ENODEV;
 		if (nh->nh_dev == NULL)
 			goto failure;
@@ -813,8 +814,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
 	if (fi->fib_prefsrc) {
 		if (cfg->fc_type != RTN_LOCAL || !cfg->fc_dst ||
 		    fi->fib_prefsrc != cfg->fc_dst)
-			if (inet_addr_type(cfg->fc_nlinfo.nl_net,
-					   fi->fib_prefsrc) != RTN_LOCAL)
+			if (inet_addr_type(net, fi->fib_prefsrc) != RTN_LOCAL)
 				goto err_inval;
 	}
 
-- 
1.5.3.rc5


^ permalink raw reply related

* [PATCH 2/7 net-2.6.25] [IPV4]: Small style cleanup of the error path in rtm_to_ifaddr.
From: Denis V. Lunev @ 2008-01-25 13:51 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, containers, Denis V. Lunev
In-Reply-To: <1201269123-20378-1-git-send-email-den@openvz.org>

Remove error code assignment inside brackets on failure. The code looks better
if the error is assigned before condition check. Also, the compiler treats this
better.

Signed-off-by: Denis V. Lunev <den@openvz.org>
---
 net/ipv4/devinet.c |   21 ++++++++-------------
 1 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 21f71bf..9da4c68 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -492,39 +492,34 @@ static struct in_ifaddr *rtm_to_ifaddr(struct nlmsghdr *nlh)
 	struct ifaddrmsg *ifm;
 	struct net_device *dev;
 	struct in_device *in_dev;
-	int err = -EINVAL;
+	int err;
 
 	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFA_MAX, ifa_ipv4_policy);
 	if (err < 0)
 		goto errout;
 
 	ifm = nlmsg_data(nlh);
-	if (ifm->ifa_prefixlen > 32 || tb[IFA_LOCAL] == NULL) {
-		err = -EINVAL;
+	err = -EINVAL;
+	if (ifm->ifa_prefixlen > 32 || tb[IFA_LOCAL] == NULL)
 		goto errout;
-	}
 
 	dev = __dev_get_by_index(&init_net, ifm->ifa_index);
-	if (dev == NULL) {
-		err = -ENODEV;
+	err = -ENODEV;
+	if (dev == NULL)
 		goto errout;
-	}
 
 	in_dev = __in_dev_get_rtnl(dev);
-	if (in_dev == NULL) {
-		err = -ENOBUFS;
+	err = -ENOBUFS;
+	if (in_dev == NULL)
 		goto errout;
-	}
 
 	ifa = inet_alloc_ifa();
-	if (ifa == NULL) {
+	if (ifa == NULL)
 		/*
 		 * A potential indev allocation can be left alive, it stays
 		 * assigned to its device and is destroy with it.
 		 */
-		err = -ENOBUFS;
 		goto errout;
-	}
 
 	ipv4_devconf_setall(in_dev);
 	in_dev_hold(in_dev);
-- 
1.5.3.rc5


^ permalink raw reply related

* [PATCH 4/7 net-2.6.25] [NETNS]: Process interface address manipulation routines in the namespace.
From: Denis V. Lunev @ 2008-01-25 13:52 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, containers, Denis V. Lunev
In-Reply-To: <1201269123-20378-1-git-send-email-den@openvz.org>

The namespace is available when required except rtm_to_ifaddr. Add
namespace argument to it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
---
 net/ipv4/devinet.c |   14 ++++++++------
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index e55c85e..6a6e92e 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -485,7 +485,7 @@ errout:
 	return err;
 }
 
-static struct in_ifaddr *rtm_to_ifaddr(struct nlmsghdr *nlh)
+static struct in_ifaddr *rtm_to_ifaddr(struct net *net, struct nlmsghdr *nlh)
 {
 	struct nlattr *tb[IFA_MAX+1];
 	struct in_ifaddr *ifa;
@@ -503,7 +503,7 @@ static struct in_ifaddr *rtm_to_ifaddr(struct nlmsghdr *nlh)
 	if (ifm->ifa_prefixlen > 32 || tb[IFA_LOCAL] == NULL)
 		goto errout;
 
-	dev = __dev_get_by_index(&init_net, ifm->ifa_index);
+	dev = __dev_get_by_index(net, ifm->ifa_index);
 	err = -ENODEV;
 	if (dev == NULL)
 		goto errout;
@@ -571,7 +571,7 @@ static int inet_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg
 	if (net != &init_net)
 		return -EINVAL;
 
-	ifa = rtm_to_ifaddr(nlh);
+	ifa = rtm_to_ifaddr(net, nlh);
 	if (IS_ERR(ifa))
 		return PTR_ERR(ifa);
 
@@ -1189,7 +1189,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
 
 	s_ip_idx = ip_idx = cb->args[1];
 	idx = 0;
-	for_each_netdev(&init_net, dev) {
+	for_each_netdev(net, dev) {
 		if (idx < s_idx)
 			goto cont;
 		if (idx > s_idx)
@@ -1223,7 +1223,9 @@ static void rtmsg_ifa(int event, struct in_ifaddr* ifa, struct nlmsghdr *nlh,
 	struct sk_buff *skb;
 	u32 seq = nlh ? nlh->nlmsg_seq : 0;
 	int err = -ENOBUFS;
+	struct net *net;
 
+	net = ifa->ifa_dev->dev->nd_net;
 	skb = nlmsg_new(inet_nlmsg_size(), GFP_KERNEL);
 	if (skb == NULL)
 		goto errout;
@@ -1235,10 +1237,10 @@ static void rtmsg_ifa(int event, struct in_ifaddr* ifa, struct nlmsghdr *nlh,
 		kfree_skb(skb);
 		goto errout;
 	}
-	err = rtnl_notify(skb, &init_net, pid, RTNLGRP_IPV4_IFADDR, nlh, GFP_KERNEL);
+	err = rtnl_notify(skb, net, pid, RTNLGRP_IPV4_IFADDR, nlh, GFP_KERNEL);
 errout:
 	if (err < 0)
-		rtnl_set_sk_err(&init_net, RTNLGRP_IPV4_IFADDR, err);
+		rtnl_set_sk_err(net, RTNLGRP_IPV4_IFADDR, err);
 }
 
 #ifdef CONFIG_SYSCTL
-- 
1.5.3.rc5


^ permalink raw reply related

* [PATCH 5/7 net-2.6.25] [IPV4]: fib_sync_down rework.
From: Denis V. Lunev @ 2008-01-25 13:52 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, containers, Denis V. Lunev
In-Reply-To: <1201269123-20378-1-git-send-email-den@openvz.org>

fib_sync_down can be called with an address and with a device. In reality
it is called either with address OR with a device. The codepath inside is
completely different, so lets separate it into two calls for these two
cases.

Signed-off-by: Denis V. Lunev <den@openvz.org>
---
 include/net/ip_fib.h     |    3 +-
 net/ipv4/fib_frontend.c  |    4 +-
 net/ipv4/fib_semantics.c |  104 +++++++++++++++++++++++----------------------
 3 files changed, 57 insertions(+), 54 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 9daa60b..1b2f008 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -218,7 +218,8 @@ extern void fib_select_default(struct net *net, const struct flowi *flp,
 
 /* Exported by fib_semantics.c */
 extern int ip_fib_check_default(__be32 gw, struct net_device *dev);
-extern int fib_sync_down(__be32 local, struct net_device *dev, int force);
+extern int fib_sync_down_dev(struct net_device *dev, int force);
+extern int fib_sync_down_addr(__be32 local);
 extern int fib_sync_up(struct net_device *dev);
 extern __be32  __fib_res_prefsrc(struct fib_result *res);
 extern void fib_select_multipath(const struct flowi *flp, struct fib_result *res);
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index d0507f4..d69ffa2 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -808,7 +808,7 @@ static void fib_del_ifaddr(struct in_ifaddr *ifa)
 			   First of all, we scan fib_info list searching
 			   for stray nexthop entries, then ignite fib_flush.
 			*/
-			if (fib_sync_down(ifa->ifa_local, NULL, 0))
+			if (fib_sync_down_addr(ifa->ifa_local))
 				fib_flush(dev->nd_net);
 		}
 	}
@@ -898,7 +898,7 @@ static void nl_fib_lookup_exit(struct net *net)
 
 static void fib_disable_ip(struct net_device *dev, int force)
 {
-	if (fib_sync_down(0, dev, force))
+	if (fib_sync_down_dev(dev, force))
 		fib_flush(dev->nd_net);
 	rt_cache_flush(0);
 	arp_ifdown(dev);
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index c791286..5beff2e 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1031,70 +1031,72 @@ nla_put_failure:
      referring to it.
    - device went down -> we must shutdown all nexthops going via it.
  */
-
-int fib_sync_down(__be32 local, struct net_device *dev, int force)
+int fib_sync_down_addr(__be32 local)
 {
 	int ret = 0;
-	int scope = RT_SCOPE_NOWHERE;
-
-	if (force)
-		scope = -1;
+	unsigned int hash = fib_laddr_hashfn(local);
+	struct hlist_head *head = &fib_info_laddrhash[hash];
+	struct hlist_node *node;
+	struct fib_info *fi;
 
-	if (local && fib_info_laddrhash) {
-		unsigned int hash = fib_laddr_hashfn(local);
-		struct hlist_head *head = &fib_info_laddrhash[hash];
-		struct hlist_node *node;
-		struct fib_info *fi;
+	if (fib_info_laddrhash == NULL || local == 0)
+		return 0;
 
-		hlist_for_each_entry(fi, node, head, fib_lhash) {
-			if (fi->fib_prefsrc == local) {
-				fi->fib_flags |= RTNH_F_DEAD;
-				ret++;
-			}
+	hlist_for_each_entry(fi, node, head, fib_lhash) {
+		if (fi->fib_prefsrc == local) {
+			fi->fib_flags |= RTNH_F_DEAD;
+			ret++;
 		}
 	}
+	return ret;
+}
 
-	if (dev) {
-		struct fib_info *prev_fi = NULL;
-		unsigned int hash = fib_devindex_hashfn(dev->ifindex);
-		struct hlist_head *head = &fib_info_devhash[hash];
-		struct hlist_node *node;
-		struct fib_nh *nh;
+int fib_sync_down_dev(struct net_device *dev, int force)
+{
+	int ret = 0;
+	int scope = RT_SCOPE_NOWHERE;
+	struct fib_info *prev_fi = NULL;
+	unsigned int hash = fib_devindex_hashfn(dev->ifindex);
+	struct hlist_head *head = &fib_info_devhash[hash];
+	struct hlist_node *node;
+	struct fib_nh *nh;
 
-		hlist_for_each_entry(nh, node, head, nh_hash) {
-			struct fib_info *fi = nh->nh_parent;
-			int dead;
+	if (force)
+		scope = -1;
 
-			BUG_ON(!fi->fib_nhs);
-			if (nh->nh_dev != dev || fi == prev_fi)
-				continue;
-			prev_fi = fi;
-			dead = 0;
-			change_nexthops(fi) {
-				if (nh->nh_flags&RTNH_F_DEAD)
-					dead++;
-				else if (nh->nh_dev == dev &&
-					 nh->nh_scope != scope) {
-					nh->nh_flags |= RTNH_F_DEAD;
+	hlist_for_each_entry(nh, node, head, nh_hash) {
+		struct fib_info *fi = nh->nh_parent;
+		int dead;
+
+		BUG_ON(!fi->fib_nhs);
+		if (nh->nh_dev != dev || fi == prev_fi)
+			continue;
+		prev_fi = fi;
+		dead = 0;
+		change_nexthops(fi) {
+			if (nh->nh_flags&RTNH_F_DEAD)
+				dead++;
+			else if (nh->nh_dev == dev &&
+					nh->nh_scope != scope) {
+				nh->nh_flags |= RTNH_F_DEAD;
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
-					spin_lock_bh(&fib_multipath_lock);
-					fi->fib_power -= nh->nh_power;
-					nh->nh_power = 0;
-					spin_unlock_bh(&fib_multipath_lock);
+				spin_lock_bh(&fib_multipath_lock);
+				fi->fib_power -= nh->nh_power;
+				nh->nh_power = 0;
+				spin_unlock_bh(&fib_multipath_lock);
 #endif
-					dead++;
-				}
+				dead++;
+			}
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
-				if (force > 1 && nh->nh_dev == dev) {
-					dead = fi->fib_nhs;
-					break;
-				}
-#endif
-			} endfor_nexthops(fi)
-			if (dead == fi->fib_nhs) {
-				fi->fib_flags |= RTNH_F_DEAD;
-				ret++;
+			if (force > 1 && nh->nh_dev == dev) {
+				dead = fi->fib_nhs;
+				break;
 			}
+#endif
+		} endfor_nexthops(fi)
+		if (dead == fi->fib_nhs) {
+			fi->fib_flags |= RTNH_F_DEAD;
+			ret++;
 		}
 	}
 
-- 
1.5.3.rc5


^ permalink raw reply related

* [PATCH 3/7 net-2.6.25] [IPV4]: Prohibit assignment of 0.0.0.0 as interface address.
From: Denis V. Lunev @ 2008-01-25 13:51 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, containers, Denis V. Lunev
In-Reply-To: <1201269123-20378-1-git-send-email-den@openvz.org>

I could hardly imagine why sombady needs to assign 0.0.0.0 as an interface
address or interface destination address. The kernel will behave in a strage
way in several places if this is possible, as ifa_local != 0 is considered
as initialized/non-initialized state of the ifa.

Signed-off-by: Denis V. Lunev <den@openvz.org>
---
 net/ipv4/devinet.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 9da4c68..e55c85e 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -534,7 +534,13 @@ static struct in_ifaddr *rtm_to_ifaddr(struct nlmsghdr *nlh)
 	ifa->ifa_dev = in_dev;
 
 	ifa->ifa_local = nla_get_be32(tb[IFA_LOCAL]);
+	err = -EINVAL;
+	if (ifa->ifa_local == htonl(INADDR_ANY))
+		goto fail_free;
+
 	ifa->ifa_address = nla_get_be32(tb[IFA_ADDRESS]);
+	if (ifa->ifa_address == htonl(INADDR_ANY))
+		goto fail_free;
 
 	if (tb[IFA_BROADCAST])
 		ifa->ifa_broadcast = nla_get_be32(tb[IFA_BROADCAST]);
@@ -549,6 +555,8 @@ static struct in_ifaddr *rtm_to_ifaddr(struct nlmsghdr *nlh)
 
 	return ifa;
 
+fail_free:
+	inet_free_ifa(ifa);
 errout:
 	return ERR_PTR(err);
 }
@@ -736,6 +744,8 @@ int devinet_ioctl(unsigned int cmd, void __user *arg)
 		ret = -EINVAL;
 		if (inet_abc_len(sin->sin_addr.s_addr) < 0)
 			break;
+		if (sin->sin_addr.s_addr == INADDR_ANY)
+			break;
 
 		if (!ifa) {
 			ret = -ENOBUFS;
@@ -786,6 +796,8 @@ int devinet_ioctl(unsigned int cmd, void __user *arg)
 		ret = -EINVAL;
 		if (inet_abc_len(sin->sin_addr.s_addr) < 0)
 			break;
+		if (sin->sin_addr.s_addr == INADDR_ANY)
+			break;
 		ret = 0;
 		inet_del_ifa(in_dev, ifap, 0);
 		ifa->ifa_address = sin->sin_addr.s_addr;
-- 
1.5.3.rc5


^ permalink raw reply related

* Re: [PATCH 3/7 net-2.6.25] [IPV4]: Prohibit assignment of 0.0.0.0 as interface address.
From: Daniel Lezcano @ 2008-01-25 14:01 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: davem, netdev, devel, containers
In-Reply-To: <1201269123-20378-3-git-send-email-den@openvz.org>

Denis V. Lunev wrote:
> I could hardly imagine why sombady needs to assign 0.0.0.0 as an interface
> address or interface destination address. The kernel will behave in a strage
> way in several places if this is possible, as ifa_local != 0 is considered
> as initialized/non-initialized state of the ifa.

AFAICS, we should be able to set at an interface address to 0.0.0.0, in 
order to remove an IP address from an interface and keep this one up.
I see two trivial cases:
  * remove the ipv4 on an interface but continue to use it through ipv6
  * move ipv4 address from the interface to an attached bridge

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox