* [PATCH] bonding: fix error handling if slave is busy
@ 2011-12-30 22:40 Stephen Hemminger
2011-12-31 16:11 ` Nicolas de Pesloüan
0 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2011-12-30 22:40 UTC (permalink / raw)
To: David Miller, Jay Vosburgh, Andy Gospodarek, netdev
The bonding device can cause kernel panic in the enslave error handling.
If slave device already has a receive handler registered, then the
error unwind does not clear the new entry out of the slave list.
This ends up leaving a reference to freed memory in the bond
device slave linked list.
The following is a simple example:
# modprobe dummy
# ip li add dummy0-1 link dummy0 type macvlan
# modprobe bonding
# echo +dummy0 >/sys/class/net/bond0/bonding/slaves
# ip -s li show dev bond0
This returns with -EBUSY, but the bonding device has bogus entry in
the slave list, and will panic on next operation that gets statistics
from bond0.
The fix is to detach the slave (which removes it from the list)
in the unwind path.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
---
Patch is against net-next but should be applied to net (3.2), and
stable (3.1 and 3.0).
--- a/drivers/net/bonding/bond_main.c 2011-12-30 14:20:03.171823181 -0800
+++ b/drivers/net/bonding/bond_main.c 2011-12-30 14:20:20.232020474 -0800
@@ -1853,6 +1853,9 @@ err_dest_symlinks:
bond_destroy_slave_symlinks(bond_dev, slave_dev);
err_close:
+ write_lock_bh(&bond->lock);
+ bond_detach_slave(bond, new_slave);
+ write_unlock_bh(&bond->lock);
dev_close(slave_dev);
err_unset_master:
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] bonding: fix error handling if slave is busy
2011-12-30 22:40 [PATCH] bonding: fix error handling if slave is busy Stephen Hemminger
@ 2011-12-31 16:11 ` Nicolas de Pesloüan
2011-12-31 23:26 ` [PATCH] bonding: fix error handling if slave is busy (v2) Stephen Hemminger
0 siblings, 1 reply; 7+ messages in thread
From: Nicolas de Pesloüan @ 2011-12-31 16:11 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, Jay Vosburgh, Andy Gospodarek, netdev
Le 30/12/2011 23:40, Stephen Hemminger a écrit :
> The bonding device can cause kernel panic in the enslave error handling.
>
> If slave device already has a receive handler registered, then the
> error unwind does not clear the new entry out of the slave list.
> This ends up leaving a reference to freed memory in the bond
> device slave linked list.
>
> The following is a simple example:
> # modprobe dummy
> # ip li add dummy0-1 link dummy0 type macvlan
> # modprobe bonding
> # echo +dummy0>/sys/class/net/bond0/bonding/slaves
> # ip -s li show dev bond0
>
> This returns with -EBUSY, but the bonding device has bogus entry in
> the slave list, and will panic on next operation that gets statistics
> from bond0.
>
> The fix is to detach the slave (which removes it from the list)
> in the unwind path.
>
>
> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
>
> ---
> Patch is against net-next but should be applied to net (3.2), and
> stable (3.1 and 3.0).
>
> --- a/drivers/net/bonding/bond_main.c 2011-12-30 14:20:03.171823181 -0800
> +++ b/drivers/net/bonding/bond_main.c 2011-12-30 14:20:20.232020474 -0800
> @@ -1853,6 +1853,9 @@ err_dest_symlinks:
> bond_destroy_slave_symlinks(bond_dev, slave_dev);
>
> err_close:
> + write_lock_bh(&bond->lock);
> + bond_detach_slave(bond, new_slave);
> + write_unlock_bh(&bond->lock);
> dev_close(slave_dev);
>
> err_unset_master:
NAK.
There are three 'goto err_close' before the call to bond_attach_slave. For those three goto, your
path will call bond_detach_slave without a previous call to bond_attach_slave.
This would at least decrement bond->slave_cnt, without having incremented it before.
Do I miss something ?
Nicolas.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] bonding: fix error handling if slave is busy (v2)
2011-12-31 16:11 ` Nicolas de Pesloüan
@ 2011-12-31 23:26 ` Stephen Hemminger
2012-01-01 0:09 ` Nicolas de Pesloüan
0 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2011-12-31 23:26 UTC (permalink / raw)
To: Nicolas de Pesloüan
Cc: David Miller, Jay Vosburgh, Andy Gospodarek, netdev
If slave device already has a receive handler registered, then the
error unwind of bonding device enslave function is broken.
The following will leave a pointer to freed memory in the slave
device list, causing a later kernel panic.
# modprobe dummy
# ip li add dummy0-1 link dummy0 type macvlan
# modprobe bonding
# echo +dummy0 >/sys/class/net/bond0/bonding/slaves
The fix is to detach the slave (which removes it from the list)
in the unwind path.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
---
v2 - need to keep original err_close for other unwind
--- a/drivers/net/bonding/bond_main.c 2011-12-30 14:20:03.171823181 -0800
+++ b/drivers/net/bonding/bond_main.c 2011-12-31 15:20:16.493379415 -0800
@@ -1822,7 +1822,7 @@ int bond_enslave(struct net_device *bond
"but new slave device does not support netpoll.\n",
bond_dev->name);
res = -EBUSY;
- goto err_close;
+ goto err_detach;
}
}
#endif
@@ -1831,7 +1831,7 @@ int bond_enslave(struct net_device *bond
res = bond_create_slave_symlinks(bond_dev, slave_dev);
if (res)
- goto err_close;
+ goto err_detach;
res = netdev_rx_handler_register(slave_dev, bond_handle_frame,
new_slave);
@@ -1852,6 +1852,11 @@ int bond_enslave(struct net_device *bond
err_dest_symlinks:
bond_destroy_slave_symlinks(bond_dev, slave_dev);
+err_detach:
+ write_lock_bh(&bond->lock);
+ bond_detach_slave(bond, new_slave);
+ write_unlock_bh(&bond->lock);
+
err_close:
dev_close(slave_dev);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] bonding: fix error handling if slave is busy (v2)
2011-12-31 23:26 ` [PATCH] bonding: fix error handling if slave is busy (v2) Stephen Hemminger
@ 2012-01-01 0:09 ` Nicolas de Pesloüan
2012-01-01 0:13 ` Stephen Hemminger
2012-01-03 17:49 ` David Miller
0 siblings, 2 replies; 7+ messages in thread
From: Nicolas de Pesloüan @ 2012-01-01 0:09 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, Jay Vosburgh, Andy Gospodarek, netdev
Le 01/01/2012 00:26, Stephen Hemminger a écrit :
> If slave device already has a receive handler registered, then the
> error unwind of bonding device enslave function is broken.
>
> The following will leave a pointer to freed memory in the slave
> device list, causing a later kernel panic.
> # modprobe dummy
> # ip li add dummy0-1 link dummy0 type macvlan
> # modprobe bonding
> # echo +dummy0>/sys/class/net/bond0/bonding/slaves
>
> The fix is to detach the slave (which removes it from the list)
> in the unwind path.
>
> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
Thanks Stephen.
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
> ---
> v2 - need to keep original err_close for other unwind
>
> --- a/drivers/net/bonding/bond_main.c 2011-12-30 14:20:03.171823181 -0800
> +++ b/drivers/net/bonding/bond_main.c 2011-12-31 15:20:16.493379415 -0800
> @@ -1822,7 +1822,7 @@ int bond_enslave(struct net_device *bond
> "but new slave device does not support netpoll.\n",
> bond_dev->name);
> res = -EBUSY;
> - goto err_close;
> + goto err_detach;
> }
> }
> #endif
> @@ -1831,7 +1831,7 @@ int bond_enslave(struct net_device *bond
>
> res = bond_create_slave_symlinks(bond_dev, slave_dev);
> if (res)
> - goto err_close;
> + goto err_detach;
>
> res = netdev_rx_handler_register(slave_dev, bond_handle_frame,
> new_slave);
> @@ -1852,6 +1852,11 @@ int bond_enslave(struct net_device *bond
> err_dest_symlinks:
> bond_destroy_slave_symlinks(bond_dev, slave_dev);
>
> +err_detach:
> + write_lock_bh(&bond->lock);
> + bond_detach_slave(bond, new_slave);
> + write_unlock_bh(&bond->lock);
> +
> err_close:
> dev_close(slave_dev);
>
>
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] bonding: fix error handling if slave is busy (v2)
2012-01-01 0:09 ` Nicolas de Pesloüan
@ 2012-01-01 0:13 ` Stephen Hemminger
2012-01-01 0:28 ` Nicolas de Pesloüan
2012-01-03 17:49 ` David Miller
1 sibling, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2012-01-01 0:13 UTC (permalink / raw)
To: Nicolas de Pesloüan
Cc: David Miller, Jay Vosburgh, Andy Gospodarek, netdev
On Sun, 01 Jan 2012 01:09:50 +0100
Nicolas de Pesloüan <nicolas.2p.debian@gmail.com> wrote:
> Le 01/01/2012 00:26, Stephen Hemminger a écrit :
> > If slave device already has a receive handler registered, then the
> > error unwind of bonding device enslave function is broken.
> >
> > The following will leave a pointer to freed memory in the slave
> > device list, causing a later kernel panic.
> > # modprobe dummy
> > # ip li add dummy0-1 link dummy0 type macvlan
> > # modprobe bonding
> > # echo +dummy0>/sys/class/net/bond0/bonding/slaves
> >
> > The fix is to detach the slave (which removes it from the list)
> > in the unwind path.
> >
> > Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
>
> Thanks Stephen.
>
> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
The locking in bond driver is a tangled web.
Would be cleaner to get rid of bond->lock altogether.
Slave add/delete should be protected by RTNL, and the lookup should
be converted to RCU. The problem is that bonding driver implements
own form of circular list to handle round-robin etc.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] bonding: fix error handling if slave is busy (v2)
2012-01-01 0:13 ` Stephen Hemminger
@ 2012-01-01 0:28 ` Nicolas de Pesloüan
0 siblings, 0 replies; 7+ messages in thread
From: Nicolas de Pesloüan @ 2012-01-01 0:28 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, Jay Vosburgh, Andy Gospodarek, netdev
Le 01/01/2012 01:13, Stephen Hemminger a écrit :
> On Sun, 01 Jan 2012 01:09:50 +0100
> Nicolas de Pesloüan<nicolas.2p.debian@gmail.com> wrote:
>
>> Le 01/01/2012 00:26, Stephen Hemminger a écrit :
>>> If slave device already has a receive handler registered, then the
>>> error unwind of bonding device enslave function is broken.
>>>
>>> The following will leave a pointer to freed memory in the slave
>>> device list, causing a later kernel panic.
>>> # modprobe dummy
>>> # ip li add dummy0-1 link dummy0 type macvlan
>>> # modprobe bonding
>>> # echo +dummy0>/sys/class/net/bond0/bonding/slaves
>>>
>>> The fix is to detach the slave (which removes it from the list)
>>> in the unwind path.
>>>
>>> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
>>
>> Thanks Stephen.
>>
>> Reviewed-by: Nicolas de Pesloüan<nicolas.2p.debian@free.fr>
>
> The locking in bond driver is a tangled web.
>
> Would be cleaner to get rid of bond->lock altogether.
> Slave add/delete should be protected by RTNL, and the lookup should
> be converted to RCU. The problem is that bonding driver implements
> own form of circular list to handle round-robin etc.
Bonding has become an incredibly complex thing, due to the large number of corner cases it needs to
handle. And the locking system in probably part of the problem.
Unfortunately, I'm far from a Linux locking specialist, so I cannot comment on this... I just
noticed that searching for RTNL in Documentations yields no result... :-(
Nicolas.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] bonding: fix error handling if slave is busy (v2)
2012-01-01 0:09 ` Nicolas de Pesloüan
2012-01-01 0:13 ` Stephen Hemminger
@ 2012-01-03 17:49 ` David Miller
1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2012-01-03 17:49 UTC (permalink / raw)
To: nicolas.2p.debian; +Cc: shemminger, fubar, andy, netdev
From: Nicolas de Pesloüan <nicolas.2p.debian@gmail.com>
Date: Sun, 01 Jan 2012 01:09:50 +0100
> Le 01/01/2012 00:26, Stephen Hemminger a écrit :
>> If slave device already has a receive handler registered, then the
>> error unwind of bonding device enslave function is broken.
>>
>> The following will leave a pointer to freed memory in the slave
>> device list, causing a later kernel panic.
>> # modprobe dummy
>> # ip li add dummy0-1 link dummy0 type macvlan
>> # modprobe bonding
>> # echo +dummy0>/sys/class/net/bond0/bonding/slaves
>>
>> The fix is to detach the slave (which removes it from the list)
>> in the unwind path.
>>
>> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
>
> Thanks Stephen.
>
> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Applied, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-01-03 17:49 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-30 22:40 [PATCH] bonding: fix error handling if slave is busy Stephen Hemminger
2011-12-31 16:11 ` Nicolas de Pesloüan
2011-12-31 23:26 ` [PATCH] bonding: fix error handling if slave is busy (v2) Stephen Hemminger
2012-01-01 0:09 ` Nicolas de Pesloüan
2012-01-01 0:13 ` Stephen Hemminger
2012-01-01 0:28 ` Nicolas de Pesloüan
2012-01-03 17:49 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).