* Re: [net-next] bonding: don't allow the master to become its slave
2012-08-09 19:03 ` Ben Hutchings
@ 2012-08-09 19:23 ` Jay Vosburgh
2012-08-09 19:39 ` Flavio Leitner
2012-08-09 19:54 ` Jiri Pirko
2 siblings, 0 replies; 12+ messages in thread
From: Jay Vosburgh @ 2012-08-09 19:23 UTC (permalink / raw)
To: Ben Hutchings
Cc: Flavio Leitner, netdev, Andy Gospodarek, Leonardo Chiquitto,
Jiri Pirko
Ben Hutchings <bhutchings@solarflare.com> wrote:
>On Thu, 2012-08-09 at 15:30 -0300, Flavio Leitner wrote:
>> It doesn't make any sense to allow the master to become
>> its slave. That creates a loop of events causing a crash.
>
>What if there are other intermediate devices, e.g. the slave is a VLAN
>sub-device of the bond? And doesn't team also have this problem?
>
>I think a more general check for such loops might be required.
I thought we had disallowed any nesting of bonds at all, but I
checked the netdev archives, and it appears we discussed it (and agreed
it didn't work), but it kind of petered out.
http://patchwork.ozlabs.org/patch/79705/
In any event, I think a patch like the following would get all
cases (double enslavement or enslavement of any bonding master) in one
place:
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 6fae5f3..d14651c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1505,18 +1505,17 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
int link_reporting;
int res = 0;
+ if (slave_dev->priv_flags & IFF_BONDING) {
+ pr_debug("Error, Device was already enslaved\n");
+ return -EBUSY;
+ }
+
if (!bond->params.use_carrier && slave_dev->ethtool_ops == NULL &&
slave_ops->ndo_do_ioctl == NULL) {
pr_warning("%s: Warning: no link monitoring support for %s\n",
bond_dev->name, slave_dev->name);
}
- /* already enslaved */
- if (slave_dev->flags & IFF_SLAVE) {
- pr_debug("Error, Device was already enslaved\n");
- return -EBUSY;
- }
-
/* vlan challenged mutual exclusion */
/* no need to lock since we're protected by rtnl_lock */
if (slave_dev->features & NETIF_F_VLAN_CHALLENGED) {
This is basically the same logic that Jiri Bohac originally
proposed in the discussion I mention above, although this patch moves
the test further up and combines the master and slave tests into one.
Comments? I haven't tested this at all, but I think the logic
is correct. I don't think having two separate tests to get special
"master" and "slave" error cases is worthwhile.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [net-next] bonding: don't allow the master to become its slave
2012-08-09 19:03 ` Ben Hutchings
2012-08-09 19:23 ` Jay Vosburgh
@ 2012-08-09 19:39 ` Flavio Leitner
2012-08-09 19:55 ` Jiri Pirko
2012-08-09 19:54 ` Jiri Pirko
2 siblings, 1 reply; 12+ messages in thread
From: Flavio Leitner @ 2012-08-09 19:39 UTC (permalink / raw)
To: Ben Hutchings
Cc: netdev, Jay Vosburgh, Andy Gospodarek, Leonardo Chiquitto,
Jiri Pirko
On Thu, 9 Aug 2012 20:03:23 +0100
Ben Hutchings <bhutchings@solarflare.com> wrote:
> On Thu, 2012-08-09 at 15:30 -0300, Flavio Leitner wrote:
> > It doesn't make any sense to allow the master to become
> > its slave. That creates a loop of events causing a crash.
>
> What if there are other intermediate devices, e.g. the slave is a VLAN
> sub-device of the bond? And doesn't team also have this problem?
>
> I think a more general check for such loops might be required.
Maybe patching netdev_set_master() to fail in the loop case is
the way to go. That would work for bonding, team and bridge.
What you think?
fbl
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [net-next] bonding: don't allow the master to become its slave
2012-08-09 19:39 ` Flavio Leitner
@ 2012-08-09 19:55 ` Jiri Pirko
2012-08-09 20:52 ` Flavio Leitner
2012-08-09 21:09 ` Ben Hutchings
0 siblings, 2 replies; 12+ messages in thread
From: Jiri Pirko @ 2012-08-09 19:55 UTC (permalink / raw)
To: Flavio Leitner
Cc: Ben Hutchings, netdev, Jay Vosburgh, Andy Gospodarek,
Leonardo Chiquitto
Thu, Aug 09, 2012 at 09:39:06PM CEST, fbl@redhat.com wrote:
>On Thu, 9 Aug 2012 20:03:23 +0100
>Ben Hutchings <bhutchings@solarflare.com> wrote:
>
>> On Thu, 2012-08-09 at 15:30 -0300, Flavio Leitner wrote:
>> > It doesn't make any sense to allow the master to become
>> > its slave. That creates a loop of events causing a crash.
>>
>> What if there are other intermediate devices, e.g. the slave is a VLAN
>> sub-device of the bond? And doesn't team also have this problem?
>>
>> I think a more general check for such loops might be required.
>
>Maybe patching netdev_set_master() to fail in the loop case is
>the way to go. That would work for bonding, team and bridge.
>
>What you think?
How about other devices who do not use "->master" like vlan, macvlan?
>
>fbl
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [net-next] bonding: don't allow the master to become its slave
2012-08-09 19:55 ` Jiri Pirko
@ 2012-08-09 20:52 ` Flavio Leitner
2012-08-09 21:09 ` Ben Hutchings
1 sibling, 0 replies; 12+ messages in thread
From: Flavio Leitner @ 2012-08-09 20:52 UTC (permalink / raw)
To: Jiri Pirko
Cc: Ben Hutchings, netdev, Jay Vosburgh, Andy Gospodarek,
Leonardo Chiquitto
On Thu, 9 Aug 2012 21:55:39 +0200
Jiri Pirko <jpirko@redhat.com> wrote:
> Thu, Aug 09, 2012 at 09:39:06PM CEST, fbl@redhat.com wrote:
> >On Thu, 9 Aug 2012 20:03:23 +0100
> >Ben Hutchings <bhutchings@solarflare.com> wrote:
> >
> >> On Thu, 2012-08-09 at 15:30 -0300, Flavio Leitner wrote:
> >> > It doesn't make any sense to allow the master to become
> >> > its slave. That creates a loop of events causing a crash.
> >>
> >> What if there are other intermediate devices, e.g. the slave is a VLAN
> >> sub-device of the bond? And doesn't team also have this problem?
> >>
> >> I think a more general check for such loops might be required.
> >
> >Maybe patching netdev_set_master() to fail in the loop case is
> >the way to go. That would work for bonding, team and bridge.
> >
> >What you think?
>
> How about other devices who do not use "->master" like vlan, macvlan?
Didn't get you. This is what I had in mind, just to show the idea.
diff --git a/net/core/dev.c b/net/core/dev.c
index f91abf8..a404afb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4424,6 +4424,24 @@ static int __init dev_proc_init(void)
#define dev_proc_init() 0
#endif /* CONFIG_PROC_FS */
+bool netdev_check_loop(struct net_device *master, struct net_device *dev)
+{
+ if (master == dev)
+ return true;
+
+ if (is_vlan_dev(dev))
+ return nedev_check_loop(vlan_dev_real_dev(dev));
+
+ /* is a bridge ?*/
+ if (dev->priv_flags & IFF_EBRIDGE) {
+ list_for_each_entry(p, &br->port_list, list) {
+ if (nedev_check_loop(p->dev))
+ return true;
+ }
+ }
+
+ return false;
+}
/**
* netdev_set_master - set up master pointer
@@ -4447,6 +4465,9 @@ int netdev_set_master(struct net_device *slave, struct net_device *master)
dev_hold(master);
}
+ if (netdev_check_loop(master, slave))
+ return -EINVAL;
+
slave->master = master;
if (old)
fbl
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [net-next] bonding: don't allow the master to become its slave
2012-08-09 19:55 ` Jiri Pirko
2012-08-09 20:52 ` Flavio Leitner
@ 2012-08-09 21:09 ` Ben Hutchings
2012-08-09 21:27 ` Jay Vosburgh
1 sibling, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2012-08-09 21:09 UTC (permalink / raw)
To: Jiri Pirko
Cc: Flavio Leitner, netdev, Jay Vosburgh, Andy Gospodarek,
Leonardo Chiquitto
On Thu, 2012-08-09 at 21:55 +0200, Jiri Pirko wrote:
> Thu, Aug 09, 2012 at 09:39:06PM CEST, fbl@redhat.com wrote:
> >On Thu, 9 Aug 2012 20:03:23 +0100
> >Ben Hutchings <bhutchings@solarflare.com> wrote:
> >
> >> On Thu, 2012-08-09 at 15:30 -0300, Flavio Leitner wrote:
> >> > It doesn't make any sense to allow the master to become
> >> > its slave. That creates a loop of events causing a crash.
> >>
> >> What if there are other intermediate devices, e.g. the slave is a VLAN
> >> sub-device of the bond? And doesn't team also have this problem?
> >>
> >> I think a more general check for such loops might be required.
> >
> >Maybe patching netdev_set_master() to fail in the loop case is
> >the way to go. That would work for bonding, team and bridge.
> >
> >What you think?
>
>
> How about other devices who do not use "->master" like vlan, macvlan?
And they shouldn't use master, because they allow multiple upper devices
may be stacked on a single lower device. Instead they use iflink, but
that's an ifindex and not a net_device pointer.
So I think we can catch simple loops with:
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4445,8 +4445,22 @@ int netdev_set_master(struct net_device *slave, struct net_device *master)
ASSERT_RTNL();
if (master) {
+ struct net_device *bottom, *top;
+
if (old)
return -EBUSY;
+
+ /* Prevent loops */
+ bottom = slave;
+ while (bottom->iflink != bottom->ifindex)
+ bottom = __dev_get_by_index(dev_net(bottom),
+ bottom->iflink);
+ top = master;
+ while (top->master)
+ top = top->master;
+ if (top == bottom)
+ return -EBUSY;
+
dev_hold(master);
}
--- END ---
But then there can be quite silly device relationships like:
+-------+
| bond0 |
++-----++
/ \
+-------+ +---+---+ +---+---+ +-------+
| vlan0 | | vlan1 | | vlan2 | | vlan3 |
+---+---+ +---+---+ +---+---+ +---+---+
\ / \ /
++-----++ ++--+--++
| bond1 | | bond2 |
+-------+ +-------+
: : : :
Suppose the user tries to make bond0 a slave of bond1; we need to go to
somewhat more effort to detect the loop.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [net-next] bonding: don't allow the master to become its slave
2012-08-09 21:09 ` Ben Hutchings
@ 2012-08-09 21:27 ` Jay Vosburgh
2012-08-09 23:43 ` David Miller
0 siblings, 1 reply; 12+ messages in thread
From: Jay Vosburgh @ 2012-08-09 21:27 UTC (permalink / raw)
To: Ben Hutchings
Cc: Jiri Pirko, Flavio Leitner, netdev, Andy Gospodarek,
Leonardo Chiquitto
Ben Hutchings <bhutchings@solarflare.com> wrote:
>On Thu, 2012-08-09 at 21:55 +0200, Jiri Pirko wrote:
[...]
>> How about other devices who do not use "->master" like vlan, macvlan?
>
>And they shouldn't use master, because they allow multiple upper devices
>may be stacked on a single lower device. Instead they use iflink, but
>that's an ifindex and not a net_device pointer.
>
>So I think we can catch simple loops with:
>
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -4445,8 +4445,22 @@ int netdev_set_master(struct net_device *slave, struct net_device *master)
> ASSERT_RTNL();
>
> if (master) {
>+ struct net_device *bottom, *top;
>+
> if (old)
> return -EBUSY;
>+
>+ /* Prevent loops */
>+ bottom = slave;
>+ while (bottom->iflink != bottom->ifindex)
>+ bottom = __dev_get_by_index(dev_net(bottom),
>+ bottom->iflink);
>+ top = master;
>+ while (top->master)
>+ top = top->master;
>+ if (top == bottom)
>+ return -EBUSY;
>+
> dev_hold(master);
> }
>
>--- END ---
>
>But then there can be quite silly device relationships like:
>
> +-------+
> | bond0 |
> ++-----++
> / \
>+-------+ +---+---+ +---+---+ +-------+
>| vlan0 | | vlan1 | | vlan2 | | vlan3 |
>+---+---+ +---+---+ +---+---+ +---+---+
> \ / \ /
> ++-----++ ++--+--++
> | bond1 | | bond2 |
> +-------+ +-------+
> : : : :
>
>Suppose the user tries to make bond0 a slave of bond1; we need to go to
>somewhat more effort to detect the loop.
If that's hard to do (and it might be; I'm not aware of a
standard way to run up and down those stacks of interfaces, which might
not always be vlans in the middle), there's still the priv_flags &
IFF_BONDING test that bonding could (and probably should) do itself as
well. The team driver could presumably have a similar test, although I
seem to recall that team was allowed to nest.
FWIW, I've seen both the top and bottom halves of that picture
in use (i.e., bonds consisting of vlans as slaves or bonds with vlans
configured above them), but not combined as in your diagram.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [net-next] bonding: don't allow the master to become its slave
2012-08-09 21:27 ` Jay Vosburgh
@ 2012-08-09 23:43 ` David Miller
2012-08-10 13:04 ` Jiri Pirko
0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2012-08-09 23:43 UTC (permalink / raw)
To: fubar; +Cc: bhutchings, jpirko, fbl, netdev, andy, lchiquitto
From: Jay Vosburgh <fubar@us.ibm.com>
Date: Thu, 09 Aug 2012 14:27:08 -0700
> If that's hard to do (and it might be; I'm not aware of a
> standard way to run up and down those stacks of interfaces, which might
> not always be vlans in the middle), there's still the priv_flags &
> IFF_BONDING test that bonding could (and probably should) do itself as
> well. The team driver could presumably have a similar test, although I
> seem to recall that team was allowed to nest.
>
> FWIW, I've seen both the top and bottom halves of that picture
> in use (i.e., bonds consisting of vlans as slaves or bonds with vlans
> configured above them), but not combined as in your diagram.
We're basically looking for cycles in a complex graph.
Some combination of Jay and Ben's most recent patches, with some minor
modifications, ought to do it.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [net-next] bonding: don't allow the master to become its slave
2012-08-09 23:43 ` David Miller
@ 2012-08-10 13:04 ` Jiri Pirko
0 siblings, 0 replies; 12+ messages in thread
From: Jiri Pirko @ 2012-08-10 13:04 UTC (permalink / raw)
To: David Miller; +Cc: fubar, bhutchings, jpirko, fbl, netdev, andy, lchiquitto
Fri, Aug 10, 2012 at 01:43:31AM CEST, davem@davemloft.net wrote:
>From: Jay Vosburgh <fubar@us.ibm.com>
>Date: Thu, 09 Aug 2012 14:27:08 -0700
>
>> If that's hard to do (and it might be; I'm not aware of a
>> standard way to run up and down those stacks of interfaces, which might
>> not always be vlans in the middle), there's still the priv_flags &
>> IFF_BONDING test that bonding could (and probably should) do itself as
>> well. The team driver could presumably have a similar test, although I
>> seem to recall that team was allowed to nest.
>>
>> FWIW, I've seen both the top and bottom halves of that picture
>> in use (i.e., bonds consisting of vlans as slaves or bonds with vlans
>> configured above them), but not combined as in your diagram.
>
>We're basically looking for cycles in a complex graph.
>
>Some combination of Jay and Ben's most recent patches, with some minor
>modifications, ought to do it.
Hmm. Would be probably good to have list/table of related devices,
possibly with information about the relation.
After that, every relation add would check for loops.
I will dive in the code over the weekend to see if this is doable in
some nice way.
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [net-next] bonding: don't allow the master to become its slave
2012-08-09 19:03 ` Ben Hutchings
2012-08-09 19:23 ` Jay Vosburgh
2012-08-09 19:39 ` Flavio Leitner
@ 2012-08-09 19:54 ` Jiri Pirko
2 siblings, 0 replies; 12+ messages in thread
From: Jiri Pirko @ 2012-08-09 19:54 UTC (permalink / raw)
To: Ben Hutchings
Cc: Flavio Leitner, netdev, Jay Vosburgh, Andy Gospodarek,
Leonardo Chiquitto
Thu, Aug 09, 2012 at 09:03:23PM CEST, bhutchings@solarflare.com wrote:
>On Thu, 2012-08-09 at 15:30 -0300, Flavio Leitner wrote:
>> It doesn't make any sense to allow the master to become
>> its slave. That creates a loop of events causing a crash.
>
>What if there are other intermediate devices, e.g. the slave is a VLAN
>sub-device of the bond? And doesn't team also have this problem?
Yes, it does.
>
>I think a more general check for such loops might be required.
I agree.
>
>Ben.
>
>> Reported-by: Leonardo Chiquitto <lchiquitto@suse.com>
>> Signed-off-by: Flavio Leitner <fbl@redhat.com>
>> ---
>> drivers/net/bonding/bond_main.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 6fae5f3..5407b44 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -1505,6 +1505,11 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>> int link_reporting;
>> int res = 0;
>>
>> + if (bond_dev == slave_dev) {
>> + pr_err("%s: Error: cannot enslave itself.\n", bond_dev->name);
>> + return -EINVAL;
>> + }
>> +
>> if (!bond->params.use_carrier && slave_dev->ethtool_ops == NULL &&
>> slave_ops->ndo_do_ioctl == NULL) {
>> pr_warning("%s: Warning: no link monitoring support for %s\n",
>
>--
>Ben Hutchings, Staff Engineer, Solarflare
>Not speaking for my employer; that's the marketing department's job.
>They asked us to note that Solarflare product names are trademarked.
>
^ permalink raw reply [flat|nested] 12+ messages in thread