* [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 @ 2014-03-28 9:22 Zheng Li 2014-03-31 6:58 ` zheng.li 2014-04-01 0:35 ` Jay Vosburgh 0 siblings, 2 replies; 12+ messages in thread From: Zheng Li @ 2014-03-28 9:22 UTC (permalink / raw) To: netdev, fubar, andy; +Cc: linux-kernel, davem, joe.jin, zheng.x.li In bond mode tlb and alb, inactive slaves should keep inactive flag to 1 to refuse to receive broadcast packets. Now, active slave send broadcast packets (for example ARP requests) which will arrive inactive slaves on same host from switch, but inactive slave's inactive flag is zero that cause bridge receive the broadcast packets to produce a wrong entry in forward table. Typical situation is domu send some ARP request which go out from dom0 bond's active slave, then the ARP broadcast request packets go back to inactive slave from switch, because the inactive slave's inactive flag is zero, kernel will receive the packets and pass them to bridge that cause dom0's bridge map domu's MAC address to port of bond, bridge should map domu's MAC to port of vif. Signed-off-by: Zheng Li <zheng.x.li@oracle.com> --- drivers/net/bonding/bond_main.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index e5628fc..f97d72e 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3058,7 +3058,7 @@ static int bond_open(struct net_device *bond_dev) if (bond_has_slaves(bond)) { read_lock(&bond->curr_slave_lock); bond_for_each_slave(bond, slave, iter) { - if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP) + if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP || bond_is_lb(bond)) && (slave != bond->curr_active_slave)) { bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW); -- 1.7.6.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 2014-03-28 9:22 [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 Zheng Li @ 2014-03-31 6:58 ` zheng.li 2014-04-01 0:35 ` Jay Vosburgh 1 sibling, 0 replies; 12+ messages in thread From: zheng.li @ 2014-03-31 6:58 UTC (permalink / raw) To: fubar; +Cc: Zheng Li, netdev, andy, linux-kernel, davem, joe.jin Hi Jay, What's you think about the new patch. Thanks, Zheng Li 于 2014年03月28日 17:22, Zheng Li 写道: > In bond mode tlb and alb, inactive slaves should keep inactive flag to 1 to > refuse to receive broadcast packets. Now, active slave send broadcast packets > (for example ARP requests) which will arrive inactive slaves on same host from > switch, but inactive slave's inactive flag is zero that cause bridge receive the > broadcast packets to produce a wrong entry in forward table. Typical situation > is domu send some ARP request which go out from dom0 bond's active slave, then > the ARP broadcast request packets go back to inactive slave from switch, because > the inactive slave's inactive flag is zero, kernel will receive the packets and > pass them to bridge that cause dom0's bridge map domu's MAC address to port of > bond, bridge should map domu's MAC to port of vif. > > Signed-off-by: Zheng Li <zheng.x.li@oracle.com> > --- > drivers/net/bonding/bond_main.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index e5628fc..f97d72e 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -3058,7 +3058,7 @@ static int bond_open(struct net_device *bond_dev) > if (bond_has_slaves(bond)) { > read_lock(&bond->curr_slave_lock); > bond_for_each_slave(bond, slave, iter) { > - if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP) > + if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP || bond_is_lb(bond)) > && (slave != bond->curr_active_slave)) { > bond_set_slave_inactive_flags(slave, > BOND_SLAVE_NOTIFY_NOW); > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 2014-03-28 9:22 [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 Zheng Li 2014-03-31 6:58 ` zheng.li @ 2014-04-01 0:35 ` Jay Vosburgh 1 sibling, 0 replies; 12+ messages in thread From: Jay Vosburgh @ 2014-04-01 0:35 UTC (permalink / raw) To: Zheng Li; +Cc: netdev, andy, linux-kernel, davem, joe.jin Zheng Li <zheng.x.li@oracle.com> wrote: >In bond mode tlb and alb, inactive slaves should keep inactive flag to 1 to >refuse to receive broadcast packets. Now, active slave send broadcast packets >(for example ARP requests) which will arrive inactive slaves on same host from >switch, but inactive slave's inactive flag is zero that cause bridge receive the >broadcast packets to produce a wrong entry in forward table. Typical situation >is domu send some ARP request which go out from dom0 bond's active slave, then >the ARP broadcast request packets go back to inactive slave from switch, because >the inactive slave's inactive flag is zero, kernel will receive the packets and >pass them to bridge that cause dom0's bridge map domu's MAC address to port of >bond, bridge should map domu's MAC to port of vif. I think the patch is ok (I don't have a machine to test it on at the moment), but the description above is leaving out some details about how the problem is induced. The actual problem being fixed here is that bond_open is not setting the inactive flag correctly for some modes (alb and tlb), resulting in the behavior described above if the bond has been administratively set down and then back up. This effect should not occur when slaves are added while the bond is up; it's something that only happens after a down/up bounce of the bond. That said, the patch itself looks fine to me. Signed-off-by: Jay Vosburgh <j.vosburgh@gmail.com> -J >Signed-off-by: Zheng Li <zheng.x.li@oracle.com> >--- > drivers/net/bonding/bond_main.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >index e5628fc..f97d72e 100644 >--- a/drivers/net/bonding/bond_main.c >+++ b/drivers/net/bonding/bond_main.c >@@ -3058,7 +3058,7 @@ static int bond_open(struct net_device *bond_dev) > if (bond_has_slaves(bond)) { > read_lock(&bond->curr_slave_lock); > bond_for_each_slave(bond, slave, iter) { >- if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP) >+ if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP || bond_is_lb(bond)) > && (slave != bond->curr_active_slave)) { > bond_set_slave_inactive_flags(slave, > BOND_SLAVE_NOTIFY_NOW); >-- >1.7.6.5 --- -Jay Vosburgh, j.vosburgh@gmail.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1. @ 2014-03-20 8:51 Zheng Li 2014-03-20 9:36 ` Ding Tianhong ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Zheng Li @ 2014-03-20 8:51 UTC (permalink / raw) To: netdev, fubar, andy; +Cc: linux-kernel, davem, joe.jin, zheng.x.li Except bond mode 1, in other bond modes, inactive slaves should keep inactive flag to 1 to refuse to receive broadcast packets. Now, active slave send broadcast packets (for example ARP requests) which will arrive inactive slaves on same host from switch, but inactive slave's inactive flag is zero that cause bridge receive the broadcast packets to produce a wrong entry in forward table. Typical situation is domu send some ARP request which go out from dom0 bond's active slave, then the ARP broadcast request packets go back to inactive slave from switch, because the inactive slave's inactive flag is zero, kernel will receive the packets and pass them to bridge, that cause dom0's bridge map domu's MAC address to port of bond, bridge should map domu's MAC to port of vif. Signed-off-by: Zheng Li <zheng.x.li@oracle.com> --- drivers/net/bonding/bond_main.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index e5628fc..2f73f18 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3063,7 +3063,7 @@ static int bond_open(struct net_device *bond_dev) bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW); } else { - bond_set_slave_active_flags(slave, + bond_set_slave_state(slave, BOND_STATE_ACTIVE, BOND_SLAVE_NOTIFY_NOW); } } -- 1.7.6.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1. 2014-03-20 8:51 Zheng Li @ 2014-03-20 9:36 ` Ding Tianhong 2014-03-20 17:02 ` Jay Vosburgh 2014-03-21 11:34 ` Sergei Shtylyov 2 siblings, 0 replies; 12+ messages in thread From: Ding Tianhong @ 2014-03-20 9:36 UTC (permalink / raw) To: Zheng Li, netdev, fubar, andy; +Cc: linux-kernel, davem, joe.jin On 2014/3/20 16:51, Zheng Li wrote: > Except bond mode 1, in other bond modes, inactive slaves should keep inactive flag to > 1 to refuse to receive broadcast packets. Now, active slave send broadcast packets > (for example ARP requests) which will arrive inactive slaves on same host from switch, > but inactive slave's inactive flag is zero that cause bridge receive the broadcast > packets to produce a wrong entry in forward table. Typical situation is domu send some > ARP request which go out from dom0 bond's active slave, then the ARP broadcast request > packets go back to inactive slave from switch, because the inactive slave's inactive > flag is zero, kernel will receive the packets and pass them to bridge, that cause dom0's > bridge map domu's MAC address to port of bond, bridge should map domu's MAC to port of vif. > > Signed-off-by: Zheng Li <zheng.x.li@oracle.com> > --- > drivers/net/bonding/bond_main.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index e5628fc..2f73f18 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -3063,7 +3063,7 @@ static int bond_open(struct net_device *bond_dev) > bond_set_slave_inactive_flags(slave, > BOND_SLAVE_NOTIFY_NOW); > } else { > - bond_set_slave_active_flags(slave, > + bond_set_slave_state(slave, BOND_STATE_ACTIVE, > BOND_SLAVE_NOTIFY_NOW); This patch could be applied? Ding > } > } > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1. 2014-03-20 8:51 Zheng Li 2014-03-20 9:36 ` Ding Tianhong @ 2014-03-20 17:02 ` Jay Vosburgh 2014-03-21 2:39 ` zheng.li 2014-03-21 11:34 ` Sergei Shtylyov 2 siblings, 1 reply; 12+ messages in thread From: Jay Vosburgh @ 2014-03-20 17:02 UTC (permalink / raw) To: Zheng Li; +Cc: netdev, andy, linux-kernel, davem, joe.jin Zheng Li <zheng.x.li@oracle.com> wrote: >Except bond mode 1, in other bond modes, inactive slaves should keep >inactive flag to 1 to refuse to receive broadcast packets. Now, active >slave send broadcast packets (for example ARP requests) which will >arrive inactive slaves on same host from switch, but inactive slave's >inactive flag is zero that cause bridge receive the broadcast packets >to produce a wrong entry in forward table. Typical situation is domu >send some ARP request which go out from dom0 bond's active slave, then >the ARP broadcast request packets go back to inactive slave from >switch, because the inactive slave's inactive flag is zero, kernel will >receive the packets and pass them to bridge, that cause dom0's bridge >map domu's MAC address to port of bond, bridge should map domu's MAC to >port of vif. I suspect this will break LACP (802.3ad) and Etherchannel (balance-xor, balance-rr) modes, as those modes can receive broadcast or multicast on any slave. In those cases, the switch knows about the aggregation, and will only send the broadcast / multicast to one of the ports, but the port selected is not always the same one. In which mode are you having trouble? -J > >Signed-off-by: Zheng Li <zheng.x.li@oracle.com> >--- > drivers/net/bonding/bond_main.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >index e5628fc..2f73f18 100644 >--- a/drivers/net/bonding/bond_main.c >+++ b/drivers/net/bonding/bond_main.c >@@ -3063,7 +3063,7 @@ static int bond_open(struct net_device *bond_dev) > bond_set_slave_inactive_flags(slave, > BOND_SLAVE_NOTIFY_NOW); > } else { >- bond_set_slave_active_flags(slave, >+ bond_set_slave_state(slave, BOND_STATE_ACTIVE, > BOND_SLAVE_NOTIFY_NOW); > } > } >-- >1.7.6.5 > --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1. 2014-03-20 17:02 ` Jay Vosburgh @ 2014-03-21 2:39 ` zheng.li 2014-03-21 8:41 ` Ding Tianhong 2014-03-21 17:43 ` Jay Vosburgh 0 siblings, 2 replies; 12+ messages in thread From: zheng.li @ 2014-03-21 2:39 UTC (permalink / raw) To: Jay Vosburgh; +Cc: netdev, andy, linux-kernel, davem, joe.jin 于 2014年03月21日 01:02, Jay Vosburgh 写道: > Zheng Li <zheng.x.li@oracle.com> wrote: > >> Except bond mode 1, in other bond modes, inactive slaves should keep >> inactive flag to 1 to refuse to receive broadcast packets. Now, active >> slave send broadcast packets (for example ARP requests) which will >> arrive inactive slaves on same host from switch, but inactive slave's >> inactive flag is zero that cause bridge receive the broadcast packets >> to produce a wrong entry in forward table. Typical situation is domu >> send some ARP request which go out from dom0 bond's active slave, then >> the ARP broadcast request packets go back to inactive slave from >> switch, because the inactive slave's inactive flag is zero, kernel will >> receive the packets and pass them to bridge, that cause dom0's bridge >> map domu's MAC address to port of bond, bridge should map domu's MAC to >> port of vif. > > I suspect this will break LACP (802.3ad) and Etherchannel > (balance-xor, balance-rr) modes, as those modes can receive broadcast or > multicast on any slave. In those cases, the switch knows about the > aggregation, and will only send the broadcast / multicast to one of the > ports, but the port selected is not always the same one. > > In which mode are you having trouble? > > -J Except bond mode 1, in other modes (major test in mode 6, and test all other mode, except mode 1, all other modes has the bug), the bridge make a wrong entry which map guest MAC to the port of bond, it should map guest MAC to the port of vif. Env description: dom0's bridge contains bond1 and vif ports, bond1 as port 1 , vif as port 2, bond1 has two slaves which connect a switch. when from guest ping others ,the arp broadcast request will go out from bond1's active slave, and then go back to itself inactive slave from switch , in function of bond_should_deliver_exact_match will return false by inactive is zero, return false will cause bridge receive the arp request packets whose original is from guest through vif that let bridge consider the SRC MAC of guest is from bond1 by analyzing the arp broadcast packets, then make a wrong forward entry "MAC of guest, from port 1 (bond1)" , the correct entry should be "MAC of guest , from port 2 (vif)". bond_should_deliver_exact_match(struct sk_buff *skb, struct slave *slave, struct bonding *bond) { if (bond_is_slave_inactive(slave)) { if (bond->params.mode == BOND_MODE_ALB && skb->pkt_type != PACKET_BROADCAST && skb->pkt_type != PACKET_MULTICAST) return false; return true; } return false; } Thanks, Zheng Li > >> >> Signed-off-by: Zheng Li <zheng.x.li@oracle.com> >> --- >> drivers/net/bonding/bond_main.c | 2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >> index e5628fc..2f73f18 100644 >> --- a/drivers/net/bonding/bond_main.c >> +++ b/drivers/net/bonding/bond_main.c >> @@ -3063,7 +3063,7 @@ static int bond_open(struct net_device *bond_dev) >> bond_set_slave_inactive_flags(slave, >> BOND_SLAVE_NOTIFY_NOW); >> } else { >> - bond_set_slave_active_flags(slave, >> + bond_set_slave_state(slave, BOND_STATE_ACTIVE, >> BOND_SLAVE_NOTIFY_NOW); >> } >> } >> -- >> 1.7.6.5 >> > > --- > -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1. 2014-03-21 2:39 ` zheng.li @ 2014-03-21 8:41 ` Ding Tianhong 2014-03-21 17:43 ` Jay Vosburgh 1 sibling, 0 replies; 12+ messages in thread From: Ding Tianhong @ 2014-03-21 8:41 UTC (permalink / raw) To: zheng.li, Jay Vosburgh; +Cc: netdev, andy, linux-kernel, davem, joe.jin On 2014/3/21 10:39, zheng.li wrote: > 于 2014年03月21日 01:02, Jay Vosburgh 写道: >> Zheng Li <zheng.x.li@oracle.com> wrote: >> >>> Except bond mode 1, in other bond modes, inactive slaves should keep >>> inactive flag to 1 to refuse to receive broadcast packets. Now, active >>> slave send broadcast packets (for example ARP requests) which will >>> arrive inactive slaves on same host from switch, but inactive slave's >>> inactive flag is zero that cause bridge receive the broadcast packets >>> to produce a wrong entry in forward table. Typical situation is domu >>> send some ARP request which go out from dom0 bond's active slave, then >>> the ARP broadcast request packets go back to inactive slave from >>> switch, because the inactive slave's inactive flag is zero, kernel will >>> receive the packets and pass them to bridge, that cause dom0's bridge >>> map domu's MAC address to port of bond, bridge should map domu's MAC to >>> port of vif. >> >> I suspect this will break LACP (802.3ad) and Etherchannel >> (balance-xor, balance-rr) modes, as those modes can receive broadcast or >> multicast on any slave. In those cases, the switch knows about the >> aggregation, and will only send the broadcast / multicast to one of the >> ports, but the port selected is not always the same one. >> >> In which mode are you having trouble? >> >> -J > > Except bond mode 1, in other modes (major test in mode 6, and test all > other mode, except mode 1, all other modes has the bug), the bridge > make a wrong entry which map guest MAC to the port of bond, it should > map guest MAC to the port of vif. > > Env description: dom0's bridge contains bond1 and vif ports, bond1 as > port 1 , vif as port 2, bond1 has two slaves which connect a switch. > when from guest ping others ,the arp broadcast request will go out from > bond1's active slave, and then go back to itself inactive slave from > switch , in function of bond_should_deliver_exact_match will return > false by inactive is zero, return false will cause bridge receive the > arp request packets whose original is from guest through vif that let > bridge consider the SRC MAC of guest is from bond1 by analyzing the arp > broadcast packets, then make a wrong forward entry "MAC of guest, from > port 1 (bond1)" , the correct entry should be "MAC of guest , from port > 2 (vif)". > > > bond_should_deliver_exact_match(struct sk_buff *skb, > struct slave *slave, > struct bonding *bond) > { > if (bond_is_slave_inactive(slave)) { > if (bond->params.mode == BOND_MODE_ALB && > skb->pkt_type != PACKET_BROADCAST && > skb->pkt_type != PACKET_MULTICAST) > return false; > return true; > } > return false; > } > > Thanks, > Zheng Li > I know the problems of yours, but you can't fix the problem by the follow way, it will break the original mode such as 802.3ad, if you want to fix the problem, I think you need a better solution. Ding > >> >>> >>> Signed-off-by: Zheng Li <zheng.x.li@oracle.com> >>> --- >>> drivers/net/bonding/bond_main.c | 2 +- >>> 1 files changed, 1 insertions(+), 1 deletions(-) >>> >>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >>> index e5628fc..2f73f18 100644 >>> --- a/drivers/net/bonding/bond_main.c >>> +++ b/drivers/net/bonding/bond_main.c >>> @@ -3063,7 +3063,7 @@ static int bond_open(struct net_device *bond_dev) >>> bond_set_slave_inactive_flags(slave, >>> BOND_SLAVE_NOTIFY_NOW); >>> } else { >>> - bond_set_slave_active_flags(slave, >>> + bond_set_slave_state(slave, BOND_STATE_ACTIVE, >>> BOND_SLAVE_NOTIFY_NOW); >>> } >>> } >>> -- >>> 1.7.6.5 >>> >> >> --- >> -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com >> >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > . > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1. 2014-03-21 2:39 ` zheng.li 2014-03-21 8:41 ` Ding Tianhong @ 2014-03-21 17:43 ` Jay Vosburgh 2014-03-24 9:01 ` zheng.li 1 sibling, 1 reply; 12+ messages in thread From: Jay Vosburgh @ 2014-03-21 17:43 UTC (permalink / raw) To: zheng.li; +Cc: netdev, andy, linux-kernel, davem, joe.jin zheng.li <zheng.x.li@oracle.com> wrote: >于 2014年03月21日 01:02, Jay Vosburgh 写道: >> Zheng Li <zheng.x.li@oracle.com> wrote: >> >>> Except bond mode 1, in other bond modes, inactive slaves should keep >>> inactive flag to 1 to refuse to receive broadcast packets. Now, active >>> slave send broadcast packets (for example ARP requests) which will >>> arrive inactive slaves on same host from switch, but inactive slave's >>> inactive flag is zero that cause bridge receive the broadcast packets >>> to produce a wrong entry in forward table. Typical situation is domu >>> send some ARP request which go out from dom0 bond's active slave, then >>> the ARP broadcast request packets go back to inactive slave from >>> switch, because the inactive slave's inactive flag is zero, kernel will >>> receive the packets and pass them to bridge, that cause dom0's bridge >>> map domu's MAC address to port of bond, bridge should map domu's MAC to >>> port of vif. >> >> I suspect this will break LACP (802.3ad) and Etherchannel >> (balance-xor, balance-rr) modes, as those modes can receive broadcast or >> multicast on any slave. In those cases, the switch knows about the >> aggregation, and will only send the broadcast / multicast to one of the >> ports, but the port selected is not always the same one. >> >> In which mode are you having trouble? >> >> -J > >Except bond mode 1, in other modes (major test in mode 6, and test all >other mode, except mode 1, all other modes has the bug), the bridge >make a wrong entry which map guest MAC to the port of bond, it should >map guest MAC to the port of vif. > >Env description: dom0's bridge contains bond1 and vif ports, bond1 as >port 1 , vif as port 2, bond1 has two slaves which connect a switch. >when from guest ping others ,the arp broadcast request will go out from >bond1's active slave, and then go back to itself inactive slave from >switch , in function of bond_should_deliver_exact_match will return >false by inactive is zero, return false will cause bridge receive the >arp request packets whose original is from guest through vif that let >bridge consider the SRC MAC of guest is from bond1 by analyzing the arp >broadcast packets, then make a wrong forward entry "MAC of guest, from >port 1 (bond1)" , the correct entry should be "MAC of guest , from port >2 (vif)". I believe I understand; the crux of the problem is that the broadcast is looped by the switch to the other bond port, which then processes it as a received packet. For the tlb and alb modes, you're logically correct; the slaves other than the active slave should be set to inactive when the bond is opened. This is how they are set when configured normally to limit inbound broadcast and multicast traffic to just one slave (because these modes do not configure the switch ports into channel groups or aggregators; the switch has no knowledge of the bond). Your patch is still incorrect, though; the slaves should be set to actually be inactive via bond_set_slave_inactive_flags, not left "semi-active," i.e., BOND_STATE_ACTIVE but slave->inactive set. More on the slave->inactive flag handling later, though. For the balance-rr/-xor and 802.3ad modes, I suspect you have a configuration problem of some sort. For those modes, the switch should not loop back the broadcast as you describe when correctly configured. The balance-rr/-xor modes should be connected to switch ports that are configured into a single Etherchannel group (static link aggregation). If they are not, the looping back behavior you describe is expected, as the switch will flood the broadcast to all ports. Similarly, the 802.3ad mode should be connected to a switch with the ports configured for LACP, in which case the ports will automatically configure into one or more aggregators, and again, the switch will limit broadcast and multicast traffic to just one member of an aggregator. The -rr/-xor/802.3ad modes must have all slaves set to active in order for them to correctly process incoming broadcast and multicast traffic in their proper configuration. Setting them to inactive will cause packet loss in those configurations. The 802.3ad mode is probably the easiest to examine; if you configure the switch ports for LACP mode, the /proc/net/bonding/bond0 file should indicate that all the slaves are in the same aggregator (have the same Aggregator ID), and that that aggregator is the active one. The switch should have similar indications, although the format is vendor-specific. If your switch is configured for LACP and you still have issues, please post the /proc/net/bonding/bond0 contents. Finally, getting back to the slave->inactive flag. The only difference between bond_set_slave_active_flags or _inactive_flags and bond_set_slave_state is that the slave->inactive flag (whose only purpose is duplicate suppression for incoming packets) is cleared by the first (_flags variant), but not by the second. Right now, the only caller of _set_slave_state are the _active/_inactive_flags functions, so it all works. If _set_slave_state is called independently, then the slave->inactive flag and the actual state may become unsynchronized. In summary, it looks to me like: (a) bond_set_slave_state should fix slave->inactive to match the slave state if there are going to be callers other than bond_set_slave_active_flags or _inactive_flags, or remove the slave->inactive field entirely and have bond_is_slave_inactive use slave->back directly. and, the immediate problem here, (b) bond_open should call bond_set_slave_active_flags or _inactive_flags based on the mode and whether or not the slave is the curr_active_slave. For active-backup, alb and tlb modes, the active slave is active, the others are inactive; for -rr, -xor and 802.3ad modes, all slaves are active. I dunno what to do with broadcast mode; make 'em all active, I guess. -J >bond_should_deliver_exact_match(struct sk_buff *skb, > struct slave *slave, > struct bonding *bond) >{ > if (bond_is_slave_inactive(slave)) { > if (bond->params.mode == BOND_MODE_ALB && > skb->pkt_type != PACKET_BROADCAST && > skb->pkt_type != PACKET_MULTICAST) > return false; > return true; > } > return false; >} > >Thanks, >Zheng Li > > >> >>> >>> Signed-off-by: Zheng Li <zheng.x.li@oracle.com> >>> --- >>> drivers/net/bonding/bond_main.c | 2 +- >>> 1 files changed, 1 insertions(+), 1 deletions(-) >>> >>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >>> index e5628fc..2f73f18 100644 >>> --- a/drivers/net/bonding/bond_main.c >>> +++ b/drivers/net/bonding/bond_main.c >>> @@ -3063,7 +3063,7 @@ static int bond_open(struct net_device *bond_dev) >>> bond_set_slave_inactive_flags(slave, >>> BOND_SLAVE_NOTIFY_NOW); >>> } else { >>> - bond_set_slave_active_flags(slave, >>> + bond_set_slave_state(slave, BOND_STATE_ACTIVE, >>> BOND_SLAVE_NOTIFY_NOW); >>> } >>> } >>> -- >>> 1.7.6.5 --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1. 2014-03-21 17:43 ` Jay Vosburgh @ 2014-03-24 9:01 ` zheng.li 2014-03-24 19:25 ` David Miller 0 siblings, 1 reply; 12+ messages in thread From: zheng.li @ 2014-03-24 9:01 UTC (permalink / raw) To: Jay Vosburgh; +Cc: netdev, andy, linux-kernel, davem, joe.jin Recreate patch again as below, please check. From 9f504b1ee94e87dcfbb330ebd2f5bc6ca91f4b5b Mon Sep 17 00:00:00 2001 From: Zheng Li <zheng.x.li@oracle.com> Date: Mon, 24 Mar 2014 16:53:25 +0800 Subject: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 in tlb and alb mode. In bond mode tlb and alb, inactive slaves should keep inactive flag to 1 to refuse to receive broadcast packets. Now, active slave send broadcast packets (for example ARP requests) which will arrive inactive slaves on same host from switch, but inactive slave's inactive flag is zero that cause bridge receive the broadcast packets to produce a wrong entry in forward table. Typical situation is domu send some ARP request which go out from dom0 bond's active slave, then the ARP broadcast request packets go back to inactive slave from switch, because the inactive slave's inactive flag is zero, kernel will receive the packets and pass them to bridge, that cause dom0's bridge map domu's MAC address to port of bond, bridge should map domu's MAC to port of vif. Signed-off-by: Zheng Li <zheng.x.li@oracle.com> --- drivers/net/bonding/bond_main.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index e5628fc..8761df6 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3062,7 +3062,7 @@ static int bond_open(struct net_device *bond_dev) && (slave != bond->curr_active_slave)) { bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW); - } else { + } else if (!bond_is_lb(bond)) { bond_set_slave_active_flags(slave, BOND_SLAVE_NOTIFY_NOW); } -- 1.7.6.5 Thanks, Zheng Li 于 2014年03月22日 01:43, Jay Vosburgh 写道: > zheng.li <zheng.x.li@oracle.com> wrote: > >> 于 2014年03月21日 01:02, Jay Vosburgh 写道: >>> Zheng Li <zheng.x.li@oracle.com> wrote: >>> >>>> Except bond mode 1, in other bond modes, inactive slaves should keep >>>> inactive flag to 1 to refuse to receive broadcast packets. Now, active >>>> slave send broadcast packets (for example ARP requests) which will >>>> arrive inactive slaves on same host from switch, but inactive slave's >>>> inactive flag is zero that cause bridge receive the broadcast packets >>>> to produce a wrong entry in forward table. Typical situation is domu >>>> send some ARP request which go out from dom0 bond's active slave, then >>>> the ARP broadcast request packets go back to inactive slave from >>>> switch, because the inactive slave's inactive flag is zero, kernel will >>>> receive the packets and pass them to bridge, that cause dom0's bridge >>>> map domu's MAC address to port of bond, bridge should map domu's MAC to >>>> port of vif. >>> >>> I suspect this will break LACP (802.3ad) and Etherchannel >>> (balance-xor, balance-rr) modes, as those modes can receive broadcast or >>> multicast on any slave. In those cases, the switch knows about the >>> aggregation, and will only send the broadcast / multicast to one of the >>> ports, but the port selected is not always the same one. >>> >>> In which mode are you having trouble? >>> >>> -J >> >> Except bond mode 1, in other modes (major test in mode 6, and test all >> other mode, except mode 1, all other modes has the bug), the bridge >> make a wrong entry which map guest MAC to the port of bond, it should >> map guest MAC to the port of vif. >> >> Env description: dom0's bridge contains bond1 and vif ports, bond1 as >> port 1 , vif as port 2, bond1 has two slaves which connect a switch. >> when from guest ping others ,the arp broadcast request will go out from >> bond1's active slave, and then go back to itself inactive slave from >> switch , in function of bond_should_deliver_exact_match will return >> false by inactive is zero, return false will cause bridge receive the >> arp request packets whose original is from guest through vif that let >> bridge consider the SRC MAC of guest is from bond1 by analyzing the arp >> broadcast packets, then make a wrong forward entry "MAC of guest, from >> port 1 (bond1)" , the correct entry should be "MAC of guest , from port >> 2 (vif)". > > I believe I understand; the crux of the problem is that the > broadcast is looped by the switch to the other bond port, which then > processes it as a received packet. > > For the tlb and alb modes, you're logically correct; the slaves > other than the active slave should be set to inactive when the bond is > opened. This is how they are set when configured normally to limit > inbound broadcast and multicast traffic to just one slave (because these > modes do not configure the switch ports into channel groups or > aggregators; the switch has no knowledge of the bond). > > Your patch is still incorrect, though; the slaves should be set > to actually be inactive via bond_set_slave_inactive_flags, not left > "semi-active," i.e., BOND_STATE_ACTIVE but slave->inactive set. More on > the slave->inactive flag handling later, though. > > For the balance-rr/-xor and 802.3ad modes, I suspect you have a > configuration problem of some sort. For those modes, the switch should > not loop back the broadcast as you describe when correctly configured. > > The balance-rr/-xor modes should be connected to switch ports > that are configured into a single Etherchannel group (static link > aggregation). If they are not, the looping back behavior you describe > is expected, as the switch will flood the broadcast to all ports. > > Similarly, the 802.3ad mode should be connected to a switch with > the ports configured for LACP, in which case the ports will > automatically configure into one or more aggregators, and again, the > switch will limit broadcast and multicast traffic to just one member of > an aggregator. > > The -rr/-xor/802.3ad modes must have all slaves set to active in > order for them to correctly process incoming broadcast and multicast > traffic in their proper configuration. Setting them to inactive will > cause packet loss in those configurations. > > The 802.3ad mode is probably the easiest to examine; if you > configure the switch ports for LACP mode, the /proc/net/bonding/bond0 > file should indicate that all the slaves are in the same aggregator > (have the same Aggregator ID), and that that aggregator is the active > one. The switch should have similar indications, although the format is > vendor-specific. If your switch is configured for LACP and you still > have issues, please post the /proc/net/bonding/bond0 contents. > > Finally, getting back to the slave->inactive flag. The only > difference between bond_set_slave_active_flags or _inactive_flags and > bond_set_slave_state is that the slave->inactive flag (whose only > purpose is duplicate suppression for incoming packets) is cleared by the > first (_flags variant), but not by the second. Right now, the only > caller of _set_slave_state are the _active/_inactive_flags functions, so > it all works. If _set_slave_state is called independently, then the > slave->inactive flag and the actual state may become unsynchronized. > > In summary, it looks to me like: > > (a) bond_set_slave_state should fix slave->inactive to match the > slave state if there are going to be callers other than > bond_set_slave_active_flags or _inactive_flags, or remove the > slave->inactive field entirely and have bond_is_slave_inactive use > slave->back directly. > > and, the immediate problem here, > > (b) bond_open should call bond_set_slave_active_flags or > _inactive_flags based on the mode and whether or not the slave is the > curr_active_slave. For active-backup, alb and tlb modes, the active > slave is active, the others are inactive; for -rr, -xor and 802.3ad > modes, all slaves are active. I dunno what to do with broadcast mode; > make 'em all active, I guess. > > -J > >> bond_should_deliver_exact_match(struct sk_buff *skb, >> struct slave *slave, >> struct bonding *bond) >> { >> if (bond_is_slave_inactive(slave)) { >> if (bond->params.mode == BOND_MODE_ALB && >> skb->pkt_type != PACKET_BROADCAST && >> skb->pkt_type != PACKET_MULTICAST) >> return false; >> return true; >> } >> return false; >> } >> >> Thanks, >> Zheng Li >> >> >>> >>>> >>>> Signed-off-by: Zheng Li <zheng.x.li@oracle.com> >>>> --- >>>> drivers/net/bonding/bond_main.c | 2 +- >>>> 1 files changed, 1 insertions(+), 1 deletions(-) >>>> >>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >>>> index e5628fc..2f73f18 100644 >>>> --- a/drivers/net/bonding/bond_main.c >>>> +++ b/drivers/net/bonding/bond_main.c >>>> @@ -3063,7 +3063,7 @@ static int bond_open(struct net_device *bond_dev) >>>> bond_set_slave_inactive_flags(slave, >>>> BOND_SLAVE_NOTIFY_NOW); >>>> } else { >>>> - bond_set_slave_active_flags(slave, >>>> + bond_set_slave_state(slave, BOND_STATE_ACTIVE, >>>> BOND_SLAVE_NOTIFY_NOW); >>>> } >>>> } >>>> -- >>>> 1.7.6.5 > > --- > -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1. 2014-03-24 9:01 ` zheng.li @ 2014-03-24 19:25 ` David Miller 0 siblings, 0 replies; 12+ messages in thread From: David Miller @ 2014-03-24 19:25 UTC (permalink / raw) To: zheng.x.li; +Cc: fubar, netdev, andy, linux-kernel, joe.jin From: "zheng.li" <zheng.x.li@oracle.com> Date: Mon, 24 Mar 2014 17:01:42 +0800 > Recreate patch again as below, please check. Your patch has been severely corrupted by your email client, and is thus unusable by anyone. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1. 2014-03-20 8:51 Zheng Li 2014-03-20 9:36 ` Ding Tianhong 2014-03-20 17:02 ` Jay Vosburgh @ 2014-03-21 11:34 ` Sergei Shtylyov 2 siblings, 0 replies; 12+ messages in thread From: Sergei Shtylyov @ 2014-03-21 11:34 UTC (permalink / raw) To: Zheng Li, netdev, fubar, andy; +Cc: linux-kernel, davem, joe.jin Hello. On 03/20/2014 11:51 AM, Zheng Li wrote: > Except bond mode 1, in other bond modes, inactive slaves should keep inactive flag to > 1 to refuse to receive broadcast packets. Now, active slave send broadcast packets > (for example ARP requests) which will arrive inactive slaves on same host from switch, > but inactive slave's inactive flag is zero that cause bridge receive the broadcast > packets to produce a wrong entry in forward table. Typical situation is domu send some > ARP request which go out from dom0 bond's active slave, then the ARP broadcast request > packets go back to inactive slave from switch, because the inactive slave's inactive > flag is zero, kernel will receive the packets and pass them to bridge, that cause dom0's > bridge map domu's MAC address to port of bond, bridge should map domu's MAC to port of vif. > Signed-off-by: Zheng Li <zheng.x.li@oracle.com> > --- > drivers/net/bonding/bond_main.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index e5628fc..2f73f18 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -3063,7 +3063,7 @@ static int bond_open(struct net_device *bond_dev) > bond_set_slave_inactive_flags(slave, > BOND_SLAVE_NOTIFY_NOW); > } else { > - bond_set_slave_active_flags(slave, > + bond_set_slave_state(slave, BOND_STATE_ACTIVE, > BOND_SLAVE_NOTIFY_NOW); Now you have to re-indent this line to start right under 'state'. WBR, Sergei ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-04-01 0:35 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-28 9:22 [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 Zheng Li 2014-03-31 6:58 ` zheng.li 2014-04-01 0:35 ` Jay Vosburgh -- strict thread matches above, loose matches on Subject: below -- 2014-03-20 8:51 Zheng Li 2014-03-20 9:36 ` Ding Tianhong 2014-03-20 17:02 ` Jay Vosburgh 2014-03-21 2:39 ` zheng.li 2014-03-21 8:41 ` Ding Tianhong 2014-03-21 17:43 ` Jay Vosburgh 2014-03-24 9:01 ` zheng.li 2014-03-24 19:25 ` David Miller 2014-03-21 11:34 ` Sergei Shtylyov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).