netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jay Vosburgh <fubar@us.ibm.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: Yinglin Sun <Yinglin.Sun@emc.com>,
	"David S. Miller" <davem@davemloft.net>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	James Morris <jmorris@namei.org>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Patrick McHardy <kaber@trash.net>,
	netdev@vger.kernel.org
Subject: Re: [PATCH] IPv6: DAD from bonding iface is treated as dup address from others
Date: Thu, 06 Oct 2011 12:05:33 -0700	[thread overview]
Message-ID: <27199.1317927933@death> (raw)
In-Reply-To: <20111006110047.GA22462@hmsreliant.think-freely.org>

Neil Horman <nhorman@tuxdriver.com> wrote:

>On Wed, Oct 05, 2011 at 08:59:10PM -0700, Yinglin Sun wrote:
>> Steps to reproduce this issue:
>> 1. create bond0 over eth0 and eth1, set the mode to balance-xor
>> 2. add an IPv6 address to bond0
>> 3. DAD packet is sent out from one slave and then is looped back from
>> the other slave. Therefore, it is treated as a duplicate address and
>> stays tentative afterwards:
>>    kern.info:
>>        Oct  5 11:50:18 testvm1 kernel: [  129.224353] bond0: IPv6 duplicate address 1234::1 detected!
>> 
>> Signed-off-by: Yinglin Sun <Yinglin.Sun@emc.com>
>> ---
>>  net/ipv6/ndisc.c |   15 +++++++++++++--
>>  1 files changed, 13 insertions(+), 2 deletions(-)
>> 
>> diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
>> index 9da6e02..c82f4c7 100644
>> --- a/net/ipv6/ndisc.c
>> +++ b/net/ipv6/ndisc.c
>> @@ -809,9 +809,10 @@ static void ndisc_recv_ns(struct sk_buff *skb)
>>  
>>  		if (ifp->flags & (IFA_F_TENTATIVE|IFA_F_OPTIMISTIC)) {
>>  			if (dad) {
>> +				const unsigned char *sadr;
>> +				sadr = skb_mac_header(skb);
>> +
>>  				if (dev->type == ARPHRD_IEEE802_TR) {
>> -					const unsigned char *sadr;
>> -					sadr = skb_mac_header(skb);
>>  					if (((sadr[8] ^ dev->dev_addr[0]) & 0x7f) == 0 &&
>>  					    sadr[9] == dev->dev_addr[1] &&
>>  					    sadr[10] == dev->dev_addr[2] &&
>> @@ -821,6 +822,16 @@ static void ndisc_recv_ns(struct sk_buff *skb)
>>  						/* looped-back to us */
>>  						goto out;
>>  					}
>> +				} else if (dev->type == ARPHRD_ETHER) {
>> +					if (sadr[6] == dev->dev_addr[0] &&
>> +					    sadr[7] == dev->dev_addr[1] &&
>> +					    sadr[8] == dev->dev_addr[2] &&
>> +					    sadr[9] == dev->dev_addr[3] &&
>> +					    sadr[10] == dev->dev_addr[4] &&
>> +					    sadr[11] == dev->dev_addr[5]) {
>> +						/* looped-back to us */
>> +						goto out;
>> +					}
>>  				}
>>  
>>  				/*
>> -- 
>> 1.7.4.1
>> 
>Nack, This seems like it will just completely break DAD.  What if theres another
>system out there with the same mac address.  A response from that system would
>get dropped by this filter, instead of causing The local system to stop using
>the address.  What you really want to do is modify
>bond_should_deliver_exact_match to detect this frame on the inactive slave or
>some such, and drop the frame there.

	Also NACK; and adding a bit of information.  The balance-xor
mode is nominally expecting to interact with a switch whose ports are
set for etherchannel ("static link aggregation"), in which case the
switch will not loop the packet back around.

	If your switch can do etherchannel, then enable it and the
problem should go away.  If your switch cannot do this, then you may
have other issues, because all of the multicast or broadcast packets
going out any bonding slave will loop around to another slave.  You
could also use 802.3ad / LACP if you switch supports that.

	For balance-xor (or balance-rr, for that matter) mode to a
non-etherchannel switch, it's going to be difficult, if not impossible,
to modify bond_should_deliver_exact_match, because there are no inactive
slaves.  In this mode, bonding is expecting the switch to balance
incoming traffic across the ports, and not deliver looped back packets
or duplicates.  There are no restrictions on what type of traffic
(mcast, bcast, ucast) may arrive on any given port.

	I can't think of a way to make the non-etherchannel case work
for balance-xor (or balance-rr) without breaking the DAD functionality
in the case of an actual duplicate.  I'm not aware of a way to
distinguish a looped back DAD probe from an actual duplicate address
probe elsewhere on the network.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

  reply	other threads:[~2011-10-06 19:06 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-06  3:59 [PATCH] IPv6: DAD from bonding iface is treated as dup address from others Yinglin Sun
2011-10-06 11:00 ` Neil Horman
2011-10-06 19:05   ` Jay Vosburgh [this message]
2011-10-06 22:17     ` Yinglin Sun
2011-10-07  0:03       ` Yinglin Sun
2011-10-07  0:59         ` Jay Vosburgh
2011-10-07  1:24           ` Yinglin Sun
2011-10-07  6:13             ` Chuck Anderson
2011-10-07 16:59               ` Yinglin Sun
2011-10-07 17:29                 ` Neil Horman
2011-10-07 11:10             ` Neil Horman
2011-10-07 18:08               ` Yinglin Sun
2011-10-07 19:09                 ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27199.1317927933@death \
    --to=fubar@us.ibm.com \
    --cc=Yinglin.Sun@emc.com \
    --cc=davem@davemloft.net \
    --cc=jmorris@namei.org \
    --cc=kaber@trash.net \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).