Re: [PATCH v2 net] bonding: Fix stacked device detection in arp monitoring

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Vlad Yasevich <vyasevic@redhat.com>
To: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: netdev@vger.kernel.org, Veaceslav Falico <vfalico@redhat.com>,
	Andy Gospodarek <andy@greyhouse.net>,
	Ding Tianhong <dingtianhong@huawei.com>,
	Patric McHardy <kaber@trash.net>
Subject: Re: [PATCH v2 net] bonding: Fix stacked device detection in arp monitoring
Date: Wed, 07 May 2014 13:08:09 -0400	[thread overview]
Message-ID: <536A6879.8070303@redhat.com> (raw)
In-Reply-To: <29645.1399481039@localhost.localdomain>

On 05/07/2014 12:43 PM, Jay Vosburgh wrote:
> Vlad Yasevich <vyasevic@redhat.com> wrote:
> 
>> Prior to commit fbd929f2dce460456807a51e18d623db3db9f077
>> 	bonding: support QinQ for bond arp interval
>>
>> the arp monitoring code allowed for proper detection of devices
>> stacked on top of vlans.  Since the above commit, the
>> code can still detect a device stacked on top of single
>> vlan, but not a device stacked on top of Q-in-Q configuration.
>> The search will only set the inner vlan tag if the route
>> device is the vlan device.  However, this is not always the
>> case, as it is possible to extend the stacked configuration.
>>
>> With this patch it is possible to provision devices on
>> top Q-in-Q vlan configuration that should be used as
>> a source of ARP monitoring information.
>>
>> For example:
>> ip link add link bond0 vlan10 type vlan proto 802.1q id 10
>> ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
>> ip link add link vlan100 type macvlan
>>
>> Note:  This patch limites the number of stacked VLANs to 2,
>> just like before.  The original, however had another issue
>> in that if we had more then 2 levels of VLANs, we would end
>> up generating incorrectly tagged traffic.  This is no longer
>> possible.
>>
>> Fixes: fbd929f2dce460456807a51e18d623db3db9f077 (bonding: support QinQ for bond arp interval)
>> CC: Jay Vosburgh <j.vosburgh@gmail.com>
>> CC: Veaceslav Falico <vfalico@redhat.com>
>> CC: Andy Gospodarek <andy@greyhouse.net>
>> CC: Ding Tianhong <dingtianhong@huawei.com>
>> CC: Patric McHardy <kaber@trash.net>
>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>> ---
>> v2->v1:
>> * Changed the function name to better describe what the function is doing.
>>  We are not just finding the stack of vlan devices, we are also verifything
>>  the path between the bonding device and the route output device.
>> * Added some more commenets about what the function is doing.
>> * Fixed an issue with multiple peer vlans.
>> * Removed all occurances of 'inner' and 'outer' and replaced it with tag
>>  array.
> 
> 	I think you may have misunderstood my prior comment; I meant
> that I liked the "inner" and "outer" names better than "tag[0]" and
> "tag[1]".

Oh, sorry.  I misunderstood.  I can certainly maintain inner/outer
connotation, but it seem a bit silly.  We have a stack of vlan devices
so we'll apply them as as stack as well.

I actually debated about making it generic and not limiting us to a
max depth of 2 as there is nothing in the vlan implementation that
limits the user from configuring a stack of more then 2 deep.

Even 802.1ad spec doesn't explicitly limit the number of vlan headers to
2, and once you do that, the concept of inner/outer goes away.

> 
> 	I did notice that the inner and outer parameters could be
> removed from bond_arp_send as well, but, again, I found the "inner" and
> "outer" names more descriptive than tag[0] or tag[1]; perhaps a #define
> for the magic numbers (0 = "outer", 1 = "inner" and 2 = "max nesting"),
> or at least a comment that says straight up "tag[0] is the outer tag,
> tag[1] is the inner tag (if there are two tags)" is in order.
> 
>> drivers/net/bonding/bond_main.c | 114 ++++++++++++++++++----------------------
>> 1 file changed, 52 insertions(+), 62 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 9d08e00..f592f96 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -2126,8 +2126,7 @@ static bool bond_has_this_ip(struct bonding *bond, __be32 ip)
>>  */
>> static void bond_arp_send(struct net_device *slave_dev, int arp_op,
>> 			  __be32 dest_ip, __be32 src_ip,
>> -			  struct bond_vlan_tag *inner,
>> -			  struct bond_vlan_tag *outer)
>> +			  struct bond_vlan_tag *tags)
>> {
>> 	struct sk_buff *skb;
>>
>> @@ -2141,12 +2140,12 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op,
>> 		net_err_ratelimited("ARP packet allocation failed\n");
>> 		return;
>> 	}
>> -	if (outer->vlan_id) {
>> -		if (inner->vlan_id) {
>> +	if (tags[0].vlan_id) {
>> +		if (tags[1].vlan_id) {
>> 			pr_debug("inner tag: proto %X vid %X\n",
>> -				 ntohs(inner->vlan_proto), inner->vlan_id);
>> -			skb = __vlan_put_tag(skb, inner->vlan_proto,
>> -					     inner->vlan_id);
>> +				 ntohs(tags[1].vlan_proto), tags[1].vlan_id);
>> +			skb = __vlan_put_tag(skb, tags[1].vlan_proto,
>> +					     tags[1].vlan_id);
>> 			if (!skb) {
>> 				net_err_ratelimited("failed to insert inner VLAN tag\n");
>> 				return;
>> @@ -2154,8 +2153,8 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op,
>> 		}
>>
>> 		pr_debug("outer reg: proto %X vid %X\n",
>> -			 ntohs(outer->vlan_proto), outer->vlan_id);
>> -		skb = vlan_put_tag(skb, outer->vlan_proto, outer->vlan_id);
>> +			 ntohs(tags[0].vlan_proto), tags[0].vlan_id);
>> +		skb = vlan_put_tag(skb, tags[0].vlan_proto, tags[0].vlan_id);
>> 		if (!skb) {
>> 			net_err_ratelimited("failed to insert outer VLAN tag\n");
>> 			return;
>> @@ -2164,22 +2163,52 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op,
>> 	arp_xmit(skb);
>> }
>>
>> +/* Check to make sure that @end device is stacked on top of the @start
>> + * device.  Invofrmation about any intervening vlans are saved into
>> + * the @tag array.  @idx parametet specifies how many vlans deep we are
>> + * are currently looking. We currently only support 2 levels of vlan stacking.
>> + * Return true if we have a valid stacking configuration.  Otherwise false.
>> + */
> 
> 	Spelling nits: "Information" and "parameter".
> 
>> +static bool bond_check_path(struct net_device *start, struct net_device *end,
>> +			    struct bond_vlan_tag *tag, int idx)
>> +{
>> +	struct net_device *upper;
>> +	struct list_head  *iter;
>> +
>> +	/* We do not support more then 2 levels of VLAN nesting */
>> +	if (idx >= 2)
>> +		return false;
>> +
>> +	netdev_for_each_all_upper_dev_rcu(start, upper, iter) {
>> +		if (is_vlan_dev(upper)) {
>> +			tag[idx].vlan_proto = vlan_dev_vlan_proto(upper);
>> +			tag[idx].vlan_id = vlan_dev_vlan_id(upper);
>> +		}
>> +		if (upper == end)
>> +			return true;
>> +
>> +		/* Look at the devices list  of 'upper' only if it is a
>> +		 * vlan device.
>> +		 */
>> +		if (is_vlan_dev(upper) &&
>> +		    bond_check_path(upper, end, tag, idx+1))
>> +			return true;
> 
> 	This may or may not be a realistic configuration, but will this
> function traverse correctly if there is some other device type between
> the two vlans?  E.g., eth0 -> bond0 -> vlan100 -> bridge -> vlan200,
> where "vlan200" is the "end" device holding the IP address from the
> route lookup.  It need not be a bridge in there, but I think this would
> be a legal configuration.

Yes.  I verified that it works.  The reason is that we are traversing
the all_adj_list.upper list which contains all of the upper devices at
each level.  So, at vlan100 level, we will see vlan200 and all will be
well.

-vlad

> 
> 	-J
> 
>> +	}
>> +	return false;
>> +}
>> +
>>
>> static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
>> {
>> -	struct net_device *upper, *vlan_upper;
>> -	struct list_head *iter, *vlan_iter;
>> 	struct rtable *rt;
>> -	struct bond_vlan_tag inner, outer;
>> +	struct bond_vlan_tag tags[2];
>> 	__be32 *targets = bond->params.arp_targets, addr;
>> 	int i;
>> +	bool ret;
>>
>> 	for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) {
>> 		pr_debug("basa: target %pI4\n", &targets[i]);
>> -		inner.vlan_proto = 0;
>> -		inner.vlan_id = 0;
>> -		outer.vlan_proto = 0;
>> -		outer.vlan_id = 0;
>> +		memset(tags, 0, sizeof(tags));
>>
>> 		/* Find out through which dev should the packet go */
>> 		rt = ip_route_output(dev_net(bond->dev), targets[i], 0,
>> @@ -2192,7 +2221,8 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
>> 				net_warn_ratelimited("%s: no route to arp_ip_target %pI4 and arp_validate is set\n",
>> 						     bond->dev->name,
>> 						     &targets[i]);
>> -			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 0, &inner, &outer);
>> +			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
>> +				      0, tags);
>> 			continue;
>> 		}
>>
>> @@ -2201,52 +2231,12 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
>> 			goto found;
>>
>> 		rcu_read_lock();
>> -		/* first we search only for vlan devices. for every vlan
>> -		 * found we verify its upper dev list, searching for the
>> -		 * rt->dst.dev. If found we save the tag of the vlan and
>> -		 * proceed to send the packet.
>> -		 */
>> -		netdev_for_each_all_upper_dev_rcu(bond->dev, vlan_upper,
>> -						  vlan_iter) {
>> -			if (!is_vlan_dev(vlan_upper))
>> -				continue;
>> -
>> -			if (vlan_upper == rt->dst.dev) {
>> -				outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
>> -				outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
>> -				rcu_read_unlock();
>> -				goto found;
>> -			}
>> -			netdev_for_each_all_upper_dev_rcu(vlan_upper, upper,
>> -							  iter) {
>> -				if (upper == rt->dst.dev) {
>> -					/* If the upper dev is a vlan dev too,
>> -					 *  set the vlan tag to inner tag.
>> -					 */
>> -					if (is_vlan_dev(upper)) {
>> -						inner.vlan_proto = vlan_dev_vlan_proto(upper);
>> -						inner.vlan_id = vlan_dev_vlan_id(upper);
>> -					}
>> -					outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
>> -					outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
>> -					rcu_read_unlock();
>> -					goto found;
>> -				}
>> -			}
>> -		}
>> -
>> -		/* if the device we're looking for is not on top of any of
>> -		 * our upper vlans, then just search for any dev that
>> -		 * matches, and in case it's a vlan - save the id
>> -		 */
>> -		netdev_for_each_all_upper_dev_rcu(bond->dev, upper, iter) {
>> -			if (upper == rt->dst.dev) {
>> -				rcu_read_unlock();
>> -				goto found;
>> -			}
>> -		}
>> +		ret = bond_check_path(bond->dev, rt->dst.dev, tags, 0);
>> 		rcu_read_unlock();
>>
>> +		if (ret)
>> +			goto found;
>> +
>> 		/* Not our device - skip */
>> 		pr_debug("%s: no path to arp_ip_target %pI4 via rt.dev %s\n",
>> 			 bond->dev->name, &targets[i],
>> @@ -2259,7 +2249,7 @@ found:
>> 		addr = bond_confirm_addr(rt->dst.dev, targets[i], 0);
>> 		ip_rt_put(rt);
>> 		bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
>> -			      addr, &inner, &outer);
>> +			      addr, tags);
>> 	}
>> }
>>
>> -- 
>> 1.9.0
>>
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com
>

next prev parent reply	other threads:[~2014-05-07 17:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-07 13:47 [PATCH v2 net] bonding: Fix stacked device detection in arp monitoring Vlad Yasevich
2014-05-07 16:43 ` Jay Vosburgh
2014-05-07 17:08   ` Vlad Yasevich [this message]
2014-05-07 17:49     ` Veaceslav Falico
2014-05-07 18:11       ` Veaceslav Falico
2014-05-07 18:47         ` Vlad Yasevich
2014-05-07 18:59           ` Veaceslav Falico
2014-05-07 19:40             ` Jay Vosburgh
2014-05-07 20:10             ` Veaceslav Falico
2014-05-08  4:25               ` Ding Tianhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=536A6879.8070303@redhat.com \
    --to=vyasevic@redhat.com \
    --cc=andy@greyhouse.net \
    --cc=dingtianhong@huawei.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=kaber@trash.net \
    --cc=netdev@vger.kernel.org \
    --cc=vfalico@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.