From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vlad Yasevich Subject: Re: [PATCH v2 net] bonding: Fix stacked device detection in arp monitoring Date: Wed, 07 May 2014 13:08:09 -0400 Message-ID: <536A6879.8070303@redhat.com> References: <1399470461-22213-1-git-send-email-vyasevic@redhat.com> <29645.1399481039@localhost.localdomain> Reply-To: vyasevic@redhat.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, Veaceslav Falico , Andy Gospodarek , Ding Tianhong , Patric McHardy To: Jay Vosburgh Return-path: Received: from mx1.redhat.com ([209.132.183.28]:56909 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751049AbaEGRIT (ORCPT ); Wed, 7 May 2014 13:08:19 -0400 In-Reply-To: <29645.1399481039@localhost.localdomain> Sender: netdev-owner@vger.kernel.org List-ID: On 05/07/2014 12:43 PM, Jay Vosburgh wrote: > Vlad Yasevich wrote: > >> Prior to commit fbd929f2dce460456807a51e18d623db3db9f077 >> bonding: support QinQ for bond arp interval >> >> the arp monitoring code allowed for proper detection of devices >> stacked on top of vlans. Since the above commit, the >> code can still detect a device stacked on top of single >> vlan, but not a device stacked on top of Q-in-Q configuration. >> The search will only set the inner vlan tag if the route >> device is the vlan device. However, this is not always the >> case, as it is possible to extend the stacked configuration. >> >> With this patch it is possible to provision devices on >> top Q-in-Q vlan configuration that should be used as >> a source of ARP monitoring information. >> >> For example: >> ip link add link bond0 vlan10 type vlan proto 802.1q id 10 >> ip link add link vlan10 vlan100 type vlan proto 802.1q id 100 >> ip link add link vlan100 type macvlan >> >> Note: This patch limites the number of stacked VLANs to 2, >> just like before. The original, however had another issue >> in that if we had more then 2 levels of VLANs, we would end >> up generating incorrectly tagged traffic. This is no longer >> possible. >> >> Fixes: fbd929f2dce460456807a51e18d623db3db9f077 (bonding: support QinQ for bond arp interval) >> CC: Jay Vosburgh >> CC: Veaceslav Falico >> CC: Andy Gospodarek >> CC: Ding Tianhong >> CC: Patric McHardy >> Signed-off-by: Vlad Yasevich >> --- >> v2->v1: >> * Changed the function name to better describe what the function is doing. >> We are not just finding the stack of vlan devices, we are also verifything >> the path between the bonding device and the route output device. >> * Added some more commenets about what the function is doing. >> * Fixed an issue with multiple peer vlans. >> * Removed all occurances of 'inner' and 'outer' and replaced it with tag >> array. > > I think you may have misunderstood my prior comment; I meant > that I liked the "inner" and "outer" names better than "tag[0]" and > "tag[1]". Oh, sorry. I misunderstood. I can certainly maintain inner/outer connotation, but it seem a bit silly. We have a stack of vlan devices so we'll apply them as as stack as well. I actually debated about making it generic and not limiting us to a max depth of 2 as there is nothing in the vlan implementation that limits the user from configuring a stack of more then 2 deep. Even 802.1ad spec doesn't explicitly limit the number of vlan headers to 2, and once you do that, the concept of inner/outer goes away. > > I did notice that the inner and outer parameters could be > removed from bond_arp_send as well, but, again, I found the "inner" and > "outer" names more descriptive than tag[0] or tag[1]; perhaps a #define > for the magic numbers (0 = "outer", 1 = "inner" and 2 = "max nesting"), > or at least a comment that says straight up "tag[0] is the outer tag, > tag[1] is the inner tag (if there are two tags)" is in order. > >> drivers/net/bonding/bond_main.c | 114 ++++++++++++++++++---------------------- >> 1 file changed, 52 insertions(+), 62 deletions(-) >> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >> index 9d08e00..f592f96 100644 >> --- a/drivers/net/bonding/bond_main.c >> +++ b/drivers/net/bonding/bond_main.c >> @@ -2126,8 +2126,7 @@ static bool bond_has_this_ip(struct bonding *bond, __be32 ip) >> */ >> static void bond_arp_send(struct net_device *slave_dev, int arp_op, >> __be32 dest_ip, __be32 src_ip, >> - struct bond_vlan_tag *inner, >> - struct bond_vlan_tag *outer) >> + struct bond_vlan_tag *tags) >> { >> struct sk_buff *skb; >> >> @@ -2141,12 +2140,12 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op, >> net_err_ratelimited("ARP packet allocation failed\n"); >> return; >> } >> - if (outer->vlan_id) { >> - if (inner->vlan_id) { >> + if (tags[0].vlan_id) { >> + if (tags[1].vlan_id) { >> pr_debug("inner tag: proto %X vid %X\n", >> - ntohs(inner->vlan_proto), inner->vlan_id); >> - skb = __vlan_put_tag(skb, inner->vlan_proto, >> - inner->vlan_id); >> + ntohs(tags[1].vlan_proto), tags[1].vlan_id); >> + skb = __vlan_put_tag(skb, tags[1].vlan_proto, >> + tags[1].vlan_id); >> if (!skb) { >> net_err_ratelimited("failed to insert inner VLAN tag\n"); >> return; >> @@ -2154,8 +2153,8 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op, >> } >> >> pr_debug("outer reg: proto %X vid %X\n", >> - ntohs(outer->vlan_proto), outer->vlan_id); >> - skb = vlan_put_tag(skb, outer->vlan_proto, outer->vlan_id); >> + ntohs(tags[0].vlan_proto), tags[0].vlan_id); >> + skb = vlan_put_tag(skb, tags[0].vlan_proto, tags[0].vlan_id); >> if (!skb) { >> net_err_ratelimited("failed to insert outer VLAN tag\n"); >> return; >> @@ -2164,22 +2163,52 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op, >> arp_xmit(skb); >> } >> >> +/* Check to make sure that @end device is stacked on top of the @start >> + * device. Invofrmation about any intervening vlans are saved into >> + * the @tag array. @idx parametet specifies how many vlans deep we are >> + * are currently looking. We currently only support 2 levels of vlan stacking. >> + * Return true if we have a valid stacking configuration. Otherwise false. >> + */ > > Spelling nits: "Information" and "parameter". > >> +static bool bond_check_path(struct net_device *start, struct net_device *end, >> + struct bond_vlan_tag *tag, int idx) >> +{ >> + struct net_device *upper; >> + struct list_head *iter; >> + >> + /* We do not support more then 2 levels of VLAN nesting */ >> + if (idx >= 2) >> + return false; >> + >> + netdev_for_each_all_upper_dev_rcu(start, upper, iter) { >> + if (is_vlan_dev(upper)) { >> + tag[idx].vlan_proto = vlan_dev_vlan_proto(upper); >> + tag[idx].vlan_id = vlan_dev_vlan_id(upper); >> + } >> + if (upper == end) >> + return true; >> + >> + /* Look at the devices list of 'upper' only if it is a >> + * vlan device. >> + */ >> + if (is_vlan_dev(upper) && >> + bond_check_path(upper, end, tag, idx+1)) >> + return true; > > This may or may not be a realistic configuration, but will this > function traverse correctly if there is some other device type between > the two vlans? E.g., eth0 -> bond0 -> vlan100 -> bridge -> vlan200, > where "vlan200" is the "end" device holding the IP address from the > route lookup. It need not be a bridge in there, but I think this would > be a legal configuration. Yes. I verified that it works. The reason is that we are traversing the all_adj_list.upper list which contains all of the upper devices at each level. So, at vlan100 level, we will see vlan200 and all will be well. -vlad > > -J > >> + } >> + return false; >> +} >> + >> >> static void bond_arp_send_all(struct bonding *bond, struct slave *slave) >> { >> - struct net_device *upper, *vlan_upper; >> - struct list_head *iter, *vlan_iter; >> struct rtable *rt; >> - struct bond_vlan_tag inner, outer; >> + struct bond_vlan_tag tags[2]; >> __be32 *targets = bond->params.arp_targets, addr; >> int i; >> + bool ret; >> >> for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) { >> pr_debug("basa: target %pI4\n", &targets[i]); >> - inner.vlan_proto = 0; >> - inner.vlan_id = 0; >> - outer.vlan_proto = 0; >> - outer.vlan_id = 0; >> + memset(tags, 0, sizeof(tags)); >> >> /* Find out through which dev should the packet go */ >> rt = ip_route_output(dev_net(bond->dev), targets[i], 0, >> @@ -2192,7 +2221,8 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave) >> net_warn_ratelimited("%s: no route to arp_ip_target %pI4 and arp_validate is set\n", >> bond->dev->name, >> &targets[i]); >> - bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 0, &inner, &outer); >> + bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], >> + 0, tags); >> continue; >> } >> >> @@ -2201,52 +2231,12 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave) >> goto found; >> >> rcu_read_lock(); >> - /* first we search only for vlan devices. for every vlan >> - * found we verify its upper dev list, searching for the >> - * rt->dst.dev. If found we save the tag of the vlan and >> - * proceed to send the packet. >> - */ >> - netdev_for_each_all_upper_dev_rcu(bond->dev, vlan_upper, >> - vlan_iter) { >> - if (!is_vlan_dev(vlan_upper)) >> - continue; >> - >> - if (vlan_upper == rt->dst.dev) { >> - outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper); >> - outer.vlan_id = vlan_dev_vlan_id(vlan_upper); >> - rcu_read_unlock(); >> - goto found; >> - } >> - netdev_for_each_all_upper_dev_rcu(vlan_upper, upper, >> - iter) { >> - if (upper == rt->dst.dev) { >> - /* If the upper dev is a vlan dev too, >> - * set the vlan tag to inner tag. >> - */ >> - if (is_vlan_dev(upper)) { >> - inner.vlan_proto = vlan_dev_vlan_proto(upper); >> - inner.vlan_id = vlan_dev_vlan_id(upper); >> - } >> - outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper); >> - outer.vlan_id = vlan_dev_vlan_id(vlan_upper); >> - rcu_read_unlock(); >> - goto found; >> - } >> - } >> - } >> - >> - /* if the device we're looking for is not on top of any of >> - * our upper vlans, then just search for any dev that >> - * matches, and in case it's a vlan - save the id >> - */ >> - netdev_for_each_all_upper_dev_rcu(bond->dev, upper, iter) { >> - if (upper == rt->dst.dev) { >> - rcu_read_unlock(); >> - goto found; >> - } >> - } >> + ret = bond_check_path(bond->dev, rt->dst.dev, tags, 0); >> rcu_read_unlock(); >> >> + if (ret) >> + goto found; >> + >> /* Not our device - skip */ >> pr_debug("%s: no path to arp_ip_target %pI4 via rt.dev %s\n", >> bond->dev->name, &targets[i], >> @@ -2259,7 +2249,7 @@ found: >> addr = bond_confirm_addr(rt->dst.dev, targets[i], 0); >> ip_rt_put(rt); >> bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], >> - addr, &inner, &outer); >> + addr, tags); >> } >> } >> >> -- >> 1.9.0 >> > > --- > -Jay Vosburgh, jay.vosburgh@canonical.com >