From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jay Vosburgh Subject: Re: [PATCH net] bonding: fix arp requests sends with isolated routes Date: Mon, 17 Feb 2014 17:07:51 -0800 Message-ID: <4562.1392685671@death.nxdomain> References: <52FE3D5B.6060103@alphalink.fr> <20140217.145635.1123180851794758928.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: f.cachereul@alphalink.fr, vfalico@redhat.com, andy@greyhouse.net, netdev@vger.kernel.org To: David Miller Return-path: Received: from e8.ny.us.ibm.com ([32.97.182.138]:50173 "EHLO e8.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751100AbaBRBH5 convert rfc822-to-8bit (ORCPT ); Mon, 17 Feb 2014 20:07:57 -0500 Received: from /spool/local by e8.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 17 Feb 2014 20:07:55 -0500 Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 0EAE06E8041 for ; Mon, 17 Feb 2014 20:07:49 -0500 (EST) Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by b01cxnp22034.gho.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s1I17rf23670378 for ; Tue, 18 Feb 2014 01:07:53 GMT Received: from d01av01.pok.ibm.com (localhost [127.0.0.1]) by d01av01.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s1I17qlo018732 for ; Mon, 17 Feb 2014 20:07:53 -0500 In-reply-to: <20140217.145635.1123180851794758928.davem@davemloft.net> Sender: netdev-owner@vger.kernel.org List-ID: David Miller wrote: >From: Fran=C3=A7ois Cachereul >Date: Fri, 14 Feb 2014 16:59:23 +0100 > >> Make arp_send_all() try to send arp packets through slave devices ev= ent >> if no route to arp_ip_target is found. This is useful when the route >> is in an isolated routing table with routing rule parameters like oi= f or >> iif in which case ip_route_output() return an error. >> Thus, the arp packet is send without vlan and with the bond ip addre= ss >> as sender. >>=20 >> Signed-off-by: Fran=C3=A7ois CACHEREUL >> --- >> This previously worked, the problem was added in 2.6.35 with vlan 0 >> added by default when the module 8021q is loaded. Before that no rou= te >> lookup was done if the bond device did not have any vlan. The proble= m >> now exists event if the module 8021q is not loaded. > >I don't like this at all, you're trying to paper over the fact that we >can't set the flow key correctly at this point. > >Just assuming the route might be there and trying anyways is not reall= y >acceptable in my opinion. There's a reason we do a route lookup at al= l. The reason for the route lookup is to get a VLAN ID for the outgoing ARP (if VLANs are configured above the bond), so it can be correctly tagged. As Francois says, older versions of the bond_arp_send_all function would skip the route lookup entirely if there were no VLANs configured above the bond. E.g., the original logic from a 2.6.32-era kernel looks like: for (i =3D 0; (i < BOND_MAX_ARP_TARGETS); i++) { [...] if (!bond->vlgrp) { pr_debug("basa: empty vlan: arp_send\n"); bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], bond->master_ip, 0); continue; } /* * If VLANs are configured, we do a route lookup to * determine which VLAN interface would be used, so we * can tag the ARP with the proper VLAN tag. */ memset(&fl, 0, sizeof(fl)); fl.fl4_dst =3D targets[i]; fl.fl4_tos =3D RTO_ONLINK; rv =3D ip_route_output_key(&init_net, &rt, &fl); [...] So, in the past, this particular case (oif / iif in route selection) would "work," in the sense that an ARP would go out with no VLAN ID, but only when there were known to be no VLANs configured above the bond. If any VLANs were configured above the bond, this case would fail as we're seeing here. Nowadays, there is no easy way to tell if there are VLANs above the bond, and there's generally a VID 0 configured anyway, so the route lookup is unconditional. In the case at issue here (the route lookup for the arp_ip_target IP address fails), it's not possible for bonding to determine what interface would be used, and therefore what VLAN tag to use. Francois's patch would make bonding essentially take a best guess of "no VLAN" and send an untagged ARP for any destination not found in the regular (no iif, oif, etc, rule) routing table, which is what used to happen for the "known no VLAN" case. With the patch, these ARPs may have an all-zero source IP address (since the bond_confirm_addr call may not find a suitable sourc= e address for something it can't find a route to). That is a legal ARP (used for duplicate address detection according to RFC 2131), but when last I tried it a couple of years ago, the replies won't pass arp_validate (as the target IP of 0.0.0.0 in the reply doesn't match an= y of the bond's IP address), and I suspect that hasn't changed. In the days of yore code above, bonding kept track of what it thought the bond's IP address was (bond->master_ip), and used that as the source IP in the ARPs. That wasn't always correct if the bond had multiple IP addresses. So, ultimately, Francois is correct that this is a regression of a behavior that used to work. On the other hand, this patch isn't really a complete restoration of the prior behavior. It's no longer possible to know that there aren't any VLANs above the bond, and so the "no VLAN" guess is much less reliable than it used to be, plus the ARPs that will be generated probably won't work with arp_validate. As much as I loathe adding more options to bonding, a manually selected "force VLAN ID" for the arp_ip_target(s) would resolve this fo= r the minority of cases where the automatic VLAN ID selection does not function. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com