From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tokarev Subject: Re: e100 + VLANs? Date: Mon, 10 Oct 2011 20:51:04 +0400 Message-ID: <4E932278.8010802@msgid.tls.msk.ru> References: <4E90212D.8030009@msgid.tls.msk.ru> <1318091046.5276.22.camel@edumazet-laptop> <4E9097C0.2030307@gmail.com> <20111010101954.GB2840382@jupiter.n2.diac24.net> <4E9307CB.4050704@msgid.tls.msk.ru> <1318259152.3227.0.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC> <20111010151343.GB3260852@jupiter.n2.diac24.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Eric Dumazet , jeffrey.t.kirsher@intel.com, netdev To: David Lamparter Return-path: Received: from isrv.corpit.ru ([86.62.121.231]:54966 "EHLO isrv.corpit.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751140Ab1JJQvG (ORCPT ); Mon, 10 Oct 2011 12:51:06 -0400 In-Reply-To: <20111010151343.GB3260852@jupiter.n2.diac24.net> Sender: netdev-owner@vger.kernel.org List-ID: 10.10.2011 19:13, David Lamparter wrote: > On Mon, Oct 10, 2011 at 05:05:52PM +0200, Eric Dumazet wrote: >>> When pinging this NIC from another machine over VLAN5, I see >>> ARP packets coming to it, gets recognized and replies going >>> back, all on vlan 5. But on the other side, replies comes >>> WITHOUT a VLAN tag! >>> >>> From this NIC's point of view, capturing on whole ethX: >>> >>> 00:1f:c6:ef:e5:1b > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 5, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.48.11.2 tell 10.48.11.1, length 42 >>> 00:90:27:30:6d:1c > 00:1f:c6:ef:e5:1b, ethertype 802.1Q (0x8100), length 46: vlan 5, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.48.11.2 is-at 00:90:27:30:6d:1c, length 28 >>> >>> From the partner point of view, also on whole ethX: >>> >>> 00:1f:c6:ef:e5:1b > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 5, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.48.11.2 tell 10.48.11.1, length 28 >>> 00:90:27:30:6d:1c > 00:1f:c6:ef:e5:1b, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Reply 10.48.11.2 is-at 00:90:27:30:6d:1c, length 46 >>> >>> So, the tag gets eaten somewhere along the way... ;) > > Hmm. Looks like broken VLAN TX offload, but the driver doesn't even > implement VLAN offload. Maybe it's broken in its non-implementation... > > Your "partner" is a known-good setup and can be assumed to be working > correctly? This is over a crossover cable, no evil switches involved? There are just two machines involved, both connected to the same _switch_ - no, it is not over cross-over cable. It's a good idea to test one, I'll try it tomorrow (will insert a second "known good" nic into another machine). The second machine, the "partner", has this NIC: 02:00.0 Ethernet controller: Atheros Communications L1 Gigabit Ethernet (rev b0) and it is a known-good implementation - it worked with and without vlan tags (we had a weird mixed tagged/untagged setup) for over 2 years without any issues, and which works now as well - it's our main server which is in two VLANs, connected to an interface marked as tagged in the switch. It communicates with the other machine when that other machine uses already mentioned VIA RhineIII NIC - which I used to replace this non-working E100. So it's 2 machines, one with 2 nics - VIA Rhine (working) and e100 (non-working), both connected to two "tagged" ports in the switch. And another, with atl1 NIC, also connected to a "tagged" port in the switch. >>> And I can't really recreate the situation which I had - I know >>> some packets were flowing, so at least ARP worked. Now it >>> does not work anymore. >> >> What the 'partner' setup looks like ? >> >> ip link >> ip addr >> ip ro > 'local' setup too please :) The setup is quite complex - there are numerous tunnels and virtual interfaces. Here are the relevant parts. (Note that `ip addr' includes information present in `ip link'): The "Partner" machine, with just one NIC, atl1, ip addr: 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:1f:c6:ef:e5:1b brd ff:ff:ff:ff:ff:ff 3: tls-vlan@eth0: mtu 1500 qdisc noqueue master tls-br state UP link/ether 00:1f:c6:ef:e5:1b brd ff:ff:ff:ff:ff:ff Our main vlan, LAN, #1. 4: tls-br: mtu 1500 qdisc noqueue state UP link/ether 00:1f:c6:ef:e5:1b brd ff:ff:ff:ff:ff:ff inet 192.168.177.15/26 brd 192.168.177.63 scope global tls-br A bridge that connects this VLAN#1 and other stuff (virtual machines etc) 6: dmz-vlan@eth0: mtu 1500 qdisc noqueue master dmz-br state UP link/ether 00:1f:c6:ef:e5:1b brd ff:ff:ff:ff:ff:ff That's DMZ segment, VLAN#2 ... 21: test@eth0: mtu 1500 qdisc noqueue state UP link/ether 00:1f:c6:ef:e5:1b brd ff:ff:ff:ff:ff:ff inet 10.48.11.1/24 scope global test This is vlan#5, my test vlan. The machine with two (working, via-rhine, and non-working, e100): 2: ethx: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:90:27:30:6d:1c brd ff:ff:ff:ff:ff:ff This is via-rhine, with the MAC address of E100 -- the one which works. 13: eth-tls@ethx: mtu 1500 qdisc noqueue state UP link/ether 00:90:27:30:6d:1c brd ff:ff:ff:ff:ff:ff inet 192.168.177.5/26 brd 192.168.177.63 scope global eth-tls Our main VLAN#1 (here it's w/o bridge) 14: eth-dmz@ethx: mtu 1500 qdisc noqueue state UP link/ether 00:90:27:30:6d:1c brd ff:ff:ff:ff:ff:ff inet 192.168.177.225/29 brd 192.168.177.231 scope global eth-dmz DMZ VLAN#2 4: eth2: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:90:27:30:6d:1c brd ff:ff:ff:ff:ff:ff The non-working e100. Here it has the same MAC address as ethx above, because I explicitly changed ethx to have this MAC, since the $ISP has it hardcoded for our port on their side. The tests were done with the two addresses being original as set up by the hardware, and later on I also tried to set this MAC to be 00:90:27:30:6d:1d (note the last digit) - all the same result, packets sent over the iface above shows on the receiving side as having no vlan tag. 24: test@eth2: mtu 1500 qdisc noqueue state UP link/ether 00:90:27:30:6d:1c brd ff:ff:ff:ff:ff:ff inet 10.48.11.2/24 scope global test And finally this is the test vlan#5. tcpdump was run on eth2 here and on eth0 on the first machine. On both machines tcpdump is of version 4.1.1. Here's offload information for e100 nic: # ethtool -k eth2 Offload parameters for eth2: rx-checksumming: off tx-checksumming: off scatter-gather: off tcp-segmentation-offload: off udp-fragmentation-offload: off generic-segmentation-offload: off generic-receive-offload: off large-receive-offload: off ntuple-filters: off receive-hashing: off It supports (or appears to) some offloading, in particular I can enable GSO offload, and it even works somehow. Now, I enabled another pair of VLAN interfaces on these two NICs, with VLAN#6, and configured both ports in the switch to be parts of VLAN6 too (tagged). And voila, everything now works in there. Two ifaces added, "partner", atl1: 22: test6@eth0: mtu 1500 qdisc noop state DOWN link/ether 00:1f:c6:ef:e5:1b brd ff:ff:ff:ff:ff:ff inet 10.48.6.1/24 scope global test6 this e100: 25: test6@eth2: mtu 1500 qdisc noop state DOWN link/ether 00:90:27:30:6d:1c brd ff:ff:ff:ff:ff:ff inet 10.48.6.2/24 scope global test6 Yesterday, the vlan ID where it didn't work was #4, and in #1 it all - apparently - worked. I created 2 more pairs of VLAN interfaces and added to the swithc -- it all works just fine. Here: # x=8; ip link add link eth2 name test$x type vlan id $x; ip addr add 10.48.$x.2/24 dev test$x; ip link set test$x up (That's on the e100 side, similar was on atl1 side). x=6, x=7 and x=8 works just fine. x=5 does not, ARP replies arrives without VLAN tag to the atl1 side. Ok. So now I can reproduce the initial problem. So, `ping -s 1469' from atl1 side, so that the resulting packet side is 1497 bytes (1468 is the largest size that works) -- the packets does not arrive at e100 side at all - it's 100% quiet in tcpdump there. When pinging from e100 side and tcpdump'ing on atl1 side (replies does not come back to e100): 20:49:33.322646 00:90:27:30:6d:1c > 00:1f:c6:ef:e5:1b, ethertype 802.1Q (0x8100), length 1515: vlan 8, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1497) 10.48.8.2 > 10.48.8.1: ICMP echo request, id 5785, seq 72, length 1477 20:49:33.322691 00:1f:c6:ef:e5:1b > 00:90:27:30:6d:1c, ethertype 802.1Q (0x8100), length 1515: vlan 8, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 23781, offset 0, flags [none], proto ICMP (1), length 1497) 10.48.8.1 > 10.48.8.2: ICMP echo reply, id 5785, seq 72, length 1477 So it appears that on e100 side, the _receive_ buffer is too small somehow. I'll do some more experiments with VLAN#5 tomorrow, in a clean environment (maybe using direct cable connection - not cross-over, since GigE should autodetect this stuff (hopefully)). Thanks! /mjt