From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Patrick McHardy <kaber@trash.net>,
Nithin Nayak Sujir <nsujir@broadcom.com>,
Michael Chan <mchan@broadcom.com>, Jiri Pirko <jiri@resnulli.us>,
Vladislav Yasevich <vyasevic@redhat.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 3.16 03/55] net: Always untag vlan-tagged traffic on input.
Date: Mon, 13 Oct 2014 04:24:17 +0200 [thread overview]
Message-ID: <20141013022443.870604371@linuxfoundation.org> (raw)
In-Reply-To: <20141013022443.729870634@linuxfoundation.org>
3.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlad Yasevich <vyasevic@redhat.com>
[ Upstream commit 0d5501c1c828fb97d02af50aa9d2b1a5498b94e4 ]
Currently the functionality to untag traffic on input resides
as part of the vlan module and is build only when VLAN support
is enabled in the kernel. When VLAN is disabled, the function
vlan_untag() turns into a stub and doesn't really untag the
packets. This seems to create an interesting interaction
between VMs supporting checksum offloading and some network drivers.
There are some drivers that do not allow the user to change
tx-vlan-offload feature of the driver. These drivers also seem
to assume that any VLAN-tagged traffic they transmit will
have the vlan information in the vlan_tci and not in the vlan
header already in the skb. When transmitting skbs that already
have tagged data with partial checksum set, the checksum doesn't
appear to be updated correctly by the card thus resulting in a
failure to establish TCP connections.
The following is a packet trace taken on the receiver where a
sender is a VM with a VLAN configued. The host VM is running on
doest not have VLAN support and the outging interface on the
host is tg3:
10:12:43.503055 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27243,
offset 0, flags [DF], proto TCP (6), length 60)
10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-> 0x48d9), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294837885 ecr 0,nop,wscale 7], length 0
10:12:44.505556 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27244,
offset 0, flags [DF], proto TCP (6), length 60)
10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-> 0x44ee), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294838888 ecr 0,nop,wscale 7], length 0
This connection finally times out.
I've only access to the TG3 hardware in this configuration thus have
only tested this with TG3 driver. There are a lot of other drivers
that do not permit user changes to vlan acceleration features, and
I don't know if they all suffere from a similar issue.
The patch attempt to fix this another way. It moves the vlan header
stipping code out of the vlan module and always builds it into the
kernel network core. This way, even if vlan is not supported on
a virtualizatoin host, the virtual machines running on top of such
host will still work with VLANs enabled.
CC: Patrick McHardy <kaber@trash.net>
CC: Nithin Nayak Sujir <nsujir@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/if_vlan.h | 6 -----
include/linux/skbuff.h | 1
net/8021q/vlan_core.c | 53 ------------------------------------------------
net/bridge/br_vlan.c | 2 -
net/core/dev.c | 2 -
net/core/skbuff.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 56 insertions(+), 61 deletions(-)
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -187,7 +187,6 @@ vlan_dev_get_egress_qos_mask(struct net_
}
extern bool vlan_do_receive(struct sk_buff **skb);
-extern struct sk_buff *vlan_untag(struct sk_buff *skb);
extern int vlan_vid_add(struct net_device *dev, __be16 proto, u16 vid);
extern void vlan_vid_del(struct net_device *dev, __be16 proto, u16 vid);
@@ -241,11 +240,6 @@ static inline bool vlan_do_receive(struc
return false;
}
-static inline struct sk_buff *vlan_untag(struct sk_buff *skb)
-{
- return skb;
-}
-
static inline int vlan_vid_add(struct net_device *dev, __be16 proto, u16 vid)
{
return 0;
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2549,6 +2549,7 @@ int skb_shift(struct sk_buff *tgt, struc
void skb_scrub_packet(struct sk_buff *skb, bool xnet);
unsigned int skb_gso_transport_seglen(const struct sk_buff *skb);
struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features);
+struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
struct skb_checksum_ops {
__wsum (*update)(const void *mem, int len, __wsum wsum);
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -112,59 +112,6 @@ __be16 vlan_dev_vlan_proto(const struct
}
EXPORT_SYMBOL(vlan_dev_vlan_proto);
-static struct sk_buff *vlan_reorder_header(struct sk_buff *skb)
-{
- if (skb_cow(skb, skb_headroom(skb)) < 0) {
- kfree_skb(skb);
- return NULL;
- }
-
- memmove(skb->data - ETH_HLEN, skb->data - VLAN_ETH_HLEN, 2 * ETH_ALEN);
- skb->mac_header += VLAN_HLEN;
- return skb;
-}
-
-struct sk_buff *vlan_untag(struct sk_buff *skb)
-{
- struct vlan_hdr *vhdr;
- u16 vlan_tci;
-
- if (unlikely(vlan_tx_tag_present(skb))) {
- /* vlan_tci is already set-up so leave this for another time */
- return skb;
- }
-
- skb = skb_share_check(skb, GFP_ATOMIC);
- if (unlikely(!skb))
- goto err_free;
-
- if (unlikely(!pskb_may_pull(skb, VLAN_HLEN)))
- goto err_free;
-
- vhdr = (struct vlan_hdr *) skb->data;
- vlan_tci = ntohs(vhdr->h_vlan_TCI);
- __vlan_hwaccel_put_tag(skb, skb->protocol, vlan_tci);
-
- skb_pull_rcsum(skb, VLAN_HLEN);
- vlan_set_encap_proto(skb, vhdr);
-
- skb = vlan_reorder_header(skb);
- if (unlikely(!skb))
- goto err_free;
-
- skb_reset_network_header(skb);
- skb_reset_transport_header(skb);
- skb_reset_mac_len(skb);
-
- return skb;
-
-err_free:
- kfree_skb(skb);
- return NULL;
-}
-EXPORT_SYMBOL(vlan_untag);
-
-
/*
* vlan info and vid list
*/
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -183,7 +183,7 @@ bool br_allowed_ingress(struct net_bridg
*/
if (unlikely(!vlan_tx_tag_present(skb) &&
skb->protocol == proto)) {
- skb = vlan_untag(skb);
+ skb = skb_vlan_untag(skb);
if (unlikely(!skb))
return false;
}
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3588,7 +3588,7 @@ another_round:
if (skb->protocol == cpu_to_be16(ETH_P_8021Q) ||
skb->protocol == cpu_to_be16(ETH_P_8021AD)) {
- skb = vlan_untag(skb);
+ skb = skb_vlan_untag(skb);
if (unlikely(!skb))
goto unlock;
}
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -62,6 +62,7 @@
#include <linux/scatterlist.h>
#include <linux/errqueue.h>
#include <linux/prefetch.h>
+#include <linux/if_vlan.h>
#include <net/protocol.h>
#include <net/dst.h>
@@ -3959,3 +3960,55 @@ unsigned int skb_gso_transport_seglen(co
return shinfo->gso_size;
}
EXPORT_SYMBOL_GPL(skb_gso_transport_seglen);
+
+static struct sk_buff *skb_reorder_vlan_header(struct sk_buff *skb)
+{
+ if (skb_cow(skb, skb_headroom(skb)) < 0) {
+ kfree_skb(skb);
+ return NULL;
+ }
+
+ memmove(skb->data - ETH_HLEN, skb->data - VLAN_ETH_HLEN, 2 * ETH_ALEN);
+ skb->mac_header += VLAN_HLEN;
+ return skb;
+}
+
+struct sk_buff *skb_vlan_untag(struct sk_buff *skb)
+{
+ struct vlan_hdr *vhdr;
+ u16 vlan_tci;
+
+ if (unlikely(vlan_tx_tag_present(skb))) {
+ /* vlan_tci is already set-up so leave this for another time */
+ return skb;
+ }
+
+ skb = skb_share_check(skb, GFP_ATOMIC);
+ if (unlikely(!skb))
+ goto err_free;
+
+ if (unlikely(!pskb_may_pull(skb, VLAN_HLEN)))
+ goto err_free;
+
+ vhdr = (struct vlan_hdr *)skb->data;
+ vlan_tci = ntohs(vhdr->h_vlan_TCI);
+ __vlan_hwaccel_put_tag(skb, skb->protocol, vlan_tci);
+
+ skb_pull_rcsum(skb, VLAN_HLEN);
+ vlan_set_encap_proto(skb, vhdr);
+
+ skb = skb_reorder_vlan_header(skb);
+ if (unlikely(!skb))
+ goto err_free;
+
+ skb_reset_network_header(skb);
+ skb_reset_transport_header(skb);
+ skb_reset_mac_len(skb);
+
+ return skb;
+
+err_free:
+ kfree_skb(skb);
+ return NULL;
+}
+EXPORT_SYMBOL(skb_vlan_untag);
next prev parent reply other threads:[~2014-10-13 2:24 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-13 2:24 [PATCH 3.16 00/55] 3.16.6-stable review Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 01/55] netlink: reset network header before passing to taps Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 02/55] rtnetlink: fix VF info size Greg Kroah-Hartman
2014-10-13 2:24 ` Greg Kroah-Hartman [this message]
2014-10-13 2:24 ` [PATCH 3.16 04/55] myri10ge: check for DMA mapping errors Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 05/55] Revert "macvlan: simplify the structure port" Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 06/55] tcp: dont use timestamp from repaired skb-s to calculate RTT (v2) Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 07/55] sit: Fix ipip6_tunnel_lookup device matching criteria Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 08/55] tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 09/55] tcp: fix ssthresh and undo for consecutive short FRTO episodes Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 10/55] net: phy: smsc: move smsc_phy_config_init reset part in a soft_reset function Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 11/55] tipc: fix message importance range check Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 12/55] packet: handle too big packets for PACKET_V3 Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 13/55] bnx2x: Revert UNDI flushing mechanism Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 14/55] net: ipv6: fib: dont sleep inside atomic lock Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 15/55] openvswitch: fix panic with multiple vlan headers Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 16/55] vxlan: fix incorrect initializer in union vxlan_addr Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 17/55] net: fix checksum features handling in netif_skb_features() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 18/55] ipv6: fix rtnl locking in setsockopt for anycast and multicast Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 19/55] l2tp: fix race while getting PMTU on PPP pseudo-wire Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 20/55] ipv6: restore the behavior of ipv6_sock_ac_drop() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 21/55] bonding: fix div by zero while enslaving and transmitting Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 22/55] net: filter: fix possible use after free Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 23/55] bridge: Check if vlan filtering is enabled only once Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 24/55] bridge: Fix br_should_learn to check vlan_enabled Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 25/55] net: allow macvlans to move to net namespace Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 26/55] macvlan: allow to enqueue broadcast pkt on virtual device Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 27/55] tg3: Work around HW/FW limitations with vlan encapsulated frames Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 28/55] tg3: Allow for recieve of full-size 8021AD frames Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 29/55] xfrm: Generate blackhole routes only from route lookup functions Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 30/55] xfrm: Generate queueing " Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 31/55] ip_tunnel: Dont allow to add the same tunnel multiple times Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 32/55] macvtap: Fix race between device delete and open Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 33/55] Revert "net/macb: add pinctrl consumer support" Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 34/55] net/mlx4_core: Allow not to specify probe_vf in SRIOV IB mode Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 35/55] net/mlx4: Correctly configure single ported VFs from the host Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 36/55] gro: fix aggregation for skb using frag_list Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 37/55] ipv6: remove rt6i_genid Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 38/55] hyperv: Fix a bug in netvsc_start_xmit() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 39/55] ip6_gre: fix flowi6_proto value in xmit path Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 40/55] net: systemport: fix bcm_sysport_insert_tsb() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 41/55] team: avoid race condition in scheduling delayed work Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 42/55] hyperv: Fix a bug in netvsc_send() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 43/55] sctp: handle association restarts when the socket is closed Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 44/55] net_sched: copy exts->type in tcf_exts_change() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.16 45/55] uas: Add a quirk for rejecting ATA_12 and ATA_16 commands Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 46/55] uas: Add no-report-opcodes quirk Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 47/55] uas: Add US_FL_NO_ATA_1X quirk for Seagate (0bc2:ab20) drives Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 48/55] uas: Add another ASM1051 usb-id to the uas blacklist Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 49/55] USB: Add device quirk for ASUS T100 Base Station keyboard Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 50/55] USB: serial: cp210x: added Ketra N1 wireless interface support Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 51/55] USB: cp210x: add support for Seluxit USB dongle Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 52/55] usb: musb: dsps: kill OTG timer on suspend Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 53/55] crypto: caam - fix addressing of struct member Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 54/55] driver/base/node: remove unnecessary kfree of node struct from unregister_one_node Greg Kroah-Hartman
2014-10-13 2:25 ` [PATCH 3.16 55/55] serial: 8250: Add Quark X1000 to 8250_pci.c Greg Kroah-Hartman
2014-10-13 15:19 ` [PATCH 3.16 00/55] 3.16.6-stable review Guenter Roeck
2014-10-13 20:32 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141013022443.870604371@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=jiri@resnulli.us \
--cc=kaber@trash.net \
--cc=linux-kernel@vger.kernel.org \
--cc=mchan@broadcom.com \
--cc=nsujir@broadcom.com \
--cc=stable@vger.kernel.org \
--cc=vyasevic@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).