* [PATCH net-next v3 1/6] openvswitch: make skb modifiable in ovs_flow_key_extract*
2016-10-07 16:07 [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Jiri Benc
@ 2016-10-07 16:07 ` Jiri Benc
2016-10-07 16:07 ` [PATCH net-next v3 2/6] openvswitch: normalize vlan rx path Jiri Benc
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Jiri Benc @ 2016-10-07 16:07 UTC (permalink / raw)
To: netdev; +Cc: pravin shelar, Eric Garver
Allow ovs_flow_key_extract and ovs_flow_key_extract_userspace to modify the
skb. This will be used by the following patch to move vlan tag to the
vlan_tci field.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
v3: new in v3
---
net/openvswitch/datapath.c | 9 ++++++---
net/openvswitch/flow.c | 33 +++++++++++++++++++++++++--------
net/openvswitch/flow.h | 14 ++++++++------
net/openvswitch/vport.c | 8 +++-----
4 files changed, 42 insertions(+), 22 deletions(-)
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 4d67ea856067..3469a8df3d5e 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -607,10 +607,13 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
if (IS_ERR(flow))
goto err_kfree_skb;
- err = ovs_flow_key_extract_userspace(net, a[OVS_PACKET_ATTR_KEY],
- packet, &flow->key, log);
- if (err)
+ packet = ovs_flow_key_extract_userspace(net, a[OVS_PACKET_ATTR_KEY],
+ packet, &flow->key, log);
+ if (IS_ERR(packet)) {
+ err = PTR_ERR(packet);
+ packet = NULL;
goto err_flow_free;
+ }
err = ovs_nla_copy_actions(net, a[OVS_PACKET_ATTR_ACTIONS],
&flow->key, &acts, log);
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index c8c82e109c68..f358608dd33d 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -721,9 +721,12 @@ int ovs_flow_key_update(struct sk_buff *skb, struct sw_flow_key *key)
return key_extract(skb, key);
}
-int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
- struct sk_buff *skb, struct sw_flow_key *key)
+struct sk_buff *ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
+ struct sk_buff *skb,
+ struct sw_flow_key *key)
{
+ int err;
+
/* Extract metadata from packet. */
if (tun_info) {
key->tun_proto = ip_tunnel_info_af(tun_info);
@@ -753,19 +756,33 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
key->ovs_flow_hash = 0;
key->recirc_id = 0;
- return key_extract(skb, key);
+ err = key_extract(skb, key);
+ if (err) {
+ kfree_skb(skb);
+ return ERR_PTR(err);
+ }
+ return skb;
}
-int ovs_flow_key_extract_userspace(struct net *net, const struct nlattr *attr,
- struct sk_buff *skb,
- struct sw_flow_key *key, bool log)
+struct sk_buff *ovs_flow_key_extract_userspace(struct net *net,
+ const struct nlattr *attr,
+ struct sk_buff *skb,
+ struct sw_flow_key *key,
+ bool log)
{
int err;
/* Extract metadata from netlink attributes. */
err = ovs_nla_get_flow_metadata(net, attr, key, log);
if (err)
- return err;
+ goto err_free;
- return key_extract(skb, key);
+ err = key_extract(skb, key);
+ if (err)
+ goto err_free;
+ return skb;
+
+err_free:
+ kfree_skb(skb);
+ return ERR_PTR(err);
}
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index ae783f5c6695..7eb251f1471f 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -224,12 +224,14 @@ void ovs_flow_stats_clear(struct sw_flow *);
u64 ovs_flow_used_time(unsigned long flow_jiffies);
int ovs_flow_key_update(struct sk_buff *skb, struct sw_flow_key *key);
-int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
- struct sk_buff *skb,
- struct sw_flow_key *key);
+struct sk_buff *ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
+ struct sk_buff *skb,
+ struct sw_flow_key *key);
/* Extract key from packet coming from userspace. */
-int ovs_flow_key_extract_userspace(struct net *net, const struct nlattr *attr,
- struct sk_buff *skb,
- struct sw_flow_key *key, bool log);
+struct sk_buff *ovs_flow_key_extract_userspace(struct net *net,
+ const struct nlattr *attr,
+ struct sk_buff *skb,
+ struct sw_flow_key *key,
+ bool log);
#endif /* flow.h */
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 8f198437c724..8aefcb20cc58 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -455,11 +455,9 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
}
/* Extract flow from 'skb' into 'key'. */
- error = ovs_flow_key_extract(tun_info, skb, &key);
- if (unlikely(error)) {
- kfree_skb(skb);
- return error;
- }
+ skb = ovs_flow_key_extract(tun_info, skb, &key);
+ if (IS_ERR(skb))
+ return PTR_ERR(skb);
ovs_dp_process_packet(skb, &key);
return 0;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next v3 2/6] openvswitch: normalize vlan rx path
2016-10-07 16:07 [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Jiri Benc
2016-10-07 16:07 ` [PATCH net-next v3 1/6] openvswitch: make skb modifiable in ovs_flow_key_extract* Jiri Benc
@ 2016-10-07 16:07 ` Jiri Benc
2016-10-07 16:07 ` [PATCH net-next v3 3/6] openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal dev Jiri Benc
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Jiri Benc @ 2016-10-07 16:07 UTC (permalink / raw)
To: netdev; +Cc: pravin shelar, Eric Garver
Similarly to how the core networking stack behaves, let the first vlan tag
be always stored in skb->vlan_tci. This is already ensured in
__netif_receive_skb_core for packets that were received from the kernel and
honored by skb_vlan_push and skb_vlan_pop. There is a couple of paths where
a packet with vlan header inside the packet data can be received:
(1) Packets received from the user space.
(2) Packets received via internal device, either injected through AF_PACKET
or with vlan acceleration turned off.
In addition, there will be a third path when support for ETH_P_TEB packets
is added.
To catch all of these cases, untag the vlan frame in ovs_flow_key_extract
and ovs_flow_key_extract_userspace.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
v3: moved the untagging to ovs_flow_key_extract* to catch all cases
---
net/openvswitch/flow.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index f358608dd33d..14159ac19850 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -716,6 +716,20 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
return 0;
}
+static struct sk_buff *vlan_untag(struct sk_buff *skb)
+{
+ if (eth_type_vlan(skb->protocol)) {
+ __skb_pull(skb, ETH_HLEN);
+ skb_reset_network_header(skb);
+ skb_reset_mac_len(skb);
+ skb = skb_vlan_untag(skb);
+ if (unlikely(!skb))
+ return NULL;
+ skb_push(skb, ETH_HLEN);
+ }
+ return skb;
+}
+
int ovs_flow_key_update(struct sk_buff *skb, struct sw_flow_key *key)
{
return key_extract(skb, key);
@@ -727,6 +741,10 @@ struct sk_buff *ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
{
int err;
+ skb = vlan_untag(skb);
+ if (!skb)
+ return ERR_PTR(-ENOMEM);
+
/* Extract metadata from packet. */
if (tun_info) {
key->tun_proto = ip_tunnel_info_af(tun_info);
@@ -772,6 +790,10 @@ struct sk_buff *ovs_flow_key_extract_userspace(struct net *net,
{
int err;
+ skb = vlan_untag(skb);
+ if (!skb)
+ return ERR_PTR(-ENOMEM);
+
/* Extract metadata from netlink attributes. */
err = ovs_nla_get_flow_metadata(net, attr, key, log);
if (err)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next v3 3/6] openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal dev
2016-10-07 16:07 [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Jiri Benc
2016-10-07 16:07 ` [PATCH net-next v3 1/6] openvswitch: make skb modifiable in ovs_flow_key_extract* Jiri Benc
2016-10-07 16:07 ` [PATCH net-next v3 2/6] openvswitch: normalize vlan rx path Jiri Benc
@ 2016-10-07 16:07 ` Jiri Benc
2016-10-07 16:07 ` [PATCH net-next v3 4/6] openvswitch: keep vlan tag accelerated on internal device Jiri Benc
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Jiri Benc @ 2016-10-07 16:07 UTC (permalink / raw)
To: netdev; +Cc: pravin shelar, Eric Garver
The internal device does support 802.1AD offloading.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
v3: new in v3
---
net/openvswitch/vport-internal_dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c
index 95c36147a6e1..e7da29021b38 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -176,7 +176,7 @@ static void do_setup(struct net_device *netdev)
netdev->vlan_features = netdev->features;
netdev->hw_enc_features = netdev->features;
- netdev->features |= NETIF_F_HW_VLAN_CTAG_TX;
+ netdev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
netdev->hw_features = netdev->features & ~NETIF_F_LLTX;
eth_hw_addr_random(netdev);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next v3 4/6] openvswitch: keep vlan tag accelerated on internal device
2016-10-07 16:07 [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Jiri Benc
` (2 preceding siblings ...)
2016-10-07 16:07 ` [PATCH net-next v3 3/6] openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal dev Jiri Benc
@ 2016-10-07 16:07 ` Jiri Benc
2016-10-07 16:07 ` [PATCH net-next v3 5/6] openvswitch: remove unreachable code in vlan parsing Jiri Benc
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Jiri Benc @ 2016-10-07 16:07 UTC (permalink / raw)
To: netdev; +Cc: pravin shelar, Eric Garver
Disallow turning off of vlan acceleration on internal ports. We need the
vlan tag to be in skb->vlan_tci; otherwise, we would pull it back in
ovs_flow_key_extract, defeating the purpose of setting the vlan acceleration
off in the first place.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
v3: new in v3
---
net/openvswitch/vport-internal_dev.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c
index e7da29021b38..0531d48eb960 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -143,6 +143,12 @@ static void internal_set_rx_headroom(struct net_device *dev, int new_hr)
dev->needed_headroom = new_hr < 0 ? 0 : new_hr;
}
+static netdev_features_t internal_fix_features(struct net_device *dev,
+ netdev_features_t features)
+{
+ return features | NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
+}
+
static const struct net_device_ops internal_dev_netdev_ops = {
.ndo_open = internal_dev_open,
.ndo_stop = internal_dev_stop,
@@ -151,6 +157,7 @@ static const struct net_device_ops internal_dev_netdev_ops = {
.ndo_change_mtu = internal_dev_change_mtu,
.ndo_get_stats64 = internal_get_stats,
.ndo_set_rx_headroom = internal_set_rx_headroom,
+ .ndo_fix_features = internal_fix_features,
};
static struct rtnl_link_ops internal_dev_link_ops __read_mostly = {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next v3 5/6] openvswitch: remove unreachable code in vlan parsing
2016-10-07 16:07 [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Jiri Benc
` (3 preceding siblings ...)
2016-10-07 16:07 ` [PATCH net-next v3 4/6] openvswitch: keep vlan tag accelerated on internal device Jiri Benc
@ 2016-10-07 16:07 ` Jiri Benc
2016-10-07 16:07 ` [PATCH net-next v3 6/6] openvswitch: fix vlan subtraction from packet length Jiri Benc
2016-10-07 19:59 ` [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Pravin Shelar
6 siblings, 0 replies; 9+ messages in thread
From: Jiri Benc @ 2016-10-07 16:07 UTC (permalink / raw)
To: netdev; +Cc: pravin shelar, Eric Garver
Now when the first vlan tag is always in skb->vlan_tci, drop code that
assumed it might not be the case.
This patch also removes the wrong likely() statement around
skb_vlan_tag_present introduced by 018c1dda5ff1 ("openvswitch: 802.1AD Flow
handling, actions, vlan parsing, netlink attributes"). This code is called
whenever flow key is being extracted from the packet, the packet may be as
likely vlan tagged as not.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
v3: unchanged
---
net/openvswitch/flow.c | 28 ++++++++--------------------
1 file changed, 8 insertions(+), 20 deletions(-)
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 14159ac19850..45e384167053 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -308,9 +308,7 @@ static bool icmp6hdr_ok(struct sk_buff *skb)
/**
* Parse vlan tag from vlan header.
- * Returns ERROR on memory error.
- * Returns 0 if it encounters a non-vlan or incomplete packet.
- * Returns 1 after successfully parsing vlan tag.
+ * Returns ERROR on memory error, 0 otherwise.
*/
static int parse_vlan_tag(struct sk_buff *skb, struct vlan_head *key_vh)
{
@@ -331,34 +329,24 @@ static int parse_vlan_tag(struct sk_buff *skb, struct vlan_head *key_vh)
key_vh->tpid = vh->tpid;
__skb_pull(skb, sizeof(struct vlan_head));
- return 1;
+ return 0;
}
static int parse_vlan(struct sk_buff *skb, struct sw_flow_key *key)
{
- int res;
-
key->eth.vlan.tci = 0;
key->eth.vlan.tpid = 0;
key->eth.cvlan.tci = 0;
key->eth.cvlan.tpid = 0;
- if (likely(skb_vlan_tag_present(skb))) {
- key->eth.vlan.tci = htons(skb->vlan_tci);
- key->eth.vlan.tpid = skb->vlan_proto;
- } else {
- /* Parse outer vlan tag in the non-accelerated case. */
- res = parse_vlan_tag(skb, &key->eth.vlan);
- if (res <= 0)
- return res;
- }
+ if (!skb_vlan_tag_present(skb))
+ return 0;
- /* Parse inner vlan tag. */
- res = parse_vlan_tag(skb, &key->eth.cvlan);
- if (res <= 0)
- return res;
+ key->eth.vlan.tci = htons(skb->vlan_tci);
+ key->eth.vlan.tpid = skb->vlan_proto;
- return 0;
+ /* Parse inner vlan tag. */
+ return parse_vlan_tag(skb, &key->eth.cvlan);
}
static __be16 parse_ethertype(struct sk_buff *skb)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next v3 6/6] openvswitch: fix vlan subtraction from packet length
2016-10-07 16:07 [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Jiri Benc
` (4 preceding siblings ...)
2016-10-07 16:07 ` [PATCH net-next v3 5/6] openvswitch: remove unreachable code in vlan parsing Jiri Benc
@ 2016-10-07 16:07 ` Jiri Benc
2016-10-07 19:59 ` [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Pravin Shelar
6 siblings, 0 replies; 9+ messages in thread
From: Jiri Benc @ 2016-10-07 16:07 UTC (permalink / raw)
To: netdev; +Cc: pravin shelar, Eric Garver
When the packet has its vlan tag in skb->vlan_tci, the length of the VLAN
header is not counted in skb->len. It doesn't make sense to subtract it.
In addition, to honor the comment below the code, the VLAN header length
should not be subtracted if there's a vlan tag in skb->vlan_tci. This leads
to the code simply subtracting the Ethernet header length.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
v3: unchanged
---
net/openvswitch/vport.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 8aefcb20cc58..45370dd6a685 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -481,17 +481,13 @@ EXPORT_SYMBOL_GPL(ovs_vport_deferred_free);
static unsigned int packet_length(const struct sk_buff *skb)
{
- unsigned int length = skb->len - ETH_HLEN;
-
- if (skb_vlan_tagged(skb))
- length -= VLAN_HLEN;
-
/* Don't subtract for multiple VLAN tags. Most (all?) drivers allow
* (ETH_LEN + VLAN_HLEN) in addition to the mtu value, but almost none
* account for 802.1ad. e.g. is_skb_forwardable().
+ * Note that the first VLAN tag is always in skb->vlan_tci, thus not
+ * accounted for in skb->len.
*/
-
- return length;
+ return skb->len - ETH_HLEN;
}
void ovs_vport_send(struct vport *vport, struct sk_buff *skb)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent
2016-10-07 16:07 [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Jiri Benc
` (5 preceding siblings ...)
2016-10-07 16:07 ` [PATCH net-next v3 6/6] openvswitch: fix vlan subtraction from packet length Jiri Benc
@ 2016-10-07 19:59 ` Pravin Shelar
2016-10-10 12:46 ` Jiri Benc
6 siblings, 1 reply; 9+ messages in thread
From: Pravin Shelar @ 2016-10-07 19:59 UTC (permalink / raw)
To: Jiri Benc; +Cc: Linux Kernel Network Developers, Eric Garver
On Fri, Oct 7, 2016 at 9:07 AM, Jiri Benc <jbenc@redhat.com> wrote:
> Always keep the first vlan tag "accelerated", i.e. in skb->vlan_tci.
>
> Unfortunately, with all the changes since v2, this patchset no longer has
> the nice deletions > insertions diffstat. I still think it's worth it, as it
> makes things more consistent overall.
>
After looking at the changes, I am not sure about the value. These
patches are making code bit complicated by processing vlan header
twice rather than once in current code.
> Patch 3 is valid on its own. Patch 6 is needed in one form or other; with
> the changes in this set, it's a simple deletion. Otherwise we'd need more
> elaborate checks.
>
As far as patch 6 is concerned I think we could do MTU checks similar
to the rest of networking stack (for example is_skb_forwardable()).
That would simplify things here.
> Jiri Benc (6):
> openvswitch: make skb modifiable in ovs_flow_key_extract*
> openvswitch: normalize vlan rx path
> openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal dev
> openvswitch: keep vlan tag accelerated on internal device
> openvswitch: remove unreachable code in vlan parsing
> openvswitch: fix vlan subtraction from packet length
>
> net/openvswitch/datapath.c | 9 ++--
> net/openvswitch/flow.c | 83 ++++++++++++++++++++++++------------
> net/openvswitch/flow.h | 14 +++---
> net/openvswitch/vport-internal_dev.c | 9 +++-
> net/openvswitch/vport.c | 18 +++-----
> 5 files changed, 83 insertions(+), 50 deletions(-)
>
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent
2016-10-07 19:59 ` [PATCH net-next v3 0/6] openvswitch: make vlan handling consistent Pravin Shelar
@ 2016-10-10 12:46 ` Jiri Benc
0 siblings, 0 replies; 9+ messages in thread
From: Jiri Benc @ 2016-10-10 12:46 UTC (permalink / raw)
To: Pravin Shelar; +Cc: Linux Kernel Network Developers, Eric Garver
On Fri, 7 Oct 2016 12:59:08 -0700, Pravin Shelar wrote:
> On Fri, Oct 7, 2016 at 9:07 AM, Jiri Benc <jbenc@redhat.com> wrote:
> > Always keep the first vlan tag "accelerated", i.e. in skb->vlan_tci.
> >
> > Unfortunately, with all the changes since v2, this patchset no longer has
> > the nice deletions > insertions diffstat. I still think it's worth it, as it
> > makes things more consistent overall.
> >
> After looking at the changes, I am not sure about the value. These
> patches are making code bit complicated by processing vlan header
> twice rather than once in current code.
Yes, this is a trade-off. A bit more complexity on packet ingress, less
complexity on packet processing.
My main motivation was L3 packets where the code in packet_length
became more complicated than I'd like to to cover all possible cases.
Normalizing the vlan tags looked as a pretty obvious improvement. But
I'm not that thrilled with what it evolved into. I think it's slightly
better than what we have now but I can understand how you may think
opposite.
I'll rip the fixes (patches 3 and 6) off this patchset and send them
separately.
> As far as patch 6 is concerned I think we could do MTU checks similar
> to the rest of networking stack (for example is_skb_forwardable()).
> That would simplify things here.
Fixing the current code is not that hard. The real problem is the added
complexity with L3 packets. I'll look more into the possible solutions
for the L3 patchset.
Jiri
^ permalink raw reply [flat|nested] 9+ messages in thread