[RFC PATCH 00/11] GSO partial and TSO FIXEDID support

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 00/11] GSO partial and TSO FIXEDID support
@ 2016-04-07 22:31 Alexander Duyck
  2016-04-07 22:31 ` [RFC PATCH 01/11] GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU Alexander Duyck
                   ` (10 more replies)
  0 siblings, 11 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:31 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch series represents my respun version of the GSO partial work.
The major changes from the first version is that we are no longer making
the decision to mangle IP IDs transparently at the device.  Instead it is
now pushed up to the tunnel layer itself so that the tunnel is not
responsible for deciding if the IP IDs will be static or fixed for a given
TSO.

I'm a bit jammed up at the moment as I am trying to determine the best spot
to make the same change I currently am for VXLAN and GENEVE with GRE and
IPIP tunnels.  I'm assuming the correct spot would be somewhere near
iptunnel_handle_offload as I did for the other two tunnel types I have
already updated, but the flow based tunnels for GRE seem to be making it a
bit more complicated as I am not sure if a tunnel dev actually exists for
those tunnels.

This patch series is meant to apply to the dev-queue branch Jeff Kirsher's
next-queue tree at:
https://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue.git

I chose his tree as the Intel driver patches would not apply otherwise.

Patch 1 from the series is a copy of the patch recently accepted for the
net tree.  It is needed in this series to avoid merge conflicts later on as
we were making other changes in the GRO code.

---

Alexander Duyck (11):
      GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU
      ethtool: Add support for toggling any of the GSO offloads
      GSO: Add GSO type for fixed IPv4 ID
      GRO: Add support for TCP with fixed IPv4 ID field, limit tunnel IP ID values
      GSO: Support partial segmentation offload
      VXLAN: Add option to mangle IP IDs on inner headers when using TSO
      GENEVE: Add option to mangle IP IDs on inner headers when using TSO
      Documentation: Add documentation for TSO and GSO features
      i40e/i40evf: Add support for GSO partial with UDP_TUNNEL_CSUM and GRE_CSUM
      ixgbe/ixgbevf: Add support for GSO partial
      igb/igbvf: Add support for GSO partial


 Documentation/networking/segmentation-offloads.txt |  127 ++++++++++++++
 drivers/net/ethernet/intel/i40e/i40e_main.c        |    6 +
 drivers/net/ethernet/intel/i40e/i40e_txrx.c        |    7 +
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c      |    7 +
 drivers/net/ethernet/intel/i40evf/i40evf_main.c    |    6 +
 drivers/net/ethernet/intel/igb/igb_main.c          |  119 ++++++++++---
 drivers/net/ethernet/intel/igbvf/netdev.c          |  180 ++++++++++++--------
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c      |  115 +++++++++----
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c  |  122 +++++++++++---
 drivers/net/geneve.c                               |   24 ++-
 drivers/net/vxlan.c                                |   16 ++
 include/linux/netdev_features.h                    |    8 +
 include/linux/netdevice.h                          |   11 +
 include/linux/skbuff.h                             |   27 ++-
 include/net/udp_tunnel.h                           |    8 -
 include/net/vxlan.h                                |    1 
 include/uapi/linux/if_link.h                       |    2 
 net/core/dev.c                                     |   33 ++++
 net/core/ethtool.c                                 |    4 
 net/core/skbuff.c                                  |   29 +++
 net/ipv4/af_inet.c                                 |   70 ++++++--
 net/ipv4/fou.c                                     |    6 +
 net/ipv4/gre_offload.c                             |   35 +++-
 net/ipv4/ip_gre.c                                  |   13 +
 net/ipv4/tcp_offload.c                             |   30 +++
 net/ipv4/udp_offload.c                             |   27 ++-
 net/ipv6/ip6_offload.c                             |   21 ++
 27 files changed, 837 insertions(+), 217 deletions(-)
 create mode 100644 Documentation/networking/segmentation-offloads.txt

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC PATCH 01/11] GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
@ 2016-04-07 22:31 ` Alexander Duyck
  2016-04-07 22:32 ` [RFC PATCH 02/11] ethtool: Add support for toggling any of the GSO offloads Alexander Duyck
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:31 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch fixes an issue I found in which we were dropping frames if we
had enabled checksums on GRE headers that were encapsulated by either FOU
or GUE.  Without this patch I was barely able to get 1 Gb/s of throughput.
With this patch applied I am now at least getting around 6 Gb/s.

The issue is due to the fact that with FOU or GUE applied we do not provide
a transport offset pointing to the GRE header, nor do we offload it in
software as the GRE header is completely skipped by GSO and treated like a
VXLAN or GENEVE type header.  As such we need to prevent the stack from
generating it and also prevent GRE from generating it via any interface we
create.

Fixes: c3483384ee511 ("gro: Allow tunnel stacking in the case of FOU/GUE")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 include/linux/netdevice.h |    5 ++++-
 net/core/dev.c            |    1 +
 net/ipv4/fou.c            |    6 ++++++
 net/ipv4/gre_offload.c    |    8 ++++++++
 net/ipv4/ip_gre.c         |   13 ++++++++++---
 5 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cb0d5d09c2e4..8395308a2445 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2120,7 +2120,10 @@ struct napi_gro_cb {
 	/* Used in foo-over-udp, set in udp[46]_gro_receive */
 	u8	is_ipv6:1;
 
-	/* 7 bit hole */
+	/* Used in GRE, set in fou/gue_gro_receive */
+	u8	is_fou:1;
+
+	/* 6 bit hole */
 
 	/* used to support CHECKSUM_COMPLETE for tunneling protocols */
 	__wsum	csum;
diff --git a/net/core/dev.c b/net/core/dev.c
index 273f10d1e306..d51343a821ed 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4439,6 +4439,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 		NAPI_GRO_CB(skb)->flush = 0;
 		NAPI_GRO_CB(skb)->free = 0;
 		NAPI_GRO_CB(skb)->encap_mark = 0;
+		NAPI_GRO_CB(skb)->is_fou = 0;
 		NAPI_GRO_CB(skb)->gro_remcsum_start = 0;
 
 		/* Setup for GRO checksum validation */
diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index 5a94aea280d3..a39068b4a4d9 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -203,6 +203,9 @@ static struct sk_buff **fou_gro_receive(struct sk_buff **head,
 	 */
 	NAPI_GRO_CB(skb)->encap_mark = 0;
 
+	/* Flag this frame as already having an outer encap header */
+	NAPI_GRO_CB(skb)->is_fou = 1;
+
 	rcu_read_lock();
 	offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
 	ops = rcu_dereference(offloads[proto]);
@@ -368,6 +371,9 @@ static struct sk_buff **gue_gro_receive(struct sk_buff **head,
 	 */
 	NAPI_GRO_CB(skb)->encap_mark = 0;
 
+	/* Flag this frame as already having an outer encap header */
+	NAPI_GRO_CB(skb)->is_fou = 1;
+
 	rcu_read_lock();
 	offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
 	ops = rcu_dereference(offloads[guehdr->proto_ctype]);
diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
index c47539d04b88..6a5bd4317866 100644
--- a/net/ipv4/gre_offload.c
+++ b/net/ipv4/gre_offload.c
@@ -150,6 +150,14 @@ static struct sk_buff **gre_gro_receive(struct sk_buff **head,
 	if ((greh->flags & ~(GRE_KEY|GRE_CSUM)) != 0)
 		goto out;
 
+	/* We can only support GRE_CSUM if we can track the location of
+	 * the GRE header.  In the case of FOU/GUE we cannot because the
+	 * outer UDP header displaces the GRE header leaving us in a state
+	 * of limbo.
+	 */
+	if ((greh->flags & GRE_CSUM) && NAPI_GRO_CB(skb)->is_fou)
+		goto out;
+
 	type = greh->protocol;
 
 	rcu_read_lock();
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 31936d387cfd..af5d1f38217f 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -862,9 +862,16 @@ static void __gre_tunnel_init(struct net_device *dev)
 	dev->hw_features	|= GRE_FEATURES;
 
 	if (!(tunnel->parms.o_flags & TUNNEL_SEQ)) {
-		/* TCP offload with GRE SEQ is not supported. */
-		dev->features    |= NETIF_F_GSO_SOFTWARE;
-		dev->hw_features |= NETIF_F_GSO_SOFTWARE;
+		/* TCP offload with GRE SEQ is not supported, nor
+		 * can we support 2 levels of outer headers requiring
+		 * an update.
+		 */
+		if (!(tunnel->parms.o_flags & TUNNEL_CSUM) ||
+		    (tunnel->encap.type == TUNNEL_ENCAP_NONE)) {
+			dev->features    |= NETIF_F_GSO_SOFTWARE;
+			dev->hw_features |= NETIF_F_GSO_SOFTWARE;
+		}
+
 		/* Can use a lockless transmit, unless we generate
 		 * output sequences
 		 */

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 02/11] ethtool: Add support for toggling any of the GSO offloads
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
  2016-04-07 22:31 ` [RFC PATCH 01/11] GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU Alexander Duyck
@ 2016-04-07 22:32 ` Alexander Duyck
  2016-04-07 22:32 ` [RFC PATCH 03/11] GSO: Add GSO type for fixed IPv4 ID Alexander Duyck
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:32 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

The strings were missing for several of the GSO offloads that are
available.  This patch provides the missing strings so that we can toggle
or query any of them via the ethtool command.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 net/core/ethtool.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index f426c5ad6149..6a7f99661c2f 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -82,9 +82,11 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
 	[NETIF_F_TSO6_BIT] =             "tx-tcp6-segmentation",
 	[NETIF_F_FSO_BIT] =              "tx-fcoe-segmentation",
 	[NETIF_F_GSO_GRE_BIT] =		 "tx-gre-segmentation",
+	[NETIF_F_GSO_GRE_CSUM_BIT] =	 "tx-gre-csum-segmentation",
 	[NETIF_F_GSO_IPIP_BIT] =	 "tx-ipip-segmentation",
 	[NETIF_F_GSO_SIT_BIT] =		 "tx-sit-segmentation",
 	[NETIF_F_GSO_UDP_TUNNEL_BIT] =	 "tx-udp_tnl-segmentation",
+	[NETIF_F_GSO_UDP_TUNNEL_CSUM_BIT] = "tx-udp_tnl-csum-segmentation",
 
 	[NETIF_F_FCOE_CRC_BIT] =         "tx-checksum-fcoe-crc",
 	[NETIF_F_SCTP_CRC_BIT] =        "tx-checksum-sctp",

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 03/11] GSO: Add GSO type for fixed IPv4 ID
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
  2016-04-07 22:31 ` [RFC PATCH 01/11] GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU Alexander Duyck
  2016-04-07 22:32 ` [RFC PATCH 02/11] ethtool: Add support for toggling any of the GSO offloads Alexander Duyck
@ 2016-04-07 22:32 ` Alexander Duyck
  2016-04-07 22:32 ` [RFC PATCH 04/11] GRO: Add support for TCP with fixed IPv4 ID field, limit tunnel IP ID values Alexander Duyck
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:32 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch adds support for TSO using IPv4 headers with a fixed IP ID
field.  This is meant to allow us to do a lossless GRO in the case of TCP
flows that use a fixed IP ID such as those that convert IPv6 header to IPv4
headers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 include/linux/netdev_features.h |    3 +++
 include/linux/netdevice.h       |    1 +
 include/linux/skbuff.h          |   20 +++++++++++---------
 net/core/ethtool.c              |    1 +
 net/ipv4/af_inet.c              |   19 +++++++++++--------
 net/ipv4/gre_offload.c          |    1 +
 net/ipv4/tcp_offload.c          |    4 +++-
 net/ipv6/ip6_offload.c          |    3 ++-
 8 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index a734bf43d190..5d7da1ac6df5 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -39,6 +39,7 @@ enum {
 	NETIF_F_UFO_BIT,		/* ... UDPv4 fragmentation */
 	NETIF_F_GSO_ROBUST_BIT,		/* ... ->SKB_GSO_DODGY */
 	NETIF_F_TSO_ECN_BIT,		/* ... TCP ECN support */
+	NETIF_F_TSO_FIXEDID_BIT,	/* ... IPV4 header has fixed IP ID */
 	NETIF_F_TSO6_BIT,		/* ... TCPv6 segmentation */
 	NETIF_F_FSO_BIT,		/* ... FCoE segmentation */
 	NETIF_F_GSO_GRE_BIT,		/* ... GRE with TSO */
@@ -120,6 +121,7 @@ enum {
 #define NETIF_F_GSO_SIT		__NETIF_F(GSO_SIT)
 #define NETIF_F_GSO_UDP_TUNNEL	__NETIF_F(GSO_UDP_TUNNEL)
 #define NETIF_F_GSO_UDP_TUNNEL_CSUM __NETIF_F(GSO_UDP_TUNNEL_CSUM)
+#define NETIF_F_TSO_FIXEDID	__NETIF_F(TSO_FIXEDID)
 #define NETIF_F_GSO_TUNNEL_REMCSUM __NETIF_F(GSO_TUNNEL_REMCSUM)
 #define NETIF_F_HW_VLAN_STAG_FILTER __NETIF_F(HW_VLAN_STAG_FILTER)
 #define NETIF_F_HW_VLAN_STAG_RX	__NETIF_F(HW_VLAN_STAG_RX)
@@ -147,6 +149,7 @@ enum {
 
 /* List of features with software fallbacks. */
 #define NETIF_F_GSO_SOFTWARE	(NETIF_F_TSO | NETIF_F_TSO_ECN | \
+				 NETIF_F_TSO_FIXEDID | \
 				 NETIF_F_TSO6 | NETIF_F_UFO)
 
 /* List of IP checksum features. Note that NETIF_F_ HW_CSUM should not be
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 8395308a2445..38ccc01eb97d 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4011,6 +4011,7 @@ static inline bool net_gso_ok(netdev_features_t features, int gso_type)
 	BUILD_BUG_ON(SKB_GSO_UDP     != (NETIF_F_UFO >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_DODGY   != (NETIF_F_GSO_ROBUST >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_TCP_ECN != (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
+	BUILD_BUG_ON(SKB_GSO_TCP_FIXEDID != (NETIF_F_TSO_FIXEDID >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_TCPV6   != (NETIF_F_TSO6 >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_FCOE    != (NETIF_F_FSO >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_GRE     != (NETIF_F_GSO_GRE >> NETIF_F_GSO_SHIFT));
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 007381270ff8..5fba16658f9d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -465,23 +465,25 @@ enum {
 	/* This indicates the tcp segment has CWR set. */
 	SKB_GSO_TCP_ECN = 1 << 3,
 
-	SKB_GSO_TCPV6 = 1 << 4,
+	SKB_GSO_TCP_FIXEDID = 1 << 4,
 
-	SKB_GSO_FCOE = 1 << 5,
+	SKB_GSO_TCPV6 = 1 << 5,
 
-	SKB_GSO_GRE = 1 << 6,
+	SKB_GSO_FCOE = 1 << 6,
 
-	SKB_GSO_GRE_CSUM = 1 << 7,
+	SKB_GSO_GRE = 1 << 7,
 
-	SKB_GSO_IPIP = 1 << 8,
+	SKB_GSO_GRE_CSUM = 1 << 8,
 
-	SKB_GSO_SIT = 1 << 9,
+	SKB_GSO_IPIP = 1 << 9,
 
-	SKB_GSO_UDP_TUNNEL = 1 << 10,
+	SKB_GSO_SIT = 1 << 10,
 
-	SKB_GSO_UDP_TUNNEL_CSUM = 1 << 11,
+	SKB_GSO_UDP_TUNNEL = 1 << 11,
 
-	SKB_GSO_TUNNEL_REMCSUM = 1 << 12,
+	SKB_GSO_UDP_TUNNEL_CSUM = 1 << 12,
+
+	SKB_GSO_TUNNEL_REMCSUM = 1 << 13,
 };
 
 #if BITS_PER_LONG > 32
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 6a7f99661c2f..5340c9dbc318 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -79,6 +79,7 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
 	[NETIF_F_UFO_BIT] =              "tx-udp-fragmentation",
 	[NETIF_F_GSO_ROBUST_BIT] =       "tx-gso-robust",
 	[NETIF_F_TSO_ECN_BIT] =          "tx-tcp-ecn-segmentation",
+	[NETIF_F_TSO_FIXEDID_BIT] =	 "tx-tcp-fixedid-segmentation",
 	[NETIF_F_TSO6_BIT] =             "tx-tcp6-segmentation",
 	[NETIF_F_FSO_BIT] =              "tx-fcoe-segmentation",
 	[NETIF_F_GSO_GRE_BIT] =		 "tx-gre-segmentation",
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index a38b9910af60..19e9a2c45d71 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1195,10 +1195,10 @@ EXPORT_SYMBOL(inet_sk_rebuild_header);
 static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 					netdev_features_t features)
 {
+	bool udpfrag = false, fixedid = false, encap;
 	struct sk_buff *segs = ERR_PTR(-EINVAL);
 	const struct net_offload *ops;
 	unsigned int offset = 0;
-	bool udpfrag, encap;
 	struct iphdr *iph;
 	int proto;
 	int nhoff;
@@ -1217,6 +1217,7 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 		       SKB_GSO_TCPV6 |
 		       SKB_GSO_UDP_TUNNEL |
 		       SKB_GSO_UDP_TUNNEL_CSUM |
+		       SKB_GSO_TCP_FIXEDID |
 		       SKB_GSO_TUNNEL_REMCSUM |
 		       0)))
 		goto out;
@@ -1248,11 +1249,14 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 
 	segs = ERR_PTR(-EPROTONOSUPPORT);
 
-	if (skb->encapsulation &&
-	    skb_shinfo(skb)->gso_type & (SKB_GSO_SIT|SKB_GSO_IPIP))
-		udpfrag = proto == IPPROTO_UDP && encap;
-	else
-		udpfrag = proto == IPPROTO_UDP && !skb->encapsulation;
+	if (!skb->encapsulation || encap) {
+		udpfrag = !!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP);
+		fixedid = !!(skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID);
+
+		/* fixed ID is invalid if DF bit is not set */
+		if (fixedid && !(iph->frag_off & htons(IP_DF)))
+			goto out;
+	}
 
 	ops = rcu_dereference(inet_offloads[proto]);
 	if (likely(ops && ops->callbacks.gso_segment))
@@ -1265,12 +1269,11 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 	do {
 		iph = (struct iphdr *)(skb_mac_header(skb) + nhoff);
 		if (udpfrag) {
-			iph->id = htons(id);
 			iph->frag_off = htons(offset >> 3);
 			if (skb->next)
 				iph->frag_off |= htons(IP_MF);
 			offset += skb->len - nhoff - ihl;
-		} else {
+		} else if (!fixedid) {
 			iph->id = htons(id++);
 		}
 		iph->tot_len = htons(skb->len - nhoff);
diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
index 6a5bd4317866..6376b0cdf693 100644
--- a/net/ipv4/gre_offload.c
+++ b/net/ipv4/gre_offload.c
@@ -32,6 +32,7 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
 				  SKB_GSO_UDP |
 				  SKB_GSO_DODGY |
 				  SKB_GSO_TCP_ECN |
+				  SKB_GSO_TCP_FIXEDID |
 				  SKB_GSO_GRE |
 				  SKB_GSO_GRE_CSUM |
 				  SKB_GSO_IPIP |
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 773083b7f1e9..08dd25d835af 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -89,6 +89,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
 			     ~(SKB_GSO_TCPV4 |
 			       SKB_GSO_DODGY |
 			       SKB_GSO_TCP_ECN |
+			       SKB_GSO_TCP_FIXEDID |
 			       SKB_GSO_TCPV6 |
 			       SKB_GSO_GRE |
 			       SKB_GSO_GRE_CSUM |
@@ -98,7 +99,8 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
 			       SKB_GSO_UDP_TUNNEL_CSUM |
 			       SKB_GSO_TUNNEL_REMCSUM |
 			       0) ||
-			     !(type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))))
+			     !(type & (SKB_GSO_TCPV4 |
+				       SKB_GSO_TCPV6))))
 			goto out;
 
 		skb_shinfo(skb)->gso_segs = DIV_ROUND_UP(skb->len, mss);
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 82e9f3076028..d7530b9a1d63 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -73,6 +73,8 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 		       SKB_GSO_UDP |
 		       SKB_GSO_DODGY |
 		       SKB_GSO_TCP_ECN |
+		       SKB_GSO_TCP_FIXEDID |
+		       SKB_GSO_TCPV6 |
 		       SKB_GSO_GRE |
 		       SKB_GSO_GRE_CSUM |
 		       SKB_GSO_IPIP |
@@ -80,7 +82,6 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 		       SKB_GSO_UDP_TUNNEL |
 		       SKB_GSO_UDP_TUNNEL_CSUM |
 		       SKB_GSO_TUNNEL_REMCSUM |
-		       SKB_GSO_TCPV6 |
 		       0)))
 		goto out;
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 04/11] GRO: Add support for TCP with fixed IPv4 ID field, limit tunnel IP ID values
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
                   ` (2 preceding siblings ...)
  2016-04-07 22:32 ` [RFC PATCH 03/11] GSO: Add GSO type for fixed IPv4 ID Alexander Duyck
@ 2016-04-07 22:32 ` Alexander Duyck
  2016-04-07 22:32 ` [RFC PATCH 05/11] GSO: Support partial segmentation offload Alexander Duyck
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:32 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch does two things.

First it allows TCP to aggregate TCP frames with a fixed IPv4 ID field.  As
a result we should now be able to aggregate flows that were converted from
IPv6 to IPv4.  In addition this allows us more flexibility for future
implementations of segmentation as we may be able to use a fixed IP ID when
segmenting the flow.

The second thing this addresses is that it places limitations on the outer
IPv4 ID header in the case of tunneled frames.  Specifically it forces the
IP ID to be incrementing by 1 unless the DF bit is set in the outer IPv4
header.  This way we can avoid creating overlapping series of IP IDs that
could possibly be fragmented if the frame goes through GRO and is then
resegmented via GSO.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 include/linux/netdevice.h |    5 ++++-
 net/core/dev.c            |    1 +
 net/ipv4/af_inet.c        |   35 ++++++++++++++++++++++++++++-------
 net/ipv4/tcp_offload.c    |   16 +++++++++++++++-
 net/ipv6/ip6_offload.c    |    8 ++++++--
 5 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 38ccc01eb97d..abf8cc2d9bfb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2123,7 +2123,10 @@ struct napi_gro_cb {
 	/* Used in GRE, set in fou/gue_gro_receive */
 	u8	is_fou:1;
 
-	/* 6 bit hole */
+	/* Used to determine if flush_id can be ignored */
+	u8	is_atomic:1;
+
+	/* 5 bit hole */
 
 	/* used to support CHECKSUM_COMPLETE for tunneling protocols */
 	__wsum	csum;
diff --git a/net/core/dev.c b/net/core/dev.c
index d51343a821ed..4ed2852b3706 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4440,6 +4440,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 		NAPI_GRO_CB(skb)->free = 0;
 		NAPI_GRO_CB(skb)->encap_mark = 0;
 		NAPI_GRO_CB(skb)->is_fou = 0;
+		NAPI_GRO_CB(skb)->is_atomic = 1;
 		NAPI_GRO_CB(skb)->gro_remcsum_start = 0;
 
 		/* Setup for GRO checksum validation */
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 19e9a2c45d71..98fe04b99e01 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1328,6 +1328,7 @@ static struct sk_buff **inet_gro_receive(struct sk_buff **head,
 
 	for (p = *head; p; p = p->next) {
 		struct iphdr *iph2;
+		u16 flush_id;
 
 		if (!NAPI_GRO_CB(p)->same_flow)
 			continue;
@@ -1351,16 +1352,36 @@ static struct sk_buff **inet_gro_receive(struct sk_buff **head,
 			(iph->tos ^ iph2->tos) |
 			((iph->frag_off ^ iph2->frag_off) & htons(IP_DF));
 
-		/* Save the IP ID check to be included later when we get to
-		 * the transport layer so only the inner most IP ID is checked.
-		 * This is because some GSO/TSO implementations do not
-		 * correctly increment the IP ID for the outer hdrs.
-		 */
-		NAPI_GRO_CB(p)->flush_id =
-			    ((u16)(ntohs(iph2->id) + NAPI_GRO_CB(p)->count) ^ id);
 		NAPI_GRO_CB(p)->flush |= flush;
+
+		/* We need to store of the IP ID check to be included later
+		 * when we can verify that this packet does in fact belong
+		 * to a given flow.
+		 */
+		flush_id = (u16)(id - ntohs(iph2->id));
+
+		/* This bit of code makes it much easier for us to identify
+		 * the cases where we are doing atomic vs non-atomic IP ID
+		 * checks.  Specifically an atomic check can return IP ID
+		 * values 0 - 0xFFFF, while a non-atomic check can only
+		 * return 0 or 0xFFFF.
+		 */
+		if (!NAPI_GRO_CB(p)->is_atomic ||
+		    !(iph->frag_off & htons(IP_DF))) {
+			flush_id ^= NAPI_GRO_CB(p)->count;
+			flush_id = flush_id ? 0xFFFF : 0;
+		}
+
+		/* If the previous IP ID value was based on an atomic
+		 * datagram we can overwrite the value and ignore it.
+		 */
+		if (NAPI_GRO_CB(skb)->is_atomic)
+			NAPI_GRO_CB(p)->flush_id = flush_id;
+		else
+			NAPI_GRO_CB(p)->flush_id |= flush_id;
 	}
 
+	NAPI_GRO_CB(skb)->is_atomic = !!(iph->frag_off & htons(IP_DF));
 	NAPI_GRO_CB(skb)->flush |= flush;
 	skb_set_network_header(skb, off);
 	/* The above will be needed by the transport layer if there is one
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 08dd25d835af..d1ffd55289bd 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -239,7 +239,7 @@ struct sk_buff **tcp_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 
 found:
 	/* Include the IP ID check below from the inner most IP hdr */
-	flush = NAPI_GRO_CB(p)->flush | NAPI_GRO_CB(p)->flush_id;
+	flush = NAPI_GRO_CB(p)->flush;
 	flush |= (__force int)(flags & TCP_FLAG_CWR);
 	flush |= (__force int)((flags ^ tcp_flag_word(th2)) &
 		  ~(TCP_FLAG_CWR | TCP_FLAG_FIN | TCP_FLAG_PSH));
@@ -248,6 +248,17 @@ found:
 		flush |= *(u32 *)((u8 *)th + i) ^
 			 *(u32 *)((u8 *)th2 + i);
 
+	/* When we receive our second frame we can made a decision on if we
+	 * continue this flow as an atomic flow with a fixed ID or if we use
+	 * an incrementing ID.
+	 */
+	if (NAPI_GRO_CB(p)->flush_id != 1 ||
+	    NAPI_GRO_CB(p)->count != 1 ||
+	    !NAPI_GRO_CB(p)->is_atomic)
+		flush |= NAPI_GRO_CB(p)->flush_id;
+	else
+		NAPI_GRO_CB(p)->is_atomic = false;
+
 	mss = skb_shinfo(p)->gso_size;
 
 	flush |= (len - 1) >= mss;
@@ -316,6 +327,9 @@ static int tcp4_gro_complete(struct sk_buff *skb, int thoff)
 				  iph->daddr, 0);
 	skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4;
 
+	if (NAPI_GRO_CB(skb)->is_atomic)
+		skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_FIXEDID;
+
 	return tcp_gro_complete(skb);
 }
 
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index d7530b9a1d63..e9479499f58c 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -240,10 +240,14 @@ static struct sk_buff **ipv6_gro_receive(struct sk_buff **head,
 		NAPI_GRO_CB(p)->flush |= !!(first_word & htonl(0x0FF00000));
 		NAPI_GRO_CB(p)->flush |= flush;
 
-		/* Clear flush_id, there's really no concept of ID in IPv6. */
-		NAPI_GRO_CB(p)->flush_id = 0;
+		/* If the previous IP ID value was based on an atomic
+		 * datagram we can overwrite the value and ignore it.
+		 */
+		if (NAPI_GRO_CB(skb)->is_atomic)
+			NAPI_GRO_CB(p)->flush_id = 0;
 	}
 
+	NAPI_GRO_CB(skb)->is_atomic = true;
 	NAPI_GRO_CB(skb)->flush |= flush;
 
 	skb_gro_postpull_rcsum(skb, iph, nlen);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 05/11] GSO: Support partial segmentation offload
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
                   ` (3 preceding siblings ...)
  2016-04-07 22:32 ` [RFC PATCH 04/11] GRO: Add support for TCP with fixed IPv4 ID field, limit tunnel IP ID values Alexander Duyck
@ 2016-04-07 22:32 ` Alexander Duyck
  2016-04-07 22:32 ` [RFC PATCH 06/11] VXLAN: Add option to mangle IP IDs on inner headers when using TSO Alexander Duyck
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:32 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch adds support for something I am referring to as GSO partial.
The basic idea is that we can support a broader range of devices for
segmentation if we use fixed outer headers and have the hardware only
really deal with segmenting the inner header.  The idea behind the naming
is due to the fact that everything before csum_start will be fixed headers,
and everything after will be the region that is handled by hardware.

With the current implementation it allows us to add support for the
following GSO types with an inner TSO_FIXEDID or TSO6 offload:
NETIF_F_GSO_GRE
NETIF_F_GSO_GRE_CSUM
NETIF_F_GSO_IPIP
NETIF_F_GSO_SIT
NETIF_F_UDP_TUNNEL
NETIF_F_UDP_TUNNEL_CSUM

In the case of hardware that already supports tunneling we may be able to
extend this further to support TSO_TCPV4 without TSO_FIXEDID if the
hardware can support updating inner IPv4 headers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 include/linux/netdev_features.h |    5 +++++
 include/linux/netdevice.h       |    2 ++
 include/linux/skbuff.h          |    9 +++++++--
 net/core/dev.c                  |   31 ++++++++++++++++++++++++++++++-
 net/core/ethtool.c              |    1 +
 net/core/skbuff.c               |   29 ++++++++++++++++++++++++++++-
 net/ipv4/af_inet.c              |   20 ++++++++++++++++----
 net/ipv4/gre_offload.c          |   26 +++++++++++++++++++++-----
 net/ipv4/tcp_offload.c          |   10 ++++++++--
 net/ipv4/udp_offload.c          |   27 +++++++++++++++++++++------
 net/ipv6/ip6_offload.c          |   10 +++++++++-
 11 files changed, 148 insertions(+), 22 deletions(-)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index 5d7da1ac6df5..6ef549ec5b13 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -48,6 +48,10 @@ enum {
 	NETIF_F_GSO_SIT_BIT,		/* ... SIT tunnel with TSO */
 	NETIF_F_GSO_UDP_TUNNEL_BIT,	/* ... UDP TUNNEL with TSO */
 	NETIF_F_GSO_UDP_TUNNEL_CSUM_BIT,/* ... UDP TUNNEL with TSO & CSUM */
+	NETIF_F_GSO_PARTIAL_BIT,	/* ... Only segment inner-most L4
+					 *     in hardware and all other
+					 *     headers in software.
+					 */
 	NETIF_F_GSO_TUNNEL_REMCSUM_BIT, /* ... TUNNEL with TSO & REMCSUM */
 	/**/NETIF_F_GSO_LAST =		/* last bit, see GSO_MASK */
 		NETIF_F_GSO_TUNNEL_REMCSUM_BIT,
@@ -122,6 +126,7 @@ enum {
 #define NETIF_F_GSO_UDP_TUNNEL	__NETIF_F(GSO_UDP_TUNNEL)
 #define NETIF_F_GSO_UDP_TUNNEL_CSUM __NETIF_F(GSO_UDP_TUNNEL_CSUM)
 #define NETIF_F_TSO_FIXEDID	__NETIF_F(TSO_FIXEDID)
+#define NETIF_F_GSO_PARTIAL	 __NETIF_F(GSO_PARTIAL)
 #define NETIF_F_GSO_TUNNEL_REMCSUM __NETIF_F(GSO_TUNNEL_REMCSUM)
 #define NETIF_F_HW_VLAN_STAG_FILTER __NETIF_F(HW_VLAN_STAG_FILTER)
 #define NETIF_F_HW_VLAN_STAG_RX	__NETIF_F(HW_VLAN_STAG_RX)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index abf8cc2d9bfb..36a079598034 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1656,6 +1656,7 @@ struct net_device {
 	netdev_features_t	vlan_features;
 	netdev_features_t	hw_enc_features;
 	netdev_features_t	mpls_features;
+	netdev_features_t	gso_partial_features;
 
 	int			ifindex;
 	int			group;
@@ -4023,6 +4024,7 @@ static inline bool net_gso_ok(netdev_features_t features, int gso_type)
 	BUILD_BUG_ON(SKB_GSO_SIT     != (NETIF_F_GSO_SIT >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_UDP_TUNNEL != (NETIF_F_GSO_UDP_TUNNEL >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_UDP_TUNNEL_CSUM != (NETIF_F_GSO_UDP_TUNNEL_CSUM >> NETIF_F_GSO_SHIFT));
+	BUILD_BUG_ON(SKB_GSO_PARTIAL != (NETIF_F_GSO_PARTIAL >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_TUNNEL_REMCSUM != (NETIF_F_GSO_TUNNEL_REMCSUM >> NETIF_F_GSO_SHIFT));
 
 	return (features & feature) == feature;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5fba16658f9d..da0ace389fec 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -483,7 +483,9 @@ enum {
 
 	SKB_GSO_UDP_TUNNEL_CSUM = 1 << 12,
 
-	SKB_GSO_TUNNEL_REMCSUM = 1 << 13,
+	SKB_GSO_PARTIAL = 1 << 13,
+
+	SKB_GSO_TUNNEL_REMCSUM = 1 << 14,
 };
 
 #if BITS_PER_LONG > 32
@@ -3591,7 +3593,10 @@ static inline struct sec_path *skb_sec_path(struct sk_buff *skb)
  * Keeps track of level of encapsulation of network headers.
  */
 struct skb_gso_cb {
-	int	mac_offset;
+	union {
+		int	mac_offset;
+		int	data_offset;
+	};
 	int	encap_level;
 	__wsum	csum;
 	__u16	csum_start;
diff --git a/net/core/dev.c b/net/core/dev.c
index 4ed2852b3706..53b216b617c3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2711,6 +2711,19 @@ struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 			return ERR_PTR(err);
 	}
 
+	/* Only report GSO partial support if it will enable us to
+	 * support segmentation on this frame without needing additional
+	 * work.
+	 */
+	if (features & NETIF_F_GSO_PARTIAL) {
+		netdev_features_t partial_features = NETIF_F_GSO_ROBUST;
+		struct net_device *dev = skb->dev;
+
+		partial_features |= dev->features & dev->gso_partial_features;
+		if (!skb_gso_ok(skb, features | partial_features))
+			features &= ~NETIF_F_GSO_PARTIAL;
+	}
+
 	BUILD_BUG_ON(SKB_SGO_CB_OFFSET +
 		     sizeof(*SKB_GSO_CB(skb)) > sizeof(skb->cb));
 
@@ -2841,6 +2854,14 @@ netdev_features_t netif_skb_features(struct sk_buff *skb)
 	if (skb->encapsulation)
 		features &= dev->hw_enc_features;
 
+	/* Support for GSO partial features requires software intervention
+	 * before we can actually process the packets so we need to strip
+	 * support for any partial features now and we can pull them back
+	 * in after we have partially segmented the frame.
+	 */
+	if (skb_is_gso(skb) && !(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL))
+		features &= ~dev->gso_partial_features;
+
 	if (skb_vlan_tagged(skb))
 		features = netdev_intersect_features(features,
 						     dev->vlan_features |
@@ -6707,6 +6728,14 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
 		}
 	}
 
+	/* GSO partial features require GSO partial be set */
+	if ((features & dev->gso_partial_features) &&
+	    !(features & NETIF_F_GSO_PARTIAL)) {
+		netdev_dbg(dev,
+			   "Dropping partially supported GSO features since no GSO partial.\n");
+		features &= ~dev->gso_partial_features;
+	}
+
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	if (dev->netdev_ops->ndo_busy_poll)
 		features |= NETIF_F_BUSY_POLL;
@@ -6987,7 +7016,7 @@ int register_netdevice(struct net_device *dev)
 
 	/* Make NETIF_F_SG inheritable to tunnel devices.
 	 */
-	dev->hw_enc_features |= NETIF_F_SG;
+	dev->hw_enc_features |= NETIF_F_SG | NETIF_F_GSO_PARTIAL;
 
 	/* Make NETIF_F_SG inheritable to MPLS.
 	 */
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 5340c9dbc318..005478851118 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -88,6 +88,7 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
 	[NETIF_F_GSO_SIT_BIT] =		 "tx-sit-segmentation",
 	[NETIF_F_GSO_UDP_TUNNEL_BIT] =	 "tx-udp_tnl-segmentation",
 	[NETIF_F_GSO_UDP_TUNNEL_CSUM_BIT] = "tx-udp_tnl-csum-segmentation",
+	[NETIF_F_GSO_PARTIAL_BIT] =	 "tx-gso-partial",
 
 	[NETIF_F_FCOE_CRC_BIT] =         "tx-checksum-fcoe-crc",
 	[NETIF_F_SCTP_CRC_BIT] =        "tx-checksum-sctp",
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d04c2d1c8c87..4cc594cdaada 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3076,8 +3076,9 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 	struct sk_buff *frag_skb = head_skb;
 	unsigned int offset = doffset;
 	unsigned int tnl_hlen = skb_tnl_header_len(head_skb);
+	unsigned int partial_segs = 0;
 	unsigned int headroom;
-	unsigned int len;
+	unsigned int len = head_skb->len;
 	__be16 proto;
 	bool csum;
 	int sg = !!(features & NETIF_F_SG);
@@ -3094,6 +3095,15 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 
 	csum = !!can_checksum_protocol(features, proto);
 
+	/* GSO partial only requires that we trim off any excess that
+	 * doesn't fit into an MSS sized block, so take care of that
+	 * now.
+	 */
+	if (features & NETIF_F_GSO_PARTIAL) {
+		partial_segs = len / mss;
+		mss *= partial_segs;
+	}
+
 	headroom = skb_headroom(head_skb);
 	pos = skb_headlen(head_skb);
 
@@ -3281,6 +3291,23 @@ perform_csum_check:
 	 */
 	segs->prev = tail;
 
+	/* Update GSO info on first skb in partial sequence. */
+	if (partial_segs) {
+		int type = skb_shinfo(head_skb)->gso_type;
+
+		/* Update type to add partial and then remove dodgy if set */
+		type |= SKB_GSO_PARTIAL;
+		type &= ~SKB_GSO_DODGY;
+
+		/* Update GSO info and prepare to start updating headers on
+		 * our way back down the stack of protocols.
+		 */
+		skb_shinfo(segs)->gso_size = skb_shinfo(head_skb)->gso_size;
+		skb_shinfo(segs)->gso_segs = partial_segs;
+		skb_shinfo(segs)->gso_type = type;
+		SKB_GSO_CB(segs)->data_offset = skb_headroom(segs) + doffset;
+	}
+
 	/* Following permits correct backpressure, for protocols
 	 * using skb_set_owner_w().
 	 * Idea is to tranfert ownership from head_skb to last segment.
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 98fe04b99e01..22b03eb07740 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1200,7 +1200,7 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 	const struct net_offload *ops;
 	unsigned int offset = 0;
 	struct iphdr *iph;
-	int proto;
+	int proto, tot_len;
 	int nhoff;
 	int ihl;
 	int id;
@@ -1219,6 +1219,7 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 		       SKB_GSO_UDP_TUNNEL_CSUM |
 		       SKB_GSO_TCP_FIXEDID |
 		       SKB_GSO_TUNNEL_REMCSUM |
+		       SKB_GSO_PARTIAL |
 		       0)))
 		goto out;
 
@@ -1273,10 +1274,21 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 			if (skb->next)
 				iph->frag_off |= htons(IP_MF);
 			offset += skb->len - nhoff - ihl;
-		} else if (!fixedid) {
-			iph->id = htons(id++);
+			tot_len = skb->len - nhoff;
+		} else if (skb_is_gso(skb)) {
+			if (!fixedid) {
+				iph->id = htons(id);
+				id += skb_shinfo(skb)->gso_segs;
+			}
+			tot_len = skb_shinfo(skb)->gso_size +
+				  SKB_GSO_CB(skb)->data_offset +
+				  skb->head - (unsigned char *)iph;
+		} else {
+			if (!fixedid)
+				iph->id = htons(id++);
+			tot_len = skb->len - nhoff;
 		}
-		iph->tot_len = htons(skb->len - nhoff);
+		iph->tot_len = htons(tot_len);
 		ip_send_check(iph);
 		if (encap)
 			skb_reset_inner_headers(skb);
diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
index 6376b0cdf693..20557f211408 100644
--- a/net/ipv4/gre_offload.c
+++ b/net/ipv4/gre_offload.c
@@ -36,7 +36,8 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
 				  SKB_GSO_GRE |
 				  SKB_GSO_GRE_CSUM |
 				  SKB_GSO_IPIP |
-				  SKB_GSO_SIT)))
+				  SKB_GSO_SIT |
+				  SKB_GSO_PARTIAL)))
 		goto out;
 
 	if (!skb->encapsulation)
@@ -87,7 +88,7 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
 	skb = segs;
 	do {
 		struct gre_base_hdr *greh;
-		__be32 *pcsum;
+		__sum16 *pcsum;
 
 		/* Set up inner headers if we are offloading inner checksum */
 		if (skb->ip_summed == CHECKSUM_PARTIAL) {
@@ -107,10 +108,25 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
 			continue;
 
 		greh = (struct gre_base_hdr *)skb_transport_header(skb);
-		pcsum = (__be32 *)(greh + 1);
+		pcsum = (__sum16 *)(greh + 1);
+
+		if (skb_is_gso(skb)) {
+			unsigned int partial_adj;
+
+			/* Adjust checksum to account for the fact that
+			 * the partial checksum is based on actual size
+			 * whereas headers should be based on MSS size.
+			 */
+			partial_adj = skb->len + skb_headroom(skb) -
+				      SKB_GSO_CB(skb)->data_offset -
+				      skb_shinfo(skb)->gso_size;
+			*pcsum = ~csum_fold((__force __wsum)htonl(partial_adj));
+		} else {
+			*pcsum = 0;
+		}
 
-		*pcsum = 0;
-		*(__sum16 *)pcsum = gso_make_checksum(skb, 0);
+		*(pcsum + 1) = 0;
+		*pcsum = gso_make_checksum(skb, 0);
 	} while ((skb = skb->next));
 out:
 	return segs;
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index d1ffd55289bd..02737b607aa7 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -109,6 +109,12 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
 		goto out;
 	}
 
+	/* GSO partial only requires splitting the frame into an MSS
+	 * multiple and possibly a remainder.  So update the mss now.
+	 */
+	if (features & NETIF_F_GSO_PARTIAL)
+		mss = skb->len - (skb->len % mss);
+
 	copy_destructor = gso_skb->destructor == tcp_wfree;
 	ooo_okay = gso_skb->ooo_okay;
 	/* All segments but the first should have ooo_okay cleared */
@@ -133,7 +139,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
 	newcheck = ~csum_fold((__force __wsum)((__force u32)th->check +
 					       (__force u32)delta));
 
-	do {
+	while (skb->next) {
 		th->fin = th->psh = 0;
 		th->check = newcheck;
 
@@ -153,7 +159,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
 
 		th->seq = htonl(seq);
 		th->cwr = 0;
-	} while (skb->next);
+	}
 
 	/* Following permits TCP Small Queues to work well with GSO :
 	 * The callback to TCP stack will be called at the time last frag
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 0ed2dafb7cc4..454e9e70d460 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -51,8 +51,11 @@ static struct sk_buff *__skb_udp_tunnel_segment(struct sk_buff *skb,
 	 * 16 bit length field due to the header being added outside of an
 	 * IP or IPv6 frame that was already limited to 64K - 1.
 	 */
-	partial = csum_sub(csum_unfold(uh->check),
-			   (__force __wsum)htonl(skb->len));
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL)
+		partial = (__force __wsum)uh->len;
+	else
+		partial = (__force __wsum)htonl(skb->len);
+	partial = csum_sub(csum_unfold(uh->check), partial);
 
 	/* setup inner skb. */
 	skb->encapsulation = 0;
@@ -101,7 +104,7 @@ static struct sk_buff *__skb_udp_tunnel_segment(struct sk_buff *skb,
 	udp_offset = outer_hlen - tnl_hlen;
 	skb = segs;
 	do {
-		__be16 len;
+		unsigned int len;
 
 		if (remcsum)
 			skb->ip_summed = CHECKSUM_NONE;
@@ -119,14 +122,26 @@ static struct sk_buff *__skb_udp_tunnel_segment(struct sk_buff *skb,
 		skb_reset_mac_header(skb);
 		skb_set_network_header(skb, mac_len);
 		skb_set_transport_header(skb, udp_offset);
-		len = htons(skb->len - udp_offset);
+		len = skb->len - udp_offset;
 		uh = udp_hdr(skb);
-		uh->len = len;
+
+		/* If we are only performing partial GSO the inner header
+		 * will be using a length value equal to only one MSS sized
+		 * segment instead of the entire frame.
+		 */
+		if (skb_is_gso(skb)) {
+			uh->len = htons(skb_shinfo(skb)->gso_size +
+					SKB_GSO_CB(skb)->data_offset +
+					skb->head - (unsigned char *)uh);
+		} else {
+			uh->len = htons(len);
+		}
 
 		if (!need_csum)
 			continue;
 
-		uh->check = ~csum_fold(csum_add(partial, (__force __wsum)len));
+		uh->check = ~csum_fold(csum_add(partial,
+				       (__force __wsum)htonl(len)));
 
 		if (skb->encapsulation || !offload_csum) {
 			uh->check = gso_make_checksum(skb, ~uh->check);
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index e9479499f58c..ad3a6bd4a7f7 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -63,6 +63,7 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 	int proto;
 	struct frag_hdr *fptr;
 	unsigned int unfrag_ip6hlen;
+	unsigned int payload_len;
 	u8 *prevhdr;
 	int offset = 0;
 	bool encap, udpfrag;
@@ -82,6 +83,7 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 		       SKB_GSO_UDP_TUNNEL |
 		       SKB_GSO_UDP_TUNNEL_CSUM |
 		       SKB_GSO_TUNNEL_REMCSUM |
+		       SKB_GSO_PARTIAL |
 		       0)))
 		goto out;
 
@@ -118,7 +120,13 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 
 	for (skb = segs; skb; skb = skb->next) {
 		ipv6h = (struct ipv6hdr *)(skb_mac_header(skb) + nhoff);
-		ipv6h->payload_len = htons(skb->len - nhoff - sizeof(*ipv6h));
+		if (skb_is_gso(skb))
+			payload_len = skb_shinfo(skb)->gso_size +
+				      SKB_GSO_CB(skb)->data_offset +
+				      skb->head - (unsigned char *)(ipv6h + 1);
+		else
+			payload_len = skb->len - nhoff - sizeof(*ipv6h);
+		ipv6h->payload_len = htons(payload_len);
 		skb->network_header = (u8 *)ipv6h - skb->head;
 
 		if (udpfrag) {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 06/11] VXLAN: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
                   ` (4 preceding siblings ...)
  2016-04-07 22:32 ` [RFC PATCH 05/11] GSO: Support partial segmentation offload Alexander Duyck
@ 2016-04-07 22:32 ` Alexander Duyck
  2016-04-07 22:32 ` [RFC PATCH 07/11] GENEVE: " Alexander Duyck
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:32 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch adds support for a feature I am calling IP ID mangling.  It is
basically just another way of saying the IP IDs that are transmitted by the
tunnel may not match up with what would normally be expected.  Specifically
what will happen is in the case of TSO the IP IDs on the headers will be a
fixed value so a given TSO will repeat the same inner IP ID value gso_segs
number of times.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 drivers/net/vxlan.c          |   16 ++++++++++++++++
 include/net/vxlan.h          |    1 +
 include/uapi/linux/if_link.h |    1 +
 3 files changed, 18 insertions(+)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 51cccddfe403..cc903ab832c2 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1783,6 +1783,10 @@ static int vxlan_build_skb(struct sk_buff *skb, struct dst_entry *dst,
 			type |= SKB_GSO_TUNNEL_REMCSUM;
 	}
 
+	if ((vxflags & VXLAN_F_TCP_FIXEDID) && skb_is_gso(skb) &&
+	    (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4))
+		type |= SKB_GSO_TCP_FIXEDID;
+
 	min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len
 			+ VXLAN_HLEN + iphdr_len
 			+ (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
@@ -2635,6 +2639,7 @@ static const struct nla_policy vxlan_policy[IFLA_VXLAN_MAX + 1] = {
 	[IFLA_VXLAN_GBP]	= { .type = NLA_FLAG, },
 	[IFLA_VXLAN_GPE]	= { .type = NLA_FLAG, },
 	[IFLA_VXLAN_REMCSUM_NOPARTIAL]	= { .type = NLA_FLAG },
+	[IFLA_VXLAN_IPID_MANGLE]	= { .type = NLA_FLAG },
 };
 
 static int vxlan_validate(struct nlattr *tb[], struct nlattr *data[])
@@ -3092,6 +3097,9 @@ static int vxlan_newlink(struct net *src_net, struct net_device *dev,
 	if (data[IFLA_VXLAN_REMCSUM_NOPARTIAL])
 		conf.flags |= VXLAN_F_REMCSUM_NOPARTIAL;
 
+	if (data[IFLA_VXLAN_IPID_MANGLE])
+		conf.flags |= VXLAN_F_TCP_FIXEDID;
+
 	err = vxlan_dev_configure(src_net, dev, &conf);
 	switch (err) {
 	case -ENODEV:
@@ -3154,6 +3162,10 @@ static size_t vxlan_get_size(const struct net_device *dev)
 		nla_total_size(sizeof(__u8)) + /* IFLA_VXLAN_UDP_ZERO_CSUM6_RX */
 		nla_total_size(sizeof(__u8)) + /* IFLA_VXLAN_REMCSUM_TX */
 		nla_total_size(sizeof(__u8)) + /* IFLA_VXLAN_REMCSUM_RX */
+		nla_total_size(0) +	       /* IFLA_VXLAN_GBP */
+		nla_total_size(0) +	       /* IFLA_VXLAN_GPE */
+		nla_total_size(0) +	       /* IFLA_VXLAN_REMCSUM_NOPARTIAL */
+		nla_total_size(0) +	       /* IFLA_VXLAN_IPID_MANGLE */
 		0;
 }
 
@@ -3244,6 +3256,10 @@ static int vxlan_fill_info(struct sk_buff *skb, const struct net_device *dev)
 	    nla_put_flag(skb, IFLA_VXLAN_REMCSUM_NOPARTIAL))
 		goto nla_put_failure;
 
+	if (vxlan->flags & VXLAN_F_TCP_FIXEDID &&
+	    nla_put_flag(skb, IFLA_VXLAN_IPID_MANGLE))
+		goto nla_put_failure;
+
 	return 0;
 
 nla_put_failure:
diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index dcc6f4057115..5c2dc9ecea59 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -265,6 +265,7 @@ struct vxlan_dev {
 #define VXLAN_F_REMCSUM_NOPARTIAL	0x1000
 #define VXLAN_F_COLLECT_METADATA	0x2000
 #define VXLAN_F_GPE			0x4000
+#define VXLAN_F_TCP_FIXEDID		0x8000
 
 /* Flags that are used in the receive path. These flags must match in
  * order for a socket to be shareable
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 9427f17d06d6..a3bc3f2a63d3 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -489,6 +489,7 @@ enum {
 	IFLA_VXLAN_COLLECT_METADATA,
 	IFLA_VXLAN_LABEL,
 	IFLA_VXLAN_GPE,
+	IFLA_VXLAN_IPID_MANGLE,
 	__IFLA_VXLAN_MAX
 };
 #define IFLA_VXLAN_MAX	(__IFLA_VXLAN_MAX - 1)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
                   ` (5 preceding siblings ...)
  2016-04-07 22:32 ` [RFC PATCH 06/11] VXLAN: Add option to mangle IP IDs on inner headers when using TSO Alexander Duyck
@ 2016-04-07 22:32 ` Alexander Duyck
  2016-04-07 23:22   ` Jesse Gross
  2016-04-07 22:32 ` [RFC PATCH 08/11] Documentation: Add documentation for TSO and GSO features Alexander Duyck
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:32 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch adds support for a feature I am calling IP ID mangling.  It is
basically just another way of saying the IP IDs that are transmitted by the
tunnel may not match up with what would normally be expected.  Specifically
what will happen is in the case of TSO the IP IDs on the headers will be a
fixed value so a given TSO will repeat the same inner IP ID value gso_segs
number of times.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 drivers/net/geneve.c         |   24 ++++++++++++++++++++++--
 include/net/udp_tunnel.h     |    8 --------
 include/uapi/linux/if_link.h |    1 +
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index bc168894bda3..6352223d80c3 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -80,6 +80,7 @@ struct geneve_dev {
 #define GENEVE_F_UDP_ZERO_CSUM_TX	BIT(0)
 #define GENEVE_F_UDP_ZERO_CSUM6_TX	BIT(1)
 #define GENEVE_F_UDP_ZERO_CSUM6_RX	BIT(2)
+#define GENEVE_F_TCP_FIXEDID		BIT(3)
 
 struct geneve_sock {
 	bool			collect_md;
@@ -702,9 +703,14 @@ static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
 	int min_headroom;
 	int err;
 	bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM_TX);
+	int type = udp_sum ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
 
 	skb_scrub_packet(skb, xnet);
 
+	if ((flags & GENEVE_F_TCP_FIXEDID) && skb_is_gso(skb) &&
+	    (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4))
+		type |= SKB_GSO_TCP_FIXEDID;
+
 	min_headroom = LL_RESERVED_SPACE(rt->dst.dev) + rt->dst.header_len
 			+ GENEVE_BASE_HLEN + opt_len + sizeof(struct iphdr);
 	err = skb_cow_head(skb, min_headroom);
@@ -713,7 +719,7 @@ static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
 		goto free_rt;
 	}
 
-	skb = udp_tunnel_handle_offloads(skb, udp_sum);
+	skb = iptunnel_handle_offloads(skb, type);
 	if (IS_ERR(skb)) {
 		err = PTR_ERR(skb);
 		goto free_rt;
@@ -739,9 +745,14 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
 	int min_headroom;
 	int err;
 	bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM6_TX);
+	int type = udp_sum ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
 
 	skb_scrub_packet(skb, xnet);
 
+	if ((flags & GENEVE_F_TCP_FIXEDID) && skb_is_gso(skb) &&
+	    (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4))
+		type |= SKB_GSO_TCP_FIXEDID;
+
 	min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len
 			+ GENEVE_BASE_HLEN + opt_len + sizeof(struct ipv6hdr);
 	err = skb_cow_head(skb, min_headroom);
@@ -750,7 +761,7 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
 		goto free_dst;
 	}
 
-	skb = udp_tunnel_handle_offloads(skb, udp_sum);
+	skb = iptunnel_handle_offloads(skb, type);
 	if (IS_ERR(skb)) {
 		err = PTR_ERR(skb);
 		goto free_dst;
@@ -1249,6 +1260,7 @@ static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = {
 	[IFLA_GENEVE_UDP_CSUM]		= { .type = NLA_U8 },
 	[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]	= { .type = NLA_U8 },
 	[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]	= { .type = NLA_U8 },
+	[IFLA_GENEVE_IPID_MANGLE]	= { .type = NLA_FLAG },
 };
 
 static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
@@ -1436,6 +1448,9 @@ static int geneve_newlink(struct net *net, struct net_device *dev,
 	    nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]))
 		flags |= GENEVE_F_UDP_ZERO_CSUM6_RX;
 
+	if (data[IFLA_GENEVE_IPID_MANGLE])
+		flags |= GENEVE_F_TCP_FIXEDID;
+
 	return geneve_configure(net, dev, &remote, vni, ttl, tos, label,
 				dst_port, metadata, flags);
 }
@@ -1460,6 +1475,7 @@ static size_t geneve_get_size(const struct net_device *dev)
 		nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_CSUM */
 		nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_ZERO_CSUM6_TX */
 		nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_ZERO_CSUM6_RX */
+		nla_total_size(0) +	 /* IFLA_GENEVE_IPID_MANGLE */
 		0;
 }
 
@@ -1505,6 +1521,10 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
 		       !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_RX)))
 		goto nla_put_failure;
 
+	if ((geneve->flags & GENEVE_F_TCP_FIXEDID) &&
+	    nla_put_flag(skb, IFLA_GENEVE_IPID_MANGLE))
+		goto nla_put_failure;
+
 	return 0;
 
 nla_put_failure:
diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index b83114077cee..c44d04259665 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -98,14 +98,6 @@ struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
 				    __be16 flags, __be64 tunnel_id,
 				    int md_size);
 
-static inline struct sk_buff *udp_tunnel_handle_offloads(struct sk_buff *skb,
-							 bool udp_csum)
-{
-	int type = udp_csum ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
-
-	return iptunnel_handle_offloads(skb, type);
-}
-
 static inline void udp_tunnel_gro_complete(struct sk_buff *skb, int nhoff)
 {
 	struct udphdr *uh;
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index a3bc3f2a63d3..38acf49c818b 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -513,6 +513,7 @@ enum {
 	IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
 	IFLA_GENEVE_UDP_ZERO_CSUM6_RX,
 	IFLA_GENEVE_LABEL,
+	IFLA_GENEVE_IPID_MANGLE,
 	__IFLA_GENEVE_MAX
 };
 #define IFLA_GENEVE_MAX	(__IFLA_GENEVE_MAX - 1)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 08/11] Documentation: Add documentation for TSO and GSO features
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
                   ` (6 preceding siblings ...)
  2016-04-07 22:32 ` [RFC PATCH 07/11] GENEVE: " Alexander Duyck
@ 2016-04-07 22:32 ` Alexander Duyck
  2016-04-07 22:32 ` [RFC PATCH 09/11] i40e/i40evf: Add support for GSO partial with UDP_TUNNEL_CSUM and GRE_CSUM Alexander Duyck
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:32 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This document is a starting point for defining the TSO and GSO features.
The whole thing is starting to get a bit messy so I wanted to make sure we
have notes somwhere to start describing what does and doesn't work.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 Documentation/networking/segmentation-offloads.txt |  127 ++++++++++++++++++++
 1 file changed, 127 insertions(+)
 create mode 100644 Documentation/networking/segmentation-offloads.txt

diff --git a/Documentation/networking/segmentation-offloads.txt b/Documentation/networking/segmentation-offloads.txt
new file mode 100644
index 000000000000..b06dd9b65ab3
--- /dev/null
+++ b/Documentation/networking/segmentation-offloads.txt
@@ -0,0 +1,127 @@
+Segmentation Offloads in the Linux Networking Stack
+
+Introduction
+============
+
+This document describes a set of techniques in the Linux networking stack
+to take advantage of segmentation offload capabilities of various NICs.
+
+The following technologies are described:
+ * TCP Segmentation Offload - TSO
+ * UDP Fragmentation Offload - UFO
+ * IPIP, SIT, GRE, and UDP Tunnel Offloads
+ * Generic Segmentation Offload - GSO
+ * Generic Receive Offload - GRO
+ * Partial Generic Segmentation Offload - GSO_PARTIAL
+
+TCP Segmentation Offload
+========================
+
+TCP segmentation allows a device to segment a single frame into multiple
+frames with a data payload size specified in skb_shinfo()->gso_size.
+When TCP segmentation requested the bit for either SKB_GSO_TCP or
+SKB_GSO_TCP6 should be set in skb_shinfo()->gso_type and
+skb_shinfo()->gso_size should be set to a non-zero value.
+
+TCP segmentation is dependent on support for the use of partial checksum
+offload.  For this reason TSO is normally disabled if the Tx checksum
+offload for a given device is disabled.
+
+In order to support TCP segmentation offload it is necessary to populate
+the network and transport header offsets of the skbuff so that the device
+drivers will be able determine the offsets of the IP or IPv6 header and the
+TCP header.  In addition as CHECKSUM_PARTIAL is required csum_start should
+also point to the TCP header of the packet.
+
+For IPv4 segmentation we support one of two types in terms of the IP ID.
+The default behavior is to increment the IP ID with every segment.  If the
+GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
+ID and all segments will use the same IP ID.
+
+UDP Fragmentation Offload
+=========================
+
+UDP fragmentation offload allows a device to fragment an oversized UDP
+datagram into multiple IPv4 fragments.  Many of the requirements for UDP
+fragmentation offload are the same as TSO.  However the IPv4 ID for
+fragments should not increment as a single IPv4 datagram is fragmented.
+
+IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
+========================================================
+
+In addition to the offloads described above it is possible for a frame to
+contain additional headers such as an outer tunnel.  In order to account
+for such instances an additional set of segmentation offload types were
+introduced including SKB_GSO_IPIP, SKB_GSO_SIT, SKB_GSO_GRE, and
+SKB_GSO_UDP_TUNNEL.  These extra segmentation types are used to identify
+cases where there are more than just 1 set of headers.  For example in the
+case of IPIP and SIT we should have the network and transport headers moved
+from the standard list of headers to "inner" header offsets.
+
+Currently only two levels of headers are supported.  The convention is to
+refer to the tunnel headers as the outer headers, while the encapsulated
+data is normally referred to as the inner headers.  Below is the list of
+calls to access the given headers:
+
+IPIP/SIT Tunnel:
+		Outer			Inner
+MAC		skb_mac_header
+Network		skb_network_header	skb_inner_network_header
+Transport	skb_transport_header
+
+UDP/GRE Tunnel:
+		Outer			Inner
+MAC		skb_mac_header		skb_inner_mac_header
+Network		skb_network_header	skb_inner_network_header
+Transport	skb_transport_header	skb_inner_transport_header
+
+In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
+SKB_GSO_UDP_TUNNEL_CSUM.  These two additional tunnel types reflect the
+fact that the outer header also requests to have a non-zero checksum
+included in the outer header.
+
+Finally there is SKB_GSO_REMCSUM which indicates that a given tunnel header
+has requested a remote checksum offload.  In this case the inner headers
+will be left with a partial checksum and only the outer header checksum
+will be computed.
+
+Generic Segmentation Offload
+============================
+
+Generic segmentation offload is a pure software offload that is meant to
+deal with cases where device drivers cannot perform the offloads described
+above.  What occurs in GSO is that a given skbuff will have its data broken
+out over multiple skbuffs that have been resized to match the MSS provided
+via skb_shinfo()->gso_size.
+
+Before enabling any hardware segmentation offload a corresponding software
+offload is required in GSO.  Otherwise it becomes possible for a frame to
+be re-routed between devices and end up being unable to be transmitted.
+
+Generic Receive Offload
+=======================
+
+Generic receive offload is the complement to GSO.  Ideally any frame
+assembled by GRO should be segmented to create an identical sequence of
+frames using GSO, and any sequence of frames segmented by GSO should be
+able to be reassembled back to the original by GRO.  The only exception to
+this is IPv4 ID in the case that the DF bit is set for a given IP header.
+If the value of the IPv4 ID is not sequentially incrementing it will be
+altered so that it is when a frame assembled via GRO is segmented via GSO.
+
+Partial Generic Segmentation Offload
+====================================
+
+Partial generic segmentation offload is a hybrid between TSO and GSO.  What
+it effectively does is take advantage of certain traits of TCP and tunnels
+so that instead of having to rewrite the packet headers for each segment
+only the inner-most transport header and possibly the outer-most network
+header need to be updated.  This allows devices that do not support tunnel
+offloads or tunnel offloads with checksum to still make use of segmentation.
+
+With the partial offload what occurs is that all headers excluding the
+inner transport header are updated such that they will contain the correct
+values for if the header was simply duplicated.  The one exception to this
+is the outer IPv4 ID field.  It is up to the device drivers to guarantee
+that the IPv4 ID field is incremented in the case that a given header does
+not have the DF bit set.

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 09/11] i40e/i40evf: Add support for GSO partial with UDP_TUNNEL_CSUM and GRE_CSUM
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
                   ` (7 preceding siblings ...)
  2016-04-07 22:32 ` [RFC PATCH 08/11] Documentation: Add documentation for TSO and GSO features Alexander Duyck
@ 2016-04-07 22:32 ` Alexander Duyck
  2016-04-07 22:32 ` [RFC PATCH 10/11] ixgbe/ixgbevf: Add support for GSO partial Alexander Duyck
  2016-04-07 22:33 ` [RFC PATCH 11/11] igb/igbvf: " Alexander Duyck
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:32 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch makes it so that i40e and i40evf can use GSO_PARTIAL to support
segmentation for frames with checksums enabled in outer headers.  As a
result we can now send data over these types of tunnels at over 20Gb/s
versus the 12Gb/s that was previously possible on my system.

The advantage with the i40e parts is that this offload is mostly
transparent as the hardware still deals with the inner and/or outer IPv4
headers so the IP ID is still incrementing for both when this offload is
performed.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c     |    6 +++++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.c     |    7 ++++++-
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c   |    7 ++++++-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c |    6 +++++-
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 07a70c4ac49f..6c095b07ce82 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -9119,17 +9119,21 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 				   NETIF_F_TSO_ECN		|
 				   NETIF_F_TSO6			|
 				   NETIF_F_GSO_GRE		|
+				   NETIF_F_GSO_GRE_CSUM		|
 				   NETIF_F_GSO_IPIP		|
 				   NETIF_F_GSO_SIT		|
 				   NETIF_F_GSO_UDP_TUNNEL	|
 				   NETIF_F_GSO_UDP_TUNNEL_CSUM	|
+				   NETIF_F_GSO_PARTIAL		|
 				   NETIF_F_SCTP_CRC		|
 				   NETIF_F_RXHASH		|
 				   NETIF_F_RXCSUM		|
 				   0;
 
 	if (!(pf->flags & I40E_FLAG_OUTER_UDP_CSUM_CAPABLE))
-		netdev->hw_enc_features ^= NETIF_F_GSO_UDP_TUNNEL_CSUM;
+		netdev->gso_partial_features |= NETIF_F_GSO_UDP_TUNNEL_CSUM;
+
+	netdev->gso_partial_features |= NETIF_F_GSO_GRE_CSUM;
 
 	/* record features VLANs can make use of */
 	netdev->vlan_features |= netdev->hw_enc_features;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 6e44cf118843..ede4183468b9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2300,11 +2300,15 @@ static int i40e_tso(struct sk_buff *skb, u8 *hdr_len, u64 *cd_type_cmd_tso_mss)
 	}
 
 	if (skb_shinfo(skb)->gso_type & (SKB_GSO_GRE |
+					 SKB_GSO_GRE_CSUM |
 					 SKB_GSO_IPIP |
 					 SKB_GSO_SIT |
 					 SKB_GSO_UDP_TUNNEL |
 					 SKB_GSO_UDP_TUNNEL_CSUM)) {
-		if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM) {
+		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
+		    (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)) {
+			l4.udp->len = 0;
+
 			/* determine offset of outer transport header */
 			l4_offset = l4.hdr - skb->data;
 
@@ -2481,6 +2485,7 @@ static int i40e_tx_enable_csum(struct sk_buff *skb, u32 *tx_flags,
 
 		/* indicate if we need to offload outer UDP header */
 		if ((*tx_flags & I40E_TX_FLAGS_TSO) &&
+		    !(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
 		    (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM))
 			tunnel |= I40E_TXD_CTX_QW0_L4T_CS_MASK;
 
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index f101895ecf4a..6ce00547c13e 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -1565,11 +1565,15 @@ static int i40e_tso(struct sk_buff *skb, u8 *hdr_len, u64 *cd_type_cmd_tso_mss)
 	}
 
 	if (skb_shinfo(skb)->gso_type & (SKB_GSO_GRE |
+					 SKB_GSO_GRE_CSUM |
 					 SKB_GSO_IPIP |
 					 SKB_GSO_SIT |
 					 SKB_GSO_UDP_TUNNEL |
 					 SKB_GSO_UDP_TUNNEL_CSUM)) {
-		if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM) {
+		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
+		    (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)) {
+			l4.udp->len = 0;
+
 			/* determine offset of outer transport header */
 			l4_offset = l4.hdr - skb->data;
 
@@ -1704,6 +1708,7 @@ static int i40e_tx_enable_csum(struct sk_buff *skb, u32 *tx_flags,
 
 		/* indicate if we need to offload outer UDP header */
 		if ((*tx_flags & I40E_TX_FLAGS_TSO) &&
+		    !(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
 		    (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM))
 			tunnel |= I40E_TXD_CTX_QW0_L4T_CS_MASK;
 
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 806da2686623..9c78c6d6afcb 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -2346,17 +2346,21 @@ int i40evf_process_config(struct i40evf_adapter *adapter)
 				   NETIF_F_TSO_ECN		|
 				   NETIF_F_TSO6			|
 				   NETIF_F_GSO_GRE		|
+				   NETIF_F_GSO_GRE_CSUM		|
 				   NETIF_F_GSO_IPIP		|
 				   NETIF_F_GSO_SIT		|
 				   NETIF_F_GSO_UDP_TUNNEL	|
 				   NETIF_F_GSO_UDP_TUNNEL_CSUM	|
+				   NETIF_F_GSO_PARTIAL		|
 				   NETIF_F_SCTP_CRC		|
 				   NETIF_F_RXHASH		|
 				   NETIF_F_RXCSUM		|
 				   0;
 
 	if (!(adapter->flags & I40EVF_FLAG_OUTER_UDP_CSUM_CAPABLE))
-		netdev->hw_enc_features ^= NETIF_F_GSO_UDP_TUNNEL_CSUM;
+		netdev->gso_partial_features |= NETIF_F_GSO_UDP_TUNNEL_CSUM;
+
+	netdev->gso_partial_features |= NETIF_F_GSO_GRE_CSUM;
 
 	/* record features VLANs can make use of */
 	netdev->vlan_features |= netdev->hw_enc_features;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 10/11] ixgbe/ixgbevf: Add support for GSO partial
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
                   ` (8 preceding siblings ...)
  2016-04-07 22:32 ` [RFC PATCH 09/11] i40e/i40evf: Add support for GSO partial with UDP_TUNNEL_CSUM and GRE_CSUM Alexander Duyck
@ 2016-04-07 22:32 ` Alexander Duyck
  2016-04-07 22:33 ` [RFC PATCH 11/11] igb/igbvf: " Alexander Duyck
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:32 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch adds support for partial GSO segmentation in the case of
encapsulated frames.  Specifically with this change the driver can perform
segmentation as long as the type is either SKB_GSO_TCP_FIXEDID or
SKB_GSO_TCPV6.  If neither of these gso types are specified then tunnel
segmentation is not supported and we will default back to GSO.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |  115 +++++++++++++++-----
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |  122 ++++++++++++++++-----
 2 files changed, 180 insertions(+), 57 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index c6bd3ae5f986..57e083f6c8a9 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -7195,9 +7195,18 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
 		     struct ixgbe_tx_buffer *first,
 		     u8 *hdr_len)
 {
+	u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
 	struct sk_buff *skb = first->skb;
-	u32 vlan_macip_lens, type_tucmd;
-	u32 mss_l4len_idx, l4len;
+	union {
+		struct iphdr *v4;
+		struct ipv6hdr *v6;
+		unsigned char *hdr;
+	} ip;
+	union {
+		struct tcphdr *tcp;
+		unsigned char *hdr;
+	} l4;
+	u32 paylen, l4_offset;
 	int err;
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL)
@@ -7210,46 +7219,52 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
 	if (err < 0)
 		return err;
 
+	ip.hdr = skb_network_header(skb);
+	l4.hdr = skb_checksum_start(skb);
+
 	/* ADV DTYP TUCMD MKRLOC/ISCSIHEDLEN */
 	type_tucmd = IXGBE_ADVTXD_TUCMD_L4T_TCP;
 
-	if (first->protocol == htons(ETH_P_IP)) {
-		struct iphdr *iph = ip_hdr(skb);
-		iph->tot_len = 0;
-		iph->check = 0;
-		tcp_hdr(skb)->check = ~csum_tcpudp_magic(iph->saddr,
-							 iph->daddr, 0,
-							 IPPROTO_TCP,
-							 0);
+	/* initialize outer IP header fields */
+	if (ip.v4->version == 4) {
+		/* IP header will have to cancel out any data that
+		 * is not a part of the outer IP header
+		 */
+		ip.v4->check = csum_fold(csum_add(lco_csum(skb),
+						  csum_unfold(l4.tcp->check)));
 		type_tucmd |= IXGBE_ADVTXD_TUCMD_IPV4;
+
+		ip.v4->tot_len = 0;
 		first->tx_flags |= IXGBE_TX_FLAGS_TSO |
 				   IXGBE_TX_FLAGS_CSUM |
 				   IXGBE_TX_FLAGS_IPV4;
-	} else if (skb_is_gso_v6(skb)) {
-		ipv6_hdr(skb)->payload_len = 0;
-		tcp_hdr(skb)->check =
-		    ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
-				     &ipv6_hdr(skb)->daddr,
-				     0, IPPROTO_TCP, 0);
+	} else {
+		ip.v6->payload_len = 0;
 		first->tx_flags |= IXGBE_TX_FLAGS_TSO |
 				   IXGBE_TX_FLAGS_CSUM;
 	}
 
-	/* compute header lengths */
-	l4len = tcp_hdrlen(skb);
-	*hdr_len = skb_transport_offset(skb) + l4len;
+	/* determine offset of inner transport header */
+	l4_offset = l4.hdr - skb->data;
+
+	/* compute length of segmentation header */
+	*hdr_len = (l4.tcp->doff * 4) + l4_offset;
+
+	/* remove payload length from inner checksum */
+	paylen = skb->len - l4_offset;
+	csum_replace_by_diff(&l4.tcp->check, htonl(paylen));
 
 	/* update gso size and bytecount with header size */
 	first->gso_segs = skb_shinfo(skb)->gso_segs;
 	first->bytecount += (first->gso_segs - 1) * *hdr_len;
 
 	/* mss_l4len_id: use 0 as index for TSO */
-	mss_l4len_idx = l4len << IXGBE_ADVTXD_L4LEN_SHIFT;
+	mss_l4len_idx = (*hdr_len - l4_offset) << IXGBE_ADVTXD_L4LEN_SHIFT;
 	mss_l4len_idx |= skb_shinfo(skb)->gso_size << IXGBE_ADVTXD_MSS_SHIFT;
 
 	/* vlan_macip_lens: HEADLEN, MACLEN, VLAN tag */
-	vlan_macip_lens = skb_network_header_len(skb);
-	vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
+	vlan_macip_lens = l4.hdr - ip.hdr;
+	vlan_macip_lens |= (ip.hdr - skb->data) << IXGBE_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
 
 	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
@@ -8906,17 +8921,49 @@ static void ixgbe_fwd_del(struct net_device *pdev, void *priv)
 	kfree(fwd_adapter);
 }
 
-#define IXGBE_MAX_TUNNEL_HDR_LEN 80
+#define IXGBE_MAX_MAC_HDR_LEN		127
+#define IXGBE_MAX_NETWORK_HDR_LEN	511
+#define IXGBE_GSO_PARTIAL_FEATURES (NETIF_F_TSO_FIXEDID | \
+				    NETIF_F_GSO_GRE | \
+				    NETIF_F_GSO_GRE_CSUM | \
+				    NETIF_F_GSO_IPIP | \
+				    NETIF_F_GSO_SIT | \
+				    NETIF_F_GSO_UDP_TUNNEL | \
+				    NETIF_F_GSO_UDP_TUNNEL_CSUM)
+
 static netdev_features_t
 ixgbe_features_check(struct sk_buff *skb, struct net_device *dev,
 		     netdev_features_t features)
 {
-	if (!skb->encapsulation)
-		return features;
-
-	if (unlikely(skb_inner_mac_header(skb) - skb_transport_header(skb) >
-		     IXGBE_MAX_TUNNEL_HDR_LEN))
-		return features & ~NETIF_F_CSUM_MASK;
+	unsigned int network_hdr_len, mac_hdr_len;
+
+	/* Make certain the headers can be described by a context descriptor */
+	mac_hdr_len = skb_network_header(skb) - skb->data;
+	network_hdr_len = skb_checksum_start(skb) - skb_network_header(skb);
+	if (unlikely((mac_hdr_len > IXGBE_MAX_MAC_HDR_LEN) ||
+		     (network_hdr_len >  IXGBE_MAX_NETWORK_HDR_LEN)))
+		return features & ~(NETIF_F_HW_CSUM |
+				    NETIF_F_SCTP_CRC |
+				    NETIF_F_HW_VLAN_CTAG_TX |
+				    NETIF_F_TSO |
+				    NETIF_F_TSO_FIXEDID |
+				    NETIF_F_TSO6);
+
+	/* We can only support a fixed IPv4 ID or IPv6 header for TSO
+	 * with tunnels.  So if we aren't using a tunnel, or we aren't
+	 * performing TSO with a fixed ID we must strip the partial
+	 * features.
+	 */
+	if (!(skb_shinfo(skb)->gso_type & (SKB_GSO_GRE |
+					   SKB_GSO_GRE_CSUM |
+					   SKB_GSO_IPIP |
+					   SKB_GSO_SIT |
+					   SKB_GSO_UDP_TUNNEL |
+					   SKB_GSO_UDP_TUNNEL_CSUM)) ||
+	    !(skb_shinfo(skb)->gso_type & (SKB_GSO_TCP_FIXEDID |
+					   SKB_GSO_TCPV6)))
+		return features & ~(NETIF_F_GSO_PARTIAL |
+				    IXGBE_GSO_PARTIAL_FEATURES);
 
 	return features;
 }
@@ -9288,6 +9335,10 @@ skip_sriov:
 			   NETIF_F_HW_VLAN_CTAG_RX |
 			   NETIF_F_HW_VLAN_CTAG_FILTER;
 
+	netdev->gso_partial_features = IXGBE_GSO_PARTIAL_FEATURES;
+	netdev->features |= NETIF_F_GSO_PARTIAL |
+			    IXGBE_GSO_PARTIAL_FEATURES;
+
 	if (hw->mac.type >= ixgbe_mac_82599EB)
 		netdev->features |= NETIF_F_SCTP_CRC;
 
@@ -9307,7 +9358,11 @@ skip_sriov:
 				 NETIF_F_SCTP_CRC;
 
 	netdev->mpls_features |= NETIF_F_HW_CSUM;
-	netdev->hw_enc_features |= NETIF_F_HW_CSUM;
+	netdev->hw_enc_features |= NETIF_F_HW_CSUM |
+				   NETIF_F_TSO |
+				   NETIF_F_TSO6 |
+				   NETIF_F_GSO_PARTIAL |
+				   IXGBE_GSO_PARTIAL_FEATURES;
 
 	netdev->priv_flags |= IFF_UNICAST_FLT;
 	netdev->priv_flags |= IFF_SUPP_NOFCS;
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 007cbe094990..9c8ff4f69463 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -3272,9 +3272,18 @@ static int ixgbevf_tso(struct ixgbevf_ring *tx_ring,
 		       struct ixgbevf_tx_buffer *first,
 		       u8 *hdr_len)
 {
+	u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
 	struct sk_buff *skb = first->skb;
-	u32 vlan_macip_lens, type_tucmd;
-	u32 mss_l4len_idx, l4len;
+	union {
+		struct iphdr *v4;
+		struct ipv6hdr *v6;
+		unsigned char *hdr;
+	} ip;
+	union {
+		struct tcphdr *tcp;
+		unsigned char *hdr;
+	} l4;
+	u32 paylen, l4_offset;
 	int err;
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL)
@@ -3287,49 +3296,53 @@ static int ixgbevf_tso(struct ixgbevf_ring *tx_ring,
 	if (err < 0)
 		return err;
 
+	ip.hdr = skb_network_header(skb);
+	l4.hdr = skb_checksum_start(skb);
+
 	/* ADV DTYP TUCMD MKRLOC/ISCSIHEDLEN */
 	type_tucmd = IXGBE_ADVTXD_TUCMD_L4T_TCP;
 
-	if (first->protocol == htons(ETH_P_IP)) {
-		struct iphdr *iph = ip_hdr(skb);
-
-		iph->tot_len = 0;
-		iph->check = 0;
-		tcp_hdr(skb)->check = ~csum_tcpudp_magic(iph->saddr,
-							 iph->daddr, 0,
-							 IPPROTO_TCP,
-							 0);
+	/* initialize outer IP header fields */
+	if (ip.v4->version == 4) {
+		/* IP header will have to cancel out any data that
+		 * is not a part of the outer IP header
+		 */
+		ip.v4->check = csum_fold(csum_add(lco_csum(skb),
+						  csum_unfold(l4.tcp->check)));
 		type_tucmd |= IXGBE_ADVTXD_TUCMD_IPV4;
+
+		ip.v4->tot_len = 0;
 		first->tx_flags |= IXGBE_TX_FLAGS_TSO |
 				   IXGBE_TX_FLAGS_CSUM |
 				   IXGBE_TX_FLAGS_IPV4;
-	} else if (skb_is_gso_v6(skb)) {
-		ipv6_hdr(skb)->payload_len = 0;
-		tcp_hdr(skb)->check =
-		    ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
-				     &ipv6_hdr(skb)->daddr,
-				     0, IPPROTO_TCP, 0);
+	} else {
+		ip.v6->payload_len = 0;
 		first->tx_flags |= IXGBE_TX_FLAGS_TSO |
 				   IXGBE_TX_FLAGS_CSUM;
 	}
 
-	/* compute header lengths */
-	l4len = tcp_hdrlen(skb);
-	*hdr_len += l4len;
-	*hdr_len = skb_transport_offset(skb) + l4len;
+	/* determine offset of inner transport header */
+	l4_offset = l4.hdr - skb->data;
+
+	/* compute length of segmentation header */
+	*hdr_len = (l4.tcp->doff * 4) + l4_offset;
+
+	/* remove payload length from inner checksum */
+	paylen = skb->len - l4_offset;
+	csum_replace_by_diff(&l4.tcp->check, htonl(paylen));
 
-	/* update GSO size and bytecount with header size */
+	/* update gso size and bytecount with header size */
 	first->gso_segs = skb_shinfo(skb)->gso_segs;
 	first->bytecount += (first->gso_segs - 1) * *hdr_len;
 
 	/* mss_l4len_id: use 1 as index for TSO */
-	mss_l4len_idx = l4len << IXGBE_ADVTXD_L4LEN_SHIFT;
+	mss_l4len_idx = (*hdr_len - l4_offset) << IXGBE_ADVTXD_L4LEN_SHIFT;
 	mss_l4len_idx |= skb_shinfo(skb)->gso_size << IXGBE_ADVTXD_MSS_SHIFT;
 	mss_l4len_idx |= 1 << IXGBE_ADVTXD_IDX_SHIFT;
 
 	/* vlan_macip_lens: HEADLEN, MACLEN, VLAN tag */
-	vlan_macip_lens = skb_network_header_len(skb);
-	vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
+	vlan_macip_lens = l4.hdr - ip.hdr;
+	vlan_macip_lens |= (ip.hdr - skb->data) << IXGBE_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
 
 	ixgbevf_tx_ctxtdesc(tx_ring, vlan_macip_lens,
@@ -3870,6 +3883,53 @@ static struct rtnl_link_stats64 *ixgbevf_get_stats(struct net_device *netdev,
 	return stats;
 }
 
+#define IXGBEVF_MAX_MAC_HDR_LEN		127
+#define IXGBEVF_MAX_NETWORK_HDR_LEN	511
+#define IXGBEVF_GSO_PARTIAL_FEATURES (NETIF_F_TSO_FIXEDID | \
+				      NETIF_F_GSO_GRE | \
+				      NETIF_F_GSO_GRE_CSUM | \
+				      NETIF_F_GSO_IPIP | \
+				      NETIF_F_GSO_SIT | \
+				      NETIF_F_GSO_UDP_TUNNEL | \
+				      NETIF_F_GSO_UDP_TUNNEL_CSUM)
+
+static netdev_features_t
+ixgbevf_features_check(struct sk_buff *skb, struct net_device *dev,
+		       netdev_features_t features)
+{
+	unsigned int network_hdr_len, mac_hdr_len;
+
+	/* Make certain the headers can be described by a context descriptor */
+	mac_hdr_len = skb_network_header(skb) - skb->data;
+	network_hdr_len = skb_checksum_start(skb) - skb_network_header(skb);
+	if (unlikely((mac_hdr_len > IXGBEVF_MAX_MAC_HDR_LEN) ||
+		     (network_hdr_len >  IXGBEVF_MAX_NETWORK_HDR_LEN)))
+		return features & ~(NETIF_F_HW_CSUM |
+				    NETIF_F_SCTP_CRC |
+				    NETIF_F_HW_VLAN_CTAG_TX |
+				    NETIF_F_TSO |
+				    NETIF_F_TSO_FIXEDID |
+				    NETIF_F_TSO6);
+
+	/* We can only support a fixed IPv4 ID or IPv6 header for TSO
+	 * with tunnels.  So if we aren't using a tunnel, or we aren't
+	 * performing TSO with a fixed ID we must strip the partial
+	 * features.
+	 */
+	if (!(skb_shinfo(skb)->gso_type & (SKB_GSO_GRE |
+					   SKB_GSO_GRE_CSUM |
+					   SKB_GSO_IPIP |
+					   SKB_GSO_SIT |
+					   SKB_GSO_UDP_TUNNEL |
+					   SKB_GSO_UDP_TUNNEL_CSUM)) ||
+	    !(skb_shinfo(skb)->gso_type & (SKB_GSO_TCP_FIXEDID |
+					   SKB_GSO_TCPV6)))
+		return features & ~(NETIF_F_GSO_PARTIAL |
+				    IXGBEVF_GSO_PARTIAL_FEATURES);
+
+	return features;
+}
+
 static const struct net_device_ops ixgbevf_netdev_ops = {
 	.ndo_open		= ixgbevf_open,
 	.ndo_stop		= ixgbevf_close,
@@ -3888,7 +3948,7 @@ static const struct net_device_ops ixgbevf_netdev_ops = {
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= ixgbevf_netpoll,
 #endif
-	.ndo_features_check	= passthru_features_check,
+	.ndo_features_check	= ixgbevf_features_check,
 };
 
 static void ixgbevf_assign_netdev_ops(struct net_device *dev)
@@ -3999,6 +4059,10 @@ static int ixgbevf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 			      NETIF_F_HW_CSUM |
 			      NETIF_F_SCTP_CRC;
 
+	netdev->gso_partial_features = IXGBEVF_GSO_PARTIAL_FEATURES;
+	netdev->hw_features |= NETIF_F_GSO_PARTIAL |
+			       IXGBEVF_GSO_PARTIAL_FEATURES;
+
 	netdev->features = netdev->hw_features |
 			   NETIF_F_HW_VLAN_CTAG_TX |
 			   NETIF_F_HW_VLAN_CTAG_RX |
@@ -4011,7 +4075,11 @@ static int ixgbevf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 				 NETIF_F_SCTP_CRC;
 
 	netdev->mpls_features |= NETIF_F_HW_CSUM;
-	netdev->hw_enc_features |= NETIF_F_HW_CSUM;
+	netdev->hw_enc_features |= NETIF_F_HW_CSUM |
+				   NETIF_F_TSO |
+				   NETIF_F_TSO6 |
+				   NETIF_F_GSO_PARTIAL |
+				   IXGBEVF_GSO_PARTIAL_FEATURES;
 
 	if (pci_using_dac)
 		netdev->features |= NETIF_F_HIGHDMA;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 11/11] igb/igbvf: Add support for GSO partial
  2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
                   ` (9 preceding siblings ...)
  2016-04-07 22:32 ` [RFC PATCH 10/11] ixgbe/ixgbevf: Add support for GSO partial Alexander Duyck
@ 2016-04-07 22:33 ` Alexander Duyck
  10 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 22:33 UTC (permalink / raw)
  To: herbert, tom, jesse, alexander.duyck, edumazet, netdev, davem

This patch adds support for partial GSO segmentation in the case of
encapsulated frames.  Specifically with this change the driver can perform
segmentation as long as the type is either SKB_GSO_TCP_FIXEDID or
SKB_GSO_TCPV6.  If neither of these gso types are specified then tunnel
segmentation is not supported and we will default back to GSO.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |  119 +++++++++++++++----
 drivers/net/ethernet/intel/igbvf/netdev.c |  180 ++++++++++++++++++-----------
 2 files changed, 205 insertions(+), 94 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 8e96c35307fb..8204ebecd2a5 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2087,6 +2087,52 @@ static int igb_ndo_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 	return ndo_dflt_fdb_add(ndm, tb, dev, addr, vid, flags);
 }
 
+#define IGB_MAX_MAC_HDR_LEN	127
+#define IGB_MAX_NETWORK_HDR_LEN	511
+#define IGB_GSO_PARTIAL_FEATURES (NETIF_F_TSO_FIXEDID | \
+				  NETIF_F_GSO_GRE | \
+				  NETIF_F_GSO_GRE_CSUM | \
+				  NETIF_F_GSO_IPIP | \
+				  NETIF_F_GSO_SIT | \
+				  NETIF_F_GSO_UDP_TUNNEL | \
+				  NETIF_F_GSO_UDP_TUNNEL_CSUM)
+static netdev_features_t
+igb_features_check(struct sk_buff *skb, struct net_device *dev,
+		   netdev_features_t features)
+{
+	unsigned int network_hdr_len, mac_hdr_len;
+
+	/* Make certain the headers can be described by a context descriptor */
+	mac_hdr_len = skb_network_header(skb) - skb->data;
+	network_hdr_len = skb_checksum_start(skb) - skb_network_header(skb);
+	if (unlikely((mac_hdr_len > IGB_MAX_MAC_HDR_LEN) ||
+		     (network_hdr_len >  IGB_MAX_NETWORK_HDR_LEN)))
+		return features & ~(NETIF_F_HW_CSUM |
+				    NETIF_F_SCTP_CRC |
+				    NETIF_F_HW_VLAN_CTAG_TX |
+				    NETIF_F_TSO |
+				    NETIF_F_TSO_FIXEDID |
+				    NETIF_F_TSO6);
+
+	/* We can only support a fixed IPv4 ID or IPv6 header for TSO
+	 * with tunnels.  So if we aren't using a tunnel, or we aren't
+	 * performing TSO with a fixed ID we must strip the partial
+	 * features.
+	 */
+	if (!(skb_shinfo(skb)->gso_type & (SKB_GSO_GRE |
+					   SKB_GSO_GRE_CSUM |
+					   SKB_GSO_IPIP |
+					   SKB_GSO_SIT |
+					   SKB_GSO_UDP_TUNNEL |
+					   SKB_GSO_UDP_TUNNEL_CSUM)) ||
+	    !(skb_shinfo(skb)->gso_type & (SKB_GSO_TCP_FIXEDID |
+					   SKB_GSO_TCPV6)))
+		return features & ~(NETIF_F_GSO_PARTIAL |
+				    IGB_GSO_PARTIAL_FEATURES);
+
+	return features;
+}
+
 static const struct net_device_ops igb_netdev_ops = {
 	.ndo_open		= igb_open,
 	.ndo_stop		= igb_close,
@@ -2111,7 +2157,7 @@ static const struct net_device_ops igb_netdev_ops = {
 	.ndo_fix_features	= igb_fix_features,
 	.ndo_set_features	= igb_set_features,
 	.ndo_fdb_add		= igb_ndo_fdb_add,
-	.ndo_features_check	= passthru_features_check,
+	.ndo_features_check	= igb_features_check,
 };
 
 /**
@@ -2384,6 +2430,9 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (hw->mac.type >= e1000_82576)
 		netdev->features |= NETIF_F_SCTP_CRC;
 
+	netdev->gso_partial_features = IGB_GSO_PARTIAL_FEATURES;
+	netdev->features |= NETIF_F_GSO_PARTIAL | IGB_GSO_PARTIAL_FEATURES;
+
 	/* copy netdev features into list of user selectable features */
 	netdev->hw_features |= netdev->features;
 	netdev->hw_features |= NETIF_F_RXALL;
@@ -2401,14 +2450,16 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 				 NETIF_F_SCTP_CRC;
 
 	netdev->mpls_features |= NETIF_F_HW_CSUM;
-	netdev->hw_enc_features |= NETIF_F_HW_CSUM;
+	netdev->hw_enc_features |= NETIF_F_HW_CSUM |
+				   NETIF_F_TSO |
+				   NETIF_F_TSO6 |
+				   NETIF_F_GSO_PARTIAL |
+				   IGB_GSO_PARTIAL_FEATURES;
 
 	netdev->priv_flags |= IFF_SUPP_NOFCS;
 
-	if (pci_using_dac) {
+	if (pci_using_dac)
 		netdev->features |= NETIF_F_HIGHDMA;
-		netdev->vlan_features |= NETIF_F_HIGHDMA;
-	}
 
 	netdev->priv_flags |= IFF_UNICAST_FLT;
 
@@ -4842,9 +4893,18 @@ static int igb_tso(struct igb_ring *tx_ring,
 		   struct igb_tx_buffer *first,
 		   u8 *hdr_len)
 {
+	u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
 	struct sk_buff *skb = first->skb;
-	u32 vlan_macip_lens, type_tucmd;
-	u32 mss_l4len_idx, l4len;
+	union {
+		struct iphdr *v4;
+		struct ipv6hdr *v6;
+		unsigned char *hdr;
+	} ip;
+	union {
+		struct tcphdr *tcp;
+		unsigned char *hdr;
+	} l4;
+	u32 paylen, l4_offset;
 	int err;
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL)
@@ -4857,45 +4917,52 @@ static int igb_tso(struct igb_ring *tx_ring,
 	if (err < 0)
 		return err;
 
+	ip.hdr = skb_network_header(skb);
+	l4.hdr = skb_checksum_start(skb);
+
 	/* ADV DTYP TUCMD MKRLOC/ISCSIHEDLEN */
 	type_tucmd = E1000_ADVTXD_TUCMD_L4T_TCP;
 
-	if (first->protocol == htons(ETH_P_IP)) {
-		struct iphdr *iph = ip_hdr(skb);
-		iph->tot_len = 0;
-		iph->check = 0;
-		tcp_hdr(skb)->check = ~csum_tcpudp_magic(iph->saddr,
-							 iph->daddr, 0,
-							 IPPROTO_TCP,
-							 0);
+	/* initialize outer IP header fields */
+	if (ip.v4->version == 4) {
+		/* IP header will have to cancel out any data that
+		 * is not a part of the outer IP header
+		 */
+		ip.v4->check = csum_fold(csum_add(lco_csum(skb),
+						  csum_unfold(l4.tcp->check)));
 		type_tucmd |= E1000_ADVTXD_TUCMD_IPV4;
+
+		ip.v4->tot_len = 0;
 		first->tx_flags |= IGB_TX_FLAGS_TSO |
 				   IGB_TX_FLAGS_CSUM |
 				   IGB_TX_FLAGS_IPV4;
-	} else if (skb_is_gso_v6(skb)) {
-		ipv6_hdr(skb)->payload_len = 0;
-		tcp_hdr(skb)->check = ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
-						       &ipv6_hdr(skb)->daddr,
-						       0, IPPROTO_TCP, 0);
+	} else {
+		ip.v6->payload_len = 0;
 		first->tx_flags |= IGB_TX_FLAGS_TSO |
 				   IGB_TX_FLAGS_CSUM;
 	}
 
-	/* compute header lengths */
-	l4len = tcp_hdrlen(skb);
-	*hdr_len = skb_transport_offset(skb) + l4len;
+	/* determine offset of inner transport header */
+	l4_offset = l4.hdr - skb->data;
+
+	/* compute length of segmentation header */
+	*hdr_len = (l4.tcp->doff * 4) + l4_offset;
+
+	/* remove payload length from inner checksum */
+	paylen = skb->len - l4_offset;
+	csum_replace_by_diff(&l4.tcp->check, htonl(paylen));
 
 	/* update gso size and bytecount with header size */
 	first->gso_segs = skb_shinfo(skb)->gso_segs;
 	first->bytecount += (first->gso_segs - 1) * *hdr_len;
 
 	/* MSS L4LEN IDX */
-	mss_l4len_idx = l4len << E1000_ADVTXD_L4LEN_SHIFT;
+	mss_l4len_idx = (*hdr_len - l4_offset) << E1000_ADVTXD_L4LEN_SHIFT;
 	mss_l4len_idx |= skb_shinfo(skb)->gso_size << E1000_ADVTXD_MSS_SHIFT;
 
 	/* VLAN MACLEN IPLEN */
-	vlan_macip_lens = skb_network_header_len(skb);
-	vlan_macip_lens |= skb_network_offset(skb) << E1000_ADVTXD_MACLEN_SHIFT;
+	vlan_macip_lens = l4.hdr - ip.hdr;
+	vlan_macip_lens |= (ip.hdr - skb->data) << E1000_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= first->tx_flags & IGB_TX_FLAGS_VLAN_MASK;
 
 	igb_tx_ctxtdesc(tx_ring, vlan_macip_lens, type_tucmd, mss_l4len_idx);
diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c
index c12442252adb..287fb6ca7670 100644
--- a/drivers/net/ethernet/intel/igbvf/netdev.c
+++ b/drivers/net/ethernet/intel/igbvf/netdev.c
@@ -1933,83 +1933,74 @@ static void igbvf_tx_ctxtdesc(struct igbvf_ring *tx_ring, u32 vlan_macip_lens,
 	buffer_info->dma = 0;
 }
 
-static int igbvf_tso(struct igbvf_adapter *adapter,
-		     struct igbvf_ring *tx_ring,
-		     struct sk_buff *skb, u32 tx_flags, u8 *hdr_len,
-		     __be16 protocol)
-{
-	struct e1000_adv_tx_context_desc *context_desc;
-	struct igbvf_buffer *buffer_info;
-	u32 info = 0, tu_cmd = 0;
-	u32 mss_l4len_idx, l4len;
-	unsigned int i;
+static int igbvf_tso(struct igbvf_ring *tx_ring,
+		     struct sk_buff *skb, u32 tx_flags, u8 *hdr_len)
+{
+	u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
+	union {
+		struct iphdr *v4;
+		struct ipv6hdr *v6;
+		unsigned char *hdr;
+	} ip;
+	union {
+		struct tcphdr *tcp;
+		unsigned char *hdr;
+	} l4;
+	u32 paylen, l4_offset;
 	int err;
 
-	*hdr_len = 0;
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		return 0;
+
+	if (!skb_is_gso(skb))
+		return 0;
 
 	err = skb_cow_head(skb, 0);
-	if (err < 0) {
-		dev_err(&adapter->pdev->dev, "igbvf_tso returning an error\n");
+	if (err < 0)
 		return err;
-	}
 
-	l4len = tcp_hdrlen(skb);
-	*hdr_len += l4len;
-
-	if (protocol == htons(ETH_P_IP)) {
-		struct iphdr *iph = ip_hdr(skb);
-
-		iph->tot_len = 0;
-		iph->check = 0;
-		tcp_hdr(skb)->check = ~csum_tcpudp_magic(iph->saddr,
-							 iph->daddr, 0,
-							 IPPROTO_TCP,
-							 0);
-	} else if (skb_is_gso_v6(skb)) {
-		ipv6_hdr(skb)->payload_len = 0;
-		tcp_hdr(skb)->check = ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
-						       &ipv6_hdr(skb)->daddr,
-						       0, IPPROTO_TCP, 0);
-	}
+	ip.hdr = skb_network_header(skb);
+	l4.hdr = skb_checksum_start(skb);
 
-	i = tx_ring->next_to_use;
+	/* ADV DTYP TUCMD MKRLOC/ISCSIHEDLEN */
+	type_tucmd = E1000_ADVTXD_TUCMD_L4T_TCP;
 
-	buffer_info = &tx_ring->buffer_info[i];
-	context_desc = IGBVF_TX_CTXTDESC_ADV(*tx_ring, i);
-	/* VLAN MACLEN IPLEN */
-	if (tx_flags & IGBVF_TX_FLAGS_VLAN)
-		info |= (tx_flags & IGBVF_TX_FLAGS_VLAN_MASK);
-	info |= (skb_network_offset(skb) << E1000_ADVTXD_MACLEN_SHIFT);
-	*hdr_len += skb_network_offset(skb);
-	info |= (skb_transport_header(skb) - skb_network_header(skb));
-	*hdr_len += (skb_transport_header(skb) - skb_network_header(skb));
-	context_desc->vlan_macip_lens = cpu_to_le32(info);
+	/* initialize outer IP header fields */
+	if (ip.v4->version == 4) {
+		/* IP header will have to cancel out any data that
+		 * is not a part of the outer IP header
+		 */
+		ip.v4->check = csum_fold(csum_add(lco_csum(skb),
+						  csum_unfold(l4.tcp->check)));
+		type_tucmd |= E1000_ADVTXD_TUCMD_IPV4;
 
-	/* ADV DTYP TUCMD MKRLOC/ISCSIHEDLEN */
-	tu_cmd |= (E1000_TXD_CMD_DEXT | E1000_ADVTXD_DTYP_CTXT);
+		ip.v4->tot_len = 0;
+	} else {
+		ip.v6->payload_len = 0;
+	}
 
-	if (protocol == htons(ETH_P_IP))
-		tu_cmd |= E1000_ADVTXD_TUCMD_IPV4;
-	tu_cmd |= E1000_ADVTXD_TUCMD_L4T_TCP;
+	/* determine offset of inner transport header */
+	l4_offset = l4.hdr - skb->data;
 
-	context_desc->type_tucmd_mlhl = cpu_to_le32(tu_cmd);
+	/* compute length of segmentation header */
+	*hdr_len = (l4.tcp->doff * 4) + l4_offset;
 
-	/* MSS L4LEN IDX */
-	mss_l4len_idx = (skb_shinfo(skb)->gso_size << E1000_ADVTXD_MSS_SHIFT);
-	mss_l4len_idx |= (l4len << E1000_ADVTXD_L4LEN_SHIFT);
+	/* remove payload length from inner checksum */
+	paylen = skb->len - l4_offset;
+	csum_replace_by_diff(&l4.tcp->check, htonl(paylen));
 
-	context_desc->mss_l4len_idx = cpu_to_le32(mss_l4len_idx);
-	context_desc->seqnum_seed = 0;
+	/* MSS L4LEN IDX */
+	mss_l4len_idx = (*hdr_len - l4_offset) << E1000_ADVTXD_L4LEN_SHIFT;
+	mss_l4len_idx |= skb_shinfo(skb)->gso_size << E1000_ADVTXD_MSS_SHIFT;
 
-	buffer_info->time_stamp = jiffies;
-	buffer_info->dma = 0;
-	i++;
-	if (i == tx_ring->count)
-		i = 0;
+	/* VLAN MACLEN IPLEN */
+	vlan_macip_lens = l4.hdr - ip.hdr;
+	vlan_macip_lens |= (ip.hdr - skb->data) << E1000_ADVTXD_MACLEN_SHIFT;
+	vlan_macip_lens |= tx_flags & IGBVF_TX_FLAGS_VLAN_MASK;
 
-	tx_ring->next_to_use = i;
+	igbvf_tx_ctxtdesc(tx_ring, vlan_macip_lens, type_tucmd, mss_l4len_idx);
 
-	return true;
+	return 1;
 }
 
 static inline bool igbvf_ipv6_csum_is_sctp(struct sk_buff *skb)
@@ -2271,8 +2262,7 @@ static netdev_tx_t igbvf_xmit_frame_ring_adv(struct sk_buff *skb,
 
 	first = tx_ring->next_to_use;
 
-	tso = skb_is_gso(skb) ?
-		igbvf_tso(adapter, tx_ring, skb, tx_flags, &hdr_len, protocol) : 0;
+	tso = igbvf_tso(tx_ring, skb, tx_flags, &hdr_len);
 	if (unlikely(tso < 0)) {
 		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
@@ -2615,6 +2605,52 @@ static int igbvf_set_features(struct net_device *netdev,
 	return 0;
 }
 
+#define IGBVF_MAX_MAC_HDR_LEN		127
+#define IGBVF_MAX_NETWORK_HDR_LEN	511
+#define IGBVF_GSO_PARTIAL_FEATURES (NETIF_F_TSO_FIXEDID | \
+				    NETIF_F_GSO_GRE | \
+				    NETIF_F_GSO_GRE_CSUM | \
+				    NETIF_F_GSO_IPIP | \
+				    NETIF_F_GSO_SIT | \
+				    NETIF_F_GSO_UDP_TUNNEL | \
+				    NETIF_F_GSO_UDP_TUNNEL_CSUM)
+static netdev_features_t
+igbvf_features_check(struct sk_buff *skb, struct net_device *dev,
+		     netdev_features_t features)
+{
+	unsigned int network_hdr_len, mac_hdr_len;
+
+	/* Make certain the headers can be described by a context descriptor */
+	mac_hdr_len = skb_network_header(skb) - skb->data;
+	network_hdr_len = skb_checksum_start(skb) - skb_network_header(skb);
+	if (unlikely((mac_hdr_len > IGBVF_MAX_MAC_HDR_LEN) ||
+		     (network_hdr_len >  IGBVF_MAX_NETWORK_HDR_LEN)))
+		return features & ~(NETIF_F_HW_CSUM |
+				    NETIF_F_SCTP_CRC |
+				    NETIF_F_HW_VLAN_CTAG_TX |
+				    NETIF_F_TSO |
+				    NETIF_F_TSO_FIXEDID |
+				    NETIF_F_TSO6);
+
+	/* We can only support a fixed IPv4 ID or IPv6 header for TSO
+	 * with tunnels.  So if we aren't using a tunnel, or we aren't
+	 * performing TSO with a fixed ID we must strip the partial
+	 * features.
+	 */
+	if (!(skb_shinfo(skb)->gso_type & (SKB_GSO_GRE |
+					   SKB_GSO_GRE_CSUM |
+					   SKB_GSO_IPIP |
+					   SKB_GSO_SIT |
+					   SKB_GSO_UDP_TUNNEL |
+					   SKB_GSO_UDP_TUNNEL_CSUM)) ||
+	    !(skb_shinfo(skb)->gso_type & (SKB_GSO_TCP_FIXEDID |
+					   SKB_GSO_TCPV6)))
+		return features & ~(NETIF_F_GSO_PARTIAL |
+				    IGBVF_GSO_PARTIAL_FEATURES);
+
+	return features;
+}
+
 static const struct net_device_ops igbvf_netdev_ops = {
 	.ndo_open		= igbvf_open,
 	.ndo_stop		= igbvf_close,
@@ -2631,7 +2667,7 @@ static const struct net_device_ops igbvf_netdev_ops = {
 	.ndo_poll_controller	= igbvf_netpoll,
 #endif
 	.ndo_set_features	= igbvf_set_features,
-	.ndo_features_check	= passthru_features_check,
+	.ndo_features_check	= igbvf_features_check,
 };
 
 /**
@@ -2739,14 +2775,15 @@ static int igbvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 			      NETIF_F_HW_CSUM |
 			      NETIF_F_SCTP_CRC;
 
+	netdev->gso_partial_features = IGBVF_GSO_PARTIAL_FEATURES;
+	netdev->hw_features |= NETIF_F_GSO_PARTIAL |
+			       IGBVF_GSO_PARTIAL_FEATURES;
+
 	netdev->features = netdev->hw_features |
 			   NETIF_F_HW_VLAN_CTAG_TX |
 			   NETIF_F_HW_VLAN_CTAG_RX |
 			   NETIF_F_HW_VLAN_CTAG_FILTER;
 
-	if (pci_using_dac)
-		netdev->features |= NETIF_F_HIGHDMA;
-
 	netdev->vlan_features |= NETIF_F_SG |
 				 NETIF_F_TSO |
 				 NETIF_F_TSO6 |
@@ -2754,7 +2791,14 @@ static int igbvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 				 NETIF_F_SCTP_CRC;
 
 	netdev->mpls_features |= NETIF_F_HW_CSUM;
-	netdev->hw_enc_features |= NETIF_F_HW_CSUM;
+	netdev->hw_enc_features |= NETIF_F_HW_CSUM |
+				   NETIF_F_TSO |
+				   NETIF_F_TSO6 |
+				   NETIF_F_GSO_PARTIAL |
+				   IGBVF_GSO_PARTIAL_FEATURES;
+
+	if (pci_using_dac)
+		netdev->features |= NETIF_F_HIGHDMA;
 
 	/*reset the controller to put the device in a known good state */
 	err = hw->mac.ops.reset_hw(hw);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-07 22:32 ` [RFC PATCH 07/11] GENEVE: " Alexander Duyck
@ 2016-04-07 23:22   ` Jesse Gross
  2016-04-07 23:52     ` Alexander Duyck
  0 siblings, 1 reply; 20+ messages in thread
From: Jesse Gross @ 2016-04-07 23:22 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: herbert, Tom Herbert, Alexander Duyck, edumazet,
	Linux Kernel Network Developers, David Miller

On Thu, Apr 7, 2016 at 7:32 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
> This patch adds support for a feature I am calling IP ID mangling.  It is
> basically just another way of saying the IP IDs that are transmitted by the
> tunnel may not match up with what would normally be expected.  Specifically
> what will happen is in the case of TSO the IP IDs on the headers will be a
> fixed value so a given TSO will repeat the same inner IP ID value gso_segs
> number of times.
>
> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>

If I'm understanding this correctly, enabling IP ID mangling will help
performance on ixgbe since it will allow it to do GSO partial instead
of plain GSO but it will hurt performance on i40e since it will drop
from TSO to plain GSO.

Assuming that's right, it seems like it will make it hard to chose the
right setting without knowledge of which hardware is in use. I guess
what we really want is "I care about nicely incrementing IP IDs" vs.
"I don't care as long as the DF bit is set". That second case is
really what this flag is trying to say but it seems like it is
enforcing too much in the i40e case - I don't think anyone wants to go
out of their way to make IP IDs jump around if incrementing is faster.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-07 23:22   ` Jesse Gross
@ 2016-04-07 23:52     ` Alexander Duyck
  2016-04-08 21:40       ` Jesse Gross
  0 siblings, 1 reply; 20+ messages in thread
From: Alexander Duyck @ 2016-04-07 23:52 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Alexander Duyck, Herbert Xu, Tom Herbert, Eric Dumazet,
	Linux Kernel Network Developers, David Miller

On Thu, Apr 7, 2016 at 4:22 PM, Jesse Gross <jesse@kernel.org> wrote:
> On Thu, Apr 7, 2016 at 7:32 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
>> This patch adds support for a feature I am calling IP ID mangling.  It is
>> basically just another way of saying the IP IDs that are transmitted by the
>> tunnel may not match up with what would normally be expected.  Specifically
>> what will happen is in the case of TSO the IP IDs on the headers will be a
>> fixed value so a given TSO will repeat the same inner IP ID value gso_segs
>> number of times.
>>
>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>
> If I'm understanding this correctly, enabling IP ID mangling will help
> performance on ixgbe since it will allow it to do GSO partial instead
> of plain GSO but it will hurt performance on i40e since it will drop
> from TSO to plain GSO.

Right.  However the option is currently defaulted to off, and can be
enabled per tunnel endpoint.  So if you had an ixgbe to i40e link you
could enable it on the end with the ixgbe and you should see good
performance in both directions.

> Assuming that's right, it seems like it will make it hard to chose the
> right setting without knowledge of which hardware is in use. I guess
> what we really want is "I care about nicely incrementing IP IDs" vs.
> "I don't care as long as the DF bit is set". That second case is
> really what this flag is trying to say but it seems like it is
> enforcing too much in the i40e case - I don't think anyone wants to go
> out of their way to make IP IDs jump around if incrementing is faster.

Right.  The problem is trying to sort out all the GRO/GSO bits.  I was
probably being a bit too conservative after the last few iterations
for the GRO fixes.

Just a thought.  What if I replaced NETIF_F_TSO_FIXEDID with something
that meant we could mange the IP ID like a NETIF_F_TSO_IPID_MANGLE
(advice for better name welcome).  Instead of the feature flag meaning
we are going to transmit packets with a fixed ID it would mean we
don't care about the ID and are free to mangle it as we see fit.  The
GSO type can retain the same meaning as far as that requiring the same
ID for all, but the feature would mean we will take fixed and convert
it to incrementing, or incrementing and convert it to fixed.

- Alex

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-07 23:52     ` Alexander Duyck
@ 2016-04-08 21:40       ` Jesse Gross
  2016-04-08 22:04         ` Alexander Duyck
  0 siblings, 1 reply; 20+ messages in thread
From: Jesse Gross @ 2016-04-08 21:40 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, Herbert Xu, Tom Herbert, Eric Dumazet,
	Linux Kernel Network Developers, David Miller

On Thu, Apr 7, 2016 at 8:52 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> Just a thought.  What if I replaced NETIF_F_TSO_FIXEDID with something
> that meant we could mange the IP ID like a NETIF_F_TSO_IPID_MANGLE
> (advice for better name welcome).  Instead of the feature flag meaning
> we are going to transmit packets with a fixed ID it would mean we
> don't care about the ID and are free to mangle it as we see fit.  The
> GSO type can retain the same meaning as far as that requiring the same
> ID for all, but the feature would mean we will take fixed and convert
> it to incrementing, or incrementing and convert it to fixed.

I saw the new version of the code that you posted with this idea and
now that I understand it better, it seems like a reasonable choice to
me - it's nice that it is consistent with GRO and not tunnel specific.
It also makes behavior consistent across drivers in regards to
incrementing IDs in the default case, which was one of my concerns
from before.

Maybe I missed it but I didn't see any checks for the DF bit being set
when we transmit a packet with NETIF_F_TSO_MANGLEID. Even if I am
comfortable mangling my IDs in the DF case, I don't think this would
ever extend to non-DF packets. In the documentation you noted that it
is the driver's responsibility to do this check but I couldn't find it
in either ixgbe or igb. It would also be nice if the core stack could
enforce it somehow as well rather than each driver.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-08 21:40       ` Jesse Gross
@ 2016-04-08 22:04         ` Alexander Duyck
  2016-04-09 15:52           ` Jesse Gross
  0 siblings, 1 reply; 20+ messages in thread
From: Alexander Duyck @ 2016-04-08 22:04 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Alexander Duyck, Herbert Xu, Tom Herbert, Eric Dumazet,
	Linux Kernel Network Developers, David Miller

On Fri, Apr 8, 2016 at 2:40 PM, Jesse Gross <jesse@kernel.org> wrote:
> On Thu, Apr 7, 2016 at 8:52 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> Just a thought.  What if I replaced NETIF_F_TSO_FIXEDID with something
>> that meant we could mange the IP ID like a NETIF_F_TSO_IPID_MANGLE
>> (advice for better name welcome).  Instead of the feature flag meaning
>> we are going to transmit packets with a fixed ID it would mean we
>> don't care about the ID and are free to mangle it as we see fit.  The
>> GSO type can retain the same meaning as far as that requiring the same
>> ID for all, but the feature would mean we will take fixed and convert
>> it to incrementing, or incrementing and convert it to fixed.
>
> I saw the new version of the code that you posted with this idea and
> now that I understand it better, it seems like a reasonable choice to
> me - it's nice that it is consistent with GRO and not tunnel specific.
> It also makes behavior consistent across drivers in regards to
> incrementing IDs in the default case, which was one of my concerns
> from before.
>
> Maybe I missed it but I didn't see any checks for the DF bit being set
> when we transmit a packet with NETIF_F_TSO_MANGLEID. Even if I am
> comfortable mangling my IDs in the DF case, I don't think this would
> ever extend to non-DF packets. In the documentation you noted that it
> is the driver's responsibility to do this check but I couldn't find it
> in either ixgbe or igb. It would also be nice if the core stack could
> enforce it somehow as well rather than each driver.

Yeah I had glossed over that in the igb and ixgbe patches.  A check is
only really needed for the incrementing to non-incrementing case and I
wasn't sure how common it was to have TCP with an IP header that
didn't set the DF bit.  In the case of the outer headers igb and ixgbe
will increment the IP ID always so we don't have to worry about if DF
is set of not there.  For the inner headers I had fudged it a bit and
didn't add the validation.  If needed I can see about adding that
shortly.

- Alex

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-08 22:04         ` Alexander Duyck
@ 2016-04-09 15:52           ` Jesse Gross
  2016-04-09 17:36             ` Alexander Duyck
  0 siblings, 1 reply; 20+ messages in thread
From: Jesse Gross @ 2016-04-09 15:52 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, Herbert Xu, Tom Herbert, Eric Dumazet,
	Linux Kernel Network Developers, David Miller

On Fri, Apr 8, 2016 at 7:04 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Fri, Apr 8, 2016 at 2:40 PM, Jesse Gross <jesse@kernel.org> wrote:
>> Maybe I missed it but I didn't see any checks for the DF bit being set
>> when we transmit a packet with NETIF_F_TSO_MANGLEID. Even if I am
>> comfortable mangling my IDs in the DF case, I don't think this would
>> ever extend to non-DF packets. In the documentation you noted that it
>> is the driver's responsibility to do this check but I couldn't find it
>> in either ixgbe or igb. It would also be nice if the core stack could
>> enforce it somehow as well rather than each driver.
>
> Yeah I had glossed over that in the igb and ixgbe patches.  A check is
> only really needed for the incrementing to non-incrementing case and I
> wasn't sure how common it was to have TCP with an IP header that
> didn't set the DF bit.  In the case of the outer headers igb and ixgbe
> will increment the IP ID always so we don't have to worry about if DF
> is set of not there.  For the inner headers I had fudged it a bit and
> didn't add the validation.  If needed I can see about adding that
> shortly.

TCP without the DF bit set is not the default but it is possible (it
can be enabled by setting /proc/sys/net/ipv4/ip_no_pmtu_disc). I also
did a quick check of some Internet services and at least some of them
seem to return TCP without DF, so it's not too rare.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-09 15:52           ` Jesse Gross
@ 2016-04-09 17:36             ` Alexander Duyck
  2016-04-09 18:02               ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: Alexander Duyck @ 2016-04-09 17:36 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Alexander Duyck, Herbert Xu, Tom Herbert, Eric Dumazet,
	Linux Kernel Network Developers, David Miller

On Sat, Apr 9, 2016 at 8:52 AM, Jesse Gross <jesse@kernel.org> wrote:
> On Fri, Apr 8, 2016 at 7:04 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Fri, Apr 8, 2016 at 2:40 PM, Jesse Gross <jesse@kernel.org> wrote:
>>> Maybe I missed it but I didn't see any checks for the DF bit being set
>>> when we transmit a packet with NETIF_F_TSO_MANGLEID. Even if I am
>>> comfortable mangling my IDs in the DF case, I don't think this would
>>> ever extend to non-DF packets. In the documentation you noted that it
>>> is the driver's responsibility to do this check but I couldn't find it
>>> in either ixgbe or igb. It would also be nice if the core stack could
>>> enforce it somehow as well rather than each driver.
>>
>> Yeah I had glossed over that in the igb and ixgbe patches.  A check is
>> only really needed for the incrementing to non-incrementing case and I
>> wasn't sure how common it was to have TCP with an IP header that
>> didn't set the DF bit.  In the case of the outer headers igb and ixgbe
>> will increment the IP ID always so we don't have to worry about if DF
>> is set of not there.  For the inner headers I had fudged it a bit and
>> didn't add the validation.  If needed I can see about adding that
>> shortly.
>
> TCP without the DF bit set is not the default but it is possible (it
> can be enabled by setting /proc/sys/net/ipv4/ip_no_pmtu_disc). I also
> did a quick check of some Internet services and at least some of them
> seem to return TCP without DF, so it's not too rare.

For next version I plan to include a check in netif_skb_features that
will clear NETIF_F_TSO_MANGLEID if the DF bit is not set.

- Alex

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-09 17:36             ` Alexander Duyck
@ 2016-04-09 18:02               ` Eric Dumazet
  2016-04-09 18:32                 ` Alexander Duyck
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2016-04-09 18:02 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Jesse Gross, Alexander Duyck, Herbert Xu, Tom Herbert,
	Eric Dumazet, Linux Kernel Network Developers, David Miller

On Sat, 2016-04-09 at 10:36 -0700, Alexander Duyck wrote:

> For next version I plan to include a check in netif_skb_features that
> will clear NETIF_F_TSO_MANGLEID if the DF bit is not set.

So it looks like we slowly but surely make the whole stack damn slow, to
support some extra features.

How many instructions are needed nowadays for netif_skb_features()
alone ?

More than 120 if I am not mistaken, if we count fast path in
skb_network_protocol()

Note: I have a patch to remove the gso_min_segs thing, currently unused.

2 instructions saved ! Woo-hoo !

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
  2016-04-09 18:02               ` Eric Dumazet
@ 2016-04-09 18:32                 ` Alexander Duyck
  0 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2016-04-09 18:32 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesse Gross, Alexander Duyck, Herbert Xu, Tom Herbert,
	Eric Dumazet, Linux Kernel Network Developers, David Miller

On Sat, Apr 9, 2016 at 11:02 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Sat, 2016-04-09 at 10:36 -0700, Alexander Duyck wrote:
>
>> For next version I plan to include a check in netif_skb_features that
>> will clear NETIF_F_TSO_MANGLEID if the DF bit is not set.
>
> So it looks like we slowly but surely make the whole stack damn slow, to
> support some extra features.
>
> How many instructions are needed nowadays for netif_skb_features()
> alone ?
>
> More than 120 if I am not mistaken, if we count fast path in
> skb_network_protocol()
>
> Note: I have a patch to remove the gso_min_segs thing, currently unused.
>
> 2 instructions saved ! Woo-hoo !

I was planning to drop the checks for TSO_MANGLEID and GSO_PARTIAL in
a if statment based on skb_is_gso() since I had both GSO_PARTIAL to
check for as well.

- Alex

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-04-09 18:32 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-07 22:31 [RFC PATCH 00/11] GSO partial and TSO FIXEDID support Alexander Duyck
2016-04-07 22:31 ` [RFC PATCH 01/11] GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU Alexander Duyck
2016-04-07 22:32 ` [RFC PATCH 02/11] ethtool: Add support for toggling any of the GSO offloads Alexander Duyck
2016-04-07 22:32 ` [RFC PATCH 03/11] GSO: Add GSO type for fixed IPv4 ID Alexander Duyck
2016-04-07 22:32 ` [RFC PATCH 04/11] GRO: Add support for TCP with fixed IPv4 ID field, limit tunnel IP ID values Alexander Duyck
2016-04-07 22:32 ` [RFC PATCH 05/11] GSO: Support partial segmentation offload Alexander Duyck
2016-04-07 22:32 ` [RFC PATCH 06/11] VXLAN: Add option to mangle IP IDs on inner headers when using TSO Alexander Duyck
2016-04-07 22:32 ` [RFC PATCH 07/11] GENEVE: " Alexander Duyck
2016-04-07 23:22   ` Jesse Gross
2016-04-07 23:52     ` Alexander Duyck
2016-04-08 21:40       ` Jesse Gross
2016-04-08 22:04         ` Alexander Duyck
2016-04-09 15:52           ` Jesse Gross
2016-04-09 17:36             ` Alexander Duyck
2016-04-09 18:02               ` Eric Dumazet
2016-04-09 18:32                 ` Alexander Duyck
2016-04-07 22:32 ` [RFC PATCH 08/11] Documentation: Add documentation for TSO and GSO features Alexander Duyck
2016-04-07 22:32 ` [RFC PATCH 09/11] i40e/i40evf: Add support for GSO partial with UDP_TUNNEL_CSUM and GRE_CSUM Alexander Duyck
2016-04-07 22:32 ` [RFC PATCH 10/11] ixgbe/ixgbevf: Add support for GSO partial Alexander Duyck
2016-04-07 22:33 ` [RFC PATCH 11/11] igb/igbvf: " Alexander Duyck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).