* [PATCH net-next v3 0/5] xen-netback: IPv6 offload support
@ 2013-10-10 15:25 Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 1/5] xen-netback: add support for IPv6 checksum offload to guest Paul Durrant
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev
This patch series adds support for checksum and large packet offloads into
xen-netback.
Testing has mainly been done using the Microsoft network hardware
certification suite running in Server 2008R2 VMs with Citrix PV frontends.
v2:
- Fixed Wei's email address in Cc lines
v3:
- Responded to Wei's comments:
- netif.h now updated with comments and a definition of
XEN_NETIF_GSO_TYPE_NONE.
- limited number of pullups
- Responded to Annie's comments:
- New GSO_BIT macro
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-next v3 1/5] xen-netback: add support for IPv6 checksum offload to guest
2013-10-10 15:25 [PATCH net-next v3 0/5] xen-netback: IPv6 offload support Paul Durrant
@ 2013-10-10 15:25 ` Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 2/5] xen-netback: add support for IPv6 checksum offload from guest Paul Durrant
` (3 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
Check xenstore flag feature-ipv6-csum-offload to determine if a
guest is happy to accept IPv6 packets with only partial checksum.
Also check analogous feature-ip-csum-offload to determine if a
guest is happy to accept IPv4 packets with only partial checksum
as a replacement for a negated feature-no-csum-offload value and
add a comment to deprecate use of feature-no-csum-offload.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/common.h | 3 ++-
drivers/net/xen-netback/interface.c | 10 +++++++---
drivers/net/xen-netback/xenbus.c | 24 +++++++++++++++++++++++-
include/xen/interface/io/netif.h | 10 ++++++++++
4 files changed, 42 insertions(+), 5 deletions(-)
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 5715318..b4a9a3c 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -153,7 +153,8 @@ struct xenvif {
u8 can_sg:1;
u8 gso:1;
u8 gso_prefix:1;
- u8 csum:1;
+ u8 ip_csum:1;
+ u8 ipv6_csum:1;
/* Internal feature information. */
u8 can_queue:1; /* can queue packets for receiver? */
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 01bb854..8e92783 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -216,8 +216,10 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
features &= ~NETIF_F_SG;
if (!vif->gso && !vif->gso_prefix)
features &= ~NETIF_F_TSO;
- if (!vif->csum)
+ if (!vif->ip_csum)
features &= ~NETIF_F_IP_CSUM;
+ if (!vif->ipv6_csum)
+ features &= ~NETIF_F_IPV6_CSUM;
return features;
}
@@ -306,7 +308,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
vif->domid = domid;
vif->handle = handle;
vif->can_sg = 1;
- vif->csum = 1;
+ vif->ip_csum = 1;
vif->dev = dev;
vif->credit_bytes = vif->remaining_credit = ~0UL;
@@ -316,7 +318,9 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
vif->credit_timeout.expires = jiffies;
dev->netdev_ops = &xenvif_netdev_ops;
- dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
+ dev->hw_features = NETIF_F_SG |
+ NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
+ NETIF_F_TSO;
dev->features = dev->hw_features;
SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 1b08d87..99343ca 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -571,10 +571,32 @@ static int connect_rings(struct backend_info *be)
val = 0;
vif->gso_prefix = !!val;
+ /* Before feature-ipv6-csum-offload was introduced, checksum offload
+ * was turned on by a zero value in (or lack of)
+ * feature-no-csum-offload. Therefore, for compatibility, assume
+ * that if feature-ip-csum-offload is missing that IP (v4) offload
+ * is turned on. If this is not the case then the flag will be
+ * cleared when we subsequently sample feature-no-csum-offload.
+ */
+ if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ip-csum-offload",
+ "%d", &val) < 0)
+ val = 1;
+ vif->ip_csum = !!val;
+
+ if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ipv6-csum-offload",
+ "%d", &val) < 0)
+ val = 0;
+ vif->ipv6_csum = !!val;
+
+ /* This is deprecated. Frontends should set IP v4 and v6 checksum
+ * offload using feature-ip-csum-offload and
+ * feature-ipv6-csum-offload respectively.
+ */
if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-no-csum-offload",
"%d", &val) < 0)
val = 0;
- vif->csum = !val;
+ if (val)
+ vif->ip_csum = 0;
/* Map the shared frame, irq etc. */
err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref,
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index eb262e3..d9fb44739 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -51,6 +51,16 @@
*/
/*
+ * "feature-no-csum-offload" was used to turn off IPv4 TCP/UDP checksum
+ * offload but is now deprecated. Two new feature flags should now be used
+ * to control checksum offload:
+ * "feature-ip-csum-offload" should be used to turn IPv4 TCP/UDP checksum
+ * offload on or off. If it is missing then the feature is assumed to be on.
+ * "feature-ipv6-csum-offload" should be used to turn IPv6 TCP/UDP checksum
+ * offload on or off. If it is missing then the feature is assumed to be off.
+ */
+
+/*
* This is the 'wire' format for packets:
* Request 1: xen_netif_tx_request -- XEN_NETTXF_* (any flags)
* [Request 2: xen_netif_extra_info] (only if request 1 has XEN_NETTXF_extra_info)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v3 2/5] xen-netback: add support for IPv6 checksum offload from guest
2013-10-10 15:25 [PATCH net-next v3 0/5] xen-netback: IPv6 offload support Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 1/5] xen-netback: add support for IPv6 checksum offload to guest Paul Durrant
@ 2013-10-10 15:25 ` Paul Durrant
2013-10-11 10:09 ` Wei Liu
2013-10-10 15:25 ` [PATCH net-next v3 3/5] xen-netback: Unconditionally set NETIF_F_RXCSUM Paul Durrant
` (2 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
For performance of VM to VM traffic on a single host it is better to avoid
calculation of TCP/UDP checksum in the sending frontend. To allow this this
patch adds the code necessary to set up partial checksum for IPv6 packets
and xenstore flag feature-ipv6-csum-offload to advertise that fact to
frontends.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/netback.c | 243 +++++++++++++++++++++++++++++++------
drivers/net/xen-netback/xenbus.c | 9 ++
2 files changed, 213 insertions(+), 39 deletions(-)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index f3e591c..4c12030 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -109,15 +109,10 @@ static inline unsigned long idx_to_kaddr(struct xenvif *vif,
return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx));
}
-/*
- * This is the amount of packet we copy rather than map, so that the
- * guest can't fiddle with the contents of the headers while we do
- * packet processing on them (netfilter, routing, etc).
+/* This is a miniumum size for the linear area to avoid lots of
+ * calls to __pskb_pull_tail() as we set up checksum offsets.
*/
-#define PKT_PROT_LEN (ETH_HLEN + \
- VLAN_HLEN + \
- sizeof(struct iphdr) + MAX_IPOPTLEN + \
- sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
+#define PKT_PROT_LEN 128
static u16 frag_get_pending_idx(skb_frag_t *frag)
{
@@ -1118,61 +1113,74 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
return 0;
}
-static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+static inline void maybe_pull_tail(struct sk_buff *skb, unsigned int len)
+{
+ if (skb_is_nonlinear(skb) && skb_headlen(skb) < len) {
+ /* If we need to pullup then pullup to the max, so we
+ * won't need to do it again.
+ */
+ int target = min_t(int, skb->len, MAX_TCP_HEADER);
+ __pskb_pull_tail(skb, target - skb_headlen(skb));
+ }
+}
+
+static int checksum_setup_ip(struct xenvif *vif, struct sk_buff *skb,
+ int recalculate_partial_csum)
{
- struct iphdr *iph;
+ struct iphdr *iph = (void *)skb->data;
+ unsigned int header_size;
+ unsigned int off;
int err = -EPROTO;
- int recalculate_partial_csum = 0;
- /*
- * A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
- * peers can fail to set NETRXF_csum_blank when sending a GSO
- * frame. In this case force the SKB to CHECKSUM_PARTIAL and
- * recalculate the partial checksum.
- */
- if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
- vif->rx_gso_checksum_fixup++;
- skb->ip_summed = CHECKSUM_PARTIAL;
- recalculate_partial_csum = 1;
- }
+ off = sizeof(struct iphdr);
- /* A non-CHECKSUM_PARTIAL SKB does not require setup. */
- if (skb->ip_summed != CHECKSUM_PARTIAL)
- return 0;
+ header_size = skb->network_header + off + MAX_IPOPTLEN;
+ maybe_pull_tail(skb, header_size);
- if (skb->protocol != htons(ETH_P_IP))
- goto out;
+ off = iph->ihl * 4;
- iph = (void *)skb->data;
switch (iph->protocol) {
case IPPROTO_TCP:
- if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+ if (!skb_partial_csum_set(skb, off,
offsetof(struct tcphdr, check)))
goto out;
if (recalculate_partial_csum) {
struct tcphdr *tcph = tcp_hdr(skb);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct tcphdr);
+ maybe_pull_tail(skb, header_size);
+
tcph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
- skb->len - iph->ihl*4,
+ skb->len - off,
IPPROTO_TCP, 0);
}
break;
case IPPROTO_UDP:
- if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+ if (!skb_partial_csum_set(skb, off,
offsetof(struct udphdr, check)))
goto out;
if (recalculate_partial_csum) {
struct udphdr *udph = udp_hdr(skb);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct udphdr);
+ maybe_pull_tail(skb, header_size);
+
udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
- skb->len - iph->ihl*4,
+ skb->len - off,
IPPROTO_UDP, 0);
}
break;
default:
if (net_ratelimit())
netdev_err(vif->dev,
- "Attempting to checksum a non-TCP/UDP packet, dropping a protocol %d packet\n",
+ "Attempting to checksum a non-TCP/UDP packet, "
+ "dropping a protocol %d packet\n",
iph->protocol);
goto out;
}
@@ -1183,6 +1191,168 @@ out:
return err;
}
+static int checksum_setup_ipv6(struct xenvif *vif, struct sk_buff *skb,
+ int recalculate_partial_csum)
+{
+ int err = -EPROTO;
+ struct ipv6hdr *ipv6h = (void *)skb->data;
+ u8 nexthdr;
+ unsigned int header_size;
+ unsigned int off;
+ bool done;
+ bool found;
+
+ done = false;
+ found = false;
+
+ off = sizeof(struct ipv6hdr);
+
+ header_size = skb->network_header + off;
+ maybe_pull_tail(skb, header_size);
+
+ nexthdr = ipv6h->nexthdr;
+
+ while ((off <= sizeof(struct ipv6hdr) + ntohs(ipv6h->payload_len)) &&
+ !done) {
+ switch (nexthdr) {
+ case IPPROTO_FRAGMENT: {
+ struct frag_hdr *hp = (void *)(skb->data + off);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct frag_hdr);
+ maybe_pull_tail(skb, header_size);
+
+ nexthdr = hp->nexthdr;
+ off += 8;
+ break;
+ }
+ case IPPROTO_DSTOPTS:
+ case IPPROTO_HOPOPTS:
+ case IPPROTO_ROUTING: {
+ struct ipv6_opt_hdr *hp = (void *)(skb->data + off);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct ipv6_opt_hdr);
+ maybe_pull_tail(skb, header_size);
+
+ nexthdr = hp->nexthdr;
+ off += ipv6_optlen(hp);
+ break;
+ }
+ case IPPROTO_AH: {
+ struct ip_auth_hdr *hp = (void *)(skb->data + off);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct ip_auth_hdr);
+ maybe_pull_tail(skb, header_size);
+
+ nexthdr = hp->nexthdr;
+ off += (hp->hdrlen+2)<<2;
+ break;
+ }
+ case IPPROTO_TCP:
+ case IPPROTO_UDP:
+ found = true;
+ /* Fall through */
+ default:
+ done = true;
+ break;
+ }
+ }
+
+ if (!done) {
+ if (net_ratelimit())
+ netdev_err(vif->dev, "Failed to parse packet header\n");
+ goto out;
+ }
+
+ if (!found) {
+ if (net_ratelimit())
+ netdev_err(vif->dev,
+ "Attempting to checksum a non-TCP/UDP packet, "
+ "dropping a protocol %d packet\n",
+ nexthdr);
+ goto out;
+ }
+
+ switch (nexthdr) {
+ case IPPROTO_TCP:
+ if (!skb_partial_csum_set(skb, off,
+ offsetof(struct tcphdr, check)))
+ goto out;
+
+ if (recalculate_partial_csum) {
+ struct tcphdr *tcph = tcp_hdr(skb);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct tcphdr);
+ maybe_pull_tail(skb, header_size);
+
+ tcph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+ &ipv6h->daddr,
+ skb->len - off,
+ IPPROTO_TCP, 0);
+ }
+ break;
+ case IPPROTO_UDP:
+ if (!skb_partial_csum_set(skb, off,
+ offsetof(struct udphdr, check)))
+ goto out;
+
+ if (recalculate_partial_csum) {
+ struct udphdr *udph = udp_hdr(skb);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct udphdr);
+ maybe_pull_tail(skb, header_size);
+
+ udph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+ &ipv6h->daddr,
+ skb->len - off,
+ IPPROTO_UDP, 0);
+ }
+ break;
+ }
+
+ err = 0;
+
+out:
+ return err;
+}
+
+static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+{
+ int err = -EPROTO;
+ int recalculate_partial_csum = 0;
+
+ /* A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
+ * peers can fail to set NETRXF_csum_blank when sending a GSO
+ * frame. In this case force the SKB to CHECKSUM_PARTIAL and
+ * recalculate the partial checksum.
+ */
+ if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
+ vif->rx_gso_checksum_fixup++;
+ skb->ip_summed = CHECKSUM_PARTIAL;
+ recalculate_partial_csum = 1;
+ }
+
+ /* A non-CHECKSUM_PARTIAL SKB does not require setup. */
+ if (skb->ip_summed != CHECKSUM_PARTIAL)
+ return 0;
+
+ if (skb->protocol == htons(ETH_P_IP))
+ err = checksum_setup_ip(vif, skb, recalculate_partial_csum);
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ err = checksum_setup_ipv6(vif, skb, recalculate_partial_csum);
+
+ return err;
+}
+
static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
{
unsigned long now = jiffies;
@@ -1428,12 +1598,7 @@ static int xenvif_tx_submit(struct xenvif *vif, int budget)
xenvif_fill_frags(vif, skb);
- /*
- * If the initial fragment was < PKT_PROT_LEN then
- * pull through some bytes from the other fragments to
- * increase the linear region to PKT_PROT_LEN bytes.
- */
- if (skb_headlen(skb) < PKT_PROT_LEN && skb_is_nonlinear(skb)) {
+ if (skb_is_nonlinear(skb) && skb_headlen(skb) < PKT_PROT_LEN) {
int target = min_t(int, skb->len, PKT_PROT_LEN);
__pskb_pull_tail(skb, target - skb_headlen(skb));
}
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 99343ca..9c9b37d 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -105,6 +105,15 @@ static int netback_probe(struct xenbus_device *dev,
goto abort_transaction;
}
+ /* We support partial checksum setup for IPv6 packets */
+ err = xenbus_printf(xbt, dev->nodename,
+ "feature-ipv6-csum-offload",
+ "%d", 1);
+ if (err) {
+ message = "writing feature-ipv6-csum-offload";
+ goto abort_transaction;
+ }
+
/* We support rx-copy path. */
err = xenbus_printf(xbt, dev->nodename,
"feature-rx-copy", "%d", 1);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v3 3/5] xen-netback: Unconditionally set NETIF_F_RXCSUM
2013-10-10 15:25 [PATCH net-next v3 0/5] xen-netback: IPv6 offload support Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 1/5] xen-netback: add support for IPv6 checksum offload to guest Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 2/5] xen-netback: add support for IPv6 checksum offload from guest Paul Durrant
@ 2013-10-10 15:25 ` Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to " Paul Durrant
4 siblings, 0 replies; 12+ messages in thread
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
There is no mechanism to insist that a guest always generates a packet
with good checksum (at least for IPv4) so we must handle checksum
offloading from the guest and hence should set NETIF_F_RXCSUM.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/interface.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 8e92783..cb0d8ea 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -321,7 +321,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
dev->hw_features = NETIF_F_SG |
NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
NETIF_F_TSO;
- dev->features = dev->hw_features;
+ dev->features = dev->hw_features | NETIF_F_RXCSUM;
SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
dev->tx_queue_len = XENVIF_QUEUE_LENGTH;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v3 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest
2013-10-10 15:25 [PATCH net-next v3 0/5] xen-netback: IPv6 offload support Paul Durrant
` (2 preceding siblings ...)
2013-10-10 15:25 ` [PATCH net-next v3 3/5] xen-netback: Unconditionally set NETIF_F_RXCSUM Paul Durrant
@ 2013-10-10 15:25 ` Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to " Paul Durrant
4 siblings, 0 replies; 12+ messages in thread
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
This patch adds a xenstore feature flag, festure-gso-tcpv6, to advertise
that netback can handle IPv6 TCP GSO packets. It creates SKB_GSO_TCPV6 skbs
if the frontend passes an extra segment with the new type
XEN_NETIF_GSO_TYPE_TCPV6 added to netif.h.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/netback.c | 11 ++++++++---
drivers/net/xen-netback/xenbus.c | 7 +++++++
include/xen/interface/io/netif.h | 10 +++++++++-
3 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 4c12030..afc055c 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1096,15 +1096,20 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
return -EINVAL;
}
- /* Currently only TCPv4 S.O. is supported. */
- if (gso->u.gso.type != XEN_NETIF_GSO_TYPE_TCPV4) {
+ switch (gso->u.gso.type) {
+ case XEN_NETIF_GSO_TYPE_TCPV4:
+ skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+ break;
+ case XEN_NETIF_GSO_TYPE_TCPV6:
+ skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+ break;
+ default:
netdev_err(vif->dev, "Bad GSO type %d.\n", gso->u.gso.type);
xenvif_fatal_tx_err(vif);
return -EINVAL;
}
skb_shinfo(skb)->gso_size = gso->u.gso.size;
- skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
/* Header must be checked, and gso_segs computed. */
skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY;
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 9c9b37d..7e4dcc9 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -105,6 +105,13 @@ static int netback_probe(struct xenbus_device *dev,
goto abort_transaction;
}
+ err = xenbus_printf(xbt, dev->nodename, "feature-gso-tcpv6",
+ "%d", sg);
+ if (err) {
+ message = "writing feature-gso-tcpv6";
+ goto abort_transaction;
+ }
+
/* We support partial checksum setup for IPv6 packets */
err = xenbus_printf(xbt, dev->nodename,
"feature-ipv6-csum-offload",
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index d9fb44739..d7dd8d7 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -61,6 +61,13 @@
*/
/*
+ * "feature-gso-tcpv4" and "feature-gso-tcpv6" advertise the capability to
+ * handle large TCP packets (in IPv4 or IPv6 form respectively). Neither
+ * frontends nor backends are assumed to be capable unless the flags are
+ * present.
+ */
+
+/*
* This is the 'wire' format for packets:
* Request 1: xen_netif_tx_request -- XEN_NETTXF_* (any flags)
* [Request 2: xen_netif_extra_info] (only if request 1 has XEN_NETTXF_extra_info)
@@ -105,8 +112,9 @@ struct xen_netif_tx_request {
#define _XEN_NETIF_EXTRA_FLAG_MORE (0)
#define XEN_NETIF_EXTRA_FLAG_MORE (1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
-/* GSO types - only TCPv4 currently supported. */
+/* GSO types */
#define XEN_NETIF_GSO_TYPE_TCPV4 (1)
+#define XEN_NETIF_GSO_TYPE_TCPV6 (2)
/*
* This structure needs to fit within both netif_tx_request and
--
1.7.10.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to the guest
2013-10-10 15:25 [PATCH net-next v3 0/5] xen-netback: IPv6 offload support Paul Durrant
` (3 preceding siblings ...)
2013-10-10 15:25 ` [PATCH net-next v3 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest Paul Durrant
@ 2013-10-10 15:25 ` Paul Durrant
2013-10-11 10:09 ` Wei Liu
4 siblings, 1 reply; 12+ messages in thread
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
This patch adds code to handle SKB_GSO_TCPV6 skbs and construct appropriate
extra or prefix segments to pass the large packet to the frontend. New
xenstore flags, feature-gso-tcpv6 and feature-gso-tcpv6-prefix, are sampled
to determine if the frontend is capable of handling such packets.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/common.h | 9 +++++--
drivers/net/xen-netback/interface.c | 6 +++--
drivers/net/xen-netback/netback.c | 48 +++++++++++++++++++++++++++--------
drivers/net/xen-netback/xenbus.c | 29 +++++++++++++++++++--
include/xen/interface/io/netif.h | 1 +
5 files changed, 77 insertions(+), 16 deletions(-)
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index b4a9a3c..55b8dec 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -87,9 +87,13 @@ struct pending_tx_info {
struct xenvif_rx_meta {
int id;
int size;
+ int gso_type;
int gso_size;
};
+#define GSO_BIT(type) \
+ (1 << XEN_NETIF_GSO_TYPE_ ## type)
+
/* Discriminate from any valid pending_idx value. */
#define INVALID_PENDING_IDX 0xFFFF
@@ -150,9 +154,10 @@ struct xenvif {
u8 fe_dev_addr[6];
/* Frontend feature information. */
+ int gso_mask;
+ int gso_prefix_mask;
+
u8 can_sg:1;
- u8 gso:1;
- u8 gso_prefix:1;
u8 ip_csum:1;
u8 ipv6_csum:1;
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index cb0d8ea..e4aa267 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -214,8 +214,10 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
if (!vif->can_sg)
features &= ~NETIF_F_SG;
- if (!vif->gso && !vif->gso_prefix)
+ if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV4))
features &= ~NETIF_F_TSO;
+ if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV6))
+ features &= ~NETIF_F_TSO6;
if (!vif->ip_csum)
features &= ~NETIF_F_IP_CSUM;
if (!vif->ipv6_csum)
@@ -320,7 +322,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
dev->netdev_ops = &xenvif_netdev_ops;
dev->hw_features = NETIF_F_SG |
NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
- NETIF_F_TSO;
+ NETIF_F_TSO | NETIF_F_TSO6;
dev->features = dev->hw_features | NETIF_F_RXCSUM;
SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index afc055c..bb31921 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -140,7 +140,7 @@ static int max_required_rx_slots(struct xenvif *vif)
int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
- if (vif->can_sg || vif->gso || vif->gso_prefix)
+ if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
return max;
@@ -312,6 +312,7 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif *vif,
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
meta = npo->meta + npo->meta_prod++;
+ meta->gso_type = 0;
meta->gso_size = 0;
meta->size = 0;
meta->id = req->id;
@@ -334,6 +335,7 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
struct gnttab_copy *copy_gop;
struct xenvif_rx_meta *meta;
unsigned long bytes;
+ int gso_type;
/* Data must not cross a page boundary. */
BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
@@ -392,7 +394,14 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
}
/* Leave a gap for the GSO descriptor. */
- if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
+ if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+ gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+ else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+ gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+ else
+ gso_type = XEN_NETIF_GSO_TYPE_NONE;
+
+ if (*head && ((1 << gso_type) & vif->gso_mask))
vif->rx.req_cons++;
*head = 0; /* There must be something in this buffer now. */
@@ -423,14 +432,28 @@ static int xenvif_gop_skb(struct sk_buff *skb,
unsigned char *data;
int head = 1;
int old_meta_prod;
+ int gso_type;
+ int gso_size;
old_meta_prod = npo->meta_prod;
+ if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
+ gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+ gso_size = skb_shinfo(skb)->gso_size;
+ } else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) {
+ gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+ gso_size = skb_shinfo(skb)->gso_size;
+ } else {
+ gso_type = 0;
+ gso_size = 0;
+ }
+
/* Set up a GSO prefix descriptor, if necessary */
- if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
+ if ((1 << skb_shinfo(skb)->gso_type) & vif->gso_prefix_mask) {
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
meta = npo->meta + npo->meta_prod++;
- meta->gso_size = skb_shinfo(skb)->gso_size;
+ meta->gso_type = gso_type;
+ meta->gso_size = gso_size;
meta->size = 0;
meta->id = req->id;
}
@@ -438,10 +461,13 @@ static int xenvif_gop_skb(struct sk_buff *skb,
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
meta = npo->meta + npo->meta_prod++;
- if (!vif->gso_prefix)
- meta->gso_size = skb_shinfo(skb)->gso_size;
- else
+ if ((1 << gso_type) & vif->gso_mask) {
+ meta->gso_type = gso_type;
+ meta->gso_size = gso_size;
+ } else {
+ meta->gso_type = 0;
meta->gso_size = 0;
+ }
meta->size = 0;
meta->id = req->id;
@@ -587,7 +613,8 @@ void xenvif_rx_action(struct xenvif *vif)
vif = netdev_priv(skb->dev);
- if (vif->meta[npo.meta_cons].gso_size && vif->gso_prefix) {
+ if ((1 << vif->meta[npo.meta_cons].gso_type) &
+ vif->gso_prefix_mask) {
resp = RING_GET_RESPONSE(&vif->rx,
vif->rx.rsp_prod_pvt++);
@@ -624,7 +651,8 @@ void xenvif_rx_action(struct xenvif *vif)
vif->meta[npo.meta_cons].size,
flags);
- if (vif->meta[npo.meta_cons].gso_size && !vif->gso_prefix) {
+ if ((1 << vif->meta[npo.meta_cons].gso_type) &
+ vif->gso_mask) {
struct xen_netif_extra_info *gso =
(struct xen_netif_extra_info *)
RING_GET_RESPONSE(&vif->rx,
@@ -632,8 +660,8 @@ void xenvif_rx_action(struct xenvif *vif)
resp->flags |= XEN_NETRXF_extra_info;
+ gso->u.gso.type = vif->meta[npo.meta_cons].gso_type;
gso->u.gso.size = vif->meta[npo.meta_cons].gso_size;
- gso->u.gso.type = XEN_NETIF_GSO_TYPE_TCPV4;
gso->u.gso.pad = 0;
gso->u.gso.features = 0;
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 7e4dcc9..6a005f1 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -577,15 +577,40 @@ static int connect_rings(struct backend_info *be)
val = 0;
vif->can_sg = !!val;
+ vif->gso_mask = 0;
+ vif->gso_prefix_mask = 0;
+
if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4",
"%d", &val) < 0)
val = 0;
- vif->gso = !!val;
+ if (val)
+ vif->gso_mask |= GSO_BIT(TCPV4);
if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4-prefix",
"%d", &val) < 0)
val = 0;
- vif->gso_prefix = !!val;
+ if (val)
+ vif->gso_prefix_mask |= GSO_BIT(TCPV4);
+
+ if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6",
+ "%d", &val) < 0)
+ val = 0;
+ if (val)
+ vif->gso_mask |= GSO_BIT(TCPV6);
+
+ if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6-prefix",
+ "%d", &val) < 0)
+ val = 0;
+ if (val)
+ vif->gso_prefix_mask |= GSO_BIT(TCPV6);
+
+ if (vif->gso_mask & vif->gso_prefix_mask) {
+ xenbus_dev_fatal(dev, err,
+ "%s: gso and gso prefix flags are not "
+ "mutually exclusive",
+ dev->otherend);
+ return -EOPNOTSUPP;
+ }
/* Before feature-ipv6-csum-offload was introduced, checksum offload
* was turned on by a zero value in (or lack of)
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index d7dd8d7..41674bc 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -113,6 +113,7 @@ struct xen_netif_tx_request {
#define XEN_NETIF_EXTRA_FLAG_MORE (1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
/* GSO types */
+#define XEN_NETIF_GSO_TYPE_NONE (0)
#define XEN_NETIF_GSO_TYPE_TCPV4 (1)
#define XEN_NETIF_GSO_TYPE_TCPV6 (2)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v3 2/5] xen-netback: add support for IPv6 checksum offload from guest
2013-10-10 15:25 ` [PATCH net-next v3 2/5] xen-netback: add support for IPv6 checksum offload from guest Paul Durrant
@ 2013-10-11 10:09 ` Wei Liu
2013-10-11 11:09 ` Paul Durrant
0 siblings, 1 reply; 12+ messages in thread
From: Wei Liu @ 2013-10-11 10:09 UTC (permalink / raw)
To: Paul Durrant; +Cc: xen-devel, netdev, Wei Liu, David Vrabel, Ian Campbell
On Thu, Oct 10, 2013 at 04:25:30PM +0100, Paul Durrant wrote:
[...]
> -#define PKT_PROT_LEN (ETH_HLEN + \
> - VLAN_HLEN + \
> - sizeof(struct iphdr) + MAX_IPOPTLEN + \
> - sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
> +#define PKT_PROT_LEN 128
I'm not against changing this, but could you please add comment on why
128 is chosen.
>
[...]
> + while ((off <= sizeof(struct ipv6hdr) + ntohs(ipv6h->payload_len)) &&
> + !done) {
> + switch (nexthdr) {
> + case IPPROTO_FRAGMENT: {
> + struct frag_hdr *hp = (void *)(skb->data + off);
> +
> + header_size = skb->network_header +
> + off +
> + sizeof(struct frag_hdr);
> + maybe_pull_tail(skb, header_size);
> +
> + nexthdr = hp->nexthdr;
> + off += 8;
Can you use sizeof(frag_hdr) instead of 8?
> + break;
> + }
> + case IPPROTO_DSTOPTS:
> + case IPPROTO_HOPOPTS:
> + case IPPROTO_ROUTING: {
> + struct ipv6_opt_hdr *hp = (void *)(skb->data + off);
> +
> + header_size = skb->network_header +
> + off +
> + sizeof(struct ipv6_opt_hdr);
> + maybe_pull_tail(skb, header_size);
> +
> + nexthdr = hp->nexthdr;
> + off += ipv6_optlen(hp);
> + break;
> + }
> + case IPPROTO_AH: {
> + struct ip_auth_hdr *hp = (void *)(skb->data + off);
> +
> + header_size = skb->network_header +
> + off +
> + sizeof(struct ip_auth_hdr);
> + maybe_pull_tail(skb, header_size);
> +
> + nexthdr = hp->nexthdr;
> + off += (hp->hdrlen+2)<<2;
> + break;
> + }
> + case IPPROTO_TCP:
> + case IPPROTO_UDP:
> + found = true;
> + /* Fall through */
> + default:
> + done = true;
> + break;
The above 'switch' doesn't seem to cover all IPPROTO_*. Will it cause
the loop to exit too early? In other words, does any IPPROTO_* not
listed above marks the end of parsing?
[...]
> unsigned long now = jiffies;
> @@ -1428,12 +1598,7 @@ static int xenvif_tx_submit(struct xenvif *vif, int budget)
>
> xenvif_fill_frags(vif, skb);
>
> - /*
> - * If the initial fragment was < PKT_PROT_LEN then
> - * pull through some bytes from the other fragments to
> - * increase the linear region to PKT_PROT_LEN bytes.
> - */
> - if (skb_headlen(skb) < PKT_PROT_LEN && skb_is_nonlinear(skb)) {
> + if (skb_is_nonlinear(skb) && skb_headlen(skb) < PKT_PROT_LEN) {
This change is not necessary: the comment is useful, and swapping two
operands of && doesn't have any effect.
Wei.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to the guest
2013-10-10 15:25 ` [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to " Paul Durrant
@ 2013-10-11 10:09 ` Wei Liu
2013-10-11 11:11 ` Paul Durrant
2013-10-11 14:03 ` Paul Durrant
0 siblings, 2 replies; 12+ messages in thread
From: Wei Liu @ 2013-10-11 10:09 UTC (permalink / raw)
To: Paul Durrant; +Cc: xen-devel, netdev, Wei Liu, David Vrabel, Ian Campbell
On Thu, Oct 10, 2013 at 04:25:33PM +0100, Paul Durrant wrote:
[...]
> return max;
> @@ -312,6 +312,7 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif *vif,
> req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
>
> meta = npo->meta + npo->meta_prod++;
> + meta->gso_type = 0;
Should use XEN_NETIF_GSO_TYPE_NONE here.
> meta->gso_size = 0;
> meta->size = 0;
[...]
> + if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
> + gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
> + gso_size = skb_shinfo(skb)->gso_size;
> + } else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) {
> + gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
> + gso_size = skb_shinfo(skb)->gso_size;
> + } else {
> + gso_type = 0;
Ditto.
> + gso_size = 0;
> + }
> +
[...]
> @@ -438,10 +461,13 @@ static int xenvif_gop_skb(struct sk_buff *skb,
> req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
> meta = npo->meta + npo->meta_prod++;
>
> - if (!vif->gso_prefix)
> - meta->gso_size = skb_shinfo(skb)->gso_size;
> - else
> + if ((1 << gso_type) & vif->gso_mask) {
> + meta->gso_type = gso_type;
> + meta->gso_size = gso_size;
> + } else {
> + meta->gso_type = 0;
Ditto.
> meta->gso_size = 0;
> + }
>
> meta->size = 0;
> meta->id = req->id;
> @@ -587,7 +613,8 @@ void xenvif_rx_action(struct xenvif *vif)
>
> vif = netdev_priv(skb->dev);
>
> - if (vif->meta[npo.meta_cons].gso_size && vif->gso_prefix) {
> + if ((1 << vif->meta[npo.meta_cons].gso_type) &
> + vif->gso_prefix_mask) {
The logic is changed here. Why do you remove the test on gso_size?
> resp = RING_GET_RESPONSE(&vif->rx,
> vif->rx.rsp_prod_pvt++);
>
> @@ -624,7 +651,8 @@ void xenvif_rx_action(struct xenvif *vif)
> vif->meta[npo.meta_cons].size,
> flags);
>
> - if (vif->meta[npo.meta_cons].gso_size && !vif->gso_prefix) {
> + if ((1 << vif->meta[npo.meta_cons].gso_type) &
> + vif->gso_mask) {
Ditto.
> struct xen_netif_extra_info *gso =
> (struct xen_netif_extra_info *)
> RING_GET_RESPONSE(&vif->rx,
> @@ -632,8 +660,8 @@ void xenvif_rx_action(struct xenvif *vif)
>
[...]
> * was turned on by a zero value in (or lack of)
> diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
> index d7dd8d7..41674bc 100644
> --- a/include/xen/interface/io/netif.h
> +++ b/include/xen/interface/io/netif.h
> @@ -113,6 +113,7 @@ struct xen_netif_tx_request {
> #define XEN_NETIF_EXTRA_FLAG_MORE (1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
>
> /* GSO types */
> +#define XEN_NETIF_GSO_TYPE_NONE (0)
Indentation.
Wei.
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH net-next v3 2/5] xen-netback: add support for IPv6 checksum offload from guest
2013-10-11 10:09 ` Wei Liu
@ 2013-10-11 11:09 ` Paul Durrant
2013-10-11 12:08 ` [Xen-devel] " Paul Durrant
0 siblings, 1 reply; 12+ messages in thread
From: Paul Durrant @ 2013-10-11 11:09 UTC (permalink / raw)
To: Wei Liu
Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org, Wei Liu,
David Vrabel, Ian Campbell
> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: 11 October 2013 11:10
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; David Vrabel;
> Ian Campbell
> Subject: Re: [PATCH net-next v3 2/5] xen-netback: add support for IPv6
> checksum offload from guest
>
> On Thu, Oct 10, 2013 at 04:25:30PM +0100, Paul Durrant wrote:
> [...]
> > -#define PKT_PROT_LEN (ETH_HLEN + \
> > - VLAN_HLEN + \
> > - sizeof(struct iphdr) + MAX_IPOPTLEN + \
> > - sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
> > +#define PKT_PROT_LEN 128
>
> I'm not against changing this, but could you please add comment on why
> 128 is chosen.
>
Hmm. I did mod the comment, but that change seems to have got lost somewhere.
> >
> [...]
> > + while ((off <= sizeof(struct ipv6hdr) + ntohs(ipv6h->payload_len))
> &&
> > + !done) {
> > + switch (nexthdr) {
> > + case IPPROTO_FRAGMENT: {
> > + struct frag_hdr *hp = (void *)(skb->data + off);
> > +
> > + header_size = skb->network_header +
> > + off +
> > + sizeof(struct frag_hdr);
> > + maybe_pull_tail(skb, header_size);
> > +
> > + nexthdr = hp->nexthdr;
> > + off += 8;
>
> Can you use sizeof(frag_hdr) instead of 8?
>
> > + break;
> > + }
> > + case IPPROTO_DSTOPTS:
> > + case IPPROTO_HOPOPTS:
> > + case IPPROTO_ROUTING: {
> > + struct ipv6_opt_hdr *hp = (void *)(skb->data + off);
> > +
> > + header_size = skb->network_header +
> > + off +
> > + sizeof(struct ipv6_opt_hdr);
> > + maybe_pull_tail(skb, header_size);
> > +
> > + nexthdr = hp->nexthdr;
> > + off += ipv6_optlen(hp);
> > + break;
> > + }
> > + case IPPROTO_AH: {
> > + struct ip_auth_hdr *hp = (void *)(skb->data + off);
> > +
> > + header_size = skb->network_header +
> > + off +
> > + sizeof(struct ip_auth_hdr);
> > + maybe_pull_tail(skb, header_size);
> > +
> > + nexthdr = hp->nexthdr;
> > + off += (hp->hdrlen+2)<<2;
> > + break;
> > + }
> > + case IPPROTO_TCP:
> > + case IPPROTO_UDP:
> > + found = true;
> > + /* Fall through */
> > + default:
> > + done = true;
> > + break;
>
> The above 'switch' doesn't seem to cover all IPPROTO_*. Will it cause
> the loop to exit too early? In other words, does any IPPROTO_* not
> listed above marks the end of parsing?
>
AFAIK, hitting anything not in that switch would mean we don't have a TCP or UDP packet. I think the original code structure made that clearer so I'm going to go back to that.
> [...]
> > unsigned long now = jiffies;
> > @@ -1428,12 +1598,7 @@ static int xenvif_tx_submit(struct xenvif *vif, int
> budget)
> >
> > xenvif_fill_frags(vif, skb);
> >
> > - /*
> > - * If the initial fragment was < PKT_PROT_LEN then
> > - * pull through some bytes from the other fragments to
> > - * increase the linear region to PKT_PROT_LEN bytes.
> > - */
> > - if (skb_headlen(skb) < PKT_PROT_LEN &&
> skb_is_nonlinear(skb)) {
> > + if (skb_is_nonlinear(skb) && skb_headlen(skb) <
> PKT_PROT_LEN) {
>
> This change is not necessary: the comment is useful, and swapping two
> operands of && doesn't have any effect.
Sorry, that was too much cutting and pasting around. I'll just ditch that hunk.
Paul
>
> Wei.
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to the guest
2013-10-11 10:09 ` Wei Liu
@ 2013-10-11 11:11 ` Paul Durrant
2013-10-11 14:03 ` Paul Durrant
1 sibling, 0 replies; 12+ messages in thread
From: Paul Durrant @ 2013-10-11 11:11 UTC (permalink / raw)
To: Wei Liu
Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org, Wei Liu,
David Vrabel, Ian Campbell
> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: 11 October 2013 11:10
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; David Vrabel;
> Ian Campbell
> Subject: Re: [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to
> the guest
>
> On Thu, Oct 10, 2013 at 04:25:33PM +0100, Paul Durrant wrote:
> [...]
> > return max;
> > @@ -312,6 +312,7 @@ static struct xenvif_rx_meta
> *get_next_rx_buffer(struct xenvif *vif,
> > req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
> >
> > meta = npo->meta + npo->meta_prod++;
> > + meta->gso_type = 0;
>
> Should use XEN_NETIF_GSO_TYPE_NONE here.
>
> > meta->gso_size = 0;
> > meta->size = 0;
> [...]
> > + if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
> > + gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
> > + gso_size = skb_shinfo(skb)->gso_size;
> > + } else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) {
> > + gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
> > + gso_size = skb_shinfo(skb)->gso_size;
> > + } else {
> > + gso_type = 0;
>
> Ditto.
>
> > + gso_size = 0;
> > + }
> > +
> [...]
> > @@ -438,10 +461,13 @@ static int xenvif_gop_skb(struct sk_buff *skb,
> > req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
> > meta = npo->meta + npo->meta_prod++;
> >
> > - if (!vif->gso_prefix)
> > - meta->gso_size = skb_shinfo(skb)->gso_size;
> > - else
> > + if ((1 << gso_type) & vif->gso_mask) {
> > + meta->gso_type = gso_type;
> > + meta->gso_size = gso_size;
> > + } else {
> > + meta->gso_type = 0;
>
> Ditto.
>
> > meta->gso_size = 0;
> > + }
> >
> > meta->size = 0;
> > meta->id = req->id;
> > @@ -587,7 +613,8 @@ void xenvif_rx_action(struct xenvif *vif)
> >
> > vif = netdev_priv(skb->dev);
> >
> > - if (vif->meta[npo.meta_cons].gso_size && vif->gso_prefix) {
> > + if ((1 << vif->meta[npo.meta_cons].gso_type) &
> > + vif->gso_prefix_mask) {
>
> The logic is changed here. Why do you remove the test on gso_size?
>
> > resp = RING_GET_RESPONSE(&vif->rx,
> > vif->rx.rsp_prod_pvt++);
> >
> > @@ -624,7 +651,8 @@ void xenvif_rx_action(struct xenvif *vif)
> > vif->meta[npo.meta_cons].size,
> > flags);
> >
> > - if (vif->meta[npo.meta_cons].gso_size && !vif->gso_prefix)
> {
> > + if ((1 << vif->meta[npo.meta_cons].gso_type) &
> > + vif->gso_mask) {
>
> Ditto.
>
> > struct xen_netif_extra_info *gso =
> > (struct xen_netif_extra_info *)
> > RING_GET_RESPONSE(&vif->rx,
> > @@ -632,8 +660,8 @@ void xenvif_rx_action(struct xenvif *vif)
> >
> [...]
> > * was turned on by a zero value in (or lack of)
> > diff --git a/include/xen/interface/io/netif.h
> b/include/xen/interface/io/netif.h
> > index d7dd8d7..41674bc 100644
> > --- a/include/xen/interface/io/netif.h
> > +++ b/include/xen/interface/io/netif.h
> > @@ -113,6 +113,7 @@ struct xen_netif_tx_request {
> > #define XEN_NETIF_EXTRA_FLAG_MORE
> (1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
> >
> > /* GSO types */
> > +#define XEN_NETIF_GSO_TYPE_NONE (0)
>
> Indentation.
>
Agreed. I managed to lose a load of changes and this patch was one I had to re-construct; I'll fix.
Paul
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [Xen-devel] [PATCH net-next v3 2/5] xen-netback: add support for IPv6 checksum offload from guest
2013-10-11 11:09 ` Paul Durrant
@ 2013-10-11 12:08 ` Paul Durrant
0 siblings, 0 replies; 12+ messages in thread
From: Paul Durrant @ 2013-10-11 12:08 UTC (permalink / raw)
To: Paul Durrant, Wei Liu
Cc: netdev@vger.kernel.org, Ian Campbell, Wei Liu, David Vrabel,
xen-devel@lists.xen.org
> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Paul Durrant
> Sent: 11 October 2013 12:10
> To: Wei Liu
> Cc: netdev@vger.kernel.org; Ian Campbell; Wei Liu; David Vrabel; xen-
> devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net-next v3 2/5] xen-netback: add support
> for IPv6 checksum offload from guest
>
> > -----Original Message-----
> > From: Wei Liu [mailto:wei.liu2@citrix.com]
> > Sent: 11 October 2013 11:10
> > To: Paul Durrant
> > Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; David
> Vrabel;
> > Ian Campbell
> > Subject: Re: [PATCH net-next v3 2/5] xen-netback: add support for IPv6
> > checksum offload from guest
> >
> > On Thu, Oct 10, 2013 at 04:25:30PM +0100, Paul Durrant wrote:
> > [...]
> > > -#define PKT_PROT_LEN (ETH_HLEN + \
> > > - VLAN_HLEN + \
> > > - sizeof(struct iphdr) + MAX_IPOPTLEN + \
> > > - sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
> > > +#define PKT_PROT_LEN 128
> >
> > I'm not against changing this, but could you please add comment on why
> > 128 is chosen.
> >
>
> Hmm. I did mod the comment, but that change seems to have got lost
> somewhere.
>
> > >
> > [...]
> > > + while ((off <= sizeof(struct ipv6hdr) + ntohs(ipv6h->payload_len))
> > &&
> > > + !done) {
> > > + switch (nexthdr) {
> > > + case IPPROTO_FRAGMENT: {
> > > + struct frag_hdr *hp = (void *)(skb->data + off);
> > > +
> > > + header_size = skb->network_header +
> > > + off +
> > > + sizeof(struct frag_hdr);
> > > + maybe_pull_tail(skb, header_size);
> > > +
> > > + nexthdr = hp->nexthdr;
> > > + off += 8;
> >
> > Can you use sizeof(frag_hdr) instead of 8?
> >
> > > + break;
> > > + }
> > > + case IPPROTO_DSTOPTS:
> > > + case IPPROTO_HOPOPTS:
> > > + case IPPROTO_ROUTING: {
> > > + struct ipv6_opt_hdr *hp = (void *)(skb->data + off);
> > > +
> > > + header_size = skb->network_header +
> > > + off +
> > > + sizeof(struct ipv6_opt_hdr);
> > > + maybe_pull_tail(skb, header_size);
> > > +
> > > + nexthdr = hp->nexthdr;
> > > + off += ipv6_optlen(hp);
> > > + break;
> > > + }
> > > + case IPPROTO_AH: {
> > > + struct ip_auth_hdr *hp = (void *)(skb->data + off);
> > > +
> > > + header_size = skb->network_header +
> > > + off +
> > > + sizeof(struct ip_auth_hdr);
> > > + maybe_pull_tail(skb, header_size);
> > > +
> > > + nexthdr = hp->nexthdr;
> > > + off += (hp->hdrlen+2)<<2;
> > > + break;
> > > + }
> > > + case IPPROTO_TCP:
> > > + case IPPROTO_UDP:
> > > + found = true;
> > > + /* Fall through */
> > > + default:
> > > + done = true;
> > > + break;
> >
> > The above 'switch' doesn't seem to cover all IPPROTO_*. Will it cause
> > the loop to exit too early? In other words, does any IPPROTO_* not
> > listed above marks the end of parsing?
> >
>
> AFAIK, hitting anything not in that switch would mean we don't have a TCP or
> UDP packet. I think the original code structure made that clearer so I'm going
> to go back to that.
>
Looking at this again I realize I should not parse the fragment header; there's no way anyone should be sending a fragment with partial checksum.
Paul
> > [...]
> > > unsigned long now = jiffies;
> > > @@ -1428,12 +1598,7 @@ static int xenvif_tx_submit(struct xenvif *vif,
> int
> > budget)
> > >
> > > xenvif_fill_frags(vif, skb);
> > >
> > > - /*
> > > - * If the initial fragment was < PKT_PROT_LEN then
> > > - * pull through some bytes from the other fragments to
> > > - * increase the linear region to PKT_PROT_LEN bytes.
> > > - */
> > > - if (skb_headlen(skb) < PKT_PROT_LEN &&
> > skb_is_nonlinear(skb)) {
> > > + if (skb_is_nonlinear(skb) && skb_headlen(skb) <
> > PKT_PROT_LEN) {
> >
> > This change is not necessary: the comment is useful, and swapping two
> > operands of && doesn't have any effect.
>
> Sorry, that was too much cutting and pasting around. I'll just ditch that hunk.
>
> Paul
>
> >
> > Wei.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to the guest
2013-10-11 10:09 ` Wei Liu
2013-10-11 11:11 ` Paul Durrant
@ 2013-10-11 14:03 ` Paul Durrant
1 sibling, 0 replies; 12+ messages in thread
From: Paul Durrant @ 2013-10-11 14:03 UTC (permalink / raw)
To: Wei Liu
Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org, Wei Liu,
David Vrabel, Ian Campbell
> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: 11 October 2013 11:10
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; David Vrabel;
> Ian Campbell
> Subject: Re: [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to
> the guest
>
> On Thu, Oct 10, 2013 at 04:25:33PM +0100, Paul Durrant wrote:
> [...]
> > return max;
> > @@ -312,6 +312,7 @@ static struct xenvif_rx_meta
> *get_next_rx_buffer(struct xenvif *vif,
> > req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
> >
> > meta = npo->meta + npo->meta_prod++;
> > + meta->gso_type = 0;
>
> Should use XEN_NETIF_GSO_TYPE_NONE here.
>
> > meta->gso_size = 0;
> > meta->size = 0;
> [...]
> > + if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
> > + gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
> > + gso_size = skb_shinfo(skb)->gso_size;
> > + } else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) {
> > + gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
> > + gso_size = skb_shinfo(skb)->gso_size;
> > + } else {
> > + gso_type = 0;
>
> Ditto.
>
> > + gso_size = 0;
> > + }
> > +
> [...]
> > @@ -438,10 +461,13 @@ static int xenvif_gop_skb(struct sk_buff *skb,
> > req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
> > meta = npo->meta + npo->meta_prod++;
> >
> > - if (!vif->gso_prefix)
> > - meta->gso_size = skb_shinfo(skb)->gso_size;
> > - else
> > + if ((1 << gso_type) & vif->gso_mask) {
> > + meta->gso_type = gso_type;
> > + meta->gso_size = gso_size;
> > + } else {
> > + meta->gso_type = 0;
>
> Ditto.
>
> > meta->gso_size = 0;
> > + }
> >
> > meta->size = 0;
> > meta->id = req->id;
> > @@ -587,7 +613,8 @@ void xenvif_rx_action(struct xenvif *vif)
> >
> > vif = netdev_priv(skb->dev);
> >
> > - if (vif->meta[npo.meta_cons].gso_size && vif->gso_prefix) {
> > + if ((1 << vif->meta[npo.meta_cons].gso_type) &
> > + vif->gso_prefix_mask) {
>
> The logic is changed here. Why do you remove the test on gso_size?
>
Sorry, I missed this one before so never responded...
The test on gso_size is no longer necessary, because if gso_size is 0 then gso_type will be XEN_NETIF_GSO_TYPE_NONE and that will never appear in the masks.
Paul
> > resp = RING_GET_RESPONSE(&vif->rx,
> > vif->rx.rsp_prod_pvt++);
> >
> > @@ -624,7 +651,8 @@ void xenvif_rx_action(struct xenvif *vif)
> > vif->meta[npo.meta_cons].size,
> > flags);
> >
> > - if (vif->meta[npo.meta_cons].gso_size && !vif->gso_prefix)
> {
> > + if ((1 << vif->meta[npo.meta_cons].gso_type) &
> > + vif->gso_mask) {
>
> Ditto.
>
> > struct xen_netif_extra_info *gso =
> > (struct xen_netif_extra_info *)
> > RING_GET_RESPONSE(&vif->rx,
> > @@ -632,8 +660,8 @@ void xenvif_rx_action(struct xenvif *vif)
> >
> [...]
> > * was turned on by a zero value in (or lack of)
> > diff --git a/include/xen/interface/io/netif.h
> b/include/xen/interface/io/netif.h
> > index d7dd8d7..41674bc 100644
> > --- a/include/xen/interface/io/netif.h
> > +++ b/include/xen/interface/io/netif.h
> > @@ -113,6 +113,7 @@ struct xen_netif_tx_request {
> > #define XEN_NETIF_EXTRA_FLAG_MORE
> (1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
> >
> > /* GSO types */
> > +#define XEN_NETIF_GSO_TYPE_NONE (0)
>
> Indentation.
>
> Wei.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-10-11 14:03 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-10 15:25 [PATCH net-next v3 0/5] xen-netback: IPv6 offload support Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 1/5] xen-netback: add support for IPv6 checksum offload to guest Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 2/5] xen-netback: add support for IPv6 checksum offload from guest Paul Durrant
2013-10-11 10:09 ` Wei Liu
2013-10-11 11:09 ` Paul Durrant
2013-10-11 12:08 ` [Xen-devel] " Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 3/5] xen-netback: Unconditionally set NETIF_F_RXCSUM Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest Paul Durrant
2013-10-10 15:25 ` [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to " Paul Durrant
2013-10-11 10:09 ` Wei Liu
2013-10-11 11:11 ` Paul Durrant
2013-10-11 14:03 ` Paul Durrant
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).