* [PATCH net-next 0/3] geneve: Add support for Remote Checksum Offload
@ 2015-12-08 18:27 Tom Herbert
2015-12-08 18:27 ` [PATCH net-next 1/3] rco: Clean up casting errors Tom Herbert
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Tom Herbert @ 2015-12-08 18:27 UTC (permalink / raw)
To: davem, netdev, jesse, linville; +Cc: kernel-team
This patch set adds UDP checksum configuration via netlink and
Remote Checksum Offload for Geneve,
Testing (10Gbps mlx4):
Single connection TCP_STREAM in netperf
- No UDP checksums, no RCO
4371.9 Mbpos
- UDP checksums enabled, no RCO
7263.4 Mbps
- UDP checksums enabled, RCO enabled
7607.6 Mbps
200 TCP_RR streams
- No UDP checksums, no RCO
55.05% CPU utilization
879284.9 tps
184/231/742 50/90/99% latencies
- UDP checksums enabled, no RCO
55.46% CPU utilization
901785 tps
176/222/738 50/90/99% latencies
- UDP checksums enabled, RCO enabled
52.36% CPU utilization
910582 tps
174/218/706 50/90/99% latencies
Tom Herbert (3):
rco: Clean up casting errors
geneve: UDP checksum configuration via netlink
geneve: Remote Checksum Offload support
drivers/net/geneve.c | 249 ++++++++++++++++++++++++++++++++++++++-----
include/net/checksum.h | 3 +-
include/net/geneve.h | 22 +++-
include/uapi/linux/if_link.h | 6 ++
4 files changed, 246 insertions(+), 34 deletions(-)
--
2.4.6
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net-next 1/3] rco: Clean up casting errors
2015-12-08 18:27 [PATCH net-next 0/3] geneve: Add support for Remote Checksum Offload Tom Herbert
@ 2015-12-08 18:27 ` Tom Herbert
2015-12-08 18:27 ` [PATCH net-next 2/3] geneve: UDP checksum configuration via netlink Tom Herbert
2015-12-08 18:27 ` [PATCH net-next 3/3] geneve: Remote Checksum Offload support Tom Herbert
2 siblings, 0 replies; 13+ messages in thread
From: Tom Herbert @ 2015-12-08 18:27 UTC (permalink / raw)
To: davem, netdev, jesse, linville; +Cc: kernel-team
Fixe a couple of cast errors found by sparse.
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
include/net/checksum.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/net/checksum.h b/include/net/checksum.h
index 9fcaedf..10a16b5 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -165,7 +165,8 @@ static inline __wsum remcsum_adjust(void *ptr, __wsum csum,
csum = csum_sub(csum, csum_partial(ptr, start, 0));
/* Set derived checksum in packet */
- delta = csum_sub(csum_fold(csum), *psum);
+ delta = csum_sub((__force __wsum)csum_fold(csum),
+ (__force __wsum)*psum);
*psum = csum_fold(csum);
return delta;
--
2.4.6
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net-next 2/3] geneve: UDP checksum configuration via netlink
2015-12-08 18:27 [PATCH net-next 0/3] geneve: Add support for Remote Checksum Offload Tom Herbert
2015-12-08 18:27 ` [PATCH net-next 1/3] rco: Clean up casting errors Tom Herbert
@ 2015-12-08 18:27 ` Tom Herbert
2015-12-08 19:27 ` John W. Linville
2015-12-08 18:27 ` [PATCH net-next 3/3] geneve: Remote Checksum Offload support Tom Herbert
2 siblings, 1 reply; 13+ messages in thread
From: Tom Herbert @ 2015-12-08 18:27 UTC (permalink / raw)
To: davem, netdev, jesse, linville; +Cc: kernel-team
Add support to enable and disable UDP checksums via netlink. This is
similar to how VXLAN and GUE allow this. This includes support for
enabling the UDP zero checksum (for both TX and RX).
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
drivers/net/geneve.c | 93 +++++++++++++++++++++++++++++++++-----------
include/uapi/linux/if_link.h | 3 ++
2 files changed, 73 insertions(+), 23 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index de5c30c..0750d7a 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -71,8 +71,14 @@ struct geneve_dev {
__be16 dst_port;
bool collect_md;
struct gro_cells gro_cells;
+ u32 flags;
};
+/* Geneve device flags */
+#define GENEVE_F_UDP_CSUM BIT(0)
+#define GENEVE_F_UDP_ZERO_CSUM6_TX BIT(1)
+#define GENEVE_F_UDP_ZERO_CSUM6_RX BIT(2)
+
struct geneve_sock {
bool collect_md;
struct list_head list;
@@ -81,6 +87,7 @@ struct geneve_sock {
int refcnt;
struct udp_offload udp_offloads;
struct hlist_head vni_list[VNI_HASH_SIZE];
+ u32 flags;
};
static inline __u32 geneve_net_vni_hash(u8 vni[3])
@@ -343,7 +350,7 @@ error:
}
static struct socket *geneve_create_sock(struct net *net, bool ipv6,
- __be16 port)
+ __be16 port, u32 flags)
{
struct socket *sock;
struct udp_port_cfg udp_conf;
@@ -354,6 +361,8 @@ static struct socket *geneve_create_sock(struct net *net, bool ipv6,
if (ipv6) {
udp_conf.family = AF_INET6;
udp_conf.ipv6_v6only = 1;
+ udp_conf.use_udp6_rx_checksums =
+ !(flags & GENEVE_F_UDP_ZERO_CSUM6_RX);
} else {
udp_conf.family = AF_INET;
udp_conf.local_ip.s_addr = htonl(INADDR_ANY);
@@ -480,7 +489,7 @@ static int geneve_gro_complete(struct sk_buff *skb, int nhoff,
/* Create new listen socket if needed */
static struct geneve_sock *geneve_socket_create(struct net *net, __be16 port,
- bool ipv6)
+ bool ipv6, u32 flags)
{
struct geneve_net *gn = net_generic(net, geneve_net_id);
struct geneve_sock *gs;
@@ -492,7 +501,7 @@ static struct geneve_sock *geneve_socket_create(struct net *net, __be16 port,
if (!gs)
return ERR_PTR(-ENOMEM);
- sock = geneve_create_sock(net, ipv6, port);
+ sock = geneve_create_sock(net, ipv6, port, flags);
if (IS_ERR(sock)) {
kfree(gs);
return ERR_CAST(sock);
@@ -575,12 +584,13 @@ static int geneve_sock_add(struct geneve_dev *geneve, bool ipv6)
goto out;
}
- gs = geneve_socket_create(net, geneve->dst_port, ipv6);
+ gs = geneve_socket_create(net, geneve->dst_port, ipv6, geneve->flags);
if (IS_ERR(gs))
return PTR_ERR(gs);
out:
gs->collect_md = geneve->collect_md;
+ gs->flags = geneve->flags;
#if IS_ENABLED(CONFIG_IPV6)
if (ipv6)
geneve->sock6 = gs;
@@ -642,11 +652,12 @@ static void geneve_build_header(struct genevehdr *geneveh,
static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
__be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
- bool csum, bool xnet)
+ u32 flags, bool xnet)
{
struct genevehdr *gnvh;
int min_headroom;
int err;
+ bool udp_sum = !!(flags & GENEVE_F_UDP_CSUM);
skb_scrub_packet(skb, xnet);
@@ -658,7 +669,7 @@ static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
goto free_rt;
}
- skb = udp_tunnel_handle_offloads(skb, csum);
+ skb = udp_tunnel_handle_offloads(skb, udp_sum);
if (IS_ERR(skb)) {
err = PTR_ERR(skb);
goto free_rt;
@@ -678,11 +689,12 @@ free_rt:
#if IS_ENABLED(CONFIG_IPV6)
static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
__be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
- bool csum, bool xnet)
+ u32 flags, bool xnet)
{
struct genevehdr *gnvh;
int min_headroom;
int err;
+ bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM6_TX);
skb_scrub_packet(skb, xnet);
@@ -694,7 +706,7 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
goto free_dst;
}
- skb = udp_tunnel_handle_offloads(skb, csum);
+ skb = udp_tunnel_handle_offloads(skb, udp_sum);
if (IS_ERR(skb)) {
err = PTR_ERR(skb);
goto free_dst;
@@ -824,9 +836,9 @@ static netdev_tx_t geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct flowi4 fl4;
__u8 tos, ttl;
__be16 sport;
- bool udp_csum;
__be16 df;
bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
+ u32 flags = geneve->flags;
if (geneve->collect_md) {
if (unlikely(!info || !(info->mode & IP_TUNNEL_INFO_TX))) {
@@ -857,9 +869,13 @@ static netdev_tx_t geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
if (key->tun_flags & TUNNEL_GENEVE_OPT)
opts = ip_tunnel_info_opts(info);
- udp_csum = !!(key->tun_flags & TUNNEL_CSUM);
+ if (key->tun_flags & TUNNEL_CSUM)
+ flags |= GENEVE_F_UDP_CSUM;
+ else
+ flags &= ~GENEVE_F_UDP_CSUM;
+
err = geneve_build_skb(rt, skb, key->tun_flags, vni,
- info->options_len, opts, udp_csum, xnet);
+ info->options_len, opts, flags, xnet);
if (unlikely(err))
goto err;
@@ -867,9 +883,8 @@ static netdev_tx_t geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
ttl = key->ttl;
df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
} else {
- udp_csum = false;
err = geneve_build_skb(rt, skb, 0, geneve->vni,
- 0, NULL, udp_csum, xnet);
+ 0, NULL, flags, xnet);
if (unlikely(err))
goto err;
@@ -883,7 +898,7 @@ static netdev_tx_t geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
err = udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, fl4.saddr, fl4.daddr,
tos, ttl, df, sport, geneve->dst_port,
!net_eq(geneve->net, dev_net(geneve->dev)),
- !udp_csum);
+ !(flags & GENEVE_F_UDP_CSUM));
iptunnel_xmit_stats(err, &dev->stats, dev->tstats);
return NETDEV_TX_OK;
@@ -912,8 +927,8 @@ static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct flowi6 fl6;
__u8 prio, ttl;
__be16 sport;
- bool udp_csum;
bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
+ u32 flags = geneve->flags;
if (geneve->collect_md) {
if (unlikely(!info || !(info->mode & IP_TUNNEL_INFO_TX))) {
@@ -942,19 +957,22 @@ static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
if (key->tun_flags & TUNNEL_GENEVE_OPT)
opts = ip_tunnel_info_opts(info);
- udp_csum = !!(key->tun_flags & TUNNEL_CSUM);
+ if (key->tun_flags & TUNNEL_CSUM)
+ flags |= GENEVE_F_UDP_CSUM;
+ else
+ flags &= ~GENEVE_F_UDP_CSUM;
+
err = geneve6_build_skb(dst, skb, key->tun_flags, vni,
info->options_len, opts,
- udp_csum, xnet);
+ flags, xnet);
if (unlikely(err))
goto err;
prio = ip_tunnel_ecn_encap(key->tos, iip, skb);
ttl = key->ttl;
} else {
- udp_csum = false;
err = geneve6_build_skb(dst, skb, 0, geneve->vni,
- 0, NULL, udp_csum, xnet);
+ 0, NULL, flags, xnet);
if (unlikely(err))
goto err;
@@ -966,7 +984,8 @@ static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
}
err = udp_tunnel6_xmit_skb(dst, gs6->sock->sk, skb, dev,
&fl6.saddr, &fl6.daddr, prio, ttl,
- sport, geneve->dst_port, !udp_csum);
+ sport, geneve->dst_port,
+ !!(flags & GENEVE_F_UDP_ZERO_CSUM6_TX));
iptunnel_xmit_stats(err, &dev->stats, dev->tstats);
return NETDEV_TX_OK;
@@ -1099,6 +1118,9 @@ static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = {
[IFLA_GENEVE_TOS] = { .type = NLA_U8 },
[IFLA_GENEVE_PORT] = { .type = NLA_U16 },
[IFLA_GENEVE_COLLECT_METADATA] = { .type = NLA_FLAG },
+ [IFLA_GENEVE_UDP_CSUM] = { .type = NLA_U8 },
+ [IFLA_GENEVE_UDP_ZERO_CSUM6_TX] = { .type = NLA_U8 },
+ [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NLA_U8 },
};
static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
@@ -1152,7 +1174,7 @@ static struct geneve_dev *geneve_find_dev(struct geneve_net *gn,
static int geneve_configure(struct net *net, struct net_device *dev,
union geneve_addr *remote,
__u32 vni, __u8 ttl, __u8 tos, __be16 dst_port,
- bool metadata)
+ bool metadata, u32 flags)
{
struct geneve_net *gn = net_generic(net, geneve_net_id);
struct geneve_dev *t, *geneve = netdev_priv(dev);
@@ -1183,6 +1205,7 @@ static int geneve_configure(struct net *net, struct net_device *dev,
geneve->tos = tos;
geneve->dst_port = dst_port;
geneve->collect_md = metadata;
+ geneve->flags = flags;
t = geneve_find_dev(gn, dst_port, remote, geneve->vni,
&tun_on_same_port, &tun_collect_md);
@@ -1213,6 +1236,7 @@ static int geneve_newlink(struct net *net, struct net_device *dev,
bool metadata = false;
union geneve_addr remote = geneve_remote_unspec;
__u32 vni = 0;
+ u32 flags = 0;
if (data[IFLA_GENEVE_REMOTE] && data[IFLA_GENEVE_REMOTE6])
return -EINVAL;
@@ -1253,8 +1277,20 @@ static int geneve_newlink(struct net *net, struct net_device *dev,
if (data[IFLA_GENEVE_COLLECT_METADATA])
metadata = true;
+ if (data[IFLA_GENEVE_UDP_CSUM] &&
+ nla_get_u8(data[IFLA_GENEVE_UDP_CSUM]))
+ flags |= GENEVE_F_UDP_CSUM;
+
+ if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX] &&
+ nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]))
+ flags |= GENEVE_F_UDP_ZERO_CSUM6_TX;
+
+ if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX] &&
+ nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]))
+ flags |= GENEVE_F_UDP_ZERO_CSUM6_RX;
+
return geneve_configure(net, dev, &remote, vni, ttl, tos, dst_port,
- metadata);
+ metadata, flags);
}
static void geneve_dellink(struct net_device *dev, struct list_head *head)
@@ -1273,6 +1309,9 @@ static size_t geneve_get_size(const struct net_device *dev)
nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_TOS */
nla_total_size(sizeof(__be16)) + /* IFLA_GENEVE_PORT */
nla_total_size(0) + /* IFLA_GENEVE_COLLECT_METADATA */
+ nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_CSUM */
+ nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_ZERO_CSUM6_TX */
+ nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_ZERO_CSUM6_RX */
0;
}
@@ -1309,6 +1348,14 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
goto nla_put_failure;
}
+ if (nla_put_u8(skb, IFLA_GENEVE_UDP_CSUM,
+ !!(geneve->flags & GENEVE_F_UDP_CSUM)) ||
+ nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
+ !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_TX)) ||
+ nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_RX,
+ !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_RX)))
+ goto nla_put_failure;
+
return 0;
nla_put_failure:
@@ -1342,7 +1389,7 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
return dev;
err = geneve_configure(net, dev, &geneve_remote_unspec,
- 0, 0, 0, htons(dst_port), true);
+ 0, 0, 0, htons(dst_port), true, 0);
if (err) {
free_netdev(dev);
return ERR_PTR(err);
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 5ad5737..2be1dd5 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -462,6 +462,9 @@ enum {
IFLA_GENEVE_PORT, /* destination port */
IFLA_GENEVE_COLLECT_METADATA,
IFLA_GENEVE_REMOTE6,
+ IFLA_GENEVE_UDP_CSUM,
+ IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
+ IFLA_GENEVE_UDP_ZERO_CSUM6_RX,
__IFLA_GENEVE_MAX
};
#define IFLA_GENEVE_MAX (__IFLA_GENEVE_MAX - 1)
--
2.4.6
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net-next 3/3] geneve: Remote Checksum Offload support
2015-12-08 18:27 [PATCH net-next 0/3] geneve: Add support for Remote Checksum Offload Tom Herbert
2015-12-08 18:27 ` [PATCH net-next 1/3] rco: Clean up casting errors Tom Herbert
2015-12-08 18:27 ` [PATCH net-next 2/3] geneve: UDP checksum configuration via netlink Tom Herbert
@ 2015-12-08 18:27 ` Tom Herbert
2015-12-08 19:31 ` John W. Linville
` (2 more replies)
2 siblings, 3 replies; 13+ messages in thread
From: Tom Herbert @ 2015-12-08 18:27 UTC (permalink / raw)
To: davem, netdev, jesse, linville; +Cc: kernel-team
Add support for remote checksum offload in both the normal and GRO
paths. netlink command are used to enable sending of the Remote
Checksum Data, and allow processing of it on receive. The Remote
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
drivers/net/geneve.c | 162 ++++++++++++++++++++++++++++++++++++++++---
include/net/geneve.h | 22 ++++--
include/uapi/linux/if_link.h | 3 +
3 files changed, 174 insertions(+), 13 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 0750d7a..68945a4 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -78,6 +78,9 @@ struct geneve_dev {
#define GENEVE_F_UDP_CSUM BIT(0)
#define GENEVE_F_UDP_ZERO_CSUM6_TX BIT(1)
#define GENEVE_F_UDP_ZERO_CSUM6_RX BIT(2)
+#define GENEVE_F_REMCSUM_TX BIT(3)
+#define GENEVE_F_REMCSUM_RX BIT(4)
+#define GENEVE_F_REMCSUM_NOPARTIAL BIT(5)
struct geneve_sock {
bool collect_md;
@@ -308,6 +311,33 @@ static void geneve_uninit(struct net_device *dev)
free_percpu(dev->tstats);
}
+static struct genevehdr *geneve_remcsum(struct sk_buff *skb,
+ struct genevehdr *gh,
+ size_t hdrlen, bool nopartial)
+{
+ size_t start, offset, plen;
+
+ if (skb->remcsum_offload)
+ return gh;
+
+ start = gh->rco_start << GENEVE_RCO_SHIFT;
+ offset = start + (gh->udp_rco ?
+ offsetof(struct udphdr, check) :
+ offsetof(struct tcphdr, check));
+
+ plen = hdrlen + offset + sizeof(u16);
+
+ if (!pskb_may_pull(skb, plen))
+ return NULL;
+
+ gh = (struct genevehdr *)(udp_hdr(skb) + 1);
+
+ skb_remcsum_process(skb, (void *)gh + hdrlen, start, offset,
+ nopartial);
+
+ return gh;
+}
+
/* Callback from net/ipv4/udp.c to receive packets */
static int geneve_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
{
@@ -336,6 +366,15 @@ static int geneve_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
if (!gs)
goto drop;
+ if (geneveh->rco && (gs->flags & GENEVE_F_REMCSUM_RX)) {
+ geneveh = geneve_remcsum(skb, geneveh,
+ sizeof(geneveh) + opts_len,
+ !!(gs->flags &
+ GENEVE_F_REMCSUM_NOPARTIAL));
+ if (unlikely(!geneveh))
+ goto drop;
+ }
+
geneve_rx(gs, skb);
return 0;
@@ -397,6 +436,32 @@ static int geneve_hlen(struct genevehdr *gh)
return sizeof(*gh) + gh->opt_len * 4;
}
+static struct genevehdr *geneve_gro_remcsum(struct sk_buff *skb,
+ unsigned int off,
+ struct genevehdr *gh, size_t hdrlen,
+ struct gro_remcsum *grc,
+ bool nopartial)
+{
+ size_t start, offset;
+
+ if (skb->remcsum_offload)
+ return gh;
+
+ if (!NAPI_GRO_CB(skb)->csum_valid)
+ return NULL;
+
+ start = gh->rco_start << GENEVE_RCO_SHIFT;
+ offset = start + (gh->udp_rco ?
+ offsetof(struct udphdr, check) :
+ offsetof(struct tcphdr, check));
+
+ gh = skb_gro_remcsum_process(skb, (void *)gh, off, hdrlen,
+ start, offset, grc, nopartial);
+
+ skb->remcsum_offload = 1;
+
+ return gh;
+}
static struct sk_buff **geneve_gro_receive(struct sk_buff **head,
struct sk_buff *skb,
struct udp_offload *uoff)
@@ -407,6 +472,11 @@ static struct sk_buff **geneve_gro_receive(struct sk_buff **head,
const struct packet_offload *ptype;
__be16 type;
int flush = 1;
+ struct gro_remcsum grc;
+ struct geneve_sock *gs = container_of(uoff, struct geneve_sock,
+ udp_offloads);
+
+ skb_gro_remcsum_init(&grc);
off_gnv = skb_gro_offset(skb);
hlen = off_gnv + sizeof(*gh);
@@ -421,6 +491,16 @@ static struct sk_buff **geneve_gro_receive(struct sk_buff **head,
goto out;
gh_len = geneve_hlen(gh);
+ skb_gro_postpull_rcsum(skb, gh, gh_len);
+
+ if (gh->rco && (gs->flags & GENEVE_F_REMCSUM_RX)) {
+ gh = geneve_gro_remcsum(skb, off_gnv, gh, gh_len, &grc,
+ !!(gs->flags &
+ GENEVE_F_REMCSUM_NOPARTIAL));
+ if (unlikely(!gh))
+ goto out;
+ }
+
hlen = off_gnv + gh_len;
if (skb_gro_header_hard(skb, hlen)) {
gh = skb_gro_header_slow(skb, hlen, off_gnv);
@@ -452,7 +532,6 @@ static struct sk_buff **geneve_gro_receive(struct sk_buff **head,
}
skb_gro_pull(skb, gh_len);
- skb_gro_postpull_rcsum(skb, gh, gh_len);
pp = ptype->callbacks.gro_receive(head, skb);
out_unlock:
@@ -636,7 +715,8 @@ static int geneve_stop(struct net_device *dev)
static void geneve_build_header(struct genevehdr *geneveh,
__be16 tun_flags, u8 vni[3],
- u8 options_len, u8 *options)
+ u8 options_len, u8 *options,
+ int type, struct sk_buff *skb)
{
geneveh->ver = GENEVE_VER;
geneveh->opt_len = options_len / 4;
@@ -645,7 +725,25 @@ static void geneve_build_header(struct genevehdr *geneveh,
geneveh->rsvd1 = 0;
memcpy(geneveh->vni, vni, 3);
geneveh->proto_type = htons(ETH_P_TEB);
- geneveh->rsvd2 = 0;
+
+ if (type & SKB_GSO_TUNNEL_REMCSUM) {
+ size_t hdrlen = sizeof(*geneveh) + options_len;
+
+ geneveh->rco = 1;
+ geneveh->rco_start = (skb_checksum_start_offset(skb) -
+ hdrlen) >> GENEVE_RCO_SHIFT;
+
+ if (skb->csum_offset == offsetof(struct udphdr, check))
+ geneveh->udp_rco = 1;
+
+ if (!skb_is_gso(skb)) {
+ skb->ip_summed = CHECKSUM_NONE;
+ skb->encapsulation = 0;
+ }
+ } else {
+ geneveh->udp_rco = 0;
+ geneveh->rco_start = 0;
+ }
memcpy(geneveh->options, options, options_len);
}
@@ -658,6 +756,20 @@ static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
int min_headroom;
int err;
bool udp_sum = !!(flags & GENEVE_F_UDP_CSUM);
+ int type = udp_sum ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
+
+ if ((flags & GENEVE_F_REMCSUM_TX) &&
+ skb->ip_summed == CHECKSUM_PARTIAL) {
+ int csum_start = skb_checksum_start_offset(skb);
+
+ if (csum_start <= GENEVE_MAX_REMCSUM_START &&
+ !(csum_start & GENEVE_RCO_SHIFT_MASK) &&
+ (skb->csum_offset == offsetof(struct udphdr, check) ||
+ skb->csum_offset == offsetof(struct tcphdr, check))) {
+ udp_sum = false;
+ type |= SKB_GSO_TUNNEL_REMCSUM;
+ }
+ }
skb_scrub_packet(skb, xnet);
@@ -669,14 +781,14 @@ static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
goto free_rt;
}
- skb = udp_tunnel_handle_offloads(skb, udp_sum);
+ skb = iptunnel_handle_offloads(skb, udp_sum, type);
if (IS_ERR(skb)) {
err = PTR_ERR(skb);
goto free_rt;
}
gnvh = (struct genevehdr *)__skb_push(skb, sizeof(*gnvh) + opt_len);
- geneve_build_header(gnvh, tun_flags, vni, opt_len, opt);
+ geneve_build_header(gnvh, tun_flags, vni, opt_len, opt, type, skb);
skb_set_inner_protocol(skb, htons(ETH_P_TEB));
return 0;
@@ -695,6 +807,20 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
int min_headroom;
int err;
bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM6_TX);
+ int type = udp_sum ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
+
+ if ((flags & GENEVE_F_REMCSUM_TX) &&
+ skb->ip_summed == CHECKSUM_PARTIAL) {
+ int csum_start = skb_checksum_start_offset(skb);
+
+ if (csum_start <= GENEVE_MAX_REMCSUM_START &&
+ !(csum_start & GENEVE_RCO_SHIFT_MASK) &&
+ (skb->csum_offset == offsetof(struct udphdr, check) ||
+ skb->csum_offset == offsetof(struct tcphdr, check))) {
+ udp_sum = false;
+ type |= SKB_GSO_TUNNEL_REMCSUM;
+ }
+ }
skb_scrub_packet(skb, xnet);
@@ -706,14 +832,14 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
goto free_dst;
}
- skb = udp_tunnel_handle_offloads(skb, udp_sum);
+ skb = iptunnel_handle_offloads(skb, udp_sum, type);
if (IS_ERR(skb)) {
err = PTR_ERR(skb);
goto free_dst;
}
gnvh = (struct genevehdr *)__skb_push(skb, sizeof(*gnvh) + opt_len);
- geneve_build_header(gnvh, tun_flags, vni, opt_len, opt);
+ geneve_build_header(gnvh, tun_flags, vni, opt_len, opt, type, skb);
skb_set_inner_protocol(skb, htons(ETH_P_TEB));
return 0;
@@ -1121,6 +1247,9 @@ static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = {
[IFLA_GENEVE_UDP_CSUM] = { .type = NLA_U8 },
[IFLA_GENEVE_UDP_ZERO_CSUM6_TX] = { .type = NLA_U8 },
[IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NLA_U8 },
+ [IFLA_GENEVE_REMCSUM_TX] = { .type = NLA_U8 },
+ [IFLA_GENEVE_REMCSUM_RX] = { .type = NLA_U8 },
+ [IFLA_GENEVE_REMCSUM_NOPARTIAL] = { .type = NLA_FLAG },
};
static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
@@ -1289,6 +1418,17 @@ static int geneve_newlink(struct net *net, struct net_device *dev,
nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]))
flags |= GENEVE_F_UDP_ZERO_CSUM6_RX;
+ if (data[IFLA_GENEVE_REMCSUM_TX] &&
+ nla_get_u8(data[IFLA_GENEVE_REMCSUM_TX]))
+ flags |= GENEVE_F_REMCSUM_TX;
+
+ if (data[IFLA_GENEVE_REMCSUM_RX] &&
+ nla_get_u8(data[IFLA_GENEVE_REMCSUM_RX]))
+ flags |= GENEVE_F_REMCSUM_RX;
+
+ if (data[IFLA_GENEVE_REMCSUM_NOPARTIAL])
+ flags |= GENEVE_F_REMCSUM_NOPARTIAL;
+
return geneve_configure(net, dev, &remote, vni, ttl, tos, dst_port,
metadata, flags);
}
@@ -1312,6 +1452,8 @@ static size_t geneve_get_size(const struct net_device *dev)
nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_CSUM */
nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_ZERO_CSUM6_TX */
nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_ZERO_CSUM6_RX */
+ nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_REMCSUM_TX */
+ nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_REMCSUM_RX */
0;
}
@@ -1353,7 +1495,11 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
!!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_TX)) ||
nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_RX,
- !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_RX)))
+ !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_RX)) ||
+ nla_put_u8(skb, IFLA_GENEVE_REMCSUM_TX,
+ !!(geneve->flags & GENEVE_F_REMCSUM_TX)) ||
+ nla_put_u8(skb, IFLA_GENEVE_REMCSUM_RX,
+ !!(geneve->flags & GENEVE_F_REMCSUM_RX)))
goto nla_put_failure;
return 0;
diff --git a/include/net/geneve.h b/include/net/geneve.h
index 3106ed6..b59ad0a 100644
--- a/include/net/geneve.h
+++ b/include/net/geneve.h
@@ -8,9 +8,9 @@
/* Geneve Header:
* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- * |Ver| Opt Len |O|C| Rsvd. | Protocol Type |
+ * |Ver| Opt Len |O|C|X| Rsvd. | Protocol Type |
* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- * | Virtual Network Identifier (VNI) | Reserved |
+ * | Virtual Network Identifier (VNI) |U| Csum start |
* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
* | Variable Length Options |
* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
@@ -46,7 +46,8 @@ struct genevehdr {
#ifdef __LITTLE_ENDIAN_BITFIELD
u8 opt_len:6;
u8 ver:2;
- u8 rsvd1:6;
+ u8 rsvd1:5;
+ u8 rco:1;
u8 critical:1;
u8 oam:1;
#else
@@ -54,14 +55,25 @@ struct genevehdr {
u8 opt_len:6;
u8 oam:1;
u8 critical:1;
- u8 rsvd1:6;
+ u8 rco:1;
+ u8 rsvd1:5;
#endif
__be16 proto_type;
u8 vni[3];
- u8 rsvd2;
+#ifdef __LITTLE_ENDIAN_BITFIELD
+ u8 rco_start : 7;
+ u8 udp_rco : 1;
+#else
+ u8 udp_rco : 1;
+ u8 rco_start : 7;
+#endif
struct geneve_opt options[];
};
+#define GENEVE_RCO_SHIFT 1 /* Left shift of start */
+#define GENEVE_RCO_SHIFT_MASK ((1 << GENEVE_RCO_SHIFT) - 1)
+#define GENEVE_MAX_REMCSUM_START (0x7f << GENEVE_RCO_SHIFT)
+
#ifdef CONFIG_INET
struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
u8 name_assign_type, u16 dst_port);
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 2be1dd5..998eff0 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -465,6 +465,9 @@ enum {
IFLA_GENEVE_UDP_CSUM,
IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
IFLA_GENEVE_UDP_ZERO_CSUM6_RX,
+ IFLA_GENEVE_REMCSUM_RX,
+ IFLA_GENEVE_REMCSUM_TX,
+ IFLA_GENEVE_REMCSUM_NOPARTIAL,
__IFLA_GENEVE_MAX
};
#define IFLA_GENEVE_MAX (__IFLA_GENEVE_MAX - 1)
--
2.4.6
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 2/3] geneve: UDP checksum configuration via netlink
2015-12-08 18:27 ` [PATCH net-next 2/3] geneve: UDP checksum configuration via netlink Tom Herbert
@ 2015-12-08 19:27 ` John W. Linville
0 siblings, 0 replies; 13+ messages in thread
From: John W. Linville @ 2015-12-08 19:27 UTC (permalink / raw)
To: Tom Herbert; +Cc: davem, netdev, jesse, kernel-team
On Tue, Dec 08, 2015 at 10:27:17AM -0800, Tom Herbert wrote:
> Add support to enable and disable UDP checksums via netlink. This is
> similar to how VXLAN and GUE allow this. This includes support for
> enabling the UDP zero checksum (for both TX and RX).
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
Now I can drop this from my todo list... :-)
Acked-by: John W. Linville <linville@tuxdriver.com>
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 3/3] geneve: Remote Checksum Offload support
2015-12-08 18:27 ` [PATCH net-next 3/3] geneve: Remote Checksum Offload support Tom Herbert
@ 2015-12-08 19:31 ` John W. Linville
2015-12-08 19:59 ` Tom Herbert
2015-12-08 20:45 ` kbuild test robot
2015-12-08 20:45 ` [PATCH] geneve: fix noderef.cocci warnings kbuild test robot
2 siblings, 1 reply; 13+ messages in thread
From: John W. Linville @ 2015-12-08 19:31 UTC (permalink / raw)
To: Tom Herbert; +Cc: davem, netdev, jesse, kernel-team
On Tue, Dec 08, 2015 at 10:27:18AM -0800, Tom Herbert wrote:
> Add support for remote checksum offload in both the normal and GRO
> paths. netlink command are used to enable sending of the Remote
> Checksum Data, and allow processing of it on receive. The Remote
afsO&^(*&5^a+++NO CARRIER
Did you mean to finish that sentence? ;-)
> Signed-off-by: Tom Herbert <tom@herbertland.com>
Jesse is going to have to comment on your (ab)use of the reserved
fields. I presume that an RFC would be forthcoming?
> diff --git a/include/net/geneve.h b/include/net/geneve.h
> index 3106ed6..b59ad0a 100644
> --- a/include/net/geneve.h
> +++ b/include/net/geneve.h
> @@ -8,9 +8,9 @@
>
> /* Geneve Header:
> * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> - * |Ver| Opt Len |O|C| Rsvd. | Protocol Type |
> + * |Ver| Opt Len |O|C|X| Rsvd. | Protocol Type |
> * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> - * | Virtual Network Identifier (VNI) | Reserved |
> + * | Virtual Network Identifier (VNI) |U| Csum start |
> * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> * | Variable Length Options |
> * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Otherwise, LGTM -- I'm really happy to see Geneve get this attention!
Acked-by: John W. Linville <linville@tuxdriver.com>
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 3/3] geneve: Remote Checksum Offload support
2015-12-08 19:31 ` John W. Linville
@ 2015-12-08 19:59 ` Tom Herbert
2015-12-08 23:58 ` Jesse Gross
0 siblings, 1 reply; 13+ messages in thread
From: Tom Herbert @ 2015-12-08 19:59 UTC (permalink / raw)
To: John W. Linville
Cc: David S. Miller, Linux Kernel Network Developers, Jesse Gross,
Kernel Team
On Tue, Dec 8, 2015 at 11:31 AM, John W. Linville
<linville@tuxdriver.com> wrote:
> On Tue, Dec 08, 2015 at 10:27:18AM -0800, Tom Herbert wrote:
>> Add support for remote checksum offload in both the normal and GRO
>> paths. netlink command are used to enable sending of the Remote
>> Checksum Data, and allow processing of it on receive. The Remote
>
Extraneous.
> afsO&^(*&5^a+++NO CARRIER
>
> Did you mean to finish that sentence? ;-)
>
>> Signed-off-by: Tom Herbert <tom@herbertland.com>
>
> Jesse is going to have to comment on your (ab)use of the reserved
> fields. I presume that an RFC would be forthcoming?
>
I'll spin an I-D once there is agreement on the approach. This is
identical to VXLAN excepted the bit flag indicating csum_start is
present might be different.
https://tools.ietf.org/html/draft-herbert-vxlan-rco-00
>> diff --git a/include/net/geneve.h b/include/net/geneve.h
>> index 3106ed6..b59ad0a 100644
>> --- a/include/net/geneve.h
>> +++ b/include/net/geneve.h
>> @@ -8,9 +8,9 @@
>>
>> /* Geneve Header:
>> * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> - * |Ver| Opt Len |O|C| Rsvd. | Protocol Type |
>> + * |Ver| Opt Len |O|C|X| Rsvd. | Protocol Type |
>> * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> - * | Virtual Network Identifier (VNI) | Reserved |
>> + * | Virtual Network Identifier (VNI) |U| Csum start |
>> * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> * | Variable Length Options |
>> * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>
> Otherwise, LGTM -- I'm really happy to see Geneve get this attention!
>
> Acked-by: John W. Linville <linville@tuxdriver.com>
>
> --
> John W. Linville Someday the world will need a hero, and you
> linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 3/3] geneve: Remote Checksum Offload support
2015-12-08 18:27 ` [PATCH net-next 3/3] geneve: Remote Checksum Offload support Tom Herbert
2015-12-08 19:31 ` John W. Linville
@ 2015-12-08 20:45 ` kbuild test robot
2015-12-08 20:45 ` [PATCH] geneve: fix noderef.cocci warnings kbuild test robot
2 siblings, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2015-12-08 20:45 UTC (permalink / raw)
To: Tom Herbert; +Cc: kbuild-all, davem, netdev, jesse, linville, kernel-team
Hi Tom,
[auto build test WARNING on net-next/master]
url: https://github.com/0day-ci/linux/commits/Tom-Herbert/rco-Clean-up-casting-errors/20151209-022902
coccinelle warnings: (new ones prefixed by >>)
>> drivers/net/geneve.c:371:6-12: ERROR: application of sizeof to pointer
Please review and possibly fold the followup patch.
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] geneve: fix noderef.cocci warnings
2015-12-08 18:27 ` [PATCH net-next 3/3] geneve: Remote Checksum Offload support Tom Herbert
2015-12-08 19:31 ` John W. Linville
2015-12-08 20:45 ` kbuild test robot
@ 2015-12-08 20:45 ` kbuild test robot
2 siblings, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2015-12-08 20:45 UTC (permalink / raw)
To: Tom Herbert; +Cc: kbuild-all, davem, netdev, jesse, linville, kernel-team
drivers/net/geneve.c:371:6-12: ERROR: application of sizeof to pointer
sizeof when applied to a pointer typed expression gives the size of
the pointer
Generated by: scripts/coccinelle/misc/noderef.cocci
CC: Tom Herbert <tom@herbertland.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
geneve.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -368,7 +368,7 @@ static int geneve_udp_encap_recv(struct
if (geneveh->rco && (gs->flags & GENEVE_F_REMCSUM_RX)) {
geneveh = geneve_remcsum(skb, geneveh,
- sizeof(geneveh) + opts_len,
+ sizeof(*geneveh) + opts_len,
!!(gs->flags &
GENEVE_F_REMCSUM_NOPARTIAL));
if (unlikely(!geneveh))
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 3/3] geneve: Remote Checksum Offload support
2015-12-08 19:59 ` Tom Herbert
@ 2015-12-08 23:58 ` Jesse Gross
2015-12-09 0:11 ` Tom Herbert
0 siblings, 1 reply; 13+ messages in thread
From: Jesse Gross @ 2015-12-08 23:58 UTC (permalink / raw)
To: Tom Herbert
Cc: John W. Linville, David S. Miller,
Linux Kernel Network Developers, Jesse Gross, Kernel Team
On Tue, Dec 8, 2015 at 11:59 AM, Tom Herbert <tom@herbertland.com> wrote:
> On Tue, Dec 8, 2015 at 11:31 AM, John W. Linville
> <linville@tuxdriver.com> wrote:
>> On Tue, Dec 08, 2015 at 10:27:18AM -0800, Tom Herbert wrote:
>>> Add support for remote checksum offload in both the normal and GRO
>>> paths. netlink command are used to enable sending of the Remote
>>> Checksum Data, and allow processing of it on receive. The Remote
>>
> Extraneous.
>
>> afsO&^(*&5^a+++NO CARRIER
>>
>> Did you mean to finish that sentence? ;-)
>>
>>> Signed-off-by: Tom Herbert <tom@herbertland.com>
>>
>> Jesse is going to have to comment on your (ab)use of the reserved
>> fields. I presume that an RFC would be forthcoming?
>>
> I'll spin an I-D once there is agreement on the approach. This is
> identical to VXLAN excepted the bit flag indicating csum_start is
> present might be different.
Thanks for doing this - it was also on my to-do list.
We can definitely do better than VXLAN here since there is
extensibility built into the protocol. I think the right way to do
this is to define a TLV option and then use the same format as what
you defined for GUE (and it is also nice to reuse some of the option
formats to minimize differences across protocols).
I'd like to set up an IANA registry for classes but that's a little
slow going at this point, so for the next draft update I'm planning on
adding some initial assignments, including a Linux class. We could
then allocate remote checksum out of that. The new version of the
draft should be coming out pretty soon.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 3/3] geneve: Remote Checksum Offload support
2015-12-08 23:58 ` Jesse Gross
@ 2015-12-09 0:11 ` Tom Herbert
2015-12-09 1:13 ` Jesse Gross
2015-12-09 2:20 ` David Miller
0 siblings, 2 replies; 13+ messages in thread
From: Tom Herbert @ 2015-12-09 0:11 UTC (permalink / raw)
To: Jesse Gross
Cc: John W. Linville, David S. Miller,
Linux Kernel Network Developers, Jesse Gross, Kernel Team
On Tue, Dec 8, 2015 at 3:58 PM, Jesse Gross <jesse@kernel.org> wrote:
> On Tue, Dec 8, 2015 at 11:59 AM, Tom Herbert <tom@herbertland.com> wrote:
>> On Tue, Dec 8, 2015 at 11:31 AM, John W. Linville
>> <linville@tuxdriver.com> wrote:
>>> On Tue, Dec 08, 2015 at 10:27:18AM -0800, Tom Herbert wrote:
>>>> Add support for remote checksum offload in both the normal and GRO
>>>> paths. netlink command are used to enable sending of the Remote
>>>> Checksum Data, and allow processing of it on receive. The Remote
>>>
>> Extraneous.
>>
>>> afsO&^(*&5^a+++NO CARRIER
>>>
>>> Did you mean to finish that sentence? ;-)
>>>
>>>> Signed-off-by: Tom Herbert <tom@herbertland.com>
>>>
>>> Jesse is going to have to comment on your (ab)use of the reserved
>>> fields. I presume that an RFC would be forthcoming?
>>>
>> I'll spin an I-D once there is agreement on the approach. This is
>> identical to VXLAN excepted the bit flag indicating csum_start is
>> present might be different.
>
> Thanks for doing this - it was also on my to-do list.
>
> We can definitely do better than VXLAN here since there is
> extensibility built into the protocol. I think the right way to do
> this is to define a TLV option and then use the same format as what
> you defined for GUE (and it is also nice to reuse some of the option
> formats to minimize differences across protocols).
>
> I'd like to set up an IANA registry for classes but that's a little
> slow going at this point, so for the next draft update I'm planning on
> adding some initial assignments, including a Linux class. We could
> then allocate remote checksum out of that. The new version of the
> draft should be coming out pretty soon.
I thought about a TLV implementation but there is no base support in
the kernel for it. It would be great if you could implement the TLV
loop. Honestly, though, I'm still very leery of the processing
overhead associated with TLVs in the critical data path. In the worst
case of RCO we would need to fish for the RCO option though the TLV
list in both the GRO and normal path for each packet. If the TLV lists
are long then performance may be negatively affected mitigating the
value of RCO (needs to be measured).
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 3/3] geneve: Remote Checksum Offload support
2015-12-09 0:11 ` Tom Herbert
@ 2015-12-09 1:13 ` Jesse Gross
2015-12-09 2:20 ` David Miller
1 sibling, 0 replies; 13+ messages in thread
From: Jesse Gross @ 2015-12-09 1:13 UTC (permalink / raw)
To: Tom Herbert
Cc: John W. Linville, David S. Miller,
Linux Kernel Network Developers, Jesse Gross, Kernel Team
On Tue, Dec 8, 2015 at 4:11 PM, Tom Herbert <tom@herbertland.com> wrote:
> On Tue, Dec 8, 2015 at 3:58 PM, Jesse Gross <jesse@kernel.org> wrote:
>> On Tue, Dec 8, 2015 at 11:59 AM, Tom Herbert <tom@herbertland.com> wrote:
>>> On Tue, Dec 8, 2015 at 11:31 AM, John W. Linville
>>>> Jesse is going to have to comment on your (ab)use of the reserved
>>>> fields. I presume that an RFC would be forthcoming?
>>>>
>>> I'll spin an I-D once there is agreement on the approach. This is
>>> identical to VXLAN excepted the bit flag indicating csum_start is
>>> present might be different.
>>
>> Thanks for doing this - it was also on my to-do list.
>>
>> We can definitely do better than VXLAN here since there is
>> extensibility built into the protocol. I think the right way to do
>> this is to define a TLV option and then use the same format as what
>> you defined for GUE (and it is also nice to reuse some of the option
>> formats to minimize differences across protocols).
>>
>> I'd like to set up an IANA registry for classes but that's a little
>> slow going at this point, so for the next draft update I'm planning on
>> adding some initial assignments, including a Linux class. We could
>> then allocate remote checksum out of that. The new version of the
>> draft should be coming out pretty soon.
>
> I thought about a TLV implementation but there is no base support in
> the kernel for it. It would be great if you could implement the TLV
> loop. Honestly, though, I'm still very leery of the processing
> overhead associated with TLVs in the critical data path. In the worst
> case of RCO we would need to fish for the RCO option though the TLV
> list in both the GRO and normal path for each packet. If the TLV lists
> are long then performance may be negatively affected mitigating the
> value of RCO (needs to be measured).
There is support for TLVs but only on a per-flow basis. In that case,
TLVs are cheap or at least scale with the amount of metadata that is
in use. I'm planning on implementing support for per-packet TLVs but
since that can't be done completely generically, I need the
registry/draft update that I mentioned before.
In practice, I don't expect the number of options for a single packet
to be particularly large, so the cost of going through the list isn't
high. The main source of variation for TLVs is from new ideas, various
control planes, and differences for control/data/OAM/etc. packets.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 3/3] geneve: Remote Checksum Offload support
2015-12-09 0:11 ` Tom Herbert
2015-12-09 1:13 ` Jesse Gross
@ 2015-12-09 2:20 ` David Miller
1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2015-12-09 2:20 UTC (permalink / raw)
To: tom; +Cc: jesse, linville, netdev, jesse, kernel-team
From: Tom Herbert <tom@herbertland.com>
Date: Tue, 8 Dec 2015 16:11:56 -0800
> I thought about a TLV implementation but there is no base support in
> the kernel for it. It would be great if you could implement the TLV
> loop. Honestly, though, I'm still very leery of the processing
> overhead associated with TLVs in the critical data path. In the
> worst case of RCO we would need to fish for the RCO option though
> the TLV list in both the GRO and normal path for each packet. If the
> TLV lists are long then performance may be negatively affected
> mitigating the value of RCO (needs to be measured).
Open ended TLVs are to me likened to an attack vector waiting to
happen, for the reasons Tom listed.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2015-12-09 2:20 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-08 18:27 [PATCH net-next 0/3] geneve: Add support for Remote Checksum Offload Tom Herbert
2015-12-08 18:27 ` [PATCH net-next 1/3] rco: Clean up casting errors Tom Herbert
2015-12-08 18:27 ` [PATCH net-next 2/3] geneve: UDP checksum configuration via netlink Tom Herbert
2015-12-08 19:27 ` John W. Linville
2015-12-08 18:27 ` [PATCH net-next 3/3] geneve: Remote Checksum Offload support Tom Herbert
2015-12-08 19:31 ` John W. Linville
2015-12-08 19:59 ` Tom Herbert
2015-12-08 23:58 ` Jesse Gross
2015-12-09 0:11 ` Tom Herbert
2015-12-09 1:13 ` Jesse Gross
2015-12-09 2:20 ` David Miller
2015-12-08 20:45 ` kbuild test robot
2015-12-08 20:45 ` [PATCH] geneve: fix noderef.cocci warnings kbuild test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).