* [PATCH net-next 0/3] Conntrack GRE offload
@ 2022-02-03 11:59 Toshiaki Makita
2022-02-03 11:59 ` [PATCH net-next 1/3] netfilter: flowtable: Support GRE Toshiaki Makita
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Toshiaki Makita @ 2022-02-03 11:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim,
Cong Wang, Jiri Pirko, Pablo Neira Ayuso, Jozsef Kadlecsik,
Florian Westphal
Cc: Toshiaki Makita, netdev, netfilter-devel, coreteam, Paul Blakey
Conntrack offload currently only supports TCP and UDP.
Thus TC/nftables/OVS cannot offload GRE packets.
However, GRE is widely used so some users create gre devices in VMs,
and in that case host OVS forwards GRE packets from/to VMs.
In order to offload GRE packets in OVS with stateful firewall support,
we need act_ct GRE offload support.
This patch set adds GRE offload support for act_ct and mlx5 conntrack.
Currently only GREv0 and no NAT support.
- Patch 1: flow_offload/flowtable GRE support.
- Patch 2: act_ct GRE offload support.
- Patch 3: mlx5 conntrack GRE offload support.
Tested with ConnectX-6 Dx 100G NIC and netperf TCP_STREAM.
+------------------------------------+
| +-----------+
| |(namespace)|
+---------+ | | netserver |
| | wire +----+ tc +--------+ +-------+ |
| netperf |-------->|mlx5|------>|mlx5 rep|--|mlx5 vf| |
| | +----+ +--------+ +-------+---+
+---------+ +------------------------------------+
- No offload (TC skip_hw): 8.5 Gbps
- Offload (act_ct) : 22 Gbps
Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Toshiaki Makita (3):
netfilter: flowtable: Support GRE
act_ct: Support GRE offload
net/mlx5: Support GRE conntrack offload
drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 21 +++--
net/netfilter/nf_flow_table_core.c | 10 +-
net/netfilter/nf_flow_table_ip.c | 54 +++++++++--
net/netfilter/nf_flow_table_offload.c | 19 ++--
net/netfilter/nft_flow_offload.c | 13 +++
net/sched/act_ct.c | 101 ++++++++++++++++-----
6 files changed, 171 insertions(+), 47 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH net-next 1/3] netfilter: flowtable: Support GRE 2022-02-03 11:59 [PATCH net-next 0/3] Conntrack GRE offload Toshiaki Makita @ 2022-02-03 11:59 ` Toshiaki Makita 2022-02-07 17:56 ` Pablo Neira Ayuso 2022-02-03 11:59 ` [PATCH net-next 2/3] act_ct: Support GRE offload Toshiaki Makita 2022-02-03 11:59 ` [PATCH net-next 3/3] net/mlx5: Support GRE conntrack offload Toshiaki Makita 2 siblings, 1 reply; 12+ messages in thread From: Toshiaki Makita @ 2022-02-03 11:59 UTC (permalink / raw) To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal Cc: Toshiaki Makita, netdev, netfilter-devel, coreteam, Paul Blakey Support GREv0 without NAT. Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com> --- net/netfilter/nf_flow_table_core.c | 10 +++++-- net/netfilter/nf_flow_table_ip.c | 54 ++++++++++++++++++++++++++++------- net/netfilter/nf_flow_table_offload.c | 19 +++++++----- net/netfilter/nft_flow_offload.c | 13 +++++++++ 4 files changed, 77 insertions(+), 19 deletions(-) diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c index b90eca7..e66a375 100644 --- a/net/netfilter/nf_flow_table_core.c +++ b/net/netfilter/nf_flow_table_core.c @@ -39,8 +39,14 @@ ft->l3proto = ctt->src.l3num; ft->l4proto = ctt->dst.protonum; - ft->src_port = ctt->src.u.tcp.port; - ft->dst_port = ctt->dst.u.tcp.port; + + switch (ctt->dst.protonum) { + case IPPROTO_TCP: + case IPPROTO_UDP: + ft->src_port = ctt->src.u.tcp.port; + ft->dst_port = ctt->dst.u.tcp.port; + break; + } } struct flow_offload *flow_offload_alloc(struct nf_conn *ct) diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c index 889cf88..48e2f58 100644 --- a/net/netfilter/nf_flow_table_ip.c +++ b/net/netfilter/nf_flow_table_ip.c @@ -172,6 +172,7 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, struct flow_ports *ports; unsigned int thoff; struct iphdr *iph; + u8 ipproto; if (!pskb_may_pull(skb, sizeof(*iph) + offset)) return -1; @@ -185,13 +186,19 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, thoff += offset; - switch (iph->protocol) { + ipproto = iph->protocol; + switch (ipproto) { case IPPROTO_TCP: *hdrsize = sizeof(struct tcphdr); break; case IPPROTO_UDP: *hdrsize = sizeof(struct udphdr); break; +#ifdef CONFIG_NF_CT_PROTO_GRE + case IPPROTO_GRE: + *hdrsize = sizeof(struct gre_base_hdr); + break; +#endif default: return -1; } @@ -202,15 +209,25 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, if (!pskb_may_pull(skb, thoff + *hdrsize)) return -1; + if (ipproto == IPPROTO_GRE) { + struct gre_base_hdr *greh; + + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) + return -1; + } + iph = (struct iphdr *)(skb_network_header(skb) + offset); - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); tuple->src_v4.s_addr = iph->saddr; tuple->dst_v4.s_addr = iph->daddr; - tuple->src_port = ports->source; - tuple->dst_port = ports->dest; + if (ipproto == IPPROTO_TCP || ipproto == IPPROTO_UDP) { + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); + tuple->src_port = ports->source; + tuple->dst_port = ports->dest; + } tuple->l3proto = AF_INET; - tuple->l4proto = iph->protocol; + tuple->l4proto = ipproto; tuple->iifidx = dev->ifindex; nf_flow_tuple_encap(skb, tuple); @@ -521,6 +538,7 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, struct flow_ports *ports; struct ipv6hdr *ip6h; unsigned int thoff; + u8 nexthdr; thoff = sizeof(*ip6h) + offset; if (!pskb_may_pull(skb, thoff)) @@ -528,13 +546,19 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, ip6h = (struct ipv6hdr *)(skb_network_header(skb) + offset); - switch (ip6h->nexthdr) { + nexthdr = ip6h->nexthdr; + switch (nexthdr) { case IPPROTO_TCP: *hdrsize = sizeof(struct tcphdr); break; case IPPROTO_UDP: *hdrsize = sizeof(struct udphdr); break; +#ifdef CONFIG_NF_CT_PROTO_GRE + case IPPROTO_GRE: + *hdrsize = sizeof(struct gre_base_hdr); + break; +#endif default: return -1; } @@ -545,15 +569,25 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, if (!pskb_may_pull(skb, thoff + *hdrsize)) return -1; + if (nexthdr == IPPROTO_GRE) { + struct gre_base_hdr *greh; + + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) + return -1; + } + ip6h = (struct ipv6hdr *)(skb_network_header(skb) + offset); - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); tuple->src_v6 = ip6h->saddr; tuple->dst_v6 = ip6h->daddr; - tuple->src_port = ports->source; - tuple->dst_port = ports->dest; + if (nexthdr == IPPROTO_TCP || nexthdr == IPPROTO_UDP) { + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); + tuple->src_port = ports->source; + tuple->dst_port = ports->dest; + } tuple->l3proto = AF_INET6; - tuple->l4proto = ip6h->nexthdr; + tuple->l4proto = nexthdr; tuple->iifidx = dev->ifindex; nf_flow_tuple_encap(skb, tuple); diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c index b561e0a..9b81080 100644 --- a/net/netfilter/nf_flow_table_offload.c +++ b/net/netfilter/nf_flow_table_offload.c @@ -170,6 +170,7 @@ static int nf_flow_rule_match(struct nf_flow_match *match, match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_TCP); break; case IPPROTO_UDP: + case IPPROTO_GRE: break; default: return -EOPNOTSUPP; @@ -178,15 +179,19 @@ static int nf_flow_rule_match(struct nf_flow_match *match, key->basic.ip_proto = tuple->l4proto; mask->basic.ip_proto = 0xff; - key->tp.src = tuple->src_port; - mask->tp.src = 0xffff; - key->tp.dst = tuple->dst_port; - mask->tp.dst = 0xffff; - match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_META) | BIT(FLOW_DISSECTOR_KEY_CONTROL) | - BIT(FLOW_DISSECTOR_KEY_BASIC) | - BIT(FLOW_DISSECTOR_KEY_PORTS); + BIT(FLOW_DISSECTOR_KEY_BASIC); + + if (tuple->l4proto == IPPROTO_TCP || tuple->l4proto == IPPROTO_UDP) { + key->tp.src = tuple->src_port; + mask->tp.src = 0xffff; + key->tp.dst = tuple->dst_port; + mask->tp.dst = 0xffff; + + match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_PORTS); + } + return 0; } diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c index 0af34ad..731b5d8 100644 --- a/net/netfilter/nft_flow_offload.c +++ b/net/netfilter/nft_flow_offload.c @@ -298,6 +298,19 @@ static void nft_flow_offload_eval(const struct nft_expr *expr, break; case IPPROTO_UDP: break; +#ifdef CONFIG_NF_CT_PROTO_GRE + case IPPROTO_GRE: { + struct nf_conntrack_tuple *tuple; + + if (ct->status & IPS_NAT_MASK) + goto out; + tuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple; + /* No support for GRE v1 */ + if (tuple->src.u.gre.key || tuple->dst.u.gre.key) + goto out; + break; + } +#endif default: goto out; } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/3] netfilter: flowtable: Support GRE 2022-02-03 11:59 ` [PATCH net-next 1/3] netfilter: flowtable: Support GRE Toshiaki Makita @ 2022-02-07 17:56 ` Pablo Neira Ayuso 2022-02-08 14:30 ` Toshiaki Makita 0 siblings, 1 reply; 12+ messages in thread From: Pablo Neira Ayuso @ 2022-02-07 17:56 UTC (permalink / raw) To: Toshiaki Makita Cc: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Jozsef Kadlecsik, Florian Westphal, netdev, netfilter-devel, coreteam, Paul Blakey On Thu, Feb 03, 2022 at 08:59:39PM +0900, Toshiaki Makita wrote: > Support GREv0 without NAT. > > Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com> > --- > net/netfilter/nf_flow_table_core.c | 10 +++++-- > net/netfilter/nf_flow_table_ip.c | 54 ++++++++++++++++++++++++++++------- > net/netfilter/nf_flow_table_offload.c | 19 +++++++----- > net/netfilter/nft_flow_offload.c | 13 +++++++++ > 4 files changed, 77 insertions(+), 19 deletions(-) > > diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c > index b90eca7..e66a375 100644 > --- a/net/netfilter/nf_flow_table_core.c > +++ b/net/netfilter/nf_flow_table_core.c > @@ -39,8 +39,14 @@ > > ft->l3proto = ctt->src.l3num; > ft->l4proto = ctt->dst.protonum; > - ft->src_port = ctt->src.u.tcp.port; > - ft->dst_port = ctt->dst.u.tcp.port; > + > + switch (ctt->dst.protonum) { > + case IPPROTO_TCP: > + case IPPROTO_UDP: > + ft->src_port = ctt->src.u.tcp.port; > + ft->dst_port = ctt->dst.u.tcp.port; > + break; > + } > } > > struct flow_offload *flow_offload_alloc(struct nf_conn *ct) > diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c > index 889cf88..48e2f58 100644 > --- a/net/netfilter/nf_flow_table_ip.c > +++ b/net/netfilter/nf_flow_table_ip.c > @@ -172,6 +172,7 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, > struct flow_ports *ports; > unsigned int thoff; > struct iphdr *iph; > + u8 ipproto; > > if (!pskb_may_pull(skb, sizeof(*iph) + offset)) > return -1; > @@ -185,13 +186,19 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, > > thoff += offset; > > - switch (iph->protocol) { > + ipproto = iph->protocol; > + switch (ipproto) { > case IPPROTO_TCP: > *hdrsize = sizeof(struct tcphdr); > break; > case IPPROTO_UDP: > *hdrsize = sizeof(struct udphdr); > break; > +#ifdef CONFIG_NF_CT_PROTO_GRE > + case IPPROTO_GRE: > + *hdrsize = sizeof(struct gre_base_hdr); > + break; > +#endif > default: > return -1; > } > @@ -202,15 +209,25 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, > if (!pskb_may_pull(skb, thoff + *hdrsize)) > return -1; > > + if (ipproto == IPPROTO_GRE) { No ifdef here? Maybe remove these ifdef everywhere? > + struct gre_base_hdr *greh; > + > + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); > + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) > + return -1; > + } > + > iph = (struct iphdr *)(skb_network_header(skb) + offset); > - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); > > tuple->src_v4.s_addr = iph->saddr; > tuple->dst_v4.s_addr = iph->daddr; > - tuple->src_port = ports->source; > - tuple->dst_port = ports->dest; > + if (ipproto == IPPROTO_TCP || ipproto == IPPROTO_UDP) { > + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); > + tuple->src_port = ports->source; > + tuple->dst_port = ports->dest; > + } maybe: switch (ipproto) { case IPPROTO_TCP: case IPPROTO_UDP: ... break; case IPPROTO_GRE: break; } ? > tuple->l3proto = AF_INET; > - tuple->l4proto = iph->protocol; > + tuple->l4proto = ipproto; > tuple->iifidx = dev->ifindex; > nf_flow_tuple_encap(skb, tuple); > > @@ -521,6 +538,7 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, > struct flow_ports *ports; > struct ipv6hdr *ip6h; > unsigned int thoff; > + u8 nexthdr; > > thoff = sizeof(*ip6h) + offset; > if (!pskb_may_pull(skb, thoff)) > @@ -528,13 +546,19 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, > > ip6h = (struct ipv6hdr *)(skb_network_header(skb) + offset); > > - switch (ip6h->nexthdr) { > + nexthdr = ip6h->nexthdr; > + switch (nexthdr) { > case IPPROTO_TCP: > *hdrsize = sizeof(struct tcphdr); > break; > case IPPROTO_UDP: > *hdrsize = sizeof(struct udphdr); > break; > +#ifdef CONFIG_NF_CT_PROTO_GRE > + case IPPROTO_GRE: > + *hdrsize = sizeof(struct gre_base_hdr); > + break; > +#endif > default: > return -1; > } > @@ -545,15 +569,25 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, > if (!pskb_may_pull(skb, thoff + *hdrsize)) > return -1; > > + if (nexthdr == IPPROTO_GRE) { > + struct gre_base_hdr *greh; > + > + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); > + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) > + return -1; > + } > + > ip6h = (struct ipv6hdr *)(skb_network_header(skb) + offset); > - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); > > tuple->src_v6 = ip6h->saddr; > tuple->dst_v6 = ip6h->daddr; > - tuple->src_port = ports->source; > - tuple->dst_port = ports->dest; > + if (nexthdr == IPPROTO_TCP || nexthdr == IPPROTO_UDP) { > + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); > + tuple->src_port = ports->source; > + tuple->dst_port = ports->dest; > + } > tuple->l3proto = AF_INET6; > - tuple->l4proto = ip6h->nexthdr; > + tuple->l4proto = nexthdr; > tuple->iifidx = dev->ifindex; > nf_flow_tuple_encap(skb, tuple); > > diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c > index b561e0a..9b81080 100644 > --- a/net/netfilter/nf_flow_table_offload.c > +++ b/net/netfilter/nf_flow_table_offload.c > @@ -170,6 +170,7 @@ static int nf_flow_rule_match(struct nf_flow_match *match, > match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_TCP); > break; > case IPPROTO_UDP: > + case IPPROTO_GRE: > break; > default: > return -EOPNOTSUPP; > @@ -178,15 +179,19 @@ static int nf_flow_rule_match(struct nf_flow_match *match, > key->basic.ip_proto = tuple->l4proto; > mask->basic.ip_proto = 0xff; > > - key->tp.src = tuple->src_port; > - mask->tp.src = 0xffff; > - key->tp.dst = tuple->dst_port; > - mask->tp.dst = 0xffff; > - > match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_META) | > BIT(FLOW_DISSECTOR_KEY_CONTROL) | > - BIT(FLOW_DISSECTOR_KEY_BASIC) | > - BIT(FLOW_DISSECTOR_KEY_PORTS); > + BIT(FLOW_DISSECTOR_KEY_BASIC); > + > + if (tuple->l4proto == IPPROTO_TCP || tuple->l4proto == IPPROTO_UDP) { > + key->tp.src = tuple->src_port; > + mask->tp.src = 0xffff; > + key->tp.dst = tuple->dst_port; > + mask->tp.dst = 0xffff; > + > + match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_PORTS); > + } > + > return 0; > } > > diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c > index 0af34ad..731b5d8 100644 > --- a/net/netfilter/nft_flow_offload.c > +++ b/net/netfilter/nft_flow_offload.c > @@ -298,6 +298,19 @@ static void nft_flow_offload_eval(const struct nft_expr *expr, > break; > case IPPROTO_UDP: > break; > +#ifdef CONFIG_NF_CT_PROTO_GRE > + case IPPROTO_GRE: { > + struct nf_conntrack_tuple *tuple; > + > + if (ct->status & IPS_NAT_MASK) > + goto out; Why this NAT check? > + tuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple; > + /* No support for GRE v1 */ > + if (tuple->src.u.gre.key || tuple->dst.u.gre.key) > + goto out; > + break; > + } > +#endif > default: > goto out; > } > -- > 1.8.3.1 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/3] netfilter: flowtable: Support GRE 2022-02-07 17:56 ` Pablo Neira Ayuso @ 2022-02-08 14:30 ` Toshiaki Makita 2022-02-09 10:01 ` Pablo Neira Ayuso 0 siblings, 1 reply; 12+ messages in thread From: Toshiaki Makita @ 2022-02-08 14:30 UTC (permalink / raw) To: Pablo Neira Ayuso Cc: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Jozsef Kadlecsik, Florian Westphal, netdev, netfilter-devel, coreteam, Paul Blakey On 2022/02/08 2:56, Pablo Neira Ayuso wrote: > On Thu, Feb 03, 2022 at 08:59:39PM +0900, Toshiaki Makita wrote: >> Support GREv0 without NAT. >> >> Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com> >> --- >> net/netfilter/nf_flow_table_core.c | 10 +++++-- >> net/netfilter/nf_flow_table_ip.c | 54 ++++++++++++++++++++++++++++------- >> net/netfilter/nf_flow_table_offload.c | 19 +++++++----- >> net/netfilter/nft_flow_offload.c | 13 +++++++++ >> 4 files changed, 77 insertions(+), 19 deletions(-) >> >> diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c >> index b90eca7..e66a375 100644 >> --- a/net/netfilter/nf_flow_table_core.c >> +++ b/net/netfilter/nf_flow_table_core.c >> @@ -39,8 +39,14 @@ >> >> ft->l3proto = ctt->src.l3num; >> ft->l4proto = ctt->dst.protonum; >> - ft->src_port = ctt->src.u.tcp.port; >> - ft->dst_port = ctt->dst.u.tcp.port; >> + >> + switch (ctt->dst.protonum) { >> + case IPPROTO_TCP: >> + case IPPROTO_UDP: >> + ft->src_port = ctt->src.u.tcp.port; >> + ft->dst_port = ctt->dst.u.tcp.port; >> + break; >> + } >> } >> >> struct flow_offload *flow_offload_alloc(struct nf_conn *ct) >> diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c >> index 889cf88..48e2f58 100644 >> --- a/net/netfilter/nf_flow_table_ip.c >> +++ b/net/netfilter/nf_flow_table_ip.c >> @@ -172,6 +172,7 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, >> struct flow_ports *ports; >> unsigned int thoff; >> struct iphdr *iph; >> + u8 ipproto; >> >> if (!pskb_may_pull(skb, sizeof(*iph) + offset)) >> return -1; >> @@ -185,13 +186,19 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, >> >> thoff += offset; >> >> - switch (iph->protocol) { >> + ipproto = iph->protocol; >> + switch (ipproto) { >> case IPPROTO_TCP: >> *hdrsize = sizeof(struct tcphdr); >> break; >> case IPPROTO_UDP: >> *hdrsize = sizeof(struct udphdr); >> break; >> +#ifdef CONFIG_NF_CT_PROTO_GRE >> + case IPPROTO_GRE: >> + *hdrsize = sizeof(struct gre_base_hdr); >> + break; >> +#endif >> default: >> return -1; >> } >> @@ -202,15 +209,25 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, >> if (!pskb_may_pull(skb, thoff + *hdrsize)) >> return -1; >> >> + if (ipproto == IPPROTO_GRE) { > > No ifdef here? Maybe remove these ifdef everywhere? I wanted to avoid adding many ifdefs and I expect this to be compiled out when CONFIG_NF_CT_PROTO_GRE=n as this block is unreachable anyway. It rather may have been unintuitive though. Removing all of these ifdefs will cause inconsistent behavior between CONFIG_NF_CT_PROTO_GRE=n/y. When CONFIG_NF_CT_PROTO_GRE=n, conntrack cannot determine GRE version, thus it will track GREv1 without key infomation, and the flow will be offloaded. When CONFIG_NF_CT_PROTO_GRE=y, GREv1 will have key information and will not be offloaded. I wanted to just refuse offloading of GRE to avoid this inconsistency. Anyway this kind of inconsistency seems to happen in software conntrack, so if you'd like to remove ifdefs, I will do. >> + struct gre_base_hdr *greh; >> + >> + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); >> + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) >> + return -1; >> + } >> + >> iph = (struct iphdr *)(skb_network_header(skb) + offset); >> - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); >> >> tuple->src_v4.s_addr = iph->saddr; >> tuple->dst_v4.s_addr = iph->daddr; >> - tuple->src_port = ports->source; >> - tuple->dst_port = ports->dest; >> + if (ipproto == IPPROTO_TCP || ipproto == IPPROTO_UDP) { >> + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); >> + tuple->src_port = ports->source; >> + tuple->dst_port = ports->dest; >> + } > > maybe: > > switch (ipproto) { > case IPPROTO_TCP: > case IPPROTO_UDP: > ... > break; > case IPPROTO_GRE: > break; > } > > ? Sure, will fix. >> tuple->l3proto = AF_INET; >> - tuple->l4proto = iph->protocol; >> + tuple->l4proto = ipproto; >> tuple->iifidx = dev->ifindex; >> nf_flow_tuple_encap(skb, tuple); >> >> @@ -521,6 +538,7 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, >> struct flow_ports *ports; >> struct ipv6hdr *ip6h; >> unsigned int thoff; >> + u8 nexthdr; >> >> thoff = sizeof(*ip6h) + offset; >> if (!pskb_may_pull(skb, thoff)) >> @@ -528,13 +546,19 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, >> >> ip6h = (struct ipv6hdr *)(skb_network_header(skb) + offset); >> >> - switch (ip6h->nexthdr) { >> + nexthdr = ip6h->nexthdr; >> + switch (nexthdr) { >> case IPPROTO_TCP: >> *hdrsize = sizeof(struct tcphdr); >> break; >> case IPPROTO_UDP: >> *hdrsize = sizeof(struct udphdr); >> break; >> +#ifdef CONFIG_NF_CT_PROTO_GRE >> + case IPPROTO_GRE: >> + *hdrsize = sizeof(struct gre_base_hdr); >> + break; >> +#endif >> default: >> return -1; >> } >> @@ -545,15 +569,25 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, >> if (!pskb_may_pull(skb, thoff + *hdrsize)) >> return -1; >> >> + if (nexthdr == IPPROTO_GRE) { >> + struct gre_base_hdr *greh; >> + >> + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); >> + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) >> + return -1; >> + } >> + >> ip6h = (struct ipv6hdr *)(skb_network_header(skb) + offset); >> - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); >> >> tuple->src_v6 = ip6h->saddr; >> tuple->dst_v6 = ip6h->daddr; >> - tuple->src_port = ports->source; >> - tuple->dst_port = ports->dest; >> + if (nexthdr == IPPROTO_TCP || nexthdr == IPPROTO_UDP) { >> + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); >> + tuple->src_port = ports->source; >> + tuple->dst_port = ports->dest; >> + } >> tuple->l3proto = AF_INET6; >> - tuple->l4proto = ip6h->nexthdr; >> + tuple->l4proto = nexthdr; >> tuple->iifidx = dev->ifindex; >> nf_flow_tuple_encap(skb, tuple); >> >> diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c >> index b561e0a..9b81080 100644 >> --- a/net/netfilter/nf_flow_table_offload.c >> +++ b/net/netfilter/nf_flow_table_offload.c >> @@ -170,6 +170,7 @@ static int nf_flow_rule_match(struct nf_flow_match *match, >> match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_TCP); >> break; >> case IPPROTO_UDP: >> + case IPPROTO_GRE: >> break; >> default: >> return -EOPNOTSUPP; >> @@ -178,15 +179,19 @@ static int nf_flow_rule_match(struct nf_flow_match *match, >> key->basic.ip_proto = tuple->l4proto; >> mask->basic.ip_proto = 0xff; >> >> - key->tp.src = tuple->src_port; >> - mask->tp.src = 0xffff; >> - key->tp.dst = tuple->dst_port; >> - mask->tp.dst = 0xffff; >> - >> match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_META) | >> BIT(FLOW_DISSECTOR_KEY_CONTROL) | >> - BIT(FLOW_DISSECTOR_KEY_BASIC) | >> - BIT(FLOW_DISSECTOR_KEY_PORTS); >> + BIT(FLOW_DISSECTOR_KEY_BASIC); >> + >> + if (tuple->l4proto == IPPROTO_TCP || tuple->l4proto == IPPROTO_UDP) { >> + key->tp.src = tuple->src_port; >> + mask->tp.src = 0xffff; >> + key->tp.dst = tuple->dst_port; >> + mask->tp.dst = 0xffff; >> + >> + match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_PORTS); >> + } >> + >> return 0; >> } >> >> diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c >> index 0af34ad..731b5d8 100644 >> --- a/net/netfilter/nft_flow_offload.c >> +++ b/net/netfilter/nft_flow_offload.c >> @@ -298,6 +298,19 @@ static void nft_flow_offload_eval(const struct nft_expr *expr, >> break; >> case IPPROTO_UDP: >> break; >> +#ifdef CONFIG_NF_CT_PROTO_GRE >> + case IPPROTO_GRE: { >> + struct nf_conntrack_tuple *tuple; >> + >> + if (ct->status & IPS_NAT_MASK) >> + goto out; > > Why this NAT check? NAT requires more work. I'd like to start with a minimal GRE support. Maybe we can add NAT support later. Toshiaki Makita >> + tuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple; >> + /* No support for GRE v1 */ >> + if (tuple->src.u.gre.key || tuple->dst.u.gre.key) >> + goto out; >> + break; >> + } >> +#endif >> default: >> goto out; >> } >> -- >> 1.8.3.1 >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/3] netfilter: flowtable: Support GRE 2022-02-08 14:30 ` Toshiaki Makita @ 2022-02-09 10:01 ` Pablo Neira Ayuso 2022-02-12 1:54 ` Toshiaki Makita 0 siblings, 1 reply; 12+ messages in thread From: Pablo Neira Ayuso @ 2022-02-09 10:01 UTC (permalink / raw) To: Toshiaki Makita Cc: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Jozsef Kadlecsik, Florian Westphal, netdev, netfilter-devel, coreteam, Paul Blakey On Tue, Feb 08, 2022 at 11:30:03PM +0900, Toshiaki Makita wrote: > On 2022/02/08 2:56, Pablo Neira Ayuso wrote: > > On Thu, Feb 03, 2022 at 08:59:39PM +0900, Toshiaki Makita wrote: [...] > > > diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c > > > index 889cf88..48e2f58 100644 > > > --- a/net/netfilter/nf_flow_table_ip.c > > > +++ b/net/netfilter/nf_flow_table_ip.c [...] > > > @@ -202,15 +209,25 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, > > > if (!pskb_may_pull(skb, thoff + *hdrsize)) > > > return -1; > > > + if (ipproto == IPPROTO_GRE) { > > > > No ifdef here? Maybe remove these ifdef everywhere? > > I wanted to avoid adding many ifdefs and I expect this to be compiled out > when CONFIG_NF_CT_PROTO_GRE=n as this block is unreachable anyway. It rather > may have been unintuitive though. > > Removing all of these ifdefs will cause inconsistent behavior between > CONFIG_NF_CT_PROTO_GRE=n/y. > When CONFIG_NF_CT_PROTO_GRE=n, conntrack cannot determine GRE version, thus > it will track GREv1 without key infomation, and the flow will be offloaded. > When CONFIG_NF_CT_PROTO_GRE=y, GREv1 will have key information and will not > be offloaded. > I wanted to just refuse offloading of GRE to avoid this inconsistency. > Anyway this kind of inconsistency seems to happen in software conntrack, so > if you'd like to remove ifdefs, I will do. Good point, thanks for explaining. LGTM. [...] > > > diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c > > > index 0af34ad..731b5d8 100644 > > > --- a/net/netfilter/nft_flow_offload.c > > > +++ b/net/netfilter/nft_flow_offload.c > > > @@ -298,6 +298,19 @@ static void nft_flow_offload_eval(const struct nft_expr *expr, > > > break; > > > case IPPROTO_UDP: > > > break; > > > +#ifdef CONFIG_NF_CT_PROTO_GRE > > > + case IPPROTO_GRE: { > > > + struct nf_conntrack_tuple *tuple; > > > + > > > + if (ct->status & IPS_NAT_MASK) > > > + goto out; > > > > Why this NAT check? > > NAT requires more work. I'd like to start with a minimal GRE support. > Maybe we can add NAT support later. Oh well, yes, no problem. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/3] netfilter: flowtable: Support GRE 2022-02-09 10:01 ` Pablo Neira Ayuso @ 2022-02-12 1:54 ` Toshiaki Makita 2022-02-19 12:51 ` Toshiaki Makita 0 siblings, 1 reply; 12+ messages in thread From: Toshiaki Makita @ 2022-02-12 1:54 UTC (permalink / raw) To: Pablo Neira Ayuso Cc: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Jozsef Kadlecsik, Florian Westphal, netdev, netfilter-devel, coreteam, Paul Blakey On 2022/02/09 19:01, Pablo Neira Ayuso wrote: > On Tue, Feb 08, 2022 at 11:30:03PM +0900, Toshiaki Makita wrote: >> On 2022/02/08 2:56, Pablo Neira Ayuso wrote: >>> On Thu, Feb 03, 2022 at 08:59:39PM +0900, Toshiaki Makita wrote: > [...] >>>> diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c >>>> index 889cf88..48e2f58 100644 >>>> --- a/net/netfilter/nf_flow_table_ip.c >>>> +++ b/net/netfilter/nf_flow_table_ip.c > [...] >>>> @@ -202,15 +209,25 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, >>>> if (!pskb_may_pull(skb, thoff + *hdrsize)) >>>> return -1; >>>> + if (ipproto == IPPROTO_GRE) { >>> >>> No ifdef here? Maybe remove these ifdef everywhere? >> >> I wanted to avoid adding many ifdefs and I expect this to be compiled out >> when CONFIG_NF_CT_PROTO_GRE=n as this block is unreachable anyway. It rather >> may have been unintuitive though. >> >> Removing all of these ifdefs will cause inconsistent behavior between >> CONFIG_NF_CT_PROTO_GRE=n/y. >> When CONFIG_NF_CT_PROTO_GRE=n, conntrack cannot determine GRE version, thus >> it will track GREv1 without key infomation, and the flow will be offloaded. >> When CONFIG_NF_CT_PROTO_GRE=y, GREv1 will have key information and will not >> be offloaded. >> I wanted to just refuse offloading of GRE to avoid this inconsistency. >> Anyway this kind of inconsistency seems to happen in software conntrack, so >> if you'd like to remove ifdefs, I will do. > > Good point, thanks for explaining. LGTM. Let me confirm, did you agree to keep ifdefs, or delete them? Toshiaki Makita ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/3] netfilter: flowtable: Support GRE 2022-02-12 1:54 ` Toshiaki Makita @ 2022-02-19 12:51 ` Toshiaki Makita 2022-02-23 10:44 ` Pablo Neira Ayuso 0 siblings, 1 reply; 12+ messages in thread From: Toshiaki Makita @ 2022-02-19 12:51 UTC (permalink / raw) To: Pablo Neira Ayuso Cc: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Jozsef Kadlecsik, Florian Westphal, netdev, netfilter-devel, coreteam, Paul Blakey Ping, Pablo? On 2022/02/12 10:54, Toshiaki Makita wrote: > On 2022/02/09 19:01, Pablo Neira Ayuso wrote: >> On Tue, Feb 08, 2022 at 11:30:03PM +0900, Toshiaki Makita wrote: >>> On 2022/02/08 2:56, Pablo Neira Ayuso wrote: >>>> On Thu, Feb 03, 2022 at 08:59:39PM +0900, Toshiaki Makita wrote: >> [...] >>>>> diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c >>>>> index 889cf88..48e2f58 100644 >>>>> --- a/net/netfilter/nf_flow_table_ip.c >>>>> +++ b/net/netfilter/nf_flow_table_ip.c >> [...] >>>>> @@ -202,15 +209,25 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const >>>>> struct net_device *dev, >>>>> if (!pskb_may_pull(skb, thoff + *hdrsize)) >>>>> return -1; >>>>> + if (ipproto == IPPROTO_GRE) { >>>> >>>> No ifdef here? Maybe remove these ifdef everywhere? >>> >>> I wanted to avoid adding many ifdefs and I expect this to be compiled out >>> when CONFIG_NF_CT_PROTO_GRE=n as this block is unreachable anyway. It rather >>> may have been unintuitive though. >>> >>> Removing all of these ifdefs will cause inconsistent behavior between >>> CONFIG_NF_CT_PROTO_GRE=n/y. >>> When CONFIG_NF_CT_PROTO_GRE=n, conntrack cannot determine GRE version, thus >>> it will track GREv1 without key infomation, and the flow will be offloaded. >>> When CONFIG_NF_CT_PROTO_GRE=y, GREv1 will have key information and will not >>> be offloaded. >>> I wanted to just refuse offloading of GRE to avoid this inconsistency. >>> Anyway this kind of inconsistency seems to happen in software conntrack, so >>> if you'd like to remove ifdefs, I will do. >> >> Good point, thanks for explaining. LGTM. > > Let me confirm, did you agree to keep ifdefs, or delete them? > > Toshiaki Makita ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/3] netfilter: flowtable: Support GRE 2022-02-19 12:51 ` Toshiaki Makita @ 2022-02-23 10:44 ` Pablo Neira Ayuso 0 siblings, 0 replies; 12+ messages in thread From: Pablo Neira Ayuso @ 2022-02-23 10:44 UTC (permalink / raw) To: Toshiaki Makita Cc: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Jozsef Kadlecsik, Florian Westphal, netdev, netfilter-devel, coreteam, Paul Blakey On Sat, Feb 19, 2022 at 09:51:37PM +0900, Toshiaki Makita wrote: > Ping, Pablo? Please go ahead, these #ifdef can go away later on. > On 2022/02/12 10:54, Toshiaki Makita wrote: > > On 2022/02/09 19:01, Pablo Neira Ayuso wrote: > > > On Tue, Feb 08, 2022 at 11:30:03PM +0900, Toshiaki Makita wrote: > > > > On 2022/02/08 2:56, Pablo Neira Ayuso wrote: > > > > > On Thu, Feb 03, 2022 at 08:59:39PM +0900, Toshiaki Makita wrote: > > > [...] > > > > > > diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c > > > > > > index 889cf88..48e2f58 100644 > > > > > > --- a/net/netfilter/nf_flow_table_ip.c > > > > > > +++ b/net/netfilter/nf_flow_table_ip.c > > > [...] > > > > > > @@ -202,15 +209,25 @@ static int nf_flow_tuple_ip(struct > > > > > > sk_buff *skb, const struct net_device *dev, > > > > > > if (!pskb_may_pull(skb, thoff + *hdrsize)) > > > > > > return -1; > > > > > > + if (ipproto == IPPROTO_GRE) { > > > > > > > > > > No ifdef here? Maybe remove these ifdef everywhere? > > > > > > > > I wanted to avoid adding many ifdefs and I expect this to be compiled out > > > > when CONFIG_NF_CT_PROTO_GRE=n as this block is unreachable anyway. It rather > > > > may have been unintuitive though. > > > > > > > > Removing all of these ifdefs will cause inconsistent behavior between > > > > CONFIG_NF_CT_PROTO_GRE=n/y. > > > > When CONFIG_NF_CT_PROTO_GRE=n, conntrack cannot determine GRE version, thus > > > > it will track GREv1 without key infomation, and the flow will be offloaded. > > > > When CONFIG_NF_CT_PROTO_GRE=y, GREv1 will have key information and will not > > > > be offloaded. > > > > I wanted to just refuse offloading of GRE to avoid this inconsistency. > > > > Anyway this kind of inconsistency seems to happen in software conntrack, so > > > > if you'd like to remove ifdefs, I will do. > > > > > > Good point, thanks for explaining. LGTM. > > > > Let me confirm, did you agree to keep ifdefs, or delete them? > > > > Toshiaki Makita ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-next 2/3] act_ct: Support GRE offload 2022-02-03 11:59 [PATCH net-next 0/3] Conntrack GRE offload Toshiaki Makita 2022-02-03 11:59 ` [PATCH net-next 1/3] netfilter: flowtable: Support GRE Toshiaki Makita @ 2022-02-03 11:59 ` Toshiaki Makita 2022-02-10 12:48 ` Paul Blakey 2022-02-03 11:59 ` [PATCH net-next 3/3] net/mlx5: Support GRE conntrack offload Toshiaki Makita 2 siblings, 1 reply; 12+ messages in thread From: Toshiaki Makita @ 2022-02-03 11:59 UTC (permalink / raw) To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal Cc: Toshiaki Makita, netdev, netfilter-devel, coreteam, Paul Blakey Support GREv0 without NAT. Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com> --- net/sched/act_ct.c | 101 +++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 79 insertions(+), 22 deletions(-) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index f99247f..a5f47d5 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -421,6 +421,19 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft, break; case IPPROTO_UDP: break; +#ifdef CONFIG_NF_CT_PROTO_GRE + case IPPROTO_GRE: { + struct nf_conntrack_tuple *tuple; + + if (ct->status & IPS_NAT_MASK) + return; + tuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple; + /* No support for GRE v1 */ + if (tuple->src.u.gre.key || tuple->dst.u.gre.key) + return; + break; + } +#endif default: return; } @@ -440,6 +453,8 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft, struct flow_ports *ports; unsigned int thoff; struct iphdr *iph; + size_t hdrsize; + u8 ipproto; if (!pskb_network_may_pull(skb, sizeof(*iph))) return false; @@ -451,29 +466,49 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft, unlikely(thoff != sizeof(struct iphdr))) return false; - if (iph->protocol != IPPROTO_TCP && - iph->protocol != IPPROTO_UDP) + ipproto = iph->protocol; + switch (ipproto) { + case IPPROTO_TCP: + hdrsize = sizeof(struct tcphdr); + break; + case IPPROTO_UDP: + hdrsize = sizeof(*ports); + break; +#ifdef CONFIG_NF_CT_PROTO_GRE + case IPPROTO_GRE: + hdrsize = sizeof(struct gre_base_hdr); + break; +#endif + default: return false; + } if (iph->ttl <= 1) return false; - if (!pskb_network_may_pull(skb, iph->protocol == IPPROTO_TCP ? - thoff + sizeof(struct tcphdr) : - thoff + sizeof(*ports))) + if (!pskb_network_may_pull(skb, thoff + hdrsize)) return false; iph = ip_hdr(skb); - if (iph->protocol == IPPROTO_TCP) + if (ipproto == IPPROTO_TCP) { *tcph = (void *)(skb_network_header(skb) + thoff); + } else if (ipproto == IPPROTO_GRE) { + struct gre_base_hdr *greh; + + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) + return false; + } - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); tuple->src_v4.s_addr = iph->saddr; tuple->dst_v4.s_addr = iph->daddr; - tuple->src_port = ports->source; - tuple->dst_port = ports->dest; + if (ipproto == IPPROTO_TCP || ipproto == IPPROTO_UDP) { + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); + tuple->src_port = ports->source; + tuple->dst_port = ports->dest; + } tuple->l3proto = AF_INET; - tuple->l4proto = iph->protocol; + tuple->l4proto = ipproto; return true; } @@ -486,36 +521,58 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft, struct flow_ports *ports; struct ipv6hdr *ip6h; unsigned int thoff; + size_t hdrsize; + u8 nexthdr; if (!pskb_network_may_pull(skb, sizeof(*ip6h))) return false; ip6h = ipv6_hdr(skb); + thoff = sizeof(*ip6h); - if (ip6h->nexthdr != IPPROTO_TCP && - ip6h->nexthdr != IPPROTO_UDP) - return false; + nexthdr = ip6h->nexthdr; + switch (nexthdr) { + case IPPROTO_TCP: + hdrsize = sizeof(struct tcphdr); + break; + case IPPROTO_UDP: + hdrsize = sizeof(*ports); + break; +#ifdef CONFIG_NF_CT_PROTO_GRE + case IPPROTO_GRE: + hdrsize = sizeof(struct gre_base_hdr); + break; +#endif + default: + return -1; + } if (ip6h->hop_limit <= 1) return false; - thoff = sizeof(*ip6h); - if (!pskb_network_may_pull(skb, ip6h->nexthdr == IPPROTO_TCP ? - thoff + sizeof(struct tcphdr) : - thoff + sizeof(*ports))) + if (!pskb_network_may_pull(skb, thoff + hdrsize)) return false; ip6h = ipv6_hdr(skb); - if (ip6h->nexthdr == IPPROTO_TCP) + if (nexthdr == IPPROTO_TCP) { *tcph = (void *)(skb_network_header(skb) + thoff); + } else if (nexthdr == IPPROTO_GRE) { + struct gre_base_hdr *greh; + + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) + return false; + } - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); tuple->src_v6 = ip6h->saddr; tuple->dst_v6 = ip6h->daddr; - tuple->src_port = ports->source; - tuple->dst_port = ports->dest; + if (nexthdr == IPPROTO_TCP || nexthdr == IPPROTO_UDP) { + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); + tuple->src_port = ports->source; + tuple->dst_port = ports->dest; + } tuple->l3proto = AF_INET6; - tuple->l4proto = ip6h->nexthdr; + tuple->l4proto = nexthdr; return true; } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/3] act_ct: Support GRE offload 2022-02-03 11:59 ` [PATCH net-next 2/3] act_ct: Support GRE offload Toshiaki Makita @ 2022-02-10 12:48 ` Paul Blakey 0 siblings, 0 replies; 12+ messages in thread From: Paul Blakey @ 2022-02-10 12:48 UTC (permalink / raw) To: Toshiaki Makita Cc: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netdev, netfilter-devel, coreteam On Thu, 3 Feb 2022, Toshiaki Makita wrote: > Support GREv0 without NAT. > > Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com> > --- > net/sched/act_ct.c | 101 +++++++++++++++++++++++++++++++++++++++++------------ > 1 file changed, 79 insertions(+), 22 deletions(-) > > diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c > index f99247f..a5f47d5 100644 > --- a/net/sched/act_ct.c > +++ b/net/sched/act_ct.c > @@ -421,6 +421,19 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft, > break; > case IPPROTO_UDP: > break; > +#ifdef CONFIG_NF_CT_PROTO_GRE > + case IPPROTO_GRE: { > + struct nf_conntrack_tuple *tuple; > + > + if (ct->status & IPS_NAT_MASK) > + return; > + tuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple; > + /* No support for GRE v1 */ > + if (tuple->src.u.gre.key || tuple->dst.u.gre.key) > + return; > + break; > + } > +#endif > default: > return; > } > @@ -440,6 +453,8 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft, > struct flow_ports *ports; > unsigned int thoff; > struct iphdr *iph; > + size_t hdrsize; > + u8 ipproto; > > if (!pskb_network_may_pull(skb, sizeof(*iph))) > return false; > @@ -451,29 +466,49 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft, > unlikely(thoff != sizeof(struct iphdr))) > return false; > > - if (iph->protocol != IPPROTO_TCP && > - iph->protocol != IPPROTO_UDP) > + ipproto = iph->protocol; > + switch (ipproto) { > + case IPPROTO_TCP: > + hdrsize = sizeof(struct tcphdr); > + break; > + case IPPROTO_UDP: > + hdrsize = sizeof(*ports); > + break; > +#ifdef CONFIG_NF_CT_PROTO_GRE > + case IPPROTO_GRE: > + hdrsize = sizeof(struct gre_base_hdr); > + break; > +#endif > + default: > return false; > + } > > if (iph->ttl <= 1) > return false; > > - if (!pskb_network_may_pull(skb, iph->protocol == IPPROTO_TCP ? > - thoff + sizeof(struct tcphdr) : > - thoff + sizeof(*ports))) > + if (!pskb_network_may_pull(skb, thoff + hdrsize)) > return false; > > iph = ip_hdr(skb); > - if (iph->protocol == IPPROTO_TCP) > + if (ipproto == IPPROTO_TCP) { > *tcph = (void *)(skb_network_header(skb) + thoff); > + } else if (ipproto == IPPROTO_GRE) { > + struct gre_base_hdr *greh; > + > + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); > + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) > + return false; > + } > > - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); > tuple->src_v4.s_addr = iph->saddr; > tuple->dst_v4.s_addr = iph->daddr; > - tuple->src_port = ports->source; > - tuple->dst_port = ports->dest; > + if (ipproto == IPPROTO_TCP || ipproto == IPPROTO_UDP) { > + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); > + tuple->src_port = ports->source; > + tuple->dst_port = ports->dest; > + } > tuple->l3proto = AF_INET; > - tuple->l4proto = iph->protocol; > + tuple->l4proto = ipproto; > > return true; > } > @@ -486,36 +521,58 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft, > struct flow_ports *ports; > struct ipv6hdr *ip6h; > unsigned int thoff; > + size_t hdrsize; > + u8 nexthdr; > > if (!pskb_network_may_pull(skb, sizeof(*ip6h))) > return false; > > ip6h = ipv6_hdr(skb); > + thoff = sizeof(*ip6h); > > - if (ip6h->nexthdr != IPPROTO_TCP && > - ip6h->nexthdr != IPPROTO_UDP) > - return false; > + nexthdr = ip6h->nexthdr; > + switch (nexthdr) { > + case IPPROTO_TCP: > + hdrsize = sizeof(struct tcphdr); > + break; > + case IPPROTO_UDP: > + hdrsize = sizeof(*ports); > + break; > +#ifdef CONFIG_NF_CT_PROTO_GRE > + case IPPROTO_GRE: > + hdrsize = sizeof(struct gre_base_hdr); > + break; > +#endif > + default: > + return -1; > + } > > if (ip6h->hop_limit <= 1) > return false; > > - thoff = sizeof(*ip6h); > - if (!pskb_network_may_pull(skb, ip6h->nexthdr == IPPROTO_TCP ? > - thoff + sizeof(struct tcphdr) : > - thoff + sizeof(*ports))) > + if (!pskb_network_may_pull(skb, thoff + hdrsize)) > return false; > > ip6h = ipv6_hdr(skb); > - if (ip6h->nexthdr == IPPROTO_TCP) > + if (nexthdr == IPPROTO_TCP) { > *tcph = (void *)(skb_network_header(skb) + thoff); > + } else if (nexthdr == IPPROTO_GRE) { > + struct gre_base_hdr *greh; > + > + greh = (struct gre_base_hdr *)(skb_network_header(skb) + thoff); > + if ((greh->flags & GRE_VERSION) != GRE_VERSION_0) > + return false; > + } > > - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); > tuple->src_v6 = ip6h->saddr; > tuple->dst_v6 = ip6h->daddr; > - tuple->src_port = ports->source; > - tuple->dst_port = ports->dest; > + if (nexthdr == IPPROTO_TCP || nexthdr == IPPROTO_UDP) { > + ports = (struct flow_ports *)(skb_network_header(skb) + thoff); > + tuple->src_port = ports->source; > + tuple->dst_port = ports->dest; > + } > tuple->l3proto = AF_INET6; > - tuple->l4proto = ip6h->nexthdr; > + tuple->l4proto = nexthdr; > > return true; > } > -- > 1.8.3.1 > > The GRE ifdefs I assume are for the same reason you mentioned in other patch, If so, looks good to me. Acked-by: Paul Blakey <paulb@nvidia.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-next 3/3] net/mlx5: Support GRE conntrack offload 2022-02-03 11:59 [PATCH net-next 0/3] Conntrack GRE offload Toshiaki Makita 2022-02-03 11:59 ` [PATCH net-next 1/3] netfilter: flowtable: Support GRE Toshiaki Makita 2022-02-03 11:59 ` [PATCH net-next 2/3] act_ct: Support GRE offload Toshiaki Makita @ 2022-02-03 11:59 ` Toshiaki Makita 2022-02-10 12:44 ` Paul Blakey 2 siblings, 1 reply; 12+ messages in thread From: Toshiaki Makita @ 2022-02-03 11:59 UTC (permalink / raw) To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal Cc: Toshiaki Makita, netdev, netfilter-devel, coreteam, Paul Blakey Support GREv0 without NAT. Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com> --- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index 0f4d3b9d..465643c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -258,7 +258,8 @@ struct mlx5_ct_entry { return -EOPNOTSUPP; } } else { - return -EOPNOTSUPP; + if (tuple->ip_proto != IPPROTO_GRE) + return -EOPNOTSUPP; } return 0; @@ -807,7 +808,11 @@ struct mlx5_ct_entry { attr->dest_chain = 0; attr->dest_ft = mlx5e_tc_post_act_get_ft(ct_priv->post_act); attr->ft = nat ? ct_priv->ct_nat : ct_priv->ct; - attr->outer_match_level = MLX5_MATCH_L4; + if (entry->tuple.ip_proto == IPPROTO_TCP || + entry->tuple.ip_proto == IPPROTO_UDP) + attr->outer_match_level = MLX5_MATCH_L4; + else + attr->outer_match_level = MLX5_MATCH_L3; attr->counter = entry->counter->counter; attr->flags |= MLX5_ATTR_FLAG_NO_IN_PORT; if (ct_priv->ns_type == MLX5_FLOW_NAMESPACE_FDB) @@ -1224,16 +1229,20 @@ static void mlx5_tc_ct_entry_del_work(struct work_struct *work) struct flow_keys flow_keys; skb_reset_network_header(skb); - skb_flow_dissect_flow_keys(skb, &flow_keys, 0); + skb_flow_dissect_flow_keys(skb, &flow_keys, FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP); tuple->zone = zone; if (flow_keys.basic.ip_proto != IPPROTO_TCP && - flow_keys.basic.ip_proto != IPPROTO_UDP) + flow_keys.basic.ip_proto != IPPROTO_UDP && + flow_keys.basic.ip_proto != IPPROTO_GRE) return false; - tuple->port.src = flow_keys.ports.src; - tuple->port.dst = flow_keys.ports.dst; + if (flow_keys.basic.ip_proto == IPPROTO_TCP || + flow_keys.basic.ip_proto == IPPROTO_UDP) { + tuple->port.src = flow_keys.ports.src; + tuple->port.dst = flow_keys.ports.dst; + } tuple->n_proto = flow_keys.basic.n_proto; tuple->ip_proto = flow_keys.basic.ip_proto; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 3/3] net/mlx5: Support GRE conntrack offload 2022-02-03 11:59 ` [PATCH net-next 3/3] net/mlx5: Support GRE conntrack offload Toshiaki Makita @ 2022-02-10 12:44 ` Paul Blakey 0 siblings, 0 replies; 12+ messages in thread From: Paul Blakey @ 2022-02-10 12:44 UTC (permalink / raw) To: Toshiaki Makita Cc: David S. Miller, Jakub Kicinski, Saeed Mahameed, Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netdev, netfilter-devel, coreteam On Thu, 3 Feb 2022, Toshiaki Makita wrote: > Support GREv0 without NAT. > > Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com> > --- > drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 21 +++++++++++++++------ > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c > index 0f4d3b9d..465643c 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c > @@ -258,7 +258,8 @@ struct mlx5_ct_entry { > return -EOPNOTSUPP; > } > } else { > - return -EOPNOTSUPP; > + if (tuple->ip_proto != IPPROTO_GRE) > + return -EOPNOTSUPP; > } > > return 0; > @@ -807,7 +808,11 @@ struct mlx5_ct_entry { > attr->dest_chain = 0; > attr->dest_ft = mlx5e_tc_post_act_get_ft(ct_priv->post_act); > attr->ft = nat ? ct_priv->ct_nat : ct_priv->ct; > - attr->outer_match_level = MLX5_MATCH_L4; > + if (entry->tuple.ip_proto == IPPROTO_TCP || > + entry->tuple.ip_proto == IPPROTO_UDP) > + attr->outer_match_level = MLX5_MATCH_L4; > + else > + attr->outer_match_level = MLX5_MATCH_L3; > attr->counter = entry->counter->counter; > attr->flags |= MLX5_ATTR_FLAG_NO_IN_PORT; > if (ct_priv->ns_type == MLX5_FLOW_NAMESPACE_FDB) > @@ -1224,16 +1229,20 @@ static void mlx5_tc_ct_entry_del_work(struct work_struct *work) > struct flow_keys flow_keys; > > skb_reset_network_header(skb); > - skb_flow_dissect_flow_keys(skb, &flow_keys, 0); > + skb_flow_dissect_flow_keys(skb, &flow_keys, FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP); > > tuple->zone = zone; > > if (flow_keys.basic.ip_proto != IPPROTO_TCP && > - flow_keys.basic.ip_proto != IPPROTO_UDP) > + flow_keys.basic.ip_proto != IPPROTO_UDP && > + flow_keys.basic.ip_proto != IPPROTO_GRE) > return false; > > - tuple->port.src = flow_keys.ports.src; > - tuple->port.dst = flow_keys.ports.dst; > + if (flow_keys.basic.ip_proto == IPPROTO_TCP || > + flow_keys.basic.ip_proto == IPPROTO_UDP) { > + tuple->port.src = flow_keys.ports.src; > + tuple->port.dst = flow_keys.ports.dst; > + } > tuple->n_proto = flow_keys.basic.n_proto; > tuple->ip_proto = flow_keys.basic.ip_proto; > > -- > 1.8.3.1 > > Acked-by: Paul Blakey <paulb@nvidia.com> Looks good to me. Thanks. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-02-23 10:44 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-02-03 11:59 [PATCH net-next 0/3] Conntrack GRE offload Toshiaki Makita 2022-02-03 11:59 ` [PATCH net-next 1/3] netfilter: flowtable: Support GRE Toshiaki Makita 2022-02-07 17:56 ` Pablo Neira Ayuso 2022-02-08 14:30 ` Toshiaki Makita 2022-02-09 10:01 ` Pablo Neira Ayuso 2022-02-12 1:54 ` Toshiaki Makita 2022-02-19 12:51 ` Toshiaki Makita 2022-02-23 10:44 ` Pablo Neira Ayuso 2022-02-03 11:59 ` [PATCH net-next 2/3] act_ct: Support GRE offload Toshiaki Makita 2022-02-10 12:48 ` Paul Blakey 2022-02-03 11:59 ` [PATCH net-next 3/3] net/mlx5: Support GRE conntrack offload Toshiaki Makita 2022-02-10 12:44 ` Paul Blakey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).