* [PATCH net-next 01/10] netfilter: nft_payload: move struct nft_payload_set definition where it belongs
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
2022-10-28 4:10 ` patchwork-bot+netdevbpf
2022-10-26 13:22 ` [PATCH net-next 02/10] netfilter: nf_tables: reduce nft_pktinfo by 8 bytes Pablo Neira Ayuso
` (8 subsequent siblings)
9 siblings, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
Not required to expose this header in nf_tables_core.h, move it to where
it is used, ie. nft_payload.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_tables_core.h | 10 ----------
net/netfilter/nft_payload.c | 10 ++++++++++
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h
index 1223af68cd9a..990c3767a350 100644
--- a/include/net/netfilter/nf_tables_core.h
+++ b/include/net/netfilter/nf_tables_core.h
@@ -66,16 +66,6 @@ struct nft_payload {
u8 dreg;
};
-struct nft_payload_set {
- enum nft_payload_bases base:8;
- u8 offset;
- u8 len;
- u8 sreg;
- u8 csum_type;
- u8 csum_offset;
- u8 csum_flags;
-};
-
extern const struct nft_expr_ops nft_payload_fast_ops;
extern const struct nft_expr_ops nft_bitwise_fast_ops;
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 088244f9d838..07621d509a68 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -665,6 +665,16 @@ static int nft_payload_csum_inet(struct sk_buff *skb, const u32 *src,
return 0;
}
+struct nft_payload_set {
+ enum nft_payload_bases base:8;
+ u8 offset;
+ u8 len;
+ u8 sreg;
+ u8 csum_type;
+ u8 csum_offset;
+ u8 csum_flags;
+};
+
static void nft_payload_set_eval(const struct nft_expr *expr,
struct nft_regs *regs,
const struct nft_pktinfo *pkt)
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 01/10] netfilter: nft_payload: move struct nft_payload_set definition where it belongs
2022-10-26 13:22 ` [PATCH net-next 01/10] netfilter: nft_payload: move struct nft_payload_set definition where it belongs Pablo Neira Ayuso
@ 2022-10-28 4:10 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 16+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-10-28 4:10 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev, kuba, pabeni, edumazet
Hello:
This series was applied to netdev/net-next.git (master)
by Pablo Neira Ayuso <pablo@netfilter.org>:
On Wed, 26 Oct 2022 15:22:18 +0200 you wrote:
> Not required to expose this header in nf_tables_core.h, move it to where
> it is used, ie. nft_payload.
>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> ---
> include/net/netfilter/nf_tables_core.h | 10 ----------
> net/netfilter/nft_payload.c | 10 ++++++++++
> 2 files changed, 10 insertions(+), 10 deletions(-)
Here is the summary with links:
- [net-next,01/10] netfilter: nft_payload: move struct nft_payload_set definition where it belongs
https://git.kernel.org/netdev/net-next/c/ac1f8c049319
- [net-next,02/10] netfilter: nf_tables: reduce nft_pktinfo by 8 bytes
https://git.kernel.org/netdev/net-next/c/e7a1caa67ce6
- [net-next,03/10] netfilter: nft_objref: make it builtin
https://git.kernel.org/netdev/net-next/c/d037abc2414b
- [net-next,04/10] netfilter: nft_payload: access GRE payload via inner offset
https://git.kernel.org/netdev/net-next/c/c247897d7c19
- [net-next,05/10] netfilter: nft_payload: access ipip payload for inner offset
https://git.kernel.org/netdev/net-next/c/3927ce8850ca
- [net-next,06/10] netfilter: nft_inner: support for inner tunnel header matching
https://git.kernel.org/netdev/net-next/c/3a07327d10a0
- [net-next,07/10] netfilter: nft_inner: add percpu inner context
https://git.kernel.org/netdev/net-next/c/0e795b37ba04
- [net-next,08/10] netfilter: nft_meta: add inner match support
https://git.kernel.org/netdev/net-next/c/a150d122b6bd
- [net-next,09/10] netfilter: nft_inner: add geneve support
https://git.kernel.org/netdev/net-next/c/0db14b95660b
- [net-next,10/10] netfilter: nft_inner: set tunnel offset to GRE header offset
https://git.kernel.org/netdev/net-next/c/91619eb60aec
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH net-next 02/10] netfilter: nf_tables: reduce nft_pktinfo by 8 bytes
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 01/10] netfilter: nft_payload: move struct nft_payload_set definition where it belongs Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 03/10] netfilter: nft_objref: make it builtin Pablo Neira Ayuso
` (7 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
From: Florian Westphal <fw@strlen.de>
structure is reduced from 32 to 24 bytes. While at it, also check
that iphdrlen is sane, this is guaranteed for NFPROTO_IPV4 but not
for ingress or bridge, so add checks for this.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_tables.h | 4 ++--
include/net/netfilter/nf_tables_ipv4.h | 4 ++++
include/net/netfilter/nf_tables_ipv6.h | 6 +++---
3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index cdb7db9b0e25..f6db510689a8 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -32,8 +32,8 @@ struct nft_pktinfo {
u8 flags;
u8 tprot;
u16 fragoff;
- unsigned int thoff;
- unsigned int inneroff;
+ u16 thoff;
+ u16 inneroff;
};
static inline struct sock *nft_sk(const struct nft_pktinfo *pkt)
diff --git a/include/net/netfilter/nf_tables_ipv4.h b/include/net/netfilter/nf_tables_ipv4.h
index c4a6147b0ef8..112708f7a6b4 100644
--- a/include/net/netfilter/nf_tables_ipv4.h
+++ b/include/net/netfilter/nf_tables_ipv4.h
@@ -35,6 +35,8 @@ static inline int __nft_set_pktinfo_ipv4_validate(struct nft_pktinfo *pkt)
return -1;
else if (len < thoff)
return -1;
+ else if (thoff < sizeof(*iph))
+ return -1;
pkt->flags = NFT_PKTINFO_L4PROTO;
pkt->tprot = iph->protocol;
@@ -69,6 +71,8 @@ static inline int nft_set_pktinfo_ipv4_ingress(struct nft_pktinfo *pkt)
return -1;
} else if (len < thoff) {
goto inhdr_error;
+ } else if (thoff < sizeof(*iph)) {
+ return -1;
}
pkt->flags = NFT_PKTINFO_L4PROTO;
diff --git a/include/net/netfilter/nf_tables_ipv6.h b/include/net/netfilter/nf_tables_ipv6.h
index ec7eaeaf4f04..467d59b9e533 100644
--- a/include/net/netfilter/nf_tables_ipv6.h
+++ b/include/net/netfilter/nf_tables_ipv6.h
@@ -13,7 +13,7 @@ static inline void nft_set_pktinfo_ipv6(struct nft_pktinfo *pkt)
unsigned short frag_off;
protohdr = ipv6_find_hdr(pkt->skb, &thoff, -1, &frag_off, &flags);
- if (protohdr < 0) {
+ if (protohdr < 0 || thoff > U16_MAX) {
nft_set_pktinfo_unspec(pkt);
return;
}
@@ -47,7 +47,7 @@ static inline int __nft_set_pktinfo_ipv6_validate(struct nft_pktinfo *pkt)
return -1;
protohdr = ipv6_find_hdr(pkt->skb, &thoff, -1, &frag_off, &flags);
- if (protohdr < 0)
+ if (protohdr < 0 || thoff > U16_MAX)
return -1;
pkt->flags = NFT_PKTINFO_L4PROTO;
@@ -93,7 +93,7 @@ static inline int nft_set_pktinfo_ipv6_ingress(struct nft_pktinfo *pkt)
}
protohdr = ipv6_find_hdr(pkt->skb, &thoff, -1, &frag_off, &flags);
- if (protohdr < 0)
+ if (protohdr < 0 || thoff > U16_MAX)
goto inhdr_error;
pkt->flags = NFT_PKTINFO_L4PROTO;
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 03/10] netfilter: nft_objref: make it builtin
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 01/10] netfilter: nft_payload: move struct nft_payload_set definition where it belongs Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 02/10] netfilter: nf_tables: reduce nft_pktinfo by 8 bytes Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 04/10] netfilter: nft_payload: access GRE payload via inner offset Pablo Neira Ayuso
` (6 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
From: Florian Westphal <fw@strlen.de>
nft_objref is needed to reference named objects, it makes
no sense to disable it.
Before:
text data bss dec filename
4014 424 0 4438 nft_objref.o
4174 1128 0 5302 nft_objref.ko
359351 15276 864 375491 nf_tables.ko
After:
text data bss dec filename
3815 408 0 4223 nft_objref.o
363161 15692 864 379717 nf_tables.ko
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_tables_core.h | 1 +
net/netfilter/Kconfig | 6 ------
net/netfilter/Makefile | 4 ++--
net/netfilter/nf_tables_core.c | 1 +
net/netfilter/nft_objref.c | 22 +---------------------
5 files changed, 5 insertions(+), 29 deletions(-)
diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h
index 990c3767a350..83d763631f81 100644
--- a/include/net/netfilter/nf_tables_core.h
+++ b/include/net/netfilter/nf_tables_core.h
@@ -18,6 +18,7 @@ extern struct nft_expr_type nft_meta_type;
extern struct nft_expr_type nft_rt_type;
extern struct nft_expr_type nft_exthdr_type;
extern struct nft_expr_type nft_last_type;
+extern struct nft_expr_type nft_objref_type;
#ifdef CONFIG_NETWORK_SECMARK
extern struct nft_object_type nft_secmark_obj_type;
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 4b8d04640ff3..0846bd75b1da 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -568,12 +568,6 @@ config NFT_TUNNEL
This option adds the "tunnel" expression that you can use to set
tunneling policies.
-config NFT_OBJREF
- tristate "Netfilter nf_tables stateful object reference module"
- help
- This option adds the "objref" expression that allows you to refer to
- stateful objects, such as counters and quotas.
-
config NFT_QUEUE
depends on NETFILTER_NETLINK_QUEUE
tristate "Netfilter nf_tables queue module"
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 0f060d100880..7a6b518ba2b4 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -86,7 +86,8 @@ nf_tables-objs := nf_tables_core.o nf_tables_api.o nft_chain_filter.o \
nf_tables_trace.o nft_immediate.o nft_cmp.o nft_range.o \
nft_bitwise.o nft_byteorder.o nft_payload.o nft_lookup.o \
nft_dynset.o nft_meta.o nft_rt.o nft_exthdr.o nft_last.o \
- nft_counter.o nft_chain_route.o nf_tables_offload.o \
+ nft_counter.o nft_objref.o \
+ nft_chain_route.o nf_tables_offload.o \
nft_set_hash.o nft_set_bitmap.o nft_set_rbtree.o \
nft_set_pipapo.o
@@ -104,7 +105,6 @@ obj-$(CONFIG_NFT_CT) += nft_ct.o
obj-$(CONFIG_NFT_FLOW_OFFLOAD) += nft_flow_offload.o
obj-$(CONFIG_NFT_LIMIT) += nft_limit.o
obj-$(CONFIG_NFT_NAT) += nft_nat.o
-obj-$(CONFIG_NFT_OBJREF) += nft_objref.o
obj-$(CONFIG_NFT_QUEUE) += nft_queue.o
obj-$(CONFIG_NFT_QUOTA) += nft_quota.o
obj-$(CONFIG_NFT_REJECT) += nft_reject.o
diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index cee3e4e905ec..6dcead50208c 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -340,6 +340,7 @@ static struct nft_expr_type *nft_basic_types[] = {
&nft_exthdr_type,
&nft_last_type,
&nft_counter_type,
+ &nft_objref_type,
};
static struct nft_object_type *nft_basic_objects[] = {
diff --git a/net/netfilter/nft_objref.c b/net/netfilter/nft_objref.c
index 5d8d91b3904d..74e0eea4abac 100644
--- a/net/netfilter/nft_objref.c
+++ b/net/netfilter/nft_objref.c
@@ -82,7 +82,6 @@ static void nft_objref_activate(const struct nft_ctx *ctx,
obj->use++;
}
-static struct nft_expr_type nft_objref_type;
static const struct nft_expr_ops nft_objref_ops = {
.type = &nft_objref_type,
.size = NFT_EXPR_SIZE(sizeof(struct nft_object *)),
@@ -195,7 +194,6 @@ static void nft_objref_map_destroy(const struct nft_ctx *ctx,
nf_tables_destroy_set(ctx, priv->set);
}
-static struct nft_expr_type nft_objref_type;
static const struct nft_expr_ops nft_objref_map_ops = {
.type = &nft_objref_type,
.size = NFT_EXPR_SIZE(sizeof(struct nft_objref_map)),
@@ -233,28 +231,10 @@ static const struct nla_policy nft_objref_policy[NFTA_OBJREF_MAX + 1] = {
[NFTA_OBJREF_SET_ID] = { .type = NLA_U32 },
};
-static struct nft_expr_type nft_objref_type __read_mostly = {
+struct nft_expr_type nft_objref_type __read_mostly = {
.name = "objref",
.select_ops = nft_objref_select_ops,
.policy = nft_objref_policy,
.maxattr = NFTA_OBJREF_MAX,
.owner = THIS_MODULE,
};
-
-static int __init nft_objref_module_init(void)
-{
- return nft_register_expr(&nft_objref_type);
-}
-
-static void __exit nft_objref_module_exit(void)
-{
- nft_unregister_expr(&nft_objref_type);
-}
-
-module_init(nft_objref_module_init);
-module_exit(nft_objref_module_exit);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Pablo Neira Ayuso <pablo@netfilter.org>");
-MODULE_ALIAS_NFT_EXPR("objref");
-MODULE_DESCRIPTION("nftables stateful object reference module");
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 04/10] netfilter: nft_payload: access GRE payload via inner offset
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
` (2 preceding siblings ...)
2022-10-26 13:22 ` [PATCH net-next 03/10] netfilter: nft_objref: make it builtin Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
2022-10-28 3:35 ` Jakub Kicinski
2022-10-26 13:22 ` [PATCH net-next 05/10] netfilter: nft_payload: access ipip payload for " Pablo Neira Ayuso
` (5 subsequent siblings)
9 siblings, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
Parse GRE v0 packets to properly set up inner offset, this allow for
matching on inner headers.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_payload.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 07621d509a68..03a1f271bf4f 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -19,6 +19,7 @@
/* For layer 4 checksum field offset. */
#include <linux/tcp.h>
#include <linux/udp.h>
+#include <net/gre.h>
#include <linux/icmpv6.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
@@ -100,6 +101,37 @@ static int __nft_payload_inner_offset(struct nft_pktinfo *pkt)
pkt->inneroff = thoff + __tcp_hdrlen(th);
}
break;
+ case IPPROTO_GRE: {
+ u32 offset = sizeof(struct gre_base_hdr), version;
+ struct gre_base_hdr *gre, _gre;
+
+ gre = skb_header_pointer(pkt->skb, thoff, sizeof(_gre), &_gre);
+ if (!gre)
+ return -1;
+
+ version = gre->flags & GRE_VERSION;
+ switch (version) {
+ case GRE_VERSION_0:
+ if (gre->flags & GRE_ROUTING)
+ return -1;
+
+ if (gre->flags & GRE_CSUM) {
+ offset += sizeof_field(struct gre_full_hdr, csum) +
+ sizeof_field(struct gre_full_hdr, reserved1);
+ }
+ if (gre->flags & GRE_KEY)
+ offset += sizeof_field(struct gre_full_hdr, key);
+
+ if (gre->flags & GRE_SEQ)
+ offset += sizeof_field(struct gre_full_hdr, seq);
+ break;
+ default:
+ return -1;
+ }
+
+ pkt->inneroff = thoff + offset;
+ }
+ break;
default:
return -1;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 05/10] netfilter: nft_payload: access ipip payload for inner offset
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
` (3 preceding siblings ...)
2022-10-26 13:22 ` [PATCH net-next 04/10] netfilter: nft_payload: access GRE payload via inner offset Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 06/10] netfilter: nft_inner: support for inner tunnel header matching Pablo Neira Ayuso
` (4 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
ipip is an special case, transport and inner header offset are set to
the same offset to use the upcoming inner expression for matching on
inner tunnel headers.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_payload.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 03a1f271bf4f..84b490d6cc75 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -132,6 +132,9 @@ static int __nft_payload_inner_offset(struct nft_pktinfo *pkt)
pkt->inneroff = thoff + offset;
}
break;
+ case IPPROTO_IPIP:
+ pkt->inneroff = thoff;
+ break;
default:
return -1;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 06/10] netfilter: nft_inner: support for inner tunnel header matching
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
` (4 preceding siblings ...)
2022-10-26 13:22 ` [PATCH net-next 05/10] netfilter: nft_payload: access ipip payload for " Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 07/10] netfilter: nft_inner: add percpu inner context Pablo Neira Ayuso
` (3 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
This new expression allows you to match on the inner headers that are
encapsulated by any of the existing tunneling protocols.
This expression parses the inner packet to set the link, network and
transport offsets, so the existing expressions (with a few updates) can
be reused to match on the inner headers.
The inner expression supports for different tunnel combinations such as:
- ethernet frame over IPv4/IPv6 packet, eg. VxLAN.
- IPv4/IPv6 packet over IPv4/IPv6 packet, eg. IPIP.
- IPv4/IPv6 packet over IPv4/IPv6 + transport header, eg. GRE.
- transport header (ESP or SCTP) over transport header (usually UDP)
The following fields are used to describe the tunnel protocol:
- flags, which describe how to parse the inner headers:
NFT_PAYLOAD_CTX_INNER_TUN, the tunnel provides its own header.
NFT_PAYLOAD_CTX_INNER_ETHER, the ethernet frame is available as inner header.
NFT_PAYLOAD_CTX_INNER_NH, the network header is available as inner header.
NFT_PAYLOAD_CTX_INNER_TH, the transport header is available as inner header.
For example, VxLAN sets on all of these flags. While GRE only sets on
NFT_PAYLOAD_CTX_INNER_NH and NFT_PAYLOAD_CTX_INNER_TH. Then, ESP over
UDP only sets on NFT_PAYLOAD_CTX_INNER_TH.
The tunnel description is composed of the following attributes:
- header size: in case the tunnel comes with its own header, eg. VxLAN.
- type: this provides a hint to userspace on how to delinearize the rule.
This is useful for VxLAN and Geneve since they run over UDP, since
transport does not provide a hint. This is also useful in case hardware
offload is ever supported. The type is not currently interpreted by the
kernel.
- expression: currently only payload supported. Follow up patch adds
also inner meta support which is required by autogenerated
dependencies. The exthdr expression should be supported too
at some point. There is a new inner_ops operation that needs to be
set on to allow to use an existing expression from the inner expression.
This patch adds a new NFT_PAYLOAD_TUN_HEADER base which allows to match
on the tunnel header fields, eg. vxlan vni.
The payload expression is embedded into nft_inner private area and this
private data area is passed to the payload inner eval function via
direct call.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_tables.h | 5 +
include/net/netfilter/nf_tables_core.h | 24 ++
include/uapi/linux/netfilter/nf_tables.h | 26 ++
net/netfilter/Makefile | 2 +-
net/netfilter/nf_tables_api.c | 37 +++
net/netfilter/nf_tables_core.c | 1 +
net/netfilter/nft_inner.c | 336 +++++++++++++++++++++++
net/netfilter/nft_payload.c | 89 +++++-
8 files changed, 518 insertions(+), 2 deletions(-)
create mode 100644 net/netfilter/nft_inner.c
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index f6db510689a8..2dbfe7524a7e 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -375,6 +375,10 @@ static inline void *nft_expr_priv(const struct nft_expr *expr)
return (void *)expr->data;
}
+struct nft_expr_info;
+
+int nft_expr_inner_parse(const struct nft_ctx *ctx, const struct nlattr *nla,
+ struct nft_expr_info *info);
int nft_expr_clone(struct nft_expr *dst, struct nft_expr *src);
void nft_expr_destroy(const struct nft_ctx *ctx, struct nft_expr *expr);
int nft_expr_dump(struct sk_buff *skb, unsigned int attr,
@@ -864,6 +868,7 @@ struct nft_expr_type {
const struct nlattr * const tb[]);
void (*release_ops)(const struct nft_expr_ops *ops);
const struct nft_expr_ops *ops;
+ const struct nft_expr_ops *inner_ops;
struct list_head list;
const char *name;
struct module *owner;
diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h
index 83d763631f81..be2b2b5d0a52 100644
--- a/include/net/netfilter/nf_tables_core.h
+++ b/include/net/netfilter/nf_tables_core.h
@@ -19,6 +19,7 @@ extern struct nft_expr_type nft_rt_type;
extern struct nft_expr_type nft_exthdr_type;
extern struct nft_expr_type nft_last_type;
extern struct nft_expr_type nft_objref_type;
+extern struct nft_expr_type nft_inner_type;
#ifdef CONFIG_NETWORK_SECMARK
extern struct nft_object_type nft_secmark_obj_type;
@@ -139,4 +140,27 @@ void nft_rt_get_eval(const struct nft_expr *expr,
struct nft_regs *regs, const struct nft_pktinfo *pkt);
void nft_counter_eval(const struct nft_expr *expr, struct nft_regs *regs,
const struct nft_pktinfo *pkt);
+
+enum {
+ NFT_PAYLOAD_CTX_INNER_TUN = (1 << 0),
+ NFT_PAYLOAD_CTX_INNER_LL = (1 << 1),
+ NFT_PAYLOAD_CTX_INNER_NH = (1 << 2),
+ NFT_PAYLOAD_CTX_INNER_TH = (1 << 3),
+};
+
+struct nft_inner_tun_ctx {
+ u16 inner_tunoff;
+ u16 inner_lloff;
+ u16 inner_nhoff;
+ u16 inner_thoff;
+ __be16 llproto;
+ u8 l4proto;
+ u8 flags;
+};
+
+int nft_payload_inner_offset(const struct nft_pktinfo *pkt);
+void nft_payload_inner_eval(const struct nft_expr *expr, struct nft_regs *regs,
+ const struct nft_pktinfo *pkt,
+ struct nft_inner_tun_ctx *ctx);
+
#endif /* _NET_NF_TABLES_CORE_H */
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 466fd3f4447c..05a15dce8271 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -760,6 +760,7 @@ enum nft_payload_bases {
NFT_PAYLOAD_NETWORK_HEADER,
NFT_PAYLOAD_TRANSPORT_HEADER,
NFT_PAYLOAD_INNER_HEADER,
+ NFT_PAYLOAD_TUN_HEADER,
};
/**
@@ -779,6 +780,31 @@ enum nft_payload_csum_flags {
NFT_PAYLOAD_L4CSUM_PSEUDOHDR = (1 << 0),
};
+enum nft_inner_type {
+ NFT_INNER_UNSPEC = 0,
+ NFT_INNER_VXLAN,
+};
+
+enum nft_inner_flags {
+ NFT_INNER_HDRSIZE = (1 << 0),
+ NFT_INNER_LL = (1 << 1),
+ NFT_INNER_NH = (1 << 2),
+ NFT_INNER_TH = (1 << 3),
+};
+#define NFT_INNER_MASK (NFT_INNER_HDRSIZE | NFT_INNER_LL | \
+ NFT_INNER_NH | NFT_INNER_TH)
+
+enum nft_inner_attributes {
+ NFTA_INNER_UNSPEC,
+ NFTA_INNER_NUM,
+ NFTA_INNER_TYPE,
+ NFTA_INNER_FLAGS,
+ NFTA_INNER_HDRSIZE,
+ NFTA_INNER_EXPR,
+ __NFTA_INNER_MAX
+};
+#define NFTA_INNER_MAX (__NFTA_INNER_MAX - 1)
+
/**
* enum nft_payload_attributes - nf_tables payload expression netlink attributes
*
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 7a6b518ba2b4..1d4db1943936 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -86,7 +86,7 @@ nf_tables-objs := nf_tables_core.o nf_tables_api.o nft_chain_filter.o \
nf_tables_trace.o nft_immediate.o nft_cmp.o nft_range.o \
nft_bitwise.o nft_byteorder.o nft_payload.o nft_lookup.o \
nft_dynset.o nft_meta.o nft_rt.o nft_exthdr.o nft_last.o \
- nft_counter.o nft_objref.o \
+ nft_counter.o nft_objref.o nft_inner.o \
nft_chain_route.o nf_tables_offload.o \
nft_set_hash.o nft_set_bitmap.o nft_set_rbtree.o \
nft_set_pipapo.o
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 58d9cbc9ccdc..6b79f5e18f08 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2857,6 +2857,43 @@ static int nf_tables_expr_parse(const struct nft_ctx *ctx,
return err;
}
+int nft_expr_inner_parse(const struct nft_ctx *ctx, const struct nlattr *nla,
+ struct nft_expr_info *info)
+{
+ struct nlattr *tb[NFTA_EXPR_MAX + 1];
+ const struct nft_expr_type *type;
+ int err;
+
+ err = nla_parse_nested_deprecated(tb, NFTA_EXPR_MAX, nla,
+ nft_expr_policy, NULL);
+ if (err < 0)
+ return err;
+
+ if (!tb[NFTA_EXPR_DATA])
+ return -EINVAL;
+
+ type = __nft_expr_type_get(ctx->family, tb[NFTA_EXPR_NAME]);
+ if (IS_ERR(type))
+ return PTR_ERR(type);
+
+ if (!type->inner_ops)
+ return -EOPNOTSUPP;
+
+ err = nla_parse_nested_deprecated(info->tb, type->maxattr,
+ tb[NFTA_EXPR_DATA],
+ type->policy, NULL);
+ if (err < 0)
+ goto err_nla_parse;
+
+ info->attr = nla;
+ info->ops = type->inner_ops;
+
+ return 0;
+
+err_nla_parse:
+ return err;
+}
+
static int nf_tables_newexpr(const struct nft_ctx *ctx,
const struct nft_expr_info *expr_info,
struct nft_expr *expr)
diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index 6dcead50208c..709a736c301c 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -341,6 +341,7 @@ static struct nft_expr_type *nft_basic_types[] = {
&nft_last_type,
&nft_counter_type,
&nft_objref_type,
+ &nft_inner_type,
};
static struct nft_object_type *nft_basic_objects[] = {
diff --git a/net/netfilter/nft_inner.c b/net/netfilter/nft_inner.c
new file mode 100644
index 000000000000..1e4079b5b431
--- /dev/null
+++ b/net/netfilter/nft_inner.c
@@ -0,0 +1,336 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022 Pablo Neira Ayuso <pablo@netfilter.org>
+ */
+
+#include <linux/kernel.h>
+#include <linux/if_vlan.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/netlink.h>
+#include <linux/netfilter.h>
+#include <linux/netfilter/nf_tables.h>
+#include <net/netfilter/nf_tables_core.h>
+#include <net/netfilter/nf_tables.h>
+#include <net/netfilter/nf_tables_offload.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include <net/gre.h>
+#include <net/ip.h>
+#include <linux/icmpv6.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+
+/* Same layout as nft_expr but it embeds the private expression data area. */
+struct __nft_expr {
+ const struct nft_expr_ops *ops;
+ union {
+ struct nft_payload payload;
+ } __attribute__((aligned(__alignof__(u64))));
+};
+
+enum {
+ NFT_INNER_EXPR_PAYLOAD,
+};
+
+struct nft_inner {
+ u8 flags;
+ u8 hdrsize;
+ u8 type;
+ u8 expr_type;
+
+ struct __nft_expr expr;
+};
+
+static int nft_inner_parse_l2l3(const struct nft_inner *priv,
+ const struct nft_pktinfo *pkt,
+ struct nft_inner_tun_ctx *ctx, u32 off)
+{
+ __be16 llproto, outer_llproto;
+ u32 nhoff, thoff;
+
+ if (priv->flags & NFT_INNER_LL) {
+ struct vlan_ethhdr *veth, _veth;
+ struct ethhdr *eth, _eth;
+ u32 hdrsize;
+
+ eth = skb_header_pointer(pkt->skb, off, sizeof(_eth), &_eth);
+ if (!eth)
+ return -1;
+
+ switch (eth->h_proto) {
+ case htons(ETH_P_IP):
+ case htons(ETH_P_IPV6):
+ llproto = eth->h_proto;
+ hdrsize = sizeof(_eth);
+ break;
+ case htons(ETH_P_8021Q):
+ veth = skb_header_pointer(pkt->skb, off, sizeof(_veth), &_veth);
+ if (!eth)
+ return -1;
+
+ outer_llproto = veth->h_vlan_encapsulated_proto;
+ llproto = veth->h_vlan_proto;
+ hdrsize = sizeof(_veth);
+ break;
+ default:
+ return -1;
+ }
+
+ ctx->inner_lloff = off;
+ ctx->flags |= NFT_PAYLOAD_CTX_INNER_LL;
+ off += hdrsize;
+ } else {
+ struct iphdr *iph;
+ u32 _version;
+
+ iph = skb_header_pointer(pkt->skb, off, sizeof(_version), &_version);
+ if (!iph)
+ return -1;
+
+ switch (iph->version) {
+ case 4:
+ llproto = htons(ETH_P_IP);
+ break;
+ case 6:
+ llproto = htons(ETH_P_IPV6);
+ break;
+ default:
+ return -1;
+ }
+ }
+
+ ctx->llproto = llproto;
+ if (llproto == htons(ETH_P_8021Q))
+ llproto = outer_llproto;
+
+ nhoff = off;
+
+ switch (llproto) {
+ case htons(ETH_P_IP): {
+ struct iphdr *iph, _iph;
+
+ iph = skb_header_pointer(pkt->skb, nhoff, sizeof(_iph), &_iph);
+ if (!iph)
+ return -1;
+
+ if (iph->ihl < 5 || iph->version != 4)
+ return -1;
+
+ ctx->inner_nhoff = nhoff;
+ ctx->flags |= NFT_PAYLOAD_CTX_INNER_NH;
+
+ thoff = nhoff + (iph->ihl * 4);
+ if ((ntohs(iph->frag_off) & IP_OFFSET) == 0) {
+ ctx->flags |= NFT_PAYLOAD_CTX_INNER_TH;
+ ctx->inner_thoff = thoff;
+ ctx->l4proto = iph->protocol;
+ }
+ }
+ break;
+ case htons(ETH_P_IPV6): {
+ struct ipv6hdr *ip6h, _ip6h;
+ int fh_flags = IP6_FH_F_AUTH;
+ unsigned short fragoff;
+ int l4proto;
+
+ ip6h = skb_header_pointer(pkt->skb, nhoff, sizeof(_ip6h), &_ip6h);
+ if (!ip6h)
+ return -1;
+
+ if (ip6h->version != 6)
+ return -1;
+
+ ctx->inner_nhoff = nhoff;
+ ctx->flags |= NFT_PAYLOAD_CTX_INNER_NH;
+
+ thoff = nhoff;
+ l4proto = ipv6_find_hdr(pkt->skb, &thoff, -1, &fragoff, &fh_flags);
+ if (l4proto < 0 || thoff > U16_MAX)
+ return -1;
+
+ if (fragoff == 0) {
+ thoff = nhoff + sizeof(_ip6h);
+ ctx->flags |= NFT_PAYLOAD_CTX_INNER_TH;
+ ctx->inner_thoff = thoff;
+ ctx->l4proto = l4proto;
+ }
+ }
+ break;
+ default:
+ return -1;
+ }
+
+ return 0;
+}
+
+static int nft_inner_parse_tunhdr(const struct nft_inner *priv,
+ const struct nft_pktinfo *pkt,
+ struct nft_inner_tun_ctx *ctx, u32 *off)
+{
+ if (pkt->tprot != IPPROTO_UDP ||
+ pkt->tprot != IPPROTO_GRE)
+ return -1;
+
+ ctx->inner_tunoff = *off;
+ ctx->flags |= NFT_PAYLOAD_CTX_INNER_TUN;
+ *off += priv->hdrsize;
+
+ return 0;
+}
+
+static int nft_inner_parse(const struct nft_inner *priv,
+ const struct nft_pktinfo *pkt,
+ struct nft_inner_tun_ctx *tun_ctx)
+{
+ struct nft_inner_tun_ctx ctx = {};
+ u32 off = pkt->inneroff;
+
+ if (priv->flags & NFT_INNER_HDRSIZE &&
+ nft_inner_parse_tunhdr(priv, pkt, &ctx, &off) < 0)
+ return -1;
+
+ if (priv->flags & (NFT_INNER_LL | NFT_INNER_NH)) {
+ if (nft_inner_parse_l2l3(priv, pkt, &ctx, off) < 0)
+ return -1;
+ } else if (priv->flags & NFT_INNER_TH) {
+ ctx.inner_thoff = off;
+ ctx.flags |= NFT_PAYLOAD_CTX_INNER_TH;
+ }
+
+ *tun_ctx = ctx;
+
+ return 0;
+}
+
+static void nft_inner_eval(const struct nft_expr *expr, struct nft_regs *regs,
+ const struct nft_pktinfo *pkt)
+{
+ const struct nft_inner *priv = nft_expr_priv(expr);
+ struct nft_inner_tun_ctx tun_ctx = {};
+
+ if (nft_payload_inner_offset(pkt) < 0)
+ goto err;
+
+ if (nft_inner_parse(priv, pkt, &tun_ctx) < 0)
+ goto err;
+
+ switch (priv->expr_type) {
+ case NFT_INNER_EXPR_PAYLOAD:
+ nft_payload_inner_eval((struct nft_expr *)&priv->expr, regs, pkt, &tun_ctx);
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ goto err;
+ }
+ return;
+err:
+ regs->verdict.code = NFT_BREAK;
+}
+
+static const struct nla_policy nft_inner_policy[NFTA_INNER_MAX + 1] = {
+ [NFTA_INNER_NUM] = { .type = NLA_U32 },
+ [NFTA_INNER_FLAGS] = { .type = NLA_U32 },
+ [NFTA_INNER_HDRSIZE] = { .type = NLA_U32 },
+ [NFTA_INNER_TYPE] = { .type = NLA_U32 },
+ [NFTA_INNER_EXPR] = { .type = NLA_NESTED },
+};
+
+struct nft_expr_info {
+ const struct nft_expr_ops *ops;
+ const struct nlattr *attr;
+ struct nlattr *tb[NFT_EXPR_MAXATTR + 1];
+};
+
+static int nft_inner_init(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ const struct nlattr * const tb[])
+{
+ struct nft_inner *priv = nft_expr_priv(expr);
+ u32 flags, hdrsize, type, num;
+ struct nft_expr_info expr_info;
+ int err;
+
+ if (!tb[NFTA_INNER_FLAGS] ||
+ !tb[NFTA_INNER_HDRSIZE] ||
+ !tb[NFTA_INNER_TYPE] ||
+ !tb[NFTA_INNER_EXPR])
+ return -EINVAL;
+
+ flags = ntohl(nla_get_be32(tb[NFTA_INNER_FLAGS]));
+ if (flags & ~NFT_INNER_MASK)
+ return -EOPNOTSUPP;
+
+ num = ntohl(nla_get_be32(tb[NFTA_INNER_NUM]));
+ if (num != 0)
+ return -EOPNOTSUPP;
+
+ hdrsize = ntohl(nla_get_be32(tb[NFTA_INNER_HDRSIZE]));
+ type = ntohl(nla_get_be32(tb[NFTA_INNER_TYPE]));
+
+ if (type > U8_MAX)
+ return -EINVAL;
+
+ if (flags & NFT_INNER_HDRSIZE) {
+ if (hdrsize == 0 || hdrsize > 64)
+ return -EOPNOTSUPP;
+ }
+
+ priv->flags = flags;
+ priv->hdrsize = hdrsize;
+ priv->type = type;
+
+ err = nft_expr_inner_parse(ctx, tb[NFTA_INNER_EXPR], &expr_info);
+ if (err < 0)
+ return err;
+
+ priv->expr.ops = expr_info.ops;
+
+ if (!strcmp(expr_info.ops->type->name, "payload"))
+ priv->expr_type = NFT_INNER_EXPR_PAYLOAD;
+ else
+ return -EINVAL;
+
+ err = expr_info.ops->init(ctx, (struct nft_expr *)&priv->expr,
+ (const struct nlattr * const*)expr_info.tb);
+ if (err < 0)
+ return err;
+
+ return 0;
+}
+
+static int nft_inner_dump(struct sk_buff *skb, const struct nft_expr *expr)
+{
+ const struct nft_inner *priv = nft_expr_priv(expr);
+
+ if (nla_put_be32(skb, NFTA_INNER_NUM, htonl(0)) ||
+ nla_put_be32(skb, NFTA_INNER_TYPE, htonl(priv->type)) ||
+ nla_put_be32(skb, NFTA_INNER_FLAGS, htonl(priv->flags)) ||
+ nla_put_be32(skb, NFTA_INNER_HDRSIZE, htonl(priv->hdrsize)))
+ goto nla_put_failure;
+
+ if (nft_expr_dump(skb, NFTA_INNER_EXPR,
+ (struct nft_expr *)&priv->expr) < 0)
+ goto nla_put_failure;
+
+ return 0;
+
+nla_put_failure:
+ return -1;
+}
+
+static const struct nft_expr_ops nft_inner_ops = {
+ .type = &nft_inner_type,
+ .size = NFT_EXPR_SIZE(sizeof(struct nft_inner)),
+ .eval = nft_inner_eval,
+ .init = nft_inner_init,
+ .dump = nft_inner_dump,
+};
+
+struct nft_expr_type nft_inner_type __read_mostly = {
+ .name = "inner",
+ .ops = &nft_inner_ops,
+ .policy = nft_inner_policy,
+ .maxattr = NFTA_INNER_MAX,
+ .owner = THIS_MODULE,
+};
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 84b490d6cc75..9d2ac764a14c 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -144,7 +144,7 @@ static int __nft_payload_inner_offset(struct nft_pktinfo *pkt)
return 0;
}
-static int nft_payload_inner_offset(const struct nft_pktinfo *pkt)
+int nft_payload_inner_offset(const struct nft_pktinfo *pkt)
{
if (!(pkt->flags & NFT_PKTINFO_INNER) &&
__nft_payload_inner_offset((struct nft_pktinfo *)pkt) < 0)
@@ -587,6 +587,92 @@ const struct nft_expr_ops nft_payload_fast_ops = {
.offload = nft_payload_offload,
};
+void nft_payload_inner_eval(const struct nft_expr *expr, struct nft_regs *regs,
+ const struct nft_pktinfo *pkt,
+ struct nft_inner_tun_ctx *tun_ctx)
+{
+ const struct nft_payload *priv = nft_expr_priv(expr);
+ const struct sk_buff *skb = pkt->skb;
+ u32 *dest = ®s->data[priv->dreg];
+ int offset;
+
+ if (priv->len % NFT_REG32_SIZE)
+ dest[priv->len / NFT_REG32_SIZE] = 0;
+
+ switch (priv->base) {
+ case NFT_PAYLOAD_TUN_HEADER:
+ if (!(tun_ctx->flags & NFT_PAYLOAD_CTX_INNER_TUN))
+ goto err;
+
+ offset = tun_ctx->inner_tunoff;
+ break;
+ case NFT_PAYLOAD_LL_HEADER:
+ if (!(tun_ctx->flags & NFT_PAYLOAD_CTX_INNER_LL))
+ goto err;
+
+ offset = tun_ctx->inner_lloff;
+ break;
+ case NFT_PAYLOAD_NETWORK_HEADER:
+ if (!(tun_ctx->flags & NFT_PAYLOAD_CTX_INNER_NH))
+ goto err;
+
+ offset = tun_ctx->inner_nhoff;
+ break;
+ case NFT_PAYLOAD_TRANSPORT_HEADER:
+ if (!(tun_ctx->flags & NFT_PAYLOAD_CTX_INNER_TH))
+ goto err;
+
+ offset = tun_ctx->inner_thoff;
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ goto err;
+ }
+ offset += priv->offset;
+
+ if (skb_copy_bits(skb, offset, dest, priv->len) < 0)
+ goto err;
+
+ return;
+err:
+ regs->verdict.code = NFT_BREAK;
+}
+
+static int nft_payload_inner_init(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ const struct nlattr * const tb[])
+{
+ struct nft_payload *priv = nft_expr_priv(expr);
+ u32 base;
+
+ base = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_BASE]));
+ switch (base) {
+ case NFT_PAYLOAD_TUN_HEADER:
+ case NFT_PAYLOAD_LL_HEADER:
+ case NFT_PAYLOAD_NETWORK_HEADER:
+ case NFT_PAYLOAD_TRANSPORT_HEADER:
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ priv->base = base;
+ priv->offset = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_OFFSET]));
+ priv->len = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_LEN]));
+
+ return nft_parse_register_store(ctx, tb[NFTA_PAYLOAD_DREG],
+ &priv->dreg, NULL, NFT_DATA_VALUE,
+ priv->len);
+}
+
+static const struct nft_expr_ops nft_payload_inner_ops = {
+ .type = &nft_payload_type,
+ .size = NFT_EXPR_SIZE(sizeof(struct nft_payload)),
+ .init = nft_payload_inner_init,
+ .dump = nft_payload_dump,
+ /* direct call to nft_payload_inner_eval(). */
+};
+
static inline void nft_csum_replace(__sum16 *sum, __wsum fsum, __wsum tsum)
{
*sum = csum_fold(csum_add(csum_sub(~csum_unfold(*sum), fsum), tsum));
@@ -930,6 +1016,7 @@ nft_payload_select_ops(const struct nft_ctx *ctx,
struct nft_expr_type nft_payload_type __read_mostly = {
.name = "payload",
.select_ops = nft_payload_select_ops,
+ .inner_ops = &nft_payload_inner_ops,
.policy = nft_payload_policy,
.maxattr = NFTA_PAYLOAD_MAX,
.owner = THIS_MODULE,
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 07/10] netfilter: nft_inner: add percpu inner context
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
` (5 preceding siblings ...)
2022-10-26 13:22 ` [PATCH net-next 06/10] netfilter: nft_inner: support for inner tunnel header matching Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 08/10] netfilter: nft_meta: add inner match support Pablo Neira Ayuso
` (2 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
Add NFT_PKTINFO_INNER_FULL flag to annotate that inner offsets are
available. Store nft_inner_tun_ctx object in percpu area to cache
existing inner offsets for this skbuff.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_tables.h | 1 +
include/net/netfilter/nf_tables_core.h | 1 +
net/netfilter/nft_inner.c | 26 ++++++++++++++++++++++----
3 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 2dbfe7524a7e..38e2b396e38a 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -24,6 +24,7 @@ struct module;
enum {
NFT_PKTINFO_L4PROTO = (1 << 0),
NFT_PKTINFO_INNER = (1 << 1),
+ NFT_PKTINFO_INNER_FULL = (1 << 2),
};
struct nft_pktinfo {
diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h
index be2b2b5d0a52..3e825381ac5c 100644
--- a/include/net/netfilter/nf_tables_core.h
+++ b/include/net/netfilter/nf_tables_core.h
@@ -149,6 +149,7 @@ enum {
};
struct nft_inner_tun_ctx {
+ u16 type;
u16 inner_tunoff;
u16 inner_lloff;
u16 inner_nhoff;
diff --git a/net/netfilter/nft_inner.c b/net/netfilter/nft_inner.c
index 1e4079b5b431..29f2eefe0357 100644
--- a/net/netfilter/nft_inner.c
+++ b/net/netfilter/nft_inner.c
@@ -21,6 +21,8 @@
#include <linux/ip.h>
#include <linux/ipv6.h>
+static DEFINE_PER_CPU(struct nft_inner_tun_ctx, nft_pcpu_tun_ctx);
+
/* Same layout as nft_expr but it embeds the private expression data area. */
struct __nft_expr {
const struct nft_expr_ops *ops;
@@ -180,7 +182,7 @@ static int nft_inner_parse_tunhdr(const struct nft_inner *priv,
}
static int nft_inner_parse(const struct nft_inner *priv,
- const struct nft_pktinfo *pkt,
+ struct nft_pktinfo *pkt,
struct nft_inner_tun_ctx *tun_ctx)
{
struct nft_inner_tun_ctx ctx = {};
@@ -199,25 +201,41 @@ static int nft_inner_parse(const struct nft_inner *priv,
}
*tun_ctx = ctx;
+ tun_ctx->type = priv->type;
+ pkt->flags |= NFT_PKTINFO_INNER_FULL;
return 0;
}
+static bool nft_inner_parse_needed(const struct nft_inner *priv,
+ const struct nft_pktinfo *pkt,
+ const struct nft_inner_tun_ctx *tun_ctx)
+{
+ if (!(pkt->flags & NFT_PKTINFO_INNER_FULL))
+ return true;
+
+ if (priv->type != tun_ctx->type)
+ return true;
+
+ return false;
+}
+
static void nft_inner_eval(const struct nft_expr *expr, struct nft_regs *regs,
const struct nft_pktinfo *pkt)
{
+ struct nft_inner_tun_ctx *tun_ctx = this_cpu_ptr(&nft_pcpu_tun_ctx);
const struct nft_inner *priv = nft_expr_priv(expr);
- struct nft_inner_tun_ctx tun_ctx = {};
if (nft_payload_inner_offset(pkt) < 0)
goto err;
- if (nft_inner_parse(priv, pkt, &tun_ctx) < 0)
+ if (nft_inner_parse_needed(priv, pkt, tun_ctx) &&
+ nft_inner_parse(priv, (struct nft_pktinfo *)pkt, tun_ctx) < 0)
goto err;
switch (priv->expr_type) {
case NFT_INNER_EXPR_PAYLOAD:
- nft_payload_inner_eval((struct nft_expr *)&priv->expr, regs, pkt, &tun_ctx);
+ nft_payload_inner_eval((struct nft_expr *)&priv->expr, regs, pkt, tun_ctx);
break;
default:
WARN_ON_ONCE(1);
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 08/10] netfilter: nft_meta: add inner match support
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
` (6 preceding siblings ...)
2022-10-26 13:22 ` [PATCH net-next 07/10] netfilter: nft_inner: add percpu inner context Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 09/10] netfilter: nft_inner: add geneve support Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 10/10] netfilter: nft_inner: set tunnel offset to GRE header offset Pablo Neira Ayuso
9 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
Add support for inner meta matching on:
- NFT_META_PROTOCOL: to match on the ethertype, this can be used
regardless tunnel protocol provides no link layer header, in that case
nft_inner sets on the ethertype based on the IP header version field.
- NFT_META_L4PROTO: to match on the layer 4 protocol.
These meta expression are usually autogenerated as dependencies by
userspace nftables.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nft_meta.h | 6 ++++
net/netfilter/nft_inner.c | 8 +++++
net/netfilter/nft_meta.c | 62 ++++++++++++++++++++++++++++++++
3 files changed, 76 insertions(+)
diff --git a/include/net/netfilter/nft_meta.h b/include/net/netfilter/nft_meta.h
index 9b51cc67de54..f3a5285a511c 100644
--- a/include/net/netfilter/nft_meta.h
+++ b/include/net/netfilter/nft_meta.h
@@ -46,4 +46,10 @@ int nft_meta_set_validate(const struct nft_ctx *ctx,
bool nft_meta_get_reduce(struct nft_regs_track *track,
const struct nft_expr *expr);
+
+struct nft_inner_tun_ctx;
+void nft_meta_inner_eval(const struct nft_expr *expr,
+ struct nft_regs *regs, const struct nft_pktinfo *pkt,
+ struct nft_inner_tun_ctx *tun_ctx);
+
#endif
diff --git a/net/netfilter/nft_inner.c b/net/netfilter/nft_inner.c
index 29f2eefe0357..c43a2fe0ceb7 100644
--- a/net/netfilter/nft_inner.c
+++ b/net/netfilter/nft_inner.c
@@ -12,6 +12,7 @@
#include <linux/netfilter/nf_tables.h>
#include <net/netfilter/nf_tables_core.h>
#include <net/netfilter/nf_tables.h>
+#include <net/netfilter/nft_meta.h>
#include <net/netfilter/nf_tables_offload.h>
#include <linux/tcp.h>
#include <linux/udp.h>
@@ -28,11 +29,13 @@ struct __nft_expr {
const struct nft_expr_ops *ops;
union {
struct nft_payload payload;
+ struct nft_meta meta;
} __attribute__((aligned(__alignof__(u64))));
};
enum {
NFT_INNER_EXPR_PAYLOAD,
+ NFT_INNER_EXPR_META,
};
struct nft_inner {
@@ -237,6 +240,9 @@ static void nft_inner_eval(const struct nft_expr *expr, struct nft_regs *regs,
case NFT_INNER_EXPR_PAYLOAD:
nft_payload_inner_eval((struct nft_expr *)&priv->expr, regs, pkt, tun_ctx);
break;
+ case NFT_INNER_EXPR_META:
+ nft_meta_inner_eval((struct nft_expr *)&priv->expr, regs, pkt, tun_ctx);
+ break;
default:
WARN_ON_ONCE(1);
goto err;
@@ -306,6 +312,8 @@ static int nft_inner_init(const struct nft_ctx *ctx,
if (!strcmp(expr_info.ops->type->name, "payload"))
priv->expr_type = NFT_INNER_EXPR_PAYLOAD;
+ else if (!strcmp(expr_info.ops->type->name, "meta"))
+ priv->expr_type = NFT_INNER_EXPR_META;
else
return -EINVAL;
diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index 55d2d49c3425..8c39adeebb5c 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -831,9 +831,71 @@ nft_meta_select_ops(const struct nft_ctx *ctx,
return ERR_PTR(-EINVAL);
}
+static int nft_meta_inner_init(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ const struct nlattr * const tb[])
+{
+ struct nft_meta *priv = nft_expr_priv(expr);
+ unsigned int len;
+
+ priv->key = ntohl(nla_get_be32(tb[NFTA_META_KEY]));
+ switch (priv->key) {
+ case NFT_META_PROTOCOL:
+ len = sizeof(u16);
+ break;
+ case NFT_META_L4PROTO:
+ len = sizeof(u32);
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+ priv->len = len;
+
+ return nft_parse_register_store(ctx, tb[NFTA_META_DREG], &priv->dreg,
+ NULL, NFT_DATA_VALUE, len);
+}
+
+void nft_meta_inner_eval(const struct nft_expr *expr,
+ struct nft_regs *regs,
+ const struct nft_pktinfo *pkt,
+ struct nft_inner_tun_ctx *tun_ctx)
+{
+ const struct nft_meta *priv = nft_expr_priv(expr);
+ u32 *dest = ®s->data[priv->dreg];
+
+ switch (priv->key) {
+ case NFT_META_PROTOCOL:
+ nft_reg_store16(dest, (__force u16)tun_ctx->llproto);
+ break;
+ case NFT_META_L4PROTO:
+ if (!(tun_ctx->flags & NFT_PAYLOAD_CTX_INNER_TH))
+ goto err;
+
+ nft_reg_store8(dest, tun_ctx->l4proto);
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ goto err;
+ }
+ return;
+
+err:
+ regs->verdict.code = NFT_BREAK;
+}
+EXPORT_SYMBOL_GPL(nft_meta_inner_eval);
+
+static const struct nft_expr_ops nft_meta_inner_ops = {
+ .type = &nft_meta_type,
+ .size = NFT_EXPR_SIZE(sizeof(struct nft_meta)),
+ .init = nft_meta_inner_init,
+ .dump = nft_meta_get_dump,
+ /* direct call to nft_meta_inner_eval(). */
+};
+
struct nft_expr_type nft_meta_type __read_mostly = {
.name = "meta",
.select_ops = nft_meta_select_ops,
+ .inner_ops = &nft_meta_inner_ops,
.policy = nft_meta_policy,
.maxattr = NFTA_META_MAX,
.owner = THIS_MODULE,
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 09/10] netfilter: nft_inner: add geneve support
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
` (7 preceding siblings ...)
2022-10-26 13:22 ` [PATCH net-next 08/10] netfilter: nft_meta: add inner match support Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
2022-10-26 13:22 ` [PATCH net-next 10/10] netfilter: nft_inner: set tunnel offset to GRE header offset Pablo Neira Ayuso
9 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
Geneve tunnel header may contain options, parse geneve header and update
offset to point to the link layer header according to the opt_len field.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/uapi/linux/netfilter/nf_tables.h | 1 +
net/netfilter/nft_inner.c | 17 +++++++++++++++++
2 files changed, 18 insertions(+)
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 05a15dce8271..e4b739d57480 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -783,6 +783,7 @@ enum nft_payload_csum_flags {
enum nft_inner_type {
NFT_INNER_UNSPEC = 0,
NFT_INNER_VXLAN,
+ NFT_INNER_GENEVE,
};
enum nft_inner_flags {
diff --git a/net/netfilter/nft_inner.c b/net/netfilter/nft_inner.c
index c43a2fe0ceb7..19fdc8c70cd1 100644
--- a/net/netfilter/nft_inner.c
+++ b/net/netfilter/nft_inner.c
@@ -17,6 +17,7 @@
#include <linux/tcp.h>
#include <linux/udp.h>
#include <net/gre.h>
+#include <net/geneve.h>
#include <net/ip.h>
#include <linux/icmpv6.h>
#include <linux/ip.h>
@@ -181,6 +182,22 @@ static int nft_inner_parse_tunhdr(const struct nft_inner *priv,
ctx->flags |= NFT_PAYLOAD_CTX_INNER_TUN;
*off += priv->hdrsize;
+ switch (priv->type) {
+ case NFT_INNER_GENEVE: {
+ struct genevehdr *gnvh, _gnvh;
+
+ gnvh = skb_header_pointer(pkt->skb, pkt->inneroff,
+ sizeof(_gnvh), &_gnvh);
+ if (!gnvh)
+ return -1;
+
+ *off += gnvh->opt_len * 4;
+ }
+ break;
+ default:
+ break;
+ }
+
return 0;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 10/10] netfilter: nft_inner: set tunnel offset to GRE header offset
2022-10-26 13:22 [PATCH net-next 00/10] Netfilter updates for net-next Pablo Neira Ayuso
` (8 preceding siblings ...)
2022-10-26 13:22 ` [PATCH net-next 09/10] netfilter: nft_inner: add geneve support Pablo Neira Ayuso
@ 2022-10-26 13:22 ` Pablo Neira Ayuso
9 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2022-10-26 13:22 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet
Set inner tunnel offset to the GRE header, this is redundant to existing
transport header offset, but this normalizes the handling of the tunnel
header regardless its location in the layering. GRE version 0 is overloaded
with RFCs, the type decorator in the inner expression might also be useful
to interpret matching fields from the netlink delinearize path in userspace.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_inner.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nft_inner.c b/net/netfilter/nft_inner.c
index 19fdc8c70cd1..eae7caeff316 100644
--- a/net/netfilter/nft_inner.c
+++ b/net/netfilter/nft_inner.c
@@ -174,8 +174,13 @@ static int nft_inner_parse_tunhdr(const struct nft_inner *priv,
const struct nft_pktinfo *pkt,
struct nft_inner_tun_ctx *ctx, u32 *off)
{
- if (pkt->tprot != IPPROTO_UDP ||
- pkt->tprot != IPPROTO_GRE)
+ if (pkt->tprot == IPPROTO_GRE) {
+ ctx->inner_tunoff = pkt->thoff;
+ ctx->flags |= NFT_PAYLOAD_CTX_INNER_TUN;
+ return 0;
+ }
+
+ if (pkt->tprot != IPPROTO_UDP)
return -1;
ctx->inner_tunoff = *off;
--
2.30.2
^ permalink raw reply related [flat|nested] 16+ messages in thread