All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3 nf v3] netfilter: nf_socket: skip socket lookup for non-first fragments
@ 2026-04-21 10:44 Fernando Fernandez Mancera
  2026-04-21 10:44 ` [PATCH 2/3 nf v3] netfilter: nf_tables: skip L4 header parsing " Fernando Fernandez Mancera
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Fernando Fernandez Mancera @ 2026-04-21 10:44 UTC (permalink / raw)
  To: netfilter-devel
  Cc: coreteam, ecklm94, phil, fw, pablo, Fernando Fernandez Mancera

Both nft_socket and xt_socket relies on L4 headers to perform socket
lookup in the slow path. For fragmented packets, while the IP protocol
remains constant across all fragments, only the first fragment contains
the actual L4 header.

As the expression/match could be attached to a chain with a priority
lower than -400, it could bypass defragmentation.

Add a check for fragmentation in the lookup functions directly so the
problem is handled for both nft_socket and xt_socket at the same time.
In addition, future users of the functions would not need to care about
this.

Fixes: 902d6a4c2a4f ("netfilter: nf_defrag: Skip defrag if NOTRACK is set")
Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
v3: added this patch to the series, I splitted this as the fix is
generic for both nft_socket and xt_socket
---
 net/ipv4/netfilter/nf_socket_ipv4.c | 3 +++
 net/ipv6/netfilter/nf_socket_ipv6.c | 5 +++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/netfilter/nf_socket_ipv4.c b/net/ipv4/netfilter/nf_socket_ipv4.c
index 5080fa5fbf6a..f9c6755f5ec5 100644
--- a/net/ipv4/netfilter/nf_socket_ipv4.c
+++ b/net/ipv4/netfilter/nf_socket_ipv4.c
@@ -94,6 +94,9 @@ struct sock *nf_sk_lookup_slow_v4(struct net *net, const struct sk_buff *skb,
 #endif
 	int doff = 0;
 
+	if (ntohs(iph->frag_off) & IP_OFFSET)
+		return NULL;
+
 	if (iph->protocol == IPPROTO_UDP || iph->protocol == IPPROTO_TCP) {
 		struct tcphdr _hdr;
 		struct udphdr *hp;
diff --git a/net/ipv6/netfilter/nf_socket_ipv6.c b/net/ipv6/netfilter/nf_socket_ipv6.c
index ced8bd44828e..893f2aeb4711 100644
--- a/net/ipv6/netfilter/nf_socket_ipv6.c
+++ b/net/ipv6/netfilter/nf_socket_ipv6.c
@@ -100,6 +100,7 @@ struct sock *nf_sk_lookup_slow_v6(struct net *net, const struct sk_buff *skb,
 	const struct in6_addr *daddr = NULL, *saddr = NULL;
 	struct ipv6hdr *iph = ipv6_hdr(skb), ipv6_var;
 	struct sk_buff *data_skb = NULL;
+	unsigned short fragoff = 0;
 	int doff = 0;
 	int thoff = 0, tproto;
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
@@ -107,8 +108,8 @@ struct sock *nf_sk_lookup_slow_v6(struct net *net, const struct sk_buff *skb,
 	struct nf_conn const *ct;
 #endif
 
-	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL);
-	if (tproto < 0) {
+	tproto = ipv6_find_hdr(skb, &thoff, -1, &fragoff, NULL);
+	if (tproto < 0 || fragoff) {
 		pr_debug("unable to find transport header in IPv6 packet, dropping\n");
 		return NULL;
 	}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3 nf v3] netfilter: nf_tables: skip L4 header parsing for non-first fragments
  2026-04-21 10:44 [PATCH 1/3 nf v3] netfilter: nf_socket: skip socket lookup for non-first fragments Fernando Fernandez Mancera
@ 2026-04-21 10:44 ` Fernando Fernandez Mancera
  2026-04-21 10:44 ` [PATCH 3/3 nf v3] netfilter: xtables: fix " Fernando Fernandez Mancera
  2026-04-21 11:33 ` [PATCH 1/3 nf v3] netfilter: nf_socket: skip socket lookup " Pablo Neira Ayuso
  2 siblings, 0 replies; 7+ messages in thread
From: Fernando Fernandez Mancera @ 2026-04-21 10:44 UTC (permalink / raw)
  To: netfilter-devel
  Cc: coreteam, ecklm94, phil, fw, pablo, Fernando Fernandez Mancera

The tproxy, osf and exthdr (SCTP) expressions rely on the presence of
transport layer headers to perform socket lookups, fingerprint matching,
or chunk extraction. For fragmented packets, while the IP protocol
remains constant across all fragments, only the first fragment contains
the actual L4 header.

The expressions could be attached to a chain with a priority lower than
-400, bypassing defragmentation. Or could be used in stateless
environments where defragmentation is not happening at all.  This could
result in garbage data being used for the matching.

Add a check for pkt->fragoff so only unfragmented packets or the first
fragment is processed.

Fixes: 133dc203d77d ("netfilter: nft_exthdr: Support SCTP chunks")
Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support")
Fixes: b96af92d6eaf ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
v2: handled fragmented packets for socket expression too,
squashed nftables expression commits into this one.
v3: removed changes to nft_socket and created a generic solution for
xt/nft
---
 net/netfilter/nft_exthdr.c | 2 +-
 net/netfilter/nft_osf.c    | 2 +-
 net/netfilter/nft_tproxy.c | 8 ++++----
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c
index 0407d6f708ae..e6a07c0df207 100644
--- a/net/netfilter/nft_exthdr.c
+++ b/net/netfilter/nft_exthdr.c
@@ -376,7 +376,7 @@ static void nft_exthdr_sctp_eval(const struct nft_expr *expr,
 	const struct sctp_chunkhdr *sch;
 	struct sctp_chunkhdr _sch;
 
-	if (pkt->tprot != IPPROTO_SCTP)
+	if (pkt->tprot != IPPROTO_SCTP || pkt->fragoff)
 		goto err;
 
 	do {
diff --git a/net/netfilter/nft_osf.c b/net/netfilter/nft_osf.c
index 18003433476c..966c7745c423 100644
--- a/net/netfilter/nft_osf.c
+++ b/net/netfilter/nft_osf.c
@@ -28,7 +28,7 @@ static void nft_osf_eval(const struct nft_expr *expr, struct nft_regs *regs,
 	struct nf_osf_data data;
 	struct tcphdr _tcph;
 
-	if (pkt->tprot != IPPROTO_TCP) {
+	if (pkt->tprot != IPPROTO_TCP || pkt->fragoff) {
 		regs->verdict.code = NFT_BREAK;
 		return;
 	}
diff --git a/net/netfilter/nft_tproxy.c b/net/netfilter/nft_tproxy.c
index f2101af8c867..89be443734f6 100644
--- a/net/netfilter/nft_tproxy.c
+++ b/net/netfilter/nft_tproxy.c
@@ -30,8 +30,8 @@ static void nft_tproxy_eval_v4(const struct nft_expr *expr,
 	__be16 tport = 0;
 	struct sock *sk;
 
-	if (pkt->tprot != IPPROTO_TCP &&
-	    pkt->tprot != IPPROTO_UDP) {
+	if ((pkt->tprot != IPPROTO_TCP &&
+	     pkt->tprot != IPPROTO_UDP) || pkt->fragoff) {
 		regs->verdict.code = NFT_BREAK;
 		return;
 	}
@@ -97,8 +97,8 @@ static void nft_tproxy_eval_v6(const struct nft_expr *expr,
 
 	memset(&taddr, 0, sizeof(taddr));
 
-	if (pkt->tprot != IPPROTO_TCP &&
-	    pkt->tprot != IPPROTO_UDP) {
+	if ((pkt->tprot != IPPROTO_TCP &&
+	     pkt->tprot != IPPROTO_UDP) || pkt->fragoff) {
 		regs->verdict.code = NFT_BREAK;
 		return;
 	}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3 nf v3] netfilter: xtables: fix L4 header parsing for non-first fragments
  2026-04-21 10:44 [PATCH 1/3 nf v3] netfilter: nf_socket: skip socket lookup for non-first fragments Fernando Fernandez Mancera
  2026-04-21 10:44 ` [PATCH 2/3 nf v3] netfilter: nf_tables: skip L4 header parsing " Fernando Fernandez Mancera
@ 2026-04-21 10:44 ` Fernando Fernandez Mancera
  2026-04-24 11:38   ` Pablo Neira Ayuso
  2026-04-21 11:33 ` [PATCH 1/3 nf v3] netfilter: nf_socket: skip socket lookup " Pablo Neira Ayuso
  2 siblings, 1 reply; 7+ messages in thread
From: Fernando Fernandez Mancera @ 2026-04-21 10:44 UTC (permalink / raw)
  To: netfilter-devel
  Cc: coreteam, ecklm94, phil, fw, pablo, Fernando Fernandez Mancera

Multiple targets and matches relies on L4 header to operate. For
fragmented packets, every fragment carries the transport protocol
identifier, but only the first fragment contains the L4 header.

As the 'raw' table can be configured to run at priority -450 (before
defragmentation at -400), the target/match can be reached before
reassembly. In this case, non-first fragments have their payload
incorrectly parsed as a TCP/UDP header. This would be of course a
misconfiguration scenario. In most of the cases this just lead to a
unreliable behavior for fragmented traffic.

Add a fragment check to ensure target/match only evaluates unfragmented
packets or the first fragment in the stream.

Fixes: 902d6a4c2a4f ("netfilter: nf_defrag: Skip defrag if NOTRACK is set")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
v2: handled ecn, socket and tcpmss matches
v3: extracted socket to its own patch with a generic solution for
nft/xt, added a comment specifying that par->fragoff is fine for
ecn/tcpmss ipv6 as they enforce -p tcp. Keep on mind that osf only
supports ipv4.
---
 net/netfilter/xt_TPROXY.c | 11 +++++++++--
 net/netfilter/xt_ecn.c    |  4 ++++
 net/netfilter/xt_osf.c    |  3 +++
 net/netfilter/xt_tcpmss.c |  4 ++++
 4 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c
index e4bea1d346cf..5f60e7298a1e 100644
--- a/net/netfilter/xt_TPROXY.c
+++ b/net/netfilter/xt_TPROXY.c
@@ -86,6 +86,9 @@ tproxy_tg4_v0(struct sk_buff *skb, const struct xt_action_param *par)
 {
 	const struct xt_tproxy_target_info *tgi = par->targinfo;
 
+	if (par->fragoff)
+		return NF_DROP;
+
 	return tproxy_tg4(xt_net(par), skb, tgi->laddr, tgi->lport,
 			  tgi->mark_mask, tgi->mark_value);
 }
@@ -95,6 +98,9 @@ tproxy_tg4_v1(struct sk_buff *skb, const struct xt_action_param *par)
 {
 	const struct xt_tproxy_target_info_v1 *tgi = par->targinfo;
 
+	if (par->fragoff)
+		return NF_DROP;
+
 	return tproxy_tg4(xt_net(par), skb, tgi->laddr.ip, tgi->lport,
 			  tgi->mark_mask, tgi->mark_value);
 }
@@ -106,6 +112,7 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par)
 {
 	const struct ipv6hdr *iph = ipv6_hdr(skb);
 	const struct xt_tproxy_target_info_v1 *tgi = par->targinfo;
+	unsigned short fragoff = 0;
 	struct udphdr _hdr, *hp;
 	struct sock *sk;
 	const struct in6_addr *laddr;
@@ -113,8 +120,8 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par)
 	int thoff = 0;
 	int tproto;
 
-	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL);
-	if (tproto < 0)
+	tproto = ipv6_find_hdr(skb, &thoff, -1, &fragoff, NULL);
+	if (tproto < 0 || fragoff)
 		return NF_DROP;
 
 	hp = skb_header_pointer(skb, thoff, sizeof(_hdr), &_hdr);
diff --git a/net/netfilter/xt_ecn.c b/net/netfilter/xt_ecn.c
index b96e8203ac54..a8503f5d26bf 100644
--- a/net/netfilter/xt_ecn.c
+++ b/net/netfilter/xt_ecn.c
@@ -30,6 +30,10 @@ static bool match_tcp(const struct sk_buff *skb, struct xt_action_param *par)
 	struct tcphdr _tcph;
 	const struct tcphdr *th;
 
+	/* this is fine for IPv6 as ecn_mt_check6() enforces -p tcp */
+	if (par->fragoff)
+		return false;
+
 	/* In practice, TCP match does this, so can't fail.  But let's
 	 * be good citizens.
 	 */
diff --git a/net/netfilter/xt_osf.c b/net/netfilter/xt_osf.c
index dc9485854002..e8807caede68 100644
--- a/net/netfilter/xt_osf.c
+++ b/net/netfilter/xt_osf.c
@@ -27,6 +27,9 @@
 static bool
 xt_osf_match_packet(const struct sk_buff *skb, struct xt_action_param *p)
 {
+	if (p->fragoff)
+		return false;
+
 	return nf_osf_match(skb, xt_family(p), xt_hooknum(p), xt_in(p),
 			    xt_out(p), p->matchinfo, xt_net(p), nf_osf_fingers);
 }
diff --git a/net/netfilter/xt_tcpmss.c b/net/netfilter/xt_tcpmss.c
index 0d32d4841cb3..b9da8269161d 100644
--- a/net/netfilter/xt_tcpmss.c
+++ b/net/netfilter/xt_tcpmss.c
@@ -32,6 +32,10 @@ tcpmss_mt(const struct sk_buff *skb, struct xt_action_param *par)
 	u8 _opt[15 * 4 - sizeof(_tcph)];
 	unsigned int i, optlen;
 
+	/* this is fine for IPv6 as xt_tcpmss enforces -p tcp */
+	if (par->fragoff)
+		return false;
+
 	/* If we don't have the whole header, drop packet. */
 	th = skb_header_pointer(skb, par->thoff, sizeof(_tcph), &_tcph);
 	if (th == NULL)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3 nf v3] netfilter: nf_socket: skip socket lookup for non-first fragments
  2026-04-21 10:44 [PATCH 1/3 nf v3] netfilter: nf_socket: skip socket lookup for non-first fragments Fernando Fernandez Mancera
  2026-04-21 10:44 ` [PATCH 2/3 nf v3] netfilter: nf_tables: skip L4 header parsing " Fernando Fernandez Mancera
  2026-04-21 10:44 ` [PATCH 3/3 nf v3] netfilter: xtables: fix " Fernando Fernandez Mancera
@ 2026-04-21 11:33 ` Pablo Neira Ayuso
  2 siblings, 0 replies; 7+ messages in thread
From: Pablo Neira Ayuso @ 2026-04-21 11:33 UTC (permalink / raw)
  To: Fernando Fernandez Mancera; +Cc: netfilter-devel, coreteam, ecklm94, phil, fw

Hi Fernando,

This series LGTM, it is addressing the issues we have discussed.

On Tue, Apr 21, 2026 at 12:44:07PM +0200, Fernando Fernandez Mancera wrote:
> Both nft_socket and xt_socket relies on L4 headers to perform socket
> lookup in the slow path. For fragmented packets, while the IP protocol
> remains constant across all fragments, only the first fragment contains
> the actual L4 header.
> 
> As the expression/match could be attached to a chain with a priority
> lower than -400, it could bypass defragmentation.
> 
> Add a check for fragmentation in the lookup functions directly so the
> problem is handled for both nft_socket and xt_socket at the same time.
> In addition, future users of the functions would not need to care about
> this.
> 
> Fixes: 902d6a4c2a4f ("netfilter: nf_defrag: Skip defrag if NOTRACK is set")
> Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching")
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> ---
> v3: added this patch to the series, I splitted this as the fix is
> generic for both nft_socket and xt_socket
> ---
>  net/ipv4/netfilter/nf_socket_ipv4.c | 3 +++
>  net/ipv6/netfilter/nf_socket_ipv6.c | 5 +++--
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/netfilter/nf_socket_ipv4.c b/net/ipv4/netfilter/nf_socket_ipv4.c
> index 5080fa5fbf6a..f9c6755f5ec5 100644
> --- a/net/ipv4/netfilter/nf_socket_ipv4.c
> +++ b/net/ipv4/netfilter/nf_socket_ipv4.c
> @@ -94,6 +94,9 @@ struct sock *nf_sk_lookup_slow_v4(struct net *net, const struct sk_buff *skb,
>  #endif
>  	int doff = 0;
>  
> +	if (ntohs(iph->frag_off) & IP_OFFSET)
> +		return NULL;
> +
>  	if (iph->protocol == IPPROTO_UDP || iph->protocol == IPPROTO_TCP) {
>  		struct tcphdr _hdr;
>  		struct udphdr *hp;
> diff --git a/net/ipv6/netfilter/nf_socket_ipv6.c b/net/ipv6/netfilter/nf_socket_ipv6.c
> index ced8bd44828e..893f2aeb4711 100644
> --- a/net/ipv6/netfilter/nf_socket_ipv6.c
> +++ b/net/ipv6/netfilter/nf_socket_ipv6.c
> @@ -100,6 +100,7 @@ struct sock *nf_sk_lookup_slow_v6(struct net *net, const struct sk_buff *skb,
>  	const struct in6_addr *daddr = NULL, *saddr = NULL;
>  	struct ipv6hdr *iph = ipv6_hdr(skb), ipv6_var;
>  	struct sk_buff *data_skb = NULL;
> +	unsigned short fragoff = 0;
>  	int doff = 0;
>  	int thoff = 0, tproto;
>  #if IS_ENABLED(CONFIG_NF_CONNTRACK)
> @@ -107,8 +108,8 @@ struct sock *nf_sk_lookup_slow_v6(struct net *net, const struct sk_buff *skb,
>  	struct nf_conn const *ct;
>  #endif
>  
> -	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL);
> -	if (tproto < 0) {
> +	tproto = ipv6_find_hdr(skb, &thoff, -1, &fragoff, NULL);
> +	if (tproto < 0 || fragoff) {
>  		pr_debug("unable to find transport header in IPv6 packet, dropping\n");
>  		return NULL;
>  	}
> -- 
> 2.53.0
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3 nf v3] netfilter: xtables: fix L4 header parsing for non-first fragments
  2026-04-21 10:44 ` [PATCH 3/3 nf v3] netfilter: xtables: fix " Fernando Fernandez Mancera
@ 2026-04-24 11:38   ` Pablo Neira Ayuso
  2026-04-24 14:33     ` Fernando Fernandez Mancera
  0 siblings, 1 reply; 7+ messages in thread
From: Pablo Neira Ayuso @ 2026-04-24 11:38 UTC (permalink / raw)
  To: Fernando Fernandez Mancera; +Cc: netfilter-devel, coreteam, ecklm94, phil, fw

On Tue, Apr 21, 2026 at 12:44:09PM +0200, Fernando Fernandez Mancera wrote:
> Multiple targets and matches relies on L4 header to operate. For
> fragmented packets, every fragment carries the transport protocol
> identifier, but only the first fragment contains the L4 header.
> 
> As the 'raw' table can be configured to run at priority -450 (before
> defragmentation at -400), the target/match can be reached before
> reassembly. In this case, non-first fragments have their payload
> incorrectly parsed as a TCP/UDP header. This would be of course a
> misconfiguration scenario. In most of the cases this just lead to a
> unreliable behavior for fragmented traffic.
> 
> Add a fragment check to ensure target/match only evaluates unfragmented
> packets or the first fragment in the stream.

AI reports xt_hashlimit could be a good candidate to check for
fragoff, I think it is, so I would suggest to expand it there to cover
this.

It also mentions synproxy as another candidate but IPv6 synproxy does
not do ipv6_find_hdr() on purpose I think (it assumed nexthdr is TCP)
for SYN and ACK packets, so checking for fragoff there is not
possible. Given this is to deal with flood, I think think it is worth
the fragoff validation.

> Fixes: 902d6a4c2a4f ("netfilter: nf_defrag: Skip defrag if NOTRACK is set")
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> ---
> v2: handled ecn, socket and tcpmss matches
> v3: extracted socket to its own patch with a generic solution for
> nft/xt, added a comment specifying that par->fragoff is fine for
> ecn/tcpmss ipv6 as they enforce -p tcp. Keep on mind that osf only
> supports ipv4.
> ---
>  net/netfilter/xt_TPROXY.c | 11 +++++++++--
>  net/netfilter/xt_ecn.c    |  4 ++++
>  net/netfilter/xt_osf.c    |  3 +++
>  net/netfilter/xt_tcpmss.c |  4 ++++
>  4 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c
> index e4bea1d346cf..5f60e7298a1e 100644
> --- a/net/netfilter/xt_TPROXY.c
> +++ b/net/netfilter/xt_TPROXY.c
> @@ -86,6 +86,9 @@ tproxy_tg4_v0(struct sk_buff *skb, const struct xt_action_param *par)
>  {
>  	const struct xt_tproxy_target_info *tgi = par->targinfo;
>  
> +	if (par->fragoff)
> +		return NF_DROP;
> +
>  	return tproxy_tg4(xt_net(par), skb, tgi->laddr, tgi->lport,
>  			  tgi->mark_mask, tgi->mark_value);
>  }
> @@ -95,6 +98,9 @@ tproxy_tg4_v1(struct sk_buff *skb, const struct xt_action_param *par)
>  {
>  	const struct xt_tproxy_target_info_v1 *tgi = par->targinfo;
>  
> +	if (par->fragoff)
> +		return NF_DROP;
> +
>  	return tproxy_tg4(xt_net(par), skb, tgi->laddr.ip, tgi->lport,
>  			  tgi->mark_mask, tgi->mark_value);
>  }
> @@ -106,6 +112,7 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par)
>  {
>  	const struct ipv6hdr *iph = ipv6_hdr(skb);
>  	const struct xt_tproxy_target_info_v1 *tgi = par->targinfo;
> +	unsigned short fragoff = 0;
>  	struct udphdr _hdr, *hp;
>  	struct sock *sk;
>  	const struct in6_addr *laddr;
> @@ -113,8 +120,8 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par)
>  	int thoff = 0;
>  	int tproto;
>  
> -	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL);
> -	if (tproto < 0)
> +	tproto = ipv6_find_hdr(skb, &thoff, -1, &fragoff, NULL);
> +	if (tproto < 0 || fragoff)
>  		return NF_DROP;
>  
>  	hp = skb_header_pointer(skb, thoff, sizeof(_hdr), &_hdr);
> diff --git a/net/netfilter/xt_ecn.c b/net/netfilter/xt_ecn.c
> index b96e8203ac54..a8503f5d26bf 100644
> --- a/net/netfilter/xt_ecn.c
> +++ b/net/netfilter/xt_ecn.c
> @@ -30,6 +30,10 @@ static bool match_tcp(const struct sk_buff *skb, struct xt_action_param *par)
>  	struct tcphdr _tcph;
>  	const struct tcphdr *th;
>  
> +	/* this is fine for IPv6 as ecn_mt_check6() enforces -p tcp */
> +	if (par->fragoff)
> +		return false;
> +
>  	/* In practice, TCP match does this, so can't fail.  But let's
>  	 * be good citizens.
>  	 */
> diff --git a/net/netfilter/xt_osf.c b/net/netfilter/xt_osf.c
> index dc9485854002..e8807caede68 100644
> --- a/net/netfilter/xt_osf.c
> +++ b/net/netfilter/xt_osf.c
> @@ -27,6 +27,9 @@
>  static bool
>  xt_osf_match_packet(const struct sk_buff *skb, struct xt_action_param *p)
>  {
> +	if (p->fragoff)
> +		return false;
> +
>  	return nf_osf_match(skb, xt_family(p), xt_hooknum(p), xt_in(p),
>  			    xt_out(p), p->matchinfo, xt_net(p), nf_osf_fingers);
>  }
> diff --git a/net/netfilter/xt_tcpmss.c b/net/netfilter/xt_tcpmss.c
> index 0d32d4841cb3..b9da8269161d 100644
> --- a/net/netfilter/xt_tcpmss.c
> +++ b/net/netfilter/xt_tcpmss.c
> @@ -32,6 +32,10 @@ tcpmss_mt(const struct sk_buff *skb, struct xt_action_param *par)
>  	u8 _opt[15 * 4 - sizeof(_tcph)];
>  	unsigned int i, optlen;
>  
> +	/* this is fine for IPv6 as xt_tcpmss enforces -p tcp */
> +	if (par->fragoff)
> +		return false;
> +
>  	/* If we don't have the whole header, drop packet. */
>  	th = skb_header_pointer(skb, par->thoff, sizeof(_tcph), &_tcph);
>  	if (th == NULL)
> -- 
> 2.53.0
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3 nf v3] netfilter: xtables: fix L4 header parsing for non-first fragments
  2026-04-24 11:38   ` Pablo Neira Ayuso
@ 2026-04-24 14:33     ` Fernando Fernandez Mancera
  2026-04-24 17:09       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 7+ messages in thread
From: Fernando Fernandez Mancera @ 2026-04-24 14:33 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, coreteam, ecklm94, phil, fw

On 4/24/26 1:38 PM, Pablo Neira Ayuso wrote:
> On Tue, Apr 21, 2026 at 12:44:09PM +0200, Fernando Fernandez Mancera wrote:
>> Multiple targets and matches relies on L4 header to operate. For
>> fragmented packets, every fragment carries the transport protocol
>> identifier, but only the first fragment contains the L4 header.
>>
>> As the 'raw' table can be configured to run at priority -450 (before
>> defragmentation at -400), the target/match can be reached before
>> reassembly. In this case, non-first fragments have their payload
>> incorrectly parsed as a TCP/UDP header. This would be of course a
>> misconfiguration scenario. In most of the cases this just lead to a
>> unreliable behavior for fragmented traffic.
>>
>> Add a fragment check to ensure target/match only evaluates unfragmented
>> packets or the first fragment in the stream.
> 

Hi Pablo,

> AI reports xt_hashlimit could be a good candidate to check for
> fragoff, I think it is, so I would suggest to expand it there to cover
> this.
> 

This seems like a good catch. I will work on this.

> It also mentions synproxy as another candidate but IPv6 synproxy does
> not do ipv6_find_hdr() on purpose I think (it assumed nexthdr is TCP)
> for SYN and ACK packets, so checking for fragoff there is not
> possible. Given this is to deal with flood, I think think it is worth
> the fragoff validation.
> 

I don't think it makes sense for SYNPROXY. SYNPROXY requires conntrack 
to work both ipt_SYNPROXY and nft_synproxy and only allows LOCAL_IN and 
FORWARD hooks so it should be fine AFAIU. I don't understand how someone 
could skip defragmentation here.

Maybe something extra to add to ipt_SYNPROXY is a restriction for TCP 
protocol only. As the code currently assumes the transport layer is TCP 
which isn't enforced.

But that would be kind of a different fix. What do you think?

>> Fixes: 902d6a4c2a4f ("netfilter: nf_defrag: Skip defrag if NOTRACK is set")
>> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
>> ---
>> v2: handled ecn, socket and tcpmss matches
>> v3: extracted socket to its own patch with a generic solution for
>> nft/xt, added a comment specifying that par->fragoff is fine for
>> ecn/tcpmss ipv6 as they enforce -p tcp. Keep on mind that osf only
>> supports ipv4.
>> ---
>>   net/netfilter/xt_TPROXY.c | 11 +++++++++--
>>   net/netfilter/xt_ecn.c    |  4 ++++
>>   net/netfilter/xt_osf.c    |  3 +++
>>   net/netfilter/xt_tcpmss.c |  4 ++++
>>   4 files changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c
>> index e4bea1d346cf..5f60e7298a1e 100644
>> --- a/net/netfilter/xt_TPROXY.c
>> +++ b/net/netfilter/xt_TPROXY.c
>> @@ -86,6 +86,9 @@ tproxy_tg4_v0(struct sk_buff *skb, const struct xt_action_param *par)
>>   {
>>   	const struct xt_tproxy_target_info *tgi = par->targinfo;
>>   
>> +	if (par->fragoff)
>> +		return NF_DROP;
>> +
>>   	return tproxy_tg4(xt_net(par), skb, tgi->laddr, tgi->lport,
>>   			  tgi->mark_mask, tgi->mark_value);
>>   }
>> @@ -95,6 +98,9 @@ tproxy_tg4_v1(struct sk_buff *skb, const struct xt_action_param *par)
>>   {
>>   	const struct xt_tproxy_target_info_v1 *tgi = par->targinfo;
>>   
>> +	if (par->fragoff)
>> +		return NF_DROP;
>> +
>>   	return tproxy_tg4(xt_net(par), skb, tgi->laddr.ip, tgi->lport,
>>   			  tgi->mark_mask, tgi->mark_value);
>>   }
>> @@ -106,6 +112,7 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par)
>>   {
>>   	const struct ipv6hdr *iph = ipv6_hdr(skb);
>>   	const struct xt_tproxy_target_info_v1 *tgi = par->targinfo;
>> +	unsigned short fragoff = 0;
>>   	struct udphdr _hdr, *hp;
>>   	struct sock *sk;
>>   	const struct in6_addr *laddr;
>> @@ -113,8 +120,8 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par)
>>   	int thoff = 0;
>>   	int tproto;
>>   
>> -	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL);
>> -	if (tproto < 0)
>> +	tproto = ipv6_find_hdr(skb, &thoff, -1, &fragoff, NULL);
>> +	if (tproto < 0 || fragoff)
>>   		return NF_DROP;
>>   
>>   	hp = skb_header_pointer(skb, thoff, sizeof(_hdr), &_hdr);
>> diff --git a/net/netfilter/xt_ecn.c b/net/netfilter/xt_ecn.c
>> index b96e8203ac54..a8503f5d26bf 100644
>> --- a/net/netfilter/xt_ecn.c
>> +++ b/net/netfilter/xt_ecn.c
>> @@ -30,6 +30,10 @@ static bool match_tcp(const struct sk_buff *skb, struct xt_action_param *par)
>>   	struct tcphdr _tcph;
>>   	const struct tcphdr *th;
>>   
>> +	/* this is fine for IPv6 as ecn_mt_check6() enforces -p tcp */
>> +	if (par->fragoff)
>> +		return false;
>> +
>>   	/* In practice, TCP match does this, so can't fail.  But let's
>>   	 * be good citizens.
>>   	 */
>> diff --git a/net/netfilter/xt_osf.c b/net/netfilter/xt_osf.c
>> index dc9485854002..e8807caede68 100644
>> --- a/net/netfilter/xt_osf.c
>> +++ b/net/netfilter/xt_osf.c
>> @@ -27,6 +27,9 @@
>>   static bool
>>   xt_osf_match_packet(const struct sk_buff *skb, struct xt_action_param *p)
>>   {
>> +	if (p->fragoff)
>> +		return false;
>> +
>>   	return nf_osf_match(skb, xt_family(p), xt_hooknum(p), xt_in(p),
>>   			    xt_out(p), p->matchinfo, xt_net(p), nf_osf_fingers);
>>   }
>> diff --git a/net/netfilter/xt_tcpmss.c b/net/netfilter/xt_tcpmss.c
>> index 0d32d4841cb3..b9da8269161d 100644
>> --- a/net/netfilter/xt_tcpmss.c
>> +++ b/net/netfilter/xt_tcpmss.c
>> @@ -32,6 +32,10 @@ tcpmss_mt(const struct sk_buff *skb, struct xt_action_param *par)
>>   	u8 _opt[15 * 4 - sizeof(_tcph)];
>>   	unsigned int i, optlen;
>>   
>> +	/* this is fine for IPv6 as xt_tcpmss enforces -p tcp */
>> +	if (par->fragoff)
>> +		return false;
>> +
>>   	/* If we don't have the whole header, drop packet. */
>>   	th = skb_header_pointer(skb, par->thoff, sizeof(_tcph), &_tcph);
>>   	if (th == NULL)
>> -- 
>> 2.53.0
>>
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3 nf v3] netfilter: xtables: fix L4 header parsing for non-first fragments
  2026-04-24 14:33     ` Fernando Fernandez Mancera
@ 2026-04-24 17:09       ` Pablo Neira Ayuso
  0 siblings, 0 replies; 7+ messages in thread
From: Pablo Neira Ayuso @ 2026-04-24 17:09 UTC (permalink / raw)
  To: Fernando Fernandez Mancera; +Cc: netfilter-devel, coreteam, ecklm94, phil, fw

On Fri, Apr 24, 2026 at 04:33:56PM +0200, Fernando Fernandez Mancera wrote:
> On 4/24/26 1:38 PM, Pablo Neira Ayuso wrote:
> > On Tue, Apr 21, 2026 at 12:44:09PM +0200, Fernando Fernandez Mancera wrote:
> > > Multiple targets and matches relies on L4 header to operate. For
> > > fragmented packets, every fragment carries the transport protocol
> > > identifier, but only the first fragment contains the L4 header.
> > > 
> > > As the 'raw' table can be configured to run at priority -450 (before
> > > defragmentation at -400), the target/match can be reached before
> > > reassembly. In this case, non-first fragments have their payload
> > > incorrectly parsed as a TCP/UDP header. This would be of course a
> > > misconfiguration scenario. In most of the cases this just lead to a
> > > unreliable behavior for fragmented traffic.
> > > 
> > > Add a fragment check to ensure target/match only evaluates unfragmented
> > > packets or the first fragment in the stream.
> > 
> 
> Hi Pablo,
> 
> > AI reports xt_hashlimit could be a good candidate to check for
> > fragoff, I think it is, so I would suggest to expand it there to cover
> > this.
> > 
> 
> This seems like a good catch. I will work on this.
> 
> > It also mentions synproxy as another candidate but IPv6 synproxy does
> > not do ipv6_find_hdr() on purpose I think (it assumed nexthdr is TCP)
> > for SYN and ACK packets, so checking for fragoff there is not
> > possible. Given this is to deal with flood, I think think it is worth
> > the fragoff validation.
> > 
> 
> I don't think it makes sense for SYNPROXY. SYNPROXY requires conntrack to
> work both ipt_SYNPROXY and nft_synproxy and only allows LOCAL_IN and FORWARD
> hooks so it should be fine AFAIU. I don't understand how someone could skip
> defragmentation here.
> 
> Maybe something extra to add to ipt_SYNPROXY is a restriction for TCP
> protocol only. As the code currently assumes the transport layer is TCP
> which isn't enforced.
> 
> But that would be kind of a different fix. What do you think?

My suggestion is to update xt_hashlimit to handle fragoff.

Regarding synproxy, it is a right indeed that this depends on
nf_defrag so no changes are needed there.

So please send a v4 including xt_hashlimit.

Thanks Fernando.

> > > Fixes: 902d6a4c2a4f ("netfilter: nf_defrag: Skip defrag if NOTRACK is set")
> > > Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> > > ---
> > > v2: handled ecn, socket and tcpmss matches
> > > v3: extracted socket to its own patch with a generic solution for
> > > nft/xt, added a comment specifying that par->fragoff is fine for
> > > ecn/tcpmss ipv6 as they enforce -p tcp. Keep on mind that osf only
> > > supports ipv4.
> > > ---
> > >   net/netfilter/xt_TPROXY.c | 11 +++++++++--
> > >   net/netfilter/xt_ecn.c    |  4 ++++
> > >   net/netfilter/xt_osf.c    |  3 +++
> > >   net/netfilter/xt_tcpmss.c |  4 ++++
> > >   4 files changed, 20 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c
> > > index e4bea1d346cf..5f60e7298a1e 100644
> > > --- a/net/netfilter/xt_TPROXY.c
> > > +++ b/net/netfilter/xt_TPROXY.c
> > > @@ -86,6 +86,9 @@ tproxy_tg4_v0(struct sk_buff *skb, const struct xt_action_param *par)
> > >   {
> > >   	const struct xt_tproxy_target_info *tgi = par->targinfo;
> > > +	if (par->fragoff)
> > > +		return NF_DROP;
> > > +
> > >   	return tproxy_tg4(xt_net(par), skb, tgi->laddr, tgi->lport,
> > >   			  tgi->mark_mask, tgi->mark_value);
> > >   }
> > > @@ -95,6 +98,9 @@ tproxy_tg4_v1(struct sk_buff *skb, const struct xt_action_param *par)
> > >   {
> > >   	const struct xt_tproxy_target_info_v1 *tgi = par->targinfo;
> > > +	if (par->fragoff)
> > > +		return NF_DROP;
> > > +
> > >   	return tproxy_tg4(xt_net(par), skb, tgi->laddr.ip, tgi->lport,
> > >   			  tgi->mark_mask, tgi->mark_value);
> > >   }
> > > @@ -106,6 +112,7 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par)
> > >   {
> > >   	const struct ipv6hdr *iph = ipv6_hdr(skb);
> > >   	const struct xt_tproxy_target_info_v1 *tgi = par->targinfo;
> > > +	unsigned short fragoff = 0;
> > >   	struct udphdr _hdr, *hp;
> > >   	struct sock *sk;
> > >   	const struct in6_addr *laddr;
> > > @@ -113,8 +120,8 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par)
> > >   	int thoff = 0;
> > >   	int tproto;
> > > -	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL);
> > > -	if (tproto < 0)
> > > +	tproto = ipv6_find_hdr(skb, &thoff, -1, &fragoff, NULL);
> > > +	if (tproto < 0 || fragoff)
> > >   		return NF_DROP;
> > >   	hp = skb_header_pointer(skb, thoff, sizeof(_hdr), &_hdr);
> > > diff --git a/net/netfilter/xt_ecn.c b/net/netfilter/xt_ecn.c
> > > index b96e8203ac54..a8503f5d26bf 100644
> > > --- a/net/netfilter/xt_ecn.c
> > > +++ b/net/netfilter/xt_ecn.c
> > > @@ -30,6 +30,10 @@ static bool match_tcp(const struct sk_buff *skb, struct xt_action_param *par)
> > >   	struct tcphdr _tcph;
> > >   	const struct tcphdr *th;
> > > +	/* this is fine for IPv6 as ecn_mt_check6() enforces -p tcp */
> > > +	if (par->fragoff)
> > > +		return false;
> > > +
> > >   	/* In practice, TCP match does this, so can't fail.  But let's
> > >   	 * be good citizens.
> > >   	 */
> > > diff --git a/net/netfilter/xt_osf.c b/net/netfilter/xt_osf.c
> > > index dc9485854002..e8807caede68 100644
> > > --- a/net/netfilter/xt_osf.c
> > > +++ b/net/netfilter/xt_osf.c
> > > @@ -27,6 +27,9 @@
> > >   static bool
> > >   xt_osf_match_packet(const struct sk_buff *skb, struct xt_action_param *p)
> > >   {
> > > +	if (p->fragoff)
> > > +		return false;
> > > +
> > >   	return nf_osf_match(skb, xt_family(p), xt_hooknum(p), xt_in(p),
> > >   			    xt_out(p), p->matchinfo, xt_net(p), nf_osf_fingers);
> > >   }
> > > diff --git a/net/netfilter/xt_tcpmss.c b/net/netfilter/xt_tcpmss.c
> > > index 0d32d4841cb3..b9da8269161d 100644
> > > --- a/net/netfilter/xt_tcpmss.c
> > > +++ b/net/netfilter/xt_tcpmss.c
> > > @@ -32,6 +32,10 @@ tcpmss_mt(const struct sk_buff *skb, struct xt_action_param *par)
> > >   	u8 _opt[15 * 4 - sizeof(_tcph)];
> > >   	unsigned int i, optlen;
> > > +	/* this is fine for IPv6 as xt_tcpmss enforces -p tcp */
> > > +	if (par->fragoff)
> > > +		return false;
> > > +
> > >   	/* If we don't have the whole header, drop packet. */
> > >   	th = skb_header_pointer(skb, par->thoff, sizeof(_tcph), &_tcph);
> > >   	if (th == NULL)
> > > -- 
> > > 2.53.0
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-24 17:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21 10:44 [PATCH 1/3 nf v3] netfilter: nf_socket: skip socket lookup for non-first fragments Fernando Fernandez Mancera
2026-04-21 10:44 ` [PATCH 2/3 nf v3] netfilter: nf_tables: skip L4 header parsing " Fernando Fernandez Mancera
2026-04-21 10:44 ` [PATCH 3/3 nf v3] netfilter: xtables: fix " Fernando Fernandez Mancera
2026-04-24 11:38   ` Pablo Neira Ayuso
2026-04-24 14:33     ` Fernando Fernandez Mancera
2026-04-24 17:09       ` Pablo Neira Ayuso
2026-04-21 11:33 ` [PATCH 1/3 nf v3] netfilter: nf_socket: skip socket lookup " Pablo Neira Ayuso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.