netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] inet: frags: drop fraglist conntrack references
@ 2026-01-02 14:00 Florian Westphal
  2026-01-03  9:55 ` Eric Dumazet
  2026-01-04 20:13 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Florian Westphal @ 2026-01-02 14:00 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	dsahern, Florian Westphal, syzbot+4393c47753b7808dac7d

Jakub added a warning in nf_conntrack_cleanup_net_list() to make debugging
leaked skbs/conntrack references more obvious.

syzbot reports this as triggering, and I can also reproduce this via
ip_defrag.sh selftest:

 conntrack cleanup blocked for 60s
 WARNING: net/netfilter/nf_conntrack_core.c:2512
 [..]

conntrack clenups gets stuck because there are skbs with still hold nf_conn
references via their frag_list.

   net.core.skb_defer_max=0 makes the hang disappear.

Eric Dumazet points out that skb_release_head_state() doesn't follow the
fraglist.

ip_defrag.sh can only reproduce this problem since
commit 6471658dc66c ("udp: use skb_attempt_defer_free()"), but AFAICS this
problem could happen with TCP as well if pmtu discovery is off.

The relevant problem path for udp is:
1. netns emits fragmented packets
2. nf_defrag_v6_hook reassembles them (in output hook)
3. reassembled skb is tracked (skb owns nf_conn reference)
4. ip6_output refragments
5. refragmented packets also own nf_conn reference (ip6_fragment
   calls ip6_copy_metadata())
6. on input path, nf_defrag_v6_hook skips defragmentation: the
   fragments already have skb->nf_conn attached
7. skbs are reassembled via ipv6_frag_rcv()
8. skb_consume_udp -> skb_attempt_defer_free() -> skb ends up
   in pcpu freelist, but still has nf_conn reference.

Possible solutions:
 1 let defrag engine drop nf_conn entry, OR
 2 export kick_defer_list_purge() and call it from the conntrack
   netns exit callback, OR
 3 add skb_has_frag_list() check to skb_attempt_defer_free()

2 & 3 also solve ip_defrag.sh hang but share same drawback:

Such reassembled skbs, queued to socket, can prevent conntrack module
removal until userspace has consumed the packet. While both tcp and udp
stack do call nf_reset_ct() before placing skb on socket queue, that
function doesn't iterate frag_list skbs.

Therefore drop nf_conn entries when they are placed in defrag queue.
Keep the nf_conn entry of the first (offset 0) skb so that reassembled
skb retains nf_conn entry for sake of TX path.

Note that fixes tag is incorrect; it points to the commit introducing the
'ip_defrag.sh reproducible problem': no need to backport this patch to
every stable kernel.

Reported-by: syzbot+4393c47753b7808dac7d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/693b0fa7.050a0220.4004e.040d.GAE@google.com/
Fixes: 6471658dc66c ("udp: use skb_attempt_defer_free()")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/ipv4/inet_fragment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 001ee5c4d962..4e6d7467ed44 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -488,6 +488,8 @@ int inet_frag_queue_insert(struct inet_frag_queue *q, struct sk_buff *skb,
 	}
 
 	FRAG_CB(skb)->ip_defrag_offset = offset;
+	if (offset)
+		nf_reset_ct(skb);
 
 	return IPFRAG_OK;
 }
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net] inet: frags: drop fraglist conntrack references
  2026-01-02 14:00 [PATCH net] inet: frags: drop fraglist conntrack references Florian Westphal
@ 2026-01-03  9:55 ` Eric Dumazet
  2026-01-04 20:13 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Eric Dumazet @ 2026-01-03  9:55 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netdev, Paolo Abeni, David S. Miller, Jakub Kicinski, dsahern,
	syzbot+4393c47753b7808dac7d

On Fri, Jan 2, 2026 at 3:00 PM Florian Westphal <fw@strlen.de> wrote:
>
> Jakub added a warning in nf_conntrack_cleanup_net_list() to make debugging
> leaked skbs/conntrack references more obvious.
>
> syzbot reports this as triggering, and I can also reproduce this via
> ip_defrag.sh selftest:
>
>  conntrack cleanup blocked for 60s
>  WARNING: net/netfilter/nf_conntrack_core.c:2512
>  [..]
>
> conntrack clenups gets stuck because there are skbs with still hold nf_conn
> references via their frag_list.
>
>    net.core.skb_defer_max=0 makes the hang disappear.
>
> Eric Dumazet points out that skb_release_head_state() doesn't follow the
> fraglist.
>
> ip_defrag.sh can only reproduce this problem since
> commit 6471658dc66c ("udp: use skb_attempt_defer_free()"), but AFAICS this
> problem could happen with TCP as well if pmtu discovery is off.
>
> The relevant problem path for udp is:
> 1. netns emits fragmented packets
> 2. nf_defrag_v6_hook reassembles them (in output hook)
> 3. reassembled skb is tracked (skb owns nf_conn reference)
> 4. ip6_output refragments
> 5. refragmented packets also own nf_conn reference (ip6_fragment
>    calls ip6_copy_metadata())
> 6. on input path, nf_defrag_v6_hook skips defragmentation: the
>    fragments already have skb->nf_conn attached
> 7. skbs are reassembled via ipv6_frag_rcv()
> 8. skb_consume_udp -> skb_attempt_defer_free() -> skb ends up
>    in pcpu freelist, but still has nf_conn reference.
>
> Possible solutions:
>  1 let defrag engine drop nf_conn entry, OR
>  2 export kick_defer_list_purge() and call it from the conntrack
>    netns exit callback, OR
>  3 add skb_has_frag_list() check to skb_attempt_defer_free()
>
> 2 & 3 also solve ip_defrag.sh hang but share same drawback:
>
> Such reassembled skbs, queued to socket, can prevent conntrack module
> removal until userspace has consumed the packet. While both tcp and udp
> stack do call nf_reset_ct() before placing skb on socket queue, that
> function doesn't iterate frag_list skbs.
>
> Therefore drop nf_conn entries when they are placed in defrag queue.
> Keep the nf_conn entry of the first (offset 0) skb so that reassembled
> skb retains nf_conn entry for sake of TX path.
>
> Note that fixes tag is incorrect; it points to the commit introducing the
> 'ip_defrag.sh reproducible problem': no need to backport this patch to
> every stable kernel.
>
> Reported-by: syzbot+4393c47753b7808dac7d@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/netdev/693b0fa7.050a0220.4004e.040d.GAE@google.com/
> Fixes: 6471658dc66c ("udp: use skb_attempt_defer_free()")
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---

Thanks a lot Florian for taking care of this.

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net] inet: frags: drop fraglist conntrack references
  2026-01-02 14:00 [PATCH net] inet: frags: drop fraglist conntrack references Florian Westphal
  2026-01-03  9:55 ` Eric Dumazet
@ 2026-01-04 20:13 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-01-04 20:13 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netdev, pabeni, davem, edumazet, kuba, dsahern,
	syzbot+4393c47753b7808dac7d

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri,  2 Jan 2026 15:00:07 +0100 you wrote:
> Jakub added a warning in nf_conntrack_cleanup_net_list() to make debugging
> leaked skbs/conntrack references more obvious.
> 
> syzbot reports this as triggering, and I can also reproduce this via
> ip_defrag.sh selftest:
> 
>  conntrack cleanup blocked for 60s
>  WARNING: net/netfilter/nf_conntrack_core.c:2512
>  [..]
> 
> [...]

Here is the summary with links:
  - [net] inet: frags: drop fraglist conntrack references
    https://git.kernel.org/netdev/net/c/2ef02ac38d3c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-01-04 20:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-02 14:00 [PATCH net] inet: frags: drop fraglist conntrack references Florian Westphal
2026-01-03  9:55 ` Eric Dumazet
2026-01-04 20:13 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).