All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fernando Fernandez Mancera <fmancera@suse.de>
To: Salvatore Bonaccorso <carnil@debian.org>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Florian Westphal <fw@strlen.de>, Phil Sutter <phil@nwl.cc>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	Alejandro Olivan Alvarez <alejandro.olivan.alvarez@gmail.com>
Cc: 1130336@bugs.debian.org, netfilter-devel@vger.kernel.org,
	coreteam@netfilter.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, regressions@lists.linux.dev,
	stable@vger.kernel.org
Subject: Re: [regression] Network failure beyond first connection after 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped")
Date: Sat, 14 Mar 2026 20:00:06 +0100	[thread overview]
Message-ID: <b3cbfd15-acd1-4500-ba30-eac6f48523fb@suse.de> (raw)
In-Reply-To: <c72a56ab-a16c-4866-9a44-a03393f074db@suse.de>

On 3/14/26 5:13 PM, Fernando Fernandez Mancera wrote:
> Hi,
> 
> On 3/14/26 3:03 PM, Salvatore Bonaccorso wrote:
>> Control: forwarded -1 https://lore.kernel.org/ 
>> regressions/177349610461.3071718.4083978280323144323@eldamar.lan
>> Control: tags -1 + upstream
>>
>> Hi
>>
>> In Debian, in https://bugs.debian.org/1130336, Alejandro reported that
>> after updates including 69894e5b4c5e ("netfilter: nft_connlimit:
>> update the count if add was skipped"), when the following rule is set
>>
>>     iptables -A INPUT -p tcp -m connlimit --connlimit-above 111 -j 
>> REJECT --reject-with tcp-reset
>>
>> connections get stuck accordingly, it can be easily reproduced by:
>>
>> # iptables -A INPUT -p tcp -m connlimit --connlimit-above 111 -j 
>> REJECT --reject-with tcp-reset
>> # nft list ruleset
>> # Warning: table ip filter is managed by iptables-nft, do not touch!
>> table ip filter {
>>          chain INPUT {
>>                  type filter hook input priority filter; policy accept;
>>                  ip protocol tcp xt match "connlimit" counter packets 
>> 0 bytes 0 reject with tcp reset
>>          }
>> }
>> # wget -O /dev/null https://git.kernel.org/torvalds/t/linux-7.0- 
>> rc3.tar.gz
>> --2026-03-14 14:53:51--  https://git.kernel.org/torvalds/t/linux-7.0- 
>> rc3.tar.gz
>> Resolving git.kernel.org (git.kernel.org)... 172.105.64.184, 
>> 2a01:7e01:e001:937:0:1991:8:25
>> Connecting to git.kernel.org (git.kernel.org)|172.105.64.184|:443... 
>> connected.
>> HTTP request sent, awaiting response... 301 Moved Permanently
>> Location: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/ 
>> linux.git/snapshot/linux-7.0-rc3.tar.gz [following]
>> --2026-03-14 14:53:51--  https://git.kernel.org/pub/scm/linux/kernel/ 
>> git/torvalds/linux.git/snapshot/linux-7.0-rc3.tar.gz
>> Reusing existing connection to git.kernel.org:443.
>> HTTP request sent, awaiting response... 200 OK
>> Length: unspecified [application/x-gzip]
>> Saving to: ‘/dev/null’
>>
>> /dev/null                         [                         
>> <=>                    ] 248.03M  51.9MB/s    in 5.0s
>>
>> 2026-03-14 14:53:56 (49.3 MB/s) - ‘/dev/null’ saved [260080129]
>>
>> # wget -O /dev/null https://git.kernel.org/torvalds/t/linux-7.0- 
>> rc3.tar.gz
>> --2026-03-14 14:53:58--  https://git.kernel.org/torvalds/t/linux-7.0- 
>> rc3.tar.gz
>> Resolving git.kernel.org (git.kernel.org)... 172.105.64.184, 
>> 2a01:7e01:e001:937:0:1991:8:25
>> Connecting to git.kernel.org (git.kernel.org)|172.105.64.184|:443... 
>> failed: Connection timed out.
>> Connecting to git.kernel.org (git.kernel.org)| 
>> 2a01:7e01:e001:937:0:1991:8:25|:443... failed: Network is unreachable.
>>
>> Before the 69894e5b4c5e ("netfilter: nft_connlimit: update the count
>> if add was skipped") commit this worked.
>>
> 
> Thanks for the report. I have reproduced this on upstream kernel. I am 
> working on it.
> 

This is what is happening:

1. The first connection is established and tracked, all good. When it 
finishes, it goes to TIME_WAIT state
2. The second connection is established, ct is confirmed since the 
beginning, skipping the tracking and calling a GC.
3. The previously tracked connection is cleaned up during GC as 
TIME_WAIT is considered closed.
4. count is therefore 0 and xt performs a drop.

There are two different approaches to fix this IMHO.

The first one would be to stop considering TIME_WAIT as closed. But that 
would artificially solve the issue.

The second one is to check what is the TCP status inside the 
nf_ct_is_confirmed() check and if it is SENT or RECV but confirmed there 
are two options - ore it is a retransmission or the ct was confirmed 
even before we tracked it. In both situations, perform an insert with a 
GC. Then we make sure no duplicate tracking is happening and the 
connection is tracked properly. The following diff fixes it, what do you 
think? I can send a formal patch if this solution is considered acceptable.

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 00eed5b4d1b1..ae94e5d7e00b 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -78,6 +78,15 @@ static inline bool already_closed(const struct 
nf_conn *conn)
  		return false;
  }

+static inline bool tcp_syn_sent_or_recv(const struct nf_conn *conn)
+{
+	if (nf_ct_protonum(conn) == IPPROTO_TCP)
+		return conn->proto.tcp.state == TCP_CONNTRACK_SYN_SENT ||
+		       conn->proto.tcp.state == TCP_CONNTRACK_SYN_RECV;
+	else
+		return false;
+}
+
  static int key_diff(const u32 *a, const u32 *b, unsigned int klen)
  {
  	return memcmp(a, b, klen * sizeof(u32));
@@ -183,6 +192,9 @@ static int __nf_conncount_add(struct net *net,
  		 * might have happened before hitting connlimit
  		 */
  		if (skb->skb_iif != LOOPBACK_IFINDEX) {
+			if (tcp_syn_sent_or_recv(ct))
+				goto check_connections;
+
  			err = -EEXIST;
  			goto out_put;
  		}

> Thanks,
> Fernando.

  reply	other threads:[~2026-03-14 19:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-14 14:03 [regression] Network failure beyond first connection after 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped") Salvatore Bonaccorso
2026-03-14 16:13 ` Fernando Fernandez Mancera
2026-03-14 19:00   ` Fernando Fernandez Mancera [this message]
2026-03-14 19:25     ` Florian Westphal
2026-03-15  1:09       ` Fernando Fernandez Mancera
2026-03-18 12:49         ` Bug#1130336: " Salvatore Bonaccorso
2026-03-19  8:44           ` Alejandro Oliván Alvarez
2026-03-19  8:59             ` Fernando Fernandez Mancera
2026-04-22  9:18               ` Thorsten Leemhuis
2026-04-22 10:32                 ` Fernando Fernandez Mancera

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3cbfd15-acd1-4500-ba30-eac6f48523fb@suse.de \
    --to=fmancera@suse.de \
    --cc=1130336@bugs.debian.org \
    --cc=alejandro.olivan.alvarez@gmail.com \
    --cc=carnil@debian.org \
    --cc=coreteam@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=phil@nwl.cc \
    --cc=regressions@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.