From: Fernando Fernandez Mancera <fmancera@suse.de>
To: Florian Westphal <fw@strlen.de>
Cc: Salvatore Bonaccorso <carnil@debian.org>,
Pablo Neira Ayuso <pablo@netfilter.org>,
Phil Sutter <phil@nwl.cc>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
Alejandro Olivan Alvarez <alejandro.olivan.alvarez@gmail.com>,
1130336@bugs.debian.org, netfilter-devel@vger.kernel.org,
coreteam@netfilter.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, regressions@lists.linux.dev,
stable@vger.kernel.org
Subject: Re: [regression] Network failure beyond first connection after 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped")
Date: Sun, 15 Mar 2026 02:09:33 +0100 [thread overview]
Message-ID: <4da571ab-fa1d-4ee6-b71c-24f4a28243ed@suse.de> (raw)
In-Reply-To: <abW2MAAqLnKZm3KF@strlen.de>
On 3/14/26 8:25 PM, Florian Westphal wrote:
> Fernando Fernandez Mancera <fmancera@suse.de> wrote:
>> On 3/14/26 5:13 PM, Fernando Fernandez Mancera wrote:
>>> Hi,
>>>
>>> On 3/14/26 3:03 PM, Salvatore Bonaccorso wrote:
>>>> Control: forwarded -1
>>>> https://lore.kernel.org/ regressions/177349610461.3071718.4083978280323144323@eldamar.lan
>>>> Control: tags -1 + upstream
>>>>
>>>> Hi
>>>>
>>>> In Debian, in https://bugs.debian.org/1130336, Alejandro reported that
>>>> after updates including 69894e5b4c5e ("netfilter: nft_connlimit:
>>>> update the count if add was skipped"), when the following rule is set
>>>>
>>>> iptables -A INPUT -p tcp -m
>>>> connlimit --connlimit-above 111 -j
>>>> REJECT --reject-with tcp-reset
>>>>
>>>> connections get stuck accordingly, it can be easily reproduced by:
>>>>
>>>> # iptables -A INPUT -p tcp -m connlimit
>>>> --connlimit-above 111 -j REJECT
>>>> --reject-with tcp-reset
>>>> # nft list ruleset
>>>> # Warning: table ip filter is managed by iptables-nft, do not touch!
>>>> table ip filter {
>>>> chain INPUT {
>>>> type filter hook input priority filter; policy accept;
>>>> ip protocol tcp xt
>>>> match "connlimit" counter packets 0
>>>> bytes 0 reject with tcp reset
>>>> }
>>>> }
>>>> # wget -O /dev/null
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> --2026-03-14 14:53:51--
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> Resolving git.kernel.org
>>>> (git.kernel.org)... 172.105.64.184,
>>>> 2a01:7e01:e001:937:0:1991:8:25
>>>> Connecting to git.kernel.org
>>>> (git.kernel.org)|172.105.64.184|:443...
>>>> connected.
>>>> HTTP request sent, awaiting response... 301 Moved Permanently
>>>> Location: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
>>>> linux.git/snapshot/linux-7.0-rc3.tar.gz
>>>> [following]
>>>> --2026-03-14 14:53:51--
>>>> https://git.kernel.org/pub/scm/linux/kernel/ git/torvalds/linux.git/snapshot/linux-7.0-rc3.tar.gz
>>>> Reusing existing connection to git.kernel.org:443.
>>>> HTTP request sent, awaiting response... 200 OK
>>>> Length: unspecified [application/x-gzip]
>>>> Saving to: ‘/dev/null’
>>>>
>>>> /dev/null [
>>>> <=> ] 248.03M
>>>> 51.9MB/s in 5.0s
>>>>
>>>> 2026-03-14 14:53:56 (49.3 MB/s) - ‘/dev/null’ saved [260080129]
>>>>
>>>> # wget -O /dev/null
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> --2026-03-14 14:53:58--
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> Resolving git.kernel.org
>>>> (git.kernel.org)... 172.105.64.184,
>>>> 2a01:7e01:e001:937:0:1991:8:25
>>>> Connecting to git.kernel.org
>>>> (git.kernel.org)|172.105.64.184|:443...
>>>> failed: Connection timed out.
>>>> Connecting to git.kernel.org
>>>> (git.kernel.org)|
>>>> 2a01:7e01:e001:937:0:1991:8:25|:443...
>>>> failed: Network is unreachable.
>>>>
>>>> Before the 69894e5b4c5e ("netfilter: nft_connlimit: update the count
>>>> if add was skipped") commit this worked.
>>>>
>>>
>>> Thanks for the report. I have reproduced
>>> this on upstream kernel. I am working on it.
>>>
>>
>> This is what is happening:
>>
>> 1. The first connection is established and
>> tracked, all good. When it finishes, it goes to
>> TIME_WAIT state
>> 2. The second connection is established, ct is
>> confirmed since the beginning, skipping the
>> tracking and calling a GC.
>> 3. The previously tracked connection is cleaned
>> up during GC as TIME_WAIT is considered closed.
>
> This is stupid. The fix is to add --syn or use
> OUTPUT. Its not even clear to me what the user wants to achive with this rule.
>
Yes, the ruleset shown does not make sense. Having said this, it could
affect to a soft-limit scenario as the one described on the blamed commit..
xt_connlimit was designed with --syn on mind but it was not enforced and
people used it for many different things. At least, we are learning many
people ignored --syn completely.
>> +static inline bool tcp_syn_sent_or_recv(const struct nf_conn *conn)
>> +{
>> + if (nf_ct_protonum(conn) == IPPROTO_TCP)
>> + return conn->proto.tcp.state == TCP_CONNTRACK_SYN_SENT ||
>> + conn->proto.tcp.state == TCP_CONNTRACK_SYN_RECV;
>> + else
>> + return false;
>> +}
>
> We're adding ever more complex checks in the conncount backend.
> I don't like any of the solutions.
>
As we are already fetching the ct.. would it be fine if instead we go
for a protocol agnostic solution with:
if (ctinfo == IP_CT_NEW)
goto check_connections;
inside the confirmed if statement? If I am not wrong, it should be a
valid solution too and IMHO a better one.
> What about reverting the offending commit, at least for tree_count?
> That way it continues to work as it did in the past.
>
Before the fix, soft-limiting scenarios were broken and therefore this
specific ruleset was too. I hope this is not a ruleset in production and
it is just for reproducing the issue.
P.S: I have been investigating on a way to improve conncount backend
structure so the GC is not that expensive.. I don't have anything
relevant yet but I plan to provide some updates.
next prev parent reply other threads:[~2026-03-15 1:09 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-14 14:03 [regression] Network failure beyond first connection after 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped") Salvatore Bonaccorso
2026-03-14 16:13 ` Fernando Fernandez Mancera
2026-03-14 19:00 ` Fernando Fernandez Mancera
2026-03-14 19:25 ` Florian Westphal
2026-03-15 1:09 ` Fernando Fernandez Mancera [this message]
2026-03-18 12:49 ` Bug#1130336: " Salvatore Bonaccorso
2026-03-19 8:44 ` Alejandro Oliván Alvarez
2026-03-19 8:59 ` Fernando Fernandez Mancera
2026-04-22 9:18 ` Thorsten Leemhuis
2026-04-22 10:32 ` Fernando Fernandez Mancera
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4da571ab-fa1d-4ee6-b71c-24f4a28243ed@suse.de \
--to=fmancera@suse.de \
--cc=1130336@bugs.debian.org \
--cc=alejandro.olivan.alvarez@gmail.com \
--cc=carnil@debian.org \
--cc=coreteam@netfilter.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pablo@netfilter.org \
--cc=phil@nwl.cc \
--cc=regressions@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.