From: Fernando Fernandez Mancera <fmancera@suse.de>
To: Florian Westphal <fw@strlen.de>
Cc: Salvatore Bonaccorso <carnil@debian.org>,
Pablo Neira Ayuso <pablo@netfilter.org>,
Phil Sutter <phil@nwl.cc>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
Alejandro Olivan Alvarez <alejandro.olivan.alvarez@gmail.com>,
1130336@bugs.debian.org, netfilter-devel@vger.kernel.org,
coreteam@netfilter.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, regressions@lists.linux.dev,
stable@vger.kernel.org
Subject: Re: [regression] Network failure beyond first connection after 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped")
Date: Sun, 15 Mar 2026 02:09:33 +0100 [thread overview]
Message-ID: <4da571ab-fa1d-4ee6-b71c-24f4a28243ed@suse.de> (raw)
In-Reply-To: <abW2MAAqLnKZm3KF@strlen.de>
On 3/14/26 8:25 PM, Florian Westphal wrote:
> Fernando Fernandez Mancera <fmancera@suse.de> wrote:
>> On 3/14/26 5:13 PM, Fernando Fernandez Mancera wrote:
>>> Hi,
>>>
>>> On 3/14/26 3:03 PM, Salvatore Bonaccorso wrote:
>>>> Control: forwarded -1
>>>> https://lore.kernel.org/ regressions/177349610461.3071718.4083978280323144323@eldamar.lan
>>>> Control: tags -1 + upstream
>>>>
>>>> Hi
>>>>
>>>> In Debian, in https://bugs.debian.org/1130336, Alejandro reported that
>>>> after updates including 69894e5b4c5e ("netfilter: nft_connlimit:
>>>> update the count if add was skipped"), when the following rule is set
>>>>
>>>> iptables -A INPUT -p tcp -m
>>>> connlimit --connlimit-above 111 -j
>>>> REJECT --reject-with tcp-reset
>>>>
>>>> connections get stuck accordingly, it can be easily reproduced by:
>>>>
>>>> # iptables -A INPUT -p tcp -m connlimit
>>>> --connlimit-above 111 -j REJECT
>>>> --reject-with tcp-reset
>>>> # nft list ruleset
>>>> # Warning: table ip filter is managed by iptables-nft, do not touch!
>>>> table ip filter {
>>>> chain INPUT {
>>>> type filter hook input priority filter; policy accept;
>>>> ip protocol tcp xt
>>>> match "connlimit" counter packets 0
>>>> bytes 0 reject with tcp reset
>>>> }
>>>> }
>>>> # wget -O /dev/null
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> --2026-03-14 14:53:51--
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> Resolving git.kernel.org
>>>> (git.kernel.org)... 172.105.64.184,
>>>> 2a01:7e01:e001:937:0:1991:8:25
>>>> Connecting to git.kernel.org
>>>> (git.kernel.org)|172.105.64.184|:443...
>>>> connected.
>>>> HTTP request sent, awaiting response... 301 Moved Permanently
>>>> Location: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
>>>> linux.git/snapshot/linux-7.0-rc3.tar.gz
>>>> [following]
>>>> --2026-03-14 14:53:51--
>>>> https://git.kernel.org/pub/scm/linux/kernel/ git/torvalds/linux.git/snapshot/linux-7.0-rc3.tar.gz
>>>> Reusing existing connection to git.kernel.org:443.
>>>> HTTP request sent, awaiting response... 200 OK
>>>> Length: unspecified [application/x-gzip]
>>>> Saving to: ‘/dev/null’
>>>>
>>>> /dev/null [
>>>> <=> ] 248.03M
>>>> 51.9MB/s in 5.0s
>>>>
>>>> 2026-03-14 14:53:56 (49.3 MB/s) - ‘/dev/null’ saved [260080129]
>>>>
>>>> # wget -O /dev/null
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> --2026-03-14 14:53:58--
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> Resolving git.kernel.org
>>>> (git.kernel.org)... 172.105.64.184,
>>>> 2a01:7e01:e001:937:0:1991:8:25
>>>> Connecting to git.kernel.org
>>>> (git.kernel.org)|172.105.64.184|:443...
>>>> failed: Connection timed out.
>>>> Connecting to git.kernel.org
>>>> (git.kernel.org)|
>>>> 2a01:7e01:e001:937:0:1991:8:25|:443...
>>>> failed: Network is unreachable.
>>>>
>>>> Before the 69894e5b4c5e ("netfilter: nft_connlimit: update the count
>>>> if add was skipped") commit this worked.
>>>>
>>>
>>> Thanks for the report. I have reproduced
>>> this on upstream kernel. I am working on it.
>>>
>>
>> This is what is happening:
>>
>> 1. The first connection is established and
>> tracked, all good. When it finishes, it goes to
>> TIME_WAIT state
>> 2. The second connection is established, ct is
>> confirmed since the beginning, skipping the
>> tracking and calling a GC.
>> 3. The previously tracked connection is cleaned
>> up during GC as TIME_WAIT is considered closed.
>
> This is stupid. The fix is to add --syn or use
> OUTPUT. Its not even clear to me what the user wants to achive with this rule.
>
Yes, the ruleset shown does not make sense. Having said this, it could
affect to a soft-limit scenario as the one described on the blamed commit..
xt_connlimit was designed with --syn on mind but it was not enforced and
people used it for many different things. At least, we are learning many
people ignored --syn completely.
>> +static inline bool tcp_syn_sent_or_recv(const struct nf_conn *conn)
>> +{
>> + if (nf_ct_protonum(conn) == IPPROTO_TCP)
>> + return conn->proto.tcp.state == TCP_CONNTRACK_SYN_SENT ||
>> + conn->proto.tcp.state == TCP_CONNTRACK_SYN_RECV;
>> + else
>> + return false;
>> +}
>
> We're adding ever more complex checks in the conncount backend.
> I don't like any of the solutions.
>
As we are already fetching the ct.. would it be fine if instead we go
for a protocol agnostic solution with:
if (ctinfo == IP_CT_NEW)
goto check_connections;
inside the confirmed if statement? If I am not wrong, it should be a
valid solution too and IMHO a better one.
> What about reverting the offending commit, at least for tree_count?
> That way it continues to work as it did in the past.
>
Before the fix, soft-limiting scenarios were broken and therefore this
specific ruleset was too. I hope this is not a ruleset in production and
it is just for reproducing the issue.
P.S: I have been investigating on a way to improve conncount backend
structure so the GC is not that expensive.. I don't have anything
relevant yet but I plan to provide some updates.
next prev parent reply other threads:[~2026-03-15 1:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-14 14:03 [regression] Network failure beyond first connection after 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped") Salvatore Bonaccorso
2026-03-14 16:13 ` Fernando Fernandez Mancera
2026-03-14 19:00 ` Fernando Fernandez Mancera
2026-03-14 19:25 ` Florian Westphal
2026-03-15 1:09 ` Fernando Fernandez Mancera [this message]
2026-03-18 12:49 ` Bug#1130336: " Salvatore Bonaccorso
2026-03-19 8:44 ` Alejandro Oliván Alvarez
2026-03-19 8:59 ` Fernando Fernandez Mancera
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4da571ab-fa1d-4ee6-b71c-24f4a28243ed@suse.de \
--to=fmancera@suse.de \
--cc=1130336@bugs.debian.org \
--cc=alejandro.olivan.alvarez@gmail.com \
--cc=carnil@debian.org \
--cc=coreteam@netfilter.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pablo@netfilter.org \
--cc=phil@nwl.cc \
--cc=regressions@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox