public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Fernando Fernandez Mancera <fmancera@suse.de>
To: Florian Westphal <fw@strlen.de>
Cc: Salvatore Bonaccorso <carnil@debian.org>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Phil Sutter <phil@nwl.cc>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	Alejandro Olivan Alvarez <alejandro.olivan.alvarez@gmail.com>,
	1130336@bugs.debian.org, netfilter-devel@vger.kernel.org,
	coreteam@netfilter.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, regressions@lists.linux.dev,
	stable@vger.kernel.org
Subject: Re: [regression] Network failure beyond first connection after 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped")
Date: Sun, 15 Mar 2026 02:09:33 +0100	[thread overview]
Message-ID: <4da571ab-fa1d-4ee6-b71c-24f4a28243ed@suse.de> (raw)
In-Reply-To: <abW2MAAqLnKZm3KF@strlen.de>

On 3/14/26 8:25 PM, Florian Westphal wrote:
> Fernando Fernandez Mancera <fmancera@suse.de> wrote:
>> On 3/14/26 5:13 PM, Fernando Fernandez Mancera wrote:
>>> Hi,
>>>
>>> On 3/14/26 3:03 PM, Salvatore Bonaccorso wrote:
>>>> Control: forwarded -1
>>>> https://lore.kernel.org/ regressions/177349610461.3071718.4083978280323144323@eldamar.lan
>>>> Control: tags -1 + upstream
>>>>
>>>> Hi
>>>>
>>>> In Debian, in https://bugs.debian.org/1130336, Alejandro reported that
>>>> after updates including 69894e5b4c5e ("netfilter: nft_connlimit:
>>>> update the count if add was skipped"), when the following rule is set
>>>>
>>>>      iptables -A INPUT -p tcp -m
>>>> connlimit --connlimit-above 111 -j
>>>> REJECT --reject-with tcp-reset
>>>>
>>>> connections get stuck accordingly, it can be easily reproduced by:
>>>>
>>>> # iptables -A INPUT -p tcp -m connlimit
>>>> --connlimit-above 111 -j REJECT
>>>> --reject-with tcp-reset
>>>> # nft list ruleset
>>>> # Warning: table ip filter is managed by iptables-nft, do not touch!
>>>> table ip filter {
>>>>           chain INPUT {
>>>>                   type filter hook input priority filter; policy accept;
>>>>                   ip protocol tcp xt
>>>> match "connlimit" counter packets 0
>>>> bytes 0 reject with tcp reset
>>>>           }
>>>> }
>>>> # wget -O /dev/null
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> --2026-03-14 14:53:51--
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> Resolving git.kernel.org
>>>> (git.kernel.org)... 172.105.64.184,
>>>> 2a01:7e01:e001:937:0:1991:8:25
>>>> Connecting to git.kernel.org
>>>> (git.kernel.org)|172.105.64.184|:443...
>>>> connected.
>>>> HTTP request sent, awaiting response... 301 Moved Permanently
>>>> Location: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
>>>> linux.git/snapshot/linux-7.0-rc3.tar.gz
>>>> [following]
>>>> --2026-03-14 14:53:51--
>>>> https://git.kernel.org/pub/scm/linux/kernel/ git/torvalds/linux.git/snapshot/linux-7.0-rc3.tar.gz
>>>> Reusing existing connection to git.kernel.org:443.
>>>> HTTP request sent, awaiting response... 200 OK
>>>> Length: unspecified [application/x-gzip]
>>>> Saving to: ‘/dev/null’
>>>>
>>>> /dev/null                         [
>>>> <=>                    ] 248.03M
>>>> 51.9MB/s    in 5.0s
>>>>
>>>> 2026-03-14 14:53:56 (49.3 MB/s) - ‘/dev/null’ saved [260080129]
>>>>
>>>> # wget -O /dev/null
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> --2026-03-14 14:53:58--
>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>> rc3.tar.gz
>>>> Resolving git.kernel.org
>>>> (git.kernel.org)... 172.105.64.184,
>>>> 2a01:7e01:e001:937:0:1991:8:25
>>>> Connecting to git.kernel.org
>>>> (git.kernel.org)|172.105.64.184|:443...
>>>> failed: Connection timed out.
>>>> Connecting to git.kernel.org
>>>> (git.kernel.org)|
>>>> 2a01:7e01:e001:937:0:1991:8:25|:443...
>>>> failed: Network is unreachable.
>>>>
>>>> Before the 69894e5b4c5e ("netfilter: nft_connlimit: update the count
>>>> if add was skipped") commit this worked.
>>>>
>>>
>>> Thanks for the report. I have reproduced
>>> this on upstream kernel. I am working on it.
>>>
>>
>> This is what is happening:
>>
>> 1. The first connection is established and
>> tracked, all good. When it finishes, it goes to
>> TIME_WAIT state
>> 2. The second connection is established, ct is
>> confirmed since the beginning, skipping the
>> tracking and calling a GC.
>> 3. The previously tracked connection is cleaned
>> up during GC as TIME_WAIT is considered closed.
> 
> This is stupid.  The fix is to add --syn or use
> OUTPUT.  Its not even clear to me what the user wants to achive with this rule.
> 

Yes, the ruleset shown does not make sense. Having said this, it could 
affect to a soft-limit scenario as the one described on the blamed commit..

xt_connlimit was designed with --syn on mind but it was not enforced and 
people used it for many different things. At least, we are learning many 
people ignored --syn completely.

>> +static inline bool tcp_syn_sent_or_recv(const struct nf_conn *conn)
>> +{
>> +	if (nf_ct_protonum(conn) == IPPROTO_TCP)
>> +		return conn->proto.tcp.state == TCP_CONNTRACK_SYN_SENT ||
>> +		       conn->proto.tcp.state == TCP_CONNTRACK_SYN_RECV;
>> +	else
>> +		return false;
>> +}
> 
> We're adding ever more complex checks in the conncount backend.
> I don't like any of the solutions.
> 

As we are already fetching the ct.. would it be fine if instead we go 
for a protocol agnostic solution with:

if (ctinfo == IP_CT_NEW)
	goto check_connections;

inside the confirmed if statement? If I am not wrong, it should be a 
valid solution too and IMHO a better one.

> What about reverting the offending commit, at least for tree_count?
> That way it continues to work as it did in the past.
> 

Before the fix, soft-limiting scenarios were broken and therefore this 
specific ruleset was too. I hope this is not a ruleset in production and 
it is just for reproducing the issue.

P.S: I have been investigating on a way to improve conncount backend 
structure so the GC is not that expensive.. I don't have anything 
relevant yet but I plan to provide some updates.

  reply	other threads:[~2026-03-15  1:09 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-14 14:03 [regression] Network failure beyond first connection after 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped") Salvatore Bonaccorso
2026-03-14 16:13 ` Fernando Fernandez Mancera
2026-03-14 19:00   ` Fernando Fernandez Mancera
2026-03-14 19:25     ` Florian Westphal
2026-03-15  1:09       ` Fernando Fernandez Mancera [this message]
2026-03-18 12:49         ` Bug#1130336: " Salvatore Bonaccorso
2026-03-19  8:44           ` Alejandro Oliván Alvarez
2026-03-19  8:59             ` Fernando Fernandez Mancera

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4da571ab-fa1d-4ee6-b71c-24f4a28243ed@suse.de \
    --to=fmancera@suse.de \
    --cc=1130336@bugs.debian.org \
    --cc=alejandro.olivan.alvarez@gmail.com \
    --cc=carnil@debian.org \
    --cc=coreteam@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=phil@nwl.cc \
    --cc=regressions@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox