From: Fernando Fernandez Mancera <fmancera@suse.de>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org, coreteam@netfilter.org, fw@strlen.de
Subject: Re: [PATCH nf v2] netfilter: nft_connlimit: fix duplicated tracking of a connection
Date: Thu, 30 Oct 2025 09:12:32 +0100 [thread overview]
Message-ID: <c58ae9ad-46f3-4853-bc61-ac725c860160@suse.de> (raw)
In-Reply-To: <aQJ6AysjCMTHLzsP@calendula>
On 10/29/25 9:33 PM, Pablo Neira Ayuso wrote:
> Hi Fernando,
>
> On Wed, Oct 29, 2025 at 02:23:18PM +0100, Fernando Fernandez Mancera wrote:
>> Connlimit expression can be used for all kind of packets and not only
>> for packets with connection state new. See this ruleset as example:
>>
>> table ip filter {
>> chain input {
>> type filter hook input priority filter; policy accept;
>> tcp dport 22 ct count over 4 counter
>> }
>> }
>>
>> Currently, if the connection count goes over the limit the counter will
>> count the packets. When a connection is closed, the connection count
>> won't decrement as it should because it is only updated for new
>> connections due to an optimization on __nf_conncount_add() that prevents
>> updating the list if the connection is duplicated.
>>
>> In addition, since commit d265929930e2 ("netfilter: nf_conncount: reduce
>> unnecessary GC") there can be situations where a duplicated connection
>> is added to the list. This is caused by two packets from the same
>> connection being processed during the same jiffy.
>>
>> To solve these problems, check whether this is a new connection and only
>> add the connection to the list if that is the case during connlimit
>> evaluation. Otherwise run a GC to update the count. This doesn't yield a
>> performance degradation.
>
> This is true is list is small, e.g. ct count over 4.
>
> But user could much larger value, then every packet could trigger a
> long list walk, because gc is bound to CONNCOUNT_GC_MAX_NODES which is
> the maximum number of nodes that is _collected_.
>
> And maybe the user selects:
>
> ct count over N mark set 0x1
>
> where N is high, the gc walk can be long.
>
> TBH, I added this expression mainly focusing on being used with
> dynset, I allowed it too in rules for parity. In the dynset case,
> there is a front-end datastructure (set) and this conncount list
> is per element. Maybe there high ct count is also possible.
>
> With this patch, gc is called more frequently, not only for each new
> packet.
>
How is it called more frequently? Before, it was calling
nf_conncount_add() for every packet which is indeed performing a GC
inside, both nf_conncount_add() and nf_conncount_gc_list() return
immediately if a GC was performed during the same jiffy.
I tried with a limit of 2000 connections and didn't notice any
performance change. I could try with CONNCOUNT_GC_MAX_NODES too.
>> Fixed in xt_connlimit too.
>>
>> Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC")
>> Fixes: 976afca1ceba ("netfilter: nf_conncount: Early exit in nf_conncount_lookup() and cleanup")
>> Closes: https://lore.kernel.org/netfilter/trinity-85c72a88-d762-46c3-be97-36f10e5d9796-1761173693813@3c-app-mailcom-bs12/
>> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
>> ---
>> v2: use nf_ct_is_confirmed(), add comment about why the gc call is
>> needed and fix this in xt_connlimit too.
>> ---
>> net/netfilter/nft_connlimit.c | 17 ++++++++++++++---
>> net/netfilter/xt_connlimit.c | 14 ++++++++++++--
>> 2 files changed, 26 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/netfilter/nft_connlimit.c b/net/netfilter/nft_connlimit.c
>> index fc35a11cdca2..dedea1681e73 100644
>> --- a/net/netfilter/nft_connlimit.c
>> +++ b/net/netfilter/nft_connlimit.c
>> @@ -43,9 +43,20 @@ static inline void nft_connlimit_do_eval(struct nft_connlimit *priv,
>> return;
>> }
>>
>> - if (nf_conncount_add(nft_net(pkt), priv->list, tuple_ptr, zone)) {
>> - regs->verdict.code = NF_DROP;
>> - return;
>> + if (!ct || !nf_ct_is_confirmed(ct)) {
>> + if (nf_conncount_add(nft_net(pkt), priv->list, tuple_ptr, zone)) {
>> + regs->verdict.code = NF_DROP;
>> + return;
>> + }
>> + } else {
>> + /* Call gc to update the list count if any connection has been
>> + * closed already. This is useful to softlimit connections
>> + * like limiting bandwidth based on a number of open
>> + * connections.
>> + */
>> + local_bh_disable();
>> + nf_conncount_gc_list(nft_net(pkt), priv->list);
>> + local_bh_enable();
>> }
>>
>> count = READ_ONCE(priv->list->count);
>> diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
>> index 0189f8b6b0bd..5c90e1929d86 100644
>> --- a/net/netfilter/xt_connlimit.c
>> +++ b/net/netfilter/xt_connlimit.c
>> @@ -69,8 +69,18 @@ connlimit_mt(const struct sk_buff *skb, struct xt_action_param *par)
>> key[1] = zone->id;
>> }
>>
>> - connections = nf_conncount_count(net, info->data, key, tuple_ptr,
>> - zone);
>> + if (!ct || !nf_ct_is_confirmed(ct)) {
>> + connections = nf_conncount_count(net, info->data, key, tuple_ptr,
>> + zone);
>> + } else {
>> + /* Call nf_conncount_count() with NULL tuple and zone to update
>> + * the list if any connection has been closed already. This is
>> + * useful to softlimit connections like limiting bandwidth based
>> + * on a number of open connections.
>> + */
>> + connections = nf_conncount_count(net, info->data, key, NULL, NULL);
>> + }
>
> Maybe remove this from xt_connlimit?
>
Dropping this would leave xt_connlimit broken for the use-cases I
discussed with Florian on the v1.
>> +
>> if (connections == 0)
>> /* kmalloc failed, drop it entirely */
>> goto hotdrop;
>> --
>> 2.51.0
>>
next prev parent reply other threads:[~2025-10-30 8:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 13:23 [PATCH nf v2] netfilter: nft_connlimit: fix duplicated tracking of a connection Fernando Fernandez Mancera
2025-10-29 20:33 ` Pablo Neira Ayuso
2025-10-30 8:12 ` Fernando Fernandez Mancera [this message]
2025-10-30 23:04 ` Pablo Neira Ayuso
2025-10-30 23:40 ` Florian Westphal
2025-10-31 8:55 ` Fernando Fernandez Mancera
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c58ae9ad-46f3-4853-bc61-ac725c860160@suse.de \
--to=fmancera@suse.de \
--cc=coreteam@netfilter.org \
--cc=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).