netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Fernando Fernandez Mancera <fmancera@suse.de>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org, coreteam@netfilter.org, fw@strlen.de
Subject: Re: [PATCH nf v2] netfilter: nft_connlimit: fix duplicated tracking of a connection
Date: Thu, 30 Oct 2025 09:12:32 +0100	[thread overview]
Message-ID: <c58ae9ad-46f3-4853-bc61-ac725c860160@suse.de> (raw)
In-Reply-To: <aQJ6AysjCMTHLzsP@calendula>



On 10/29/25 9:33 PM, Pablo Neira Ayuso wrote:
> Hi Fernando,
> 
> On Wed, Oct 29, 2025 at 02:23:18PM +0100, Fernando Fernandez Mancera wrote:
>> Connlimit expression can be used for all kind of packets and not only
>> for packets with connection state new. See this ruleset as example:
>>
>> table ip filter {
>>          chain input {
>>                  type filter hook input priority filter; policy accept;
>>                  tcp dport 22 ct count over 4 counter
>>          }
>> }
>>
>> Currently, if the connection count goes over the limit the counter will
>> count the packets. When a connection is closed, the connection count
>> won't decrement as it should because it is only updated for new
>> connections due to an optimization on __nf_conncount_add() that prevents
>> updating the list if the connection is duplicated.
>>
>> In addition, since commit d265929930e2 ("netfilter: nf_conncount: reduce
>> unnecessary GC") there can be situations where a duplicated connection
>> is added to the list. This is caused by two packets from the same
>> connection being processed during the same jiffy.
>>
>> To solve these problems, check whether this is a new connection and only
>> add the connection to the list if that is the case during connlimit
>> evaluation. Otherwise run a GC to update the count. This doesn't yield a
>> performance degradation.
> 
> This is true is list is small, e.g. ct count over 4.
> 
> But user could much larger value, then every packet could trigger a
> long list walk, because gc is bound to CONNCOUNT_GC_MAX_NODES which is
> the maximum number of nodes that is _collected_.
> 
> And maybe the user selects:
> 
>    ct count over N mark set 0x1
> 
> where N is high, the gc walk can be long.
> 
> TBH, I added this expression mainly focusing on being used with
> dynset, I allowed it too in rules for parity. In the dynset case,
> there is a front-end datastructure (set) and this conncount list
> is per element. Maybe there high ct count is also possible.
> 
> With this patch, gc is called more frequently, not only for each new
> packet.
> 

How is it called more frequently? Before, it was calling 
nf_conncount_add() for every packet which is indeed performing a GC 
inside, both nf_conncount_add() and nf_conncount_gc_list() return 
immediately if a GC was performed during the same jiffy.

I tried with a limit of 2000 connections and didn't notice any 
performance change. I could try with CONNCOUNT_GC_MAX_NODES too.

>> Fixed in xt_connlimit too.
>>
>> Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC")
>> Fixes: 976afca1ceba ("netfilter: nf_conncount: Early exit in nf_conncount_lookup() and cleanup")
>> Closes: https://lore.kernel.org/netfilter/trinity-85c72a88-d762-46c3-be97-36f10e5d9796-1761173693813@3c-app-mailcom-bs12/
>> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
>> ---
>> v2: use nf_ct_is_confirmed(), add comment about why the gc call is
>> needed and fix this in xt_connlimit too.
>> ---
>>   net/netfilter/nft_connlimit.c | 17 ++++++++++++++---
>>   net/netfilter/xt_connlimit.c  | 14 ++++++++++++--
>>   2 files changed, 26 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/netfilter/nft_connlimit.c b/net/netfilter/nft_connlimit.c
>> index fc35a11cdca2..dedea1681e73 100644
>> --- a/net/netfilter/nft_connlimit.c
>> +++ b/net/netfilter/nft_connlimit.c
>> @@ -43,9 +43,20 @@ static inline void nft_connlimit_do_eval(struct nft_connlimit *priv,
>>   		return;
>>   	}
>>   
>> -	if (nf_conncount_add(nft_net(pkt), priv->list, tuple_ptr, zone)) {
>> -		regs->verdict.code = NF_DROP;
>> -		return;
>> +	if (!ct || !nf_ct_is_confirmed(ct)) {
>> +		if (nf_conncount_add(nft_net(pkt), priv->list, tuple_ptr, zone)) {
>> +			regs->verdict.code = NF_DROP;
>> +			return;
>> +		}
>> +	} else {
>> +		/* Call gc to update the list count if any connection has been
>> +		 * closed already. This is useful to softlimit connections
>> +		 * like limiting bandwidth based on a number of open
>> +		 * connections.
>> +		 */
>> +		local_bh_disable();
>> +		nf_conncount_gc_list(nft_net(pkt), priv->list);
>> +		local_bh_enable();
>>   	}
>>   
>>   	count = READ_ONCE(priv->list->count);
>> diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
>> index 0189f8b6b0bd..5c90e1929d86 100644
>> --- a/net/netfilter/xt_connlimit.c
>> +++ b/net/netfilter/xt_connlimit.c
>> @@ -69,8 +69,18 @@ connlimit_mt(const struct sk_buff *skb, struct xt_action_param *par)
>>   		key[1] = zone->id;
>>   	}
>>   
>> -	connections = nf_conncount_count(net, info->data, key, tuple_ptr,
>> -					 zone);
>> +	if (!ct || !nf_ct_is_confirmed(ct)) {
>> +		connections = nf_conncount_count(net, info->data, key, tuple_ptr,
>> +						 zone);
>> +	} else {
>> +		/* Call nf_conncount_count() with NULL tuple and zone to update
>> +		 * the list if any connection has been closed already. This is
>> +		 * useful to softlimit connections like limiting bandwidth based
>> +		 * on a number of open connections.
>> +		 */
>> +		connections = nf_conncount_count(net, info->data, key, NULL, NULL);
>> +	}
> 
> Maybe remove this from xt_connlimit?
>

Dropping this would leave xt_connlimit broken for the use-cases I 
discussed with Florian on the v1.
>> +
>>   	if (connections == 0)
>>   		/* kmalloc failed, drop it entirely */
>>   		goto hotdrop;
>> -- 
>> 2.51.0
>>


  reply	other threads:[~2025-10-30  8:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-29 13:23 [PATCH nf v2] netfilter: nft_connlimit: fix duplicated tracking of a connection Fernando Fernandez Mancera
2025-10-29 20:33 ` Pablo Neira Ayuso
2025-10-30  8:12   ` Fernando Fernandez Mancera [this message]
2025-10-30 23:04     ` Pablo Neira Ayuso
2025-10-30 23:40       ` Florian Westphal
2025-10-31  8:55       ` Fernando Fernandez Mancera

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c58ae9ad-46f3-4853-bc61-ac725c860160@suse.de \
    --to=fmancera@suse.de \
    --cc=coreteam@netfilter.org \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).