public inbox for netfilter@vger.kernel.org
 help / color / mirror / Atom feed
From: Fernando Fernandez Mancera <fmancera@suse.de>
To: Rukomoinikova Aleksandra <ARukomoinikova@k2.cloud>,
	Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Odintsov Vladislav <VlOdintsov@k2.cloud>,
	"netfilter@vger.kernel.org" <netfilter@vger.kernel.org>,
	"ovs-dev@openvswitch.org" <ovs-dev@openvswitch.org>,
	Kovalev Evgeniy <EvgKovalev@k2.cloud>,
	Vazhnetsov Anton <AVazhnetsov@k2.cloud>
Subject: Re: netfilter: nf_conncount: cpu soft lockup using limiting with Open vSwitch.
Date: Mon, 15 Dec 2025 12:00:05 +0100	[thread overview]
Message-ID: <46357175-2d8b-4d93-9c9b-20bbd51408d9@suse.de> (raw)
In-Reply-To: <b2064e7b-0776-4e14-adb6-c68080987471@k2.cloud>

On 12/12/25 10:27 PM, Rukomoinikova Aleksandra wrote:
> Hi one more time! I found another issue. I'll describe it below.
> 
> In my opinion, it's relevant after merging [1]  I saw that a fix for
> this commit was merged last week, but but it doesn't fix case I'll
> describe below.
> 

Just to be sure, you have tested this with the latest mainline kernel, 
right? Because as you mentioned we merged several fixes addressing 
outdated counts.

> I create limits via Open vSwitch and run a TCP flood this way:
hping3 -S
> -I host -p 10880 -i u5 10.255.41.101 -c 100 (Here, -i u5 is important,
> meaning with timers < jiffies; it's more likely the issue won't
> reproduce otherwise)
> 
And I set the following limit on the zone where these connections
> arrive:
ovs-dpctl ct-set-limits zone=9,limit=100
> 
> I start traffic, I see traffic on the interface: TCP SYN -> TCP SYN+ACK
> -> TCP RST. Zone 9 overflows, but connections immediately become CLOSED.
> I run hping3 again: hping3 -S -I host -p 10880 -i u5 10.255.41.101 -c
> 100 - I don't see anything on the interface and I see messages in dmesg
> from openvswitch saying the number of connections exceeds the limit. At
> this moment, if I call ovs-dpctl ct-get-limits, traffic will immediately
> start flowing again because the limit will be reset to zero.
> 

To me it seems that openvswitch should perform a GC somewhere similar to 
what we did on nft_connlimit/xt_connlimit.

> I think the problem is as follows: before commit [1], we called
> __nf_conncount_gc_list for every connection, and this function iterates
> over all connections and cleans up those already closed.
> 
> What we have now is that when trying to add a connection in
> __nf_conncount_add, if we don't find it, then while handling errors, we
> don't continue the iteration further and immediately exit the function
> with zero, which represents the current connection count - we will clean> connections in the list only until we find the connection we want to
> commit now - meaning the connection count will become outdated.
> 
> Furthermore, we then go to check already_closed found connections and
> iterate collect variable, which also doesn't allow connections to be
> fully cleaned up; we will clean up a maximum of 8
> (CONNCOUNT_GC_MAX_NODES) entries per one call to __nf_conncount_add.
> 
> Also, with such a TCP attack - when connections transition to CLOSED
> immediately - __nf_conncount_gc_list won't be called at all, because we
> will constantly be calling __nf_conncount_add and updating the last_gc
> status. That's why ovs-dpctl ct-get-limits helps; it simply calls
> __nf_conncount_gc_list and cleans up all closed connections.
> I propose the following behavior, which will be similar to what we had
> before [2]
> 
> diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
> index 19039a0802b8..e5224785f01e 100644
> --- a/net/netfilter/nf_conncount.c
> +++ b/net/netfilter/nf_conncount.c
> @@ -171,6 +171,7 @@ static int __nf_conncount_add(struct net *net,
>        struct nf_conn *found_ct;
>        unsigned int collect = 0;
>        bool refcounted = false;
> +    bool need_add = false;
> 
>        if (!get_ct_or_tuple_from_skb(net, skb, l3num, &ct, &tuple, &zone,
> &refcounted))
>            return -ENOENT;
> @@ -196,7 +197,8 @@ static int __nf_conncount_add(struct net *net,
>                    if (nf_ct_tuple_equal(&conn->tuple, &tuple) &&
>                        nf_ct_zone_id(&conn->zone, conn->zone.dir) ==
>                        nf_ct_zone_id(zone, zone->dir))
> -                    goto out_put; /* already exists */
> +                    /* already exists */
> +                    need_add = false;

I don't see the point to continue here. If we reached this, it means the 
ct is already tracked. Sure, the count is not being updated but does it 
matter? This connection is already tracked.

>                } else {
>                    collect++;
>                }
> @@ -214,7 +216,7 @@ static int __nf_conncount_add(struct net *net,
>                 * Attempt to avoid a re-add in this case.
>                 */
>                nf_ct_put(found_ct);
> -            goto out_put;
> +            need_add = false;
>            } else if (already_closed(found_ct)) {
>                /*
>                 * we do not care about connections which are
> @@ -222,13 +224,16 @@ static int __nf_conncount_add(struct net *net,
>                 */
>                nf_ct_put(found_ct);
>                conn_free(list, conn);
> -            collect++;
>                continue;
>            }

This worries me a bit, it would mean that for every legit add operation 
the function will go through ALL the connections tracked which might be 
a really huge number. This would impact the performance.

IMO, this is the only relevant line on the patch for your use-case 
probably. I do not think the others have any impact. I am wondering if 
this can be fixed by handling it from openvswitch side by calling gc 
when needed.

I am going to try to reproduce this locally on a machine I have. Let's 
see what I can get.

Thanks,
Fernando.

> 
>            nf_ct_put(found_ct);
>        }
> 
> +    if (!need_add) {
> +        goto out_put;
> +    }
> +
>    add_new_node:
> 
> [1] netfilter: nf_conncount: reduce unnecessary GC commit
> https://github.com/torvalds/linux/commit/d265929930e2ffafc744c0ae05fb70acd53be1ee
> [2] netfilter: nf_conncount: merge lookup and add functions commit.
> https://github.com/torvalds/linux/commit/df4a902509766897f7371fdfa4c3bf8bc321b55d
> [3] netfilter: nft_connlimit: update the count if add was skipped
> https://github.com/torvalds/linux/commit/69894e5b4c5e28cda5f32af33d4a92b7a4b93b0e
> 
> Do you think I missed any cases, and how will this affect the function's
> performance? Thanks)
> 



  reply	other threads:[~2025-12-15 11:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <7a6872ce-8015-4397-bbe9-22108c65b7ec@k2.cloud>
2025-12-08 12:27 ` netfilter: nf_conncount: cpu soft lockup using limiting with Open vSwitch Odintsov Vladislav
2025-12-09  7:57   ` Fernando Fernandez Mancera
2025-12-09  8:42     ` Pablo Neira Ayuso
2025-12-09  8:49       ` Rukomoinikova Aleksandra
2025-12-10  8:03       ` Fernando Fernandez Mancera
2025-12-12 21:27         ` Rukomoinikova Aleksandra
2025-12-15 11:00           ` Fernando Fernandez Mancera [this message]
2025-12-15 11:07             ` Rukomoinikova Aleksandra
2025-12-15 15:38               ` Fernando Fernandez Mancera
2025-12-08 12:31 Rukomoinikova Aleksandra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46357175-2d8b-4d93-9c9b-20bbd51408d9@suse.de \
    --to=fmancera@suse.de \
    --cc=ARukomoinikova@k2.cloud \
    --cc=AVazhnetsov@k2.cloud \
    --cc=EvgKovalev@k2.cloud \
    --cc=VlOdintsov@k2.cloud \
    --cc=netfilter@vger.kernel.org \
    --cc=ovs-dev@openvswitch.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox