* [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory
@ 2026-03-04 17:50 Chris Arges
2026-03-04 21:26 ` Pablo Neira Ayuso
0 siblings, 1 reply; 8+ messages in thread
From: Chris Arges @ 2026-03-04 17:50 UTC (permalink / raw)
To: Pablo Neira Ayuso, Florian Westphal, stable, linux-kernel,
Greg Kroah-Hartman
Cc: lwn, jslaby, kernel-team, netfilter-devel
Hello,
We've noticed significant slab unreclaimable memory increase after upgrading
from 6.18.12 to 6.18.15. Other memory values look fairly close, but in my
testing slab unreclaimable goes from 1.7 GB to 4.9 GB on machines.
Our use case is having nft rules like below, but adding them to 1000s of
network namespaces. This is essentially running `nft -f` for all these
namespaces every minute.
```
table inet service_1234567 {
}
delete table inet service_1234567
table inet service_1234567 {
chain input {
type filter hook prerouting priority filter; policy accept;
ip saddr @account.ip_list drop
}
set account.ip_list {
type ipv4_addr
flags interval
auto-merge
}
}
add element inet service_1234567 account.ip_list { /* add 1000s of CIDRs here */ }
```
I suspect this is related to:
- 36ed9b6e3961 (upstream 7e43e0a1141deec651a60109dab3690854107298)
- netfilter: nft_set_rbtree: translate rbtree to array for binary search
I'm still digging into this, and plan on reverting commits and seeing if memory
usage goes back to nominal in production. I don't have a trivial
reproducer unfortunately.
Happy to run some additional tests, and I can easily apply patches on top of
linux-6.18.y to run in a test environment.
We are using userspace nftables 1.1.3, but had to apply the patch mentioned
in this thread: https://lore.kernel.org/all/e6b43861cda6953cc7f8c259e663b890e53d7785.camel@sapience.com/
In order to solve the other regression we encountered.
Thanks,
--chris
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory
2026-03-04 17:50 [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory Chris Arges
@ 2026-03-04 21:26 ` Pablo Neira Ayuso
2026-03-04 21:27 ` Pablo Neira Ayuso
0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2026-03-04 21:26 UTC (permalink / raw)
To: 2026022652-lyricist-washtub-eeb4
Cc: Florian Westphal, stable, linux-kernel, Greg Kroah-Hartman, lwn,
jslaby, kernel-team, netfilter-devel
Hi,
On Wed, Mar 04, 2026 at 11:50:54AM -0600, Chris Arges wrote:
> Hello,
>
> We've noticed significant slab unreclaimable memory increase after upgrading
> from 6.18.12 to 6.18.15. Other memory values look fairly close, but in my
> testing slab unreclaimable goes from 1.7 GB to 4.9 GB on machines.
From where are you collecting these memory consumption numbers?
> Our use case is having nft rules like below, but adding them to 1000s of
> network namespaces. This is essentially running `nft -f` for all these
> namespaces every minute.
Those numbers for only 1000? That is too little number of entries for
such increase in memory usage that you report.
> ```
> table inet service_1234567 {
> }
> delete table inet service_1234567
> table inet service_1234567 {
> chain input {
> type filter hook prerouting priority filter; policy accept;
> ip saddr @account.ip_list drop
> }
> set account.ip_list {
> type ipv4_addr
> flags interval
> auto-merge
> }
> }
> add element inet service_1234567 account.ip_list { /* add 1000s of CIDRs here */ }
> ```
>
> I suspect this is related to:
> - 36ed9b6e3961 (upstream 7e43e0a1141deec651a60109dab3690854107298)
> - netfilter: nft_set_rbtree: translate rbtree to array for binary search
More memory consumption is expected indeed, but not so much as you are
reporting.
> I'm still digging into this, and plan on reverting commits and seeing if memory
> usage goes back to nominal in production. I don't have a trivial
> reproducer unfortunately.
The extra memory comes from the array allocation, the relevant code
is here:
#define NFT_ARRAY_EXTRA_SIZE 10240
/* Similar to nft_rbtree_{u,k}size to hide details to userspace, but consider
* packed representation coming from userspace for anonymous sets too.
*/
static u32 nft_array_elems(const struct nft_set *set)
> Happy to run some additional tests, and I can easily apply patches on top of
> linux-6.18.y to run in a test environment.
I would need need more info to propose a patch, I don't know where you
are pulling such numbers. You also mention you have no reproducer.
> We are using userspace nftables 1.1.3, but had to apply the patch mentioned
> in this thread: https://lore.kernel.org/all/e6b43861cda6953cc7f8c259e663b890e53d7785.camel@sapience.com/
> In order to solve the other regression we encountered.
Yes, there are plans to revert a kernel patch that went in -stable to
address this.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory
2026-03-04 21:26 ` Pablo Neira Ayuso
@ 2026-03-04 21:27 ` Pablo Neira Ayuso
2026-03-05 16:28 ` Chris Arges
0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2026-03-04 21:27 UTC (permalink / raw)
To: Chris Arges
Cc: Florian Westphal, stable, linux-kernel, Greg Kroah-Hartman, lwn,
jslaby, kernel-team, netfilter-devel
Resending, your Reply-To: is botched.
-o-
Hi,
On Wed, Mar 04, 2026 at 11:50:54AM -0600, Chris Arges wrote:
> Hello,
>
> We've noticed significant slab unreclaimable memory increase after upgrading
> from 6.18.12 to 6.18.15. Other memory values look fairly close, but in my
> testing slab unreclaimable goes from 1.7 GB to 4.9 GB on machines.
From where are you collecting these memory consumption numbers?
> Our use case is having nft rules like below, but adding them to 1000s of
> network namespaces. This is essentially running `nft -f` for all these
> namespaces every minute.
Those numbers for only 1000? That is too little number of entries for
such increase in memory usage that you report.
> ```
> table inet service_1234567 {
> }
> delete table inet service_1234567
> table inet service_1234567 {
> chain input {
> type filter hook prerouting priority filter; policy accept;
> ip saddr @account.ip_list drop
> }
> set account.ip_list {
> type ipv4_addr
> flags interval
> auto-merge
> }
> }
> add element inet service_1234567 account.ip_list { /* add 1000s of CIDRs here */ }
> ```
>
> I suspect this is related to:
> - 36ed9b6e3961 (upstream 7e43e0a1141deec651a60109dab3690854107298)
> - netfilter: nft_set_rbtree: translate rbtree to array for binary search
More memory consumption is expected indeed, but not so much as you are
reporting.
> I'm still digging into this, and plan on reverting commits and seeing if memory
> usage goes back to nominal in production. I don't have a trivial
> reproducer unfortunately.
The extra memory comes from the array allocation, the relevant code
is here:
#define NFT_ARRAY_EXTRA_SIZE 10240
/* Similar to nft_rbtree_{u,k}size to hide details to userspace, but consider
* packed representation coming from userspace for anonymous sets too.
*/
static u32 nft_array_elems(const struct nft_set *set)
> Happy to run some additional tests, and I can easily apply patches on top of
> linux-6.18.y to run in a test environment.
I would need need more info to propose a patch, I don't know where you
are pulling such numbers. You also mention you have no reproducer.
> We are using userspace nftables 1.1.3, but had to apply the patch mentioned
> in this thread: https://lore.kernel.org/all/e6b43861cda6953cc7f8c259e663b890e53d7785.camel@sapience.com/
> In order to solve the other regression we encountered.
Yes, there are plans to revert a kernel patch that went in -stable to
address this.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory
2026-03-04 21:27 ` Pablo Neira Ayuso
@ 2026-03-05 16:28 ` Chris Arges
2026-03-06 12:22 ` Pablo Neira Ayuso
0 siblings, 1 reply; 8+ messages in thread
From: Chris Arges @ 2026-03-05 16:28 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Florian Westphal, stable, linux-kernel, Greg Kroah-Hartman, lwn,
jslaby, kernel-team, netfilter-devel
On 2026-03-04 22:27:45, Pablo Neira Ayuso wrote:
> Resending, your Reply-To: is botched.
>
> -o-
>
I noticed after I sent, thanks for fixing.
> Hi,
>
> On Wed, Mar 04, 2026 at 11:50:54AM -0600, Chris Arges wrote:
> > Hello,
> >
> > We've noticed significant slab unreclaimable memory increase after upgrading
> > from 6.18.12 to 6.18.15. Other memory values look fairly close, but in my
> > testing slab unreclaimable goes from 1.7 GB to 4.9 GB on machines.
>
> From where are you collecting these memory consumption numbers?
>
These numbers come from the cgroup's memory.stat:
```
$ cat /sys/fs/cgroup/path/to/service/memory.stat | grep slab
slab_reclaimable 35874232
slab_unreclaimable 5343553056
slab 5379427288
```
> > Our use case is having nft rules like below, but adding them to 1000s of
> > network namespaces. This is essentially running `nft -f` for all these
> > namespaces every minute.
>
> Those numbers for only 1000? That is too little number of entries for
> such increase in memory usage that you report.
>
For this workload that I suspect (since its in the cgroup) it has the following
characteristics:
- 1000s of namespaces
- 1000s of CIDRs in ip list per namespace
- Updating everything frequently (<1m)
> > ```
> > table inet service_1234567 {
> > }
> > delete table inet service_1234567
> > table inet service_1234567 {
> > chain input {
> > type filter hook prerouting priority filter; policy accept;
> > ip saddr @account.ip_list drop
> > }
> > set account.ip_list {
> > type ipv4_addr
> > flags interval
> > auto-merge
> > }
> > }
> > add element inet service_1234567 account.ip_list { /* add 1000s of CIDRs here */ }
> > ```
> >
> > I suspect this is related to:
> > - 36ed9b6e3961 (upstream 7e43e0a1141deec651a60109dab3690854107298)
> > - netfilter: nft_set_rbtree: translate rbtree to array for binary search
>
> More memory consumption is expected indeed, but not so much as you are
> reporting.
>
> > I'm still digging into this, and plan on reverting commits and seeing if memory
> > usage goes back to nominal in production. I don't have a trivial
> > reproducer unfortunately.
>
> The extra memory comes from the array allocation, the relevant code
> is here:
>
> #define NFT_ARRAY_EXTRA_SIZE 10240
>
> /* Similar to nft_rbtree_{u,k}size to hide details to userspace, but consider
> * packed representation coming from userspace for anonymous sets too.
> */
> static u32 nft_array_elems(const struct nft_set *set)
>
> > Happy to run some additional tests, and I can easily apply patches on top of
> > linux-6.18.y to run in a test environment.
>
> I would need need more info to propose a patch, I don't know where you
> are pulling such numbers. You also mention you have no reproducer.
>
To clarify this issue _is_ happening in our production environments, so I can
reproduce this issue there. It only happened when going from 6.18.12 to
6.18.15, and with a service inside a cgroup that is mostly applying large sets
of IPs via nft. I do not have a simple reproducer script or something I can
easily share yet, but am working on that.
I'm going to try and revert rbtree patch series locally and see if it still
happens. I can also play with NFT_ARRAY_EXTRA_SIZE and see if that is a factor
here as well.
> > We are using userspace nftables 1.1.3, but had to apply the patch mentioned
> > in this thread: https://lore.kernel.org/all/e6b43861cda6953cc7f8c259e663b890e53d7785.camel@sapience.com/
> > In order to solve the other regression we encountered.
>
> Yes, there are plans to revert a kernel patch that went in -stable to
> address this.
Thanks.
--chris
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory
2026-03-05 16:28 ` Chris Arges
@ 2026-03-06 12:22 ` Pablo Neira Ayuso
2026-03-06 12:25 ` Pablo Neira Ayuso
0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2026-03-06 12:22 UTC (permalink / raw)
To: Chris Arges
Cc: Florian Westphal, stable, linux-kernel, Greg Kroah-Hartman, lwn,
jslaby, kernel-team, netfilter-devel
[-- Attachment #1: Type: text/plain, Size: 1511 bytes --]
Hi Chris,
On Thu, Mar 05, 2026 at 10:28:49AM -0600, Chris Arges wrote:
> I noticed after I sent, thanks for fixing.
> > Hi,
> >
> > On Wed, Mar 04, 2026 at 11:50:54AM -0600, Chris Arges wrote:
> > > Hello,
> > >
> > > We've noticed significant slab unreclaimable memory increase after upgrading
> > > from 6.18.12 to 6.18.15. Other memory values look fairly close, but in my
> > > testing slab unreclaimable goes from 1.7 GB to 4.9 GB on machines.
> >
> > From where are you collecting these memory consumption numbers?
> >
>
> These numbers come from the cgroup's memory.stat:
> ```
> $ cat /sys/fs/cgroup/path/to/service/memory.stat | grep slab
> slab_reclaimable 35874232
> slab_unreclaimable 5343553056
> slab 5379427288
> ```
>
> > > Our use case is having nft rules like below, but adding them to 1000s of
> > > network namespaces. This is essentially running `nft -f` for all these
> > > namespaces every minute.
> >
> > Those numbers for only 1000? That is too little number of entries for
> > such increase in memory usage that you report.
> >
>
> For this workload that I suspect (since its in the cgroup) it has the following
> characteristics:
> - 1000s of namespaces
> - 1000s of CIDRs in ip list per namespace
> - Updating everything frequently (<1m)
I see what is going on, my resize logic is not correct. This is
increasing the size for each new transaction, then the array is
getting larger and larger on each transaction update.
Could you please give a try to this patch?
Thanks.
[-- Attachment #2: fix.patch --]
[-- Type: text/x-diff, Size: 458 bytes --]
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 853ff30a208c..4462ac48fdfa 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -646,7 +646,7 @@ static int nft_array_may_resize(const struct nft_set *set)
struct nft_array *array;
if (!priv->array_next) {
- array = nft_array_alloc(nelems + NFT_ARRAY_EXTRA_SIZE);
+ array = nft_array_alloc(nelems);
if (!array)
return -ENOMEM;
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory
2026-03-06 12:22 ` Pablo Neira Ayuso
@ 2026-03-06 12:25 ` Pablo Neira Ayuso
2026-03-06 18:20 ` Chris Arges
0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2026-03-06 12:25 UTC (permalink / raw)
To: Chris Arges
Cc: Florian Westphal, stable, linux-kernel, Greg Kroah-Hartman, lwn,
jslaby, kernel-team, netfilter-devel
[-- Attachment #1: Type: text/plain, Size: 1714 bytes --]
On Fri, Mar 06, 2026 at 01:22:44PM +0100, Pablo Neira Ayuso wrote:
> Hi Chris,
>
> On Thu, Mar 05, 2026 at 10:28:49AM -0600, Chris Arges wrote:
> > I noticed after I sent, thanks for fixing.
> > > Hi,
> > >
> > > On Wed, Mar 04, 2026 at 11:50:54AM -0600, Chris Arges wrote:
> > > > Hello,
> > > >
> > > > We've noticed significant slab unreclaimable memory increase after upgrading
> > > > from 6.18.12 to 6.18.15. Other memory values look fairly close, but in my
> > > > testing slab unreclaimable goes from 1.7 GB to 4.9 GB on machines.
> > >
> > > From where are you collecting these memory consumption numbers?
> > >
> >
> > These numbers come from the cgroup's memory.stat:
> > ```
> > $ cat /sys/fs/cgroup/path/to/service/memory.stat | grep slab
> > slab_reclaimable 35874232
> > slab_unreclaimable 5343553056
> > slab 5379427288
> > ```
> >
> > > > Our use case is having nft rules like below, but adding them to 1000s of
> > > > network namespaces. This is essentially running `nft -f` for all these
> > > > namespaces every minute.
> > >
> > > Those numbers for only 1000? That is too little number of entries for
> > > such increase in memory usage that you report.
> > >
> >
> > For this workload that I suspect (since its in the cgroup) it has the following
> > characteristics:
> > - 1000s of namespaces
> > - 1000s of CIDRs in ip list per namespace
> > - Updating everything frequently (<1m)
>
> I see what is going on, my resize logic is not correct. This is
> increasing the size for each new transaction, then the array is
> getting larger and larger on each transaction update.
>
> Could you please give a try to this patch?
Scratch that.
Please, give a try to this patch.
Thanks.
[-- Attachment #2: fix-rbtree-array-resize.patch --]
[-- Type: text/x-diff, Size: 478 bytes --]
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 853ff30a208c..cffeb6f5c532 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -646,7 +646,7 @@ static int nft_array_may_resize(const struct nft_set *set)
struct nft_array *array;
if (!priv->array_next) {
- array = nft_array_alloc(nelems + NFT_ARRAY_EXTRA_SIZE);
+ array = nft_array_alloc(priv->array->max_intervals);
if (!array)
return -ENOMEM;
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory
2026-03-06 12:25 ` Pablo Neira Ayuso
@ 2026-03-06 18:20 ` Chris Arges
2026-03-07 0:15 ` Pablo Neira Ayuso
0 siblings, 1 reply; 8+ messages in thread
From: Chris Arges @ 2026-03-06 18:20 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Florian Westphal, stable, linux-kernel, Greg Kroah-Hartman, lwn,
jslaby, kernel-team, netfilter-devel
On 2026-03-06 13:25:44, Pablo Neira Ayuso wrote:
<snip>
> > I see what is going on, my resize logic is not correct. This is
> > increasing the size for each new transaction, then the array is
> > getting larger and larger on each transaction update.
> >
> > Could you please give a try to this patch?
>
> Scratch that.
>
> Please, give a try to this patch.
>
> Thanks.
Pablo,
Thanks, I'm getting this set up on a few machines. I will have:
- 6.18.15 (original kernel version that repo'd the issue for us)
- 6.18.15 + this patch
- 6.18.15 + revert rbtree patchseries
I'll compare memory usage with those 3 variants and give a response.
--chris
> diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
> index 853ff30a208c..cffeb6f5c532 100644
> --- a/net/netfilter/nft_set_rbtree.c
> +++ b/net/netfilter/nft_set_rbtree.c
> @@ -646,7 +646,7 @@ static int nft_array_may_resize(const struct nft_set *set)
> struct nft_array *array;
>
> if (!priv->array_next) {
> - array = nft_array_alloc(nelems + NFT_ARRAY_EXTRA_SIZE);
> + array = nft_array_alloc(priv->array->max_intervals);
> if (!array)
> return -ENOMEM;
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory
2026-03-06 18:20 ` Chris Arges
@ 2026-03-07 0:15 ` Pablo Neira Ayuso
0 siblings, 0 replies; 8+ messages in thread
From: Pablo Neira Ayuso @ 2026-03-07 0:15 UTC (permalink / raw)
To: Chris Arges
Cc: Florian Westphal, stable, linux-kernel, Greg Kroah-Hartman, lwn,
jslaby, kernel-team, netfilter-devel
On Fri, Mar 06, 2026 at 12:20:16PM -0600, Chris Arges wrote:
> On 2026-03-06 13:25:44, Pablo Neira Ayuso wrote:
> <snip>
> > > I see what is going on, my resize logic is not correct. This is
> > > increasing the size for each new transaction, then the array is
> > > getting larger and larger on each transaction update.
> > >
> > > Could you please give a try to this patch?
> >
> > Scratch that.
> >
> > Please, give a try to this patch.
> >
> > Thanks.
>
> Pablo,
>
> Thanks, I'm getting this set up on a few machines. I will have:
> - 6.18.15 (original kernel version that repo'd the issue for us)
> - 6.18.15 + this patch
> - 6.18.15 + revert rbtree patchseries
>
> I'll compare memory usage with those 3 variants and give a response.
I posted a new patch version, see:
https://patchwork.ozlabs.org/project/netfilter-devel/patch/20260307001124.2897063-1-pablo@netfilter.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-03-07 0:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-04 17:50 [REGRESSION] 6.18.14 netfilter/nftables consumes way more memory Chris Arges
2026-03-04 21:26 ` Pablo Neira Ayuso
2026-03-04 21:27 ` Pablo Neira Ayuso
2026-03-05 16:28 ` Chris Arges
2026-03-06 12:22 ` Pablo Neira Ayuso
2026-03-06 12:25 ` Pablo Neira Ayuso
2026-03-06 18:20 ` Chris Arges
2026-03-07 0:15 ` Pablo Neira Ayuso
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox