From: Przemek Kitszel <przemyslaw.kitszel@intel.com>
To: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
Cc: <jdamato@fastly.com>, <intel-wired-lan@lists.osuosl.org>,
<netdev@vger.kernel.org>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
"Igor Raits" <igor@gooddata.com>,
Daniel Secik <daniel.secik@gooddata.com>,
"Zdenek Pesek" <zdenek.pesek@gooddata.com>,
Jakub Kicinski <kuba@kernel.org>,
"Eric Dumazet" <edumazet@google.com>,
Martin Karsten <mkarsten@uwaterloo.ca>,
"Ahmed Zaki" <ahmed.zaki@intel.com>,
"Czapnik, Lukasz" <lukasz.czapnik@intel.com>,
Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Subject: Re: Increased memory usage on NUMA nodes with ICE driver after upgrade to 6.13.y (regression in commit 492a044508ad)
Date: Tue, 15 Apr 2025 16:38:40 +0200 [thread overview]
Message-ID: <4a061a51-8a6c-42b8-9957-66073b4bc65f@intel.com> (raw)
In-Reply-To: <CAK8fFZ4hY6GUJNENz3wY9jaYLZXGfpr7dnZxzGMYoE44caRbgw@mail.gmail.com>
On 4/14/25 18:29, Jaroslav Pulchart wrote:
> Hello,
+CC to co-devs and reviewers of initial napi_config introduction
+CC Ahmed, who leverages napi_config for more stuff in 6.15
>
> While investigating increased memory usage after upgrading our
> host/hypervisor servers from Linux kernel 6.12.y to 6.13.y, I observed
> a regression in available memory per NUMA node. Our servers allocate
> 60GB of each NUMA node’s 64GB of RAM to HugePages for VMs, leaving 4GB
> for the host OS.
>
> After the upgrade, we noticed approximately 500MB less free RAM on
> NUMA nodes 0 and 2 compared to 6.12.y, even with no VMs running (just
> the host OS after reboot). These nodes host Intel 810-XXV NICs. Here's
> a snapshot of the NUMA stats on vanilla 6.13.y:
>
> NUMA nodes: 0 1 2 3 4 5 6 7 8
> 9 10 11 12 13 14 15
> HPFreeGiB: 60 60 60 60 60 60 60 60 60
> 60 60 60 60 60 60 60
> MemTotal: 64989 65470 65470 65470 65470 65470 65470 65453
> 65470 65470 65470 65470 65470 65470 65470 65462
> MemFree: 2793 3559 3150 3438 3616 3722 3520 3547 3547
> 3536 3506 3452 3440 3489 3607 3729
>
> We traced the issue to commit 492a044508ad13a490a24c66f311339bf891cb5f
> "ice: Add support for persistent NAPI config".
thank you for the report and bisection,
this commit is ice's opt-in into using persistent napi_config
I have checked the code, and there is nothing obvious to inflate memory
consumption in the driver/core in the touched parts. I have not yet
looked into how much memory is eaten by the hash array of now-kept
configs.
>
> We limit the number of channels on the NICs to match local NUMA cores
> or less if unused interface (from ridiculous 96 default), for example:
We will experiment with other defaults, looks like number of total CPUs,
instead of local NUMA cores, might be better here. And even if that
would resolve the issue, I would like to have a more direct fix for this
> ethtool -L em1 combined 6 # active port; from 96
> ethtool -L p3p2 combined 2 # unused port; from 96
>
> This typically aligns memory use with local CPUs and keeps NUMA-local
> memory usage within expected limits. However, starting with kernel
> 6.13.y and this commit, the high memory usage by the ICE driver
> persists regardless of reduced channel configuration.
As a workaround, you could try to do devlink reload (action
driver_reinit), that should flush all napi instances.
We will try to reproduce the issue locally and work on a fix.
>
> Reverting the commit restores expected memory availability on nodes 0
> and 2. Below are stats from 6.13.y with the commit reverted:
> NUMA nodes: 0 1 2 3 4 5 6 7 8
> 9 10 11 12 13 14 15
> HPFreeGiB: 60 60 60 60 60 60 60 60 60
> 60 60 60 60 60 60 60
> MemTotal: 64989 65470 65470 65470 65470 65470 65470 65453 65470
> 65470 65470 65470 65470 65470 65470 65462
> MemFree: 3208 3765 3668 3507 3811 3727 3812 3546 3676 3596 ...
>
> This brings nodes 0 and 2 back to ~3.5GB free RAM, similar to kernel
> 6.12.y, and avoids swap pressure and memory exhaustion when running
> services and VMs.
>
> I also do not see any practical benefit in persisting the channel
> memory allocation. After a fresh server reboot, channels are not
> explicitly configured, and the system will not automatically resize
> them back to a higher count unless manually set again. Therefore,
> retaining the previous memory footprint appears unnecessary and
> potentially harmful in memory-constrained environments
in this particular case there is indeed no benefit, it was designed
for keeping the config/stats for queues that were meaningfully used
it is rather clunky anyway
>
> Best regards,
> Jaroslav Pulchart
next prev parent reply other threads:[~2025-04-15 14:39 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-14 16:29 Increased memory usage on NUMA nodes with ICE driver after upgrade to 6.13.y (regression in commit 492a044508ad) Jaroslav Pulchart
2025-04-14 17:15 ` [Intel-wired-lan] " Paul Menzel
2025-04-15 14:38 ` Przemek Kitszel [this message]
2025-04-16 0:53 ` Jakub Kicinski
2025-04-16 7:13 ` Jaroslav Pulchart
2025-04-16 13:48 ` Jakub Kicinski
2025-04-16 16:03 ` Jaroslav Pulchart
2025-04-16 22:44 ` Jakub Kicinski
2025-04-16 22:57 ` [Intel-wired-lan] " Keller, Jacob E
2025-04-16 22:57 ` Keller, Jacob E
2025-04-17 0:13 ` Jakub Kicinski
2025-04-17 17:52 ` Keller, Jacob E
2025-05-21 10:50 ` Jaroslav Pulchart
2025-06-04 8:42 ` Jaroslav Pulchart
[not found] ` <CAK8fFZ5XTO9dGADuMSV0hJws-6cZE9equa3X6dfTBgDyzE1pEQ@mail.gmail.com>
2025-06-25 14:03 ` Przemek Kitszel
[not found] ` <CAK8fFZ7LREBEdhXjBAKuaqktOz1VwsBTxcCpLBsa+dkMj4Pyyw@mail.gmail.com>
2025-06-25 20:25 ` Jakub Kicinski
2025-06-26 7:42 ` Jaroslav Pulchart
2025-06-30 7:35 ` Jaroslav Pulchart
2025-06-30 16:02 ` Jacob Keller
2025-06-30 17:24 ` Jaroslav Pulchart
2025-06-30 18:59 ` Jacob Keller
2025-06-30 20:01 ` Jaroslav Pulchart
2025-06-30 20:42 ` Jacob Keller
2025-06-30 21:56 ` Jacob Keller
2025-06-30 23:16 ` Jacob Keller
2025-07-01 6:48 ` Jaroslav Pulchart
2025-07-01 20:48 ` Jacob Keller
2025-07-02 9:48 ` Jaroslav Pulchart
2025-07-02 18:01 ` Jacob Keller
2025-07-02 21:56 ` Jacob Keller
2025-07-03 6:46 ` Jaroslav Pulchart
2025-07-03 16:16 ` Jacob Keller
2025-07-04 19:30 ` Maciej Fijalkowski
2025-07-07 18:32 ` Jacob Keller
2025-07-07 22:03 ` Jacob Keller
2025-07-09 0:50 ` Jacob Keller
2025-07-09 19:11 ` Jacob Keller
2025-07-09 21:04 ` Jaroslav Pulchart
2025-07-09 21:15 ` Jacob Keller
2025-07-11 18:16 ` Jaroslav Pulchart
2025-07-11 22:30 ` Jacob Keller
2025-07-14 5:34 ` Jaroslav Pulchart
2025-06-25 14:53 ` Paul Menzel
2025-07-04 16:55 ` Michal Kubiak
2025-07-05 7:01 ` Jaroslav Pulchart
2025-07-07 15:37 ` Jaroslav Pulchart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4a061a51-8a6c-42b8-9957-66073b4bc65f@intel.com \
--to=przemyslaw.kitszel@intel.com \
--cc=ahmed.zaki@intel.com \
--cc=anthony.l.nguyen@intel.com \
--cc=daniel.secik@gooddata.com \
--cc=edumazet@google.com \
--cc=igor@gooddata.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jaroslav.pulchart@gooddata.com \
--cc=jdamato@fastly.com \
--cc=kuba@kernel.org \
--cc=lukasz.czapnik@intel.com \
--cc=michal.swiatkowski@linux.intel.com \
--cc=mkarsten@uwaterloo.ca \
--cc=netdev@vger.kernel.org \
--cc=zdenek.pesek@gooddata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).