public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mal Haak <malcolm@haak.id.au>
To: linux-kernel@vger.kernel.org
Subject: Re: Possible memory leak in 6.17.7
Date: Sat, 6 Dec 2025 08:23:36 +1000	[thread overview]
Message-ID: <20251206082336.6e04a1ac@xps15mal> (raw)
In-Reply-To: <20251120122351.231513e1@xps15mal>

I have a reproducer. It's slow but it works.

I kept rsync running for 2 days by moving 5TB of files.

smem -wp

Area                           Used      Cache   Noncache 
firmware/hardware             0.00%      0.00%      0.00% 
kernel image                  0.00%      0.00%      0.00% 
kernel dynamic memory        98.81%      1.69%     97.13% 
userspace memory              0.08%      0.05%      0.03% 
free memory                   1.11%      1.11%      0.00% 
[root@kerneltest ~]# uname -a
Linux kerneltest 6.18.0-1-mainline #1 SMP PREEMPT_DYNAMIC Tue, 11
Nov 2025 00:02:22 +0000 x86_64 GNU/Linux

The issue is in 6.18.

On Thu, 20 Nov 2025 12:23:51 +1000
Mal Haak <malcolm@haak.id.au> wrote:

> On Mon, 10 Nov 2025 18:20:08 +1000
> Mal Haak <malcolm@haak.id.au> wrote:
> 
> > Hello,
> > 
> > I have found a memory leak in 6.17.7 but I am unsure how to track it
> > down effectively.
> > 
> > I am running a server that has a heavy read/write workload to a
> > cephfs file system. It is a VM. 
> > 
> > Over time it appears that the non-cache useage of kernel dynamic
> > memory increases. The kernel seems to think the pages are
> > reclaimable however nothing appears to trigger the reclaim. This
> > leads to workloads getting killed via oomkiller. 
> > 
> > smem -wp output:
> > 
> > Area                           Used      Cache   Noncache 
> > firmware/hardware             0.00%      0.00%      0.00% 
> > kernel image                  0.00%      0.00%      0.00% 
> > kernel dynamic memory        88.21%     36.25%     51.96% 
> > userspace memory              9.49%      0.15%      9.34% 
> > free memory                   2.30%      2.30%      0.00% 
> > 
> > free -h output:
> > 
> >        total  used   free   shared  buff/cache available 
> > Mem:   31Gi   3.6Gi  500Mi  4.0Mi   11Gi      27Gi 
> > Swap:  4.0Gi  179Mi  3.8Gi
> > 
> > Reverting to the previous LTS fixes the issue
> > 
> > smem -wp output:
> > Area                           Used      Cache   Noncache 
> > firmware/hardware             0.00%      0.00%      0.00% 
> > kernel image                  0.00%      0.00%      0.00% 
> > kernel dynamic memory        80.22%     79.32%      0.90% 
> > userspace memory             10.48%      0.20%     10.28% 
> > free memory                   9.30%      9.30%      0.00% 
> >   
> I have more information. The leaking of kernel memory only starts once
> there is a lot of data in buffers/cache. And only once it's been in
> that state for several hours. 
> 
> Currently in my search for a reproducer I have found that
> downloading then seeding of multiple torrents of linux
> distribution ISO's will replicate the issue. But it only begins
> leaking at around the 6-9 hour mark. 
> 
> It does not appear to be dependant on cephfs; but due to it's use of
> sockets I believe this is making the situation worse. 
> 
> I cannot replicate it at all with the LTS kernel release but it does
> look like the current RC releases do have this issue. 
> 
> I was looking at doing a kernel build with CONFIG_DEBUG_KMEMLEAK
> enabled and will if it's thought this would find the issue. However as
> the memory usage is still somewhat tracked and obviously marked as
> reclaimable it feels more like something in the reclaim logic is
> getting broken. 
> 
> I do wonder if due to it only happening after ram is mostly consumed
> by cache, and even then only if it has been that way for hours, if the
> issue is memory fragmentation related. 
> 
> Regardless, some advice on how to narrow this down faster than a git
> bisect as 9hrs to even confirm replication of the issue makes git
> bisect painfully slow.
> 
> Thanks in advance
> 
> Mal Haak
> 


  reply	other threads:[~2025-12-05 22:27 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-10  8:20 Possible memory leak in 6.17.7 Mal Haak
2025-11-20  2:23 ` Mal Haak
2025-12-05 22:23   ` Mal Haak [this message]
2025-12-08  9:52     ` Mal Haak
2025-12-08 11:08 ` David Wang
2025-12-08 23:08   ` Mal Haak
2025-12-09  4:40     ` David Wang
2025-12-10 13:43       ` Mal Haak
2025-12-11  3:28         ` RRe: " David Wang
2025-12-11  4:23           ` Mal Haak
2025-12-15 19:42             ` Viacheslav Dubeyko
2025-12-16  1:26               ` Mal Haak
2025-12-16  2:02                 ` Viacheslav Dubeyko
2025-12-16  7:00                 ` David Wang
2025-12-16  7:09                   ` Mal Haak
2025-12-16 11:55                     ` Mal Haak
2025-12-16 12:18                       ` David Wang
2025-12-16 12:42                         ` David Wang
2025-12-17  1:56                           ` Viacheslav Dubeyko
2025-12-17  2:28                             ` Mal Haak
2025-12-17  5:59                 ` David Wang
2025-12-17  6:46                   ` Mal Haak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251206082336.6e04a1ac@xps15mal \
    --to=malcolm@haak.id.au \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox