All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mal Haak <malcolm@haak.id.au>
To: linux-kernel@vger.kernel.org
Subject: Re: Possible memory leak in 6.17.7
Date: Sat, 6 Dec 2025 08:23:36 +1000	[thread overview]
Message-ID: <20251206082336.6e04a1ac@xps15mal> (raw)
In-Reply-To: <20251120122351.231513e1@xps15mal>

I have a reproducer. It's slow but it works.

I kept rsync running for 2 days by moving 5TB of files.

smem -wp

Area                           Used      Cache   Noncache 
firmware/hardware             0.00%      0.00%      0.00% 
kernel image                  0.00%      0.00%      0.00% 
kernel dynamic memory        98.81%      1.69%     97.13% 
userspace memory              0.08%      0.05%      0.03% 
free memory                   1.11%      1.11%      0.00% 
[root@kerneltest ~]# uname -a
Linux kerneltest 6.18.0-1-mainline #1 SMP PREEMPT_DYNAMIC Tue, 11
Nov 2025 00:02:22 +0000 x86_64 GNU/Linux

The issue is in 6.18.

On Thu, 20 Nov 2025 12:23:51 +1000
Mal Haak <malcolm@haak.id.au> wrote:

> On Mon, 10 Nov 2025 18:20:08 +1000
> Mal Haak <malcolm@haak.id.au> wrote:
> 
> > Hello,
> > 
> > I have found a memory leak in 6.17.7 but I am unsure how to track it
> > down effectively.
> > 
> > I am running a server that has a heavy read/write workload to a
> > cephfs file system. It is a VM. 
> > 
> > Over time it appears that the non-cache useage of kernel dynamic
> > memory increases. The kernel seems to think the pages are
> > reclaimable however nothing appears to trigger the reclaim. This
> > leads to workloads getting killed via oomkiller. 
> > 
> > smem -wp output:
> > 
> > Area                           Used      Cache   Noncache 
> > firmware/hardware             0.00%      0.00%      0.00% 
> > kernel image                  0.00%      0.00%      0.00% 
> > kernel dynamic memory        88.21%     36.25%     51.96% 
> > userspace memory              9.49%      0.15%      9.34% 
> > free memory                   2.30%      2.30%      0.00% 
> > 
> > free -h output:
> > 
> >        total  used   free   shared  buff/cache available 
> > Mem:   31Gi   3.6Gi  500Mi  4.0Mi   11Gi      27Gi 
> > Swap:  4.0Gi  179Mi  3.8Gi
> > 
> > Reverting to the previous LTS fixes the issue
> > 
> > smem -wp output:
> > Area                           Used      Cache   Noncache 
> > firmware/hardware             0.00%      0.00%      0.00% 
> > kernel image                  0.00%      0.00%      0.00% 
> > kernel dynamic memory        80.22%     79.32%      0.90% 
> > userspace memory             10.48%      0.20%     10.28% 
> > free memory                   9.30%      9.30%      0.00% 
> >   
> I have more information. The leaking of kernel memory only starts once
> there is a lot of data in buffers/cache. And only once it's been in
> that state for several hours. 
> 
> Currently in my search for a reproducer I have found that
> downloading then seeding of multiple torrents of linux
> distribution ISO's will replicate the issue. But it only begins
> leaking at around the 6-9 hour mark. 
> 
> It does not appear to be dependant on cephfs; but due to it's use of
> sockets I believe this is making the situation worse. 
> 
> I cannot replicate it at all with the LTS kernel release but it does
> look like the current RC releases do have this issue. 
> 
> I was looking at doing a kernel build with CONFIG_DEBUG_KMEMLEAK
> enabled and will if it's thought this would find the issue. However as
> the memory usage is still somewhat tracked and obviously marked as
> reclaimable it feels more like something in the reclaim logic is
> getting broken. 
> 
> I do wonder if due to it only happening after ram is mostly consumed
> by cache, and even then only if it has been that way for hours, if the
> issue is memory fragmentation related. 
> 
> Regardless, some advice on how to narrow this down faster than a git
> bisect as 9hrs to even confirm replication of the issue makes git
> bisect painfully slow.
> 
> Thanks in advance
> 
> Mal Haak
> 


  reply	other threads:[~2025-12-05 22:27 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-10  8:20 Possible memory leak in 6.17.7 Mal Haak
2025-11-20  2:23 ` Mal Haak
2025-12-05 22:23   ` Mal Haak [this message]
2025-12-08  9:52     ` Mal Haak
2025-12-08 11:08 ` David Wang
2025-12-08 23:08   ` Mal Haak
2025-12-09  4:40     ` David Wang
2025-12-10 13:43       ` Mal Haak
2025-12-11  3:28         ` RRe: " David Wang
2025-12-11  4:23           ` Mal Haak
2025-12-15 19:42             ` Viacheslav Dubeyko
2025-12-16  1:26               ` Mal Haak
2025-12-16  2:02                 ` Viacheslav Dubeyko
2025-12-16  7:00                 ` David Wang
2025-12-16  7:09                   ` Mal Haak
2025-12-16 11:55                     ` Mal Haak
2025-12-16 12:18                       ` David Wang
2025-12-16 12:42                         ` David Wang
2025-12-17  1:56                           ` Viacheslav Dubeyko
2025-12-17  2:28                             ` Mal Haak
2025-12-17  5:59                 ` David Wang
2025-12-17  6:46                   ` Mal Haak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251206082336.6e04a1ac@xps15mal \
    --to=malcolm@haak.id.au \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.