RRe: Possible memory leak in 6.17.7

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "David Wang" <00107082@163.com>
To: "Mal Haak" <malcolm@haak.id.au>
Cc: linux-kernel@vger.kernel.org, surenb@google.com,
	xiubli@redhat.com, idryomov@gmail.com,
	ceph-devel@vger.kernel.org
Subject: RRe: Possible memory leak in 6.17.7
Date: Thu, 11 Dec 2025 11:28:21 +0800 (CST)	[thread overview]
Message-ID: <2a9ba88e.3aa6.19b0b73dd4e.Coremail.00107082@163.com> (raw)
In-Reply-To: <20251210234318.5d8c2d68@xps15mal>



At 2025-12-10 21:43:18, "Mal Haak" <malcolm@haak.id.au> wrote:
>On Tue, 9 Dec 2025 12:40:21 +0800 (CST)
>"David Wang" <00107082@163.com> wrote:
>
>> At 2025-12-09 07:08:31, "Mal Haak" <malcolm@haak.id.au> wrote:
>> >On Mon,  8 Dec 2025 19:08:29 +0800
>> >David Wang <00107082@163.com> wrote:
>> >  
>> >> On Mon, 10 Nov 2025 18:20:08 +1000
>> >> Mal Haak <malcolm@haak.id.au> wrote:  
>> >> > Hello,
>> >> > 
>> >> > I have found a memory leak in 6.17.7 but I am unsure how to
>> >> > track it down effectively.
>> >> > 
>> >> >     
>> >> 
>> >> I think the `memory allocation profiling` feature can help.
>> >> https://docs.kernel.org/mm/allocation-profiling.html
>> >> 
>> >> You would need to build a kernel with 
>> >> CONFIG_MEM_ALLOC_PROFILING=y
>> >> CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y
>> >> 
>> >> And check /proc/allocinfo for the suspicious allocations which take
>> >> more memory than expected.
>> >> 
>> >> (I once caught a nvidia driver memory leak.)
>> >> 
>> >> 
>> >> FYI
>> >> David
>> >>   
>> >
>> >Thank you for your suggestion. I have some results.
>> >
>> >Ran the rsync workload for about 9 hours. It started to look like it
>> >was happening.
>> ># smem -pw
>> >Area                           Used      Cache   Noncache 
>> >firmware/hardware             0.00%      0.00%      0.00% 
>> >kernel image                  0.00%      0.00%      0.00% 
>> >kernel dynamic memory        80.46%     65.80%     14.66% 
>> >userspace memory              0.35%      0.16%      0.19% 
>> >free memory                  19.19%     19.19%      0.00% 
>> ># sort -g /proc/allocinfo|tail|numfmt --to=iec
>> >         22M     5609 mm/memory.c:1190 func:folio_prealloc 
>> >         23M     1932 fs/xfs/xfs_buf.c:226 [xfs]
>> >func:xfs_buf_alloc_backing_mem 
>> >         24M    24135 fs/xfs/xfs_icache.c:97 [xfs]
>> > func:xfs_inode_alloc 27M     6693 mm/memory.c:1192
>> > func:folio_prealloc 58M    14784 mm/page_ext.c:271
>> > func:alloc_page_ext 258M      129 mm/khugepaged.c:1069
>> > func:alloc_charge_folio 430M   770788 lib/xarray.c:378
>> > func:xas_alloc 545M    36444 mm/slub.c:3059 func:alloc_slab_page 
>> >        9.8G  2563617 mm/readahead.c:189 func:ractl_alloc_folio 
>> >         20G  5164004 mm/filemap.c:2012 func:__filemap_get_folio 
>> >
>> >
>> >So I stopped the workload and dropped caches to confirm.
>> >
>> ># echo 3 > /proc/sys/vm/drop_caches
>> ># smem -pw
>> >Area                           Used      Cache   Noncache 
>> >firmware/hardware             0.00%      0.00%      0.00% 
>> >kernel image                  0.00%      0.00%      0.00% 
>> >kernel dynamic memory        33.45%      0.09%     33.36% 
>> >userspace memory              0.36%      0.16%      0.19% 
>> >free memory                  66.20%     66.20%      0.00% 
>> ># sort -g /proc/allocinfo|tail|numfmt --to=iec
>> >         12M     2987 mm/execmem.c:41 func:execmem_vmalloc 
>> >         12M        3 kernel/dma/pool.c:96 func:atomic_pool_expand 
>> >         13M      751 mm/slub.c:3061 func:alloc_slab_page 
>> >         16M        8 mm/khugepaged.c:1069 func:alloc_charge_folio 
>> >         18M     4355 mm/memory.c:1190 func:folio_prealloc 
>> >         24M     6119 mm/memory.c:1192 func:folio_prealloc 
>> >         58M    14784 mm/page_ext.c:271 func:alloc_page_ext 
>> >         61M    15448 mm/readahead.c:189 func:ractl_alloc_folio 
>> >         79M     6726 mm/slub.c:3059 func:alloc_slab_page 
>> >         11G  2674488 mm/filemap.c:2012 func:__filemap_get_folio

Maybe narrowing down the "Noncache" caller of __filemap_get_folio would help clarify things.
(It could be designed that way, and  needs other route than dropping-cache to release the memory, just guess....)
If you want, you can modify code to split the accounting for __filemap_get_folio according to its callers.

Following is a draft patch: (based on v6.18)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 09b581c1d878..ba8c659a6ae3 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -753,7 +753,11 @@ static inline fgf_t fgf_set_order(size_t size)
 }
 
 void *filemap_get_entry(struct address_space *mapping, pgoff_t index);
-struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
+
+#define __filemap_get_folio(...)			\
+	alloc_hooks(__filemap_get_folio_noprof(__VA_ARGS__))
+
+struct folio *__filemap_get_folio_noprof(struct address_space *mapping, pgoff_t index,
 		fgf_t fgp_flags, gfp_t gfp);
 struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
 		fgf_t fgp_flags, gfp_t gfp);
diff --git a/mm/filemap.c b/mm/filemap.c
index 024b71da5224..e1c1c26d7cb3 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1938,7 +1938,7 @@ void *filemap_get_entry(struct address_space *mapping, pgoff_t index)
  *
  * Return: The found folio or an ERR_PTR() otherwise.
  */
-struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
+struct folio *__filemap_get_folio_noprof(struct address_space *mapping, pgoff_t index,
 		fgf_t fgp_flags, gfp_t gfp)
 {
 	struct folio *folio;
@@ -2009,7 +2009,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
 			err = -ENOMEM;
 			if (order > min_order)
 				alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN;
-			folio = filemap_alloc_folio(alloc_gfp, order);
+			folio = filemap_alloc_folio_noprof(alloc_gfp, order);
 			if (!folio)
 				continue;
 
@@ -2056,7 +2056,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
 		folio_clear_dropbehind(folio);
 	return folio;
 }
-EXPORT_SYMBOL(__filemap_get_folio);
+EXPORT_SYMBOL(__filemap_get_folio_noprof);
 
 static inline struct folio *find_get_entry(struct xa_state *xas, pgoff_t max,
 		xa_mark_t mark)




FYI
David

>> >
>> >So if I'm reading this correctly something is causing folios collect
>> >and not be able to be freed?  
>> 
>> CC cephfs, maybe someone could have an easy reading out of those
>> folio usage
>> 
>> 
>> >
>> >Also it's clear that some of the folio's are counting as cache and
>> >some aren't. 
>> >
>> >Like I said 6.17 and 6.18 both have the issue. 6.12 does not. I'm now
>> >going to manually walk through previous kernel releases and find
>> >where it first starts happening purely because I'm having issues
>> >building earlier kernels due to rust stuff and other python
>> >incompatibilities making doing a git-bisect a bit fun.
>> >
>> >I'll do it the packages way until I get closer, then solve the build
>> >issues. 
>> >
>> >Thanks,
>> >Mal
>> >  
>Thanks David.
>
>I've contacted the ceph developers as well. 
>
>There was a suggestion it was due to the change from, to quote:
>folio.free() to folio.put() or something like this.
>
>The change happened around 6.14/6.15
>
>I've found an easier reproducer. 
>
>There has been a suggestion that perhaps the ceph team might not fix
>this as "you can just reboot before the machine becomes unstable" and
>"Nobody else has encountered this bug"
>
>I'll leave that to other people to make a call on but I'd assume the
>lack of reports is due to the fact that most stable distros are still
>on a a far too early kernel and/or are using the fuse driver with k8s.
>
>Anyway, thanks for your assistance.

next prev parent reply	other threads:[~2025-12-11  3:29 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-10  8:20 Possible memory leak in 6.17.7 Mal Haak
2025-11-20  2:23 ` Mal Haak
2025-12-05 22:23   ` Mal Haak
2025-12-08  9:52     ` Mal Haak
2025-12-08 11:08 ` David Wang
2025-12-08 23:08   ` Mal Haak
2025-12-09  4:40     ` David Wang
2025-12-10 13:43       ` Mal Haak
2025-12-11  3:28         ` David Wang [this message]
2025-12-11  4:23           ` RRe: " Mal Haak
2025-12-15 19:42             ` Viacheslav Dubeyko
2025-12-16  1:26               ` Mal Haak
2025-12-16  2:02                 ` Viacheslav Dubeyko
2025-12-16  7:00                 ` David Wang
2025-12-16  7:09                   ` Mal Haak
2025-12-16 11:55                     ` Mal Haak
2025-12-16 12:18                       ` David Wang
2025-12-16 12:42                         ` David Wang
2025-12-17  1:56                           ` Viacheslav Dubeyko
2025-12-17  2:28                             ` Mal Haak
2025-12-17  5:59                 ` David Wang
2025-12-17  6:46                   ` Mal Haak

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:09b581c1d87 dfblob:ba8c659a6ae dfblob:024b71da522
dfblob:e1c1c26d7cb )
 OR (
bs:"RRe: Possible memory leak in 6.17.7" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a9ba88e.3aa6.19b0b73dd4e.Coremail.00107082@163.com \
    --to=00107082@163.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=malcolm@haak.id.au \
    --cc=surenb@google.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.