* [syzbot] [cgroups?] [mm?] WARNING in folio_lruvec_lock
@ 2025-06-19 12:02 syzbot
2025-06-20 11:50 ` Lorenzo Stoakes
0 siblings, 1 reply; 2+ messages in thread
From: syzbot @ 2025-06-19 12:02 UTC (permalink / raw)
To: akpm, cgroups, hannes, linux-kernel, linux-mm, mhocko,
muchun.song, roman.gushchin, shakeel.butt, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: bc6e0ba6c9ba Add linux-next specific files for 20250613
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1126090c580000
kernel config: https://syzkaller.appspot.com/x/.config?x=2f7a2e4d17ed458f
dashboard link: https://syzkaller.appspot.com/bug?extid=a74a028d848147bc5931
compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/2430bb0465cc/disk-bc6e0ba6.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/436a39deef0a/vmlinux-bc6e0ba6.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e314ca5b1eb3/bzImage-bc6e0ba6.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a74a028d848147bc5931@syzkaller.appspotmail.com
handle_mm_fault+0x740/0x8e0 mm/memory.c:6397
faultin_page mm/gup.c:1186 [inline]
__get_user_pages+0x1aef/0x30b0 mm/gup.c:1488
populate_vma_page_range+0x29f/0x3a0 mm/gup.c:1922
__mm_populate+0x24c/0x380 mm/gup.c:2025
mm_populate include/linux/mm.h:3354 [inline]
vm_mmap_pgoff+0x3f0/0x4c0 mm/util.c:584
ksys_mmap_pgoff+0x587/0x760 mm/mmap.c:607
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
page_owner free stack trace missing
------------[ cut here ]------------
WARNING: CPU: 0 PID: 38 at ./include/linux/memcontrol.h:732 folio_lruvec include/linux/memcontrol.h:732 [inline]
WARNING: CPU: 0 PID: 38 at ./include/linux/memcontrol.h:732 folio_lruvec_lock+0x150/0x1a0 mm/memcontrol.c:1211
Modules linked in:
CPU: 0 UID: 0 PID: 38 Comm: ksmd Not tainted 6.16.0-rc1-next-20250613-syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
RIP: 0010:folio_lruvec include/linux/memcontrol.h:732 [inline]
RIP: 0010:folio_lruvec_lock+0x150/0x1a0 mm/memcontrol.c:1211
Code: 7c 25 00 00 74 08 4c 89 ff e8 7c 66 f8 ff 4d 89 2f eb c4 48 89 df 48 c7 c6 60 4f 98 8b e8 58 9b dc ff c6 05 01 85 5f 0d 01 90 <0f> 0b 90 e9 d5 fe ff ff 44 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 4d
RSP: 0018:ffffc90000ae7660 EFLAGS: 00010046
RAX: b21d845e3554e000 RBX: ffffea0002108000 RCX: b21d845e3554e000
RDX: 0000000000000002 RSI: ffffffff8db792e4 RDI: ffff88801de83c00
RBP: ffffea0002108000 R08: 0000000000000003 R09: 0000000000000004
R10: dffffc0000000000 R11: fffffbfff1bfaa14 R12: ffffea0002108000
R13: ffffea0002108008 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff888125c41000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f475c15ef98 CR3: 000000005f95a000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__split_unmapped_folio+0x42e/0x2cb0 mm/huge_memory.c:3487
__folio_split+0xf78/0x1300 mm/huge_memory.c:3891
cmp_and_merge_page mm/ksm.c:2358 [inline]
ksm_do_scan+0x499b/0x6530 mm/ksm.c:2665
ksm_scan_thread+0x10b/0x4b0 mm/ksm.c:2687
kthread+0x711/0x8a0 kernel/kthread.c:464
ret_from_fork+0x3f9/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [syzbot] [cgroups?] [mm?] WARNING in folio_lruvec_lock
2025-06-19 12:02 [syzbot] [cgroups?] [mm?] WARNING in folio_lruvec_lock syzbot
@ 2025-06-20 11:50 ` Lorenzo Stoakes
0 siblings, 0 replies; 2+ messages in thread
From: Lorenzo Stoakes @ 2025-06-20 11:50 UTC (permalink / raw)
To: syzbot
Cc: akpm, cgroups, hannes, linux-kernel, linux-mm, mhocko,
muchun.song, roman.gushchin, shakeel.butt, syzkaller-bugs
OK I think this might well be me, apologies. I definitely see a suspicious
looking bug. TL;DR - will fix, it's not upstream yet.
Thanks to Andrew for forwarding to me, that's some insight there!
So it looks like in [0] we are doing the KSM flag update _before_ the mmap()
hook, mistakenly, which is... not good.
This results in the correct checks not being applied to the VMA, because
e.g. VM_HUGETLB will not be set until after the .mmap() hook has been completed
(I'm working on converting the hooks to .mmap_prepare() but we're not there
yet...)
[0]:https://lore.kernel.org/all/3ba660af716d87a18ca5b4e635f2101edeb56340.1748537921.git.lorenzo.stoakes@oracle.com/
I will send a fix there.
Thanks, Lorenzo
On Thu, Jun 19, 2025 at 05:02:31AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: bc6e0ba6c9ba Add linux-next specific files for 20250613
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1126090c580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=2f7a2e4d17ed458f
> dashboard link: https://syzkaller.appspot.com/bug?extid=a74a028d848147bc5931
> compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/2430bb0465cc/disk-bc6e0ba6.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/436a39deef0a/vmlinux-bc6e0ba6.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/e314ca5b1eb3/bzImage-bc6e0ba6.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+a74a028d848147bc5931@syzkaller.appspotmail.com
>
> handle_mm_fault+0x740/0x8e0 mm/memory.c:6397
I mean this is:
ret = hugetlb_fault(vma->vm_mm, vma, address, flags);
Interestingly, I see in mem_cgroup_charge_hugetlb():
/*
* Even memcg does not account for hugetlb, we still want to update
* system-level stats via lruvec_stat_mod_folio. Return 0, and skip
* charging the memcg.
*/
if (mem_cgroup_disabled() || !memcg_accounts_hugetlb() ||
!memcg || !cgroup_subsys_on_dfl(memory_cgrp_subsys))
goto out;
if (charge_memcg(folio, memcg, gfp))
ret = -ENOMEM;
So maybe somehow KSM is touching hugetlb (it shouldn't do...) which has an
uncharged folio...?
This aligns with us having set KSM flags at the wrong time on a hugetlb mapping.
> faultin_page mm/gup.c:1186 [inline]
> __get_user_pages+0x1aef/0x30b0 mm/gup.c:1488
> populate_vma_page_range+0x29f/0x3a0 mm/gup.c:1922
> __mm_populate+0x24c/0x380 mm/gup.c:2025
> mm_populate include/linux/mm.h:3354 [inline]
> vm_mmap_pgoff+0x3f0/0x4c0 mm/util.c:584
> ksys_mmap_pgoff+0x587/0x760 mm/mmap.c:607
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> page_owner free stack trace missing
I'm guessing this is the process stack of the repro (even though syzkaller can't repro :P)
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 38 at ./include/linux/memcontrol.h:732 folio_lruvec include/linux/memcontrol.h:732 [inline]
This is:
static inline struct lruvec *folio_lruvec(struct folio *folio)
{
struct mem_cgroup *memcg = folio_memcg(folio);
VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled(), folio); <---- here
return mem_cgroup_lruvec(memcg, folio_pgdat(folio));
}
Meaning folio_memcg() is failing to find a memcg for the folio.
I'm not really that familiar with cgroup implementation but:
static inline struct mem_cgroup *folio_memcg(struct folio *folio)
{
if (folio_memcg_kmem(folio))
return obj_cgroup_memcg(__folio_objcg(folio));
return __folio_memcg(folio); <--- seems this is what is returning NULL?
}
I guess it's __folio_memcg() that's returning NULL as apparently
obj_cgroup_memcg() should always return something non-NULL.
And this is:
static inline struct mem_cgroup *__folio_memcg(struct folio *folio)
{
unsigned long memcg_data = folio->memcg_data;
...
return (struct mem_cgroup *)(memcg_data & ~OBJEXTS_FLAGS_MASK);
}
So if folio->memcg_data is NULL or NULL against the mask this will return NULL.
I see this is set to NULL (or rather 0) in mem_cgroup_migrate(), also in
__memcg_kmem_uncharge_page() (but is this kmem? No?), also uncharge_folio().
We also set the memcg after charge_memcg() -> commit_charge() so perhaps a
charge was expected that didn't happen somehow?
This again aligns with a mis-flagged hugetlb folio.
> WARNING: CPU: 0 PID: 38 at ./include/linux/memcontrol.h:732 folio_lruvec_lock+0x150/0x1a0 mm/memcontrol.c:1211
Ths is:
struct lruvec *folio_lruvec_lock(struct folio *folio)
{
struct lruvec *lruvec = folio_lruvec(folio); <---- here
spin_lock(&lruvec->lru_lock);
lruvec_memcg_debug(lruvec, folio);
return lruvec;
}
> Modules linked in:
> CPU: 0 UID: 0 PID: 38 Comm: ksmd Not tainted 6.16.0-rc1-next-20250613-syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
> RIP: 0010:folio_lruvec include/linux/memcontrol.h:732 [inline]
> RIP: 0010:folio_lruvec_lock+0x150/0x1a0 mm/memcontrol.c:1211
> Code: 7c 25 00 00 74 08 4c 89 ff e8 7c 66 f8 ff 4d 89 2f eb c4 48 89 df 48 c7 c6 60 4f 98 8b e8 58 9b dc ff c6 05 01 85 5f 0d 01 90 <0f> 0b 90 e9 d5 fe ff ff 44 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 4d
> RSP: 0018:ffffc90000ae7660 EFLAGS: 00010046
> RAX: b21d845e3554e000 RBX: ffffea0002108000 RCX: b21d845e3554e000
> RDX: 0000000000000002 RSI: ffffffff8db792e4 RDI: ffff88801de83c00
> RBP: ffffea0002108000 R08: 0000000000000003 R09: 0000000000000004
> R10: dffffc0000000000 R11: fffffbfff1bfaa14 R12: ffffea0002108000
> R13: ffffea0002108008 R14: 0000000000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff888125c41000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f475c15ef98 CR3: 000000005f95a000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> __split_unmapped_folio+0x42e/0x2cb0 mm/huge_memory.c:3487
This is:
static int __split_unmapped_folio(struct folio *folio, int new_order,
struct page *split_at, struct page *lock_at,
struct list_head *list, pgoff_t end,
struct xa_state *xas, struct address_space *mapping,
bool uniform_split)
{
...
/* lock lru list/PageCompound, ref frozen by page_ref_freeze */
lruvec = folio_lruvec_lock(folio); <--- here
...
}
So we're splitting an unmapped folio that is locked, non-LRU and frozen
(refcount == 0)
Interstingly, __split_folio_to_order() sets (new_)folio->memcg_data, but this is
called _after_ this folio_lurvec_lock().
> __folio_split+0xf78/0x1300 mm/huge_memory.c:3891
This is:
ret = __split_unmapped_folio(folio, new_order,
split_at, lock_at, list, end, &xas, mapping,
uniform_split);
> cmp_and_merge_page mm/ksm.c:2358 [inline]
So we have tried to merge two pages:
kfolio = try_to_merge_two_pages(rmap_item, page,
tree_rmap_item, tree_page);
But failed:
/*
* If both pages we tried to merge belong to the same compound
* page, then we actually ended up increasing the reference
* count of the same compound page twice, and split_huge_page
* failed.
* Here we set a flag if that happened, and we use it later to
* try split_huge_page again. Since we call put_page right
* afterwards, the reference count will be correct and
* split_huge_page should succeed.
*/
split = PageTransCompound(page)
&& compound_head(page) == compound_head(tree_page);
if (kfolio) {
...
} else if (split) {
/*
* We are here if we tried to merge two pages and
* failed because they both belonged to the same
* compound page. We will split the page now, but no
* merging will take place.
* We do not want to add the cost of a full lock; if
* the page is locked, it is better to skip it and
* perhaps try again later.
*/
if (!trylock_page(page))
return;
split_huge_page(page); <---- this is where the failure occurs.
unlock_page(page);
}
> ksm_do_scan+0x499b/0x6530 mm/ksm.c:2665
> ksm_scan_thread+0x10b/0x4b0 mm/ksm.c:2687
> kthread+0x711/0x8a0 kernel/kthread.c:464
> ret_from_fork+0x3f9/0x770 arch/x86/kernel/process.c:148
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> </TASK>
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
>
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-06-20 11:51 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-19 12:02 [syzbot] [cgroups?] [mm?] WARNING in folio_lruvec_lock syzbot
2025-06-20 11:50 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).