* 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 @ 2024-09-24 22:28 Mikhail Gavrilov 2024-10-02 17:34 ` Mikhail Gavrilov 0 siblings, 1 reply; 8+ messages in thread From: Mikhail Gavrilov @ 2024-09-24 22:28 UTC (permalink / raw) To: Linux List Kernel Mailing, Linux regressions mailing list, linux-fsdevel [-- Attachment #1: Type: text/plain, Size: 7384 bytes --] Hi, I am testing kernel snapshots on Fedora Rawhide and Today with build on commit de5cb0dcb74c I saw for the first time "KASAN: slab-use-after-free in m_next+0x13b". Unfortunately it is not clear what triggered this problem because it happened after 21 hour uptime. Full trace looks like: input: Noble FoKus Mystique (AVRCP) as /devices/virtual/input/input26 ================================================================== BUG: KASAN: slab-use-after-free in m_next+0x13b/0x170 Read of size 8 at addr ffff8885609b40f0 by task htop/3847 CPU: 14 UID: 1000 PID: 3847 Comm: htop Tainted: G W L ------- --- 6.12.0-0.rc0.20240923gitde5cb0dcb74c.9.fc42.x86_64+debug #1 Tainted: [W]=WARN, [L]=SOFTLOCKUP Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING WIFI, BIOS 3040 09/12/2024 Call Trace: <TASK> dump_stack_lvl+0x84/0xd0 ? m_next+0x13b/0x170 print_report+0x174/0x505 ? m_next+0x13b/0x170 ? __virt_addr_valid+0x231/0x420 ? m_next+0x13b/0x170 kasan_report+0xab/0x180 ? m_next+0x13b/0x170 m_next+0x13b/0x170 seq_read_iter+0x8e5/0x1130 seq_read+0x2b4/0x3c0 ? __pfx_seq_read+0x10/0x10 ? inode_security+0x54/0xf0 ? rw_verify_area+0x3b2/0x5e0 vfs_read+0x165/0xa20 ? __pfx_vfs_read+0x10/0x10 ? ktime_get_coarse_real_ts64+0x41/0xd0 ? local_clock_noinstr+0xd/0x100 ? __pfx_lock_release+0x10/0x10 ksys_read+0xfb/0x1d0 ? __pfx_ksys_read+0x10/0x10 ? ktime_get_coarse_real_ts64+0x41/0xd0 do_syscall_64+0x97/0x190 ? __lock_acquire+0xdcd/0x62c0 ? __pfx___lock_acquire+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 ? audit_filter_inodes.part.0+0x12d/0x220 ? local_clock_noinstr+0xd/0x100 ? __pfx_lock_release+0x10/0x10 ? rcu_is_watching+0x12/0xc0 ? kfree+0x27c/0x4d0 ? audit_reset_context+0x8c5/0xee0 ? lockdep_hardirqs_on_prepare+0x171/0x400 ? do_syscall_64+0xa3/0x190 ? lockdep_hardirqs_on+0x7c/0x100 ? do_syscall_64+0xa3/0x190 ? do_syscall_64+0xa3/0x190 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f4190dcac36 Code: 89 df e8 2d c1 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 15 83 e2 39 83 fa 08 75 0d e8 32 ff ff ff 66 90 48 8b 45 10 0f 05 <48> 8b 5d f8 c9 c3 0f 1f 40 00 f3 0f 1e fa 55 48 89 e5 48 83 ec 08 RSP: 002b:00007ffcde82b690 EFLAGS: 00000202 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 00007f4190ce3740 RCX: 00007f4190dcac36 RDX: 0000000000000400 RSI: 000055bf5e823a20 RDI: 0000000000000005 RBP: 00007ffcde82b6a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 00007f4190f44fd0 R13: 00007f4190f44e80 R14: 000055bf5e823e20 R15: 000055bf5ecc9160 </TASK> Allocated by task 176289: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_slab_alloc+0x6e/0x70 kmem_cache_alloc_noprof+0x15a/0x3d0 vm_area_dup+0x23/0x190 __split_vma+0x137/0xd40 vms_gather_munmap_vmas+0x29d/0xfc0 mmap_region+0x35a/0x1f50 do_mmap+0x8e7/0x1020 vm_mmap_pgoff+0x178/0x2f0 __do_fast_syscall_32+0x86/0x110 do_fast_syscall_32+0x32/0x80 sysret32_from_system_call+0x0/0x4a Freed by task 0: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x70 __kasan_slab_free+0x37/0x50 kmem_cache_free+0x1a7/0x5a0 rcu_do_batch+0x3fd/0x1120 rcu_core+0x636/0x9b0 handle_softirqs+0x1e9/0x8d0 __irq_exit_rcu+0xbb/0x1c0 irq_exit_rcu+0xe/0x30 sysvec_apic_timer_interrupt+0xa1/0xd0 asm_sysvec_apic_timer_interrupt+0x1a/0x20 Last potentially related work creation: kasan_save_stack+0x30/0x50 __kasan_record_aux_stack+0x8e/0xa0 __call_rcu_common.constprop.0+0xf4/0x10d0 vma_complete+0x720/0x10b0 commit_merge+0x42a/0x1310 vma_expand+0x313/0xad0 vma_merge_new_range+0x2cd/0xec0 mmap_region+0x432/0x1f50 do_mmap+0x8e7/0x1020 vm_mmap_pgoff+0x178/0x2f0 __do_fast_syscall_32+0x86/0x110 do_fast_syscall_32+0x32/0x80 sysret32_from_system_call+0x0/0x4a The buggy address belongs to the object at ffff8885609b40f0 which belongs to the cache vm_area_struct of size 176 The buggy address is located 0 bytes inside of freed 176-byte region [ffff8885609b40f0, ffff8885609b41a0) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5609b4 head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff88814d36d001 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff888108113d40 dead000000000100 dead000000000122 raw: 0000000000000000 0000000000220022 00000001f5000000 ffff88814d36d001 head: 0017ffffc0000040 ffff888108113d40 dead000000000100 dead000000000122 head: 0000000000000000 0000000000220022 00000001f5000000 ffff88814d36d001 head: 0017ffffc0000001 ffffea0015826d01 ffffffffffffffff 0000000000000000 head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8885609b3f80: 00 00 00 00 00 00 00 00 00 00 00 00task_mmu 00 00 00 00 ffff8885609b4000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8885609b4080: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fa fb ^ ffff8885609b4100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8885609b4180: fb fb fb fb fc fc fc fc fc fc fc fc 00 00 00 00 ================================================================== Disabling lock debugging due to kernel taint > sh /usr/src/kernels/(uname -r)/scripts/faddr2line /lib/debug/lib/modules/(uname -r)/vmlinux m_next+0x13b m_next+0x13b/0x170: proc_get_vma at fs/proc/task_mmu.c:136 (inlined by) m_next at fs/proc/task_mmu.c:187 > cat -n /usr/src/debug/kernel-6.11-8833-gde5cb0dcb74c/linux-6.12.0-0.rc0.20240923gitde5cb0dcb74c.9.fc42.x86_64/fs/proc/task_mmu.c | sed -n '182,192 p' 182 { 183 if (*ppos == -2UL) { 184 *ppos = -1UL; 185 return NULL; 186 } 187 return proc_get_vma(m->private, ppos); 188 } 189 190 static void m_stop(struct seq_file *m, void *v) 191 { 192 struct proc_maps_private *priv = m->private; > git blame fs/proc/task_mmu.c -L 182,192 Blaming lines: 100% (11/11), done. a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 182) { c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 183) if (*ppos == -2UL) { c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 184) *ppos = -1UL; c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 185) return NULL; c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 186) } c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 187) return proc_get_vma(m->private, ppos); a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 188) } a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 189) a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 190) static void m_stop(struct seq_file *m, void *v) a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 191) { a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 192) struct proc_maps_private *priv = m->private; Hmm this line hasn't changed for two years. Machine spec: https://linux-hardware.org/?probe=323b76ce48 I attached below full kernel log and build config. Can anyone figure out what happened or should we wait for the second manifestation of this issue? -- Best Regards, Mike Gavrilov. [-- Attachment #2: 6.12.0-0.rc0.20240923gitde5cb0dcb74c.9.fc42-BUG-KASAN-slab-use-after-free-in-m_next.zip --] [-- Type: application/zip, Size: 90276 bytes --] [-- Attachment #3: .config.zip --] [-- Type: application/zip, Size: 67403 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 2024-09-24 22:28 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 Mikhail Gavrilov @ 2024-10-02 17:34 ` Mikhail Gavrilov 2024-10-02 17:55 ` Lorenzo Stoakes 0 siblings, 1 reply; 8+ messages in thread From: Mikhail Gavrilov @ 2024-10-02 17:34 UTC (permalink / raw) To: Linux List Kernel Mailing, Linux regressions mailing list, linux-fsdevel, Liam.Howlett, lorenzo.stoakes, Andrew Morton, Linux Memory Management List On Wed, Sep 25, 2024 at 3:28 AM Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote: > > Hi, > I am testing kernel snapshots on Fedora Rawhide and Today with build > on commit de5cb0dcb74c I saw for the first time "KASAN: > slab-use-after-free in m_next+0x13b". > Unfortunately it is not clear what triggered this problem because it > happened after 21 hour uptime. > > Full trace looks like: > input: Noble FoKus Mystique (AVRCP) as /devices/virtual/input/input26 > ================================================================== > BUG: KASAN: slab-use-after-free in m_next+0x13b/0x170 > Read of size 8 at addr ffff8885609b40f0 by task htop/3847 > > CPU: 14 UID: 1000 PID: 3847 Comm: htop Tainted: G W L > ------- --- 6.12.0-0.rc0.20240923gitde5cb0dcb74c.9.fc42.x86_64+debug > #1 > Tainted: [W]=WARN, [L]=SOFTLOCKUP > Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING WIFI, > BIOS 3040 09/12/2024 > Call Trace: > <TASK> > dump_stack_lvl+0x84/0xd0 > ? m_next+0x13b/0x170 > print_report+0x174/0x505 > ? m_next+0x13b/0x170 > ? __virt_addr_valid+0x231/0x420 > ? m_next+0x13b/0x170 > kasan_report+0xab/0x180 > ? m_next+0x13b/0x170 > m_next+0x13b/0x170 > seq_read_iter+0x8e5/0x1130 > seq_read+0x2b4/0x3c0 > ? __pfx_seq_read+0x10/0x10 > ? inode_security+0x54/0xf0 > ? rw_verify_area+0x3b2/0x5e0 > vfs_read+0x165/0xa20 > ? __pfx_vfs_read+0x10/0x10 > ? ktime_get_coarse_real_ts64+0x41/0xd0 > ? local_clock_noinstr+0xd/0x100 > ? __pfx_lock_release+0x10/0x10 > ksys_read+0xfb/0x1d0 > ? __pfx_ksys_read+0x10/0x10 > ? ktime_get_coarse_real_ts64+0x41/0xd0 > do_syscall_64+0x97/0x190 > ? __lock_acquire+0xdcd/0x62c0 > ? __pfx___lock_acquire+0x10/0x10 > ? __pfx___lock_acquire+0x10/0x10 > ? __pfx___lock_acquire+0x10/0x10 > ? audit_filter_inodes.part.0+0x12d/0x220 > ? local_clock_noinstr+0xd/0x100 > ? __pfx_lock_release+0x10/0x10 > ? rcu_is_watching+0x12/0xc0 > ? kfree+0x27c/0x4d0 > ? audit_reset_context+0x8c5/0xee0 > ? lockdep_hardirqs_on_prepare+0x171/0x400 > ? do_syscall_64+0xa3/0x190 > ? lockdep_hardirqs_on+0x7c/0x100 > ? do_syscall_64+0xa3/0x190 > ? do_syscall_64+0xa3/0x190 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > RIP: 0033:0x7f4190dcac36 > Code: 89 df e8 2d c1 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 15 > 83 e2 39 83 fa 08 75 0d e8 32 ff ff ff 66 90 48 8b 45 10 0f 05 <48> 8b > 5d f8 c9 c3 0f 1f 40 00 f3 0f 1e fa 55 48 89 e5 48 83 ec 08 > RSP: 002b:00007ffcde82b690 EFLAGS: 00000202 ORIG_RAX: 0000000000000000 > RAX: ffffffffffffffda RBX: 00007f4190ce3740 RCX: 00007f4190dcac36 > RDX: 0000000000000400 RSI: 000055bf5e823a20 RDI: 0000000000000005 > RBP: 00007ffcde82b6a0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000202 R12: 00007f4190f44fd0 > R13: 00007f4190f44e80 R14: 000055bf5e823e20 R15: 000055bf5ecc9160 > </TASK> > > Allocated by task 176289: > kasan_save_stack+0x30/0x50 > kasan_save_track+0x14/0x30 > __kasan_slab_alloc+0x6e/0x70 > kmem_cache_alloc_noprof+0x15a/0x3d0 > vm_area_dup+0x23/0x190 > __split_vma+0x137/0xd40 > vms_gather_munmap_vmas+0x29d/0xfc0 > mmap_region+0x35a/0x1f50 > do_mmap+0x8e7/0x1020 > vm_mmap_pgoff+0x178/0x2f0 > __do_fast_syscall_32+0x86/0x110 > do_fast_syscall_32+0x32/0x80 > sysret32_from_system_call+0x0/0x4a > > Freed by task 0: > kasan_save_stack+0x30/0x50 > kasan_save_track+0x14/0x30 > kasan_save_free_info+0x3b/0x70 > __kasan_slab_free+0x37/0x50 > kmem_cache_free+0x1a7/0x5a0 > rcu_do_batch+0x3fd/0x1120 > rcu_core+0x636/0x9b0 > handle_softirqs+0x1e9/0x8d0 > __irq_exit_rcu+0xbb/0x1c0 > irq_exit_rcu+0xe/0x30 > sysvec_apic_timer_interrupt+0xa1/0xd0 > asm_sysvec_apic_timer_interrupt+0x1a/0x20 > > Last potentially related work creation: > kasan_save_stack+0x30/0x50 > __kasan_record_aux_stack+0x8e/0xa0 > __call_rcu_common.constprop.0+0xf4/0x10d0 > vma_complete+0x720/0x10b0 > commit_merge+0x42a/0x1310 > vma_expand+0x313/0xad0 > vma_merge_new_range+0x2cd/0xec0 > mmap_region+0x432/0x1f50 > do_mmap+0x8e7/0x1020 > vm_mmap_pgoff+0x178/0x2f0 > __do_fast_syscall_32+0x86/0x110 > do_fast_syscall_32+0x32/0x80 > sysret32_from_system_call+0x0/0x4a > > The buggy address belongs to the object at ffff8885609b40f0 > which belongs to the cache vm_area_struct of size 176 > The buggy address is located 0 bytes inside of > freed 176-byte region [ffff8885609b40f0, ffff8885609b41a0) > > The buggy address belongs to the physical page: > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5609b4 > head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 > memcg:ffff88814d36d001 > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) > page_type: f5(slab) > raw: 0017ffffc0000040 ffff888108113d40 dead000000000100 dead000000000122 > raw: 0000000000000000 0000000000220022 00000001f5000000 ffff88814d36d001 > head: 0017ffffc0000040 ffff888108113d40 dead000000000100 dead000000000122 > head: 0000000000000000 0000000000220022 00000001f5000000 ffff88814d36d001 > head: 0017ffffc0000001 ffffea0015826d01 ffffffffffffffff 0000000000000000 > head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000 > page dumped because: kasan: bad access detected > > Memory state around the buggy address: > ffff8885609b3f80: 00 00 00 00 00 00 00 00 00 00 00 00task_mmu 00 00 00 00 > ffff8885609b4000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >ffff8885609b4080: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fa fb > ^ > ffff8885609b4100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > ffff8885609b4180: fb fb fb fb fc fc fc fc fc fc fc fc 00 00 00 00 > ================================================================== > Disabling lock debugging due to kernel taint > > > sh /usr/src/kernels/(uname -r)/scripts/faddr2line /lib/debug/lib/modules/(uname -r)/vmlinux m_next+0x13b > m_next+0x13b/0x170: > proc_get_vma at fs/proc/task_mmu.c:136 > (inlined by) m_next at fs/proc/task_mmu.c:187 > > > cat -n /usr/src/debug/kernel-6.11-8833-gde5cb0dcb74c/linux-6.12.0-0.rc0.20240923gitde5cb0dcb74c.9.fc42.x86_64/fs/proc/task_mmu.c | sed -n '182,192 p' > 182 { > 183 if (*ppos == -2UL) { > 184 *ppos = -1UL; > 185 return NULL; > 186 } > 187 return proc_get_vma(m->private, ppos); > 188 } > 189 > 190 static void m_stop(struct seq_file *m, void *v) > 191 { > 192 struct proc_maps_private *priv = m->private; > > > git blame fs/proc/task_mmu.c -L 182,192 > Blaming lines: 100% (11/11), done. > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 182) { > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 183) > if (*ppos == -2UL) { > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 184) > *ppos = -1UL; > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 185) > return NULL; > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 186) } > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 187) > return proc_get_vma(m->private, ppos); > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 188) } > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 189) > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 190) > static void m_stop(struct seq_file *m, void *v) > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 191) { > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 192) > struct proc_maps_private *priv = m->private; > > Hmm this line hasn't changed for two years. > > Machine spec: https://linux-hardware.org/?probe=323b76ce48 > I attached below full kernel log and build config. > > Can anyone figure out what happened or should we wait for the second > manifestation of this issue? > Finally I spotted that this issue is caused by the Steam client. And usually happens after downloading game updates. Looks like Steam client runs some post update scripts which cause slab-use-after-free in m_next. Git bisect found the first bad commit: commit f8d112a4e657c65c888e6b8a8435ef61a66e4ab8 (HEAD) Author: Liam R. Howlett <Liam.Howlett@Oracle.com> Date: Fri Aug 30 00:00:54 2024 -0400 mm/mmap: avoid zeroing vma tree in mmap_region() Instead of zeroing the vma tree and then overwriting the area, let the area be overwritten and then clean up the gathered vmas using vms_complete_munmap_vmas(). To ensure locking is downgraded correctly, the mm is set regardless of MAP_FIXED or not (NULL vma). If a driver is mapping over an existing vma, then clear the ptes before the call_mmap() invocation. This is done using the vms_clean_up_area() helper. If there is a close vm_ops, that must also be called to ensure any cleanup is done before mapping over the area. This also means that calling open has been added to the abort of an unmap operation, for now. Since vm_ops->open() and vm_ops->close() are not always undo each other (state cleanup may exist in ->close() that is lost forever), the code cannot be left in this way, but that change has been isolated to another commit to make this point very obvious for traceability. Temporarily keep track of the number of pages that will be removed and reduce the charged amount. This also drops the validate_mm() call in the vma_expand() function. It is necessary to drop the validate as it would fail since the mm map_count would be incorrect during a vma expansion, prior to the cleanup from vms_complete_munmap_vmas(). Clean up the error handing of the vms_gather_munmap_vmas() by calling the verification within the function. Link: https://lkml.kernel.org/r/20240830040101.822209-15-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Bert Karwatzki <spasswolf@web.de> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Jiri Olsa <olsajiri@gmail.com> Cc: Kees Cook <kees@kernel.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Mark Brown <broonie@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Paul Moore <paul@paul-moore.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> mm/mmap.c | 57 +++++++++++++++++++++++++++------------------------------ mm/vma.c | 54 ++++++++++++++++++++++++++++++++++++++++++------------ mm/vma.h | 22 ++++++++++++++++------ 3 files changed, 85 insertions(+), 48 deletions(-) -- Best Regards, Mike Gavrilov. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 2024-10-02 17:34 ` Mikhail Gavrilov @ 2024-10-02 17:55 ` Lorenzo Stoakes 2024-10-02 20:32 ` Lorenzo Stoakes 0 siblings, 1 reply; 8+ messages in thread From: Lorenzo Stoakes @ 2024-10-02 17:55 UTC (permalink / raw) To: Mikhail Gavrilov Cc: Linux List Kernel Mailing, Linux regressions mailing list, linux-fsdevel, Liam.Howlett, Andrew Morton, Linux Memory Management List Thanks for your report! On Wed, Oct 02, 2024 at 10:34:32PM GMT, Mikhail Gavrilov wrote: > On Wed, Sep 25, 2024 at 3:28 AM Mikhail Gavrilov > <mikhail.v.gavrilov@gmail.com> wrote: > > > > Hi, > > I am testing kernel snapshots on Fedora Rawhide and Today with build > > on commit de5cb0dcb74c I saw for the first time "KASAN: > > slab-use-after-free in m_next+0x13b". > > Unfortunately it is not clear what triggered this problem because it > > happened after 21 hour uptime. > > > > Full trace looks like: > > input: Noble FoKus Mystique (AVRCP) as /devices/virtual/input/input26 > > ================================================================== > > BUG: KASAN: slab-use-after-free in m_next+0x13b/0x170 > > Read of size 8 at addr ffff8885609b40f0 by task htop/3847 > > > > CPU: 14 UID: 1000 PID: 3847 Comm: htop Tainted: G W L > > ------- --- 6.12.0-0.rc0.20240923gitde5cb0dcb74c.9.fc42.x86_64+debug > > #1 > > Tainted: [W]=WARN, [L]=SOFTLOCKUP > > Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING WIFI, > > BIOS 3040 09/12/2024 > > Call Trace: > > <TASK> > > dump_stack_lvl+0x84/0xd0 > > ? m_next+0x13b/0x170 > > print_report+0x174/0x505 > > ? m_next+0x13b/0x170 > > ? __virt_addr_valid+0x231/0x420 > > ? m_next+0x13b/0x170 > > kasan_report+0xab/0x180 > > ? m_next+0x13b/0x170 > > m_next+0x13b/0x170 > > seq_read_iter+0x8e5/0x1130 > > seq_read+0x2b4/0x3c0 > > ? __pfx_seq_read+0x10/0x10 > > ? inode_security+0x54/0xf0 > > ? rw_verify_area+0x3b2/0x5e0 > > vfs_read+0x165/0xa20 > > ? __pfx_vfs_read+0x10/0x10 > > ? ktime_get_coarse_real_ts64+0x41/0xd0 > > ? local_clock_noinstr+0xd/0x100 > > ? __pfx_lock_release+0x10/0x10 > > ksys_read+0xfb/0x1d0 > > ? __pfx_ksys_read+0x10/0x10 > > ? ktime_get_coarse_real_ts64+0x41/0xd0 > > do_syscall_64+0x97/0x190 > > ? __lock_acquire+0xdcd/0x62c0 > > ? __pfx___lock_acquire+0x10/0x10 > > ? __pfx___lock_acquire+0x10/0x10 > > ? __pfx___lock_acquire+0x10/0x10 > > ? audit_filter_inodes.part.0+0x12d/0x220 > > ? local_clock_noinstr+0xd/0x100 > > ? __pfx_lock_release+0x10/0x10 > > ? rcu_is_watching+0x12/0xc0 > > ? kfree+0x27c/0x4d0 > > ? audit_reset_context+0x8c5/0xee0 > > ? lockdep_hardirqs_on_prepare+0x171/0x400 > > ? do_syscall_64+0xa3/0x190 > > ? lockdep_hardirqs_on+0x7c/0x100 > > ? do_syscall_64+0xa3/0x190 > > ? do_syscall_64+0xa3/0x190 > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > RIP: 0033:0x7f4190dcac36 > > Code: 89 df e8 2d c1 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 15 > > 83 e2 39 83 fa 08 75 0d e8 32 ff ff ff 66 90 48 8b 45 10 0f 05 <48> 8b > > 5d f8 c9 c3 0f 1f 40 00 f3 0f 1e fa 55 48 89 e5 48 83 ec 08 > > RSP: 002b:00007ffcde82b690 EFLAGS: 00000202 ORIG_RAX: 0000000000000000 > > RAX: ffffffffffffffda RBX: 00007f4190ce3740 RCX: 00007f4190dcac36 > > RDX: 0000000000000400 RSI: 000055bf5e823a20 RDI: 0000000000000005 > > RBP: 00007ffcde82b6a0 R08: 0000000000000000 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000202 R12: 00007f4190f44fd0 > > R13: 00007f4190f44e80 R14: 000055bf5e823e20 R15: 000055bf5ecc9160 > > </TASK> > > > > Allocated by task 176289: > > kasan_save_stack+0x30/0x50 > > kasan_save_track+0x14/0x30 > > __kasan_slab_alloc+0x6e/0x70 > > kmem_cache_alloc_noprof+0x15a/0x3d0 > > vm_area_dup+0x23/0x190 > > __split_vma+0x137/0xd40 > > vms_gather_munmap_vmas+0x29d/0xfc0 > > mmap_region+0x35a/0x1f50 > > do_mmap+0x8e7/0x1020 > > vm_mmap_pgoff+0x178/0x2f0 > > __do_fast_syscall_32+0x86/0x110 > > do_fast_syscall_32+0x32/0x80 > > sysret32_from_system_call+0x0/0x4a > > > > Freed by task 0: > > kasan_save_stack+0x30/0x50 > > kasan_save_track+0x14/0x30 > > kasan_save_free_info+0x3b/0x70 > > __kasan_slab_free+0x37/0x50 > > kmem_cache_free+0x1a7/0x5a0 > > rcu_do_batch+0x3fd/0x1120 > > rcu_core+0x636/0x9b0 > > handle_softirqs+0x1e9/0x8d0 > > __irq_exit_rcu+0xbb/0x1c0 > > irq_exit_rcu+0xe/0x30 > > sysvec_apic_timer_interrupt+0xa1/0xd0 > > asm_sysvec_apic_timer_interrupt+0x1a/0x20 > > > > Last potentially related work creation: > > kasan_save_stack+0x30/0x50 > > __kasan_record_aux_stack+0x8e/0xa0 > > __call_rcu_common.constprop.0+0xf4/0x10d0 > > vma_complete+0x720/0x10b0 > > commit_merge+0x42a/0x1310 > > vma_expand+0x313/0xad0 > > vma_merge_new_range+0x2cd/0xec0 > > mmap_region+0x432/0x1f50 > > do_mmap+0x8e7/0x1020 > > vm_mmap_pgoff+0x178/0x2f0 > > __do_fast_syscall_32+0x86/0x110 > > do_fast_syscall_32+0x32/0x80 > > sysret32_from_system_call+0x0/0x4a > > > > The buggy address belongs to the object at ffff8885609b40f0 > > which belongs to the cache vm_area_struct of size 176 > > The buggy address is located 0 bytes inside of > > freed 176-byte region [ffff8885609b40f0, ffff8885609b41a0) > > > > The buggy address belongs to the physical page: > > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5609b4 > > head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 > > memcg:ffff88814d36d001 > > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) > > page_type: f5(slab) > > raw: 0017ffffc0000040 ffff888108113d40 dead000000000100 dead000000000122 > > raw: 0000000000000000 0000000000220022 00000001f5000000 ffff88814d36d001 > > head: 0017ffffc0000040 ffff888108113d40 dead000000000100 dead000000000122 > > head: 0000000000000000 0000000000220022 00000001f5000000 ffff88814d36d001 > > head: 0017ffffc0000001 ffffea0015826d01 ffffffffffffffff 0000000000000000 > > head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000 > > page dumped because: kasan: bad access detected > > > > Memory state around the buggy address: > > ffff8885609b3f80: 00 00 00 00 00 00 00 00 00 00 00 00task_mmu 00 00 00 00 > > ffff8885609b4000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > >ffff8885609b4080: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fa fb > > ^ > > ffff8885609b4100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > > ffff8885609b4180: fb fb fb fb fc fc fc fc fc fc fc fc 00 00 00 00 > > ================================================================== > > Disabling lock debugging due to kernel taint > > > > > sh /usr/src/kernels/(uname -r)/scripts/faddr2line /lib/debug/lib/modules/(uname -r)/vmlinux m_next+0x13b > > m_next+0x13b/0x170: > > proc_get_vma at fs/proc/task_mmu.c:136 > > (inlined by) m_next at fs/proc/task_mmu.c:187 > > > > > cat -n /usr/src/debug/kernel-6.11-8833-gde5cb0dcb74c/linux-6.12.0-0.rc0.20240923gitde5cb0dcb74c.9.fc42.x86_64/fs/proc/task_mmu.c | sed -n '182,192 p' > > 182 { > > 183 if (*ppos == -2UL) { > > 184 *ppos = -1UL; > > 185 return NULL; > > 186 } > > 187 return proc_get_vma(m->private, ppos); > > 188 } > > 189 > > 190 static void m_stop(struct seq_file *m, void *v) > > 191 { > > 192 struct proc_maps_private *priv = m->private; > > > > > git blame fs/proc/task_mmu.c -L 182,192 > > Blaming lines: 100% (11/11), done. > > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 182) { > > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 183) > > if (*ppos == -2UL) { > > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 184) > > *ppos = -1UL; > > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 185) > > return NULL; > > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 186) } > > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 187) > > return proc_get_vma(m->private, ppos); > > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 188) } > > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 189) > > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 190) > > static void m_stop(struct seq_file *m, void *v) > > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 191) { > > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 192) > > struct proc_maps_private *priv = m->private; > > > > Hmm this line hasn't changed for two years. > > > > Machine spec: https://linux-hardware.org/?probe=323b76ce48 > > I attached below full kernel log and build config. > > > > Can anyone figure out what happened or should we wait for the second > > manifestation of this issue? > > > > Finally I spotted that this issue is caused by the Steam client. > And usually happens after downloading game updates. > Looks like Steam client runs some post update scripts which cause > slab-use-after-free in m_next. Yeah similar issue being investigated elsewhere, See https://lore.kernel.org/all/c63a64a9-cdee-4586-85ba-800e8e1a8054@lucifer.local/ for latest update. This is ongoing, but also steam, also this commit and also related to steam update doing something strange, so strange I literally can't repro locally :) but Bert in that thread can. We can reliably repro it with CONFIG_DEBUG_VM_MAPLE_TREE, CONFIG_DEBUG_VM, and CONFIG_DEBUG_MAPLE_TREE set, if you set these you should see a report more quickly (let us know if you do). Also note that there is a critical error handling fix in https://lore.kernel.org/linux-mm/20241002073932.13482-1-lorenzo.stoakes@oracle.com/ Which should get hotfixed soon. > > Git bisect found the first bad commit: > commit f8d112a4e657c65c888e6b8a8435ef61a66e4ab8 (HEAD) > Author: Liam R. Howlett <Liam.Howlett@Oracle.com> > Date: Fri Aug 30 00:00:54 2024 -0400 > > mm/mmap: avoid zeroing vma tree in mmap_region() > > Instead of zeroing the vma tree and then overwriting the area, let the > area be overwritten and then clean up the gathered vmas using > vms_complete_munmap_vmas(). > > To ensure locking is downgraded correctly, the mm is set regardless of > MAP_FIXED or not (NULL vma). > > If a driver is mapping over an existing vma, then clear the ptes before > the call_mmap() invocation. This is done using the vms_clean_up_area() > helper. If there is a close vm_ops, that must also be called to ensure > any cleanup is done before mapping over the area. This also means that > calling open has been added to the abort of an unmap operation, for now. > > Since vm_ops->open() and vm_ops->close() are not always undo each other > (state cleanup may exist in ->close() that is lost forever), the code > cannot be left in this way, but that change has been isolated to another > commit to make this point very obvious for traceability. > > Temporarily keep track of the number of pages that will be removed and > reduce the charged amount. > > This also drops the validate_mm() call in the vma_expand() function. It > is necessary to drop the validate as it would fail since the mm map_count > would be incorrect during a vma expansion, prior to the cleanup from > vms_complete_munmap_vmas(). > > Clean up the error handing of the vms_gather_munmap_vmas() by calling the > verification within the function. > > Link: https://lkml.kernel.org/r/20240830040101.822209-15-Liam.Howlett@oracle.com > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Cc: Bert Karwatzki <spasswolf@web.de> > Cc: Jeff Xu <jeffxu@chromium.org> > Cc: Jiri Olsa <olsajiri@gmail.com> > Cc: Kees Cook <kees@kernel.org> > Cc: Lorenzo Stoakes <lstoakes@gmail.com> > Cc: Mark Brown <broonie@kernel.org> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: "Paul E. McKenney" <paulmck@kernel.org> > Cc: Paul Moore <paul@paul-moore.com> > Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> > Cc: Suren Baghdasaryan <surenb@google.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > > mm/mmap.c | 57 +++++++++++++++++++++++++++------------------------------ > mm/vma.c | 54 ++++++++++++++++++++++++++++++++++++++++++------------ > mm/vma.h | 22 ++++++++++++++++------ > 3 files changed, 85 insertions(+), 48 deletions(-) > > -- > Best Regards, > Mike Gavrilov. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 2024-10-02 17:55 ` Lorenzo Stoakes @ 2024-10-02 20:32 ` Lorenzo Stoakes 2024-10-02 20:45 ` Mikhail Gavrilov 0 siblings, 1 reply; 8+ messages in thread From: Lorenzo Stoakes @ 2024-10-02 20:32 UTC (permalink / raw) To: Mikhail Gavrilov Cc: Linux List Kernel Mailing, Linux regressions mailing list, linux-fsdevel, Liam.Howlett, Andrew Morton, Linux Memory Management List On Wed, Oct 02, 2024 at 06:55:59PM GMT, Lorenzo Stoakes wrote: > Thanks for your report! Out of curiosity, what GPU are you using? :) [snip] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 2024-10-02 20:32 ` Lorenzo Stoakes @ 2024-10-02 20:45 ` Mikhail Gavrilov 2024-10-03 21:25 ` Mikhail Gavrilov 0 siblings, 1 reply; 8+ messages in thread From: Mikhail Gavrilov @ 2024-10-02 20:45 UTC (permalink / raw) To: Lorenzo Stoakes Cc: Linux List Kernel Mailing, Linux regressions mailing list, linux-fsdevel, Liam.Howlett, Andrew Morton, Linux Memory Management List On Wed, Oct 2, 2024 at 10:56 PM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > We can reliably repro it with CONFIG_DEBUG_VM_MAPLE_TREE, CONFIG_DEBUG_VM, and > CONFIG_DEBUG_MAPLE_TREE set, if you set these you should see a report more > quickly (let us know if you do). mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_VM_MAPLE_TREE' # CONFIG_DEBUG_VM_MAPLE_TREE is not set mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_VM' CONFIG_DEBUG_VM_IRQSOFF=y CONFIG_DEBUG_VM=y # CONFIG_DEBUG_VM_MAPLE_TREE is not set # CONFIG_DEBUG_VM_RB is not set CONFIG_DEBUG_VM_PGFLAGS=y CONFIG_DEBUG_VM_PGTABLE=y mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_MAPLE_TREE' # CONFIG_DEBUG_MAPLE_TREE is not set Fedora's kernel build uses only CONFIG_DEBUG_VM and it's enough for reproducing this issue. Anyway I enabled all three options. I'll try to live for a day without steam launching. In a day I'll write whether it is reproducing without steam or not. On Thu, Oct 3, 2024 at 1:32 AM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > Out of curiosity, what GPU are you using? :) The issue reproduces on all my machines. One has an AMD Radeon 6900 XT and a second AMD Radeon 7900 XTX. -- Best Regards, Mike Gavrilov. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 2024-10-02 20:45 ` Mikhail Gavrilov @ 2024-10-03 21:25 ` Mikhail Gavrilov 2024-10-03 21:52 ` Lorenzo Stoakes 0 siblings, 1 reply; 8+ messages in thread From: Mikhail Gavrilov @ 2024-10-03 21:25 UTC (permalink / raw) To: Lorenzo Stoakes Cc: Linux List Kernel Mailing, Linux regressions mailing list, linux-fsdevel, Liam.Howlett, Andrew Morton, Linux Memory Management List [-- Attachment #1: Type: text/plain, Size: 9240 bytes --] On Thu, Oct 3, 2024 at 1:45 AM Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote: > > On Wed, Oct 2, 2024 at 10:56 PM Lorenzo Stoakes > <lorenzo.stoakes@oracle.com> wrote: > > We can reliably repro it with CONFIG_DEBUG_VM_MAPLE_TREE, CONFIG_DEBUG_VM, and > > CONFIG_DEBUG_MAPLE_TREE set, if you set these you should see a report more > > quickly (let us know if you do). > > mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_VM_MAPLE_TREE' > # CONFIG_DEBUG_VM_MAPLE_TREE is not set > mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_VM' > CONFIG_DEBUG_VM_IRQSOFF=y > CONFIG_DEBUG_VM=y > # CONFIG_DEBUG_VM_MAPLE_TREE is not set > # CONFIG_DEBUG_VM_RB is not set > CONFIG_DEBUG_VM_PGFLAGS=y > CONFIG_DEBUG_VM_PGTABLE=y > mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_MAPLE_TREE' > # CONFIG_DEBUG_MAPLE_TREE is not set > > Fedora's kernel build uses only CONFIG_DEBUG_VM and it's enough for > reproducing this issue. > Anyway I enabled all three options. I'll try to live for a day without > steam launching. In a day I'll write whether it is reproducing without > steam or not. A day passed, and as expected, the problem did not occur until I launch Steam. But with suggested options the stacktrace looks different. Instead of "KASAN: slab-use-after-free in m_next+0x13b" I see this: [88841.586167] node00000000b4c54d84: data_end 9 != the last slot offset 8 [88841.586315] BUG at mas_validate_limits:7523 (1) [88841.586320] maple_tree(0000000067811125) flags 30F, height 3 root 0000000040e0c786 [88841.586324] 0-ffffffffffffffff: node 000000009b462d47 depth 0 type 3 parent 00000000db18456d contents: 10000 11400000 1e000 1f000 1f000 75e15000 0 0 0 ffffffff00283000 | 09 09| 000000005518cec0 67FFFFFF 0000000085840a0a 79970FFF 00000000975349aa 79F50FFF 00000000afe6ddd8 7B140FFF 0000000083c903b1 7BB96FFF 00000000335e109c F605AFFF 000000007e7333d1 F6570FFF 00000000d8e9900e F6C92FFF 00000000250ada8a F76E1FFF 00000000e567baed [88841.586357] 0-67ffffff: node 000000005c64e204 depth 1 type 3 parent 0000000069e1180e contents: 10000 0 0 0 0 0 0 0 0 0 | 05 00| 000000000cfac463 16FFFF 00000000f0522fec 400FFF 00000000cd8938b8 94FFFF 00000000d2bcb2e3 E9FFFF 00000000ed8d307e 173FFFF 0000000056285bf1 67FFFFFF 0000000000000000 0 0000000000000000 0 0000000000000000 0 0000000000000000 [88841.586388] 0-16ffff: node 0000000037648f62 depth 2 type 1 parent 00000000978387fd contents: 0000000000000000 FFFF 000000000bc2e123 10FFFF 0000000049345b43 11FFFF 000000008940e7cb 126FFF 000000007c2365c0 12FFFF 00000000cfc1c890 142FFF 00000000b64ae6ea 14FFFF 00000000f8f8f6c9 165FFF 000000008460c3ec 16FFFF 0000000000000000 0 0000000000000000 0 0000000000000000 0 0000000000000000 0 0000000000000000 0 0000000000000000 0 000000009d394510 [88841.586413] 0-ffff: 0000000000000000 [88841.586417] 10000-10ffff: 000000000bc2e123 [88841.586420] 110000-11ffff: 0000000049345b43 [88841.586424] 120000-126fff: 000000008940e7cb [88841.586428] 127000-12ffff: 000000007c2365c0 [88841.586431] 130000-142fff: 00000000cfc1c890 [88841.586435] 143000-14ffff: 00000000b64ae6ea [88841.586438] 150000-165fff: 00000000f8f8f6c9 [88841.586442] 166000-16ffff: 000000008460c3ec [88841.586445] 170000-400fff: node 0000000030a5de34 depth 2 type 1 parent 00000000161b9281 contents: 0000000090f8ff7b 171FFF 00000000a90cdf09 17FFFF 00000000ad657f59 190FFF 0000000026397ca7 19FFFF 000000003413c0f4 1B0FFF 000000000ca6dd7d 1BFFFF 00000000cf83b99b 1CEFFF 0000000096a06890 1CFFFF 00000000ed96cdbd 1E5FFF 00000000e6e9d2cb 1EFFFF 00000000bc54b9f4 1FFFFF 000000006e42b324 3DFFFF 00000000afd4728b 3FFFFF 0000000082572c0c 400FFF 0000000000000000 0 00000000e89e29fc [88841.586471] 170000-171fff: 0000000090f8ff7b [88841.586474] 172000-17ffff: 00000000a90cdf09 [88841.586478] 180000-190fff: 00000000ad657f59 [88841.586481] 191000-19ffff: 0000000026397ca7 [88841.586485] 1a0000-1b0fff: 000000003413c0f4 [88841.586511] 1b1000-1bffff: 000000000ca6dd7d [88841.586515] 1c0000-1cefff: 00000000cf83b99b [88841.586519] 1cf000-1cffff: 0000000096a06890 [88841.586522] 1d0000-1e5fff: 00000000ed96cdbd [88841.586526] 1e6000-1effff: 00000000e6e9d2cb [88841.586529] 1f0000-1fffff: 00000000bc54b9f4 [88841.586533] 200000-3dffff: 000000006e42b324 [88841.586537] 3e0000-3fffff: 00000000afd4728b [88841.586540] 400000-400fff: 0000000082572c0c [88841.586544] 401000-94ffff: node 00000000f4ffb374 depth 2 type 1 parent 000000005fb58d4e contents: 000000004eafabe6 403FFF 00000000104e2e73 404FFF 000000004dbe1ca9 406FFF 00000000ffb92c1b 407FFF 00000000cffd3517 409FFF 000000009ef45250 40FFFF 00000000373dd145 410FFF 00000000eaff67b3 50FFFF 000000002e632fe1 511FFF 000000001839285f 60FFFF 0000000043d54299 611FFF 00000000da2961ba 80FFFF 00000000155e68ba 8C9FFF 0000000010bfe63e 8CFFFF 00000000a4834cd3 94FFFF 000000000e628eae [88841.586569] 401000-403fff: 000000004eafabe6 [88841.586572] 404000-404fff: 00000000104e2e73 [88841.586576] 405000-406fff: 000000004dbe1ca9 [88841.586579] 407000-407fff: 00000000ffb92c1b [88841.586583] 408000-409fff: 00000000cffd3517 [88841.586586] 40a000-40ffff: 000000009ef45250 [88841.586590] 410000-410fff: 00000000373dd145 [88841.586594] 411000-50ffff: 00000000eaff67b3 [88841.586597] 510000-511fff: 000000002e632fe1 [88841.586601] 512000-60ffff: 000000001839285f [88841.586604] 610000-611fff: 0000000043d54299 [88841.586608] 612000-80ffff: 00000000da2961ba [88841.586611] 810000-8c9fff: 00000000155e68ba [88841.586615] 8ca000-8cffff: 0000000010bfe63e [88841.586618] 8d0000-94ffff: 00000000a4834cd3 *** [88841.592355] Pass: 3886705433 Run:3886705434 [88841.592359] CPU: 22 UID: 1000 PID: 273842 Comm: rundll32.exe Tainted: G W L 6.11.0-rc6-13b-f8d112a4e657c65c888e6b8a8435ef61a66e4ab8+ #720 [88841.592364] Tainted: [W]=WARN, [L]=SOFTLOCKUP [88841.592366] Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING WIFI, BIOS 3040 09/12/2024 [88841.592369] Call Trace: [88841.592372] <TASK> [88841.592376] dump_stack_lvl+0x84/0xd0 [88841.592384] mt_validate+0x2932/0x2980 [88841.592397] ? __pfx_mt_validate+0x10/0x10 [88841.592408] validate_mm+0xa5/0x310 [88841.592414] ? __pfx_validate_mm+0x10/0x10 [88841.592427] vms_complete_munmap_vmas+0x572/0x9b0 [88841.592431] ? __pfx_mas_prev+0x10/0x10 [88841.592438] mmap_region+0x10f9/0x24a0 [88841.592447] ? __pfx_mmap_region+0x10/0x10 [88841.592450] ? __pfx_mark_lock+0x10/0x10 [88841.592459] ? mark_lock+0xf5/0x16d0 [88841.592474] ? mm_get_unmapped_area_vmflags+0x48/0xc0 [88841.592482] ? security_mmap_addr+0x57/0x90 [88841.592487] ? __get_unmapped_area+0x191/0x2c0 [88841.592492] do_mmap+0x8cf/0xff0 [88841.592500] ? __pfx_do_mmap+0x10/0x10 [88841.592503] ? down_write_killable+0x19d/0x280 [88841.592506] ? __pfx_down_write_killable+0x10/0x10 [88841.592513] vm_mmap_pgoff+0x178/0x2f0 [88841.592521] ? __pfx_vm_mmap_pgoff+0x10/0x10 [88841.592524] ? lockdep_hardirqs_on+0x7c/0x100 [88841.592528] ? seqcount_lockdep_reader_access.constprop.0+0xa5/0xb0 [88841.592537] __do_fast_syscall_32+0x86/0x110 [88841.592540] ? kfree+0x257/0x3a0 [88841.592547] ? audit_reset_context+0x8c5/0xee0 [88841.592555] ? lockdep_hardirqs_on_prepare+0x171/0x400 [88841.592558] ? __do_fast_syscall_32+0x92/0x110 [88841.592561] ? lockdep_hardirqs_on+0x7c/0x100 [88841.592564] ? __do_fast_syscall_32+0x92/0x110 [88841.592571] ? lockdep_hardirqs_on_prepare+0x171/0x400 [88841.592574] ? __do_fast_syscall_32+0x92/0x110 [88841.592577] ? lockdep_hardirqs_on+0x7c/0x100 [88841.592580] ? __do_fast_syscall_32+0x92/0x110 [88841.592583] ? audit_reset_context+0x8c5/0xee0 [88841.592590] ? lockdep_hardirqs_on_prepare+0x171/0x400 [88841.592593] ? __do_fast_syscall_32+0x92/0x110 [88841.592596] ? lockdep_hardirqs_on+0x7c/0x100 [88841.592600] ? rcu_is_watching+0x12/0xc0 [88841.592603] ? trace_irq_disable.constprop.0+0xce/0x110 [88841.592609] do_fast_syscall_32+0x32/0x80 [88841.592612] entry_SYSCALL_compat_after_hwframe+0x75/0x75 [88841.592616] RIP: 0023:0xf7f3e5a9 [88841.592632] Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 cd 0f 05 cd 80 <5d> 5a 59 c3 cc 90 90 90 2e 8d b4 26 00 00 00 00 8d b4 26 00 00 00 [88841.592635] RSP: 002b:000000000050f450 EFLAGS: 00000256 ORIG_RAX: 00000000000000c0 [88841.592639] RAX: ffffffffffffffda RBX: 0000000001b90000 RCX: 000000000001f000 [88841.592641] RDX: 0000000000000000 RSI: 0000000000004032 RDI: 00000000ffffffff [88841.592644] RBP: 0000000000000000 R08: 000000000050f450 R09: 0000000000000000 [88841.592646] R10: 0000000000000000 R11: 0000000000000256 R12: 0000000000000000 [88841.592648] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [88841.592658] </TASK> [88841.592668] 00000000b4c54d84[9] should not have entry 00000000f0273bd5 Full kernel log attached here below as archive. -- Best Regards, Mike Gavrilov. [-- Attachment #2: dmesg-6.11.0-rc6-13b-f8d112a4e657c65c888e6b8a8435ef61a66e4ab8.zip --] [-- Type: application/zip, Size: 169251 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 2024-10-03 21:25 ` Mikhail Gavrilov @ 2024-10-03 21:52 ` Lorenzo Stoakes 2024-10-05 6:45 ` Lorenzo Stoakes 0 siblings, 1 reply; 8+ messages in thread From: Lorenzo Stoakes @ 2024-10-03 21:52 UTC (permalink / raw) To: Mikhail Gavrilov Cc: Linux List Kernel Mailing, Linux regressions mailing list, linux-fsdevel, Liam.Howlett, Andrew Morton, Linux Memory Management List On Fri, Oct 04, 2024 at 02:25:07AM +0500, Mikhail Gavrilov wrote: > On Thu, Oct 3, 2024 at 1:45 AM Mikhail Gavrilov > <mikhail.v.gavrilov@gmail.com> wrote: > > > > On Wed, Oct 2, 2024 at 10:56 PM Lorenzo Stoakes > > <lorenzo.stoakes@oracle.com> wrote: > > > We can reliably repro it with CONFIG_DEBUG_VM_MAPLE_TREE, CONFIG_DEBUG_VM, and > > > CONFIG_DEBUG_MAPLE_TREE set, if you set these you should see a report more > > > quickly (let us know if you do). > > > > mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_VM_MAPLE_TREE' > > # CONFIG_DEBUG_VM_MAPLE_TREE is not set > > mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_VM' > > CONFIG_DEBUG_VM_IRQSOFF=y > > CONFIG_DEBUG_VM=y > > # CONFIG_DEBUG_VM_MAPLE_TREE is not set > > # CONFIG_DEBUG_VM_RB is not set > > CONFIG_DEBUG_VM_PGFLAGS=y > > CONFIG_DEBUG_VM_PGTABLE=y > > mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_MAPLE_TREE' > > # CONFIG_DEBUG_MAPLE_TREE is not set > > > > Fedora's kernel build uses only CONFIG_DEBUG_VM and it's enough for > > reproducing this issue. > > Anyway I enabled all three options. I'll try to live for a day without > > steam launching. In a day I'll write whether it is reproducing without > > steam or not. > > A day passed, and as expected, the problem did not occur until I launch Steam. > But with suggested options the stacktrace looks different. > Instead of "KASAN: slab-use-after-free in m_next+0x13b" I see this: > > [88841.586167] node00000000b4c54d84: data_end 9 != the last slot offset 8 Thanks, looking into the attached dmesg this looks to be identical to the issue that Bert reported in the other thread. The nature of it is that once the corruption happens 'weird stuff' will happen after this, luckily this debug mode lets us pick up on the original corruption. Bert is somehow luckily is able to reproduce very repeatably, so we have been able to get a lot more information, but it's taking time to truly narrow it down. Am working flat out to try to resolve the issue, we have before/after maple trees and it seems like a certain operation is resulting in a corrupted maple tree (duplicate 0x67ffffff entry). It is proving very very stubborn to be able to reproduce locally even in a controlled environment where the maple tree is manually set up, but am continuing my efforts to try to do so as best I can! :) Will respond here once we have a viable fix. Thanks again for taking the time to report and to grab the debug maple tree, it's very useful! Cheers, Lorenzo > [88841.586315] BUG at mas_validate_limits:7523 (1) > [88841.586320] maple_tree(0000000067811125) flags 30F, height 3 root > 0000000040e0c786 > [88841.586324] 0-ffffffffffffffff: node 000000009b462d47 depth 0 type > 3 parent 00000000db18456d contents: 10000 11400000 1e000 1f000 1f000 > 75e15000 0 0 0 ffffffff00283000 | 09 09| 000000005518cec0 67FFFFFF > 0000000085840a0a 79970FFF 00000000975349aa 79F50FFF 00000000afe6ddd8 > 7B140FFF 0000000083c903b1 7BB96FFF 00000000335e109c F605AFFF > 000000007e7333d1 F6570FFF 00000000d8e9900e F6C92FFF 00000000250ada8a > F76E1FFF 00000000e567baed > [88841.586357] 0-67ffffff: node 000000005c64e204 depth 1 type 3 > parent 0000000069e1180e contents: 10000 0 0 0 0 0 0 0 0 0 | 05 00| > 000000000cfac463 16FFFF 00000000f0522fec 400FFF 00000000cd8938b8 > 94FFFF 00000000d2bcb2e3 E9FFFF 00000000ed8d307e 173FFFF > 0000000056285bf1 67FFFFFF 0000000000000000 0 0000000000000000 0 > 0000000000000000 0 0000000000000000 > [88841.586388] 0-16ffff: node 0000000037648f62 depth 2 type 1 > parent 00000000978387fd contents: 0000000000000000 FFFF > 000000000bc2e123 10FFFF 0000000049345b43 11FFFF 000000008940e7cb > 126FFF 000000007c2365c0 12FFFF 00000000cfc1c890 142FFF > 00000000b64ae6ea 14FFFF 00000000f8f8f6c9 165FFF 000000008460c3ec > 16FFFF 0000000000000000 0 0000000000000000 0 0000000000000000 0 > 0000000000000000 0 0000000000000000 0 0000000000000000 0 > 000000009d394510 > [88841.586413] 0-ffff: 0000000000000000 > [88841.586417] 10000-10ffff: 000000000bc2e123 > [88841.586420] 110000-11ffff: 0000000049345b43 > [88841.586424] 120000-126fff: 000000008940e7cb > [88841.586428] 127000-12ffff: 000000007c2365c0 > [88841.586431] 130000-142fff: 00000000cfc1c890 > [88841.586435] 143000-14ffff: 00000000b64ae6ea > [88841.586438] 150000-165fff: 00000000f8f8f6c9 > [88841.586442] 166000-16ffff: 000000008460c3ec > [88841.586445] 170000-400fff: node 0000000030a5de34 depth 2 type 1 > parent 00000000161b9281 contents: 0000000090f8ff7b 171FFF > 00000000a90cdf09 17FFFF 00000000ad657f59 190FFF 0000000026397ca7 > 19FFFF 000000003413c0f4 1B0FFF 000000000ca6dd7d 1BFFFF > 00000000cf83b99b 1CEFFF 0000000096a06890 1CFFFF 00000000ed96cdbd > 1E5FFF 00000000e6e9d2cb 1EFFFF 00000000bc54b9f4 1FFFFF > 000000006e42b324 3DFFFF 00000000afd4728b 3FFFFF 0000000082572c0c > 400FFF 0000000000000000 0 00000000e89e29fc > [88841.586471] 170000-171fff: 0000000090f8ff7b > [88841.586474] 172000-17ffff: 00000000a90cdf09 > [88841.586478] 180000-190fff: 00000000ad657f59 > [88841.586481] 191000-19ffff: 0000000026397ca7 > [88841.586485] 1a0000-1b0fff: 000000003413c0f4 > [88841.586511] 1b1000-1bffff: 000000000ca6dd7d > [88841.586515] 1c0000-1cefff: 00000000cf83b99b > [88841.586519] 1cf000-1cffff: 0000000096a06890 > [88841.586522] 1d0000-1e5fff: 00000000ed96cdbd > [88841.586526] 1e6000-1effff: 00000000e6e9d2cb > [88841.586529] 1f0000-1fffff: 00000000bc54b9f4 > [88841.586533] 200000-3dffff: 000000006e42b324 > [88841.586537] 3e0000-3fffff: 00000000afd4728b > [88841.586540] 400000-400fff: 0000000082572c0c > [88841.586544] 401000-94ffff: node 00000000f4ffb374 depth 2 type 1 > parent 000000005fb58d4e contents: 000000004eafabe6 403FFF > 00000000104e2e73 404FFF 000000004dbe1ca9 406FFF 00000000ffb92c1b > 407FFF 00000000cffd3517 409FFF 000000009ef45250 40FFFF > 00000000373dd145 410FFF 00000000eaff67b3 50FFFF 000000002e632fe1 > 511FFF 000000001839285f 60FFFF 0000000043d54299 611FFF > 00000000da2961ba 80FFFF 00000000155e68ba 8C9FFF 0000000010bfe63e > 8CFFFF 00000000a4834cd3 94FFFF 000000000e628eae > [88841.586569] 401000-403fff: 000000004eafabe6 > [88841.586572] 404000-404fff: 00000000104e2e73 > [88841.586576] 405000-406fff: 000000004dbe1ca9 > [88841.586579] 407000-407fff: 00000000ffb92c1b > [88841.586583] 408000-409fff: 00000000cffd3517 > [88841.586586] 40a000-40ffff: 000000009ef45250 > [88841.586590] 410000-410fff: 00000000373dd145 > [88841.586594] 411000-50ffff: 00000000eaff67b3 > [88841.586597] 510000-511fff: 000000002e632fe1 > [88841.586601] 512000-60ffff: 000000001839285f > [88841.586604] 610000-611fff: 0000000043d54299 > [88841.586608] 612000-80ffff: 00000000da2961ba > [88841.586611] 810000-8c9fff: 00000000155e68ba > [88841.586615] 8ca000-8cffff: 0000000010bfe63e > [88841.586618] 8d0000-94ffff: 00000000a4834cd3 > *** > [88841.592355] Pass: 3886705433 Run:3886705434 > [88841.592359] CPU: 22 UID: 1000 PID: 273842 Comm: rundll32.exe > Tainted: G W L > 6.11.0-rc6-13b-f8d112a4e657c65c888e6b8a8435ef61a66e4ab8+ #720 > [88841.592364] Tainted: [W]=WARN, [L]=SOFTLOCKUP > [88841.592366] Hardware name: ASUS System Product Name/ROG STRIX > B650E-I GAMING WIFI, BIOS 3040 09/12/2024 > [88841.592369] Call Trace: > [88841.592372] <TASK> > [88841.592376] dump_stack_lvl+0x84/0xd0 > [88841.592384] mt_validate+0x2932/0x2980 > [88841.592397] ? __pfx_mt_validate+0x10/0x10 > [88841.592408] validate_mm+0xa5/0x310 > [88841.592414] ? __pfx_validate_mm+0x10/0x10 > [88841.592427] vms_complete_munmap_vmas+0x572/0x9b0 > [88841.592431] ? __pfx_mas_prev+0x10/0x10 > [88841.592438] mmap_region+0x10f9/0x24a0 > [88841.592447] ? __pfx_mmap_region+0x10/0x10 > [88841.592450] ? __pfx_mark_lock+0x10/0x10 > [88841.592459] ? mark_lock+0xf5/0x16d0 > [88841.592474] ? mm_get_unmapped_area_vmflags+0x48/0xc0 > [88841.592482] ? security_mmap_addr+0x57/0x90 > [88841.592487] ? __get_unmapped_area+0x191/0x2c0 > [88841.592492] do_mmap+0x8cf/0xff0 > [88841.592500] ? __pfx_do_mmap+0x10/0x10 > [88841.592503] ? down_write_killable+0x19d/0x280 > [88841.592506] ? __pfx_down_write_killable+0x10/0x10 > [88841.592513] vm_mmap_pgoff+0x178/0x2f0 > [88841.592521] ? __pfx_vm_mmap_pgoff+0x10/0x10 > [88841.592524] ? lockdep_hardirqs_on+0x7c/0x100 > [88841.592528] ? seqcount_lockdep_reader_access.constprop.0+0xa5/0xb0 > [88841.592537] __do_fast_syscall_32+0x86/0x110 > [88841.592540] ? kfree+0x257/0x3a0 > [88841.592547] ? audit_reset_context+0x8c5/0xee0 > [88841.592555] ? lockdep_hardirqs_on_prepare+0x171/0x400 > [88841.592558] ? __do_fast_syscall_32+0x92/0x110 > [88841.592561] ? lockdep_hardirqs_on+0x7c/0x100 > [88841.592564] ? __do_fast_syscall_32+0x92/0x110 > [88841.592571] ? lockdep_hardirqs_on_prepare+0x171/0x400 > [88841.592574] ? __do_fast_syscall_32+0x92/0x110 > [88841.592577] ? lockdep_hardirqs_on+0x7c/0x100 > [88841.592580] ? __do_fast_syscall_32+0x92/0x110 > [88841.592583] ? audit_reset_context+0x8c5/0xee0 > [88841.592590] ? lockdep_hardirqs_on_prepare+0x171/0x400 > [88841.592593] ? __do_fast_syscall_32+0x92/0x110 > [88841.592596] ? lockdep_hardirqs_on+0x7c/0x100 > [88841.592600] ? rcu_is_watching+0x12/0xc0 > [88841.592603] ? trace_irq_disable.constprop.0+0xce/0x110 > [88841.592609] do_fast_syscall_32+0x32/0x80 > [88841.592612] entry_SYSCALL_compat_after_hwframe+0x75/0x75 > [88841.592616] RIP: 0023:0xf7f3e5a9 > [88841.592632] Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 > 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 cd 0f > 05 cd 80 <5d> 5a 59 c3 cc 90 90 90 2e 8d b4 26 00 00 00 00 8d b4 26 00 > 00 00 > [88841.592635] RSP: 002b:000000000050f450 EFLAGS: 00000256 ORIG_RAX: > 00000000000000c0 > [88841.592639] RAX: ffffffffffffffda RBX: 0000000001b90000 RCX: 000000000001f000 > [88841.592641] RDX: 0000000000000000 RSI: 0000000000004032 RDI: 00000000ffffffff > [88841.592644] RBP: 0000000000000000 R08: 000000000050f450 R09: 0000000000000000 > [88841.592646] R10: 0000000000000000 R11: 0000000000000256 R12: 0000000000000000 > [88841.592648] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [88841.592658] </TASK> > [88841.592668] 00000000b4c54d84[9] should not have entry 00000000f0273bd5 > > Full kernel log attached here below as archive. > > -- > Best Regards, > Mike Gavrilov. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 2024-10-03 21:52 ` Lorenzo Stoakes @ 2024-10-05 6:45 ` Lorenzo Stoakes 0 siblings, 0 replies; 8+ messages in thread From: Lorenzo Stoakes @ 2024-10-05 6:45 UTC (permalink / raw) To: Mikhail Gavrilov Cc: Linux List Kernel Mailing, Linux regressions mailing list, linux-fsdevel, Liam.Howlett, Andrew Morton, Linux Memory Management List On Thu, Oct 03, 2024 at 10:52:03PM +0100, Lorenzo Stoakes wrote: > On Fri, Oct 04, 2024 at 02:25:07AM +0500, Mikhail Gavrilov wrote: > > On Thu, Oct 3, 2024 at 1:45 AM Mikhail Gavrilov > > <mikhail.v.gavrilov@gmail.com> wrote: > > > > > > On Wed, Oct 2, 2024 at 10:56 PM Lorenzo Stoakes > > > <lorenzo.stoakes@oracle.com> wrote: > > > > We can reliably repro it with CONFIG_DEBUG_VM_MAPLE_TREE, CONFIG_DEBUG_VM, and > > > > CONFIG_DEBUG_MAPLE_TREE set, if you set these you should see a report more > > > > quickly (let us know if you do). > > > > > > mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_VM_MAPLE_TREE' > > > # CONFIG_DEBUG_VM_MAPLE_TREE is not set > > > mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_VM' > > > CONFIG_DEBUG_VM_IRQSOFF=y > > > CONFIG_DEBUG_VM=y > > > # CONFIG_DEBUG_VM_MAPLE_TREE is not set > > > # CONFIG_DEBUG_VM_RB is not set > > > CONFIG_DEBUG_VM_PGFLAGS=y > > > CONFIG_DEBUG_VM_PGTABLE=y > > > mikhail@primary-ws ~/dmesg> cat .config | grep 'CONFIG_DEBUG_MAPLE_TREE' > > > # CONFIG_DEBUG_MAPLE_TREE is not set > > > > > > Fedora's kernel build uses only CONFIG_DEBUG_VM and it's enough for > > > reproducing this issue. > > > Anyway I enabled all three options. I'll try to live for a day without > > > steam launching. In a day I'll write whether it is reproducing without > > > steam or not. > > > > A day passed, and as expected, the problem did not occur until I launch Steam. > > But with suggested options the stacktrace looks different. > > Instead of "KASAN: slab-use-after-free in m_next+0x13b" I see this: > > > > [88841.586167] node00000000b4c54d84: data_end 9 != the last slot offset 8 > > Thanks, looking into the attached dmesg this looks to be identical to the > issue that Bert reported in the other thread. > > The nature of it is that once the corruption happens 'weird stuff' will > happen after this, luckily this debug mode lets us pick up on the original > corruption. > > Bert is somehow luckily is able to reproduce very repeatably, so we have > been able to get a lot more information, but it's taking time to truly > narrow it down. > > Am working flat out to try to resolve the issue, we have before/after maple > trees and it seems like a certain operation is resulting in a corrupted > maple tree (duplicate 0x67ffffff entry). > > It is proving very very stubborn to be able to reproduce locally even in a > controlled environment where the maple tree is manually set up, but am > continuing my efforts to try to do so as best I can! :) > > Will respond here once we have a viable fix. I cc'd (and tagged) you over there, but I have a fix for this problem, do give it a try! [0] [0]: https://lore.kernel.org/linux-mm/20241005064114.42770-1-lorenzo.stoakes@oracle.com/ [snip] Cheers, Lorenzo ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-10-05 6:45 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-24 22:28 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187 Mikhail Gavrilov 2024-10-02 17:34 ` Mikhail Gavrilov 2024-10-02 17:55 ` Lorenzo Stoakes 2024-10-02 20:32 ` Lorenzo Stoakes 2024-10-02 20:45 ` Mikhail Gavrilov 2024-10-03 21:25 ` Mikhail Gavrilov 2024-10-03 21:52 ` Lorenzo Stoakes 2024-10-05 6:45 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).