* [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file @ 2023-03-30 15:53 Ivan Orlov 2023-03-31 1:33 ` Zach O'Keefe 0 siblings, 1 reply; 11+ messages in thread From: Ivan Orlov @ 2023-03-30 15:53 UTC (permalink / raw) To: akpm Cc: Ivan Orlov, linux-mm, linux-kernel, himadrispandya, skhan, linux-kernel-mentees, shy828301, zokeefe, syzbot+9578faa5475acb35fa50 Syzkaller reported the following issue: kernel BUG at mm/khugepaged.c:1823! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 5097 Comm: syz-executor220 Not tainted 6.2.0-syzkaller-13154-g857f1268a591 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023 RIP: 0010:collapse_file mm/khugepaged.c:1823 [inline] RIP: 0010:hpage_collapse_scan_file+0x67c8/0x7580 mm/khugepaged.c:2233 Code: 00 00 89 de e8 c9 66 a3 ff 31 ff 89 de e8 c0 66 a3 ff 45 84 f6 0f 85 28 0d 00 00 e8 22 64 a3 ff e9 dc f7 ff ff e8 18 64 a3 ff <0f> 0b f3 0f 1e fa e8 0d 64 a3 ff e9 93 f6 ff ff f3 0f 1e fa 4c 89 RSP: 0018:ffffc90003dff4e0 EFLAGS: 00010093 RAX: ffffffff81e95988 RBX: 00000000000001c1 RCX: ffff8880205b3a80 RDX: 0000000000000000 RSI: 00000000000001c0 RDI: 00000000000001c1 RBP: ffffc90003dff830 R08: ffffffff81e90e67 R09: fffffbfff1a433c3 R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000000 R13: ffffc90003dff6c0 R14: 00000000000001c0 R15: 0000000000000000 FS: 00007fdbae5ee700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdbae6901e0 CR3: 000000007b2dd000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> madvise_collapse+0x721/0xf50 mm/khugepaged.c:2693 madvise_vma_behavior mm/madvise.c:1086 [inline] madvise_walk_vmas mm/madvise.c:1260 [inline] do_madvise+0x9e5/0x4680 mm/madvise.c:1439 __do_sys_madvise mm/madvise.c:1452 [inline] __se_sys_madvise mm/madvise.c:1450 [inline] __x64_sys_madvise+0xa5/0xb0 mm/madvise.c:1450 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd The 'xas_store' call during page cache scanning can potentially translate 'xas' into the error state (with the reproducer provided by the syzkaller the error code is -ENOMEM). However, there are no further checks after the 'xas_store', and the next call of 'xas_next' at the start of the scanning cycle doesn't increase the xa_index, and the issue occurs. This patch will add the xarray state error checking after the 'xas_store' and the corresponding result error code. It will also add xarray state error checking via WARN_ON_ONCE macros, to be sure that ENOMEM or other possible errors don't occur at the places they shouldn't. Tested via syzbot. Reported-by: syzbot+9578faa5475acb35fa50@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> --- V1 -> V2: Add WARN_ON_ONCE error checking and comments mm/khugepaged.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 92e6f56a932d..8b6580b13339 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -55,6 +55,7 @@ enum scan_result { SCAN_CGROUP_CHARGE_FAIL, SCAN_TRUNCATED, SCAN_PAGE_HAS_PRIVATE, + SCAN_STORE_FAILED, }; #define CREATE_TRACE_POINTS @@ -1840,6 +1841,15 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, goto xa_locked; } xas_store(&xas, hpage); + if (xas_error(&xas)) { + /* revert shmem_charge performed + * in the previous condition + */ + mapping->nrpages--; + shmem_uncharge(mapping->host, 1); + result = SCAN_STORE_FAILED; + goto xa_locked; + } nr_none++; continue; } @@ -1992,6 +2002,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, /* Finally, replace with the new page. */ xas_store(&xas, hpage); + /* We can't get an ENOMEM here (because the allocation happened before) + * but let's check for errors (XArray implementation can be + * changed in the future) + */ + WARN_ON_ONCE(xas_error(&xas)); continue; out_unlock: unlock_page(page); @@ -2029,6 +2044,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, /* Join all the small entries into a single multi-index entry */ xas_set_order(&xas, start, HPAGE_PMD_ORDER); xas_store(&xas, hpage); + /* Here we can't get an ENOMEM (because entries were + * previously allocated) But let's check for errors + * (XArray implementation can be changed in the future) + */ + WARN_ON_ONCE(xas_error(&xas)); xa_locked: xas_unlock_irq(&xas); xa_unlocked: -- 2.34.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-03-30 15:53 [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file Ivan Orlov @ 2023-03-31 1:33 ` Zach O'Keefe 2023-03-31 6:47 ` Ivan Orlov 2023-04-04 0:58 ` Yang Shi 0 siblings, 2 replies; 11+ messages in thread From: Zach O'Keefe @ 2023-03-31 1:33 UTC (permalink / raw) To: Ivan Orlov Cc: akpm, linux-mm, linux-kernel, himadrispandya, skhan, linux-kernel-mentees, shy828301, syzbot+9578faa5475acb35fa50 On Mar 30 19:53, Ivan Orlov wrote: > Syzkaller reported the following issue: > > kernel BUG at mm/khugepaged.c:1823! > invalid opcode: 0000 [#1] PREEMPT SMP KASAN > CPU: 1 PID: 5097 Comm: syz-executor220 Not tainted 6.2.0-syzkaller-13154-g857f1268a591 #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023 > RIP: 0010:collapse_file mm/khugepaged.c:1823 [inline] > RIP: 0010:hpage_collapse_scan_file+0x67c8/0x7580 mm/khugepaged.c:2233 > Code: 00 00 89 de e8 c9 66 a3 ff 31 ff 89 de e8 c0 66 a3 ff 45 84 f6 0f 85 28 0d 00 00 e8 22 64 a3 ff e9 dc f7 ff ff e8 18 64 a3 ff <0f> 0b f3 0f 1e fa e8 0d 64 a3 ff e9 93 f6 ff ff f3 0f 1e fa 4c 89 > RSP: 0018:ffffc90003dff4e0 EFLAGS: 00010093 > RAX: ffffffff81e95988 RBX: 00000000000001c1 RCX: ffff8880205b3a80 > RDX: 0000000000000000 RSI: 00000000000001c0 RDI: 00000000000001c1 > RBP: ffffc90003dff830 R08: ffffffff81e90e67 R09: fffffbfff1a433c3 > R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000000 > R13: ffffc90003dff6c0 R14: 00000000000001c0 R15: 0000000000000000 > FS: 00007fdbae5ee700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fdbae6901e0 CR3: 000000007b2dd000 CR4: 00000000003506e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > madvise_collapse+0x721/0xf50 mm/khugepaged.c:2693 > madvise_vma_behavior mm/madvise.c:1086 [inline] > madvise_walk_vmas mm/madvise.c:1260 [inline] > do_madvise+0x9e5/0x4680 mm/madvise.c:1439 > __do_sys_madvise mm/madvise.c:1452 [inline] > __se_sys_madvise mm/madvise.c:1450 [inline] > __x64_sys_madvise+0xa5/0xb0 mm/madvise.c:1450 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > Thanks, Ivan. In the process of reviewing this, I starting thinking if the !shmem case was also susceptible to a similar race, and I *think* it might be. Unfortunately, my time has ran out, and I haven't been able to validate ; I'm less familiar with the file-side of things. The underlying problem is race with truncation/hole-punch under OOM condition. The nice do-while loop near the top of collapse_file() attempts to avoid this scenario by making sure enough slots are available. However, when we drop xarray lock, we open ourselves up to concurrent removal + slot deletion. Those slots then need to be allocated again -- which under OOM condition is failable. The syzbot reproducer picks on shmem, but I think this can occur for file as well. If we find a hole, we unlock the xarray and call page_cache_sync_readahead(), which if it succeeds, IIUC, will have allocated a new slot in our mapping pointing to the new page. We *then* locks the page. Only after the page is locked are we protected from concurrent removal (Note: this is what provides us protection in many of the xas_store() cases ; we've held the slot's contained page-lock since verifying the slot exists, protecting us from removal / reallocation races). Maybe I'm just low on caffeine at the end of the day, and am missing something, but if I had more time, I'd be looking into the file-side some more to verify. Apologies that hasn't occurred to me until now ; I was looking at one of your comments and double-checked why I *thought* we were safe. Anyways, irrespective of that looming issues, some more notes to follow: > The 'xas_store' call during page cache scanning can potentially > translate 'xas' into the error state (with the reproducer provided > by the syzkaller the error code is -ENOMEM). However, there are no > further checks after the 'xas_store', and the next call of 'xas_next' > at the start of the scanning cycle doesn't increase the xa_index, > and the issue occurs. > > This patch will add the xarray state error checking after the > 'xas_store' and the corresponding result error code. It will > also add xarray state error checking via WARN_ON_ONCE macros, > to be sure that ENOMEM or other possible errors don't occur > at the places they shouldn't. Thanks for the additions here. I think it's worthwhile providing even more details about the specifics of the race we are fixing and/or guarding against to help ppl understand how that -ENOMEM comes about if the do-while loop has "Ensured" we have slots available (additionally, I think that comment can be augmented). > Tested via syzbot. > > Reported-by: syzbot+9578faa5475acb35fa50@syzkaller.appspotmail.com > Link: https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc > Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> > --- > V1 -> V2: Add WARN_ON_ONCE error checking and comments > > mm/khugepaged.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 92e6f56a932d..8b6580b13339 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -55,6 +55,7 @@ enum scan_result { > SCAN_CGROUP_CHARGE_FAIL, > SCAN_TRUNCATED, > SCAN_PAGE_HAS_PRIVATE, > + SCAN_STORE_FAILED, > }; I'm still reluctant to add a new error code for this as this seems like quite a rare race that requires OOM to trigger. I'd be happier just reusing SCAN_FAIL, or, something we might get some millage out of later: SCAN_OOM. Also, a reminder to update include/trace/events/huge_memory.h, if you go that route. > > #define CREATE_TRACE_POINTS > @@ -1840,6 +1841,15 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > goto xa_locked; > } > xas_store(&xas, hpage); > + if (xas_error(&xas)) { > + /* revert shmem_charge performed > + * in the previous condition > + */ Nit: Here, and following, I think standard convention for multiline comment is to have an empty first and last line, eg: + /* + * revert shmem_charge performed + * in the previous condition + */ Though, checkpatch.pl --strict didn't seem to care. > + mapping->nrpages--; > + shmem_uncharge(mapping->host, 1); > + result = SCAN_STORE_FAILED; > + goto xa_locked; > + } > nr_none++; > continue; > } > @@ -1992,6 +2002,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > > /* Finally, replace with the new page. */ > xas_store(&xas, hpage); > + /* We can't get an ENOMEM here (because the allocation happened before) > + * but let's check for errors (XArray implementation can be > + * changed in the future) > + */ > + WARN_ON_ONCE(xas_error(&xas)); Nit: it's not just that allocation happened before -- need some guarantee we've been protected from concurrent removal. This is what made me look at the file side. > continue; > out_unlock: > unlock_page(page); > @@ -2029,6 +2044,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > /* Join all the small entries into a single multi-index entry */ > xas_set_order(&xas, start, HPAGE_PMD_ORDER); > xas_store(&xas, hpage); > + /* Here we can't get an ENOMEM (because entries were > + * previously allocated) But let's check for errors > + * (XArray implementation can be changed in the future) > + */ > + WARN_ON_ONCE(xas_error(&xas)); Ditto. Apologies I won't be around to see this change through -- I'm just out of time, and will be shutting my computer down tomorrow for 3 months. Sorry for the poor timing, for raising issues, then disappearing. Hopefully I'm wrong and the file-side isn't a concern. Best, Zach > xa_locked: > xas_unlock_irq(&xas); > xa_unlocked: > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-03-31 1:33 ` Zach O'Keefe @ 2023-03-31 6:47 ` Ivan Orlov 2023-04-04 0:58 ` Yang Shi 1 sibling, 0 replies; 11+ messages in thread From: Ivan Orlov @ 2023-03-31 6:47 UTC (permalink / raw) To: Zach O'Keefe Cc: akpm, linux-mm, linux-kernel, himadrispandya, skhan, linux-kernel-mentees, shy828301, syzbot+9578faa5475acb35fa50 On 3/31/23 05:33, Zach O'Keefe wrote: > Thanks, Ivan. > > In the process of reviewing this, I starting thinking if the !shmem case was > also susceptible to a similar race, and I *think* it might be. Unfortunately, my > time has ran out, and I haven't been able to validate ; I'm less familiar with > the file-side of things. > > The underlying problem is race with truncation/hole-punch under OOM condition. > The nice do-while loop near the top of collapse_file() attempts to avoid this > scenario by making sure enough slots are available. However, when we drop xarray > lock, we open ourselves up to concurrent removal + slot deletion. Those slots > then need to be allocated again -- which under OOM condition is failable. > > The syzbot reproducer picks on shmem, but I think this can occur for file as > well. If we find a hole, we unlock the xarray and call > page_cache_sync_readahead(), which if it succeeds, IIUC, will have allocated a > new slot in our mapping pointing to the new page. We *then* locks the page. Only > after the page is locked are we protected from concurrent removal (Note: this is > what provides us protection in many of the xas_store() cases ; we've held the > slot's contained page-lock since verifying the slot exists, protecting us from > removal / reallocation races). > > Maybe I'm just low on caffeine at the end of the day, and am missing something, > but if I had more time, I'd be looking into the file-side some more to verify. > Apologies that hasn't occurred to me until now ; I was looking at one of your > comments and double-checked why I *thought* we were safe. > > Anyways, irrespective of that looming issues, some more notes to follow: > >> The 'xas_store' call during page cache scanning can potentially >> translate 'xas' into the error state (with the reproducer provided >> by the syzkaller the error code is -ENOMEM). However, there are no >> further checks after the 'xas_store', and the next call of 'xas_next' >> at the start of the scanning cycle doesn't increase the xa_index, >> and the issue occurs. >> >> This patch will add the xarray state error checking after the >> 'xas_store' and the corresponding result error code. It will >> also add xarray state error checking via WARN_ON_ONCE macros, >> to be sure that ENOMEM or other possible errors don't occur >> at the places they shouldn't. > > Thanks for the additions here. I think it's worthwhile providing even more > details about the specifics of the race we are fixing and/or guarding against to > help ppl understand how that -ENOMEM comes about if the do-while loop has > "Ensured" we have slots available (additionally, I think that comment can be > augmented). > >> Tested via syzbot. >> >> Reported-by: syzbot+9578faa5475acb35fa50@syzkaller.appspotmail.com >> Link: https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc >> Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> >> --- >> V1 -> V2: Add WARN_ON_ONCE error checking and comments >> >> mm/khugepaged.c | 20 ++++++++++++++++++++ >> 1 file changed, 20 insertions(+) >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 92e6f56a932d..8b6580b13339 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -55,6 +55,7 @@ enum scan_result { >> SCAN_CGROUP_CHARGE_FAIL, >> SCAN_TRUNCATED, >> SCAN_PAGE_HAS_PRIVATE, >> + SCAN_STORE_FAILED, >> }; > > I'm still reluctant to add a new error code for this as this seems like quite a > rare race that requires OOM to trigger. I'd be happier just reusing SCAN_FAIL, > or, something we might get some millage out of later: SCAN_OOM. > > Also, a reminder to update include/trace/events/huge_memory.h, if you go that > route. > >> >> #define CREATE_TRACE_POINTS >> @@ -1840,6 +1841,15 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, >> goto xa_locked; >> } >> xas_store(&xas, hpage); >> + if (xas_error(&xas)) { >> + /* revert shmem_charge performed >> + * in the previous condition >> + */ > > Nit: Here, and following, I think standard convention for multiline comment is > to have an empty first and last line, eg: > > + /* > + * revert shmem_charge performed > + * in the previous condition > + */ > > Though, checkpatch.pl --strict didn't seem to care. > >> + mapping->nrpages--; >> + shmem_uncharge(mapping->host, 1); >> + result = SCAN_STORE_FAILED; >> + goto xa_locked; >> + } >> nr_none++; >> continue; >> } >> @@ -1992,6 +2002,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, >> >> /* Finally, replace with the new page. */ >> xas_store(&xas, hpage); >> + /* We can't get an ENOMEM here (because the allocation happened before) >> + * but let's check for errors (XArray implementation can be >> + * changed in the future) >> + */ >> + WARN_ON_ONCE(xas_error(&xas)); > > Nit: it's not just that allocation happened before -- need some guarantee we've > been protected from concurrent removal. This is what made me look at the file > side. > >> continue; >> out_unlock: >> unlock_page(page); >> @@ -2029,6 +2044,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, >> /* Join all the small entries into a single multi-index entry */ >> xas_set_order(&xas, start, HPAGE_PMD_ORDER); >> xas_store(&xas, hpage); >> + /* Here we can't get an ENOMEM (because entries were >> + * previously allocated) But let's check for errors >> + * (XArray implementation can be changed in the future) >> + */ >> + WARN_ON_ONCE(xas_error(&xas)); > > Ditto. > > Apologies I won't be around to see this change through -- I'm just out of time, > and will be shutting my computer down tomorrow for 3 months. Sorry for the poor > timing, for raising issues, then disappearing. Hopefully I'm wrong and the > file-side isn't a concern. > > Best, > Zach > >> xa_locked: >> xas_unlock_irq(&xas); >> xa_unlocked: >> -- >> 2.34.1 >> Hello, Zach! Thank you very much for the detailed review! I thought that locking is our guarantee to not get an -ENOMEM, but I didn't have the deep understanding of the problem's cause, due to the fact I'm just starting my memory management journey :) In any case, I will do a research about this problem, and hopefully after you get out of the vacation you will see a new patch fully fixes this problem. Have a nice vacation! Thanks again! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-03-31 1:33 ` Zach O'Keefe 2023-03-31 6:47 ` Ivan Orlov @ 2023-04-04 0:58 ` Yang Shi 2023-04-05 16:43 ` Zach O'Keefe 1 sibling, 1 reply; 11+ messages in thread From: Yang Shi @ 2023-04-04 0:58 UTC (permalink / raw) To: Zach O'Keefe Cc: Ivan Orlov, akpm, linux-mm, linux-kernel, himadrispandya, skhan, linux-kernel-mentees, syzbot+9578faa5475acb35fa50 On Thu, Mar 30, 2023 at 6:33 PM Zach O'Keefe <zokeefe@google.com> wrote: > > On Mar 30 19:53, Ivan Orlov wrote: > > Syzkaller reported the following issue: > > > > kernel BUG at mm/khugepaged.c:1823! > > invalid opcode: 0000 [#1] PREEMPT SMP KASAN > > CPU: 1 PID: 5097 Comm: syz-executor220 Not tainted 6.2.0-syzkaller-13154-g857f1268a591 #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023 > > RIP: 0010:collapse_file mm/khugepaged.c:1823 [inline] > > RIP: 0010:hpage_collapse_scan_file+0x67c8/0x7580 mm/khugepaged.c:2233 > > Code: 00 00 89 de e8 c9 66 a3 ff 31 ff 89 de e8 c0 66 a3 ff 45 84 f6 0f 85 28 0d 00 00 e8 22 64 a3 ff e9 dc f7 ff ff e8 18 64 a3 ff <0f> 0b f3 0f 1e fa e8 0d 64 a3 ff e9 93 f6 ff ff f3 0f 1e fa 4c 89 > > RSP: 0018:ffffc90003dff4e0 EFLAGS: 00010093 > > RAX: ffffffff81e95988 RBX: 00000000000001c1 RCX: ffff8880205b3a80 > > RDX: 0000000000000000 RSI: 00000000000001c0 RDI: 00000000000001c1 > > RBP: ffffc90003dff830 R08: ffffffff81e90e67 R09: fffffbfff1a433c3 > > R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000000 > > R13: ffffc90003dff6c0 R14: 00000000000001c0 R15: 0000000000000000 > > FS: 00007fdbae5ee700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007fdbae6901e0 CR3: 000000007b2dd000 CR4: 00000000003506e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <TASK> > > madvise_collapse+0x721/0xf50 mm/khugepaged.c:2693 > > madvise_vma_behavior mm/madvise.c:1086 [inline] > > madvise_walk_vmas mm/madvise.c:1260 [inline] > > do_madvise+0x9e5/0x4680 mm/madvise.c:1439 > > __do_sys_madvise mm/madvise.c:1452 [inline] > > __se_sys_madvise mm/madvise.c:1450 [inline] > > __x64_sys_madvise+0xa5/0xb0 mm/madvise.c:1450 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > > > Thanks, Ivan. > > In the process of reviewing this, I starting thinking if the !shmem case was > also susceptible to a similar race, and I *think* it might be. Unfortunately, my > time has ran out, and I haven't been able to validate ; I'm less familiar with > the file-side of things. > > The underlying problem is race with truncation/hole-punch under OOM condition. > The nice do-while loop near the top of collapse_file() attempts to avoid this > scenario by making sure enough slots are available. However, when we drop xarray > lock, we open ourselves up to concurrent removal + slot deletion. Those slots > then need to be allocated again -- which under OOM condition is failable. > > The syzbot reproducer picks on shmem, but I think this can occur for file as > well. If we find a hole, we unlock the xarray and call > page_cache_sync_readahead(), which if it succeeds, IIUC, will have allocated a > new slot in our mapping pointing to the new page. We *then* locks the page. Only > after the page is locked are we protected from concurrent removal (Note: this is > what provides us protection in many of the xas_store() cases ; we've held the > slot's contained page-lock since verifying the slot exists, protecting us from > removal / reallocation races). IIUC, the find_lock_page() should be able to handle the race. It does check whether the page is truncated or not after locking the page. > > Maybe I'm just low on caffeine at the end of the day, and am missing something, > but if I had more time, I'd be looking into the file-side some more to verify. > Apologies that hasn't occurred to me until now ; I was looking at one of your > comments and double-checked why I *thought* we were safe. > > Anyways, irrespective of that looming issues, some more notes to follow: > > > The 'xas_store' call during page cache scanning can potentially > > translate 'xas' into the error state (with the reproducer provided > > by the syzkaller the error code is -ENOMEM). However, there are no > > further checks after the 'xas_store', and the next call of 'xas_next' > > at the start of the scanning cycle doesn't increase the xa_index, > > and the issue occurs. > > > > This patch will add the xarray state error checking after the > > 'xas_store' and the corresponding result error code. It will > > also add xarray state error checking via WARN_ON_ONCE macros, > > to be sure that ENOMEM or other possible errors don't occur > > at the places they shouldn't. > > Thanks for the additions here. I think it's worthwhile providing even more > details about the specifics of the race we are fixing and/or guarding against to > help ppl understand how that -ENOMEM comes about if the do-while loop has > "Ensured" we have slots available (additionally, I think that comment can be > augmented). > > > Tested via syzbot. > > > > Reported-by: syzbot+9578faa5475acb35fa50@syzkaller.appspotmail.com > > Link: https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc > > Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> > > --- > > V1 -> V2: Add WARN_ON_ONCE error checking and comments > > > > mm/khugepaged.c | 20 ++++++++++++++++++++ > > 1 file changed, 20 insertions(+) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 92e6f56a932d..8b6580b13339 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -55,6 +55,7 @@ enum scan_result { > > SCAN_CGROUP_CHARGE_FAIL, > > SCAN_TRUNCATED, > > SCAN_PAGE_HAS_PRIVATE, > > + SCAN_STORE_FAILED, > > }; > > I'm still reluctant to add a new error code for this as this seems like quite a > rare race that requires OOM to trigger. I'd be happier just reusing SCAN_FAIL, > or, something we might get some millage out of later: SCAN_OOM. > > Also, a reminder to update include/trace/events/huge_memory.h, if you go that > route. > > > > > #define CREATE_TRACE_POINTS > > @@ -1840,6 +1841,15 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > > goto xa_locked; > > } > > xas_store(&xas, hpage); > > + if (xas_error(&xas)) { > > + /* revert shmem_charge performed > > + * in the previous condition > > + */ > > Nit: Here, and following, I think standard convention for multiline comment is > to have an empty first and last line, eg: > > + /* > + * revert shmem_charge performed > + * in the previous condition > + */ > > Though, checkpatch.pl --strict didn't seem to care. > > > + mapping->nrpages--; > > + shmem_uncharge(mapping->host, 1); > > + result = SCAN_STORE_FAILED; > > + goto xa_locked; > > + } > > nr_none++; > > continue; > > } > > @@ -1992,6 +2002,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > > > > /* Finally, replace with the new page. */ > > xas_store(&xas, hpage); > > + /* We can't get an ENOMEM here (because the allocation happened before) > > + * but let's check for errors (XArray implementation can be > > + * changed in the future) > > + */ > > + WARN_ON_ONCE(xas_error(&xas)); > > Nit: it's not just that allocation happened before -- need some guarantee we've > been protected from concurrent removal. This is what made me look at the file > side. > > > continue; > > out_unlock: > > unlock_page(page); > > @@ -2029,6 +2044,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > > /* Join all the small entries into a single multi-index entry */ > > xas_set_order(&xas, start, HPAGE_PMD_ORDER); > > xas_store(&xas, hpage); > > + /* Here we can't get an ENOMEM (because entries were > > + * previously allocated) But let's check for errors > > + * (XArray implementation can be changed in the future) > > + */ > > + WARN_ON_ONCE(xas_error(&xas)); > > Ditto. > > Apologies I won't be around to see this change through -- I'm just out of time, > and will be shutting my computer down tomorrow for 3 months. Sorry for the poor > timing, for raising issues, then disappearing. Hopefully I'm wrong and the > file-side isn't a concern. > > Best, > Zach > > > xa_locked: > > xas_unlock_irq(&xas); > > xa_unlocked: > > -- > > 2.34.1 > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-04-04 0:58 ` Yang Shi @ 2023-04-05 16:43 ` Zach O'Keefe 2023-04-16 18:33 ` Andrew Morton 0 siblings, 1 reply; 11+ messages in thread From: Zach O'Keefe @ 2023-04-05 16:43 UTC (permalink / raw) To: Yang Shi Cc: Ivan Orlov, akpm, himadrispandya, linux-kernel, linux-kernel-mentees, linux-mm, skhan, syzbot+9578faa5475acb35fa50 [-- Attachment #1: Type: text/plain, Size: 9958 bytes --] On Mon, Apr 3, 2023 at 5:58 PM Yang Shi <shy828301@gmail.com> wrote: > On Thu, Mar 30, 2023 at 6:33 PM Zach O'Keefe <zokeefe@google.com> wrote: > > > > On Mar 30 19:53, Ivan Orlov wrote: > > > Syzkaller reported the following issue: > > > > > > kernel BUG at mm/khugepaged.c:1823! > > > invalid opcode: 0000 [#1] PREEMPT SMP KASAN > > > CPU: 1 PID: 5097 Comm: syz-executor220 Not tainted > 6.2.0-syzkaller-13154-g857f1268a591 #0 > > > Hardware name: Google Google Compute Engine/Google Compute Engine, > BIOS Google 02/16/2023 > > > RIP: 0010:collapse_file mm/khugepaged.c:1823 [inline] > > > RIP: 0010:hpage_collapse_scan_file+0x67c8/0x7580 mm/khugepaged.c:2233 > > > Code: 00 00 89 de e8 c9 66 a3 ff 31 ff 89 de e8 c0 66 a3 ff 45 84 f6 > 0f 85 28 0d 00 00 e8 22 64 a3 ff e9 dc f7 ff ff e8 18 64 a3 ff <0f> 0b f3 > 0f 1e fa e8 0d 64 a3 ff e9 93 f6 ff ff f3 0f 1e fa 4c 89 > > > RSP: 0018:ffffc90003dff4e0 EFLAGS: 00010093 > > > RAX: ffffffff81e95988 RBX: 00000000000001c1 RCX: ffff8880205b3a80 > > > RDX: 0000000000000000 RSI: 00000000000001c0 RDI: 00000000000001c1 > > > RBP: ffffc90003dff830 R08: ffffffff81e90e67 R09: fffffbfff1a433c3 > > > R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000000 > > > R13: ffffc90003dff6c0 R14: 00000000000001c0 R15: 0000000000000000 > > > FS: 00007fdbae5ee700(0000) GS:ffff8880b9900000(0000) > knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00007fdbae6901e0 CR3: 000000007b2dd000 CR4: 00000000003506e0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > Call Trace: > > > <TASK> > > > madvise_collapse+0x721/0xf50 mm/khugepaged.c:2693 > > > madvise_vma_behavior mm/madvise.c:1086 [inline] > > > madvise_walk_vmas mm/madvise.c:1260 [inline] > > > do_madvise+0x9e5/0x4680 mm/madvise.c:1439 > > > __do_sys_madvise mm/madvise.c:1452 [inline] > > > __se_sys_madvise mm/madvise.c:1450 [inline] > > > __x64_sys_madvise+0xa5/0xb0 mm/madvise.c:1450 > > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > > do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 > > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > > > > > > Thanks, Ivan. > > > > In the process of reviewing this, I starting thinking if the !shmem case > was > > also susceptible to a similar race, and I *think* it might be. > Unfortunately, my > > time has ran out, and I haven't been able to validate ; I'm less > familiar with > > the file-side of things. > > > > The underlying problem is race with truncation/hole-punch under OOM > condition. > > The nice do-while loop near the top of collapse_file() attempts to avoid > this > > scenario by making sure enough slots are available. However, when we > drop xarray > > lock, we open ourselves up to concurrent removal + slot deletion. Those > slots > > then need to be allocated again -- which under OOM condition is failable. > > > > The syzbot reproducer picks on shmem, but I think this can occur for > file as > > well. If we find a hole, we unlock the xarray and call > > page_cache_sync_readahead(), which if it succeeds, IIUC, will have > allocated a > > new slot in our mapping pointing to the new page. We *then* locks the > page. Only > > after the page is locked are we protected from concurrent removal (Note: > this is > > what provides us protection in many of the xas_store() cases ; we've > held the > > slot's contained page-lock since verifying the slot exists, protecting > us from > > removal / reallocation races). > > IIUC, the find_lock_page() should be able to handle the race. It does > check whether the page is truncated or not after locking the page. > Ugh. Duh. You’re right, and I’m not sure how I missed that implication; we *did* succeed in locking the page in page cache. Well, the above rant just looks foolish now. Anyways, sorry for the false alarm, Ivan, and thanks for keeping me honest, Yang! > > > > Maybe I'm just low on caffeine at the end of the day, and am missing > something, > > but if I had more time, I'd be looking into the file-side some more to > verify. > > Apologies that hasn't occurred to me until now ; I was looking at one of > your > > comments and double-checked why I *thought* we were safe. > > > > Anyways, irrespective of that looming issues, some more notes to follow: > > > > > The 'xas_store' call during page cache scanning can potentially > > > translate 'xas' into the error state (with the reproducer provided > > > by the syzkaller the error code is -ENOMEM). However, there are no > > > further checks after the 'xas_store', and the next call of 'xas_next' > > > at the start of the scanning cycle doesn't increase the xa_index, > > > and the issue occurs. > > > > > > This patch will add the xarray state error checking after the > > > 'xas_store' and the corresponding result error code. It will > > > also add xarray state error checking via WARN_ON_ONCE macros, > > > to be sure that ENOMEM or other possible errors don't occur > > > at the places they shouldn't. > > > > Thanks for the additions here. I think it's worthwhile providing even > more > > details about the specifics of the race we are fixing and/or guarding > against to > > help ppl understand how that -ENOMEM comes about if the do-while loop has > > "Ensured" we have slots available (additionally, I think that comment > can be > > augmented). > > > > > Tested via syzbot. > > > > > > Reported-by: syzbot+9578faa5475acb35fa50@syzkaller.appspotmail.com > > > Link: > https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc > > > Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> > > > --- > > > V1 -> V2: Add WARN_ON_ONCE error checking and comments > > > > > > mm/khugepaged.c | 20 ++++++++++++++++++++ > > > 1 file changed, 20 insertions(+) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 92e6f56a932d..8b6580b13339 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -55,6 +55,7 @@ enum scan_result { > > > SCAN_CGROUP_CHARGE_FAIL, > > > SCAN_TRUNCATED, > > > SCAN_PAGE_HAS_PRIVATE, > > > + SCAN_STORE_FAILED, > > > }; > > > > I'm still reluctant to add a new error code for this as this seems like > quite a > > rare race that requires OOM to trigger. I'd be happier just reusing > SCAN_FAIL, > > or, something we might get some millage out of later: SCAN_OOM. > > > > Also, a reminder to update include/trace/events/huge_memory.h, if you go > that > > route. > > > > > > > > #define CREATE_TRACE_POINTS > > > @@ -1840,6 +1841,15 @@ static int collapse_file(struct mm_struct *mm, > unsigned long addr, > > > goto xa_locked; > > > } > > > xas_store(&xas, hpage); > > > + if (xas_error(&xas)) { > > > + /* revert shmem_charge performed > > > + * in the previous condition > > > + */ > > > > Nit: Here, and following, I think standard convention for multiline > comment is > > to have an empty first and last line, eg: > > > > + /* > > + * revert shmem_charge performed > > + * in the previous condition > > + */ > > > > Though, checkpatch.pl --strict didn't seem to care. > > > > > + mapping->nrpages--; > > > + shmem_uncharge(mapping->host, 1); > > > + result = SCAN_STORE_FAILED; > > > + goto xa_locked; > > > + } > > > nr_none++; > > > continue; > > > } > > > @@ -1992,6 +2002,11 @@ static int collapse_file(struct mm_struct *mm, > unsigned long addr, > > > > > > /* Finally, replace with the new page. */ > > > xas_store(&xas, hpage); > > > + /* We can't get an ENOMEM here (because the allocation > happened before) > > > + * but let's check for errors (XArray implementation can > be > > > + * changed in the future) > > > + */ > > > + WARN_ON_ONCE(xas_error(&xas)); > > > > Nit: it's not just that allocation happened before -- need some > guarantee we've > > been protected from concurrent removal. This is what made me look at the > file > > side. > > > > > continue; > > > out_unlock: > > > unlock_page(page); > > > @@ -2029,6 +2044,11 @@ static int collapse_file(struct mm_struct *mm, > unsigned long addr, > > > /* Join all the small entries into a single multi-index entry */ > > > xas_set_order(&xas, start, HPAGE_PMD_ORDER); > > > xas_store(&xas, hpage); > > > + /* Here we can't get an ENOMEM (because entries were > > > + * previously allocated) But let's check for errors > > > + * (XArray implementation can be changed in the future) > > > + */ > > > + WARN_ON_ONCE(xas_error(&xas)); > > > > Ditto. > > > > Apologies I won't be around to see this change through -- I'm just out > of time, > > and will be shutting my computer down tomorrow for 3 months. Sorry for > the poor > > timing, for raising issues, then disappearing. Hopefully I'm wrong and > the > > file-side isn't a concern. > > > > Best, > > Zach > > > > > xa_locked: > > > xas_unlock_irq(&xas); > > > xa_unlocked: > > > -- > > > 2.34.1 > > > > > [-- Attachment #2: Type: text/html, Size: 12864 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-04-05 16:43 ` Zach O'Keefe @ 2023-04-16 18:33 ` Andrew Morton 2023-04-16 20:39 ` Ivan Orlov ` (3 more replies) 0 siblings, 4 replies; 11+ messages in thread From: Andrew Morton @ 2023-04-16 18:33 UTC (permalink / raw) To: Zach O'Keefe Cc: Yang Shi, Ivan Orlov, himadrispandya, linux-kernel, linux-kernel-mentees, linux-mm, skhan, syzbot+9578faa5475acb35fa50, Mike Kravetz, Kirill A. Shutemov, Matthew Wilcox Circling back to this fix... The BUG() is obviously real. We're unsure that Ivan's fix is the best one. We haven't identified a Fixes:, and as this report is against the 6.2 kernel, a cc:stable will be needed. According to the sysbot bisection (https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc), this is present in linux-5.19, so it might predate Zach's 58ac9a8993a13ebc changes. But that bisection claim might be misleading. And Zach is offline for a few months. So can people please take a look and see if we can get this wrapped up? Matthew, the assertion failure is in the VM_BUG_ON(index != xas.xa_index); which was added in 77da9389b9d5f, so perhaps you could take a look? Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-04-16 18:33 ` Andrew Morton @ 2023-04-16 20:39 ` Ivan Orlov 2023-04-16 23:04 ` Ivan Orlov ` (2 subsequent siblings) 3 siblings, 0 replies; 11+ messages in thread From: Ivan Orlov @ 2023-04-16 20:39 UTC (permalink / raw) To: Andrew Morton, Zach O'Keefe Cc: Yang Shi, himadrispandya, linux-kernel, linux-kernel-mentees, linux-mm, skhan, syzbot+9578faa5475acb35fa50, Mike Kravetz, Kirill A. Shutemov, Matthew Wilcox On 4/16/23 22:33, Andrew Morton wrote: > > Circling back to this fix... > > The BUG() is obviously real. We're unsure that Ivan's fix is the best > one. We haven't identified a Fixes:, and as this report is against the 6.2 > kernel, a cc:stable will be needed. > > According to the sysbot bisection > (https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc), > this is present in linux-5.19, so it might predate Zach's > 58ac9a8993a13ebc changes. But that bisection claim might be > misleading. > > And Zach is offline for a few months. So can people please take a look > and see if we can get this wrapped up? > > Matthew, the assertion failure is in the > > VM_BUG_ON(index != xas.xa_index); > > which was added in 77da9389b9d5f, so perhaps you could take a look? > > Thanks. Hello, Andrew! I have been unable to reproduce the problem with any of the existing reproducers on the 3d7cb6b04c3f3115719235cc6866b10326de34cd commit, which was detected by the syzkaller bisection. I also tried to test if the problem is reproducible on this commit via syzbot, but it did not detect the problem. It's possible that the bisection claim is misleading, as the issue may not be consistently reproducible. Why did you mention the 58ac9a8993a13ebc commit? I thought 99cb0dbd47a15 was considered as a "Fixes". 99cb0dbd47a15 is older than 3d7cb6b04c3f3, and the problematic code appeared there, so probably the problem could appear in 99cb0dbd47a15 as well. Please, correct me if I'm missing something. Kind regards, Ivan Orlov. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-04-16 18:33 ` Andrew Morton 2023-04-16 20:39 ` Ivan Orlov @ 2023-04-16 23:04 ` Ivan Orlov 2023-04-17 0:45 ` Hugh Dickins 2023-04-17 18:28 ` Ivan Orlov 3 siblings, 0 replies; 11+ messages in thread From: Ivan Orlov @ 2023-04-16 23:04 UTC (permalink / raw) To: Andrew Morton, Zach O'Keefe Cc: Yang Shi, himadrispandya, linux-kernel, linux-kernel-mentees, linux-mm, skhan, syzbot+9578faa5475acb35fa50, Mike Kravetz, Kirill A. Shutemov, Matthew Wilcox On 4/16/23 22:33, Andrew Morton wrote: > > Circling back to this fix... > > The BUG() is obviously real. We're unsure that Ivan's fix is the best > one. We haven't identified a Fixes:, and as this report is against the 6.2 > kernel, a cc:stable will be needed. > > According to the sysbot bisection > (https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc), > this is present in linux-5.19, so it might predate Zach's > 58ac9a8993a13ebc changes. But that bisection claim might be > misleading. > > And Zach is offline for a few months. So can people please take a look > and see if we can get this wrapped up? > > Matthew, the assertion failure is in the > > VM_BUG_ON(index != xas.xa_index); > > which was added in 77da9389b9d5f, so perhaps you could take a look? > > Thanks. I tested the reproducers on the 99cb0dbd47a15 commit, and they do not trigger the problematic condition of shared memory truncation or hole-punching. I will investigate further, as there have been many changes in khugepaged since the 99cb0dbd47a15 commit that could potentially affect its behavior. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-04-16 18:33 ` Andrew Morton 2023-04-16 20:39 ` Ivan Orlov 2023-04-16 23:04 ` Ivan Orlov @ 2023-04-17 0:45 ` Hugh Dickins 2023-04-17 18:28 ` Ivan Orlov 3 siblings, 0 replies; 11+ messages in thread From: Hugh Dickins @ 2023-04-17 0:45 UTC (permalink / raw) To: Andrew Morton Cc: Zach O'Keefe, Yang Shi, Ivan Orlov, himadrispandya, linux-kernel, linux-kernel-mentees, linux-mm, skhan, syzbot+9578faa5475acb35fa50, Mike Kravetz, Kirill A. Shutemov, Matthew Wilcox On Sun, 16 Apr 2023, Andrew Morton wrote: > > Circling back to this fix... > > The BUG() is obviously real. It's worth remembering that syzbot's reproducer involves artificially injecting page allocation failures. So although the bug may be "real", it is rather in the theoretical category, way down on my own list to look at, and I'd say not at all urgent to fix. Hugh > We're unsure that Ivan's fix is the best > one. We haven't identified a Fixes:, and as this report is against the 6.2 > kernel, a cc:stable will be needed. > > According to the sysbot bisection > (https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc), > this is present in linux-5.19, so it might predate Zach's > 58ac9a8993a13ebc changes. But that bisection claim might be > misleading. > > And Zach is offline for a few months. So can people please take a look > and see if we can get this wrapped up? > > Matthew, the assertion failure is in the > > VM_BUG_ON(index != xas.xa_index); > > which was added in 77da9389b9d5f, so perhaps you could take a look? > > Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-04-16 18:33 ` Andrew Morton ` (2 preceding siblings ...) 2023-04-17 0:45 ` Hugh Dickins @ 2023-04-17 18:28 ` Ivan Orlov 2023-04-18 21:28 ` Andrew Morton 3 siblings, 1 reply; 11+ messages in thread From: Ivan Orlov @ 2023-04-17 18:28 UTC (permalink / raw) To: Andrew Morton, Zach O'Keefe Cc: Yang Shi, himadrispandya, linux-kernel, linux-kernel-mentees, linux-mm, skhan, syzbot+9578faa5475acb35fa50, Mike Kravetz, Kirill A. Shutemov, Matthew Wilcox On 16.04.2023 22:33, Andrew Morton wrote: > > Circling back to this fix... > > The BUG() is obviously real. We're unsure that Ivan's fix is the best > one. We haven't identified a Fixes:, and as this report is against the 6.2 > kernel, a cc:stable will be needed. > > According to the sysbot bisection > (https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959fc), > this is present in linux-5.19, so it might predate Zach's > 58ac9a8993a13ebc changes. But that bisection claim might be > misleading. > > And Zach is offline for a few months. So can people please take a look > and see if we can get this wrapped up? > > Matthew, the assertion failure is in the > > VM_BUG_ON(index != xas.xa_index); > > which was added in 77da9389b9d5f, so perhaps you could take a look? > > Thanks. Tested today on the latest rc7, the bug still exists. Kind regards, Ivan Orlov. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file 2023-04-17 18:28 ` Ivan Orlov @ 2023-04-18 21:28 ` Andrew Morton 0 siblings, 0 replies; 11+ messages in thread From: Andrew Morton @ 2023-04-18 21:28 UTC (permalink / raw) To: Ivan Orlov Cc: Zach O'Keefe, Yang Shi, himadrispandya, linux-kernel, linux-kernel-mentees, linux-mm, skhan, syzbot+9578faa5475acb35fa50, Mike Kravetz, Kirill A. Shutemov, Matthew Wilcox On Mon, 17 Apr 2023 22:28:51 +0400 Ivan Orlov <ivan.orlov0322@gmail.com> wrote: > Tested today on the latest rc7, the bug still exists. OK, thanks. I'll send this patch upstream during the next merge window. If you, Zach or someone else decides to remove SCAN_STORE_FAILED and perhaps those WARN_ON_ONCE()es then we can address these things later. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-04-18 21:28 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-03-30 15:53 [PATCH v2] mm: khugepaged: Fix kernel BUG in hpage_collapse_scan_file Ivan Orlov 2023-03-31 1:33 ` Zach O'Keefe 2023-03-31 6:47 ` Ivan Orlov 2023-04-04 0:58 ` Yang Shi 2023-04-05 16:43 ` Zach O'Keefe 2023-04-16 18:33 ` Andrew Morton 2023-04-16 20:39 ` Ivan Orlov 2023-04-16 23:04 ` Ivan Orlov 2023-04-17 0:45 ` Hugh Dickins 2023-04-17 18:28 ` Ivan Orlov 2023-04-18 21:28 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).