* mm: BUG: Bad page state in process ksmd @ 2014-03-26 15:13 Sasha Levin 2014-03-26 19:55 ` Andrew Morton 2014-03-27 15:21 ` Hugh Dickins 0 siblings, 2 replies; 6+ messages in thread From: Sasha Levin @ 2014-03-26 15:13 UTC (permalink / raw) To: linux-mm@kvack.org; +Cc: Andrew Morton, LKML Hi all, While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following. Out of curiosity, is there a reason not to do bad flag checks when actually setting flag? Obviously it'll be slower but it'll be easier catching these issues. [ 3926.683948] BUG: Bad page state in process ksmd pfn:5a6246 [ 3926.689336] page:ffffea0016989180 count:0 mapcount:0 mapping: (null) index: [ 3926.696507] page flags: 0x56fffff8028001c(referenced|uptodate|dirty|swapbacked|mlock [ 3926.709201] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 3926.711216] bad because of flags: [ 3926.712136] page flags: 0x200000(mlocked) [ 3926.713574] Modules linked in: [ 3926.714466] CPU: 26 PID: 3864 Comm: ksmd Tainted: G W 3.14.0-rc7-next-201 [ 3926.720942] ffffffff85688060 ffff8806ec7abc38 ffffffff844bd702 0000000000002fa0 [ 3926.728107] ffffea0016989180 ffff8806ec7abc68 ffffffff844b158f 000fffff80000000 [ 3926.730563] 0000000000000000 000fffff80000000 ffffffff85688060 ffff8806ec7abcb8 [ 3926.737653] Call Trace: [ 3926.738347] dump_stack (lib/dump_stack.c:52) [ 3926.739841] bad_page (arch/x86/include/asm/atomic.h:38 include/linux/mm.h:432 mm/page_alloc.c:339) [ 3926.741296] free_pages_prepare (mm/page_alloc.c:644 mm/page_alloc.c:738) [ 3926.742818] free_hot_cold_page (mm/page_alloc.c:1371) [ 3926.749425] __put_single_page (mm/swap.c:71) [ 3926.751074] put_page (mm/swap.c:237) [ 3926.752398] ksm_do_scan (mm/ksm.c:1480 mm/ksm.c:1704) [ 3926.753957] ksm_scan_thread (mm/ksm.c:1723) [ 3926.755940] ? bit_waitqueue (kernel/sched/wait.c:291) [ 3926.758644] ? ksm_do_scan (mm/ksm.c:1715) [ 3926.760420] kthread (kernel/kthread.c:219) [ 3926.761605] ? kthread_create_on_node (kernel/kthread.c:185) [ 3926.763149] ret_from_fork (arch/x86/kernel/entry_64.S:555) [ 3926.764323] ? kthread_create_on_node (kernel/kthread.c:185) Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mm: BUG: Bad page state in process ksmd 2014-03-26 15:13 mm: BUG: Bad page state in process ksmd Sasha Levin @ 2014-03-26 19:55 ` Andrew Morton 2014-03-26 21:39 ` Sasha Levin 2014-03-27 15:21 ` Hugh Dickins 1 sibling, 1 reply; 6+ messages in thread From: Andrew Morton @ 2014-03-26 19:55 UTC (permalink / raw) To: Sasha Levin; +Cc: linux-mm@kvack.org, LKML, Hugh Dickins On Wed, 26 Mar 2014 11:13:27 -0400 Sasha Levin <sasha.levin@oracle.com> wrote: > Hi all, > > While fuzzing with trinity inside a KVM tools guest running the latest -next > kernel I've stumbled on the following. (cc Hugh) > Out of curiosity, is there a reason not to do bad flag checks when actually > setting flag? Obviously it'll be slower but it'll be easier catching these > issues. Tricky. Each code site must determine what are and are not valid page states depending upon the current context. The one place where we've made that effort is at the point where a page is returned to the free page pool. Any other sites would require similar amounts of effort and each one would be different from all the others. We do this in a small way all over the place, against individual page flags. grep PageLocked */*.c. > [ 3926.683948] BUG: Bad page state in process ksmd pfn:5a6246 > [ 3926.689336] page:ffffea0016989180 count:0 mapcount:0 mapping: (null) index: > [ 3926.696507] page flags: 0x56fffff8028001c(referenced|uptodate|dirty|swapbacked|mlock > [ 3926.709201] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > [ 3926.711216] bad because of flags: > [ 3926.712136] page flags: 0x200000(mlocked) > [ 3926.713574] Modules linked in: > [ 3926.714466] CPU: 26 PID: 3864 Comm: ksmd Tainted: G W 3.14.0-rc7-next-201 > [ 3926.720942] ffffffff85688060 ffff8806ec7abc38 ffffffff844bd702 0000000000002fa0 > [ 3926.728107] ffffea0016989180 ffff8806ec7abc68 ffffffff844b158f 000fffff80000000 > [ 3926.730563] 0000000000000000 000fffff80000000 ffffffff85688060 ffff8806ec7abcb8 > [ 3926.737653] Call Trace: > [ 3926.738347] dump_stack (lib/dump_stack.c:52) > [ 3926.739841] bad_page (arch/x86/include/asm/atomic.h:38 include/linux/mm.h:432 mm/page_alloc.c:339) > [ 3926.741296] free_pages_prepare (mm/page_alloc.c:644 mm/page_alloc.c:738) > [ 3926.742818] free_hot_cold_page (mm/page_alloc.c:1371) > [ 3926.749425] __put_single_page (mm/swap.c:71) > [ 3926.751074] put_page (mm/swap.c:237) > [ 3926.752398] ksm_do_scan (mm/ksm.c:1480 mm/ksm.c:1704) > [ 3926.753957] ksm_scan_thread (mm/ksm.c:1723) > [ 3926.755940] ? bit_waitqueue (kernel/sched/wait.c:291) > [ 3926.758644] ? ksm_do_scan (mm/ksm.c:1715) > [ 3926.760420] kthread (kernel/kthread.c:219) > [ 3926.761605] ? kthread_create_on_node (kernel/kthread.c:185) > [ 3926.763149] ret_from_fork (arch/x86/kernel/entry_64.S:555) > [ 3926.764323] ? kthread_create_on_node (kernel/kthread.c:185) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mm: BUG: Bad page state in process ksmd 2014-03-26 19:55 ` Andrew Morton @ 2014-03-26 21:39 ` Sasha Levin 2014-03-27 15:36 ` Hugh Dickins 0 siblings, 1 reply; 6+ messages in thread From: Sasha Levin @ 2014-03-26 21:39 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-mm@kvack.org, LKML, Hugh Dickins On 03/26/2014 03:55 PM, Andrew Morton wrote: > On Wed, 26 Mar 2014 11:13:27 -0400 Sasha Levin <sasha.levin@oracle.com> wrote: >> Out of curiosity, is there a reason not to do bad flag checks when actually >> setting flag? Obviously it'll be slower but it'll be easier catching these >> issues. > > Tricky. Each code site must determine what are and are not valid page > states depending upon the current context. The one place where we've > made that effort is at the point where a page is returned to the free > page pool. Any other sites would require similar amounts of effort and > each one would be different from all the others. > > We do this in a small way all over the place, against individual page > flags. grep PageLocked */*.c. What if we define generic page types and group page flags under them? It would be easier to put these checks in key sites around the code and no need to fully customize them to each site. For exmaple, swap_readpage() is doing this: VM_BUG_ON_PAGE(!PageLocked(page), page); VM_BUG_ON_PAGE(PageUptodate(page), page); But what if instead of that we'd do: VM_BUG_ON_PAGE(!PageSwap(page), page); Where PageSwap would test "not locked", "uptodate", and in addition a set of "sanity" flags which it didn't make sense to test individually everywhere (PageError()? PageReclaim()?). I can add the infrastructure if that sounds good (and people promise to work with me on defining page types). I'd be happy to do all the testing involved in getting this to work right. Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mm: BUG: Bad page state in process ksmd 2014-03-26 21:39 ` Sasha Levin @ 2014-03-27 15:36 ` Hugh Dickins 0 siblings, 0 replies; 6+ messages in thread From: Hugh Dickins @ 2014-03-27 15:36 UTC (permalink / raw) To: Sasha Levin; +Cc: Andrew Morton, linux-mm@kvack.org, LKML, Hugh Dickins On Wed, 26 Mar 2014, Sasha Levin wrote: > On 03/26/2014 03:55 PM, Andrew Morton wrote: > > On Wed, 26 Mar 2014 11:13:27 -0400 Sasha Levin <sasha.levin@oracle.com> > > wrote: > > > Out of curiosity, is there a reason not to do bad flag checks when > > > actually > > > setting flag? Obviously it'll be slower but it'll be easier catching these > > > issues. > > > > Tricky. Each code site must determine what are and are not valid page > > states depending upon the current context. The one place where we've > > made that effort is at the point where a page is returned to the free > > page pool. Any other sites would require similar amounts of effort and > > each one would be different from all the others. > > > > We do this in a small way all over the place, against individual page > > flags. grep PageLocked */*.c. > > What if we define generic page types and group page flags under them? > It would be easier to put these checks in key sites around the code > and no need to fully customize them to each site. > > For exmaple, swap_readpage() is doing this: > > VM_BUG_ON_PAGE(!PageLocked(page), page); > VM_BUG_ON_PAGE(PageUptodate(page), page); > > But what if instead of that we'd do: > > VM_BUG_ON_PAGE(!PageSwap(page), page); > > Where PageSwap would test "not locked", "uptodate", and in addition > a set of "sanity" flags which it didn't make sense to test individually > everywhere (PageError()? PageReclaim()?). > > I can add the infrastructure if that sounds good (and people promise to > work with me on defining page types). I'd be happy to do all the testing > involved in getting this to work right. Sorry, I don't understand how you see that as a good idea. I wonder if you have cleverly put that suggestion into the thread, to push me into a more timely response to the BUG than you usually get ?-) It seems a bad idea to me in at least three ways: expending more developer time on establishing what set of page flags to test at each site; expending more developer time on fixing all the false positives that would result; and spoiling the greppability of the source tree by hiding flag checks in obscure combinations. Page flags are separate flags because they are largely independent. Developers have inserted the VM_BUG_ONs they think are needed, please leave them at that. There may be a good case for removing some of the older ones that have served their purpose (we rather overused PageLocked checks in 2.4 for example), but not for putting effort into adding more to what's there. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mm: BUG: Bad page state in process ksmd 2014-03-26 15:13 mm: BUG: Bad page state in process ksmd Sasha Levin 2014-03-26 19:55 ` Andrew Morton @ 2014-03-27 15:21 ` Hugh Dickins 2014-03-27 15:31 ` Sasha Levin 1 sibling, 1 reply; 6+ messages in thread From: Hugh Dickins @ 2014-03-27 15:21 UTC (permalink / raw) To: Sasha Levin; +Cc: linux-mm@kvack.org, Andrew Morton, LKML On Wed, 26 Mar 2014, Sasha Levin wrote: > Hi all, > > While fuzzing with trinity inside a KVM tools guest running the latest -next > kernel I've stumbled on the following. > > Out of curiosity, is there a reason not to do bad flag checks when actually > setting flag? Obviously it'll be slower but it'll be easier catching these > issues.o I don't see how it would help here. > > [ 3926.683948] BUG: Bad page state in process ksmd pfn:5a6246 > [ 3926.689336] page:ffffea0016989180 count:0 mapcount:0 mapping: > (null) index: > [ 3926.696507] page flags: > 0x56fffff8028001c(referenced|uptodate|dirty|swapbacked|mlock > [ 3926.709201] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > [ 3926.711216] bad because of flags: > [ 3926.712136] page flags: 0x200000(mlocked) > [ 3926.713574] Modules linked in: > [ 3926.714466] CPU: 26 PID: 3864 Comm: ksmd Tainted: G W > 3.14.0-rc7-next-201 > [ 3926.720942] ffffffff85688060 ffff8806ec7abc38 ffffffff844bd702 > 0000000000002fa0 > [ 3926.728107] ffffea0016989180 ffff8806ec7abc68 ffffffff844b158f > 000fffff80000000 > [ 3926.730563] 0000000000000000 000fffff80000000 ffffffff85688060 > ffff8806ec7abcb8 > [ 3926.737653] Call Trace: > [ 3926.738347] dump_stack (lib/dump_stack.c:52) > [ 3926.739841] bad_page (arch/x86/include/asm/atomic.h:38 > include/linux/mm.h:432 mm/page_alloc.c:339) > [ 3926.741296] free_pages_prepare (mm/page_alloc.c:644 mm/page_alloc.c:738) > [ 3926.742818] free_hot_cold_page (mm/page_alloc.c:1371) > [ 3926.749425] __put_single_page (mm/swap.c:71) > [ 3926.751074] put_page (mm/swap.c:237) > [ 3926.752398] ksm_do_scan (mm/ksm.c:1480 mm/ksm.c:1704) > [ 3926.753957] ksm_scan_thread (mm/ksm.c:1723) > [ 3926.755940] ? bit_waitqueue (kernel/sched/wait.c:291) > [ 3926.758644] ? ksm_do_scan (mm/ksm.c:1715) > [ 3926.760420] kthread (kernel/kthread.c:219) > [ 3926.761605] ? kthread_create_on_node (kernel/kthread.c:185) > [ 3926.763149] ret_from_fork (arch/x86/kernel/entry_64.S:555) > [ 3926.764323] ? kthread_create_on_node (kernel/kthread.c:185) I've thought about this some, and slept on it, but don't yet see how it comes about. I'll have to come back to it later. Was it a one-off, or do you find it fairly easy to reproduce? If the latter, it would be interesting to know if it comes from recent changes or not. mm/mlock.c does appear to have been under continuous revision for several releases (but barely changed in next). Thanks, Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mm: BUG: Bad page state in process ksmd 2014-03-27 15:21 ` Hugh Dickins @ 2014-03-27 15:31 ` Sasha Levin 0 siblings, 0 replies; 6+ messages in thread From: Sasha Levin @ 2014-03-27 15:31 UTC (permalink / raw) To: Hugh Dickins; +Cc: linux-mm@kvack.org, Andrew Morton, LKML On 03/27/2014 11:21 AM, Hugh Dickins wrote: > I've thought about this some, and slept on it, but don't yet see > how it comes about. I'll have to come back to it later. > > Was it a one-off, or do you find it fairly easy to reproduce? > > If the latter, it would be interesting to know if it comes from > recent changes or not. mm/mlock.c does appear to have been under > continuous revision for several releases (but barely changed in next). I can't say it's easy to reproduce but it did happen 5-6 times at this point. As far as I can tell there were no big changes in trinity for the last week or so while we were in lsf/mm, and this issue being reproducible makes me believe it has something to do with recent changes to mm code. Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-03-27 15:37 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-26 15:13 mm: BUG: Bad page state in process ksmd Sasha Levin 2014-03-26 19:55 ` Andrew Morton 2014-03-26 21:39 ` Sasha Levin 2014-03-27 15:36 ` Hugh Dickins 2014-03-27 15:21 ` Hugh Dickins 2014-03-27 15:31 ` Sasha Levin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).