linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* mm: BUG: Bad page state in process ksmd
@ 2014-03-26 15:13 Sasha Levin
  2014-03-26 19:55 ` Andrew Morton
  2014-03-27 15:21 ` Hugh Dickins
  0 siblings, 2 replies; 6+ messages in thread
From: Sasha Levin @ 2014-03-26 15:13 UTC (permalink / raw)
  To: linux-mm@kvack.org; +Cc: Andrew Morton, LKML

Hi all,

While fuzzing with trinity inside a KVM tools guest running the latest -next
kernel I've stumbled on the following.

Out of curiosity, is there a reason not to do bad flag checks when actually
setting flag? Obviously it'll be slower but it'll be easier catching these
issues.

[ 3926.683948] BUG: Bad page state in process ksmd  pfn:5a6246
[ 3926.689336] page:ffffea0016989180 count:0 mapcount:0 mapping:          (null) index:
[ 3926.696507] page flags: 0x56fffff8028001c(referenced|uptodate|dirty|swapbacked|mlock
[ 3926.709201] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
[ 3926.711216] bad because of flags:
[ 3926.712136] page flags: 0x200000(mlocked)
[ 3926.713574] Modules linked in:
[ 3926.714466] CPU: 26 PID: 3864 Comm: ksmd Tainted: G        W     3.14.0-rc7-next-201
[ 3926.720942]  ffffffff85688060 ffff8806ec7abc38 ffffffff844bd702 0000000000002fa0
[ 3926.728107]  ffffea0016989180 ffff8806ec7abc68 ffffffff844b158f 000fffff80000000
[ 3926.730563]  0000000000000000 000fffff80000000 ffffffff85688060 ffff8806ec7abcb8
[ 3926.737653] Call Trace:
[ 3926.738347]  dump_stack (lib/dump_stack.c:52)
[ 3926.739841]  bad_page (arch/x86/include/asm/atomic.h:38 include/linux/mm.h:432 mm/page_alloc.c:339)
[ 3926.741296]  free_pages_prepare (mm/page_alloc.c:644 mm/page_alloc.c:738)
[ 3926.742818]  free_hot_cold_page (mm/page_alloc.c:1371)
[ 3926.749425]  __put_single_page (mm/swap.c:71)
[ 3926.751074]  put_page (mm/swap.c:237)
[ 3926.752398]  ksm_do_scan (mm/ksm.c:1480 mm/ksm.c:1704)
[ 3926.753957]  ksm_scan_thread (mm/ksm.c:1723)
[ 3926.755940]  ? bit_waitqueue (kernel/sched/wait.c:291)
[ 3926.758644]  ? ksm_do_scan (mm/ksm.c:1715)
[ 3926.760420]  kthread (kernel/kthread.c:219)
[ 3926.761605]  ? kthread_create_on_node (kernel/kthread.c:185)
[ 3926.763149]  ret_from_fork (arch/x86/kernel/entry_64.S:555)
[ 3926.764323]  ? kthread_create_on_node (kernel/kthread.c:185)


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: BUG: Bad page state in process ksmd
  2014-03-26 15:13 mm: BUG: Bad page state in process ksmd Sasha Levin
@ 2014-03-26 19:55 ` Andrew Morton
  2014-03-26 21:39   ` Sasha Levin
  2014-03-27 15:21 ` Hugh Dickins
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2014-03-26 19:55 UTC (permalink / raw)
  To: Sasha Levin; +Cc: linux-mm@kvack.org, LKML, Hugh Dickins

On Wed, 26 Mar 2014 11:13:27 -0400 Sasha Levin <sasha.levin@oracle.com> wrote:

> Hi all,
> 
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel I've stumbled on the following.

(cc Hugh)

> Out of curiosity, is there a reason not to do bad flag checks when actually
> setting flag? Obviously it'll be slower but it'll be easier catching these
> issues.

Tricky.  Each code site must determine what are and are not valid page
states depending upon the current context.  The one place where we've
made that effort is at the point where a page is returned to the free
page pool.  Any other sites would require similar amounts of effort and
each one would be different from all the others.

We do this in a small way all over the place, against individual page
flags.  grep PageLocked */*.c.

> [ 3926.683948] BUG: Bad page state in process ksmd  pfn:5a6246
> [ 3926.689336] page:ffffea0016989180 count:0 mapcount:0 mapping:          (null) index:
> [ 3926.696507] page flags: 0x56fffff8028001c(referenced|uptodate|dirty|swapbacked|mlock
> [ 3926.709201] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> [ 3926.711216] bad because of flags:
> [ 3926.712136] page flags: 0x200000(mlocked)
> [ 3926.713574] Modules linked in:
> [ 3926.714466] CPU: 26 PID: 3864 Comm: ksmd Tainted: G        W     3.14.0-rc7-next-201
> [ 3926.720942]  ffffffff85688060 ffff8806ec7abc38 ffffffff844bd702 0000000000002fa0
> [ 3926.728107]  ffffea0016989180 ffff8806ec7abc68 ffffffff844b158f 000fffff80000000
> [ 3926.730563]  0000000000000000 000fffff80000000 ffffffff85688060 ffff8806ec7abcb8
> [ 3926.737653] Call Trace:
> [ 3926.738347]  dump_stack (lib/dump_stack.c:52)
> [ 3926.739841]  bad_page (arch/x86/include/asm/atomic.h:38 include/linux/mm.h:432 mm/page_alloc.c:339)
> [ 3926.741296]  free_pages_prepare (mm/page_alloc.c:644 mm/page_alloc.c:738)
> [ 3926.742818]  free_hot_cold_page (mm/page_alloc.c:1371)
> [ 3926.749425]  __put_single_page (mm/swap.c:71)
> [ 3926.751074]  put_page (mm/swap.c:237)
> [ 3926.752398]  ksm_do_scan (mm/ksm.c:1480 mm/ksm.c:1704)
> [ 3926.753957]  ksm_scan_thread (mm/ksm.c:1723)
> [ 3926.755940]  ? bit_waitqueue (kernel/sched/wait.c:291)
> [ 3926.758644]  ? ksm_do_scan (mm/ksm.c:1715)
> [ 3926.760420]  kthread (kernel/kthread.c:219)
> [ 3926.761605]  ? kthread_create_on_node (kernel/kthread.c:185)
> [ 3926.763149]  ret_from_fork (arch/x86/kernel/entry_64.S:555)
> [ 3926.764323]  ? kthread_create_on_node (kernel/kthread.c:185)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: BUG: Bad page state in process ksmd
  2014-03-26 19:55 ` Andrew Morton
@ 2014-03-26 21:39   ` Sasha Levin
  2014-03-27 15:36     ` Hugh Dickins
  0 siblings, 1 reply; 6+ messages in thread
From: Sasha Levin @ 2014-03-26 21:39 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm@kvack.org, LKML, Hugh Dickins

On 03/26/2014 03:55 PM, Andrew Morton wrote:
> On Wed, 26 Mar 2014 11:13:27 -0400 Sasha Levin <sasha.levin@oracle.com> wrote:
>> Out of curiosity, is there a reason not to do bad flag checks when actually
>> setting flag? Obviously it'll be slower but it'll be easier catching these
>> issues.
>
> Tricky.  Each code site must determine what are and are not valid page
> states depending upon the current context.  The one place where we've
> made that effort is at the point where a page is returned to the free
> page pool.  Any other sites would require similar amounts of effort and
> each one would be different from all the others.
>
> We do this in a small way all over the place, against individual page
> flags.  grep PageLocked */*.c.

What if we define generic page types and group page flags under them?
It would be easier to put these checks in key sites around the code
and no need to fully customize them to each site.

For exmaple, swap_readpage() is doing this:

         VM_BUG_ON_PAGE(!PageLocked(page), page);
         VM_BUG_ON_PAGE(PageUptodate(page), page);

But what if instead of that we'd do:

	VM_BUG_ON_PAGE(!PageSwap(page), page);

Where PageSwap would test "not locked", "uptodate", and in addition
a set of "sanity" flags which it didn't make sense to test individually
everywhere (PageError()? PageReclaim()?).

I can add the infrastructure if that sounds good (and people promise to
work with me on defining page types). I'd be happy to do all the testing
involved in getting this to work right.


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: BUG: Bad page state in process ksmd
  2014-03-26 15:13 mm: BUG: Bad page state in process ksmd Sasha Levin
  2014-03-26 19:55 ` Andrew Morton
@ 2014-03-27 15:21 ` Hugh Dickins
  2014-03-27 15:31   ` Sasha Levin
  1 sibling, 1 reply; 6+ messages in thread
From: Hugh Dickins @ 2014-03-27 15:21 UTC (permalink / raw)
  To: Sasha Levin; +Cc: linux-mm@kvack.org, Andrew Morton, LKML

On Wed, 26 Mar 2014, Sasha Levin wrote:
> Hi all,
> 
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel I've stumbled on the following.
> 
> Out of curiosity, is there a reason not to do bad flag checks when actually
> setting flag? Obviously it'll be slower but it'll be easier catching these
> issues.o

I don't see how it would help here.

> 
> [ 3926.683948] BUG: Bad page state in process ksmd  pfn:5a6246
> [ 3926.689336] page:ffffea0016989180 count:0 mapcount:0 mapping:
> (null) index:
> [ 3926.696507] page flags:
> 0x56fffff8028001c(referenced|uptodate|dirty|swapbacked|mlock
> [ 3926.709201] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> [ 3926.711216] bad because of flags:
> [ 3926.712136] page flags: 0x200000(mlocked)
> [ 3926.713574] Modules linked in:
> [ 3926.714466] CPU: 26 PID: 3864 Comm: ksmd Tainted: G        W
> 3.14.0-rc7-next-201
> [ 3926.720942]  ffffffff85688060 ffff8806ec7abc38 ffffffff844bd702
> 0000000000002fa0
> [ 3926.728107]  ffffea0016989180 ffff8806ec7abc68 ffffffff844b158f
> 000fffff80000000
> [ 3926.730563]  0000000000000000 000fffff80000000 ffffffff85688060
> ffff8806ec7abcb8
> [ 3926.737653] Call Trace:
> [ 3926.738347]  dump_stack (lib/dump_stack.c:52)
> [ 3926.739841]  bad_page (arch/x86/include/asm/atomic.h:38
> include/linux/mm.h:432 mm/page_alloc.c:339)
> [ 3926.741296]  free_pages_prepare (mm/page_alloc.c:644 mm/page_alloc.c:738)
> [ 3926.742818]  free_hot_cold_page (mm/page_alloc.c:1371)
> [ 3926.749425]  __put_single_page (mm/swap.c:71)
> [ 3926.751074]  put_page (mm/swap.c:237)
> [ 3926.752398]  ksm_do_scan (mm/ksm.c:1480 mm/ksm.c:1704)
> [ 3926.753957]  ksm_scan_thread (mm/ksm.c:1723)
> [ 3926.755940]  ? bit_waitqueue (kernel/sched/wait.c:291)
> [ 3926.758644]  ? ksm_do_scan (mm/ksm.c:1715)
> [ 3926.760420]  kthread (kernel/kthread.c:219)
> [ 3926.761605]  ? kthread_create_on_node (kernel/kthread.c:185)
> [ 3926.763149]  ret_from_fork (arch/x86/kernel/entry_64.S:555)
> [ 3926.764323]  ? kthread_create_on_node (kernel/kthread.c:185)

I've thought about this some, and slept on it, but don't yet see
how it comes about.  I'll have to come back to it later.

Was it a one-off, or do you find it fairly easy to reproduce?

If the latter, it would be interesting to know if it comes from
recent changes or not.  mm/mlock.c does appear to have been under
continuous revision for several releases (but barely changed in next).

Thanks,
Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: BUG: Bad page state in process ksmd
  2014-03-27 15:21 ` Hugh Dickins
@ 2014-03-27 15:31   ` Sasha Levin
  0 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2014-03-27 15:31 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-mm@kvack.org, Andrew Morton, LKML

On 03/27/2014 11:21 AM, Hugh Dickins wrote:
> I've thought about this some, and slept on it, but don't yet see
> how it comes about.  I'll have to come back to it later.
>
> Was it a one-off, or do you find it fairly easy to reproduce?
>
> If the latter, it would be interesting to know if it comes from
> recent changes or not.  mm/mlock.c does appear to have been under
> continuous revision for several releases (but barely changed in next).

I can't say it's easy to reproduce but it did happen 5-6 times at this point.

As far as I can tell there were no big changes in trinity for the last week
or so while we were in lsf/mm, and this issue being reproducible makes me
believe it has something to do with recent changes to mm code.


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mm: BUG: Bad page state in process ksmd
  2014-03-26 21:39   ` Sasha Levin
@ 2014-03-27 15:36     ` Hugh Dickins
  0 siblings, 0 replies; 6+ messages in thread
From: Hugh Dickins @ 2014-03-27 15:36 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Andrew Morton, linux-mm@kvack.org, LKML, Hugh Dickins

On Wed, 26 Mar 2014, Sasha Levin wrote:
> On 03/26/2014 03:55 PM, Andrew Morton wrote:
> > On Wed, 26 Mar 2014 11:13:27 -0400 Sasha Levin <sasha.levin@oracle.com>
> > wrote:
> > > Out of curiosity, is there a reason not to do bad flag checks when
> > > actually
> > > setting flag? Obviously it'll be slower but it'll be easier catching these
> > > issues.
> > 
> > Tricky.  Each code site must determine what are and are not valid page
> > states depending upon the current context.  The one place where we've
> > made that effort is at the point where a page is returned to the free
> > page pool.  Any other sites would require similar amounts of effort and
> > each one would be different from all the others.
> > 
> > We do this in a small way all over the place, against individual page
> > flags.  grep PageLocked */*.c.
> 
> What if we define generic page types and group page flags under them?
> It would be easier to put these checks in key sites around the code
> and no need to fully customize them to each site.
> 
> For exmaple, swap_readpage() is doing this:
> 
>         VM_BUG_ON_PAGE(!PageLocked(page), page);
>         VM_BUG_ON_PAGE(PageUptodate(page), page);
> 
> But what if instead of that we'd do:
> 
> 	VM_BUG_ON_PAGE(!PageSwap(page), page);
> 
> Where PageSwap would test "not locked", "uptodate", and in addition
> a set of "sanity" flags which it didn't make sense to test individually
> everywhere (PageError()? PageReclaim()?).
> 
> I can add the infrastructure if that sounds good (and people promise to
> work with me on defining page types). I'd be happy to do all the testing
> involved in getting this to work right.

Sorry, I don't understand how you see that as a good idea.  I wonder
if you have cleverly put that suggestion into the thread, to push me
into a more timely response to the BUG than you usually get ?-)

It seems a bad idea to me in at least three ways: expending more
developer time on establishing what set of page flags to test at
each site; expending more developer time on fixing all the false
positives that would result; and spoiling the greppability of the
source tree by hiding flag checks in obscure combinations.

Page flags are separate flags because they are largely
independent.

Developers have inserted the VM_BUG_ONs they think are needed,
please leave them at that.  There may be a good case for removing
some of the older ones that have served their purpose (we rather
overused PageLocked checks in 2.4 for example), but not for
putting effort into adding more to what's there.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-03-27 15:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-26 15:13 mm: BUG: Bad page state in process ksmd Sasha Levin
2014-03-26 19:55 ` Andrew Morton
2014-03-26 21:39   ` Sasha Levin
2014-03-27 15:36     ` Hugh Dickins
2014-03-27 15:21 ` Hugh Dickins
2014-03-27 15:31   ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).