* Re: BUG: Bad page state in process (current git) [not found] ` <20101115123846.GA30047@arch.trippelsdorf.de> @ 2010-11-15 19:54 ` Markus Trippelsdorf 2010-11-15 21:23 ` Hugh Dickins 0 siblings, 1 reply; 4+ messages in thread From: Markus Trippelsdorf @ 2010-11-15 19:54 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm, linux-kernel, Theodore Ts'o, linux-ext4 On 2010.11.15 at 13:38 +0100, Markus Trippelsdorf wrote: > On 2010.11.12 at 13:20 +0100, Markus Trippelsdorf wrote: > > > > Yes. Fortunately the BUG is gone since I pulled the upcoming drm fixes > > No. I happend again today (with those fixes already applied): > > BUG: Bad page state in process knode pfn:7f0a8 > page:ffffea0001bca4c0 count:0 mapcount:0 mapping: (null) index:0x0 > page flags: 0x4000000000000008(uptodate) > Pid: 18310, comm: knode Not tainted 2.6.37-rc1-00549-gae712bf-dirty #16 > Call Trace: > [<ffffffff810a9022>] ? bad_page+0x92/0xe0 > [<ffffffff810aa240>] ? get_page_from_freelist+0x4b0/0x570 > [<ffffffff8102e50e>] ? apic_timer_interrupt+0xe/0x20 > [<ffffffff810aa413>] ? __alloc_pages_nodemask+0x113/0x6b0 > [<ffffffff810a2dd4>] ? file_read_actor+0xc4/0x190 > [<ffffffff810a4a70>] ? generic_file_aio_read+0x560/0x6b0 > [<ffffffff810bdf8d>] ? handle_mm_fault+0x6bd/0x970 > [<ffffffff8104b1d0>] ? do_page_fault+0x120/0x410 > [<ffffffff810c3d85>] ? do_brk+0x275/0x360 > [<ffffffff81452d8f>] ? page_fault+0x1f/0x30 > Disabling lock debugging due to kernel taint And another one. But this time it seems to point to ext4: BUG: Bad page state in process rm pfn:52e54 page:ffffea0001222260 count:0 mapcount:0 mapping: (null) index:0x0 page flags: 0x4000000000000008(uptodate) Pid: 2084, comm: rm Not tainted 2.6.37-rc1-00549-gae712bf-dirty #23 Call Trace: [<ffffffff810a9022>] ? bad_page+0x92/0xe0 [<ffffffff810aa240>] ? get_page_from_freelist+0x4b0/0x570 [<ffffffff81142ae6>] ? ext4_ext_put_in_cache+0x46/0x90 [<ffffffff810aa413>] ? __alloc_pages_nodemask+0x113/0x6b0 [<ffffffff8118f0c7>] ? number.clone.2+0x2b7/0x2f0 [<ffffffff810a38d5>] ? find_get_page+0x75/0xb0 [<ffffffff810a4011>] ? find_or_create_page+0x51/0xb0 [<ffffffff810ff4d7>] ? __getblk+0xd7/0x260 [<ffffffff8113158f>] ? ext4_getblk+0x8f/0x1e0 [<ffffffff811316ed>] ? ext4_bread+0xd/0x70 [<ffffffff811369f4>] ? htree_dirblock_to_tree+0x34/0x190 [<ffffffff8113870f>] ? ext4_htree_fill_tree+0x9f/0x250 [<ffffffff810e109d>] ? do_filp_open+0x12d/0x5e0 [<ffffffff811289ed>] ? ext4_readdir+0x14d/0x5a0 [<ffffffff810e4e80>] ? filldir+0x0/0xd0 [<ffffffff810e50a8>] ? vfs_readdir+0xa8/0xd0 [<ffffffff810e4e80>] ? filldir+0x0/0xd0 [<ffffffff810e51b1>] ? sys_getdents+0x81/0xf0 [<ffffffff8102dc2b>] ? system_call_fastpath+0x16/0x1b Disabling lock debugging due to kernel taint I don't know. Could a possible bug in linux/fs/ext4/page-io.c be responsible for something like this? -- Markus ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: BUG: Bad page state in process (current git) 2010-11-15 19:54 ` BUG: Bad page state in process (current git) Markus Trippelsdorf @ 2010-11-15 21:23 ` Hugh Dickins 2010-11-16 11:13 ` [PATCH] ext4 Fix setting random pages PageUptodate Markus Trippelsdorf 0 siblings, 1 reply; 4+ messages in thread From: Hugh Dickins @ 2010-11-15 21:23 UTC (permalink / raw) To: Markus Trippelsdorf Cc: Christoph Lameter, linux-mm, linux-kernel, Theodore Ts'o, linux-ext4 On Mon, Nov 15, 2010 at 11:54 AM, Markus Trippelsdorf <markus@trippelsdorf.de> wrote: > On 2010.11.15 at 13:38 +0100, Markus Trippelsdorf wrote: >> On 2010.11.12 at 13:20 +0100, Markus Trippelsdorf wrote: >> > >> > Yes. Fortunately the BUG is gone since I pulled the upcoming drm fixes >> >> No. I happend again today (with those fixes already applied): >> >> BUG: Bad page state in process knode pfn:7f0a8 >> page:ffffea0001bca4c0 count:0 mapcount:0 mapping: (null) index:0x0 >> page flags: 0x4000000000000008(uptodate) >> Pid: 18310, comm: knode Not tainted 2.6.37-rc1-00549-gae712bf-dirty #16 >> Call Trace: >> [<ffffffff810a9022>] ? bad_page+0x92/0xe0 >> [<ffffffff810aa240>] ? get_page_from_freelist+0x4b0/0x570 >> [<ffffffff8102e50e>] ? apic_timer_interrupt+0xe/0x20 >> [<ffffffff810aa413>] ? __alloc_pages_nodemask+0x113/0x6b0 >> [<ffffffff810a2dd4>] ? file_read_actor+0xc4/0x190 >> [<ffffffff810a4a70>] ? generic_file_aio_read+0x560/0x6b0 >> [<ffffffff810bdf8d>] ? handle_mm_fault+0x6bd/0x970 >> [<ffffffff8104b1d0>] ? do_page_fault+0x120/0x410 >> [<ffffffff810c3d85>] ? do_brk+0x275/0x360 >> [<ffffffff81452d8f>] ? page_fault+0x1f/0x30 >> Disabling lock debugging due to kernel taint > > And another one. But this time it seems to point to ext4: > > BUG: Bad page state in process rm pfn:52e54 > page:ffffea0001222260 count:0 mapcount:0 mapping: (null) index:0x0 > page flags: 0x4000000000000008(uptodate) > Pid: 2084, comm: rm Not tainted 2.6.37-rc1-00549-gae712bf-dirty #23 > Call Trace: > [<ffffffff810a9022>] ? bad_page+0x92/0xe0 > [<ffffffff810aa240>] ? get_page_from_freelist+0x4b0/0x570 > [<ffffffff81142ae6>] ? ext4_ext_put_in_cache+0x46/0x90 > [<ffffffff810aa413>] ? __alloc_pages_nodemask+0x113/0x6b0 > [<ffffffff8118f0c7>] ? number.clone.2+0x2b7/0x2f0 > [<ffffffff810a38d5>] ? find_get_page+0x75/0xb0 > [<ffffffff810a4011>] ? find_or_create_page+0x51/0xb0 > [<ffffffff810ff4d7>] ? __getblk+0xd7/0x260 > [<ffffffff8113158f>] ? ext4_getblk+0x8f/0x1e0 > [<ffffffff811316ed>] ? ext4_bread+0xd/0x70 > [<ffffffff811369f4>] ? htree_dirblock_to_tree+0x34/0x190 > [<ffffffff8113870f>] ? ext4_htree_fill_tree+0x9f/0x250 > [<ffffffff810e109d>] ? do_filp_open+0x12d/0x5e0 > [<ffffffff811289ed>] ? ext4_readdir+0x14d/0x5a0 > [<ffffffff810e4e80>] ? filldir+0x0/0xd0 > [<ffffffff810e50a8>] ? vfs_readdir+0xa8/0xd0 > [<ffffffff810e4e80>] ? filldir+0x0/0xd0 > [<ffffffff810e51b1>] ? sys_getdents+0x81/0xf0 > [<ffffffff8102dc2b>] ? system_call_fastpath+0x16/0x1b > Disabling lock debugging due to kernel taint > > I don't know. Could a possible bug in linux/fs/ext4/page-io.c be > responsible for something like this? I do think you're right: every one of your "Bad page state" reports has been complaining only about the PageUptodate bit being set, and that SetPageUpdate() in ext4_end_bio() does look suspicious, coming after the put_page(). The more suspicious given that other races have been noticed in precisely that area, and fixed with put_io_page() in the current git tree. Perhaps that fixes your problem, but my guess would be not: I suspect the "if (!partial_write) SetPageUpdate(page);" should be done before the block (or put_io_page) which does the put_page(). Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] ext4 Fix setting random pages PageUptodate 2010-11-15 21:23 ` Hugh Dickins @ 2010-11-16 11:13 ` Markus Trippelsdorf 2010-11-16 20:38 ` Ted Ts'o 0 siblings, 1 reply; 4+ messages in thread From: Markus Trippelsdorf @ 2010-11-16 11:13 UTC (permalink / raw) To: Hugh Dickins Cc: Christoph Lameter, linux-mm, linux-kernel, Theodore Ts'o, linux-ext4 On 2010.11.15 at 13:23 -0800, Hugh Dickins wrote: > On Mon, Nov 15, 2010 at 11:54 AM, Markus Trippelsdorf > <markus@trippelsdorf.de> wrote: > > On 2010.11.15 at 13:38 +0100, Markus Trippelsdorf wrote: > >> On 2010.11.12 at 13:20 +0100, Markus Trippelsdorf wrote: > >> > > >> > Yes. Fortunately the BUG is gone since I pulled the upcoming drm fixes > >> > >> No. I happend again today (with those fixes already applied): > >> > >> BUG: Bad page state in process knode pfn:7f0a8 > >> page:ffffea0001bca4c0 count:0 mapcount:0 mapping: (null) index:0x0 > >> page flags: 0x4000000000000008(uptodate) > >> Pid: 18310, comm: knode Not tainted 2.6.37-rc1-00549-gae712bf-dirty #16 > >> Call Trace: > >> [<ffffffff810a9022>] ? bad_page+0x92/0xe0 > >> [<ffffffff810aa240>] ? get_page_from_freelist+0x4b0/0x570 > >> [<ffffffff8102e50e>] ? apic_timer_interrupt+0xe/0x20 > >> [<ffffffff810aa413>] ? __alloc_pages_nodemask+0x113/0x6b0 > >> [<ffffffff810a2dd4>] ? file_read_actor+0xc4/0x190 > >> [<ffffffff810a4a70>] ? generic_file_aio_read+0x560/0x6b0 > >> [<ffffffff810bdf8d>] ? handle_mm_fault+0x6bd/0x970 > >> [<ffffffff8104b1d0>] ? do_page_fault+0x120/0x410 > >> [<ffffffff810c3d85>] ? do_brk+0x275/0x360 > >> [<ffffffff81452d8f>] ? page_fault+0x1f/0x30 > >> Disabling lock debugging due to kernel taint > > > > And another one. But this time it seems to point to ext4: > > > > BUG: Bad page state in process rm pfn:52e54 > > page:ffffea0001222260 count:0 mapcount:0 mapping: (null) index:0x0 > > page flags: 0x4000000000000008(uptodate) > > Pid: 2084, comm: rm Not tainted 2.6.37-rc1-00549-gae712bf-dirty #23 > > Call Trace: > > [<ffffffff810a9022>] ? bad_page+0x92/0xe0 > > [<ffffffff810aa240>] ? get_page_from_freelist+0x4b0/0x570 > > [<ffffffff81142ae6>] ? ext4_ext_put_in_cache+0x46/0x90 > > [<ffffffff810aa413>] ? __alloc_pages_nodemask+0x113/0x6b0 > > [<ffffffff8118f0c7>] ? number.clone.2+0x2b7/0x2f0 > > [<ffffffff810a38d5>] ? find_get_page+0x75/0xb0 > > [<ffffffff810a4011>] ? find_or_create_page+0x51/0xb0 > > [<ffffffff810ff4d7>] ? __getblk+0xd7/0x260 > > [<ffffffff8113158f>] ? ext4_getblk+0x8f/0x1e0 > > [<ffffffff811316ed>] ? ext4_bread+0xd/0x70 > > [<ffffffff811369f4>] ? htree_dirblock_to_tree+0x34/0x190 > > [<ffffffff8113870f>] ? ext4_htree_fill_tree+0x9f/0x250 > > [<ffffffff810e109d>] ? do_filp_open+0x12d/0x5e0 > > [<ffffffff811289ed>] ? ext4_readdir+0x14d/0x5a0 > > [<ffffffff810e4e80>] ? filldir+0x0/0xd0 > > [<ffffffff810e50a8>] ? vfs_readdir+0xa8/0xd0 > > [<ffffffff810e4e80>] ? filldir+0x0/0xd0 > > [<ffffffff810e51b1>] ? sys_getdents+0x81/0xf0 > > [<ffffffff8102dc2b>] ? system_call_fastpath+0x16/0x1b > > Disabling lock debugging due to kernel taint > > > > I don't know. Could a possible bug in linux/fs/ext4/page-io.c be > > responsible for something like this? > > I do think you're right: every one of your "Bad page state" reports > has been complaining only about the PageUptodate bit being set, and > that SetPageUpdate() in ext4_end_bio() does look suspicious, coming > after the put_page(). > > The more suspicious given that other races have been noticed in > precisely that area, and fixed with put_io_page() in the current git > tree. > > Perhaps that fixes your problem, but my guess would be not: I suspect > the "if (!partial_write) SetPageUpdate(page);" should be done before > the block (or put_io_page) which does the put_page(). ext4_end_bio calls put_page and kmem_cache_free before calling SetPageUpdate(). This can result in setting the PageUptodate bit on random pages and causes the following BUG: BUG: Bad page state in process rm pfn:52e54 page:ffffea0001222260 count:0 mapcount:0 mapping: (null) index:0x0 arch kernel: page flags: 0x4000000000000008(uptodate) Fix the problem by moving put_io_page() after the SetPageUpdate() call. Thanks to Hugh Dickins for analyzing this problem. Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> --- fs/ext4/page-io.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c index 7f5451c..beacce1 100644 --- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@ -237,8 +237,6 @@ static void ext4_end_bio(struct bio *bio, int error) } while (bh != head); } - put_io_page(io_end->pages[i]); ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] ext4 Fix setting random pages PageUptodate 2010-11-16 11:13 ` [PATCH] ext4 Fix setting random pages PageUptodate Markus Trippelsdorf @ 2010-11-16 20:38 ` Ted Ts'o 0 siblings, 0 replies; 4+ messages in thread From: Ted Ts'o @ 2010-11-16 20:38 UTC (permalink / raw) To: Markus Trippelsdorf Cc: Hugh Dickins, Christoph Lameter, linux-mm, linux-kernel, linux-ext4 On Tue, Nov 16, 2010 at 12:13:39PM +0100, Markus Trippelsdorf wrote: > ext4_end_bio calls put_page and kmem_cache_free before calling > SetPageUpdate(). This can result in setting the PageUptodate bit on > random pages and causes the following BUG: Oops. Thanks muchly to you and to Hugh for discovering this. I've added this to the ext4 patch queue and will be pushing to Linus this week, for -rc3. - Ted -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-11-16 20:38 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20101110152519.GA1626@arch.trippelsdorf.de> [not found] ` <20101110154057.GA2191@arch.trippelsdorf.de> [not found] ` <alpine.DEB.2.00.1011101534370.30164@router.home> [not found] ` <20101112122003.GA1572@arch.trippelsdorf.de> [not found] ` <20101115123846.GA30047@arch.trippelsdorf.de> 2010-11-15 19:54 ` BUG: Bad page state in process (current git) Markus Trippelsdorf 2010-11-15 21:23 ` Hugh Dickins 2010-11-16 11:13 ` [PATCH] ext4 Fix setting random pages PageUptodate Markus Trippelsdorf 2010-11-16 20:38 ` Ted Ts'o
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).