* 3.15-rc8 mm/filemap.c:202 BUG @ 2014-06-03 4:21 Dave Jones 2014-06-03 9:11 ` Konstantin Khlebnikov 0 siblings, 1 reply; 10+ messages in thread From: Dave Jones @ 2014-06-03 4:21 UTC (permalink / raw) To: linux-mm; +Cc: Linux Kernel, Linus Torvalds I'm still seeing this one from time to time, though it takes me quite a while to hit it, despite my attempts at trying to narrow down the set of syscalls that cause it. kernel BUG at mm/filemap.c:202! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225 task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000 RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 RSP: 0018:ffff880055963b90 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388 RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17 RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530 R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880 R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8 FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0 DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568 0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000 0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae Call Trace: [<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70 [<ffffffffac16921b>] truncate_inode_page+0x5b/0x90 [<ffffffffac174493>] shmem_undo_range+0x363/0x790 [<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30 [<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340 [<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0 [<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0 [<ffffffffac182260>] SyS_madvise+0x3c0/0x870 [<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20 [<ffffffffac74c41f>] tracesys+0xdd/0xe2 Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 There was also another variant of the same BUG with a slighty different stack trace. kernel BUG at mm/filemap.c:202! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208 task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000 RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320 RSP: 0018:ffff880186147b18 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002 RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0 RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668 R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0 R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678 FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0 DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0 000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000 0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d Call Trace: [<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70 [<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90 [<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780 [<ffffffff84176394>] shmem_truncate_range+0x14/0x30 [<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150 [<ffffffff841e4b17>] evict+0xa7/0x170 [<ffffffff841e5435>] iput+0xf5/0x180 [<ffffffff841df8a0>] dentry_kill+0x260/0x2d0 [<ffffffff841df97c>] dput+0x6c/0x110 [<ffffffff841c92a9>] __fput+0x189/0x200 [<ffffffff841c936e>] ____fput+0xe/0x10 [<ffffffff84090484>] task_work_run+0xb4/0xe0 [<ffffffff8406ee42>] do_exit+0x302/0xb80 [<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20 [<ffffffff8407073c>] do_group_exit+0x4c/0xc0 [<ffffffff840707c4>] SyS_exit_group+0x14/0x20 [<ffffffff8475bf64>] tracesys+0xdd/0xe2 Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 3.15-rc8 mm/filemap.c:202 BUG 2014-06-03 4:21 3.15-rc8 mm/filemap.c:202 BUG Dave Jones @ 2014-06-03 9:11 ` Konstantin Khlebnikov 2014-06-03 23:11 ` Hugh Dickins 0 siblings, 1 reply; 10+ messages in thread From: Konstantin Khlebnikov @ 2014-06-03 9:11 UTC (permalink / raw) To: Dave Jones, linux-mm@kvack.org, Linux Kernel, Linus Torvalds, Hugh Dickins, Sasha Levin On Tue, Jun 3, 2014 at 8:21 AM, Dave Jones <davej@redhat.com> wrote: > I'm still seeing this one from time to time, though it takes me quite a while to hit it, > despite my attempts at trying to narrow down the set of syscalls that cause it. > > kernel BUG at mm/filemap.c:202! > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225 > task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000 > RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 > RSP: 0018:ffff880055963b90 EFLAGS: 00010046 > RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388 > RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17 > RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530 > R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880 > R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8 > FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0 > DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > Stack: > ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568 > 0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000 > 0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae > Call Trace: > [<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70 > [<ffffffffac16921b>] truncate_inode_page+0x5b/0x90 > [<ffffffffac174493>] shmem_undo_range+0x363/0x790 > [<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30 > [<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340 > [<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0 > [<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0 > [<ffffffffac182260>] SyS_madvise+0x3c0/0x870 > [<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20 > [<ffffffffac74c41f>] tracesys+0xdd/0xe2 > Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff > RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 > > There was also another variant of the same BUG with a slighty different stack trace. > > kernel BUG at mm/filemap.c:202! > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208 > task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000 > RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320 > RSP: 0018:ffff880186147b18 EFLAGS: 00010046 > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002 > RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0 > RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668 > R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0 > R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678 > FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0 > DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > Stack: > ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0 > 000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000 > 0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d > Call Trace: > [<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70 > [<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90 > [<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780 > [<ffffffff84176394>] shmem_truncate_range+0x14/0x30 > [<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150 > [<ffffffff841e4b17>] evict+0xa7/0x170 > [<ffffffff841e5435>] iput+0xf5/0x180 > [<ffffffff841df8a0>] dentry_kill+0x260/0x2d0 > [<ffffffff841df97c>] dput+0x6c/0x110 > [<ffffffff841c92a9>] __fput+0x189/0x200 > [<ffffffff841c936e>] ____fput+0xe/0x10 > [<ffffffff84090484>] task_work_run+0xb4/0xe0 > [<ffffffff8406ee42>] do_exit+0x302/0xb80 > [<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20 > [<ffffffff8407073c>] do_group_exit+0x4c/0xc0 > [<ffffffff840707c4>] SyS_exit_group+0x14/0x20 > [<ffffffff8475bf64>] tracesys+0xdd/0xe2 > Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> This might shine some light, CONFIG_DEBUG_VM should be =y. --- a/mm/filemap.c +++ b/mm/filemap.c @@ -199,7 +199,7 @@ void __delete_from_page_cache(struct page *page, void *shadow) __dec_zone_page_state(page, NR_FILE_PAGES); if (PageSwapBacked(page)) __dec_zone_page_state(page, NR_SHMEM); - BUG_ON(page_mapped(page)); + VM_BUG_ON_PAGE(page_mapped(page), page); /* * Some filesystems seem to re-dirty the page even after Hugh, As I see shmem truncate/punch hole might race with shmem_getpage_gfp() (when it converts swap-entries into normal pages) and leave pages in truncated area. Am I right? Currently I don't see how exactly this could lead to this problem, but this looks suspicious. I don't like the way in which truncate silently skips page entries when they are changing under it. Completely untested patch follows. --- a/mm/shmem.c +++ b/mm/shmem.c @@ -495,8 +495,9 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend, if (radix_tree_exceptional_entry(page)) { if (unfalloc) continue; - nr_swaps_freed += !shmem_free_swap(mapping, - index, page); + if (shmem_free_swap(mapping, index, page)) + goto retry; + nr_swaps_freed++; continue; } @@ -509,10 +510,11 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend, } unlock_page(page); } + index++; +retry: pagevec_remove_exceptionals(&pvec); pagevec_release(&pvec); mem_cgroup_uncharge_end(); - index++; } spin_lock(&info->lock); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 3.15-rc8 mm/filemap.c:202 BUG 2014-06-03 9:11 ` Konstantin Khlebnikov @ 2014-06-03 23:11 ` Hugh Dickins 2014-06-03 23:41 ` Dave Jones 2014-06-04 12:33 ` Sasha Levin 0 siblings, 2 replies; 10+ messages in thread From: Hugh Dickins @ 2014-06-03 23:11 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: Dave Jones, linux-mm@kvack.org, Linux Kernel, Linus Torvalds, Andrew Morton, Sasha Levin On Tue, 3 Jun 2014, Konstantin Khlebnikov wrote: > On Tue, Jun 3, 2014 at 8:21 AM, Dave Jones <davej@redhat.com> wrote: > > I'm still seeing this one from time to time, though it takes me quite a while to hit it, > > despite my attempts at trying to narrow down the set of syscalls that cause it. > > > > kernel BUG at mm/filemap.c:202! > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > > CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225 > > task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000 > > RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 > > RSP: 0018:ffff880055963b90 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388 > > RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17 > > RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530 > > R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880 > > R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8 > > FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0 > > DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > > Stack: > > ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568 > > 0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000 > > 0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae > > Call Trace: > > [<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70 > > [<ffffffffac16921b>] truncate_inode_page+0x5b/0x90 > > [<ffffffffac174493>] shmem_undo_range+0x363/0x790 > > [<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30 > > [<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340 > > [<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0 > > [<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0 > > [<ffffffffac182260>] SyS_madvise+0x3c0/0x870 > > [<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20 > > [<ffffffffac74c41f>] tracesys+0xdd/0xe2 > > Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff > > RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 > > > > There was also another variant of the same BUG with a slighty different stack trace. > > > > kernel BUG at mm/filemap.c:202! > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > > CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208 > > task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000 > > RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320 > > RSP: 0018:ffff880186147b18 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002 > > RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0 > > RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668 > > R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0 > > R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678 > > FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0 > > DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > > Stack: > > ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0 > > 000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000 > > 0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d > > Call Trace: > > [<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70 > > [<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90 > > [<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780 > > [<ffffffff84176394>] shmem_truncate_range+0x14/0x30 > > [<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150 > > [<ffffffff841e4b17>] evict+0xa7/0x170 > > [<ffffffff841e5435>] iput+0xf5/0x180 > > [<ffffffff841df8a0>] dentry_kill+0x260/0x2d0 > > [<ffffffff841df97c>] dput+0x6c/0x110 > > [<ffffffff841c92a9>] __fput+0x189/0x200 > > [<ffffffff841c936e>] ____fput+0xe/0x10 > > [<ffffffff84090484>] task_work_run+0xb4/0xe0 > > [<ffffffff8406ee42>] do_exit+0x302/0xb80 > > [<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20 > > [<ffffffff8407073c>] do_group_exit+0x4c/0xc0 > > [<ffffffff840707c4>] SyS_exit_group+0x14/0x20 > > [<ffffffff8475bf64>] tracesys+0xdd/0xe2 > > Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 > > > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@kvack.org. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > This might shine some light, CONFIG_DEBUG_VM should be =y. > > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -199,7 +199,7 @@ void __delete_from_page_cache(struct page *page, > void *shadow) > __dec_zone_page_state(page, NR_FILE_PAGES); > if (PageSwapBacked(page)) > __dec_zone_page_state(page, NR_SHMEM); > - BUG_ON(page_mapped(page)); > + VM_BUG_ON_PAGE(page_mapped(page), page); > > /* > * Some filesystems seem to re-dirty the page even after Yes, there's a chance that will tell us more (but I don't have high hopes of it). I'm still stumped by this issue, just as before. Sasha (or Dave), any update on whether you see this without THP? and whether you see the remove_migration_pte oops without THP? (I do have my own idea of the answers to those questions, but trying not to prejudice your answers.) > Hugh, As I see shmem truncate/punch hole might race with > shmem_getpage_gfp() (when it converts swap-entries into normal pages) > and leave pages in truncated area. Am I right? Certainly there can be that race, but (correct me if I'm wrong, of course) shmem_undo_range() already allows for it, and does not need your "goto retry" patch below. Observe how that second loop is a "for ( ; ; )" loop (like the one in truncate_inode_pages_range() that it was cloned from). The loop continues around and around until it has squeezed everything out of the hole or EOF, page or swap entry. So, yes, what looked like a swap entry just before we called shmem_free_swap(), may be a page pointer by the time we get to try to delete it from the radix_tree; but then a later find_get_entries() (after index is reset to start) will pick up the page (or swap entry if it's gone back) and deal with it. The retry is already built in. > Currently I don't see how exactly this could lead to this problem, but > this looks suspicious. page<->swap is definitely fertile ground for you to ponder on; though I think myself that the answer to the page_mapped() BUG will lie in a different direction. Mind you, I've probably given too little weight to the fact that every stacktrace shown has been a shmem one: originally I assumed that just reflected trinity running its tests on a tmpfs, now I wonder: Dave, Sasha, are you running similar tests on tmpfs and other filesystems, and find this only in the tmpfs case? Hugh > I don't like the way in which truncate silently skips page entries > when they are changing under it. > Completely untested patch follows. > > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -495,8 +495,9 @@ static void shmem_undo_range(struct inode *inode, > loff_t lstart, loff_t lend, > if (radix_tree_exceptional_entry(page)) { > if (unfalloc) > continue; > - nr_swaps_freed += !shmem_free_swap(mapping, > - index, page); > + if (shmem_free_swap(mapping, index, page)) > + goto retry; > + nr_swaps_freed++; > continue; > } > > @@ -509,10 +510,11 @@ static void shmem_undo_range(struct inode > *inode, loff_t lstart, loff_t lend, > } > unlock_page(page); > } > + index++; > +retry: > pagevec_remove_exceptionals(&pvec); > pagevec_release(&pvec); > mem_cgroup_uncharge_end(); > - index++; > } > > spin_lock(&info->lock); > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 3.15-rc8 mm/filemap.c:202 BUG 2014-06-03 23:11 ` Hugh Dickins @ 2014-06-03 23:41 ` Dave Jones 2014-06-04 12:33 ` Sasha Levin 1 sibling, 0 replies; 10+ messages in thread From: Dave Jones @ 2014-06-03 23:41 UTC (permalink / raw) To: Hugh Dickins Cc: Konstantin Khlebnikov, linux-mm@kvack.org, Linux Kernel, Linus Torvalds, Andrew Morton, Sasha Levin On Tue, Jun 03, 2014 at 04:11:43PM -0700, Hugh Dickins wrote: > > - BUG_ON(page_mapped(page)); > > + VM_BUG_ON_PAGE(page_mapped(page), page); > > > > /* > > * Some filesystems seem to re-dirty the page even after > > Yes, there's a chance that will tell us more (but I don't have high > hopes of it). I'm still stumped by this issue, just as before. running with that applied now. > Sasha (or Dave), any update on whether you see this without THP? > and whether you see the remove_migration_pte oops without THP? haven't tried yet. I wish I had a better reproducer, because it can take up to a day to show up, and if disabling THP makes it go away, it's hard to judge if that's the case, or if I haven't been running long enough.. Sort of a Schrodinger's BUG_ON. After I get a trace with the above patch applied, I'll give it a shot though, just to see what happens. > Mind you, I've probably given too little weight to the fact that every > stacktrace shown has been a shmem one: originally I assumed that just > reflected trinity running its tests on a tmpfs, now I wonder: Dave, > Sasha, are you running similar tests on tmpfs and other filesystems, > and find this only in the tmpfs case? In my case, there's a tmpfs mounted, but it's extremely unlikely that trinity walked into it. Perhaps I should try that, to see if it happens faster. > > I don't like the way in which truncate silently skips page entries > > when they are changing under it. > > Completely untested patch follows. > > > > --- a/mm/shmem.c > > +++ b/mm/shmem.c > > @@ -495,8 +495,9 @@ static void shmem_undo_range(struct inode *inode, > > loff_t lstart, loff_t lend, > > if (radix_tree_exceptional_entry(page)) { > > if (unfalloc) > > continue; > > - nr_swaps_freed += !shmem_free_swap(mapping, > > - index, page); > > + if (shmem_free_swap(mapping, index, page)) > > + goto retry; > > + nr_swaps_freed++; > > continue; > > } > > > > @@ -509,10 +510,11 @@ static void shmem_undo_range(struct inode > > *inode, loff_t lstart, loff_t lend, > > } > > unlock_page(page); > > } > > + index++; > > +retry: > > pagevec_remove_exceptionals(&pvec); > > pagevec_release(&pvec); > > mem_cgroup_uncharge_end(); > > - index++; > > } I'll add this to the queue of things to test, but that queue is now about two days deep already :) Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 3.15-rc8 mm/filemap.c:202 BUG 2014-06-03 23:11 ` Hugh Dickins 2014-06-03 23:41 ` Dave Jones @ 2014-06-04 12:33 ` Sasha Levin 2014-06-06 23:05 ` Hugh Dickins 1 sibling, 1 reply; 10+ messages in thread From: Sasha Levin @ 2014-06-04 12:33 UTC (permalink / raw) To: Hugh Dickins, Konstantin Khlebnikov Cc: Dave Jones, linux-mm@kvack.org, Linux Kernel, Linus Torvalds, Andrew Morton On 06/03/2014 07:11 PM, Hugh Dickins wrote: > On Tue, 3 Jun 2014, Konstantin Khlebnikov wrote: >> > On Tue, Jun 3, 2014 at 8:21 AM, Dave Jones <davej@redhat.com> wrote: >>> > > I'm still seeing this one from time to time, though it takes me quite a while to hit it, >>> > > despite my attempts at trying to narrow down the set of syscalls that cause it. >>> > > >>> > > kernel BUG at mm/filemap.c:202! >>> > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC >>> > > CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225 >>> > > task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000 >>> > > RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 >>> > > RSP: 0018:ffff880055963b90 EFLAGS: 00010046 >>> > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388 >>> > > RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17 >>> > > RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530 >>> > > R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880 >>> > > R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8 >>> > > FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000 >>> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> > > CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0 >>> > > DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000 >>> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 >>> > > Stack: >>> > > ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568 >>> > > 0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000 >>> > > 0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae >>> > > Call Trace: >>> > > [<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70 >>> > > [<ffffffffac16921b>] truncate_inode_page+0x5b/0x90 >>> > > [<ffffffffac174493>] shmem_undo_range+0x363/0x790 >>> > > [<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30 >>> > > [<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340 >>> > > [<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0 >>> > > [<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0 >>> > > [<ffffffffac182260>] SyS_madvise+0x3c0/0x870 >>> > > [<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20 >>> > > [<ffffffffac74c41f>] tracesys+0xdd/0xe2 >>> > > Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff >>> > > RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 >>> > > >>> > > There was also another variant of the same BUG with a slighty different stack trace. >>> > > >>> > > kernel BUG at mm/filemap.c:202! >>> > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC >>> > > CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208 >>> > > task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000 >>> > > RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320 >>> > > RSP: 0018:ffff880186147b18 EFLAGS: 00010046 >>> > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002 >>> > > RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0 >>> > > RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668 >>> > > R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0 >>> > > R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678 >>> > > FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000 >>> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> > > CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0 >>> > > DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000 >>> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 >>> > > Stack: >>> > > ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0 >>> > > 000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000 >>> > > 0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d >>> > > Call Trace: >>> > > [<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70 >>> > > [<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90 >>> > > [<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780 >>> > > [<ffffffff84176394>] shmem_truncate_range+0x14/0x30 >>> > > [<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150 >>> > > [<ffffffff841e4b17>] evict+0xa7/0x170 >>> > > [<ffffffff841e5435>] iput+0xf5/0x180 >>> > > [<ffffffff841df8a0>] dentry_kill+0x260/0x2d0 >>> > > [<ffffffff841df97c>] dput+0x6c/0x110 >>> > > [<ffffffff841c92a9>] __fput+0x189/0x200 >>> > > [<ffffffff841c936e>] ____fput+0xe/0x10 >>> > > [<ffffffff84090484>] task_work_run+0xb4/0xe0 >>> > > [<ffffffff8406ee42>] do_exit+0x302/0xb80 >>> > > [<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20 >>> > > [<ffffffff8407073c>] do_group_exit+0x4c/0xc0 >>> > > [<ffffffff840707c4>] SyS_exit_group+0x14/0x20 >>> > > [<ffffffff8475bf64>] tracesys+0xdd/0xe2 >>> > > Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 >>> > > >>> > > >>> > > -- >>> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in >>> > > the body to majordomo@kvack.org. For more info on Linux MM, >>> > > see: http://www.linux-mm.org/ . >>> > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> >> > >> > This might shine some light, CONFIG_DEBUG_VM should be =y. >> > >> > --- a/mm/filemap.c >> > +++ b/mm/filemap.c >> > @@ -199,7 +199,7 @@ void __delete_from_page_cache(struct page *page, >> > void *shadow) >> > __dec_zone_page_state(page, NR_FILE_PAGES); >> > if (PageSwapBacked(page)) >> > __dec_zone_page_state(page, NR_SHMEM); >> > - BUG_ON(page_mapped(page)); >> > + VM_BUG_ON_PAGE(page_mapped(page), page); >> > >> > /* >> > * Some filesystems seem to re-dirty the page even after > Yes, there's a chance that will tell us more (but I don't have high > hopes of it). I'm still stumped by this issue, just as before. > > Sasha (or Dave), any update on whether you see this without THP? > and whether you see the remove_migration_pte oops without THP? I'm pretty sure at this point that I only see both with THP enabled. I've started seeing much less of them during fuzzing. Timing changes? Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 3.15-rc8 mm/filemap.c:202 BUG 2014-06-04 12:33 ` Sasha Levin @ 2014-06-06 23:05 ` Hugh Dickins 2014-06-06 23:16 ` Linus Torvalds 0 siblings, 1 reply; 10+ messages in thread From: Hugh Dickins @ 2014-06-06 23:05 UTC (permalink / raw) To: Sasha Levin Cc: Linus Torvalds, Andrew Morton, Kirill A. Shutemov, Konstantin Khlebnikov, Dave Jones, linux-mm@kvack.org, Linux Kernel On Wed, 4 Jun 2014, Sasha Levin wrote: > On 06/03/2014 07:11 PM, Hugh Dickins wrote: > > On Tue, 3 Jun 2014, Konstantin Khlebnikov wrote: > >> > On Tue, Jun 3, 2014 at 8:21 AM, Dave Jones <davej@redhat.com> wrote: > >>> > > I'm still seeing this one from time to time, though it takes me quite a while to hit it, > >>> > > despite my attempts at trying to narrow down the set of syscalls that cause it. > >>> > > > >>> > > kernel BUG at mm/filemap.c:202! > >>> > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > >>> > > CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225 > >>> > > task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000 > >>> > > RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 > >>> > > RSP: 0018:ffff880055963b90 EFLAGS: 00010046 > >>> > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388 > >>> > > RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17 > >>> > > RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530 > >>> > > R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880 > >>> > > R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8 > >>> > > FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000 > >>> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>> > > CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0 > >>> > > DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000 > >>> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > >>> > > Stack: > >>> > > ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568 > >>> > > 0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000 > >>> > > 0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae > >>> > > Call Trace: > >>> > > [<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70 > >>> > > [<ffffffffac16921b>] truncate_inode_page+0x5b/0x90 > >>> > > [<ffffffffac174493>] shmem_undo_range+0x363/0x790 > >>> > > [<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30 > >>> > > [<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340 > >>> > > [<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0 > >>> > > [<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0 > >>> > > [<ffffffffac182260>] SyS_madvise+0x3c0/0x870 > >>> > > [<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20 > >>> > > [<ffffffffac74c41f>] tracesys+0xdd/0xe2 > >>> > > Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff > >>> > > RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360 > >>> > > > >>> > > There was also another variant of the same BUG with a slighty different stack trace. > >>> > > > >>> > > kernel BUG at mm/filemap.c:202! > >>> > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > >>> > > CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208 > >>> > > task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000 > >>> > > RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320 > >>> > > RSP: 0018:ffff880186147b18 EFLAGS: 00010046 > >>> > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002 > >>> > > RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0 > >>> > > RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668 > >>> > > R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0 > >>> > > R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678 > >>> > > FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000 > >>> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>> > > CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0 > >>> > > DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000 > >>> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > >>> > > Stack: > >>> > > ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0 > >>> > > 000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000 > >>> > > 0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d > >>> > > Call Trace: > >>> > > [<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70 > >>> > > [<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90 > >>> > > [<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780 > >>> > > [<ffffffff84176394>] shmem_truncate_range+0x14/0x30 > >>> > > [<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150 > >>> > > [<ffffffff841e4b17>] evict+0xa7/0x170 > >>> > > [<ffffffff841e5435>] iput+0xf5/0x180 > >>> > > [<ffffffff841df8a0>] dentry_kill+0x260/0x2d0 > >>> > > [<ffffffff841df97c>] dput+0x6c/0x110 > >>> > > [<ffffffff841c92a9>] __fput+0x189/0x200 > >>> > > [<ffffffff841c936e>] ____fput+0xe/0x10 > >>> > > [<ffffffff84090484>] task_work_run+0xb4/0xe0 > >>> > > [<ffffffff8406ee42>] do_exit+0x302/0xb80 > >>> > > [<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20 > >>> > > [<ffffffff8407073c>] do_group_exit+0x4c/0xc0 > >>> > > [<ffffffff840707c4>] SyS_exit_group+0x14/0x20 > >>> > > [<ffffffff8475bf64>] tracesys+0xdd/0xe2 > >>> > > Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 > >>> > > > >>> > > > >>> > > -- > >>> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > >>> > > the body to majordomo@kvack.org. For more info on Linux MM, > >>> > > see: http://www.linux-mm.org/ . > >>> > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > >> > > >> > This might shine some light, CONFIG_DEBUG_VM should be =y. > >> > > >> > --- a/mm/filemap.c > >> > +++ b/mm/filemap.c > >> > @@ -199,7 +199,7 @@ void __delete_from_page_cache(struct page *page, > >> > void *shadow) > >> > __dec_zone_page_state(page, NR_FILE_PAGES); > >> > if (PageSwapBacked(page)) > >> > __dec_zone_page_state(page, NR_SHMEM); > >> > - BUG_ON(page_mapped(page)); > >> > + VM_BUG_ON_PAGE(page_mapped(page), page); > >> > > >> > /* > >> > * Some filesystems seem to re-dirty the page even after > > Yes, there's a chance that will tell us more (but I don't have high > > hopes of it). I'm still stumped by this issue, just as before. > > > > Sasha (or Dave), any update on whether you see this without THP? > > and whether you see the remove_migration_pte oops without THP? > > I'm pretty sure at this point that I only see both with THP enabled. Thanks for the hint, though I've made nothing of it. Of course, some problems come from THP itself, and some from its pressure for migration. Though I'd wanted to see the remove_migration_pte oops as a key to the page_mapped bug, my guess is that they're actually independent. I might have a potential fix for the remove_migration_pte one, I just want to go back and look at my logic again, will reply in that thread if I'm still convinced. But a couple of hours ago had a thought on the page_mapped() bug, and would like to propose a patch which could be the answer to that - though frankly I remain pessimistic. See below. > > I've started seeing much less of them during fuzzing. Timing changes? Strange; I've no idea. Anyway, here's today's thought... [PATCH] mm: entry = ACCESS_ONCE(*pte) in handle_pte_fault Use ACCESS_ONCE() in handle_pte_fault() when getting the entry or orig_pte upon which all subsequent decisions and pte_same() tests will be made. I have no evidence that its lack is responsible for the mm/filemap.c:202 BUG_ON(page_mapped(page)) in __delete_from_page_cache() found by trinity, and I am not optimistic that it will fix it. But I have found no other explanation, and ACCESS_ONCE() here will surely not hurt. If gcc does re-access the pte before passing it down, then that would be disastrous for correct page fault handling, and certainly could explain the page_mapped() BUGs seen (concurrent fault causing page to be mapped in a second time on top of itself: mapcount 2 for a single pte). Signed-off-by: Hugh Dickins <hughd@google.com> --- mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- 3.15-rc8/mm/memory.c 2014-04-27 23:55:53.608801152 -0700 +++ linux/mm/memory.c 2014-06-06 15:02:35.044320183 -0700 @@ -3810,7 +3810,7 @@ static int handle_pte_fault(struct mm_st pte_t entry; spinlock_t *ptl; - entry = *pte; + entry = ACCESS_ONCE(*pte); if (!pte_present(entry)) { if (pte_none(entry)) { if (vma->vm_ops) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 3.15-rc8 mm/filemap.c:202 BUG 2014-06-06 23:05 ` Hugh Dickins @ 2014-06-06 23:16 ` Linus Torvalds 2014-06-06 23:21 ` Sasha Levin 0 siblings, 1 reply; 10+ messages in thread From: Linus Torvalds @ 2014-06-06 23:16 UTC (permalink / raw) To: Hugh Dickins Cc: Sasha Levin, Andrew Morton, Kirill A. Shutemov, Konstantin Khlebnikov, Dave Jones, linux-mm@kvack.org, Linux Kernel On Fri, Jun 6, 2014 at 4:05 PM, Hugh Dickins <hughd@google.com> wrote: > > [PATCH] mm: entry = ACCESS_ONCE(*pte) in handle_pte_fault > > Use ACCESS_ONCE() in handle_pte_fault() when getting the entry or orig_pte > upon which all subsequent decisions and pte_same() tests will be made. > > I have no evidence that its lack is responsible for the mm/filemap.c:202 > BUG_ON(page_mapped(page)) in __delete_from_page_cache() found by trinity, > and I am not optimistic that it will fix it. But I have found no other > explanation, and ACCESS_ONCE() here will surely not hurt. The patch looks obviously correct to me, although like you, I have no real reason to believe it really fixes anything. But we definitely should just load it once, since it's very much an optimistic load done before we take the real lock and re-compare. I'm somewhat dubious whether it actually would change code generation - it doesn't change anything with the test-configuration I tried with - but it's unquestionably a good patch. And hey, maybe some configurations have sufficiently different code generation that gcc actually _can_ sometimes do reloads, perhaps explaining why some people see problems. So it's certainly worth testing even if it doesn't make any change to code generation with *my* compiler and config.. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 3.15-rc8 mm/filemap.c:202 BUG 2014-06-06 23:16 ` Linus Torvalds @ 2014-06-06 23:21 ` Sasha Levin 0 siblings, 0 replies; 10+ messages in thread From: Sasha Levin @ 2014-06-06 23:21 UTC (permalink / raw) To: Linus Torvalds, Hugh Dickins Cc: Andrew Morton, Kirill A. Shutemov, Konstantin Khlebnikov, Dave Jones, linux-mm@kvack.org, Linux Kernel On 06/06/2014 07:16 PM, Linus Torvalds wrote: >> I have no evidence that its lack is responsible for the mm/filemap.c:202 >> > BUG_ON(page_mapped(page)) in __delete_from_page_cache() found by trinity, >> > and I am not optimistic that it will fix it. But I have found no other >> > explanation, and ACCESS_ONCE() here will surely not hurt. > The patch looks obviously correct to me, although like you, I have no > real reason to believe it really fixes anything. But we definitely > should just load it once, since it's very much an optimistic load done > before we take the real lock and re-compare. > > I'm somewhat dubious whether it actually would change code generation > - it doesn't change anything with the test-configuration I tried with > - but it's unquestionably a good patch. And hey, maybe some > configurations have sufficiently different code generation that gcc > actually _can_ sometimes do reloads, perhaps explaining why some > people see problems. So it's certainly worth testing even if it > doesn't make any change to code generation with *my* compiler and > config.. I'm seeing the same code generated here as well. I won't carry the patch unless Andrew/Linus take it so it won't hide possible bugs that trinity might stumble on. Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 3.15-rc8 mm/filemap.c:202 BUG @ 2014-06-09 12:01 张静(长谷) 2014-06-09 18:52 ` Hugh Dickins 0 siblings, 1 reply; 10+ messages in thread From: 张静(长谷) @ 2014-06-09 12:01 UTC (permalink / raw) To: 'Hugh Dickins' Cc: 'Sasha Levin', 'Andrew Morton', 'Kirill A. Shutemov', 'Konstantin Khlebnikov', 'Dave Jones', linux-mm, 'Linus Torvalds' [-- Attachment #1: Type: text/plain, Size: 1818 bytes --] Hi Hugh On Fri, Jun 6, 2014 at 4:05 PM, Hugh Dickins <hughd@google.com> wrote: > > Though I'd wanted to see the remove_migration_pte oops as a key to the > page_mapped bug, my guess is that they're actually independent. > In the 3.15-rc8 tree, along the migration path /* * Corner case handling: * 1. When a new swap-cache page is read into, it is added to the LRU * and treated as swapcache but it has no rmap yet. * Calling try_to_unmap() against a page->mapping==NULL page will * trigger a BUG. So handle it here. * 2. An orphaned page (see truncate_complete_page) might have * fs-private metadata. The page can be picked up due to memory * offlining. Everywhere else except page reclaim, the page is * invisible to the vm, so the page can not be migrated. So try to * free the metadata, so the page can be freed. */ if (!page->mapping) { VM_BUG_ON_PAGE(PageAnon(page), page); if (page_has_private(page)) { try_to_free_buffers(page); goto uncharge; } goto skip_unmap; } /* Establish migration ptes or remove ptes */ try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); skip_unmap: if (!page_mapped(page)) rc = move_to_new_page(newpage, page, remap_swapcache, mode); Here a page is migrated even not mapped and with no mapping! mapping = page_mapping(page); if (!mapping) rc = migrate_page(mapping, newpage, page, mode); if (!mapping) { /* Anonymous page without mapping */ if (page_count(page) != expected_count) return -EAGAIN; return MIGRATEPAGE_SUCCESS; } And seems a file cache page is treated in the way of Anon. Is that right? Thanks Hillf [-- Attachment #2: Type: text/html, Size: 17097 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 3.15-rc8 mm/filemap.c:202 BUG 2014-06-09 12:01 张静(长谷) @ 2014-06-09 18:52 ` Hugh Dickins 0 siblings, 0 replies; 10+ messages in thread From: Hugh Dickins @ 2014-06-09 18:52 UTC (permalink / raw) To: hillf.zj Cc: Sasha Levin, Andrew Morton, Kirill A. Shutemov, Konstantin Khlebnikov, Dave Jones, linux-mm, Linus Torvalds On Mon, 9 Jun 2014, hillf wrote: > On Fri, Jun 6, 2014 at 4:05 PM, Hugh Dickins <hughd@google.com> wrote: > > > > Though I'd wanted to see the remove_migration_pte oops as a key to the > > page_mapped bug, my guess is that they're actually independent. > > > > In the 3.15-rc8 tree, along the migration path > > /* > * Corner case handling: > * 1. When a new swap-cache page is read into, it is added to the LRU > * and treated as swapcache but it has no rmap yet. > * Calling try_to_unmap() against a page->mapping==NULL page will > * trigger a BUG. So handle it here. > * 2. An orphaned page (see truncate_complete_page) might have > * fs-private metadata. The page can be picked up due to memory > * offlining. Everywhere else except page reclaim, the page is > * invisible to the vm, so the page can not be migrated. So try to > * free the metadata, so the page can be freed. I don't think I'd say that an orphaned page cannot be migrated; but I do agree that it's better just to try free the page than migrate it. > */ > if (!page->mapping) { > VM_BUG_ON_PAGE(PageAnon(page), page); > if (page_has_private(page)) { > try_to_free_buffers(page); > goto uncharge; > } > goto skip_unmap; > } > > /* Establish migration ptes or remove ptes */ > try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); (There is an inefficiency here: it would better check page_mapped(page) before calling try_to_unmap(), which would save getting i_mmap_mutex unnecessarily. But that's an aside, it's not wrong as it stands.) > > skip_unmap: > if (!page_mapped(page)) > rc = move_to_new_page(newpage, page, remap_swapcache, mode); > > Here a page is migrated even not mapped and with no mapping! Why the exclamation mark? We have just tried to unmap it, so no surprise that the page is now not mapped. As to "no mapping": we hold the page lock, so it's a bug if the state of "page->mapping" has changed since we tested it above. > > mapping = page_mapping(page); > if (!mapping) > rc = migrate_page(mapping, newpage, page, mode); You need to check the way page_mapping(page) works: it doesn't simply return page->mapping, but supplies swap_address_space if PageSwapCache is set, or otherwise NULL on an anonymous page. I think your "no mapping" above amounts to swapless anonymous. > > > if (!mapping) { > /* Anonymous page without mapping */ > if (page_count(page) != expected_count) > return -EAGAIN; > return MIGRATEPAGE_SUCCESS; > } > > And seems a file cache page is treated in the way of Anon. > > Is that right? Nothing wrong with it that I see: the truncated file !page->mapping case has already been skipped in the "Corner case handling" block, though it would not worry me if an orphan page did reach here - the page count check will still refuse to migrate "unexplainable" pages. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-06-09 18:53 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-06-03 4:21 3.15-rc8 mm/filemap.c:202 BUG Dave Jones 2014-06-03 9:11 ` Konstantin Khlebnikov 2014-06-03 23:11 ` Hugh Dickins 2014-06-03 23:41 ` Dave Jones 2014-06-04 12:33 ` Sasha Levin 2014-06-06 23:05 ` Hugh Dickins 2014-06-06 23:16 ` Linus Torvalds 2014-06-06 23:21 ` Sasha Levin -- strict thread matches above, loose matches on Subject: below -- 2014-06-09 12:01 张静(长谷) 2014-06-09 18:52 ` Hugh Dickins
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).