* BUG at mm/memory.c:1489! @ 2014-05-28 8:32 Michael Ellerman 2014-05-29 0:33 ` Hugh Dickins 0 siblings, 1 reply; 16+ messages in thread From: Michael Ellerman @ 2014-05-28 8:32 UTC (permalink / raw) To: Linux MM, LKML, trinity Hey folks, Anyone seen this before? Trinity hit it just now: Linux Blade312-5 3.15.0-rc7 #306 SMP Wed May 28 17:51:18 EST 2014 ppc64 [watchdog] 27853 iterations. [F:22642 S:5174 HI:1276] ------------[ cut here ]------------ kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! cpu 0xc: Vector: 700 (Program Check) at [c000000384eaf960] pc: c0000000001ad6f0: .follow_page_mask+0x90/0x650 lr: c0000000001ad6d8: .follow_page_mask+0x78/0x650 sp: c000000384eafbe0 msr: 8000000000029032 current = 0xc0000003c27e1bc0 paca = 0xc000000001dc3000 softe: 0 irq_happened: 0x01 pid = 20800, comm = trinity-c12 kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! enter ? for help [c000000384eafcc0] c0000000001e5514 .SyS_move_pages+0x524/0x7d0 [c000000384eafe30] c00000000000a1d8 syscall_exit+0x0/0x98 --- Exception: c01 (System Call) at 00003fff795f30a8 SP (3ffff958f290) is in userspace I've left it in the debugger, can dig into it a bit more tomorrow if anyone has any clues. cheers -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: BUG at mm/memory.c:1489! 2014-05-28 8:32 BUG at mm/memory.c:1489! Michael Ellerman @ 2014-05-29 0:33 ` Hugh Dickins 2014-05-29 4:52 ` Naoya Horiguchi 2014-05-29 8:59 ` Michael Ellerman 0 siblings, 2 replies; 16+ messages in thread From: Hugh Dickins @ 2014-05-29 0:33 UTC (permalink / raw) To: Michael Ellerman Cc: Andrew Morton, Naoya Horiguchi, Benjamin Herrenschmidt, Tony Luck, linux-mm, linux-kernel, trinity On Wed, 28 May 2014, Michael Ellerman wrote: > Hey folks, > > Anyone seen this before? Trinity hit it just now: > > Linux Blade312-5 3.15.0-rc7 #306 SMP Wed May 28 17:51:18 EST 2014 ppc64 > > [watchdog] 27853 iterations. [F:22642 S:5174 HI:1276] > ------------[ cut here ]------------ > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > cpu 0xc: Vector: 700 (Program Check) at [c000000384eaf960] > pc: c0000000001ad6f0: .follow_page_mask+0x90/0x650 > lr: c0000000001ad6d8: .follow_page_mask+0x78/0x650 > sp: c000000384eafbe0 > msr: 8000000000029032 > current = 0xc0000003c27e1bc0 > paca = 0xc000000001dc3000 softe: 0 irq_happened: 0x01 > pid = 20800, comm = trinity-c12 > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > enter ? for help > [c000000384eafcc0] c0000000001e5514 .SyS_move_pages+0x524/0x7d0 > [c000000384eafe30] c00000000000a1d8 syscall_exit+0x0/0x98 > --- Exception: c01 (System Call) at 00003fff795f30a8 > SP (3ffff958f290) is in userspace > > I've left it in the debugger, can dig into it a bit more tomorrow > if anyone has any clues. Thanks for leaving it overnight, but this one is quite obvious, so go ahead and reboot whenever suits you. Trinity didn't even need to do anything bizarre to get this: that ordinary path simply didn't get tried on powerpc or ia64 before. Here's a patch which should fix it for you, but I believe leaves a race in common with other architectures. I must turn away to other things, and hope Naoya-san can fix up the locking separately (or point out why it's already safe). [PATCH] mm: fix move_pages follow_page huge_addr BUG v3.12's e632a938d914 ("mm: migrate: add hugepage migration code to move_pages()") is okay on most arches, but on follow_huge_addr-style arches ia64 and powerpc, it hits my old BUG_ON(flags & FOLL_GET) from v2.6.15 deceb6cd17e6 ("mm: follow_page with inner ptlock"). The point of the BUG_ON was that nothing needed FOLL_GET there at the time, and it was not clear that we have sufficient locking to use get_page() safely here on the outside - maybe the page found has already been freed and even reused when follow_huge_addr() returns. I suspect that e632a938d914's use of get_page() after return from follow_huge_pmd() has the same problem: what prevents a racing instance of move_pages() from already migrating away and freeing that page by then? A reference to the page should be taken while holding suitable lock (huge_pte_lockptr?), to serialize against concurrent migration. But I'm not prepared to rework the hugetlb locking here myself; so for now just supply a patch to copy e632a938d914's get_page() after follow_huge_pmd() to after follow_huge_addr(): removing the BUG_ON(flags & FOLL_GET), but probably leaving a race. Fixes: e632a938d914 ("mm: migrate: add hugepage migration code to move_pages()") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: stable@vger.kernel.org # 3.12+ --- Whether this is a patch that should go in without fixing the locking, I don't know. An unlikely race is better than a triggerable BUG? Or perhaps I'm just wrong about there being any such race. mm/memory.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) --- 3.15-rc7/mm/memory.c 2014-04-27 23:55:53.608801152 -0700 +++ linux/mm/memory.c 2014-05-28 13:05:48.340124615 -0700 @@ -1486,7 +1486,17 @@ struct page *follow_page_mask(struct vm_ page = follow_huge_addr(mm, address, flags & FOLL_WRITE); if (!IS_ERR(page)) { - BUG_ON(flags & FOLL_GET); + if (page && (flags & FOLL_GET)) { + /* + * Refcount on tail pages are not well-defined and + * shouldn't be taken. The caller should handle a NULL + * return when trying to follow tail pages. + */ + if (PageHead(page)) + get_page(page); + else + page = NULL; + } goto out; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: BUG at mm/memory.c:1489! 2014-05-29 0:33 ` Hugh Dickins @ 2014-05-29 4:52 ` Naoya Horiguchi 2014-05-29 20:50 ` Hugh Dickins 2014-05-29 8:59 ` Michael Ellerman 1 sibling, 1 reply; 16+ messages in thread From: Naoya Horiguchi @ 2014-05-29 4:52 UTC (permalink / raw) To: Hugh Dickins Cc: mpe, Andrew Morton, benh, Tony Luck, linux-mm, linux-kernel, trinity Hi Hugh, On Wed, May 28, 2014 at 05:33:11PM -0700, Hugh Dickins wrote: > On Wed, 28 May 2014, Michael Ellerman wrote: > > Hey folks, > > > > Anyone seen this before? Trinity hit it just now: > > > > Linux Blade312-5 3.15.0-rc7 #306 SMP Wed May 28 17:51:18 EST 2014 ppc64 > > > > [watchdog] 27853 iterations. [F:22642 S:5174 HI:1276] > > ------------[ cut here ]------------ > > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > > cpu 0xc: Vector: 700 (Program Check) at [c000000384eaf960] > > pc: c0000000001ad6f0: .follow_page_mask+0x90/0x650 > > lr: c0000000001ad6d8: .follow_page_mask+0x78/0x650 > > sp: c000000384eafbe0 > > msr: 8000000000029032 > > current = 0xc0000003c27e1bc0 > > paca = 0xc000000001dc3000 softe: 0 irq_happened: 0x01 > > pid = 20800, comm = trinity-c12 > > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > > enter ? for help > > [c000000384eafcc0] c0000000001e5514 .SyS_move_pages+0x524/0x7d0 > > [c000000384eafe30] c00000000000a1d8 syscall_exit+0x0/0x98 > > --- Exception: c01 (System Call) at 00003fff795f30a8 > > SP (3ffff958f290) is in userspace > > > > I've left it in the debugger, can dig into it a bit more tomorrow > > if anyone has any clues. > > Thanks for leaving it overnight, but this one is quite obvious, > so go ahead and reboot whenever suits you. > > Trinity didn't even need to do anything bizarre to get this: that > ordinary path simply didn't get tried on powerpc or ia64 before. > > Here's a patch which should fix it for you, but I believe leaves > a race in common with other architectures. I must turn away to > other things, and hope Naoya-san can fix up the locking separately > (or point out why it's already safe). > > [PATCH] mm: fix move_pages follow_page huge_addr BUG > > v3.12's e632a938d914 ("mm: migrate: add hugepage migration code to > move_pages()") is okay on most arches, but on follow_huge_addr-style > arches ia64 and powerpc, it hits my old BUG_ON(flags & FOLL_GET) > from v2.6.15 deceb6cd17e6 ("mm: follow_page with inner ptlock"). > > The point of the BUG_ON was that nothing needed FOLL_GET there at > the time, and it was not clear that we have sufficient locking to > use get_page() safely here on the outside - maybe the page found has > already been freed and even reused when follow_huge_addr() returns. > > I suspect that e632a938d914's use of get_page() after return from > follow_huge_pmd() has the same problem: what prevents a racing > instance of move_pages() from already migrating away and freeing > that page by then? A reference to the page should be taken while > holding suitable lock (huge_pte_lockptr?), to serialize against > concurrent migration. Right, we need take huge_pte_lockptr() here, I think. > But I'm not prepared to rework the hugetlb locking here myself; > so for now just supply a patch to copy e632a938d914's get_page() > after follow_huge_pmd() to after follow_huge_addr(): removing > the BUG_ON(flags & FOLL_GET), but probably leaving a race. This bug was introduced by me, so I'll fix this. Thank you for reporting. > Fixes: e632a938d914 ("mm: migrate: add hugepage migration code to move_pages()") > Reported-by: Michael Ellerman <mpe@ellerman.id.au> > Signed-off-by: Hugh Dickins <hughd@google.com> > Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > Cc: stable@vger.kernel.org # 3.12+ This patch looks good to me. Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > --- > Whether this is a patch that should go in without fixing the locking, > I don't know. An unlikely race is better than a triggerable BUG? > Or perhaps I'm just wrong about there being any such race. > > mm/memory.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > --- 3.15-rc7/mm/memory.c 2014-04-27 23:55:53.608801152 -0700 > +++ linux/mm/memory.c 2014-05-28 13:05:48.340124615 -0700 > @@ -1486,7 +1486,17 @@ struct page *follow_page_mask(struct vm_ > > page = follow_huge_addr(mm, address, flags & FOLL_WRITE); > if (!IS_ERR(page)) { > - BUG_ON(flags & FOLL_GET); > + if (page && (flags & FOLL_GET)) { > + /* > + * Refcount on tail pages are not well-defined and > + * shouldn't be taken. The caller should handle a NULL > + * return when trying to follow tail pages. > + */ > + if (PageHead(page)) > + get_page(page); > + else > + page = NULL; > + } > goto out; > } > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: BUG at mm/memory.c:1489! 2014-05-29 4:52 ` Naoya Horiguchi @ 2014-05-29 20:50 ` Hugh Dickins 0 siblings, 0 replies; 16+ messages in thread From: Hugh Dickins @ 2014-05-29 20:50 UTC (permalink / raw) To: Naoya Horiguchi Cc: Hugh Dickins, mpe, Andrew Morton, benh, Tony Luck, linux-mm, linux-kernel, trinity On Thu, 29 May 2014, Naoya Horiguchi wrote: > On Wed, May 28, 2014 at 05:33:11PM -0700, Hugh Dickins wrote: > > On Wed, 28 May 2014, Michael Ellerman wrote: > > > Hey folks, > > > > > > Anyone seen this before? Trinity hit it just now: > > > > > > Linux Blade312-5 3.15.0-rc7 #306 SMP Wed May 28 17:51:18 EST 2014 ppc64 > > > > > > [watchdog] 27853 iterations. [F:22642 S:5174 HI:1276] > > > ------------[ cut here ]------------ > > > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > > > cpu 0xc: Vector: 700 (Program Check) at [c000000384eaf960] > > > pc: c0000000001ad6f0: .follow_page_mask+0x90/0x650 > > > lr: c0000000001ad6d8: .follow_page_mask+0x78/0x650 > > > sp: c000000384eafbe0 > > > msr: 8000000000029032 > > > current = 0xc0000003c27e1bc0 > > > paca = 0xc000000001dc3000 softe: 0 irq_happened: 0x01 > > > pid = 20800, comm = trinity-c12 > > > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > > > enter ? for help > > > [c000000384eafcc0] c0000000001e5514 .SyS_move_pages+0x524/0x7d0 > > > [c000000384eafe30] c00000000000a1d8 syscall_exit+0x0/0x98 > > > --- Exception: c01 (System Call) at 00003fff795f30a8 > > > SP (3ffff958f290) is in userspace > > > > > > I've left it in the debugger, can dig into it a bit more tomorrow > > > if anyone has any clues. > > > > Thanks for leaving it overnight, but this one is quite obvious, > > so go ahead and reboot whenever suits you. > > > > Trinity didn't even need to do anything bizarre to get this: that > > ordinary path simply didn't get tried on powerpc or ia64 before. > > > > Here's a patch which should fix it for you, but I believe leaves > > a race in common with other architectures. I must turn away to > > other things, and hope Naoya-san can fix up the locking separately > > (or point out why it's already safe). > > > > [PATCH] mm: fix move_pages follow_page huge_addr BUG > > > > v3.12's e632a938d914 ("mm: migrate: add hugepage migration code to > > move_pages()") is okay on most arches, but on follow_huge_addr-style > > arches ia64 and powerpc, it hits my old BUG_ON(flags & FOLL_GET) > > from v2.6.15 deceb6cd17e6 ("mm: follow_page with inner ptlock"). > > > > The point of the BUG_ON was that nothing needed FOLL_GET there at > > the time, and it was not clear that we have sufficient locking to > > use get_page() safely here on the outside - maybe the page found has > > already been freed and even reused when follow_huge_addr() returns. > > > > I suspect that e632a938d914's use of get_page() after return from > > follow_huge_pmd() has the same problem: what prevents a racing > > instance of move_pages() from already migrating away and freeing > > that page by then? A reference to the page should be taken while > > holding suitable lock (huge_pte_lockptr?), to serialize against > > concurrent migration. > > Right, we need take huge_pte_lockptr() here, I think. > > > But I'm not prepared to rework the hugetlb locking here myself; > > so for now just supply a patch to copy e632a938d914's get_page() > > after follow_huge_pmd() to after follow_huge_addr(): removing > > the BUG_ON(flags & FOLL_GET), but probably leaving a race. > > This bug was introduced by me, so I'll fix this. > Thank you for reporting. > > > Fixes: e632a938d914 ("mm: migrate: add hugepage migration code to move_pages()") > > Reported-by: Michael Ellerman <mpe@ellerman.id.au> > > Signed-off-by: Hugh Dickins <hughd@google.com> > > Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > > Cc: stable@vger.kernel.org # 3.12+ > > This patch looks good to me. > > Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Thanks, but seeing Michael's further problems once this is in, I think we are better off for now with your non-x86_64 disablement. But that's not complete in itself, I'll comment there. Hugh > > > --- > > Whether this is a patch that should go in without fixing the locking, > > I don't know. An unlikely race is better than a triggerable BUG? > > Or perhaps I'm just wrong about there being any such race. > > > > mm/memory.c | 12 +++++++++++- > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > --- 3.15-rc7/mm/memory.c 2014-04-27 23:55:53.608801152 -0700 > > +++ linux/mm/memory.c 2014-05-28 13:05:48.340124615 -0700 > > @@ -1486,7 +1486,17 @@ struct page *follow_page_mask(struct vm_ > > > > page = follow_huge_addr(mm, address, flags & FOLL_WRITE); > > if (!IS_ERR(page)) { > > - BUG_ON(flags & FOLL_GET); > > + if (page && (flags & FOLL_GET)) { > > + /* > > + * Refcount on tail pages are not well-defined and > > + * shouldn't be taken. The caller should handle a NULL > > + * return when trying to follow tail pages. > > + */ > > + if (PageHead(page)) > > + get_page(page); > > + else > > + page = NULL; > > + } > > goto out; > > } > > > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@kvack.org. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: BUG at mm/memory.c:1489! 2014-05-29 0:33 ` Hugh Dickins 2014-05-29 4:52 ` Naoya Horiguchi @ 2014-05-29 8:59 ` Michael Ellerman 2014-05-29 18:34 ` [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) Naoya Horiguchi ` (2 more replies) 1 sibling, 3 replies; 16+ messages in thread From: Michael Ellerman @ 2014-05-29 8:59 UTC (permalink / raw) To: Hugh Dickins Cc: Andrew Morton, Naoya Horiguchi, Benjamin Herrenschmidt, Tony Luck, linux-mm, linux-kernel, trinity On Wed, 2014-05-28 at 17:33 -0700, Hugh Dickins wrote: > On Wed, 28 May 2014, Michael Ellerman wrote: > > Linux Blade312-5 3.15.0-rc7 #306 SMP Wed May 28 17:51:18 EST 2014 ppc64 > > > > [watchdog] 27853 iterations. [F:22642 S:5174 HI:1276] > > ------------[ cut here ]------------ > > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > > cpu 0xc: Vector: 700 (Program Check) at [c000000384eaf960] > > pc: c0000000001ad6f0: .follow_page_mask+0x90/0x650 > > lr: c0000000001ad6d8: .follow_page_mask+0x78/0x650 > > sp: c000000384eafbe0 > > msr: 8000000000029032 > > current = 0xc0000003c27e1bc0 > > paca = 0xc000000001dc3000 softe: 0 irq_happened: 0x01 > > pid = 20800, comm = trinity-c12 > > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > > enter ? for help > > [c000000384eafcc0] c0000000001e5514 .SyS_move_pages+0x524/0x7d0 > > [c000000384eafe30] c00000000000a1d8 syscall_exit+0x0/0x98 > > --- Exception: c01 (System Call) at 00003fff795f30a8 > > SP (3ffff958f290) is in userspace > > > > I've left it in the debugger, can dig into it a bit more tomorrow > > if anyone has any clues. > > Thanks for leaving it overnight, but this one is quite obvious, > so go ahead and reboot whenever suits you. > > Trinity didn't even need to do anything bizarre to get this: that > ordinary path simply didn't get tried on powerpc or ia64 before. > > Here's a patch which should fix it for you, but I believe leaves > a race in common with other architectures. I must turn away to > other things, and hope Naoya-san can fix up the locking separately > (or point out why it's already safe). > > [PATCH] mm: fix move_pages follow_page huge_addr BUG > > v3.12's e632a938d914 ("mm: migrate: add hugepage migration code to > move_pages()") is okay on most arches, but on follow_huge_addr-style > arches ia64 and powerpc, it hits my old BUG_ON(flags & FOLL_GET) > from v2.6.15 deceb6cd17e6 ("mm: follow_page with inner ptlock"). > > The point of the BUG_ON was that nothing needed FOLL_GET there at > the time, and it was not clear that we have sufficient locking to > use get_page() safely here on the outside - maybe the page found has > already been freed and even reused when follow_huge_addr() returns. > > I suspect that e632a938d914's use of get_page() after return from > follow_huge_pmd() has the same problem: what prevents a racing > instance of move_pages() from already migrating away and freeing > that page by then? A reference to the page should be taken while > holding suitable lock (huge_pte_lockptr?), to serialize against > concurrent migration. > > But I'm not prepared to rework the hugetlb locking here myself; > so for now just supply a patch to copy e632a938d914's get_page() > after follow_huge_pmd() to after follow_huge_addr(): removing > the BUG_ON(flags & FOLL_GET), but probably leaving a race. Thanks for the detailed explanation Hugh. Unfortunately I don't know our mm/hugetlb code well enough to give you a good answer. Ben had a quick look at our follow_huge_addr() and thought it looked "fishy". He suggested something like what we do in gup_pte_range() with page_cache_get_speculative() might be in order. Applying your patch and running trinity pretty immediately results in the following, which looks related (sys_move_pages() again) ? Unable to handle kernel paging request for data at address 0xf2000f80000000 Faulting instruction address: 0xc0000000001e29bc cpu 0x1b: Vector: 300 (Data Access) at [c0000003c70f76f0] pc: c0000000001e29bc: .remove_migration_pte+0x9c/0x320 lr: c0000000001e29b8: .remove_migration_pte+0x98/0x320 sp: c0000003c70f7970 msr: 8000000000009032 dar: f2000f80000000 dsisr: 40000000 current = 0xc0000003f9045800 paca = 0xc000000001dc6c00 softe: 0 irq_happened: 0x01 pid = 3585, comm = trinity-c27 enter ? for help [c0000003c70f7a20] c0000000001bce88 .rmap_walk+0x328/0x470 [c0000003c70f7ae0] c0000000001e2904 .remove_migration_ptes+0x44/0x60 [c0000003c70f7b80] c0000000001e4ce8 .migrate_pages+0x6d8/0xa00 [c0000003c70f7cc0] c0000000001e55ec .SyS_move_pages+0x5dc/0x7d0 [c0000003c70f7e30] c00000000000a1d8 syscall_exit+0x0/0x98 --- Exception: c01 (System Call) at 00003fff7b2b30a8 SP (3fffe09728a0) is in userspace 1b:mon> I've hit it twice in two runs: Unable to handle kernel paging request for data at address 0xf2400f00000000 Faulting instruction address: 0xc0000000001e2a3c cpu 0xd: Vector: 300 (Data Access) at [c00000038a4bf6f0] pc: c0000000001e2a3c: .remove_migration_pte+0x9c/0x320 lr: c0000000001e2a38: .remove_migration_pte+0x98/0x320 sp: c00000038a4bf970 msr: 8000000000009032 dar: f2400f00000000 dsisr: 40000000 current = 0xc0000003acd9e680 paca = 0xc000000001dc3400 softe: 0 irq_happened: 0x01 pid = 13334, comm = trinity-c13 enter ? for help [c00000038a4bfa20] c0000000001bcf08 .rmap_walk+0x328/0x470 [c00000038a4bfae0] c0000000001e2984 .remove_migration_ptes+0x44/0x60 [c00000038a4bfb80] c0000000001e4d68 .migrate_pages+0x6d8/0xa00 [c00000038a4bfcc0] c0000000001e566c .SyS_move_pages+0x5dc/0x7d0 [c00000038a4bfe30] c00000000000a1d8 syscall_exit+0x0/0x98 --- Exception: c01 (System Call) at 00003fff79df30a8 SP (3fffda95d500) is in userspace d:mon> If I tell trinity to skip sys_move_pages() it runs for hours. cheers -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) 2014-05-29 8:59 ` Michael Ellerman @ 2014-05-29 18:34 ` Naoya Horiguchi 2014-05-29 22:04 ` Hugh Dickins 2014-05-29 21:03 ` BUG at mm/memory.c:1489! Hugh Dickins [not found] ` <1401388474-mqnis5cp@n-horiguchi@ah.jp.nec.com> 2 siblings, 1 reply; 16+ messages in thread From: Naoya Horiguchi @ 2014-05-29 18:34 UTC (permalink / raw) To: mpe Cc: Hugh Dickins, Andrew Morton, benh, Tony Luck, linux-mm, linux-kernel, trinity On Thu, May 29, 2014 at 06:59:43PM +1000, Michael Ellerman wrote: > On Wed, 2014-05-28 at 17:33 -0700, Hugh Dickins wrote: > > On Wed, 28 May 2014, Michael Ellerman wrote: > > > Linux Blade312-5 3.15.0-rc7 #306 SMP Wed May 28 17:51:18 EST 2014 ppc64 > > > > > > [watchdog] 27853 iterations. [F:22642 S:5174 HI:1276] > > > ------------[ cut here ]------------ > > > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > > > cpu 0xc: Vector: 700 (Program Check) at [c000000384eaf960] > > > pc: c0000000001ad6f0: .follow_page_mask+0x90/0x650 > > > lr: c0000000001ad6d8: .follow_page_mask+0x78/0x650 > > > sp: c000000384eafbe0 > > > msr: 8000000000029032 > > > current = 0xc0000003c27e1bc0 > > > paca = 0xc000000001dc3000 softe: 0 irq_happened: 0x01 > > > pid = 20800, comm = trinity-c12 > > > kernel BUG at /home/michael/mmk-build/flow/mm/memory.c:1489! > > > enter ? for help > > > [c000000384eafcc0] c0000000001e5514 .SyS_move_pages+0x524/0x7d0 > > > [c000000384eafe30] c00000000000a1d8 syscall_exit+0x0/0x98 > > > --- Exception: c01 (System Call) at 00003fff795f30a8 > > > SP (3ffff958f290) is in userspace > > > > > > I've left it in the debugger, can dig into it a bit more tomorrow > > > if anyone has any clues. > > > > Thanks for leaving it overnight, but this one is quite obvious, > > so go ahead and reboot whenever suits you. > > > > Trinity didn't even need to do anything bizarre to get this: that > > ordinary path simply didn't get tried on powerpc or ia64 before. > > > > Here's a patch which should fix it for you, but I believe leaves > > a race in common with other architectures. I must turn away to > > other things, and hope Naoya-san can fix up the locking separately > > (or point out why it's already safe). > > > > [PATCH] mm: fix move_pages follow_page huge_addr BUG > > > > v3.12's e632a938d914 ("mm: migrate: add hugepage migration code to > > move_pages()") is okay on most arches, but on follow_huge_addr-style > > arches ia64 and powerpc, it hits my old BUG_ON(flags & FOLL_GET) > > from v2.6.15 deceb6cd17e6 ("mm: follow_page with inner ptlock"). > > > > The point of the BUG_ON was that nothing needed FOLL_GET there at > > the time, and it was not clear that we have sufficient locking to > > use get_page() safely here on the outside - maybe the page found has > > already been freed and even reused when follow_huge_addr() returns. > > > > I suspect that e632a938d914's use of get_page() after return from > > follow_huge_pmd() has the same problem: what prevents a racing > > instance of move_pages() from already migrating away and freeing > > that page by then? A reference to the page should be taken while > > holding suitable lock (huge_pte_lockptr?), to serialize against > > concurrent migration. > > > > But I'm not prepared to rework the hugetlb locking here myself; > > so for now just supply a patch to copy e632a938d914's get_page() > > after follow_huge_pmd() to after follow_huge_addr(): removing > > the BUG_ON(flags & FOLL_GET), but probably leaving a race. > > Thanks for the detailed explanation Hugh. > > Unfortunately I don't know our mm/hugetlb code well enough to give you a good > answer. Ben had a quick look at our follow_huge_addr() and thought it looked > "fishy". He suggested something like what we do in gup_pte_range() with > page_cache_get_speculative() might be in order. > > Applying your patch and running trinity pretty immediately results in the > following, which looks related (sys_move_pages() again) ? > > Unable to handle kernel paging request for data at address 0xf2000f80000000 > Faulting instruction address: 0xc0000000001e29bc > cpu 0x1b: Vector: 300 (Data Access) at [c0000003c70f76f0] > pc: c0000000001e29bc: .remove_migration_pte+0x9c/0x320 > lr: c0000000001e29b8: .remove_migration_pte+0x98/0x320 > sp: c0000003c70f7970 > msr: 8000000000009032 > dar: f2000f80000000 > dsisr: 40000000 > current = 0xc0000003f9045800 > paca = 0xc000000001dc6c00 softe: 0 irq_happened: 0x01 > pid = 3585, comm = trinity-c27 > enter ? for help > [c0000003c70f7a20] c0000000001bce88 .rmap_walk+0x328/0x470 > [c0000003c70f7ae0] c0000000001e2904 .remove_migration_ptes+0x44/0x60 > [c0000003c70f7b80] c0000000001e4ce8 .migrate_pages+0x6d8/0xa00 > [c0000003c70f7cc0] c0000000001e55ec .SyS_move_pages+0x5dc/0x7d0 > [c0000003c70f7e30] c00000000000a1d8 syscall_exit+0x0/0x98 > --- Exception: c01 (System Call) at 00003fff7b2b30a8 > SP (3fffe09728a0) is in userspace > 1b:mon> Sorry for inconvenience on your testing. Hugepage migration is enabled for archs which have pmd-level hugepage (including ppc64,) but not tested except for x86_64. hugepage_migration_support() controls this so the following patch should help you avoid the problem, I believe. Could you try to test with it? Thanks, Naoya Horiguchi --- Date: Thu, 29 May 2014 12:51:37 -0400 Subject: [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 Curretly hugepage migration is available for all archs which support pmd-level hugepage, but testing is done only for x86_64 and there're bugs for other archs. So to avoid breaking such archs, this patch limits the availability strictly to x86_64 until developers of other archs get interested in enabling this feature. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: stable@vger.kernel.org # 3.12+ --- arch/arm/mm/hugetlbpage.c | 5 ----- arch/arm64/mm/hugetlbpage.c | 5 ----- arch/ia64/mm/hugetlbpage.c | 5 ----- arch/metag/mm/hugetlbpage.c | 5 ----- arch/mips/mm/hugetlbpage.c | 5 ----- arch/powerpc/mm/hugetlbpage.c | 10 ---------- arch/s390/mm/hugetlbpage.c | 5 ----- arch/sh/mm/hugetlbpage.c | 5 ----- arch/sparc/mm/hugetlbpage.c | 5 ----- arch/tile/mm/hugetlbpage.c | 5 ----- arch/x86/Kconfig | 4 ++++ arch/x86/mm/hugetlbpage.c | 10 ---------- include/linux/hugetlb.h | 10 ++++++---- mm/Kconfig | 3 +++ 14 files changed, 13 insertions(+), 69 deletions(-) diff --git a/arch/arm/mm/hugetlbpage.c b/arch/arm/mm/hugetlbpage.c index 54ee6163c181..66781bf34077 100644 --- a/arch/arm/mm/hugetlbpage.c +++ b/arch/arm/mm/hugetlbpage.c @@ -56,8 +56,3 @@ int pmd_huge(pmd_t pmd) { return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT); } - -int pmd_huge_support(void) -{ - return 1; -} diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 5e9aec358306..2fc8258bab2d 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -54,11 +54,6 @@ int pud_huge(pud_t pud) return !(pud_val(pud) & PUD_TABLE_BIT); } -int pmd_huge_support(void) -{ - return 1; -} - static __init int setup_hugepagesz(char *opt) { unsigned long ps = memparse(opt, &opt); diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c index 68232db98baa..76069c18ee42 100644 --- a/arch/ia64/mm/hugetlbpage.c +++ b/arch/ia64/mm/hugetlbpage.c @@ -114,11 +114,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 0; -} - struct page * follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git a/arch/metag/mm/hugetlbpage.c b/arch/metag/mm/hugetlbpage.c index 042431509b56..3c52fa6d0f8e 100644 --- a/arch/metag/mm/hugetlbpage.c +++ b/arch/metag/mm/hugetlbpage.c @@ -110,11 +110,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 1; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git a/arch/mips/mm/hugetlbpage.c b/arch/mips/mm/hugetlbpage.c index 77e0ae036e7c..4ec8ee10d371 100644 --- a/arch/mips/mm/hugetlbpage.c +++ b/arch/mips/mm/hugetlbpage.c @@ -84,11 +84,6 @@ int pud_huge(pud_t pud) return (pud_val(pud) & _PAGE_HUGE) != 0; } -int pmd_huge_support(void) -{ - return 1; -} - struct page * follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index eb923654ba80..7e70ae968e5f 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -86,11 +86,6 @@ int pgd_huge(pgd_t pgd) */ return ((pgd_val(pgd) & 0x3) != 0x0); } - -int pmd_huge_support(void) -{ - return 1; -} #else int pmd_huge(pmd_t pmd) { @@ -106,11 +101,6 @@ int pgd_huge(pgd_t pgd) { return 0; } - -int pmd_huge_support(void) -{ - return 0; -} #endif pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index 0727a55d87d9..0ff66a7e29bb 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -220,11 +220,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 1; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmdp, int write) { diff --git a/arch/sh/mm/hugetlbpage.c b/arch/sh/mm/hugetlbpage.c index 0d676a41081e..d7762349ea48 100644 --- a/arch/sh/mm/hugetlbpage.c +++ b/arch/sh/mm/hugetlbpage.c @@ -83,11 +83,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 0; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index 9bd9ce80bf77..d329537739c6 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -231,11 +231,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 0; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git a/arch/tile/mm/hugetlbpage.c b/arch/tile/mm/hugetlbpage.c index 0cb3bbaa580c..e514899e1100 100644 --- a/arch/tile/mm/hugetlbpage.c +++ b/arch/tile/mm/hugetlbpage.c @@ -166,11 +166,6 @@ int pud_huge(pud_t pud) return !!(pud_val(pud) & _PAGE_HUGE_PAGE); } -int pmd_huge_support(void) -{ - return 1; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 25d2c6f7325e..0cf6a7d0a93e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1871,6 +1871,10 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK def_bool y depends on X86_64 || X86_PAE +config ARCH_ENABLE_HUGEPAGE_MIGRATION + def_bool y + depends on X86_64 || MIGRATION + menu "Power management and ACPI options" config ARCH_HIBERNATION_HEADER diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c index 8c9f647ff9e1..8b977ebf9388 100644 --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -58,11 +58,6 @@ follow_huge_pmd(struct mm_struct *mm, unsigned long address, { return NULL; } - -int pmd_huge_support(void) -{ - return 0; -} #else struct page * @@ -80,11 +75,6 @@ int pud_huge(pud_t pud) { return !!(pud_val(pud) & _PAGE_PSE); } - -int pmd_huge_support(void) -{ - return 1; -} #endif #ifdef CONFIG_HUGETLB_PAGE diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 63214868c5b2..61c2e349af64 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -385,15 +385,18 @@ static inline pgoff_t basepage_index(struct page *page) extern void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn); -int pmd_huge_support(void); /* - * Currently hugepage migration is enabled only for pmd-based hugepage. + * Currently hugepage migration is enabled only for x86_64. * This function will be updated when hugepage migration is more widely * supported. */ static inline int hugepage_migration_support(struct hstate *h) { - return pmd_huge_support() && (huge_page_shift(h) == PMD_SHIFT); +#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION + return huge_page_shift(h) == PMD_SHIFT; +#else + return 0; +#endif } static inline spinlock_t *huge_pte_lockptr(struct hstate *h, @@ -443,7 +446,6 @@ static inline pgoff_t basepage_index(struct page *page) return page->index; } #define dissolve_free_huge_pages(s, e) do {} while (0) -#define pmd_huge_support() 0 #define hugepage_migration_support(h) 0 static inline spinlock_t *huge_pte_lockptr(struct hstate *h, diff --git a/mm/Kconfig b/mm/Kconfig index ebe5880c29d6..1e22701c972b 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -264,6 +264,9 @@ config MIGRATION pages as migration can relocate pages to satisfy a huge page allocation instead of reclaiming. +config ARCH_ENABLE_HUGEPAGE_MIGRATION + boolean + config PHYS_ADDR_T_64BIT def_bool 64BIT || ARCH_PHYS_ADDR_T_64BIT -- 1.9.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) 2014-05-29 18:34 ` [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) Naoya Horiguchi @ 2014-05-29 22:04 ` Hugh Dickins 2014-05-30 2:56 ` Naoya Horiguchi 0 siblings, 1 reply; 16+ messages in thread From: Hugh Dickins @ 2014-05-29 22:04 UTC (permalink / raw) To: Naoya Horiguchi Cc: mpe, Hugh Dickins, Andrew Morton, benh, Tony Luck, linux-mm, linux-kernel, trinity On Thu, 29 May 2014, Naoya Horiguchi wrote: > > Curretly hugepage migration is available for all archs which support pmd-level > hugepage, but testing is done only for x86_64 and there're bugs for other archs. And even for x86_64 I think: the follow_huge_pmd() locking issue I mentioned. But I agree that's a different kind of bug, and probably not cause to disable the feature even on x86_64 at this stage - but cause to fix it in a different patch Cc stable when you have a moment. > So to avoid breaking such archs, this patch limits the availability strictly to > x86_64 until developers of other archs get interested in enabling this feature. Hmm, I don't like the sound of "until developers of other archs get interested in enabling this feature". Your choice, I suppose, but I had been expecting you to give them a little more help than that, by fixing up the follow_huge_addr() and locking as you expect it to be (and whatever Michael's subsequent remove_migration_pte() crash comes from - maybe obvious with a little thought, but I haven't), then pinging those architectures to give it a try and enable if they wish. Perhaps I'm expecting too much, and you haven't the time; doubt I have. I believe your patch below is incomplete, or perhaps you were expecting to layer it on top of my follow_huge_addr get_page one. No, I think we should throw mine away if you're going to disable the feature on most architectures for now (and once the locking is corrected, my get_page after follow_huge_addr will be wrong anyway). What I think you're missing is an adjustment to your 71ea2efb1e93 ("mm: migrate: remove VM_HUGETLB from vma flag check in vma_migratable()"): doesn't vma_migratable() need to test VM_HUGETLB when !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION? Then we are saved from reaching the follow_huge_addr() BUG; and avoid the weird preparation for migrating HUGETLB pages on architectures which do not support it. But yes, I think your disablement approach is the right thing for 3.15. > > Reported-by: Michael Ellerman <mpe@ellerman.id.au> > Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > Cc: stable@vger.kernel.org # 3.12+ > --- > arch/arm/mm/hugetlbpage.c | 5 ----- > arch/arm64/mm/hugetlbpage.c | 5 ----- > arch/ia64/mm/hugetlbpage.c | 5 ----- > arch/metag/mm/hugetlbpage.c | 5 ----- > arch/mips/mm/hugetlbpage.c | 5 ----- > arch/powerpc/mm/hugetlbpage.c | 10 ---------- > arch/s390/mm/hugetlbpage.c | 5 ----- > arch/sh/mm/hugetlbpage.c | 5 ----- > arch/sparc/mm/hugetlbpage.c | 5 ----- > arch/tile/mm/hugetlbpage.c | 5 ----- > arch/x86/Kconfig | 4 ++++ > arch/x86/mm/hugetlbpage.c | 10 ---------- > include/linux/hugetlb.h | 10 ++++++---- > mm/Kconfig | 3 +++ > 14 files changed, 13 insertions(+), 69 deletions(-) > > diff --git a/arch/arm/mm/hugetlbpage.c b/arch/arm/mm/hugetlbpage.c > index 54ee6163c181..66781bf34077 100644 > --- a/arch/arm/mm/hugetlbpage.c > +++ b/arch/arm/mm/hugetlbpage.c > @@ -56,8 +56,3 @@ int pmd_huge(pmd_t pmd) > { > return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT); > } > - > -int pmd_huge_support(void) > -{ > - return 1; > -} > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > index 5e9aec358306..2fc8258bab2d 100644 > --- a/arch/arm64/mm/hugetlbpage.c > +++ b/arch/arm64/mm/hugetlbpage.c > @@ -54,11 +54,6 @@ int pud_huge(pud_t pud) > return !(pud_val(pud) & PUD_TABLE_BIT); > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > static __init int setup_hugepagesz(char *opt) > { > unsigned long ps = memparse(opt, &opt); > diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c > index 68232db98baa..76069c18ee42 100644 > --- a/arch/ia64/mm/hugetlbpage.c > +++ b/arch/ia64/mm/hugetlbpage.c > @@ -114,11 +114,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 0; > -} > - > struct page * > follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) > { > diff --git a/arch/metag/mm/hugetlbpage.c b/arch/metag/mm/hugetlbpage.c > index 042431509b56..3c52fa6d0f8e 100644 > --- a/arch/metag/mm/hugetlbpage.c > +++ b/arch/metag/mm/hugetlbpage.c > @@ -110,11 +110,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > { > diff --git a/arch/mips/mm/hugetlbpage.c b/arch/mips/mm/hugetlbpage.c > index 77e0ae036e7c..4ec8ee10d371 100644 > --- a/arch/mips/mm/hugetlbpage.c > +++ b/arch/mips/mm/hugetlbpage.c > @@ -84,11 +84,6 @@ int pud_huge(pud_t pud) > return (pud_val(pud) & _PAGE_HUGE) != 0; > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > struct page * > follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c > index eb923654ba80..7e70ae968e5f 100644 > --- a/arch/powerpc/mm/hugetlbpage.c > +++ b/arch/powerpc/mm/hugetlbpage.c > @@ -86,11 +86,6 @@ int pgd_huge(pgd_t pgd) > */ > return ((pgd_val(pgd) & 0x3) != 0x0); > } > - > -int pmd_huge_support(void) > -{ > - return 1; > -} > #else > int pmd_huge(pmd_t pmd) > { > @@ -106,11 +101,6 @@ int pgd_huge(pgd_t pgd) > { > return 0; > } > - > -int pmd_huge_support(void) > -{ > - return 0; > -} > #endif > > pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) > diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c > index 0727a55d87d9..0ff66a7e29bb 100644 > --- a/arch/s390/mm/hugetlbpage.c > +++ b/arch/s390/mm/hugetlbpage.c > @@ -220,11 +220,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmdp, int write) > { > diff --git a/arch/sh/mm/hugetlbpage.c b/arch/sh/mm/hugetlbpage.c > index 0d676a41081e..d7762349ea48 100644 > --- a/arch/sh/mm/hugetlbpage.c > +++ b/arch/sh/mm/hugetlbpage.c > @@ -83,11 +83,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 0; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > { > diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c > index 9bd9ce80bf77..d329537739c6 100644 > --- a/arch/sparc/mm/hugetlbpage.c > +++ b/arch/sparc/mm/hugetlbpage.c > @@ -231,11 +231,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 0; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > { > diff --git a/arch/tile/mm/hugetlbpage.c b/arch/tile/mm/hugetlbpage.c > index 0cb3bbaa580c..e514899e1100 100644 > --- a/arch/tile/mm/hugetlbpage.c > +++ b/arch/tile/mm/hugetlbpage.c > @@ -166,11 +166,6 @@ int pud_huge(pud_t pud) > return !!(pud_val(pud) & _PAGE_HUGE_PAGE); > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > { > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 25d2c6f7325e..0cf6a7d0a93e 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -1871,6 +1871,10 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK > def_bool y > depends on X86_64 || X86_PAE > > +config ARCH_ENABLE_HUGEPAGE_MIGRATION > + def_bool y > + depends on X86_64 || MIGRATION > + Should that be X86_64 && MIGRATION? X86_64 && HUGETLB_PAGE && MIGRATION? Maybe it doesn't matter. Yes, I agree a per-arch config option is better than all those pmd_huge_support() functions, especially all the ones saying 0. > menu "Power management and ACPI options" > > config ARCH_HIBERNATION_HEADER > diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c > index 8c9f647ff9e1..8b977ebf9388 100644 > --- a/arch/x86/mm/hugetlbpage.c > +++ b/arch/x86/mm/hugetlbpage.c > @@ -58,11 +58,6 @@ follow_huge_pmd(struct mm_struct *mm, unsigned long address, > { > return NULL; > } > - > -int pmd_huge_support(void) > -{ > - return 0; > -} > #else > > struct page * > @@ -80,11 +75,6 @@ int pud_huge(pud_t pud) > { > return !!(pud_val(pud) & _PAGE_PSE); > } > - > -int pmd_huge_support(void) > -{ > - return 1; > -} > #endif > > #ifdef CONFIG_HUGETLB_PAGE > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 63214868c5b2..61c2e349af64 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -385,15 +385,18 @@ static inline pgoff_t basepage_index(struct page *page) > > extern void dissolve_free_huge_pages(unsigned long start_pfn, > unsigned long end_pfn); > -int pmd_huge_support(void); > /* > - * Currently hugepage migration is enabled only for pmd-based hugepage. > + * Currently hugepage migration is enabled only for x86_64. You don't want to have to update that comment every time an architecture opts in. No need for any comment here, I think, the name is good enough (though hugepage_migration_supported() would be better). > * This function will be updated when hugepage migration is more widely > * supported. > */ > static inline int hugepage_migration_support(struct hstate *h) > { > - return pmd_huge_support() && (huge_page_shift(h) == PMD_SHIFT); > +#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > + return huge_page_shift(h) == PMD_SHIFT; > +#else > + return 0; > +#endif > } > > static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > @@ -443,7 +446,6 @@ static inline pgoff_t basepage_index(struct page *page) > return page->index; > } > #define dissolve_free_huge_pages(s, e) do {} while (0) > -#define pmd_huge_support() 0 > #define hugepage_migration_support(h) 0 > > static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > diff --git a/mm/Kconfig b/mm/Kconfig > index ebe5880c29d6..1e22701c972b 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -264,6 +264,9 @@ config MIGRATION > pages as migration can relocate pages to satisfy a huge page > allocation instead of reclaiming. > > +config ARCH_ENABLE_HUGEPAGE_MIGRATION > + boolean > + I don't remember how duplicated config entries work, so cannot comment on that. > config PHYS_ADDR_T_64BIT > def_bool 64BIT || ARCH_PHYS_ADDR_T_64BIT > > -- > 1.9.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) 2014-05-29 22:04 ` Hugh Dickins @ 2014-05-30 2:56 ` Naoya Horiguchi 0 siblings, 0 replies; 16+ messages in thread From: Naoya Horiguchi @ 2014-05-30 2:56 UTC (permalink / raw) To: Hugh Dickins Cc: mpe, Andrew Morton, benh, Tony Luck, linux-mm, linux-kernel, trinity On Thu, May 29, 2014 at 03:04:57PM -0700, Hugh Dickins wrote: > On Thu, 29 May 2014, Naoya Horiguchi wrote: > > > > Curretly hugepage migration is available for all archs which support pmd-level > > hugepage, but testing is done only for x86_64 and there're bugs for other archs. > > And even for x86_64 I think: the follow_huge_pmd() locking issue I > mentioned. But I agree that's a different kind of bug, and probably > not cause to disable the feature even on x86_64 at this stage - but > cause to fix it in a different patch Cc stable when you have a moment. Yes, I promise to do it. > > So to avoid breaking such archs, this patch limits the availability strictly to > > x86_64 until developers of other archs get interested in enabling this feature. > > Hmm, I don't like the sound of "until developers of other archs get > interested in enabling this feature". Your choice, I suppose, but I > had been expecting you to give them a little more help than that, by > fixing up the follow_huge_addr() and locking as you expect it to be > (and whatever Michael's subsequent remove_migration_pte() crash comes > from - maybe obvious with a little thought, but I haven't), then > pinging those architectures to give it a try and enable if they wish. I agree, I'll do this (maybe starting by adding more comment to help enable on other archs) after fixing locking problem. > Perhaps I'm expecting too much, and you haven't the time; doubt I have. I can not say when, but I want to do it in a few months. > I believe your patch below is incomplete, or perhaps you were > expecting to layer it on top of my follow_huge_addr get_page one. > No, I think we should throw mine away if you're going to disable > the feature on most architectures for now (and once the locking is > corrected, my get_page after follow_huge_addr will be wrong anyway). OK, you dropped the patch, so follow_page(FOLL_GET) should not called for hugepages on non-x86_64 archs. > What I think you're missing is an adjustment to your 71ea2efb1e93 > ("mm: migrate: remove VM_HUGETLB from vma flag check in vma_migratable()"): > doesn't vma_migratable() need to test VM_HUGETLB when > !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION? Right, that looks to me a good precheck to skip hugepage. > Then we are saved from > reaching the follow_huge_addr() BUG; and avoid the weird preparation > for migrating HUGETLB pages on architectures which do not support it. > > But yes, I think your disablement approach is the right thing for 3.15. > > > > > Reported-by: Michael Ellerman <mpe@ellerman.id.au> > > Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > > Cc: stable@vger.kernel.org # 3.12+ > > --- > > arch/arm/mm/hugetlbpage.c | 5 ----- > > arch/arm64/mm/hugetlbpage.c | 5 ----- > > arch/ia64/mm/hugetlbpage.c | 5 ----- > > arch/metag/mm/hugetlbpage.c | 5 ----- > > arch/mips/mm/hugetlbpage.c | 5 ----- > > arch/powerpc/mm/hugetlbpage.c | 10 ---------- > > arch/s390/mm/hugetlbpage.c | 5 ----- > > arch/sh/mm/hugetlbpage.c | 5 ----- > > arch/sparc/mm/hugetlbpage.c | 5 ----- > > arch/tile/mm/hugetlbpage.c | 5 ----- > > arch/x86/Kconfig | 4 ++++ > > arch/x86/mm/hugetlbpage.c | 10 ---------- > > include/linux/hugetlb.h | 10 ++++++---- > > mm/Kconfig | 3 +++ > > 14 files changed, 13 insertions(+), 69 deletions(-) > > > > diff --git a/arch/arm/mm/hugetlbpage.c b/arch/arm/mm/hugetlbpage.c > > index 54ee6163c181..66781bf34077 100644 > > --- a/arch/arm/mm/hugetlbpage.c > > +++ b/arch/arm/mm/hugetlbpage.c > > @@ -56,8 +56,3 @@ int pmd_huge(pmd_t pmd) > > { > > return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT); > > } > > - > > -int pmd_huge_support(void) > > -{ > > - return 1; > > -} > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > > index 5e9aec358306..2fc8258bab2d 100644 > > --- a/arch/arm64/mm/hugetlbpage.c > > +++ b/arch/arm64/mm/hugetlbpage.c > > @@ -54,11 +54,6 @@ int pud_huge(pud_t pud) > > return !(pud_val(pud) & PUD_TABLE_BIT); > > } > > > > -int pmd_huge_support(void) > > -{ > > - return 1; > > -} > > - > > static __init int setup_hugepagesz(char *opt) > > { > > unsigned long ps = memparse(opt, &opt); > > diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c > > index 68232db98baa..76069c18ee42 100644 > > --- a/arch/ia64/mm/hugetlbpage.c > > +++ b/arch/ia64/mm/hugetlbpage.c > > @@ -114,11 +114,6 @@ int pud_huge(pud_t pud) > > return 0; > > } > > > > -int pmd_huge_support(void) > > -{ > > - return 0; > > -} > > - > > struct page * > > follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) > > { > > diff --git a/arch/metag/mm/hugetlbpage.c b/arch/metag/mm/hugetlbpage.c > > index 042431509b56..3c52fa6d0f8e 100644 > > --- a/arch/metag/mm/hugetlbpage.c > > +++ b/arch/metag/mm/hugetlbpage.c > > @@ -110,11 +110,6 @@ int pud_huge(pud_t pud) > > return 0; > > } > > > > -int pmd_huge_support(void) > > -{ > > - return 1; > > -} > > - > > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > > pmd_t *pmd, int write) > > { > > diff --git a/arch/mips/mm/hugetlbpage.c b/arch/mips/mm/hugetlbpage.c > > index 77e0ae036e7c..4ec8ee10d371 100644 > > --- a/arch/mips/mm/hugetlbpage.c > > +++ b/arch/mips/mm/hugetlbpage.c > > @@ -84,11 +84,6 @@ int pud_huge(pud_t pud) > > return (pud_val(pud) & _PAGE_HUGE) != 0; > > } > > > > -int pmd_huge_support(void) > > -{ > > - return 1; > > -} > > - > > struct page * > > follow_huge_pmd(struct mm_struct *mm, unsigned long address, > > pmd_t *pmd, int write) > > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c > > index eb923654ba80..7e70ae968e5f 100644 > > --- a/arch/powerpc/mm/hugetlbpage.c > > +++ b/arch/powerpc/mm/hugetlbpage.c > > @@ -86,11 +86,6 @@ int pgd_huge(pgd_t pgd) > > */ > > return ((pgd_val(pgd) & 0x3) != 0x0); > > } > > - > > -int pmd_huge_support(void) > > -{ > > - return 1; > > -} > > #else > > int pmd_huge(pmd_t pmd) > > { > > @@ -106,11 +101,6 @@ int pgd_huge(pgd_t pgd) > > { > > return 0; > > } > > - > > -int pmd_huge_support(void) > > -{ > > - return 0; > > -} > > #endif > > > > pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) > > diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c > > index 0727a55d87d9..0ff66a7e29bb 100644 > > --- a/arch/s390/mm/hugetlbpage.c > > +++ b/arch/s390/mm/hugetlbpage.c > > @@ -220,11 +220,6 @@ int pud_huge(pud_t pud) > > return 0; > > } > > > > -int pmd_huge_support(void) > > -{ > > - return 1; > > -} > > - > > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > > pmd_t *pmdp, int write) > > { > > diff --git a/arch/sh/mm/hugetlbpage.c b/arch/sh/mm/hugetlbpage.c > > index 0d676a41081e..d7762349ea48 100644 > > --- a/arch/sh/mm/hugetlbpage.c > > +++ b/arch/sh/mm/hugetlbpage.c > > @@ -83,11 +83,6 @@ int pud_huge(pud_t pud) > > return 0; > > } > > > > -int pmd_huge_support(void) > > -{ > > - return 0; > > -} > > - > > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > > pmd_t *pmd, int write) > > { > > diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c > > index 9bd9ce80bf77..d329537739c6 100644 > > --- a/arch/sparc/mm/hugetlbpage.c > > +++ b/arch/sparc/mm/hugetlbpage.c > > @@ -231,11 +231,6 @@ int pud_huge(pud_t pud) > > return 0; > > } > > > > -int pmd_huge_support(void) > > -{ > > - return 0; > > -} > > - > > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > > pmd_t *pmd, int write) > > { > > diff --git a/arch/tile/mm/hugetlbpage.c b/arch/tile/mm/hugetlbpage.c > > index 0cb3bbaa580c..e514899e1100 100644 > > --- a/arch/tile/mm/hugetlbpage.c > > +++ b/arch/tile/mm/hugetlbpage.c > > @@ -166,11 +166,6 @@ int pud_huge(pud_t pud) > > return !!(pud_val(pud) & _PAGE_HUGE_PAGE); > > } > > > > -int pmd_huge_support(void) > > -{ > > - return 1; > > -} > > - > > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > > pmd_t *pmd, int write) > > { > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > index 25d2c6f7325e..0cf6a7d0a93e 100644 > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -1871,6 +1871,10 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK > > def_bool y > > depends on X86_64 || X86_PAE > > > > +config ARCH_ENABLE_HUGEPAGE_MIGRATION > > + def_bool y > > + depends on X86_64 || MIGRATION > > + > > Should that be X86_64 && MIGRATION? X86_64 && HUGETLB_PAGE && MIGRATION? > Maybe it doesn't matter. Ouch, "&&" is what I meant, and HUGETLB_PAGE is also fine. > Yes, I agree a per-arch config option is better than all those > pmd_huge_support() functions, especially all the ones saying 0. > > > menu "Power management and ACPI options" > > > > config ARCH_HIBERNATION_HEADER > > diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c > > index 8c9f647ff9e1..8b977ebf9388 100644 > > --- a/arch/x86/mm/hugetlbpage.c > > +++ b/arch/x86/mm/hugetlbpage.c > > @@ -58,11 +58,6 @@ follow_huge_pmd(struct mm_struct *mm, unsigned long address, > > { > > return NULL; > > } > > - > > -int pmd_huge_support(void) > > -{ > > - return 0; > > -} > > #else > > > > struct page * > > @@ -80,11 +75,6 @@ int pud_huge(pud_t pud) > > { > > return !!(pud_val(pud) & _PAGE_PSE); > > } > > - > > -int pmd_huge_support(void) > > -{ > > - return 1; > > -} > > #endif > > > > #ifdef CONFIG_HUGETLB_PAGE > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > > index 63214868c5b2..61c2e349af64 100644 > > --- a/include/linux/hugetlb.h > > +++ b/include/linux/hugetlb.h > > @@ -385,15 +385,18 @@ static inline pgoff_t basepage_index(struct page *page) > > > > extern void dissolve_free_huge_pages(unsigned long start_pfn, > > unsigned long end_pfn); > > -int pmd_huge_support(void); > > /* > > - * Currently hugepage migration is enabled only for pmd-based hugepage. > > + * Currently hugepage migration is enabled only for x86_64. > > You don't want to have to update that comment every time an architecture > opts in. No need for any comment here, I think, the name is good enough > (though hugepage_migration_supported() would be better). Ah, OK. I'll remove this comment. Thanks, Naoya Horiguchi > > * This function will be updated when hugepage migration is more widely > > * supported. > > */ > > static inline int hugepage_migration_support(struct hstate *h) > > { > > - return pmd_huge_support() && (huge_page_shift(h) == PMD_SHIFT); > > +#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > > + return huge_page_shift(h) == PMD_SHIFT; > > +#else > > + return 0; > > +#endif > > } > > > > static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > > @@ -443,7 +446,6 @@ static inline pgoff_t basepage_index(struct page *page) > > return page->index; > > } > > #define dissolve_free_huge_pages(s, e) do {} while (0) > > -#define pmd_huge_support() 0 > > #define hugepage_migration_support(h) 0 > > > > static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > > diff --git a/mm/Kconfig b/mm/Kconfig > > index ebe5880c29d6..1e22701c972b 100644 > > --- a/mm/Kconfig > > +++ b/mm/Kconfig > > @@ -264,6 +264,9 @@ config MIGRATION > > pages as migration can relocate pages to satisfy a huge page > > allocation instead of reclaiming. > > > > +config ARCH_ENABLE_HUGEPAGE_MIGRATION > > + boolean > > + > > I don't remember how duplicated config entries work, > so cannot comment on that. > > > config PHYS_ADDR_T_64BIT > > def_bool 64BIT || ARCH_PHYS_ADDR_T_64BIT > > > > -- > > 1.9.3 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: BUG at mm/memory.c:1489! 2014-05-29 8:59 ` Michael Ellerman 2014-05-29 18:34 ` [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) Naoya Horiguchi @ 2014-05-29 21:03 ` Hugh Dickins [not found] ` <1401388474-mqnis5cp@n-horiguchi@ah.jp.nec.com> 2 siblings, 0 replies; 16+ messages in thread From: Hugh Dickins @ 2014-05-29 21:03 UTC (permalink / raw) To: Michael Ellerman Cc: Hugh Dickins, Andrew Morton, Naoya Horiguchi, Benjamin Herrenschmidt, Tony Luck, linux-mm, linux-kernel, trinity On Thu, 29 May 2014, Michael Ellerman wrote: > > Unfortunately I don't know our mm/hugetlb code well enough to give you a good > answer. Ben had a quick look at our follow_huge_addr() and thought it looked > "fishy". He suggested something like what we do in gup_pte_range() with > page_cache_get_speculative() might be in order. Fishy indeed, ancient code that was only ever intended for stats-like usage, not designed for actually getting a hold on the page. But I don't think there's a big problem to getting the locking right: just hope it doesn't require a different strategy on each architecture - often an irritation with hugetlb. Naoya-san will sort it out in due course (not 3.15) I expect, but will probably need testing help. > > Applying your patch and running trinity pretty immediately results in the > following, which looks related (sys_move_pages() again) ? > > Unable to handle kernel paging request for data at address 0xf2000f80000000 > Faulting instruction address: 0xc0000000001e29bc > cpu 0x1b: Vector: 300 (Data Access) at [c0000003c70f76f0] > pc: c0000000001e29bc: .remove_migration_pte+0x9c/0x320 > lr: c0000000001e29b8: .remove_migration_pte+0x98/0x320 > sp: c0000003c70f7970 > msr: 8000000000009032 > dar: f2000f80000000 > dsisr: 40000000 > current = 0xc0000003f9045800 > paca = 0xc000000001dc6c00 softe: 0 irq_happened: 0x01 > pid = 3585, comm = trinity-c27 > enter ? for help > [c0000003c70f7a20] c0000000001bce88 .rmap_walk+0x328/0x470 > [c0000003c70f7ae0] c0000000001e2904 .remove_migration_ptes+0x44/0x60 > [c0000003c70f7b80] c0000000001e4ce8 .migrate_pages+0x6d8/0xa00 > [c0000003c70f7cc0] c0000000001e55ec .SyS_move_pages+0x5dc/0x7d0 > [c0000003c70f7e30] c00000000000a1d8 syscall_exit+0x0/0x98 > --- Exception: c01 (System Call) at 00003fff7b2b30a8 > SP (3fffe09728a0) is in userspace > 1b:mon> > > I've hit it twice in two runs: > > If I tell trinity to skip sys_move_pages() it runs for hours. That's sad. Sorry for wasting your time with my patch, thank you for trying it. What you see might be a consequence of the locking deficiency I mentioned, given trinity's deviousness; though if it's being clever like that, I would expect it to have already found the equivalent issue on x86-64. So probably not, probably another issue. As I've said elsewhere, I think we need to go with disablement for now. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <1401388474-mqnis5cp@n-horiguchi@ah.jp.nec.com>]
* Re: [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) [not found] ` <1401388474-mqnis5cp@n-horiguchi@ah.jp.nec.com> @ 2014-05-30 1:35 ` Michael Ellerman 2014-05-30 1:52 ` Hugh Dickins 2014-05-30 3:04 ` Naoya Horiguchi 0 siblings, 2 replies; 16+ messages in thread From: Michael Ellerman @ 2014-05-30 1:35 UTC (permalink / raw) To: Naoya Horiguchi Cc: Hugh Dickins, Andrew Morton, benh, Tony Luck, linux-mm, linux-kernel, trinity On Thu, 2014-05-29 at 14:34 -0400, Naoya Horiguchi wrote: > On Thu, May 29, 2014 at 06:59:43PM +1000, Michael Ellerman wrote: > > Applying your patch and running trinity pretty immediately results in the > > following, which looks related (sys_move_pages() again) ? > > > > Unable to handle kernel paging request for data at address 0xf2000f80000000 > > Faulting instruction address: 0xc0000000001e29bc > > cpu 0x1b: Vector: 300 (Data Access) at [c0000003c70f76f0] > > pc: c0000000001e29bc: .remove_migration_pte+0x9c/0x320 > > lr: c0000000001e29b8: .remove_migration_pte+0x98/0x320 > > sp: c0000003c70f7970 > > msr: 8000000000009032 > > dar: f2000f80000000 > > dsisr: 40000000 > > current = 0xc0000003f9045800 > > paca = 0xc000000001dc6c00 softe: 0 irq_happened: 0x01 > > pid = 3585, comm = trinity-c27 > > enter ? for help > > [c0000003c70f7a20] c0000000001bce88 .rmap_walk+0x328/0x470 > > [c0000003c70f7ae0] c0000000001e2904 .remove_migration_ptes+0x44/0x60 > > [c0000003c70f7b80] c0000000001e4ce8 .migrate_pages+0x6d8/0xa00 > > [c0000003c70f7cc0] c0000000001e55ec .SyS_move_pages+0x5dc/0x7d0 > > [c0000003c70f7e30] c00000000000a1d8 syscall_exit+0x0/0x98 > > --- Exception: c01 (System Call) at 00003fff7b2b30a8 > > SP (3fffe09728a0) is in userspace > > 1b:mon> > > Sorry for inconvenience on your testing. That's fine, it's good to find bugs :) > Hugepage migration is enabled for archs which have pmd-level hugepage > (including ppc64,) but not tested except for x86_64. > hugepage_migration_support() controls this so the following patch should > help you avoid the problem, I believe. > Could you try to test with it? Sure. So this patch, in addition to Hugh's patch to remove the BUG_ON(), does avoid the crash above (remove_migration_pte()). I dropped Hugh's patch, as he has decided he doesn't like it, and added the following hunk instead: diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h index 3c1b968..f230a97 100644 --- a/include/linux/mempolicy.h +++ b/include/linux/mempolicy.h @@ -175,6 +175,12 @@ static inline int vma_migratable(struct vm_area_struct *vma) { if (vma->vm_flags & (VM_IO | VM_PFNMAP)) return 0; + +#ifndef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION + if (vma->vm_flags & VM_HUGETLB) + return 0; +#endif + /* * Migration allocates pages in the highest zone. If we cannot * do so then migration (at least from node to node) is not Which seems to be what Hugh was referring to in his mail - correct me if I'm wrong Hugh. With your patch and the above hunk I can run trinity happily for a while, whereas without it crashes almost immediately. So with the above hunk you can add my tested-by. cheers -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) 2014-05-30 1:35 ` [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) Michael Ellerman @ 2014-05-30 1:52 ` Hugh Dickins 2014-05-30 3:04 ` Naoya Horiguchi 1 sibling, 0 replies; 16+ messages in thread From: Hugh Dickins @ 2014-05-30 1:52 UTC (permalink / raw) To: Michael Ellerman Cc: Naoya Horiguchi, Hugh Dickins, Andrew Morton, benh, Tony Luck, linux-mm, linux-kernel, trinity On Fri, 30 May 2014, Michael Ellerman wrote: > > diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h > index 3c1b968..f230a97 100644 > --- a/include/linux/mempolicy.h > +++ b/include/linux/mempolicy.h > @@ -175,6 +175,12 @@ static inline int vma_migratable(struct vm_area_struct *vma) > { > if (vma->vm_flags & (VM_IO | VM_PFNMAP)) > return 0; > + > +#ifndef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > + if (vma->vm_flags & VM_HUGETLB) > + return 0; > +#endif > + > /* > * Migration allocates pages in the highest zone. If we cannot > * do so then migration (at least from node to node) is not That's right, thanks. > > > Which seems to be what Hugh was referring to in his mail - correct me if I'm > wrong Hugh. > > With your patch and the above hunk I can run trinity happily for a while, > whereas without it crashes almost immediately. > > So with the above hunk you can add my tested-by. > > cheers -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) 2014-05-30 1:35 ` [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) Michael Ellerman 2014-05-30 1:52 ` Hugh Dickins @ 2014-05-30 3:04 ` Naoya Horiguchi 2014-05-30 4:13 ` [PATCH 1/2] hugetlb: restrict hugepage_migration_support() to x86_64 Naoya Horiguchi 2014-05-30 4:13 ` [PATCH 2/2] hugetlb: rename hugepage_migration_support() to ..._supported() Naoya Horiguchi 1 sibling, 2 replies; 16+ messages in thread From: Naoya Horiguchi @ 2014-05-30 3:04 UTC (permalink / raw) To: mpe Cc: Hugh Dickins, Andrew Morton, benh, Tony Luck, linux-mm, linux-kernel, trinity On Fri, May 30, 2014 at 11:35:16AM +1000, Michael Ellerman wrote: > On Thu, 2014-05-29 at 14:34 -0400, Naoya Horiguchi wrote: > > On Thu, May 29, 2014 at 06:59:43PM +1000, Michael Ellerman wrote: > > > Applying your patch and running trinity pretty immediately results in the > > > following, which looks related (sys_move_pages() again) ? > > > > > > Unable to handle kernel paging request for data at address 0xf2000f80000000 > > > Faulting instruction address: 0xc0000000001e29bc > > > cpu 0x1b: Vector: 300 (Data Access) at [c0000003c70f76f0] > > > pc: c0000000001e29bc: .remove_migration_pte+0x9c/0x320 > > > lr: c0000000001e29b8: .remove_migration_pte+0x98/0x320 > > > sp: c0000003c70f7970 > > > msr: 8000000000009032 > > > dar: f2000f80000000 > > > dsisr: 40000000 > > > current = 0xc0000003f9045800 > > > paca = 0xc000000001dc6c00 softe: 0 irq_happened: 0x01 > > > pid = 3585, comm = trinity-c27 > > > enter ? for help > > > [c0000003c70f7a20] c0000000001bce88 .rmap_walk+0x328/0x470 > > > [c0000003c70f7ae0] c0000000001e2904 .remove_migration_ptes+0x44/0x60 > > > [c0000003c70f7b80] c0000000001e4ce8 .migrate_pages+0x6d8/0xa00 > > > [c0000003c70f7cc0] c0000000001e55ec .SyS_move_pages+0x5dc/0x7d0 > > > [c0000003c70f7e30] c00000000000a1d8 syscall_exit+0x0/0x98 > > > --- Exception: c01 (System Call) at 00003fff7b2b30a8 > > > SP (3fffe09728a0) is in userspace > > > 1b:mon> > > > > Sorry for inconvenience on your testing. > > That's fine, it's good to find bugs :) > > > Hugepage migration is enabled for archs which have pmd-level hugepage > > (including ppc64,) but not tested except for x86_64. > > hugepage_migration_support() controls this so the following patch should > > help you avoid the problem, I believe. > > Could you try to test with it? > > Sure. So this patch, in addition to Hugh's patch to remove the BUG_ON(), does > avoid the crash above (remove_migration_pte()). > > I dropped Hugh's patch, as he has decided he doesn't like it, and added the > following hunk instead: > > diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h > index 3c1b968..f230a97 100644 > --- a/include/linux/mempolicy.h > +++ b/include/linux/mempolicy.h > @@ -175,6 +175,12 @@ static inline int vma_migratable(struct vm_area_struct *vma) > { > if (vma->vm_flags & (VM_IO | VM_PFNMAP)) > return 0; > + > +#ifndef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > + if (vma->vm_flags & VM_HUGETLB) > + return 0; > +#endif > + > /* > * Migration allocates pages in the highest zone. If we cannot > * do so then migration (at least from node to node) is not > > > Which seems to be what Hugh was referring to in his mail - correct me if I'm > wrong Hugh. > > With your patch and the above hunk I can run trinity happily for a while, > whereas without it crashes almost immediately. Great. > So with the above hunk you can add my tested-by. OK, thank you for your help. I'll post the revised patch later. Thanks, Naoya Horiguchi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/2] hugetlb: restrict hugepage_migration_support() to x86_64 2014-05-30 3:04 ` Naoya Horiguchi @ 2014-05-30 4:13 ` Naoya Horiguchi 2014-05-30 12:00 ` Hugh Dickins 2014-05-30 4:13 ` [PATCH 2/2] hugetlb: rename hugepage_migration_support() to ..._supported() Naoya Horiguchi 1 sibling, 1 reply; 16+ messages in thread From: Naoya Horiguchi @ 2014-05-30 4:13 UTC (permalink / raw) To: Andrew Morton Cc: Hugh Dickins, Michael Ellerman, Benjamin Herrenschmidt, Tony Luck, linux-kernel, linux-mm, trinity Curretly hugepage migration is available for all archs which support pmd-level hugepage, but testing is done only for x86_64 and there're bugs for other archs. So to avoid breaking such archs, this patch limits the availability strictly to x86_64 until developers of other archs get interested in enabling this feature. Simply disabling hugepage migration on non-x86_64 archs is not enough to fix the reported problem where sys_move_pages() hits the BUG_ON() in follow_page(FOLL_GET), so let's fix this by checking if hugepage migration is supported in vma_migratable(). ChangeLog: - add VM_HUGETLB check in vma_migratable() - fix dependency in config ARCH_ENABLE_HUGEPAGE_MIGRATION - remove comment on hugepage_migration_support() Reported-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: stable@vger.kernel.org # 3.12+ --- arch/arm/mm/hugetlbpage.c | 5 ----- arch/arm64/mm/hugetlbpage.c | 5 ----- arch/ia64/mm/hugetlbpage.c | 5 ----- arch/metag/mm/hugetlbpage.c | 5 ----- arch/mips/mm/hugetlbpage.c | 5 ----- arch/powerpc/mm/hugetlbpage.c | 10 ---------- arch/s390/mm/hugetlbpage.c | 5 ----- arch/sh/mm/hugetlbpage.c | 5 ----- arch/sparc/mm/hugetlbpage.c | 5 ----- arch/tile/mm/hugetlbpage.c | 5 ----- arch/x86/Kconfig | 4 ++++ arch/x86/mm/hugetlbpage.c | 10 ---------- include/linux/hugetlb.h | 13 +++++-------- include/linux/mempolicy.h | 6 ++++++ mm/Kconfig | 3 +++ 15 files changed, 18 insertions(+), 73 deletions(-) diff --git v3.15-rc5.orig/arch/arm/mm/hugetlbpage.c v3.15-rc5/arch/arm/mm/hugetlbpage.c index 54ee6163c181..66781bf34077 100644 --- v3.15-rc5.orig/arch/arm/mm/hugetlbpage.c +++ v3.15-rc5/arch/arm/mm/hugetlbpage.c @@ -56,8 +56,3 @@ int pmd_huge(pmd_t pmd) { return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT); } - -int pmd_huge_support(void) -{ - return 1; -} diff --git v3.15-rc5.orig/arch/arm64/mm/hugetlbpage.c v3.15-rc5/arch/arm64/mm/hugetlbpage.c index 5e9aec358306..2fc8258bab2d 100644 --- v3.15-rc5.orig/arch/arm64/mm/hugetlbpage.c +++ v3.15-rc5/arch/arm64/mm/hugetlbpage.c @@ -54,11 +54,6 @@ int pud_huge(pud_t pud) return !(pud_val(pud) & PUD_TABLE_BIT); } -int pmd_huge_support(void) -{ - return 1; -} - static __init int setup_hugepagesz(char *opt) { unsigned long ps = memparse(opt, &opt); diff --git v3.15-rc5.orig/arch/ia64/mm/hugetlbpage.c v3.15-rc5/arch/ia64/mm/hugetlbpage.c index 68232db98baa..76069c18ee42 100644 --- v3.15-rc5.orig/arch/ia64/mm/hugetlbpage.c +++ v3.15-rc5/arch/ia64/mm/hugetlbpage.c @@ -114,11 +114,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 0; -} - struct page * follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git v3.15-rc5.orig/arch/metag/mm/hugetlbpage.c v3.15-rc5/arch/metag/mm/hugetlbpage.c index 042431509b56..3c52fa6d0f8e 100644 --- v3.15-rc5.orig/arch/metag/mm/hugetlbpage.c +++ v3.15-rc5/arch/metag/mm/hugetlbpage.c @@ -110,11 +110,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 1; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git v3.15-rc5.orig/arch/mips/mm/hugetlbpage.c v3.15-rc5/arch/mips/mm/hugetlbpage.c index 77e0ae036e7c..4ec8ee10d371 100644 --- v3.15-rc5.orig/arch/mips/mm/hugetlbpage.c +++ v3.15-rc5/arch/mips/mm/hugetlbpage.c @@ -84,11 +84,6 @@ int pud_huge(pud_t pud) return (pud_val(pud) & _PAGE_HUGE) != 0; } -int pmd_huge_support(void) -{ - return 1; -} - struct page * follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) diff --git v3.15-rc5.orig/arch/powerpc/mm/hugetlbpage.c v3.15-rc5/arch/powerpc/mm/hugetlbpage.c index eb923654ba80..7e70ae968e5f 100644 --- v3.15-rc5.orig/arch/powerpc/mm/hugetlbpage.c +++ v3.15-rc5/arch/powerpc/mm/hugetlbpage.c @@ -86,11 +86,6 @@ int pgd_huge(pgd_t pgd) */ return ((pgd_val(pgd) & 0x3) != 0x0); } - -int pmd_huge_support(void) -{ - return 1; -} #else int pmd_huge(pmd_t pmd) { @@ -106,11 +101,6 @@ int pgd_huge(pgd_t pgd) { return 0; } - -int pmd_huge_support(void) -{ - return 0; -} #endif pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) diff --git v3.15-rc5.orig/arch/s390/mm/hugetlbpage.c v3.15-rc5/arch/s390/mm/hugetlbpage.c index 0727a55d87d9..0ff66a7e29bb 100644 --- v3.15-rc5.orig/arch/s390/mm/hugetlbpage.c +++ v3.15-rc5/arch/s390/mm/hugetlbpage.c @@ -220,11 +220,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 1; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmdp, int write) { diff --git v3.15-rc5.orig/arch/sh/mm/hugetlbpage.c v3.15-rc5/arch/sh/mm/hugetlbpage.c index 0d676a41081e..d7762349ea48 100644 --- v3.15-rc5.orig/arch/sh/mm/hugetlbpage.c +++ v3.15-rc5/arch/sh/mm/hugetlbpage.c @@ -83,11 +83,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 0; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git v3.15-rc5.orig/arch/sparc/mm/hugetlbpage.c v3.15-rc5/arch/sparc/mm/hugetlbpage.c index 9bd9ce80bf77..d329537739c6 100644 --- v3.15-rc5.orig/arch/sparc/mm/hugetlbpage.c +++ v3.15-rc5/arch/sparc/mm/hugetlbpage.c @@ -231,11 +231,6 @@ int pud_huge(pud_t pud) return 0; } -int pmd_huge_support(void) -{ - return 0; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git v3.15-rc5.orig/arch/tile/mm/hugetlbpage.c v3.15-rc5/arch/tile/mm/hugetlbpage.c index 0cb3bbaa580c..e514899e1100 100644 --- v3.15-rc5.orig/arch/tile/mm/hugetlbpage.c +++ v3.15-rc5/arch/tile/mm/hugetlbpage.c @@ -166,11 +166,6 @@ int pud_huge(pud_t pud) return !!(pud_val(pud) & _PAGE_HUGE_PAGE); } -int pmd_huge_support(void) -{ - return 1; -} - struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) { diff --git v3.15-rc5.orig/arch/x86/Kconfig v3.15-rc5/arch/x86/Kconfig index 25d2c6f7325e..6b8b429c832f 100644 --- v3.15-rc5.orig/arch/x86/Kconfig +++ v3.15-rc5/arch/x86/Kconfig @@ -1871,6 +1871,10 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK def_bool y depends on X86_64 || X86_PAE +config ARCH_ENABLE_HUGEPAGE_MIGRATION + def_bool y + depends on X86_64 && HUGETLB_PAGE && MIGRATION + menu "Power management and ACPI options" config ARCH_HIBERNATION_HEADER diff --git v3.15-rc5.orig/arch/x86/mm/hugetlbpage.c v3.15-rc5/arch/x86/mm/hugetlbpage.c index 8c9f647ff9e1..8b977ebf9388 100644 --- v3.15-rc5.orig/arch/x86/mm/hugetlbpage.c +++ v3.15-rc5/arch/x86/mm/hugetlbpage.c @@ -58,11 +58,6 @@ follow_huge_pmd(struct mm_struct *mm, unsigned long address, { return NULL; } - -int pmd_huge_support(void) -{ - return 0; -} #else struct page * @@ -80,11 +75,6 @@ int pud_huge(pud_t pud) { return !!(pud_val(pud) & _PAGE_PSE); } - -int pmd_huge_support(void) -{ - return 1; -} #endif #ifdef CONFIG_HUGETLB_PAGE diff --git v3.15-rc5.orig/include/linux/hugetlb.h v3.15-rc5/include/linux/hugetlb.h index 63214868c5b2..c9de64cf288d 100644 --- v3.15-rc5.orig/include/linux/hugetlb.h +++ v3.15-rc5/include/linux/hugetlb.h @@ -385,15 +385,13 @@ static inline pgoff_t basepage_index(struct page *page) extern void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn); -int pmd_huge_support(void); -/* - * Currently hugepage migration is enabled only for pmd-based hugepage. - * This function will be updated when hugepage migration is more widely - * supported. - */ static inline int hugepage_migration_support(struct hstate *h) { - return pmd_huge_support() && (huge_page_shift(h) == PMD_SHIFT); +#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION + return huge_page_shift(h) == PMD_SHIFT; +#else + return 0; +#endif } static inline spinlock_t *huge_pte_lockptr(struct hstate *h, @@ -443,7 +441,6 @@ static inline pgoff_t basepage_index(struct page *page) return page->index; } #define dissolve_free_huge_pages(s, e) do {} while (0) -#define pmd_huge_support() 0 #define hugepage_migration_support(h) 0 static inline spinlock_t *huge_pte_lockptr(struct hstate *h, diff --git v3.15-rc5.orig/include/linux/mempolicy.h v3.15-rc5/include/linux/mempolicy.h index 3c1b968da0ca..f230a978e6ba 100644 --- v3.15-rc5.orig/include/linux/mempolicy.h +++ v3.15-rc5/include/linux/mempolicy.h @@ -175,6 +175,12 @@ static inline int vma_migratable(struct vm_area_struct *vma) { if (vma->vm_flags & (VM_IO | VM_PFNMAP)) return 0; + +#ifndef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION + if (vma->vm_flags & VM_HUGETLB) + return 0; +#endif + /* * Migration allocates pages in the highest zone. If we cannot * do so then migration (at least from node to node) is not diff --git v3.15-rc5.orig/mm/Kconfig v3.15-rc5/mm/Kconfig index ebe5880c29d6..1e22701c972b 100644 --- v3.15-rc5.orig/mm/Kconfig +++ v3.15-rc5/mm/Kconfig @@ -264,6 +264,9 @@ config MIGRATION pages as migration can relocate pages to satisfy a huge page allocation instead of reclaiming. +config ARCH_ENABLE_HUGEPAGE_MIGRATION + boolean + config PHYS_ADDR_T_64BIT def_bool 64BIT || ARCH_PHYS_ADDR_T_64BIT -- 1.9.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] hugetlb: restrict hugepage_migration_support() to x86_64 2014-05-30 4:13 ` [PATCH 1/2] hugetlb: restrict hugepage_migration_support() to x86_64 Naoya Horiguchi @ 2014-05-30 12:00 ` Hugh Dickins 0 siblings, 0 replies; 16+ messages in thread From: Hugh Dickins @ 2014-05-30 12:00 UTC (permalink / raw) To: Naoya Horiguchi Cc: Andrew Morton, Hugh Dickins, Michael Ellerman, Benjamin Herrenschmidt, Tony Luck, linux-kernel, linux-mm, trinity On Fri, 30 May 2014, Naoya Horiguchi wrote: > Curretly hugepage migration is available for all archs which support pmd-level > hugepage, but testing is done only for x86_64 and there're bugs for other archs. > So to avoid breaking such archs, this patch limits the availability strictly to > x86_64 until developers of other archs get interested in enabling this feature. > > Simply disabling hugepage migration on non-x86_64 archs is not enough to fix > the reported problem where sys_move_pages() hits the BUG_ON() in > follow_page(FOLL_GET), so let's fix this by checking if hugepage migration is > supported in vma_migratable(). > > ChangeLog: > - add VM_HUGETLB check in vma_migratable() > - fix dependency in config ARCH_ENABLE_HUGEPAGE_MIGRATION > - remove comment on hugepage_migration_support() > > Reported-by: Michael Ellerman <mpe@ellerman.id.au> > Tested-by: Michael Ellerman <mpe@ellerman.id.au> > Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > Cc: stable@vger.kernel.org # 3.12+ Acked-by: Hugh Dickins <hughd@google.com> > --- > arch/arm/mm/hugetlbpage.c | 5 ----- > arch/arm64/mm/hugetlbpage.c | 5 ----- > arch/ia64/mm/hugetlbpage.c | 5 ----- > arch/metag/mm/hugetlbpage.c | 5 ----- > arch/mips/mm/hugetlbpage.c | 5 ----- > arch/powerpc/mm/hugetlbpage.c | 10 ---------- > arch/s390/mm/hugetlbpage.c | 5 ----- > arch/sh/mm/hugetlbpage.c | 5 ----- > arch/sparc/mm/hugetlbpage.c | 5 ----- > arch/tile/mm/hugetlbpage.c | 5 ----- > arch/x86/Kconfig | 4 ++++ > arch/x86/mm/hugetlbpage.c | 10 ---------- > include/linux/hugetlb.h | 13 +++++-------- > include/linux/mempolicy.h | 6 ++++++ > mm/Kconfig | 3 +++ > 15 files changed, 18 insertions(+), 73 deletions(-) > > diff --git v3.15-rc5.orig/arch/arm/mm/hugetlbpage.c v3.15-rc5/arch/arm/mm/hugetlbpage.c > index 54ee6163c181..66781bf34077 100644 > --- v3.15-rc5.orig/arch/arm/mm/hugetlbpage.c > +++ v3.15-rc5/arch/arm/mm/hugetlbpage.c > @@ -56,8 +56,3 @@ int pmd_huge(pmd_t pmd) > { > return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT); > } > - > -int pmd_huge_support(void) > -{ > - return 1; > -} > diff --git v3.15-rc5.orig/arch/arm64/mm/hugetlbpage.c v3.15-rc5/arch/arm64/mm/hugetlbpage.c > index 5e9aec358306..2fc8258bab2d 100644 > --- v3.15-rc5.orig/arch/arm64/mm/hugetlbpage.c > +++ v3.15-rc5/arch/arm64/mm/hugetlbpage.c > @@ -54,11 +54,6 @@ int pud_huge(pud_t pud) > return !(pud_val(pud) & PUD_TABLE_BIT); > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > static __init int setup_hugepagesz(char *opt) > { > unsigned long ps = memparse(opt, &opt); > diff --git v3.15-rc5.orig/arch/ia64/mm/hugetlbpage.c v3.15-rc5/arch/ia64/mm/hugetlbpage.c > index 68232db98baa..76069c18ee42 100644 > --- v3.15-rc5.orig/arch/ia64/mm/hugetlbpage.c > +++ v3.15-rc5/arch/ia64/mm/hugetlbpage.c > @@ -114,11 +114,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 0; > -} > - > struct page * > follow_huge_pmd(struct mm_struct *mm, unsigned long address, pmd_t *pmd, int write) > { > diff --git v3.15-rc5.orig/arch/metag/mm/hugetlbpage.c v3.15-rc5/arch/metag/mm/hugetlbpage.c > index 042431509b56..3c52fa6d0f8e 100644 > --- v3.15-rc5.orig/arch/metag/mm/hugetlbpage.c > +++ v3.15-rc5/arch/metag/mm/hugetlbpage.c > @@ -110,11 +110,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > { > diff --git v3.15-rc5.orig/arch/mips/mm/hugetlbpage.c v3.15-rc5/arch/mips/mm/hugetlbpage.c > index 77e0ae036e7c..4ec8ee10d371 100644 > --- v3.15-rc5.orig/arch/mips/mm/hugetlbpage.c > +++ v3.15-rc5/arch/mips/mm/hugetlbpage.c > @@ -84,11 +84,6 @@ int pud_huge(pud_t pud) > return (pud_val(pud) & _PAGE_HUGE) != 0; > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > struct page * > follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > diff --git v3.15-rc5.orig/arch/powerpc/mm/hugetlbpage.c v3.15-rc5/arch/powerpc/mm/hugetlbpage.c > index eb923654ba80..7e70ae968e5f 100644 > --- v3.15-rc5.orig/arch/powerpc/mm/hugetlbpage.c > +++ v3.15-rc5/arch/powerpc/mm/hugetlbpage.c > @@ -86,11 +86,6 @@ int pgd_huge(pgd_t pgd) > */ > return ((pgd_val(pgd) & 0x3) != 0x0); > } > - > -int pmd_huge_support(void) > -{ > - return 1; > -} > #else > int pmd_huge(pmd_t pmd) > { > @@ -106,11 +101,6 @@ int pgd_huge(pgd_t pgd) > { > return 0; > } > - > -int pmd_huge_support(void) > -{ > - return 0; > -} > #endif > > pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) > diff --git v3.15-rc5.orig/arch/s390/mm/hugetlbpage.c v3.15-rc5/arch/s390/mm/hugetlbpage.c > index 0727a55d87d9..0ff66a7e29bb 100644 > --- v3.15-rc5.orig/arch/s390/mm/hugetlbpage.c > +++ v3.15-rc5/arch/s390/mm/hugetlbpage.c > @@ -220,11 +220,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmdp, int write) > { > diff --git v3.15-rc5.orig/arch/sh/mm/hugetlbpage.c v3.15-rc5/arch/sh/mm/hugetlbpage.c > index 0d676a41081e..d7762349ea48 100644 > --- v3.15-rc5.orig/arch/sh/mm/hugetlbpage.c > +++ v3.15-rc5/arch/sh/mm/hugetlbpage.c > @@ -83,11 +83,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 0; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > { > diff --git v3.15-rc5.orig/arch/sparc/mm/hugetlbpage.c v3.15-rc5/arch/sparc/mm/hugetlbpage.c > index 9bd9ce80bf77..d329537739c6 100644 > --- v3.15-rc5.orig/arch/sparc/mm/hugetlbpage.c > +++ v3.15-rc5/arch/sparc/mm/hugetlbpage.c > @@ -231,11 +231,6 @@ int pud_huge(pud_t pud) > return 0; > } > > -int pmd_huge_support(void) > -{ > - return 0; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > { > diff --git v3.15-rc5.orig/arch/tile/mm/hugetlbpage.c v3.15-rc5/arch/tile/mm/hugetlbpage.c > index 0cb3bbaa580c..e514899e1100 100644 > --- v3.15-rc5.orig/arch/tile/mm/hugetlbpage.c > +++ v3.15-rc5/arch/tile/mm/hugetlbpage.c > @@ -166,11 +166,6 @@ int pud_huge(pud_t pud) > return !!(pud_val(pud) & _PAGE_HUGE_PAGE); > } > > -int pmd_huge_support(void) > -{ > - return 1; > -} > - > struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address, > pmd_t *pmd, int write) > { > diff --git v3.15-rc5.orig/arch/x86/Kconfig v3.15-rc5/arch/x86/Kconfig > index 25d2c6f7325e..6b8b429c832f 100644 > --- v3.15-rc5.orig/arch/x86/Kconfig > +++ v3.15-rc5/arch/x86/Kconfig > @@ -1871,6 +1871,10 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK > def_bool y > depends on X86_64 || X86_PAE > > +config ARCH_ENABLE_HUGEPAGE_MIGRATION > + def_bool y > + depends on X86_64 && HUGETLB_PAGE && MIGRATION > + > menu "Power management and ACPI options" > > config ARCH_HIBERNATION_HEADER > diff --git v3.15-rc5.orig/arch/x86/mm/hugetlbpage.c v3.15-rc5/arch/x86/mm/hugetlbpage.c > index 8c9f647ff9e1..8b977ebf9388 100644 > --- v3.15-rc5.orig/arch/x86/mm/hugetlbpage.c > +++ v3.15-rc5/arch/x86/mm/hugetlbpage.c > @@ -58,11 +58,6 @@ follow_huge_pmd(struct mm_struct *mm, unsigned long address, > { > return NULL; > } > - > -int pmd_huge_support(void) > -{ > - return 0; > -} > #else > > struct page * > @@ -80,11 +75,6 @@ int pud_huge(pud_t pud) > { > return !!(pud_val(pud) & _PAGE_PSE); > } > - > -int pmd_huge_support(void) > -{ > - return 1; > -} > #endif > > #ifdef CONFIG_HUGETLB_PAGE > diff --git v3.15-rc5.orig/include/linux/hugetlb.h v3.15-rc5/include/linux/hugetlb.h > index 63214868c5b2..c9de64cf288d 100644 > --- v3.15-rc5.orig/include/linux/hugetlb.h > +++ v3.15-rc5/include/linux/hugetlb.h > @@ -385,15 +385,13 @@ static inline pgoff_t basepage_index(struct page *page) > > extern void dissolve_free_huge_pages(unsigned long start_pfn, > unsigned long end_pfn); > -int pmd_huge_support(void); > -/* > - * Currently hugepage migration is enabled only for pmd-based hugepage. > - * This function will be updated when hugepage migration is more widely > - * supported. > - */ > static inline int hugepage_migration_support(struct hstate *h) > { > - return pmd_huge_support() && (huge_page_shift(h) == PMD_SHIFT); > +#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > + return huge_page_shift(h) == PMD_SHIFT; > +#else > + return 0; > +#endif > } > > static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > @@ -443,7 +441,6 @@ static inline pgoff_t basepage_index(struct page *page) > return page->index; > } > #define dissolve_free_huge_pages(s, e) do {} while (0) > -#define pmd_huge_support() 0 > #define hugepage_migration_support(h) 0 > > static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > diff --git v3.15-rc5.orig/include/linux/mempolicy.h v3.15-rc5/include/linux/mempolicy.h > index 3c1b968da0ca..f230a978e6ba 100644 > --- v3.15-rc5.orig/include/linux/mempolicy.h > +++ v3.15-rc5/include/linux/mempolicy.h > @@ -175,6 +175,12 @@ static inline int vma_migratable(struct vm_area_struct *vma) > { > if (vma->vm_flags & (VM_IO | VM_PFNMAP)) > return 0; > + > +#ifndef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > + if (vma->vm_flags & VM_HUGETLB) > + return 0; > +#endif > + > /* > * Migration allocates pages in the highest zone. If we cannot > * do so then migration (at least from node to node) is not > diff --git v3.15-rc5.orig/mm/Kconfig v3.15-rc5/mm/Kconfig > index ebe5880c29d6..1e22701c972b 100644 > --- v3.15-rc5.orig/mm/Kconfig > +++ v3.15-rc5/mm/Kconfig > @@ -264,6 +264,9 @@ config MIGRATION > pages as migration can relocate pages to satisfy a huge page > allocation instead of reclaiming. > > +config ARCH_ENABLE_HUGEPAGE_MIGRATION > + boolean > + > config PHYS_ADDR_T_64BIT > def_bool 64BIT || ARCH_PHYS_ADDR_T_64BIT > > -- > 1.9.3 > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/2] hugetlb: rename hugepage_migration_support() to ..._supported() 2014-05-30 3:04 ` Naoya Horiguchi 2014-05-30 4:13 ` [PATCH 1/2] hugetlb: restrict hugepage_migration_support() to x86_64 Naoya Horiguchi @ 2014-05-30 4:13 ` Naoya Horiguchi 2014-05-30 12:02 ` Hugh Dickins 1 sibling, 1 reply; 16+ messages in thread From: Naoya Horiguchi @ 2014-05-30 4:13 UTC (permalink / raw) To: Andrew Morton Cc: Hugh Dickins, Michael Ellerman, Benjamin Herrenschmidt, Tony Luck, linux-kernel, linux-mm, trinity We already have a function named hugepage_supported(), and the similar name hugepage_migration_support() is a bit unconfortable, so let's rename it hugepage_migration_supported(). Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> --- include/linux/hugetlb.h | 4 ++-- mm/hugetlb.c | 2 +- mm/migrate.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git v3.15-rc5.orig/include/linux/hugetlb.h v3.15-rc5/include/linux/hugetlb.h index c9de64cf288d..9d35e514312b 100644 --- v3.15-rc5.orig/include/linux/hugetlb.h +++ v3.15-rc5/include/linux/hugetlb.h @@ -385,7 +385,7 @@ static inline pgoff_t basepage_index(struct page *page) extern void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn); -static inline int hugepage_migration_support(struct hstate *h) +static inline int hugepage_migration_supported(struct hstate *h) { #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION return huge_page_shift(h) == PMD_SHIFT; @@ -441,7 +441,7 @@ static inline pgoff_t basepage_index(struct page *page) return page->index; } #define dissolve_free_huge_pages(s, e) do {} while (0) -#define hugepage_migration_support(h) 0 +#define hugepage_migration_supported(h) 0 static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) diff --git v3.15-rc5.orig/mm/hugetlb.c v3.15-rc5/mm/hugetlb.c index ea42b584661a..83d936d12c1d 100644 --- v3.15-rc5.orig/mm/hugetlb.c +++ v3.15-rc5/mm/hugetlb.c @@ -545,7 +545,7 @@ static struct page *dequeue_huge_page_node(struct hstate *h, int nid) /* Movability of hugepages depends on migration support. */ static inline gfp_t htlb_alloc_mask(struct hstate *h) { - if (hugepages_treat_as_movable || hugepage_migration_support(h)) + if (hugepages_treat_as_movable || hugepage_migration_supported(h)) return GFP_HIGHUSER_MOVABLE; else return GFP_HIGHUSER; diff --git v3.15-rc5.orig/mm/migrate.c v3.15-rc5/mm/migrate.c index bed48809e5d0..15b589ae6aaf 100644 --- v3.15-rc5.orig/mm/migrate.c +++ v3.15-rc5/mm/migrate.c @@ -1031,7 +1031,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, * tables or check whether the hugepage is pmd-based or not before * kicking migration. */ - if (!hugepage_migration_support(page_hstate(hpage))) { + if (!hugepage_migration_supported(page_hstate(hpage))) { putback_active_hugepage(hpage); return -ENOSYS; } -- 1.9.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] hugetlb: rename hugepage_migration_support() to ..._supported() 2014-05-30 4:13 ` [PATCH 2/2] hugetlb: rename hugepage_migration_support() to ..._supported() Naoya Horiguchi @ 2014-05-30 12:02 ` Hugh Dickins 0 siblings, 0 replies; 16+ messages in thread From: Hugh Dickins @ 2014-05-30 12:02 UTC (permalink / raw) To: Naoya Horiguchi Cc: Andrew Morton, Hugh Dickins, Michael Ellerman, Benjamin Herrenschmidt, Tony Luck, linux-kernel, linux-mm, trinity On Fri, 30 May 2014, Naoya Horiguchi wrote: > We already have a function named hugepage_supported(), and the similar hugepages_supported() > name hugepage_migration_support() is a bit unconfortable, so let's rename > it hugepage_migration_supported(). > > Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Hugh Dickins <hughd@google.com> > --- > include/linux/hugetlb.h | 4 ++-- > mm/hugetlb.c | 2 +- > mm/migrate.c | 2 +- > 3 files changed, 4 insertions(+), 4 deletions(-) > > diff --git v3.15-rc5.orig/include/linux/hugetlb.h v3.15-rc5/include/linux/hugetlb.h > index c9de64cf288d..9d35e514312b 100644 > --- v3.15-rc5.orig/include/linux/hugetlb.h > +++ v3.15-rc5/include/linux/hugetlb.h > @@ -385,7 +385,7 @@ static inline pgoff_t basepage_index(struct page *page) > > extern void dissolve_free_huge_pages(unsigned long start_pfn, > unsigned long end_pfn); > -static inline int hugepage_migration_support(struct hstate *h) > +static inline int hugepage_migration_supported(struct hstate *h) > { > #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > return huge_page_shift(h) == PMD_SHIFT; > @@ -441,7 +441,7 @@ static inline pgoff_t basepage_index(struct page *page) > return page->index; > } > #define dissolve_free_huge_pages(s, e) do {} while (0) > -#define hugepage_migration_support(h) 0 > +#define hugepage_migration_supported(h) 0 > > static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > struct mm_struct *mm, pte_t *pte) > diff --git v3.15-rc5.orig/mm/hugetlb.c v3.15-rc5/mm/hugetlb.c > index ea42b584661a..83d936d12c1d 100644 > --- v3.15-rc5.orig/mm/hugetlb.c > +++ v3.15-rc5/mm/hugetlb.c > @@ -545,7 +545,7 @@ static struct page *dequeue_huge_page_node(struct hstate *h, int nid) > /* Movability of hugepages depends on migration support. */ > static inline gfp_t htlb_alloc_mask(struct hstate *h) > { > - if (hugepages_treat_as_movable || hugepage_migration_support(h)) > + if (hugepages_treat_as_movable || hugepage_migration_supported(h)) > return GFP_HIGHUSER_MOVABLE; > else > return GFP_HIGHUSER; > diff --git v3.15-rc5.orig/mm/migrate.c v3.15-rc5/mm/migrate.c > index bed48809e5d0..15b589ae6aaf 100644 > --- v3.15-rc5.orig/mm/migrate.c > +++ v3.15-rc5/mm/migrate.c > @@ -1031,7 +1031,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > * tables or check whether the hugepage is pmd-based or not before > * kicking migration. > */ > - if (!hugepage_migration_support(page_hstate(hpage))) { > + if (!hugepage_migration_supported(page_hstate(hpage))) { > putback_active_hugepage(hpage); > return -ENOSYS; > } > -- > 1.9.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2014-05-30 12:03 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-28 8:32 BUG at mm/memory.c:1489! Michael Ellerman 2014-05-29 0:33 ` Hugh Dickins 2014-05-29 4:52 ` Naoya Horiguchi 2014-05-29 20:50 ` Hugh Dickins 2014-05-29 8:59 ` Michael Ellerman 2014-05-29 18:34 ` [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) Naoya Horiguchi 2014-05-29 22:04 ` Hugh Dickins 2014-05-30 2:56 ` Naoya Horiguchi 2014-05-29 21:03 ` BUG at mm/memory.c:1489! Hugh Dickins [not found] ` <1401388474-mqnis5cp@n-horiguchi@ah.jp.nec.com> 2014-05-30 1:35 ` [PATCH] hugetlb: restrict hugepage_migration_support() to x86_64 (Re: BUG at mm/memory.c:1489!) Michael Ellerman 2014-05-30 1:52 ` Hugh Dickins 2014-05-30 3:04 ` Naoya Horiguchi 2014-05-30 4:13 ` [PATCH 1/2] hugetlb: restrict hugepage_migration_support() to x86_64 Naoya Horiguchi 2014-05-30 12:00 ` Hugh Dickins 2014-05-30 4:13 ` [PATCH 2/2] hugetlb: rename hugepage_migration_support() to ..._supported() Naoya Horiguchi 2014-05-30 12:02 ` Hugh Dickins
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).