* [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() @ 2016-05-09 8:34 Xishi Qiu 2016-05-09 9:39 ` Vlastimil Babka 0 siblings, 1 reply; 7+ messages in thread From: Xishi Qiu @ 2016-05-09 8:34 UTC (permalink / raw) To: Andrew Morton, Mel Gorman, Michal Hocko, Vlastimil Babka, David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi, Alexander Duyck, Johannes Weiner Cc: Linux MM, LKML If the pfn is not aligned to pageblock, the check pfn may access a next pageblcok, and the next pageblock may belong to a next section. Because struct page has not been alloced in the next section, so kernel panic. I find the caller of has_unmovable_pages() has passed a aligned pfn, so it doesn't have this problem. But the earlier kernel version(e.g. v3.10) has. e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger it. The following log is from RHEL v7.1 [14111.611492] Stack: [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000 [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001 [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200 [14111.611511] Call Trace: [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130 [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170 [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590 [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0 [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0 [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30 [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140 [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0 [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0 [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81 [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40 [14111.611598] RSP <ffff880c393ebd80> [14111.611600] CR2: ffffea0062000000 [14111.611604] ---[ end trace 9f780ed1def334c6 ]--- [14111.678586] Kernel panic - not syncing: Fatal exception Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> --- mm/page_alloc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 59de90d..9afc1bc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, return false; pfn = page_to_pfn(page); + pfn = pfn & ~(pageblock_nr_pages - 1); for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) { unsigned long check = pfn + iter; -- 1.8.3.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() 2016-05-09 8:34 [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() Xishi Qiu @ 2016-05-09 9:39 ` Vlastimil Babka 2016-05-09 10:02 ` Xishi Qiu 0 siblings, 1 reply; 7+ messages in thread From: Vlastimil Babka @ 2016-05-09 9:39 UTC (permalink / raw) To: Xishi Qiu, Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi, Alexander Duyck, Johannes Weiner Cc: Linux MM, LKML On 05/09/2016 10:34 AM, Xishi Qiu wrote: > If the pfn is not aligned to pageblock, the check pfn may access a next > pageblcok, and the next pageblock may belong to a next section. Because > struct page has not been alloced in the next section, so kernel panic. > > I find the caller of has_unmovable_pages() has passed a aligned pfn, so it > doesn't have this problem. But the earlier kernel version(e.g. v3.10) has. > e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger > it. The following log is from RHEL v7.1 I think has_unmovable_pages() is wrong layer where to fix such problem, as I'll explain below. > [14111.611492] Stack: > [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000 > [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001 > [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200 > [14111.611511] Call Trace: > [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130 > [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170 > [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590 > [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0 > [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0 > [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30 > [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140 > [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0 > [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0 > [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b > [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81 > [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40 > [14111.611598] RSP <ffff880c393ebd80> > [14111.611600] CR2: ffffea0062000000 > [14111.611604] ---[ end trace 9f780ed1def334c6 ]--- > [14111.678586] Kernel panic - not syncing: Fatal exception > > Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> It's not CC'd stable, so how will this patch fix the older kernels? Also you should determine which upstream kernel versions are affected, not a RHEL derivative. Also is the current upstream broken or not? > --- > mm/page_alloc.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 59de90d..9afc1bc 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, > return false; > > pfn = page_to_pfn(page); > + pfn = pfn & ~(pageblock_nr_pages - 1); I think it's wrong that has_unmovable_pages() would silently correct wrong input. See e.g. the call path from start_isolate_page_range -> set_migratetype_isolate -> has_unmovable_pages. In start_isolate_page_range() there are BUG_ON's to check the alignment. That would be more appropriate here as well (but use VM_BUG_ON please). One danger of the self-correction is that the adjusted pfn might be of a different zone, so let's not go there. If there's a call stack that passes unaligned page, it has to be fixed higher in the stack IMHO. > for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) { > unsigned long check = pfn + iter; > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() 2016-05-09 9:39 ` Vlastimil Babka @ 2016-05-09 10:02 ` Xishi Qiu 2016-05-12 9:27 ` Naoya Horiguchi 2016-05-12 11:36 ` Vlastimil Babka 0 siblings, 2 replies; 7+ messages in thread From: Xishi Qiu @ 2016-05-09 10:02 UTC (permalink / raw) To: Vlastimil Babka Cc: Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi, Alexander Duyck, Johannes Weiner, Linux MM, LKML On 2016/5/9 17:39, Vlastimil Babka wrote: > On 05/09/2016 10:34 AM, Xishi Qiu wrote: >> If the pfn is not aligned to pageblock, the check pfn may access a next >> pageblcok, and the next pageblock may belong to a next section. Because >> struct page has not been alloced in the next section, so kernel panic. >> >> I find the caller of has_unmovable_pages() has passed a aligned pfn, so it >> doesn't have this problem. But the earlier kernel version(e.g. v3.10) has. >> e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger >> it. The following log is from RHEL v7.1 > > I think has_unmovable_pages() is wrong layer where to fix such problem, as I'll explain below. > >> [14111.611492] Stack: >> [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000 >> [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001 >> [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200 >> [14111.611511] Call Trace: >> [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130 >> [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170 >> [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590 >> [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0 >> [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0 >> [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30 >> [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140 >> [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0 >> [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0 >> [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b >> [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81 >> [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40 >> [14111.611598] RSP <ffff880c393ebd80> >> [14111.611600] CR2: ffffea0062000000 >> [14111.611604] ---[ end trace 9f780ed1def334c6 ]--- >> [14111.678586] Kernel panic - not syncing: Fatal exception >> >> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> > > It's not CC'd stable, so how will this patch fix the older kernels? Also you should determine which upstream kernel versions are affected, not a RHEL derivative. > Also is the current upstream broken or not? > OK, I'll resend it later. The current upstream has not this problem. >> --- >> mm/page_alloc.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 59de90d..9afc1bc 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, >> return false; >> >> pfn = page_to_pfn(page); >> + pfn = pfn & ~(pageblock_nr_pages - 1); > > I think it's wrong that has_unmovable_pages() would silently correct wrong input. See e.g. the call path from start_isolate_page_range -> set_migratetype_isolate -> has_unmovable_pages. In start_isolate_page_range() there are BUG_ON's to check the alignment. That would be more appropriate here as well (but use VM_BUG_ON please). > Yes, this path is correct. But the older kernel like the following path has the problem. soft_offline_page get_any_page __get_any_page set_migratetype_isolate has_unmovable_pages > One danger of the self-correction is that the adjusted pfn might be of > a different zone, so let's not go there. If there's a call stack that passes unaligned page, it has to be fixed higher in the stack IMHO. > How about change the pfn when calling set_migratetype_isolate()? e.g. set_migratetype_isolate((p & ~(pageblock_nr_pages - 1)), true); Thanks, Xishi Qiu >> for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) { >> unsigned long check = pfn + iter; >> >> > > > . > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() 2016-05-09 10:02 ` Xishi Qiu @ 2016-05-12 9:27 ` Naoya Horiguchi 2016-05-12 11:36 ` Vlastimil Babka 1 sibling, 0 replies; 7+ messages in thread From: Naoya Horiguchi @ 2016-05-12 9:27 UTC (permalink / raw) To: Xishi Qiu Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi, Alexander Duyck, Johannes Weiner, Linux MM, LKML On Mon, May 09, 2016 at 06:02:32PM +0800, Xishi Qiu wrote: > On 2016/5/9 17:39, Vlastimil Babka wrote: > > > On 05/09/2016 10:34 AM, Xishi Qiu wrote: > >> If the pfn is not aligned to pageblock, the check pfn may access a next > >> pageblcok, and the next pageblock may belong to a next section. Because > >> struct page has not been alloced in the next section, so kernel panic. > >> > >> I find the caller of has_unmovable_pages() has passed a aligned pfn, so it > >> doesn't have this problem. But the earlier kernel version(e.g. v3.10) has. > >> e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger > >> it. The following log is from RHEL v7.1 > > > > I think has_unmovable_pages() is wrong layer where to fix such problem, as I'll explain below. > > > >> [14111.611492] Stack: > >> [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000 > >> [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001 > >> [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200 > >> [14111.611511] Call Trace: > >> [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130 > >> [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170 > >> [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590 > >> [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0 > >> [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0 > >> [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30 > >> [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140 > >> [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0 > >> [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0 > >> [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b > >> [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81 > >> [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40 > >> [14111.611598] RSP <ffff880c393ebd80> > >> [14111.611600] CR2: ffffea0062000000 > >> [14111.611604] ---[ end trace 9f780ed1def334c6 ]--- > >> [14111.678586] Kernel panic - not syncing: Fatal exception > >> > >> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> > > > > It's not CC'd stable, so how will this patch fix the older kernels? Also you should determine which upstream kernel versions are affected, not a RHEL derivative. > > Also is the current upstream broken or not? > > > > OK, I'll resend it later. The current upstream has not this problem. This is because soft offline has stopped changing migratetype at commit add05cecef80 ("mm: soft-offline: don't free target page in successful page migration") (or v4.1-3344-gadd05cecef80). # And the fix is available in RHEL7.2, but unfortunately not in RHEL7.1. Thanks, Naoya Horiguchi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() 2016-05-09 10:02 ` Xishi Qiu 2016-05-12 9:27 ` Naoya Horiguchi @ 2016-05-12 11:36 ` Vlastimil Babka 2016-05-12 11:58 ` Xishi Qiu 1 sibling, 1 reply; 7+ messages in thread From: Vlastimil Babka @ 2016-05-12 11:36 UTC (permalink / raw) To: Xishi Qiu Cc: Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi, Alexander Duyck, Johannes Weiner, Linux MM, LKML On 05/09/2016 12:02 PM, Xishi Qiu wrote: > On 2016/5/9 17:39, Vlastimil Babka wrote: > >> On 05/09/2016 10:34 AM, Xishi Qiu wrote: >>> If the pfn is not aligned to pageblock, the check pfn may access a next >>> pageblcok, and the next pageblock may belong to a next section. Because >>> struct page has not been alloced in the next section, so kernel panic. >>> >>> I find the caller of has_unmovable_pages() has passed a aligned pfn, so it >>> doesn't have this problem. But the earlier kernel version(e.g. v3.10) has. >>> e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger >>> it. The following log is from RHEL v7.1 >> >> I think has_unmovable_pages() is wrong layer where to fix such problem, as I'll explain below. >> >>> [14111.611492] Stack: >>> [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000 >>> [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001 >>> [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200 >>> [14111.611511] Call Trace: >>> [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130 >>> [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170 >>> [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590 >>> [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0 >>> [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0 >>> [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30 >>> [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140 >>> [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0 >>> [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0 >>> [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b >>> [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81 >>> [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40 >>> [14111.611598] RSP <ffff880c393ebd80> >>> [14111.611600] CR2: ffffea0062000000 >>> [14111.611604] ---[ end trace 9f780ed1def334c6 ]--- >>> [14111.678586] Kernel panic - not syncing: Fatal exception >>> >>> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> >> >> It's not CC'd stable, so how will this patch fix the older kernels? Also you should determine which upstream kernel versions are affected, not a RHEL derivative. >> Also is the current upstream broken or not? >> > > OK, I'll resend it later. The current upstream has not this problem. > >>> --- >>> mm/page_alloc.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>> index 59de90d..9afc1bc 100644 >>> --- a/mm/page_alloc.c >>> +++ b/mm/page_alloc.c >>> @@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, >>> return false; >>> >>> pfn = page_to_pfn(page); >>> + pfn = pfn & ~(pageblock_nr_pages - 1); >> >> I think it's wrong that has_unmovable_pages() would silently correct wrong input. See e.g. the call path from start_isolate_page_range -> set_migratetype_isolate -> has_unmovable_pages. In start_isolate_page_range() there are BUG_ON's to check the alignment. That would be more appropriate here as well (but use VM_BUG_ON please). >> > > Yes, this path is correct. > > But the older kernel like the following path has the problem. > soft_offline_page > get_any_page > __get_any_page > set_migratetype_isolate > has_unmovable_pages > >> One danger of the self-correction is that the adjusted pfn might be of >> a different zone, so let's not go there. If there's a call stack that passes unaligned page, it has to be fixed higher in the stack IMHO. >> > > How about change the pfn when calling set_migratetype_isolate()? > e.g. set_migratetype_isolate((p & ~(pageblock_nr_pages - 1)), true); Sounds ok, please try. > > Thanks, > Xishi Qiu > >>> for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) { >>> unsigned long check = pfn + iter; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() 2016-05-12 11:36 ` Vlastimil Babka @ 2016-05-12 11:58 ` Xishi Qiu 2016-05-12 12:05 ` Vlastimil Babka 0 siblings, 1 reply; 7+ messages in thread From: Xishi Qiu @ 2016-05-12 11:58 UTC (permalink / raw) To: Vlastimil Babka, Naoya Horiguchi Cc: Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi, Alexander Duyck, Johannes Weiner, Linux MM, LKML On 2016/5/12 19:36, Vlastimil Babka wrote: > On 05/09/2016 12:02 PM, Xishi Qiu wrote: >> On 2016/5/9 17:39, Vlastimil Babka wrote: >> >>> On 05/09/2016 10:34 AM, Xishi Qiu wrote: >>>> If the pfn is not aligned to pageblock, the check pfn may access a next >>>> pageblcok, and the next pageblock may belong to a next section. Because >>>> struct page has not been alloced in the next section, so kernel panic. >>>> >>>> I find the caller of has_unmovable_pages() has passed a aligned pfn, so it >>>> doesn't have this problem. But the earlier kernel version(e.g. v3.10) has. >>>> e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger >>>> it. The following log is from RHEL v7.1 >>> >>> I think has_unmovable_pages() is wrong layer where to fix such problem, as I'll explain below. >>> >>>> [14111.611492] Stack: >>>> [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000 >>>> [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001 >>>> [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200 >>>> [14111.611511] Call Trace: >>>> [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130 >>>> [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170 >>>> [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590 >>>> [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0 >>>> [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0 >>>> [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30 >>>> [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140 >>>> [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0 >>>> [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0 >>>> [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b >>>> [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81 >>>> [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40 >>>> [14111.611598] RSP <ffff880c393ebd80> >>>> [14111.611600] CR2: ffffea0062000000 >>>> [14111.611604] ---[ end trace 9f780ed1def334c6 ]--- >>>> [14111.678586] Kernel panic - not syncing: Fatal exception >>>> >>>> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> >>> >>> It's not CC'd stable, so how will this patch fix the older kernels? Also you should determine which upstream kernel versions are affected, not a RHEL derivative. >>> Also is the current upstream broken or not? >>> >> >> OK, I'll resend it later. The current upstream has not this problem. >> >>>> --- >>>> mm/page_alloc.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>> index 59de90d..9afc1bc 100644 >>>> --- a/mm/page_alloc.c >>>> +++ b/mm/page_alloc.c >>>> @@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, >>>> return false; >>>> >>>> pfn = page_to_pfn(page); >>>> + pfn = pfn & ~(pageblock_nr_pages - 1); >>> >>> I think it's wrong that has_unmovable_pages() would silently correct wrong input. See e.g. the call path from start_isolate_page_range -> set_migratetype_isolate -> has_unmovable_pages. In start_isolate_page_range() there are BUG_ON's to check the alignment. That would be more appropriate here as well (but use VM_BUG_ON please). >>> >> >> Yes, this path is correct. >> >> But the older kernel like the following path has the problem. >> soft_offline_page >> get_any_page >> __get_any_page >> set_migratetype_isolate >> has_unmovable_pages >> >>> One danger of the self-correction is that the adjusted pfn might be of >>> a different zone, so let's not go there. If there's a call stack that passes unaligned page, it has to be fixed higher in the stack IMHO. >>> >> >> How about change the pfn when calling set_migratetype_isolate()? >> e.g. set_migratetype_isolate((p & ~(pageblock_nr_pages - 1)), true); > > Sounds ok, please try. > Hi Vlastimil and Naoya, The mainline doesn't have this problem, because commit add05cecef80 ("mm: soft-offline: don't free target page in successful page migration") fixed it in v4.2. I guess the above patch can't be applied to older kernel directly. So shall we rewrite a new one or backport the whole patches which it depend? Thanks, Xishi Qiu >> >> Thanks, >> Xishi Qiu >> >>>> for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) { >>>> unsigned long check = pfn + iter; > > > . > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() 2016-05-12 11:58 ` Xishi Qiu @ 2016-05-12 12:05 ` Vlastimil Babka 0 siblings, 0 replies; 7+ messages in thread From: Vlastimil Babka @ 2016-05-12 12:05 UTC (permalink / raw) To: Xishi Qiu, Naoya Horiguchi Cc: Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi, Alexander Duyck, Johannes Weiner, Linux MM, LKML On 05/12/2016 01:58 PM, Xishi Qiu wrote: > On 2016/5/12 19:36, Vlastimil Babka wrote: > >> On 05/09/2016 12:02 PM, Xishi Qiu wrote: >> >> Sounds ok, please try. >> > > Hi Vlastimil and Naoya, > > The mainline doesn't have this problem, because commit > add05cecef80 ("mm: soft-offline: don't free target page in successful > page migration") fixed it in v4.2. > > I guess the above patch can't be applied to older kernel directly. > So shall we rewrite a new one or backport the whole patches which it depend? I think it makes most sense here to write a <4.2 specific patch and send it just to stable. If the alternative of backporting add05cecef80 would be disruptive, mention that in the changelog. Try to pinpoint the commit that introduced the bug so the fix can have a proper "Fixes:" header. > Thanks, > Xishi Qiu > >>> >>> Thanks, >>> Xishi Qiu >>> >>>>> for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) { >>>>> unsigned long check = pfn + iter; >> >> >> . >> > > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-05-12 12:05 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-05-09 8:34 [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() Xishi Qiu 2016-05-09 9:39 ` Vlastimil Babka 2016-05-09 10:02 ` Xishi Qiu 2016-05-12 9:27 ` Naoya Horiguchi 2016-05-12 11:36 ` Vlastimil Babka 2016-05-12 11:58 ` Xishi Qiu 2016-05-12 12:05 ` Vlastimil Babka
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).