linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: fix pfn spans two sections in has_unmovable_pages()
@ 2016-05-09  8:34 Xishi Qiu
  2016-05-09  9:39 ` Vlastimil Babka
  0 siblings, 1 reply; 7+ messages in thread
From: Xishi Qiu @ 2016-05-09  8:34 UTC (permalink / raw)
  To: Andrew Morton, Mel Gorman, Michal Hocko, Vlastimil Babka,
	David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim,
	Taku Izumi, Alexander Duyck, Johannes Weiner
  Cc: Linux MM, LKML

If the pfn is not aligned to pageblock, the check pfn may access a next
pageblcok, and the next pageblock may belong to a next section. Because
struct page has not been alloced in the next section, so kernel panic. 

I find the caller of has_unmovable_pages() has passed a aligned pfn, so it
doesn't have this problem. But the earlier kernel version(e.g. v3.10) has.
e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger
it. The following log is from RHEL v7.1

[14111.611492] Stack:
[14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000
[14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001
[14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200
[14111.611511] Call Trace:
[14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130
[14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170
[14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590
[14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0
[14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0
[14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30
[14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140
[14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0
[14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0
[14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b
[14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81
[14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40
[14111.611598] RSP <ffff880c393ebd80>
[14111.611600] CR2: ffffea0062000000
[14111.611604] ---[ end trace 9f780ed1def334c6 ]---
[14111.678586] Kernel panic - not syncing: Fatal exception

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
---
 mm/page_alloc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 59de90d..9afc1bc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 		return false;
 
 	pfn = page_to_pfn(page);
+	pfn = pfn & ~(pageblock_nr_pages - 1);
 	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
 		unsigned long check = pfn + iter;
 
-- 
1.8.3.1


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages()
  2016-05-09  8:34 [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() Xishi Qiu
@ 2016-05-09  9:39 ` Vlastimil Babka
  2016-05-09 10:02   ` Xishi Qiu
  0 siblings, 1 reply; 7+ messages in thread
From: Vlastimil Babka @ 2016-05-09  9:39 UTC (permalink / raw)
  To: Xishi Qiu, Andrew Morton, Mel Gorman, Michal Hocko,
	David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim,
	Taku Izumi, Alexander Duyck, Johannes Weiner
  Cc: Linux MM, LKML

On 05/09/2016 10:34 AM, Xishi Qiu wrote:
> If the pfn is not aligned to pageblock, the check pfn may access a next
> pageblcok, and the next pageblock may belong to a next section. Because
> struct page has not been alloced in the next section, so kernel panic.
>
> I find the caller of has_unmovable_pages() has passed a aligned pfn, so it
> doesn't have this problem. But the earlier kernel version(e.g. v3.10) has.
> e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger
> it. The following log is from RHEL v7.1

I think has_unmovable_pages() is wrong layer where to fix such problem, 
as I'll explain below.

> [14111.611492] Stack:
> [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000
> [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001
> [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200
> [14111.611511] Call Trace:
> [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130
> [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170
> [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590
> [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0
> [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0
> [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30
> [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140
> [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0
> [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0
> [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b
> [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81
> [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40
> [14111.611598] RSP <ffff880c393ebd80>
> [14111.611600] CR2: ffffea0062000000
> [14111.611604] ---[ end trace 9f780ed1def334c6 ]---
> [14111.678586] Kernel panic - not syncing: Fatal exception
>
> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>

It's not CC'd stable, so how will this patch fix the older kernels? Also 
you should determine which upstream kernel versions are affected, not a 
RHEL derivative.
Also is the current upstream broken or not?

> ---
>   mm/page_alloc.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 59de90d..9afc1bc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
>   		return false;
>
>   	pfn = page_to_pfn(page);
> +	pfn = pfn & ~(pageblock_nr_pages - 1);

I think it's wrong that has_unmovable_pages() would silently correct 
wrong input. See e.g. the call path from start_isolate_page_range -> 
set_migratetype_isolate -> has_unmovable_pages. In 
start_isolate_page_range() there are BUG_ON's to check the alignment. 
That would be more appropriate here as well (but use VM_BUG_ON please).

One danger of the self-correction is that the adjusted pfn might be of
a different zone, so let's not go there. If there's a call stack that 
passes unaligned page, it has to be fixed higher in the stack IMHO.

>   	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
>   		unsigned long check = pfn + iter;
>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages()
  2016-05-09  9:39 ` Vlastimil Babka
@ 2016-05-09 10:02   ` Xishi Qiu
  2016-05-12  9:27     ` Naoya Horiguchi
  2016-05-12 11:36     ` Vlastimil Babka
  0 siblings, 2 replies; 7+ messages in thread
From: Xishi Qiu @ 2016-05-09 10:02 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes,
	'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi,
	Alexander Duyck, Johannes Weiner, Linux MM, LKML

On 2016/5/9 17:39, Vlastimil Babka wrote:

> On 05/09/2016 10:34 AM, Xishi Qiu wrote:
>> If the pfn is not aligned to pageblock, the check pfn may access a next
>> pageblcok, and the next pageblock may belong to a next section. Because
>> struct page has not been alloced in the next section, so kernel panic.
>>
>> I find the caller of has_unmovable_pages() has passed a aligned pfn, so it
>> doesn't have this problem. But the earlier kernel version(e.g. v3.10) has.
>> e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger
>> it. The following log is from RHEL v7.1
> 
> I think has_unmovable_pages() is wrong layer where to fix such problem, as I'll explain below.
> 
>> [14111.611492] Stack:
>> [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000
>> [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001
>> [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200
>> [14111.611511] Call Trace:
>> [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130
>> [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170
>> [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590
>> [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0
>> [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0
>> [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30
>> [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140
>> [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0
>> [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0
>> [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b
>> [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81
>> [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40
>> [14111.611598] RSP <ffff880c393ebd80>
>> [14111.611600] CR2: ffffea0062000000
>> [14111.611604] ---[ end trace 9f780ed1def334c6 ]---
>> [14111.678586] Kernel panic - not syncing: Fatal exception
>>
>> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
> 
> It's not CC'd stable, so how will this patch fix the older kernels? Also you should determine which upstream kernel versions are affected, not a RHEL derivative.
> Also is the current upstream broken or not?
> 

OK, I'll resend it later. The current upstream has not this problem.

>> ---
>>   mm/page_alloc.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 59de90d..9afc1bc 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
>>           return false;
>>
>>       pfn = page_to_pfn(page);
>> +    pfn = pfn & ~(pageblock_nr_pages - 1);
> 
> I think it's wrong that has_unmovable_pages() would silently correct wrong input. See e.g. the call path from start_isolate_page_range -> set_migratetype_isolate -> has_unmovable_pages. In start_isolate_page_range() there are BUG_ON's to check the alignment. That would be more appropriate here as well (but use VM_BUG_ON please).
> 

Yes, this path is correct.

But the older kernel like the following path has the problem.
soft_offline_page
	get_any_page
		__get_any_page
			set_migratetype_isolate
				has_unmovable_pages

> One danger of the self-correction is that the adjusted pfn might be of
> a different zone, so let's not go there. If there's a call stack that passes unaligned page, it has to be fixed higher in the stack IMHO.
>

How about change the pfn when calling set_migratetype_isolate()?
e.g. set_migratetype_isolate((p & ~(pageblock_nr_pages - 1)), true);

Thanks,
Xishi Qiu

>>       for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
>>           unsigned long check = pfn + iter;
>>
>>
> 
> 
> .
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages()
  2016-05-09 10:02   ` Xishi Qiu
@ 2016-05-12  9:27     ` Naoya Horiguchi
  2016-05-12 11:36     ` Vlastimil Babka
  1 sibling, 0 replies; 7+ messages in thread
From: Naoya Horiguchi @ 2016-05-12  9:27 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Michal Hocko,
	David Rientjes, 'Kirill A . Shutemov', Joonsoo Kim,
	Taku Izumi, Alexander Duyck, Johannes Weiner, Linux MM, LKML

On Mon, May 09, 2016 at 06:02:32PM +0800, Xishi Qiu wrote:
> On 2016/5/9 17:39, Vlastimil Babka wrote:
> 
> > On 05/09/2016 10:34 AM, Xishi Qiu wrote:
> >> If the pfn is not aligned to pageblock, the check pfn may access a next
> >> pageblcok, and the next pageblock may belong to a next section. Because
> >> struct page has not been alloced in the next section, so kernel panic.
> >>
> >> I find the caller of has_unmovable_pages() has passed a aligned pfn, so it
> >> doesn't have this problem. But the earlier kernel version(e.g. v3.10) has.
> >> e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger
> >> it. The following log is from RHEL v7.1
> > 
> > I think has_unmovable_pages() is wrong layer where to fix such problem, as I'll explain below.
> > 
> >> [14111.611492] Stack:
> >> [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000
> >> [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001
> >> [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200
> >> [14111.611511] Call Trace:
> >> [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130
> >> [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170
> >> [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590
> >> [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0
> >> [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0
> >> [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30
> >> [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140
> >> [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0
> >> [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0
> >> [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b
> >> [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81
> >> [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40
> >> [14111.611598] RSP <ffff880c393ebd80>
> >> [14111.611600] CR2: ffffea0062000000
> >> [14111.611604] ---[ end trace 9f780ed1def334c6 ]---
> >> [14111.678586] Kernel panic - not syncing: Fatal exception
> >>
> >> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
> > 
> > It's not CC'd stable, so how will this patch fix the older kernels? Also you should determine which upstream kernel versions are affected, not a RHEL derivative.
> > Also is the current upstream broken or not?
> > 
> 
> OK, I'll resend it later. The current upstream has not this problem.

This is because soft offline has stopped changing migratetype at commit
add05cecef80 ("mm: soft-offline: don't free target page in successful
page migration") (or v4.1-3344-gadd05cecef80).
# And the fix is available in RHEL7.2, but unfortunately not in RHEL7.1.

Thanks,
Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages()
  2016-05-09 10:02   ` Xishi Qiu
  2016-05-12  9:27     ` Naoya Horiguchi
@ 2016-05-12 11:36     ` Vlastimil Babka
  2016-05-12 11:58       ` Xishi Qiu
  1 sibling, 1 reply; 7+ messages in thread
From: Vlastimil Babka @ 2016-05-12 11:36 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes,
	'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi,
	Alexander Duyck, Johannes Weiner, Linux MM, LKML

On 05/09/2016 12:02 PM, Xishi Qiu wrote:
> On 2016/5/9 17:39, Vlastimil Babka wrote:
>
>> On 05/09/2016 10:34 AM, Xishi Qiu wrote:
>>> If the pfn is not aligned to pageblock, the check pfn may access a next
>>> pageblcok, and the next pageblock may belong to a next section. Because
>>> struct page has not been alloced in the next section, so kernel panic.
>>>
>>> I find the caller of has_unmovable_pages() has passed a aligned pfn, so it
>>> doesn't have this problem. But the earlier kernel version(e.g. v3.10) has.
>>> e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger
>>> it. The following log is from RHEL v7.1
>>
>> I think has_unmovable_pages() is wrong layer where to fix such problem, as I'll explain below.
>>
>>> [14111.611492] Stack:
>>> [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000
>>> [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001
>>> [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200
>>> [14111.611511] Call Trace:
>>> [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130
>>> [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170
>>> [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590
>>> [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0
>>> [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0
>>> [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30
>>> [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140
>>> [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0
>>> [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0
>>> [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b
>>> [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81
>>> [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40
>>> [14111.611598] RSP <ffff880c393ebd80>
>>> [14111.611600] CR2: ffffea0062000000
>>> [14111.611604] ---[ end trace 9f780ed1def334c6 ]---
>>> [14111.678586] Kernel panic - not syncing: Fatal exception
>>>
>>> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
>>
>> It's not CC'd stable, so how will this patch fix the older kernels? Also you should determine which upstream kernel versions are affected, not a RHEL derivative.
>> Also is the current upstream broken or not?
>>
>
> OK, I'll resend it later. The current upstream has not this problem.
>
>>> ---
>>>    mm/page_alloc.c | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 59de90d..9afc1bc 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
>>>            return false;
>>>
>>>        pfn = page_to_pfn(page);
>>> +    pfn = pfn & ~(pageblock_nr_pages - 1);
>>
>> I think it's wrong that has_unmovable_pages() would silently correct wrong input. See e.g. the call path from start_isolate_page_range -> set_migratetype_isolate -> has_unmovable_pages. In start_isolate_page_range() there are BUG_ON's to check the alignment. That would be more appropriate here as well (but use VM_BUG_ON please).
>>
>
> Yes, this path is correct.
>
> But the older kernel like the following path has the problem.
> soft_offline_page
> 	get_any_page
> 		__get_any_page
> 			set_migratetype_isolate
> 				has_unmovable_pages
>
>> One danger of the self-correction is that the adjusted pfn might be of
>> a different zone, so let's not go there. If there's a call stack that passes unaligned page, it has to be fixed higher in the stack IMHO.
>>
>
> How about change the pfn when calling set_migratetype_isolate()?
> e.g. set_migratetype_isolate((p & ~(pageblock_nr_pages - 1)), true);

Sounds ok, please try.

>
> Thanks,
> Xishi Qiu
>
>>>        for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
>>>            unsigned long check = pfn + iter;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages()
  2016-05-12 11:36     ` Vlastimil Babka
@ 2016-05-12 11:58       ` Xishi Qiu
  2016-05-12 12:05         ` Vlastimil Babka
  0 siblings, 1 reply; 7+ messages in thread
From: Xishi Qiu @ 2016-05-12 11:58 UTC (permalink / raw)
  To: Vlastimil Babka, Naoya Horiguchi
  Cc: Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes,
	'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi,
	Alexander Duyck, Johannes Weiner, Linux MM, LKML

On 2016/5/12 19:36, Vlastimil Babka wrote:

> On 05/09/2016 12:02 PM, Xishi Qiu wrote:
>> On 2016/5/9 17:39, Vlastimil Babka wrote:
>>
>>> On 05/09/2016 10:34 AM, Xishi Qiu wrote:
>>>> If the pfn is not aligned to pageblock, the check pfn may access a next
>>>> pageblcok, and the next pageblock may belong to a next section. Because
>>>> struct page has not been alloced in the next section, so kernel panic.
>>>>
>>>> I find the caller of has_unmovable_pages() has passed a aligned pfn, so it
>>>> doesn't have this problem. But the earlier kernel version(e.g. v3.10) has.
>>>> e.g. echo xxx > /sys/devices/system/memory/soft_offline_page could trigger
>>>> it. The following log is from RHEL v7.1
>>>
>>> I think has_unmovable_pages() is wrong layer where to fix such problem, as I'll explain below.
>>>
>>>> [14111.611492] Stack:
>>>> [14111.611494] ffffffff8115d952 0000000000000000 01ff880c393ebe40 ffff880c7ffd9000
>>>> [14111.611500] ffffea0061ffffc0 ffff880c7ffd9068 0000000000000286 0000000000000001
>>>> [14111.611505] ffff880c393ebe10 ffffffff811c265a 000000000187ffff 0000000000000200
>>>> [14111.611511] Call Trace:
>>>> [14111.611516] [<ffffffff8115d952>] ? has_unmovable_pages+0xd2/0x130
>>>> [14111.611521] [<ffffffff811c265a>] set_migratetype_isolate+0xda/0x170
>>>> [14111.611526] [<ffffffff811c187a>] soft_offline_page+0x9a/0x590
>>>> [14111.611530] [<ffffffff812e7cab>] ? _kstrtoull+0x3b/0xa0
>>>> [14111.611535] [<ffffffff813e158f>] store_soft_offline_page+0xaf/0xf0
>>>> [14111.611539] [<ffffffff813cae18>] dev_attr_store+0x18/0x30
>>>> [14111.611544] [<ffffffff8123c046>] sysfs_write_file+0xc6/0x140
>>>> [14111.611548] [<ffffffff811c5b5d>] vfs_write+0xbd/0x1e0
>>>> [14111.611551] [<ffffffff811c65a8>] SyS_write+0x58/0xb0
>>>> [14111.611556] [<ffffffff8160f509>] system_call_fastpath+0x16/0x1b
>>>> [14111.611559] Code: 66 66 66 90 48 83 e0 fd 0c a0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 66 66 66 90 48 83 c8 42 0c a0 5d c3 90 66 66 66 66 90 <8b> 07 25 00 c0 00 00 75 02 f3 c3 48 8b 07 f6 c4 80 75 0f 48 81
>>>> [14111.611594] RIP [<ffffffff81199fc5>] PageHuge+0x5/0x40
>>>> [14111.611598] RSP <ffff880c393ebd80>
>>>> [14111.611600] CR2: ffffea0062000000
>>>> [14111.611604] ---[ end trace 9f780ed1def334c6 ]---
>>>> [14111.678586] Kernel panic - not syncing: Fatal exception
>>>>
>>>> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
>>>
>>> It's not CC'd stable, so how will this patch fix the older kernels? Also you should determine which upstream kernel versions are affected, not a RHEL derivative.
>>> Also is the current upstream broken or not?
>>>
>>
>> OK, I'll resend it later. The current upstream has not this problem.
>>
>>>> ---
>>>>    mm/page_alloc.c | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 59de90d..9afc1bc 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -6842,6 +6842,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
>>>>            return false;
>>>>
>>>>        pfn = page_to_pfn(page);
>>>> +    pfn = pfn & ~(pageblock_nr_pages - 1);
>>>
>>> I think it's wrong that has_unmovable_pages() would silently correct wrong input. See e.g. the call path from start_isolate_page_range -> set_migratetype_isolate -> has_unmovable_pages. In start_isolate_page_range() there are BUG_ON's to check the alignment. That would be more appropriate here as well (but use VM_BUG_ON please).
>>>
>>
>> Yes, this path is correct.
>>
>> But the older kernel like the following path has the problem.
>> soft_offline_page
>>     get_any_page
>>         __get_any_page
>>             set_migratetype_isolate
>>                 has_unmovable_pages
>>
>>> One danger of the self-correction is that the adjusted pfn might be of
>>> a different zone, so let's not go there. If there's a call stack that passes unaligned page, it has to be fixed higher in the stack IMHO.
>>>
>>
>> How about change the pfn when calling set_migratetype_isolate()?
>> e.g. set_migratetype_isolate((p & ~(pageblock_nr_pages - 1)), true);
> 
> Sounds ok, please try.
> 

Hi Vlastimil and Naoya,

The mainline doesn't have this problem, because commit
add05cecef80 ("mm: soft-offline: don't free target page in successful
page migration") fixed it in v4.2.

I guess the above patch can't be applied to older kernel directly.
So shall we rewrite a new one or backport the whole patches which it depend?  

Thanks,
Xishi Qiu

>>
>> Thanks,
>> Xishi Qiu
>>
>>>>        for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
>>>>            unsigned long check = pfn + iter;
> 
> 
> .
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: fix pfn spans two sections in has_unmovable_pages()
  2016-05-12 11:58       ` Xishi Qiu
@ 2016-05-12 12:05         ` Vlastimil Babka
  0 siblings, 0 replies; 7+ messages in thread
From: Vlastimil Babka @ 2016-05-12 12:05 UTC (permalink / raw)
  To: Xishi Qiu, Naoya Horiguchi
  Cc: Andrew Morton, Mel Gorman, Michal Hocko, David Rientjes,
	'Kirill A . Shutemov', Joonsoo Kim, Taku Izumi,
	Alexander Duyck, Johannes Weiner, Linux MM, LKML

On 05/12/2016 01:58 PM, Xishi Qiu wrote:
> On 2016/5/12 19:36, Vlastimil Babka wrote:
>
>> On 05/09/2016 12:02 PM, Xishi Qiu wrote:
>>
>> Sounds ok, please try.
>>
>
> Hi Vlastimil and Naoya,
>
> The mainline doesn't have this problem, because commit
> add05cecef80 ("mm: soft-offline: don't free target page in successful
> page migration") fixed it in v4.2.
>
> I guess the above patch can't be applied to older kernel directly.
> So shall we rewrite a new one or backport the whole patches which it depend?

I think it makes most sense here to write a <4.2 specific patch and send 
it just to stable. If the alternative of backporting add05cecef80 would 
be disruptive, mention that in the changelog. Try to pinpoint the commit 
that introduced the bug so the fix can have a proper "Fixes:" header.

> Thanks,
> Xishi Qiu
>
>>>
>>> Thanks,
>>> Xishi Qiu
>>>
>>>>>         for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
>>>>>             unsigned long check = pfn + iter;
>>
>>
>> .
>>
>
>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-05-12 12:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-09  8:34 [PATCH] mm: fix pfn spans two sections in has_unmovable_pages() Xishi Qiu
2016-05-09  9:39 ` Vlastimil Babka
2016-05-09 10:02   ` Xishi Qiu
2016-05-12  9:27     ` Naoya Horiguchi
2016-05-12 11:36     ` Vlastimil Babka
2016-05-12 11:58       ` Xishi Qiu
2016-05-12 12:05         ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).