* [PATCH] mm: check pfn_valid first in zero_resv_unavail
@ 2017-11-30  6:04 Dave Young
  2017-11-30  9:35 ` Michal Hocko
  2017-11-30 17:25 ` Pavel Tatashin
  0 siblings, 2 replies; 9+ messages in thread
From: Dave Young @ 2017-11-30  6:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: pasha.tatashin, linux-mm, akpm
With latest kernel I get below bug while testing kdump:
[    0.000000] BUG: unable to handle kernel paging request at ffffea00034b1040
[    0.000000] IP: zero_resv_unavail+0xbd/0x126
[    0.000000] PGD 37b98067 P4D 37b98067 PUD 37b97067 PMD 0 
[    0.000000] Oops: 0002 [#1] SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.15.0-rc1+ #316
[    0.000000] Hardware name: LENOVO 20ARS1BJ02/20ARS1BJ02, BIOS GJET92WW (2.42 ) 03/03/2017
[    0.000000] task: ffffffff81a0e4c0 task.stack: ffffffff81a00000
[    0.000000] RIP: 0010:zero_resv_unavail+0xbd/0x126
[    0.000000] RSP: 0000:ffffffff81a03d88 EFLAGS: 00010006
[    0.000000] RAX: 0000000000000000 RBX: ffffea00034b1040 RCX: 0000000000000010
[    0.000000] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffffea00034b1040
[    0.000000] RBP: 00000000000d2c41 R08: 00000000000000c0 R09: 0000000000000a0d
[    0.000000] R10: 0000000000000002 R11: 0000000000007f01 R12: ffffffff81a03d90
[    0.000000] R13: ffffea0000000000 R14: 0000000000000063 R15: 0000000000000062
[    0.000000] FS:  0000000000000000(0000) GS:ffffffff81c73000(0000) knlGS:0000000000000000
[    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.000000] CR2: ffffea00034b1040 CR3: 0000000037609000 CR4: 00000000000606b0
[    0.000000] Call Trace:
[    0.000000]  ? free_area_init_nodes+0x640/0x664
[    0.000000]  ? zone_sizes_init+0x58/0x72
[    0.000000]  ? setup_arch+0xb50/0xc6c
[    0.000000]  ? start_kernel+0x64/0x43d
[    0.000000]  ? secondary_startup_64+0xa5/0xb0
[    0.000000] Code: c1 e8 0c 48 39 d8 76 27 48 89 de 48 c1 e3 06 48 c7 c7 7a 87 79 81 e8 b0 c0 3e ff 4c 01 eb b9 10 00 00 00 31 c0 48 89 df 49 ff c6 <f3> ab eb bc 6a 00 49 
c7 c0 f0 93 d1 81 31 d2 83 ce ff 41 54 49 
[    0.000000] RIP: zero_resv_unavail+0xbd/0x126 RSP: ffffffff81a03d88
[    0.000000] CR2: ffffea00034b1040
[    0.000000] ---[ end trace f5ba9e8f73c7ee26 ]---
This is introduced with below commit:
commit a4a3ede2132ae0863e2d43e06f9b5697c51a7a3b
Author: Pavel Tatashin <pasha.tatashin@oracle.com>
Date:   Wed Nov 15 17:36:31 2017 -0800
    mm: zero reserved and unavailable struct pages
The reason is some efi reserved boot ranges is not reported in E820 ram.
In my case it is a bgrt buffer:
efi: mem00: [Boot Data          |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC] range=[0x00000000d2c41000-0x00000000d2c85fff] (0MB)
Use "add_efi_memmap" can workaround the problem with another fix:
https://lkml.org/lkml/2017/11/30/5
In zero_resv_unavail it would be better to check pfn_valid first before zero
the page struct. This fixes the problem and potential other similar problems.
Signed-off-by: Dave Young <dyoung@redhat.com>
---
 mm/page_alloc.c |    2 ++
 1 file changed, 2 insertions(+)
--- linux.orig/mm/page_alloc.c
+++ linux/mm/page_alloc.c
@@ -6253,6 +6253,8 @@ void __paginginit zero_resv_unavail(void
 	pgcnt = 0;
 	for_each_resv_unavail_range(i, &start, &end) {
 		for (pfn = PFN_DOWN(start); pfn < PFN_UP(end); pfn++) {
+			if (!pfn_valid(pfn))
+				continue;
 			mm_zero_struct_page(pfn_to_page(pfn));
 			pgcnt++;
 		}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread- * Re: [PATCH] mm: check pfn_valid first in zero_resv_unavail
  2017-11-30  6:04 [PATCH] mm: check pfn_valid first in zero_resv_unavail Dave Young
@ 2017-11-30  9:35 ` Michal Hocko
  2017-12-01  8:56   ` Dave Young
  2017-11-30 17:25 ` Pavel Tatashin
  1 sibling, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2017-11-30  9:35 UTC (permalink / raw)
  To: Dave Young; +Cc: linux-kernel, pasha.tatashin, linux-mm, akpm
On Thu 30-11-17 14:04:31, Dave Young wrote:
> With latest kernel I get below bug while testing kdump:
> 
> [    0.000000] BUG: unable to handle kernel paging request at ffffea00034b1040
> [    0.000000] IP: zero_resv_unavail+0xbd/0x126
> [    0.000000] PGD 37b98067 P4D 37b98067 PUD 37b97067 PMD 0 
> [    0.000000] Oops: 0002 [#1] SMP
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.15.0-rc1+ #316
> [    0.000000] Hardware name: LENOVO 20ARS1BJ02/20ARS1BJ02, BIOS GJET92WW (2.42 ) 03/03/2017
> [    0.000000] task: ffffffff81a0e4c0 task.stack: ffffffff81a00000
> [    0.000000] RIP: 0010:zero_resv_unavail+0xbd/0x126
> [    0.000000] RSP: 0000:ffffffff81a03d88 EFLAGS: 00010006
> [    0.000000] RAX: 0000000000000000 RBX: ffffea00034b1040 RCX: 0000000000000010
> [    0.000000] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffffea00034b1040
> [    0.000000] RBP: 00000000000d2c41 R08: 00000000000000c0 R09: 0000000000000a0d
> [    0.000000] R10: 0000000000000002 R11: 0000000000007f01 R12: ffffffff81a03d90
> [    0.000000] R13: ffffea0000000000 R14: 0000000000000063 R15: 0000000000000062
> [    0.000000] FS:  0000000000000000(0000) GS:ffffffff81c73000(0000) knlGS:0000000000000000
> [    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.000000] CR2: ffffea00034b1040 CR3: 0000000037609000 CR4: 00000000000606b0
> [    0.000000] Call Trace:
> [    0.000000]  ? free_area_init_nodes+0x640/0x664
> [    0.000000]  ? zone_sizes_init+0x58/0x72
> [    0.000000]  ? setup_arch+0xb50/0xc6c
> [    0.000000]  ? start_kernel+0x64/0x43d
> [    0.000000]  ? secondary_startup_64+0xa5/0xb0
> [    0.000000] Code: c1 e8 0c 48 39 d8 76 27 48 89 de 48 c1 e3 06 48 c7 c7 7a 87 79 81 e8 b0 c0 3e ff 4c 01 eb b9 10 00 00 00 31 c0 48 89 df 49 ff c6 <f3> ab eb bc 6a 00 49 
> c7 c0 f0 93 d1 81 31 d2 83 ce ff 41 54 49 
> [    0.000000] RIP: zero_resv_unavail+0xbd/0x126 RSP: ffffffff81a03d88
> [    0.000000] CR2: ffffea00034b1040
> [    0.000000] ---[ end trace f5ba9e8f73c7ee26 ]---
> 
> This is introduced with below commit:
> commit a4a3ede2132ae0863e2d43e06f9b5697c51a7a3b
> Author: Pavel Tatashin <pasha.tatashin@oracle.com>
> Date:   Wed Nov 15 17:36:31 2017 -0800
> 
>     mm: zero reserved and unavailable struct pages
the usual format when mentioning a commit is a4a3ede2132a ("mm: zero
reserved and unavailable struct pages").
 
> The reason is some efi reserved boot ranges is not reported in E820 ram.
> In my case it is a bgrt buffer:
> efi: mem00: [Boot Data          |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC] range=[0x00000000d2c41000-0x00000000d2c85fff] (0MB)
> 
> Use "add_efi_memmap" can workaround the problem with another fix:
> https://lkml.org/lkml/2017/11/30/5
lkml.org tends to be broken a lot, please use
http://lkml.kernel.org/r/MSG_ID instead. It would be
http://lkml.kernel.org/r/20171130052327.GA3500@dhcp-128-65.nay.redhat.com
here
> In zero_resv_unavail it would be better to check pfn_valid first before zero
> the page struct. This fixes the problem and potential other similar problems.
Can we exclude that range from the memblock allocator instead? E.g. what
happens if somebody allocates from that range?
> Signed-off-by: Dave Young <dyoung@redhat.com>
> ---
>  mm/page_alloc.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> --- linux.orig/mm/page_alloc.c
> +++ linux/mm/page_alloc.c
> @@ -6253,6 +6253,8 @@ void __paginginit zero_resv_unavail(void
>  	pgcnt = 0;
>  	for_each_resv_unavail_range(i, &start, &end) {
>  		for (pfn = PFN_DOWN(start); pfn < PFN_UP(end); pfn++) {
> +			if (!pfn_valid(pfn))
> +				continue;
>  			mm_zero_struct_page(pfn_to_page(pfn));
>  			pgcnt++;
>  		}
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
- * Re: [PATCH] mm: check pfn_valid first in zero_resv_unavail
  2017-11-30  9:35 ` Michal Hocko
@ 2017-12-01  8:56   ` Dave Young
  2017-12-01  9:19     ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Young @ 2017-12-01  8:56 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, pasha.tatashin, linux-mm, akpm
On 11/30/17 at 10:35am, Michal Hocko wrote:
> On Thu 30-11-17 14:04:31, Dave Young wrote:
> > With latest kernel I get below bug while testing kdump:
> > 
> > [    0.000000] BUG: unable to handle kernel paging request at ffffea00034b1040
> > [    0.000000] IP: zero_resv_unavail+0xbd/0x126
> > [    0.000000] PGD 37b98067 P4D 37b98067 PUD 37b97067 PMD 0 
> > [    0.000000] Oops: 0002 [#1] SMP
> > [    0.000000] Modules linked in:
> > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.15.0-rc1+ #316
> > [    0.000000] Hardware name: LENOVO 20ARS1BJ02/20ARS1BJ02, BIOS GJET92WW (2.42 ) 03/03/2017
> > [    0.000000] task: ffffffff81a0e4c0 task.stack: ffffffff81a00000
> > [    0.000000] RIP: 0010:zero_resv_unavail+0xbd/0x126
> > [    0.000000] RSP: 0000:ffffffff81a03d88 EFLAGS: 00010006
> > [    0.000000] RAX: 0000000000000000 RBX: ffffea00034b1040 RCX: 0000000000000010
> > [    0.000000] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffffea00034b1040
> > [    0.000000] RBP: 00000000000d2c41 R08: 00000000000000c0 R09: 0000000000000a0d
> > [    0.000000] R10: 0000000000000002 R11: 0000000000007f01 R12: ffffffff81a03d90
> > [    0.000000] R13: ffffea0000000000 R14: 0000000000000063 R15: 0000000000000062
> > [    0.000000] FS:  0000000000000000(0000) GS:ffffffff81c73000(0000) knlGS:0000000000000000
> > [    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    0.000000] CR2: ffffea00034b1040 CR3: 0000000037609000 CR4: 00000000000606b0
> > [    0.000000] Call Trace:
> > [    0.000000]  ? free_area_init_nodes+0x640/0x664
> > [    0.000000]  ? zone_sizes_init+0x58/0x72
> > [    0.000000]  ? setup_arch+0xb50/0xc6c
> > [    0.000000]  ? start_kernel+0x64/0x43d
> > [    0.000000]  ? secondary_startup_64+0xa5/0xb0
> > [    0.000000] Code: c1 e8 0c 48 39 d8 76 27 48 89 de 48 c1 e3 06 48 c7 c7 7a 87 79 81 e8 b0 c0 3e ff 4c 01 eb b9 10 00 00 00 31 c0 48 89 df 49 ff c6 <f3> ab eb bc 6a 00 49 
> > c7 c0 f0 93 d1 81 31 d2 83 ce ff 41 54 49 
> > [    0.000000] RIP: zero_resv_unavail+0xbd/0x126 RSP: ffffffff81a03d88
> > [    0.000000] CR2: ffffea00034b1040
> > [    0.000000] ---[ end trace f5ba9e8f73c7ee26 ]---
> > 
> > This is introduced with below commit:
> > commit a4a3ede2132ae0863e2d43e06f9b5697c51a7a3b
> > Author: Pavel Tatashin <pasha.tatashin@oracle.com>
> > Date:   Wed Nov 15 17:36:31 2017 -0800
> > 
> >     mm: zero reserved and unavailable struct pages
> 
> the usual format when mentioning a commit is a4a3ede2132a ("mm: zero
> reserved and unavailable struct pages").
Will use the short flavor commit
>  
> > The reason is some efi reserved boot ranges is not reported in E820 ram.
> > In my case it is a bgrt buffer:
> > efi: mem00: [Boot Data          |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC] range=[0x00000000d2c41000-0x00000000d2c85fff] (0MB)
> > 
> > Use "add_efi_memmap" can workaround the problem with another fix:
> > https://lkml.org/lkml/2017/11/30/5
> 
> lkml.org tends to be broken a lot, please use
> http://lkml.kernel.org/r/MSG_ID instead. It would be
> http://lkml.kernel.org/r/20171130052327.GA3500@dhcp-128-65.nay.redhat.com
> here
I also say unstable lkml, but still used to use it since it is simpler
but if you prefer lkml.kernel.org I can change it.
> 
> > In zero_resv_unavail it would be better to check pfn_valid first before zero
> > the page struct. This fixes the problem and potential other similar problems.
> 
> Can we exclude that range from the memblock allocator instead? E.g. what
> happens if somebody allocates from that range?
It is a EFI BGRT image buffer provided by firmware, they are reserved
always and can not be used to allocate memory.
> 
> > Signed-off-by: Dave Young <dyoung@redhat.com>
> > ---
> >  mm/page_alloc.c |    2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > --- linux.orig/mm/page_alloc.c
> > +++ linux/mm/page_alloc.c
> > @@ -6253,6 +6253,8 @@ void __paginginit zero_resv_unavail(void
> >  	pgcnt = 0;
> >  	for_each_resv_unavail_range(i, &start, &end) {
> >  		for (pfn = PFN_DOWN(start); pfn < PFN_UP(end); pfn++) {
> > +			if (!pfn_valid(pfn))
> > +				continue;
> >  			mm_zero_struct_page(pfn_to_page(pfn));
> >  			pgcnt++;
> >  		}
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> -- 
> Michal Hocko
> SUSE Labs
Thanks
Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
- * Re: [PATCH] mm: check pfn_valid first in zero_resv_unavail
  2017-12-01  8:56   ` Dave Young
@ 2017-12-01  9:19     ` Michal Hocko
  2017-12-01  9:29       ` Dave Young
  0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2017-12-01  9:19 UTC (permalink / raw)
  To: Dave Young; +Cc: linux-kernel, pasha.tatashin, linux-mm, akpm
On Fri 01-12-17 16:56:57, Dave Young wrote:
> On 11/30/17 at 10:35am, Michal Hocko wrote:
[...]
> > Can we exclude that range from the memblock allocator instead? E.g. what
> > happens if somebody allocates from that range?
> 
> It is a EFI BGRT image buffer provided by firmware, they are reserved
> always and can not be used to allocate memory.
Hmm, I see but I was actually suggesting to remove this range from the
memblock allocator altogether (memblock_remove) as it shouldn't be there
in the first place.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread 
- * Re: [PATCH] mm: check pfn_valid first in zero_resv_unavail
  2017-12-01  9:19     ` Michal Hocko
@ 2017-12-01  9:29       ` Dave Young
  2017-12-01  9:42         ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Young @ 2017-12-01  9:29 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, pasha.tatashin, linux-mm, akpm
On 12/01/17 at 10:19am, Michal Hocko wrote:
> On Fri 01-12-17 16:56:57, Dave Young wrote:
> > On 11/30/17 at 10:35am, Michal Hocko wrote:
> [...]
> > > Can we exclude that range from the memblock allocator instead? E.g. what
> > > happens if somebody allocates from that range?
> > 
> > It is a EFI BGRT image buffer provided by firmware, they are reserved
> > always and can not be used to allocate memory.
> 
> Hmm, I see but I was actually suggesting to remove this range from the
> memblock allocator altogether (memblock_remove) as it shouldn't be there
> in the first place.
Oh, I'm not sure because it is introduced as a way for efi to reserve
boot services areas to be persistent across kexec reboot. See
drivers/firmware/efi/efi.c: efi_mem_reserve(), BGRT is only one user
of it, there is esrt and maybe other users, I do not know if it is safe
:(
> -- 
> Michal Hocko
> SUSE Labs
Thanks
Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread 
- * Re: [PATCH] mm: check pfn_valid first in zero_resv_unavail
  2017-12-01  9:29       ` Dave Young
@ 2017-12-01  9:42         ` Michal Hocko
  2017-12-01  9:49           ` Dave Young
  0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2017-12-01  9:42 UTC (permalink / raw)
  To: Dave Young; +Cc: linux-kernel, pasha.tatashin, linux-mm, akpm
On Fri 01-12-17 17:29:51, Dave Young wrote:
> On 12/01/17 at 10:19am, Michal Hocko wrote:
> > On Fri 01-12-17 16:56:57, Dave Young wrote:
> > > On 11/30/17 at 10:35am, Michal Hocko wrote:
> > [...]
> > > > Can we exclude that range from the memblock allocator instead? E.g. what
> > > > happens if somebody allocates from that range?
> > > 
> > > It is a EFI BGRT image buffer provided by firmware, they are reserved
> > > always and can not be used to allocate memory.
> > 
> > Hmm, I see but I was actually suggesting to remove this range from the
> > memblock allocator altogether (memblock_remove) as it shouldn't be there
> > in the first place.
> 
> Oh, I'm not sure because it is introduced as a way for efi to reserve
> boot services areas to be persistent across kexec reboot. See
> drivers/firmware/efi/efi.c: efi_mem_reserve(), BGRT is only one user
> of it, there is esrt and maybe other users, I do not know if it is safe
> :(
Hmm, so it this range ever backed by a valid pfn?
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread 
- * Re: [PATCH] mm: check pfn_valid first in zero_resv_unavail
  2017-12-01  9:42         ` Michal Hocko
@ 2017-12-01  9:49           ` Dave Young
  0 siblings, 0 replies; 9+ messages in thread
From: Dave Young @ 2017-12-01  9:49 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, pasha.tatashin, linux-mm, akpm
On 12/01/17 at 10:42am, Michal Hocko wrote:
> On Fri 01-12-17 17:29:51, Dave Young wrote:
> > On 12/01/17 at 10:19am, Michal Hocko wrote:
> > > On Fri 01-12-17 16:56:57, Dave Young wrote:
> > > > On 11/30/17 at 10:35am, Michal Hocko wrote:
> > > [...]
> > > > > Can we exclude that range from the memblock allocator instead? E.g. what
> > > > > happens if somebody allocates from that range?
> > > > 
> > > > It is a EFI BGRT image buffer provided by firmware, they are reserved
> > > > always and can not be used to allocate memory.
> > > 
> > > Hmm, I see but I was actually suggesting to remove this range from the
> > > memblock allocator altogether (memblock_remove) as it shouldn't be there
> > > in the first place.
> > 
> > Oh, I'm not sure because it is introduced as a way for efi to reserve
> > boot services areas to be persistent across kexec reboot. See
> > drivers/firmware/efi/efi.c: efi_mem_reserve(), BGRT is only one user
> > of it, there is esrt and maybe other users, I do not know if it is safe
> > :(
> 
> Hmm, so it this range ever backed by a valid pfn?
I think it is in normal boot, just it does not appear in e820 across kdump
reboot. For kdump kexec_tools provided e820 the last pfn only covers the kdump
crashkernel ranges thus it is not mapped.
> -- 
> Michal Hocko
> SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread 
 
 
 
 
 
- * Re: [PATCH] mm: check pfn_valid first in zero_resv_unavail
  2017-11-30  6:04 [PATCH] mm: check pfn_valid first in zero_resv_unavail Dave Young
  2017-11-30  9:35 ` Michal Hocko
@ 2017-11-30 17:25 ` Pavel Tatashin
  2017-12-01  8:57   ` Dave Young
  1 sibling, 1 reply; 9+ messages in thread
From: Pavel Tatashin @ 2017-11-30 17:25 UTC (permalink / raw)
  To: Dave Young; +Cc: linux-kernel, Linux Memory Management List, Andrew Morton
Hi Dave,
Because unavailable memory can be in the middle of a section, I think
a proper fix would be to do pfn_valid() check only at the beginning of
section. Otherwise, we might miss zeroing  a struct page is in the
middle of a section but pfn_valid() could potentially return false as
that page is indeed invalid.
So, I would do something like this:
+                       if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))
+                               continue;
Could you please test if this fix works?
We should really look into this memory that is reserved by memblock
but Linux is not aware of physical backing, so far I know that only
x86 can have such scenarios, so we should really see if the problem
can be addressed on x86 platform. It would be very nice if we could
enforce inside memblock to reserve only memory that has real physical
backing.
Thank you,
Pavel
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread 
- * Re: [PATCH] mm: check pfn_valid first in zero_resv_unavail
  2017-11-30 17:25 ` Pavel Tatashin
@ 2017-12-01  8:57   ` Dave Young
  0 siblings, 0 replies; 9+ messages in thread
From: Dave Young @ 2017-12-01  8:57 UTC (permalink / raw)
  To: Pavel Tatashin; +Cc: linux-kernel, Linux Memory Management List, Andrew Morton
On 11/30/17 at 12:25pm, Pavel Tatashin wrote:
> Hi Dave,
> 
> Because unavailable memory can be in the middle of a section, I think
> a proper fix would be to do pfn_valid() check only at the beginning of
> section. Otherwise, we might miss zeroing  a struct page is in the
> middle of a section but pfn_valid() could potentially return false as
> that page is indeed invalid.
> 
> So, I would do something like this:
> +                       if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))
> +                               continue;
> 
> Could you please test if this fix works?
It works.
> 
> We should really look into this memory that is reserved by memblock
> but Linux is not aware of physical backing, so far I know that only
> x86 can have such scenarios, so we should really see if the problem
> can be addressed on x86 platform. It would be very nice if we could
> enforce inside memblock to reserve only memory that has real physical
> backing.
Will resend with your suggestion along with patch log changes.
> 
> Thank you,
> Pavel
Thanks
Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread 
 
end of thread, other threads:[~2017-12-01  9:49 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-30  6:04 [PATCH] mm: check pfn_valid first in zero_resv_unavail Dave Young
2017-11-30  9:35 ` Michal Hocko
2017-12-01  8:56   ` Dave Young
2017-12-01  9:19     ` Michal Hocko
2017-12-01  9:29       ` Dave Young
2017-12-01  9:42         ` Michal Hocko
2017-12-01  9:49           ` Dave Young
2017-11-30 17:25 ` Pavel Tatashin
2017-12-01  8:57   ` Dave Young
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).