* [PATCH 0/2] arm64: numa: fix spurious BUG() on NOMAP regions @ 2016-12-14 9:11 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-14 9:11 UTC (permalink / raw) To: linux-arm-kernel This fixes the issue reported by Robert Richter where the fact that the node id of struct pages covered by NOMAP regions is not initialized, triggering a VM_BUG_ON() in the mm code. I know that this approach is the least preferred option by Robert, but it has been used successfully in the downstream Linaro Enterprise kernel, running on HiSilicon D05, which suffered from the same issue as Cavium ThunderX where it was originally reported. Given that the other proposed solutions either fail to solve the issue completely, or cause regressions in other code (hibernate), I think this issue is appropriate for merging now, and backported to -stable. If there are performance concerns, we can try to improve on this solution, which could include reverting patch #2 altogether, for all I care. Patch #1 fixes a bug in the generic mm code where a struct page is dereferenced before pfn_valid() is called. This should probably go to stable regardless of where the arm64 discussion goes. Patch #2 enables CONFIG_HOLES_IN_ZONE for arm64 numa, causing the kernel to no longer assume that all pages in a zone have valid struct pages associated with them. Ard Biesheuvel (2): mm: don't dereference struct page fields of invalid pages arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA arch/arm64/Kconfig | 4 ++++ mm/page_alloc.c | 6 +++--- 2 files changed, 7 insertions(+), 3 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 0/2] arm64: numa: fix spurious BUG() on NOMAP regions @ 2016-12-14 9:11 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-14 9:11 UTC (permalink / raw) To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse, Ard Biesheuvel This fixes the issue reported by Robert Richter where the fact that the node id of struct pages covered by NOMAP regions is not initialized, triggering a VM_BUG_ON() in the mm code. I know that this approach is the least preferred option by Robert, but it has been used successfully in the downstream Linaro Enterprise kernel, running on HiSilicon D05, which suffered from the same issue as Cavium ThunderX where it was originally reported. Given that the other proposed solutions either fail to solve the issue completely, or cause regressions in other code (hibernate), I think this issue is appropriate for merging now, and backported to -stable. If there are performance concerns, we can try to improve on this solution, which could include reverting patch #2 altogether, for all I care. Patch #1 fixes a bug in the generic mm code where a struct page is dereferenced before pfn_valid() is called. This should probably go to stable regardless of where the arm64 discussion goes. Patch #2 enables CONFIG_HOLES_IN_ZONE for arm64 numa, causing the kernel to no longer assume that all pages in a zone have valid struct pages associated with them. Ard Biesheuvel (2): mm: don't dereference struct page fields of invalid pages arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA arch/arm64/Kconfig | 4 ++++ mm/page_alloc.c | 6 +++--- 2 files changed, 7 insertions(+), 3 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 0/2] arm64: numa: fix spurious BUG() on NOMAP regions @ 2016-12-14 9:11 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-14 9:11 UTC (permalink / raw) To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse, Ard Biesheuvel This fixes the issue reported by Robert Richter where the fact that the node id of struct pages covered by NOMAP regions is not initialized, triggering a VM_BUG_ON() in the mm code. I know that this approach is the least preferred option by Robert, but it has been used successfully in the downstream Linaro Enterprise kernel, running on HiSilicon D05, which suffered from the same issue as Cavium ThunderX where it was originally reported. Given that the other proposed solutions either fail to solve the issue completely, or cause regressions in other code (hibernate), I think this issue is appropriate for merging now, and backported to -stable. If there are performance concerns, we can try to improve on this solution, which could include reverting patch #2 altogether, for all I care. Patch #1 fixes a bug in the generic mm code where a struct page is dereferenced before pfn_valid() is called. This should probably go to stable regardless of where the arm64 discussion goes. Patch #2 enables CONFIG_HOLES_IN_ZONE for arm64 numa, causing the kernel to no longer assume that all pages in a zone have valid struct pages associated with them. Ard Biesheuvel (2): mm: don't dereference struct page fields of invalid pages arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA arch/arm64/Kconfig | 4 ++++ mm/page_alloc.c | 6 +++--- 2 files changed, 7 insertions(+), 3 deletions(-) -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 1/2] mm: don't dereference struct page fields of invalid pages 2016-12-14 9:11 ` Ard Biesheuvel (?) @ 2016-12-14 9:11 ` Ard Biesheuvel -1 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-14 9:11 UTC (permalink / raw) To: linux-arm-kernel The VM_BUG_ON() check in move_freepages() checks whether the node id of a page matches the node id of its zone. However, it does this before having checked whether the struct page pointer refers to a valid struct page to begin with. This is guaranteed in most cases, but may not be the case if CONFIG_HOLES_IN_ZONE=y. So reorder the VM_BUG_ON() with the pfn_valid_within() check. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- mm/page_alloc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f64e7bcb43b7..4e298e31fa86 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone, #endif for (page = start_page; page <= end_page;) { - /* Make sure we are not inadvertently changing nodes */ - VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); - if (!pfn_valid_within(page_to_pfn(page))) { page++; continue; } + /* Make sure we are not inadvertently changing nodes */ + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); + if (!PageBuddy(page)) { page++; continue; -- 2.7.4 ^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH 1/2] mm: don't dereference struct page fields of invalid pages @ 2016-12-14 9:11 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-14 9:11 UTC (permalink / raw) To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse, Ard Biesheuvel The VM_BUG_ON() check in move_freepages() checks whether the node id of a page matches the node id of its zone. However, it does this before having checked whether the struct page pointer refers to a valid struct page to begin with. This is guaranteed in most cases, but may not be the case if CONFIG_HOLES_IN_ZONE=y. So reorder the VM_BUG_ON() with the pfn_valid_within() check. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- mm/page_alloc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f64e7bcb43b7..4e298e31fa86 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone, #endif for (page = start_page; page <= end_page;) { - /* Make sure we are not inadvertently changing nodes */ - VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); - if (!pfn_valid_within(page_to_pfn(page))) { page++; continue; } + /* Make sure we are not inadvertently changing nodes */ + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); + if (!PageBuddy(page)) { page++; continue; -- 2.7.4 ^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH 1/2] mm: don't dereference struct page fields of invalid pages @ 2016-12-14 9:11 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-14 9:11 UTC (permalink / raw) To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse, Ard Biesheuvel The VM_BUG_ON() check in move_freepages() checks whether the node id of a page matches the node id of its zone. However, it does this before having checked whether the struct page pointer refers to a valid struct page to begin with. This is guaranteed in most cases, but may not be the case if CONFIG_HOLES_IN_ZONE=y. So reorder the VM_BUG_ON() with the pfn_valid_within() check. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- mm/page_alloc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f64e7bcb43b7..4e298e31fa86 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone, #endif for (page = start_page; page <= end_page;) { - /* Make sure we are not inadvertently changing nodes */ - VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); - if (!pfn_valid_within(page_to_pfn(page))) { page++; continue; } + /* Make sure we are not inadvertently changing nodes */ + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); + if (!PageBuddy(page)) { page++; continue; -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH 1/2] mm: don't dereference struct page fields of invalid pages 2016-12-14 9:11 ` Ard Biesheuvel (?) @ 2017-01-04 12:16 ` Will Deacon -1 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-04 12:16 UTC (permalink / raw) To: linux-arm-kernel On Wed, Dec 14, 2016 at 09:11:46AM +0000, Ard Biesheuvel wrote: > The VM_BUG_ON() check in move_freepages() checks whether the node > id of a page matches the node id of its zone. However, it does this > before having checked whether the struct page pointer refers to a > valid struct page to begin with. This is guaranteed in most cases, > but may not be the case if CONFIG_HOLES_IN_ZONE=y. > > So reorder the VM_BUG_ON() with the pfn_valid_within() check. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > mm/page_alloc.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index f64e7bcb43b7..4e298e31fa86 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone, > #endif > > for (page = start_page; page <= end_page;) { > - /* Make sure we are not inadvertently changing nodes */ > - VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); > - > if (!pfn_valid_within(page_to_pfn(page))) { > page++; > continue; > } > > + /* Make sure we are not inadvertently changing nodes */ > + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); > + > if (!PageBuddy(page)) { > page++; > continue; Acked-by: Will Deacon <will.deacon@arm.com> I'm guessing akpm can pick this up as a non-urgent fix. Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 1/2] mm: don't dereference struct page fields of invalid pages @ 2017-01-04 12:16 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-04 12:16 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse On Wed, Dec 14, 2016 at 09:11:46AM +0000, Ard Biesheuvel wrote: > The VM_BUG_ON() check in move_freepages() checks whether the node > id of a page matches the node id of its zone. However, it does this > before having checked whether the struct page pointer refers to a > valid struct page to begin with. This is guaranteed in most cases, > but may not be the case if CONFIG_HOLES_IN_ZONE=y. > > So reorder the VM_BUG_ON() with the pfn_valid_within() check. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > mm/page_alloc.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index f64e7bcb43b7..4e298e31fa86 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone, > #endif > > for (page = start_page; page <= end_page;) { > - /* Make sure we are not inadvertently changing nodes */ > - VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); > - > if (!pfn_valid_within(page_to_pfn(page))) { > page++; > continue; > } > > + /* Make sure we are not inadvertently changing nodes */ > + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); > + > if (!PageBuddy(page)) { > page++; > continue; Acked-by: Will Deacon <will.deacon@arm.com> I'm guessing akpm can pick this up as a non-urgent fix. Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 1/2] mm: don't dereference struct page fields of invalid pages @ 2017-01-04 12:16 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-04 12:16 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse On Wed, Dec 14, 2016 at 09:11:46AM +0000, Ard Biesheuvel wrote: > The VM_BUG_ON() check in move_freepages() checks whether the node > id of a page matches the node id of its zone. However, it does this > before having checked whether the struct page pointer refers to a > valid struct page to begin with. This is guaranteed in most cases, > but may not be the case if CONFIG_HOLES_IN_ZONE=y. > > So reorder the VM_BUG_ON() with the pfn_valid_within() check. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > mm/page_alloc.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index f64e7bcb43b7..4e298e31fa86 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone, > #endif > > for (page = start_page; page <= end_page;) { > - /* Make sure we are not inadvertently changing nodes */ > - VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); > - > if (!pfn_valid_within(page_to_pfn(page))) { > page++; > continue; > } > > + /* Make sure we are not inadvertently changing nodes */ > + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); > + > if (!PageBuddy(page)) { > page++; > continue; Acked-by: Will Deacon <will.deacon@arm.com> I'm guessing akpm can pick this up as a non-urgent fix. Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2016-12-14 9:11 ` Ard Biesheuvel (?) @ 2016-12-14 9:11 ` Ard Biesheuvel -1 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-14 9:11 UTC (permalink / raw) To: linux-arm-kernel The NUMA code may get confused by the presence of NOMAP regions within zones, resulting in spurious BUG() checks where the node id deviates from the containing zone's node id. Since the kernel has no business reasoning about node ids of pages it does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure that such pages are disregarded. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/Kconfig | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 111742126897..0472afe64d55 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK def_bool y depends on NUMA +config HOLES_IN_ZONE + def_bool y + depends on NUMA + source kernel/Kconfig.preempt source kernel/Kconfig.hz -- 2.7.4 ^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-14 9:11 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-14 9:11 UTC (permalink / raw) To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse, Ard Biesheuvel The NUMA code may get confused by the presence of NOMAP regions within zones, resulting in spurious BUG() checks where the node id deviates from the containing zone's node id. Since the kernel has no business reasoning about node ids of pages it does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure that such pages are disregarded. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/Kconfig | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 111742126897..0472afe64d55 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK def_bool y depends on NUMA +config HOLES_IN_ZONE + def_bool y + depends on NUMA + source kernel/Kconfig.preempt source kernel/Kconfig.hz -- 2.7.4 ^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-14 9:11 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-14 9:11 UTC (permalink / raw) To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse, Ard Biesheuvel The NUMA code may get confused by the presence of NOMAP regions within zones, resulting in spurious BUG() checks where the node id deviates from the containing zone's node id. Since the kernel has no business reasoning about node ids of pages it does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure that such pages are disregarded. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/Kconfig | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 111742126897..0472afe64d55 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK def_bool y depends on NUMA +config HOLES_IN_ZONE + def_bool y + depends on NUMA + source kernel/Kconfig.preempt source kernel/Kconfig.hz -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2016-12-14 9:11 ` Ard Biesheuvel (?) @ 2016-12-15 15:39 ` Robert Richter -1 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2016-12-15 15:39 UTC (permalink / raw) To: linux-arm-kernel I was going to do some measurements but my kernel crashes now with a page fault in efi_rtc_probe(): [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 [ 21.663396] pgd = ffff000009090000 [ 21.663401] [20251000] *pgd=0000010ffff90003 [ 21.663402] , *pud=0000010ffff90003 [ 21.663404] , *pmd=0000000fdc030003 [ 21.663405] , *pte=00e8832000250707 The sparsemem config requires the whole section to be initialized. Your patches do not address this. On 14.12.16 09:11:47, Ard Biesheuvel wrote: > +config HOLES_IN_ZONE > + def_bool y > + depends on NUMA This enables pfn_valid_within() for arm64 and causes the check for each page of a section. The arm64 implementation of pfn_valid() is already expensive (traversing memblock areas). Now, this is increased by a factor of 2^18 for 4k page size (16384 for 64k). We need to initialize the whole section to avoid that. -Robert [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 [ 21.663396] pgd = ffff000009090000 [ 21.663401] [20251000] *pgd=0000010ffff90003 [ 21.663402] , *pud=0000010ffff90003 [ 21.663404] , *pmd=0000000fdc030003 [ 21.663405] , *pte=00e8832000250707 [ 21.663405] [ 21.663411] Internal error: Oops: 96000047 [#1] SMP [ 21.663416] Modules linked in: [ 21.663425] CPU: 49 PID: 1 Comm: swapper/0 Tainted: G W 4.9.0.0.vanilla10-00002-g429605e9ab0a #1 [ 21.663426] Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016 [ 21.663429] task: ffff800feee6bc00 task.stack: ffff800fec050000 [ 21.663433] PC is at 0x201ff820 [ 21.663434] LR is at 0x201fdfc0 [ 21.663435] pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045 [ 21.663437] sp : ffff800fec053b70 [ 21.663440] x29: ffff800fec053bc0 x28: 0000000000000000 [ 21.663443] x27: ffff000008ce3e08 x26: ffff000008c52568 [ 21.663445] x25: ffff000008bf045c x24: ffff000008bdb828 [ 21.663448] x23: 0000000000000000 x22: 0000000000000040 [ 21.663451] x21: ffff800fec053bb8 x20: 0000000020251000 [ 21.663453] x19: ffff800fec053c20 x18: 0000000000000000 [ 21.663456] x17: 0000000000000000 x16: 00000000bbb67a65 [ 21.663459] x15: ffffffffffffffff x14: ffff810016ea291c [ 21.663461] x13: ffff810016ea2181 x12: 0000000000000030 [ 21.663464] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f [ 21.663467] x9 : feff716475687163 x8 : ffffffffffffffff [ 21.663469] x7 : 83f0680000000000 x6 : 0000000000000000 [ 21.663472] x5 : ffff800fc187aab9 x4 : 0002000000000000 [ 21.663474] x3 : ffff800fec053bb8 x2 : 0000000000000000 [ 21.663477] x1 : 83f0680000000000 x0 : 0000000020251000 [ 21.663478] [ 21.663479] Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020) ... [ 21.663605] [<00000000201ff820>] 0x201ff820 [ 21.663617] [<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78 [ 21.663625] [<ffff000008586c88>] platform_drv_probe+0x60/0xc8 [ 21.663636] [<ffff0000085845d4>] driver_probe_device+0x26c/0x420 [ 21.663639] [<ffff0000085848ac>] __driver_attach+0x124/0x128 [ 21.663642] [<ffff000008581e08>] bus_for_each_dev+0x70/0xb0 [ 21.663644] [<ffff000008583c30>] driver_attach+0x30/0x40 [ 21.663647] [<ffff000008583668>] bus_add_driver+0x200/0x2b8 [ 21.663650] [<ffff000008585430>] driver_register+0x68/0x100 [ 21.663652] [<ffff000008586e3c>] __platform_driver_probe+0x84/0x128 [ 21.663654] [<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28 [ 21.663658] [<ffff000008082d94>] do_one_initcall+0x44/0x138 [ 21.663665] [<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c [ 21.663673] [<ffff00000885e7a0>] kernel_init+0x18/0x110 [ 21.663675] [<ffff000008082b30>] ret_from_fork+0x10/0x20 [ 21.663679] Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) [ 21.663688] ---[ end trace e420ef9636e3c9b2 ]--- [ 21.663711] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 21.663711] [ 21.663713] SMP: stopping secondary CPUs [ 21.670234] Kernel Offset: disabled [ 21.670235] Memory Limit: none [ 22.681333] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-15 15:39 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2016-12-15 15:39 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm, catalin.marinas, akpm, hanjun.guo, xieyisheng1, james.morse I was going to do some measurements but my kernel crashes now with a page fault in efi_rtc_probe(): [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 [ 21.663396] pgd = ffff000009090000 [ 21.663401] [20251000] *pgd=0000010ffff90003 [ 21.663402] , *pud=0000010ffff90003 [ 21.663404] , *pmd=0000000fdc030003 [ 21.663405] , *pte=00e8832000250707 The sparsemem config requires the whole section to be initialized. Your patches do not address this. On 14.12.16 09:11:47, Ard Biesheuvel wrote: > +config HOLES_IN_ZONE > + def_bool y > + depends on NUMA This enables pfn_valid_within() for arm64 and causes the check for each page of a section. The arm64 implementation of pfn_valid() is already expensive (traversing memblock areas). Now, this is increased by a factor of 2^18 for 4k page size (16384 for 64k). We need to initialize the whole section to avoid that. -Robert [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 [ 21.663396] pgd = ffff000009090000 [ 21.663401] [20251000] *pgd=0000010ffff90003 [ 21.663402] , *pud=0000010ffff90003 [ 21.663404] , *pmd=0000000fdc030003 [ 21.663405] , *pte=00e8832000250707 [ 21.663405] [ 21.663411] Internal error: Oops: 96000047 [#1] SMP [ 21.663416] Modules linked in: [ 21.663425] CPU: 49 PID: 1 Comm: swapper/0 Tainted: G W 4.9.0.0.vanilla10-00002-g429605e9ab0a #1 [ 21.663426] Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016 [ 21.663429] task: ffff800feee6bc00 task.stack: ffff800fec050000 [ 21.663433] PC is at 0x201ff820 [ 21.663434] LR is at 0x201fdfc0 [ 21.663435] pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045 [ 21.663437] sp : ffff800fec053b70 [ 21.663440] x29: ffff800fec053bc0 x28: 0000000000000000 [ 21.663443] x27: ffff000008ce3e08 x26: ffff000008c52568 [ 21.663445] x25: ffff000008bf045c x24: ffff000008bdb828 [ 21.663448] x23: 0000000000000000 x22: 0000000000000040 [ 21.663451] x21: ffff800fec053bb8 x20: 0000000020251000 [ 21.663453] x19: ffff800fec053c20 x18: 0000000000000000 [ 21.663456] x17: 0000000000000000 x16: 00000000bbb67a65 [ 21.663459] x15: ffffffffffffffff x14: ffff810016ea291c [ 21.663461] x13: ffff810016ea2181 x12: 0000000000000030 [ 21.663464] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f [ 21.663467] x9 : feff716475687163 x8 : ffffffffffffffff [ 21.663469] x7 : 83f0680000000000 x6 : 0000000000000000 [ 21.663472] x5 : ffff800fc187aab9 x4 : 0002000000000000 [ 21.663474] x3 : ffff800fec053bb8 x2 : 0000000000000000 [ 21.663477] x1 : 83f0680000000000 x0 : 0000000020251000 [ 21.663478] [ 21.663479] Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020) ... [ 21.663605] [<00000000201ff820>] 0x201ff820 [ 21.663617] [<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78 [ 21.663625] [<ffff000008586c88>] platform_drv_probe+0x60/0xc8 [ 21.663636] [<ffff0000085845d4>] driver_probe_device+0x26c/0x420 [ 21.663639] [<ffff0000085848ac>] __driver_attach+0x124/0x128 [ 21.663642] [<ffff000008581e08>] bus_for_each_dev+0x70/0xb0 [ 21.663644] [<ffff000008583c30>] driver_attach+0x30/0x40 [ 21.663647] [<ffff000008583668>] bus_add_driver+0x200/0x2b8 [ 21.663650] [<ffff000008585430>] driver_register+0x68/0x100 [ 21.663652] [<ffff000008586e3c>] __platform_driver_probe+0x84/0x128 [ 21.663654] [<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28 [ 21.663658] [<ffff000008082d94>] do_one_initcall+0x44/0x138 [ 21.663665] [<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c [ 21.663673] [<ffff00000885e7a0>] kernel_init+0x18/0x110 [ 21.663675] [<ffff000008082b30>] ret_from_fork+0x10/0x20 [ 21.663679] Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) [ 21.663688] ---[ end trace e420ef9636e3c9b2 ]--- [ 21.663711] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 21.663711] [ 21.663713] SMP: stopping secondary CPUs [ 21.670234] Kernel Offset: disabled [ 21.670235] Memory Limit: none [ 22.681333] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-15 15:39 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2016-12-15 15:39 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm, catalin.marinas, akpm, hanjun.guo, xieyisheng1, james.morse I was going to do some measurements but my kernel crashes now with a page fault in efi_rtc_probe(): [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 [ 21.663396] pgd = ffff000009090000 [ 21.663401] [20251000] *pgd=0000010ffff90003 [ 21.663402] , *pud=0000010ffff90003 [ 21.663404] , *pmd=0000000fdc030003 [ 21.663405] , *pte=00e8832000250707 The sparsemem config requires the whole section to be initialized. Your patches do not address this. On 14.12.16 09:11:47, Ard Biesheuvel wrote: > +config HOLES_IN_ZONE > + def_bool y > + depends on NUMA This enables pfn_valid_within() for arm64 and causes the check for each page of a section. The arm64 implementation of pfn_valid() is already expensive (traversing memblock areas). Now, this is increased by a factor of 2^18 for 4k page size (16384 for 64k). We need to initialize the whole section to avoid that. -Robert [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 [ 21.663396] pgd = ffff000009090000 [ 21.663401] [20251000] *pgd=0000010ffff90003 [ 21.663402] , *pud=0000010ffff90003 [ 21.663404] , *pmd=0000000fdc030003 [ 21.663405] , *pte=00e8832000250707 [ 21.663405] [ 21.663411] Internal error: Oops: 96000047 [#1] SMP [ 21.663416] Modules linked in: [ 21.663425] CPU: 49 PID: 1 Comm: swapper/0 Tainted: G W 4.9.0.0.vanilla10-00002-g429605e9ab0a #1 [ 21.663426] Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016 [ 21.663429] task: ffff800feee6bc00 task.stack: ffff800fec050000 [ 21.663433] PC is at 0x201ff820 [ 21.663434] LR is at 0x201fdfc0 [ 21.663435] pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045 [ 21.663437] sp : ffff800fec053b70 [ 21.663440] x29: ffff800fec053bc0 x28: 0000000000000000 [ 21.663443] x27: ffff000008ce3e08 x26: ffff000008c52568 [ 21.663445] x25: ffff000008bf045c x24: ffff000008bdb828 [ 21.663448] x23: 0000000000000000 x22: 0000000000000040 [ 21.663451] x21: ffff800fec053bb8 x20: 0000000020251000 [ 21.663453] x19: ffff800fec053c20 x18: 0000000000000000 [ 21.663456] x17: 0000000000000000 x16: 00000000bbb67a65 [ 21.663459] x15: ffffffffffffffff x14: ffff810016ea291c [ 21.663461] x13: ffff810016ea2181 x12: 0000000000000030 [ 21.663464] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f [ 21.663467] x9 : feff716475687163 x8 : ffffffffffffffff [ 21.663469] x7 : 83f0680000000000 x6 : 0000000000000000 [ 21.663472] x5 : ffff800fc187aab9 x4 : 0002000000000000 [ 21.663474] x3 : ffff800fec053bb8 x2 : 0000000000000000 [ 21.663477] x1 : 83f0680000000000 x0 : 0000000020251000 [ 21.663478] [ 21.663479] Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020) ... [ 21.663605] [<00000000201ff820>] 0x201ff820 [ 21.663617] [<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78 [ 21.663625] [<ffff000008586c88>] platform_drv_probe+0x60/0xc8 [ 21.663636] [<ffff0000085845d4>] driver_probe_device+0x26c/0x420 [ 21.663639] [<ffff0000085848ac>] __driver_attach+0x124/0x128 [ 21.663642] [<ffff000008581e08>] bus_for_each_dev+0x70/0xb0 [ 21.663644] [<ffff000008583c30>] driver_attach+0x30/0x40 [ 21.663647] [<ffff000008583668>] bus_add_driver+0x200/0x2b8 [ 21.663650] [<ffff000008585430>] driver_register+0x68/0x100 [ 21.663652] [<ffff000008586e3c>] __platform_driver_probe+0x84/0x128 [ 21.663654] [<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28 [ 21.663658] [<ffff000008082d94>] do_one_initcall+0x44/0x138 [ 21.663665] [<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c [ 21.663673] [<ffff00000885e7a0>] kernel_init+0x18/0x110 [ 21.663675] [<ffff000008082b30>] ret_from_fork+0x10/0x20 [ 21.663679] Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) [ 21.663688] ---[ end trace e420ef9636e3c9b2 ]--- [ 21.663711] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 21.663711] [ 21.663713] SMP: stopping secondary CPUs [ 21.670234] Kernel Offset: disabled [ 21.670235] Memory Limit: none [ 22.681333] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2016-12-15 15:39 ` Robert Richter (?) @ 2016-12-15 16:07 ` Ard Biesheuvel -1 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-15 16:07 UTC (permalink / raw) To: linux-arm-kernel On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote: > I was going to do some measurements but my kernel crashes now with a > page fault in efi_rtc_probe(): > > [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > [ 21.663396] pgd = ffff000009090000 > [ 21.663401] [20251000] *pgd=0000010ffff90003 > [ 21.663402] , *pud=0000010ffff90003 > [ 21.663404] , *pmd=0000000fdc030003 > [ 21.663405] , *pte=00e8832000250707 > > The sparsemem config requires the whole section to be initialized. > Your patches do not address this. > 96000047 is a third level translation fault, and the PTE address has RES0 bits set. I don't see how this is related to sparsemem, could you explain? > On 14.12.16 09:11:47, Ard Biesheuvel wrote: >> +config HOLES_IN_ZONE >> + def_bool y >> + depends on NUMA > > This enables pfn_valid_within() for arm64 and causes the check for > each page of a section. The arm64 implementation of pfn_valid() is > already expensive (traversing memblock areas). Now, this is increased > by a factor of 2^18 for 4k page size (16384 for 64k). We need to > initialize the whole section to avoid that. > I know that. But if you want something for -stable, we should have something that is correct first, and only then care about the performance hit (if there is one) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-15 16:07 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-15 16:07 UTC (permalink / raw) To: Robert Richter Cc: linux-arm-kernel@lists.infradead.org, Will Deacon, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote: > I was going to do some measurements but my kernel crashes now with a > page fault in efi_rtc_probe(): > > [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > [ 21.663396] pgd = ffff000009090000 > [ 21.663401] [20251000] *pgd=0000010ffff90003 > [ 21.663402] , *pud=0000010ffff90003 > [ 21.663404] , *pmd=0000000fdc030003 > [ 21.663405] , *pte=00e8832000250707 > > The sparsemem config requires the whole section to be initialized. > Your patches do not address this. > 96000047 is a third level translation fault, and the PTE address has RES0 bits set. I don't see how this is related to sparsemem, could you explain? > On 14.12.16 09:11:47, Ard Biesheuvel wrote: >> +config HOLES_IN_ZONE >> + def_bool y >> + depends on NUMA > > This enables pfn_valid_within() for arm64 and causes the check for > each page of a section. The arm64 implementation of pfn_valid() is > already expensive (traversing memblock areas). Now, this is increased > by a factor of 2^18 for 4k page size (16384 for 64k). We need to > initialize the whole section to avoid that. > I know that. But if you want something for -stable, we should have something that is correct first, and only then care about the performance hit (if there is one) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-15 16:07 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2016-12-15 16:07 UTC (permalink / raw) To: Robert Richter Cc: linux-arm-kernel@lists.infradead.org, Will Deacon, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote: > I was going to do some measurements but my kernel crashes now with a > page fault in efi_rtc_probe(): > > [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > [ 21.663396] pgd = ffff000009090000 > [ 21.663401] [20251000] *pgd=0000010ffff90003 > [ 21.663402] , *pud=0000010ffff90003 > [ 21.663404] , *pmd=0000000fdc030003 > [ 21.663405] , *pte=00e8832000250707 > > The sparsemem config requires the whole section to be initialized. > Your patches do not address this. > 96000047 is a third level translation fault, and the PTE address has RES0 bits set. I don't see how this is related to sparsemem, could you explain? > On 14.12.16 09:11:47, Ard Biesheuvel wrote: >> +config HOLES_IN_ZONE >> + def_bool y >> + depends on NUMA > > This enables pfn_valid_within() for arm64 and causes the check for > each page of a section. The arm64 implementation of pfn_valid() is > already expensive (traversing memblock areas). Now, this is increased > by a factor of 2^18 for 4k page size (16384 for 64k). We need to > initialize the whole section to avoid that. > I know that. But if you want something for -stable, we should have something that is correct first, and only then care about the performance hit (if there is one) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2016-12-15 16:07 ` Ard Biesheuvel (?) @ 2016-12-16 17:10 ` Robert Richter -1 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2016-12-16 17:10 UTC (permalink / raw) To: linux-arm-kernel On 15.12.16 16:07:26, Ard Biesheuvel wrote: > On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote: > > I was going to do some measurements but my kernel crashes now with a > > page fault in efi_rtc_probe(): > > > > [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > > [ 21.663396] pgd = ffff000009090000 > > [ 21.663401] [20251000] *pgd=0000010ffff90003 > > [ 21.663402] , *pud=0000010ffff90003 > > [ 21.663404] , *pmd=0000000fdc030003 > > [ 21.663405] , *pte=00e8832000250707 > > > > The sparsemem config requires the whole section to be initialized. > > Your patches do not address this. > > > > 96000047 is a third level translation fault, and the PTE address has > RES0 bits set. I don't see how this is related to sparsemem, could you > explain? When initializing the whole section it works. Maybe it uncovers another bug. Did not yet start debugging this. > > > On 14.12.16 09:11:47, Ard Biesheuvel wrote: > >> +config HOLES_IN_ZONE > >> + def_bool y > >> + depends on NUMA > > > > This enables pfn_valid_within() for arm64 and causes the check for > > each page of a section. The arm64 implementation of pfn_valid() is > > already expensive (traversing memblock areas). Now, this is increased > > by a factor of 2^18 for 4k page size (16384 for 64k). We need to > > initialize the whole section to avoid that. > > > > I know that. But if you want something for -stable, we should have > something that is correct first, and only then care about the > performance hit (if there is one) I would prefer to check for a performance penalty *before* we put it into stable. There is nor risk at all with the patch I am proposing. See: https://lkml.org/lkml/2016/12/16/412 -Robert ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-16 17:10 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2016-12-16 17:10 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel@lists.infradead.org, Will Deacon, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 15.12.16 16:07:26, Ard Biesheuvel wrote: > On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote: > > I was going to do some measurements but my kernel crashes now with a > > page fault in efi_rtc_probe(): > > > > [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > > [ 21.663396] pgd = ffff000009090000 > > [ 21.663401] [20251000] *pgd=0000010ffff90003 > > [ 21.663402] , *pud=0000010ffff90003 > > [ 21.663404] , *pmd=0000000fdc030003 > > [ 21.663405] , *pte=00e8832000250707 > > > > The sparsemem config requires the whole section to be initialized. > > Your patches do not address this. > > > > 96000047 is a third level translation fault, and the PTE address has > RES0 bits set. I don't see how this is related to sparsemem, could you > explain? When initializing the whole section it works. Maybe it uncovers another bug. Did not yet start debugging this. > > > On 14.12.16 09:11:47, Ard Biesheuvel wrote: > >> +config HOLES_IN_ZONE > >> + def_bool y > >> + depends on NUMA > > > > This enables pfn_valid_within() for arm64 and causes the check for > > each page of a section. The arm64 implementation of pfn_valid() is > > already expensive (traversing memblock areas). Now, this is increased > > by a factor of 2^18 for 4k page size (16384 for 64k). We need to > > initialize the whole section to avoid that. > > > > I know that. But if you want something for -stable, we should have > something that is correct first, and only then care about the > performance hit (if there is one) I would prefer to check for a performance penalty *before* we put it into stable. There is nor risk at all with the patch I am proposing. See: https://lkml.org/lkml/2016/12/16/412 -Robert ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-16 17:10 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2016-12-16 17:10 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel@lists.infradead.org, Will Deacon, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 15.12.16 16:07:26, Ard Biesheuvel wrote: > On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote: > > I was going to do some measurements but my kernel crashes now with a > > page fault in efi_rtc_probe(): > > > > [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > > [ 21.663396] pgd = ffff000009090000 > > [ 21.663401] [20251000] *pgd=0000010ffff90003 > > [ 21.663402] , *pud=0000010ffff90003 > > [ 21.663404] , *pmd=0000000fdc030003 > > [ 21.663405] , *pte=00e8832000250707 > > > > The sparsemem config requires the whole section to be initialized. > > Your patches do not address this. > > > > 96000047 is a third level translation fault, and the PTE address has > RES0 bits set. I don't see how this is related to sparsemem, could you > explain? When initializing the whole section it works. Maybe it uncovers another bug. Did not yet start debugging this. > > > On 14.12.16 09:11:47, Ard Biesheuvel wrote: > >> +config HOLES_IN_ZONE > >> + def_bool y > >> + depends on NUMA > > > > This enables pfn_valid_within() for arm64 and causes the check for > > each page of a section. The arm64 implementation of pfn_valid() is > > already expensive (traversing memblock areas). Now, this is increased > > by a factor of 2^18 for 4k page size (16384 for 64k). We need to > > initialize the whole section to avoid that. > > > > I know that. But if you want something for -stable, we should have > something that is correct first, and only then care about the > performance hit (if there is one) I would prefer to check for a performance penalty *before* we put it into stable. There is nor risk at all with the patch I am proposing. See: https://lkml.org/lkml/2016/12/16/412 -Robert -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2016-12-15 15:39 ` Robert Richter (?) @ 2016-12-16 1:57 ` Hanjun Guo -1 siblings, 0 replies; 57+ messages in thread From: Hanjun Guo @ 2016-12-16 1:57 UTC (permalink / raw) To: linux-arm-kernel Hi Robert, On 2016/12/15 23:39, Robert Richter wrote: > I was going to do some measurements but my kernel crashes now with a > page fault in efi_rtc_probe(): > > [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > [ 21.663396] pgd = ffff000009090000 > [ 21.663401] [20251000] *pgd=0000010ffff90003 > [ 21.663402] , *pud=0000010ffff90003 > [ 21.663404] , *pmd=0000000fdc030003 > [ 21.663405] , *pte=00e8832000250707 > > The sparsemem config requires the whole section to be initialized. > Your patches do not address this. This patch set is running properly on D05, both the boot and LTP MM stress test are ok, seems it's a different configuration of memory mappings in firmware, just a stupid question, which part is related to this problem, is it only the Reserved memory? Thanks Hanjun ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-16 1:57 ` Hanjun Guo 0 siblings, 0 replies; 57+ messages in thread From: Hanjun Guo @ 2016-12-16 1:57 UTC (permalink / raw) To: Robert Richter, Ard Biesheuvel Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm, catalin.marinas, akpm, xieyisheng1, james.morse Hi Robert, On 2016/12/15 23:39, Robert Richter wrote: > I was going to do some measurements but my kernel crashes now with a > page fault in efi_rtc_probe(): > > [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > [ 21.663396] pgd = ffff000009090000 > [ 21.663401] [20251000] *pgd=0000010ffff90003 > [ 21.663402] , *pud=0000010ffff90003 > [ 21.663404] , *pmd=0000000fdc030003 > [ 21.663405] , *pte=00e8832000250707 > > The sparsemem config requires the whole section to be initialized. > Your patches do not address this. This patch set is running properly on D05, both the boot and LTP MM stress test are ok, seems it's a different configuration of memory mappings in firmware, just a stupid question, which part is related to this problem, is it only the Reserved memory? Thanks Hanjun ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-16 1:57 ` Hanjun Guo 0 siblings, 0 replies; 57+ messages in thread From: Hanjun Guo @ 2016-12-16 1:57 UTC (permalink / raw) To: Robert Richter, Ard Biesheuvel Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm, catalin.marinas, akpm, xieyisheng1, james.morse Hi Robert, On 2016/12/15 23:39, Robert Richter wrote: > I was going to do some measurements but my kernel crashes now with a > page fault in efi_rtc_probe(): > > [ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > [ 21.663396] pgd = ffff000009090000 > [ 21.663401] [20251000] *pgd=0000010ffff90003 > [ 21.663402] , *pud=0000010ffff90003 > [ 21.663404] , *pmd=0000000fdc030003 > [ 21.663405] , *pte=00e8832000250707 > > The sparsemem config requires the whole section to be initialized. > Your patches do not address this. This patch set is running properly on D05, both the boot and LTP MM stress test are ok, seems it's a different configuration of memory mappings in firmware, just a stupid question, which part is related to this problem, is it only the Reserved memory? Thanks Hanjun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2016-12-16 1:57 ` Hanjun Guo (?) @ 2016-12-16 17:14 ` Robert Richter -1 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2016-12-16 17:14 UTC (permalink / raw) To: linux-arm-kernel On 16.12.16 09:57:20, Hanjun Guo wrote: > Hi Robert, > > On 2016/12/15 23:39, Robert Richter wrote: > >I was going to do some measurements but my kernel crashes now with a > >page fault in efi_rtc_probe(): > > > >[ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > >[ 21.663396] pgd = ffff000009090000 > >[ 21.663401] [20251000] *pgd=0000010ffff90003 > >[ 21.663402] , *pud=0000010ffff90003 > >[ 21.663404] , *pmd=0000000fdc030003 > >[ 21.663405] , *pte=00e8832000250707 > > > >The sparsemem config requires the whole section to be initialized. > >Your patches do not address this. > > This patch set is running properly on D05, both the boot and > LTP MM stress test are ok, seems it's a different configuration > of memory mappings in firmware, just a stupid question, which > part is related to this problem, is it only the Reserved memory? The problem are efi reserved regions that are no longer reserved but marked as nomap pages. Those are excluded from page initialization causing parts of a memory section not being initialized. -Robert ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-16 17:14 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2016-12-16 17:14 UTC (permalink / raw) To: Hanjun Guo Cc: Ard Biesheuvel, linux-arm-kernel, will.deacon, linux-kernel, linux-mm, catalin.marinas, akpm, xieyisheng1, james.morse On 16.12.16 09:57:20, Hanjun Guo wrote: > Hi Robert, > > On 2016/12/15 23:39, Robert Richter wrote: > >I was going to do some measurements but my kernel crashes now with a > >page fault in efi_rtc_probe(): > > > >[ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > >[ 21.663396] pgd = ffff000009090000 > >[ 21.663401] [20251000] *pgd=0000010ffff90003 > >[ 21.663402] , *pud=0000010ffff90003 > >[ 21.663404] , *pmd=0000000fdc030003 > >[ 21.663405] , *pte=00e8832000250707 > > > >The sparsemem config requires the whole section to be initialized. > >Your patches do not address this. > > This patch set is running properly on D05, both the boot and > LTP MM stress test are ok, seems it's a different configuration > of memory mappings in firmware, just a stupid question, which > part is related to this problem, is it only the Reserved memory? The problem are efi reserved regions that are no longer reserved but marked as nomap pages. Those are excluded from page initialization causing parts of a memory section not being initialized. -Robert ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2016-12-16 17:14 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2016-12-16 17:14 UTC (permalink / raw) To: Hanjun Guo Cc: Ard Biesheuvel, linux-arm-kernel, will.deacon, linux-kernel, linux-mm, catalin.marinas, akpm, xieyisheng1, james.morse On 16.12.16 09:57:20, Hanjun Guo wrote: > Hi Robert, > > On 2016/12/15 23:39, Robert Richter wrote: > >I was going to do some measurements but my kernel crashes now with a > >page fault in efi_rtc_probe(): > > > >[ 21.663393] Unable to handle kernel paging request at virtual address 20251000 > >[ 21.663396] pgd = ffff000009090000 > >[ 21.663401] [20251000] *pgd=0000010ffff90003 > >[ 21.663402] , *pud=0000010ffff90003 > >[ 21.663404] , *pmd=0000000fdc030003 > >[ 21.663405] , *pte=00e8832000250707 > > > >The sparsemem config requires the whole section to be initialized. > >Your patches do not address this. > > This patch set is running properly on D05, both the boot and > LTP MM stress test are ok, seems it's a different configuration > of memory mappings in firmware, just a stupid question, which > part is related to this problem, is it only the Reserved memory? The problem are efi reserved regions that are no longer reserved but marked as nomap pages. Those are excluded from page initialization causing parts of a memory section not being initialized. -Robert -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2016-12-14 9:11 ` Ard Biesheuvel (?) @ 2017-01-04 13:28 ` Will Deacon -1 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-04 13:28 UTC (permalink / raw) To: linux-arm-kernel On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote: > The NUMA code may get confused by the presence of NOMAP regions within > zones, resulting in spurious BUG() checks where the node id deviates > from the containing zone's node id. > > Since the kernel has no business reasoning about node ids of pages it > does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure > that such pages are disregarded. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > arch/arm64/Kconfig | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 111742126897..0472afe64d55 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > def_bool y > depends on NUMA > > +config HOLES_IN_ZONE > + def_bool y > + depends on NUMA > + > source kernel/Kconfig.preempt > source kernel/Kconfig.hz I'm happy to apply this, but I'll hold off until the first patch is queued somewhere, since this doesn't help without the VM_BUG_ON being moved. Alternatively, I can queue both if somebody from the mm camp acks the first patch. Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-04 13:28 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-04 13:28 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote: > The NUMA code may get confused by the presence of NOMAP regions within > zones, resulting in spurious BUG() checks where the node id deviates > from the containing zone's node id. > > Since the kernel has no business reasoning about node ids of pages it > does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure > that such pages are disregarded. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > arch/arm64/Kconfig | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 111742126897..0472afe64d55 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > def_bool y > depends on NUMA > > +config HOLES_IN_ZONE > + def_bool y > + depends on NUMA > + > source kernel/Kconfig.preempt > source kernel/Kconfig.hz I'm happy to apply this, but I'll hold off until the first patch is queued somewhere, since this doesn't help without the VM_BUG_ON being moved. Alternatively, I can queue both if somebody from the mm camp acks the first patch. Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-04 13:28 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-04 13:28 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter, james.morse On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote: > The NUMA code may get confused by the presence of NOMAP regions within > zones, resulting in spurious BUG() checks where the node id deviates > from the containing zone's node id. > > Since the kernel has no business reasoning about node ids of pages it > does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure > that such pages are disregarded. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > arch/arm64/Kconfig | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 111742126897..0472afe64d55 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > def_bool y > depends on NUMA > > +config HOLES_IN_ZONE > + def_bool y > + depends on NUMA > + > source kernel/Kconfig.preempt > source kernel/Kconfig.hz I'm happy to apply this, but I'll hold off until the first patch is queued somewhere, since this doesn't help without the VM_BUG_ON being moved. Alternatively, I can queue both if somebody from the mm camp acks the first patch. Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2017-01-04 13:28 ` Will Deacon (?) @ 2017-01-04 13:50 ` Ard Biesheuvel -1 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2017-01-04 13:50 UTC (permalink / raw) To: linux-arm-kernel On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote: > On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote: >> The NUMA code may get confused by the presence of NOMAP regions within >> zones, resulting in spurious BUG() checks where the node id deviates >> from the containing zone's node id. >> >> Since the kernel has no business reasoning about node ids of pages it >> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure >> that such pages are disregarded. >> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >> --- >> arch/arm64/Kconfig | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index 111742126897..0472afe64d55 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK >> def_bool y >> depends on NUMA >> >> +config HOLES_IN_ZONE >> + def_bool y >> + depends on NUMA >> + >> source kernel/Kconfig.preempt >> source kernel/Kconfig.hz > > I'm happy to apply this, but I'll hold off until the first patch is queued > somewhere, since this doesn't help without the VM_BUG_ON being moved. > > Alternatively, I can queue both if somebody from the mm camp acks the > first patch. > Actually, I am not convinced the discussion is finalized. These patches do fix the issue, but Robert also suggested an alternative fix which may be preferable. http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2 I haven't responded to it yet, due to the holidays, but I'd like to explore that solution a bit further before applying anything, if you don't mind. Thanks, Ard. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-04 13:50 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2017-01-04 13:50 UTC (permalink / raw) To: Will Deacon Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, Robert Richter, James Morse On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote: > On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote: >> The NUMA code may get confused by the presence of NOMAP regions within >> zones, resulting in spurious BUG() checks where the node id deviates >> from the containing zone's node id. >> >> Since the kernel has no business reasoning about node ids of pages it >> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure >> that such pages are disregarded. >> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >> --- >> arch/arm64/Kconfig | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index 111742126897..0472afe64d55 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK >> def_bool y >> depends on NUMA >> >> +config HOLES_IN_ZONE >> + def_bool y >> + depends on NUMA >> + >> source kernel/Kconfig.preempt >> source kernel/Kconfig.hz > > I'm happy to apply this, but I'll hold off until the first patch is queued > somewhere, since this doesn't help without the VM_BUG_ON being moved. > > Alternatively, I can queue both if somebody from the mm camp acks the > first patch. > Actually, I am not convinced the discussion is finalized. These patches do fix the issue, but Robert also suggested an alternative fix which may be preferable. http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2 I haven't responded to it yet, due to the holidays, but I'd like to explore that solution a bit further before applying anything, if you don't mind. Thanks, Ard. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-04 13:50 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2017-01-04 13:50 UTC (permalink / raw) To: Will Deacon Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, Robert Richter, James Morse On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote: > On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote: >> The NUMA code may get confused by the presence of NOMAP regions within >> zones, resulting in spurious BUG() checks where the node id deviates >> from the containing zone's node id. >> >> Since the kernel has no business reasoning about node ids of pages it >> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure >> that such pages are disregarded. >> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >> --- >> arch/arm64/Kconfig | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index 111742126897..0472afe64d55 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK >> def_bool y >> depends on NUMA >> >> +config HOLES_IN_ZONE >> + def_bool y >> + depends on NUMA >> + >> source kernel/Kconfig.preempt >> source kernel/Kconfig.hz > > I'm happy to apply this, but I'll hold off until the first patch is queued > somewhere, since this doesn't help without the VM_BUG_ON being moved. > > Alternatively, I can queue both if somebody from the mm camp acks the > first patch. > Actually, I am not convinced the discussion is finalized. These patches do fix the issue, but Robert also suggested an alternative fix which may be preferable. http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2 I haven't responded to it yet, due to the holidays, but I'd like to explore that solution a bit further before applying anything, if you don't mind. Thanks, Ard. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2017-01-04 13:50 ` Ard Biesheuvel (?) @ 2017-01-04 14:02 ` Will Deacon -1 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-04 14:02 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 04, 2017 at 01:50:20PM +0000, Ard Biesheuvel wrote: > On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote: > > On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote: > >> The NUMA code may get confused by the presence of NOMAP regions within > >> zones, resulting in spurious BUG() checks where the node id deviates > >> from the containing zone's node id. > >> > >> Since the kernel has no business reasoning about node ids of pages it > >> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure > >> that such pages are disregarded. > >> > >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > >> --- > >> arch/arm64/Kconfig | 4 ++++ > >> 1 file changed, 4 insertions(+) > >> > >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > >> index 111742126897..0472afe64d55 100644 > >> --- a/arch/arm64/Kconfig > >> +++ b/arch/arm64/Kconfig > >> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > >> def_bool y > >> depends on NUMA > >> > >> +config HOLES_IN_ZONE > >> + def_bool y > >> + depends on NUMA > >> + > >> source kernel/Kconfig.preempt > >> source kernel/Kconfig.hz > > > > I'm happy to apply this, but I'll hold off until the first patch is queued > > somewhere, since this doesn't help without the VM_BUG_ON being moved. > > > > Alternatively, I can queue both if somebody from the mm camp acks the > > first patch. > > > > Actually, I am not convinced the discussion is finalized. These > patches do fix the issue, but Robert also suggested an alternative fix > which may be preferable. > > http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2 > > I haven't responded to it yet, due to the holidays, but I'd like to > explore that solution a bit further before applying anything, if you > don't mind. Using early_pfn_valid feels like a bodge to me, since having pfn_valid return false for something that early_pfn_valid says is valid (and is therefore initialised in the memmap) makes the NOMAP semantics even more confusing. But there's no rush, so I'll hold off for the moment. I was under the impression that things had stalled. Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-04 14:02 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-04 14:02 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, Robert Richter, James Morse On Wed, Jan 04, 2017 at 01:50:20PM +0000, Ard Biesheuvel wrote: > On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote: > > On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote: > >> The NUMA code may get confused by the presence of NOMAP regions within > >> zones, resulting in spurious BUG() checks where the node id deviates > >> from the containing zone's node id. > >> > >> Since the kernel has no business reasoning about node ids of pages it > >> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure > >> that such pages are disregarded. > >> > >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > >> --- > >> arch/arm64/Kconfig | 4 ++++ > >> 1 file changed, 4 insertions(+) > >> > >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > >> index 111742126897..0472afe64d55 100644 > >> --- a/arch/arm64/Kconfig > >> +++ b/arch/arm64/Kconfig > >> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > >> def_bool y > >> depends on NUMA > >> > >> +config HOLES_IN_ZONE > >> + def_bool y > >> + depends on NUMA > >> + > >> source kernel/Kconfig.preempt > >> source kernel/Kconfig.hz > > > > I'm happy to apply this, but I'll hold off until the first patch is queued > > somewhere, since this doesn't help without the VM_BUG_ON being moved. > > > > Alternatively, I can queue both if somebody from the mm camp acks the > > first patch. > > > > Actually, I am not convinced the discussion is finalized. These > patches do fix the issue, but Robert also suggested an alternative fix > which may be preferable. > > http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2 > > I haven't responded to it yet, due to the holidays, but I'd like to > explore that solution a bit further before applying anything, if you > don't mind. Using early_pfn_valid feels like a bodge to me, since having pfn_valid return false for something that early_pfn_valid says is valid (and is therefore initialised in the memmap) makes the NOMAP semantics even more confusing. But there's no rush, so I'll hold off for the moment. I was under the impression that things had stalled. Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-04 14:02 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-04 14:02 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, Robert Richter, James Morse On Wed, Jan 04, 2017 at 01:50:20PM +0000, Ard Biesheuvel wrote: > On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote: > > On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote: > >> The NUMA code may get confused by the presence of NOMAP regions within > >> zones, resulting in spurious BUG() checks where the node id deviates > >> from the containing zone's node id. > >> > >> Since the kernel has no business reasoning about node ids of pages it > >> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure > >> that such pages are disregarded. > >> > >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > >> --- > >> arch/arm64/Kconfig | 4 ++++ > >> 1 file changed, 4 insertions(+) > >> > >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > >> index 111742126897..0472afe64d55 100644 > >> --- a/arch/arm64/Kconfig > >> +++ b/arch/arm64/Kconfig > >> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > >> def_bool y > >> depends on NUMA > >> > >> +config HOLES_IN_ZONE > >> + def_bool y > >> + depends on NUMA > >> + > >> source kernel/Kconfig.preempt > >> source kernel/Kconfig.hz > > > > I'm happy to apply this, but I'll hold off until the first patch is queued > > somewhere, since this doesn't help without the VM_BUG_ON being moved. > > > > Alternatively, I can queue both if somebody from the mm camp acks the > > first patch. > > > > Actually, I am not convinced the discussion is finalized. These > patches do fix the issue, but Robert also suggested an alternative fix > which may be preferable. > > http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2 > > I haven't responded to it yet, due to the holidays, but I'd like to > explore that solution a bit further before applying anything, if you > don't mind. Using early_pfn_valid feels like a bodge to me, since having pfn_valid return false for something that early_pfn_valid says is valid (and is therefore initialised in the memmap) makes the NOMAP semantics even more confusing. But there's no rush, so I'll hold off for the moment. I was under the impression that things had stalled. Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2017-01-04 14:02 ` Will Deacon (?) @ 2017-01-05 11:24 ` Robert Richter -1 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-01-05 11:24 UTC (permalink / raw) To: linux-arm-kernel On 04.01.17 14:02:23, Will Deacon wrote: > Using early_pfn_valid feels like a bodge to me, since having pfn_valid > return false for something that early_pfn_valid says is valid (and is > therefore initialised in the memmap) makes the NOMAP semantics even more > confusing. The concern I have had with HOLES_IN_ZONE is that it enables pfn_valid_within() for arm64. This means that each pfn of a section is checked which is done only once for the section otherwise. With up to 2^18 pages per section we traverse the memblock list by that factor more often. There could be a performance regression. I haven't numbers yet, since the fix causes another kernel crash. And, this is the next problem I have. The crash doesn't happen otherwise. So, either it uncovers another bug or the fix is incomplete. Though the changes look like it should work. This needs more investigation. -Robert ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-05 11:24 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-01-05 11:24 UTC (permalink / raw) To: Will Deacon Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 04.01.17 14:02:23, Will Deacon wrote: > Using early_pfn_valid feels like a bodge to me, since having pfn_valid > return false for something that early_pfn_valid says is valid (and is > therefore initialised in the memmap) makes the NOMAP semantics even more > confusing. The concern I have had with HOLES_IN_ZONE is that it enables pfn_valid_within() for arm64. This means that each pfn of a section is checked which is done only once for the section otherwise. With up to 2^18 pages per section we traverse the memblock list by that factor more often. There could be a performance regression. I haven't numbers yet, since the fix causes another kernel crash. And, this is the next problem I have. The crash doesn't happen otherwise. So, either it uncovers another bug or the fix is incomplete. Though the changes look like it should work. This needs more investigation. -Robert ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-05 11:24 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-01-05 11:24 UTC (permalink / raw) To: Will Deacon Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 04.01.17 14:02:23, Will Deacon wrote: > Using early_pfn_valid feels like a bodge to me, since having pfn_valid > return false for something that early_pfn_valid says is valid (and is > therefore initialised in the memmap) makes the NOMAP semantics even more > confusing. The concern I have had with HOLES_IN_ZONE is that it enables pfn_valid_within() for arm64. This means that each pfn of a section is checked which is done only once for the section otherwise. With up to 2^18 pages per section we traverse the memblock list by that factor more often. There could be a performance regression. I haven't numbers yet, since the fix causes another kernel crash. And, this is the next problem I have. The crash doesn't happen otherwise. So, either it uncovers another bug or the fix is incomplete. Though the changes look like it should work. This needs more investigation. -Robert -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2017-01-05 11:24 ` Robert Richter (?) @ 2017-01-05 12:08 ` Will Deacon -1 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-05 12:08 UTC (permalink / raw) To: linux-arm-kernel On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote: > On 04.01.17 14:02:23, Will Deacon wrote: > > Using early_pfn_valid feels like a bodge to me, since having pfn_valid > > return false for something that early_pfn_valid says is valid (and is > > therefore initialised in the memmap) makes the NOMAP semantics even more > > confusing. > > The concern I have had with HOLES_IN_ZONE is that it enables > pfn_valid_within() for arm64. This means that each pfn of a section is > checked which is done only once for the section otherwise. With up to > 2^18 pages per section we traverse the memblock list by that factor > more often. There could be a performance regression. There could be, but we're trying to fix a bug here. I wouldn't have thought that walking over pfns like that is done very often. > I haven't numbers yet, since the fix causes another kernel crash. And, > this is the next problem I have. The crash doesn't happen otherwise. So, > either it uncovers another bug or the fix is incomplete. Though the > changes look like it should work. This needs more investigation. I really can't see how the fix causes a crash, and I couldn't reproduce it on any of my boards, nor could any of the Linaro folk afaik. Are you definitely running mainline with just these two patches from Ard? Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-05 12:08 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-05 12:08 UTC (permalink / raw) To: Robert Richter Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote: > On 04.01.17 14:02:23, Will Deacon wrote: > > Using early_pfn_valid feels like a bodge to me, since having pfn_valid > > return false for something that early_pfn_valid says is valid (and is > > therefore initialised in the memmap) makes the NOMAP semantics even more > > confusing. > > The concern I have had with HOLES_IN_ZONE is that it enables > pfn_valid_within() for arm64. This means that each pfn of a section is > checked which is done only once for the section otherwise. With up to > 2^18 pages per section we traverse the memblock list by that factor > more often. There could be a performance regression. There could be, but we're trying to fix a bug here. I wouldn't have thought that walking over pfns like that is done very often. > I haven't numbers yet, since the fix causes another kernel crash. And, > this is the next problem I have. The crash doesn't happen otherwise. So, > either it uncovers another bug or the fix is incomplete. Though the > changes look like it should work. This needs more investigation. I really can't see how the fix causes a crash, and I couldn't reproduce it on any of my boards, nor could any of the Linaro folk afaik. Are you definitely running mainline with just these two patches from Ard? Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-05 12:08 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-05 12:08 UTC (permalink / raw) To: Robert Richter Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote: > On 04.01.17 14:02:23, Will Deacon wrote: > > Using early_pfn_valid feels like a bodge to me, since having pfn_valid > > return false for something that early_pfn_valid says is valid (and is > > therefore initialised in the memmap) makes the NOMAP semantics even more > > confusing. > > The concern I have had with HOLES_IN_ZONE is that it enables > pfn_valid_within() for arm64. This means that each pfn of a section is > checked which is done only once for the section otherwise. With up to > 2^18 pages per section we traverse the memblock list by that factor > more often. There could be a performance regression. There could be, but we're trying to fix a bug here. I wouldn't have thought that walking over pfns like that is done very often. > I haven't numbers yet, since the fix causes another kernel crash. And, > this is the next problem I have. The crash doesn't happen otherwise. So, > either it uncovers another bug or the fix is incomplete. Though the > changes look like it should work. This needs more investigation. I really can't see how the fix causes a crash, and I couldn't reproduce it on any of my boards, nor could any of the Linaro folk afaik. Are you definitely running mainline with just these two patches from Ard? Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2017-01-05 12:08 ` Will Deacon (?) @ 2017-01-05 12:22 ` Robert Richter -1 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-01-05 12:22 UTC (permalink / raw) To: linux-arm-kernel On 05.01.17 12:08:20, Will Deacon wrote: > On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote: > > On 04.01.17 14:02:23, Will Deacon wrote: > > > Using early_pfn_valid feels like a bodge to me, since having pfn_valid > > > return false for something that early_pfn_valid says is valid (and is > > > therefore initialised in the memmap) makes the NOMAP semantics even more > > > confusing. > > > > The concern I have had with HOLES_IN_ZONE is that it enables > > pfn_valid_within() for arm64. This means that each pfn of a section is > > checked which is done only once for the section otherwise. With up to > > 2^18 pages per section we traverse the memblock list by that factor > > more often. There could be a performance regression. > > There could be, but we're trying to fix a bug here. I wouldn't have > thought that walking over pfns like that is done very often. The bug happens on a small number of machines depending on the memory layout. The fix affects all systems. And right know the impact is unclear. > > I haven't numbers yet, since the fix causes another kernel crash. And, > > this is the next problem I have. The crash doesn't happen otherwise. So, > > either it uncovers another bug or the fix is incomplete. Though the > > changes look like it should work. This needs more investigation. > > I really can't see how the fix causes a crash, and I couldn't reproduce > it on any of my boards, nor could any of the Linaro folk afaik. Are you > definitely running mainline with just these two patches from Ard? Yes, just both patches applied. Various other solutions were working. -Robert ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-05 12:22 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-01-05 12:22 UTC (permalink / raw) To: Will Deacon Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 05.01.17 12:08:20, Will Deacon wrote: > On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote: > > On 04.01.17 14:02:23, Will Deacon wrote: > > > Using early_pfn_valid feels like a bodge to me, since having pfn_valid > > > return false for something that early_pfn_valid says is valid (and is > > > therefore initialised in the memmap) makes the NOMAP semantics even more > > > confusing. > > > > The concern I have had with HOLES_IN_ZONE is that it enables > > pfn_valid_within() for arm64. This means that each pfn of a section is > > checked which is done only once for the section otherwise. With up to > > 2^18 pages per section we traverse the memblock list by that factor > > more often. There could be a performance regression. > > There could be, but we're trying to fix a bug here. I wouldn't have > thought that walking over pfns like that is done very often. The bug happens on a small number of machines depending on the memory layout. The fix affects all systems. And right know the impact is unclear. > > I haven't numbers yet, since the fix causes another kernel crash. And, > > this is the next problem I have. The crash doesn't happen otherwise. So, > > either it uncovers another bug or the fix is incomplete. Though the > > changes look like it should work. This needs more investigation. > > I really can't see how the fix causes a crash, and I couldn't reproduce > it on any of my boards, nor could any of the Linaro folk afaik. Are you > definitely running mainline with just these two patches from Ard? Yes, just both patches applied. Various other solutions were working. -Robert ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-05 12:22 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-01-05 12:22 UTC (permalink / raw) To: Will Deacon Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 05.01.17 12:08:20, Will Deacon wrote: > On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote: > > On 04.01.17 14:02:23, Will Deacon wrote: > > > Using early_pfn_valid feels like a bodge to me, since having pfn_valid > > > return false for something that early_pfn_valid says is valid (and is > > > therefore initialised in the memmap) makes the NOMAP semantics even more > > > confusing. > > > > The concern I have had with HOLES_IN_ZONE is that it enables > > pfn_valid_within() for arm64. This means that each pfn of a section is > > checked which is done only once for the section otherwise. With up to > > 2^18 pages per section we traverse the memblock list by that factor > > more often. There could be a performance regression. > > There could be, but we're trying to fix a bug here. I wouldn't have > thought that walking over pfns like that is done very often. The bug happens on a small number of machines depending on the memory layout. The fix affects all systems. And right know the impact is unclear. > > I haven't numbers yet, since the fix causes another kernel crash. And, > > this is the next problem I have. The crash doesn't happen otherwise. So, > > either it uncovers another bug or the fix is incomplete. Though the > > changes look like it should work. This needs more investigation. > > I really can't see how the fix causes a crash, and I couldn't reproduce > it on any of my boards, nor could any of the Linaro folk afaik. Are you > definitely running mainline with just these two patches from Ard? Yes, just both patches applied. Various other solutions were working. -Robert -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2017-01-05 12:22 ` Robert Richter (?) @ 2017-01-05 19:49 ` Robert Richter -1 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-01-05 19:49 UTC (permalink / raw) To: linux-arm-kernel On 05.01.17 13:22:00, Robert Richter wrote: > On 05.01.17 12:08:20, Will Deacon wrote: > > I really can't see how the fix causes a crash, and I couldn't reproduce > > it on any of my boards, nor could any of the Linaro folk afaik. Are you > > definitely running mainline with just these two patches from Ard? > > Yes, just both patches applied. Various other solutions were working. I have retested the same kernel (v4.9 based) as before and now it boots fine including rtc-efi device registration (it was crashing there): rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0 There could be a difference in firmware and mem setup, though I also downgraded the firmware to test it, but can't reproduce it anymore. I could reliable trigger the crash the first time. FTR the oops. -Robert Unable to handle kernel paging request at virtual address 20251000 pgd = ffff000009090000 [20251000] *pgd=0000010ffff90003 , *pud=0000010ffff90003 , *pmd=0000000fdc030003 , *pte=00e8832000250707 Internal error: Oops: 96000047 [#1] SMP Modules linked in: CPU: 49 PID: 1 Comm: swapper/0 Tainted: G W 4.9.0.0.vanilla10-00002-g429605e9ab0a #1 Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016 task: ffff800feee6bc00 task.stack: ffff800fec050000 PC is at 0x201ff820 LR is at 0x201fdfc0 pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045 sp : ffff800fec053b70 x29: ffff800fec053bc0 x28: 0000000000000000 x27: ffff000008ce3e08 x26: ffff000008c52568 x25: ffff000008bf045c x24: ffff000008bdb828 x23: 0000000000000000 x22: 0000000000000040 x21: ffff800fec053bb8 x20: 0000000020251000 x19: ffff800fec053c20 x18: 0000000000000000 x17: 0000000000000000 x16: 00000000bbb67a65 x15: ffffffffffffffff x14: ffff810016ea291c x13: ffff810016ea2181 x12: 0000000000000030 x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f x9 : feff716475687163 x8 : ffffffffffffffff x7 : 83f0680000000000 x6 : 0000000000000000 x5 : ffff800fc187aab9 x4 : 0002000000000000 x3 : ffff800fec053bb8 x2 : 0000000000000000 x1 : 83f0680000000000 x0 : 0000000020251000 Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020) Stack: (0xffff800fec053b70 to 0xffff800fec054000) 3b60: ffff800fec053c20 ffff800fec053c20 3b80: ffff800fec053c10 00000000201fd500 ffff000008e660d0 ffff800fec053c20 3ba0: ffff0000086eb954 ffff0000086eb930 ffff800fec053bc0 ffff0000086eb934 3bc0: ffff800fec053bf0 ffff000008c3eef4 ffff000008e602a0 ffff000008e602b0 3be0: ffff000008e60740 ffff000008e60768 ffff800fec053c30 ffff000008586c88 3c00: 00000000ffffffed ffff00000858023c ffff800fec053c30 ffff000008586c68 3c20: 0000000000000000 ffff000008e602b0 ffff800fec053c60 ffff0000085845d4 3c40: ffff000008e602b0 ffff000009049000 0000000000000000 ffff000008e60768 3c60: ffff800fec053ca0 ffff0000085848ac ffff000008e602b0 ffff000008e60310 3c80: ffff000008e60768 0000000000000000 ffff000008e4d000 ffff000008bdb828 3ca0: ffff800fec053cd0 ffff000008581e08 0000000000000000 ffff000008e60768 3cc0: ffff000008584788 0000000000000000 ffff800fec053d10 ffff000008583c30 3ce0: ffff000008e60768 ffff810fed477c00 ffff000008e4deb0 0000000000000000 3d00: ffff800fe54554a8 ffff810fed478e68 ffff800fec053d30 ffff000008583668 3d20: ffff000008e60768 ffff810fed477c00 ffff800fec053d70 ffff000008585430 3d40: ffff000008e60768 0000000000000000 ffff000008c3eed0 ffff000008e60768 3d60: ffff000008ef0000 0000000000000000 ffff800fec053d90 ffff000008586e3c 3d80: ffff000008e60740 0000000000000000 ffff800fec053dc0 ffff000008c3eec8 3da0: ffff000008c3eea8 ffff800fec050000 0000000000000000 0000000000000006 3dc0: ffff800fec053dd0 ffff000008082d94 ffff800fec053e40 ffff000008bf0d0c 3de0: 00000000000000f3 ffff000008ef0000 ffff000008c52578 0000000000000006 3e00: ffff000008ce3600 0000000000000000 ffff000008da2428 ffff000008ab2fa8 3e20: 0000000000000000 0000000600000006 ffff000008bf045c ffff000008bdb828 3e40: ffff800fec053ea0 ffff00000885e7a0 ffff00000885e788 0000000000000000 3e60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3e80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3ea0: 0000000000000000 ffff000008082b30 ffff00000885e788 0000000000000000 3ec0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3ee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3fa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3fc0: 0000000000000000 0000000000000005 0000000000000000 0000000000000000 3fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call trace: Exception stack(0xffff800fec0539a0 to 0xffff800fec053ad0) 39a0: ffff800fec053c20 0001000000000000 ffff800fec053b70 00000000201ff820 39c0: 0000000000000000 ffff810000412890 ffff800fec0539f0 ffff000008405534 39e0: ffff810000412890 ffff810016e90e30 ffff800fec053a20 ffff00000840682c 3a00: 0000000000000000 ffff800fc168f880 0000000000000000 ffff00000840668c 3a20: ffff800fec053ac0 ffff0000084069f8 ffff00000903e7b0 0000000000000001 3a40: 0000000020251000 83f0680000000000 0000000000000000 ffff800fec053bb8 3a60: 0002000000000000 ffff800fc187aab9 0000000000000000 83f0680000000000 3a80: ffffffffffffffff feff716475687163 7f7f7f7f7f7f7f7f 0101010101010101 3aa0: 0000000000000030 ffff810016ea2181 ffff810016ea291c ffffffffffffffff 3ac0: 00000000bbb67a65 0000000000000000 [<00000000201ff820>] 0x201ff820 [<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78 [<ffff000008586c88>] platform_drv_probe+0x60/0xc8 [<ffff0000085845d4>] driver_probe_device+0x26c/0x420 [<ffff0000085848ac>] __driver_attach+0x124/0x128 [<ffff000008581e08>] bus_for_each_dev+0x70/0xb0 [<ffff000008583c30>] driver_attach+0x30/0x40 [<ffff000008583668>] bus_add_driver+0x200/0x2b8 [<ffff000008585430>] driver_register+0x68/0x100 [<ffff000008586e3c>] __platform_driver_probe+0x84/0x128 [<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28 [<ffff000008082d94>] do_one_initcall+0x44/0x138 [<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c [<ffff00000885e7a0>] kernel_init+0x18/0x110 [<ffff000008082b30>] ret_from_fork+0x10/0x20 Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) ---[ end trace e420ef9636e3c9b2 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-05 19:49 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-01-05 19:49 UTC (permalink / raw) To: Will Deacon Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 05.01.17 13:22:00, Robert Richter wrote: > On 05.01.17 12:08:20, Will Deacon wrote: > > I really can't see how the fix causes a crash, and I couldn't reproduce > > it on any of my boards, nor could any of the Linaro folk afaik. Are you > > definitely running mainline with just these two patches from Ard? > > Yes, just both patches applied. Various other solutions were working. I have retested the same kernel (v4.9 based) as before and now it boots fine including rtc-efi device registration (it was crashing there): rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0 There could be a difference in firmware and mem setup, though I also downgraded the firmware to test it, but can't reproduce it anymore. I could reliable trigger the crash the first time. FTR the oops. -Robert Unable to handle kernel paging request at virtual address 20251000 pgd = ffff000009090000 [20251000] *pgd=0000010ffff90003 , *pud=0000010ffff90003 , *pmd=0000000fdc030003 , *pte=00e8832000250707 Internal error: Oops: 96000047 [#1] SMP Modules linked in: CPU: 49 PID: 1 Comm: swapper/0 Tainted: G W 4.9.0.0.vanilla10-00002-g429605e9ab0a #1 Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016 task: ffff800feee6bc00 task.stack: ffff800fec050000 PC is at 0x201ff820 LR is at 0x201fdfc0 pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045 sp : ffff800fec053b70 x29: ffff800fec053bc0 x28: 0000000000000000 x27: ffff000008ce3e08 x26: ffff000008c52568 x25: ffff000008bf045c x24: ffff000008bdb828 x23: 0000000000000000 x22: 0000000000000040 x21: ffff800fec053bb8 x20: 0000000020251000 x19: ffff800fec053c20 x18: 0000000000000000 x17: 0000000000000000 x16: 00000000bbb67a65 x15: ffffffffffffffff x14: ffff810016ea291c x13: ffff810016ea2181 x12: 0000000000000030 x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f x9 : feff716475687163 x8 : ffffffffffffffff x7 : 83f0680000000000 x6 : 0000000000000000 x5 : ffff800fc187aab9 x4 : 0002000000000000 x3 : ffff800fec053bb8 x2 : 0000000000000000 x1 : 83f0680000000000 x0 : 0000000020251000 Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020) Stack: (0xffff800fec053b70 to 0xffff800fec054000) 3b60: ffff800fec053c20 ffff800fec053c20 3b80: ffff800fec053c10 00000000201fd500 ffff000008e660d0 ffff800fec053c20 3ba0: ffff0000086eb954 ffff0000086eb930 ffff800fec053bc0 ffff0000086eb934 3bc0: ffff800fec053bf0 ffff000008c3eef4 ffff000008e602a0 ffff000008e602b0 3be0: ffff000008e60740 ffff000008e60768 ffff800fec053c30 ffff000008586c88 3c00: 00000000ffffffed ffff00000858023c ffff800fec053c30 ffff000008586c68 3c20: 0000000000000000 ffff000008e602b0 ffff800fec053c60 ffff0000085845d4 3c40: ffff000008e602b0 ffff000009049000 0000000000000000 ffff000008e60768 3c60: ffff800fec053ca0 ffff0000085848ac ffff000008e602b0 ffff000008e60310 3c80: ffff000008e60768 0000000000000000 ffff000008e4d000 ffff000008bdb828 3ca0: ffff800fec053cd0 ffff000008581e08 0000000000000000 ffff000008e60768 3cc0: ffff000008584788 0000000000000000 ffff800fec053d10 ffff000008583c30 3ce0: ffff000008e60768 ffff810fed477c00 ffff000008e4deb0 0000000000000000 3d00: ffff800fe54554a8 ffff810fed478e68 ffff800fec053d30 ffff000008583668 3d20: ffff000008e60768 ffff810fed477c00 ffff800fec053d70 ffff000008585430 3d40: ffff000008e60768 0000000000000000 ffff000008c3eed0 ffff000008e60768 3d60: ffff000008ef0000 0000000000000000 ffff800fec053d90 ffff000008586e3c 3d80: ffff000008e60740 0000000000000000 ffff800fec053dc0 ffff000008c3eec8 3da0: ffff000008c3eea8 ffff800fec050000 0000000000000000 0000000000000006 3dc0: ffff800fec053dd0 ffff000008082d94 ffff800fec053e40 ffff000008bf0d0c 3de0: 00000000000000f3 ffff000008ef0000 ffff000008c52578 0000000000000006 3e00: ffff000008ce3600 0000000000000000 ffff000008da2428 ffff000008ab2fa8 3e20: 0000000000000000 0000000600000006 ffff000008bf045c ffff000008bdb828 3e40: ffff800fec053ea0 ffff00000885e7a0 ffff00000885e788 0000000000000000 3e60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3e80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3ea0: 0000000000000000 ffff000008082b30 ffff00000885e788 0000000000000000 3ec0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3ee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3fa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3fc0: 0000000000000000 0000000000000005 0000000000000000 0000000000000000 3fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call trace: Exception stack(0xffff800fec0539a0 to 0xffff800fec053ad0) 39a0: ffff800fec053c20 0001000000000000 ffff800fec053b70 00000000201ff820 39c0: 0000000000000000 ffff810000412890 ffff800fec0539f0 ffff000008405534 39e0: ffff810000412890 ffff810016e90e30 ffff800fec053a20 ffff00000840682c 3a00: 0000000000000000 ffff800fc168f880 0000000000000000 ffff00000840668c 3a20: ffff800fec053ac0 ffff0000084069f8 ffff00000903e7b0 0000000000000001 3a40: 0000000020251000 83f0680000000000 0000000000000000 ffff800fec053bb8 3a60: 0002000000000000 ffff800fc187aab9 0000000000000000 83f0680000000000 3a80: ffffffffffffffff feff716475687163 7f7f7f7f7f7f7f7f 0101010101010101 3aa0: 0000000000000030 ffff810016ea2181 ffff810016ea291c ffffffffffffffff 3ac0: 00000000bbb67a65 0000000000000000 [<00000000201ff820>] 0x201ff820 [<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78 [<ffff000008586c88>] platform_drv_probe+0x60/0xc8 [<ffff0000085845d4>] driver_probe_device+0x26c/0x420 [<ffff0000085848ac>] __driver_attach+0x124/0x128 [<ffff000008581e08>] bus_for_each_dev+0x70/0xb0 [<ffff000008583c30>] driver_attach+0x30/0x40 [<ffff000008583668>] bus_add_driver+0x200/0x2b8 [<ffff000008585430>] driver_register+0x68/0x100 [<ffff000008586e3c>] __platform_driver_probe+0x84/0x128 [<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28 [<ffff000008082d94>] do_one_initcall+0x44/0x138 [<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c [<ffff00000885e7a0>] kernel_init+0x18/0x110 [<ffff000008082b30>] ret_from_fork+0x10/0x20 Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) ---[ end trace e420ef9636e3c9b2 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-05 19:49 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-01-05 19:49 UTC (permalink / raw) To: Will Deacon Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 05.01.17 13:22:00, Robert Richter wrote: > On 05.01.17 12:08:20, Will Deacon wrote: > > I really can't see how the fix causes a crash, and I couldn't reproduce > > it on any of my boards, nor could any of the Linaro folk afaik. Are you > > definitely running mainline with just these two patches from Ard? > > Yes, just both patches applied. Various other solutions were working. I have retested the same kernel (v4.9 based) as before and now it boots fine including rtc-efi device registration (it was crashing there): rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0 There could be a difference in firmware and mem setup, though I also downgraded the firmware to test it, but can't reproduce it anymore. I could reliable trigger the crash the first time. FTR the oops. -Robert Unable to handle kernel paging request at virtual address 20251000 pgd = ffff000009090000 [20251000] *pgd=0000010ffff90003 , *pud=0000010ffff90003 , *pmd=0000000fdc030003 , *pte=00e8832000250707 Internal error: Oops: 96000047 [#1] SMP Modules linked in: CPU: 49 PID: 1 Comm: swapper/0 Tainted: G W 4.9.0.0.vanilla10-00002-g429605e9ab0a #1 Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016 task: ffff800feee6bc00 task.stack: ffff800fec050000 PC is at 0x201ff820 LR is at 0x201fdfc0 pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045 sp : ffff800fec053b70 x29: ffff800fec053bc0 x28: 0000000000000000 x27: ffff000008ce3e08 x26: ffff000008c52568 x25: ffff000008bf045c x24: ffff000008bdb828 x23: 0000000000000000 x22: 0000000000000040 x21: ffff800fec053bb8 x20: 0000000020251000 x19: ffff800fec053c20 x18: 0000000000000000 x17: 0000000000000000 x16: 00000000bbb67a65 x15: ffffffffffffffff x14: ffff810016ea291c x13: ffff810016ea2181 x12: 0000000000000030 x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f x9 : feff716475687163 x8 : ffffffffffffffff x7 : 83f0680000000000 x6 : 0000000000000000 x5 : ffff800fc187aab9 x4 : 0002000000000000 x3 : ffff800fec053bb8 x2 : 0000000000000000 x1 : 83f0680000000000 x0 : 0000000020251000 Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020) Stack: (0xffff800fec053b70 to 0xffff800fec054000) 3b60: ffff800fec053c20 ffff800fec053c20 3b80: ffff800fec053c10 00000000201fd500 ffff000008e660d0 ffff800fec053c20 3ba0: ffff0000086eb954 ffff0000086eb930 ffff800fec053bc0 ffff0000086eb934 3bc0: ffff800fec053bf0 ffff000008c3eef4 ffff000008e602a0 ffff000008e602b0 3be0: ffff000008e60740 ffff000008e60768 ffff800fec053c30 ffff000008586c88 3c00: 00000000ffffffed ffff00000858023c ffff800fec053c30 ffff000008586c68 3c20: 0000000000000000 ffff000008e602b0 ffff800fec053c60 ffff0000085845d4 3c40: ffff000008e602b0 ffff000009049000 0000000000000000 ffff000008e60768 3c60: ffff800fec053ca0 ffff0000085848ac ffff000008e602b0 ffff000008e60310 3c80: ffff000008e60768 0000000000000000 ffff000008e4d000 ffff000008bdb828 3ca0: ffff800fec053cd0 ffff000008581e08 0000000000000000 ffff000008e60768 3cc0: ffff000008584788 0000000000000000 ffff800fec053d10 ffff000008583c30 3ce0: ffff000008e60768 ffff810fed477c00 ffff000008e4deb0 0000000000000000 3d00: ffff800fe54554a8 ffff810fed478e68 ffff800fec053d30 ffff000008583668 3d20: ffff000008e60768 ffff810fed477c00 ffff800fec053d70 ffff000008585430 3d40: ffff000008e60768 0000000000000000 ffff000008c3eed0 ffff000008e60768 3d60: ffff000008ef0000 0000000000000000 ffff800fec053d90 ffff000008586e3c 3d80: ffff000008e60740 0000000000000000 ffff800fec053dc0 ffff000008c3eec8 3da0: ffff000008c3eea8 ffff800fec050000 0000000000000000 0000000000000006 3dc0: ffff800fec053dd0 ffff000008082d94 ffff800fec053e40 ffff000008bf0d0c 3de0: 00000000000000f3 ffff000008ef0000 ffff000008c52578 0000000000000006 3e00: ffff000008ce3600 0000000000000000 ffff000008da2428 ffff000008ab2fa8 3e20: 0000000000000000 0000000600000006 ffff000008bf045c ffff000008bdb828 3e40: ffff800fec053ea0 ffff00000885e7a0 ffff00000885e788 0000000000000000 3e60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3e80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3ea0: 0000000000000000 ffff000008082b30 ffff00000885e788 0000000000000000 3ec0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3ee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3f80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3fa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 3fc0: 0000000000000000 0000000000000005 0000000000000000 0000000000000000 3fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call trace: Exception stack(0xffff800fec0539a0 to 0xffff800fec053ad0) 39a0: ffff800fec053c20 0001000000000000 ffff800fec053b70 00000000201ff820 39c0: 0000000000000000 ffff810000412890 ffff800fec0539f0 ffff000008405534 39e0: ffff810000412890 ffff810016e90e30 ffff800fec053a20 ffff00000840682c 3a00: 0000000000000000 ffff800fc168f880 0000000000000000 ffff00000840668c 3a20: ffff800fec053ac0 ffff0000084069f8 ffff00000903e7b0 0000000000000001 3a40: 0000000020251000 83f0680000000000 0000000000000000 ffff800fec053bb8 3a60: 0002000000000000 ffff800fc187aab9 0000000000000000 83f0680000000000 3a80: ffffffffffffffff feff716475687163 7f7f7f7f7f7f7f7f 0101010101010101 3aa0: 0000000000000030 ffff810016ea2181 ffff810016ea291c ffffffffffffffff 3ac0: 00000000bbb67a65 0000000000000000 [<00000000201ff820>] 0x201ff820 [<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78 [<ffff000008586c88>] platform_drv_probe+0x60/0xc8 [<ffff0000085845d4>] driver_probe_device+0x26c/0x420 [<ffff0000085848ac>] __driver_attach+0x124/0x128 [<ffff000008581e08>] bus_for_each_dev+0x70/0xb0 [<ffff000008583c30>] driver_attach+0x30/0x40 [<ffff000008583668>] bus_add_driver+0x200/0x2b8 [<ffff000008585430>] driver_register+0x68/0x100 [<ffff000008586e3c>] __platform_driver_probe+0x84/0x128 [<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28 [<ffff000008082d94>] do_one_initcall+0x44/0x138 [<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c [<ffff00000885e7a0>] kernel_init+0x18/0x110 [<ffff000008082b30>] ret_from_fork+0x10/0x20 Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) ---[ end trace e420ef9636e3c9b2 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2017-01-05 19:49 ` Robert Richter (?) @ 2017-01-06 12:03 ` Will Deacon -1 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-06 12:03 UTC (permalink / raw) To: linux-arm-kernel On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote: > On 05.01.17 13:22:00, Robert Richter wrote: > > On 05.01.17 12:08:20, Will Deacon wrote: > > > I really can't see how the fix causes a crash, and I couldn't reproduce > > > it on any of my boards, nor could any of the Linaro folk afaik. Are you > > > definitely running mainline with just these two patches from Ard? > > > > Yes, just both patches applied. Various other solutions were working. > > I have retested the same kernel (v4.9 based) as before and now it > boots fine including rtc-efi device registration (it was crashing > there): > > rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0 > > There could be a difference in firmware and mem setup, though I also > downgraded the firmware to test it, but can't reproduce it anymore. I > could reliable trigger the crash the first time. > > FTR the oops. Hmm, I just can't help but think you were accidentally running with additional patches when you saw this oops previously. For example, your log looks very similar to this one: http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html but then again, these crashes probably often look alike. Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-06 12:03 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-06 12:03 UTC (permalink / raw) To: Robert Richter Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote: > On 05.01.17 13:22:00, Robert Richter wrote: > > On 05.01.17 12:08:20, Will Deacon wrote: > > > I really can't see how the fix causes a crash, and I couldn't reproduce > > > it on any of my boards, nor could any of the Linaro folk afaik. Are you > > > definitely running mainline with just these two patches from Ard? > > > > Yes, just both patches applied. Various other solutions were working. > > I have retested the same kernel (v4.9 based) as before and now it > boots fine including rtc-efi device registration (it was crashing > there): > > rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0 > > There could be a difference in firmware and mem setup, though I also > downgraded the firmware to test it, but can't reproduce it anymore. I > could reliable trigger the crash the first time. > > FTR the oops. Hmm, I just can't help but think you were accidentally running with additional patches when you saw this oops previously. For example, your log looks very similar to this one: http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html but then again, these crashes probably often look alike. Will ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-06 12:03 ` Will Deacon 0 siblings, 0 replies; 57+ messages in thread From: Will Deacon @ 2017-01-06 12:03 UTC (permalink / raw) To: Robert Richter Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote: > On 05.01.17 13:22:00, Robert Richter wrote: > > On 05.01.17 12:08:20, Will Deacon wrote: > > > I really can't see how the fix causes a crash, and I couldn't reproduce > > > it on any of my boards, nor could any of the Linaro folk afaik. Are you > > > definitely running mainline with just these two patches from Ard? > > > > Yes, just both patches applied. Various other solutions were working. > > I have retested the same kernel (v4.9 based) as before and now it > boots fine including rtc-efi device registration (it was crashing > there): > > rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0 > > There could be a difference in firmware and mem setup, though I also > downgraded the firmware to test it, but can't reproduce it anymore. I > could reliable trigger the crash the first time. > > FTR the oops. Hmm, I just can't help but think you were accidentally running with additional patches when you saw this oops previously. For example, your log looks very similar to this one: http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html but then again, these crashes probably often look alike. Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2017-01-06 12:03 ` Will Deacon (?) @ 2017-01-06 12:22 ` Ard Biesheuvel -1 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2017-01-06 12:22 UTC (permalink / raw) To: linux-arm-kernel On 6 January 2017 at 12:03, Will Deacon <will.deacon@arm.com> wrote: > On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote: >> On 05.01.17 13:22:00, Robert Richter wrote: >> > On 05.01.17 12:08:20, Will Deacon wrote: >> > > I really can't see how the fix causes a crash, and I couldn't reproduce >> > > it on any of my boards, nor could any of the Linaro folk afaik. Are you >> > > definitely running mainline with just these two patches from Ard? >> > >> > Yes, just both patches applied. Various other solutions were working. >> >> I have retested the same kernel (v4.9 based) as before and now it >> boots fine including rtc-efi device registration (it was crashing >> there): >> >> rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0 >> >> There could be a difference in firmware and mem setup, though I also >> downgraded the firmware to test it, but can't reproduce it anymore. I >> could reliable trigger the crash the first time. >> >> FTR the oops. > > Hmm, I just can't help but think you were accidentally running with > additional patches when you saw this oops previously. For example, > your log looks very similar to this one: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html > > but then again, these crashes probably often look alike. > These are quite different, in fact. In James's case, the UEFI memory map was missing some entries, so not all memory regions that the firmware expected to be there were actually mapped, hence the all-zero *pte. In Robert's case, it looks like the UEFI runtime services page tables are corrupted, i.e., *pte has RES0 bits set. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-06 12:22 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2017-01-06 12:22 UTC (permalink / raw) To: Will Deacon Cc: Robert Richter, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 6 January 2017 at 12:03, Will Deacon <will.deacon@arm.com> wrote: > On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote: >> On 05.01.17 13:22:00, Robert Richter wrote: >> > On 05.01.17 12:08:20, Will Deacon wrote: >> > > I really can't see how the fix causes a crash, and I couldn't reproduce >> > > it on any of my boards, nor could any of the Linaro folk afaik. Are you >> > > definitely running mainline with just these two patches from Ard? >> > >> > Yes, just both patches applied. Various other solutions were working. >> >> I have retested the same kernel (v4.9 based) as before and now it >> boots fine including rtc-efi device registration (it was crashing >> there): >> >> rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0 >> >> There could be a difference in firmware and mem setup, though I also >> downgraded the firmware to test it, but can't reproduce it anymore. I >> could reliable trigger the crash the first time. >> >> FTR the oops. > > Hmm, I just can't help but think you were accidentally running with > additional patches when you saw this oops previously. For example, > your log looks very similar to this one: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html > > but then again, these crashes probably often look alike. > These are quite different, in fact. In James's case, the UEFI memory map was missing some entries, so not all memory regions that the firmware expected to be there were actually mapped, hence the all-zero *pte. In Robert's case, it looks like the UEFI runtime services page tables are corrupted, i.e., *pte has RES0 bits set. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-01-06 12:22 ` Ard Biesheuvel 0 siblings, 0 replies; 57+ messages in thread From: Ard Biesheuvel @ 2017-01-06 12:22 UTC (permalink / raw) To: Will Deacon Cc: Robert Richter, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas, Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse On 6 January 2017 at 12:03, Will Deacon <will.deacon@arm.com> wrote: > On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote: >> On 05.01.17 13:22:00, Robert Richter wrote: >> > On 05.01.17 12:08:20, Will Deacon wrote: >> > > I really can't see how the fix causes a crash, and I couldn't reproduce >> > > it on any of my boards, nor could any of the Linaro folk afaik. Are you >> > > definitely running mainline with just these two patches from Ard? >> > >> > Yes, just both patches applied. Various other solutions were working. >> >> I have retested the same kernel (v4.9 based) as before and now it >> boots fine including rtc-efi device registration (it was crashing >> there): >> >> rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0 >> >> There could be a difference in firmware and mem setup, though I also >> downgraded the firmware to test it, but can't reproduce it anymore. I >> could reliable trigger the crash the first time. >> >> FTR the oops. > > Hmm, I just can't help but think you were accidentally running with > additional patches when you saw this oops previously. For example, > your log looks very similar to this one: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html > > but then again, these crashes probably often look alike. > These are quite different, in fact. In James's case, the UEFI memory map was missing some entries, so not all memory regions that the firmware expected to be there were actually mapped, hence the all-zero *pte. In Robert's case, it looks like the UEFI runtime services page tables are corrupted, i.e., *pte has RES0 bits set. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA 2016-12-14 9:11 ` Ard Biesheuvel (?) @ 2017-02-06 13:36 ` Robert Richter -1 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-02-06 13:36 UTC (permalink / raw) To: linux-arm-kernel On 14.12.16 09:11:47, Ard Biesheuvel wrote: > The NUMA code may get confused by the presence of NOMAP regions within > zones, resulting in spurious BUG() checks where the node id deviates > from the containing zone's node id. > > Since the kernel has no business reasoning about node ids of pages it > does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure > that such pages are disregarded. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> I would rather see a solution other than making pfn_valid checks more fine grained, but this patch also fixes the issue. So: Acked-by: Robert Richter <rrichter@cavium.com> ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-02-06 13:36 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-02-06 13:36 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm, catalin.marinas, akpm, hanjun.guo, xieyisheng1, james.morse On 14.12.16 09:11:47, Ard Biesheuvel wrote: > The NUMA code may get confused by the presence of NOMAP regions within > zones, resulting in spurious BUG() checks where the node id deviates > from the containing zone's node id. > > Since the kernel has no business reasoning about node ids of pages it > does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure > that such pages are disregarded. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> I would rather see a solution other than making pfn_valid checks more fine grained, but this patch also fixes the issue. So: Acked-by: Robert Richter <rrichter@cavium.com> ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA @ 2017-02-06 13:36 ` Robert Richter 0 siblings, 0 replies; 57+ messages in thread From: Robert Richter @ 2017-02-06 13:36 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm, catalin.marinas, akpm, hanjun.guo, xieyisheng1, james.morse On 14.12.16 09:11:47, Ard Biesheuvel wrote: > The NUMA code may get confused by the presence of NOMAP regions within > zones, resulting in spurious BUG() checks where the node id deviates > from the containing zone's node id. > > Since the kernel has no business reasoning about node ids of pages it > does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure > that such pages are disregarded. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> I would rather see a solution other than making pfn_valid checks more fine grained, but this patch also fixes the issue. So: Acked-by: Robert Richter <rrichter@cavium.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 57+ messages in thread
end of thread, other threads:[~2017-02-06 13:36 UTC | newest] Thread overview: 57+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-12-14 9:11 [PATCH 0/2] arm64: numa: fix spurious BUG() on NOMAP regions Ard Biesheuvel 2016-12-14 9:11 ` Ard Biesheuvel 2016-12-14 9:11 ` Ard Biesheuvel 2016-12-14 9:11 ` [PATCH 1/2] mm: don't dereference struct page fields of invalid pages Ard Biesheuvel 2016-12-14 9:11 ` Ard Biesheuvel 2016-12-14 9:11 ` Ard Biesheuvel 2017-01-04 12:16 ` Will Deacon 2017-01-04 12:16 ` Will Deacon 2017-01-04 12:16 ` Will Deacon 2016-12-14 9:11 ` [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA Ard Biesheuvel 2016-12-14 9:11 ` Ard Biesheuvel 2016-12-14 9:11 ` Ard Biesheuvel 2016-12-15 15:39 ` Robert Richter 2016-12-15 15:39 ` Robert Richter 2016-12-15 15:39 ` Robert Richter 2016-12-15 16:07 ` Ard Biesheuvel 2016-12-15 16:07 ` Ard Biesheuvel 2016-12-15 16:07 ` Ard Biesheuvel 2016-12-16 17:10 ` Robert Richter 2016-12-16 17:10 ` Robert Richter 2016-12-16 17:10 ` Robert Richter 2016-12-16 1:57 ` Hanjun Guo 2016-12-16 1:57 ` Hanjun Guo 2016-12-16 1:57 ` Hanjun Guo 2016-12-16 17:14 ` Robert Richter 2016-12-16 17:14 ` Robert Richter 2016-12-16 17:14 ` Robert Richter 2017-01-04 13:28 ` Will Deacon 2017-01-04 13:28 ` Will Deacon 2017-01-04 13:28 ` Will Deacon 2017-01-04 13:50 ` Ard Biesheuvel 2017-01-04 13:50 ` Ard Biesheuvel 2017-01-04 13:50 ` Ard Biesheuvel 2017-01-04 14:02 ` Will Deacon 2017-01-04 14:02 ` Will Deacon 2017-01-04 14:02 ` Will Deacon 2017-01-05 11:24 ` Robert Richter 2017-01-05 11:24 ` Robert Richter 2017-01-05 11:24 ` Robert Richter 2017-01-05 12:08 ` Will Deacon 2017-01-05 12:08 ` Will Deacon 2017-01-05 12:08 ` Will Deacon 2017-01-05 12:22 ` Robert Richter 2017-01-05 12:22 ` Robert Richter 2017-01-05 12:22 ` Robert Richter 2017-01-05 19:49 ` Robert Richter 2017-01-05 19:49 ` Robert Richter 2017-01-05 19:49 ` Robert Richter 2017-01-06 12:03 ` Will Deacon 2017-01-06 12:03 ` Will Deacon 2017-01-06 12:03 ` Will Deacon 2017-01-06 12:22 ` Ard Biesheuvel 2017-01-06 12:22 ` Ard Biesheuvel 2017-01-06 12:22 ` Ard Biesheuvel 2017-02-06 13:36 ` Robert Richter 2017-02-06 13:36 ` Robert Richter 2017-02-06 13:36 ` Robert Richter
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.