[PATCH 0/2] arm64: numa: fix spurious BUG() on NOMAP regions

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] arm64: numa: fix spurious BUG() on NOMAP regions
@ 2016-12-14  9:11 ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-14  9:11 UTC (permalink / raw)
  To: linux-arm-kernel

This fixes the issue reported by Robert Richter where the fact that
the node id of struct pages covered by NOMAP regions is not initialized,
triggering a VM_BUG_ON() in the mm code.

I know that this approach is the least preferred option by Robert, but it
has been used successfully in the downstream Linaro Enterprise kernel,
running on HiSilicon D05, which suffered from the same issue as Cavium
ThunderX where it was originally reported.

Given that the other proposed solutions either fail to solve the issue
completely, or cause regressions in other code (hibernate), I think this
issue is appropriate for merging now, and backported to -stable. If there
are performance concerns, we can try to improve on this solution, which
could include reverting patch #2 altogether, for all I care.

Patch #1 fixes a bug in the generic mm code where a struct page is
dereferenced before pfn_valid() is called. This should probably go to
stable regardless of where the arm64 discussion goes.

Patch #2 enables CONFIG_HOLES_IN_ZONE for arm64 numa, causing the kernel
to no longer assume that all pages in a zone have valid struct pages
associated with them.

Ard Biesheuvel (2):
  mm: don't dereference struct page fields of invalid pages
  arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA

 arch/arm64/Kconfig | 4 ++++
 mm/page_alloc.c    | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 0/2] arm64: numa: fix spurious BUG() on NOMAP regions
@ 2016-12-14  9:11 ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-14  9:11 UTC (permalink / raw)
  To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm
  Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter,
	james.morse, Ard Biesheuvel

This fixes the issue reported by Robert Richter where the fact that
the node id of struct pages covered by NOMAP regions is not initialized,
triggering a VM_BUG_ON() in the mm code.

I know that this approach is the least preferred option by Robert, but it
has been used successfully in the downstream Linaro Enterprise kernel,
running on HiSilicon D05, which suffered from the same issue as Cavium
ThunderX where it was originally reported.

Given that the other proposed solutions either fail to solve the issue
completely, or cause regressions in other code (hibernate), I think this
issue is appropriate for merging now, and backported to -stable. If there
are performance concerns, we can try to improve on this solution, which
could include reverting patch #2 altogether, for all I care.

Patch #1 fixes a bug in the generic mm code where a struct page is
dereferenced before pfn_valid() is called. This should probably go to
stable regardless of where the arm64 discussion goes.

Patch #2 enables CONFIG_HOLES_IN_ZONE for arm64 numa, causing the kernel
to no longer assume that all pages in a zone have valid struct pages
associated with them.

Ard Biesheuvel (2):
  mm: don't dereference struct page fields of invalid pages
  arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA

 arch/arm64/Kconfig | 4 ++++
 mm/page_alloc.c    | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 0/2] arm64: numa: fix spurious BUG() on NOMAP regions
@ 2016-12-14  9:11 ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-14  9:11 UTC (permalink / raw)
  To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm
  Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter,
	james.morse, Ard Biesheuvel

This fixes the issue reported by Robert Richter where the fact that
the node id of struct pages covered by NOMAP regions is not initialized,
triggering a VM_BUG_ON() in the mm code.

I know that this approach is the least preferred option by Robert, but it
has been used successfully in the downstream Linaro Enterprise kernel,
running on HiSilicon D05, which suffered from the same issue as Cavium
ThunderX where it was originally reported.

Given that the other proposed solutions either fail to solve the issue
completely, or cause regressions in other code (hibernate), I think this
issue is appropriate for merging now, and backported to -stable. If there
are performance concerns, we can try to improve on this solution, which
could include reverting patch #2 altogether, for all I care.

Patch #1 fixes a bug in the generic mm code where a struct page is
dereferenced before pfn_valid() is called. This should probably go to
stable regardless of where the arm64 discussion goes.

Patch #2 enables CONFIG_HOLES_IN_ZONE for arm64 numa, causing the kernel
to no longer assume that all pages in a zone have valid struct pages
associated with them.

Ard Biesheuvel (2):
  mm: don't dereference struct page fields of invalid pages
  arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA

 arch/arm64/Kconfig | 4 ++++
 mm/page_alloc.c    | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 1/2] mm: don't dereference struct page fields of invalid pages
  2016-12-14  9:11 ` Ard Biesheuvel
  (?)
@ 2016-12-14  9:11   ` Ard Biesheuvel
  -1 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-14  9:11 UTC (permalink / raw)
  To: linux-arm-kernel

The VM_BUG_ON() check in move_freepages() checks whether the node
id of a page matches the node id of its zone. However, it does this
before having checked whether the struct page pointer refers to a
valid struct page to begin with. This is guaranteed in most cases,
but may not be the case if CONFIG_HOLES_IN_ZONE=y.

So reorder the VM_BUG_ON() with the pfn_valid_within() check.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 mm/page_alloc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f64e7bcb43b7..4e298e31fa86 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone,
 #endif
 
 	for (page = start_page; page <= end_page;) {
-		/* Make sure we are not inadvertently changing nodes */
-		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
-
 		if (!pfn_valid_within(page_to_pfn(page))) {
 			page++;
 			continue;
 		}
 
+		/* Make sure we are not inadvertently changing nodes */
+		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
+
 		if (!PageBuddy(page)) {
 			page++;
 			continue;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 1/2] mm: don't dereference struct page fields of invalid pages
@ 2016-12-14  9:11   ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-14  9:11 UTC (permalink / raw)
  To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm
  Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter,
	james.morse, Ard Biesheuvel

The VM_BUG_ON() check in move_freepages() checks whether the node
id of a page matches the node id of its zone. However, it does this
before having checked whether the struct page pointer refers to a
valid struct page to begin with. This is guaranteed in most cases,
but may not be the case if CONFIG_HOLES_IN_ZONE=y.

So reorder the VM_BUG_ON() with the pfn_valid_within() check.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 mm/page_alloc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f64e7bcb43b7..4e298e31fa86 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone,
 #endif
 
 	for (page = start_page; page <= end_page;) {
-		/* Make sure we are not inadvertently changing nodes */
-		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
-
 		if (!pfn_valid_within(page_to_pfn(page))) {
 			page++;
 			continue;
 		}
 
+		/* Make sure we are not inadvertently changing nodes */
+		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
+
 		if (!PageBuddy(page)) {
 			page++;
 			continue;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 1/2] mm: don't dereference struct page fields of invalid pages
@ 2016-12-14  9:11   ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-14  9:11 UTC (permalink / raw)
  To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm
  Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter,
	james.morse, Ard Biesheuvel

The VM_BUG_ON() check in move_freepages() checks whether the node
id of a page matches the node id of its zone. However, it does this
before having checked whether the struct page pointer refers to a
valid struct page to begin with. This is guaranteed in most cases,
but may not be the case if CONFIG_HOLES_IN_ZONE=y.

So reorder the VM_BUG_ON() with the pfn_valid_within() check.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 mm/page_alloc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f64e7bcb43b7..4e298e31fa86 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone,
 #endif
 
 	for (page = start_page; page <= end_page;) {
-		/* Make sure we are not inadvertently changing nodes */
-		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
-
 		if (!pfn_valid_within(page_to_pfn(page))) {
 			page++;
 			continue;
 		}
 
+		/* Make sure we are not inadvertently changing nodes */
+		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
+
 		if (!PageBuddy(page)) {
 			page++;
 			continue;
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 1/2] mm: don't dereference struct page fields of invalid pages
  2016-12-14  9:11   ` Ard Biesheuvel
  (?)
@ 2017-01-04 12:16     ` Will Deacon
  -1 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-04 12:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 14, 2016 at 09:11:46AM +0000, Ard Biesheuvel wrote:
> The VM_BUG_ON() check in move_freepages() checks whether the node
> id of a page matches the node id of its zone. However, it does this
> before having checked whether the struct page pointer refers to a
> valid struct page to begin with. This is guaranteed in most cases,
> but may not be the case if CONFIG_HOLES_IN_ZONE=y.
> 
> So reorder the VM_BUG_ON() with the pfn_valid_within() check.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  mm/page_alloc.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f64e7bcb43b7..4e298e31fa86 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone,
>  #endif
>  
>  	for (page = start_page; page <= end_page;) {
> -		/* Make sure we are not inadvertently changing nodes */
> -		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
> -
>  		if (!pfn_valid_within(page_to_pfn(page))) {
>  			page++;
>  			continue;
>  		}
>  
> +		/* Make sure we are not inadvertently changing nodes */
> +		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
> +
>  		if (!PageBuddy(page)) {
>  			page++;
>  			continue;

Acked-by: Will Deacon <will.deacon@arm.com>

I'm guessing akpm can pick this up as a non-urgent fix.

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 1/2] mm: don't dereference struct page fields of invalid pages
@ 2017-01-04 12:16     ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-04 12:16 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, akpm,
	hanjun.guo, xieyisheng1, rrichter, james.morse

On Wed, Dec 14, 2016 at 09:11:46AM +0000, Ard Biesheuvel wrote:
> The VM_BUG_ON() check in move_freepages() checks whether the node
> id of a page matches the node id of its zone. However, it does this
> before having checked whether the struct page pointer refers to a
> valid struct page to begin with. This is guaranteed in most cases,
> but may not be the case if CONFIG_HOLES_IN_ZONE=y.
> 
> So reorder the VM_BUG_ON() with the pfn_valid_within() check.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  mm/page_alloc.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f64e7bcb43b7..4e298e31fa86 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone,
>  #endif
>  
>  	for (page = start_page; page <= end_page;) {
> -		/* Make sure we are not inadvertently changing nodes */
> -		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
> -
>  		if (!pfn_valid_within(page_to_pfn(page))) {
>  			page++;
>  			continue;
>  		}
>  
> +		/* Make sure we are not inadvertently changing nodes */
> +		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
> +
>  		if (!PageBuddy(page)) {
>  			page++;
>  			continue;

Acked-by: Will Deacon <will.deacon@arm.com>

I'm guessing akpm can pick this up as a non-urgent fix.

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 1/2] mm: don't dereference struct page fields of invalid pages
@ 2017-01-04 12:16     ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-04 12:16 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, akpm,
	hanjun.guo, xieyisheng1, rrichter, james.morse

On Wed, Dec 14, 2016 at 09:11:46AM +0000, Ard Biesheuvel wrote:
> The VM_BUG_ON() check in move_freepages() checks whether the node
> id of a page matches the node id of its zone. However, it does this
> before having checked whether the struct page pointer refers to a
> valid struct page to begin with. This is guaranteed in most cases,
> but may not be the case if CONFIG_HOLES_IN_ZONE=y.
> 
> So reorder the VM_BUG_ON() with the pfn_valid_within() check.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  mm/page_alloc.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f64e7bcb43b7..4e298e31fa86 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1864,14 +1864,14 @@ int move_freepages(struct zone *zone,
>  #endif
>  
>  	for (page = start_page; page <= end_page;) {
> -		/* Make sure we are not inadvertently changing nodes */
> -		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
> -
>  		if (!pfn_valid_within(page_to_pfn(page))) {
>  			page++;
>  			continue;
>  		}
>  
> +		/* Make sure we are not inadvertently changing nodes */
> +		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
> +
>  		if (!PageBuddy(page)) {
>  			page++;
>  			continue;

Acked-by: Will Deacon <will.deacon@arm.com>

I'm guessing akpm can pick this up as a non-urgent fix.

Will

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2016-12-14  9:11 ` Ard Biesheuvel
  (?)
@ 2016-12-14  9:11   ` Ard Biesheuvel
  -1 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-14  9:11 UTC (permalink / raw)
  To: linux-arm-kernel

The NUMA code may get confused by the presence of NOMAP regions within
zones, resulting in spurious BUG() checks where the node id deviates
from the containing zone's node id.

Since the kernel has no business reasoning about node ids of pages it
does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
that such pages are disregarded.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 111742126897..0472afe64d55 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
 	def_bool y
 	depends on NUMA
 
+config HOLES_IN_ZONE
+	def_bool y
+	depends on NUMA
+
 source kernel/Kconfig.preempt
 source kernel/Kconfig.hz
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-14  9:11   ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-14  9:11 UTC (permalink / raw)
  To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm
  Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter,
	james.morse, Ard Biesheuvel

The NUMA code may get confused by the presence of NOMAP regions within
zones, resulting in spurious BUG() checks where the node id deviates
from the containing zone's node id.

Since the kernel has no business reasoning about node ids of pages it
does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
that such pages are disregarded.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 111742126897..0472afe64d55 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
 	def_bool y
 	depends on NUMA
 
+config HOLES_IN_ZONE
+	def_bool y
+	depends on NUMA
+
 source kernel/Kconfig.preempt
 source kernel/Kconfig.hz
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-14  9:11   ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-14  9:11 UTC (permalink / raw)
  To: linux-arm-kernel, will.deacon, linux-kernel, linux-mm
  Cc: catalin.marinas, akpm, hanjun.guo, xieyisheng1, rrichter,
	james.morse, Ard Biesheuvel

The NUMA code may get confused by the presence of NOMAP regions within
zones, resulting in spurious BUG() checks where the node id deviates
from the containing zone's node id.

Since the kernel has no business reasoning about node ids of pages it
does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
that such pages are disregarded.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 111742126897..0472afe64d55 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
 	def_bool y
 	depends on NUMA
 
+config HOLES_IN_ZONE
+	def_bool y
+	depends on NUMA
+
 source kernel/Kconfig.preempt
 source kernel/Kconfig.hz
 
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2016-12-14  9:11   ` Ard Biesheuvel
  (?)
@ 2016-12-15 15:39     ` Robert Richter
  -1 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2016-12-15 15:39 UTC (permalink / raw)
  To: linux-arm-kernel

I was going to do some measurements but my kernel crashes now with a
page fault in efi_rtc_probe():

[   21.663393] Unable to handle kernel paging request at virtual address 20251000
[   21.663396] pgd = ffff000009090000
[   21.663401] [20251000] *pgd=0000010ffff90003
[   21.663402] , *pud=0000010ffff90003
[   21.663404] , *pmd=0000000fdc030003
[   21.663405] , *pte=00e8832000250707

The sparsemem config requires the whole section to be initialized.
Your patches do not address this.

On 14.12.16 09:11:47, Ard Biesheuvel wrote:
> +config HOLES_IN_ZONE
> +	def_bool y
> +	depends on NUMA

This enables pfn_valid_within() for arm64 and causes the check for
each page of a section. The arm64 implementation of pfn_valid() is
already expensive (traversing memblock areas). Now, this is increased
by a factor of 2^18 for 4k page size (16384 for 64k). We need to
initialize the whole section to avoid that.

-Robert






[   21.663393] Unable to handle kernel paging request at virtual address 20251000
[   21.663396] pgd = ffff000009090000
[   21.663401] [20251000] *pgd=0000010ffff90003
[   21.663402] , *pud=0000010ffff90003
[   21.663404] , *pmd=0000000fdc030003
[   21.663405] , *pte=00e8832000250707
[   21.663405] 
[   21.663411] Internal error: Oops: 96000047 [#1] SMP
[   21.663416] Modules linked in:
[   21.663425] CPU: 49 PID: 1 Comm: swapper/0 Tainted: G        W       4.9.0.0.vanilla10-00002-g429605e9ab0a #1
[   21.663426] Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016
[   21.663429] task: ffff800feee6bc00 task.stack: ffff800fec050000
[   21.663433] PC is at 0x201ff820
[   21.663434] LR is at 0x201fdfc0
[   21.663435] pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045
[   21.663437] sp : ffff800fec053b70
[   21.663440] x29: ffff800fec053bc0 x28: 0000000000000000 
[   21.663443] x27: ffff000008ce3e08 x26: ffff000008c52568 
[   21.663445] x25: ffff000008bf045c x24: ffff000008bdb828 
[   21.663448] x23: 0000000000000000 x22: 0000000000000040 
[   21.663451] x21: ffff800fec053bb8 x20: 0000000020251000 
[   21.663453] x19: ffff800fec053c20 x18: 0000000000000000 
[   21.663456] x17: 0000000000000000 x16: 00000000bbb67a65 
[   21.663459] x15: ffffffffffffffff x14: ffff810016ea291c 
[   21.663461] x13: ffff810016ea2181 x12: 0000000000000030 
[   21.663464] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
[   21.663467] x9 : feff716475687163 x8 : ffffffffffffffff 
[   21.663469] x7 : 83f0680000000000 x6 : 0000000000000000 
[   21.663472] x5 : ffff800fc187aab9 x4 : 0002000000000000 
[   21.663474] x3 : ffff800fec053bb8 x2 : 0000000000000000 
[   21.663477] x1 : 83f0680000000000 x0 : 0000000020251000 
[   21.663478] 
[   21.663479] Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020)
...
[   21.663605] [<00000000201ff820>] 0x201ff820
[   21.663617] [<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78
[   21.663625] [<ffff000008586c88>] platform_drv_probe+0x60/0xc8
[   21.663636] [<ffff0000085845d4>] driver_probe_device+0x26c/0x420
[   21.663639] [<ffff0000085848ac>] __driver_attach+0x124/0x128
[   21.663642] [<ffff000008581e08>] bus_for_each_dev+0x70/0xb0
[   21.663644] [<ffff000008583c30>] driver_attach+0x30/0x40
[   21.663647] [<ffff000008583668>] bus_add_driver+0x200/0x2b8
[   21.663650] [<ffff000008585430>] driver_register+0x68/0x100
[   21.663652] [<ffff000008586e3c>] __platform_driver_probe+0x84/0x128
[   21.663654] [<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28
[   21.663658] [<ffff000008082d94>] do_one_initcall+0x44/0x138
[   21.663665] [<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c
[   21.663673] [<ffff00000885e7a0>] kernel_init+0x18/0x110
[   21.663675] [<ffff000008082b30>] ret_from_fork+0x10/0x20
[   21.663679] Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) 
[   21.663688] ---[ end trace e420ef9636e3c9b2 ]---
[   21.663711] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   21.663711] 
[   21.663713] SMP: stopping secondary CPUs
[   21.670234] Kernel Offset: disabled
[   21.670235] Memory Limit: none
[   22.681333] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-15 15:39     ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2016-12-15 15:39 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm,
	catalin.marinas, akpm, hanjun.guo, xieyisheng1, james.morse

I was going to do some measurements but my kernel crashes now with a
page fault in efi_rtc_probe():

[   21.663393] Unable to handle kernel paging request at virtual address 20251000
[   21.663396] pgd = ffff000009090000
[   21.663401] [20251000] *pgd=0000010ffff90003
[   21.663402] , *pud=0000010ffff90003
[   21.663404] , *pmd=0000000fdc030003
[   21.663405] , *pte=00e8832000250707

The sparsemem config requires the whole section to be initialized.
Your patches do not address this.

On 14.12.16 09:11:47, Ard Biesheuvel wrote:
> +config HOLES_IN_ZONE
> +	def_bool y
> +	depends on NUMA

This enables pfn_valid_within() for arm64 and causes the check for
each page of a section. The arm64 implementation of pfn_valid() is
already expensive (traversing memblock areas). Now, this is increased
by a factor of 2^18 for 4k page size (16384 for 64k). We need to
initialize the whole section to avoid that.

-Robert






[   21.663393] Unable to handle kernel paging request at virtual address 20251000
[   21.663396] pgd = ffff000009090000
[   21.663401] [20251000] *pgd=0000010ffff90003
[   21.663402] , *pud=0000010ffff90003
[   21.663404] , *pmd=0000000fdc030003
[   21.663405] , *pte=00e8832000250707
[   21.663405] 
[   21.663411] Internal error: Oops: 96000047 [#1] SMP
[   21.663416] Modules linked in:
[   21.663425] CPU: 49 PID: 1 Comm: swapper/0 Tainted: G        W       4.9.0.0.vanilla10-00002-g429605e9ab0a #1
[   21.663426] Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016
[   21.663429] task: ffff800feee6bc00 task.stack: ffff800fec050000
[   21.663433] PC is at 0x201ff820
[   21.663434] LR is at 0x201fdfc0
[   21.663435] pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045
[   21.663437] sp : ffff800fec053b70
[   21.663440] x29: ffff800fec053bc0 x28: 0000000000000000 
[   21.663443] x27: ffff000008ce3e08 x26: ffff000008c52568 
[   21.663445] x25: ffff000008bf045c x24: ffff000008bdb828 
[   21.663448] x23: 0000000000000000 x22: 0000000000000040 
[   21.663451] x21: ffff800fec053bb8 x20: 0000000020251000 
[   21.663453] x19: ffff800fec053c20 x18: 0000000000000000 
[   21.663456] x17: 0000000000000000 x16: 00000000bbb67a65 
[   21.663459] x15: ffffffffffffffff x14: ffff810016ea291c 
[   21.663461] x13: ffff810016ea2181 x12: 0000000000000030 
[   21.663464] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
[   21.663467] x9 : feff716475687163 x8 : ffffffffffffffff 
[   21.663469] x7 : 83f0680000000000 x6 : 0000000000000000 
[   21.663472] x5 : ffff800fc187aab9 x4 : 0002000000000000 
[   21.663474] x3 : ffff800fec053bb8 x2 : 0000000000000000 
[   21.663477] x1 : 83f0680000000000 x0 : 0000000020251000 
[   21.663478] 
[   21.663479] Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020)
...
[   21.663605] [<00000000201ff820>] 0x201ff820
[   21.663617] [<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78
[   21.663625] [<ffff000008586c88>] platform_drv_probe+0x60/0xc8
[   21.663636] [<ffff0000085845d4>] driver_probe_device+0x26c/0x420
[   21.663639] [<ffff0000085848ac>] __driver_attach+0x124/0x128
[   21.663642] [<ffff000008581e08>] bus_for_each_dev+0x70/0xb0
[   21.663644] [<ffff000008583c30>] driver_attach+0x30/0x40
[   21.663647] [<ffff000008583668>] bus_add_driver+0x200/0x2b8
[   21.663650] [<ffff000008585430>] driver_register+0x68/0x100
[   21.663652] [<ffff000008586e3c>] __platform_driver_probe+0x84/0x128
[   21.663654] [<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28
[   21.663658] [<ffff000008082d94>] do_one_initcall+0x44/0x138
[   21.663665] [<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c
[   21.663673] [<ffff00000885e7a0>] kernel_init+0x18/0x110
[   21.663675] [<ffff000008082b30>] ret_from_fork+0x10/0x20
[   21.663679] Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) 
[   21.663688] ---[ end trace e420ef9636e3c9b2 ]---
[   21.663711] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   21.663711] 
[   21.663713] SMP: stopping secondary CPUs
[   21.670234] Kernel Offset: disabled
[   21.670235] Memory Limit: none
[   22.681333] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-15 15:39     ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2016-12-15 15:39 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm,
	catalin.marinas, akpm, hanjun.guo, xieyisheng1, james.morse

I was going to do some measurements but my kernel crashes now with a
page fault in efi_rtc_probe():

[   21.663393] Unable to handle kernel paging request at virtual address 20251000
[   21.663396] pgd = ffff000009090000
[   21.663401] [20251000] *pgd=0000010ffff90003
[   21.663402] , *pud=0000010ffff90003
[   21.663404] , *pmd=0000000fdc030003
[   21.663405] , *pte=00e8832000250707

The sparsemem config requires the whole section to be initialized.
Your patches do not address this.

On 14.12.16 09:11:47, Ard Biesheuvel wrote:
> +config HOLES_IN_ZONE
> +	def_bool y
> +	depends on NUMA

This enables pfn_valid_within() for arm64 and causes the check for
each page of a section. The arm64 implementation of pfn_valid() is
already expensive (traversing memblock areas). Now, this is increased
by a factor of 2^18 for 4k page size (16384 for 64k). We need to
initialize the whole section to avoid that.

-Robert






[   21.663393] Unable to handle kernel paging request at virtual address 20251000
[   21.663396] pgd = ffff000009090000
[   21.663401] [20251000] *pgd=0000010ffff90003
[   21.663402] , *pud=0000010ffff90003
[   21.663404] , *pmd=0000000fdc030003
[   21.663405] , *pte=00e8832000250707
[   21.663405] 
[   21.663411] Internal error: Oops: 96000047 [#1] SMP
[   21.663416] Modules linked in:
[   21.663425] CPU: 49 PID: 1 Comm: swapper/0 Tainted: G        W       4.9.0.0.vanilla10-00002-g429605e9ab0a #1
[   21.663426] Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016
[   21.663429] task: ffff800feee6bc00 task.stack: ffff800fec050000
[   21.663433] PC is at 0x201ff820
[   21.663434] LR is at 0x201fdfc0
[   21.663435] pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045
[   21.663437] sp : ffff800fec053b70
[   21.663440] x29: ffff800fec053bc0 x28: 0000000000000000 
[   21.663443] x27: ffff000008ce3e08 x26: ffff000008c52568 
[   21.663445] x25: ffff000008bf045c x24: ffff000008bdb828 
[   21.663448] x23: 0000000000000000 x22: 0000000000000040 
[   21.663451] x21: ffff800fec053bb8 x20: 0000000020251000 
[   21.663453] x19: ffff800fec053c20 x18: 0000000000000000 
[   21.663456] x17: 0000000000000000 x16: 00000000bbb67a65 
[   21.663459] x15: ffffffffffffffff x14: ffff810016ea291c 
[   21.663461] x13: ffff810016ea2181 x12: 0000000000000030 
[   21.663464] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
[   21.663467] x9 : feff716475687163 x8 : ffffffffffffffff 
[   21.663469] x7 : 83f0680000000000 x6 : 0000000000000000 
[   21.663472] x5 : ffff800fc187aab9 x4 : 0002000000000000 
[   21.663474] x3 : ffff800fec053bb8 x2 : 0000000000000000 
[   21.663477] x1 : 83f0680000000000 x0 : 0000000020251000 
[   21.663478] 
[   21.663479] Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020)
...
[   21.663605] [<00000000201ff820>] 0x201ff820
[   21.663617] [<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78
[   21.663625] [<ffff000008586c88>] platform_drv_probe+0x60/0xc8
[   21.663636] [<ffff0000085845d4>] driver_probe_device+0x26c/0x420
[   21.663639] [<ffff0000085848ac>] __driver_attach+0x124/0x128
[   21.663642] [<ffff000008581e08>] bus_for_each_dev+0x70/0xb0
[   21.663644] [<ffff000008583c30>] driver_attach+0x30/0x40
[   21.663647] [<ffff000008583668>] bus_add_driver+0x200/0x2b8
[   21.663650] [<ffff000008585430>] driver_register+0x68/0x100
[   21.663652] [<ffff000008586e3c>] __platform_driver_probe+0x84/0x128
[   21.663654] [<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28
[   21.663658] [<ffff000008082d94>] do_one_initcall+0x44/0x138
[   21.663665] [<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c
[   21.663673] [<ffff00000885e7a0>] kernel_init+0x18/0x110
[   21.663675] [<ffff000008082b30>] ret_from_fork+0x10/0x20
[   21.663679] Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) 
[   21.663688] ---[ end trace e420ef9636e3c9b2 ]---
[   21.663711] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   21.663711] 
[   21.663713] SMP: stopping secondary CPUs
[   21.670234] Kernel Offset: disabled
[   21.670235] Memory Limit: none
[   22.681333] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2016-12-15 15:39     ` Robert Richter
  (?)
@ 2016-12-15 16:07       ` Ard Biesheuvel
  -1 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-15 16:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote:
> I was going to do some measurements but my kernel crashes now with a
> page fault in efi_rtc_probe():
>
> [   21.663393] Unable to handle kernel paging request at virtual address 20251000
> [   21.663396] pgd = ffff000009090000
> [   21.663401] [20251000] *pgd=0000010ffff90003
> [   21.663402] , *pud=0000010ffff90003
> [   21.663404] , *pmd=0000000fdc030003
> [   21.663405] , *pte=00e8832000250707
>
> The sparsemem config requires the whole section to be initialized.
> Your patches do not address this.
>

96000047 is a third level translation fault, and the PTE address has
RES0 bits set. I don't see how this is related to sparsemem, could you
explain?

> On 14.12.16 09:11:47, Ard Biesheuvel wrote:
>> +config HOLES_IN_ZONE
>> +     def_bool y
>> +     depends on NUMA
>
> This enables pfn_valid_within() for arm64 and causes the check for
> each page of a section. The arm64 implementation of pfn_valid() is
> already expensive (traversing memblock areas). Now, this is increased
> by a factor of 2^18 for 4k page size (16384 for 64k). We need to
> initialize the whole section to avoid that.
>

I know that. But if you want something for -stable, we should have
something that is correct first, and only then care about the
performance hit (if there is one)

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-15 16:07       ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-15 16:07 UTC (permalink / raw)
  To: Robert Richter
  Cc: linux-arm-kernel@lists.infradead.org, Will Deacon,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote:
> I was going to do some measurements but my kernel crashes now with a
> page fault in efi_rtc_probe():
>
> [   21.663393] Unable to handle kernel paging request at virtual address 20251000
> [   21.663396] pgd = ffff000009090000
> [   21.663401] [20251000] *pgd=0000010ffff90003
> [   21.663402] , *pud=0000010ffff90003
> [   21.663404] , *pmd=0000000fdc030003
> [   21.663405] , *pte=00e8832000250707
>
> The sparsemem config requires the whole section to be initialized.
> Your patches do not address this.
>

96000047 is a third level translation fault, and the PTE address has
RES0 bits set. I don't see how this is related to sparsemem, could you
explain?

> On 14.12.16 09:11:47, Ard Biesheuvel wrote:
>> +config HOLES_IN_ZONE
>> +     def_bool y
>> +     depends on NUMA
>
> This enables pfn_valid_within() for arm64 and causes the check for
> each page of a section. The arm64 implementation of pfn_valid() is
> already expensive (traversing memblock areas). Now, this is increased
> by a factor of 2^18 for 4k page size (16384 for 64k). We need to
> initialize the whole section to avoid that.
>

I know that. But if you want something for -stable, we should have
something that is correct first, and only then care about the
performance hit (if there is one)

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-15 16:07       ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2016-12-15 16:07 UTC (permalink / raw)
  To: Robert Richter
  Cc: linux-arm-kernel@lists.infradead.org, Will Deacon,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote:
> I was going to do some measurements but my kernel crashes now with a
> page fault in efi_rtc_probe():
>
> [   21.663393] Unable to handle kernel paging request at virtual address 20251000
> [   21.663396] pgd = ffff000009090000
> [   21.663401] [20251000] *pgd=0000010ffff90003
> [   21.663402] , *pud=0000010ffff90003
> [   21.663404] , *pmd=0000000fdc030003
> [   21.663405] , *pte=00e8832000250707
>
> The sparsemem config requires the whole section to be initialized.
> Your patches do not address this.
>

96000047 is a third level translation fault, and the PTE address has
RES0 bits set. I don't see how this is related to sparsemem, could you
explain?

> On 14.12.16 09:11:47, Ard Biesheuvel wrote:
>> +config HOLES_IN_ZONE
>> +     def_bool y
>> +     depends on NUMA
>
> This enables pfn_valid_within() for arm64 and causes the check for
> each page of a section. The arm64 implementation of pfn_valid() is
> already expensive (traversing memblock areas). Now, this is increased
> by a factor of 2^18 for 4k page size (16384 for 64k). We need to
> initialize the whole section to avoid that.
>

I know that. But if you want something for -stable, we should have
something that is correct first, and only then care about the
performance hit (if there is one)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2016-12-15 16:07       ` Ard Biesheuvel
  (?)
@ 2016-12-16 17:10         ` Robert Richter
  -1 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2016-12-16 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

On 15.12.16 16:07:26, Ard Biesheuvel wrote:
> On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote:
> > I was going to do some measurements but my kernel crashes now with a
> > page fault in efi_rtc_probe():
> >
> > [   21.663393] Unable to handle kernel paging request at virtual address 20251000
> > [   21.663396] pgd = ffff000009090000
> > [   21.663401] [20251000] *pgd=0000010ffff90003
> > [   21.663402] , *pud=0000010ffff90003
> > [   21.663404] , *pmd=0000000fdc030003
> > [   21.663405] , *pte=00e8832000250707
> >
> > The sparsemem config requires the whole section to be initialized.
> > Your patches do not address this.
> >
> 
> 96000047 is a third level translation fault, and the PTE address has
> RES0 bits set. I don't see how this is related to sparsemem, could you
> explain?

When initializing the whole section it works. Maybe it uncovers
another bug. Did not yet start debugging this.

> 
> > On 14.12.16 09:11:47, Ard Biesheuvel wrote:
> >> +config HOLES_IN_ZONE
> >> +     def_bool y
> >> +     depends on NUMA
> >
> > This enables pfn_valid_within() for arm64 and causes the check for
> > each page of a section. The arm64 implementation of pfn_valid() is
> > already expensive (traversing memblock areas). Now, this is increased
> > by a factor of 2^18 for 4k page size (16384 for 64k). We need to
> > initialize the whole section to avoid that.
> >
> 
> I know that. But if you want something for -stable, we should have
> something that is correct first, and only then care about the
> performance hit (if there is one)

I would prefer to check for a performance penalty *before* we put it
into stable. There is nor risk at all with the patch I am proposing.
See: https://lkml.org/lkml/2016/12/16/412

-Robert

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-16 17:10         ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2016-12-16 17:10 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel@lists.infradead.org, Will Deacon,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 15.12.16 16:07:26, Ard Biesheuvel wrote:
> On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote:
> > I was going to do some measurements but my kernel crashes now with a
> > page fault in efi_rtc_probe():
> >
> > [   21.663393] Unable to handle kernel paging request at virtual address 20251000
> > [   21.663396] pgd = ffff000009090000
> > [   21.663401] [20251000] *pgd=0000010ffff90003
> > [   21.663402] , *pud=0000010ffff90003
> > [   21.663404] , *pmd=0000000fdc030003
> > [   21.663405] , *pte=00e8832000250707
> >
> > The sparsemem config requires the whole section to be initialized.
> > Your patches do not address this.
> >
> 
> 96000047 is a third level translation fault, and the PTE address has
> RES0 bits set. I don't see how this is related to sparsemem, could you
> explain?

When initializing the whole section it works. Maybe it uncovers
another bug. Did not yet start debugging this.

> 
> > On 14.12.16 09:11:47, Ard Biesheuvel wrote:
> >> +config HOLES_IN_ZONE
> >> +     def_bool y
> >> +     depends on NUMA
> >
> > This enables pfn_valid_within() for arm64 and causes the check for
> > each page of a section. The arm64 implementation of pfn_valid() is
> > already expensive (traversing memblock areas). Now, this is increased
> > by a factor of 2^18 for 4k page size (16384 for 64k). We need to
> > initialize the whole section to avoid that.
> >
> 
> I know that. But if you want something for -stable, we should have
> something that is correct first, and only then care about the
> performance hit (if there is one)

I would prefer to check for a performance penalty *before* we put it
into stable. There is nor risk at all with the patch I am proposing.
See: https://lkml.org/lkml/2016/12/16/412

-Robert

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-16 17:10         ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2016-12-16 17:10 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel@lists.infradead.org, Will Deacon,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 15.12.16 16:07:26, Ard Biesheuvel wrote:
> On 15 December 2016 at 15:39, Robert Richter <robert.richter@cavium.com> wrote:
> > I was going to do some measurements but my kernel crashes now with a
> > page fault in efi_rtc_probe():
> >
> > [   21.663393] Unable to handle kernel paging request at virtual address 20251000
> > [   21.663396] pgd = ffff000009090000
> > [   21.663401] [20251000] *pgd=0000010ffff90003
> > [   21.663402] , *pud=0000010ffff90003
> > [   21.663404] , *pmd=0000000fdc030003
> > [   21.663405] , *pte=00e8832000250707
> >
> > The sparsemem config requires the whole section to be initialized.
> > Your patches do not address this.
> >
> 
> 96000047 is a third level translation fault, and the PTE address has
> RES0 bits set. I don't see how this is related to sparsemem, could you
> explain?

When initializing the whole section it works. Maybe it uncovers
another bug. Did not yet start debugging this.

> 
> > On 14.12.16 09:11:47, Ard Biesheuvel wrote:
> >> +config HOLES_IN_ZONE
> >> +     def_bool y
> >> +     depends on NUMA
> >
> > This enables pfn_valid_within() for arm64 and causes the check for
> > each page of a section. The arm64 implementation of pfn_valid() is
> > already expensive (traversing memblock areas). Now, this is increased
> > by a factor of 2^18 for 4k page size (16384 for 64k). We need to
> > initialize the whole section to avoid that.
> >
> 
> I know that. But if you want something for -stable, we should have
> something that is correct first, and only then care about the
> performance hit (if there is one)

I would prefer to check for a performance penalty *before* we put it
into stable. There is nor risk at all with the patch I am proposing.
See: https://lkml.org/lkml/2016/12/16/412

-Robert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2016-12-15 15:39     ` Robert Richter
  (?)
@ 2016-12-16  1:57       ` Hanjun Guo
  -1 siblings, 0 replies; 57+ messages in thread
From: Hanjun Guo @ 2016-12-16  1:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robert,

On 2016/12/15 23:39, Robert Richter wrote:
> I was going to do some measurements but my kernel crashes now with a
> page fault in efi_rtc_probe():
>
> [   21.663393] Unable to handle kernel paging request at virtual address 20251000
> [   21.663396] pgd = ffff000009090000
> [   21.663401] [20251000] *pgd=0000010ffff90003
> [   21.663402] , *pud=0000010ffff90003
> [   21.663404] , *pmd=0000000fdc030003
> [   21.663405] , *pte=00e8832000250707
>
> The sparsemem config requires the whole section to be initialized.
> Your patches do not address this.

This patch set is running properly on D05, both the boot and
LTP MM stress test are ok, seems it's a different configuration
of memory mappings in firmware, just a stupid question, which
part is related to this problem, is it only the Reserved memory?

Thanks
Hanjun

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-16  1:57       ` Hanjun Guo
  0 siblings, 0 replies; 57+ messages in thread
From: Hanjun Guo @ 2016-12-16  1:57 UTC (permalink / raw)
  To: Robert Richter, Ard Biesheuvel
  Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm,
	catalin.marinas, akpm, xieyisheng1, james.morse

Hi Robert,

On 2016/12/15 23:39, Robert Richter wrote:
> I was going to do some measurements but my kernel crashes now with a
> page fault in efi_rtc_probe():
>
> [   21.663393] Unable to handle kernel paging request at virtual address 20251000
> [   21.663396] pgd = ffff000009090000
> [   21.663401] [20251000] *pgd=0000010ffff90003
> [   21.663402] , *pud=0000010ffff90003
> [   21.663404] , *pmd=0000000fdc030003
> [   21.663405] , *pte=00e8832000250707
>
> The sparsemem config requires the whole section to be initialized.
> Your patches do not address this.

This patch set is running properly on D05, both the boot and
LTP MM stress test are ok, seems it's a different configuration
of memory mappings in firmware, just a stupid question, which
part is related to this problem, is it only the Reserved memory?

Thanks
Hanjun

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-16  1:57       ` Hanjun Guo
  0 siblings, 0 replies; 57+ messages in thread
From: Hanjun Guo @ 2016-12-16  1:57 UTC (permalink / raw)
  To: Robert Richter, Ard Biesheuvel
  Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm,
	catalin.marinas, akpm, xieyisheng1, james.morse

Hi Robert,

On 2016/12/15 23:39, Robert Richter wrote:
> I was going to do some measurements but my kernel crashes now with a
> page fault in efi_rtc_probe():
>
> [   21.663393] Unable to handle kernel paging request at virtual address 20251000
> [   21.663396] pgd = ffff000009090000
> [   21.663401] [20251000] *pgd=0000010ffff90003
> [   21.663402] , *pud=0000010ffff90003
> [   21.663404] , *pmd=0000000fdc030003
> [   21.663405] , *pte=00e8832000250707
>
> The sparsemem config requires the whole section to be initialized.
> Your patches do not address this.

This patch set is running properly on D05, both the boot and
LTP MM stress test are ok, seems it's a different configuration
of memory mappings in firmware, just a stupid question, which
part is related to this problem, is it only the Reserved memory?

Thanks
Hanjun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2016-12-16  1:57       ` Hanjun Guo
  (?)
@ 2016-12-16 17:14         ` Robert Richter
  -1 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2016-12-16 17:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 16.12.16 09:57:20, Hanjun Guo wrote:
> Hi Robert,
> 
> On 2016/12/15 23:39, Robert Richter wrote:
> >I was going to do some measurements but my kernel crashes now with a
> >page fault in efi_rtc_probe():
> >
> >[   21.663393] Unable to handle kernel paging request at virtual address 20251000
> >[   21.663396] pgd = ffff000009090000
> >[   21.663401] [20251000] *pgd=0000010ffff90003
> >[   21.663402] , *pud=0000010ffff90003
> >[   21.663404] , *pmd=0000000fdc030003
> >[   21.663405] , *pte=00e8832000250707
> >
> >The sparsemem config requires the whole section to be initialized.
> >Your patches do not address this.
> 
> This patch set is running properly on D05, both the boot and
> LTP MM stress test are ok, seems it's a different configuration
> of memory mappings in firmware, just a stupid question, which
> part is related to this problem, is it only the Reserved memory?

The problem are efi reserved regions that are no longer reserved but
marked as nomap pages. Those are excluded from page initialization
causing parts of a memory section not being initialized.

-Robert

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-16 17:14         ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2016-12-16 17:14 UTC (permalink / raw)
  To: Hanjun Guo
  Cc: Ard Biesheuvel, linux-arm-kernel, will.deacon, linux-kernel,
	linux-mm, catalin.marinas, akpm, xieyisheng1, james.morse

On 16.12.16 09:57:20, Hanjun Guo wrote:
> Hi Robert,
> 
> On 2016/12/15 23:39, Robert Richter wrote:
> >I was going to do some measurements but my kernel crashes now with a
> >page fault in efi_rtc_probe():
> >
> >[   21.663393] Unable to handle kernel paging request at virtual address 20251000
> >[   21.663396] pgd = ffff000009090000
> >[   21.663401] [20251000] *pgd=0000010ffff90003
> >[   21.663402] , *pud=0000010ffff90003
> >[   21.663404] , *pmd=0000000fdc030003
> >[   21.663405] , *pte=00e8832000250707
> >
> >The sparsemem config requires the whole section to be initialized.
> >Your patches do not address this.
> 
> This patch set is running properly on D05, both the boot and
> LTP MM stress test are ok, seems it's a different configuration
> of memory mappings in firmware, just a stupid question, which
> part is related to this problem, is it only the Reserved memory?

The problem are efi reserved regions that are no longer reserved but
marked as nomap pages. Those are excluded from page initialization
causing parts of a memory section not being initialized.

-Robert

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2016-12-16 17:14         ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2016-12-16 17:14 UTC (permalink / raw)
  To: Hanjun Guo
  Cc: Ard Biesheuvel, linux-arm-kernel, will.deacon, linux-kernel,
	linux-mm, catalin.marinas, akpm, xieyisheng1, james.morse

On 16.12.16 09:57:20, Hanjun Guo wrote:
> Hi Robert,
> 
> On 2016/12/15 23:39, Robert Richter wrote:
> >I was going to do some measurements but my kernel crashes now with a
> >page fault in efi_rtc_probe():
> >
> >[   21.663393] Unable to handle kernel paging request at virtual address 20251000
> >[   21.663396] pgd = ffff000009090000
> >[   21.663401] [20251000] *pgd=0000010ffff90003
> >[   21.663402] , *pud=0000010ffff90003
> >[   21.663404] , *pmd=0000000fdc030003
> >[   21.663405] , *pte=00e8832000250707
> >
> >The sparsemem config requires the whole section to be initialized.
> >Your patches do not address this.
> 
> This patch set is running properly on D05, both the boot and
> LTP MM stress test are ok, seems it's a different configuration
> of memory mappings in firmware, just a stupid question, which
> part is related to this problem, is it only the Reserved memory?

The problem are efi reserved regions that are no longer reserved but
marked as nomap pages. Those are excluded from page initialization
causing parts of a memory section not being initialized.

-Robert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2016-12-14  9:11   ` Ard Biesheuvel
  (?)
@ 2017-01-04 13:28     ` Will Deacon
  -1 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-04 13:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote:
> The NUMA code may get confused by the presence of NOMAP regions within
> zones, resulting in spurious BUG() checks where the node id deviates
> from the containing zone's node id.
> 
> Since the kernel has no business reasoning about node ids of pages it
> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
> that such pages are disregarded.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/Kconfig | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 111742126897..0472afe64d55 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>  	def_bool y
>  	depends on NUMA
>  
> +config HOLES_IN_ZONE
> +	def_bool y
> +	depends on NUMA
> +
>  source kernel/Kconfig.preempt
>  source kernel/Kconfig.hz

I'm happy to apply this, but I'll hold off until the first patch is queued
somewhere, since this doesn't help without the VM_BUG_ON being moved.

Alternatively, I can queue both if somebody from the mm camp acks the
first patch.

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-04 13:28     ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-04 13:28 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, akpm,
	hanjun.guo, xieyisheng1, rrichter, james.morse

On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote:
> The NUMA code may get confused by the presence of NOMAP regions within
> zones, resulting in spurious BUG() checks where the node id deviates
> from the containing zone's node id.
> 
> Since the kernel has no business reasoning about node ids of pages it
> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
> that such pages are disregarded.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/Kconfig | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 111742126897..0472afe64d55 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>  	def_bool y
>  	depends on NUMA
>  
> +config HOLES_IN_ZONE
> +	def_bool y
> +	depends on NUMA
> +
>  source kernel/Kconfig.preempt
>  source kernel/Kconfig.hz

I'm happy to apply this, but I'll hold off until the first patch is queued
somewhere, since this doesn't help without the VM_BUG_ON being moved.

Alternatively, I can queue both if somebody from the mm camp acks the
first patch.

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-04 13:28     ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-04 13:28 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, akpm,
	hanjun.guo, xieyisheng1, rrichter, james.morse

On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote:
> The NUMA code may get confused by the presence of NOMAP regions within
> zones, resulting in spurious BUG() checks where the node id deviates
> from the containing zone's node id.
> 
> Since the kernel has no business reasoning about node ids of pages it
> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
> that such pages are disregarded.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/Kconfig | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 111742126897..0472afe64d55 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>  	def_bool y
>  	depends on NUMA
>  
> +config HOLES_IN_ZONE
> +	def_bool y
> +	depends on NUMA
> +
>  source kernel/Kconfig.preempt
>  source kernel/Kconfig.hz

I'm happy to apply this, but I'll hold off until the first patch is queued
somewhere, since this doesn't help without the VM_BUG_ON being moved.

Alternatively, I can queue both if somebody from the mm camp acks the
first patch.

Will

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2017-01-04 13:28     ` Will Deacon
  (?)
@ 2017-01-04 13:50       ` Ard Biesheuvel
  -1 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2017-01-04 13:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote:
>> The NUMA code may get confused by the presence of NOMAP regions within
>> zones, resulting in spurious BUG() checks where the node id deviates
>> from the containing zone's node id.
>>
>> Since the kernel has no business reasoning about node ids of pages it
>> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
>> that such pages are disregarded.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/Kconfig | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 111742126897..0472afe64d55 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>>       def_bool y
>>       depends on NUMA
>>
>> +config HOLES_IN_ZONE
>> +     def_bool y
>> +     depends on NUMA
>> +
>>  source kernel/Kconfig.preempt
>>  source kernel/Kconfig.hz
>
> I'm happy to apply this, but I'll hold off until the first patch is queued
> somewhere, since this doesn't help without the VM_BUG_ON being moved.
>
> Alternatively, I can queue both if somebody from the mm camp acks the
> first patch.
>

Actually, I am not convinced the discussion is finalized. These
patches do fix the issue, but Robert also suggested an alternative fix
which may be preferable.

http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2

I haven't responded to it yet, due to the holidays, but I'd like to
explore that solution a bit further before applying anything, if you
don't mind.

Thanks,
Ard.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-04 13:50       ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2017-01-04 13:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, Robert Richter,
	James Morse

On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote:
>> The NUMA code may get confused by the presence of NOMAP regions within
>> zones, resulting in spurious BUG() checks where the node id deviates
>> from the containing zone's node id.
>>
>> Since the kernel has no business reasoning about node ids of pages it
>> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
>> that such pages are disregarded.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/Kconfig | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 111742126897..0472afe64d55 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>>       def_bool y
>>       depends on NUMA
>>
>> +config HOLES_IN_ZONE
>> +     def_bool y
>> +     depends on NUMA
>> +
>>  source kernel/Kconfig.preempt
>>  source kernel/Kconfig.hz
>
> I'm happy to apply this, but I'll hold off until the first patch is queued
> somewhere, since this doesn't help without the VM_BUG_ON being moved.
>
> Alternatively, I can queue both if somebody from the mm camp acks the
> first patch.
>

Actually, I am not convinced the discussion is finalized. These
patches do fix the issue, but Robert also suggested an alternative fix
which may be preferable.

http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2

I haven't responded to it yet, due to the holidays, but I'd like to
explore that solution a bit further before applying anything, if you
don't mind.

Thanks,
Ard.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-04 13:50       ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2017-01-04 13:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, Robert Richter,
	James Morse

On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote:
>> The NUMA code may get confused by the presence of NOMAP regions within
>> zones, resulting in spurious BUG() checks where the node id deviates
>> from the containing zone's node id.
>>
>> Since the kernel has no business reasoning about node ids of pages it
>> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
>> that such pages are disregarded.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/Kconfig | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 111742126897..0472afe64d55 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>>       def_bool y
>>       depends on NUMA
>>
>> +config HOLES_IN_ZONE
>> +     def_bool y
>> +     depends on NUMA
>> +
>>  source kernel/Kconfig.preempt
>>  source kernel/Kconfig.hz
>
> I'm happy to apply this, but I'll hold off until the first patch is queued
> somewhere, since this doesn't help without the VM_BUG_ON being moved.
>
> Alternatively, I can queue both if somebody from the mm camp acks the
> first patch.
>

Actually, I am not convinced the discussion is finalized. These
patches do fix the issue, but Robert also suggested an alternative fix
which may be preferable.

http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2

I haven't responded to it yet, due to the holidays, but I'd like to
explore that solution a bit further before applying anything, if you
don't mind.

Thanks,
Ard.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2017-01-04 13:50       ` Ard Biesheuvel
  (?)
@ 2017-01-04 14:02         ` Will Deacon
  -1 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-04 14:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 04, 2017 at 01:50:20PM +0000, Ard Biesheuvel wrote:
> On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote:
> >> The NUMA code may get confused by the presence of NOMAP regions within
> >> zones, resulting in spurious BUG() checks where the node id deviates
> >> from the containing zone's node id.
> >>
> >> Since the kernel has no business reasoning about node ids of pages it
> >> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
> >> that such pages are disregarded.
> >>
> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >> ---
> >>  arch/arm64/Kconfig | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> >> index 111742126897..0472afe64d55 100644
> >> --- a/arch/arm64/Kconfig
> >> +++ b/arch/arm64/Kconfig
> >> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
> >>       def_bool y
> >>       depends on NUMA
> >>
> >> +config HOLES_IN_ZONE
> >> +     def_bool y
> >> +     depends on NUMA
> >> +
> >>  source kernel/Kconfig.preempt
> >>  source kernel/Kconfig.hz
> >
> > I'm happy to apply this, but I'll hold off until the first patch is queued
> > somewhere, since this doesn't help without the VM_BUG_ON being moved.
> >
> > Alternatively, I can queue both if somebody from the mm camp acks the
> > first patch.
> >
> 
> Actually, I am not convinced the discussion is finalized. These
> patches do fix the issue, but Robert also suggested an alternative fix
> which may be preferable.
> 
> http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2
> 
> I haven't responded to it yet, due to the holidays, but I'd like to
> explore that solution a bit further before applying anything, if you
> don't mind.

Using early_pfn_valid feels like a bodge to me, since having pfn_valid
return false for something that early_pfn_valid says is valid (and is
therefore initialised in the memmap) makes the NOMAP semantics even more
confusing.

But there's no rush, so I'll hold off for the moment. I was under the
impression that things had stalled.

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-04 14:02         ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-04 14:02 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, Robert Richter,
	James Morse

On Wed, Jan 04, 2017 at 01:50:20PM +0000, Ard Biesheuvel wrote:
> On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote:
> >> The NUMA code may get confused by the presence of NOMAP regions within
> >> zones, resulting in spurious BUG() checks where the node id deviates
> >> from the containing zone's node id.
> >>
> >> Since the kernel has no business reasoning about node ids of pages it
> >> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
> >> that such pages are disregarded.
> >>
> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >> ---
> >>  arch/arm64/Kconfig | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> >> index 111742126897..0472afe64d55 100644
> >> --- a/arch/arm64/Kconfig
> >> +++ b/arch/arm64/Kconfig
> >> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
> >>       def_bool y
> >>       depends on NUMA
> >>
> >> +config HOLES_IN_ZONE
> >> +     def_bool y
> >> +     depends on NUMA
> >> +
> >>  source kernel/Kconfig.preempt
> >>  source kernel/Kconfig.hz
> >
> > I'm happy to apply this, but I'll hold off until the first patch is queued
> > somewhere, since this doesn't help without the VM_BUG_ON being moved.
> >
> > Alternatively, I can queue both if somebody from the mm camp acks the
> > first patch.
> >
> 
> Actually, I am not convinced the discussion is finalized. These
> patches do fix the issue, but Robert also suggested an alternative fix
> which may be preferable.
> 
> http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2
> 
> I haven't responded to it yet, due to the holidays, but I'd like to
> explore that solution a bit further before applying anything, if you
> don't mind.

Using early_pfn_valid feels like a bodge to me, since having pfn_valid
return false for something that early_pfn_valid says is valid (and is
therefore initialised in the memmap) makes the NOMAP semantics even more
confusing.

But there's no rush, so I'll hold off for the moment. I was under the
impression that things had stalled.

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-04 14:02         ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-04 14:02 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, Robert Richter,
	James Morse

On Wed, Jan 04, 2017 at 01:50:20PM +0000, Ard Biesheuvel wrote:
> On 4 January 2017 at 13:28, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Dec 14, 2016 at 09:11:47AM +0000, Ard Biesheuvel wrote:
> >> The NUMA code may get confused by the presence of NOMAP regions within
> >> zones, resulting in spurious BUG() checks where the node id deviates
> >> from the containing zone's node id.
> >>
> >> Since the kernel has no business reasoning about node ids of pages it
> >> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
> >> that such pages are disregarded.
> >>
> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >> ---
> >>  arch/arm64/Kconfig | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> >> index 111742126897..0472afe64d55 100644
> >> --- a/arch/arm64/Kconfig
> >> +++ b/arch/arm64/Kconfig
> >> @@ -614,6 +614,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
> >>       def_bool y
> >>       depends on NUMA
> >>
> >> +config HOLES_IN_ZONE
> >> +     def_bool y
> >> +     depends on NUMA
> >> +
> >>  source kernel/Kconfig.preempt
> >>  source kernel/Kconfig.hz
> >
> > I'm happy to apply this, but I'll hold off until the first patch is queued
> > somewhere, since this doesn't help without the VM_BUG_ON being moved.
> >
> > Alternatively, I can queue both if somebody from the mm camp acks the
> > first patch.
> >
> 
> Actually, I am not convinced the discussion is finalized. These
> patches do fix the issue, but Robert also suggested an alternative fix
> which may be preferable.
> 
> http://marc.info/?l=linux-arm-kernel&m=148190753510107&w=2
> 
> I haven't responded to it yet, due to the holidays, but I'd like to
> explore that solution a bit further before applying anything, if you
> don't mind.

Using early_pfn_valid feels like a bodge to me, since having pfn_valid
return false for something that early_pfn_valid says is valid (and is
therefore initialised in the memmap) makes the NOMAP semantics even more
confusing.

But there's no rush, so I'll hold off for the moment. I was under the
impression that things had stalled.

Will

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2017-01-04 14:02         ` Will Deacon
  (?)
@ 2017-01-05 11:24           ` Robert Richter
  -1 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-01-05 11:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 04.01.17 14:02:23, Will Deacon wrote:
> Using early_pfn_valid feels like a bodge to me, since having pfn_valid
> return false for something that early_pfn_valid says is valid (and is
> therefore initialised in the memmap) makes the NOMAP semantics even more
> confusing.

The concern I have had with HOLES_IN_ZONE is that it enables
pfn_valid_within() for arm64. This means that each pfn of a section is
checked which is done only once for the section otherwise. With up to
2^18 pages per section we traverse the memblock list by that factor
more often. There could be a performance regression. I haven't numbers
yet, since the fix causes another kernel crash. And, this is the next
problem I have. The crash doesn't happen otherwise. So, either it
uncovers another bug or the fix is incomplete. Though the changes look
like it should work. This needs more investigation.

-Robert

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-05 11:24           ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-01-05 11:24 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 04.01.17 14:02:23, Will Deacon wrote:
> Using early_pfn_valid feels like a bodge to me, since having pfn_valid
> return false for something that early_pfn_valid says is valid (and is
> therefore initialised in the memmap) makes the NOMAP semantics even more
> confusing.

The concern I have had with HOLES_IN_ZONE is that it enables
pfn_valid_within() for arm64. This means that each pfn of a section is
checked which is done only once for the section otherwise. With up to
2^18 pages per section we traverse the memblock list by that factor
more often. There could be a performance regression. I haven't numbers
yet, since the fix causes another kernel crash. And, this is the next
problem I have. The crash doesn't happen otherwise. So, either it
uncovers another bug or the fix is incomplete. Though the changes look
like it should work. This needs more investigation.

-Robert

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-05 11:24           ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-01-05 11:24 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 04.01.17 14:02:23, Will Deacon wrote:
> Using early_pfn_valid feels like a bodge to me, since having pfn_valid
> return false for something that early_pfn_valid says is valid (and is
> therefore initialised in the memmap) makes the NOMAP semantics even more
> confusing.

The concern I have had with HOLES_IN_ZONE is that it enables
pfn_valid_within() for arm64. This means that each pfn of a section is
checked which is done only once for the section otherwise. With up to
2^18 pages per section we traverse the memblock list by that factor
more often. There could be a performance regression. I haven't numbers
yet, since the fix causes another kernel crash. And, this is the next
problem I have. The crash doesn't happen otherwise. So, either it
uncovers another bug or the fix is incomplete. Though the changes look
like it should work. This needs more investigation.

-Robert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2017-01-05 11:24           ` Robert Richter
  (?)
@ 2017-01-05 12:08             ` Will Deacon
  -1 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-05 12:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote:
> On 04.01.17 14:02:23, Will Deacon wrote:
> > Using early_pfn_valid feels like a bodge to me, since having pfn_valid
> > return false for something that early_pfn_valid says is valid (and is
> > therefore initialised in the memmap) makes the NOMAP semantics even more
> > confusing.
> 
> The concern I have had with HOLES_IN_ZONE is that it enables
> pfn_valid_within() for arm64. This means that each pfn of a section is
> checked which is done only once for the section otherwise. With up to
> 2^18 pages per section we traverse the memblock list by that factor
> more often. There could be a performance regression.

There could be, but we're trying to fix a bug here. I wouldn't have
thought that walking over pfns like that is done very often.

> I haven't numbers yet, since the fix causes another kernel crash. And,
> this is the next problem I have. The crash doesn't happen otherwise. So,
> either it uncovers another bug or the fix is incomplete. Though the
> changes look like it should work. This needs more investigation.

I really can't see how the fix causes a crash, and I couldn't reproduce
it on any of my boards, nor could any of the Linaro folk afaik. Are you
definitely running mainline with just these two patches from Ard?

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-05 12:08             ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-05 12:08 UTC (permalink / raw)
  To: Robert Richter
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote:
> On 04.01.17 14:02:23, Will Deacon wrote:
> > Using early_pfn_valid feels like a bodge to me, since having pfn_valid
> > return false for something that early_pfn_valid says is valid (and is
> > therefore initialised in the memmap) makes the NOMAP semantics even more
> > confusing.
> 
> The concern I have had with HOLES_IN_ZONE is that it enables
> pfn_valid_within() for arm64. This means that each pfn of a section is
> checked which is done only once for the section otherwise. With up to
> 2^18 pages per section we traverse the memblock list by that factor
> more often. There could be a performance regression.

There could be, but we're trying to fix a bug here. I wouldn't have
thought that walking over pfns like that is done very often.

> I haven't numbers yet, since the fix causes another kernel crash. And,
> this is the next problem I have. The crash doesn't happen otherwise. So,
> either it uncovers another bug or the fix is incomplete. Though the
> changes look like it should work. This needs more investigation.

I really can't see how the fix causes a crash, and I couldn't reproduce
it on any of my boards, nor could any of the Linaro folk afaik. Are you
definitely running mainline with just these two patches from Ard?

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-05 12:08             ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-05 12:08 UTC (permalink / raw)
  To: Robert Richter
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote:
> On 04.01.17 14:02:23, Will Deacon wrote:
> > Using early_pfn_valid feels like a bodge to me, since having pfn_valid
> > return false for something that early_pfn_valid says is valid (and is
> > therefore initialised in the memmap) makes the NOMAP semantics even more
> > confusing.
> 
> The concern I have had with HOLES_IN_ZONE is that it enables
> pfn_valid_within() for arm64. This means that each pfn of a section is
> checked which is done only once for the section otherwise. With up to
> 2^18 pages per section we traverse the memblock list by that factor
> more often. There could be a performance regression.

There could be, but we're trying to fix a bug here. I wouldn't have
thought that walking over pfns like that is done very often.

> I haven't numbers yet, since the fix causes another kernel crash. And,
> this is the next problem I have. The crash doesn't happen otherwise. So,
> either it uncovers another bug or the fix is incomplete. Though the
> changes look like it should work. This needs more investigation.

I really can't see how the fix causes a crash, and I couldn't reproduce
it on any of my boards, nor could any of the Linaro folk afaik. Are you
definitely running mainline with just these two patches from Ard?

Will

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2017-01-05 12:08             ` Will Deacon
  (?)
@ 2017-01-05 12:22               ` Robert Richter
  -1 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-01-05 12:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 05.01.17 12:08:20, Will Deacon wrote:
> On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote:
> > On 04.01.17 14:02:23, Will Deacon wrote:
> > > Using early_pfn_valid feels like a bodge to me, since having pfn_valid
> > > return false for something that early_pfn_valid says is valid (and is
> > > therefore initialised in the memmap) makes the NOMAP semantics even more
> > > confusing.
> > 
> > The concern I have had with HOLES_IN_ZONE is that it enables
> > pfn_valid_within() for arm64. This means that each pfn of a section is
> > checked which is done only once for the section otherwise. With up to
> > 2^18 pages per section we traverse the memblock list by that factor
> > more often. There could be a performance regression.
> 
> There could be, but we're trying to fix a bug here. I wouldn't have
> thought that walking over pfns like that is done very often.

The bug happens on a small number of machines depending on the memory
layout. The fix affects all systems. And right know the impact is
unclear.

> > I haven't numbers yet, since the fix causes another kernel crash. And,
> > this is the next problem I have. The crash doesn't happen otherwise. So,
> > either it uncovers another bug or the fix is incomplete. Though the
> > changes look like it should work. This needs more investigation.
> 
> I really can't see how the fix causes a crash, and I couldn't reproduce
> it on any of my boards, nor could any of the Linaro folk afaik. Are you
> definitely running mainline with just these two patches from Ard?

Yes, just both patches applied. Various other solutions were working.

-Robert

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-05 12:22               ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-01-05 12:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 05.01.17 12:08:20, Will Deacon wrote:
> On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote:
> > On 04.01.17 14:02:23, Will Deacon wrote:
> > > Using early_pfn_valid feels like a bodge to me, since having pfn_valid
> > > return false for something that early_pfn_valid says is valid (and is
> > > therefore initialised in the memmap) makes the NOMAP semantics even more
> > > confusing.
> > 
> > The concern I have had with HOLES_IN_ZONE is that it enables
> > pfn_valid_within() for arm64. This means that each pfn of a section is
> > checked which is done only once for the section otherwise. With up to
> > 2^18 pages per section we traverse the memblock list by that factor
> > more often. There could be a performance regression.
> 
> There could be, but we're trying to fix a bug here. I wouldn't have
> thought that walking over pfns like that is done very often.

The bug happens on a small number of machines depending on the memory
layout. The fix affects all systems. And right know the impact is
unclear.

> > I haven't numbers yet, since the fix causes another kernel crash. And,
> > this is the next problem I have. The crash doesn't happen otherwise. So,
> > either it uncovers another bug or the fix is incomplete. Though the
> > changes look like it should work. This needs more investigation.
> 
> I really can't see how the fix causes a crash, and I couldn't reproduce
> it on any of my boards, nor could any of the Linaro folk afaik. Are you
> definitely running mainline with just these two patches from Ard?

Yes, just both patches applied. Various other solutions were working.

-Robert

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-05 12:22               ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-01-05 12:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 05.01.17 12:08:20, Will Deacon wrote:
> On Thu, Jan 05, 2017 at 12:24:07PM +0100, Robert Richter wrote:
> > On 04.01.17 14:02:23, Will Deacon wrote:
> > > Using early_pfn_valid feels like a bodge to me, since having pfn_valid
> > > return false for something that early_pfn_valid says is valid (and is
> > > therefore initialised in the memmap) makes the NOMAP semantics even more
> > > confusing.
> > 
> > The concern I have had with HOLES_IN_ZONE is that it enables
> > pfn_valid_within() for arm64. This means that each pfn of a section is
> > checked which is done only once for the section otherwise. With up to
> > 2^18 pages per section we traverse the memblock list by that factor
> > more often. There could be a performance regression.
> 
> There could be, but we're trying to fix a bug here. I wouldn't have
> thought that walking over pfns like that is done very often.

The bug happens on a small number of machines depending on the memory
layout. The fix affects all systems. And right know the impact is
unclear.

> > I haven't numbers yet, since the fix causes another kernel crash. And,
> > this is the next problem I have. The crash doesn't happen otherwise. So,
> > either it uncovers another bug or the fix is incomplete. Though the
> > changes look like it should work. This needs more investigation.
> 
> I really can't see how the fix causes a crash, and I couldn't reproduce
> it on any of my boards, nor could any of the Linaro folk afaik. Are you
> definitely running mainline with just these two patches from Ard?

Yes, just both patches applied. Various other solutions were working.

-Robert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2017-01-05 12:22               ` Robert Richter
  (?)
@ 2017-01-05 19:49                 ` Robert Richter
  -1 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-01-05 19:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 05.01.17 13:22:00, Robert Richter wrote:
> On 05.01.17 12:08:20, Will Deacon wrote:
> > I really can't see how the fix causes a crash, and I couldn't reproduce
> > it on any of my boards, nor could any of the Linaro folk afaik. Are you
> > definitely running mainline with just these two patches from Ard?
> 
> Yes, just both patches applied. Various other solutions were working.

I have retested the same kernel (v4.9 based) as before and now it
boots fine including rtc-efi device registration (it was crashing
there):

 rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0

There could be a difference in firmware and mem setup, though I also
downgraded the firmware to test it, but can't reproduce it anymore. I
could reliable trigger the crash the first time.

FTR the oops.

-Robert


Unable to handle kernel paging request at virtual address 20251000
pgd = ffff000009090000
[20251000] *pgd=0000010ffff90003
, *pud=0000010ffff90003
, *pmd=0000000fdc030003
, *pte=00e8832000250707

Internal error: Oops: 96000047 [#1] SMP
Modules linked in:
CPU: 49 PID: 1 Comm: swapper/0 Tainted: G        W       4.9.0.0.vanilla10-00002-g429605e9ab0a #1
Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016
task: ffff800feee6bc00 task.stack: ffff800fec050000
PC is at 0x201ff820
LR is at 0x201fdfc0
pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045
sp : ffff800fec053b70
x29: ffff800fec053bc0 x28: 0000000000000000 
x27: ffff000008ce3e08 x26: ffff000008c52568 
x25: ffff000008bf045c x24: ffff000008bdb828 
x23: 0000000000000000 x22: 0000000000000040 
x21: ffff800fec053bb8 x20: 0000000020251000 
x19: ffff800fec053c20 x18: 0000000000000000 
x17: 0000000000000000 x16: 00000000bbb67a65 
x15: ffffffffffffffff x14: ffff810016ea291c 
x13: ffff810016ea2181 x12: 0000000000000030 
x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
x9 : feff716475687163 x8 : ffffffffffffffff 
x7 : 83f0680000000000 x6 : 0000000000000000 
x5 : ffff800fc187aab9 x4 : 0002000000000000 
x3 : ffff800fec053bb8 x2 : 0000000000000000 
x1 : 83f0680000000000 x0 : 0000000020251000 

Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020)
Stack: (0xffff800fec053b70 to 0xffff800fec054000)
3b60:                                   ffff800fec053c20 ffff800fec053c20
3b80: ffff800fec053c10 00000000201fd500 ffff000008e660d0 ffff800fec053c20
3ba0: ffff0000086eb954 ffff0000086eb930 ffff800fec053bc0 ffff0000086eb934
3bc0: ffff800fec053bf0 ffff000008c3eef4 ffff000008e602a0 ffff000008e602b0
3be0: ffff000008e60740 ffff000008e60768 ffff800fec053c30 ffff000008586c88
3c00: 00000000ffffffed ffff00000858023c ffff800fec053c30 ffff000008586c68
3c20: 0000000000000000 ffff000008e602b0 ffff800fec053c60 ffff0000085845d4
3c40: ffff000008e602b0 ffff000009049000 0000000000000000 ffff000008e60768
3c60: ffff800fec053ca0 ffff0000085848ac ffff000008e602b0 ffff000008e60310
3c80: ffff000008e60768 0000000000000000 ffff000008e4d000 ffff000008bdb828
3ca0: ffff800fec053cd0 ffff000008581e08 0000000000000000 ffff000008e60768
3cc0: ffff000008584788 0000000000000000 ffff800fec053d10 ffff000008583c30
3ce0: ffff000008e60768 ffff810fed477c00 ffff000008e4deb0 0000000000000000
3d00: ffff800fe54554a8 ffff810fed478e68 ffff800fec053d30 ffff000008583668
3d20: ffff000008e60768 ffff810fed477c00 ffff800fec053d70 ffff000008585430
3d40: ffff000008e60768 0000000000000000 ffff000008c3eed0 ffff000008e60768
3d60: ffff000008ef0000 0000000000000000 ffff800fec053d90 ffff000008586e3c
3d80: ffff000008e60740 0000000000000000 ffff800fec053dc0 ffff000008c3eec8
3da0: ffff000008c3eea8 ffff800fec050000 0000000000000000 0000000000000006
3dc0: ffff800fec053dd0 ffff000008082d94 ffff800fec053e40 ffff000008bf0d0c
3de0: 00000000000000f3 ffff000008ef0000 ffff000008c52578 0000000000000006
3e00: ffff000008ce3600 0000000000000000 ffff000008da2428 ffff000008ab2fa8
3e20: 0000000000000000 0000000600000006 ffff000008bf045c ffff000008bdb828
3e40: ffff800fec053ea0 ffff00000885e7a0 ffff00000885e788 0000000000000000
3e60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3e80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3ea0: 0000000000000000 ffff000008082b30 ffff00000885e788 0000000000000000
3ec0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3ee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3fa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3fc0: 0000000000000000 0000000000000005 0000000000000000 0000000000000000
3fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call trace:
Exception stack(0xffff800fec0539a0 to 0xffff800fec053ad0)
39a0: ffff800fec053c20 0001000000000000 ffff800fec053b70 00000000201ff820
39c0: 0000000000000000 ffff810000412890 ffff800fec0539f0 ffff000008405534
39e0: ffff810000412890 ffff810016e90e30 ffff800fec053a20 ffff00000840682c
3a00: 0000000000000000 ffff800fc168f880 0000000000000000 ffff00000840668c
3a20: ffff800fec053ac0 ffff0000084069f8 ffff00000903e7b0 0000000000000001
3a40: 0000000020251000 83f0680000000000 0000000000000000 ffff800fec053bb8
3a60: 0002000000000000 ffff800fc187aab9 0000000000000000 83f0680000000000
3a80: ffffffffffffffff feff716475687163 7f7f7f7f7f7f7f7f 0101010101010101
3aa0: 0000000000000030 ffff810016ea2181 ffff810016ea291c ffffffffffffffff
3ac0: 00000000bbb67a65 0000000000000000
[<00000000201ff820>] 0x201ff820
[<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78
[<ffff000008586c88>] platform_drv_probe+0x60/0xc8
[<ffff0000085845d4>] driver_probe_device+0x26c/0x420
[<ffff0000085848ac>] __driver_attach+0x124/0x128
[<ffff000008581e08>] bus_for_each_dev+0x70/0xb0
[<ffff000008583c30>] driver_attach+0x30/0x40
[<ffff000008583668>] bus_add_driver+0x200/0x2b8
[<ffff000008585430>] driver_register+0x68/0x100
[<ffff000008586e3c>] __platform_driver_probe+0x84/0x128
[<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28
[<ffff000008082d94>] do_one_initcall+0x44/0x138
[<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c
[<ffff00000885e7a0>] kernel_init+0x18/0x110
[<ffff000008082b30>] ret_from_fork+0x10/0x20
Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) 
---[ end trace e420ef9636e3c9b2 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-05 19:49                 ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-01-05 19:49 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 05.01.17 13:22:00, Robert Richter wrote:
> On 05.01.17 12:08:20, Will Deacon wrote:
> > I really can't see how the fix causes a crash, and I couldn't reproduce
> > it on any of my boards, nor could any of the Linaro folk afaik. Are you
> > definitely running mainline with just these two patches from Ard?
> 
> Yes, just both patches applied. Various other solutions were working.

I have retested the same kernel (v4.9 based) as before and now it
boots fine including rtc-efi device registration (it was crashing
there):

 rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0

There could be a difference in firmware and mem setup, though I also
downgraded the firmware to test it, but can't reproduce it anymore. I
could reliable trigger the crash the first time.

FTR the oops.

-Robert


Unable to handle kernel paging request at virtual address 20251000
pgd = ffff000009090000
[20251000] *pgd=0000010ffff90003
, *pud=0000010ffff90003
, *pmd=0000000fdc030003
, *pte=00e8832000250707

Internal error: Oops: 96000047 [#1] SMP
Modules linked in:
CPU: 49 PID: 1 Comm: swapper/0 Tainted: G        W       4.9.0.0.vanilla10-00002-g429605e9ab0a #1
Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016
task: ffff800feee6bc00 task.stack: ffff800fec050000
PC is at 0x201ff820
LR is at 0x201fdfc0
pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045
sp : ffff800fec053b70
x29: ffff800fec053bc0 x28: 0000000000000000 
x27: ffff000008ce3e08 x26: ffff000008c52568 
x25: ffff000008bf045c x24: ffff000008bdb828 
x23: 0000000000000000 x22: 0000000000000040 
x21: ffff800fec053bb8 x20: 0000000020251000 
x19: ffff800fec053c20 x18: 0000000000000000 
x17: 0000000000000000 x16: 00000000bbb67a65 
x15: ffffffffffffffff x14: ffff810016ea291c 
x13: ffff810016ea2181 x12: 0000000000000030 
x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
x9 : feff716475687163 x8 : ffffffffffffffff 
x7 : 83f0680000000000 x6 : 0000000000000000 
x5 : ffff800fc187aab9 x4 : 0002000000000000 
x3 : ffff800fec053bb8 x2 : 0000000000000000 
x1 : 83f0680000000000 x0 : 0000000020251000 

Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020)
Stack: (0xffff800fec053b70 to 0xffff800fec054000)
3b60:                                   ffff800fec053c20 ffff800fec053c20
3b80: ffff800fec053c10 00000000201fd500 ffff000008e660d0 ffff800fec053c20
3ba0: ffff0000086eb954 ffff0000086eb930 ffff800fec053bc0 ffff0000086eb934
3bc0: ffff800fec053bf0 ffff000008c3eef4 ffff000008e602a0 ffff000008e602b0
3be0: ffff000008e60740 ffff000008e60768 ffff800fec053c30 ffff000008586c88
3c00: 00000000ffffffed ffff00000858023c ffff800fec053c30 ffff000008586c68
3c20: 0000000000000000 ffff000008e602b0 ffff800fec053c60 ffff0000085845d4
3c40: ffff000008e602b0 ffff000009049000 0000000000000000 ffff000008e60768
3c60: ffff800fec053ca0 ffff0000085848ac ffff000008e602b0 ffff000008e60310
3c80: ffff000008e60768 0000000000000000 ffff000008e4d000 ffff000008bdb828
3ca0: ffff800fec053cd0 ffff000008581e08 0000000000000000 ffff000008e60768
3cc0: ffff000008584788 0000000000000000 ffff800fec053d10 ffff000008583c30
3ce0: ffff000008e60768 ffff810fed477c00 ffff000008e4deb0 0000000000000000
3d00: ffff800fe54554a8 ffff810fed478e68 ffff800fec053d30 ffff000008583668
3d20: ffff000008e60768 ffff810fed477c00 ffff800fec053d70 ffff000008585430
3d40: ffff000008e60768 0000000000000000 ffff000008c3eed0 ffff000008e60768
3d60: ffff000008ef0000 0000000000000000 ffff800fec053d90 ffff000008586e3c
3d80: ffff000008e60740 0000000000000000 ffff800fec053dc0 ffff000008c3eec8
3da0: ffff000008c3eea8 ffff800fec050000 0000000000000000 0000000000000006
3dc0: ffff800fec053dd0 ffff000008082d94 ffff800fec053e40 ffff000008bf0d0c
3de0: 00000000000000f3 ffff000008ef0000 ffff000008c52578 0000000000000006
3e00: ffff000008ce3600 0000000000000000 ffff000008da2428 ffff000008ab2fa8
3e20: 0000000000000000 0000000600000006 ffff000008bf045c ffff000008bdb828
3e40: ffff800fec053ea0 ffff00000885e7a0 ffff00000885e788 0000000000000000
3e60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3e80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3ea0: 0000000000000000 ffff000008082b30 ffff00000885e788 0000000000000000
3ec0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3ee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3fa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3fc0: 0000000000000000 0000000000000005 0000000000000000 0000000000000000
3fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call trace:
Exception stack(0xffff800fec0539a0 to 0xffff800fec053ad0)
39a0: ffff800fec053c20 0001000000000000 ffff800fec053b70 00000000201ff820
39c0: 0000000000000000 ffff810000412890 ffff800fec0539f0 ffff000008405534
39e0: ffff810000412890 ffff810016e90e30 ffff800fec053a20 ffff00000840682c
3a00: 0000000000000000 ffff800fc168f880 0000000000000000 ffff00000840668c
3a20: ffff800fec053ac0 ffff0000084069f8 ffff00000903e7b0 0000000000000001
3a40: 0000000020251000 83f0680000000000 0000000000000000 ffff800fec053bb8
3a60: 0002000000000000 ffff800fc187aab9 0000000000000000 83f0680000000000
3a80: ffffffffffffffff feff716475687163 7f7f7f7f7f7f7f7f 0101010101010101
3aa0: 0000000000000030 ffff810016ea2181 ffff810016ea291c ffffffffffffffff
3ac0: 00000000bbb67a65 0000000000000000
[<00000000201ff820>] 0x201ff820
[<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78
[<ffff000008586c88>] platform_drv_probe+0x60/0xc8
[<ffff0000085845d4>] driver_probe_device+0x26c/0x420
[<ffff0000085848ac>] __driver_attach+0x124/0x128
[<ffff000008581e08>] bus_for_each_dev+0x70/0xb0
[<ffff000008583c30>] driver_attach+0x30/0x40
[<ffff000008583668>] bus_add_driver+0x200/0x2b8
[<ffff000008585430>] driver_register+0x68/0x100
[<ffff000008586e3c>] __platform_driver_probe+0x84/0x128
[<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28
[<ffff000008082d94>] do_one_initcall+0x44/0x138
[<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c
[<ffff00000885e7a0>] kernel_init+0x18/0x110
[<ffff000008082b30>] ret_from_fork+0x10/0x20
Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) 
---[ end trace e420ef9636e3c9b2 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-05 19:49                 ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-01-05 19:49 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 05.01.17 13:22:00, Robert Richter wrote:
> On 05.01.17 12:08:20, Will Deacon wrote:
> > I really can't see how the fix causes a crash, and I couldn't reproduce
> > it on any of my boards, nor could any of the Linaro folk afaik. Are you
> > definitely running mainline with just these two patches from Ard?
> 
> Yes, just both patches applied. Various other solutions were working.

I have retested the same kernel (v4.9 based) as before and now it
boots fine including rtc-efi device registration (it was crashing
there):

 rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0

There could be a difference in firmware and mem setup, though I also
downgraded the firmware to test it, but can't reproduce it anymore. I
could reliable trigger the crash the first time.

FTR the oops.

-Robert


Unable to handle kernel paging request at virtual address 20251000
pgd = ffff000009090000
[20251000] *pgd=0000010ffff90003
, *pud=0000010ffff90003
, *pmd=0000000fdc030003
, *pte=00e8832000250707

Internal error: Oops: 96000047 [#1] SMP
Modules linked in:
CPU: 49 PID: 1 Comm: swapper/0 Tainted: G        W       4.9.0.0.vanilla10-00002-g429605e9ab0a #1
Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016
task: ffff800feee6bc00 task.stack: ffff800fec050000
PC is at 0x201ff820
LR is at 0x201fdfc0
pc : [<00000000201ff820>] lr : [<00000000201fdfc0>] pstate: 20000045
sp : ffff800fec053b70
x29: ffff800fec053bc0 x28: 0000000000000000 
x27: ffff000008ce3e08 x26: ffff000008c52568 
x25: ffff000008bf045c x24: ffff000008bdb828 
x23: 0000000000000000 x22: 0000000000000040 
x21: ffff800fec053bb8 x20: 0000000020251000 
x19: ffff800fec053c20 x18: 0000000000000000 
x17: 0000000000000000 x16: 00000000bbb67a65 
x15: ffffffffffffffff x14: ffff810016ea291c 
x13: ffff810016ea2181 x12: 0000000000000030 
x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
x9 : feff716475687163 x8 : ffffffffffffffff 
x7 : 83f0680000000000 x6 : 0000000000000000 
x5 : ffff800fc187aab9 x4 : 0002000000000000 
x3 : ffff800fec053bb8 x2 : 0000000000000000 
x1 : 83f0680000000000 x0 : 0000000020251000 

Process swapper/0 (pid: 1, stack limit = 0xffff800fec050020)
Stack: (0xffff800fec053b70 to 0xffff800fec054000)
3b60:                                   ffff800fec053c20 ffff800fec053c20
3b80: ffff800fec053c10 00000000201fd500 ffff000008e660d0 ffff800fec053c20
3ba0: ffff0000086eb954 ffff0000086eb930 ffff800fec053bc0 ffff0000086eb934
3bc0: ffff800fec053bf0 ffff000008c3eef4 ffff000008e602a0 ffff000008e602b0
3be0: ffff000008e60740 ffff000008e60768 ffff800fec053c30 ffff000008586c88
3c00: 00000000ffffffed ffff00000858023c ffff800fec053c30 ffff000008586c68
3c20: 0000000000000000 ffff000008e602b0 ffff800fec053c60 ffff0000085845d4
3c40: ffff000008e602b0 ffff000009049000 0000000000000000 ffff000008e60768
3c60: ffff800fec053ca0 ffff0000085848ac ffff000008e602b0 ffff000008e60310
3c80: ffff000008e60768 0000000000000000 ffff000008e4d000 ffff000008bdb828
3ca0: ffff800fec053cd0 ffff000008581e08 0000000000000000 ffff000008e60768
3cc0: ffff000008584788 0000000000000000 ffff800fec053d10 ffff000008583c30
3ce0: ffff000008e60768 ffff810fed477c00 ffff000008e4deb0 0000000000000000
3d00: ffff800fe54554a8 ffff810fed478e68 ffff800fec053d30 ffff000008583668
3d20: ffff000008e60768 ffff810fed477c00 ffff800fec053d70 ffff000008585430
3d40: ffff000008e60768 0000000000000000 ffff000008c3eed0 ffff000008e60768
3d60: ffff000008ef0000 0000000000000000 ffff800fec053d90 ffff000008586e3c
3d80: ffff000008e60740 0000000000000000 ffff800fec053dc0 ffff000008c3eec8
3da0: ffff000008c3eea8 ffff800fec050000 0000000000000000 0000000000000006
3dc0: ffff800fec053dd0 ffff000008082d94 ffff800fec053e40 ffff000008bf0d0c
3de0: 00000000000000f3 ffff000008ef0000 ffff000008c52578 0000000000000006
3e00: ffff000008ce3600 0000000000000000 ffff000008da2428 ffff000008ab2fa8
3e20: 0000000000000000 0000000600000006 ffff000008bf045c ffff000008bdb828
3e40: ffff800fec053ea0 ffff00000885e7a0 ffff00000885e788 0000000000000000
3e60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3e80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3ea0: 0000000000000000 ffff000008082b30 ffff00000885e788 0000000000000000
3ec0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3ee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3fa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3fc0: 0000000000000000 0000000000000005 0000000000000000 0000000000000000
3fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call trace:
Exception stack(0xffff800fec0539a0 to 0xffff800fec053ad0)
39a0: ffff800fec053c20 0001000000000000 ffff800fec053b70 00000000201ff820
39c0: 0000000000000000 ffff810000412890 ffff800fec0539f0 ffff000008405534
39e0: ffff810000412890 ffff810016e90e30 ffff800fec053a20 ffff00000840682c
3a00: 0000000000000000 ffff800fc168f880 0000000000000000 ffff00000840668c
3a20: ffff800fec053ac0 ffff0000084069f8 ffff00000903e7b0 0000000000000001
3a40: 0000000020251000 83f0680000000000 0000000000000000 ffff800fec053bb8
3a60: 0002000000000000 ffff800fc187aab9 0000000000000000 83f0680000000000
3a80: ffffffffffffffff feff716475687163 7f7f7f7f7f7f7f7f 0101010101010101
3aa0: 0000000000000030 ffff810016ea2181 ffff810016ea291c ffffffffffffffff
3ac0: 00000000bbb67a65 0000000000000000
[<00000000201ff820>] 0x201ff820
[<ffff000008c3eef4>] efi_rtc_probe+0x24/0x78
[<ffff000008586c88>] platform_drv_probe+0x60/0xc8
[<ffff0000085845d4>] driver_probe_device+0x26c/0x420
[<ffff0000085848ac>] __driver_attach+0x124/0x128
[<ffff000008581e08>] bus_for_each_dev+0x70/0xb0
[<ffff000008583c30>] driver_attach+0x30/0x40
[<ffff000008583668>] bus_add_driver+0x200/0x2b8
[<ffff000008585430>] driver_register+0x68/0x100
[<ffff000008586e3c>] __platform_driver_probe+0x84/0x128
[<ffff000008c3eec8>] efi_rtc_driver_init+0x20/0x28
[<ffff000008082d94>] do_one_initcall+0x44/0x138
[<ffff000008bf0d0c>] kernel_init_freeable+0x1ac/0x24c
[<ffff00000885e7a0>] kernel_init+0x18/0x110
[<ffff000008082b30>] ret_from_fork+0x10/0x20
Code: f9400000 d5033d9f d65f03c0 d5033e9f (f9000001) 
---[ end trace e420ef9636e3c9b2 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2017-01-05 19:49                 ` Robert Richter
  (?)
@ 2017-01-06 12:03                   ` Will Deacon
  -1 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-06 12:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote:
> On 05.01.17 13:22:00, Robert Richter wrote:
> > On 05.01.17 12:08:20, Will Deacon wrote:
> > > I really can't see how the fix causes a crash, and I couldn't reproduce
> > > it on any of my boards, nor could any of the Linaro folk afaik. Are you
> > > definitely running mainline with just these two patches from Ard?
> > 
> > Yes, just both patches applied. Various other solutions were working.
> 
> I have retested the same kernel (v4.9 based) as before and now it
> boots fine including rtc-efi device registration (it was crashing
> there):
> 
>  rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0
> 
> There could be a difference in firmware and mem setup, though I also
> downgraded the firmware to test it, but can't reproduce it anymore. I
> could reliable trigger the crash the first time.
> 
> FTR the oops.

Hmm, I just can't help but think you were accidentally running with
additional patches when you saw this oops previously. For example,
your log looks very similar to this one:

  http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html

but then again, these crashes probably often look alike.

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-06 12:03                   ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-06 12:03 UTC (permalink / raw)
  To: Robert Richter
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote:
> On 05.01.17 13:22:00, Robert Richter wrote:
> > On 05.01.17 12:08:20, Will Deacon wrote:
> > > I really can't see how the fix causes a crash, and I couldn't reproduce
> > > it on any of my boards, nor could any of the Linaro folk afaik. Are you
> > > definitely running mainline with just these two patches from Ard?
> > 
> > Yes, just both patches applied. Various other solutions were working.
> 
> I have retested the same kernel (v4.9 based) as before and now it
> boots fine including rtc-efi device registration (it was crashing
> there):
> 
>  rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0
> 
> There could be a difference in firmware and mem setup, though I also
> downgraded the firmware to test it, but can't reproduce it anymore. I
> could reliable trigger the crash the first time.
> 
> FTR the oops.

Hmm, I just can't help but think you were accidentally running with
additional patches when you saw this oops previously. For example,
your log looks very similar to this one:

  http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html

but then again, these crashes probably often look alike.

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-06 12:03                   ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2017-01-06 12:03 UTC (permalink / raw)
  To: Robert Richter
  Cc: Ard Biesheuvel, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote:
> On 05.01.17 13:22:00, Robert Richter wrote:
> > On 05.01.17 12:08:20, Will Deacon wrote:
> > > I really can't see how the fix causes a crash, and I couldn't reproduce
> > > it on any of my boards, nor could any of the Linaro folk afaik. Are you
> > > definitely running mainline with just these two patches from Ard?
> > 
> > Yes, just both patches applied. Various other solutions were working.
> 
> I have retested the same kernel (v4.9 based) as before and now it
> boots fine including rtc-efi device registration (it was crashing
> there):
> 
>  rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0
> 
> There could be a difference in firmware and mem setup, though I also
> downgraded the firmware to test it, but can't reproduce it anymore. I
> could reliable trigger the crash the first time.
> 
> FTR the oops.

Hmm, I just can't help but think you were accidentally running with
additional patches when you saw this oops previously. For example,
your log looks very similar to this one:

  http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html

but then again, these crashes probably often look alike.

Will

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2017-01-06 12:03                   ` Will Deacon
  (?)
@ 2017-01-06 12:22                     ` Ard Biesheuvel
  -1 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2017-01-06 12:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 6 January 2017 at 12:03, Will Deacon <will.deacon@arm.com> wrote:
> On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote:
>> On 05.01.17 13:22:00, Robert Richter wrote:
>> > On 05.01.17 12:08:20, Will Deacon wrote:
>> > > I really can't see how the fix causes a crash, and I couldn't reproduce
>> > > it on any of my boards, nor could any of the Linaro folk afaik. Are you
>> > > definitely running mainline with just these two patches from Ard?
>> >
>> > Yes, just both patches applied. Various other solutions were working.
>>
>> I have retested the same kernel (v4.9 based) as before and now it
>> boots fine including rtc-efi device registration (it was crashing
>> there):
>>
>>  rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0
>>
>> There could be a difference in firmware and mem setup, though I also
>> downgraded the firmware to test it, but can't reproduce it anymore. I
>> could reliable trigger the crash the first time.
>>
>> FTR the oops.
>
> Hmm, I just can't help but think you were accidentally running with
> additional patches when you saw this oops previously. For example,
> your log looks very similar to this one:
>
>   http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html
>
> but then again, these crashes probably often look alike.
>

These are quite different, in fact. In James's case, the UEFI memory
map was missing some entries, so not all memory regions that the
firmware expected to be there were actually mapped, hence the all-zero
*pte. In Robert's case, it looks like the UEFI runtime services page
tables are corrupted, i.e., *pte has RES0 bits set.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-06 12:22                     ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2017-01-06 12:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robert Richter, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 6 January 2017 at 12:03, Will Deacon <will.deacon@arm.com> wrote:
> On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote:
>> On 05.01.17 13:22:00, Robert Richter wrote:
>> > On 05.01.17 12:08:20, Will Deacon wrote:
>> > > I really can't see how the fix causes a crash, and I couldn't reproduce
>> > > it on any of my boards, nor could any of the Linaro folk afaik. Are you
>> > > definitely running mainline with just these two patches from Ard?
>> >
>> > Yes, just both patches applied. Various other solutions were working.
>>
>> I have retested the same kernel (v4.9 based) as before and now it
>> boots fine including rtc-efi device registration (it was crashing
>> there):
>>
>>  rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0
>>
>> There could be a difference in firmware and mem setup, though I also
>> downgraded the firmware to test it, but can't reproduce it anymore. I
>> could reliable trigger the crash the first time.
>>
>> FTR the oops.
>
> Hmm, I just can't help but think you were accidentally running with
> additional patches when you saw this oops previously. For example,
> your log looks very similar to this one:
>
>   http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html
>
> but then again, these crashes probably often look alike.
>

These are quite different, in fact. In James's case, the UEFI memory
map was missing some entries, so not all memory regions that the
firmware expected to be there were actually mapped, hence the all-zero
*pte. In Robert's case, it looks like the UEFI runtime services page
tables are corrupted, i.e., *pte has RES0 bits set.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-01-06 12:22                     ` Ard Biesheuvel
  0 siblings, 0 replies; 57+ messages in thread
From: Ard Biesheuvel @ 2017-01-06 12:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robert Richter, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas,
	Andrew Morton, Hanjun Guo, Yisheng Xie, James Morse

On 6 January 2017 at 12:03, Will Deacon <will.deacon@arm.com> wrote:
> On Thu, Jan 05, 2017 at 08:49:44PM +0100, Robert Richter wrote:
>> On 05.01.17 13:22:00, Robert Richter wrote:
>> > On 05.01.17 12:08:20, Will Deacon wrote:
>> > > I really can't see how the fix causes a crash, and I couldn't reproduce
>> > > it on any of my boards, nor could any of the Linaro folk afaik. Are you
>> > > definitely running mainline with just these two patches from Ard?
>> >
>> > Yes, just both patches applied. Various other solutions were working.
>>
>> I have retested the same kernel (v4.9 based) as before and now it
>> boots fine including rtc-efi device registration (it was crashing
>> there):
>>
>>  rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0
>>
>> There could be a difference in firmware and mem setup, though I also
>> downgraded the firmware to test it, but can't reproduce it anymore. I
>> could reliable trigger the crash the first time.
>>
>> FTR the oops.
>
> Hmm, I just can't help but think you were accidentally running with
> additional patches when you saw this oops previously. For example,
> your log looks very similar to this one:
>
>   http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473666.html
>
> but then again, these crashes probably often look alike.
>

These are quite different, in fact. In James's case, the UEFI memory
map was missing some entries, so not all memory regions that the
firmware expected to be there were actually mapped, hence the all-zero
*pte. In Robert's case, it looks like the UEFI runtime services page
tables are corrupted, i.e., *pte has RES0 bits set.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
  2016-12-14  9:11   ` Ard Biesheuvel
  (?)
@ 2017-02-06 13:36     ` Robert Richter
  -1 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-02-06 13:36 UTC (permalink / raw)
  To: linux-arm-kernel

On 14.12.16 09:11:47, Ard Biesheuvel wrote:
> The NUMA code may get confused by the presence of NOMAP regions within
> zones, resulting in spurious BUG() checks where the node id deviates
> from the containing zone's node id.
> 
> Since the kernel has no business reasoning about node ids of pages it
> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
> that such pages are disregarded.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

I would rather see a solution other than making pfn_valid checks more
fine grained, but this patch also fixes the issue. So:

Acked-by: Robert Richter <rrichter@cavium.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-02-06 13:36     ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-02-06 13:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm,
	catalin.marinas, akpm, hanjun.guo, xieyisheng1, james.morse

On 14.12.16 09:11:47, Ard Biesheuvel wrote:
> The NUMA code may get confused by the presence of NOMAP regions within
> zones, resulting in spurious BUG() checks where the node id deviates
> from the containing zone's node id.
> 
> Since the kernel has no business reasoning about node ids of pages it
> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
> that such pages are disregarded.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

I would rather see a solution other than making pfn_valid checks more
fine grained, but this patch also fixes the issue. So:

Acked-by: Robert Richter <rrichter@cavium.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
@ 2017-02-06 13:36     ` Robert Richter
  0 siblings, 0 replies; 57+ messages in thread
From: Robert Richter @ 2017-02-06 13:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, will.deacon, linux-kernel, linux-mm,
	catalin.marinas, akpm, hanjun.guo, xieyisheng1, james.morse

On 14.12.16 09:11:47, Ard Biesheuvel wrote:
> The NUMA code may get confused by the presence of NOMAP regions within
> zones, resulting in spurious BUG() checks where the node id deviates
> from the containing zone's node id.
> 
> Since the kernel has no business reasoning about node ids of pages it
> does not own in the first place, enable CONFIG_HOLES_IN_ZONE to ensure
> that such pages are disregarded.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

I would rather see a solution other than making pfn_valid checks more
fine grained, but this patch also fixes the issue. So:

Acked-by: Robert Richter <rrichter@cavium.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2017-02-06 13:36 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-14  9:11 [PATCH 0/2] arm64: numa: fix spurious BUG() on NOMAP regions Ard Biesheuvel
2016-12-14  9:11 ` Ard Biesheuvel
2016-12-14  9:11 ` Ard Biesheuvel
2016-12-14  9:11 ` [PATCH 1/2] mm: don't dereference struct page fields of invalid pages Ard Biesheuvel
2016-12-14  9:11   ` Ard Biesheuvel
2016-12-14  9:11   ` Ard Biesheuvel
2017-01-04 12:16   ` Will Deacon
2017-01-04 12:16     ` Will Deacon
2017-01-04 12:16     ` Will Deacon
2016-12-14  9:11 ` [PATCH 2/2] arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA Ard Biesheuvel
2016-12-14  9:11   ` Ard Biesheuvel
2016-12-14  9:11   ` Ard Biesheuvel
2016-12-15 15:39   ` Robert Richter
2016-12-15 15:39     ` Robert Richter
2016-12-15 15:39     ` Robert Richter
2016-12-15 16:07     ` Ard Biesheuvel
2016-12-15 16:07       ` Ard Biesheuvel
2016-12-15 16:07       ` Ard Biesheuvel
2016-12-16 17:10       ` Robert Richter
2016-12-16 17:10         ` Robert Richter
2016-12-16 17:10         ` Robert Richter
2016-12-16  1:57     ` Hanjun Guo
2016-12-16  1:57       ` Hanjun Guo
2016-12-16  1:57       ` Hanjun Guo
2016-12-16 17:14       ` Robert Richter
2016-12-16 17:14         ` Robert Richter
2016-12-16 17:14         ` Robert Richter
2017-01-04 13:28   ` Will Deacon
2017-01-04 13:28     ` Will Deacon
2017-01-04 13:28     ` Will Deacon
2017-01-04 13:50     ` Ard Biesheuvel
2017-01-04 13:50       ` Ard Biesheuvel
2017-01-04 13:50       ` Ard Biesheuvel
2017-01-04 14:02       ` Will Deacon
2017-01-04 14:02         ` Will Deacon
2017-01-04 14:02         ` Will Deacon
2017-01-05 11:24         ` Robert Richter
2017-01-05 11:24           ` Robert Richter
2017-01-05 11:24           ` Robert Richter
2017-01-05 12:08           ` Will Deacon
2017-01-05 12:08             ` Will Deacon
2017-01-05 12:08             ` Will Deacon
2017-01-05 12:22             ` Robert Richter
2017-01-05 12:22               ` Robert Richter
2017-01-05 12:22               ` Robert Richter
2017-01-05 19:49               ` Robert Richter
2017-01-05 19:49                 ` Robert Richter
2017-01-05 19:49                 ` Robert Richter
2017-01-06 12:03                 ` Will Deacon
2017-01-06 12:03                   ` Will Deacon
2017-01-06 12:03                   ` Will Deacon
2017-01-06 12:22                   ` Ard Biesheuvel
2017-01-06 12:22                     ` Ard Biesheuvel
2017-01-06 12:22                     ` Ard Biesheuvel
2017-02-06 13:36   ` Robert Richter
2017-02-06 13:36     ` Robert Richter
2017-02-06 13:36     ` Robert Richter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.