All of lore.kernel.org
 help / color / mirror / Atom feed
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] arm64: mm: Fix memmap to be initialized for the entire section
Date: Thu, 1 Dec 2016 16:45:39 +0000	[thread overview]
Message-ID: <20161201164538.GB1236@arm.com> (raw)
In-Reply-To: <1480530091-1092-1-git-send-email-rrichter@cavium.com>

On Wed, Nov 30, 2016 at 07:21:31PM +0100, Robert Richter wrote:
> On ThunderX systems with certain memory configurations we see the
> following BUG_ON():
> 
>  kernel BUG at mm/page_alloc.c:1848!
> 
> This happens for some configs with 64k page size enabled. The BUG_ON()
> checks if start and end page of a memmap range belongs to the same
> zone.
> 
> The BUG_ON() check fails if a memory zone contains NOMAP regions. In
> this case the node information of those pages is not initialized. This
> causes an inconsistency of the page links with wrong zone and node
> information for that pages. NOMAP pages from node 1 still point to the
> mem zone from node 0 and have the wrong nid assigned.
> 
> The reason for the mis-configuration is a change in pfn_valid() which
> reports pages marked NOMAP as invalid:
> 
>  68709f45385a arm64: only consider memblocks with NOMAP cleared for linear mapping
> 
> This causes pages marked as nomap being no long reassigned to the new
> zone in memmap_init_zone() by calling __init_single_pfn().
> 
> Fixing this by restoring the old behavior of pfn_valid() to use
> memblock_is_memory(). Also changing users of pfn_valid() in arm64 code
> to use memblock_is_map_memory() where necessary. This only affects
> code in ioremap.c. The code in mmu.c still can use the new version of
> pfn_valid().
> 
> As a consequence, pfn_valid() can not be used to check if a physical
> page is RAM. It just checks if there is an underlying memmap with a
> valid struct page. Moreover, for performance reasons the whole memmap
> (with pageblock_nr_pages number of pages) has valid pfns (SPARSEMEM
> config). The memory range is extended to fit the alignment of the
> memmap. Thus, pfn_valid() may return true for pfns that do not map to
> physical memory. Those pages are simply not reported to the mm, they
> are not marked reserved nor added to the list of free pages. Other
> functions such a page_is_ram() or memblock_is_map_ memory() must be
> used to check for memory and if the page can be mapped with the linear
> mapping.
> 
> Since NOMAP mem ranges may need to be mapped with different mem
> attributes (e.g. read-only or non-caching) we can not use linear
> mapping here. The use of memblock_is_memory() in pfn_valid() may not
> break this behaviour. Since commit:
> 
>  e7cd190385d1 arm64: mark reserved memblock regions explicitly in iomem
> 
> NOMAP mem resources are no longer marked as system RAM (IORESOURCE_
> SYSTEM_RAM). Now page_is_ram() and region_intersects() (see
> memremap()) do not detect NOMAP mem as system ram and NOMAP mem is not
> added to the linear mapping as system RAM is.
> 
> v2:
> 
>  * Added Ack
>  * updated description to reflect the discussion
> 
> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Signed-off-by: Robert Richter <rrichter@cavium.com>
> ---
>  arch/arm64/mm/init.c    | 2 +-
>  arch/arm64/mm/ioremap.c | 5 +++--
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 212c4d1e2f26..166911f4a2e6 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -147,7 +147,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  int pfn_valid(unsigned long pfn)
>  {
> -	return memblock_is_map_memory(pfn << PAGE_SHIFT);
> +	return memblock_is_memory(pfn << PAGE_SHIFT);
>  }
>  EXPORT_SYMBOL(pfn_valid);
>  #endif
> diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
> index 01e88c8bcab0..c17c220b0c48 100644
> --- a/arch/arm64/mm/ioremap.c
> +++ b/arch/arm64/mm/ioremap.c
> @@ -21,6 +21,7 @@
>   */
>  
>  #include <linux/export.h>
> +#include <linux/memblock.h>
>  #include <linux/mm.h>
>  #include <linux/vmalloc.h>
>  #include <linux/io.h>
> @@ -55,7 +56,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
>  	/*
>  	 * Don't allow RAM to be mapped.
>  	 */
> -	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
> +	if (WARN_ON(memblock_is_map_memory(phys_addr)))
>  		return NULL;
>  
>  	area = get_vm_area_caller(size, VM_IOREMAP, caller);
> @@ -96,7 +97,7 @@ EXPORT_SYMBOL(__iounmap);
>  void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
>  {
>  	/* For normal memory we already have a cacheable mapping. */
> -	if (pfn_valid(__phys_to_pfn(phys_addr)))
> +	if (memblock_is_map_memory(phys_addr))
>  		return (void __iomem *)__phys_to_virt(phys_addr);

Thanks for sending out the new patch. Whilst I'm still a bit worried about
changing pfn_valid like this, I guess we'll just have to fix up any callers
which suffer from this change.

Acked-by: Will Deacon <will.deacon@arm.com>

I'd like to see this sit in -next for a bit before we send it further.

Will

WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will.deacon@arm.com>
To: Robert Richter <rrichter@cavium.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	David Daney <david.daney@cavium.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Hanjun Guo <hanjun.guo@linaro.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] arm64: mm: Fix memmap to be initialized for the entire section
Date: Thu, 1 Dec 2016 16:45:39 +0000	[thread overview]
Message-ID: <20161201164538.GB1236@arm.com> (raw)
In-Reply-To: <1480530091-1092-1-git-send-email-rrichter@cavium.com>

On Wed, Nov 30, 2016 at 07:21:31PM +0100, Robert Richter wrote:
> On ThunderX systems with certain memory configurations we see the
> following BUG_ON():
> 
>  kernel BUG at mm/page_alloc.c:1848!
> 
> This happens for some configs with 64k page size enabled. The BUG_ON()
> checks if start and end page of a memmap range belongs to the same
> zone.
> 
> The BUG_ON() check fails if a memory zone contains NOMAP regions. In
> this case the node information of those pages is not initialized. This
> causes an inconsistency of the page links with wrong zone and node
> information for that pages. NOMAP pages from node 1 still point to the
> mem zone from node 0 and have the wrong nid assigned.
> 
> The reason for the mis-configuration is a change in pfn_valid() which
> reports pages marked NOMAP as invalid:
> 
>  68709f45385a arm64: only consider memblocks with NOMAP cleared for linear mapping
> 
> This causes pages marked as nomap being no long reassigned to the new
> zone in memmap_init_zone() by calling __init_single_pfn().
> 
> Fixing this by restoring the old behavior of pfn_valid() to use
> memblock_is_memory(). Also changing users of pfn_valid() in arm64 code
> to use memblock_is_map_memory() where necessary. This only affects
> code in ioremap.c. The code in mmu.c still can use the new version of
> pfn_valid().
> 
> As a consequence, pfn_valid() can not be used to check if a physical
> page is RAM. It just checks if there is an underlying memmap with a
> valid struct page. Moreover, for performance reasons the whole memmap
> (with pageblock_nr_pages number of pages) has valid pfns (SPARSEMEM
> config). The memory range is extended to fit the alignment of the
> memmap. Thus, pfn_valid() may return true for pfns that do not map to
> physical memory. Those pages are simply not reported to the mm, they
> are not marked reserved nor added to the list of free pages. Other
> functions such a page_is_ram() or memblock_is_map_ memory() must be
> used to check for memory and if the page can be mapped with the linear
> mapping.
> 
> Since NOMAP mem ranges may need to be mapped with different mem
> attributes (e.g. read-only or non-caching) we can not use linear
> mapping here. The use of memblock_is_memory() in pfn_valid() may not
> break this behaviour. Since commit:
> 
>  e7cd190385d1 arm64: mark reserved memblock regions explicitly in iomem
> 
> NOMAP mem resources are no longer marked as system RAM (IORESOURCE_
> SYSTEM_RAM). Now page_is_ram() and region_intersects() (see
> memremap()) do not detect NOMAP mem as system ram and NOMAP mem is not
> added to the linear mapping as system RAM is.
> 
> v2:
> 
>  * Added Ack
>  * updated description to reflect the discussion
> 
> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Signed-off-by: Robert Richter <rrichter@cavium.com>
> ---
>  arch/arm64/mm/init.c    | 2 +-
>  arch/arm64/mm/ioremap.c | 5 +++--
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 212c4d1e2f26..166911f4a2e6 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -147,7 +147,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  int pfn_valid(unsigned long pfn)
>  {
> -	return memblock_is_map_memory(pfn << PAGE_SHIFT);
> +	return memblock_is_memory(pfn << PAGE_SHIFT);
>  }
>  EXPORT_SYMBOL(pfn_valid);
>  #endif
> diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
> index 01e88c8bcab0..c17c220b0c48 100644
> --- a/arch/arm64/mm/ioremap.c
> +++ b/arch/arm64/mm/ioremap.c
> @@ -21,6 +21,7 @@
>   */
>  
>  #include <linux/export.h>
> +#include <linux/memblock.h>
>  #include <linux/mm.h>
>  #include <linux/vmalloc.h>
>  #include <linux/io.h>
> @@ -55,7 +56,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
>  	/*
>  	 * Don't allow RAM to be mapped.
>  	 */
> -	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
> +	if (WARN_ON(memblock_is_map_memory(phys_addr)))
>  		return NULL;
>  
>  	area = get_vm_area_caller(size, VM_IOREMAP, caller);
> @@ -96,7 +97,7 @@ EXPORT_SYMBOL(__iounmap);
>  void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
>  {
>  	/* For normal memory we already have a cacheable mapping. */
> -	if (pfn_valid(__phys_to_pfn(phys_addr)))
> +	if (memblock_is_map_memory(phys_addr))
>  		return (void __iomem *)__phys_to_virt(phys_addr);

Thanks for sending out the new patch. Whilst I'm still a bit worried about
changing pfn_valid like this, I guess we'll just have to fix up any callers
which suffer from this change.

Acked-by: Will Deacon <will.deacon@arm.com>

I'd like to see this sit in -next for a bit before we send it further.

Will

  reply	other threads:[~2016-12-01 16:45 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-30 18:21 [PATCH v2] arm64: mm: Fix memmap to be initialized for the entire section Robert Richter
2016-11-30 18:21 ` Robert Richter
2016-12-01 16:45 ` Will Deacon [this message]
2016-12-01 16:45   ` Will Deacon
2016-12-01 17:26   ` James Morse
2016-12-01 17:26     ` James Morse
2016-12-02  7:11     ` Robert Richter
2016-12-02  7:11       ` Robert Richter
2016-12-02 14:48       ` James Morse
2016-12-02 14:48         ` James Morse
2016-12-02 14:49 ` [PATCH 0/2] Hibernate fixes for 'Fix memmap to be initialized for the entire section' James Morse
2016-12-02 14:49   ` James Morse
2016-12-02 14:49   ` [PATCH 1/2] arm64: mm: Mark nomap regions with the PG_reserved flag James Morse
2016-12-02 14:49     ` James Morse
2016-12-02 14:49   ` [PATCH 2/2] arm64: hibernate: report nomap regions as being pfn_nosave James Morse
2016-12-02 14:49     ` James Morse
2016-12-05 15:42   ` [PATCH 0/2] Hibernate fixes for 'Fix memmap to be initialized for the entire section' Ard Biesheuvel
2016-12-05 15:42     ` Ard Biesheuvel
2016-12-06 17:38     ` Will Deacon
2016-12-06 17:38       ` Will Deacon
2016-12-07  9:06       ` Robert Richter
2016-12-07  9:06         ` Robert Richter
2016-12-07 14:32         ` Will Deacon
2016-12-07 14:32           ` Will Deacon
2016-12-09 12:14 ` [PATCH v2] arm64: mm: Fix memmap to be initialized for the entire section Yisheng Xie
2016-12-09 12:14   ` Yisheng Xie
2016-12-09 12:19   ` Ard Biesheuvel
2016-12-09 12:19     ` Ard Biesheuvel
2016-12-09 12:23     ` Hanjun Guo
2016-12-09 12:23       ` Hanjun Guo
2016-12-09 13:15       ` Yisheng Xie
2016-12-09 13:15         ` Yisheng Xie
2016-12-09 14:52         ` Robert Richter
2016-12-09 14:52           ` Robert Richter
2016-12-12 12:02           ` Ard Biesheuvel
2016-12-12 12:02             ` Ard Biesheuvel
2016-12-09 14:51   ` Robert Richter
2016-12-09 14:51     ` Robert Richter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161201164538.GB1236@arm.com \
    --to=will.deacon@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.