[PATCH] arm64: fix MAX_ORDER for 64K pagesize

All of lore.kernel.org
 help / color / mirror / Atom feed

From: msalter@redhat.com (Mark Salter)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] arm64: fix MAX_ORDER for 64K pagesize
Date: Thu, 19 Jun 2014 14:12:04 -0400	[thread overview]
Message-ID: <1403201524.32688.62.camel@deneb.redhat.com> (raw)
In-Reply-To: <xa1t61jzwfo6.fsf@mina86.com>

On Tue, 2014-06-17 at 20:32 +0200, Michal Nazarewicz wrote:
> On Wed, Jun 11 2014, David Rientjes wrote:
> > On Wed, 11 Jun 2014, Mark Salter wrote:
> >
> >> With a kernel configured with ARM64_64K_PAGES && !TRANSPARENT_HUGEPAGE
> >> I get this at early boot:
> >> 
> >>   SMP: Total of 8 processors activated.
> >>   devtmpfs: initialized
> >>   Unable to handle kernel NULL pointer dereference at virtual address 00000008
> >>   pgd = fffffe0000050000
> >>   [00000008] *pgd=00000043fba00003, *pmd=00000043fba00003, *pte=00e0000078010407
> >>   Internal error: Oops: 96000006 [#1] SMP
> >>   Modules linked in:
> >>   CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc864k+ #44
> >>   task: fffffe03bc040000 ti: fffffe03bc080000 task.ti: fffffe03bc080000
> >>   PC is at __list_add+0x10/0xd4
> >>   LR is at free_one_page+0x270/0x638
> >>   ...
> >>   Call trace:
> >>   [<fffffe00003ee970>] __list_add+0x10/0xd4
> >>   [<fffffe000019c478>] free_one_page+0x26c/0x638
> >>   [<fffffe000019c8c8>] __free_pages_ok.part.52+0x84/0xbc
> >>   [<fffffe000019d5e8>] __free_pages+0x74/0xbc
> >>   [<fffffe0000c01350>] init_cma_reserved_pageblock+0xe8/0x104
> >>   [<fffffe0000c24de0>] cma_init_reserved_areas+0x190/0x1e4
> >>   [<fffffe0000090418>] do_one_initcall+0xc4/0x154
> >>   [<fffffe0000bf0a50>] kernel_init_freeable+0x204/0x2a8
> >>   [<fffffe00007520a0>] kernel_init+0xc/0xd4
> >> 
> >> This happens in this configuration because __free_one_page() is called
> >> with an order greater than MAX_ORDER, accesses past zone->free_list[]
> >> and passes a bogus list_head to list_add().
> >> 
> >> arch/arm64/Kconfig has:
> >> 
> >>   config FORCE_MAX_ZONEORDER
> >> 	int
> >> 	default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
> >> 	default "11"
> >> 
> >> So with THP turned off MAX_ORDER == 11 but init_cma_reserved_pageblock()
> >> passes __free_pages() an order of pageblock_order which is based on
> >> (HPAGE_SHIFT - PAGE_SHIFT) which is 13 for 64K pages. I worked around
> >> this by removing the THP test so FORCE_MAX_ZONEORDER is always 14 for
> >> ARM64_64K_PAGES.
> >> 
> >> Signed-off-by: Mark Salter <msalter@redhat.com>
> >> ---
> >>  arch/arm64/Kconfig | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> >> index 7295419..42a334e 100644
> >> --- a/arch/arm64/Kconfig
> >> +++ b/arch/arm64/Kconfig
> >> @@ -269,7 +269,7 @@ config XEN
> >>  
> >>  config FORCE_MAX_ZONEORDER
> >>  	int
> >> -	default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
> >> +	default "14" if ARM64_64K_PAGES
> >>  	default "11"
> >>  
> >>  endmenu
> >
> > Any reason to not switch this to
> >
> > 	ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE && CMA
> >
> > instead?  If pageblock_order > MAX_ORDER because of 
> > HPAGE_SHIFT > PAGE_SHIFT, then cma is always going to be passing a 
> > too-large-order to free_pages_prepare() via this path.
> >
> > Adding Michal and Marek to the cc.
> 
> The correct fix would be to change init_cma_reserved_pageblock such that
> it checks whether pageblock_order > MAX_ORDER and if so frees each max
> order page of the pageblock individually:
> 
> --------- >8 ---------------------------------------------------------
> From: Michal Nazarewicz <mina86@mina86.com>
> Subject: [PATCH] mm: cma: fix cases where pageblock is bigger then MAX_ORDER
> 
> With a kernel configured with ARM64_64K_PAGES && !TRANSPARENT_HUGEPAGE,
> the following is triggered at early boot:
> 
>   SMP: Total of 8 processors activated.
>   devtmpfs: initialized
>   Unable to handle kernel NULL pointer dereference at virtual address 00000008
>   pgd = fffffe0000050000
>   [00000008] *pgd=00000043fba00003, *pmd=00000043fba00003, *pte=00e0000078010407
>   Internal error: Oops: 96000006 [#1] SMP
>   Modules linked in:
>   CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc864k+ #44
>   task: fffffe03bc040000 ti: fffffe03bc080000 task.ti: fffffe03bc080000
>   PC is at __list_add+0x10/0xd4
>   LR is at free_one_page+0x270/0x638
>   ...
>   Call trace:
>   [<fffffe00003ee970>] __list_add+0x10/0xd4
>   [<fffffe000019c478>] free_one_page+0x26c/0x638
>   [<fffffe000019c8c8>] __free_pages_ok.part.52+0x84/0xbc
>   [<fffffe000019d5e8>] __free_pages+0x74/0xbc
>   [<fffffe0000c01350>] init_cma_reserved_pageblock+0xe8/0x104
>   [<fffffe0000c24de0>] cma_init_reserved_areas+0x190/0x1e4
>   [<fffffe0000090418>] do_one_initcall+0xc4/0x154
>   [<fffffe0000bf0a50>] kernel_init_freeable+0x204/0x2a8
>   [<fffffe00007520a0>] kernel_init+0xc/0xd4
> 
> This happens in this configuration because __free_one_page() is called
> with an order greater than MAX_ORDER, accesses past zone->free_list[]
> and passes a bogus list_head to list_add().
> 
> arch/arm64/Kconfig has:
> 
>   config FORCE_MAX_ZONEORDER
> 	int
> 	default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
> 	default "11"
> 
> So with THP turned off MAX_ORDER == 11 but init_cma_reserved_pageblock()
> passes __free_pages() an order of pageblock_order which is based on
> (HPAGE_SHIFT - PAGE_SHIFT) which is 13 for 64K pages.
> 
> Fix the problem by changing init_cma_reserved_pageblock() such that it
> splits pageblock into individual MAX_ORDER pages if pageblock is
> bigger than a MAX_ORDER page.
> 
> Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
> Reported-by: Mark Salter <msalter@redhat.com>
> ---
>  mm/page_alloc.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5dba293..6e657ce 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -801,7 +801,15 @@ void __init init_cma_reserved_pageblock(struct page *page)
>  
>  	set_page_refcounted(page);
>  	set_pageblock_migratetype(page, MIGRATE_CMA);
> -	__free_pages(page, pageblock_order);
> +	if (pageblock_order > MAX_ORDER) {
> +		struct page *subpage = p;
> +		unsigned count = 1 << (pageblock_order - MAX_ORDER);
> +		do {
> +			__free_pages(subpage, pageblock_order);
                                               ^^^^^^^
                                               MAX_ORDER

> +		} while (subpage += MAX_ORDER_NR_PAGES, --count);
> +	} else {
> +		__free_pages(page, pageblock_order);
> +	}
>  	adjust_managed_page_count(page, pageblock_nr_pages);
>  }
>  #endif
> --------- >8 ---------------------------------------------------------
> 
> Thoughts?  This has not been tested and I think it may cause performance
> degradation in some cases since pageblock_order is not always
> a constant, so the comparison may end up not being stripped away even on
> systems where it's always false.
> 

This works with the above tweak. So it fixes the problm here, but I was
not sure if we'd get bitten elsewhere by pageblock_order > MAX_ORDER.
It will be slower, but does it only gets called a few time at most at
boot time, right?

WARNING: multiple messages have this Message-ID (diff)

From: Mark Salter <msalter@redhat.com>
To: Michal Nazarewicz <mina86@mina86.com>
Cc: David Rientjes <rientjes@google.com>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] arm64: fix MAX_ORDER for 64K pagesize
Date: Thu, 19 Jun 2014 14:12:04 -0400	[thread overview]
Message-ID: <1403201524.32688.62.camel@deneb.redhat.com> (raw)
In-Reply-To: <xa1t61jzwfo6.fsf@mina86.com>

On Tue, 2014-06-17 at 20:32 +0200, Michal Nazarewicz wrote:
> On Wed, Jun 11 2014, David Rientjes wrote:
> > On Wed, 11 Jun 2014, Mark Salter wrote:
> >
> >> With a kernel configured with ARM64_64K_PAGES && !TRANSPARENT_HUGEPAGE
> >> I get this at early boot:
> >> 
> >>   SMP: Total of 8 processors activated.
> >>   devtmpfs: initialized
> >>   Unable to handle kernel NULL pointer dereference at virtual address 00000008
> >>   pgd = fffffe0000050000
> >>   [00000008] *pgd=00000043fba00003, *pmd=00000043fba00003, *pte=00e0000078010407
> >>   Internal error: Oops: 96000006 [#1] SMP
> >>   Modules linked in:
> >>   CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc864k+ #44
> >>   task: fffffe03bc040000 ti: fffffe03bc080000 task.ti: fffffe03bc080000
> >>   PC is at __list_add+0x10/0xd4
> >>   LR is at free_one_page+0x270/0x638
> >>   ...
> >>   Call trace:
> >>   [<fffffe00003ee970>] __list_add+0x10/0xd4
> >>   [<fffffe000019c478>] free_one_page+0x26c/0x638
> >>   [<fffffe000019c8c8>] __free_pages_ok.part.52+0x84/0xbc
> >>   [<fffffe000019d5e8>] __free_pages+0x74/0xbc
> >>   [<fffffe0000c01350>] init_cma_reserved_pageblock+0xe8/0x104
> >>   [<fffffe0000c24de0>] cma_init_reserved_areas+0x190/0x1e4
> >>   [<fffffe0000090418>] do_one_initcall+0xc4/0x154
> >>   [<fffffe0000bf0a50>] kernel_init_freeable+0x204/0x2a8
> >>   [<fffffe00007520a0>] kernel_init+0xc/0xd4
> >> 
> >> This happens in this configuration because __free_one_page() is called
> >> with an order greater than MAX_ORDER, accesses past zone->free_list[]
> >> and passes a bogus list_head to list_add().
> >> 
> >> arch/arm64/Kconfig has:
> >> 
> >>   config FORCE_MAX_ZONEORDER
> >> 	int
> >> 	default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
> >> 	default "11"
> >> 
> >> So with THP turned off MAX_ORDER == 11 but init_cma_reserved_pageblock()
> >> passes __free_pages() an order of pageblock_order which is based on
> >> (HPAGE_SHIFT - PAGE_SHIFT) which is 13 for 64K pages. I worked around
> >> this by removing the THP test so FORCE_MAX_ZONEORDER is always 14 for
> >> ARM64_64K_PAGES.
> >> 
> >> Signed-off-by: Mark Salter <msalter@redhat.com>
> >> ---
> >>  arch/arm64/Kconfig | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> >> index 7295419..42a334e 100644
> >> --- a/arch/arm64/Kconfig
> >> +++ b/arch/arm64/Kconfig
> >> @@ -269,7 +269,7 @@ config XEN
> >>  
> >>  config FORCE_MAX_ZONEORDER
> >>  	int
> >> -	default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
> >> +	default "14" if ARM64_64K_PAGES
> >>  	default "11"
> >>  
> >>  endmenu
> >
> > Any reason to not switch this to
> >
> > 	ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE && CMA
> >
> > instead?  If pageblock_order > MAX_ORDER because of 
> > HPAGE_SHIFT > PAGE_SHIFT, then cma is always going to be passing a 
> > too-large-order to free_pages_prepare() via this path.
> >
> > Adding Michal and Marek to the cc.
> 
> The correct fix would be to change init_cma_reserved_pageblock such that
> it checks whether pageblock_order > MAX_ORDER and if so frees each max
> order page of the pageblock individually:
> 
> --------- >8 ---------------------------------------------------------
> From: Michal Nazarewicz <mina86@mina86.com>
> Subject: [PATCH] mm: cma: fix cases where pageblock is bigger then MAX_ORDER
> 
> With a kernel configured with ARM64_64K_PAGES && !TRANSPARENT_HUGEPAGE,
> the following is triggered at early boot:
> 
>   SMP: Total of 8 processors activated.
>   devtmpfs: initialized
>   Unable to handle kernel NULL pointer dereference at virtual address 00000008
>   pgd = fffffe0000050000
>   [00000008] *pgd=00000043fba00003, *pmd=00000043fba00003, *pte=00e0000078010407
>   Internal error: Oops: 96000006 [#1] SMP
>   Modules linked in:
>   CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc864k+ #44
>   task: fffffe03bc040000 ti: fffffe03bc080000 task.ti: fffffe03bc080000
>   PC is at __list_add+0x10/0xd4
>   LR is at free_one_page+0x270/0x638
>   ...
>   Call trace:
>   [<fffffe00003ee970>] __list_add+0x10/0xd4
>   [<fffffe000019c478>] free_one_page+0x26c/0x638
>   [<fffffe000019c8c8>] __free_pages_ok.part.52+0x84/0xbc
>   [<fffffe000019d5e8>] __free_pages+0x74/0xbc
>   [<fffffe0000c01350>] init_cma_reserved_pageblock+0xe8/0x104
>   [<fffffe0000c24de0>] cma_init_reserved_areas+0x190/0x1e4
>   [<fffffe0000090418>] do_one_initcall+0xc4/0x154
>   [<fffffe0000bf0a50>] kernel_init_freeable+0x204/0x2a8
>   [<fffffe00007520a0>] kernel_init+0xc/0xd4
> 
> This happens in this configuration because __free_one_page() is called
> with an order greater than MAX_ORDER, accesses past zone->free_list[]
> and passes a bogus list_head to list_add().
> 
> arch/arm64/Kconfig has:
> 
>   config FORCE_MAX_ZONEORDER
> 	int
> 	default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
> 	default "11"
> 
> So with THP turned off MAX_ORDER == 11 but init_cma_reserved_pageblock()
> passes __free_pages() an order of pageblock_order which is based on
> (HPAGE_SHIFT - PAGE_SHIFT) which is 13 for 64K pages.
> 
> Fix the problem by changing init_cma_reserved_pageblock() such that it
> splits pageblock into individual MAX_ORDER pages if pageblock is
> bigger than a MAX_ORDER page.
> 
> Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
> Reported-by: Mark Salter <msalter@redhat.com>
> ---
>  mm/page_alloc.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5dba293..6e657ce 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -801,7 +801,15 @@ void __init init_cma_reserved_pageblock(struct page *page)
>  
>  	set_page_refcounted(page);
>  	set_pageblock_migratetype(page, MIGRATE_CMA);
> -	__free_pages(page, pageblock_order);
> +	if (pageblock_order > MAX_ORDER) {
> +		struct page *subpage = p;
> +		unsigned count = 1 << (pageblock_order - MAX_ORDER);
> +		do {
> +			__free_pages(subpage, pageblock_order);
                                               ^^^^^^^
                                               MAX_ORDER

> +		} while (subpage += MAX_ORDER_NR_PAGES, --count);
> +	} else {
> +		__free_pages(page, pageblock_order);
> +	}
>  	adjust_managed_page_count(page, pageblock_nr_pages);
>  }
>  #endif
> --------- >8 ---------------------------------------------------------
> 
> Thoughts?  This has not been tested and I think it may cause performance
> degradation in some cases since pageblock_order is not always
> a constant, so the comparison may end up not being stripped away even on
> systems where it's always false.
> 

This works with the above tweak. So it fixes the problm here, but I was
not sure if we'd get bitten elsewhere by pageblock_order > MAX_ORDER.
It will be slower, but does it only gets called a few time at most at
boot time, right?

next prev parent reply	other threads:[~2014-06-19 18:12 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-11 21:33 [PATCH] arm64: fix MAX_ORDER for 64K pagesize Mark Salter
2014-06-11 21:33 ` Mark Salter
2014-06-11 23:03 ` David Rientjes
2014-06-11 23:03   ` David Rientjes
2014-06-11 23:04   ` David Rientjes
2014-06-11 23:04     ` David Rientjes
2014-06-12 13:57     ` Mark Salter
2014-06-12 13:57       ` Mark Salter
2014-06-17 18:32   ` Michal Nazarewicz
2014-06-17 18:32     ` Michal Nazarewicz
2014-06-19 18:12     ` Mark Salter [this message]
2014-06-19 18:12       ` Mark Salter
2014-06-19 19:24       ` Michal Nazarewicz
2014-06-19 19:24         ` Michal Nazarewicz
2014-06-20 17:37         ` Mark Salter
2014-06-20 17:37           ` Mark Salter
2014-06-23 19:40           ` [PATCHv3] mm: page_alloc: fix CMA area initialisation when pageblock > MAX_ORDER Michal Nazarewicz
2014-06-23 19:40             ` Michal Nazarewicz
2014-06-23 21:10             ` Mark Salter
2014-06-23 21:10               ` Mark Salter
2014-06-19 19:53       ` [PATCHv2] " Michal Nazarewicz
2014-06-19 19:53         ` Michal Nazarewicz
2014-06-20 13:54         ` Christopher Covington
2014-06-20 13:54           ` Christopher Covington
2014-06-20 15:48         ` Mark Salter
2014-06-20 15:48           ` Mark Salter
2014-06-20 16:36           ` Michal Nazarewicz
2014-06-20 16:36             ` Michal Nazarewicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1403201524.32688.62.camel@deneb.redhat.com \
    --to=msalter@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.