* [PATCH v2 0/2] vmemmap fix for bug introduced by extending VA range
@ 2016-02-26 16:57 Ard Biesheuvel
2016-02-26 16:57 ` [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region Ard Biesheuvel
2016-02-26 16:57 ` [PATCH v2 2/2] arm64: mm: treat memstart_addr as a signed quantity Ard Biesheuvel
0 siblings, 2 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-26 16:57 UTC (permalink / raw)
To: linux-arm-kernel
This is a followup to patch [1] (part of [2]) where I fixed an issue that I
identified related to commit dd006da21646 ("arm64: mm: increase VA range of
identity map").
Patch #1 fixes the issue in a way that should be compatible with v4.5 and
-stable, although it may conflict trivially on hunk context.
Patch #2 is a followup that changes memstart_addr to a signed type, which is
necessary when combining patch #1 with the linear region randomization patch
that is currently queued in -next.
[1] http://article.gmane.org/gmane.linux.ports.arm.kernel/481330
[2] http://thread.gmane.org/gmane.linux.ports.arm.kernel/481327
Ard Biesheuvel (2):
arm64: vmemmap: use virtual projection of linear region
arm64: mm: treat memstart_addr as a signed quantity
arch/arm64/include/asm/memory.h | 2 +-
arch/arm64/include/asm/pgtable.h | 7 ++++---
arch/arm64/mm/init.c | 6 +++---
3 files changed, 8 insertions(+), 7 deletions(-)
--
2.5.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region
2016-02-26 16:57 [PATCH v2 0/2] vmemmap fix for bug introduced by extending VA range Ard Biesheuvel
@ 2016-02-26 16:57 ` Ard Biesheuvel
2016-03-08 1:07 ` David Daney
2016-02-26 16:57 ` [PATCH v2 2/2] arm64: mm: treat memstart_addr as a signed quantity Ard Biesheuvel
1 sibling, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-26 16:57 UTC (permalink / raw)
To: linux-arm-kernel
Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
some changes to the memory mapping code to allow physical memory to reside
at an offset that exceeds the size of the virtual mapping.
However, since the size of the vmemmap area is proportional to the size of
the VA area, but it is populated relative to the physical space, we may
end up with the struct page array being mapped outside of the vmemmap
region. For instance, on my Seattle A0 box, I can see the following output
in the dmesg log.
vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
We can fix this by deciding that the vmemmap region is not a projection of
the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
linear region. This way, we are guaranteed that the vmemmap region is of
sufficient size, and we can even reduce the size by half.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
v2: simplify the expression for vmemmap, forward compatible with the patch that
changes the type of memstart_addr to s64
arch/arm64/include/asm/pgtable.h | 7 ++++---
arch/arm64/mm/init.c | 4 ++--
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 16438dd8916a..43abcbc30813 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -34,18 +34,19 @@
/*
* VMALLOC and SPARSEMEM_VMEMMAP ranges.
*
- * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
+ * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
* (rounded up to PUD_SIZE).
* VMALLOC_START: beginning of the kernel vmalloc space
* VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
* fixed mappings and modules
*/
-#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
+#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
-#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
+#define VMEMMAP_START (VMALLOC_END + SZ_64K)
+#define vmemmap ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
#define FIRST_USER_ADDRESS 0UL
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index e1f425fe5a81..4ea7efc28e65 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -380,8 +380,8 @@ void __init mem_init(void)
MLK_ROUNDUP(_text, _etext),
MLK_ROUNDUP(_sdata, _edata),
#ifdef CONFIG_SPARSEMEM_VMEMMAP
- MLG((unsigned long)vmemmap,
- (unsigned long)vmemmap + VMEMMAP_SIZE),
+ MLG(VMEMMAP_START,
+ VMEMMAP_START + VMEMMAP_SIZE),
MLM((unsigned long)phys_to_page(memblock_start_of_DRAM()),
(unsigned long)virt_to_page(high_memory)),
#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] arm64: mm: treat memstart_addr as a signed quantity
2016-02-26 16:57 [PATCH v2 0/2] vmemmap fix for bug introduced by extending VA range Ard Biesheuvel
2016-02-26 16:57 ` [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region Ard Biesheuvel
@ 2016-02-26 16:57 ` Ard Biesheuvel
2016-02-29 12:39 ` Ard Biesheuvel
2016-02-29 18:36 ` Catalin Marinas
1 sibling, 2 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-26 16:57 UTC (permalink / raw)
To: linux-arm-kernel
Commit c031a4213c11 ("arm64: kaslr: randomize the linear region")
implements randomization of the linear region, by subtracting a random
multiple of PUD_SIZE from memstart_addr. This causes the virtual mapping
of system RAM to move upwards in the linear region, and at the same time
causes memstart_addr to assume a value which may be negative if the offset
of system RAM in the physical space is smaller than its offset relative to
PAGE_OFFSET in the virtual space.
Since memstart_addr is effectively an offset now, redefine its type as s64
so that expressions involving shifting or division preserve its sign.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/memory.h | 2 +-
arch/arm64/mm/init.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 5f8667a99e41..12f8a00fb3f1 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -135,7 +135,7 @@
#include <linux/bitops.h>
#include <linux/mmdebug.h>
-extern phys_addr_t memstart_addr;
+extern s64 memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
#define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 4ea7efc28e65..a2977d33e0dc 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -54,7 +54,7 @@
* executes, which assigns it its actual value. So use a default value
* that cannot be mistaken for a real physical address.
*/
-phys_addr_t memstart_addr __read_mostly = ~0ULL;
+s64 memstart_addr __read_mostly = -1;
phys_addr_t arm64_dma_phys_limit __read_mostly;
#ifdef CONFIG_BLK_DEV_INITRD
--
2.5.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] arm64: mm: treat memstart_addr as a signed quantity
2016-02-26 16:57 ` [PATCH v2 2/2] arm64: mm: treat memstart_addr as a signed quantity Ard Biesheuvel
@ 2016-02-29 12:39 ` Ard Biesheuvel
2016-02-29 18:36 ` Catalin Marinas
1 sibling, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-29 12:39 UTC (permalink / raw)
To: linux-arm-kernel
On 26 February 2016 at 17:57, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> Commit c031a4213c11 ("arm64: kaslr: randomize the linear region")
> implements randomization of the linear region, by subtracting a random
> multiple of PUD_SIZE from memstart_addr. This causes the virtual mapping
> of system RAM to move upwards in the linear region, and at the same time
> causes memstart_addr to assume a value which may be negative if the offset
> of system RAM in the physical space is smaller than its offset relative to
> PAGE_OFFSET in the virtual space.
>
> Since memstart_addr is effectively an offset now, redefine its type as s64
> so that expressions involving shifting or division preserve its sign.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
FYI this results in a warning, please fold
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 67ce440cb702..9cfe94b41b54 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -181,7 +181,7 @@ void __init arm64_memblock_init(void)
* linear mapping. Take care not to clip the kernel which may be
* high in memory.
*/
- memblock_remove(max(memstart_addr + linear_region_size, __pa(_end)),
+ memblock_remove(max_t(u64, memstart_addr + linear_region_size, __pa(_end)),
ULLONG_MAX);
if (memblock_end_of_DRAM() > linear_region_size)
memblock_remove(0, memblock_end_of_DRAM() - linear_region_size);
> ---
> arch/arm64/include/asm/memory.h | 2 +-
> arch/arm64/mm/init.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 5f8667a99e41..12f8a00fb3f1 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -135,7 +135,7 @@
> #include <linux/bitops.h>
> #include <linux/mmdebug.h>
>
> -extern phys_addr_t memstart_addr;
> +extern s64 memstart_addr;
> /* PHYS_OFFSET - the physical address of the start of memory. */
> #define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 4ea7efc28e65..a2977d33e0dc 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -54,7 +54,7 @@
> * executes, which assigns it its actual value. So use a default value
> * that cannot be mistaken for a real physical address.
> */
> -phys_addr_t memstart_addr __read_mostly = ~0ULL;
> +s64 memstart_addr __read_mostly = -1;
> phys_addr_t arm64_dma_phys_limit __read_mostly;
>
> #ifdef CONFIG_BLK_DEV_INITRD
> --
> 2.5.0
>
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] arm64: mm: treat memstart_addr as a signed quantity
2016-02-26 16:57 ` [PATCH v2 2/2] arm64: mm: treat memstart_addr as a signed quantity Ard Biesheuvel
2016-02-29 12:39 ` Ard Biesheuvel
@ 2016-02-29 18:36 ` Catalin Marinas
1 sibling, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2016-02-29 18:36 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Feb 26, 2016 at 05:57:14PM +0100, Ard Biesheuvel wrote:
> Commit c031a4213c11 ("arm64: kaslr: randomize the linear region")
> implements randomization of the linear region, by subtracting a random
> multiple of PUD_SIZE from memstart_addr. This causes the virtual mapping
> of system RAM to move upwards in the linear region, and at the same time
> causes memstart_addr to assume a value which may be negative if the offset
> of system RAM in the physical space is smaller than its offset relative to
> PAGE_OFFSET in the virtual space.
>
> Since memstart_addr is effectively an offset now, redefine its type as s64
> so that expressions involving shifting or division preserve its sign.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Applied. Thanks.
--
Catalin
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region
2016-02-26 16:57 ` [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region Ard Biesheuvel
@ 2016-03-08 1:07 ` David Daney
2016-03-08 2:15 ` Ard Biesheuvel
0 siblings, 1 reply; 12+ messages in thread
From: David Daney @ 2016-03-08 1:07 UTC (permalink / raw)
To: linux-arm-kernel
On 02/26/2016 08:57 AM, Ard Biesheuvel wrote:
> Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
> some changes to the memory mapping code to allow physical memory to reside
> at an offset that exceeds the size of the virtual mapping.
>
> However, since the size of the vmemmap area is proportional to the size of
> the VA area, but it is populated relative to the physical space, we may
> end up with the struct page array being mapped outside of the vmemmap
> region. For instance, on my Seattle A0 box, I can see the following output
> in the dmesg log.
>
> vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
> 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
>
> We can fix this by deciding that the vmemmap region is not a projection of
> the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
> linear region. This way, we are guaranteed that the vmemmap region is of
> sufficient size, and we can even reduce the size by half.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
I see this commit now in Linus' kernel.org tree in v4.5-rc7.
FYI: I am seeing a crash that goes away when I revert this. My kernel
has some other modifications (our NUMA patches) so I haven't yet fully
tracked this down on an unmodified kernel, but this is what I am getting:
.
.
.
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000001400000-0x00000000fffeffff]
[ 0.000000] node 0: [mem 0x00000000ffff0000-0x00000000ffffffff]
[ 0.000000] node 0: [mem 0x0000000100000000-0x00000003f51cffff]
[ 0.000000] node 0: [mem 0x00000003f51d0000-0x00000003f51dffff]
[ 0.000000] node 0: [mem 0x00000003f51e0000-0x00000003fa9bffff]
[ 0.000000] node 0: [mem 0x00000003fa9c0000-0x00000003faa8ffff]
[ 0.000000] node 0: [mem 0x00000003faa90000-0x00000003ffa3ffff]
[ 0.000000] node 0: [mem 0x00000003ffa40000-0x00000003ffa9ffff]
[ 0.000000] node 0: [mem 0x00000003ffaa0000-0x00000003ffffffff]
[ 0.000000] Initmem setup node 0 [mem
0x0000000001400000-0x00000003ffffffff]
[ 0.000000] Unable to handle kernel paging request at virtual address
fffffdff60ff0000
[ 0.000000] pgd = fffffe0000e00000
[ 0.000000] [fffffdff60ff0000] *pgd=00000003ffd90003,
*pud=00000003ffd90003, *pmd=00000003ffd90003, *pte=0000000000000000
[ 0.000000] Internal error: Oops: 96000007 [#1] PREEMPT SMP
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.0-rc7-numa+ #123
[ 0.000000] Hardware name: Cavium ThunderX CN88XX board (DT)
[ 0.000000] task: fffffe0000b39880 ti: fffffe0000b00000 task.ti:
fffffe0000b00000
[ 0.000000] PC is at memmap_init_zone+0xe0/0x130
[ 0.000000] LR is at memmap_init_zone+0xc0/0x130
[ 0.000000] pc : [<fffffe0000a85b28>] lr : [<fffffe0000a85b08>]
pstate: 800002c5
[ 0.000000] sp : fffffe0000b03cf0
[ 0.000000] x29: fffffe0000b03cf0 x28: fffffe03febe1b80
[ 0.000000] x27: fffffe03febe2a08 x26: fffffe0000b30000
[ 0.000000] x25: 0000000000040000 x24: 0000000000000000
[ 0.000000] x23: 1000000000000000 x22: 0000000000000000
[ 0.000000] x21: 0000000000000001 x20: 00000000ffffffff
[ 0.000000] x19: 000000000003fd40 x18: fffffe0000d7c240
[ 0.000000] x17: 0000000000000009 x16: 0000000400000000
[ 0.000000] x15: 0000000000000008 x14: 0000000000000004
[ 0.000000] x13: 0000000000000000 x12: 000000000001c854
[ 0.000000] x11: 00000003fffe37a8 x10: 0000000000000004
[ 0.000000] x9 : 0000000000000000 x8 : fffffe03febc0000
[ 0.000000] x7 : 0000000000000000 x6 : fffffe0000d7c240
[ 0.000000] x5 : fffffdff60000000 x4 : 0000000000000007
[ 0.000000] x3 : fffffdff60000000 x2 : fffffe0000d7c300
[ 0.000000] x1 : 0000000000ff0000 x0 : 0000000000000001
[ 0.000000]
[ 0.000000] Process swapper (pid: 0, stack limit = 0xfffffe0000b00020)
[ 0.000000] Stack: (0xfffffe0000b03cf0 to 0xfffffe0000b04000)
[ 0.000000] 3ce0: fffffe0000b03d40
fffffe0000a85fd4
[ 0.000000] 3d00: fffffe03febe2400 fffffe0000aa3000 0000000000000000
0000000000030000
[ 0.000000] 3d20: fffffe0000b5eab4 fffffe0000b5eab8 0000000000000001
fffffe0000734d18
[ 0.000000] 3d40: fffffe0000b03df0 fffffe0000a56928 0000000000000000
fffffe0000b32e68
[ 0.000000] 3d60: 0000000000000004 fffffe0000b32e70 fffffe0000b30000
fffffe03febe1b80
[ 0.000000] 3d80: fffffe0000c40000 0000000002200000 fffffe0000081198
00000003f51eaa0c
[ 0.000000] 3da0: fffffe0000d7bc90 0000000000030000 fffffe000093f148
fffffe000093f008
[ 0.000000] 3dc0: fffffe000093efb0 fffffe000093efd8 0000000000010000
fffffe0000af6798
[ 0.000000] 3de0: 0000000000000140 0000000000040000 fffffe0000b03e90
fffffe0000a44e84
[ 0.000000] 3e00: fffffe0000b03ec8 0000000001400000 0000000000040000
fffffe0000b30000
[ 0.000000] 3e20: fffffe0000b30000 0000000001400000 fffffe0000c40000
0000000002200000
[ 0.000000] 3e40: fffffe0000081198 00000003f51eaa0c 0000000000040000
0000000000000000
[ 0.000000] 3e60: fffffe0000b30000 0000000001400000 ffffffff00c40000
00000000ffffffff
[ 0.000000] 3e80: 000000000003ffaa 0000000000040000 fffffe0000b03ee0
fffffe0000a452d4
[ 0.000000] 3ea0: fffffe03febd0000 fffffe0000b30b98 fffffe0000080000
fffffe0000a45168
[ 0.000000] 3ec0: fffffe0000b03ee0 0000000000010000 0000000000040000
0000000000000000
[ 0.000000] 3ee0: fffffe0000b03f00 fffffe0000a42f2c fffffdfffa800000
00000003ffaa0000
[ 0.000000] 3f00: fffffe0000b03fa0 fffffe0000a40680 0000000000000000
fffffe0000b30b98
[ 0.000000] 3f20: 0000000021200000 00000003f50de7c8 fffffe0000b30000
0000000001400000
[ 0.000000] 3f40: 00000000021d0000 0000000002200000 fffffe0000081198
00000000ffffffc8
[ 0.000000] 3f60: 00000003f50deb40 fffffe00007200a8 0000000000000001
0000000021200000
[ 0.000000] 3f80: ffffffffffffffff 0000000000000000 0000000080808080
fefefefefefefefe
[ 0.000000] 3fa0: 0000000000000000 fffffe00000811b4 00000003f50deb40
0000000000000e12
[ 0.000000] 3fc0: 0000000021200000 00000003f50de7c8 00000003f50de7dd
0000000001400000
[ 0.000000] 3fe0: 0000000000000000 fffffe0000a88a28 0000000000000000
0000000000000000
[ 0.000000] Call trace:
[ 0.000000] Exception stack(0xfffffe0000b03b30 to 0xfffffe0000b03c50)
[ 0.000000] 3b20: 000000000003fd40
00000000ffffffff
[ 0.000000] 3b40: fffffe0000b03cf0 fffffe0000a85b28 00000003fffba000
0000000000006000
[ 0.000000] 3b60: 0000000000000004 0000000000000000 fffffe0000b03be0
fffffe00001d253c
[ 0.000000] 3b80: 00000003fffba000 0000000000006000 fffffe0000a5abec
0000000000000080
[ 0.000000] 3ba0: 0000000000000000 0000000000000000 0000000000010000
fffffe0000b674f0
[ 0.000000] 3bc0: fffffe0000942288 00000003fffba000 0000000000000001
0000000000ff0000
[ 0.000000] 3be0: fffffe0000d7c300 fffffdff60000000 0000000000000007
fffffdff60000000
[ 0.000000] 3c00: fffffe0000d7c240 0000000000000000 fffffe03febc0000
0000000000000000
[ 0.000000] 3c20: 0000000000000004 00000003fffe37a8 000000000001c854
0000000000000000
[ 0.000000] 3c40: 0000000000000004 0000000000000008
[ 0.000000] [<fffffe0000a85b28>] memmap_init_zone+0xe0/0x130
[ 0.000000] [<fffffe0000a85fd4>] free_area_init_node+0x45c/0x4a4
[ 0.000000] [<fffffe0000a56928>] free_area_init_nodes+0x594/0x5ec
[ 0.000000] [<fffffe0000a44e84>] bootmem_init+0xc8/0xf8
[ 0.000000] [<fffffe0000a452d4>] paging_init_rest+0x1c/0xdc
[ 0.000000] [<fffffe0000a42f2c>] setup_arch+0x118/0x5a0
[ 0.000000] [<fffffe0000a40680>] start_kernel+0xa8/0x3e0
[ 0.000000] [<fffffe00000811b4>] 0xfffffe00000811b4
[ 0.000000] Code: cb414261 f2dfbfe5 d37ae421 f2ffffe5 (f8656820)
[ 0.000000] ---[ end trace cb88537fdc8fa200 ]---
[ 0.000000] Kernel panic - not syncing: Fatal exception
[ 0.000000] ---[ end Kernel panic - not syncing: Fatal exception
.
.
.
> ---
> v2: simplify the expression for vmemmap, forward compatible with the patch that
> changes the type of memstart_addr to s64
>
> arch/arm64/include/asm/pgtable.h | 7 ++++---
> arch/arm64/mm/init.c | 4 ++--
> 2 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 16438dd8916a..43abcbc30813 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -34,18 +34,19 @@
> /*
> * VMALLOC and SPARSEMEM_VMEMMAP ranges.
> *
> - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
> + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
> * (rounded up to PUD_SIZE).
> * VMALLOC_START: beginning of the kernel vmalloc space
> * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
> * fixed mappings and modules
> */
> -#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
> +#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
>
> -#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
> +#define VMEMMAP_START (VMALLOC_END + SZ_64K)
> +#define vmemmap ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
>
> #define FIRST_USER_ADDRESS 0UL
>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index e1f425fe5a81..4ea7efc28e65 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -380,8 +380,8 @@ void __init mem_init(void)
> MLK_ROUNDUP(_text, _etext),
> MLK_ROUNDUP(_sdata, _edata),
> #ifdef CONFIG_SPARSEMEM_VMEMMAP
> - MLG((unsigned long)vmemmap,
> - (unsigned long)vmemmap + VMEMMAP_SIZE),
> + MLG(VMEMMAP_START,
> + VMEMMAP_START + VMEMMAP_SIZE),
> MLM((unsigned long)phys_to_page(memblock_start_of_DRAM()),
> (unsigned long)virt_to_page(high_memory)),
> #endif
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region
2016-03-08 1:07 ` David Daney
@ 2016-03-08 2:15 ` Ard Biesheuvel
2016-03-08 10:31 ` Ard Biesheuvel
0 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2016-03-08 2:15 UTC (permalink / raw)
To: linux-arm-kernel
> On 8 mrt. 2016, at 08:07, David Daney <ddaney.cavm@gmail.com> wrote:
>
>> On 02/26/2016 08:57 AM, Ard Biesheuvel wrote:
>> Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
>> some changes to the memory mapping code to allow physical memory to reside
>> at an offset that exceeds the size of the virtual mapping.
>>
>> However, since the size of the vmemmap area is proportional to the size of
>> the VA area, but it is populated relative to the physical space, we may
>> end up with the struct page array being mapped outside of the vmemmap
>> region. For instance, on my Seattle A0 box, I can see the following output
>> in the dmesg log.
>>
>> vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
>> 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
>>
>> We can fix this by deciding that the vmemmap region is not a projection of
>> the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
>> linear region. This way, we are guaranteed that the vmemmap region is of
>> sufficient size, and we can even reduce the size by half.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>
> I see this commit now in Linus' kernel.org tree in v4.5-rc7.
>
> FYI: I am seeing a crash that goes away when I revert this. My kernel has some other modifications (our NUMA patches) so I haven't yet fully tracked this down on an unmodified kernel, but this is what I am getting:
>
Hi David,
You are the second one to report this issue on a 64k pages kernel, but i haven't managed to reproduce yet.
Any chance you could instrument vmemmap_populate_basepages to figure out whether the faulting address is populated, and if not, why?
Thanks,
Ard.
> .
> .
> [ 0.000000] Early memory node ranges
> [ 0.000000] node 0: [mem 0x0000000001400000-0x00000000fffeffff]
> [ 0.000000] node 0: [mem 0x00000000ffff0000-0x00000000ffffffff]
> [ 0.000000] node 0: [mem 0x0000000100000000-0x00000003f51cffff]
> [ 0.000000] node 0: [mem 0x00000003f51d0000-0x00000003f51dffff]
> [ 0.000000] node 0: [mem 0x00000003f51e0000-0x00000003fa9bffff]
> [ 0.000000] node 0: [mem 0x00000003fa9c0000-0x00000003faa8ffff]
> [ 0.000000] node 0: [mem 0x00000003faa90000-0x00000003ffa3ffff]
> [ 0.000000] node 0: [mem 0x00000003ffa40000-0x00000003ffa9ffff]
> [ 0.000000] node 0: [mem 0x00000003ffaa0000-0x00000003ffffffff]
> [ 0.000000] Initmem setup node 0 [mem 0x0000000001400000-0x00000003ffffffff]
> [ 0.000000] Unable to handle kernel paging request at virtual address fffffdff60ff0000
> [ 0.000000] pgd = fffffe0000e00000
> [ 0.000000] [fffffdff60ff0000] *pgd=00000003ffd90003, *pud=00000003ffd90003, *pmd=00000003ffd90003, *pte=0000000000000000
> [ 0.000000] Internal error: Oops: 96000007 [#1] PREEMPT SMP
> [ 0.000000] Modules linked in:
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.0-rc7-numa+ #123
> [ 0.000000] Hardware name: Cavium ThunderX CN88XX board (DT)
> [ 0.000000] task: fffffe0000b39880 ti: fffffe0000b00000 task.ti: fffffe0000b00000
> [ 0.000000] PC is at memmap_init_zone+0xe0/0x130
> [ 0.000000] LR is at memmap_init_zone+0xc0/0x130
> [ 0.000000] pc : [<fffffe0000a85b28>] lr : [<fffffe0000a85b08>] pstate: 800002c5
> [ 0.000000] sp : fffffe0000b03cf0
> [ 0.000000] x29: fffffe0000b03cf0 x28: fffffe03febe1b80
> [ 0.000000] x27: fffffe03febe2a08 x26: fffffe0000b30000
> [ 0.000000] x25: 0000000000040000 x24: 0000000000000000
> [ 0.000000] x23: 1000000000000000 x22: 0000000000000000
> [ 0.000000] x21: 0000000000000001 x20: 00000000ffffffff
> [ 0.000000] x19: 000000000003fd40 x18: fffffe0000d7c240
> [ 0.000000] x17: 0000000000000009 x16: 0000000400000000
> [ 0.000000] x15: 0000000000000008 x14: 0000000000000004
> [ 0.000000] x13: 0000000000000000 x12: 000000000001c854
> [ 0.000000] x11: 00000003fffe37a8 x10: 0000000000000004
> [ 0.000000] x9 : 0000000000000000 x8 : fffffe03febc0000
> [ 0.000000] x7 : 0000000000000000 x6 : fffffe0000d7c240
> [ 0.000000] x5 : fffffdff60000000 x4 : 0000000000000007
> [ 0.000000] x3 : fffffdff60000000 x2 : fffffe0000d7c300
> [ 0.000000] x1 : 0000000000ff0000 x0 : 0000000000000001
> [ 0.000000]
> [ 0.000000] Process swapper (pid: 0, stack limit = 0xfffffe0000b00020)
> [ 0.000000] Stack: (0xfffffe0000b03cf0 to 0xfffffe0000b04000)
> [ 0.000000] 3ce0: fffffe0000b03d40 fffffe0000a85fd4
> [ 0.000000] 3d00: fffffe03febe2400 fffffe0000aa3000 0000000000000000 0000000000030000
> [ 0.000000] 3d20: fffffe0000b5eab4 fffffe0000b5eab8 0000000000000001 fffffe0000734d18
> [ 0.000000] 3d40: fffffe0000b03df0 fffffe0000a56928 0000000000000000 fffffe0000b32e68
> [ 0.000000] 3d60: 0000000000000004 fffffe0000b32e70 fffffe0000b30000 fffffe03febe1b80
> [ 0.000000] 3d80: fffffe0000c40000 0000000002200000 fffffe0000081198 00000003f51eaa0c
> [ 0.000000] 3da0: fffffe0000d7bc90 0000000000030000 fffffe000093f148 fffffe000093f008
> [ 0.000000] 3dc0: fffffe000093efb0 fffffe000093efd8 0000000000010000 fffffe0000af6798
> [ 0.000000] 3de0: 0000000000000140 0000000000040000 fffffe0000b03e90 fffffe0000a44e84
> [ 0.000000] 3e00: fffffe0000b03ec8 0000000001400000 0000000000040000 fffffe0000b30000
> [ 0.000000] 3e20: fffffe0000b30000 0000000001400000 fffffe0000c40000 0000000002200000
> [ 0.000000] 3e40: fffffe0000081198 00000003f51eaa0c 0000000000040000 0000000000000000
> [ 0.000000] 3e60: fffffe0000b30000 0000000001400000 ffffffff00c40000 00000000ffffffff
> [ 0.000000] 3e80: 000000000003ffaa 0000000000040000 fffffe0000b03ee0 fffffe0000a452d4
> [ 0.000000] 3ea0: fffffe03febd0000 fffffe0000b30b98 fffffe0000080000 fffffe0000a45168
> [ 0.000000] 3ec0: fffffe0000b03ee0 0000000000010000 0000000000040000 0000000000000000
> [ 0.000000] 3ee0: fffffe0000b03f00 fffffe0000a42f2c fffffdfffa800000 00000003ffaa0000
> [ 0.000000] 3f00: fffffe0000b03fa0 fffffe0000a40680 0000000000000000 fffffe0000b30b98
> [ 0.000000] 3f20: 0000000021200000 00000003f50de7c8 fffffe0000b30000 0000000001400000
> [ 0.000000] 3f40: 00000000021d0000 0000000002200000 fffffe0000081198 00000000ffffffc8
> [ 0.000000] 3f60: 00000003f50deb40 fffffe00007200a8 0000000000000001 0000000021200000
> [ 0.000000] 3f80: ffffffffffffffff 0000000000000000 0000000080808080 fefefefefefefefe
> [ 0.000000] 3fa0: 0000000000000000 fffffe00000811b4 00000003f50deb40 0000000000000e12
> [ 0.000000] 3fc0: 0000000021200000 00000003f50de7c8 00000003f50de7dd 0000000001400000
> [ 0.000000] 3fe0: 0000000000000000 fffffe0000a88a28 0000000000000000 0000000000000000
> [ 0.000000] Call trace:
> [ 0.000000] Exception stack(0xfffffe0000b03b30 to 0xfffffe0000b03c50)
> [ 0.000000] 3b20: 000000000003fd40 00000000ffffffff
> [ 0.000000] 3b40: fffffe0000b03cf0 fffffe0000a85b28 00000003fffba000 0000000000006000
> [ 0.000000] 3b60: 0000000000000004 0000000000000000 fffffe0000b03be0 fffffe00001d253c
> [ 0.000000] 3b80: 00000003fffba000 0000000000006000 fffffe0000a5abec 0000000000000080
> [ 0.000000] 3ba0: 0000000000000000 0000000000000000 0000000000010000 fffffe0000b674f0
> [ 0.000000] 3bc0: fffffe0000942288 00000003fffba000 0000000000000001 0000000000ff0000
> [ 0.000000] 3be0: fffffe0000d7c300 fffffdff60000000 0000000000000007 fffffdff60000000
> [ 0.000000] 3c00: fffffe0000d7c240 0000000000000000 fffffe03febc0000 0000000000000000
> [ 0.000000] 3c20: 0000000000000004 00000003fffe37a8 000000000001c854 0000000000000000
> [ 0.000000] 3c40: 0000000000000004 0000000000000008
> [ 0.000000] [<fffffe0000a85b28>] memmap_init_zone+0xe0/0x130
> [ 0.000000] [<fffffe0000a85fd4>] free_area_init_node+0x45c/0x4a4
> [ 0.000000] [<fffffe0000a56928>] free_area_init_nodes+0x594/0x5ec
> [ 0.000000] [<fffffe0000a44e84>] bootmem_init+0xc8/0xf8
> [ 0.000000] [<fffffe0000a452d4>] paging_init_rest+0x1c/0xdc
> [ 0.000000] [<fffffe0000a42f2c>] setup_arch+0x118/0x5a0
> [ 0.000000] [<fffffe0000a40680>] start_kernel+0xa8/0x3e0
> [ 0.000000] [<fffffe00000811b4>] 0xfffffe00000811b4
> [ 0.000000] Code: cb414261 f2dfbfe5 d37ae421 f2ffffe5 (f8656820)
> [ 0.000000] ---[ end trace cb88537fdc8fa200 ]---
> [ 0.000000] Kernel panic - not syncing: Fatal exception
> [ 0.000000] ---[ end Kernel panic - not syncing: Fatal exception
>
> .
> .
> .
>
>
>> ---
>> v2: simplify the expression for vmemmap, forward compatible with the patch that
>> changes the type of memstart_addr to s64
>>
>> arch/arm64/include/asm/pgtable.h | 7 ++++---
>> arch/arm64/mm/init.c | 4 ++--
>> 2 files changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index 16438dd8916a..43abcbc30813 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -34,18 +34,19 @@
>> /*
>> * VMALLOC and SPARSEMEM_VMEMMAP ranges.
>> *
>> - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
>> + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
>> * (rounded up to PUD_SIZE).
>> * VMALLOC_START: beginning of the kernel vmalloc space
>> * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
>> * fixed mappings and modules
>> */
>> -#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
>> +#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
>>
>> #define VMALLOC_START (MODULES_END)
>> #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
>>
>> -#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
>> +#define VMEMMAP_START (VMALLOC_END + SZ_64K)
>> +#define vmemmap ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
>>
>> #define FIRST_USER_ADDRESS 0UL
>>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index e1f425fe5a81..4ea7efc28e65 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -380,8 +380,8 @@ void __init mem_init(void)
>> MLK_ROUNDUP(_text, _etext),
>> MLK_ROUNDUP(_sdata, _edata),
>> #ifdef CONFIG_SPARSEMEM_VMEMMAP
>> - MLG((unsigned long)vmemmap,
>> - (unsigned long)vmemmap + VMEMMAP_SIZE),
>> + MLG(VMEMMAP_START,
>> + VMEMMAP_START + VMEMMAP_SIZE),
>> MLM((unsigned long)phys_to_page(memblock_start_of_DRAM()),
>> (unsigned long)virt_to_page(high_memory)),
>> #endif
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region
2016-03-08 2:15 ` Ard Biesheuvel
@ 2016-03-08 10:31 ` Ard Biesheuvel
2016-03-08 13:17 ` Mark Langsdorf
2016-03-09 11:32 ` Robert Richter
0 siblings, 2 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-03-08 10:31 UTC (permalink / raw)
To: linux-arm-kernel
On 8 March 2016 at 09:15, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>
>
>> On 8 mrt. 2016, at 08:07, David Daney <ddaney.cavm@gmail.com> wrote:
>>
>>> On 02/26/2016 08:57 AM, Ard Biesheuvel wrote:
>>> Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
>>> some changes to the memory mapping code to allow physical memory to reside
>>> at an offset that exceeds the size of the virtual mapping.
>>>
>>> However, since the size of the vmemmap area is proportional to the size of
>>> the VA area, but it is populated relative to the physical space, we may
>>> end up with the struct page array being mapped outside of the vmemmap
>>> region. For instance, on my Seattle A0 box, I can see the following output
>>> in the dmesg log.
>>>
>>> vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
>>> 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
>>>
>>> We can fix this by deciding that the vmemmap region is not a projection of
>>> the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
>>> linear region. This way, we are guaranteed that the vmemmap region is of
>>> sufficient size, and we can even reduce the size by half.
>>>
>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>
>> I see this commit now in Linus' kernel.org tree in v4.5-rc7.
>>
>> FYI: I am seeing a crash that goes away when I revert this. My kernel has some other modifications (our NUMA patches) so I haven't yet fully tracked this down on an unmodified kernel, but this is what I am getting:
>>
>
I managed to reproduce and diagnose this. The problem is that vmemmap
is no longer zone aligned, which causes trouble in the zone based
rounding that occurs in memory_present. The below patch fixes this by
rounding down the subtracted offset. Since this implies that the
region could stick off the other end, it also reverts the halving of
the region size.
--------8<----------
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f50608674580..ed57c0865290 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -40,7 +40,7 @@
* VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
* fixed mappings and modules
*/
-#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT -
1)) * sizeof(struct page), PUD_SIZE)
+#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT))
* sizeof(struct page), PUD_SIZE)
#ifndef CONFIG_KASAN
#define VMALLOC_START (VA_START)
@@ -52,7 +52,8 @@
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (VMALLOC_END + SZ_64K)
-#define vmemmap ((struct page *)VMEMMAP_START
- (memstart_addr >> PAGE_SHIFT))
+#define vmemmap ((struct page *)VMEMMAP_START - \
+ ((memstart_addr >> PAGE_SHIFT) &
PAGE_SECTION_MASK))
#define FIRST_USER_ADDRESS 0UL
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region
2016-03-08 10:31 ` Ard Biesheuvel
@ 2016-03-08 13:17 ` Mark Langsdorf
2016-03-08 15:21 ` Ard Biesheuvel
2016-03-09 11:32 ` Robert Richter
1 sibling, 1 reply; 12+ messages in thread
From: Mark Langsdorf @ 2016-03-08 13:17 UTC (permalink / raw)
To: linux-arm-kernel
On 03/08/2016 04:31 AM, Ard Biesheuvel wrote:
> On 8 March 2016 at 09:15, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>
>>
>>> On 8 mrt. 2016, at 08:07, David Daney <ddaney.cavm@gmail.com> wrote:
>>>
>>>> On 02/26/2016 08:57 AM, Ard Biesheuvel wrote:
>>>> Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
>>>> some changes to the memory mapping code to allow physical memory to reside
>>>> at an offset that exceeds the size of the virtual mapping.
>>>>
>>>> However, since the size of the vmemmap area is proportional to the size of
>>>> the VA area, but it is populated relative to the physical space, we may
>>>> end up with the struct page array being mapped outside of the vmemmap
>>>> region. For instance, on my Seattle A0 box, I can see the following output
>>>> in the dmesg log.
>>>>
>>>> vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
>>>> 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
>>>>
>>>> We can fix this by deciding that the vmemmap region is not a projection of
>>>> the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
>>>> linear region. This way, we are guaranteed that the vmemmap region is of
>>>> sufficient size, and we can even reduce the size by half.
>>>>
>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>>
>>> I see this commit now in Linus' kernel.org tree in v4.5-rc7.
>>>
>>> FYI: I am seeing a crash that goes away when I revert this. My kernel has some other modifications (our NUMA patches) so I haven't yet fully tracked this down on an unmodified kernel, but this is what I am getting:
>>>
>>
>
> I managed to reproduce and diagnose this. The problem is that vmemmap
> is no longer zone aligned, which causes trouble in the zone based
> rounding that occurs in memory_present. The below patch fixes this by
> rounding down the subtracted offset. Since this implies that the
> region could stick off the other end, it also reverts the halving of
> the region size.
This fixes the bug on my Seattle B0 system.
Tested-by: Mark Langsdorf <mlangsdo@redhat.com>
--Mark Langsdorf
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region
2016-03-08 13:17 ` Mark Langsdorf
@ 2016-03-08 15:21 ` Ard Biesheuvel
0 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-03-08 15:21 UTC (permalink / raw)
To: linux-arm-kernel
On 8 March 2016 at 20:17, Mark Langsdorf <mlangsdo@redhat.com> wrote:
> On 03/08/2016 04:31 AM, Ard Biesheuvel wrote:
>>
>> On 8 March 2016 at 09:15, Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> wrote:
>>>
>>>
>>>
>>>> On 8 mrt. 2016, at 08:07, David Daney <ddaney.cavm@gmail.com> wrote:
>>>>
>>>>> On 02/26/2016 08:57 AM, Ard Biesheuvel wrote:
>>>>> Commit dd006da21646 ("arm64: mm: increase VA range of identity map")
>>>>> made
>>>>> some changes to the memory mapping code to allow physical memory to
>>>>> reside
>>>>> at an offset that exceeds the size of the virtual mapping.
>>>>>
>>>>> However, since the size of the vmemmap area is proportional to the size
>>>>> of
>>>>> the VA area, but it is populated relative to the physical space, we may
>>>>> end up with the struct page array being mapped outside of the vmemmap
>>>>> region. For instance, on my Seattle A0 box, I can see the following
>>>>> output
>>>>> in the dmesg log.
>>>>>
>>>>> vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB
>>>>> maximum)
>>>>> 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB
>>>>> actual)
>>>>>
>>>>> We can fix this by deciding that the vmemmap region is not a projection
>>>>> of
>>>>> the physical space, but of the virtual space above PAGE_OFFSET, i.e.,
>>>>> the
>>>>> linear region. This way, we are guaranteed that the vmemmap region is
>>>>> of
>>>>> sufficient size, and we can even reduce the size by half.
>>>>>
>>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>>>
>>>>
>>>> I see this commit now in Linus' kernel.org tree in v4.5-rc7.
>>>>
>>>> FYI: I am seeing a crash that goes away when I revert this. My kernel
>>>> has some other modifications (our NUMA patches) so I haven't yet fully
>>>> tracked this down on an unmodified kernel, but this is what I am getting:
>>>>
>>>
>>
>> I managed to reproduce and diagnose this. The problem is that vmemmap
>> is no longer zone aligned, which causes trouble in the zone based
>> rounding that occurs in memory_present. The below patch fixes this by
>> rounding down the subtracted offset. Since this implies that the
>> region could stick off the other end, it also reverts the halving of
>> the region size.
>
>
> This fixes the bug on my Seattle B0 system.
>
> Tested-by: Mark Langsdorf <mlangsdo@redhat.com>
>
Thanks Mark
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region
2016-03-08 10:31 ` Ard Biesheuvel
2016-03-08 13:17 ` Mark Langsdorf
@ 2016-03-09 11:32 ` Robert Richter
2016-03-09 11:36 ` Robert Richter
1 sibling, 1 reply; 12+ messages in thread
From: Robert Richter @ 2016-03-09 11:32 UTC (permalink / raw)
To: linux-arm-kernel
On 08.03.16 17:31:05, Ard Biesheuvel wrote:
> On 8 March 2016 at 09:15, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> >
> >
> >> On 8 mrt. 2016, at 08:07, David Daney <ddaney.cavm@gmail.com> wrote:
> >>
> >>> On 02/26/2016 08:57 AM, Ard Biesheuvel wrote:
> >>> Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
> >>> some changes to the memory mapping code to allow physical memory to reside
> >>> at an offset that exceeds the size of the virtual mapping.
> >>>
> >>> However, since the size of the vmemmap area is proportional to the size of
> >>> the VA area, but it is populated relative to the physical space, we may
> >>> end up with the struct page array being mapped outside of the vmemmap
> >>> region. For instance, on my Seattle A0 box, I can see the following output
> >>> in the dmesg log.
> >>>
> >>> vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
> >>> 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
> >>>
> >>> We can fix this by deciding that the vmemmap region is not a projection of
> >>> the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
> >>> linear region. This way, we are guaranteed that the vmemmap region is of
> >>> sufficient size, and we can even reduce the size by half.
> >>>
> >>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >>
> >> I see this commit now in Linus' kernel.org tree in v4.5-rc7.
> >>
> >> FYI: I am seeing a crash that goes away when I revert this. My kernel has some other modifications (our NUMA patches) so I haven't yet fully tracked this down on an unmodified kernel, but this is what I am getting:
> >>
> >
>
> I managed to reproduce and diagnose this. The problem is that vmemmap
> is no longer zone aligned, which causes trouble in the zone based
> rounding that occurs in memory_present. The below patch fixes this by
> rounding down the subtracted offset. Since this implies that the
> region could stick off the other end, it also reverts the halving of
> the region size.
I have seen the same panic. The fix solves the problem. See enclosed
diff for reference as there was some patch corruption of the original.
Thanks,
-Robert
>From 562760cc30905748cb851cc9aee2bb9d88c67d47 Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Date: Tue, 8 Mar 2016 17:31:05 +0700
Subject: [PATCH] arm64: vmemmap: Fix use virtual projection of linear region
Signed-off-by: Robert Richter <rrichter@cavium.com>
---
arch/arm64/include/asm/pgtable.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index d9de87354869..98697488650f 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -40,7 +40,7 @@
* VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
* fixed mappings and modules
*/
-#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
+#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
#ifndef CONFIG_KASAN
#define VMALLOC_START (VA_START)
@@ -52,7 +52,7 @@
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (VMALLOC_END + SZ_64K)
-#define vmemmap ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
+#define vmemmap ((struct page *)VMEMMAP_START - ((memstart_addr >> PAGE_SHIFT) & PAGE_SECTION_MASK))
#define FIRST_USER_ADDRESS 0UL
--
2.7.0.rc3
>
>
> --------8<----------
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index f50608674580..ed57c0865290 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -40,7 +40,7 @@
> * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
> * fixed mappings and modules
> */
> -#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT -
> 1)) * sizeof(struct page), PUD_SIZE)
> +#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT))
> * sizeof(struct page), PUD_SIZE)
>
> #ifndef CONFIG_KASAN
> #define VMALLOC_START (VA_START)
> @@ -52,7 +52,8 @@
> #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
>
> #define VMEMMAP_START (VMALLOC_END + SZ_64K)
> -#define vmemmap ((struct page *)VMEMMAP_START
> - (memstart_addr >> PAGE_SHIFT))
> +#define vmemmap ((struct page *)VMEMMAP_START - \
> + ((memstart_addr >> PAGE_SHIFT) &
> PAGE_SECTION_MASK))
>
> #define FIRST_USER_ADDRESS 0UL
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region
2016-03-09 11:32 ` Robert Richter
@ 2016-03-09 11:36 ` Robert Richter
0 siblings, 0 replies; 12+ messages in thread
From: Robert Richter @ 2016-03-09 11:36 UTC (permalink / raw)
To: linux-arm-kernel
On 09.03.16 12:32:14, Robert Richter wrote:
> On 08.03.16 17:31:05, Ard Biesheuvel wrote:
> > On 8 March 2016 at 09:15, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> > I managed to reproduce and diagnose this. The problem is that vmemmap
> > is no longer zone aligned, which causes trouble in the zone based
> > rounding that occurs in memory_present. The below patch fixes this by
> > rounding down the subtracted offset. Since this implies that the
> > region could stick off the other end, it also reverts the halving of
> > the region size.
>
> I have seen the same panic. The fix solves the problem. See enclosed
> diff for reference as there was some patch corruption of the original.
So this is:
Tested-by: Robert Richter <rrichter@cavium.com>
-Robert
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-03-09 11:36 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-26 16:57 [PATCH v2 0/2] vmemmap fix for bug introduced by extending VA range Ard Biesheuvel
2016-02-26 16:57 ` [PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region Ard Biesheuvel
2016-03-08 1:07 ` David Daney
2016-03-08 2:15 ` Ard Biesheuvel
2016-03-08 10:31 ` Ard Biesheuvel
2016-03-08 13:17 ` Mark Langsdorf
2016-03-08 15:21 ` Ard Biesheuvel
2016-03-09 11:32 ` Robert Richter
2016-03-09 11:36 ` Robert Richter
2016-02-26 16:57 ` [PATCH v2 2/2] arm64: mm: treat memstart_addr as a signed quantity Ard Biesheuvel
2016-02-29 12:39 ` Ard Biesheuvel
2016-02-29 18:36 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).