* [RFC PATCH 0/6] restrict virt_to_page to linear region (instead of __pa)
@ 2016-02-24 16:21 Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region Ard Biesheuvel
` (5 more replies)
0 siblings, 6 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-24 16:21 UTC (permalink / raw)
To: linux-arm-kernel
Another approach, and another bugfix in patch #1; this series supersedes the
__pa replacement series I sent out two days ago.
While looking into the [alleged] performance hit we are taking due to the
virt_to_phys() changes that are queued up, I noticed two things:
(a) I broke vmemmap in commit dd006da21646 ("arm64: mm: increase VA range of
identity map"), since it results in the struct page array corresponding
with the memory outside of the VA range to be mapped outside the vmemmap
range as well. This can be worked around fairly easily by making the
vmemmap range a projection of the virtual linear range rather than the
physical range (patch #1). This is a bugfix, and should probably go to
-stable?
(b) Once we have the fix for (a) in place, the relation between a page in the
linear region and its struct page in the vmemmap region is no longer based
on the placement of physical RAM, and we can reimplement virt_to_page()
without regard for PHYS_OFFSET, and base it entirely on arithmetic involving
build time constants only, which hopefully helps regain some performance we
[allegedly] lost (patch #6)
In a couple of cases (#2 - #5), a fixup is needed similar to the fixups in my
__pa() replacement series, to prevent virt_to_page() being used on kernel
symbols. Other than that, the code does look somewhat cleaner, and it is
arguably more reasonable to restrict virt_to_page() to linear addresses than it
is to restrict __pa().
As far as the performance is concerned, I wonder how many __pa translations
remain on hot paths after eliminating it from virt_to_page(). Suggestions
for testing the performance gain/loss are appreciated. (hackbench?)
Ard Biesheuvel (6):
arm64: vmemmap: use virtual projection of linear region
arm64: vdso: avoid virt_to_page() translations on kernel symbols
arm64: mm: free __init memory via the linear mapping
arm64: mm: avoid virt_to_page() translation for the zero page
kernel: insn: avoid virt_to_page() translations on core kernel symbols
arm64: mm: restrict virt_to_page() to the linear mapping
arch/arm64/include/asm/memory.h | 9 ++++++++-
arch/arm64/include/asm/pgtable.h | 9 +++++----
arch/arm64/kernel/insn.c | 2 +-
arch/arm64/kernel/vdso.c | 7 ++++---
arch/arm64/mm/init.c | 7 ++++---
5 files changed, 22 insertions(+), 12 deletions(-)
--
2.5.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region
2016-02-24 16:21 [RFC PATCH 0/6] restrict virt_to_page to linear region (instead of __pa) Ard Biesheuvel
@ 2016-02-24 16:21 ` Ard Biesheuvel
2016-02-25 7:02 ` Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 2/6] arm64: vdso: avoid virt_to_page() translations on kernel symbols Ard Biesheuvel
` (4 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-24 16:21 UTC (permalink / raw)
To: linux-arm-kernel
Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
some changes to the memory mapping code to allow physical memory to reside
at an offset that exceeds the size of the virtual address space.
However, since the size of the vmemmap area is proportional to the size of
the VA area, but it is populated relative to the physical space, we may
end up with the struct page array being mapped outside of the vmemmap
region. For instance, on my Seattle A0 box, I can see the following output
in the dmesg log.
vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
We can fix this by deciding that the vmemmap region is not a projection of
the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
linear region. This way, we are guaranteed that the vmemmap region is of
sufficient size, and we can also reduce its size by half.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/pgtable.h | 7 ++++---
arch/arm64/mm/init.c | 4 ++--
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a440f5a85d08..8e6baea0ff61 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -34,18 +34,19 @@
/*
* VMALLOC and SPARSEMEM_VMEMMAP ranges.
*
- * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
+ * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
* (rounded up to PUD_SIZE).
* VMALLOC_START: beginning of the kernel vmalloc space
* VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
* fixed mappings and modules
*/
-#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
+#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
-#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
+#define VMEMMAP_START (VMALLOC_END + SZ_64K)
+#define vmemmap ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))
#define FIRST_USER_ADDRESS 0UL
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index c0ea54bd9995..88046b94fa87 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -363,8 +363,8 @@ void __init mem_init(void)
MLK_ROUNDUP(_text, _etext),
MLK_ROUNDUP(_sdata, _edata),
#ifdef CONFIG_SPARSEMEM_VMEMMAP
- MLG((unsigned long)vmemmap,
- (unsigned long)vmemmap + VMEMMAP_SIZE),
+ MLG(VMEMMAP_START,
+ VMEMMAP_START + VMEMMAP_SIZE),
MLM((unsigned long)virt_to_page(PAGE_OFFSET),
(unsigned long)virt_to_page(high_memory)),
#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 2/6] arm64: vdso: avoid virt_to_page() translations on kernel symbols
2016-02-24 16:21 [RFC PATCH 0/6] restrict virt_to_page to linear region (instead of __pa) Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region Ard Biesheuvel
@ 2016-02-24 16:21 ` Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 3/6] arm64: mm: free __init memory via the linear mapping Ard Biesheuvel
` (3 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-24 16:21 UTC (permalink / raw)
To: linux-arm-kernel
The translation performed by virt_to_page() is only valid for linear
addresses, and kernel symbols are no longer in the linear mapping.
So perform the __pa() translation explicitly, which does the right
thing in either case, and only then translate to a struct page offset.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/kernel/vdso.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index 97bc68f4c689..fb3c17f031aa 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -131,11 +131,12 @@ static int __init vdso_init(void)
return -ENOMEM;
/* Grab the vDSO data page. */
- vdso_pagelist[0] = virt_to_page(vdso_data);
+ vdso_pagelist[0] = pfn_to_page(__pa(vdso_data) >> PAGE_SHIFT);
/* Grab the vDSO code pages. */
- for (i = 0; i < vdso_pages; i++)
- vdso_pagelist[i + 1] = virt_to_page(&vdso_start + i * PAGE_SIZE);
+ vdso_pagelist[1] = pfn_to_page(__pa(&vdso_start) >> PAGE_SHIFT);
+ for (i = 1; i < vdso_pages; i++)
+ vdso_pagelist[i + 1] = vdso_pagelist[1] + i;
/* Populate the special mapping structures */
vdso_spec[0] = (struct vm_special_mapping) {
--
2.5.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 3/6] arm64: mm: free __init memory via the linear mapping
2016-02-24 16:21 [RFC PATCH 0/6] restrict virt_to_page to linear region (instead of __pa) Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 2/6] arm64: vdso: avoid virt_to_page() translations on kernel symbols Ard Biesheuvel
@ 2016-02-24 16:21 ` Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 4/6] arm64: mm: avoid virt_to_page() translation for the zero page Ard Biesheuvel
` (2 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-24 16:21 UTC (permalink / raw)
To: linux-arm-kernel
The implementation of free_initmem_default() expects __init_begin
and __init_end to be covered by the linear mapping, which is no
longer the case. So open code it instead, using addresses that are
explicitly translated from kernel virtual to linear virtual.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/init.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 88046b94fa87..e286716848af 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -398,7 +398,8 @@ void __init mem_init(void)
void free_initmem(void)
{
- free_initmem_default(0);
+ free_reserved_area(__va(__pa(__init_begin)), __va(__pa(__init_end)),
+ 0, "unused kernel");
fixup_init();
}
--
2.5.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 4/6] arm64: mm: avoid virt_to_page() translation for the zero page
2016-02-24 16:21 [RFC PATCH 0/6] restrict virt_to_page to linear region (instead of __pa) Ard Biesheuvel
` (2 preceding siblings ...)
2016-02-24 16:21 ` [RFC PATCH 3/6] arm64: mm: free __init memory via the linear mapping Ard Biesheuvel
@ 2016-02-24 16:21 ` Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 5/6] kernel: insn: avoid virt_to_page() translations on core kernel symbols Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 6/6] arm64: mm: disregard PHYS_OFFSET in virt_to_page() Ard Biesheuvel
5 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-24 16:21 UTC (permalink / raw)
To: linux-arm-kernel
The zero page is statically allocated, so grab its struct page pointer
without using virt_to_page(), which will be restricted to the linear
mapping later.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/pgtable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 8e6baea0ff61..ab84cafde3c9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -118,7 +118,7 @@ extern void __pgd_error(const char *file, int line, unsigned long val);
* for zero-mapped memory areas etc..
*/
extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
-#define ZERO_PAGE(vaddr) virt_to_page(empty_zero_page)
+#define ZERO_PAGE(vaddr) pfn_to_page(__pa(empty_zero_page) >> PAGE_SHIFT)
#define pte_ERROR(pte) __pte_error(__FILE__, __LINE__, pte_val(pte))
--
2.5.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 5/6] kernel: insn: avoid virt_to_page() translations on core kernel symbols
2016-02-24 16:21 [RFC PATCH 0/6] restrict virt_to_page to linear region (instead of __pa) Ard Biesheuvel
` (3 preceding siblings ...)
2016-02-24 16:21 ` [RFC PATCH 4/6] arm64: mm: avoid virt_to_page() translation for the zero page Ard Biesheuvel
@ 2016-02-24 16:21 ` Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 6/6] arm64: mm: disregard PHYS_OFFSET in virt_to_page() Ard Biesheuvel
5 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-24 16:21 UTC (permalink / raw)
To: linux-arm-kernel
Before restricting virt_to_page() to the linear mapping, ensure that
the text patching code does not use it to resolve references into the
core kernel text, which is mapped in the vmalloc area.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/kernel/insn.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 7371455160e5..219d3a5df142 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -96,7 +96,7 @@ static void __kprobes *patch_map(void *addr, int fixmap)
if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
page = vmalloc_to_page(addr);
else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
- page = virt_to_page(addr);
+ page = pfn_to_page(__pa(addr) >> PAGE_SHIFT);
else
return addr;
--
2.5.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 6/6] arm64: mm: disregard PHYS_OFFSET in virt_to_page()
2016-02-24 16:21 [RFC PATCH 0/6] restrict virt_to_page to linear region (instead of __pa) Ard Biesheuvel
` (4 preceding siblings ...)
2016-02-24 16:21 ` [RFC PATCH 5/6] kernel: insn: avoid virt_to_page() translations on core kernel symbols Ard Biesheuvel
@ 2016-02-24 16:21 ` Ard Biesheuvel
5 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-24 16:21 UTC (permalink / raw)
To: linux-arm-kernel
The mm layer makes heavy use of virt_to_page(), which translates from
virtual addresses to offsets in the struct page array using an intermediate
translation to physical addresses. However, these physical translations
are based on the actual placement of physical memory, which can only be
discovered at runtime. This means virt_to_page() translations involve a
global PHYS_OFFSET variable, and hence a memory access.
Now that the vmemmap region has been redefined to cover the linear region
rather than the entire physical address space, we no longer need to perform
a virtual-to-physical translation in the implementation of virt_to_page(),
which means we can get rid of the memory access.
This restricts virt_to_page() translations to the linear region, so
redefine virt_addr_valid() as well.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/memory.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index eb798156cf56..9d4b7733caa3 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -189,8 +189,15 @@ static inline void *phys_to_virt(phys_addr_t x)
*/
#define ARCH_PFN_OFFSET ((unsigned long)PHYS_PFN_OFFSET)
+#ifndef CONFIG_SPARSEMEM_VMEMMAP
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
-#define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
+#define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
+#else
+#define virt_to_page(kaddr) ((struct page *)VMEMMAP_START + \
+ (((u64)(kaddr) & ~PAGE_OFFSET) >> PAGE_SHIFT))
+#define virt_addr_valid(kaddr) pfn_valid((((u64)(kaddr) & ~PAGE_OFFSET) \
+ + PHYS_OFFSET) >> PAGE_SHIFT)
+#endif
#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region
2016-02-24 16:21 ` [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region Ard Biesheuvel
@ 2016-02-25 7:02 ` Ard Biesheuvel
2016-02-26 15:15 ` Will Deacon
0 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-25 7:02 UTC (permalink / raw)
To: linux-arm-kernel
On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
> some changes to the memory mapping code to allow physical memory to reside
> at an offset that exceeds the size of the virtual address space.
>
> However, since the size of the vmemmap area is proportional to the size of
> the VA area, but it is populated relative to the physical space, we may
> end up with the struct page array being mapped outside of the vmemmap
> region. For instance, on my Seattle A0 box, I can see the following output
> in the dmesg log.
>
> vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
> 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
>
> We can fix this by deciding that the vmemmap region is not a projection of
> the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
> linear region. This way, we are guaranteed that the vmemmap region is of
> sufficient size, and we can also reduce its size by half.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
> arch/arm64/include/asm/pgtable.h | 7 ++++---
> arch/arm64/mm/init.c | 4 ++--
> 2 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index a440f5a85d08..8e6baea0ff61 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -34,18 +34,19 @@
> /*
> * VMALLOC and SPARSEMEM_VMEMMAP ranges.
> *
> - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
> + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
> * (rounded up to PUD_SIZE).
> * VMALLOC_START: beginning of the kernel vmalloc space
> * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
> * fixed mappings and modules
> */
> -#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
> +#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
>
> #define VMALLOC_START (MODULES_END)
> #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
>
> -#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
> +#define VMEMMAP_START (VMALLOC_END + SZ_64K)
> +#define vmemmap ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))
>
Note that with the linear region randomization which is now in -next,
this division needs to be signed (since memstart_addr can wrap).
So I should either update the definition of memstart_addr to s64 in
this patch, or cast to (s64) in the expression above
> #define FIRST_USER_ADDRESS 0UL
>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index c0ea54bd9995..88046b94fa87 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -363,8 +363,8 @@ void __init mem_init(void)
> MLK_ROUNDUP(_text, _etext),
> MLK_ROUNDUP(_sdata, _edata),
> #ifdef CONFIG_SPARSEMEM_VMEMMAP
> - MLG((unsigned long)vmemmap,
> - (unsigned long)vmemmap + VMEMMAP_SIZE),
> + MLG(VMEMMAP_START,
> + VMEMMAP_START + VMEMMAP_SIZE),
> MLM((unsigned long)virt_to_page(PAGE_OFFSET),
> (unsigned long)virt_to_page(high_memory)),
> #endif
> --
> 2.5.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region
2016-02-25 7:02 ` Ard Biesheuvel
@ 2016-02-26 15:15 ` Will Deacon
2016-02-26 15:39 ` Ard Biesheuvel
0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2016-02-26 15:15 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Feb 25, 2016 at 08:02:00AM +0100, Ard Biesheuvel wrote:
> On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> > Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
> > some changes to the memory mapping code to allow physical memory to reside
> > at an offset that exceeds the size of the virtual address space.
> >
> > However, since the size of the vmemmap area is proportional to the size of
> > the VA area, but it is populated relative to the physical space, we may
> > end up with the struct page array being mapped outside of the vmemmap
> > region. For instance, on my Seattle A0 box, I can see the following output
> > in the dmesg log.
> >
> > vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
> > 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
> >
> > We can fix this by deciding that the vmemmap region is not a projection of
> > the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
> > linear region. This way, we are guaranteed that the vmemmap region is of
> > sufficient size, and we can also reduce its size by half.
> >
> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > ---
> > arch/arm64/include/asm/pgtable.h | 7 ++++---
> > arch/arm64/mm/init.c | 4 ++--
> > 2 files changed, 6 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > index a440f5a85d08..8e6baea0ff61 100644
> > --- a/arch/arm64/include/asm/pgtable.h
> > +++ b/arch/arm64/include/asm/pgtable.h
> > @@ -34,18 +34,19 @@
> > /*
> > * VMALLOC and SPARSEMEM_VMEMMAP ranges.
> > *
> > - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
> > + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
> > * (rounded up to PUD_SIZE).
> > * VMALLOC_START: beginning of the kernel vmalloc space
> > * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
> > * fixed mappings and modules
> > */
> > -#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
> > +#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
> >
> > #define VMALLOC_START (MODULES_END)
> > #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
> >
> > -#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
> > +#define VMEMMAP_START (VMALLOC_END + SZ_64K)
> > +#define vmemmap ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))
> >
>
> Note that with the linear region randomization which is now in -next,
> this division needs to be signed (since memstart_addr can wrap).
>
> So I should either update the definition of memstart_addr to s64 in
> this patch, or cast to (s64) in the expression above
Can you avoid the division altogether by doing something like:
(struct page *)(VMEMMAP_START - (PHYS_PFN(memstart_addr) * sizeof(struct page)))
or have I misunderstood how this works?
Will
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region
2016-02-26 15:15 ` Will Deacon
@ 2016-02-26 15:39 ` Ard Biesheuvel
2016-02-26 16:24 ` Will Deacon
0 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-26 15:39 UTC (permalink / raw)
To: linux-arm-kernel
On 26 February 2016 at 16:15, Will Deacon <will.deacon@arm.com> wrote:
> On Thu, Feb 25, 2016 at 08:02:00AM +0100, Ard Biesheuvel wrote:
>> On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>> > Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
>> > some changes to the memory mapping code to allow physical memory to reside
>> > at an offset that exceeds the size of the virtual address space.
>> >
>> > However, since the size of the vmemmap area is proportional to the size of
>> > the VA area, but it is populated relative to the physical space, we may
>> > end up with the struct page array being mapped outside of the vmemmap
>> > region. For instance, on my Seattle A0 box, I can see the following output
>> > in the dmesg log.
>> >
>> > vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
>> > 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
>> >
>> > We can fix this by deciding that the vmemmap region is not a projection of
>> > the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
>> > linear region. This way, we are guaranteed that the vmemmap region is of
>> > sufficient size, and we can also reduce its size by half.
>> >
>> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> > ---
>> > arch/arm64/include/asm/pgtable.h | 7 ++++---
>> > arch/arm64/mm/init.c | 4 ++--
>> > 2 files changed, 6 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> > index a440f5a85d08..8e6baea0ff61 100644
>> > --- a/arch/arm64/include/asm/pgtable.h
>> > +++ b/arch/arm64/include/asm/pgtable.h
>> > @@ -34,18 +34,19 @@
>> > /*
>> > * VMALLOC and SPARSEMEM_VMEMMAP ranges.
>> > *
>> > - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
>> > + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
>> > * (rounded up to PUD_SIZE).
>> > * VMALLOC_START: beginning of the kernel vmalloc space
>> > * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
>> > * fixed mappings and modules
>> > */
>> > -#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
>> > +#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
>> >
>> > #define VMALLOC_START (MODULES_END)
>> > #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
>> >
>> > -#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
>> > +#define VMEMMAP_START (VMALLOC_END + SZ_64K)
>> > +#define vmemmap ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))
>> >
>>
>> Note that with the linear region randomization which is now in -next,
>> this division needs to be signed (since memstart_addr can wrap).
>>
>> So I should either update the definition of memstart_addr to s64 in
>> this patch, or cast to (s64) in the expression above
>
> Can you avoid the division altogether by doing something like:
>
> (struct page *)(VMEMMAP_START - (PHYS_PFN(memstart_addr) * sizeof(struct page)))
>
> or have I misunderstood how this works?
>
It needs to be a signed shift, since the RHS of the subtraction must
remain negative if memstart_addr is 'negative'
This works as well:
(struct page *)VMEMMAP_START - ((s64)memstart_addr >> PAGE_SHIFT)
It may be appropriate to change the definition of memstart_addr to
s64, to reflect that, under randomization of the linear region, the
start of physical memory may be 'below zero' so that the actual
populated RAM region is high up in the linear region.
That way, we can lose the case here.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region
2016-02-26 15:39 ` Ard Biesheuvel
@ 2016-02-26 16:24 ` Will Deacon
2016-02-26 16:26 ` Ard Biesheuvel
0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2016-02-26 16:24 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Feb 26, 2016 at 04:39:55PM +0100, Ard Biesheuvel wrote:
> On 26 February 2016 at 16:15, Will Deacon <will.deacon@arm.com> wrote:
> > On Thu, Feb 25, 2016 at 08:02:00AM +0100, Ard Biesheuvel wrote:
> >> On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> >> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> >> > index a440f5a85d08..8e6baea0ff61 100644
> >> > --- a/arch/arm64/include/asm/pgtable.h
> >> > +++ b/arch/arm64/include/asm/pgtable.h
> >> > @@ -34,18 +34,19 @@
> >> > /*
> >> > * VMALLOC and SPARSEMEM_VMEMMAP ranges.
> >> > *
> >> > - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
> >> > + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
> >> > * (rounded up to PUD_SIZE).
> >> > * VMALLOC_START: beginning of the kernel vmalloc space
> >> > * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
> >> > * fixed mappings and modules
> >> > */
> >> > -#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
> >> > +#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
> >> >
> >> > #define VMALLOC_START (MODULES_END)
> >> > #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
> >> >
> >> > -#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
> >> > +#define VMEMMAP_START (VMALLOC_END + SZ_64K)
> >> > +#define vmemmap ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))
> >> >
> >>
> >> Note that with the linear region randomization which is now in -next,
> >> this division needs to be signed (since memstart_addr can wrap).
> >>
> >> So I should either update the definition of memstart_addr to s64 in
> >> this patch, or cast to (s64) in the expression above
> >
> > Can you avoid the division altogether by doing something like:
> >
> > (struct page *)(VMEMMAP_START - (PHYS_PFN(memstart_addr) * sizeof(struct page)))
> >
> > or have I misunderstood how this works?
> >
>
> It needs to be a signed shift, since the RHS of the subtraction must
> remain negative if memstart_addr is 'negative'
>
> This works as well:
> (struct page *)VMEMMAP_START - ((s64)memstart_addr >> PAGE_SHIFT)
Ah yeah, even better.
> It may be appropriate to change the definition of memstart_addr to
> s64, to reflect that, under randomization of the linear region, the
> start of physical memory may be 'below zero' so that the actual
> populated RAM region is high up in the linear region.
> That way, we can lose the case here.
That sounds like a good idea.
Will
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region
2016-02-26 16:24 ` Will Deacon
@ 2016-02-26 16:26 ` Ard Biesheuvel
0 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-02-26 16:26 UTC (permalink / raw)
To: linux-arm-kernel
On 26 February 2016 at 17:24, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Feb 26, 2016 at 04:39:55PM +0100, Ard Biesheuvel wrote:
>> On 26 February 2016 at 16:15, Will Deacon <will.deacon@arm.com> wrote:
>> > On Thu, Feb 25, 2016 at 08:02:00AM +0100, Ard Biesheuvel wrote:
>> >> On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>> >> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> >> > index a440f5a85d08..8e6baea0ff61 100644
>> >> > --- a/arch/arm64/include/asm/pgtable.h
>> >> > +++ b/arch/arm64/include/asm/pgtable.h
>> >> > @@ -34,18 +34,19 @@
>> >> > /*
>> >> > * VMALLOC and SPARSEMEM_VMEMMAP ranges.
>> >> > *
>> >> > - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
>> >> > + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
>> >> > * (rounded up to PUD_SIZE).
>> >> > * VMALLOC_START: beginning of the kernel vmalloc space
>> >> > * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
>> >> > * fixed mappings and modules
>> >> > */
>> >> > -#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
>> >> > +#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
>> >> >
>> >> > #define VMALLOC_START (MODULES_END)
>> >> > #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
>> >> >
>> >> > -#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
>> >> > +#define VMEMMAP_START (VMALLOC_END + SZ_64K)
>> >> > +#define vmemmap ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))
>> >> >
>> >>
>> >> Note that with the linear region randomization which is now in -next,
>> >> this division needs to be signed (since memstart_addr can wrap).
>> >>
>> >> So I should either update the definition of memstart_addr to s64 in
>> >> this patch, or cast to (s64) in the expression above
>> >
>> > Can you avoid the division altogether by doing something like:
>> >
>> > (struct page *)(VMEMMAP_START - (PHYS_PFN(memstart_addr) * sizeof(struct page)))
>> >
>> > or have I misunderstood how this works?
>> >
>>
>> It needs to be a signed shift, since the RHS of the subtraction must
>> remain negative if memstart_addr is 'negative'
>>
>> This works as well:
>> (struct page *)VMEMMAP_START - ((s64)memstart_addr >> PAGE_SHIFT)
>
> Ah yeah, even better.
>
>> It may be appropriate to change the definition of memstart_addr to
>> s64, to reflect that, under randomization of the linear region, the
>> start of physical memory may be 'below zero' so that the actual
>> populated RAM region is high up in the linear region.
>> That way, we can lose the case here.
>
> That sounds like a good idea.
>
OK, I will respin the first patch. As far as the remaining patches are
concerned, I wonder if you have any suggestions as to how to measure
the performance impact of making virt_to_page() disregard PHYS_OFFSET
(as I did in 6/6) before respinning/resending the remaining patches
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-02-26 16:26 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-24 16:21 [RFC PATCH 0/6] restrict virt_to_page to linear region (instead of __pa) Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 1/6] arm64: vmemmap: use virtual projection of linear region Ard Biesheuvel
2016-02-25 7:02 ` Ard Biesheuvel
2016-02-26 15:15 ` Will Deacon
2016-02-26 15:39 ` Ard Biesheuvel
2016-02-26 16:24 ` Will Deacon
2016-02-26 16:26 ` Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 2/6] arm64: vdso: avoid virt_to_page() translations on kernel symbols Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 3/6] arm64: mm: free __init memory via the linear mapping Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 4/6] arm64: mm: avoid virt_to_page() translation for the zero page Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 5/6] kernel: insn: avoid virt_to_page() translations on core kernel symbols Ard Biesheuvel
2016-02-24 16:21 ` [RFC PATCH 6/6] arm64: mm: disregard PHYS_OFFSET in virt_to_page() Ard Biesheuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).