* [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel
@ 2017-03-09 20:51 Ard Biesheuvel
2017-03-09 20:51 ` [PATCH v6 01/11] arm: kvm: move kvm_vgic_global_state out of .text section Ard Biesheuvel
` (11 more replies)
0 siblings, 12 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:51 UTC (permalink / raw)
To: linux-arm-kernel
Having memory that is writable and executable at the same time is a
security hazard, and so we tend to avoid those when we can. However,
at boot time, we keep .text mapped writable during the entire init
phase, and the init region itself is mapped rwx as well.
Let's improve the situation by:
- making the alternatives patching use the linear mapping
- splitting the init region into separate text and data regions
This removes all RWX mappings except the really early one created
in head.S (which we could perhaps fix in the future as well). Due to
these changes, it is also possible to make another attempt at re-enabling
the use of contiguous mappings at the PMD and PTE levels.
Changes since v5:
- add patch to remove pointless mapping/unmapping of translation table pages
when the higher level table entry pointing to it is created
- fix unitialized flags var in #9
- allow contiguous ranges for mappings created via create_pgd_mapping (ie., EFI)
- refactor pud/pmd/pte recursion logic so we don't iterate over pmds and ptes
at two separate levels concurrently (ie., cont and non-cont)
- use macros rather than static inlines for pmd_/pte_cont_addr_end()
- add Mark's R-b to #6 - #9
Changes since v4:
- the PTE_CONT patch has now spawned four more preparatory patches that clean
up some of the page table creation code before reintroducing the contiguous
attribute management
- add Mark's R-b to #4 and #5
Changes since v3:
- use linear alias only when patching the core kernel, and not for modules
- add patch to reintroduce the use of PTE_CONT for kernel mappings, except
for regions that are remapped read-only later on (i.e, .rodata and the
linear alias of .text+.rodata)
Changes since v2:
- ensure that text mappings remain writable under rodata=off
- rename create_mapping_late() to update_mapping_prot()
- clarify commit log of #2
- add acks
Ard Biesheuvel (11):
arm: kvm: move kvm_vgic_global_state out of .text section
arm64: mmu: move TLB maintenance from callers to create_mapping_late()
arm64: alternatives: apply boot time fixups via the linear mapping
arm64: mmu: map .text as read-only from the outset
arm64: mmu: apply strict permissions to .init.text and .init.data
arm64/mmu: align alloc_init_pte prototype with pmd/pud versions
arm64/mmu: ignore debug_pagealloc for kernel segments
arm64/mmu: add contiguous bit to sanity bug check
arm64/mmu: replace 'page_mappings_only' parameter with flags argument
arm64/mm: remove pointless map/unmap sequences when creating page
tables
arm64: mm: set the contiguous bit for kernel mappings where
appropriate
arch/arm64/include/asm/mmu.h | 1 +
arch/arm64/include/asm/pgtable.h | 10 +
arch/arm64/include/asm/sections.h | 2 +
arch/arm64/kernel/alternative.c | 11 +-
arch/arm64/kernel/smp.c | 1 +
arch/arm64/kernel/vmlinux.lds.S | 25 +-
arch/arm64/mm/mmu.c | 246 ++++++++++++++------
virt/kvm/arm/vgic/vgic.c | 4 +-
8 files changed, 208 insertions(+), 92 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v6 01/11] arm: kvm: move kvm_vgic_global_state out of .text section
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
@ 2017-03-09 20:51 ` Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 02/11] arm64: mmu: move TLB maintenance from callers to create_mapping_late() Ard Biesheuvel
` (10 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:51 UTC (permalink / raw)
To: linux-arm-kernel
The kvm_vgic_global_state struct contains a static key which is
written to by jump_label_init() at boot time. So in preparation of
making .text regions truly (well, almost truly) read-only, mark
kvm_vgic_global_state __ro_after_init so it moves to the .rodata
section instead.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
virt/kvm/arm/vgic/vgic.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 654dfd40e449..7713d96e85b7 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -29,7 +29,9 @@
#define DEBUG_SPINLOCK_BUG_ON(p)
#endif
-struct vgic_global __section(.hyp.text) kvm_vgic_global_state = {.gicv3_cpuif = STATIC_KEY_FALSE_INIT,};
+struct vgic_global kvm_vgic_global_state __ro_after_init = {
+ .gicv3_cpuif = STATIC_KEY_FALSE_INIT,
+};
/*
* Locking order is always:
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 02/11] arm64: mmu: move TLB maintenance from callers to create_mapping_late()
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
2017-03-09 20:51 ` [PATCH v6 01/11] arm: kvm: move kvm_vgic_global_state out of .text section Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 03/11] arm64: alternatives: apply boot time fixups via the linear mapping Ard Biesheuvel
` (9 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
In preparation of refactoring the kernel mapping logic so that text regions
are never mapped writable, which would require adding explicit TLB
maintenance to new call sites of create_mapping_late() (which is currently
invoked twice from the same function), move the TLB maintenance from the
call site into create_mapping_late() itself, and change it from a full
TLB flush into a flush by VA, which is more appropriate here.
Also, given that create_mapping_late() has evolved into a routine that only
updates protection bits on existing mappings, rename it to
update_mapping_prot()
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d28dbcf596b6..6cafd8723d1a 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -319,17 +319,20 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
pgd_pgtable_alloc, page_mappings_only);
}
-static void create_mapping_late(phys_addr_t phys, unsigned long virt,
- phys_addr_t size, pgprot_t prot)
+static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
+ phys_addr_t size, pgprot_t prot)
{
if (virt < VMALLOC_START) {
- pr_warn("BUG: not creating mapping for %pa at 0x%016lx - outside kernel range\n",
+ pr_warn("BUG: not updating mapping for %pa@0x%016lx - outside kernel range\n",
&phys, virt);
return;
}
__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot,
NULL, debug_pagealloc_enabled());
+
+ /* flush the TLBs after updating live kernel mappings */
+ flush_tlb_kernel_range(virt, virt + size);
}
static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end)
@@ -402,19 +405,16 @@ void mark_rodata_ro(void)
unsigned long section_size;
section_size = (unsigned long)_etext - (unsigned long)_text;
- create_mapping_late(__pa_symbol(_text), (unsigned long)_text,
+ update_mapping_prot(__pa_symbol(_text), (unsigned long)_text,
section_size, PAGE_KERNEL_ROX);
/*
* mark .rodata as read only. Use __init_begin rather than __end_rodata
* to cover NOTES and EXCEPTION_TABLE.
*/
section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata;
- create_mapping_late(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
+ update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
section_size, PAGE_KERNEL_RO);
- /* flush the TLBs after updating live kernel mappings */
- flush_tlb_all();
-
debug_checkwx();
}
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 03/11] arm64: alternatives: apply boot time fixups via the linear mapping
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
2017-03-09 20:51 ` [PATCH v6 01/11] arm: kvm: move kvm_vgic_global_state out of .text section Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 02/11] arm64: mmu: move TLB maintenance from callers to create_mapping_late() Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 04/11] arm64: mmu: map .text as read-only from the outset Ard Biesheuvel
` (8 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
One important rule of thumb when desiging a secure software system is
that memory should never be writable and executable at the same time.
We mostly adhere to this rule in the kernel, except at boot time, when
regions may be mapped RWX until after we are done applying alternatives
or making other one-off changes.
For the alternative patching, we can improve the situation by applying
the fixups via the linear mapping, which is never mapped with executable
permissions. So map the linear alias of .text with RW- permissions
initially, and remove the write permissions as soon as alternative
patching has completed.
Reviewed-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/mmu.h | 1 +
arch/arm64/kernel/alternative.c | 11 +++++-----
arch/arm64/kernel/smp.c | 1 +
arch/arm64/mm/mmu.c | 22 +++++++++++++++-----
4 files changed, 25 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 47619411f0ff..5468c834b072 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -37,5 +37,6 @@ extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
unsigned long virt, phys_addr_t size,
pgprot_t prot, bool page_mappings_only);
extern void *fixmap_remap_fdt(phys_addr_t dt_phys);
+extern void mark_linear_text_alias_ro(void);
#endif
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index 06d650f61da7..8840c109c5d6 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -105,11 +105,11 @@ static u32 get_alt_insn(struct alt_instr *alt, u32 *insnptr, u32 *altinsnptr)
return insn;
}
-static void __apply_alternatives(void *alt_region)
+static void __apply_alternatives(void *alt_region, bool use_linear_alias)
{
struct alt_instr *alt;
struct alt_region *region = alt_region;
- u32 *origptr, *replptr;
+ u32 *origptr, *replptr, *updptr;
for (alt = region->begin; alt < region->end; alt++) {
u32 insn;
@@ -124,11 +124,12 @@ static void __apply_alternatives(void *alt_region)
origptr = ALT_ORIG_PTR(alt);
replptr = ALT_REPL_PTR(alt);
+ updptr = use_linear_alias ? (u32 *)lm_alias(origptr) : origptr;
nr_inst = alt->alt_len / sizeof(insn);
for (i = 0; i < nr_inst; i++) {
insn = get_alt_insn(alt, origptr + i, replptr + i);
- *(origptr + i) = cpu_to_le32(insn);
+ updptr[i] = cpu_to_le32(insn);
}
flush_icache_range((uintptr_t)origptr,
@@ -155,7 +156,7 @@ static int __apply_alternatives_multi_stop(void *unused)
isb();
} else {
BUG_ON(patched);
- __apply_alternatives(®ion);
+ __apply_alternatives(®ion, true);
/* Barriers provided by the cache flushing */
WRITE_ONCE(patched, 1);
}
@@ -176,5 +177,5 @@ void apply_alternatives(void *start, size_t length)
.end = start + length,
};
- __apply_alternatives(®ion);
+ __apply_alternatives(®ion, false);
}
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index ef1caae02110..d4739552da28 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -434,6 +434,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
setup_cpu_features();
hyp_mode_check();
apply_alternatives_all();
+ mark_linear_text_alias_ro();
}
void __init smp_prepare_boot_cpu(void)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 6cafd8723d1a..df377fbe464e 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -372,16 +372,28 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
debug_pagealloc_enabled());
/*
- * Map the linear alias of the [_text, __init_begin) interval as
- * read-only/non-executable. This makes the contents of the
- * region accessible to subsystems such as hibernate, but
- * protects it from inadvertent modification or execution.
+ * Map the linear alias of the [_text, __init_begin) interval
+ * as non-executable now, and remove the write permission in
+ * mark_linear_text_alias_ro() below (which will be called after
+ * alternative patching has completed). This makes the contents
+ * of the region accessible to subsystems such as hibernate,
+ * but protects it from inadvertent modification or execution.
*/
__create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start),
- kernel_end - kernel_start, PAGE_KERNEL_RO,
+ kernel_end - kernel_start, PAGE_KERNEL,
early_pgtable_alloc, debug_pagealloc_enabled());
}
+void __init mark_linear_text_alias_ro(void)
+{
+ /*
+ * Remove the write permissions from the linear alias of .text/.rodata
+ */
+ update_mapping_prot(__pa_symbol(_text), (unsigned long)lm_alias(_text),
+ (unsigned long)__init_begin - (unsigned long)_text,
+ PAGE_KERNEL_RO);
+}
+
static void __init map_mem(pgd_t *pgd)
{
struct memblock_region *reg;
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 04/11] arm64: mmu: map .text as read-only from the outset
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
` (2 preceding siblings ...)
2017-03-09 20:52 ` [PATCH v6 03/11] arm64: alternatives: apply boot time fixups via the linear mapping Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 05/11] arm64: mmu: apply strict permissions to .init.text and .init.data Ard Biesheuvel
` (7 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
Now that alternatives patching code no longer relies on the primary
mapping of .text being writable, we can remove the code that removes
the writable permissions post-init time, and map it read-only from
the outset.
To preserve the existing behavior under rodata=off, which is relied
upon by external debuggers to manage software breakpoints (as pointed
out by Mark), add an early_param() check for rodata=, and use RWX
permissions if it set to 'off'.
Reviewed-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index df377fbe464e..300e98e8cd63 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -416,9 +416,6 @@ void mark_rodata_ro(void)
{
unsigned long section_size;
- section_size = (unsigned long)_etext - (unsigned long)_text;
- update_mapping_prot(__pa_symbol(_text), (unsigned long)_text,
- section_size, PAGE_KERNEL_ROX);
/*
* mark .rodata as read only. Use __init_begin rather than __end_rodata
* to cover NOTES and EXCEPTION_TABLE.
@@ -451,6 +448,12 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
vm_area_add_early(vma);
}
+static int __init parse_rodata(char *arg)
+{
+ return strtobool(arg, &rodata_enabled);
+}
+early_param("rodata", parse_rodata);
+
/*
* Create fine-grained mappings for the kernel.
*/
@@ -458,7 +461,14 @@ static void __init map_kernel(pgd_t *pgd)
{
static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_init, vmlinux_data;
- map_kernel_segment(pgd, _text, _etext, PAGE_KERNEL_EXEC, &vmlinux_text);
+ /*
+ * External debuggers may need to write directly to the text
+ * mapping to install SW breakpoints. Allow this (only) when
+ * explicitly requested with rodata=off.
+ */
+ pgprot_t text_prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
+
+ map_kernel_segment(pgd, _text, _etext, text_prot, &vmlinux_text);
map_kernel_segment(pgd, __start_rodata, __init_begin, PAGE_KERNEL, &vmlinux_rodata);
map_kernel_segment(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
&vmlinux_init);
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 05/11] arm64: mmu: apply strict permissions to .init.text and .init.data
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
` (3 preceding siblings ...)
2017-03-09 20:52 ` [PATCH v6 04/11] arm64: mmu: map .text as read-only from the outset Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 06/11] arm64/mmu: align alloc_init_pte prototype with pmd/pud versions Ard Biesheuvel
` (6 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
To avoid having mappings that are writable and executable at the same
time, split the init region into a .init.text region that is mapped
read-only, and a .init.data region that is mapped non-executable.
This is possible now that the alternative patching occurs via the linear
mapping, and the linear alias of the init region is always mapped writable
(but never executable).
Since the alternatives descriptions themselves are read-only data, move
those into the .init.text region.
Reviewed-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/sections.h | 2 ++
arch/arm64/kernel/vmlinux.lds.S | 25 +++++++++++++-------
arch/arm64/mm/mmu.c | 12 ++++++----
3 files changed, 26 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h
index 4e7e7067afdb..941267caa39c 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -24,6 +24,8 @@ extern char __hibernate_exit_text_start[], __hibernate_exit_text_end[];
extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
extern char __hyp_text_start[], __hyp_text_end[];
extern char __idmap_text_start[], __idmap_text_end[];
+extern char __initdata_begin[], __initdata_end[];
+extern char __inittext_begin[], __inittext_end[];
extern char __irqentry_text_start[], __irqentry_text_end[];
extern char __mmuoff_data_start[], __mmuoff_data_end[];
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index b8deffa9e1bf..2c93d259046c 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -143,12 +143,27 @@ SECTIONS
. = ALIGN(SEGMENT_ALIGN);
__init_begin = .;
+ __inittext_begin = .;
INIT_TEXT_SECTION(8)
.exit.text : {
ARM_EXIT_KEEP(EXIT_TEXT)
}
+ . = ALIGN(4);
+ .altinstructions : {
+ __alt_instructions = .;
+ *(.altinstructions)
+ __alt_instructions_end = .;
+ }
+ .altinstr_replacement : {
+ *(.altinstr_replacement)
+ }
+
+ . = ALIGN(PAGE_SIZE);
+ __inittext_end = .;
+ __initdata_begin = .;
+
.init.data : {
INIT_DATA
INIT_SETUP(16)
@@ -164,15 +179,6 @@ SECTIONS
PERCPU_SECTION(L1_CACHE_BYTES)
- . = ALIGN(4);
- .altinstructions : {
- __alt_instructions = .;
- *(.altinstructions)
- __alt_instructions_end = .;
- }
- .altinstr_replacement : {
- *(.altinstr_replacement)
- }
.rela : ALIGN(8) {
*(.rela .rela*)
}
@@ -181,6 +187,7 @@ SECTIONS
__rela_size = SIZEOF(.rela);
. = ALIGN(SEGMENT_ALIGN);
+ __initdata_end = .;
__init_end = .;
_data = .;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 300e98e8cd63..75e21c33caff 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -459,7 +459,8 @@ early_param("rodata", parse_rodata);
*/
static void __init map_kernel(pgd_t *pgd)
{
- static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_init, vmlinux_data;
+ static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_inittext,
+ vmlinux_initdata, vmlinux_data;
/*
* External debuggers may need to write directly to the text
@@ -469,9 +470,12 @@ static void __init map_kernel(pgd_t *pgd)
pgprot_t text_prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
map_kernel_segment(pgd, _text, _etext, text_prot, &vmlinux_text);
- map_kernel_segment(pgd, __start_rodata, __init_begin, PAGE_KERNEL, &vmlinux_rodata);
- map_kernel_segment(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
- &vmlinux_init);
+ map_kernel_segment(pgd, __start_rodata, __inittext_begin, PAGE_KERNEL,
+ &vmlinux_rodata);
+ map_kernel_segment(pgd, __inittext_begin, __inittext_end, text_prot,
+ &vmlinux_inittext);
+ map_kernel_segment(pgd, __initdata_begin, __initdata_end, PAGE_KERNEL,
+ &vmlinux_initdata);
map_kernel_segment(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) {
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 06/11] arm64/mmu: align alloc_init_pte prototype with pmd/pud versions
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
` (4 preceding siblings ...)
2017-03-09 20:52 ` [PATCH v6 05/11] arm64: mmu: apply strict permissions to .init.text and .init.data Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 07/11] arm64/mmu: ignore debug_pagealloc for kernel segments Ard Biesheuvel
` (5 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
Align the function prototype of alloc_init_pte() with its pmd and pud
counterparts by replacing the pfn parameter with the equivalent physical
address.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 75e21c33caff..bb9179084217 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -107,7 +107,7 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
}
static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
- unsigned long end, unsigned long pfn,
+ unsigned long end, phys_addr_t phys,
pgprot_t prot,
phys_addr_t (*pgtable_alloc)(void))
{
@@ -128,8 +128,7 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
do {
pte_t old_pte = *pte;
- set_pte(pte, pfn_pte(pfn, prot));
- pfn++;
+ set_pte(pte, pfn_pte(__phys_to_pfn(phys), prot));
/*
* After the PTE entry has been populated once, we
@@ -137,6 +136,7 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
*/
BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), pte_val(*pte)));
+ phys += PAGE_SIZE;
} while (pte++, addr += PAGE_SIZE, addr != end);
pte_clear_fixmap();
@@ -182,7 +182,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
BUG_ON(!pgattr_change_is_safe(pmd_val(old_pmd),
pmd_val(*pmd)));
} else {
- alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
+ alloc_init_pte(pmd, addr, next, phys,
prot, pgtable_alloc);
BUG_ON(pmd_val(old_pmd) != 0 &&
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 07/11] arm64/mmu: ignore debug_pagealloc for kernel segments
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
` (5 preceding siblings ...)
2017-03-09 20:52 ` [PATCH v6 06/11] arm64/mmu: align alloc_init_pte prototype with pmd/pud versions Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 08/11] arm64/mmu: add contiguous bit to sanity bug check Ard Biesheuvel
` (4 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
The debug_pagealloc facility manipulates kernel mappings in the linear
region at page granularity to detect out of bounds or use-after-free
accesses. Since the kernel segments are not allocated dynamically,
there is no point in taking the debug_pagealloc_enabled flag into
account for them, and we can use block mappings unconditionally.
Note that this applies equally to the linear alias of text/rodata:
we will never have dynamic allocations there given that the same
memory is statically in use by the kernel image.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index bb9179084217..ec23aec6433f 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -328,8 +328,7 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
return;
}
- __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot,
- NULL, debug_pagealloc_enabled());
+ __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, false);
/* flush the TLBs after updating live kernel mappings */
flush_tlb_kernel_range(virt, virt + size);
@@ -381,7 +380,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
*/
__create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start),
kernel_end - kernel_start, PAGE_KERNEL,
- early_pgtable_alloc, debug_pagealloc_enabled());
+ early_pgtable_alloc, false);
}
void __init mark_linear_text_alias_ro(void)
@@ -437,7 +436,7 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
BUG_ON(!PAGE_ALIGNED(size));
__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
- early_pgtable_alloc, debug_pagealloc_enabled());
+ early_pgtable_alloc, false);
vma->addr = va_start;
vma->phys_addr = pa_start;
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 08/11] arm64/mmu: add contiguous bit to sanity bug check
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
` (6 preceding siblings ...)
2017-03-09 20:52 ` [PATCH v6 07/11] arm64/mmu: ignore debug_pagealloc for kernel segments Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 09/11] arm64/mmu: replace 'page_mappings_only' parameter with flags argument Ard Biesheuvel
` (3 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
A mapping with the contiguous bit cannot be safely manipulated while
live, regardless of whether the bit changes between the old and new
mapping. So take this into account when deciding whether the change
is safe.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index ec23aec6433f..382ebd6ef46f 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -103,7 +103,15 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
*/
static const pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE;
- return old == 0 || new == 0 || ((old ^ new) & ~mask) == 0;
+ /* creating or taking down mappings is always safe */
+ if (old == 0 || new == 0)
+ return true;
+
+ /* live contiguous mappings may not be manipulated at all */
+ if ((old | new) & PTE_CONT)
+ return false;
+
+ return ((old ^ new) & ~mask) == 0;
}
static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 09/11] arm64/mmu: replace 'page_mappings_only' parameter with flags argument
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
` (7 preceding siblings ...)
2017-03-09 20:52 ` [PATCH v6 08/11] arm64/mmu: add contiguous bit to sanity bug check Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 10/11] arm64/mm: remove pointless map/unmap sequences when creating page tables Ard Biesheuvel
` (2 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
In preparation of extending the policy for manipulating kernel mappings
with whether or not contiguous hints may be used in the page tables,
replace the bool 'page_mappings_only' with a flags field and a flag
NO_BLOCK_MAPPINGS.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 45 ++++++++++++--------
1 file changed, 27 insertions(+), 18 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 382ebd6ef46f..bc9d5eb7bfa2 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -43,6 +43,8 @@
#include <asm/mmu_context.h>
#include <asm/ptdump.h>
+#define NO_BLOCK_MAPPINGS BIT(0)
+
u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
u64 kimage_voffset __ro_after_init;
@@ -153,7 +155,7 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
phys_addr_t phys, pgprot_t prot,
phys_addr_t (*pgtable_alloc)(void),
- bool page_mappings_only)
+ int flags)
{
pmd_t *pmd;
unsigned long next;
@@ -180,7 +182,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
/* try section mapping first */
if (((addr | next | phys) & ~SECTION_MASK) == 0 &&
- !page_mappings_only) {
+ (flags & NO_BLOCK_MAPPINGS) == 0) {
pmd_set_huge(pmd, phys, prot);
/*
@@ -217,7 +219,7 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next,
static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
phys_addr_t phys, pgprot_t prot,
phys_addr_t (*pgtable_alloc)(void),
- bool page_mappings_only)
+ int flags)
{
pud_t *pud;
unsigned long next;
@@ -239,7 +241,8 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
/*
* For 4K granule only, attempt to put down a 1GB block
*/
- if (use_1G_block(addr, next, phys) && !page_mappings_only) {
+ if (use_1G_block(addr, next, phys) &&
+ (flags & NO_BLOCK_MAPPINGS) == 0) {
pud_set_huge(pud, phys, prot);
/*
@@ -250,7 +253,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
pud_val(*pud)));
} else {
alloc_init_pmd(pud, addr, next, phys, prot,
- pgtable_alloc, page_mappings_only);
+ pgtable_alloc, flags);
BUG_ON(pud_val(old_pud) != 0 &&
pud_val(old_pud) != pud_val(*pud));
@@ -265,7 +268,7 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
unsigned long virt, phys_addr_t size,
pgprot_t prot,
phys_addr_t (*pgtable_alloc)(void),
- bool page_mappings_only)
+ int flags)
{
unsigned long addr, length, end, next;
pgd_t *pgd = pgd_offset_raw(pgdir, virt);
@@ -285,7 +288,7 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
do {
next = pgd_addr_end(addr, end);
alloc_init_pud(pgd, addr, next, phys, prot, pgtable_alloc,
- page_mappings_only);
+ flags);
phys += next - addr;
} while (pgd++, addr = next, addr != end);
}
@@ -314,17 +317,22 @@ static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
&phys, virt);
return;
}
- __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, false);
+ __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, 0);
}
void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
unsigned long virt, phys_addr_t size,
pgprot_t prot, bool page_mappings_only)
{
+ int flags = 0;
+
BUG_ON(mm == &init_mm);
+ if (page_mappings_only)
+ flags = NO_BLOCK_MAPPINGS;
+
__create_pgd_mapping(mm->pgd, phys, virt, size, prot,
- pgd_pgtable_alloc, page_mappings_only);
+ pgd_pgtable_alloc, flags);
}
static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
@@ -336,7 +344,7 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
return;
}
- __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, false);
+ __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, 0);
/* flush the TLBs after updating live kernel mappings */
flush_tlb_kernel_range(virt, virt + size);
@@ -346,6 +354,10 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
{
phys_addr_t kernel_start = __pa_symbol(_text);
phys_addr_t kernel_end = __pa_symbol(__init_begin);
+ int flags = 0;
+
+ if (debug_pagealloc_enabled())
+ flags = NO_BLOCK_MAPPINGS;
/*
* Take care not to create a writable alias for the
@@ -356,8 +368,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
if (end < kernel_start || start >= kernel_end) {
__create_pgd_mapping(pgd, start, __phys_to_virt(start),
end - start, PAGE_KERNEL,
- early_pgtable_alloc,
- debug_pagealloc_enabled());
+ early_pgtable_alloc, flags);
return;
}
@@ -369,14 +380,12 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
__create_pgd_mapping(pgd, start,
__phys_to_virt(start),
kernel_start - start, PAGE_KERNEL,
- early_pgtable_alloc,
- debug_pagealloc_enabled());
+ early_pgtable_alloc, flags);
if (kernel_end < end)
__create_pgd_mapping(pgd, kernel_end,
__phys_to_virt(kernel_end),
end - kernel_end, PAGE_KERNEL,
- early_pgtable_alloc,
- debug_pagealloc_enabled());
+ early_pgtable_alloc, flags);
/*
* Map the linear alias of the [_text, __init_begin) interval
@@ -388,7 +397,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
*/
__create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start),
kernel_end - kernel_start, PAGE_KERNEL,
- early_pgtable_alloc, false);
+ early_pgtable_alloc, 0);
}
void __init mark_linear_text_alias_ro(void)
@@ -444,7 +453,7 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
BUG_ON(!PAGE_ALIGNED(size));
__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
- early_pgtable_alloc, false);
+ early_pgtable_alloc, 0);
vma->addr = va_start;
vma->phys_addr = pa_start;
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 10/11] arm64/mm: remove pointless map/unmap sequences when creating page tables
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
` (8 preceding siblings ...)
2017-03-09 20:52 ` [PATCH v6 09/11] arm64/mmu: replace 'page_mappings_only' parameter with flags argument Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-10 11:25 ` Mark Rutland
2017-03-09 20:52 ` [PATCH v6 11/11] arm64: mm: set the contiguous bit for kernel mappings where appropriate Ard Biesheuvel
2017-03-23 14:12 ` [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Catalin Marinas
11 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
The routines __pud_populate and __pmd_populate only create a table
entry at their respective level which refers to the next level page
by its physical address, so there is no reason to map this page and
then unmap it immediately after.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index bc9d5eb7bfa2..85ab82f5a0bc 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -128,9 +128,7 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
phys_addr_t pte_phys;
BUG_ON(!pgtable_alloc);
pte_phys = pgtable_alloc();
- pte = pte_set_fixmap(pte_phys);
__pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE);
- pte_clear_fixmap();
}
BUG_ON(pmd_bad(*pmd));
@@ -168,9 +166,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
phys_addr_t pmd_phys;
BUG_ON(!pgtable_alloc);
pmd_phys = pgtable_alloc();
- pmd = pmd_set_fixmap(pmd_phys);
__pud_populate(pud, pmd_phys, PUD_TYPE_TABLE);
- pmd_clear_fixmap();
}
BUG_ON(pud_bad(*pud));
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 11/11] arm64: mm: set the contiguous bit for kernel mappings where appropriate
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
` (9 preceding siblings ...)
2017-03-09 20:52 ` [PATCH v6 10/11] arm64/mm: remove pointless map/unmap sequences when creating page tables Ard Biesheuvel
@ 2017-03-09 20:52 ` Ard Biesheuvel
2017-03-10 12:30 ` Mark Rutland
2017-03-23 14:12 ` [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Catalin Marinas
11 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2017-03-09 20:52 UTC (permalink / raw)
To: linux-arm-kernel
This is the third attempt at enabling the use of contiguous hints for
kernel mappings. The most recent attempt 0bfc445dec9d was reverted after
it turned out that updating permission attributes on live contiguous ranges
may result in TLB conflicts. So this time, the contiguous hint is not set
for .rodata or for the linear alias of .text/.rodata, both of which are
mapped read-write initially, and remapped read-only at a later stage.
(Note that the latter region could also be unmapped and remapped again
with updated permission attributes, given that the region, while live, is
only mapped for the convenience of the hibernation code, but that also
means the TLB footprint is negligible anyway, so why bother)
This enables the following contiguous range sizes for the virtual mapping
of the kernel image, and for the linear mapping:
granule size | cont PTE | cont PMD |
-------------+------------+------------+
4 KB | 64 KB | 32 MB |
16 KB | 2 MB | 1 GB* |
64 KB | 2 MB | 16 GB* |
* Only when built for 3 or more levels of translation. This is due to the
fact that a 2 level configuration only consists of PGDs and PTEs, and the
added complexity of dealing with folded PMDs is not justified considering
that 16 GB contiguous ranges are likely to be ignored by the hardware (and
16k/2 levels is a niche configuration)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/pgtable.h | 10 ++
arch/arm64/mm/mmu.c | 140 ++++++++++++++------
2 files changed, 107 insertions(+), 43 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0eef6064bf3b..c213fdbd056c 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -74,6 +74,16 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
#define pte_user_exec(pte) (!(pte_val(pte) & PTE_UXN))
#define pte_cont(pte) (!!(pte_val(pte) & PTE_CONT))
+#define pte_cont_addr_end(addr, end) \
+({ unsigned long __boundary = ((addr) + CONT_PTE_SIZE) & CONT_PTE_MASK; \
+ (__boundary - 1 < (end) - 1) ? __boundary : (end); \
+})
+
+#define pmd_cont_addr_end(addr, end) \
+({ unsigned long __boundary = ((addr) + CONT_PMD_SIZE) & CONT_PMD_MASK; \
+ (__boundary - 1 < (end) - 1) ? __boundary : (end); \
+})
+
#ifdef CONFIG_ARM64_HW_AFDBM
#define pte_hw_dirty(pte) (pte_write(pte) && !(pte_val(pte) & PTE_RDONLY))
#else
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 85ab82f5a0bc..91502e36e6d9 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -44,6 +44,7 @@
#include <asm/ptdump.h>
#define NO_BLOCK_MAPPINGS BIT(0)
+#define NO_CONT_MAPPINGS BIT(1)
u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
@@ -116,22 +117,11 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
return ((old ^ new) & ~mask) == 0;
}
-static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
- unsigned long end, phys_addr_t phys,
- pgprot_t prot,
- phys_addr_t (*pgtable_alloc)(void))
+static void init_pte(pmd_t *pmd, unsigned long addr, unsigned long end,
+ phys_addr_t phys, pgprot_t prot)
{
pte_t *pte;
- BUG_ON(pmd_sect(*pmd));
- if (pmd_none(*pmd)) {
- phys_addr_t pte_phys;
- BUG_ON(!pgtable_alloc);
- pte_phys = pgtable_alloc();
- __pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE);
- }
- BUG_ON(pmd_bad(*pmd));
-
pte = pte_set_fixmap_offset(pmd, addr);
do {
pte_t old_pte = *pte;
@@ -150,25 +140,45 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
pte_clear_fixmap();
}
-static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
- phys_addr_t phys, pgprot_t prot,
- phys_addr_t (*pgtable_alloc)(void),
- int flags)
+static void alloc_init_cont_pte(pmd_t *pmd, unsigned long addr,
+ unsigned long end, phys_addr_t phys,
+ pgprot_t prot,
+ phys_addr_t (*pgtable_alloc)(void),
+ int flags)
{
- pmd_t *pmd;
unsigned long next;
- /*
- * Check for initial section mappings in the pgd/pud and remove them.
- */
- BUG_ON(pud_sect(*pud));
- if (pud_none(*pud)) {
- phys_addr_t pmd_phys;
+ BUG_ON(pmd_sect(*pmd));
+ if (pmd_none(*pmd)) {
+ phys_addr_t pte_phys;
BUG_ON(!pgtable_alloc);
- pmd_phys = pgtable_alloc();
- __pud_populate(pud, pmd_phys, PUD_TYPE_TABLE);
+ pte_phys = pgtable_alloc();
+ __pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE);
}
- BUG_ON(pud_bad(*pud));
+ BUG_ON(pmd_bad(*pmd));
+
+ do {
+ pgprot_t __prot = prot;
+
+ next = pte_cont_addr_end(addr, end);
+
+ /* use a contiguous mapping if the range is suitably aligned */
+ if ((((addr | next | phys) & ~CONT_PTE_MASK) == 0) &&
+ (flags & NO_CONT_MAPPINGS) == 0)
+ __prot = __pgprot(pgprot_val(prot) | PTE_CONT);
+
+ init_pte(pmd, addr, next, phys, __prot);
+
+ phys += next - addr;
+ } while (addr = next, addr != end);
+}
+
+static void init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
+ phys_addr_t phys, pgprot_t prot,
+ phys_addr_t (*pgtable_alloc)(void), int flags)
+{
+ unsigned long next;
+ pmd_t *pmd;
pmd = pmd_set_fixmap_offset(pud, addr);
do {
@@ -188,8 +198,8 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
BUG_ON(!pgattr_change_is_safe(pmd_val(old_pmd),
pmd_val(*pmd)));
} else {
- alloc_init_pte(pmd, addr, next, phys,
- prot, pgtable_alloc);
+ alloc_init_cont_pte(pmd, addr, next, phys, prot,
+ pgtable_alloc, flags);
BUG_ON(pmd_val(old_pmd) != 0 &&
pmd_val(old_pmd) != pmd_val(*pmd));
@@ -200,6 +210,41 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
pmd_clear_fixmap();
}
+static void alloc_init_cont_pmd(pud_t *pud, unsigned long addr,
+ unsigned long end, phys_addr_t phys,
+ pgprot_t prot,
+ phys_addr_t (*pgtable_alloc)(void), int flags)
+{
+ unsigned long next;
+
+ /*
+ * Check for initial section mappings in the pgd/pud.
+ */
+ BUG_ON(pud_sect(*pud));
+ if (pud_none(*pud)) {
+ phys_addr_t pmd_phys;
+ BUG_ON(!pgtable_alloc);
+ pmd_phys = pgtable_alloc();
+ __pud_populate(pud, pmd_phys, PUD_TYPE_TABLE);
+ }
+ BUG_ON(pud_bad(*pud));
+
+ do {
+ pgprot_t __prot = prot;
+
+ next = pmd_cont_addr_end(addr, end);
+
+ /* use a contiguous mapping if the range is suitably aligned */
+ if ((((addr | next | phys) & ~CONT_PMD_MASK) == 0) &&
+ (flags & NO_CONT_MAPPINGS) == 0)
+ __prot = __pgprot(pgprot_val(prot) | PTE_CONT);
+
+ init_pmd(pud, addr, next, phys, __prot, pgtable_alloc, flags);
+
+ phys += next - addr;
+ } while (addr = next, addr != end);
+}
+
static inline bool use_1G_block(unsigned long addr, unsigned long next,
unsigned long phys)
{
@@ -248,8 +293,8 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
BUG_ON(!pgattr_change_is_safe(pud_val(old_pud),
pud_val(*pud)));
} else {
- alloc_init_pmd(pud, addr, next, phys, prot,
- pgtable_alloc, flags);
+ alloc_init_cont_pmd(pud, addr, next, phys, prot,
+ pgtable_alloc, flags);
BUG_ON(pud_val(old_pud) != 0 &&
pud_val(old_pud) != pud_val(*pud));
@@ -313,7 +358,8 @@ static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
&phys, virt);
return;
}
- __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, 0);
+ __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
+ NO_CONT_MAPPINGS);
}
void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
@@ -325,7 +371,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
BUG_ON(mm == &init_mm);
if (page_mappings_only)
- flags = NO_BLOCK_MAPPINGS;
+ flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
__create_pgd_mapping(mm->pgd, phys, virt, size, prot,
pgd_pgtable_alloc, flags);
@@ -340,7 +386,8 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
return;
}
- __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, 0);
+ __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
+ NO_CONT_MAPPINGS);
/* flush the TLBs after updating live kernel mappings */
flush_tlb_kernel_range(virt, virt + size);
@@ -353,7 +400,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
int flags = 0;
if (debug_pagealloc_enabled())
- flags = NO_BLOCK_MAPPINGS;
+ flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
/*
* Take care not to create a writable alias for the
@@ -390,10 +437,12 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
* alternative patching has completed). This makes the contents
* of the region accessible to subsystems such as hibernate,
* but protects it from inadvertent modification or execution.
+ * Note that contiguous mappings cannot be remapped in this way,
+ * so we should avoid them here.
*/
__create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start),
kernel_end - kernel_start, PAGE_KERNEL,
- early_pgtable_alloc, 0);
+ early_pgtable_alloc, NO_CONT_MAPPINGS);
}
void __init mark_linear_text_alias_ro(void)
@@ -440,7 +489,8 @@ void mark_rodata_ro(void)
}
static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
- pgprot_t prot, struct vm_struct *vma)
+ pgprot_t prot, struct vm_struct *vma,
+ int flags)
{
phys_addr_t pa_start = __pa_symbol(va_start);
unsigned long size = va_end - va_start;
@@ -449,7 +499,7 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
BUG_ON(!PAGE_ALIGNED(size));
__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
- early_pgtable_alloc, 0);
+ early_pgtable_alloc, flags);
vma->addr = va_start;
vma->phys_addr = pa_start;
@@ -481,14 +531,18 @@ static void __init map_kernel(pgd_t *pgd)
*/
pgprot_t text_prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
- map_kernel_segment(pgd, _text, _etext, text_prot, &vmlinux_text);
+ /*
+ * Only rodata will be remapped with different permissions later on,
+ * all other segments are allowed to use contiguous mappings.
+ */
+ map_kernel_segment(pgd, _text, _etext, text_prot, &vmlinux_text, 0);
map_kernel_segment(pgd, __start_rodata, __inittext_begin, PAGE_KERNEL,
- &vmlinux_rodata);
+ &vmlinux_rodata, NO_CONT_MAPPINGS);
map_kernel_segment(pgd, __inittext_begin, __inittext_end, text_prot,
- &vmlinux_inittext);
+ &vmlinux_inittext, 0);
map_kernel_segment(pgd, __initdata_begin, __initdata_end, PAGE_KERNEL,
- &vmlinux_initdata);
- map_kernel_segment(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
+ &vmlinux_initdata, 0);
+ map_kernel_segment(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data, 0);
if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) {
/*
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v6 10/11] arm64/mm: remove pointless map/unmap sequences when creating page tables
2017-03-09 20:52 ` [PATCH v6 10/11] arm64/mm: remove pointless map/unmap sequences when creating page tables Ard Biesheuvel
@ 2017-03-10 11:25 ` Mark Rutland
0 siblings, 0 replies; 15+ messages in thread
From: Mark Rutland @ 2017-03-10 11:25 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 09, 2017 at 09:52:08PM +0100, Ard Biesheuvel wrote:
> The routines __pud_populate and __pmd_populate only create a table
> entry at their respective level which refers to the next level page
> by its physical address, so there is no reason to map this page and
> then unmap it immediately after.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Mark.
> ---
> arch/arm64/mm/mmu.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index bc9d5eb7bfa2..85ab82f5a0bc 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -128,9 +128,7 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
> phys_addr_t pte_phys;
> BUG_ON(!pgtable_alloc);
> pte_phys = pgtable_alloc();
> - pte = pte_set_fixmap(pte_phys);
> __pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE);
> - pte_clear_fixmap();
> }
> BUG_ON(pmd_bad(*pmd));
>
> @@ -168,9 +166,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
> phys_addr_t pmd_phys;
> BUG_ON(!pgtable_alloc);
> pmd_phys = pgtable_alloc();
> - pmd = pmd_set_fixmap(pmd_phys);
> __pud_populate(pud, pmd_phys, PUD_TYPE_TABLE);
> - pmd_clear_fixmap();
> }
> BUG_ON(pud_bad(*pud));
>
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v6 11/11] arm64: mm: set the contiguous bit for kernel mappings where appropriate
2017-03-09 20:52 ` [PATCH v6 11/11] arm64: mm: set the contiguous bit for kernel mappings where appropriate Ard Biesheuvel
@ 2017-03-10 12:30 ` Mark Rutland
0 siblings, 0 replies; 15+ messages in thread
From: Mark Rutland @ 2017-03-10 12:30 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 09, 2017 at 09:52:09PM +0100, Ard Biesheuvel wrote:
> This is the third attempt at enabling the use of contiguous hints for
> kernel mappings. The most recent attempt 0bfc445dec9d was reverted after
> it turned out that updating permission attributes on live contiguous ranges
> may result in TLB conflicts. So this time, the contiguous hint is not set
> for .rodata or for the linear alias of .text/.rodata, both of which are
> mapped read-write initially, and remapped read-only at a later stage.
> (Note that the latter region could also be unmapped and remapped again
> with updated permission attributes, given that the region, while live, is
> only mapped for the convenience of the hibernation code, but that also
> means the TLB footprint is negligible anyway, so why bother)
>
> This enables the following contiguous range sizes for the virtual mapping
> of the kernel image, and for the linear mapping:
>
> granule size | cont PTE | cont PMD |
> -------------+------------+------------+
> 4 KB | 64 KB | 32 MB |
> 16 KB | 2 MB | 1 GB* |
> 64 KB | 2 MB | 16 GB* |
>
> * Only when built for 3 or more levels of translation. This is due to the
> fact that a 2 level configuration only consists of PGDs and PTEs, and the
> added complexity of dealing with folded PMDs is not justified considering
> that 16 GB contiguous ranges are likely to be ignored by the hardware (and
> 16k/2 levels is a niche configuration)
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
This looks great; thanks for slogging through this!
To the best of my understanding this should avoid the issue seen by
Jean-Philippe with the last attempt.
As with the rest of the series:
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
I've given this a spin with 4K and 64K pages (2 and 3 level for the
latter) on Juno, with debug_pagealloc dynamically disabled/enabled. The
page tables look as expected in all cases, and memtest can happily poke
all memory without issue regardless.
FWIW, for the series:
Tested-by: Mark Rutland <mark.rutland@arm.com>
Hopefully Catalin/Will are happy to pick this up soon!
Thanks,
Mark.
> ---
> arch/arm64/include/asm/pgtable.h | 10 ++
> arch/arm64/mm/mmu.c | 140 ++++++++++++++------
> 2 files changed, 107 insertions(+), 43 deletions(-)
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 0eef6064bf3b..c213fdbd056c 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -74,6 +74,16 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
> #define pte_user_exec(pte) (!(pte_val(pte) & PTE_UXN))
> #define pte_cont(pte) (!!(pte_val(pte) & PTE_CONT))
>
> +#define pte_cont_addr_end(addr, end) \
> +({ unsigned long __boundary = ((addr) + CONT_PTE_SIZE) & CONT_PTE_MASK; \
> + (__boundary - 1 < (end) - 1) ? __boundary : (end); \
> +})
> +
> +#define pmd_cont_addr_end(addr, end) \
> +({ unsigned long __boundary = ((addr) + CONT_PMD_SIZE) & CONT_PMD_MASK; \
> + (__boundary - 1 < (end) - 1) ? __boundary : (end); \
> +})
> +
> #ifdef CONFIG_ARM64_HW_AFDBM
> #define pte_hw_dirty(pte) (pte_write(pte) && !(pte_val(pte) & PTE_RDONLY))
> #else
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 85ab82f5a0bc..91502e36e6d9 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -44,6 +44,7 @@
> #include <asm/ptdump.h>
>
> #define NO_BLOCK_MAPPINGS BIT(0)
> +#define NO_CONT_MAPPINGS BIT(1)
>
> u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
>
> @@ -116,22 +117,11 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
> return ((old ^ new) & ~mask) == 0;
> }
>
> -static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
> - unsigned long end, phys_addr_t phys,
> - pgprot_t prot,
> - phys_addr_t (*pgtable_alloc)(void))
> +static void init_pte(pmd_t *pmd, unsigned long addr, unsigned long end,
> + phys_addr_t phys, pgprot_t prot)
> {
> pte_t *pte;
>
> - BUG_ON(pmd_sect(*pmd));
> - if (pmd_none(*pmd)) {
> - phys_addr_t pte_phys;
> - BUG_ON(!pgtable_alloc);
> - pte_phys = pgtable_alloc();
> - __pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE);
> - }
> - BUG_ON(pmd_bad(*pmd));
> -
> pte = pte_set_fixmap_offset(pmd, addr);
> do {
> pte_t old_pte = *pte;
> @@ -150,25 +140,45 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
> pte_clear_fixmap();
> }
>
> -static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
> - phys_addr_t phys, pgprot_t prot,
> - phys_addr_t (*pgtable_alloc)(void),
> - int flags)
> +static void alloc_init_cont_pte(pmd_t *pmd, unsigned long addr,
> + unsigned long end, phys_addr_t phys,
> + pgprot_t prot,
> + phys_addr_t (*pgtable_alloc)(void),
> + int flags)
> {
> - pmd_t *pmd;
> unsigned long next;
>
> - /*
> - * Check for initial section mappings in the pgd/pud and remove them.
> - */
> - BUG_ON(pud_sect(*pud));
> - if (pud_none(*pud)) {
> - phys_addr_t pmd_phys;
> + BUG_ON(pmd_sect(*pmd));
> + if (pmd_none(*pmd)) {
> + phys_addr_t pte_phys;
> BUG_ON(!pgtable_alloc);
> - pmd_phys = pgtable_alloc();
> - __pud_populate(pud, pmd_phys, PUD_TYPE_TABLE);
> + pte_phys = pgtable_alloc();
> + __pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE);
> }
> - BUG_ON(pud_bad(*pud));
> + BUG_ON(pmd_bad(*pmd));
> +
> + do {
> + pgprot_t __prot = prot;
> +
> + next = pte_cont_addr_end(addr, end);
> +
> + /* use a contiguous mapping if the range is suitably aligned */
> + if ((((addr | next | phys) & ~CONT_PTE_MASK) == 0) &&
> + (flags & NO_CONT_MAPPINGS) == 0)
> + __prot = __pgprot(pgprot_val(prot) | PTE_CONT);
> +
> + init_pte(pmd, addr, next, phys, __prot);
> +
> + phys += next - addr;
> + } while (addr = next, addr != end);
> +}
> +
> +static void init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
> + phys_addr_t phys, pgprot_t prot,
> + phys_addr_t (*pgtable_alloc)(void), int flags)
> +{
> + unsigned long next;
> + pmd_t *pmd;
>
> pmd = pmd_set_fixmap_offset(pud, addr);
> do {
> @@ -188,8 +198,8 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
> BUG_ON(!pgattr_change_is_safe(pmd_val(old_pmd),
> pmd_val(*pmd)));
> } else {
> - alloc_init_pte(pmd, addr, next, phys,
> - prot, pgtable_alloc);
> + alloc_init_cont_pte(pmd, addr, next, phys, prot,
> + pgtable_alloc, flags);
>
> BUG_ON(pmd_val(old_pmd) != 0 &&
> pmd_val(old_pmd) != pmd_val(*pmd));
> @@ -200,6 +210,41 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
> pmd_clear_fixmap();
> }
>
> +static void alloc_init_cont_pmd(pud_t *pud, unsigned long addr,
> + unsigned long end, phys_addr_t phys,
> + pgprot_t prot,
> + phys_addr_t (*pgtable_alloc)(void), int flags)
> +{
> + unsigned long next;
> +
> + /*
> + * Check for initial section mappings in the pgd/pud.
> + */
> + BUG_ON(pud_sect(*pud));
> + if (pud_none(*pud)) {
> + phys_addr_t pmd_phys;
> + BUG_ON(!pgtable_alloc);
> + pmd_phys = pgtable_alloc();
> + __pud_populate(pud, pmd_phys, PUD_TYPE_TABLE);
> + }
> + BUG_ON(pud_bad(*pud));
> +
> + do {
> + pgprot_t __prot = prot;
> +
> + next = pmd_cont_addr_end(addr, end);
> +
> + /* use a contiguous mapping if the range is suitably aligned */
> + if ((((addr | next | phys) & ~CONT_PMD_MASK) == 0) &&
> + (flags & NO_CONT_MAPPINGS) == 0)
> + __prot = __pgprot(pgprot_val(prot) | PTE_CONT);
> +
> + init_pmd(pud, addr, next, phys, __prot, pgtable_alloc, flags);
> +
> + phys += next - addr;
> + } while (addr = next, addr != end);
> +}
> +
> static inline bool use_1G_block(unsigned long addr, unsigned long next,
> unsigned long phys)
> {
> @@ -248,8 +293,8 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
> BUG_ON(!pgattr_change_is_safe(pud_val(old_pud),
> pud_val(*pud)));
> } else {
> - alloc_init_pmd(pud, addr, next, phys, prot,
> - pgtable_alloc, flags);
> + alloc_init_cont_pmd(pud, addr, next, phys, prot,
> + pgtable_alloc, flags);
>
> BUG_ON(pud_val(old_pud) != 0 &&
> pud_val(old_pud) != pud_val(*pud));
> @@ -313,7 +358,8 @@ static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
> &phys, virt);
> return;
> }
> - __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, 0);
> + __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
> + NO_CONT_MAPPINGS);
> }
>
> void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
> @@ -325,7 +371,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
> BUG_ON(mm == &init_mm);
>
> if (page_mappings_only)
> - flags = NO_BLOCK_MAPPINGS;
> + flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>
> __create_pgd_mapping(mm->pgd, phys, virt, size, prot,
> pgd_pgtable_alloc, flags);
> @@ -340,7 +386,8 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
> return;
> }
>
> - __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, 0);
> + __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
> + NO_CONT_MAPPINGS);
>
> /* flush the TLBs after updating live kernel mappings */
> flush_tlb_kernel_range(virt, virt + size);
> @@ -353,7 +400,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
> int flags = 0;
>
> if (debug_pagealloc_enabled())
> - flags = NO_BLOCK_MAPPINGS;
> + flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>
> /*
> * Take care not to create a writable alias for the
> @@ -390,10 +437,12 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
> * alternative patching has completed). This makes the contents
> * of the region accessible to subsystems such as hibernate,
> * but protects it from inadvertent modification or execution.
> + * Note that contiguous mappings cannot be remapped in this way,
> + * so we should avoid them here.
> */
> __create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start),
> kernel_end - kernel_start, PAGE_KERNEL,
> - early_pgtable_alloc, 0);
> + early_pgtable_alloc, NO_CONT_MAPPINGS);
> }
>
> void __init mark_linear_text_alias_ro(void)
> @@ -440,7 +489,8 @@ void mark_rodata_ro(void)
> }
>
> static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
> - pgprot_t prot, struct vm_struct *vma)
> + pgprot_t prot, struct vm_struct *vma,
> + int flags)
> {
> phys_addr_t pa_start = __pa_symbol(va_start);
> unsigned long size = va_end - va_start;
> @@ -449,7 +499,7 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
> BUG_ON(!PAGE_ALIGNED(size));
>
> __create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
> - early_pgtable_alloc, 0);
> + early_pgtable_alloc, flags);
>
> vma->addr = va_start;
> vma->phys_addr = pa_start;
> @@ -481,14 +531,18 @@ static void __init map_kernel(pgd_t *pgd)
> */
> pgprot_t text_prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
>
> - map_kernel_segment(pgd, _text, _etext, text_prot, &vmlinux_text);
> + /*
> + * Only rodata will be remapped with different permissions later on,
> + * all other segments are allowed to use contiguous mappings.
> + */
> + map_kernel_segment(pgd, _text, _etext, text_prot, &vmlinux_text, 0);
> map_kernel_segment(pgd, __start_rodata, __inittext_begin, PAGE_KERNEL,
> - &vmlinux_rodata);
> + &vmlinux_rodata, NO_CONT_MAPPINGS);
> map_kernel_segment(pgd, __inittext_begin, __inittext_end, text_prot,
> - &vmlinux_inittext);
> + &vmlinux_inittext, 0);
> map_kernel_segment(pgd, __initdata_begin, __initdata_end, PAGE_KERNEL,
> - &vmlinux_initdata);
> - map_kernel_segment(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
> + &vmlinux_initdata, 0);
> + map_kernel_segment(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data, 0);
>
> if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) {
> /*
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
` (10 preceding siblings ...)
2017-03-09 20:52 ` [PATCH v6 11/11] arm64: mm: set the contiguous bit for kernel mappings where appropriate Ard Biesheuvel
@ 2017-03-23 14:12 ` Catalin Marinas
11 siblings, 0 replies; 15+ messages in thread
From: Catalin Marinas @ 2017-03-23 14:12 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 09, 2017 at 09:51:58PM +0100, Ard Biesheuvel wrote:
> Ard Biesheuvel (11):
> arm: kvm: move kvm_vgic_global_state out of .text section
> arm64: mmu: move TLB maintenance from callers to create_mapping_late()
> arm64: alternatives: apply boot time fixups via the linear mapping
> arm64: mmu: map .text as read-only from the outset
> arm64: mmu: apply strict permissions to .init.text and .init.data
> arm64/mmu: align alloc_init_pte prototype with pmd/pud versions
> arm64/mmu: ignore debug_pagealloc for kernel segments
> arm64/mmu: add contiguous bit to sanity bug check
> arm64/mmu: replace 'page_mappings_only' parameter with flags argument
> arm64/mm: remove pointless map/unmap sequences when creating page
> tables
> arm64: mm: set the contiguous bit for kernel mappings where
> appropriate
Queued for 4.12. Thanks.
--
Catalin
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-03-23 14:12 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-09 20:51 [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Ard Biesheuvel
2017-03-09 20:51 ` [PATCH v6 01/11] arm: kvm: move kvm_vgic_global_state out of .text section Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 02/11] arm64: mmu: move TLB maintenance from callers to create_mapping_late() Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 03/11] arm64: alternatives: apply boot time fixups via the linear mapping Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 04/11] arm64: mmu: map .text as read-only from the outset Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 05/11] arm64: mmu: apply strict permissions to .init.text and .init.data Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 06/11] arm64/mmu: align alloc_init_pte prototype with pmd/pud versions Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 07/11] arm64/mmu: ignore debug_pagealloc for kernel segments Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 08/11] arm64/mmu: add contiguous bit to sanity bug check Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 09/11] arm64/mmu: replace 'page_mappings_only' parameter with flags argument Ard Biesheuvel
2017-03-09 20:52 ` [PATCH v6 10/11] arm64/mm: remove pointless map/unmap sequences when creating page tables Ard Biesheuvel
2017-03-10 11:25 ` Mark Rutland
2017-03-09 20:52 ` [PATCH v6 11/11] arm64: mm: set the contiguous bit for kernel mappings where appropriate Ard Biesheuvel
2017-03-10 12:30 ` Mark Rutland
2017-03-23 14:12 ` [PATCH v6 00/11] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).