* [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2
@ 2023-11-24 10:18 Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 01/39] arm64: kernel: Disable latent_entropy GCC plugin in early C runtime Ard Biesheuvel
` (39 more replies)
0 siblings, 40 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
At the request of Catalin, this series was split off from my LPA2 series
[0] in order to make the changes a bit more manageable.
This series reorganizes the kernel VA space, and refactors/replaces the
early mapping code so that:
- everything is done only once, in the appropriate order;
- everything is done with the MMU and caches enabled (*)
- everything is done from C code (notably, 100s of lines of
incomprehensible asm code are removed from head.S).
(*) the initial ID map will be populated with the MMU and caches
disabled if that is how we entered from the bootloader.
This is important for LPA2, but also for other future extensions to the
page table format, as managing this entirely in early asm code as we do
today would become intractable. This applies also to things such as
copying the KAsan shadow or the fixmap from the early page tables into
the permanent ones - this is all being removed by this series.
Another notable difference implemented by this series is the fact that
the permanent ID map always covers 48 bits of VA space, and is no longer
tied to the size of the kernel VA space. This removes awkward logic to
add a translation level above PGD level, and will be beneficial for
other reasons too (it permits future changes in the EFI logic to get rid
of SetVirtualAddressMap() entirely)
Changes since v4:
- merge a couple of followup tweaks for issues that were reported while
the v4 was briefly queued up and pulled into -next
- rebase onto v6.7-rc1
- omit LVA/LPA2 and WXN related changes
[0] https://lore.kernel.org/all/20230912141549.278777-63-ardb@google.com/
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Ard Biesheuvel (39):
arm64: kernel: Disable latent_entropy GCC plugin in early C runtime
arm64: mm: Take potential load offset into account when KASLR is off
arm64: mm: get rid of kimage_vaddr global variable
arm64: mm: Move PCI I/O emulation region above the vmemmap region
arm64: mm: Move fixmap region above vmemmap region
arm64: ptdump: Allow all region boundaries to be defined at boot time
arm64: ptdump: Discover start of vmemmap region at runtime
arm64: vmemmap: Avoid base2 order of struct page size to dimension
region
arm64: mm: Reclaim unused vmemmap region for vmalloc use
arm64: kaslr: Adjust randomization range dynamically
arm64: kernel: Manage absolute relocations in code built under pi/
arm64: kernel: Don't rely on objcopy to make code under pi/ __init
arm64: head: move relocation handling to C code
arm64: idreg-override: Omit non-NULL checks for override pointer
arm64: idreg-override: Prepare for place relative reloc patching
arm64: idreg-override: Avoid parameq() and parameqn()
arm64: idreg-override: avoid strlen() to check for empty strings
arm64: idreg-override: Avoid sprintf() for simple string concatenation
arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit
arm64: idreg-override: Move to early mini C runtime
arm64: kernel: Remove early fdt remap code
arm64: head: Clear BSS and the kernel page tables in one go
arm64: Move feature overrides into the BSS section
arm64: head: Run feature override detection before mapping the kernel
arm64: head: move dynamic shadow call stack patching into early C
runtime
arm64: kaslr: Use feature override instead of parsing the cmdline
again
arm64/kernel: Move 'nokaslr' parsing out of early idreg code
arm64: idreg-override: Create a pseudo feature for rodata=off
arm64: Add helpers to probe local CPU for PAC and BTI support
arm64: head: allocate more pages for the kernel mapping
arm64: head: move memstart_offset_seed handling to C code
arm64: mm: Make kaslr_requires_kpti() a static inline
arm64: head: Move early kernel mapping routines into C code
arm64: mm: Use 48-bit virtual addressing for the permanent ID map
arm64: pgtable: Decouple PGDIR size macros from PGD/PUD/PMD levels
arm64: kernel: Create initial ID map from C code
arm64: mm: avoid fixmap for early swapper_pg_dir updates
arm64: mm: omit redundant remap of kernel image
arm64: Revert "mm: provide idmap pointer to cpu_replace_ttbr1()"
arch/arm64/include/asm/archrandom.h | 2 -
arch/arm64/include/asm/assembler.h | 14 -
arch/arm64/include/asm/cpufeature.h | 53 +++
arch/arm64/include/asm/fixmap.h | 1 -
arch/arm64/include/asm/kasan.h | 2 -
arch/arm64/include/asm/kernel-pgtable.h | 128 +++---
arch/arm64/include/asm/memory.h | 20 +-
arch/arm64/include/asm/mmu.h | 40 +-
arch/arm64/include/asm/mmu_context.h | 25 +-
arch/arm64/include/asm/pgtable.h | 10 +-
arch/arm64/include/asm/scs.h | 36 +-
arch/arm64/include/asm/setup.h | 3 -
arch/arm64/kernel/Makefile | 7 +-
arch/arm64/kernel/cpufeature.c | 56 +--
arch/arm64/kernel/head.S | 428 ++------------------
arch/arm64/kernel/image-vars.h | 33 ++
arch/arm64/kernel/kaslr.c | 11 +-
arch/arm64/kernel/module.c | 2 +-
arch/arm64/kernel/pi/Makefile | 28 +-
arch/arm64/kernel/{ => pi}/idreg-override.c | 188 +++++----
arch/arm64/kernel/pi/kaslr_early.c | 78 +---
arch/arm64/kernel/pi/map_kernel.c | 187 +++++++++
arch/arm64/kernel/pi/map_range.c | 100 +++++
arch/arm64/kernel/{ => pi}/patch-scs.c | 36 +-
arch/arm64/kernel/pi/pi.h | 32 ++
arch/arm64/kernel/pi/relacheck.c | 130 ++++++
arch/arm64/kernel/pi/relocate.c | 64 +++
arch/arm64/kernel/setup.c | 22 -
arch/arm64/kernel/vmlinux.lds.S | 17 +-
arch/arm64/kvm/mmu.c | 15 +-
arch/arm64/mm/fixmap.c | 34 --
arch/arm64/mm/kasan_init.c | 19 +-
arch/arm64/mm/mmu.c | 135 ++----
arch/arm64/mm/proc.S | 13 +-
arch/arm64/mm/ptdump.c | 56 ++-
35 files changed, 1017 insertions(+), 1008 deletions(-)
rename arch/arm64/kernel/{ => pi}/idreg-override.c (58%)
create mode 100644 arch/arm64/kernel/pi/map_kernel.c
create mode 100644 arch/arm64/kernel/pi/map_range.c
rename arch/arm64/kernel/{ => pi}/patch-scs.c (89%)
create mode 100644 arch/arm64/kernel/pi/pi.h
create mode 100644 arch/arm64/kernel/pi/relacheck.c
create mode 100644 arch/arm64/kernel/pi/relocate.c
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH v5 01/39] arm64: kernel: Disable latent_entropy GCC plugin in early C runtime
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 02/39] arm64: mm: Take potential load offset into account when KASLR is off Ard Biesheuvel
` (38 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
In subsequent patches, mark portions of the early C code will be marked
as __init. Unfortunarely, __init implies __latent_entropy, and this
would result in the early C code being instrumented in an unsafe manner.
Disable the latent entropy plugin for the early C code.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/pi/Makefile | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kernel/pi/Makefile b/arch/arm64/kernel/pi/Makefile
index 4c0ea3cd4ea4..c844a0546d7f 100644
--- a/arch/arm64/kernel/pi/Makefile
+++ b/arch/arm64/kernel/pi/Makefile
@@ -3,6 +3,7 @@
KBUILD_CFLAGS := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) -fpie \
-Os -DDISABLE_BRANCH_PROFILING $(DISABLE_STACKLEAK_PLUGIN) \
+ $(DISABLE_LATENT_ENTROPY_PLUGIN) \
$(call cc-option,-mbranch-protection=none) \
-I$(srctree)/scripts/dtc/libfdt -fno-stack-protector \
-include $(srctree)/include/linux/hidden.h \
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 02/39] arm64: mm: Take potential load offset into account when KASLR is off
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 01/39] arm64: kernel: Disable latent_entropy GCC plugin in early C runtime Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 03/39] arm64: mm: get rid of kimage_vaddr global variable Ard Biesheuvel
` (37 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
We enable CONFIG_RELOCATABLE even when CONFIG_RANDOMIZE_BASE is
disabled, and this permits the loader (i.e., EFI) to place the kernel
anywhere in physical memory as long as the base address is 64k aligned.
This means that the 'KASLR' case described in the header that defines
the size of the statically allocated page tables could take effect even
when CONFIG_RANDMIZE_BASE=n. So check for CONFIG_RELOCATABLE instead.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/kernel-pgtable.h | 27 +++++---------------
1 file changed, 6 insertions(+), 21 deletions(-)
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 85d26143faa5..83ddb14b95a5 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -37,27 +37,12 @@
/*
- * If KASLR is enabled, then an offset K is added to the kernel address
- * space. The bottom 21 bits of this offset are zero to guarantee 2MB
- * alignment for PA and VA.
- *
- * For each pagetable level of the swapper, we know that the shift will
- * be larger than 21 (for the 4KB granule case we use section maps thus
- * the smallest shift is actually 30) thus there is the possibility that
- * KASLR can increase the number of pagetable entries by 1, so we make
- * room for this extra entry.
- *
- * Note KASLR cannot increase the number of required entries for a level
- * by more than one because it increments both the virtual start and end
- * addresses equally (the extra entry comes from the case where the end
- * address is just pushed over a boundary and the start address isn't).
+ * A relocatable kernel may execute from an address that differs from the one at
+ * which it was linked. In the worst case, its runtime placement may intersect
+ * with two adjacent PGDIR entries, which means that an additional page table
+ * may be needed at each subordinate level.
*/
-
-#ifdef CONFIG_RANDOMIZE_BASE
-#define EARLY_KASLR (1)
-#else
-#define EARLY_KASLR (0)
-#endif
+#define EXTRA_PAGE __is_defined(CONFIG_RELOCATABLE)
#define SPAN_NR_ENTRIES(vstart, vend, shift) \
((((vend) - 1) >> (shift)) - ((vstart) >> (shift)) + 1)
@@ -83,7 +68,7 @@
+ EARLY_PGDS((vstart), (vend), add) /* each PGDIR needs a next level page table */ \
+ EARLY_PUDS((vstart), (vend), add) /* each PUD needs a next level page table */ \
+ EARLY_PMDS((vstart), (vend), add)) /* each PMD needs a next level page table */
-#define INIT_DIR_SIZE (PAGE_SIZE * EARLY_PAGES(KIMAGE_VADDR, _end, EARLY_KASLR))
+#define INIT_DIR_SIZE (PAGE_SIZE * EARLY_PAGES(KIMAGE_VADDR, _end, EXTRA_PAGE))
/* the initial ID map may need two extra pages if it needs to be extended */
#if VA_BITS < 48
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 03/39] arm64: mm: get rid of kimage_vaddr global variable
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 01/39] arm64: kernel: Disable latent_entropy GCC plugin in early C runtime Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 02/39] arm64: mm: Take potential load offset into account when KASLR is off Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 04/39] arm64: mm: Move PCI I/O emulation region above the vmemmap region Ard Biesheuvel
` (36 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
We store the address of _text in kimage_vaddr, but since commit
09e3c22a86f6889d ("arm64: Use a variable to store non-global mappings
decision"), we no longer reference this variable from modules so we no
longer need to export it.
In fact, we don't need it at all so let's just get rid of it.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/memory.h | 6 ++----
arch/arm64/kernel/head.S | 2 +-
arch/arm64/mm/mmu.c | 3 ---
3 files changed, 3 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index fde4186cc387..b8d726f951ae 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -182,6 +182,7 @@
#include <linux/types.h>
#include <asm/boot.h>
#include <asm/bug.h>
+#include <asm/sections.h>
#if VA_BITS > 48
extern u64 vabits_actual;
@@ -193,15 +194,12 @@ extern s64 memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
#define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
-/* the virtual base of the kernel image */
-extern u64 kimage_vaddr;
-
/* the offset between the kernel virtual and physical mappings */
extern u64 kimage_voffset;
static inline unsigned long kaslr_offset(void)
{
- return kimage_vaddr - KIMAGE_VADDR;
+ return (u64)&_text - KIMAGE_VADDR;
}
#ifdef CONFIG_RANDOMIZE_BASE
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 7b236994f0e1..cab7f91949d8 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -482,7 +482,7 @@ SYM_FUNC_START_LOCAL(__primary_switched)
str_l x21, __fdt_pointer, x5 // Save FDT pointer
- ldr_l x4, kimage_vaddr // Save the offset between
+ adrp x4, _text // Save the offset between
sub x4, x4, x0 // the kernel virtual and
str_l x4, kimage_voffset, x5 // physical mappings
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 15f6347d23b6..03c73e9197ac 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -52,9 +52,6 @@ u64 vabits_actual __ro_after_init = VA_BITS_MIN;
EXPORT_SYMBOL(vabits_actual);
#endif
-u64 kimage_vaddr __ro_after_init = (u64)&_text;
-EXPORT_SYMBOL(kimage_vaddr);
-
u64 kimage_voffset __ro_after_init;
EXPORT_SYMBOL(kimage_voffset);
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 04/39] arm64: mm: Move PCI I/O emulation region above the vmemmap region
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (2 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 03/39] arm64: mm: get rid of kimage_vaddr global variable Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 05/39] arm64: mm: Move fixmap region above " Ard Biesheuvel
` (35 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Move the PCI I/O region above the vmemmap region in the kernel's VA
space. This will permit us to reclaim the lower part of the vmemmap
region for vmalloc/vmap allocations when running a 52-bit VA capable
build on a 48-bit VA capable system.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/memory.h | 4 ++--
arch/arm64/mm/ptdump.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index b8d726f951ae..99caeff78e1a 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -49,8 +49,8 @@
#define MODULES_VSIZE (SZ_2G)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
-#define PCI_IO_END (VMEMMAP_START - SZ_8M)
-#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
+#define PCI_IO_START (VMEMMAP_END + SZ_8M)
+#define PCI_IO_END (PCI_IO_START + PCI_IO_SIZE)
#define FIXADDR_TOP (VMEMMAP_START - SZ_32M)
#if VA_BITS > 48
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index e305b6593c4e..d1df56d44f8a 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -47,10 +47,10 @@ static struct addr_marker address_markers[] = {
{ VMALLOC_END, "vmalloc() end" },
{ FIXADDR_TOT_START, "Fixmap start" },
{ FIXADDR_TOP, "Fixmap end" },
- { PCI_IO_START, "PCI I/O start" },
- { PCI_IO_END, "PCI I/O end" },
{ VMEMMAP_START, "vmemmap start" },
{ VMEMMAP_START + VMEMMAP_SIZE, "vmemmap end" },
+ { PCI_IO_START, "PCI I/O start" },
+ { PCI_IO_END, "PCI I/O end" },
{ -1, NULL },
};
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 05/39] arm64: mm: Move fixmap region above vmemmap region
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (3 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 04/39] arm64: mm: Move PCI I/O emulation region above the vmemmap region Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 06/39] arm64: ptdump: Allow all region boundaries to be defined at boot time Ard Biesheuvel
` (34 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Move the fixmap region above the vmemmap region, so that the start of
the vmemmap delineates the end of the region available for vmalloc and
vmap allocations and the randomized placement of the kernel and modules.
In a subsequent patch, we will take advantage of this to reclaim most of
the vmemmap area when running a 52-bit VA capable build with 52-bit
virtual addressing disabled at runtime.
Note that the existing guard region of 256 MiB covers the fixmap and PCI
I/O regions as well, so we can reduce it 8 MiB, which is what we use in
other places too.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/memory.h | 2 +-
arch/arm64/include/asm/pgtable.h | 2 +-
arch/arm64/mm/ptdump.c | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 99caeff78e1a..2745bed8ae5b 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,7 @@
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
#define PCI_IO_START (VMEMMAP_END + SZ_8M)
#define PCI_IO_END (PCI_IO_START + PCI_IO_SIZE)
-#define FIXADDR_TOP (VMEMMAP_START - SZ_32M)
+#define FIXADDR_TOP (-UL(SZ_8M))
#if VA_BITS > 48
#define VA_BITS_MIN (48)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b19a8aee684c..8d30e2787b1f 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -22,7 +22,7 @@
* and fixed mappings
*/
#define VMALLOC_START (MODULES_END)
-#define VMALLOC_END (VMEMMAP_START - SZ_256M)
+#define VMALLOC_END (VMEMMAP_START - SZ_8M)
#define vmemmap ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index d1df56d44f8a..3958b008f908 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -45,12 +45,12 @@ static struct addr_marker address_markers[] = {
{ MODULES_END, "Modules end" },
{ VMALLOC_START, "vmalloc() area" },
{ VMALLOC_END, "vmalloc() end" },
- { FIXADDR_TOT_START, "Fixmap start" },
- { FIXADDR_TOP, "Fixmap end" },
{ VMEMMAP_START, "vmemmap start" },
{ VMEMMAP_START + VMEMMAP_SIZE, "vmemmap end" },
{ PCI_IO_START, "PCI I/O start" },
{ PCI_IO_END, "PCI I/O end" },
+ { FIXADDR_TOT_START, "Fixmap start" },
+ { FIXADDR_TOP, "Fixmap end" },
{ -1, NULL },
};
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 06/39] arm64: ptdump: Allow all region boundaries to be defined at boot time
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (4 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 05/39] arm64: mm: Move fixmap region above " Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 07/39] arm64: ptdump: Discover start of vmemmap region at runtime Ard Biesheuvel
` (33 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Rework the way the address_markers array is populated so that we can
tolerate values that are not compile time constants generally, rather
than keeping track manually of the array indexes in question, and poking
new values into them manually. This will be needed for VMALLOC_END,
which will cease to be a compile time constant after a subsequent patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/mm/ptdump.c | 54 ++++++++------------
1 file changed, 22 insertions(+), 32 deletions(-)
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 3958b008f908..bfc307890344 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -26,34 +26,6 @@
#include <asm/ptdump.h>
-enum address_markers_idx {
- PAGE_OFFSET_NR = 0,
- PAGE_END_NR,
-#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
- KASAN_START_NR,
-#endif
-};
-
-static struct addr_marker address_markers[] = {
- { PAGE_OFFSET, "Linear Mapping start" },
- { 0 /* PAGE_END */, "Linear Mapping end" },
-#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
- { 0 /* KASAN_SHADOW_START */, "Kasan shadow start" },
- { KASAN_SHADOW_END, "Kasan shadow end" },
-#endif
- { MODULES_VADDR, "Modules start" },
- { MODULES_END, "Modules end" },
- { VMALLOC_START, "vmalloc() area" },
- { VMALLOC_END, "vmalloc() end" },
- { VMEMMAP_START, "vmemmap start" },
- { VMEMMAP_START + VMEMMAP_SIZE, "vmemmap end" },
- { PCI_IO_START, "PCI I/O start" },
- { PCI_IO_END, "PCI I/O end" },
- { FIXADDR_TOT_START, "Fixmap start" },
- { FIXADDR_TOP, "Fixmap end" },
- { -1, NULL },
-};
-
#define pt_dump_seq_printf(m, fmt, args...) \
({ \
if (m) \
@@ -339,9 +311,8 @@ static void __init ptdump_initialize(void)
pg_level[i].mask |= pg_level[i].bits[j].mask;
}
-static struct ptdump_info kernel_ptdump_info = {
+static struct ptdump_info kernel_ptdump_info __ro_after_init = {
.mm = &init_mm,
- .markers = address_markers,
.base_addr = PAGE_OFFSET,
};
@@ -375,10 +346,29 @@ void ptdump_check_wx(void)
static int __init ptdump_init(void)
{
- address_markers[PAGE_END_NR].start_address = PAGE_END;
+ struct addr_marker m[] = {
+ { PAGE_OFFSET, "Linear Mapping start" },
+ { PAGE_END, "Linear Mapping end" },
#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
- address_markers[KASAN_START_NR].start_address = KASAN_SHADOW_START;
+ { KASAN_SHADOW_START, "Kasan shadow start" },
+ { KASAN_SHADOW_END, "Kasan shadow end" },
#endif
+ { MODULES_VADDR, "Modules start" },
+ { MODULES_END, "Modules end" },
+ { VMALLOC_START, "vmalloc() area" },
+ { VMALLOC_END, "vmalloc() end" },
+ { VMEMMAP_START, "vmemmap start" },
+ { VMEMMAP_START + VMEMMAP_SIZE, "vmemmap end" },
+ { PCI_IO_START, "PCI I/O start" },
+ { PCI_IO_END, "PCI I/O end" },
+ { FIXADDR_TOT_START, "Fixmap start" },
+ { FIXADDR_TOP, "Fixmap end" },
+ { -1, NULL },
+ };
+ static struct addr_marker address_markers[ARRAY_SIZE(m)] __ro_after_init;
+
+ kernel_ptdump_info.markers = memcpy(address_markers, m, sizeof(m));
+
ptdump_initialize();
ptdump_debugfs_register(&kernel_ptdump_info, "kernel_page_tables");
return 0;
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 07/39] arm64: ptdump: Discover start of vmemmap region at runtime
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (5 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 06/39] arm64: ptdump: Allow all region boundaries to be defined at boot time Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 08/39] arm64: vmemmap: Avoid base2 order of struct page size to dimension region Ard Biesheuvel
` (32 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
We will soon reclaim the part of the vmemmap region that covers VA space
that is not addressable by the hardware. To avoid confusion, ensure that
the 'vmemmap start' marker points at the start of the region that is
actually being used for the struct page array, rather than the start of
the region we set aside for it at build time.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/mm/ptdump.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index bfc307890344..f3fdbf3bb6ad 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -346,6 +346,8 @@ void ptdump_check_wx(void)
static int __init ptdump_init(void)
{
+ u64 page_offset = _PAGE_OFFSET(vabits_actual);
+ u64 vmemmap_start = (u64)virt_to_page((void *)page_offset);
struct addr_marker m[] = {
{ PAGE_OFFSET, "Linear Mapping start" },
{ PAGE_END, "Linear Mapping end" },
@@ -357,7 +359,7 @@ static int __init ptdump_init(void)
{ MODULES_END, "Modules end" },
{ VMALLOC_START, "vmalloc() area" },
{ VMALLOC_END, "vmalloc() end" },
- { VMEMMAP_START, "vmemmap start" },
+ { vmemmap_start, "vmemmap start" },
{ VMEMMAP_START + VMEMMAP_SIZE, "vmemmap end" },
{ PCI_IO_START, "PCI I/O start" },
{ PCI_IO_END, "PCI I/O end" },
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 08/39] arm64: vmemmap: Avoid base2 order of struct page size to dimension region
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (6 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 07/39] arm64: ptdump: Discover start of vmemmap region at runtime Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 09/39] arm64: mm: Reclaim unused vmemmap region for vmalloc use Ard Biesheuvel
` (31 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The placement and size of the vmemmap region in the kernel virtual
address space is currently derived from the base2 order of the size of a
struct page. This makes for nicely aligned constants with lots of
leading 0xf and trailing 0x0 digits, but given that the actual struct
pages are indexed as an ordinary array, this resulting region is
severely overdimensioned when the size of a struct page is just over a
power of 2.
This doesn't matter today, but once we enable 52-bit virtual addressing
for 4k pages configurations, the vmemmap region may take up almost half
of the upper VA region with the current struct page upper bound at 64
bytes. And once we enable KMSAN or other features that push the size of
a struct page over 64 bytes, we will run out of VMALLOC space entirely.
So instead, let's derive the region size from the actual size of a
struct page, and place the entire region 1 GB from the top of the VA
space, where it still doesn't share any lower level translation table
entries with the fixmap.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/memory.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 2745bed8ae5b..b49575a92afc 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -30,8 +30,8 @@
* keep a constant PAGE_OFFSET and "fallback" to using the higher end
* of the VMEMMAP where 52-bit support is not available in hardware.
*/
-#define VMEMMAP_SHIFT (PAGE_SHIFT - STRUCT_PAGE_MAX_SHIFT)
-#define VMEMMAP_SIZE ((_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET) >> VMEMMAP_SHIFT)
+#define VMEMMAP_RANGE (_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET)
+#define VMEMMAP_SIZE ((VMEMMAP_RANGE >> PAGE_SHIFT) * sizeof(struct page))
/*
* PAGE_OFFSET - the virtual address of the start of the linear map, at the
@@ -47,8 +47,8 @@
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define MODULES_VADDR (_PAGE_END(VA_BITS_MIN))
#define MODULES_VSIZE (SZ_2G)
-#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
-#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
+#define VMEMMAP_START (VMEMMAP_END - VMEMMAP_SIZE)
+#define VMEMMAP_END (-UL(SZ_1G))
#define PCI_IO_START (VMEMMAP_END + SZ_8M)
#define PCI_IO_END (PCI_IO_START + PCI_IO_SIZE)
#define FIXADDR_TOP (-UL(SZ_8M))
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 09/39] arm64: mm: Reclaim unused vmemmap region for vmalloc use
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (7 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 08/39] arm64: vmemmap: Avoid base2 order of struct page size to dimension region Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 10/39] arm64: kaslr: Adjust randomization range dynamically Ard Biesheuvel
` (30 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The vmemmap array is statically sized based on the maximum supported
size of the virtual address space, but it is located inside the upper VA
region, which is statically sized based on the *minimum* supported size
of the VA space. This doesn't matter much when using 64k pages, which is
the only configuration that currently supports 52-bit virtual
addressing.
However, upcoming LPA2 support will change this picture somewhat, as in
that case, the vmemmap array will take up more than 25% of the upper VA
region when using 4k pages. Given that most of this space is never used
when running on a system that does not support 52-bit virtual
addressing, let's reclaim the unused vmemmap area in that case.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/pgtable.h | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 8d30e2787b1f..5b5bedfa320e 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -18,11 +18,15 @@
* VMALLOC range.
*
* VMALLOC_START: beginning of the kernel vmalloc space
- * VMALLOC_END: extends to the available space below vmemmap, PCI I/O space
- * and fixed mappings
+ * VMALLOC_END: extends to the available space below vmemmap
*/
#define VMALLOC_START (MODULES_END)
+#if VA_BITS == VA_BITS_MIN
#define VMALLOC_END (VMEMMAP_START - SZ_8M)
+#else
+#define VMEMMAP_UNUSED_NPAGES ((_PAGE_OFFSET(vabits_actual) - PAGE_OFFSET) >> PAGE_SHIFT)
+#define VMALLOC_END (VMEMMAP_START + VMEMMAP_UNUSED_NPAGES * sizeof(struct page) - SZ_8M)
+#endif
#define vmemmap ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 10/39] arm64: kaslr: Adjust randomization range dynamically
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (8 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 09/39] arm64: mm: Reclaim unused vmemmap region for vmalloc use Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 11/39] arm64: kernel: Manage absolute relocations in code built under pi/ Ard Biesheuvel
` (29 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Currently, we base the KASLR randomization range on a rough estimate of
the available space in the upper VA region: the lower 1/4th has the
module region and the upper 1/4th has the fixmap, vmemmap and PCI I/O
ranges, and so we pick a random location in the remaining space in the
middle.
Once we enable support for 5-level paging with 4k pages, this no longer
works: the vmemmap region, being dimensioned to cover a 52-bit linear
region, takes up so much space in the upper VA region (the size of which
is based on a 48-bit VA space for compatibility with non-LVA hardware)
that the region above the vmalloc region takes up more than a quarter of
the available space.
So instead of a heuristic, let's derive the randomization range from the
actual boundaries of the vmalloc region.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/image-vars.h | 2 ++
arch/arm64/kernel/pi/kaslr_early.c | 11 ++++++-----
2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 5e4dc72ab1bd..e931ce078a00 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -36,6 +36,8 @@ PROVIDE(__pi___memcpy = __pi_memcpy);
PROVIDE(__pi___memmove = __pi_memmove);
PROVIDE(__pi___memset = __pi_memset);
+PROVIDE(__pi_vabits_actual = vabits_actual);
+
#ifdef CONFIG_KVM
/*
diff --git a/arch/arm64/kernel/pi/kaslr_early.c b/arch/arm64/kernel/pi/kaslr_early.c
index 17bff6e399e4..b9e0bb4bc6a9 100644
--- a/arch/arm64/kernel/pi/kaslr_early.c
+++ b/arch/arm64/kernel/pi/kaslr_early.c
@@ -14,6 +14,7 @@
#include <asm/archrandom.h>
#include <asm/memory.h>
+#include <asm/pgtable.h>
/* taken from lib/string.c */
static char *__strstr(const char *s1, const char *s2)
@@ -87,7 +88,7 @@ static u64 get_kaslr_seed(void *fdt)
asmlinkage u64 kaslr_early_init(void *fdt)
{
- u64 seed;
+ u64 seed, range;
if (is_kaslr_disabled_cmdline(fdt))
return 0;
@@ -102,9 +103,9 @@ asmlinkage u64 kaslr_early_init(void *fdt)
/*
* OK, so we are proceeding with KASLR enabled. Calculate a suitable
* kernel image offset from the seed. Let's place the kernel in the
- * middle half of the VMALLOC area (VA_BITS_MIN - 2), and stay clear of
- * the lower and upper quarters to avoid colliding with other
- * allocations.
+ * 'middle' half of the VMALLOC area, and stay clear of the lower and
+ * upper quarters to avoid colliding with other allocations.
*/
- return BIT(VA_BITS_MIN - 3) + (seed & GENMASK(VA_BITS_MIN - 3, 0));
+ range = (VMALLOC_END - KIMAGE_VADDR) / 2;
+ return range / 2 + (((__uint128_t)range * seed) >> 64);
}
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 11/39] arm64: kernel: Manage absolute relocations in code built under pi/
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (9 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 10/39] arm64: kaslr: Adjust randomization range dynamically Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 12/39] arm64: kernel: Don't rely on objcopy to make code under pi/ __init Ard Biesheuvel
` (28 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The mini C runtime runs before relocations are processed, and so it
cannot rely on statically initialized pointer variables.
Add a check to ensure that such code does not get introduced by
accident, by going over the relocations in each object, identifying the
ones that operate on data sections that are part of the executable
image, and raising an error if any relocations of type R_AARCH64_ABS64
exist. Note that such relocations are permitted in other places (e.g.,
debug sections) and will never occur in compiler generated code sections
when using the small code model, so only check sections that have
SHF_ALLOC set and SHF_EXECINSTR cleared.
To accommodate cases where statically initialized symbol references are
unavoidable, introduce a special case for ELF input data sections that
have ".rodata.prel64" in their names, and in these cases, instead of
rejecting any encountered ABS64 relocations, convert them into PREL64
relocations, which don't require any runtime fixups. Note that the code
in question must still be modified to deal with this, as it needs to
convert the 64-bit signed offsets into absolute addresses before use.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/pi/Makefile | 9 +-
arch/arm64/kernel/pi/pi.h | 14 +++
arch/arm64/kernel/pi/relacheck.c | 130 ++++++++++++++++++++
3 files changed, 151 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/pi/Makefile b/arch/arm64/kernel/pi/Makefile
index c844a0546d7f..bc32a431fe35 100644
--- a/arch/arm64/kernel/pi/Makefile
+++ b/arch/arm64/kernel/pi/Makefile
@@ -22,11 +22,16 @@ KCSAN_SANITIZE := n
UBSAN_SANITIZE := n
KCOV_INSTRUMENT := n
+hostprogs := relacheck
+
+quiet_cmd_piobjcopy = $(quiet_cmd_objcopy)
+ cmd_piobjcopy = $(cmd_objcopy) && $(obj)/relacheck $(@) $(<)
+
$(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_ \
--remove-section=.note.gnu.property \
--prefix-alloc-sections=.init
-$(obj)/%.pi.o: $(obj)/%.o FORCE
- $(call if_changed,objcopy)
+$(obj)/%.pi.o: $(obj)/%.o $(obj)/relacheck FORCE
+ $(call if_changed,piobjcopy)
$(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)
diff --git a/arch/arm64/kernel/pi/pi.h b/arch/arm64/kernel/pi/pi.h
new file mode 100644
index 000000000000..f455ad385976
--- /dev/null
+++ b/arch/arm64/kernel/pi/pi.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright 2023 Google LLC
+// Author: Ard Biesheuvel <ardb@google.com>
+
+#define __prel64_initconst __section(".init.rodata.prel64")
+
+typedef volatile signed long prel64_t;
+
+static inline void *prel64_to_pointer(const prel64_t *offset)
+{
+ if (!*offset)
+ return NULL;
+ return (void *)offset + *offset;
+}
diff --git a/arch/arm64/kernel/pi/relacheck.c b/arch/arm64/kernel/pi/relacheck.c
new file mode 100644
index 000000000000..b0cd4d0d275b
--- /dev/null
+++ b/arch/arm64/kernel/pi/relacheck.c
@@ -0,0 +1,130 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023 - Google LLC
+ * Author: Ard Biesheuvel <ardb@google.com>
+ */
+
+#include <elf.h>
+#include <fcntl.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+#define HOST_ORDER ELFDATA2LSB
+#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+#define HOST_ORDER ELFDATA2MSB
+#endif
+
+static Elf64_Ehdr *ehdr;
+static Elf64_Shdr *shdr;
+static const char *strtab;
+static bool swap;
+
+static uint64_t swab_elfxword(uint64_t val)
+{
+ return swap ? __builtin_bswap64(val) : val;
+}
+
+static uint32_t swab_elfword(uint32_t val)
+{
+ return swap ? __builtin_bswap32(val) : val;
+}
+
+static uint16_t swab_elfhword(uint16_t val)
+{
+ return swap ? __builtin_bswap16(val) : val;
+}
+
+int main(int argc, char *argv[])
+{
+ struct stat stat;
+ int fd, ret;
+
+ if (argc < 3) {
+ fprintf(stderr, "file arguments missing\n");
+ exit(EXIT_FAILURE);
+ }
+
+ fd = open(argv[1], O_RDWR);
+ if (fd < 0) {
+ fprintf(stderr, "failed to open %s\n", argv[1]);
+ exit(EXIT_FAILURE);
+ }
+
+ ret = fstat(fd, &stat);
+ if (ret < 0) {
+ fprintf(stderr, "failed to stat() %s\n", argv[1]);
+ exit(EXIT_FAILURE);
+ }
+
+ ehdr = mmap(0, stat.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ if (ehdr == MAP_FAILED) {
+ fprintf(stderr, "failed to mmap() %s\n", argv[1]);
+ exit(EXIT_FAILURE);
+ }
+
+ swap = ehdr->e_ident[EI_DATA] != HOST_ORDER;
+ shdr = (void *)ehdr + swab_elfxword(ehdr->e_shoff);
+ strtab = (void *)ehdr +
+ swab_elfxword(shdr[swab_elfhword(ehdr->e_shstrndx)].sh_offset);
+
+ for (int i = 0; i < swab_elfhword(ehdr->e_shnum); i++) {
+ unsigned long info, flags;
+ bool prel64 = false;
+ Elf64_Rela *rela;
+ int numrela;
+
+ if (swab_elfword(shdr[i].sh_type) != SHT_RELA)
+ continue;
+
+ /* only consider RELA sections operating on data */
+ info = swab_elfword(shdr[i].sh_info);
+ flags = swab_elfxword(shdr[info].sh_flags);
+ if ((flags & (SHF_ALLOC | SHF_EXECINSTR)) != SHF_ALLOC)
+ continue;
+
+ /*
+ * We generally don't permit ABS64 relocations in the code that
+ * runs before relocation processing occurs. If statically
+ * initialized absolute symbol references are unavoidable, they
+ * may be emitted into a *.rodata.prel64 section and they will
+ * be converted to place-relative 64-bit references. This
+ * requires special handling in the referring code.
+ */
+ if (strstr(strtab + swab_elfword(shdr[info].sh_name),
+ ".rodata.prel64")) {
+ prel64 = true;
+ }
+
+ rela = (void *)ehdr + swab_elfxword(shdr[i].sh_offset);
+ numrela = swab_elfxword(shdr[i].sh_size) / sizeof(*rela);
+
+ for (int j = 0; j < numrela; j++) {
+ uint64_t info = swab_elfxword(rela[j].r_info);
+
+ if (ELF64_R_TYPE(info) != R_AARCH64_ABS64)
+ continue;
+
+ if (prel64) {
+ /* convert ABS64 into PREL64 */
+ info ^= R_AARCH64_ABS64 ^ R_AARCH64_PREL64;
+ rela[j].r_info = swab_elfxword(info);
+ } else {
+ fprintf(stderr,
+ "Unexpected absolute relocations detected in %s\n",
+ argv[2]);
+ close(fd);
+ unlink(argv[1]);
+ exit(EXIT_FAILURE);
+ }
+ }
+ }
+ close(fd);
+ return 0;
+}
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 12/39] arm64: kernel: Don't rely on objcopy to make code under pi/ __init
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (10 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 11/39] arm64: kernel: Manage absolute relocations in code built under pi/ Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 13/39] arm64: head: move relocation handling to C code Ard Biesheuvel
` (27 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
We will add some code under pi/ that contains global variables that
should not end up in __initdata, as they will not be writable via the
initial ID map. So only rely on objcopy for making the libfdt code
__init, and use explicit annotations for the rest.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/pi/Makefile | 6 ++++--
arch/arm64/kernel/pi/kaslr_early.c | 16 +++++++++-------
2 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/kernel/pi/Makefile b/arch/arm64/kernel/pi/Makefile
index bc32a431fe35..2bbe866417d4 100644
--- a/arch/arm64/kernel/pi/Makefile
+++ b/arch/arm64/kernel/pi/Makefile
@@ -28,11 +28,13 @@ quiet_cmd_piobjcopy = $(quiet_cmd_objcopy)
cmd_piobjcopy = $(cmd_objcopy) && $(obj)/relacheck $(@) $(<)
$(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_ \
- --remove-section=.note.gnu.property \
- --prefix-alloc-sections=.init
+ --remove-section=.note.gnu.property
$(obj)/%.pi.o: $(obj)/%.o $(obj)/relacheck FORCE
$(call if_changed,piobjcopy)
+# ensure that all the lib- code ends up as __init code and data
+$(obj)/lib-%.pi.o: OBJCOPYFLAGS += --prefix-alloc-sections=.init
+
$(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)
diff --git a/arch/arm64/kernel/pi/kaslr_early.c b/arch/arm64/kernel/pi/kaslr_early.c
index b9e0bb4bc6a9..167081b30a15 100644
--- a/arch/arm64/kernel/pi/kaslr_early.c
+++ b/arch/arm64/kernel/pi/kaslr_early.c
@@ -17,7 +17,7 @@
#include <asm/pgtable.h>
/* taken from lib/string.c */
-static char *__strstr(const char *s1, const char *s2)
+static char *__init __strstr(const char *s1, const char *s2)
{
size_t l1, l2;
@@ -33,7 +33,7 @@ static char *__strstr(const char *s1, const char *s2)
}
return NULL;
}
-static bool cmdline_contains_nokaslr(const u8 *cmdline)
+static bool __init cmdline_contains_nokaslr(const u8 *cmdline)
{
const u8 *str;
@@ -41,7 +41,7 @@ static bool cmdline_contains_nokaslr(const u8 *cmdline)
return str == cmdline || (str > cmdline && *(str - 1) == ' ');
}
-static bool is_kaslr_disabled_cmdline(void *fdt)
+static bool __init is_kaslr_disabled_cmdline(void *fdt)
{
if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
int node;
@@ -67,17 +67,19 @@ static bool is_kaslr_disabled_cmdline(void *fdt)
return cmdline_contains_nokaslr(CONFIG_CMDLINE);
}
-static u64 get_kaslr_seed(void *fdt)
+static u64 __init get_kaslr_seed(void *fdt)
{
+ static char const chosen_str[] __initconst = "chosen";
+ static char const seed_str[] __initconst = "kaslr-seed";
int node, len;
fdt64_t *prop;
u64 ret;
- node = fdt_path_offset(fdt, "/chosen");
+ node = fdt_path_offset(fdt, chosen_str);
if (node < 0)
return 0;
- prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len);
+ prop = fdt_getprop_w(fdt, node, seed_str, &len);
if (!prop || len != sizeof(u64))
return 0;
@@ -86,7 +88,7 @@ static u64 get_kaslr_seed(void *fdt)
return ret;
}
-asmlinkage u64 kaslr_early_init(void *fdt)
+asmlinkage u64 __init kaslr_early_init(void *fdt)
{
u64 seed, range;
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 13/39] arm64: head: move relocation handling to C code
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (11 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 12/39] arm64: kernel: Don't rely on objcopy to make code under pi/ __init Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 14/39] arm64: idreg-override: Omit non-NULL checks for override pointer Ard Biesheuvel
` (26 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Now that we have a mini C runtime before the kernel mapping is up, we
can move the non-trivial relocation processing code out of head.S and
reimplement it in C.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/Makefile | 3 +-
arch/arm64/kernel/head.S | 104 ++------------------
arch/arm64/kernel/pi/Makefile | 5 +-
arch/arm64/kernel/pi/relocate.c | 62 ++++++++++++
arch/arm64/kernel/vmlinux.lds.S | 12 ++-
5 files changed, 82 insertions(+), 104 deletions(-)
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index d95b3d6b471a..a8487749cabd 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -57,7 +57,8 @@ obj-$(CONFIG_ACPI) += acpi.o
obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o
obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o
obj-$(CONFIG_PARAVIRT) += paravirt.o
-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o pi/
+obj-$(CONFIG_RELOCATABLE) += pi/
+obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o
obj-$(CONFIG_ELF_CORE) += elfcore.o
obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o \
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index cab7f91949d8..a8fa64fc30d7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -81,7 +81,7 @@
* x20 primary_entry() .. __primary_switch() CPU boot mode
* x21 primary_entry() .. start_kernel() FDT pointer passed at boot in x0
* x22 create_idmap() .. start_kernel() ID map VA of the DT blob
- * x23 primary_entry() .. start_kernel() physical misalignment/KASLR offset
+ * x23 __primary_switch() physical misalignment/KASLR offset
* x24 __primary_switch() linear map KASLR seed
* x25 primary_entry() .. start_kernel() supported VA size
* x28 create_idmap() callee preserved temp register
@@ -389,7 +389,7 @@ SYM_FUNC_START_LOCAL(create_idmap)
/* Remap the kernel page tables r/w in the ID map */
adrp x1, _text
adrp x2, init_pg_dir
- adrp x3, init_pg_end
+ adrp x3, _end
bic x4, x2, #SWAPPER_BLOCK_SIZE - 1
mov_q x5, SWAPPER_RW_MMUFLAGS
mov x6, #SWAPPER_BLOCK_SHIFT
@@ -779,97 +779,6 @@ SYM_FUNC_START_LOCAL(__no_granule_support)
b 1b
SYM_FUNC_END(__no_granule_support)
-#ifdef CONFIG_RELOCATABLE
-SYM_FUNC_START_LOCAL(__relocate_kernel)
- /*
- * Iterate over each entry in the relocation table, and apply the
- * relocations in place.
- */
- adr_l x9, __rela_start
- adr_l x10, __rela_end
- mov_q x11, KIMAGE_VADDR // default virtual offset
- add x11, x11, x23 // actual virtual offset
-
-0: cmp x9, x10
- b.hs 1f
- ldp x12, x13, [x9], #24
- ldr x14, [x9, #-8]
- cmp w13, #R_AARCH64_RELATIVE
- b.ne 0b
- add x14, x14, x23 // relocate
- str x14, [x12, x23]
- b 0b
-
-1:
-#ifdef CONFIG_RELR
- /*
- * Apply RELR relocations.
- *
- * RELR is a compressed format for storing relative relocations. The
- * encoded sequence of entries looks like:
- * [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
- *
- * i.e. start with an address, followed by any number of bitmaps. The
- * address entry encodes 1 relocation. The subsequent bitmap entries
- * encode up to 63 relocations each, at subsequent offsets following
- * the last address entry.
- *
- * The bitmap entries must have 1 in the least significant bit. The
- * assumption here is that an address cannot have 1 in lsb. Odd
- * addresses are not supported. Any odd addresses are stored in the RELA
- * section, which is handled above.
- *
- * Excluding the least significant bit in the bitmap, each non-zero
- * bit in the bitmap represents a relocation to be applied to
- * a corresponding machine word that follows the base address
- * word. The second least significant bit represents the machine
- * word immediately following the initial address, and each bit
- * that follows represents the next word, in linear order. As such,
- * a single bitmap can encode up to 63 relocations in a 64-bit object.
- *
- * In this implementation we store the address of the next RELR table
- * entry in x9, the address being relocated by the current address or
- * bitmap entry in x13 and the address being relocated by the current
- * bit in x14.
- */
- adr_l x9, __relr_start
- adr_l x10, __relr_end
-
-2: cmp x9, x10
- b.hs 7f
- ldr x11, [x9], #8
- tbnz x11, #0, 3f // branch to handle bitmaps
- add x13, x11, x23
- ldr x12, [x13] // relocate address entry
- add x12, x12, x23
- str x12, [x13], #8 // adjust to start of bitmap
- b 2b
-
-3: mov x14, x13
-4: lsr x11, x11, #1
- cbz x11, 6f
- tbz x11, #0, 5f // skip bit if not set
- ldr x12, [x14] // relocate bit
- add x12, x12, x23
- str x12, [x14]
-
-5: add x14, x14, #8 // move to next bit's address
- b 4b
-
-6: /*
- * Move to the next bitmap's address. 8 is the word size, and 63 is the
- * number of significant bits in a bitmap entry.
- */
- add x13, x13, #(8 * 63)
- b 2b
-
-7:
-#endif
- ret
-
-SYM_FUNC_END(__relocate_kernel)
-#endif
-
SYM_FUNC_START_LOCAL(__primary_switch)
adrp x1, reserved_pg_dir
adrp x2, init_idmap_pg_dir
@@ -877,11 +786,11 @@ SYM_FUNC_START_LOCAL(__primary_switch)
#ifdef CONFIG_RELOCATABLE
adrp x23, KERNEL_START
and x23, x23, MIN_KIMG_ALIGN - 1
-#ifdef CONFIG_RANDOMIZE_BASE
- mov x0, x22
- adrp x1, init_pg_end
+ adrp x1, early_init_stack
mov sp, x1
mov x29, xzr
+#ifdef CONFIG_RANDOMIZE_BASE
+ mov x0, x22
bl __pi_kaslr_early_init
and x24, x0, #SZ_2M - 1 // capture memstart offset seed
bic x0, x0, #SZ_2M - 1
@@ -894,7 +803,8 @@ SYM_FUNC_START_LOCAL(__primary_switch)
adrp x1, init_pg_dir
load_ttbr1 x1, x1, x2
#ifdef CONFIG_RELOCATABLE
- bl __relocate_kernel
+ mov x0, x23
+ bl __pi_relocate_kernel
#endif
ldr x8, =__primary_switched
adrp x0, KERNEL_START // __pa(KERNEL_START)
diff --git a/arch/arm64/kernel/pi/Makefile b/arch/arm64/kernel/pi/Makefile
index 2bbe866417d4..d084c1dcf416 100644
--- a/arch/arm64/kernel/pi/Makefile
+++ b/arch/arm64/kernel/pi/Makefile
@@ -38,5 +38,6 @@ $(obj)/lib-%.pi.o: OBJCOPYFLAGS += --prefix-alloc-sections=.init
$(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)
-obj-y := kaslr_early.pi.o lib-fdt.pi.o lib-fdt_ro.pi.o
-extra-y := $(patsubst %.pi.o,%.o,$(obj-y))
+obj-y := relocate.pi.o
+obj-$(CONFIG_RANDOMIZE_BASE) += kaslr_early.pi.o lib-fdt.pi.o lib-fdt_ro.pi.o
+extra-y := $(patsubst %.pi.o,%.o,$(obj-y))
diff --git a/arch/arm64/kernel/pi/relocate.c b/arch/arm64/kernel/pi/relocate.c
new file mode 100644
index 000000000000..1853408ea76b
--- /dev/null
+++ b/arch/arm64/kernel/pi/relocate.c
@@ -0,0 +1,62 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright 2023 Google LLC
+// Authors: Ard Biesheuvel <ardb@google.com>
+// Peter Collingbourne <pcc@google.com>
+
+#include <linux/elf.h>
+#include <linux/init.h>
+#include <linux/types.h>
+
+extern const Elf64_Rela rela_start[], rela_end[];
+extern const u64 relr_start[], relr_end[];
+
+void __init relocate_kernel(u64 offset)
+{
+ u64 *place = NULL;
+
+ for (const Elf64_Rela *rela = rela_start; rela < rela_end; rela++) {
+ if (ELF64_R_TYPE(rela->r_info) != R_AARCH64_RELATIVE)
+ continue;
+ *(u64 *)(rela->r_offset + offset) = rela->r_addend + offset;
+ }
+
+ if (!IS_ENABLED(CONFIG_RELR) || !offset)
+ return;
+
+ /*
+ * Apply RELR relocations.
+ *
+ * RELR is a compressed format for storing relative relocations. The
+ * encoded sequence of entries looks like:
+ * [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
+ *
+ * i.e. start with an address, followed by any number of bitmaps. The
+ * address entry encodes 1 relocation. The subsequent bitmap entries
+ * encode up to 63 relocations each, at subsequent offsets following
+ * the last address entry.
+ *
+ * The bitmap entries must have 1 in the least significant bit. The
+ * assumption here is that an address cannot have 1 in lsb. Odd
+ * addresses are not supported. Any odd addresses are stored in the
+ * RELA section, which is handled above.
+ *
+ * With the exception of the least significant bit, each bit in the
+ * bitmap corresponds with a machine word that follows the base address
+ * word, and the bit value indicates whether or not a relocation needs
+ * to be applied to it. The second least significant bit represents the
+ * machine word immediately following the initial address, and each bit
+ * that follows represents the next word, in linear order. As such, a
+ * single bitmap can encode up to 63 relocations in a 64-bit object.
+ */
+ for (const u64 *relr = relr_start; relr < relr_end; relr++) {
+ if ((*relr & 1) == 0) {
+ place = (u64 *)(*relr + offset);
+ *place++ += offset;
+ } else {
+ for (u64 *p = place, r = *relr >> 1; r; p++, r >>= 1)
+ if (r & 1)
+ *p += offset;
+ place += 63;
+ }
+ }
+}
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 3cd7e76cc562..8dd5dda66f7c 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -270,15 +270,15 @@ SECTIONS
HYPERVISOR_RELOC_SECTION
.rela.dyn : ALIGN(8) {
- __rela_start = .;
+ __pi_rela_start = .;
*(.rela .rela*)
- __rela_end = .;
+ __pi_rela_end = .;
}
.relr.dyn : ALIGN(8) {
- __relr_start = .;
+ __pi_relr_start = .;
*(.relr.dyn)
- __relr_end = .;
+ __pi_relr_end = .;
}
. = ALIGN(SEGMENT_ALIGN);
@@ -317,6 +317,10 @@ SECTIONS
init_pg_dir = .;
. += INIT_DIR_SIZE;
init_pg_end = .;
+#ifdef CONFIG_RELOCATABLE
+ . += SZ_4K; /* stack for the early relocation code */
+ early_init_stack = .;
+#endif
. = ALIGN(SEGMENT_ALIGN);
__pecoff_data_size = ABSOLUTE(. - __initdata_begin);
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 14/39] arm64: idreg-override: Omit non-NULL checks for override pointer
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (12 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 13/39] arm64: head: move relocation handling to C code Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 15/39] arm64: idreg-override: Prepare for place relative reloc patching Ard Biesheuvel
` (25 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Now that override pointers are always set, we can drop the various
non-NULL checks that we have in the code.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/idreg-override.c | 16 +++++-----------
1 file changed, 5 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/idreg-override.c
index 3addc09f8746..536bc33859bc 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -216,9 +216,6 @@ static void __init match_options(const char *cmdline)
for (i = 0; i < ARRAY_SIZE(regs); i++) {
int f;
- if (!regs[i]->override)
- continue;
-
for (f = 0; strlen(regs[i]->fields[f].name); f++) {
u64 shift = regs[i]->fields[f].shift;
u64 width = regs[i]->fields[f].width ?: 4;
@@ -319,10 +316,8 @@ asmlinkage void __init init_feature_override(u64 boot_status)
int i;
for (i = 0; i < ARRAY_SIZE(regs); i++) {
- if (regs[i]->override) {
- regs[i]->override->val = 0;
- regs[i]->override->mask = 0;
- }
+ regs[i]->override->val = 0;
+ regs[i]->override->mask = 0;
}
__boot_status = boot_status;
@@ -330,9 +325,8 @@ asmlinkage void __init init_feature_override(u64 boot_status)
parse_cmdline();
for (i = 0; i < ARRAY_SIZE(regs); i++) {
- if (regs[i]->override)
- dcache_clean_inval_poc((unsigned long)regs[i]->override,
- (unsigned long)regs[i]->override +
- sizeof(*regs[i]->override));
+ dcache_clean_inval_poc((unsigned long)regs[i]->override,
+ (unsigned long)regs[i]->override +
+ sizeof(*regs[i]->override));
}
}
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 15/39] arm64: idreg-override: Prepare for place relative reloc patching
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (13 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 14/39] arm64: idreg-override: Omit non-NULL checks for override pointer Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-27 12:53 ` Marc Zyngier
2023-11-24 10:18 ` [PATCH v5 16/39] arm64: idreg-override: Avoid parameq() and parameqn() Ard Biesheuvel
` (24 subsequent siblings)
39 siblings, 1 reply; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The ID reg override handling code uses a rather elaborate data structure
that relies on statically initialized absolute address values in pointer
fields. This means that this code cannot run until relocation fixups
have been applied, and this is unfortunate, because it means we cannot
discover overrides for KASLR or LVA/LPA without creating the kernel
mapping and performing the relocations first.
This can be solved by switching to place-relative relocations, which can
be applied by the linker at build time. This means some additional
arithmetic is required when dereferencing these pointers, as we can no
longer dereference the pointer members directly.
So let's implement this for idreg-override.c in a preliminary way, i.e.,
convert all the references in code to use a special accessor that
produces the correct absolute value at runtime.
To preserve the strong type checking for the static initializers, use
union types for representing the hybrid quantities.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/idreg-override.c | 98 +++++++++++++-------
1 file changed, 65 insertions(+), 33 deletions(-)
diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/idreg-override.c
index 536bc33859bc..4e32a44560bf 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -21,14 +21,32 @@
static u64 __boot_status __initdata;
+// temporary __prel64 related definitions
+// to be removed when this code is moved under pi/
+
+#define __prel64_initconst __initconst
+
+typedef void *prel64_t;
+
+static void *prel64_to_pointer(const prel64_t *p)
+{
+ return *p;
+}
+
struct ftr_set_desc {
char name[FTR_DESC_NAME_LEN];
- struct arm64_ftr_override *override;
+ union {
+ struct arm64_ftr_override *override;
+ prel64_t override_prel;
+ };
struct {
char name[FTR_DESC_FIELD_LEN];
u8 shift;
u8 width;
- bool (*filter)(u64 val);
+ union {
+ bool (*filter)(u64 val);
+ prel64_t filter_prel;
+ };
} fields[];
};
@@ -46,7 +64,7 @@ static bool __init mmfr1_vh_filter(u64 val)
val == 0);
}
-static const struct ftr_set_desc mmfr1 __initconst = {
+static const struct ftr_set_desc mmfr1 __prel64_initconst = {
.name = "id_aa64mmfr1",
.override = &id_aa64mmfr1_override,
.fields = {
@@ -70,7 +88,7 @@ static bool __init pfr0_sve_filter(u64 val)
return true;
}
-static const struct ftr_set_desc pfr0 __initconst = {
+static const struct ftr_set_desc pfr0 __prel64_initconst = {
.name = "id_aa64pfr0",
.override = &id_aa64pfr0_override,
.fields = {
@@ -94,7 +112,7 @@ static bool __init pfr1_sme_filter(u64 val)
return true;
}
-static const struct ftr_set_desc pfr1 __initconst = {
+static const struct ftr_set_desc pfr1 __prel64_initconst = {
.name = "id_aa64pfr1",
.override = &id_aa64pfr1_override,
.fields = {
@@ -105,7 +123,7 @@ static const struct ftr_set_desc pfr1 __initconst = {
},
};
-static const struct ftr_set_desc isar1 __initconst = {
+static const struct ftr_set_desc isar1 __prel64_initconst = {
.name = "id_aa64isar1",
.override = &id_aa64isar1_override,
.fields = {
@@ -117,7 +135,7 @@ static const struct ftr_set_desc isar1 __initconst = {
},
};
-static const struct ftr_set_desc isar2 __initconst = {
+static const struct ftr_set_desc isar2 __prel64_initconst = {
.name = "id_aa64isar2",
.override = &id_aa64isar2_override,
.fields = {
@@ -128,7 +146,7 @@ static const struct ftr_set_desc isar2 __initconst = {
},
};
-static const struct ftr_set_desc smfr0 __initconst = {
+static const struct ftr_set_desc smfr0 __prel64_initconst = {
.name = "id_aa64smfr0",
.override = &id_aa64smfr0_override,
.fields = {
@@ -149,7 +167,7 @@ static bool __init hvhe_filter(u64 val)
ID_AA64MMFR1_EL1_VH_SHIFT));
}
-static const struct ftr_set_desc sw_features __initconst = {
+static const struct ftr_set_desc sw_features __prel64_initconst = {
.name = "arm64_sw",
.override = &arm64_sw_feature_override,
.fields = {
@@ -159,14 +177,17 @@ static const struct ftr_set_desc sw_features __initconst = {
},
};
-static const struct ftr_set_desc * const regs[] __initconst = {
- &mmfr1,
- &pfr0,
- &pfr1,
- &isar1,
- &isar2,
- &smfr0,
- &sw_features,
+static const union {
+ const struct ftr_set_desc *reg;
+ prel64_t reg_prel;
+} regs[] __prel64_initconst = {
+ { .reg = &mmfr1 },
+ { .reg = &pfr0 },
+ { .reg = &pfr1 },
+ { .reg = &isar1 },
+ { .reg = &isar2 },
+ { .reg = &smfr0 },
+ { .reg = &sw_features },
};
static const struct {
@@ -214,15 +235,20 @@ static void __init match_options(const char *cmdline)
int i;
for (i = 0; i < ARRAY_SIZE(regs); i++) {
+ const struct ftr_set_desc *reg = prel64_to_pointer(®s[i].reg_prel);
+ struct arm64_ftr_override *override;
int f;
- for (f = 0; strlen(regs[i]->fields[f].name); f++) {
- u64 shift = regs[i]->fields[f].shift;
- u64 width = regs[i]->fields[f].width ?: 4;
+ override = prel64_to_pointer(®->override_prel);
+
+ for (f = 0; strlen(reg->fields[f].name); f++) {
+ u64 shift = reg->fields[f].shift;
+ u64 width = reg->fields[f].width ?: 4;
u64 mask = GENMASK_ULL(shift + width - 1, shift);
+ bool (*filter)(u64 val);
u64 v;
- if (find_field(cmdline, regs[i], f, &v))
+ if (find_field(cmdline, reg, f, &v))
continue;
/*
@@ -230,16 +256,16 @@ static void __init match_options(const char *cmdline)
* it by setting the value to the all-ones while
* clearing the mask... Yes, this is fragile.
*/
- if (regs[i]->fields[f].filter &&
- !regs[i]->fields[f].filter(v)) {
- regs[i]->override->val |= mask;
- regs[i]->override->mask &= ~mask;
+ filter = prel64_to_pointer(®->fields[f].filter_prel);
+ if (filter && !filter(v)) {
+ override->val |= mask;
+ override->mask &= ~mask;
continue;
}
- regs[i]->override->val &= ~mask;
- regs[i]->override->val |= (v << shift) & mask;
- regs[i]->override->mask |= mask;
+ override->val &= ~mask;
+ override->val |= (v << shift) & mask;
+ override->mask |= mask;
return;
}
@@ -313,11 +339,16 @@ void init_feature_override(u64 boot_status);
asmlinkage void __init init_feature_override(u64 boot_status)
{
+ struct arm64_ftr_override *override;
+ const struct ftr_set_desc *reg;
int i;
for (i = 0; i < ARRAY_SIZE(regs); i++) {
- regs[i]->override->val = 0;
- regs[i]->override->mask = 0;
+ reg = prel64_to_pointer(®s[i].reg_prel);
+ override = prel64_to_pointer(®->override_prel);
+
+ override->val = 0;
+ override->mask = 0;
}
__boot_status = boot_status;
@@ -325,8 +356,9 @@ asmlinkage void __init init_feature_override(u64 boot_status)
parse_cmdline();
for (i = 0; i < ARRAY_SIZE(regs); i++) {
- dcache_clean_inval_poc((unsigned long)regs[i]->override,
- (unsigned long)regs[i]->override +
- sizeof(*regs[i]->override));
+ reg = prel64_to_pointer(®s[i].reg_prel);
+ override = prel64_to_pointer(®->override_prel);
+ dcache_clean_inval_poc((unsigned long)override,
+ (unsigned long)(override + 1));
}
}
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 16/39] arm64: idreg-override: Avoid parameq() and parameqn()
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (14 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 15/39] arm64: idreg-override: Prepare for place relative reloc patching Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 17/39] arm64: idreg-override: avoid strlen() to check for empty strings Ard Biesheuvel
` (23 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The only way parameq() and parameqn() deviate from the ordinary string
and memory routines is that they ignore the difference between dashes
and underscores.
Since we copy each command line argument into a buffer before passing it
to parameq() and parameqn() numerous times, let's just convert all
dashes to underscores just once, and update the alias array accordingly.
This also helps reduce the dependency on kernel APIs that are no longer
available once we move this code into the early mini C runtime.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/idreg-override.c | 28 ++++++++++++--------
1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/idreg-override.c
index 4e32a44560bf..cb4de31071d4 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -194,8 +194,8 @@ static const struct {
char alias[FTR_ALIAS_NAME_LEN];
char feature[FTR_ALIAS_OPTION_LEN];
} aliases[] __initconst = {
- { "kvm-arm.mode=nvhe", "id_aa64mmfr1.vh=0" },
- { "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
+ { "kvm_arm.mode=nvhe", "id_aa64mmfr1.vh=0" },
+ { "kvm_arm.mode=protected", "id_aa64mmfr1.vh=0" },
{ "arm64.nosve", "id_aa64pfr0.sve=0" },
{ "arm64.nosme", "id_aa64pfr1.sme=0" },
{ "arm64.nobti", "id_aa64pfr1.bt=0" },
@@ -224,7 +224,7 @@ static int __init find_field(const char *cmdline,
len = snprintf(opt, ARRAY_SIZE(opt), "%s.%s=",
reg->name, reg->fields[f].name);
- if (!parameqn(cmdline, opt, len))
+ if (memcmp(cmdline, opt, len))
return -1;
return kstrtou64(cmdline + len, 0, v);
@@ -281,23 +281,29 @@ static __init void __parse_cmdline(const char *cmdline, bool parse_aliases)
cmdline = skip_spaces(cmdline);
- for (len = 0; cmdline[len] && !isspace(cmdline[len]); len++);
- if (!len)
+ /* terminate on "--" appearing on the command line by itself */
+ if (cmdline[0] == '-' && cmdline[1] == '-' && isspace(cmdline[2]))
return;
- len = min(len, ARRAY_SIZE(buf) - 1);
- memcpy(buf, cmdline, len);
- buf[len] = '\0';
-
- if (strcmp(buf, "--") == 0)
+ for (len = 0; cmdline[len] && !isspace(cmdline[len]); len++) {
+ if (len >= sizeof(buf) - 1)
+ break;
+ if (cmdline[len] == '-')
+ buf[len] = '_';
+ else
+ buf[len] = cmdline[len];
+ }
+ if (!len)
return;
+ buf[len] = 0;
+
cmdline += len;
match_options(buf);
for (i = 0; parse_aliases && i < ARRAY_SIZE(aliases); i++)
- if (parameq(buf, aliases[i].alias))
+ if (!memcmp(buf, aliases[i].alias, len + 1))
__parse_cmdline(aliases[i].feature, false);
} while (1);
}
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 17/39] arm64: idreg-override: avoid strlen() to check for empty strings
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (15 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 16/39] arm64: idreg-override: Avoid parameq() and parameqn() Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 18/39] arm64: idreg-override: Avoid sprintf() for simple string concatenation Ard Biesheuvel
` (22 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
strlen() is a costly way to decide whether a string is empty, as in that
case, the first character will be NUL so we can check for that directly.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/idreg-override.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/idreg-override.c
index cb4de31071d4..96fec5b7a65e 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -241,7 +241,7 @@ static void __init match_options(const char *cmdline)
override = prel64_to_pointer(®->override_prel);
- for (f = 0; strlen(reg->fields[f].name); f++) {
+ for (f = 0; reg->fields[f].name[0] != '\0'; f++) {
u64 shift = reg->fields[f].shift;
u64 width = reg->fields[f].width ?: 4;
u64 mask = GENMASK_ULL(shift + width - 1, shift);
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 18/39] arm64: idreg-override: Avoid sprintf() for simple string concatenation
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (16 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 17/39] arm64: idreg-override: avoid strlen() to check for empty strings Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 19/39] arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit Ard Biesheuvel
` (21 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Instead of using sprintf() with the "%s.%s=" format, where the first
string argument is always the same in the inner loop of match_options(),
use simple memcpy() for string concatenation, and move the first copy to
the outer loop. This removes the dependency on sprintf(), which will be
difficult to fulfil when we move this code into the early mini C
runtime.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/idreg-override.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/idreg-override.c
index 96fec5b7a65e..41eeca857b32 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -215,14 +215,15 @@ static int __init parse_nokaslr(char *unused)
}
early_param("nokaslr", parse_nokaslr);
-static int __init find_field(const char *cmdline,
+static int __init find_field(const char *cmdline, char *opt, int len,
const struct ftr_set_desc *reg, int f, u64 *v)
{
- char opt[FTR_DESC_NAME_LEN + FTR_DESC_FIELD_LEN + 2];
- int len;
+ int flen = strlen(reg->fields[f].name);
- len = snprintf(opt, ARRAY_SIZE(opt), "%s.%s=",
- reg->name, reg->fields[f].name);
+ // append '<fieldname>=' to obtain '<name>.<fieldname>='
+ memcpy(opt + len, reg->fields[f].name, flen);
+ len += flen;
+ opt[len++] = '=';
if (memcmp(cmdline, opt, len))
return -1;
@@ -232,15 +233,21 @@ static int __init find_field(const char *cmdline,
static void __init match_options(const char *cmdline)
{
+ char opt[FTR_DESC_NAME_LEN + FTR_DESC_FIELD_LEN + 2];
int i;
for (i = 0; i < ARRAY_SIZE(regs); i++) {
const struct ftr_set_desc *reg = prel64_to_pointer(®s[i].reg_prel);
struct arm64_ftr_override *override;
+ int len = strlen(reg->name);
int f;
override = prel64_to_pointer(®->override_prel);
+ // set opt[] to '<name>.'
+ memcpy(opt, reg->name, len);
+ opt[len++] = '.';
+
for (f = 0; reg->fields[f].name[0] != '\0'; f++) {
u64 shift = reg->fields[f].shift;
u64 width = reg->fields[f].width ?: 4;
@@ -248,7 +255,7 @@ static void __init match_options(const char *cmdline)
bool (*filter)(u64 val);
u64 v;
- if (find_field(cmdline, reg, f, &v))
+ if (find_field(cmdline, opt, len, reg, f, &v))
continue;
/*
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 19/39] arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (17 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 18/39] arm64: idreg-override: Avoid sprintf() for simple string concatenation Ard Biesheuvel
@ 2023-11-24 10:18 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 20/39] arm64: idreg-override: Move to early mini C runtime Ard Biesheuvel
` (20 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:18 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
All ID register value overrides are =0 with the exception of the nokaslr
pseudo feature which uses =1. In order to remove the dependency on
kstrtou64(), which is part of the core kernel and no longer usable once
we move idreg-override into the early mini C runtime, let's just parse a
single hex digit (with optional leading 0x) and set the output value
accordingly.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/idreg-override.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/idreg-override.c
index 41eeca857b32..2eb81795934a 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -215,6 +215,20 @@ static int __init parse_nokaslr(char *unused)
}
early_param("nokaslr", parse_nokaslr);
+static int __init parse_hexdigit(const char *p, u64 *v)
+{
+ // skip "0x" if it comes next
+ if (p[0] == '0' && tolower(p[1]) == 'x')
+ p += 2;
+
+ // check whether the RHS is a single hex digit
+ if (!isxdigit(p[0]) || (p[1] && !isspace(p[1])))
+ return -EINVAL;
+
+ *v = tolower(*p) - (isdigit(*p) ? '0' : 'a' - 10);
+ return 0;
+}
+
static int __init find_field(const char *cmdline, char *opt, int len,
const struct ftr_set_desc *reg, int f, u64 *v)
{
@@ -228,7 +242,7 @@ static int __init find_field(const char *cmdline, char *opt, int len,
if (memcmp(cmdline, opt, len))
return -1;
- return kstrtou64(cmdline + len, 0, v);
+ return parse_hexdigit(cmdline + len, v);
}
static void __init match_options(const char *cmdline)
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 20/39] arm64: idreg-override: Move to early mini C runtime
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (18 preceding siblings ...)
2023-11-24 10:18 ` [PATCH v5 19/39] arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 21/39] arm64: kernel: Remove early fdt remap code Ard Biesheuvel
` (19 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
We will want to parse the ID register overrides even earlier, so that we
can take them into account before creating the kernel mapping. So
migrate the code and make it work in the context of the early C runtime.
We will move the invocation to an earlier stage in a subsequent patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/Makefile | 4 +--
arch/arm64/kernel/head.S | 5 ++-
arch/arm64/kernel/image-vars.h | 9 +++++
arch/arm64/kernel/pi/Makefile | 5 +--
arch/arm64/kernel/{ => pi}/idreg-override.c | 38 ++++++++------------
5 files changed, 30 insertions(+), 31 deletions(-)
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index a8487749cabd..dc85dc2ee4ed 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -33,8 +33,7 @@ obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
return_address.o cpuinfo.o cpu_errata.o \
cpufeature.o alternative.o cacheinfo.o \
smp.o smp_spin_table.o topology.o smccc-call.o \
- syscall.o proton-pack.o idreg-override.o idle.o \
- patching.o
+ syscall.o proton-pack.o idle.o patching.o pi/
obj-$(CONFIG_COMPAT) += sys32.o signal32.o \
sys_compat.o
@@ -57,7 +56,6 @@ obj-$(CONFIG_ACPI) += acpi.o
obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o
obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o
obj-$(CONFIG_PARAVIRT) += paravirt.o
-obj-$(CONFIG_RELOCATABLE) += pi/
obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o
obj-$(CONFIG_ELF_CORE) += elfcore.o
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a8fa64fc30d7..ca5e5fbefcd3 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -510,10 +510,9 @@ SYM_FUNC_START_LOCAL(__primary_switched)
#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl kasan_early_init
#endif
- mov x0, x21 // pass FDT address in x0
- bl early_fdt_map // Try mapping the FDT early
mov x0, x20 // pass the full boot status
- bl init_feature_override // Parse cpu feature overrides
+ mov x1, x22 // pass the low FDT mapping
+ bl __pi_init_feature_override // Parse cpu feature overrides
#ifdef CONFIG_UNWIND_PATCH_PAC_INTO_SCS
bl scs_patch_vmlinux
#endif
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index e931ce078a00..eacc3d167733 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -37,6 +37,15 @@ PROVIDE(__pi___memmove = __pi_memmove);
PROVIDE(__pi___memset = __pi_memset);
PROVIDE(__pi_vabits_actual = vabits_actual);
+PROVIDE(__pi_id_aa64isar1_override = id_aa64isar1_override);
+PROVIDE(__pi_id_aa64isar2_override = id_aa64isar2_override);
+PROVIDE(__pi_id_aa64mmfr1_override = id_aa64mmfr1_override);
+PROVIDE(__pi_id_aa64pfr0_override = id_aa64pfr0_override);
+PROVIDE(__pi_id_aa64pfr1_override = id_aa64pfr1_override);
+PROVIDE(__pi_id_aa64smfr0_override = id_aa64smfr0_override);
+PROVIDE(__pi_id_aa64zfr0_override = id_aa64zfr0_override);
+PROVIDE(__pi_arm64_sw_feature_override = arm64_sw_feature_override);
+PROVIDE(__pi__ctype = _ctype);
#ifdef CONFIG_KVM
diff --git a/arch/arm64/kernel/pi/Makefile b/arch/arm64/kernel/pi/Makefile
index d084c1dcf416..7f6dfce893c3 100644
--- a/arch/arm64/kernel/pi/Makefile
+++ b/arch/arm64/kernel/pi/Makefile
@@ -38,6 +38,7 @@ $(obj)/lib-%.pi.o: OBJCOPYFLAGS += --prefix-alloc-sections=.init
$(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)
-obj-y := relocate.pi.o
-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr_early.pi.o lib-fdt.pi.o lib-fdt_ro.pi.o
+obj-y := idreg-override.pi.o lib-fdt.pi.o lib-fdt_ro.pi.o
+obj-$(CONFIG_RELOCATABLE) += relocate.pi.o
+obj-$(CONFIG_RANDOMIZE_BASE) += kaslr_early.pi.o
extra-y := $(patsubst %.pi.o,%.o,$(obj-y))
diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
similarity index 94%
rename from arch/arm64/kernel/idreg-override.c
rename to arch/arm64/kernel/pi/idreg-override.c
index 2eb81795934a..bcba0ce71af0 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/pi/idreg-override.c
@@ -14,6 +14,8 @@
#include <asm/cpufeature.h>
#include <asm/setup.h>
+#include "pi.h"
+
#define FTR_DESC_NAME_LEN 20
#define FTR_DESC_FIELD_LEN 10
#define FTR_ALIAS_NAME_LEN 30
@@ -21,18 +23,6 @@
static u64 __boot_status __initdata;
-// temporary __prel64 related definitions
-// to be removed when this code is moved under pi/
-
-#define __prel64_initconst __initconst
-
-typedef void *prel64_t;
-
-static void *prel64_to_pointer(const prel64_t *p)
-{
- return *p;
-}
-
struct ftr_set_desc {
char name[FTR_DESC_NAME_LEN];
union {
@@ -329,16 +319,11 @@ static __init void __parse_cmdline(const char *cmdline, bool parse_aliases)
} while (1);
}
-static __init const u8 *get_bootargs_cmdline(void)
+static __init const u8 *get_bootargs_cmdline(const void *fdt)
{
const u8 *prop;
- void *fdt;
int node;
- fdt = get_early_fdt_ptr();
- if (!fdt)
- return NULL;
-
node = fdt_path_offset(fdt, "/chosen");
if (node < 0)
return NULL;
@@ -350,9 +335,9 @@ static __init const u8 *get_bootargs_cmdline(void)
return strlen(prop) ? prop : NULL;
}
-static __init void parse_cmdline(void)
+static __init void parse_cmdline(const void *fdt)
{
- const u8 *prop = get_bootargs_cmdline();
+ const u8 *prop = get_bootargs_cmdline(fdt);
if (IS_ENABLED(CONFIG_CMDLINE_FORCE) || !prop)
__parse_cmdline(CONFIG_CMDLINE, true);
@@ -362,9 +347,9 @@ static __init void parse_cmdline(void)
}
/* Keep checkers quiet */
-void init_feature_override(u64 boot_status);
+void init_feature_override(u64 boot_status, const void *fdt);
-asmlinkage void __init init_feature_override(u64 boot_status)
+asmlinkage void __init init_feature_override(u64 boot_status, const void *fdt)
{
struct arm64_ftr_override *override;
const struct ftr_set_desc *reg;
@@ -380,7 +365,7 @@ asmlinkage void __init init_feature_override(u64 boot_status)
__boot_status = boot_status;
- parse_cmdline();
+ parse_cmdline(fdt);
for (i = 0; i < ARRAY_SIZE(regs); i++) {
reg = prel64_to_pointer(®s[i].reg_prel);
@@ -389,3 +374,10 @@ asmlinkage void __init init_feature_override(u64 boot_status)
(unsigned long)(override + 1));
}
}
+
+char * __init skip_spaces(const char *str)
+{
+ while (isspace(*str))
+ ++str;
+ return (char *)str;
+}
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 21/39] arm64: kernel: Remove early fdt remap code
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (19 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 20/39] arm64: idreg-override: Move to early mini C runtime Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 22/39] arm64: head: Clear BSS and the kernel page tables in one go Ard Biesheuvel
` (18 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The early FDT remap code is no longer used so let's drop it.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/setup.h | 3 ---
arch/arm64/kernel/setup.c | 15 ---------------
2 files changed, 18 deletions(-)
diff --git a/arch/arm64/include/asm/setup.h b/arch/arm64/include/asm/setup.h
index f4af547ef54c..acc5e00bf3b0 100644
--- a/arch/arm64/include/asm/setup.h
+++ b/arch/arm64/include/asm/setup.h
@@ -7,9 +7,6 @@
#include <uapi/asm/setup.h>
-void *get_early_fdt_ptr(void);
-void early_fdt_map(u64 dt_phys);
-
/*
* These two variables are used in the head.S file.
*/
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 417a8a86b2db..4b0b3515ee20 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -166,21 +166,6 @@ static void __init smp_build_mpidr_hash(void)
pr_warn("Large number of MPIDR hash buckets detected\n");
}
-static void *early_fdt_ptr __initdata;
-
-void __init *get_early_fdt_ptr(void)
-{
- return early_fdt_ptr;
-}
-
-asmlinkage void __init early_fdt_map(u64 dt_phys)
-{
- int fdt_size;
-
- early_fixmap_init();
- early_fdt_ptr = fixmap_remap_fdt(dt_phys, &fdt_size, PAGE_KERNEL);
-}
-
static void __init setup_machine_fdt(phys_addr_t dt_phys)
{
int size;
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 22/39] arm64: head: Clear BSS and the kernel page tables in one go
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (20 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 21/39] arm64: kernel: Remove early fdt remap code Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 23/39] arm64: Move feature overrides into the BSS section Ard Biesheuvel
` (17 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
We will move the CPU feature overrides into BSS in a subsequent patch,
and this requires that BSS is zeroed before the feature override
detection code runs. So let's map BSS read-write in the ID map, and zero
it via this mapping.
Since the kernel page tables are right next to it, and also zeroed via
the ID map, let's drop the separate clear_page_tables() function, and
just zero everything in one go.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 33 +++++++-------------
arch/arm64/kernel/vmlinux.lds.S | 3 ++
2 files changed, 14 insertions(+), 22 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ca5e5fbefcd3..2af518161f3a 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -177,17 +177,6 @@ SYM_CODE_START_LOCAL(preserve_boot_args)
ret
SYM_CODE_END(preserve_boot_args)
-SYM_FUNC_START_LOCAL(clear_page_tables)
- /*
- * Clear the init page tables.
- */
- adrp x0, init_pg_dir
- adrp x1, init_pg_end
- sub x2, x1, x0
- mov x1, xzr
- b __pi_memset // tail call
-SYM_FUNC_END(clear_page_tables)
-
/*
* Macro to populate page table entries, these entries can be pointers to the next level
* or last level entries pointing to physical memory.
@@ -386,9 +375,9 @@ SYM_FUNC_START_LOCAL(create_idmap)
map_memory x0, x1, x3, x6, x7, x3, IDMAP_PGD_ORDER, x10, x11, x12, x13, x14, EXTRA_SHIFT
- /* Remap the kernel page tables r/w in the ID map */
+ /* Remap BSS and the kernel page tables r/w in the ID map */
adrp x1, _text
- adrp x2, init_pg_dir
+ adrp x2, __bss_start
adrp x3, _end
bic x4, x2, #SWAPPER_BLOCK_SIZE - 1
mov_q x5, SWAPPER_RW_MMUFLAGS
@@ -489,14 +478,6 @@ SYM_FUNC_START_LOCAL(__primary_switched)
mov x0, x20
bl set_cpu_boot_mode_flag
- // Clear BSS
- adr_l x0, __bss_start
- mov x1, xzr
- adr_l x2, __bss_stop
- sub x2, x2, x0
- bl __pi_memset
- dsb ishst // Make zero page visible to PTW
-
#if VA_BITS > 48
adr_l x8, vabits_actual // Set this early so KASAN early init
str x25, [x8] // ... observes the correct value
@@ -782,6 +763,15 @@ SYM_FUNC_START_LOCAL(__primary_switch)
adrp x1, reserved_pg_dir
adrp x2, init_idmap_pg_dir
bl __enable_mmu
+
+ // Clear BSS
+ adrp x0, __bss_start
+ mov x1, xzr
+ adrp x2, init_pg_end
+ sub x2, x2, x0
+ bl __pi_memset
+ dsb ishst // Make zero page visible to PTW
+
#ifdef CONFIG_RELOCATABLE
adrp x23, KERNEL_START
and x23, x23, MIN_KIMG_ALIGN - 1
@@ -796,7 +786,6 @@ SYM_FUNC_START_LOCAL(__primary_switch)
orr x23, x23, x0 // record kernel offset
#endif
#endif
- bl clear_page_tables
bl create_kernel_mapping
adrp x1, init_pg_dir
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 8dd5dda66f7c..8a3c6aacc355 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -311,12 +311,15 @@ SECTIONS
__pecoff_data_rawsize = ABSOLUTE(. - __initdata_begin);
_edata = .;
+ /* start of zero-init region */
BSS_SECTION(SBSS_ALIGN, 0, 0)
. = ALIGN(PAGE_SIZE);
init_pg_dir = .;
. += INIT_DIR_SIZE;
init_pg_end = .;
+ /* end of zero-init region */
+
#ifdef CONFIG_RELOCATABLE
. += SZ_4K; /* stack for the early relocation code */
early_init_stack = .;
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 23/39] arm64: Move feature overrides into the BSS section
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (21 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 22/39] arm64: head: Clear BSS and the kernel page tables in one go Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 24/39] arm64: head: Run feature override detection before mapping the kernel Ard Biesheuvel
` (16 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
In order to allow the CPU feature override detection code to run even
earlier, move the feature override global variables into BSS, which is
the only part of the static kernel image that is mapped read-write in
the initial ID map.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/cpufeature.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 646591c67e7a..b95374387946 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -655,13 +655,13 @@ static const struct arm64_ftr_bits ftr_raz[] = {
#define ARM64_FTR_REG(id, table) \
__ARM64_FTR_REG_OVERRIDE(#id, id, table, &no_override)
-struct arm64_ftr_override __ro_after_init id_aa64mmfr1_override;
-struct arm64_ftr_override __ro_after_init id_aa64pfr0_override;
-struct arm64_ftr_override __ro_after_init id_aa64pfr1_override;
-struct arm64_ftr_override __ro_after_init id_aa64zfr0_override;
-struct arm64_ftr_override __ro_after_init id_aa64smfr0_override;
-struct arm64_ftr_override __ro_after_init id_aa64isar1_override;
-struct arm64_ftr_override __ro_after_init id_aa64isar2_override;
+struct arm64_ftr_override id_aa64mmfr1_override;
+struct arm64_ftr_override id_aa64pfr0_override;
+struct arm64_ftr_override id_aa64pfr1_override;
+struct arm64_ftr_override id_aa64zfr0_override;
+struct arm64_ftr_override id_aa64smfr0_override;
+struct arm64_ftr_override id_aa64isar1_override;
+struct arm64_ftr_override id_aa64isar2_override;
struct arm64_ftr_override arm64_sw_feature_override;
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 24/39] arm64: head: Run feature override detection before mapping the kernel
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (22 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 23/39] arm64: Move feature overrides into the BSS section Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 25/39] arm64: head: move dynamic shadow call stack patching into early C runtime Ard Biesheuvel
` (15 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
To permit the feature overrides to be taken into account before the
KASLR init code runs and the kernel mapping is created, move the
detection code to an earlier stage in the boot.
In a subsequent patch, this will be taken advantage of by merging the
preliminary and permanent mappings of the kernel text and data into a
single one that gets created and relocated before start_kernel() is
called.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 17 +++++++++--------
arch/arm64/kernel/vmlinux.lds.S | 4 +---
2 files changed, 10 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2af518161f3a..865ecc1f8255 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -375,9 +375,9 @@ SYM_FUNC_START_LOCAL(create_idmap)
map_memory x0, x1, x3, x6, x7, x3, IDMAP_PGD_ORDER, x10, x11, x12, x13, x14, EXTRA_SHIFT
- /* Remap BSS and the kernel page tables r/w in the ID map */
+ /* Remap [.init].data, BSS and the kernel page tables r/w in the ID map */
adrp x1, _text
- adrp x2, __bss_start
+ adrp x2, __initdata_begin
adrp x3, _end
bic x4, x2, #SWAPPER_BLOCK_SIZE - 1
mov_q x5, SWAPPER_RW_MMUFLAGS
@@ -491,9 +491,6 @@ SYM_FUNC_START_LOCAL(__primary_switched)
#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl kasan_early_init
#endif
- mov x0, x20 // pass the full boot status
- mov x1, x22 // pass the low FDT mapping
- bl __pi_init_feature_override // Parse cpu feature overrides
#ifdef CONFIG_UNWIND_PATCH_PAC_INTO_SCS
bl scs_patch_vmlinux
#endif
@@ -772,12 +769,16 @@ SYM_FUNC_START_LOCAL(__primary_switch)
bl __pi_memset
dsb ishst // Make zero page visible to PTW
-#ifdef CONFIG_RELOCATABLE
- adrp x23, KERNEL_START
- and x23, x23, MIN_KIMG_ALIGN - 1
adrp x1, early_init_stack
mov sp, x1
mov x29, xzr
+ mov x0, x20 // pass the full boot status
+ mov x1, x22 // pass the low FDT mapping
+ bl __pi_init_feature_override // Parse cpu feature overrides
+
+#ifdef CONFIG_RELOCATABLE
+ adrp x23, KERNEL_START
+ and x23, x23, MIN_KIMG_ALIGN - 1
#ifdef CONFIG_RANDOMIZE_BASE
mov x0, x22
bl __pi_kaslr_early_init
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 8a3c6aacc355..3afb4223a5e8 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -320,10 +320,8 @@ SECTIONS
init_pg_end = .;
/* end of zero-init region */
-#ifdef CONFIG_RELOCATABLE
- . += SZ_4K; /* stack for the early relocation code */
+ . += SZ_4K; /* stack for the early C runtime */
early_init_stack = .;
-#endif
. = ALIGN(SEGMENT_ALIGN);
__pecoff_data_size = ABSOLUTE(. - __initdata_begin);
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 25/39] arm64: head: move dynamic shadow call stack patching into early C runtime
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (23 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 24/39] arm64: head: Run feature override detection before mapping the kernel Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 26/39] arm64: kaslr: Use feature override instead of parsing the cmdline again Ard Biesheuvel
` (14 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Once we update the early kernel mapping code to only map the kernel once
with the right permissions, we can no longer perform code patching via
this mapping.
So move this code to an earlier stage of the boot, right after applying
the relocations.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/scs.h | 4 +--
arch/arm64/kernel/Makefile | 2 --
arch/arm64/kernel/head.S | 8 +++---
arch/arm64/kernel/module.c | 2 +-
arch/arm64/kernel/pi/Makefile | 10 +++++---
arch/arm64/kernel/{ => pi}/patch-scs.c | 26 ++++++++++----------
6 files changed, 27 insertions(+), 25 deletions(-)
diff --git a/arch/arm64/include/asm/scs.h b/arch/arm64/include/asm/scs.h
index 3fdae5fe3142..eca2ba5a6276 100644
--- a/arch/arm64/include/asm/scs.h
+++ b/arch/arm64/include/asm/scs.h
@@ -72,8 +72,8 @@ static inline void dynamic_scs_init(void)
static inline void dynamic_scs_init(void) {}
#endif
-int scs_patch(const u8 eh_frame[], int size);
-asmlinkage void scs_patch_vmlinux(void);
+int __pi_scs_patch(const u8 eh_frame[], int size);
+asmlinkage void __pi_scs_patch_vmlinux(void);
#endif /* __ASSEMBLY __ */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index dc85dc2ee4ed..14b4a179bad3 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -71,8 +71,6 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-y += vdso-wrap.o
obj-$(CONFIG_COMPAT_VDSO) += vdso32-wrap.o
-obj-$(CONFIG_UNWIND_PATCH_PAC_INTO_SCS) += patch-scs.o
-CFLAGS_patch-scs.o += -mbranch-protection=none
# Force dependency (vdso*-wrap.S includes vdso.so through incbin)
$(obj)/vdso-wrap.o: $(obj)/vdso/vdso.so
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 865ecc1f8255..b320702032a7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -490,9 +490,6 @@ SYM_FUNC_START_LOCAL(__primary_switched)
#endif
#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl kasan_early_init
-#endif
-#ifdef CONFIG_UNWIND_PATCH_PAC_INTO_SCS
- bl scs_patch_vmlinux
#endif
mov x0, x20
bl finalise_el2 // Prefer VHE if possible
@@ -794,6 +791,11 @@ SYM_FUNC_START_LOCAL(__primary_switch)
#ifdef CONFIG_RELOCATABLE
mov x0, x23
bl __pi_relocate_kernel
+#endif
+#ifdef CONFIG_UNWIND_PATCH_PAC_INTO_SCS
+ ldr x0, =__eh_frame_start
+ ldr x1, =__eh_frame_end
+ bl __pi_scs_patch_vmlinux
#endif
ldr x8, =__primary_switched
adrp x0, KERNEL_START // __pa(KERNEL_START)
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index dd851297596e..47e0be610bb6 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -595,7 +595,7 @@ int module_finalize(const Elf_Ehdr *hdr,
if (scs_is_dynamic()) {
s = find_section(hdr, sechdrs, ".init.eh_frame");
if (s)
- scs_patch((void *)s->sh_addr, s->sh_size);
+ __pi_scs_patch((void *)s->sh_addr, s->sh_size);
}
return module_init_ftrace_plt(hdr, sechdrs, me);
diff --git a/arch/arm64/kernel/pi/Makefile b/arch/arm64/kernel/pi/Makefile
index 7f6dfce893c3..a8b302245f15 100644
--- a/arch/arm64/kernel/pi/Makefile
+++ b/arch/arm64/kernel/pi/Makefile
@@ -38,7 +38,9 @@ $(obj)/lib-%.pi.o: OBJCOPYFLAGS += --prefix-alloc-sections=.init
$(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)
-obj-y := idreg-override.pi.o lib-fdt.pi.o lib-fdt_ro.pi.o
-obj-$(CONFIG_RELOCATABLE) += relocate.pi.o
-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr_early.pi.o
-extra-y := $(patsubst %.pi.o,%.o,$(obj-y))
+obj-y := idreg-override.pi.o \
+ lib-fdt.pi.o lib-fdt_ro.pi.o
+obj-$(CONFIG_RELOCATABLE) += relocate.pi.o
+obj-$(CONFIG_RANDOMIZE_BASE) += kaslr_early.pi.o
+obj-$(CONFIG_UNWIND_PATCH_PAC_INTO_SCS) += patch-scs.pi.o
+extra-y := $(patsubst %.pi.o,%.o,$(obj-y))
diff --git a/arch/arm64/kernel/patch-scs.c b/arch/arm64/kernel/pi/patch-scs.c
similarity index 91%
rename from arch/arm64/kernel/patch-scs.c
rename to arch/arm64/kernel/pi/patch-scs.c
index a1fe4b4ff591..c65ef40d1e6b 100644
--- a/arch/arm64/kernel/patch-scs.c
+++ b/arch/arm64/kernel/pi/patch-scs.c
@@ -4,14 +4,11 @@
* Author: Ard Biesheuvel <ardb@google.com>
*/
-#include <linux/bug.h>
#include <linux/errno.h>
#include <linux/init.h>
#include <linux/linkage.h>
-#include <linux/printk.h>
#include <linux/types.h>
-#include <asm/cacheflush.h>
#include <asm/scs.h>
//
@@ -81,7 +78,11 @@ static void __always_inline scs_patch_loc(u64 loc)
*/
return;
}
- dcache_clean_pou(loc, loc + sizeof(u32));
+ if (IS_ENABLED(CONFIG_ARM64_WORKAROUND_CLEAN_CACHE))
+ asm("dc civac, %0" :: "r"(loc));
+ else
+ asm(ALTERNATIVE("dc cvau, %0", "nop", ARM64_HAS_CACHE_IDC)
+ :: "r"(loc));
}
/*
@@ -128,10 +129,10 @@ struct eh_frame {
};
};
-static int noinstr scs_handle_fde_frame(const struct eh_frame *frame,
- bool fde_has_augmentation_data,
- int code_alignment_factor,
- bool dry_run)
+static int scs_handle_fde_frame(const struct eh_frame *frame,
+ bool fde_has_augmentation_data,
+ int code_alignment_factor,
+ bool dry_run)
{
int size = frame->size - offsetof(struct eh_frame, opcodes) + 4;
u64 loc = (u64)offset_to_ptr(&frame->initial_loc);
@@ -198,14 +199,13 @@ static int noinstr scs_handle_fde_frame(const struct eh_frame *frame,
break;
default:
- pr_err("unhandled opcode: %02x in FDE frame %lx\n", opcode[-1], (uintptr_t)frame);
return -ENOEXEC;
}
}
return 0;
}
-int noinstr scs_patch(const u8 eh_frame[], int size)
+int scs_patch(const u8 eh_frame[], int size)
{
const u8 *p = eh_frame;
@@ -251,12 +251,12 @@ int noinstr scs_patch(const u8 eh_frame[], int size)
return 0;
}
-asmlinkage void __init scs_patch_vmlinux(void)
+asmlinkage void __init scs_patch_vmlinux(const u8 start[], const u8 end[])
{
if (!should_patch_pac_into_scs())
return;
- WARN_ON(scs_patch(__eh_frame_start, __eh_frame_end - __eh_frame_start));
- icache_inval_all_pou();
+ scs_patch(start, end - start);
+ asm("ic ialluis");
isb();
}
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 26/39] arm64: kaslr: Use feature override instead of parsing the cmdline again
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (24 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 25/39] arm64: head: move dynamic shadow call stack patching into early C runtime Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 27/39] arm64/kernel: Move 'nokaslr' parsing out of early idreg code Ard Biesheuvel
` (13 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The early kaslr code open codes the detection of 'nokaslr' on the kernel
command line, and this is no longer necessary now that the feature
detection code, which also looks for the same string, executes before
this code.
Note that the pseudo-feature's mask can be disregarded: it is used for
true CPU features to mask the CPU feature register, not the value of the
override.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/cpufeature.h | 8 +++
arch/arm64/kernel/kaslr.c | 4 +-
arch/arm64/kernel/pi/kaslr_early.c | 53 +-------------------
3 files changed, 10 insertions(+), 55 deletions(-)
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index f6d416fe49b0..77ed3b28cbc6 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -909,6 +909,14 @@ extern struct arm64_ftr_override id_aa64isar2_override;
extern struct arm64_ftr_override arm64_sw_feature_override;
+static inline bool kaslr_disabled_cmdline(void)
+{
+ if (cpuid_feature_extract_unsigned_field(arm64_sw_feature_override.val,
+ ARM64_SW_FEATURE_OVERRIDE_NOKASLR))
+ return true;
+ return false;
+}
+
u32 get_kvm_ipa_limit(void);
void dump_cpu_features(void);
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
index 94a269cd1f07..efbeb8356769 100644
--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -16,9 +16,7 @@ bool __ro_after_init __kaslr_is_enabled = false;
void __init kaslr_init(void)
{
- if (cpuid_feature_extract_unsigned_field(arm64_sw_feature_override.val &
- arm64_sw_feature_override.mask,
- ARM64_SW_FEATURE_OVERRIDE_NOKASLR)) {
+ if (kaslr_disabled_cmdline()) {
pr_info("KASLR disabled on command line\n");
return;
}
diff --git a/arch/arm64/kernel/pi/kaslr_early.c b/arch/arm64/kernel/pi/kaslr_early.c
index 167081b30a15..f2305e276ec3 100644
--- a/arch/arm64/kernel/pi/kaslr_early.c
+++ b/arch/arm64/kernel/pi/kaslr_early.c
@@ -16,57 +16,6 @@
#include <asm/memory.h>
#include <asm/pgtable.h>
-/* taken from lib/string.c */
-static char *__init __strstr(const char *s1, const char *s2)
-{
- size_t l1, l2;
-
- l2 = strlen(s2);
- if (!l2)
- return (char *)s1;
- l1 = strlen(s1);
- while (l1 >= l2) {
- l1--;
- if (!memcmp(s1, s2, l2))
- return (char *)s1;
- s1++;
- }
- return NULL;
-}
-static bool __init cmdline_contains_nokaslr(const u8 *cmdline)
-{
- const u8 *str;
-
- str = __strstr(cmdline, "nokaslr");
- return str == cmdline || (str > cmdline && *(str - 1) == ' ');
-}
-
-static bool __init is_kaslr_disabled_cmdline(void *fdt)
-{
- if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
- int node;
- const u8 *prop;
-
- node = fdt_path_offset(fdt, "/chosen");
- if (node < 0)
- goto out;
-
- prop = fdt_getprop(fdt, node, "bootargs", NULL);
- if (!prop)
- goto out;
-
- if (cmdline_contains_nokaslr(prop))
- return true;
-
- if (IS_ENABLED(CONFIG_CMDLINE_EXTEND))
- goto out;
-
- return false;
- }
-out:
- return cmdline_contains_nokaslr(CONFIG_CMDLINE);
-}
-
static u64 __init get_kaslr_seed(void *fdt)
{
static char const chosen_str[] __initconst = "chosen";
@@ -92,7 +41,7 @@ asmlinkage u64 __init kaslr_early_init(void *fdt)
{
u64 seed, range;
- if (is_kaslr_disabled_cmdline(fdt))
+ if (kaslr_disabled_cmdline())
return 0;
seed = get_kaslr_seed(fdt);
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 27/39] arm64/kernel: Move 'nokaslr' parsing out of early idreg code
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (25 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 26/39] arm64: kaslr: Use feature override instead of parsing the cmdline again Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 28/39] arm64: idreg-override: Create a pseudo feature for rodata=off Ard Biesheuvel
` (12 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Parsing and ignoring 'nokaslr' can be done from anywhere, except from
the code that runs very early and is therefore built with limitations on
the kind of relocations it is permitted to use.
So move it to a source file that is part of the ordinary kernel build.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/kaslr.c | 7 +++++++
arch/arm64/kernel/pi/idreg-override.c | 7 -------
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
index efbeb8356769..1da3e25f9d9e 100644
--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -34,3 +34,10 @@ void __init kaslr_init(void)
pr_info("KASLR enabled\n");
__kaslr_is_enabled = true;
}
+
+static int __init parse_nokaslr(char *unused)
+{
+ /* nokaslr param handling is done by early cpufeature code */
+ return 0;
+}
+early_param("nokaslr", parse_nokaslr);
diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
index bcba0ce71af0..26961e0f94b7 100644
--- a/arch/arm64/kernel/pi/idreg-override.c
+++ b/arch/arm64/kernel/pi/idreg-override.c
@@ -198,13 +198,6 @@ static const struct {
{ "nokaslr", "arm64_sw.nokaslr=1" },
};
-static int __init parse_nokaslr(char *unused)
-{
- /* nokaslr param handling is done by early cpufeature code */
- return 0;
-}
-early_param("nokaslr", parse_nokaslr);
-
static int __init parse_hexdigit(const char *p, u64 *v)
{
// skip "0x" if it comes next
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 28/39] arm64: idreg-override: Create a pseudo feature for rodata=off
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (26 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 27/39] arm64/kernel: Move 'nokaslr' parsing out of early idreg code Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 29/39] arm64: Add helpers to probe local CPU for PAC and BTI support Ard Biesheuvel
` (11 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Add rodata=off to the set of kernel command line options that is parsed
early using the CPU feature override detection code, so we can easily
refer to it when creating the kernel mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/cpufeature.h | 1 +
arch/arm64/kernel/pi/idreg-override.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 77ed3b28cbc6..99c01417e544 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -17,6 +17,7 @@
#define ARM64_SW_FEATURE_OVERRIDE_NOKASLR 0
#define ARM64_SW_FEATURE_OVERRIDE_HVHE 4
+#define ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF 8
#ifndef __ASSEMBLY__
diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
index 26961e0f94b7..1aa59c01ab33 100644
--- a/arch/arm64/kernel/pi/idreg-override.c
+++ b/arch/arm64/kernel/pi/idreg-override.c
@@ -163,6 +163,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
.fields = {
FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
+ FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
{}
},
};
@@ -196,6 +197,7 @@ static const struct {
{ "arm64.nomops", "id_aa64isar2.mops=0" },
{ "arm64.nomte", "id_aa64pfr1.mte=0" },
{ "nokaslr", "arm64_sw.nokaslr=1" },
+ { "rodata=off", "arm64_sw.rodataoff=1" },
};
static int __init parse_hexdigit(const char *p, u64 *v)
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 29/39] arm64: Add helpers to probe local CPU for PAC and BTI support
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (27 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 28/39] arm64: idreg-override: Create a pseudo feature for rodata=off Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 12:37 ` Marc Zyngier
2023-11-24 10:19 ` [PATCH v5 30/39] arm64: head: allocate more pages for the kernel mapping Ard Biesheuvel
` (10 subsequent siblings)
39 siblings, 1 reply; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Add some helpers that will be used by the early kernel mapping code to
check feature support on the local CPU. This permits the early kernel
mapping to be created with the right attributes, removing the need for
tearing it down and recreating it.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/cpufeature.h | 44 ++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 99c01417e544..301d4d2211d5 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -921,6 +921,50 @@ static inline bool kaslr_disabled_cmdline(void)
u32 get_kvm_ipa_limit(void);
void dump_cpu_features(void);
+static inline bool cpu_has_bti(void)
+{
+ u64 pfr1;
+
+ if (!IS_ENABLED(CONFIG_ARM64_BTI))
+ return false;
+
+ pfr1 = read_cpuid(ID_AA64PFR1_EL1);
+ pfr1 &= ~id_aa64pfr1_override.mask;
+ pfr1 |= id_aa64pfr1_override.val;
+
+ return cpuid_feature_extract_unsigned_field(pfr1,
+ ID_AA64PFR1_EL1_BT_SHIFT);
+}
+
+static inline bool cpu_has_pac(void)
+{
+ u64 isar1, isar2;
+ u8 feat;
+
+ if (!IS_ENABLED(CONFIG_ARM64_PTR_AUTH))
+ return false;
+
+ isar1 = read_cpuid(ID_AA64ISAR1_EL1);
+ isar1 &= ~id_aa64isar1_override.mask;
+ isar1 |= id_aa64isar1_override.val;
+ feat = cpuid_feature_extract_unsigned_field(isar1,
+ ID_AA64ISAR1_EL1_APA_SHIFT);
+ if (feat)
+ return true;
+
+ feat = cpuid_feature_extract_unsigned_field(isar1,
+ ID_AA64ISAR1_EL1_API_SHIFT);
+ if (feat)
+ return true;
+
+ isar2 = read_sysreg_s(SYS_ID_AA64ISAR2_EL1);
+ isar2 &= ~id_aa64isar2_override.mask;
+ isar2 |= id_aa64isar2_override.val;
+ feat = cpuid_feature_extract_unsigned_field(isar2,
+ ID_AA64ISAR2_EL1_APA3_SHIFT);
+ return feat;
+}
+
#endif /* __ASSEMBLY__ */
#endif
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 30/39] arm64: head: allocate more pages for the kernel mapping
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (28 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 29/39] arm64: Add helpers to probe local CPU for PAC and BTI support Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 31/39] arm64: head: move memstart_offset_seed handling to C code Ard Biesheuvel
` (9 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
In preparation for switching to an early kernel mapping routine that
maps each segment according to its precise boundaries, and with the
correct attributes, let's allocate some extra pages for page tables for
the 4k page size configuration. This is necessary because the start and
end of each segment may not be aligned to the block size, and so we'll
need an extra page table at each segment boundary.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/kernel-pgtable.h | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 83ddb14b95a5..0631604995ee 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -68,7 +68,7 @@
+ EARLY_PGDS((vstart), (vend), add) /* each PGDIR needs a next level page table */ \
+ EARLY_PUDS((vstart), (vend), add) /* each PUD needs a next level page table */ \
+ EARLY_PMDS((vstart), (vend), add)) /* each PMD needs a next level page table */
-#define INIT_DIR_SIZE (PAGE_SIZE * EARLY_PAGES(KIMAGE_VADDR, _end, EXTRA_PAGE))
+#define INIT_DIR_SIZE (PAGE_SIZE * (EARLY_PAGES(KIMAGE_VADDR, _end, EXTRA_PAGE) + EARLY_SEGMENT_EXTRA_PAGES))
/* the initial ID map may need two extra pages if it needs to be extended */
#if VA_BITS < 48
@@ -89,6 +89,15 @@
#define SWAPPER_TABLE_SHIFT PMD_SHIFT
#endif
+/* The number of segments in the kernel image (text, rodata, inittext, initdata, data+bss) */
+#define KERNEL_SEGMENT_COUNT 5
+
+#if SWAPPER_BLOCK_SIZE > SEGMENT_ALIGN
+#define EARLY_SEGMENT_EXTRA_PAGES (KERNEL_SEGMENT_COUNT + 1)
+#else
+#define EARLY_SEGMENT_EXTRA_PAGES 0
+#endif
+
/*
* Initial memory map attributes.
*/
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 31/39] arm64: head: move memstart_offset_seed handling to C code
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (29 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 30/39] arm64: head: allocate more pages for the kernel mapping Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 32/39] arm64: mm: Make kaslr_requires_kpti() a static inline Ard Biesheuvel
` (8 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Now that we can set BSS variables from the early code running from the
ID map, we can set memstart_offset_seed directly from the C code that
derives the value instead of passing it back and forth between C and asm
code.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 7 -------
arch/arm64/kernel/image-vars.h | 1 +
arch/arm64/kernel/pi/kaslr_early.c | 4 ++++
3 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b320702032a7..aa7766dc64d9 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -82,7 +82,6 @@
* x21 primary_entry() .. start_kernel() FDT pointer passed at boot in x0
* x22 create_idmap() .. start_kernel() ID map VA of the DT blob
* x23 __primary_switch() physical misalignment/KASLR offset
- * x24 __primary_switch() linear map KASLR seed
* x25 primary_entry() .. start_kernel() supported VA size
* x28 create_idmap() callee preserved temp register
*/
@@ -483,11 +482,6 @@ SYM_FUNC_START_LOCAL(__primary_switched)
str x25, [x8] // ... observes the correct value
dc civac, x8 // Make visible to booting secondaries
#endif
-
-#ifdef CONFIG_RANDOMIZE_BASE
- adrp x5, memstart_offset_seed // Save KASLR linear map seed
- strh w24, [x5, :lo12:memstart_offset_seed]
-#endif
#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl kasan_early_init
#endif
@@ -779,7 +773,6 @@ SYM_FUNC_START_LOCAL(__primary_switch)
#ifdef CONFIG_RANDOMIZE_BASE
mov x0, x22
bl __pi_kaslr_early_init
- and x24, x0, #SZ_2M - 1 // capture memstart offset seed
bic x0, x0, #SZ_2M - 1
orr x23, x23, x0 // record kernel offset
#endif
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index eacc3d167733..8d96052079e8 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -46,6 +46,7 @@ PROVIDE(__pi_id_aa64smfr0_override = id_aa64smfr0_override);
PROVIDE(__pi_id_aa64zfr0_override = id_aa64zfr0_override);
PROVIDE(__pi_arm64_sw_feature_override = arm64_sw_feature_override);
PROVIDE(__pi__ctype = _ctype);
+PROVIDE(__pi_memstart_offset_seed = memstart_offset_seed);
#ifdef CONFIG_KVM
diff --git a/arch/arm64/kernel/pi/kaslr_early.c b/arch/arm64/kernel/pi/kaslr_early.c
index f2305e276ec3..eeecee7ffd6f 100644
--- a/arch/arm64/kernel/pi/kaslr_early.c
+++ b/arch/arm64/kernel/pi/kaslr_early.c
@@ -16,6 +16,8 @@
#include <asm/memory.h>
#include <asm/pgtable.h>
+extern u16 memstart_offset_seed;
+
static u64 __init get_kaslr_seed(void *fdt)
{
static char const chosen_str[] __initconst = "chosen";
@@ -51,6 +53,8 @@ asmlinkage u64 __init kaslr_early_init(void *fdt)
return 0;
}
+ memstart_offset_seed = seed & U16_MAX;
+
/*
* OK, so we are proceeding with KASLR enabled. Calculate a suitable
* kernel image offset from the seed. Let's place the kernel in the
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 32/39] arm64: mm: Make kaslr_requires_kpti() a static inline
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (30 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 31/39] arm64: head: move memstart_offset_seed handling to C code Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 33/39] arm64: head: Move early kernel mapping routines into C code Ard Biesheuvel
` (7 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
In preparation for moving the first assignment of arm64_use_ng_mappings
to an earlier stage in the boot, ensure that kaslr_requires_kpti() is
accessible without relying on the core kernel's view on whether or not
KASLR is enabled. So make it a static inline, and move the
kaslr_enabled() check out of it and into the callers, one of which will
disappear in a subsequent patch.
Once/when support for the obsolete ThunderX 1 platform is dropped, this
check reduces to a E0PD feature check on the local CPU.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/mmu.h | 38 +++++++++++++++++-
arch/arm64/kernel/cpufeature.c | 42 +-------------------
arch/arm64/kernel/setup.c | 2 +-
3 files changed, 39 insertions(+), 43 deletions(-)
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 2fcf51231d6e..d0b8b4b413b6 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -71,7 +71,43 @@ extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
pgprot_t prot, bool page_mappings_only);
extern void *fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot);
extern void mark_linear_text_alias_ro(void);
-extern bool kaslr_requires_kpti(void);
+
+/*
+ * This check is triggered during the early boot before the cpufeature
+ * is initialised. Checking the status on the local CPU allows the boot
+ * CPU to detect the need for non-global mappings and thus avoiding a
+ * pagetable re-write after all the CPUs are booted. This check will be
+ * anyway run on individual CPUs, allowing us to get the consistent
+ * state once the SMP CPUs are up and thus make the switch to non-global
+ * mappings if required.
+ */
+static inline bool kaslr_requires_kpti(void)
+{
+ /*
+ * E0PD does a similar job to KPTI so can be used instead
+ * where available.
+ */
+ if (IS_ENABLED(CONFIG_ARM64_E0PD)) {
+ u64 mmfr2 = read_sysreg_s(SYS_ID_AA64MMFR2_EL1);
+ if (cpuid_feature_extract_unsigned_field(mmfr2,
+ ID_AA64MMFR2_EL1_E0PD_SHIFT))
+ return false;
+ }
+
+ /*
+ * Systems affected by Cavium erratum 24756 are incompatible
+ * with KPTI.
+ */
+ if (IS_ENABLED(CONFIG_CAVIUM_ERRATUM_27456)) {
+ extern const struct midr_range cavium_erratum_27456_cpus[];
+
+ if (is_midr_in_range_list(read_cpuid_id(),
+ cavium_erratum_27456_cpus))
+ return false;
+ }
+
+ return true;
+}
#define INIT_MM_CONTEXT(name) \
.pgd = init_pg_dir,
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index b95374387946..7222f5a14794 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1649,46 +1649,6 @@ has_useable_cnp(const struct arm64_cpu_capabilities *entry, int scope)
return has_cpuid_feature(entry, scope);
}
-/*
- * This check is triggered during the early boot before the cpufeature
- * is initialised. Checking the status on the local CPU allows the boot
- * CPU to detect the need for non-global mappings and thus avoiding a
- * pagetable re-write after all the CPUs are booted. This check will be
- * anyway run on individual CPUs, allowing us to get the consistent
- * state once the SMP CPUs are up and thus make the switch to non-global
- * mappings if required.
- */
-bool kaslr_requires_kpti(void)
-{
- if (!IS_ENABLED(CONFIG_RANDOMIZE_BASE))
- return false;
-
- /*
- * E0PD does a similar job to KPTI so can be used instead
- * where available.
- */
- if (IS_ENABLED(CONFIG_ARM64_E0PD)) {
- u64 mmfr2 = read_sysreg_s(SYS_ID_AA64MMFR2_EL1);
- if (cpuid_feature_extract_unsigned_field(mmfr2,
- ID_AA64MMFR2_EL1_E0PD_SHIFT))
- return false;
- }
-
- /*
- * Systems affected by Cavium erratum 24756 are incompatible
- * with KPTI.
- */
- if (IS_ENABLED(CONFIG_CAVIUM_ERRATUM_27456)) {
- extern const struct midr_range cavium_erratum_27456_cpus[];
-
- if (is_midr_in_range_list(read_cpuid_id(),
- cavium_erratum_27456_cpus))
- return false;
- }
-
- return kaslr_enabled();
-}
-
static bool __meltdown_safe = true;
static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
@@ -1741,7 +1701,7 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
}
/* Useful for KASLR robustness */
- if (kaslr_requires_kpti()) {
+ if (kaslr_enabled() && kaslr_requires_kpti()) {
if (!__kpti_forced) {
str = "KASLR";
__kpti_forced = 1;
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 4b0b3515ee20..c2d6852a4e0c 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -288,7 +288,7 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
* mappings from the start, avoiding the cost of rewriting
* everything later.
*/
- arm64_use_ng_mappings = kaslr_requires_kpti();
+ arm64_use_ng_mappings = kaslr_enabled() && kaslr_requires_kpti();
early_fixmap_init();
early_ioremap_init();
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 33/39] arm64: head: Move early kernel mapping routines into C code
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (31 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 32/39] arm64: mm: Make kaslr_requires_kpti() a static inline Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 34/39] arm64: mm: Use 48-bit virtual addressing for the permanent ID map Ard Biesheuvel
` (6 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The asm version of the kernel mapping code works fine for creating a
coarse grained identity map, but for mapping the kernel down to its
exact boundaries with the right attributes, it is not suitable. This is
why we create a preliminary RWX kernel mapping first, and then rebuild
it from scratch later on.
So let's reimplement this in C, in a way that will make it unnecessary
to create the kernel page tables yet another time in paging_init().
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/archrandom.h | 2 -
arch/arm64/include/asm/scs.h | 32 +---
arch/arm64/kernel/head.S | 52 +-----
arch/arm64/kernel/image-vars.h | 19 +++
arch/arm64/kernel/pi/Makefile | 1 +
arch/arm64/kernel/pi/idreg-override.c | 22 ++-
arch/arm64/kernel/pi/kaslr_early.c | 12 +-
arch/arm64/kernel/pi/map_kernel.c | 165 ++++++++++++++++++++
arch/arm64/kernel/pi/map_range.c | 88 +++++++++++
arch/arm64/kernel/pi/patch-scs.c | 16 +-
arch/arm64/kernel/pi/pi.h | 14 ++
arch/arm64/kernel/pi/relocate.c | 2 +
arch/arm64/kernel/setup.c | 7 -
arch/arm64/kernel/vmlinux.lds.S | 4 +-
arch/arm64/mm/proc.S | 1 +
15 files changed, 316 insertions(+), 121 deletions(-)
diff --git a/arch/arm64/include/asm/archrandom.h b/arch/arm64/include/asm/archrandom.h
index ecdb3cfcd0f8..8babfbe31f95 100644
--- a/arch/arm64/include/asm/archrandom.h
+++ b/arch/arm64/include/asm/archrandom.h
@@ -129,6 +129,4 @@ static inline bool __init __early_cpu_has_rndr(void)
return (ftr >> ID_AA64ISAR0_EL1_RNDR_SHIFT) & 0xf;
}
-u64 kaslr_early_init(void *fdt);
-
#endif /* _ASM_ARCHRANDOM_H */
diff --git a/arch/arm64/include/asm/scs.h b/arch/arm64/include/asm/scs.h
index eca2ba5a6276..2e010ea76be2 100644
--- a/arch/arm64/include/asm/scs.h
+++ b/arch/arm64/include/asm/scs.h
@@ -33,37 +33,11 @@
#include <asm/cpufeature.h>
#ifdef CONFIG_UNWIND_PATCH_PAC_INTO_SCS
-static inline bool should_patch_pac_into_scs(void)
-{
- u64 reg;
-
- /*
- * We only enable the shadow call stack dynamically if we are running
- * on a system that does not implement PAC or BTI. PAC and SCS provide
- * roughly the same level of protection, and BTI relies on the PACIASP
- * instructions serving as landing pads, preventing us from patching
- * those instructions into something else.
- */
- reg = read_sysreg_s(SYS_ID_AA64ISAR1_EL1);
- if (SYS_FIELD_GET(ID_AA64ISAR1_EL1, APA, reg) |
- SYS_FIELD_GET(ID_AA64ISAR1_EL1, API, reg))
- return false;
-
- reg = read_sysreg_s(SYS_ID_AA64ISAR2_EL1);
- if (SYS_FIELD_GET(ID_AA64ISAR2_EL1, APA3, reg))
- return false;
-
- if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL)) {
- reg = read_sysreg_s(SYS_ID_AA64PFR1_EL1);
- if (reg & (0xf << ID_AA64PFR1_EL1_BT_SHIFT))
- return false;
- }
- return true;
-}
-
static inline void dynamic_scs_init(void)
{
- if (should_patch_pac_into_scs()) {
+ extern bool __pi_dynamic_scs_is_enabled;
+
+ if (__pi_dynamic_scs_is_enabled) {
pr_info("Enabling dynamic shadow call stack\n");
static_branch_enable(&dynamic_scs_enabled);
}
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index aa7766dc64d9..ffacce7b5a02 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -81,7 +81,6 @@
* x20 primary_entry() .. __primary_switch() CPU boot mode
* x21 primary_entry() .. start_kernel() FDT pointer passed at boot in x0
* x22 create_idmap() .. start_kernel() ID map VA of the DT blob
- * x23 __primary_switch() physical misalignment/KASLR offset
* x25 primary_entry() .. start_kernel() supported VA size
* x28 create_idmap() callee preserved temp register
*/
@@ -408,24 +407,6 @@ SYM_FUNC_START_LOCAL(create_idmap)
0: ret x28
SYM_FUNC_END(create_idmap)
-SYM_FUNC_START_LOCAL(create_kernel_mapping)
- adrp x0, init_pg_dir
- mov_q x5, KIMAGE_VADDR // compile time __va(_text)
-#ifdef CONFIG_RELOCATABLE
- add x5, x5, x23 // add KASLR displacement
-#endif
- adrp x6, _end // runtime __pa(_end)
- adrp x3, _text // runtime __pa(_text)
- sub x6, x6, x3 // _end - _text
- add x6, x6, x5 // runtime __va(_end)
- mov_q x7, SWAPPER_RW_MMUFLAGS
-
- map_memory x0, x1, x5, x6, x7, x3, (VA_BITS - PGDIR_SHIFT), x10, x11, x12, x13, x14
-
- dsb ishst // sync with page table walker
- ret
-SYM_FUNC_END(create_kernel_mapping)
-
/*
* Initialize CPU registers with task-specific and cpu-specific context.
*
@@ -752,44 +733,13 @@ SYM_FUNC_START_LOCAL(__primary_switch)
adrp x2, init_idmap_pg_dir
bl __enable_mmu
- // Clear BSS
- adrp x0, __bss_start
- mov x1, xzr
- adrp x2, init_pg_end
- sub x2, x2, x0
- bl __pi_memset
- dsb ishst // Make zero page visible to PTW
-
adrp x1, early_init_stack
mov sp, x1
mov x29, xzr
mov x0, x20 // pass the full boot status
mov x1, x22 // pass the low FDT mapping
- bl __pi_init_feature_override // Parse cpu feature overrides
-
-#ifdef CONFIG_RELOCATABLE
- adrp x23, KERNEL_START
- and x23, x23, MIN_KIMG_ALIGN - 1
-#ifdef CONFIG_RANDOMIZE_BASE
- mov x0, x22
- bl __pi_kaslr_early_init
- bic x0, x0, #SZ_2M - 1
- orr x23, x23, x0 // record kernel offset
-#endif
-#endif
- bl create_kernel_mapping
+ bl __pi_early_map_kernel // Map and relocate the kernel
- adrp x1, init_pg_dir
- load_ttbr1 x1, x1, x2
-#ifdef CONFIG_RELOCATABLE
- mov x0, x23
- bl __pi_relocate_kernel
-#endif
-#ifdef CONFIG_UNWIND_PATCH_PAC_INTO_SCS
- ldr x0, =__eh_frame_start
- ldr x1, =__eh_frame_end
- bl __pi_scs_patch_vmlinux
-#endif
ldr x8, =__primary_switched
adrp x0, KERNEL_START // __pa(KERNEL_START)
br x8
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 8d96052079e8..e566b32f9c22 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -45,9 +45,28 @@ PROVIDE(__pi_id_aa64pfr1_override = id_aa64pfr1_override);
PROVIDE(__pi_id_aa64smfr0_override = id_aa64smfr0_override);
PROVIDE(__pi_id_aa64zfr0_override = id_aa64zfr0_override);
PROVIDE(__pi_arm64_sw_feature_override = arm64_sw_feature_override);
+PROVIDE(__pi_arm64_use_ng_mappings = arm64_use_ng_mappings);
+#ifdef CONFIG_CAVIUM_ERRATUM_27456
+PROVIDE(__pi_cavium_erratum_27456_cpus = cavium_erratum_27456_cpus);
+#endif
PROVIDE(__pi__ctype = _ctype);
PROVIDE(__pi_memstart_offset_seed = memstart_offset_seed);
+PROVIDE(__pi_init_pg_dir = init_pg_dir);
+PROVIDE(__pi_init_pg_end = init_pg_end);
+
+PROVIDE(__pi__text = _text);
+PROVIDE(__pi__stext = _stext);
+PROVIDE(__pi__etext = _etext);
+PROVIDE(__pi___start_rodata = __start_rodata);
+PROVIDE(__pi___inittext_begin = __inittext_begin);
+PROVIDE(__pi___inittext_end = __inittext_end);
+PROVIDE(__pi___initdata_begin = __initdata_begin);
+PROVIDE(__pi___initdata_end = __initdata_end);
+PROVIDE(__pi__data = _data);
+PROVIDE(__pi___bss_start = __bss_start);
+PROVIDE(__pi__end = _end);
+
#ifdef CONFIG_KVM
/*
diff --git a/arch/arm64/kernel/pi/Makefile b/arch/arm64/kernel/pi/Makefile
index a8b302245f15..8c2f80a46b93 100644
--- a/arch/arm64/kernel/pi/Makefile
+++ b/arch/arm64/kernel/pi/Makefile
@@ -39,6 +39,7 @@ $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)
obj-y := idreg-override.pi.o \
+ map_kernel.pi.o map_range.pi.o \
lib-fdt.pi.o lib-fdt_ro.pi.o
obj-$(CONFIG_RELOCATABLE) += relocate.pi.o
obj-$(CONFIG_RANDOMIZE_BASE) += kaslr_early.pi.o
diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
index 1aa59c01ab33..5857e4efad59 100644
--- a/arch/arm64/kernel/pi/idreg-override.c
+++ b/arch/arm64/kernel/pi/idreg-override.c
@@ -314,37 +314,35 @@ static __init void __parse_cmdline(const char *cmdline, bool parse_aliases)
} while (1);
}
-static __init const u8 *get_bootargs_cmdline(const void *fdt)
+static __init const u8 *get_bootargs_cmdline(const void *fdt, int node)
{
+ static char const bootargs[] __initconst = "bootargs";
const u8 *prop;
- int node;
- node = fdt_path_offset(fdt, "/chosen");
if (node < 0)
return NULL;
- prop = fdt_getprop(fdt, node, "bootargs", NULL);
+ prop = fdt_getprop(fdt, node, bootargs, NULL);
if (!prop)
return NULL;
return strlen(prop) ? prop : NULL;
}
-static __init void parse_cmdline(const void *fdt)
+static __init void parse_cmdline(const void *fdt, int chosen)
{
- const u8 *prop = get_bootargs_cmdline(fdt);
+ static char const cmdline[] __initconst = CONFIG_CMDLINE;
+ const u8 *prop = get_bootargs_cmdline(fdt, chosen);
if (IS_ENABLED(CONFIG_CMDLINE_FORCE) || !prop)
- __parse_cmdline(CONFIG_CMDLINE, true);
+ __parse_cmdline(cmdline, true);
if (!IS_ENABLED(CONFIG_CMDLINE_FORCE) && prop)
__parse_cmdline(prop, true);
}
-/* Keep checkers quiet */
-void init_feature_override(u64 boot_status, const void *fdt);
-
-asmlinkage void __init init_feature_override(u64 boot_status, const void *fdt)
+void __init init_feature_override(u64 boot_status, const void *fdt,
+ int chosen)
{
struct arm64_ftr_override *override;
const struct ftr_set_desc *reg;
@@ -360,7 +358,7 @@ asmlinkage void __init init_feature_override(u64 boot_status, const void *fdt)
__boot_status = boot_status;
- parse_cmdline(fdt);
+ parse_cmdline(fdt, chosen);
for (i = 0; i < ARRAY_SIZE(regs); i++) {
reg = prel64_to_pointer(®s[i].reg_prel);
diff --git a/arch/arm64/kernel/pi/kaslr_early.c b/arch/arm64/kernel/pi/kaslr_early.c
index eeecee7ffd6f..0257b43819db 100644
--- a/arch/arm64/kernel/pi/kaslr_early.c
+++ b/arch/arm64/kernel/pi/kaslr_early.c
@@ -16,17 +16,17 @@
#include <asm/memory.h>
#include <asm/pgtable.h>
+#include "pi.h"
+
extern u16 memstart_offset_seed;
-static u64 __init get_kaslr_seed(void *fdt)
+static u64 __init get_kaslr_seed(void *fdt, int node)
{
- static char const chosen_str[] __initconst = "chosen";
static char const seed_str[] __initconst = "kaslr-seed";
- int node, len;
fdt64_t *prop;
u64 ret;
+ int len;
- node = fdt_path_offset(fdt, chosen_str);
if (node < 0)
return 0;
@@ -39,14 +39,14 @@ static u64 __init get_kaslr_seed(void *fdt)
return ret;
}
-asmlinkage u64 __init kaslr_early_init(void *fdt)
+u64 __init kaslr_early_init(void *fdt, int chosen)
{
u64 seed, range;
if (kaslr_disabled_cmdline())
return 0;
- seed = get_kaslr_seed(fdt);
+ seed = get_kaslr_seed(fdt, chosen);
if (!seed) {
if (!__early_cpu_has_rndr() ||
!__arm64_rndr((unsigned long *)&seed))
diff --git a/arch/arm64/kernel/pi/map_kernel.c b/arch/arm64/kernel/pi/map_kernel.c
new file mode 100644
index 000000000000..afb6a337fa48
--- /dev/null
+++ b/arch/arm64/kernel/pi/map_kernel.c
@@ -0,0 +1,165 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright 2023 Google LLC
+// Author: Ard Biesheuvel <ardb@google.com>
+
+#include <linux/init.h>
+#include <linux/libfdt.h>
+#include <linux/linkage.h>
+#include <linux/types.h>
+#include <linux/sizes.h>
+#include <linux/string.h>
+
+#include <asm/memory.h>
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/tlbflush.h>
+
+#include "pi.h"
+
+extern const u8 __eh_frame_start[], __eh_frame_end[];
+
+extern void idmap_cpu_replace_ttbr1(void *pgdir);
+
+static void __init map_segment(pgd_t *pg_dir, u64 *pgd, u64 va_offset,
+ void *start, void *end, pgprot_t prot,
+ bool may_use_cont, int root_level)
+{
+ map_range(pgd, ((u64)start + va_offset) & ~PAGE_OFFSET,
+ ((u64)end + va_offset) & ~PAGE_OFFSET, (u64)start,
+ prot, root_level, (pte_t *)pg_dir, may_use_cont, 0);
+}
+
+static void __init unmap_segment(pgd_t *pg_dir, u64 va_offset, void *start,
+ void *end, int root_level)
+{
+ map_segment(pg_dir, NULL, va_offset, start, end, __pgprot(0),
+ false, root_level);
+}
+
+static void __init map_kernel(u64 kaslr_offset, u64 va_offset, int root_level)
+{
+ bool enable_scs = IS_ENABLED(CONFIG_UNWIND_PATCH_PAC_INTO_SCS);
+ bool twopass = IS_ENABLED(CONFIG_RELOCATABLE);
+ u64 pgdp = (u64)init_pg_dir + PAGE_SIZE;
+ pgprot_t text_prot = PAGE_KERNEL_ROX;
+ pgprot_t data_prot = PAGE_KERNEL;
+ pgprot_t prot;
+
+ /*
+ * External debuggers may need to write directly to the text mapping to
+ * install SW breakpoints. Allow this (only) when explicitly requested
+ * with rodata=off.
+ */
+ if (cpuid_feature_extract_unsigned_field(arm64_sw_feature_override.val,
+ ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF))
+ text_prot = PAGE_KERNEL_EXEC;
+
+ /*
+ * We only enable the shadow call stack dynamically if we are running
+ * on a system that does not implement PAC or BTI. PAC and SCS provide
+ * roughly the same level of protection, and BTI relies on the PACIASP
+ * instructions serving as landing pads, preventing us from patching
+ * those instructions into something else.
+ */
+ if (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL) && cpu_has_pac())
+ enable_scs = false;
+
+ if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) && cpu_has_bti()) {
+ enable_scs = false;
+
+ /*
+ * If we have a CPU that supports BTI and a kernel built for
+ * BTI then mark the kernel executable text as guarded pages
+ * now so we don't have to rewrite the page tables later.
+ */
+ text_prot = __pgprot_modify(text_prot, PTE_GP, PTE_GP);
+ }
+
+ /* Map all code read-write on the first pass if needed */
+ twopass |= enable_scs;
+ prot = twopass ? data_prot : text_prot;
+
+ map_segment(init_pg_dir, &pgdp, va_offset, _stext, _etext, prot,
+ !twopass, root_level);
+ map_segment(init_pg_dir, &pgdp, va_offset, __start_rodata,
+ __inittext_begin, data_prot, false, root_level);
+ map_segment(init_pg_dir, &pgdp, va_offset, __inittext_begin,
+ __inittext_end, prot, false, root_level);
+ map_segment(init_pg_dir, &pgdp, va_offset, __initdata_begin,
+ __initdata_end, data_prot, false, root_level);
+ map_segment(init_pg_dir, &pgdp, va_offset, _data, _end, data_prot,
+ true, root_level);
+ dsb(ishst);
+
+ idmap_cpu_replace_ttbr1(init_pg_dir);
+
+ if (twopass) {
+ if (IS_ENABLED(CONFIG_RELOCATABLE))
+ relocate_kernel(kaslr_offset);
+
+ if (enable_scs) {
+ scs_patch(__eh_frame_start + va_offset,
+ __eh_frame_end - __eh_frame_start);
+ asm("ic ialluis");
+
+ dynamic_scs_is_enabled = true;
+ }
+
+ /*
+ * Unmap the text region before remapping it, to avoid
+ * potential TLB conflicts when creating the contiguous
+ * descriptors.
+ */
+ unmap_segment(init_pg_dir, va_offset, _stext, _etext,
+ root_level);
+ dsb(ishst);
+ isb();
+ __tlbi(vmalle1);
+ isb();
+
+ /*
+ * Remap these segments with different permissions
+ * No new page table allocations should be needed
+ */
+ map_segment(init_pg_dir, NULL, va_offset, _stext, _etext,
+ text_prot, true, root_level);
+ map_segment(init_pg_dir, NULL, va_offset, __inittext_begin,
+ __inittext_end, text_prot, false, root_level);
+ dsb(ishst);
+ }
+}
+
+asmlinkage void __init early_map_kernel(u64 boot_status, void *fdt)
+{
+ static char const chosen_str[] __initconst = "/chosen";
+ u64 va_base, pa_base = (u64)&_text;
+ u64 kaslr_offset = pa_base % MIN_KIMG_ALIGN;
+ int root_level = 4 - CONFIG_PGTABLE_LEVELS;
+ int chosen;
+
+ /* Clear BSS and the initial page tables */
+ memset(__bss_start, 0, (u64)init_pg_end - (u64)__bss_start);
+
+ /* Parse the command line for CPU feature overrides */
+ chosen = fdt_path_offset(fdt, chosen_str);
+ init_feature_override(boot_status, fdt, chosen);
+
+ /*
+ * The virtual KASLR displacement modulo 2MiB is decided by the
+ * physical placement of the image, as otherwise, we might not be able
+ * to create the early kernel mapping using 2 MiB block descriptors. So
+ * take the low bits of the KASLR offset from the physical address, and
+ * fill in the high bits from the seed.
+ */
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ u64 kaslr_seed = kaslr_early_init(fdt, chosen);
+
+ if (kaslr_seed && kaslr_requires_kpti())
+ arm64_use_ng_mappings = true;
+
+ kaslr_offset |= kaslr_seed & ~(MIN_KIMG_ALIGN - 1);
+ }
+
+ va_base = KIMAGE_VADDR + kaslr_offset;
+ map_kernel(kaslr_offset, va_base - pa_base, root_level);
+}
diff --git a/arch/arm64/kernel/pi/map_range.c b/arch/arm64/kernel/pi/map_range.c
new file mode 100644
index 000000000000..c31feda18f47
--- /dev/null
+++ b/arch/arm64/kernel/pi/map_range.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright 2023 Google LLC
+// Author: Ard Biesheuvel <ardb@google.com>
+
+#include <linux/types.h>
+#include <linux/sizes.h>
+
+#include <asm/memory.h>
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+
+#include "pi.h"
+
+/**
+ * map_range - Map a contiguous range of physical pages into virtual memory
+ *
+ * @pte: Address of physical pointer to array of pages to
+ * allocate page tables from
+ * @start: Virtual address of the start of the range
+ * @end: Virtual address of the end of the range (exclusive)
+ * @pa: Physical address of the start of the range
+ * @prot: Access permissions of the range
+ * @level: Translation level for the mapping
+ * @tbl: The level @level page table to create the mappings in
+ * @may_use_cont: Whether the use of the contiguous attribute is allowed
+ * @va_offset: Offset between a physical page and its current mapping
+ * in the VA space
+ */
+void __init map_range(u64 *pte, u64 start, u64 end, u64 pa, pgprot_t prot,
+ int level, pte_t *tbl, bool may_use_cont, u64 va_offset)
+{
+ u64 cmask = (level == 3) ? CONT_PTE_SIZE - 1 : U64_MAX;
+ u64 protval = pgprot_val(prot) & ~PTE_TYPE_MASK;
+ int lshift = (3 - level) * (PAGE_SHIFT - 3);
+ u64 lmask = (PAGE_SIZE << lshift) - 1;
+
+ start &= PAGE_MASK;
+ pa &= PAGE_MASK;
+
+ /* Advance tbl to the entry that covers start */
+ tbl += (start >> (lshift + PAGE_SHIFT)) % PTRS_PER_PTE;
+
+ /*
+ * Set the right block/page bits for this level unless we are
+ * clearing the mapping
+ */
+ if (protval)
+ protval |= (level < 3) ? PMD_TYPE_SECT : PTE_TYPE_PAGE;
+
+ while (start < end) {
+ u64 next = min((start | lmask) + 1, PAGE_ALIGN(end));
+
+ if (level < 3 && (start | next | pa) & lmask) {
+ /*
+ * This chunk needs a finer grained mapping. Create a
+ * table mapping if necessary and recurse.
+ */
+ if (pte_none(*tbl)) {
+ *tbl = __pte(__phys_to_pte_val(*pte) |
+ PMD_TYPE_TABLE | PMD_TABLE_UXN);
+ *pte += PTRS_PER_PTE * sizeof(pte_t);
+ }
+ map_range(pte, start, next, pa, prot, level + 1,
+ (pte_t *)(__pte_to_phys(*tbl) + va_offset),
+ may_use_cont, va_offset);
+ } else {
+ /*
+ * Start a contiguous range if start and pa are
+ * suitably aligned
+ */
+ if (((start | pa) & cmask) == 0 && may_use_cont)
+ protval |= PTE_CONT;
+
+ /*
+ * Clear the contiguous attribute if the remaining
+ * range does not cover a contiguous block
+ */
+ if ((end & ~cmask) <= start)
+ protval &= ~PTE_CONT;
+
+ /* Put down a block or page mapping */
+ *tbl = __pte(__phys_to_pte_val(pa) | protval);
+ }
+ pa += next - start;
+ start = next;
+ tbl++;
+ }
+}
diff --git a/arch/arm64/kernel/pi/patch-scs.c b/arch/arm64/kernel/pi/patch-scs.c
index c65ef40d1e6b..49d8b40e61bc 100644
--- a/arch/arm64/kernel/pi/patch-scs.c
+++ b/arch/arm64/kernel/pi/patch-scs.c
@@ -11,6 +11,10 @@
#include <asm/scs.h>
+#include "pi.h"
+
+bool dynamic_scs_is_enabled;
+
//
// This minimal DWARF CFI parser is partially based on the code in
// arch/arc/kernel/unwind.c, and on the document below:
@@ -46,8 +50,6 @@
#define DW_CFA_GNU_negative_offset_extended 0x2f
#define DW_CFA_hi_user 0x3f
-extern const u8 __eh_frame_start[], __eh_frame_end[];
-
enum {
PACIASP = 0xd503233f,
AUTIASP = 0xd50323bf,
@@ -250,13 +252,3 @@ int scs_patch(const u8 eh_frame[], int size)
}
return 0;
}
-
-asmlinkage void __init scs_patch_vmlinux(const u8 start[], const u8 end[])
-{
- if (!should_patch_pac_into_scs())
- return;
-
- scs_patch(start, end - start);
- asm("ic ialluis");
- isb();
-}
diff --git a/arch/arm64/kernel/pi/pi.h b/arch/arm64/kernel/pi/pi.h
index f455ad385976..8d3d3ba44c27 100644
--- a/arch/arm64/kernel/pi/pi.h
+++ b/arch/arm64/kernel/pi/pi.h
@@ -2,6 +2,8 @@
// Copyright 2023 Google LLC
// Author: Ard Biesheuvel <ardb@google.com>
+#include <linux/types.h>
+
#define __prel64_initconst __section(".init.rodata.prel64")
typedef volatile signed long prel64_t;
@@ -12,3 +14,15 @@ static inline void *prel64_to_pointer(const prel64_t *offset)
return NULL;
return (void *)offset + *offset;
}
+
+extern bool dynamic_scs_is_enabled;
+
+void init_feature_override(u64 boot_status, const void *fdt, int chosen);
+u64 kaslr_early_init(void *fdt, int chosen);
+void relocate_kernel(u64 offset);
+int scs_patch(const u8 eh_frame[], int size);
+
+void map_range(u64 *pgd, u64 start, u64 end, u64 pa, pgprot_t prot,
+ int level, pte_t *tbl, bool may_use_cont, u64 va_offset);
+
+asmlinkage void early_map_kernel(u64 boot_status, void *fdt);
diff --git a/arch/arm64/kernel/pi/relocate.c b/arch/arm64/kernel/pi/relocate.c
index 1853408ea76b..2407d2696398 100644
--- a/arch/arm64/kernel/pi/relocate.c
+++ b/arch/arm64/kernel/pi/relocate.c
@@ -7,6 +7,8 @@
#include <linux/init.h>
#include <linux/types.h>
+#include "pi.h"
+
extern const Elf64_Rela rela_start[], rela_end[];
extern const u64 relr_start[], relr_end[];
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index c2d6852a4e0c..f75e03b9ad28 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -283,13 +283,6 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
kaslr_init();
- /*
- * If know now we are going to need KPTI then use non-global
- * mappings from the start, avoiding the cost of rewriting
- * everything later.
- */
- arm64_use_ng_mappings = kaslr_enabled() && kaslr_requires_kpti();
-
early_fixmap_init();
early_ioremap_init();
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 3afb4223a5e8..755a22d4f840 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -126,9 +126,9 @@ jiffies = jiffies_64;
#ifdef CONFIG_UNWIND_TABLES
#define UNWIND_DATA_SECTIONS \
.eh_frame : { \
- __eh_frame_start = .; \
+ __pi___eh_frame_start = .; \
*(.eh_frame) \
- __eh_frame_end = .; \
+ __pi___eh_frame_end = .; \
}
#else
#define UNWIND_DATA_SECTIONS
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index f66c37a1610e..7c1bdaf25408 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -195,6 +195,7 @@ SYM_TYPED_FUNC_START(idmap_cpu_replace_ttbr1)
ret
SYM_FUNC_END(idmap_cpu_replace_ttbr1)
+SYM_FUNC_ALIAS(__pi_idmap_cpu_replace_ttbr1, idmap_cpu_replace_ttbr1)
.popsection
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 34/39] arm64: mm: Use 48-bit virtual addressing for the permanent ID map
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (32 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 33/39] arm64: head: Move early kernel mapping routines into C code Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 35/39] arm64: pgtable: Decouple PGDIR size macros from PGD/PUD/PMD levels Ard Biesheuvel
` (5 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Even though we support loading kernels anywhere in 48-bit addressable
physical memory, we create the ID maps based on the number of levels
that we happened to configure for the kernel VA and user VA spaces.
The reason for this is that the PGD/PUD/PMD based classification of
translation levels, along with the associated folding when the number of
levels is less than 5, does not permit creating a page table hierarchy
of a set number of levels. This means that, for instance, on 39-bit VA
kernels we need to configure an additional level above PGD level on the
fly, and 36-bit VA kernels still only support 47-bit virtual addressing
with this trick applied.
Now that we have a separate helper to populate page table hierarchies
that does not define the levels in terms of PUDS/PMDS/etc at all, let's
reuse it to create the permanent ID map with a fixed VA size of 48 bits.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/kernel-pgtable.h | 3 ++
arch/arm64/kernel/head.S | 5 +++
arch/arm64/kvm/mmu.c | 15 +++------
arch/arm64/mm/mmu.c | 32 +++++++++++---------
arch/arm64/mm/proc.S | 9 ++----
5 files changed, 32 insertions(+), 32 deletions(-)
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 0631604995ee..742a4b2778f7 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -35,6 +35,9 @@
#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
#endif
+#define IDMAP_VA_BITS 48
+#define IDMAP_LEVELS ARM64_HW_PGTABLE_LEVELS(IDMAP_VA_BITS)
+#define IDMAP_ROOT_LEVEL (4 - IDMAP_LEVELS)
/*
* A relocatable kernel may execute from an address that differs from the one at
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ffacce7b5a02..a1c29d64e875 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -729,6 +729,11 @@ SYM_FUNC_START_LOCAL(__no_granule_support)
SYM_FUNC_END(__no_granule_support)
SYM_FUNC_START_LOCAL(__primary_switch)
+ mrs x1, tcr_el1
+ mov x2, #64 - VA_BITS
+ tcr_set_t0sz x1, x2
+ msr tcr_el1, x1
+
adrp x1, reserved_pg_dir
adrp x2, init_idmap_pg_dir
bl __enable_mmu
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index d87c8fcc4c24..234b03e18ff7 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1873,16 +1873,9 @@ int __init kvm_mmu_init(u32 *hyp_va_bits)
BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK);
/*
- * The ID map may be configured to use an extended virtual address
- * range. This is only the case if system RAM is out of range for the
- * currently configured page size and VA_BITS_MIN, in which case we will
- * also need the extended virtual range for the HYP ID map, or we won't
- * be able to enable the EL2 MMU.
- *
- * However, in some cases the ID map may be configured for fewer than
- * the number of VA bits used by the regular kernel stage 1. This
- * happens when VA_BITS=52 and the kernel image is placed in PA space
- * below 48 bits.
+ * The ID map is always configured for 48 bits of translation, which
+ * may be fewer than the number of VA bits used by the regular kernel
+ * stage 1, when VA_BITS=52.
*
* At EL2, there is only one TTBR register, and we can't switch between
* translation tables *and* update TCR_EL2.T0SZ at the same time. Bottom
@@ -1893,7 +1886,7 @@ int __init kvm_mmu_init(u32 *hyp_va_bits)
* 1 VA bits to assure that the hypervisor can both ID map its code page
* and map any kernel memory.
*/
- idmap_bits = 64 - ((idmap_t0sz & TCR_T0SZ_MASK) >> TCR_T0SZ_OFFSET);
+ idmap_bits = IDMAP_VA_BITS;
kernel_bits = vabits_actual;
*hyp_va_bits = max(idmap_bits, kernel_bits);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 03c73e9197ac..94847dbd31cd 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -754,22 +754,21 @@ static void __init map_kernel(pgd_t *pgdp)
kasan_copy_shadow(pgdp);
}
+void __pi_map_range(u64 *pgd, u64 start, u64 end, u64 pa, pgprot_t prot,
+ int level, pte_t *tbl, bool may_use_cont, u64 va_offset);
+
+static u8 idmap_ptes[IDMAP_LEVELS - 1][PAGE_SIZE] __aligned(PAGE_SIZE) __ro_after_init,
+ kpti_ptes[IDMAP_LEVELS - 1][PAGE_SIZE] __aligned(PAGE_SIZE) __ro_after_init;
+
static void __init create_idmap(void)
{
u64 start = __pa_symbol(__idmap_text_start);
- u64 size = __pa_symbol(__idmap_text_end) - start;
- pgd_t *pgd = idmap_pg_dir;
- u64 pgd_phys;
-
- /* check if we need an additional level of translation */
- if (VA_BITS < 48 && idmap_t0sz < (64 - VA_BITS_MIN)) {
- pgd_phys = early_pgtable_alloc(PAGE_SHIFT);
- set_pgd(&idmap_pg_dir[start >> VA_BITS],
- __pgd(pgd_phys | P4D_TYPE_TABLE));
- pgd = __va(pgd_phys);
- }
- __create_pgd_mapping(pgd, start, start, size, PAGE_KERNEL_ROX,
- early_pgtable_alloc, 0);
+ u64 end = __pa_symbol(__idmap_text_end);
+ u64 ptep = __pa_symbol(idmap_ptes);
+
+ __pi_map_range(&ptep, start, end, start, PAGE_KERNEL_ROX,
+ IDMAP_ROOT_LEVEL, (pte_t *)idmap_pg_dir, false,
+ __phys_to_virt(ptep) - ptep);
if (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0)) {
extern u32 __idmap_kpti_flag;
@@ -779,8 +778,10 @@ static void __init create_idmap(void)
* The KPTI G-to-nG conversion code needs a read-write mapping
* of its synchronization flag in the ID map.
*/
- __create_pgd_mapping(pgd, pa, pa, sizeof(u32), PAGE_KERNEL,
- early_pgtable_alloc, 0);
+ ptep = __pa_symbol(kpti_ptes);
+ __pi_map_range(&ptep, pa, pa + sizeof(u32), pa, PAGE_KERNEL,
+ IDMAP_ROOT_LEVEL, (pte_t *)idmap_pg_dir, false,
+ __phys_to_virt(ptep) - ptep);
}
}
@@ -805,6 +806,7 @@ void __init paging_init(void)
memblock_allow_resize();
create_idmap();
+ idmap_t0sz = TCR_T0SZ(IDMAP_VA_BITS);
}
#ifdef CONFIG_MEMORY_HOTPLUG
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 7c1bdaf25408..47ede52bb900 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -421,9 +421,9 @@ SYM_FUNC_START(__cpu_setup)
mair .req x17
tcr .req x16
mov_q mair, MAIR_EL1_SET
- mov_q tcr, TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
- TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
- TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS | TCR_MTE_FLAGS
+ mov_q tcr, TCR_T0SZ(IDMAP_VA_BITS) | TCR_T1SZ(VA_BITS) | TCR_CACHE_FLAGS | \
+ TCR_SMP_FLAGS | TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
+ TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS | TCR_MTE_FLAGS
tcr_clear_errata_bits tcr, x9, x5
@@ -431,10 +431,7 @@ SYM_FUNC_START(__cpu_setup)
sub x9, xzr, x0
add x9, x9, #64
tcr_set_t1sz tcr, x9
-#else
- idmap_get_t0sz x9
#endif
- tcr_set_t0sz tcr, x9
/*
* Set the IPS bits in TCR_EL1.
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 35/39] arm64: pgtable: Decouple PGDIR size macros from PGD/PUD/PMD levels
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (33 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 34/39] arm64: mm: Use 48-bit virtual addressing for the permanent ID map Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 36/39] arm64: kernel: Create initial ID map from C code Ard Biesheuvel
` (4 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The mapping from PGD/PUD/PMD to levels and shifts is very confusing,
given that, due to folding, the shifts may be equal for different
levels, if the macros are even #define'd to begin with.
In a subsequent patch, we will modify the ID mapping code to decouple
the number of levels from the kernel's view of how these types are
folded, so prepare for this by reformulating the macros without the use
of these types.
Instead, use SWAPPER_BLOCK_SHIFT as the base quantity, and derive it
from either PAGE_SHIFT or PMD_SHIFT, which -if defined at all- are
defined unambiguously for a given page size, regardless of the number of
configured levels.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/kernel-pgtable.h | 65 ++++++--------------
1 file changed, 19 insertions(+), 46 deletions(-)
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 742a4b2778f7..f1fc98a233d5 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -13,27 +13,22 @@
#include <asm/sparsemem.h>
/*
- * The linear mapping and the start of memory are both 2M aligned (per
- * the arm64 booting.txt requirements). Hence we can use section mapping
- * with 4K (section size = 2M) but not with 16K (section size = 32M) or
- * 64K (section size = 512M).
+ * The physical and virtual addresses of the start of the kernel image are
+ * equal modulo 2 MiB (per the arm64 booting.txt requirements). Hence we can
+ * use section mapping with 4K (section size = 2M) but not with 16K (section
+ * size = 32M) or 64K (section size = 512M).
*/
-
-/*
- * The idmap and swapper page tables need some space reserved in the kernel
- * image. Both require pgd, pud (4 levels only) and pmd tables to (section)
- * map the kernel. With the 64K page configuration, swapper and idmap need to
- * map to pte level. The swapper also maps the FDT (see __create_page_tables
- * for more information). Note that the number of ID map translation levels
- * could be increased on the fly if system RAM is out of reach for the default
- * VA range, so pages required to map highest possible PA are reserved in all
- * cases.
- */
-#ifdef CONFIG_ARM64_4K_PAGES
-#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1)
+#if defined(PMD_SIZE) && PMD_SIZE <= MIN_KIMG_ALIGN
+#define SWAPPER_BLOCK_SHIFT PMD_SHIFT
+#define SWAPPER_SKIP_LEVEL 1
#else
-#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
+#define SWAPPER_BLOCK_SHIFT PAGE_SHIFT
+#define SWAPPER_SKIP_LEVEL 0
#endif
+#define SWAPPER_BLOCK_SIZE (UL(1) << SWAPPER_BLOCK_SHIFT)
+#define SWAPPER_TABLE_SHIFT (SWAPPER_BLOCK_SHIFT + PAGE_SHIFT - 3)
+
+#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - SWAPPER_SKIP_LEVEL)
#define IDMAP_VA_BITS 48
#define IDMAP_LEVELS ARM64_HW_PGTABLE_LEVELS(IDMAP_VA_BITS)
@@ -53,24 +48,13 @@
#define EARLY_ENTRIES(vstart, vend, shift, add) \
(SPAN_NR_ENTRIES(vstart, vend, shift) + (add))
-#define EARLY_PGDS(vstart, vend, add) (EARLY_ENTRIES(vstart, vend, PGDIR_SHIFT, add))
-
-#if SWAPPER_PGTABLE_LEVELS > 3
-#define EARLY_PUDS(vstart, vend, add) (EARLY_ENTRIES(vstart, vend, PUD_SHIFT, add))
-#else
-#define EARLY_PUDS(vstart, vend, add) (0)
-#endif
+#define EARLY_LEVEL(lvl, vstart, vend, add) \
+ (SWAPPER_PGTABLE_LEVELS > lvl ? EARLY_ENTRIES(vstart, vend, SWAPPER_BLOCK_SHIFT + lvl * (PAGE_SHIFT - 3), add) : 0)
-#if SWAPPER_PGTABLE_LEVELS > 2
-#define EARLY_PMDS(vstart, vend, add) (EARLY_ENTRIES(vstart, vend, SWAPPER_TABLE_SHIFT, add))
-#else
-#define EARLY_PMDS(vstart, vend, add) (0)
-#endif
-
-#define EARLY_PAGES(vstart, vend, add) ( 1 /* PGDIR page */ \
- + EARLY_PGDS((vstart), (vend), add) /* each PGDIR needs a next level page table */ \
- + EARLY_PUDS((vstart), (vend), add) /* each PUD needs a next level page table */ \
- + EARLY_PMDS((vstart), (vend), add)) /* each PMD needs a next level page table */
+#define EARLY_PAGES(vstart, vend, add) (1 /* PGDIR page */ \
+ + EARLY_LEVEL(3, (vstart), (vend), add) /* each entry needs a next level page table */ \
+ + EARLY_LEVEL(2, (vstart), (vend), add) /* each entry needs a next level page table */ \
+ + EARLY_LEVEL(1, (vstart), (vend), add))/* each entry needs a next level page table */
#define INIT_DIR_SIZE (PAGE_SIZE * (EARLY_PAGES(KIMAGE_VADDR, _end, EXTRA_PAGE) + EARLY_SEGMENT_EXTRA_PAGES))
/* the initial ID map may need two extra pages if it needs to be extended */
@@ -81,17 +65,6 @@
#endif
#define INIT_IDMAP_DIR_PAGES EARLY_PAGES(KIMAGE_VADDR, _end + MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE, 1)
-/* Initial memory map size */
-#ifdef CONFIG_ARM64_4K_PAGES
-#define SWAPPER_BLOCK_SHIFT PMD_SHIFT
-#define SWAPPER_BLOCK_SIZE PMD_SIZE
-#define SWAPPER_TABLE_SHIFT PUD_SHIFT
-#else
-#define SWAPPER_BLOCK_SHIFT PAGE_SHIFT
-#define SWAPPER_BLOCK_SIZE PAGE_SIZE
-#define SWAPPER_TABLE_SHIFT PMD_SHIFT
-#endif
-
/* The number of segments in the kernel image (text, rodata, inittext, initdata, data+bss) */
#define KERNEL_SEGMENT_COUNT 5
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 36/39] arm64: kernel: Create initial ID map from C code
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (34 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 35/39] arm64: pgtable: Decouple PGDIR size macros from PGD/PUD/PMD levels Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 37/39] arm64: mm: avoid fixmap for early swapper_pg_dir updates Ard Biesheuvel
` (3 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
The asm code that creates the initial ID map is rather intricate and
hard to follow. This is problematic because it makes adding support for
things like LPA2 or WXN more difficult than necessary. Also, it is
parameterized like the rest of the MM code to run with a configurable
number of levels, which is rather pointless, given that all AArch64 CPUs
implement support for 48-bit virtual addressing, and that many systems
exist with DRAM located outside of the 39-bit addressable range, which
is the only smaller VA size that is widely used, and we need additional
tricks to make things work in that combination.
So let's bite the bullet, and rip out all the asm macros, and fiddly
code, and replace it with a C implementation based on the newly added
routines for creating the early kernel VA mappings. And while at it,
create the initial ID map based on 48-bit virtual addressing as well,
regardless of the number of configured levels for the kernel proper.
Note that this code may execute with the MMU and caches disabled, and is
therefore not permitted to make unaligned accesses. This shouldn't
generally happen in any case for the algorithm as implemented, but to be
sure, let's pass -mstrict-align to the compiler just in case.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/assembler.h | 14 -
arch/arm64/include/asm/kernel-pgtable.h | 50 ++--
arch/arm64/include/asm/mmu_context.h | 6 +-
arch/arm64/kernel/head.S | 267 ++------------------
arch/arm64/kernel/image-vars.h | 1 +
arch/arm64/kernel/pi/Makefile | 3 +
arch/arm64/kernel/pi/map_kernel.c | 18 ++
arch/arm64/kernel/pi/map_range.c | 12 +
arch/arm64/kernel/pi/pi.h | 4 +
arch/arm64/mm/mmu.c | 5 -
arch/arm64/mm/proc.S | 3 +-
11 files changed, 88 insertions(+), 295 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 376a980f2bad..beb53bbd8c19 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -345,20 +345,6 @@ alternative_cb_end
bfi \valreg, \t1sz, #TCR_T1SZ_OFFSET, #TCR_TxSZ_WIDTH
.endm
-/*
- * idmap_get_t0sz - get the T0SZ value needed to cover the ID map
- *
- * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the
- * entire ID map region can be mapped. As T0SZ == (64 - #bits used),
- * this number conveniently equals the number of leading zeroes in
- * the physical address of _end.
- */
- .macro idmap_get_t0sz, reg
- adrp \reg, _end
- orr \reg, \reg, #(1 << VA_BITS_MIN) - 1
- clz \reg, \reg
- .endm
-
/*
* tcr_compute_pa_size - set TCR.(I)PS to the highest supported
* ID_AA64MMFR0_EL1.PARange value
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index f1fc98a233d5..bf05a77873a4 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -29,6 +29,7 @@
#define SWAPPER_TABLE_SHIFT (SWAPPER_BLOCK_SHIFT + PAGE_SHIFT - 3)
#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - SWAPPER_SKIP_LEVEL)
+#define INIT_IDMAP_PGTABLE_LEVELS (IDMAP_LEVELS - SWAPPER_SKIP_LEVEL)
#define IDMAP_VA_BITS 48
#define IDMAP_LEVELS ARM64_HW_PGTABLE_LEVELS(IDMAP_VA_BITS)
@@ -48,44 +49,39 @@
#define EARLY_ENTRIES(vstart, vend, shift, add) \
(SPAN_NR_ENTRIES(vstart, vend, shift) + (add))
-#define EARLY_LEVEL(lvl, vstart, vend, add) \
- (SWAPPER_PGTABLE_LEVELS > lvl ? EARLY_ENTRIES(vstart, vend, SWAPPER_BLOCK_SHIFT + lvl * (PAGE_SHIFT - 3), add) : 0)
+#define EARLY_LEVEL(lvl, lvls, vstart, vend, add) \
+ (lvls > lvl ? EARLY_ENTRIES(vstart, vend, SWAPPER_BLOCK_SHIFT + lvl * (PAGE_SHIFT - 3), add) : 0)
-#define EARLY_PAGES(vstart, vend, add) (1 /* PGDIR page */ \
- + EARLY_LEVEL(3, (vstart), (vend), add) /* each entry needs a next level page table */ \
- + EARLY_LEVEL(2, (vstart), (vend), add) /* each entry needs a next level page table */ \
- + EARLY_LEVEL(1, (vstart), (vend), add))/* each entry needs a next level page table */
-#define INIT_DIR_SIZE (PAGE_SIZE * (EARLY_PAGES(KIMAGE_VADDR, _end, EXTRA_PAGE) + EARLY_SEGMENT_EXTRA_PAGES))
+#define EARLY_PAGES(lvls, vstart, vend, add) (1 /* PGDIR page */ \
+ + EARLY_LEVEL(3, (lvls), (vstart), (vend), add) /* each entry needs a next level page table */ \
+ + EARLY_LEVEL(2, (lvls), (vstart), (vend), add) /* each entry needs a next level page table */ \
+ + EARLY_LEVEL(1, (lvls), (vstart), (vend), add))/* each entry needs a next level page table */
+#define INIT_DIR_SIZE (PAGE_SIZE * (EARLY_PAGES(SWAPPER_PGTABLE_LEVELS, KIMAGE_VADDR, _end, EXTRA_PAGE) \
+ + EARLY_SEGMENT_EXTRA_PAGES))
-/* the initial ID map may need two extra pages if it needs to be extended */
-#if VA_BITS < 48
-#define INIT_IDMAP_DIR_SIZE ((INIT_IDMAP_DIR_PAGES + 2) * PAGE_SIZE)
-#else
-#define INIT_IDMAP_DIR_SIZE (INIT_IDMAP_DIR_PAGES * PAGE_SIZE)
-#endif
-#define INIT_IDMAP_DIR_PAGES EARLY_PAGES(KIMAGE_VADDR, _end + MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE, 1)
+#define INIT_IDMAP_DIR_PAGES (EARLY_PAGES(INIT_IDMAP_PGTABLE_LEVELS, KIMAGE_VADDR, _end, 1))
+#define INIT_IDMAP_DIR_SIZE ((INIT_IDMAP_DIR_PAGES + EARLY_IDMAP_EXTRA_PAGES) * PAGE_SIZE)
+
+#define INIT_IDMAP_FDT_PAGES (EARLY_PAGES(INIT_IDMAP_PGTABLE_LEVELS, 0UL, UL(MAX_FDT_SIZE), 1) - 1)
+#define INIT_IDMAP_FDT_SIZE ((INIT_IDMAP_FDT_PAGES + EARLY_IDMAP_EXTRA_FDT_PAGES) * PAGE_SIZE)
/* The number of segments in the kernel image (text, rodata, inittext, initdata, data+bss) */
#define KERNEL_SEGMENT_COUNT 5
#if SWAPPER_BLOCK_SIZE > SEGMENT_ALIGN
#define EARLY_SEGMENT_EXTRA_PAGES (KERNEL_SEGMENT_COUNT + 1)
-#else
-#define EARLY_SEGMENT_EXTRA_PAGES 0
-#endif
-
/*
- * Initial memory map attributes.
+ * The initial ID map consists of the kernel image, mapped as two separate
+ * segments, and may appear misaligned wrt the swapper block size. This means
+ * we need 3 additional pages. The DT could straddle a swapper block boundary,
+ * so it may need 2.
*/
-#define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED | PTE_UXN)
-#define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S | PTE_UXN)
-
-#ifdef CONFIG_ARM64_4K_PAGES
-#define SWAPPER_RW_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS | PTE_WRITE)
-#define SWAPPER_RX_MMUFLAGS (SWAPPER_RW_MMUFLAGS | PMD_SECT_RDONLY)
+#define EARLY_IDMAP_EXTRA_PAGES 3
+#define EARLY_IDMAP_EXTRA_FDT_PAGES 2
#else
-#define SWAPPER_RW_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS | PTE_WRITE)
-#define SWAPPER_RX_MMUFLAGS (SWAPPER_RW_MMUFLAGS | PTE_RDONLY)
+#define EARLY_SEGMENT_EXTRA_PAGES 0
+#define EARLY_IDMAP_EXTRA_PAGES 0
+#define EARLY_IDMAP_EXTRA_FDT_PAGES 0
#endif
#endif /* __ASM_KERNEL_PGTABLE_H */
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 9ce4200508b1..3c487948b94e 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -61,11 +61,9 @@ static inline void cpu_switch_mm(pgd_t *pgd, struct mm_struct *mm)
}
/*
- * TCR.T0SZ value to use when the ID map is active. Usually equals
- * TCR_T0SZ(VA_BITS), unless system RAM is positioned very high in
- * physical memory, in which case it will be smaller.
+ * TCR.T0SZ value to use when the ID map is active.
*/
-extern int idmap_t0sz;
+#define idmap_t0sz TCR_T0SZ(IDMAP_VA_BITS)
/*
* Ensure TCR.T0SZ is set to the provided value.
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a1c29d64e875..545b5d8976f4 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -80,26 +80,42 @@
* x19 primary_entry() .. start_kernel() whether we entered with the MMU on
* x20 primary_entry() .. __primary_switch() CPU boot mode
* x21 primary_entry() .. start_kernel() FDT pointer passed at boot in x0
- * x22 create_idmap() .. start_kernel() ID map VA of the DT blob
* x25 primary_entry() .. start_kernel() supported VA size
- * x28 create_idmap() callee preserved temp register
*/
SYM_CODE_START(primary_entry)
bl record_mmu_state
bl preserve_boot_args
- bl create_idmap
+
+ adrp x1, early_init_stack
+ mov sp, x1
+ mov x29, xzr
+ adrp x0, init_idmap_pg_dir
+ bl __pi_create_init_idmap
+
+ /*
+ * If the page tables have been populated with non-cacheable
+ * accesses (MMU disabled), invalidate those tables again to
+ * remove any speculatively loaded cache lines.
+ */
+ cbnz x19, 0f
+ dmb sy
+ mov x1, x0 // end of used region
+ adrp x0, init_idmap_pg_dir
+ adr_l x2, dcache_inval_poc
+ blr x2
+ b 1f
/*
* If we entered with the MMU and caches on, clean the ID mapped part
* of the primary boot code to the PoC so we can safely execute it with
* the MMU off.
*/
- cbz x19, 0f
- adrp x0, __idmap_text_start
+0: adrp x0, __idmap_text_start
adr_l x1, __idmap_text_end
adr_l x2, dcache_clean_poc
blr x2
-0: mov x0, x19
+
+1: mov x0, x19
bl init_kernel_el // w0=cpu_boot_mode
mov x20, x0
@@ -175,238 +191,6 @@ SYM_CODE_START_LOCAL(preserve_boot_args)
ret
SYM_CODE_END(preserve_boot_args)
-/*
- * Macro to populate page table entries, these entries can be pointers to the next level
- * or last level entries pointing to physical memory.
- *
- * tbl: page table address
- * rtbl: pointer to page table or physical memory
- * index: start index to write
- * eindex: end index to write - [index, eindex] written to
- * flags: flags for pagetable entry to or in
- * inc: increment to rtbl between each entry
- * tmp1: temporary variable
- *
- * Preserves: tbl, eindex, flags, inc
- * Corrupts: index, tmp1
- * Returns: rtbl
- */
- .macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1
-.Lpe\@: phys_to_pte \tmp1, \rtbl
- orr \tmp1, \tmp1, \flags // tmp1 = table entry
- str \tmp1, [\tbl, \index, lsl #3]
- add \rtbl, \rtbl, \inc // rtbl = pa next level
- add \index, \index, #1
- cmp \index, \eindex
- b.ls .Lpe\@
- .endm
-
-/*
- * Compute indices of table entries from virtual address range. If multiple entries
- * were needed in the previous page table level then the next page table level is assumed
- * to be composed of multiple pages. (This effectively scales the end index).
- *
- * vstart: virtual address of start of range
- * vend: virtual address of end of range - we map [vstart, vend]
- * shift: shift used to transform virtual address into index
- * order: #imm 2log(number of entries in page table)
- * istart: index in table corresponding to vstart
- * iend: index in table corresponding to vend
- * count: On entry: how many extra entries were required in previous level, scales
- * our end index.
- * On exit: returns how many extra entries required for next page table level
- *
- * Preserves: vstart, vend
- * Returns: istart, iend, count
- */
- .macro compute_indices, vstart, vend, shift, order, istart, iend, count
- ubfx \istart, \vstart, \shift, \order
- ubfx \iend, \vend, \shift, \order
- add \iend, \iend, \count, lsl \order
- sub \count, \iend, \istart
- .endm
-
-/*
- * Map memory for specified virtual address range. Each level of page table needed supports
- * multiple entries. If a level requires n entries the next page table level is assumed to be
- * formed from n pages.
- *
- * tbl: location of page table
- * rtbl: address to be used for first level page table entry (typically tbl + PAGE_SIZE)
- * vstart: virtual address of start of range
- * vend: virtual address of end of range - we map [vstart, vend - 1]
- * flags: flags to use to map last level entries
- * phys: physical address corresponding to vstart - physical memory is contiguous
- * order: #imm 2log(number of entries in PGD table)
- *
- * If extra_shift is set, an extra level will be populated if the end address does
- * not fit in 'extra_shift' bits. This assumes vend is in the TTBR0 range.
- *
- * Temporaries: istart, iend, tmp, count, sv - these need to be different registers
- * Preserves: vstart, flags
- * Corrupts: tbl, rtbl, vend, istart, iend, tmp, count, sv
- */
- .macro map_memory, tbl, rtbl, vstart, vend, flags, phys, order, istart, iend, tmp, count, sv, extra_shift
- sub \vend, \vend, #1
- add \rtbl, \tbl, #PAGE_SIZE
- mov \count, #0
-
- .ifnb \extra_shift
- tst \vend, #~((1 << (\extra_shift)) - 1)
- b.eq .L_\@
- compute_indices \vstart, \vend, #\extra_shift, #(PAGE_SHIFT - 3), \istart, \iend, \count
- mov \sv, \rtbl
- populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
- mov \tbl, \sv
- .endif
-.L_\@:
- compute_indices \vstart, \vend, #PGDIR_SHIFT, #\order, \istart, \iend, \count
- mov \sv, \rtbl
- populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
- mov \tbl, \sv
-
-#if SWAPPER_PGTABLE_LEVELS > 3
- compute_indices \vstart, \vend, #PUD_SHIFT, #(PAGE_SHIFT - 3), \istart, \iend, \count
- mov \sv, \rtbl
- populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
- mov \tbl, \sv
-#endif
-
-#if SWAPPER_PGTABLE_LEVELS > 2
- compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #(PAGE_SHIFT - 3), \istart, \iend, \count
- mov \sv, \rtbl
- populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
- mov \tbl, \sv
-#endif
-
- compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #(PAGE_SHIFT - 3), \istart, \iend, \count
- bic \rtbl, \phys, #SWAPPER_BLOCK_SIZE - 1
- populate_entries \tbl, \rtbl, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp
- .endm
-
-/*
- * Remap a subregion created with the map_memory macro with modified attributes
- * or output address. The entire remapped region must have been covered in the
- * invocation of map_memory.
- *
- * x0: last level table address (returned in first argument to map_memory)
- * x1: start VA of the existing mapping
- * x2: start VA of the region to update
- * x3: end VA of the region to update (exclusive)
- * x4: start PA associated with the region to update
- * x5: attributes to set on the updated region
- * x6: order of the last level mappings
- */
-SYM_FUNC_START_LOCAL(remap_region)
- sub x3, x3, #1 // make end inclusive
-
- // Get the index offset for the start of the last level table
- lsr x1, x1, x6
- bfi x1, xzr, #0, #PAGE_SHIFT - 3
-
- // Derive the start and end indexes into the last level table
- // associated with the provided region
- lsr x2, x2, x6
- lsr x3, x3, x6
- sub x2, x2, x1
- sub x3, x3, x1
-
- mov x1, #1
- lsl x6, x1, x6 // block size at this level
-
- populate_entries x0, x4, x2, x3, x5, x6, x7
- ret
-SYM_FUNC_END(remap_region)
-
-SYM_FUNC_START_LOCAL(create_idmap)
- mov x28, lr
- /*
- * The ID map carries a 1:1 mapping of the physical address range
- * covered by the loaded image, which could be anywhere in DRAM. This
- * means that the required size of the VA (== PA) space is decided at
- * boot time, and could be more than the configured size of the VA
- * space for ordinary kernel and user space mappings.
- *
- * There are three cases to consider here:
- * - 39 <= VA_BITS < 48, and the ID map needs up to 48 VA bits to cover
- * the placement of the image. In this case, we configure one extra
- * level of translation on the fly for the ID map only. (This case
- * also covers 42-bit VA/52-bit PA on 64k pages).
- *
- * - VA_BITS == 48, and the ID map needs more than 48 VA bits. This can
- * only happen when using 64k pages, in which case we need to extend
- * the root level table rather than add a level. Note that we can
- * treat this case as 'always extended' as long as we take care not
- * to program an unsupported T0SZ value into the TCR register.
- *
- * - Combinations that would require two additional levels of
- * translation are not supported, e.g., VA_BITS==36 on 16k pages, or
- * VA_BITS==39/4k pages with 5-level paging, where the input address
- * requires more than 47 or 48 bits, respectively.
- */
-#if (VA_BITS < 48)
-#define IDMAP_PGD_ORDER (VA_BITS - PGDIR_SHIFT)
-#define EXTRA_SHIFT (PGDIR_SHIFT + PAGE_SHIFT - 3)
-
- /*
- * If VA_BITS < 48, we have to configure an additional table level.
- * First, we have to verify our assumption that the current value of
- * VA_BITS was chosen such that all translation levels are fully
- * utilised, and that lowering T0SZ will always result in an additional
- * translation level to be configured.
- */
-#if VA_BITS != EXTRA_SHIFT
-#error "Mismatch between VA_BITS and page size/number of translation levels"
-#endif
-#else
-#define IDMAP_PGD_ORDER (PHYS_MASK_SHIFT - PGDIR_SHIFT)
-#define EXTRA_SHIFT
- /*
- * If VA_BITS == 48, we don't have to configure an additional
- * translation level, but the top-level table has more entries.
- */
-#endif
- adrp x0, init_idmap_pg_dir
- adrp x3, _text
- adrp x6, _end + MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE
- mov_q x7, SWAPPER_RX_MMUFLAGS
-
- map_memory x0, x1, x3, x6, x7, x3, IDMAP_PGD_ORDER, x10, x11, x12, x13, x14, EXTRA_SHIFT
-
- /* Remap [.init].data, BSS and the kernel page tables r/w in the ID map */
- adrp x1, _text
- adrp x2, __initdata_begin
- adrp x3, _end
- bic x4, x2, #SWAPPER_BLOCK_SIZE - 1
- mov_q x5, SWAPPER_RW_MMUFLAGS
- mov x6, #SWAPPER_BLOCK_SHIFT
- bl remap_region
-
- /* Remap the FDT after the kernel image */
- adrp x1, _text
- adrp x22, _end + SWAPPER_BLOCK_SIZE
- bic x2, x22, #SWAPPER_BLOCK_SIZE - 1
- bfi x22, x21, #0, #SWAPPER_BLOCK_SHIFT // remapped FDT address
- add x3, x2, #MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE
- bic x4, x21, #SWAPPER_BLOCK_SIZE - 1
- mov_q x5, SWAPPER_RW_MMUFLAGS
- mov x6, #SWAPPER_BLOCK_SHIFT
- bl remap_region
-
- /*
- * Since the page tables have been populated with non-cacheable
- * accesses (MMU disabled), invalidate those tables again to
- * remove any speculatively loaded cache lines.
- */
- cbnz x19, 0f // skip cache invalidation if MMU is on
- dmb sy
-
- adrp x0, init_idmap_pg_dir
- adrp x1, init_idmap_pg_end
- bl dcache_inval_poc
-0: ret x28
-SYM_FUNC_END(create_idmap)
-
/*
* Initialize CPU registers with task-specific and cpu-specific context.
*
@@ -729,11 +513,6 @@ SYM_FUNC_START_LOCAL(__no_granule_support)
SYM_FUNC_END(__no_granule_support)
SYM_FUNC_START_LOCAL(__primary_switch)
- mrs x1, tcr_el1
- mov x2, #64 - VA_BITS
- tcr_set_t0sz x1, x2
- msr tcr_el1, x1
-
adrp x1, reserved_pg_dir
adrp x2, init_idmap_pg_dir
bl __enable_mmu
@@ -742,7 +521,7 @@ SYM_FUNC_START_LOCAL(__primary_switch)
mov sp, x1
mov x29, xzr
mov x0, x20 // pass the full boot status
- mov x1, x22 // pass the low FDT mapping
+ mov x1, x21 // pass the FDT
bl __pi_early_map_kernel // Map and relocate the kernel
ldr x8, =__primary_switched
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index e566b32f9c22..941a14c05184 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -52,6 +52,7 @@ PROVIDE(__pi_cavium_erratum_27456_cpus = cavium_erratum_27456_cpus);
PROVIDE(__pi__ctype = _ctype);
PROVIDE(__pi_memstart_offset_seed = memstart_offset_seed);
+PROVIDE(__pi_init_idmap_pg_dir = init_idmap_pg_dir);
PROVIDE(__pi_init_pg_dir = init_pg_dir);
PROVIDE(__pi_init_pg_end = init_pg_end);
diff --git a/arch/arm64/kernel/pi/Makefile b/arch/arm64/kernel/pi/Makefile
index 8c2f80a46b93..4393b41f0b71 100644
--- a/arch/arm64/kernel/pi/Makefile
+++ b/arch/arm64/kernel/pi/Makefile
@@ -11,6 +11,9 @@ KBUILD_CFLAGS := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) -fpie \
-fno-asynchronous-unwind-tables -fno-unwind-tables \
$(call cc-option,-fno-addrsig)
+# this code may run with the MMU off so disable unaligned accesses
+CFLAGS_map_range.o += -mstrict-align
+
# remove SCS flags from all objects in this directory
KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS))
# disable LTO
diff --git a/arch/arm64/kernel/pi/map_kernel.c b/arch/arm64/kernel/pi/map_kernel.c
index afb6a337fa48..a9aad3a8b575 100644
--- a/arch/arm64/kernel/pi/map_kernel.c
+++ b/arch/arm64/kernel/pi/map_kernel.c
@@ -129,6 +129,22 @@ static void __init map_kernel(u64 kaslr_offset, u64 va_offset, int root_level)
}
}
+static void __init map_fdt(u64 fdt)
+{
+ static u8 ptes[INIT_IDMAP_FDT_SIZE] __initdata __aligned(PAGE_SIZE);
+ u64 efdt = fdt + MAX_FDT_SIZE;
+ u64 ptep = (u64)ptes;
+
+ /*
+ * Map up to MAX_FDT_SIZE bytes, but avoid overlap with
+ * the kernel image.
+ */
+ map_range(&ptep, fdt, (u64)_text > fdt ? min((u64)_text, efdt) : efdt,
+ fdt, PAGE_KERNEL, IDMAP_ROOT_LEVEL,
+ (pte_t *)init_idmap_pg_dir, false, 0);
+ dsb(ishst);
+}
+
asmlinkage void __init early_map_kernel(u64 boot_status, void *fdt)
{
static char const chosen_str[] __initconst = "/chosen";
@@ -137,6 +153,8 @@ asmlinkage void __init early_map_kernel(u64 boot_status, void *fdt)
int root_level = 4 - CONFIG_PGTABLE_LEVELS;
int chosen;
+ map_fdt((u64)fdt);
+
/* Clear BSS and the initial page tables */
memset(__bss_start, 0, (u64)init_pg_end - (u64)__bss_start);
diff --git a/arch/arm64/kernel/pi/map_range.c b/arch/arm64/kernel/pi/map_range.c
index c31feda18f47..79e4f6a2efe1 100644
--- a/arch/arm64/kernel/pi/map_range.c
+++ b/arch/arm64/kernel/pi/map_range.c
@@ -86,3 +86,15 @@ void __init map_range(u64 *pte, u64 start, u64 end, u64 pa, pgprot_t prot,
tbl++;
}
}
+
+asmlinkage u64 __init create_init_idmap(pgd_t *pg_dir)
+{
+ u64 ptep = (u64)pg_dir + PAGE_SIZE;
+
+ map_range(&ptep, (u64)_stext, (u64)__initdata_begin, (u64)_stext,
+ PAGE_KERNEL_ROX, IDMAP_ROOT_LEVEL, (pte_t *)pg_dir, false, 0);
+ map_range(&ptep, (u64)__initdata_begin, (u64)_end, (u64)__initdata_begin,
+ PAGE_KERNEL, IDMAP_ROOT_LEVEL, (pte_t *)pg_dir, false, 0);
+
+ return ptep;
+}
diff --git a/arch/arm64/kernel/pi/pi.h b/arch/arm64/kernel/pi/pi.h
index 8d3d3ba44c27..04a1f576baee 100644
--- a/arch/arm64/kernel/pi/pi.h
+++ b/arch/arm64/kernel/pi/pi.h
@@ -17,6 +17,8 @@ static inline void *prel64_to_pointer(const prel64_t *offset)
extern bool dynamic_scs_is_enabled;
+extern pgd_t init_idmap_pg_dir[];
+
void init_feature_override(u64 boot_status, const void *fdt, int chosen);
u64 kaslr_early_init(void *fdt, int chosen);
void relocate_kernel(u64 offset);
@@ -26,3 +28,5 @@ void map_range(u64 *pgd, u64 start, u64 end, u64 pa, pgprot_t prot,
int level, pte_t *tbl, bool may_use_cont, u64 va_offset);
asmlinkage void early_map_kernel(u64 boot_status, void *fdt);
+
+asmlinkage u64 create_init_idmap(pgd_t *pgd);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 94847dbd31cd..cb6b756dbb56 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -45,8 +45,6 @@
#define NO_CONT_MAPPINGS BIT(1)
#define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */
-int idmap_t0sz __ro_after_init;
-
#if VA_BITS > 48
u64 vabits_actual __ro_after_init = VA_BITS_MIN;
EXPORT_SYMBOL(vabits_actual);
@@ -790,8 +788,6 @@ void __init paging_init(void)
pgd_t *pgdp = pgd_set_fixmap(__pa_symbol(swapper_pg_dir));
extern pgd_t init_idmap_pg_dir[];
- idmap_t0sz = 63UL - __fls(__pa_symbol(_end) | GENMASK(VA_BITS_MIN - 1, 0));
-
map_kernel(pgdp);
map_mem(pgdp);
@@ -806,7 +802,6 @@ void __init paging_init(void)
memblock_allow_resize();
create_idmap();
- idmap_t0sz = TCR_T0SZ(IDMAP_VA_BITS);
}
#ifdef CONFIG_MEMORY_HOTPLUG
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 47ede52bb900..55c366dbda8f 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -200,7 +200,8 @@ SYM_FUNC_ALIAS(__pi_idmap_cpu_replace_ttbr1, idmap_cpu_replace_ttbr1)
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-#define KPTI_NG_PTE_FLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS | PTE_WRITE)
+#define KPTI_NG_PTE_FLAGS (PTE_ATTRINDX(MT_NORMAL) | PTE_TYPE_PAGE | \
+ PTE_AF | PTE_SHARED | PTE_UXN | PTE_WRITE)
.pushsection ".idmap.text", "a"
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 37/39] arm64: mm: avoid fixmap for early swapper_pg_dir updates
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (35 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 36/39] arm64: kernel: Create initial ID map from C code Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 38/39] arm64: mm: omit redundant remap of kernel image Ard Biesheuvel
` (2 subsequent siblings)
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Early in the boot, when .rodata is still writable, we can poke
swapper_pg_dir entries directly, and there is no need to go through the
fixmap. After a future patch, we will enter the kernel with
swapper_pg_dir already active, and early swapper_pg_dir updates for
creating the fixmap page table hierarchy itself cannot go through the
fixmap for obvious reaons. So let's keep track of whether rodata is
writable, and update the descriptor directly in that case.
As the same reasoning applies to early KASAN init, make the function
noinstr as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/mm/mmu.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index cb6b756dbb56..d36ba0811f5e 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -55,6 +55,8 @@ EXPORT_SYMBOL(kimage_voffset);
u32 __boot_cpu_mode[] = { BOOT_CPU_MODE_EL2, BOOT_CPU_MODE_EL1 };
+static bool rodata_is_rw __ro_after_init = true;
+
/*
* The booting CPU updates the failed status @__early_cpu_boot_status,
* with MMU turned off.
@@ -71,10 +73,21 @@ EXPORT_SYMBOL(empty_zero_page);
static DEFINE_SPINLOCK(swapper_pgdir_lock);
static DEFINE_MUTEX(fixmap_lock);
-void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
+void noinstr set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
{
pgd_t *fixmap_pgdp;
+ /*
+ * Don't bother with the fixmap if swapper_pg_dir is still mapped
+ * writable in the kernel mapping.
+ */
+ if (rodata_is_rw) {
+ WRITE_ONCE(*pgdp, pgd);
+ dsb(ishst);
+ isb();
+ return;
+ }
+
spin_lock(&swapper_pgdir_lock);
fixmap_pgdp = pgd_set_fixmap(__pa_symbol(pgdp));
WRITE_ONCE(*fixmap_pgdp, pgd);
@@ -628,6 +641,7 @@ void mark_rodata_ro(void)
* to cover NOTES and EXCEPTION_TABLE.
*/
section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata;
+ WRITE_ONCE(rodata_is_rw, false);
update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
section_size, PAGE_KERNEL_RO);
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 38/39] arm64: mm: omit redundant remap of kernel image
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (36 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 37/39] arm64: mm: avoid fixmap for early swapper_pg_dir updates Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 39/39] arm64: Revert "mm: provide idmap pointer to cpu_replace_ttbr1()" Ard Biesheuvel
2023-11-24 16:22 ` [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
Now that the early kernel mapping is created with all the right
attributes and segment boundaries, there is no longer a need to recreate
it and switch to it. This also means we no longer have to copy the kasan
shadow or some parts of the fixmap from one set of page tables to the
other.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/fixmap.h | 1 -
arch/arm64/include/asm/kasan.h | 2 -
arch/arm64/include/asm/mmu.h | 2 +-
arch/arm64/kernel/image-vars.h | 1 +
arch/arm64/kernel/pi/map_kernel.c | 6 +-
arch/arm64/mm/fixmap.c | 34 --------
arch/arm64/mm/kasan_init.c | 15 ----
arch/arm64/mm/mmu.c | 85 ++++----------------
8 files changed, 21 insertions(+), 125 deletions(-)
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 58c294a96676..8aabd45e9a13 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -100,7 +100,6 @@ enum fixed_addresses {
#define FIXMAP_PAGE_IO __pgprot(PROT_DEVICE_nGnRE)
void __init early_fixmap_init(void);
-void __init fixmap_copy(pgd_t *pgdir);
#define __early_set_fixmap __set_fixmap
diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index 12d5f47f7dbe..ab52688ac4bd 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -36,12 +36,10 @@ void kasan_init(void);
#define _KASAN_SHADOW_START(va) (KASAN_SHADOW_END - (1UL << ((va) - KASAN_SHADOW_SCALE_SHIFT)))
#define KASAN_SHADOW_START _KASAN_SHADOW_START(vabits_actual)
-void kasan_copy_shadow(pgd_t *pgdir);
asmlinkage void kasan_early_init(void);
#else
static inline void kasan_init(void) { }
-static inline void kasan_copy_shadow(pgd_t *pgdir) { }
#endif
#endif
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index d0b8b4b413b6..65977c7783c5 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -110,7 +110,7 @@ static inline bool kaslr_requires_kpti(void)
}
#define INIT_MM_CONTEXT(name) \
- .pgd = init_pg_dir,
+ .pgd = swapper_pg_dir,
#endif /* !__ASSEMBLY__ */
#endif
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 941a14c05184..e140c5bda90b 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -55,6 +55,7 @@ PROVIDE(__pi_memstart_offset_seed = memstart_offset_seed);
PROVIDE(__pi_init_idmap_pg_dir = init_idmap_pg_dir);
PROVIDE(__pi_init_pg_dir = init_pg_dir);
PROVIDE(__pi_init_pg_end = init_pg_end);
+PROVIDE(__pi_swapper_pg_dir = swapper_pg_dir);
PROVIDE(__pi__text = _text);
PROVIDE(__pi__stext = _stext);
diff --git a/arch/arm64/kernel/pi/map_kernel.c b/arch/arm64/kernel/pi/map_kernel.c
index a9aad3a8b575..d5b345d68f07 100644
--- a/arch/arm64/kernel/pi/map_kernel.c
+++ b/arch/arm64/kernel/pi/map_kernel.c
@@ -125,8 +125,12 @@ static void __init map_kernel(u64 kaslr_offset, u64 va_offset, int root_level)
text_prot, true, root_level);
map_segment(init_pg_dir, NULL, va_offset, __inittext_begin,
__inittext_end, text_prot, false, root_level);
- dsb(ishst);
}
+
+ /* Copy the root page table to its final location */
+ memcpy((void *)swapper_pg_dir + va_offset, init_pg_dir, PGD_SIZE);
+ dsb(ishst);
+ idmap_cpu_replace_ttbr1(swapper_pg_dir);
}
static void __init map_fdt(u64 fdt)
diff --git a/arch/arm64/mm/fixmap.c b/arch/arm64/mm/fixmap.c
index c0a3301203bd..9436a12e1882 100644
--- a/arch/arm64/mm/fixmap.c
+++ b/arch/arm64/mm/fixmap.c
@@ -167,37 +167,3 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
return dt_virt;
}
-
-/*
- * Copy the fixmap region into a new pgdir.
- */
-void __init fixmap_copy(pgd_t *pgdir)
-{
- if (!READ_ONCE(pgd_val(*pgd_offset_pgd(pgdir, FIXADDR_TOT_START)))) {
- /*
- * The fixmap falls in a separate pgd to the kernel, and doesn't
- * live in the carveout for the swapper_pg_dir. We can simply
- * re-use the existing dir for the fixmap.
- */
- set_pgd(pgd_offset_pgd(pgdir, FIXADDR_TOT_START),
- READ_ONCE(*pgd_offset_k(FIXADDR_TOT_START)));
- } else if (CONFIG_PGTABLE_LEVELS > 3) {
- pgd_t *bm_pgdp;
- p4d_t *bm_p4dp;
- pud_t *bm_pudp;
- /*
- * The fixmap shares its top level pgd entry with the kernel
- * mapping. This can really only occur when we are running
- * with 16k/4 levels, so we can simply reuse the pud level
- * entry instead.
- */
- BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
- bm_pgdp = pgd_offset_pgd(pgdir, FIXADDR_TOT_START);
- bm_p4dp = p4d_offset(bm_pgdp, FIXADDR_TOT_START);
- bm_pudp = pud_set_fixmap_offset(bm_p4dp, FIXADDR_TOT_START);
- pud_populate(&init_mm, bm_pudp, lm_alias(bm_pmd));
- pud_clear_fixmap();
- } else {
- BUG();
- }
-}
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 555285ebd5af..dd91f5942fd2 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -184,21 +184,6 @@ static void __init kasan_map_populate(unsigned long start, unsigned long end,
kasan_pgd_populate(start & PAGE_MASK, PAGE_ALIGN(end), node, false);
}
-/*
- * Copy the current shadow region into a new pgdir.
- */
-void __init kasan_copy_shadow(pgd_t *pgdir)
-{
- pgd_t *pgdp, *pgdp_new, *pgdp_end;
-
- pgdp = pgd_offset_k(KASAN_SHADOW_START);
- pgdp_end = pgd_offset_k(KASAN_SHADOW_END);
- pgdp_new = pgd_offset_pgd(pgdir, KASAN_SHADOW_START);
- do {
- set_pgd(pgdp_new, READ_ONCE(*pgdp));
- } while (pgdp++, pgdp_new++, pgdp != pgdp_end);
-}
-
static void __init clear_pgds(unsigned long start,
unsigned long end)
{
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d36ba0811f5e..bb9e13d54a5d 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -648,9 +648,9 @@ void mark_rodata_ro(void)
debug_checkwx();
}
-static void __init map_kernel_segment(pgd_t *pgdp, void *va_start, void *va_end,
- pgprot_t prot, struct vm_struct *vma,
- int flags, unsigned long vm_flags)
+static void __init declare_vma(struct vm_struct *vma,
+ void *va_start, void *va_end,
+ unsigned long vm_flags)
{
phys_addr_t pa_start = __pa_symbol(va_start);
unsigned long size = va_end - va_start;
@@ -658,9 +658,6 @@ static void __init map_kernel_segment(pgd_t *pgdp, void *va_start, void *va_end,
BUG_ON(!PAGE_ALIGNED(pa_start));
BUG_ON(!PAGE_ALIGNED(size));
- __create_pgd_mapping(pgdp, pa_start, (unsigned long)va_start, size, prot,
- early_pgtable_alloc, flags);
-
if (!(vm_flags & VM_NO_GUARD))
size += PAGE_SIZE;
@@ -673,12 +670,12 @@ static void __init map_kernel_segment(pgd_t *pgdp, void *va_start, void *va_end,
vm_area_add_early(vma);
}
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
static pgprot_t kernel_exec_prot(void)
{
return rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
}
-#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
static int __init map_entry_trampoline(void)
{
int i;
@@ -710,60 +707,17 @@ core_initcall(map_entry_trampoline);
#endif
/*
- * Open coded check for BTI, only for use to determine configuration
- * for early mappings for before the cpufeature code has run.
- */
-static bool arm64_early_this_cpu_has_bti(void)
-{
- u64 pfr1;
-
- if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
- return false;
-
- pfr1 = __read_sysreg_by_encoding(SYS_ID_AA64PFR1_EL1);
- return cpuid_feature_extract_unsigned_field(pfr1,
- ID_AA64PFR1_EL1_BT_SHIFT);
-}
-
-/*
- * Create fine-grained mappings for the kernel.
+ * Declare the VMA areas for the kernel
*/
-static void __init map_kernel(pgd_t *pgdp)
+static void __init declare_kernel_vmas(void)
{
- static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_inittext,
- vmlinux_initdata, vmlinux_data;
-
- /*
- * External debuggers may need to write directly to the text
- * mapping to install SW breakpoints. Allow this (only) when
- * explicitly requested with rodata=off.
- */
- pgprot_t text_prot = kernel_exec_prot();
-
- /*
- * If we have a CPU that supports BTI and a kernel built for
- * BTI then mark the kernel executable text as guarded pages
- * now so we don't have to rewrite the page tables later.
- */
- if (arm64_early_this_cpu_has_bti())
- text_prot = __pgprot_modify(text_prot, PTE_GP, PTE_GP);
+ static struct vm_struct vmlinux_seg[KERNEL_SEGMENT_COUNT];
- /*
- * Only rodata will be remapped with different permissions later on,
- * all other segments are allowed to use contiguous mappings.
- */
- map_kernel_segment(pgdp, _stext, _etext, text_prot, &vmlinux_text, 0,
- VM_NO_GUARD);
- map_kernel_segment(pgdp, __start_rodata, __inittext_begin, PAGE_KERNEL,
- &vmlinux_rodata, NO_CONT_MAPPINGS, VM_NO_GUARD);
- map_kernel_segment(pgdp, __inittext_begin, __inittext_end, text_prot,
- &vmlinux_inittext, 0, VM_NO_GUARD);
- map_kernel_segment(pgdp, __initdata_begin, __initdata_end, PAGE_KERNEL,
- &vmlinux_initdata, 0, VM_NO_GUARD);
- map_kernel_segment(pgdp, _data, _end, PAGE_KERNEL, &vmlinux_data, 0, 0);
-
- fixmap_copy(pgdp);
- kasan_copy_shadow(pgdp);
+ declare_vma(&vmlinux_seg[0], _stext, _etext, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[1], __start_rodata, __inittext_begin, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[2], __inittext_begin, __inittext_end, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[3], __initdata_begin, __initdata_end, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[4], _data, _end, 0);
}
void __pi_map_range(u64 *pgd, u64 start, u64 end, u64 pa, pgprot_t prot,
@@ -799,23 +753,12 @@ static void __init create_idmap(void)
void __init paging_init(void)
{
- pgd_t *pgdp = pgd_set_fixmap(__pa_symbol(swapper_pg_dir));
- extern pgd_t init_idmap_pg_dir[];
-
- map_kernel(pgdp);
- map_mem(pgdp);
-
- pgd_clear_fixmap();
-
- cpu_replace_ttbr1(lm_alias(swapper_pg_dir), init_idmap_pg_dir);
- init_mm.pgd = swapper_pg_dir;
-
- memblock_phys_free(__pa_symbol(init_pg_dir),
- __pa_symbol(init_pg_end) - __pa_symbol(init_pg_dir));
+ map_mem(swapper_pg_dir);
memblock_allow_resize();
create_idmap();
+ declare_kernel_vmas();
}
#ifdef CONFIG_MEMORY_HOTPLUG
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v5 39/39] arm64: Revert "mm: provide idmap pointer to cpu_replace_ttbr1()"
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (37 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 38/39] arm64: mm: omit redundant remap of kernel image Ard Biesheuvel
@ 2023-11-24 10:19 ` Ard Biesheuvel
2023-11-24 16:22 ` [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 10:19 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
From: Ard Biesheuvel <ardb@kernel.org>
This reverts commit 1682c45b920643c, which is no longer needed now that
we create the permanent kernel mapping directly during early boot.
This is not a clean revert but the changes are straight-forward.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/mmu_context.h | 19 +++++++------------
arch/arm64/mm/kasan_init.c | 4 ++--
2 files changed, 9 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 3c487948b94e..d740e2ddd272 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -108,18 +108,13 @@ static inline void cpu_uninstall_idmap(void)
cpu_switch_mm(mm->pgd, mm);
}
-static inline void __cpu_install_idmap(pgd_t *idmap)
+static inline void cpu_install_idmap(void)
{
cpu_set_reserved_ttbr0();
local_flush_tlb_all();
cpu_set_idmap_tcr_t0sz();
- cpu_switch_mm(lm_alias(idmap), &init_mm);
-}
-
-static inline void cpu_install_idmap(void)
-{
- __cpu_install_idmap(idmap_pg_dir);
+ cpu_switch_mm(lm_alias(idmap_pg_dir), &init_mm);
}
/*
@@ -150,7 +145,7 @@ static inline void cpu_install_ttbr0(phys_addr_t ttbr0, unsigned long t0sz)
* Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
* avoiding the possibility of conflicting TLB entries being allocated.
*/
-static inline void __cpu_replace_ttbr1(pgd_t *pgdp, pgd_t *idmap, bool cnp)
+static inline void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp)
{
typedef void (ttbr_replace_func)(phys_addr_t);
extern ttbr_replace_func idmap_cpu_replace_ttbr1;
@@ -165,7 +160,7 @@ static inline void __cpu_replace_ttbr1(pgd_t *pgdp, pgd_t *idmap, bool cnp)
replace_phys = (void *)__pa_symbol(idmap_cpu_replace_ttbr1);
- __cpu_install_idmap(idmap);
+ cpu_install_idmap();
/*
* We really don't want to take *any* exceptions while TTBR1 is
@@ -180,17 +175,17 @@ static inline void __cpu_replace_ttbr1(pgd_t *pgdp, pgd_t *idmap, bool cnp)
static inline void cpu_enable_swapper_cnp(void)
{
- __cpu_replace_ttbr1(lm_alias(swapper_pg_dir), idmap_pg_dir, true);
+ __cpu_replace_ttbr1(lm_alias(swapper_pg_dir), true);
}
-static inline void cpu_replace_ttbr1(pgd_t *pgdp, pgd_t *idmap)
+static inline void cpu_replace_ttbr1(pgd_t *pgdp)
{
/*
* Only for early TTBR1 replacement before cpucaps are finalized and
* before we've decided whether to use CNP.
*/
WARN_ON(system_capabilities_finalized());
- __cpu_replace_ttbr1(pgdp, idmap, false);
+ __cpu_replace_ttbr1(pgdp, false);
}
/*
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index dd91f5942fd2..2ea0de4a8486 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -220,7 +220,7 @@ static void __init kasan_init_shadow(void)
*/
memcpy(tmp_pg_dir, swapper_pg_dir, sizeof(tmp_pg_dir));
dsb(ishst);
- cpu_replace_ttbr1(lm_alias(tmp_pg_dir), idmap_pg_dir);
+ cpu_replace_ttbr1(lm_alias(tmp_pg_dir));
clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
@@ -256,7 +256,7 @@ static void __init kasan_init_shadow(void)
PAGE_KERNEL_RO));
memset(kasan_early_shadow_page, KASAN_SHADOW_INIT, PAGE_SIZE);
- cpu_replace_ttbr1(lm_alias(swapper_pg_dir), idmap_pg_dir);
+ cpu_replace_ttbr1(lm_alias(swapper_pg_dir));
}
static void __init kasan_init_depth(void)
--
2.43.0.rc1.413.gea7ed67945-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH v5 29/39] arm64: Add helpers to probe local CPU for PAC and BTI support
2023-11-24 10:19 ` [PATCH v5 29/39] arm64: Add helpers to probe local CPU for PAC and BTI support Ard Biesheuvel
@ 2023-11-24 12:37 ` Marc Zyngier
2023-11-24 13:08 ` Ard Biesheuvel
0 siblings, 1 reply; 47+ messages in thread
From: Marc Zyngier @ 2023-11-24 12:37 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, Will Deacon,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
On Fri, 24 Nov 2023 10:19:09 +0000,
Ard Biesheuvel <ardb@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Add some helpers that will be used by the early kernel mapping code to
> check feature support on the local CPU. This permits the early kernel
> mapping to be created with the right attributes, removing the need for
> tearing it down and recreating it.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> arch/arm64/include/asm/cpufeature.h | 44 ++++++++++++++++++++
> 1 file changed, 44 insertions(+)
>
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 99c01417e544..301d4d2211d5 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -921,6 +921,50 @@ static inline bool kaslr_disabled_cmdline(void)
> u32 get_kvm_ipa_limit(void);
> void dump_cpu_features(void);
>
> +static inline bool cpu_has_bti(void)
> +{
> + u64 pfr1;
> +
> + if (!IS_ENABLED(CONFIG_ARM64_BTI))
> + return false;
> +
> + pfr1 = read_cpuid(ID_AA64PFR1_EL1);
> + pfr1 &= ~id_aa64pfr1_override.mask;
> + pfr1 |= id_aa64pfr1_override.val;
There is a potential gotcha here. If the override failed because a
filter rejected it, val will be set to full ones for the size of the
failed override field, and the corresponding mask will be set to 0
(see match_options()).
A safer pattern would be:
pfr1 |= id_aa64pfr1_override.val & id_aa64pfr1_override.mask;
although we currently don't have such a filter for BTI nor PAC, so the
code is OK for now.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v5 29/39] arm64: Add helpers to probe local CPU for PAC and BTI support
2023-11-24 12:37 ` Marc Zyngier
@ 2023-11-24 13:08 ` Ard Biesheuvel
2023-11-24 13:48 ` Marc Zyngier
0 siblings, 1 reply; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 13:08 UTC (permalink / raw)
To: Marc Zyngier
Cc: Ard Biesheuvel, linux-arm-kernel, Catalin Marinas, Will Deacon,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
On Fri, 24 Nov 2023 at 13:37, Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 24 Nov 2023 10:19:09 +0000,
> Ard Biesheuvel <ardb@google.com> wrote:
> >
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Add some helpers that will be used by the early kernel mapping code to
> > check feature support on the local CPU. This permits the early kernel
> > mapping to be created with the right attributes, removing the need for
> > tearing it down and recreating it.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> > arch/arm64/include/asm/cpufeature.h | 44 ++++++++++++++++++++
> > 1 file changed, 44 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> > index 99c01417e544..301d4d2211d5 100644
> > --- a/arch/arm64/include/asm/cpufeature.h
> > +++ b/arch/arm64/include/asm/cpufeature.h
> > @@ -921,6 +921,50 @@ static inline bool kaslr_disabled_cmdline(void)
> > u32 get_kvm_ipa_limit(void);
> > void dump_cpu_features(void);
> >
> > +static inline bool cpu_has_bti(void)
> > +{
> > + u64 pfr1;
> > +
> > + if (!IS_ENABLED(CONFIG_ARM64_BTI))
> > + return false;
> > +
> > + pfr1 = read_cpuid(ID_AA64PFR1_EL1);
> > + pfr1 &= ~id_aa64pfr1_override.mask;
> > + pfr1 |= id_aa64pfr1_override.val;
>
> There is a potential gotcha here. If the override failed because a
> filter rejected it, val will be set to full ones for the size of the
> failed override field, and the corresponding mask will be set to 0
> (see match_options()).
>
> A safer pattern would be:
>
> pfr1 |= id_aa64pfr1_override.val & id_aa64pfr1_override.mask;
>
> although we currently don't have such a filter for BTI nor PAC, so the
> code is OK for now.
>
This confuses me.
What is the point of a value/mask pair if the mask is applied to the
value, and not to the quantity that is being overridden? Surely, not
setting those bits in 'value' to begin with makes the whole mask
redundant, no?
If the filter is supposed to prevent the override from taking effect,
wouldn't it be better to use 0/0 for the mask/value pair?
(/me likely misses some context here)
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v5 29/39] arm64: Add helpers to probe local CPU for PAC and BTI support
2023-11-24 13:08 ` Ard Biesheuvel
@ 2023-11-24 13:48 ` Marc Zyngier
2023-11-25 8:59 ` Ard Biesheuvel
0 siblings, 1 reply; 47+ messages in thread
From: Marc Zyngier @ 2023-11-24 13:48 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Ard Biesheuvel, linux-arm-kernel, Catalin Marinas, Will Deacon,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
On Fri, 24 Nov 2023 13:08:33 +0000,
Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Fri, 24 Nov 2023 at 13:37, Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Fri, 24 Nov 2023 10:19:09 +0000,
> > Ard Biesheuvel <ardb@google.com> wrote:
> > >
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > Add some helpers that will be used by the early kernel mapping code to
> > > check feature support on the local CPU. This permits the early kernel
> > > mapping to be created with the right attributes, removing the need for
> > > tearing it down and recreating it.
> > >
> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > > ---
> > > arch/arm64/include/asm/cpufeature.h | 44 ++++++++++++++++++++
> > > 1 file changed, 44 insertions(+)
> > >
> > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> > > index 99c01417e544..301d4d2211d5 100644
> > > --- a/arch/arm64/include/asm/cpufeature.h
> > > +++ b/arch/arm64/include/asm/cpufeature.h
> > > @@ -921,6 +921,50 @@ static inline bool kaslr_disabled_cmdline(void)
> > > u32 get_kvm_ipa_limit(void);
> > > void dump_cpu_features(void);
> > >
> > > +static inline bool cpu_has_bti(void)
> > > +{
> > > + u64 pfr1;
> > > +
> > > + if (!IS_ENABLED(CONFIG_ARM64_BTI))
> > > + return false;
> > > +
> > > + pfr1 = read_cpuid(ID_AA64PFR1_EL1);
> > > + pfr1 &= ~id_aa64pfr1_override.mask;
> > > + pfr1 |= id_aa64pfr1_override.val;
> >
> > There is a potential gotcha here. If the override failed because a
> > filter rejected it, val will be set to full ones for the size of the
> > failed override field, and the corresponding mask will be set to 0
> > (see match_options()).
> >
> > A safer pattern would be:
> >
> > pfr1 |= id_aa64pfr1_override.val & id_aa64pfr1_override.mask;
> >
> > although we currently don't have such a filter for BTI nor PAC, so the
> > code is OK for now.
> >
>
> This confuses me.
>
> What is the point of a value/mask pair if the mask is applied to the
> value, and not to the quantity that is being overridden? Surely, not
> setting those bits in 'value' to begin with makes the whole mask
> redundant, no?
>
> If the filter is supposed to prevent the override from taking effect,
> wouldn't it be better to use 0/0 for the mask/value pair?
>
> (/me likely misses some context here)
0/0 would be fine for the purpose of not applying the override.
However, this hack is used to report the failed override at boot time
instead of silently ignoring it (see [1]).
Is it crap? Absolutely. But I didn't have a better idea at the time
(and still don't). Alternatively, we can drop the reporting and be
done with it.
M.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kernel/cpufeature.c#n923
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
` (38 preceding siblings ...)
2023-11-24 10:19 ` [PATCH v5 39/39] arm64: Revert "mm: provide idmap pointer to cpu_replace_ttbr1()" Ard Biesheuvel
@ 2023-11-24 16:22 ` Ard Biesheuvel
39 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-24 16:22 UTC (permalink / raw)
Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Marc Zyngier,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
On Fri, 24 Nov 2023 at 11:20, Ard Biesheuvel <ardb@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> At the request of Catalin, this series was split off from my LPA2 series
> [0] in order to make the changes a bit more manageable.
>
Note: it appears I need another patch from the second batch in order
to address a kCFI build issue with this series in some configurations:
[PATCH v4 40/61] arm64: mmu: Make cpu_replace_ttbr1() out of line
https://lore.kernel.org/all/20230912141549.278777-103-ardb@google.com/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v5 29/39] arm64: Add helpers to probe local CPU for PAC and BTI support
2023-11-24 13:48 ` Marc Zyngier
@ 2023-11-25 8:59 ` Ard Biesheuvel
0 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-25 8:59 UTC (permalink / raw)
To: Marc Zyngier
Cc: Ard Biesheuvel, linux-arm-kernel, Catalin Marinas, Will Deacon,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
On Fri, 24 Nov 2023 at 14:48, Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 24 Nov 2023 13:08:33 +0000,
> Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Fri, 24 Nov 2023 at 13:37, Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > On Fri, 24 Nov 2023 10:19:09 +0000,
> > > Ard Biesheuvel <ardb@google.com> wrote:
> > > >
> > > > From: Ard Biesheuvel <ardb@kernel.org>
> > > >
> > > > Add some helpers that will be used by the early kernel mapping code to
> > > > check feature support on the local CPU. This permits the early kernel
> > > > mapping to be created with the right attributes, removing the need for
> > > > tearing it down and recreating it.
> > > >
> > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > > > ---
> > > > arch/arm64/include/asm/cpufeature.h | 44 ++++++++++++++++++++
> > > > 1 file changed, 44 insertions(+)
> > > >
> > > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> > > > index 99c01417e544..301d4d2211d5 100644
> > > > --- a/arch/arm64/include/asm/cpufeature.h
> > > > +++ b/arch/arm64/include/asm/cpufeature.h
> > > > @@ -921,6 +921,50 @@ static inline bool kaslr_disabled_cmdline(void)
> > > > u32 get_kvm_ipa_limit(void);
> > > > void dump_cpu_features(void);
> > > >
> > > > +static inline bool cpu_has_bti(void)
> > > > +{
> > > > + u64 pfr1;
> > > > +
> > > > + if (!IS_ENABLED(CONFIG_ARM64_BTI))
> > > > + return false;
> > > > +
> > > > + pfr1 = read_cpuid(ID_AA64PFR1_EL1);
> > > > + pfr1 &= ~id_aa64pfr1_override.mask;
> > > > + pfr1 |= id_aa64pfr1_override.val;
> > >
> > > There is a potential gotcha here. If the override failed because a
> > > filter rejected it, val will be set to full ones for the size of the
> > > failed override field, and the corresponding mask will be set to 0
> > > (see match_options()).
> > >
> > > A safer pattern would be:
> > >
> > > pfr1 |= id_aa64pfr1_override.val & id_aa64pfr1_override.mask;
> > >
> > > although we currently don't have such a filter for BTI nor PAC, so the
> > > code is OK for now.
> > >
> >
> > This confuses me.
> >
> > What is the point of a value/mask pair if the mask is applied to the
> > value, and not to the quantity that is being overridden? Surely, not
> > setting those bits in 'value' to begin with makes the whole mask
> > redundant, no?
> >
> > If the filter is supposed to prevent the override from taking effect,
> > wouldn't it be better to use 0/0 for the mask/value pair?
> >
> > (/me likely misses some context here)
>
> 0/0 would be fine for the purpose of not applying the override.
> However, this hack is used to report the failed override at boot time
> instead of silently ignoring it (see [1]).
>
> Is it crap? Absolutely. But I didn't have a better idea at the time
> (and still don't). Alternatively, we can drop the reporting and be
> done with it.
>
No I think it is fine, actually. A valid mask should cover at least
the 1 bits in value, so signalling an invalid override this way is
perfectly reasonable.
But applying the mask to value still does not make sense imo. Instead,
I should just check that the override is valid before applying it.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v5 15/39] arm64: idreg-override: Prepare for place relative reloc patching
2023-11-24 10:18 ` [PATCH v5 15/39] arm64: idreg-override: Prepare for place relative reloc patching Ard Biesheuvel
@ 2023-11-27 12:53 ` Marc Zyngier
2023-11-27 12:58 ` Ard Biesheuvel
0 siblings, 1 reply; 47+ messages in thread
From: Marc Zyngier @ 2023-11-27 12:53 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, Will Deacon,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
On Fri, 24 Nov 2023 10:18:55 +0000,
Ard Biesheuvel <ardb@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The ID reg override handling code uses a rather elaborate data structure
> that relies on statically initialized absolute address values in pointer
> fields. This means that this code cannot run until relocation fixups
> have been applied, and this is unfortunate, because it means we cannot
> discover overrides for KASLR or LVA/LPA without creating the kernel
> mapping and performing the relocations first.
>
> This can be solved by switching to place-relative relocations, which can
> be applied by the linker at build time. This means some additional
> arithmetic is required when dereferencing these pointers, as we can no
> longer dereference the pointer members directly.
>
> So let's implement this for idreg-override.c in a preliminary way, i.e.,
> convert all the references in code to use a special accessor that
> produces the correct absolute value at runtime.
>
> To preserve the strong type checking for the static initializers, use
> union types for representing the hybrid quantities.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> arch/arm64/kernel/idreg-override.c | 98 +++++++++++++-------
> 1 file changed, 65 insertions(+), 33 deletions(-)
>
> diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/idreg-override.c
> index 536bc33859bc..4e32a44560bf 100644
> --- a/arch/arm64/kernel/idreg-override.c
> +++ b/arch/arm64/kernel/idreg-override.c
> @@ -21,14 +21,32 @@
>
> static u64 __boot_status __initdata;
>
> +// temporary __prel64 related definitions
> +// to be removed when this code is moved under pi/
> +
> +#define __prel64_initconst __initconst
> +
> +typedef void *prel64_t;
> +
> +static void *prel64_to_pointer(const prel64_t *p)
> +{
> + return *p;
> +}
> +
Having played with this series a bit to see how the E2H0 override
support would fit in, I found that this cast to a void* could hide
stupid bugs that should normally be caught at compile time.
Having hacked on it a bit, I came up with this (partial patch on top
of the full series):
diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
index 84647ebb87ee..d60322477e44 100644
--- a/arch/arm64/kernel/pi/idreg-override.c
+++ b/arch/arm64/kernel/pi/idreg-override.c
@@ -246,12 +246,12 @@ static void __init match_options(const char *cmdline)
int i;
for (i = 0; i < ARRAY_SIZE(regs); i++) {
- const struct ftr_set_desc *reg = prel64_to_pointer(®s[i].reg_prel);
+ const struct ftr_set_desc *reg = prel64_pointer(regs[i].reg);
struct arm64_ftr_override *override;
int len = strlen(reg->name);
int f;
- override = prel64_to_pointer(®->override_prel);
+ override = prel64_pointer(reg->override);
// set opt[] to '<name>.'
memcpy(opt, reg->name, len);
diff --git a/arch/arm64/kernel/pi/pi.h b/arch/arm64/kernel/pi/pi.h
index 04a1f576baee..9d922322999b 100644
--- a/arch/arm64/kernel/pi/pi.h
+++ b/arch/arm64/kernel/pi/pi.h
@@ -15,6 +15,8 @@ static inline void *prel64_to_pointer(const prel64_t *offset)
return (void *)offset + *offset;
}
+#define prel64_pointer(__d) (typeof(__d))prel64_to_pointer(&__d##_prel)
+
extern bool dynamic_scs_is_enabled;
extern pgd_t init_idmap_pg_dir[];
which preserves the type-checking, at the expense of the implicit
*_prel field access. Not sure if that's something you have considered,
but I thought I'd raise it here.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH v5 15/39] arm64: idreg-override: Prepare for place relative reloc patching
2023-11-27 12:53 ` Marc Zyngier
@ 2023-11-27 12:58 ` Ard Biesheuvel
0 siblings, 0 replies; 47+ messages in thread
From: Ard Biesheuvel @ 2023-11-27 12:58 UTC (permalink / raw)
To: Marc Zyngier
Cc: Ard Biesheuvel, linux-arm-kernel, Catalin Marinas, Will Deacon,
Mark Rutland, Ryan Roberts, Anshuman Khandual, Kees Cook
On Mon, 27 Nov 2023 at 13:53, Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 24 Nov 2023 10:18:55 +0000,
> Ard Biesheuvel <ardb@google.com> wrote:
> >
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > The ID reg override handling code uses a rather elaborate data structure
> > that relies on statically initialized absolute address values in pointer
> > fields. This means that this code cannot run until relocation fixups
> > have been applied, and this is unfortunate, because it means we cannot
> > discover overrides for KASLR or LVA/LPA without creating the kernel
> > mapping and performing the relocations first.
> >
> > This can be solved by switching to place-relative relocations, which can
> > be applied by the linker at build time. This means some additional
> > arithmetic is required when dereferencing these pointers, as we can no
> > longer dereference the pointer members directly.
> >
> > So let's implement this for idreg-override.c in a preliminary way, i.e.,
> > convert all the references in code to use a special accessor that
> > produces the correct absolute value at runtime.
> >
> > To preserve the strong type checking for the static initializers, use
> > union types for representing the hybrid quantities.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> > arch/arm64/kernel/idreg-override.c | 98 +++++++++++++-------
> > 1 file changed, 65 insertions(+), 33 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/idreg-override.c
> > index 536bc33859bc..4e32a44560bf 100644
> > --- a/arch/arm64/kernel/idreg-override.c
> > +++ b/arch/arm64/kernel/idreg-override.c
> > @@ -21,14 +21,32 @@
> >
> > static u64 __boot_status __initdata;
> >
> > +// temporary __prel64 related definitions
> > +// to be removed when this code is moved under pi/
> > +
> > +#define __prel64_initconst __initconst
> > +
> > +typedef void *prel64_t;
> > +
> > +static void *prel64_to_pointer(const prel64_t *p)
> > +{
> > + return *p;
> > +}
> > +
>
> Having played with this series a bit to see how the E2H0 override
> support would fit in, I found that this cast to a void* could hide
> stupid bugs that should normally be caught at compile time.
>
> Having hacked on it a bit, I came up with this (partial patch on top
> of the full series):
>
> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> index 84647ebb87ee..d60322477e44 100644
> --- a/arch/arm64/kernel/pi/idreg-override.c
> +++ b/arch/arm64/kernel/pi/idreg-override.c
> @@ -246,12 +246,12 @@ static void __init match_options(const char *cmdline)
> int i;
>
> for (i = 0; i < ARRAY_SIZE(regs); i++) {
> - const struct ftr_set_desc *reg = prel64_to_pointer(®s[i].reg_prel);
> + const struct ftr_set_desc *reg = prel64_pointer(regs[i].reg);
> struct arm64_ftr_override *override;
> int len = strlen(reg->name);
> int f;
>
> - override = prel64_to_pointer(®->override_prel);
> + override = prel64_pointer(reg->override);
>
> // set opt[] to '<name>.'
> memcpy(opt, reg->name, len);
> diff --git a/arch/arm64/kernel/pi/pi.h b/arch/arm64/kernel/pi/pi.h
> index 04a1f576baee..9d922322999b 100644
> --- a/arch/arm64/kernel/pi/pi.h
> +++ b/arch/arm64/kernel/pi/pi.h
> @@ -15,6 +15,8 @@ static inline void *prel64_to_pointer(const prel64_t *offset)
> return (void *)offset + *offset;
> }
>
> +#define prel64_pointer(__d) (typeof(__d))prel64_to_pointer(&__d##_prel)
> +
> extern bool dynamic_scs_is_enabled;
>
> extern pgd_t init_idmap_pg_dir[];
>
>
> which preserves the type-checking, at the expense of the implicit
> *_prel field access. Not sure if that's something you have considered,
> but I thought I'd raise it here.
>
Thanks!
I agree the strict type checking is desirable, and I don't think the
implicit _prel concatenation is an issue given that this is only used
in a single source file (and if that changes, we can revisit)
I'll fold this in for the next rev.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2023-11-27 12:59 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-24 10:18 [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 01/39] arm64: kernel: Disable latent_entropy GCC plugin in early C runtime Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 02/39] arm64: mm: Take potential load offset into account when KASLR is off Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 03/39] arm64: mm: get rid of kimage_vaddr global variable Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 04/39] arm64: mm: Move PCI I/O emulation region above the vmemmap region Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 05/39] arm64: mm: Move fixmap region above " Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 06/39] arm64: ptdump: Allow all region boundaries to be defined at boot time Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 07/39] arm64: ptdump: Discover start of vmemmap region at runtime Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 08/39] arm64: vmemmap: Avoid base2 order of struct page size to dimension region Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 09/39] arm64: mm: Reclaim unused vmemmap region for vmalloc use Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 10/39] arm64: kaslr: Adjust randomization range dynamically Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 11/39] arm64: kernel: Manage absolute relocations in code built under pi/ Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 12/39] arm64: kernel: Don't rely on objcopy to make code under pi/ __init Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 13/39] arm64: head: move relocation handling to C code Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 14/39] arm64: idreg-override: Omit non-NULL checks for override pointer Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 15/39] arm64: idreg-override: Prepare for place relative reloc patching Ard Biesheuvel
2023-11-27 12:53 ` Marc Zyngier
2023-11-27 12:58 ` Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 16/39] arm64: idreg-override: Avoid parameq() and parameqn() Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 17/39] arm64: idreg-override: avoid strlen() to check for empty strings Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 18/39] arm64: idreg-override: Avoid sprintf() for simple string concatenation Ard Biesheuvel
2023-11-24 10:18 ` [PATCH v5 19/39] arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 20/39] arm64: idreg-override: Move to early mini C runtime Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 21/39] arm64: kernel: Remove early fdt remap code Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 22/39] arm64: head: Clear BSS and the kernel page tables in one go Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 23/39] arm64: Move feature overrides into the BSS section Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 24/39] arm64: head: Run feature override detection before mapping the kernel Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 25/39] arm64: head: move dynamic shadow call stack patching into early C runtime Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 26/39] arm64: kaslr: Use feature override instead of parsing the cmdline again Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 27/39] arm64/kernel: Move 'nokaslr' parsing out of early idreg code Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 28/39] arm64: idreg-override: Create a pseudo feature for rodata=off Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 29/39] arm64: Add helpers to probe local CPU for PAC and BTI support Ard Biesheuvel
2023-11-24 12:37 ` Marc Zyngier
2023-11-24 13:08 ` Ard Biesheuvel
2023-11-24 13:48 ` Marc Zyngier
2023-11-25 8:59 ` Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 30/39] arm64: head: allocate more pages for the kernel mapping Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 31/39] arm64: head: move memstart_offset_seed handling to C code Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 32/39] arm64: mm: Make kaslr_requires_kpti() a static inline Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 33/39] arm64: head: Move early kernel mapping routines into C code Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 34/39] arm64: mm: Use 48-bit virtual addressing for the permanent ID map Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 35/39] arm64: pgtable: Decouple PGDIR size macros from PGD/PUD/PMD levels Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 36/39] arm64: kernel: Create initial ID map from C code Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 37/39] arm64: mm: avoid fixmap for early swapper_pg_dir updates Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 38/39] arm64: mm: omit redundant remap of kernel image Ard Biesheuvel
2023-11-24 10:19 ` [PATCH v5 39/39] arm64: Revert "mm: provide idmap pointer to cpu_replace_ttbr1()" Ard Biesheuvel
2023-11-24 16:22 ` [PATCH v5 00/39] arm64: Reorganize kernel VA space for LPA2 Ard Biesheuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).