linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] 52-bit kernel VAs for arm64
@ 2017-12-04 14:13 Steve Capper
  2017-12-04 14:13 ` [PATCH 01/12] KVM: arm/arm64: vgic: Remove spurious call to kern_hyp_va Steve Capper
                   ` (11 more replies)
  0 siblings, 12 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

This patch series brings 52-bit kernel VA support to arm64; if supported
at boot time. A new kernel option CONFIG_ARM64_VA_BITS_48_52 is available
when configured with a 64KB PAGE_SIZE (as on ARMv8.2-LPA, 52-bit VAs are
only allowed when running with a 64KB granule).

Switching between 48 and 52-bit does not involve any changes to the number
of page table levels. The number of PGDIR entries increases when running
with a 52 bit kernel VA.

In order to allow the kernel to switch between VA spaces at boot time, we
need to re-arrange the current kernel VA space. If we place the kernel at
the bottom of the address space, then symbol addresses will be independent
of address space size. (We can't assume relocation will be enabled).

The new kernel VA space looks like this (running with 48/52-bit kernel VA):

0xffff000000000000/0xfff0000000000000	ttbr1 start - linear mapping start
0xffff800000000000/0xfff8000000000000	midpoint of address space
0xffff800000000000/0xfffda00000000000	KASAN start
0xffffa00000000000/0xffffa00000000000	vmalloc start
---
0xfffffbdfd67b0000			vmemmap start
0xfffffbffd6800000			Fixed map start
0xffffffffd6c00000			PCI IO start
0xffffffffd7e00000			kernel modules start
0xffffffffdfe00000			kernel image start
0xffffffffffe00000			guard region for PTR_ERR

The KASAN end address is the same for both 48 and 52 bit kernel VA spaces,
meaning that KASAN "grows upwards" into the extra VA space available when
running with 52-bit. This allows us to run with inline KASAN, where the
offset is baked into the code in many places, on both VA spaces using the
same binary.

In order to allow for this logic, the KASAN_SHADOW_OFFSET logic was altered
to match the system used for x86; namely that KASAN_SHADOW_OFFSET is a
Kconfig constant rather than a derived quantity. In order to simplify
future VA work, the code to compute the KASAN shadow offset is supplied
as a script in the documentation folder.

If KASAN is not enabled, then the VMALLOC start position is variable
depending on VA_BITS. Thus VMALLOC_START cannot always be considered
constant.

This patch series modifies VA_BITS from a constant pre-processor macro, to
a runtime variable and this requires changes to other parts of the arm64
code such the page table dumper. Some parts of the code require pre-processing
constants derived from VA_BITS, so two new pre-processor constants have
been introduced:
 VA_BITS_MIN	the minimum number of VA_BITS used, this can be used to bound
		addresses conservatively s.t. mappings work for both address
		space sizes. An example use case being the EFI stub code
		efi_get_max_initrd_addr(). Another example being to determine
		whether or not we need an extra page table level for the
		identity mapping (on 64KB PAGE_SIZE we already have 3-levels
		for both 48-bit and 52-bit VA space).

 VA_BITS_ALT	if running with a higher kernel VA space, this is the number
		of bits available. VA_BITS_MIN and VA_BITS_ALT can be used
		together to generate constants (or test compile time asserts)
		which are then chosen at runtime.

I am mindful how disruptive the change to VA_BITS is, but I have so far been
unable to simplify this further. One idea I had was to remove VA_BITS completely
but that just makes the seris a lot more complicated.

Another idea was to fix VA_BITS to either 48 or 52, this doesn't really help
things. Either choice breaks derived addresses and also, less obviously, breaks
the meaning behind the constant. efi_get_max_initrd_addr will give the wrong
results if we choose 52 for example.

Though disruptive, having VA_BITS de-constified will help us pick out bugs at
compile time that one may not otherwise notice if it is set to the wrong constant.
That said, I'm receptive to any ideas that can simplify this.

One can't assume that ARMv8.2-LPA implies ARMv8.1-VHE, so changes are also made
to the KVM HYP mapping code to support both 48 and 52-bit VAs. In writing the
support, I unearthed some subtle bugs in the KVM HYP mapping logic and these
are addressed in the first two patches of this series.

This patch series applies to 4.15-rc2, and the early pagetable patches I
posted earlier:
http://lists.infradead.org/pipermail/linux-arm-kernel/2017-November/543494.html

Steve Capper (12):
  KVM: arm/arm64: vgic: Remove spurious call to kern_hyp_va
  arm64: KVM: Enforce injective kern_hyp_va mappings
  arm/arm64: KVM: Formalise end of direct linear map
  arm64: Initialise high_memory global variable earlier
  arm64: mm: Remove VMALLOC checks from update_mapping_prot(.)
  arm64: mm: Flip kernel VA space
  arm64: mm: Place kImage at bottom of VA space
  arm64: kasan: Switch to using KASAN_SHADOW_OFFSET
  arm64: dump: Make kernel page table dumper dynamic again
  arm64: mm: Make VA_BITS variable, introduce VA_BITS_MIN
  arm64: KVM: Add support for an alternative VA space
  arm64: mm: Add 48/52-bit kernel VA support

 Documentation/arm64/kasan-offsets.sh | 17 ++++++++
 arch/arm/include/asm/kvm_hyp.h       |  2 +
 arch/arm/include/asm/memory.h        |  1 +
 arch/arm64/Kconfig                   | 22 ++++++++++
 arch/arm64/Makefile                  |  7 ---
 arch/arm64/include/asm/assembler.h   |  2 +-
 arch/arm64/include/asm/cpucaps.h     |  6 ++-
 arch/arm64/include/asm/efi.h         |  4 +-
 arch/arm64/include/asm/kasan.h       | 24 +++++------
 arch/arm64/include/asm/kvm_hyp.h     | 10 +++++
 arch/arm64/include/asm/kvm_mmu.h     | 83 ++++++++++++++++++++++++++++--------
 arch/arm64/include/asm/memory.h      | 42 +++++++++++-------
 arch/arm64/include/asm/mmu_context.h |  2 +-
 arch/arm64/include/asm/pgtable.h     | 13 ++++--
 arch/arm64/include/asm/processor.h   |  2 +-
 arch/arm64/kernel/cpufeature.c       | 47 +++++++++++++++++---
 arch/arm64/kernel/head.S             | 13 +++---
 arch/arm64/kvm/hyp-init.S            |  2 +-
 arch/arm64/mm/dump.c                 | 59 ++++++++++++++++++++-----
 arch/arm64/mm/fault.c                |  2 +-
 arch/arm64/mm/init.c                 | 16 +++----
 arch/arm64/mm/kasan_init.c           | 14 +++---
 arch/arm64/mm/mmu.c                  | 11 ++---
 arch/arm64/mm/proc.S                 | 42 +++++++++++++++++-
 virt/kvm/arm/hyp/vgic-v2-sr.c        |  4 +-
 virt/kvm/arm/mmu.c                   |  4 +-
 26 files changed, 336 insertions(+), 115 deletions(-)
 create mode 100644 Documentation/arm64/kasan-offsets.sh

-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/12] KVM: arm/arm64: vgic: Remove spurious call to kern_hyp_va
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 14:30   ` Suzuki K Poulose
  2017-12-04 14:13 ` [PATCH 02/12] arm64: KVM: Enforce injective kern_hyp_va mappings Steve Capper
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

In save_elrsr(.), we use the following technique to ascertain the
address of the vgic global state:
	(kern_hyp_va(&kvm_vgic_global_state))->nr_lr

For arm, kern_hyp_va(va) == va, and this call effectively compiles out.

For arm64, this call can be spurious as the address of kvm_vgic_global_state
will usually be determined by relative page/absolute page offset relocation
at link time. As the function is idempotent, having the call for arm64 does
not cause any problems.

Unfortunately, this is about to change for arm64 as we need to change
the logic of kern_hyp_va to allow for kernel addresses that are outside
the direct linear map.

This patch removes the call to kern_hyp_va, and ensures that correct
HYP addresses are computed via relative page offset addressing on arm64.
This is achieved by a custom accessor, hyp_address(.), which on arm is a
simple reference operator.

Cc: James Morse <james.morese@arm.com>
Suggested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm/include/asm/kvm_hyp.h   |  2 ++
 arch/arm64/include/asm/kvm_hyp.h | 10 ++++++++++
 virt/kvm/arm/hyp/vgic-v2-sr.c    |  4 ++--
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index ab20ffa8b9e7..1864a9bdd160 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -26,6 +26,8 @@
 
 #define __hyp_text __section(.hyp.text) notrace
 
+#define hyp_address(symbol)	(&(symbol))
+
 #define __ACCESS_VFP(CRn)			\
 	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
 
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 08d3bb66c8b7..34a4ae906a97 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -25,6 +25,16 @@
 
 #define __hyp_text __section(.hyp.text) notrace
 
+#define hyp_address(symbol)				\
+({							\
+	typeof(&symbol) __ret;				\
+	asm volatile(					\
+	"adrp %[ptr], " #symbol	"\n"			\
+	"add %[ptr], %[ptr], :lo12:" #symbol "\n"	\
+	: [ptr] "=r"(__ret));				\
+	__ret;						\
+})
+
 #define read_sysreg_elx(r,nvh,vh)					\
 	({								\
 		u64 reg;						\
diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
index a3f18d362366..330fd4637708 100644
--- a/virt/kvm/arm/hyp/vgic-v2-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
@@ -25,7 +25,7 @@
 static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	int nr_lr = (kern_hyp_va(&kvm_vgic_global_state))->nr_lr;
+	int nr_lr = hyp_address(kvm_vgic_global_state)->nr_lr;
 	u32 elrsr0, elrsr1;
 
 	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
@@ -143,7 +143,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
 		return -1;
 
 	rd = kvm_vcpu_dabt_get_rd(vcpu);
-	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
+	addr  = kern_hyp_va(hyp_address(kvm_vgic_global_state)->vcpu_base_va);
 	addr += fault_ipa - vgic->vgic_cpu_base;
 
 	if (kvm_vcpu_dabt_iswrite(vcpu)) {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 02/12] arm64: KVM: Enforce injective kern_hyp_va mappings
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
  2017-12-04 14:13 ` [PATCH 01/12] KVM: arm/arm64: vgic: Remove spurious call to kern_hyp_va Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 14:13 ` [PATCH 03/12] arm/arm64: KVM: Formalise end of direct linear map Steve Capper
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

For systems that are not executing with VHE, we need to create page
tables for HYP/EL2 mode in order to access data from the kernel running
at EL1.

In addition to parts of the kernel address space being mapped to EL2, we
also need to make space for an identity mapping of the __hyp_idmap_text
area (as this code is responsible for activating the EL2 MMU).

In order to create these pagetables, we need a mechanism to map from the
address space pointed to by TTBR1_EL1 (addresses preceded by 0xFF...)
to the one addressed by TTBR0_EL2 (addresses preceded by 0x00...).

There are two ways of performing this mapping depending upon the
physical address of __hyp_idmap_text_start.

If PA[VA_BITS - 2] == 0b:
1) HYP_VA = KERN_VA & GENMASK(VA_BITS - 2, 0) - so we mask in the lower
bits of the kernel address. This is a bijective mapping.

If PA[VA_BITS - 2] == 1b:
2) HYP_VA = KERN_VA & GENMASK(VA_BITS - 3, 0) - so the top bit of our HYP
VA will always be zero. This mapping is no longer injective, each HYP VA
can be obtained from two different kernel VAs.

These mappings guarantee that kernel addresses in the direct linear
mapping will not give a HYP VA that collides with the identity mapping
for __hyp_idmap_text.

Unfortunately, with the second mapping we run the risk of hyp VAs
derived from kernel addresses in the direct linear map colliding with
those derived from kernel addresses from ioremap.

This patch addresses this issue by switching to the following logic:
If PA[VA_BITS - 2] == 0b:
3) HYP_VA = KERN_VA XOR GENMASK(63, VA_BITS - 1) - we toggle off the top
bits from the kernel address rather than and in the bottom bits.

If PA[VA_BITS - 2] == 1b:
4) HYP_VA = KERN_VA XOR GENMASK(63, VA_BITS - 2) - no longer maps to a
reduced address space, we have a bijective mapping.

Now there is no possibility of collision between HYP VAs obtained from
kernel addresses.

Note that the new mappings are no longer idempotent, so the following
code sequence will behave differently after this patch is applied:
	testva = kern_hyp_va(kern_hyp_va(sourceva));

Cc: James Morse <james.morse@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/include/asm/cpucaps.h |  2 +-
 arch/arm64/include/asm/kvm_mmu.h | 36 +++++++++++++++++-------------------
 arch/arm64/kernel/cpufeature.c   |  8 ++++----
 3 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 2ff7c5e8efab..3de31a1010ee 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -32,7 +32,7 @@
 #define ARM64_HAS_VIRT_HOST_EXTN		11
 #define ARM64_WORKAROUND_CAVIUM_27456		12
 #define ARM64_HAS_32BIT_EL0			13
-#define ARM64_HYP_OFFSET_LOW			14
+#define ARM64_HYP_MAP_FLIP			14
 #define ARM64_MISMATCHED_CACHE_LINE_SIZE	15
 #define ARM64_HAS_NO_FPSIMD			16
 #define ARM64_WORKAROUND_REPEAT_TLBI		17
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 672c8684d5c2..d74d5236c26c 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -69,8 +69,8 @@
  * mappings, and none of this applies in that case.
  */
 
-#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
-#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
+#define HYP_MAP_KERNEL_BITS	(UL(0xffffffffffffffff) << VA_BITS)
+#define HYP_MAP_HIGH_BIT	(UL(1) << (VA_BITS - 1))
 
 #ifdef __ASSEMBLY__
 
@@ -82,26 +82,24 @@
  * reg: VA to be converted.
  *
  * This generates the following sequences:
- * - High mask:
- *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
+ *
+ * - Flip the kernel bits:
+ *		eor x0, x0, #HYP_MAP_KERNEL_BITS
  *		nop
- * - Low mask:
- *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
- *		and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
+ *
+ * - Flip the kernel bits and upper HYP bit:
+ *		eor x0, x0, #HYP_MAP_KERNEL_BITS
+ *		eor x0, x0, #HYP_MAP_HIGH_BIT
  * - VHE:
  *		nop
  *		nop
- *
- * The "low mask" version works because the mask is a strict subset of
- * the "high mask", hence performing the first mask for nothing.
- * Should be completely invisible on any viable CPU.
  */
 .macro kern_hyp_va	reg
 alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
-	and     \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK
+	eor     \reg, \reg, #HYP_MAP_KERNEL_BITS
 alternative_else_nop_endif
-alternative_if ARM64_HYP_OFFSET_LOW
-	and     \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK
+alternative_if ARM64_HYP_MAP_FLIP
+	eor     \reg, \reg, #HYP_MAP_HIGH_BIT
 alternative_else_nop_endif
 .endm
 
@@ -115,16 +113,16 @@ alternative_else_nop_endif
 
 static inline unsigned long __kern_hyp_va(unsigned long v)
 {
-	asm volatile(ALTERNATIVE("and %0, %0, %1",
+	asm volatile(ALTERNATIVE("eor %0, %0, %1",
 				 "nop",
 				 ARM64_HAS_VIRT_HOST_EXTN)
 		     : "+r" (v)
-		     : "i" (HYP_PAGE_OFFSET_HIGH_MASK));
+		     : "i" (HYP_MAP_KERNEL_BITS));
 	asm volatile(ALTERNATIVE("nop",
-				 "and %0, %0, %1",
-				 ARM64_HYP_OFFSET_LOW)
+				 "eor %0, %0, %1",
+				 ARM64_HYP_MAP_FLIP)
 		     : "+r" (v)
-		     : "i" (HYP_PAGE_OFFSET_LOW_MASK));
+		     : "i" (HYP_MAP_HIGH_BIT));
 	return v;
 }
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index c5ba0097887f..5a6e1f3611eb 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -824,7 +824,7 @@ static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused
 	return is_kernel_in_hyp_mode();
 }
 
-static bool hyp_offset_low(const struct arm64_cpu_capabilities *entry,
+static bool hyp_flip_space(const struct arm64_cpu_capabilities *entry,
 			   int __unused)
 {
 	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
@@ -926,10 +926,10 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = ID_AA64PFR0_EL0_32BIT_64BIT,
 	},
 	{
-		.desc = "Reduced HYP mapping offset",
-		.capability = ARM64_HYP_OFFSET_LOW,
+		.desc = "HYP mapping flipped",
+		.capability = ARM64_HYP_MAP_FLIP,
 		.def_scope = SCOPE_SYSTEM,
-		.matches = hyp_offset_low,
+		.matches = hyp_flip_space,
 	},
 	{
 		/* FP/SIMD is not implemented */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 03/12] arm/arm64: KVM: Formalise end of direct linear map
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
  2017-12-04 14:13 ` [PATCH 01/12] KVM: arm/arm64: vgic: Remove spurious call to kern_hyp_va Steve Capper
  2017-12-04 14:13 ` [PATCH 02/12] arm64: KVM: Enforce injective kern_hyp_va mappings Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 14:13 ` [PATCH 04/12] arm64: Initialise high_memory global variable earlier Steve Capper
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

We assume that the direct linear map ends at ~0 in the KVM HYP map
intersection checking code. This assumption will become invalid later on
for arm64 when the address space of the kernel is re-arranged.

This patch introduces a new constant PAGE_OFFSET_END for both arm and
arm64 and defines it to be ~0UL

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm/include/asm/memory.h   | 1 +
 arch/arm64/include/asm/memory.h | 1 +
 virt/kvm/arm/mmu.c              | 4 ++--
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 1f54e4e98c1e..e223a945c361 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -30,6 +30,7 @@
 
 /* PAGE_OFFSET - the virtual address of the start of the kernel image */
 #define PAGE_OFFSET		UL(CONFIG_PAGE_OFFSET)
+#define PAGE_OFFSET_END		(~0UL)
 
 #ifdef CONFIG_MMU
 
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index d4bae7d6e0d8..2dedc775d151 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -67,6 +67,7 @@
 	(UL(1) << VA_BITS) + 1)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) - \
 	(UL(1) << (VA_BITS - 1)) + 1)
+#define PAGE_OFFSET_END		(~0UL)
 #define KIMAGE_VADDR		(MODULES_END)
 #define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
 #define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index b36945d49986..4ff903409c65 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1764,10 +1764,10 @@ int kvm_mmu_init(void)
 
 	kvm_info("IDMAP page: %lx\n", hyp_idmap_start);
 	kvm_info("HYP VA range: %lx:%lx\n",
-		 kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
+		 kern_hyp_va(PAGE_OFFSET), kern_hyp_va(PAGE_OFFSET_END));
 
 	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
-	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
+	    hyp_idmap_start <  kern_hyp_va(PAGE_OFFSET_END) &&
 	    hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) {
 		/*
 		 * The idmap page is intersecting with the VA space,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 04/12] arm64: Initialise high_memory global variable earlier
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
                   ` (2 preceding siblings ...)
  2017-12-04 14:13 ` [PATCH 03/12] arm/arm64: KVM: Formalise end of direct linear map Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-11 12:00   ` Catalin Marinas
  2017-12-04 14:13 ` [PATCH 05/12] arm64: mm: Remove VMALLOC checks from update_mapping_prot(.) Steve Capper
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

The high_memory global variable is used by
cma_declare_contiguous(.) before it is defined.

We don't notice this as we compute __pa(high_memory - 1), and it looks
like we're processing a VA from the direct linear map.

This problem becomes apparent when we flip the kernel virtual address
space and the linear map is moved to the bottom of the kernel VA space.

This patch moves the initialisation of high_memory before it used.

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/mm/init.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 5960bef0170d..00e7b900ca41 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -476,6 +476,8 @@ void __init arm64_memblock_init(void)
 
 	reserve_elfcorehdr();
 
+	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
+
 	dma_contiguous_reserve(arm64_dma_phys_limit);
 
 	memblock_allow_resize();
@@ -502,7 +504,6 @@ void __init bootmem_init(void)
 	sparse_init();
 	zone_sizes_init(min, max);
 
-	high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
 	memblock_dump_all();
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/12] arm64: mm: Remove VMALLOC checks from update_mapping_prot(.)
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
                   ` (3 preceding siblings ...)
  2017-12-04 14:13 ` [PATCH 04/12] arm64: Initialise high_memory global variable earlier Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 16:01   ` Ard Biesheuvel
  2017-12-04 14:13 ` [PATCH 06/12] arm64: mm: Flip kernel VA space Steve Capper
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

update_mapping_prot assumes that it will be used on the VA for the
kernel .text section. (Via the check virt >= VMALLOC_START)

Recent kdump patches employ this function to modify the protection of
the direct linear mapping (which is strictly speaking outside of this
area), via mark_linear_text_alias_ro(.).

We "get away" with this as the direct linear mapping currently follows
the VA for the kernel text, so the check passes.

This patch removes the check in update_mapping_prot allowing us to move
the kernel VA layout without spuriously firing the warning.

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/mm/mmu.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 58b1ed6fd7ec..c8f486384fe3 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -383,12 +383,6 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
 				phys_addr_t size, pgprot_t prot)
 {
-	if (virt < VMALLOC_START) {
-		pr_warn("BUG: not updating mapping for %pa@0x%016lx - outside kernel range\n",
-			&phys, virt);
-		return;
-	}
-
 	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
 			     NO_CONT_MAPPINGS);
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/12] arm64: mm: Flip kernel VA space
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
                   ` (4 preceding siblings ...)
  2017-12-04 14:13 ` [PATCH 05/12] arm64: mm: Remove VMALLOC checks from update_mapping_prot(.) Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 14:13 ` [PATCH 07/12] arm64: mm: Place kImage at bottom of " Steve Capper
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

Put the direct linear map in the top half of the VA space and then the
kernel + everything else in the bottom half.

We need to adjust:
 *) KASAN shadow region placement logic,
 *) KASAN_SHADOW_OFFSET computation logic,
 *) virt_to_phys, phys_to_virt checks
 *) page table dumper
 *) KVM hyp map flip logic

These are all small changes, that need to take place atomically, so they
are bundled into this commit.

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/Makefile              |  2 +-
 arch/arm64/include/asm/memory.h  | 10 +++++-----
 arch/arm64/include/asm/pgtable.h |  2 +-
 arch/arm64/kernel/cpufeature.c   |  2 +-
 arch/arm64/mm/dump.c             |  8 ++++----
 arch/arm64/mm/init.c             |  9 +--------
 arch/arm64/mm/kasan_init.c       |  4 ++--
 7 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index b481b4a7c011..7eaff48d2a39 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -100,7 +100,7 @@ endif
 # KASAN_SHADOW_OFFSET = VA_START + (1 << (VA_BITS - 3)) - (1 << 61)
 # in 32-bit arithmetic
 KASAN_SHADOW_OFFSET := $(shell printf "0x%08x00000000\n" $$(( \
-			(0xffffffff & (-1 << ($(CONFIG_ARM64_VA_BITS) - 32))) \
+			(0xffffffff & (-1 << ($(CONFIG_ARM64_VA_BITS) - 1 - 32))) \
 			+ (1 << ($(CONFIG_ARM64_VA_BITS) - 32 - 3)) \
 			- (1 << (64 - 32 - 3)) )) )
 
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 2dedc775d151..0a912eb3d74f 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -64,15 +64,15 @@
  */
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) - \
-	(UL(1) << VA_BITS) + 1)
-#define PAGE_OFFSET		(UL(0xffffffffffffffff) - \
 	(UL(1) << (VA_BITS - 1)) + 1)
-#define PAGE_OFFSET_END		(~0UL)
+#define PAGE_OFFSET		(UL(0xffffffffffffffff) - \
+	(UL(1) << VA_BITS) + 1)
+#define PAGE_OFFSET_END		(VA_START)
 #define KIMAGE_VADDR		(MODULES_END)
 #define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
 #define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
 #define MODULES_VSIZE		(SZ_128M)
-#define VMEMMAP_START		(PAGE_OFFSET - VMEMMAP_SIZE)
+#define VMEMMAP_START		(-VMEMMAP_SIZE)
 #define PCI_IO_END		(VMEMMAP_START - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
@@ -223,7 +223,7 @@ static inline unsigned long kaslr_offset(void)
  * space. Testing the top bit for the start of the region is a
  * sufficient check.
  */
-#define __is_lm_address(addr)	(!!((addr) & BIT(VA_BITS - 1)))
+#define __is_lm_address(addr)	(!((addr) & BIT(VA_BITS - 1)))
 
 #define __lm_to_phys(addr)	(((addr) & ~PAGE_OFFSET) + PHYS_OFFSET)
 #define __kimg_to_phys(addr)	((addr) - kimage_voffset)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 72cde7268cad..054b37143a50 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -31,7 +31,7 @@
  *	and fixed mappings
  */
 #define VMALLOC_START		(MODULES_END)
-#define VMALLOC_END		(PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
+#define VMALLOC_END		(- PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
 
 #define vmemmap			((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 5a6e1f3611eb..99b1d1ebe551 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -834,7 +834,7 @@ static bool hyp_flip_space(const struct arm64_cpu_capabilities *entry,
 	 * - the idmap doesn't clash with it,
 	 * - the kernel is not running at EL2.
 	 */
-	return idmap_addr > GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
+	return idmap_addr <= GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
 }
 
 static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unused)
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index ca74a2aace42..b7b09c0fc50d 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -30,6 +30,8 @@
 #include <asm/ptdump.h>
 
 static const struct addr_marker address_markers[] = {
+	{ PAGE_OFFSET,			"Linear Mapping start" },
+	{ VA_START,			"Linear Mapping end" },
 #ifdef CONFIG_KASAN
 	{ KASAN_SHADOW_START,		"Kasan shadow start" },
 	{ KASAN_SHADOW_END,		"Kasan shadow end" },
@@ -43,10 +45,8 @@ static const struct addr_marker address_markers[] = {
 	{ PCI_IO_START,			"PCI I/O start" },
 	{ PCI_IO_END,			"PCI I/O end" },
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
-	{ VMEMMAP_START,		"vmemmap start" },
-	{ VMEMMAP_START + VMEMMAP_SIZE,	"vmemmap end" },
+	{ VMEMMAP_START,		"vmemmap" },
 #endif
-	{ PAGE_OFFSET,			"Linear Mapping" },
 	{ -1,				NULL },
 };
 
@@ -375,7 +375,7 @@ static void ptdump_initialize(void)
 static struct ptdump_info kernel_ptdump_info = {
 	.mm		= &init_mm,
 	.markers	= address_markers,
-	.base_addr	= VA_START,
+	.base_addr	= PAGE_OFFSET,
 };
 
 void ptdump_check_wx(void)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 00e7b900ca41..230d78b75831 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -361,19 +361,12 @@ static void __init fdt_enforce_memory_region(void)
 
 void __init arm64_memblock_init(void)
 {
-	const s64 linear_region_size = -(s64)PAGE_OFFSET;
+	const s64 linear_region_size = BIT(VA_BITS - 1);
 
 	/* Handle linux,usable-memory-range property */
 	fdt_enforce_memory_region();
 
 	/*
-	 * Ensure that the linear region takes up exactly half of the kernel
-	 * virtual address space. This way, we can distinguish a linear address
-	 * from a kernel/module/vmalloc address by testing a single bit.
-	 */
-	BUILD_BUG_ON(linear_region_size != BIT(VA_BITS - 1));
-
-	/*
 	 * Select a suitable value for the base of physical memory.
 	 */
 	memstart_addr = round_down(memblock_start_of_DRAM(),
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index acba49fb5aac..5aef679e61c6 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -205,10 +205,10 @@ void __init kasan_init(void)
 	kasan_map_populate(kimg_shadow_start, kimg_shadow_end,
 			   pfn_to_nid(virt_to_pfn(lm_alias(_text))));
 
-	kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
+	kasan_populate_zero_shadow(kasan_mem_to_shadow((void *) VA_START),
 				   (void *)mod_shadow_start);
 	kasan_populate_zero_shadow((void *)kimg_shadow_end,
-				   kasan_mem_to_shadow((void *)PAGE_OFFSET));
+				   (void *)KASAN_SHADOW_END);
 
 	if (kimg_shadow_start > mod_shadow_end)
 		kasan_populate_zero_shadow((void *)mod_shadow_end,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/12] arm64: mm: Place kImage at bottom of VA space
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
                   ` (5 preceding siblings ...)
  2017-12-04 14:13 ` [PATCH 06/12] arm64: mm: Flip kernel VA space Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 16:25   ` Ard Biesheuvel
  2017-12-04 14:13 ` [PATCH 08/12] arm64: kasan: Switch to using KASAN_SHADOW_OFFSET Steve Capper
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

Re-arrange the kernel memory map s.t. the kernel image resides in the
bottom 514MB of memory. With the modules, fixed map, PCI IO space placed
above it. At the very bottom of the memory map we set aside a 2MB guard
region to prevent ambiguity with PTR_ERR/ERR_PTR.

Dynamically resizable objects such as KASAN shadow and sparsemem map
are placed above the fixed size objects.

This means that kernel addresses are now no longer directly dependent on
VA space size.

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/include/asm/memory.h  | 17 +++++++++--------
 arch/arm64/include/asm/pgtable.h |  4 ++--
 arch/arm64/mm/dump.c             | 12 +++++++-----
 3 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 0a912eb3d74f..ba80561c6ed8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -68,14 +68,15 @@
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) - \
 	(UL(1) << VA_BITS) + 1)
 #define PAGE_OFFSET_END		(VA_START)
-#define KIMAGE_VADDR		(MODULES_END)
-#define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
-#define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
+#define KIMAGE_VSIZE		(SZ_512M)
+#define KIMAGE_VADDR		(UL(0) - SZ_2M - KIMAGE_VSIZE)
 #define MODULES_VSIZE		(SZ_128M)
-#define VMEMMAP_START		(-VMEMMAP_SIZE)
-#define PCI_IO_END		(VMEMMAP_START - SZ_2M)
+#define MODULES_END		(KIMAGE_VADDR)
+#define MODULES_VADDR		(MODULES_END - MODULES_VSIZE)
+#define PCI_IO_END		(MODULES_VADDR - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
-#define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
+#define FIXADDR_TOP		(PCI_IO_START - PGDIR_SIZE)
+#define VMEMMAP_START		(FIXADDR_START - VMEMMAP_SIZE)
 
 #define KERNEL_START      _text
 #define KERNEL_END        _end
@@ -292,10 +293,10 @@ static inline void *phys_to_virt(phys_addr_t x)
 #define _virt_addr_valid(kaddr)	pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
 #else
 #define __virt_to_pgoff(kaddr)	(((u64)(kaddr) & ~PAGE_OFFSET) / PAGE_SIZE * sizeof(struct page))
-#define __page_to_voff(kaddr)	(((u64)(kaddr) & ~VMEMMAP_START) * PAGE_SIZE / sizeof(struct page))
+#define __page_to_voff(kaddr)	(((u64)(kaddr) - VMEMMAP_START) * PAGE_SIZE / sizeof(struct page))
 
 #define page_to_virt(page)	((void *)((__page_to_voff(page)) | PAGE_OFFSET))
-#define virt_to_page(vaddr)	((struct page *)((__virt_to_pgoff(vaddr)) | VMEMMAP_START))
+#define virt_to_page(vaddr)	((struct page *)((__virt_to_pgoff(vaddr)) + VMEMMAP_START))
 
 #define _virt_addr_valid(kaddr)	pfn_valid((((u64)(kaddr) & ~PAGE_OFFSET) \
 					   + PHYS_OFFSET) >> PAGE_SHIFT)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 054b37143a50..e8b4dcc11fed 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -30,8 +30,8 @@
  * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space
  *	and fixed mappings
  */
-#define VMALLOC_START		(MODULES_END)
-#define VMALLOC_END		(- PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
+#define VMALLOC_START		(VA_START + KASAN_SHADOW_SIZE)
+#define VMALLOC_END		(FIXADDR_TOP - PUD_SIZE)
 
 #define vmemmap			((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
 
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index b7b09c0fc50d..e5d1b5f432fe 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -36,17 +36,19 @@ static const struct addr_marker address_markers[] = {
 	{ KASAN_SHADOW_START,		"Kasan shadow start" },
 	{ KASAN_SHADOW_END,		"Kasan shadow end" },
 #endif
-	{ MODULES_VADDR,		"Modules start" },
-	{ MODULES_END,			"Modules end" },
 	{ VMALLOC_START,		"vmalloc() Area" },
 	{ VMALLOC_END,			"vmalloc() End" },
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+	{ VMEMMAP_START,		"vmemmap start" },
+	{ VMEMMAP_START + VMEMMAP_SIZE,	"vmemmap end"},
+#endif
 	{ FIXADDR_START,		"Fixmap start" },
 	{ FIXADDR_TOP,			"Fixmap end" },
 	{ PCI_IO_START,			"PCI I/O start" },
 	{ PCI_IO_END,			"PCI I/O end" },
-#ifdef CONFIG_SPARSEMEM_VMEMMAP
-	{ VMEMMAP_START,		"vmemmap" },
-#endif
+	{ MODULES_VADDR,		"Modules start" },
+	{ MODULES_END,			"Modules end" },
+	{ KIMAGE_VADDR,			"kImage start"},
 	{ -1,				NULL },
 };
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 08/12] arm64: kasan: Switch to using KASAN_SHADOW_OFFSET
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
                   ` (6 preceding siblings ...)
  2017-12-04 14:13 ` [PATCH 07/12] arm64: mm: Place kImage at bottom of " Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 14:13 ` [PATCH 09/12] arm64: dump: Make kernel page table dumper dynamic again Steve Capper
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

KASAN_SHADOW_OFFSET is a constant that is supplied to gcc as a command
line argument and affects the codegen of the inline address sanetiser.

Essentially, for an example memory access:
	*ptr1 = val;
The compiler will insert logic similar to the below:
	shadowValue = *(ptr1 >> 3 + KASAN_SHADOW_OFFSET)
	if (somethingWrong(shadowValue))
		flagAnError();

As this code sequence is inserted into many places, and
KASAN_SHADOW_OFFSET is essentially baked into many places in the kernel
.text, the only sane thing we can do at compile time is to check that
the KASAN_SHADOW_OFFSET gives us a memory region that is valid,
otherwise BUILD_BUG on a discrepancy.

i.e. If we want to run a single kernel binary with multiple address
spaces, then we need to do this with KASAN_SHADOW_OFFSET fixed.

Thankfully, due to the way the KASAN_SHADOW_OFFSET is used to provide
shadow addresses we know that the end of the shadow region is constant
w.r.t. VA space size:
	KASAN_SHADOW_END = ~0 >> 3 + KASAN_SHADOW_OFFSET

This means that if we increase the size of the VA space, the KASAN
region expands upwards into the new space that is provided.

This patch removes the logic to compute the KASAN_SHADOW_OFFSET in the
arm64 Makefile, and instead we adopt the approach used by x86 to supply
offset values in kConfig. To help debug/develop future VA space changes,
the Makefile logic has been preserved in a script file in the arm64
Documentation folder.

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 Documentation/arm64/kasan-offsets.sh | 17 +++++++++++++++++
 arch/arm64/Kconfig                   | 10 ++++++++++
 arch/arm64/Makefile                  |  7 -------
 arch/arm64/include/asm/kasan.h       | 24 +++++++++++-------------
 arch/arm64/include/asm/pgtable.h     |  7 ++++++-
 arch/arm64/mm/kasan_init.c           |  1 -
 6 files changed, 44 insertions(+), 22 deletions(-)
 create mode 100644 Documentation/arm64/kasan-offsets.sh

diff --git a/Documentation/arm64/kasan-offsets.sh b/Documentation/arm64/kasan-offsets.sh
new file mode 100644
index 000000000000..d07a95518770
--- /dev/null
+++ b/Documentation/arm64/kasan-offsets.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+# Print out the KASAN_SHADOW_OFFSETS required to place the KASAN SHADOW
+# start address at the mid-point of the kernel VA space
+
+print_kasan_offset () {
+	printf "%02d\t" $1
+	printf "0x%08x00000000\n" $(( (0xffffffff & (-1 << ($1 - 1 - 32))) \
+			+ (1 << ($1 - 32 - 3)) \
+			- (1 << (64 - 32 - 3)) ))
+}
+
+printf "VABITS\tKASAN_SHADOW_OFFSET\n"
+print_kasan_offset 48
+print_kasan_offset 42
+print_kasan_offset 39
+print_kasan_offset 36
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a93339f5178f..0fa430326825 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -272,6 +272,16 @@ config ARCH_SUPPORTS_UPROBES
 config ARCH_PROC_KCORE_TEXT
 	def_bool y
 
+config KASAN_SHADOW_OFFSET
+	hex
+	depends on KASAN
+	default 0xdfffa00000000000 if ARM64_VA_BITS_48
+	default 0xdfffd00000000000 if ARM64_VA_BITS_47
+	default 0xdffffe8000000000 if ARM64_VA_BITS_42
+	default 0xdfffffd000000000 if ARM64_VA_BITS_39
+	default 0xdffffffa00000000 if ARM64_VA_BITS_36
+	default 0xffffffffffffffff
+
 source "init/Kconfig"
 
 source "kernel/Kconfig.freezer"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 7eaff48d2a39..13cc9311ef7d 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -97,13 +97,6 @@ else
 TEXT_OFFSET := 0x00080000
 endif
 
-# KASAN_SHADOW_OFFSET = VA_START + (1 << (VA_BITS - 3)) - (1 << 61)
-# in 32-bit arithmetic
-KASAN_SHADOW_OFFSET := $(shell printf "0x%08x00000000\n" $$(( \
-			(0xffffffff & (-1 << ($(CONFIG_ARM64_VA_BITS) - 1 - 32))) \
-			+ (1 << ($(CONFIG_ARM64_VA_BITS) - 32 - 3)) \
-			- (1 << (64 - 32 - 3)) )) )
-
 export	TEXT_OFFSET GZFLAGS
 
 core-y		+= arch/arm64/kernel/ arch/arm64/mm/
diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index e266f80e45b7..28b9d9cb7795 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -10,24 +10,22 @@
 #include <asm/memory.h>
 #include <asm/pgtable-types.h>
 
+#define KASAN_SHADOW_OFFSET _AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
+
 /*
  * KASAN_SHADOW_START: beginning of the kernel virtual addresses.
  * KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
- */
-#define KASAN_SHADOW_START      (VA_START)
-#define KASAN_SHADOW_END        (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
-
-/*
- * This value is used to map an address to the corresponding shadow
- * address by the following formula:
- *     shadow_addr = (address >> 3) + KASAN_SHADOW_OFFSET;
  *
- * (1 << 61) shadow addresses - [KASAN_SHADOW_OFFSET,KASAN_SHADOW_END]
- * cover all 64-bits of virtual addresses. So KASAN_SHADOW_OFFSET
- * should satisfy the following equation:
- *      KASAN_SHADOW_OFFSET = KASAN_SHADOW_END - (1ULL << 61)
+ * We derive these values from KASAN_SHADOW_OFFSET and the size of the VA
+ * space.
+ *
+ * KASAN shadow addresses are derived from the following formula:
+ *	shadow_addr = (address >> 3) + KASAN_SHADOW_OFFSET;
+ *
  */
-#define KASAN_SHADOW_OFFSET     (KASAN_SHADOW_END - (1ULL << (64 - 3)))
+#define KASAN_SHADOW_END	((1UL << 61) + KASAN_SHADOW_OFFSET)
+#define _KASAN_SHADOW_START(va)	(KASAN_SHADOW_END - (1UL << ((va) - 3)))
+#define KASAN_SHADOW_START      _KASAN_SHADOW_START(VA_BITS)
 
 void kasan_init(void);
 void kasan_copy_shadow(pgd_t *pgdir);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e8b4dcc11fed..5506f7d66bfa 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -19,6 +19,7 @@
 #include <asm/bug.h>
 #include <asm/proc-fns.h>
 
+#include <asm/kasan.h>
 #include <asm/memory.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/pgtable-prot.h>
@@ -30,7 +31,11 @@
  * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space
  *	and fixed mappings
  */
-#define VMALLOC_START		(VA_START + KASAN_SHADOW_SIZE)
+#ifdef CONFIG_KASAN
+#define VMALLOC_START		(KASAN_SHADOW_END)
+#else
+#define VMALLOC_START		(VA_START)
+#endif
 #define VMALLOC_END		(FIXADDR_TOP - PUD_SIZE)
 
 #define vmemmap			((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 5aef679e61c6..968535789d13 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -135,7 +135,6 @@ static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
 /* The early shadow maps everything to a single page of zeroes */
 asmlinkage void __init kasan_early_init(void)
 {
-	BUILD_BUG_ON(KASAN_SHADOW_OFFSET != KASAN_SHADOW_END - (1UL << 61));
 	BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_START, PGDIR_SIZE));
 	BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE));
 	kasan_pgd_populate(KASAN_SHADOW_START, KASAN_SHADOW_END, NUMA_NO_NODE,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 09/12] arm64: dump: Make kernel page table dumper dynamic again
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
                   ` (7 preceding siblings ...)
  2017-12-04 14:13 ` [PATCH 08/12] arm64: kasan: Switch to using KASAN_SHADOW_OFFSET Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 14:13 ` [PATCH 10/12] arm64: mm: Make VA_BITS variable, introduce VA_BITS_MIN Steve Capper
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

The kernel page table dumper assumes that the placement of VA regions is
constant and determined at compile time. As we are about to introduce
variable VA logic, we need to be able to determine certain regions at
boot time.

This patch adds logic to the kernel page table dumper s.t. these regions
can be computed at boot time.

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/mm/dump.c | 51 +++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 43 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index e5d1b5f432fe..c50af666c407 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -29,18 +29,42 @@
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptdump.h>
 
-static const struct addr_marker address_markers[] = {
-	{ PAGE_OFFSET,			"Linear Mapping start" },
-	{ VA_START,			"Linear Mapping end" },
+
+enum address_markers_idx {
+	PAGE_OFFSET_NR = 0,
+	VA_START_NR,
 #ifdef CONFIG_KASAN
-	{ KASAN_SHADOW_START,		"Kasan shadow start" },
+	KASAN_START_NR,
+	KASAN_END_NR,
+#endif
+	VMALLOC_START_NR,
+	VMALLOC_END_NR,
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+	VMEMMAP_START_NR,
+	VMEMMAP_END_NR,
+#endif
+	FIXADDR_START_NR,
+	FIXADDR_END_NR,
+	PCI_START_NR,
+	PCI_END_NR,
+	MODULES_START_NR,
+	MODULES_END_NR,
+	KIMAGE_NR,
+	END_NR
+};
+
+static struct addr_marker address_markers[] = {
+	{ 0 /* PAGE_OFFSET */,		"Linear Mapping start" },
+	{ 0 /* VA_START */,		"Linear Mapping end" },
+#ifdef CONFIG_KASAN
+	{ 0 /* KASAN_SHADOW_START */,	"Kasan shadow start" },
 	{ KASAN_SHADOW_END,		"Kasan shadow end" },
 #endif
-	{ VMALLOC_START,		"vmalloc() Area" },
+	{ 0 /* VMALLOC_START */,	"vmalloc() Area" },
 	{ VMALLOC_END,			"vmalloc() End" },
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
-	{ VMEMMAP_START,		"vmemmap start" },
-	{ VMEMMAP_START + VMEMMAP_SIZE,	"vmemmap end"},
+	{ 0 /* VMEMMAP_START */,		"vmemmap start" },
+	{ 0 /*VMEMMAP_START + VMEMMAP_SIZE */,	"vmemmap end"},
 #endif
 	{ FIXADDR_START,		"Fixmap start" },
 	{ FIXADDR_TOP,			"Fixmap end" },
@@ -377,7 +401,6 @@ static void ptdump_initialize(void)
 static struct ptdump_info kernel_ptdump_info = {
 	.mm		= &init_mm,
 	.markers	= address_markers,
-	.base_addr	= PAGE_OFFSET,
 };
 
 void ptdump_check_wx(void)
@@ -403,6 +426,18 @@ void ptdump_check_wx(void)
 static int ptdump_init(void)
 {
 	ptdump_initialize();
+	kernel_ptdump_info.base_addr = PAGE_OFFSET;
+	address_markers[PAGE_OFFSET_NR].start_address = PAGE_OFFSET;
+	address_markers[VA_START_NR].start_address = VA_START;
+#ifdef CONFIG_KASAN
+	address_markers[KASAN_START_NR].start_address = KASAN_SHADOW_START;
+#endif
+	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+	address_markers[VMEMMAP_START_NR].start_address = VMEMMAP_START;
+	address_markers[VMEMMAP_END_NR].start_address = VMEMMAP_START
+							+ VMEMMAP_SIZE;
+#endif
 	return ptdump_debugfs_register(&kernel_ptdump_info,
 					"kernel_page_tables");
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/12] arm64: mm: Make VA_BITS variable, introduce VA_BITS_MIN
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
                   ` (8 preceding siblings ...)
  2017-12-04 14:13 ` [PATCH 09/12] arm64: dump: Make kernel page table dumper dynamic again Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 14:13 ` [PATCH 11/12] arm64: KVM: Add support for an alternative VA space Steve Capper
  2017-12-04 14:13 ` [PATCH 12/12] arm64: mm: Add 48/52-bit kernel VA support Steve Capper
  11 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

In order to allow the kernel to select different virtual address sizes
on boot we need to "de-constify" VA_BITS. This patch introduces
vabits_actual, a variable which is defined at very early boot, and
VA_BITS is then re-defined to reference this variable.

Having VA_BITS variable can potentially break a lot of code that makes
compile time deductions from it. To prevent future code changes being
made that break variable VA, this patch enforces VA_BITS to be variable
always (i.e. no CONFIG options will change this).

A new constant, VA_BITS_MIN is defined, that gives the minimum address
space size the kernel is compiled for. This is used for example in the
EFI stub code to choose the furthest addressable distance for the
initrd to be placed. Increasing the VA space size on bootup does not
invalidate this logic.

Also, VA_BITS_MIN is now used to detect whether or not additional page
table levels are required for the idmap. We used to check for
 #ifdef CONFIG_ARM64_VA_BITS_48
which does not work when moving up to 52-bits.

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/include/asm/assembler.h   |  2 +-
 arch/arm64/include/asm/efi.h         |  4 ++--
 arch/arm64/include/asm/kvm_mmu.h     |  6 ++++--
 arch/arm64/include/asm/memory.h      | 17 ++++++++++-------
 arch/arm64/include/asm/mmu_context.h |  2 +-
 arch/arm64/include/asm/pgtable.h     |  4 ++--
 arch/arm64/include/asm/processor.h   |  2 +-
 arch/arm64/kernel/cpufeature.c       |  2 +-
 arch/arm64/kernel/head.S             | 13 ++++++++-----
 arch/arm64/kvm/hyp-init.S            |  2 +-
 arch/arm64/mm/fault.c                |  2 +-
 arch/arm64/mm/init.c                 |  4 ++++
 arch/arm64/mm/kasan_init.c           |  6 +++---
 arch/arm64/mm/mmu.c                  |  5 ++++-
 arch/arm64/mm/proc.S                 | 21 ++++++++++++++++++++-
 15 files changed, 63 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index aef72d886677..b59c04caae44 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -344,7 +344,7 @@ alternative_endif
  * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
  */
 	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
-#ifndef CONFIG_ARM64_VA_BITS_48
+#if VA_BITS_MIN < 48
 	ldr_l	\tmpreg, idmap_t0sz
 	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
 #endif
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 650344d01124..57d0b6b13231 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -66,7 +66,7 @@ static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base)
 
 /*
  * On arm64, we have to ensure that the initrd ends up in the linear region,
- * which is a 1 GB aligned region of size '1UL << (VA_BITS - 1)' that is
+ * which is a 1 GB aligned region of size '1UL << (VA_BITS_MIN - 1)' that is
  * guaranteed to cover the kernel Image.
  *
  * Since the EFI stub is part of the kernel Image, we can relax the
@@ -77,7 +77,7 @@ static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base)
 static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base,
 						    unsigned long image_addr)
 {
-	return (image_addr & ~(SZ_1G - 1UL)) + (1UL << (VA_BITS - 1));
+	return (image_addr & ~(SZ_1G - 1UL)) + (1UL << (VA_BITS_MIN - 1));
 }
 
 #define efi_call_early(f, ...)		sys_table_arg->boottime->f(__VA_ARGS__)
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index d74d5236c26c..5174fd7e5196 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -69,8 +69,10 @@
  * mappings, and none of this applies in that case.
  */
 
-#define HYP_MAP_KERNEL_BITS	(UL(0xffffffffffffffff) << VA_BITS)
-#define HYP_MAP_HIGH_BIT	(UL(1) << (VA_BITS - 1))
+#define _HYP_MAP_KERNEL_BITS(va)	(UL(0xffffffffffffffff) << (va))
+#define _HYP_MAP_HIGH_BIT(va)		(UL(1) << ((va) - 1))
+#define HYP_MAP_KERNEL_BITS	_HYP_MAP_KERNEL_BITS(VA_BITS_MIN)
+#define HYP_MAP_HIGH_BIT	_HYP_MAP_HIGH_BIT(VA_BITS_MIN)
 
 #ifdef __ASSEMBLY__
 
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index ba80561c6ed8..652ae83468b6 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -62,11 +62,6 @@
  * VA_BITS - the maximum number of bits for virtual addresses.
  * VA_START - the first kernel virtual address.
  */
-#define VA_BITS			(CONFIG_ARM64_VA_BITS)
-#define VA_START		(UL(0xffffffffffffffff) - \
-	(UL(1) << (VA_BITS - 1)) + 1)
-#define PAGE_OFFSET		(UL(0xffffffffffffffff) - \
-	(UL(1) << VA_BITS) + 1)
 #define PAGE_OFFSET_END		(VA_START)
 #define KIMAGE_VSIZE		(SZ_512M)
 #define KIMAGE_VADDR		(UL(0) - SZ_2M - KIMAGE_VSIZE)
@@ -177,10 +172,18 @@
 #endif
 
 #ifndef __ASSEMBLY__
+extern u64			vabits_actual;
+#define VA_BITS			({vabits_actual;})
+#define VA_START		(UL(0xffffffffffffffff) - \
+	(UL(1) << (VA_BITS - 1)) + 1)
+#define PAGE_OFFSET		(UL(0xffffffffffffffff) - \
+	(UL(1) << VA_BITS) + 1)
+#define PAGE_OFFSET_END		(VA_START)
 
 #include <linux/bitops.h>
 #include <linux/mmdebug.h>
 
+extern s64			physvirt_offset;
 extern s64			memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
@@ -226,7 +229,7 @@ static inline unsigned long kaslr_offset(void)
  */
 #define __is_lm_address(addr)	(!((addr) & BIT(VA_BITS - 1)))
 
-#define __lm_to_phys(addr)	(((addr) & ~PAGE_OFFSET) + PHYS_OFFSET)
+#define __lm_to_phys(addr)	(((addr) + physvirt_offset))
 #define __kimg_to_phys(addr)	((addr) - kimage_voffset)
 
 #define __virt_to_phys_nodebug(x) ({					\
@@ -245,7 +248,7 @@ extern phys_addr_t __phys_addr_symbol(unsigned long x);
 #define __phys_addr_symbol(x)	__pa_symbol_nodebug(x)
 #endif
 
-#define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET) | PAGE_OFFSET)
+#define __phys_to_virt(x)	((unsigned long)((x) - physvirt_offset))
 #define __phys_to_kimg(x)	((unsigned long)((x) + kimage_voffset))
 
 /*
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 3257895a9b5e..e57ed28ed360 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -66,7 +66,7 @@ extern u64 idmap_t0sz;
 
 static inline bool __cpu_uses_extended_idmap(void)
 {
-	return (!IS_ENABLED(CONFIG_ARM64_VA_BITS_48) &&
+	return ((VA_BITS_MIN < 48) &&
 		unlikely(idmap_t0sz != TCR_T0SZ(VA_BITS)));
 }
 
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 5506f7d66bfa..2cfcb406f0dd 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -683,8 +683,8 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm,
 }
 #endif
 
-extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
-extern pgd_t idmap_pg_dir[PTRS_PER_PGD];
+extern pgd_t swapper_pg_dir[];
+extern pgd_t idmap_pg_dir[];
 extern pgd_t swapper_pg_end[];
 /*
  * Encode and decode a swap entry:
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 023cacb946c3..aa294d1ddea8 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -19,7 +19,7 @@
 #ifndef __ASM_PROCESSOR_H
 #define __ASM_PROCESSOR_H
 
-#define TASK_SIZE_64		(UL(1) << VA_BITS)
+#define TASK_SIZE_64		(UL(1) << VA_BITS_MIN)
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 99b1d1ebe551..31cfffa79fee 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -834,7 +834,7 @@ static bool hyp_flip_space(const struct arm64_cpu_capabilities *entry,
 	 * - the idmap doesn't clash with it,
 	 * - the kernel is not running@EL2.
 	 */
-	return idmap_addr <= GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
+	return idmap_addr <= GENMASK(VA_BITS_MIN - 2, 0) && !is_kernel_in_hyp_mode();
 }
 
 static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unused)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 71f1722685fe..6a637f91edfc 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -119,6 +119,7 @@ ENTRY(stext)
 	adrp	x23, __PHYS_OFFSET
 	and	x23, x23, MIN_KIMG_ALIGN - 1	// KASLR offset, defaults to 0
 	bl	set_cpu_boot_mode_flag
+	bl	__setup_va_constants
 	bl	__create_page_tables
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
@@ -250,7 +251,9 @@ ENDPROC(preserve_boot_args)
 	add \rtbl, \tbl, #PAGE_SIZE
 	mov \sv, \rtbl
 	mov \count, #1
-	compute_indices \vstart, \vend, #PGDIR_SHIFT, #PTRS_PER_PGD, \istart, \iend, \count
+
+	ldr_l \tmp, ptrs_per_pgd
+	compute_indices \vstart, \vend, #PGDIR_SHIFT, \tmp, \istart, \iend, \count
 	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
 	mov \tbl, \sv
 	mov \sv, \rtbl
@@ -314,7 +317,7 @@ __create_page_tables:
 	adrp	x3, __idmap_text_start		// __pa(__idmap_text_start)
 	adrp	x4, __idmap_text_end		// __pa(__idmap_text_end)
 
-#ifndef CONFIG_ARM64_VA_BITS_48
+#if (VA_BITS_MIN < 48)
 #define EXTRA_SHIFT	(PGDIR_SHIFT + PAGE_SHIFT - 3)
 #define EXTRA_PTRS	(1 << (48 - EXTRA_SHIFT))
 
@@ -329,7 +332,7 @@ __create_page_tables:
 	 * utilised, and that lowering T0SZ will always result in an additional
 	 * translation level to be configured.
 	 */
-#if VA_BITS != EXTRA_SHIFT
+#if VA_BITS_MIN != EXTRA_SHIFT
 #error "Mismatch between VA_BITS and page size/number of translation levels"
 #endif
 
@@ -340,8 +343,8 @@ __create_page_tables:
 	 * the physical address of __idmap_text_end.
 	 */
 	clz	x5, x4
-	cmp	x5, TCR_T0SZ(VA_BITS)	// default T0SZ small enough?
-	b.ge	1f			// .. then skip additional level
+	cmp	x5, TCR_T0SZ(VA_BITS_MIN)	// default T0SZ small enough?
+	b.ge	1f				// .. then skip additional level
 
 	adr_l	x6, idmap_t0sz
 	str	x5, [x6]
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 3f9615582377..729c49d9f574 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -71,7 +71,7 @@ __do_hyp_init:
 	mov	x5, #TCR_EL2_RES1
 	orr	x4, x4, x5
 
-#ifndef CONFIG_ARM64_VA_BITS_48
+#if VA_BITS_MIN < 48
 	/*
 	 * If we are running with VA_BITS < 48, we may be running with an extra
 	 * level of translation in the ID map. This is only the case if system
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 22168cd0dde7..c0e0f8638bee 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -149,7 +149,7 @@ void show_pte(unsigned long addr)
 		return;
 	}
 
-	pr_alert("%s pgtable: %luk pages, %u-bit VAs, pgd = %p\n",
+	pr_alert("%s pgtable: %luk pages, %llu-bit VAs, pgd = %p\n",
 		 mm == &init_mm ? "swapper" : "user", PAGE_SIZE / SZ_1K,
 		 VA_BITS, mm->pgd);
 	pgd = pgd_offset(mm, addr);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 230d78b75831..7e130252edce 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -62,6 +62,8 @@
 s64 memstart_addr __ro_after_init = -1;
 phys_addr_t arm64_dma_phys_limit __ro_after_init;
 
+s64 physvirt_offset __ro_after_init = -1;
+
 #ifdef CONFIG_BLK_DEV_INITRD
 static int __init early_initrd(char *p)
 {
@@ -372,6 +374,8 @@ void __init arm64_memblock_init(void)
 	memstart_addr = round_down(memblock_start_of_DRAM(),
 				   ARM64_MEMSTART_ALIGN);
 
+	physvirt_offset = PHYS_OFFSET - PAGE_OFFSET;
+
 	/*
 	 * Remove the memory that we will not be able to cover with the
 	 * linear mapping. Take care not to clip the kernel which may be
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 968535789d13..33f99e18e4ff 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -27,7 +27,7 @@
 #include <asm/sections.h>
 #include <asm/tlbflush.h>
 
-static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE);
+static pgd_t tmp_pg_dir[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
 
 /*
  * The p*d_populate functions call virt_to_phys implicitly so they can't be used
@@ -135,7 +135,7 @@ static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
 /* The early shadow maps everything to a single page of zeroes */
 asmlinkage void __init kasan_early_init(void)
 {
-	BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_START, PGDIR_SIZE));
+	BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS_MIN), PGDIR_SIZE));
 	BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE));
 	kasan_pgd_populate(KASAN_SHADOW_START, KASAN_SHADOW_END, NUMA_NO_NODE,
 			   true);
@@ -195,7 +195,7 @@ void __init kasan_init(void)
 	 * tmp_pg_dir used to keep early shadow mapped until full shadow
 	 * setup will be finished.
 	 */
-	memcpy(tmp_pg_dir, swapper_pg_dir, sizeof(tmp_pg_dir));
+	memcpy(tmp_pg_dir, swapper_pg_dir, PTRS_PER_PGD * sizeof(pgd_t));
 	dsb(ishst);
 	cpu_replace_ttbr1(lm_alias(tmp_pg_dir));
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c8f486384fe3..8a063f22de88 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -49,7 +49,10 @@
 #define NO_BLOCK_MAPPINGS	BIT(0)
 #define NO_CONT_MAPPINGS	BIT(1)
 
-u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
+u64 idmap_t0sz __ro_after_init;
+u64 ptrs_per_pgd __ro_after_init;
+u64 vabits_actual __ro_after_init;
+EXPORT_SYMBOL(vabits_actual);
 
 u64 kimage_voffset __ro_after_init;
 EXPORT_SYMBOL(kimage_voffset);
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 95233dfc4c39..607a6ff0e205 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -223,7 +223,7 @@ ENTRY(__cpu_setup)
 	 * Set/prepare TCR and TTBR. We use 512GB (39-bit) address range for
 	 * both user and kernel.
 	 */
-	ldr	x10, =TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
+	ldr	x10, =TCR_TxSZ(VA_BITS_MIN) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
 			TCR_TG_FLAGS | TCR_ASID16 | TCR_TBI0
 	tcr_set_idmap_t0sz	x10, x9
 
@@ -250,6 +250,25 @@ ENTRY(__cpu_setup)
 	ret					// return to head.S
 ENDPROC(__cpu_setup)
 
+ENTRY(__setup_va_constants)
+	mov	x0, #VA_BITS_MIN
+	mov	x1, TCR_T0SZ(VA_BITS_MIN)
+	mov	x2, #1 << (VA_BITS_MIN - PGDIR_SHIFT)
+	str_l	x0, vabits_actual, x5
+	str_l	x1, idmap_t0sz, x5
+	str_l	x2, ptrs_per_pgd, x5
+
+	adr_l	x0, vabits_actual
+	adr_l	x1, idmap_t0sz
+	adr_l	x2, ptrs_per_pgd
+	dmb	sy
+	dc	ivac, x0	// Invalidate potentially stale cache
+	dc	ivac, x1
+	dc	ivac, x2
+
+	ret
+ENDPROC(__setup_va_constants)
+
 	/*
 	 * We set the desired value explicitly, including those of the
 	 * reserved bits. The values of bits EE & E0E were set early in
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 11/12] arm64: KVM: Add support for an alternative VA space
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
                   ` (9 preceding siblings ...)
  2017-12-04 14:13 ` [PATCH 10/12] arm64: mm: Make VA_BITS variable, introduce VA_BITS_MIN Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  2017-12-04 14:13 ` [PATCH 12/12] arm64: mm: Add 48/52-bit kernel VA support Steve Capper
  11 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adjusts the alternative patching logic for kern_hyp_va to
take into account a change in virtual address space size on boot.

Because the instructions in the alternatives regions have to be fixed at
compile time, in order to make the logic depend on a dynamic VA size
the predicates have to be adjusted.

The predicates used, follow the corresponding logic:
 - ARM64_HAS_VIRT_HOST_EXTN, true if running with VHE
 - ARM64_HYP_MAP_FLIP, true if !VHE and idmap is high and VA size is small.
 - ARM64_HYP_RUNNING_ALT_VA, true if !VHE and VA size is big.
 - ARM64_HYP_MAP_FLIP_ALT, true if !VHE and idmap is high and VA size is big.

Using the above predicates means we have to add two instructions to
kern_hyp_va.

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/Kconfig               |  4 ++++
 arch/arm64/include/asm/cpucaps.h |  4 +++-
 arch/arm64/include/asm/kvm_mmu.h | 47 ++++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/cpufeature.c   | 39 ++++++++++++++++++++++++++++++++-
 4 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0fa430326825..143c453b06f1 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -656,6 +656,10 @@ config ARM64_VA_BITS
 	default 47 if ARM64_VA_BITS_47
 	default 48 if ARM64_VA_BITS_48
 
+config ARM64_VA_BITS_ALT
+	bool
+	default n
+
 config CPU_BIG_ENDIAN
        bool "Build big-endian kernel"
        help
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 3de31a1010ee..955936adcf7a 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -41,7 +41,9 @@
 #define ARM64_WORKAROUND_CAVIUM_30115		20
 #define ARM64_HAS_DCPOP				21
 #define ARM64_SVE				22
+#define ARM64_HYP_RUNNING_ALT_VA		23
+#define ARM64_HYP_MAP_FLIP_ALT			24
 
-#define ARM64_NCAPS				23
+#define ARM64_NCAPS				25
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 5174fd7e5196..8de396764a11 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -73,6 +73,11 @@
 #define _HYP_MAP_HIGH_BIT(va)		(UL(1) << ((va) - 1))
 #define HYP_MAP_KERNEL_BITS	_HYP_MAP_KERNEL_BITS(VA_BITS_MIN)
 #define HYP_MAP_HIGH_BIT	_HYP_MAP_HIGH_BIT(VA_BITS_MIN)
+#ifdef CONFIG_ARM64_VA_BITS_ALT
+#define HYP_MAP_KERNEL_BITS_ALT	(_HYP_MAP_KERNEL_BITS(VA_BITS_ALT) \
+				 ^ _HYP_MAP_KERNEL_BITS(VA_BITS_MIN))
+#define HYP_MAP_HIGH_BIT_ALT	_HYP_MAP_HIGH_BIT(VA_BITS_ALT)
+#endif
 
 #ifdef __ASSEMBLY__
 
@@ -95,6 +100,27 @@
  * - VHE:
  *		nop
  *		nop
+ *
+ * For cases where we are running with a variable address space size,
+ * two extra instructions are added, and the logic changes thusly:
+ *
+ * - Flip the kernel bits for the new VA:
+ *		eor x0, x0, #HYP_MAP_KERNEL_BITS
+ *		nop
+ *		eor x0, x0, #HYP_MAP_KERNEL_BITS_ALT
+ *		eor
+ *
+ * - Flip the kernel bits and upper HYP bit for new VA:
+ *		eor x0, x0, #HYP_MAP_KERNEL_BITS
+ *		nop
+ *		eor x0, x0, #HYP_MAP_KERNEL_BITS_ALT
+ *		eor x0, x0, #HYP_MAP_HIGH_BIT_ALT
+ *
+ * - VHE:
+ *		nop
+ *		nop
+ *		nop
+ *		nop
  */
 .macro kern_hyp_va	reg
 alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
@@ -103,6 +129,14 @@ alternative_else_nop_endif
 alternative_if ARM64_HYP_MAP_FLIP
 	eor     \reg, \reg, #HYP_MAP_HIGH_BIT
 alternative_else_nop_endif
+#ifdef CONFIG_ARM64_VA_BITS_ALT
+alternative_if ARM64_HYP_RUNNING_ALT_VA
+	eor	\reg, \reg, #HYP_MAP_KERNEL_BITS_ALT
+alternative_else_nop_endif
+alternative_if ARM64_HYP_MAP_FLIP_ALT
+	eor     \reg, \reg, #HYP_MAP_HIGH_BIT_ALT
+alternative_else_nop_endif
+#endif
 .endm
 
 #else
@@ -125,6 +159,19 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
 				 ARM64_HYP_MAP_FLIP)
 		     : "+r" (v)
 		     : "i" (HYP_MAP_HIGH_BIT));
+#ifdef CONFIG_ARM64_VA_BITS_ALT
+	asm volatile(ALTERNATIVE("nop",
+				 "eor %0, %0, %1",
+				 ARM64_HYP_RUNNING_ALT_VA)
+		     : "+r" (v)
+		     : "i" (HYP_MAP_KERNEL_BITS_ALT));
+	asm volatile(ALTERNATIVE("nop",
+				 "eor %0, %0, %1",
+				 ARM64_HYP_MAP_FLIP_ALT)
+		     : "+r" (v)
+		     : "i" (HYP_MAP_HIGH_BIT_ALT));
+#endif
+
 	return v;
 }
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 31cfffa79fee..cd4bcd2d0942 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -834,7 +834,8 @@ static bool hyp_flip_space(const struct arm64_cpu_capabilities *entry,
 	 * - the idmap doesn't clash with it,
 	 * - the kernel is not running@EL2.
 	 */
-	return idmap_addr <= GENMASK(VA_BITS_MIN - 2, 0) && !is_kernel_in_hyp_mode();
+	return (VA_BITS == VA_BITS_MIN) &&
+		idmap_addr <= GENMASK(VA_BITS_MIN - 2, 0) && !is_kernel_in_hyp_mode();
 }
 
 static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unused)
@@ -845,6 +846,28 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
 					ID_AA64PFR0_FP_SHIFT) < 0;
 }
 
+#ifdef CONFIG_ARM64_VA_BITS_ALT
+static bool hyp_using_large_va(const struct arm64_cpu_capabilities *entry,
+				int __unused)
+{
+	return (VA_BITS > VA_BITS_MIN) && !is_kernel_in_hyp_mode();
+}
+
+static bool hyp_flip_space_alt(const struct arm64_cpu_capabilities *entry,
+			   int __unused)
+{
+	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
+
+	/*
+	 * Activate the lower HYP offset only if:
+	 * - the idmap doesn't clash with it,
+	 * - the kernel is not running at EL2.
+	 */
+	return (VA_BITS > VA_BITS_MIN) &&
+		idmap_addr <= GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
+}
+#endif
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		.desc = "GIC system register CPU interface",
@@ -931,6 +954,20 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.def_scope = SCOPE_SYSTEM,
 		.matches = hyp_flip_space,
 	},
+#ifdef CONFIG_ARM64_VA_BITS_ALT
+	{
+		.desc = "HYP mapping using larger VA space",
+		.capability = ARM64_HYP_RUNNING_ALT_VA,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = hyp_using_large_va,
+	},
+	{
+		.desc = "HYP mapping using flipped, larger VA space",
+		.capability = ARM64_HYP_MAP_FLIP_ALT,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = hyp_flip_space_alt,
+	},
+#endif
 	{
 		/* FP/SIMD is not implemented */
 		.capability = ARM64_HAS_NO_FPSIMD,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 12/12] arm64: mm: Add 48/52-bit kernel VA support
  2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
                   ` (10 preceding siblings ...)
  2017-12-04 14:13 ` [PATCH 11/12] arm64: KVM: Add support for an alternative VA space Steve Capper
@ 2017-12-04 14:13 ` Steve Capper
  11 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

Add the option to use 52-bit VA support upon availability at boot. We
use the same KASAN_SHADOW_OFFSET for both 48 and 52 bit VA spaces as in
both cases the start and end of the KASAN shadow region are PGD aligned.

>From ID_AA64MMFR2, we check the LVA field on very early boot and set the
VA size, PGDIR_SHIFT and TCR.T[01]SZ values which then influence how the
rest of the memory system behaves.

Note that userspace addresses will still be capped out at 48-bit. More
patches are needed to deal with scenarios where the user provides
MMAP_FIXED hint and a high address to mmap.

Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/Kconfig              |  8 ++++++++
 arch/arm64/include/asm/memory.h |  5 +++++
 arch/arm64/mm/kasan_init.c      |  3 +++
 arch/arm64/mm/proc.S            | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 143c453b06f1..be65dd4bb109 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -262,6 +262,7 @@ config PGTABLE_LEVELS
 	default 2 if ARM64_16K_PAGES && ARM64_VA_BITS_36
 	default 2 if ARM64_64K_PAGES && ARM64_VA_BITS_42
 	default 3 if ARM64_64K_PAGES && ARM64_VA_BITS_48
+	default 3 if ARM64_64K_PAGES && ARM64_VA_BITS_48_52
 	default 3 if ARM64_4K_PAGES && ARM64_VA_BITS_39
 	default 3 if ARM64_16K_PAGES && ARM64_VA_BITS_47
 	default 4 if !ARM64_64K_PAGES && ARM64_VA_BITS_48
@@ -275,6 +276,7 @@ config ARCH_PROC_KCORE_TEXT
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
+	default 0xdfffa00000000000 if ARM64_VA_BITS_48_52
 	default 0xdfffa00000000000 if ARM64_VA_BITS_48
 	default 0xdfffd00000000000 if ARM64_VA_BITS_47
 	default 0xdffffe8000000000 if ARM64_VA_BITS_42
@@ -646,6 +648,10 @@ config ARM64_VA_BITS_47
 config ARM64_VA_BITS_48
 	bool "48-bit"
 
+config ARM64_VA_BITS_48_52
+	bool "48 or 52-bit (decided at boot time)"
+	depends on ARM64_64K_PAGES
+
 endchoice
 
 config ARM64_VA_BITS
@@ -655,9 +661,11 @@ config ARM64_VA_BITS
 	default 42 if ARM64_VA_BITS_42
 	default 47 if ARM64_VA_BITS_47
 	default 48 if ARM64_VA_BITS_48
+	default 48 if ARM64_VA_BITS_48_52
 
 config ARM64_VA_BITS_ALT
 	bool
+	default y if ARM64_VA_BITS_48_52
 	default n
 
 config CPU_BIG_ENDIAN
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 652ae83468b6..8530f8eb77da 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -72,6 +72,11 @@
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - PGDIR_SIZE)
 #define VMEMMAP_START		(FIXADDR_START - VMEMMAP_SIZE)
+#define VA_BITS_MIN		(CONFIG_ARM64_VA_BITS)
+
+#ifdef CONFIG_ARM64_VA_BITS_48_52
+#define VA_BITS_ALT		(52)
+#endif
 
 #define KERNEL_START      _text
 #define KERNEL_END        _end
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 33f99e18e4ff..38c933c17f82 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -135,6 +135,9 @@ static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
 /* The early shadow maps everything to a single page of zeroes */
 asmlinkage void __init kasan_early_init(void)
 {
+#ifdef CONFIG_ARM64_VA_BITS_ALT
+	BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS_ALT), PGDIR_SIZE));
+#endif
 	BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS_MIN), PGDIR_SIZE));
 	BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE));
 	kasan_pgd_populate(KASAN_SHADOW_START, KASAN_SHADOW_END, NUMA_NO_NODE,
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 607a6ff0e205..42a91a4a1126 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -225,6 +225,14 @@ ENTRY(__cpu_setup)
 	 */
 	ldr	x10, =TCR_TxSZ(VA_BITS_MIN) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
 			TCR_TG_FLAGS | TCR_ASID16 | TCR_TBI0
+#ifdef CONFIG_ARM64_VA_BITS_ALT
+	ldr_l	x9, vabits_actual
+	cmp	x9, #VA_BITS_ALT
+	b.ne	1f
+	ldr	x10, =TCR_TxSZ(VA_BITS_ALT) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
+			TCR_TG_FLAGS | TCR_ASID16 | TCR_TBI0
+1:
+#endif
 	tcr_set_idmap_t0sz	x10, x9
 
 	/*
@@ -251,9 +259,22 @@ ENTRY(__cpu_setup)
 ENDPROC(__cpu_setup)
 
 ENTRY(__setup_va_constants)
+#ifdef CONFIG_ARM64_VA_BITS_48_52
+	mrs_s	x5, SYS_ID_AA64MMFR2_EL1
+	and	x5, x5, #0xf << ID_AA64MMFR2_LVA_SHIFT
+	cmp	x5, #1 << ID_AA64MMFR2_LVA_SHIFT
+	b.ne	1f
+	mov	x0, #VA_BITS_ALT
+	mov	x1, TCR_T0SZ(VA_BITS_ALT)
+	mov	x2, #1 << (VA_BITS_ALT - PGDIR_SHIFT)
+	b	2f
+#endif
+
+1:
 	mov	x0, #VA_BITS_MIN
 	mov	x1, TCR_T0SZ(VA_BITS_MIN)
 	mov	x2, #1 << (VA_BITS_MIN - PGDIR_SHIFT)
+2:
 	str_l	x0, vabits_actual, x5
 	str_l	x1, idmap_t0sz, x5
 	str_l	x2, ptrs_per_pgd, x5
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 01/12] KVM: arm/arm64: vgic: Remove spurious call to kern_hyp_va
  2017-12-04 14:13 ` [PATCH 01/12] KVM: arm/arm64: vgic: Remove spurious call to kern_hyp_va Steve Capper
@ 2017-12-04 14:30   ` Suzuki K Poulose
  2017-12-12 11:53     ` Steve Capper
  0 siblings, 1 reply; 26+ messages in thread
From: Suzuki K Poulose @ 2017-12-04 14:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/12/17 14:13, Steve Capper wrote:
> In save_elrsr(.), we use the following technique to ascertain the
> address of the vgic global state:
> 	(kern_hyp_va(&kvm_vgic_global_state))->nr_lr
> 
> For arm, kern_hyp_va(va) == va, and this call effectively compiles out.
> 
> For arm64, this call can be spurious as the address of kvm_vgic_global_state
> will usually be determined by relative page/absolute page offset relocation
> at link time. As the function is idempotent, having the call for arm64 does
> not cause any problems.
> 
> Unfortunately, this is about to change for arm64 as we need to change
> the logic of kern_hyp_va to allow for kernel addresses that are outside
> the direct linear map.
> 
> This patch removes the call to kern_hyp_va, and ensures that correct
> HYP addresses are computed via relative page offset addressing on arm64.
> This is achieved by a custom accessor, hyp_address(.), which on arm is a
> simple reference operator.

minor nit: I somehow feel that there word "symbol" should be part of the name of
the macro, to make it implicit that it can only be used on a symbol and not any
generic variable.

> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 08d3bb66c8b7..34a4ae906a97 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -25,6 +25,16 @@
>   
>   #define __hyp_text __section(.hyp.text) notrace
>   
> +#define hyp_address(symbol)				\
> +({							\
> +	typeof(&symbol) __ret;				\
> +	asm volatile(					\
> +	"adrp %[ptr], " #symbol	"\n"			\
> +	"add %[ptr], %[ptr], :lo12:" #symbol "\n"	\
> +	: [ptr] "=r"(__ret));				\
> +	__ret;						\
> +})
> +

> -	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
> +	addr  = kern_hyp_va(hyp_address(kvm_vgic_global_state)->vcpu_base_va);

e.g, Like here, why do we use hyp_address only for the kvm_vgic_global_state and not
the dereferenced value. Having a name, say, hyp_symbol_address() makes it clear.

Otherwise, looks good to me.

Suzuki

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 05/12] arm64: mm: Remove VMALLOC checks from update_mapping_prot(.)
  2017-12-04 14:13 ` [PATCH 05/12] arm64: mm: Remove VMALLOC checks from update_mapping_prot(.) Steve Capper
@ 2017-12-04 16:01   ` Ard Biesheuvel
  2017-12-12 15:39     ` Steve Capper
  0 siblings, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2017-12-04 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 December 2017 at 14:13, Steve Capper <steve.capper@arm.com> wrote:
> update_mapping_prot assumes that it will be used on the VA for the
> kernel .text section. (Via the check virt >= VMALLOC_START)
>
> Recent kdump patches employ this function to modify the protection of
> the direct linear mapping (which is strictly speaking outside of this
> area), via mark_linear_text_alias_ro(.).
>

Isn't that a bug? Is it guaranteed that those protection attributes
can be modified without splitting live page tables, and the resulting
risk of TLB conflicts?

> We "get away" with this as the direct linear mapping currently follows
> the VA for the kernel text, so the check passes.
>
> This patch removes the check in update_mapping_prot allowing us to move
> the kernel VA layout without spuriously firing the warning.
>
> Signed-off-by: Steve Capper <steve.capper@arm.com>
> ---
>  arch/arm64/mm/mmu.c | 6 ------
>  1 file changed, 6 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 58b1ed6fd7ec..c8f486384fe3 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -383,12 +383,6 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
>  static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
>                                 phys_addr_t size, pgprot_t prot)
>  {
> -       if (virt < VMALLOC_START) {
> -               pr_warn("BUG: not updating mapping for %pa at 0x%016lx - outside kernel range\n",
> -                       &phys, virt);
> -               return;
> -       }
> -
>         __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
>                              NO_CONT_MAPPINGS);
>
> --
> 2.11.0
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 07/12] arm64: mm: Place kImage at bottom of VA space
  2017-12-04 14:13 ` [PATCH 07/12] arm64: mm: Place kImage at bottom of " Steve Capper
@ 2017-12-04 16:25   ` Ard Biesheuvel
  2017-12-04 17:18     ` Steve Capper
  0 siblings, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2017-12-04 16:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 December 2017 at 14:13, Steve Capper <steve.capper@arm.com> wrote:
> Re-arrange the kernel memory map s.t. the kernel image resides in the
> bottom 514MB of memory.

I guess this breaks KASLR entirely, no? Given that it adds an offset
in the range [0 ... sizeof(VMALLOC_SPACE) /4 ].

In any case, it makes sense to keep the kernel VA space adjacent to
the VMALLOC space, rather than put stuff like PCI I/O and the fixmap
in between.

> With the modules, fixed map, PCI IO space placed
> above it. At the very bottom of the memory map we set aside a 2MB guard
> region to prevent ambiguity with PTR_ERR/ERR_PTR.
>

Interesting. In another thread, we discussed whether it is necessary
to prevent the linear map randomization code from allocating at the
very top [bottom in Steve-speak] of the kernel virtual address space,
and this is a thing I did not consider.

> Dynamically resizable objects such as KASAN shadow and sparsemem map
> are placed above the fixed size objects.
>

The current placement of the sparsemem map was carefully chosen so
that virt_to_page/page_to_virt translations are extremely cheap. Is
that still the case?

> This means that kernel addresses are now no longer directly dependent on
> VA space size.
>
> Signed-off-by: Steve Capper <steve.capper@arm.com>
> ---
>  arch/arm64/include/asm/memory.h  | 17 +++++++++--------
>  arch/arm64/include/asm/pgtable.h |  4 ++--
>  arch/arm64/mm/dump.c             | 12 +++++++-----
>  3 files changed, 18 insertions(+), 15 deletions(-)
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 0a912eb3d74f..ba80561c6ed8 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -68,14 +68,15 @@
>  #define PAGE_OFFSET            (UL(0xffffffffffffffff) - \
>         (UL(1) << VA_BITS) + 1)
>  #define PAGE_OFFSET_END                (VA_START)
> -#define KIMAGE_VADDR           (MODULES_END)
> -#define MODULES_END            (MODULES_VADDR + MODULES_VSIZE)
> -#define MODULES_VADDR          (VA_START + KASAN_SHADOW_SIZE)
> +#define KIMAGE_VSIZE           (SZ_512M)
> +#define KIMAGE_VADDR           (UL(0) - SZ_2M - KIMAGE_VSIZE)
>  #define MODULES_VSIZE          (SZ_128M)
> -#define VMEMMAP_START          (-VMEMMAP_SIZE)
> -#define PCI_IO_END             (VMEMMAP_START - SZ_2M)
> +#define MODULES_END            (KIMAGE_VADDR)
> +#define MODULES_VADDR          (MODULES_END - MODULES_VSIZE)
> +#define PCI_IO_END             (MODULES_VADDR - SZ_2M)
>  #define PCI_IO_START           (PCI_IO_END - PCI_IO_SIZE)
> -#define FIXADDR_TOP            (PCI_IO_START - SZ_2M)
> +#define FIXADDR_TOP            (PCI_IO_START - PGDIR_SIZE)
> +#define VMEMMAP_START          (FIXADDR_START - VMEMMAP_SIZE)
>
>  #define KERNEL_START      _text
>  #define KERNEL_END        _end
> @@ -292,10 +293,10 @@ static inline void *phys_to_virt(phys_addr_t x)
>  #define _virt_addr_valid(kaddr)        pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
>  #else
>  #define __virt_to_pgoff(kaddr) (((u64)(kaddr) & ~PAGE_OFFSET) / PAGE_SIZE * sizeof(struct page))
> -#define __page_to_voff(kaddr)  (((u64)(kaddr) & ~VMEMMAP_START) * PAGE_SIZE / sizeof(struct page))
> +#define __page_to_voff(kaddr)  (((u64)(kaddr) - VMEMMAP_START) * PAGE_SIZE / sizeof(struct page))
>
>  #define page_to_virt(page)     ((void *)((__page_to_voff(page)) | PAGE_OFFSET))
> -#define virt_to_page(vaddr)    ((struct page *)((__virt_to_pgoff(vaddr)) | VMEMMAP_START))
> +#define virt_to_page(vaddr)    ((struct page *)((__virt_to_pgoff(vaddr)) + VMEMMAP_START))
>
>  #define _virt_addr_valid(kaddr)        pfn_valid((((u64)(kaddr) & ~PAGE_OFFSET) \
>                                            + PHYS_OFFSET) >> PAGE_SHIFT)
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 054b37143a50..e8b4dcc11fed 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -30,8 +30,8 @@
>   * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space
>   *     and fixed mappings
>   */
> -#define VMALLOC_START          (MODULES_END)
> -#define VMALLOC_END            (- PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
> +#define VMALLOC_START          (VA_START + KASAN_SHADOW_SIZE)
> +#define VMALLOC_END            (FIXADDR_TOP - PUD_SIZE)
>
>  #define vmemmap                        ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
>
> diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
> index b7b09c0fc50d..e5d1b5f432fe 100644
> --- a/arch/arm64/mm/dump.c
> +++ b/arch/arm64/mm/dump.c
> @@ -36,17 +36,19 @@ static const struct addr_marker address_markers[] = {
>         { KASAN_SHADOW_START,           "Kasan shadow start" },
>         { KASAN_SHADOW_END,             "Kasan shadow end" },
>  #endif
> -       { MODULES_VADDR,                "Modules start" },
> -       { MODULES_END,                  "Modules end" },
>         { VMALLOC_START,                "vmalloc() Area" },
>         { VMALLOC_END,                  "vmalloc() End" },
> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> +       { VMEMMAP_START,                "vmemmap start" },
> +       { VMEMMAP_START + VMEMMAP_SIZE, "vmemmap end"},
> +#endif
>         { FIXADDR_START,                "Fixmap start" },
>         { FIXADDR_TOP,                  "Fixmap end" },
>         { PCI_IO_START,                 "PCI I/O start" },
>         { PCI_IO_END,                   "PCI I/O end" },
> -#ifdef CONFIG_SPARSEMEM_VMEMMAP
> -       { VMEMMAP_START,                "vmemmap" },
> -#endif
> +       { MODULES_VADDR,                "Modules start" },
> +       { MODULES_END,                  "Modules end" },
> +       { KIMAGE_VADDR,                 "kImage start"},
>         { -1,                           NULL },
>  };
>
> --
> 2.11.0
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 07/12] arm64: mm: Place kImage at bottom of VA space
  2017-12-04 16:25   ` Ard Biesheuvel
@ 2017-12-04 17:18     ` Steve Capper
  2017-12-04 17:21       ` Steve Capper
  2017-12-04 17:27       ` Ard Biesheuvel
  0 siblings, 2 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 17:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard,

On Mon, Dec 04, 2017 at 04:25:18PM +0000, Ard Biesheuvel wrote:
> On 4 December 2017 at 14:13, Steve Capper <steve.capper@arm.com> wrote:
> > Re-arrange the kernel memory map s.t. the kernel image resides in the
> > bottom 514MB of memory.
>
> I guess this breaks KASLR entirely, no? Given that it adds an offset
> in the range [0 ... sizeof(VMALLOC_SPACE) /4 ].

Yes, yes it does. Sorry about this. I had very carefully tested KASLR
with custom offsets... on my early page table code. I will have a think
about this.

>From a KASLR side, my (renewed) understanding is that a virtual address
as low as possible is desired for the kimage start as that affords the
most wiggle room?

>
> In any case, it makes sense to keep the kernel VA space adjacent to
> the VMALLOC space, rather than put stuff like PCI I/O and the fixmap
> in between.
>
> > With the modules, fixed map, PCI IO space placed
> > above it. At the very bottom of the memory map we set aside a 2MB guard
> > region to prevent ambiguity with PTR_ERR/ERR_PTR.
> >
>
> Interesting. In another thread, we discussed whether it is necessary
> to prevent the linear map randomization code from allocating at the
> very top [bottom in Steve-speak] of the kernel virtual address space,
> and this is a thing I did not consider.
>

I'll adjust my nomenclature to be less confusing.

> > Dynamically resizable objects such as KASAN shadow and sparsemem map
> > are placed above the fixed size objects.
> >
>
> The current placement of the sparsemem map was carefully chosen so
> that virt_to_page/page_to_virt translations are extremely cheap. Is
> that still the case?

I will double check the virt_to_page/page_to_virt. I had adjuested virt_to_phys
and phys_to_virt and I think this one escaped me.

Cheers,
--
Steve
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 07/12] arm64: mm: Place kImage at bottom of VA space
  2017-12-04 17:18     ` Steve Capper
@ 2017-12-04 17:21       ` Steve Capper
  2017-12-04 17:27       ` Ard Biesheuvel
  1 sibling, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-04 17:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 04, 2017 at 05:18:09PM +0000, Steve Capper wrote:
> Hi Ard,
> 

[...]

> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Apologies for this email disclaimer, it was sent erroneously
and can be ignored.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 07/12] arm64: mm: Place kImage at bottom of VA space
  2017-12-04 17:18     ` Steve Capper
  2017-12-04 17:21       ` Steve Capper
@ 2017-12-04 17:27       ` Ard Biesheuvel
  2017-12-04 18:12         ` Steve Capper
  1 sibling, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2017-12-04 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 December 2017 at 17:18, Steve Capper <steve.capper@arm.com> wrote:
> Hi Ard,
>
> On Mon, Dec 04, 2017 at 04:25:18PM +0000, Ard Biesheuvel wrote:
>> On 4 December 2017 at 14:13, Steve Capper <steve.capper@arm.com> wrote:
>> > Re-arrange the kernel memory map s.t. the kernel image resides in the
>> > bottom 514MB of memory.
>>
>> I guess this breaks KASLR entirely, no? Given that it adds an offset
>> in the range [0 ... sizeof(VMALLOC_SPACE) /4 ].
>
> Yes, yes it does. Sorry about this. I had very carefully tested KASLR
> with custom offsets... on my early page table code. I will have a think
> about this.
>
> From a KASLR side, my (renewed) understanding is that a virtual address
> as low as possible is desired for the kimage start as that affords the
> most wiggle room?
>

Well, the nice thing about the current arrangement is that the default
is adjacent to the vmalloc space so any non-zero [bounded] offset
produces a valid placement. Addition with subtraction is easy, so
which side the default placement happens to be at does not really
matter. Having to implement additional bounds checking in the early
KASLR init code to stay clear of the PCI I/O or fixmap regions sounds
a bit more cumbersome.

>>
>> In any case, it makes sense to keep the kernel VA space adjacent to
>> the VMALLOC space, rather than put stuff like PCI I/O and the fixmap
>> in between.
>>
>> > With the modules, fixed map, PCI IO space placed
>> > above it. At the very bottom of the memory map we set aside a 2MB guard
>> > region to prevent ambiguity with PTR_ERR/ERR_PTR.
>> >
>>
>> Interesting. In another thread, we discussed whether it is necessary
>> to prevent the linear map randomization code from allocating at the
>> very top [bottom in Steve-speak] of the kernel virtual address space,
>> and this is a thing I did not consider.
>>
>
> I'll adjust my nomenclature to be less confusing.
>

Thanks :-)

>> > Dynamically resizable objects such as KASAN shadow and sparsemem map
>> > are placed above the fixed size objects.
>> >
>>
>> The current placement of the sparsemem map was carefully chosen so
>> that virt_to_page/page_to_virt translations are extremely cheap. Is
>> that still the case?
>
> I will double check the virt_to_page/page_to_virt. I had adjuested virt_to_phys
> and phys_to_virt and I think this one escaped me.
>
> Cheers,
> --
> Steve
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 07/12] arm64: mm: Place kImage at bottom of VA space
  2017-12-04 17:27       ` Ard Biesheuvel
@ 2017-12-04 18:12         ` Steve Capper
  2017-12-12 11:03           ` Steve Capper
  0 siblings, 1 reply; 26+ messages in thread
From: Steve Capper @ 2017-12-04 18:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 04, 2017 at 05:27:10PM +0000, Ard Biesheuvel wrote:
> On 4 December 2017 at 17:18, Steve Capper <steve.capper@arm.com> wrote:
> > Hi Ard,
> >
> > On Mon, Dec 04, 2017 at 04:25:18PM +0000, Ard Biesheuvel wrote:
> >> On 4 December 2017 at 14:13, Steve Capper <steve.capper@arm.com> wrote:
> >> > Re-arrange the kernel memory map s.t. the kernel image resides in the
> >> > bottom 514MB of memory.
> >>
> >> I guess this breaks KASLR entirely, no? Given that it adds an offset
> >> in the range [0 ... sizeof(VMALLOC_SPACE) /4 ].
> >
> > Yes, yes it does. Sorry about this. I had very carefully tested KASLR
> > with custom offsets... on my early page table code. I will have a think
> > about this.
> >
> > From a KASLR side, my (renewed) understanding is that a virtual address
> > as low as possible is desired for the kimage start as that affords the
> > most wiggle room?
> >
> 
> Well, the nice thing about the current arrangement is that the default
> is adjacent to the vmalloc space so any non-zero [bounded] offset
> produces a valid placement. Addition with subtraction is easy, so
> which side the default placement happens to be at does not really
> matter. Having to implement additional bounds checking in the early
> KASLR init code to stay clear of the PCI I/O or fixmap regions sounds
> a bit more cumbersome.
> 

I *think* I can fix KASAN_SHADOW_END to be 0xFFFF200000000000 on both 48-bit
and 52-bit VA configurations. Thus I may be able to enable 52-bit VA with
minimal disruption to the layout of the VA space (i.e. no need to change
the layout) if I also depend on CONFIG_RELOCATABLE.

I'll investigate...

Cheers,
--
Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 04/12] arm64: Initialise high_memory global variable earlier
  2017-12-04 14:13 ` [PATCH 04/12] arm64: Initialise high_memory global variable earlier Steve Capper
@ 2017-12-11 12:00   ` Catalin Marinas
  2017-12-12 10:56     ` Steve Capper
  0 siblings, 1 reply; 26+ messages in thread
From: Catalin Marinas @ 2017-12-11 12:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 04, 2017 at 02:13:05PM +0000, Steve Capper wrote:
> The high_memory global variable is used by
> cma_declare_contiguous(.) before it is defined.
> 
> We don't notice this as we compute __pa(high_memory - 1), and it looks
> like we're processing a VA from the direct linear map.
> 
> This problem becomes apparent when we flip the kernel virtual address
> space and the linear map is moved to the bottom of the kernel VA space.
> 
> This patch moves the initialisation of high_memory before it used.
> 
> Signed-off-by: Steve Capper <steve.capper@arm.com>

It looks like we've had this bug since 3.18 (f7426b983a6a, "mm: cma:
adjust address limit to avoid hitting low/high memory boundary"). It may
be worth adding a cc stable on this patch.

-- 
Catalin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 04/12] arm64: Initialise high_memory global variable earlier
  2017-12-11 12:00   ` Catalin Marinas
@ 2017-12-12 10:56     ` Steve Capper
  0 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-12 10:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 11, 2017 at 12:00:22PM +0000, Catalin Marinas wrote:
> On Mon, Dec 04, 2017 at 02:13:05PM +0000, Steve Capper wrote:
> > The high_memory global variable is used by
> > cma_declare_contiguous(.) before it is defined.
> > 
> > We don't notice this as we compute __pa(high_memory - 1), and it looks
> > like we're processing a VA from the direct linear map.
> > 
> > This problem becomes apparent when we flip the kernel virtual address
> > space and the linear map is moved to the bottom of the kernel VA space.
> > 
> > This patch moves the initialisation of high_memory before it used.
> > 
> > Signed-off-by: Steve Capper <steve.capper@arm.com>
> 
> It looks like we've had this bug since 3.18 (f7426b983a6a, "mm: cma:
> adjust address limit to avoid hitting low/high memory boundary"). It may
> be worth adding a cc stable on this patch.

Thanks Catalin,
Will add a fixes and cc stable.

Cheers,
-- 
Steve

> 
> -- 
> Catalin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 07/12] arm64: mm: Place kImage at bottom of VA space
  2017-12-04 18:12         ` Steve Capper
@ 2017-12-12 11:03           ` Steve Capper
  0 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-12 11:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 04, 2017 at 06:12:13PM +0000, Steve Capper wrote:
> On Mon, Dec 04, 2017 at 05:27:10PM +0000, Ard Biesheuvel wrote:
> > On 4 December 2017 at 17:18, Steve Capper <steve.capper@arm.com> wrote:
> > > Hi Ard,
> > >
> > > On Mon, Dec 04, 2017 at 04:25:18PM +0000, Ard Biesheuvel wrote:
> > >> On 4 December 2017 at 14:13, Steve Capper <steve.capper@arm.com> wrote:
> > >> > Re-arrange the kernel memory map s.t. the kernel image resides in the
> > >> > bottom 514MB of memory.
> > >>
> > >> I guess this breaks KASLR entirely, no? Given that it adds an offset
> > >> in the range [0 ... sizeof(VMALLOC_SPACE) /4 ].
> > >
> > > Yes, yes it does. Sorry about this. I had very carefully tested KASLR
> > > with custom offsets... on my early page table code. I will have a think
> > > about this.
> > >
> > > From a KASLR side, my (renewed) understanding is that a virtual address
> > > as low as possible is desired for the kimage start as that affords the
> > > most wiggle room?
> > >
> > 
> > Well, the nice thing about the current arrangement is that the default
> > is adjacent to the vmalloc space so any non-zero [bounded] offset
> > produces a valid placement. Addition with subtraction is easy, so
> > which side the default placement happens to be at does not really
> > matter. Having to implement additional bounds checking in the early
> > KASLR init code to stay clear of the PCI I/O or fixmap regions sounds
> > a bit more cumbersome.
> > 
> 
> I *think* I can fix KASAN_SHADOW_END to be 0xFFFF200000000000 on both 48-bit
> and 52-bit VA configurations. Thus I may be able to enable 52-bit VA with
> minimal disruption to the layout of the VA space (i.e. no need to change
> the layout) if I also depend on CONFIG_RELOCATABLE.
> 

Unfortunately, having KASAN_SHADOW_END at 0xFFFF2000000000000 doesn't
work with 52-bit VAs as this would place it in the direct linear map
area.

So I think we need to flip the two halves of the kernel address space in
order to accommodate inline KASAN that operates under multiple VA space
sizes (I couldn't figure out any way to patch the inline KASAN instrumentation).

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/12] KVM: arm/arm64: vgic: Remove spurious call to kern_hyp_va
  2017-12-04 14:30   ` Suzuki K Poulose
@ 2017-12-12 11:53     ` Steve Capper
  0 siblings, 0 replies; 26+ messages in thread
From: Steve Capper @ 2017-12-12 11:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 04, 2017 at 02:30:28PM +0000, Suzuki K Poulose wrote:
> On 04/12/17 14:13, Steve Capper wrote:
> > In save_elrsr(.), we use the following technique to ascertain the
> > address of the vgic global state:
> > 	(kern_hyp_va(&kvm_vgic_global_state))->nr_lr
> > 
> > For arm, kern_hyp_va(va) == va, and this call effectively compiles out.
> > 
> > For arm64, this call can be spurious as the address of kvm_vgic_global_state
> > will usually be determined by relative page/absolute page offset relocation
> > at link time. As the function is idempotent, having the call for arm64 does
> > not cause any problems.
> > 
> > Unfortunately, this is about to change for arm64 as we need to change
> > the logic of kern_hyp_va to allow for kernel addresses that are outside
> > the direct linear map.
> > 
> > This patch removes the call to kern_hyp_va, and ensures that correct
> > HYP addresses are computed via relative page offset addressing on arm64.
> > This is achieved by a custom accessor, hyp_address(.), which on arm is a
> > simple reference operator.
> 
> minor nit: I somehow feel that there word "symbol" should be part of the name of
> the macro, to make it implicit that it can only be used on a symbol and not any
> generic variable.
> 
> > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> > index 08d3bb66c8b7..34a4ae906a97 100644
> > --- a/arch/arm64/include/asm/kvm_hyp.h
> > +++ b/arch/arm64/include/asm/kvm_hyp.h
> > @@ -25,6 +25,16 @@
> >   #define __hyp_text __section(.hyp.text) notrace
> > +#define hyp_address(symbol)				\
> > +({							\
> > +	typeof(&symbol) __ret;				\
> > +	asm volatile(					\
> > +	"adrp %[ptr], " #symbol	"\n"			\
> > +	"add %[ptr], %[ptr], :lo12:" #symbol "\n"	\
> > +	: [ptr] "=r"(__ret));				\
> > +	__ret;						\
> > +})
> > +
> 
> > -	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
> > +	addr  = kern_hyp_va(hyp_address(kvm_vgic_global_state)->vcpu_base_va);
> 
> e.g, Like here, why do we use hyp_address only for the kvm_vgic_global_state and not
> the dereferenced value. Having a name, say, hyp_symbol_address() makes it clear.
> 
> Otherwise, looks good to me.
> 

Thanks Suzuki,
Marc Zyngier has a similar patch in his series:
KVM/arm64: Randomise EL2 mappings

I'll refactor my series to apply on top of Marc's
(and take advantage of the simiplified HYP logic)

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 05/12] arm64: mm: Remove VMALLOC checks from update_mapping_prot(.)
  2017-12-04 16:01   ` Ard Biesheuvel
@ 2017-12-12 15:39     ` Steve Capper
  2017-12-13 16:04       ` Catalin Marinas
  0 siblings, 1 reply; 26+ messages in thread
From: Steve Capper @ 2017-12-12 15:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard,

On Mon, Dec 04, 2017 at 04:01:09PM +0000, Ard Biesheuvel wrote:
> On 4 December 2017 at 14:13, Steve Capper <steve.capper@arm.com> wrote:
> > update_mapping_prot assumes that it will be used on the VA for the
> > kernel .text section. (Via the check virt >= VMALLOC_START)
> >
> > Recent kdump patches employ this function to modify the protection of
> > the direct linear mapping (which is strictly speaking outside of this
> > area), via mark_linear_text_alias_ro(.).
> >
> 
> Isn't that a bug? Is it guaranteed that those protection attributes
> can be modified without splitting live page tables, and the resulting
> risk of TLB conflicts?

IIUC there is a bug in my earlier patch to flip the kernel VA space round,
it should allow addresses from the linear mapping to be processed by
update_mapping_prot. I'll update the logic in VA flip patch, so this
patch can be removed from the series.

It is not apparent to me how mark_linear_text_alias_ro(.) guarantees
that no page table entries for the linear map are split, though.

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 05/12] arm64: mm: Remove VMALLOC checks from update_mapping_prot(.)
  2017-12-12 15:39     ` Steve Capper
@ 2017-12-13 16:04       ` Catalin Marinas
  0 siblings, 0 replies; 26+ messages in thread
From: Catalin Marinas @ 2017-12-13 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 12, 2017 at 03:39:23PM +0000, Steve Capper wrote:
> It is not apparent to me how mark_linear_text_alias_ro(.) guarantees
> that no page table entries for the linear map are split, though.

map_mem() ensures that when mapped via __map_memblock(), no contiguous
entries are created. Also both ends of the _text..__init_begin would be
mapped to the granularity permitted by their alignment so that no later
splitting is necessary.

-- 
Catalin

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-12-13 16:04 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-04 14:13 [PATCH 00/12] 52-bit kernel VAs for arm64 Steve Capper
2017-12-04 14:13 ` [PATCH 01/12] KVM: arm/arm64: vgic: Remove spurious call to kern_hyp_va Steve Capper
2017-12-04 14:30   ` Suzuki K Poulose
2017-12-12 11:53     ` Steve Capper
2017-12-04 14:13 ` [PATCH 02/12] arm64: KVM: Enforce injective kern_hyp_va mappings Steve Capper
2017-12-04 14:13 ` [PATCH 03/12] arm/arm64: KVM: Formalise end of direct linear map Steve Capper
2017-12-04 14:13 ` [PATCH 04/12] arm64: Initialise high_memory global variable earlier Steve Capper
2017-12-11 12:00   ` Catalin Marinas
2017-12-12 10:56     ` Steve Capper
2017-12-04 14:13 ` [PATCH 05/12] arm64: mm: Remove VMALLOC checks from update_mapping_prot(.) Steve Capper
2017-12-04 16:01   ` Ard Biesheuvel
2017-12-12 15:39     ` Steve Capper
2017-12-13 16:04       ` Catalin Marinas
2017-12-04 14:13 ` [PATCH 06/12] arm64: mm: Flip kernel VA space Steve Capper
2017-12-04 14:13 ` [PATCH 07/12] arm64: mm: Place kImage at bottom of " Steve Capper
2017-12-04 16:25   ` Ard Biesheuvel
2017-12-04 17:18     ` Steve Capper
2017-12-04 17:21       ` Steve Capper
2017-12-04 17:27       ` Ard Biesheuvel
2017-12-04 18:12         ` Steve Capper
2017-12-12 11:03           ` Steve Capper
2017-12-04 14:13 ` [PATCH 08/12] arm64: kasan: Switch to using KASAN_SHADOW_OFFSET Steve Capper
2017-12-04 14:13 ` [PATCH 09/12] arm64: dump: Make kernel page table dumper dynamic again Steve Capper
2017-12-04 14:13 ` [PATCH 10/12] arm64: mm: Make VA_BITS variable, introduce VA_BITS_MIN Steve Capper
2017-12-04 14:13 ` [PATCH 11/12] arm64: KVM: Add support for an alternative VA space Steve Capper
2017-12-04 14:13 ` [PATCH 12/12] arm64: mm: Add 48/52-bit kernel VA support Steve Capper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).