From: James Morse <james.morse@arm.com>
To: Marc Zyngier <marc.zyngier@arm.com>
Cc: kvm@vger.kernel.org, Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
kvmarm@lists.cs.columbia.edu,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v5 10/23] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
Date: Fri, 09 Mar 2018 18:59:08 +0000 [thread overview]
Message-ID: <5AA2D97C.70503@arm.com> (raw)
In-Reply-To: <20180301155538.26860-11-marc.zyngier@arm.com>
Hi Marc,
On 01/03/18 15:55, Marc Zyngier wrote:
> We so far mapped our HYP IO (which is essencially the GICv2 control
(Nit: essentially)
> registers) using the same method as for memory. It recently appeared
> that is a bit unsafe:
>
> We compute the HYP VA using the kern_hyp_va helper, but that helper
> is only designed to deal with kernel VAs coming from the linear map,
> and not from the vmalloc region... This could in turn cause some bad
> aliasing between the two, amplified by the upcoming VA randomisation.
>
> A solution is to come up with our very own basic VA allocator for
> MMIO. Since half of the HYP address space only contains a single
> page (the idmap), we have plenty to borrow from. Let's use the idmap
> as a base, and allocate downwards from it. GICv2 now lives on the
> other side of the great VA barrier.
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 0e5cfffb4c21..3074544940dc 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
> *
> * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
> * therefore contains either mappings in the kernel memory area (above
> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
> - * VMALLOC_START to VMALLOC_END).
> + * PAGE_OFFSET), or device mappings in the idmap range.
> *
> - * boot_hyp_pgd should only map two pages for the init code.
> + * boot_hyp_pgd should only map the idmap range, and is only used in
> + * the extended idmap case.
> */
> void free_hyp_pgds(void)
> {
> + pgd_t *id_pgd;
> +
> mutex_lock(&kvm_hyp_pgd_mutex);
>
> + id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
> +
> + if (id_pgd)
> + unmap_hyp_range(id_pgd, io_map_base,
> + hyp_idmap_start + PAGE_SIZE - io_map_base);
Even if kvm_mmu_init() fails before it sets io_map_base, this will still unmap
the idmap. It just starts from 0, so it may take out the flipped PAGE_OFFSET
range too...
(using io_map_base without taking io_map_lock makes me nervous ... in practice,
its fine)
> +
> if (boot_hyp_pgd) {
> - unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
> free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
> boot_hyp_pgd = NULL;
> }
>
> if (hyp_pgd) {
> - unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
> unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
> (uintptr_t)high_memory - PAGE_OFFSET);
> - unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
> - VMALLOC_END - VMALLOC_START);
>
> free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
> hyp_pgd = NULL;
> @@ -719,7 +726,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> void __iomem **kaddr,
> void __iomem **haddr)
> {
> - unsigned long start, end;
> + pgd_t *pgd = hyp_pgd;
> + unsigned long base;
> int ret;
>
> *kaddr = ioremap(phys_addr, size);
> @@ -731,11 +739,42 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> return 0;
> }
>
> + mutex_lock(&io_map_lock);
> +
> + /*
> + * This assumes that we we have enough space below the idmap
> + * page to allocate our VAs. If not, the check below will
> + * kick. A potential alternative would be to detect that
> + * overflow and switch to an allocation above the idmap.
> + */
> + size = max(PAGE_SIZE, roundup_pow_of_two(size));
> + base = io_map_base - size;
> + base &= ~(size - 1);
>
> - start = kern_hyp_va((unsigned long)*kaddr);
> - end = kern_hyp_va((unsigned long)*kaddr + size);
> - ret = __create_hyp_mappings(hyp_pgd, PTRS_PER_PGD, start, end,
> - __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
> + /*
> + * Verify that BIT(VA_BITS - 1) hasn't been flipped by
> + * allocating the new area, as it would indicate we've
> + * overflowed the idmap/IO address range.
> + */
> + if ((base ^ io_map_base) & BIT(VA_BITS - 1)) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + if (__kvm_cpu_uses_extended_idmap())
> + pgd = boot_hyp_pgd;
> +
> + ret = __create_hyp_mappings(pgd, __kvm_idmap_ptrs_per_pgd(),
> + base, base + size,
> + __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
(/me winces, that's subtle....)
This __kvm_idmap_ptrs_per_pgd() change is because the hyp_pgd
extended-idmap top-level page may be a pgd bigger than the 64-entries that linux
believes it has for 64K/48bit VA?
Doesn't unmap_hyp_range() need to know about this too? Otherwise its
pgd_index(hyp_idmap_start) is going to mask out too much of the address, and
pgd_addr_end() will never reach the end address we provided...
...
Trying to boot a 64K config, and forcing it into teardown_hyp_mode() leads to
some fireworks: It looks like an unaligned end address is getting into
unmap_hyp_ptes() and its escaping the loop to kvm_set_pte() other kernel data...
My local changes are below [0], the config is defconfig + 64K pages, this is on
Juno. 4K pages is quite happy with this 'force teardown_hyp_mode()' debug hack.
Bisects to patch 4: "arm64: KVM: Dynamically patch the kernel/hyp VA mask"
I'll keep digging on Monday,
Thanks,
James
[0] Local changes to this series on v4.16-rc4
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 433d13d0c271..5a132180119d 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -418,7 +418,7 @@ static inline void *kvm_get_hyp_vector(void)
/* This is only called on a !VHE system */
static inline int kvm_map_vectors(void)
{
- phys_addr_t vect_pa = virt_to_phys(__bp_harden_hyp_vecs_start);
+ phys_addr_t vect_pa = __pa_symbol(__bp_harden_hyp_vecs_start);
unsigned long size = __bp_harden_hyp_vecs_end - __bp_harden_hyp_vecs_start;
if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR)) {
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 86941f6181bb..8f4ec0cc269f 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1494,7 +1494,7 @@ static int init_hyp_mode(void)
}
}
- return 0;
+ err = -EINVAL;
out_err:
teardown_hyp_mode();
WARNING: multiple messages have this Message-ID (diff)
From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 10/23] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
Date: Fri, 09 Mar 2018 18:59:08 +0000 [thread overview]
Message-ID: <5AA2D97C.70503@arm.com> (raw)
In-Reply-To: <20180301155538.26860-11-marc.zyngier@arm.com>
Hi Marc,
On 01/03/18 15:55, Marc Zyngier wrote:
> We so far mapped our HYP IO (which is essencially the GICv2 control
(Nit: essentially)
> registers) using the same method as for memory. It recently appeared
> that is a bit unsafe:
>
> We compute the HYP VA using the kern_hyp_va helper, but that helper
> is only designed to deal with kernel VAs coming from the linear map,
> and not from the vmalloc region... This could in turn cause some bad
> aliasing between the two, amplified by the upcoming VA randomisation.
>
> A solution is to come up with our very own basic VA allocator for
> MMIO. Since half of the HYP address space only contains a single
> page (the idmap), we have plenty to borrow from. Let's use the idmap
> as a base, and allocate downwards from it. GICv2 now lives on the
> other side of the great VA barrier.
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 0e5cfffb4c21..3074544940dc 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
> *
> * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
> * therefore contains either mappings in the kernel memory area (above
> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
> - * VMALLOC_START to VMALLOC_END).
> + * PAGE_OFFSET), or device mappings in the idmap range.
> *
> - * boot_hyp_pgd should only map two pages for the init code.
> + * boot_hyp_pgd should only map the idmap range, and is only used in
> + * the extended idmap case.
> */
> void free_hyp_pgds(void)
> {
> + pgd_t *id_pgd;
> +
> mutex_lock(&kvm_hyp_pgd_mutex);
>
> + id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
> +
> + if (id_pgd)
> + unmap_hyp_range(id_pgd, io_map_base,
> + hyp_idmap_start + PAGE_SIZE - io_map_base);
Even if kvm_mmu_init() fails before it sets io_map_base, this will still unmap
the idmap. It just starts from 0, so it may take out the flipped PAGE_OFFSET
range too...
(using io_map_base without taking io_map_lock makes me nervous ... in practice,
its fine)
> +
> if (boot_hyp_pgd) {
> - unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
> free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
> boot_hyp_pgd = NULL;
> }
>
> if (hyp_pgd) {
> - unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
> unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
> (uintptr_t)high_memory - PAGE_OFFSET);
> - unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
> - VMALLOC_END - VMALLOC_START);
>
> free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
> hyp_pgd = NULL;
> @@ -719,7 +726,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> void __iomem **kaddr,
> void __iomem **haddr)
> {
> - unsigned long start, end;
> + pgd_t *pgd = hyp_pgd;
> + unsigned long base;
> int ret;
>
> *kaddr = ioremap(phys_addr, size);
> @@ -731,11 +739,42 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> return 0;
> }
>
> + mutex_lock(&io_map_lock);
> +
> + /*
> + * This assumes that we we have enough space below the idmap
> + * page to allocate our VAs. If not, the check below will
> + * kick. A potential alternative would be to detect that
> + * overflow and switch to an allocation above the idmap.
> + */
> + size = max(PAGE_SIZE, roundup_pow_of_two(size));
> + base = io_map_base - size;
> + base &= ~(size - 1);
>
> - start = kern_hyp_va((unsigned long)*kaddr);
> - end = kern_hyp_va((unsigned long)*kaddr + size);
> - ret = __create_hyp_mappings(hyp_pgd, PTRS_PER_PGD, start, end,
> - __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
> + /*
> + * Verify that BIT(VA_BITS - 1) hasn't been flipped by
> + * allocating the new area, as it would indicate we've
> + * overflowed the idmap/IO address range.
> + */
> + if ((base ^ io_map_base) & BIT(VA_BITS - 1)) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + if (__kvm_cpu_uses_extended_idmap())
> + pgd = boot_hyp_pgd;
> +
> + ret = __create_hyp_mappings(pgd, __kvm_idmap_ptrs_per_pgd(),
> + base, base + size,
> + __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
(/me winces, that's subtle....)
This __kvm_idmap_ptrs_per_pgd() change is because the hyp_pgd
extended-idmap top-level page may be a pgd bigger than the 64-entries that linux
believes it has for 64K/48bit VA?
Doesn't unmap_hyp_range() need to know about this too? Otherwise its
pgd_index(hyp_idmap_start) is going to mask out too much of the address, and
pgd_addr_end() will never reach the end address we provided...
...
Trying to boot a 64K config, and forcing it into teardown_hyp_mode() leads to
some fireworks: It looks like an unaligned end address is getting into
unmap_hyp_ptes() and its escaping the loop to kvm_set_pte() other kernel data...
My local changes are below [0], the config is defconfig + 64K pages, this is on
Juno. 4K pages is quite happy with this 'force teardown_hyp_mode()' debug hack.
Bisects to patch 4: "arm64: KVM: Dynamically patch the kernel/hyp VA mask"
I'll keep digging on Monday,
Thanks,
James
[0] Local changes to this series on v4.16-rc4
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 433d13d0c271..5a132180119d 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -418,7 +418,7 @@ static inline void *kvm_get_hyp_vector(void)
/* This is only called on a !VHE system */
static inline int kvm_map_vectors(void)
{
- phys_addr_t vect_pa = virt_to_phys(__bp_harden_hyp_vecs_start);
+ phys_addr_t vect_pa = __pa_symbol(__bp_harden_hyp_vecs_start);
unsigned long size = __bp_harden_hyp_vecs_end - __bp_harden_hyp_vecs_start;
if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR)) {
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 86941f6181bb..8f4ec0cc269f 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1494,7 +1494,7 @@ static int init_hyp_mode(void)
}
}
- return 0;
+ err = -EINVAL;
out_err:
teardown_hyp_mode();
next prev parent reply other threads:[~2018-03-09 18:54 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-01 15:55 [PATCH v5 00/23] KVM/arm64: Randomise EL2 mappings (variant 3a mitigation) Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 01/23] arm64: alternatives: Add dynamic patching feature Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-07 18:09 ` Catalin Marinas
2018-03-07 18:09 ` Catalin Marinas
2018-03-01 15:55 ` [PATCH v5 02/23] arm64: insn: Add N immediate encoding Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-07 18:09 ` Catalin Marinas
2018-03-07 18:09 ` Catalin Marinas
2018-03-01 15:55 ` [PATCH v5 03/23] arm64: insn: Add encoder for bitwise operations using literals Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-07 18:10 ` Catalin Marinas
2018-03-07 18:10 ` Catalin Marinas
2018-03-12 14:44 ` Marc Zyngier
2018-03-12 14:44 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 04/23] arm64: KVM: Dynamically patch the kernel/hyp VA mask Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-07 18:10 ` Catalin Marinas
2018-03-07 18:10 ` Catalin Marinas
2018-03-01 15:55 ` [PATCH v5 05/23] arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-07 18:11 ` Catalin Marinas
2018-03-07 18:11 ` Catalin Marinas
2018-03-13 8:44 ` Suzuki K Poulose
2018-03-13 8:44 ` Suzuki K Poulose
2018-03-01 15:55 ` [PATCH v5 06/23] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 07/23] KVM: arm/arm64: Demote HYP VA range display to being a debug feature Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 08/23] KVM: arm/arm64: Move ioremap calls to create_hyp_io_mappings Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-13 9:03 ` Suzuki K Poulose
2018-03-01 15:55 ` [PATCH v5 09/23] KVM: arm/arm64: Keep GICv2 HYP VAs in kvm_vgic_global_state Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-13 9:35 ` Suzuki K Poulose
2018-03-13 9:35 ` Suzuki K Poulose
2018-03-13 11:40 ` Marc Zyngier
2018-03-13 11:40 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 10/23] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-09 18:59 ` James Morse [this message]
2018-03-09 18:59 ` James Morse
2018-03-12 14:02 ` Marc Zyngier
2018-03-12 14:02 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 11/23] arm64; insn: Add encoder for the EXTR instruction Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-07 18:12 ` Catalin Marinas
2018-03-07 18:12 ` Catalin Marinas
2018-03-01 15:55 ` [PATCH v5 12/23] arm64: insn: Allow ADD/SUB (immediate) with LSL #12 Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-07 18:13 ` Catalin Marinas
2018-03-07 18:13 ` Catalin Marinas
2018-03-01 15:55 ` [PATCH v5 13/23] arm64: KVM: Dynamically compute the HYP VA mask Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 14/23] arm64: KVM: Introduce EL2 VA randomisation Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-13 11:31 ` James Morse
2018-03-13 11:31 ` James Morse
2018-03-13 11:48 ` James Morse
2018-03-13 11:48 ` James Morse
2018-03-01 15:55 ` [PATCH v5 15/23] arm64: Update the KVM memory map documentation Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 16/23] arm64: KVM: Move vector offsetting from hyp-init.S to kvm_get_hyp_vector Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 17/23] arm64: KVM: Move stashing of x0/x1 into the vector code itself Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 18/23] arm64: KVM: Add epilogue branching to the vector code Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-08 13:59 ` Catalin Marinas
2018-03-08 13:59 ` Catalin Marinas
2018-03-01 15:55 ` [PATCH v5 19/23] arm64: KVM: Allow far branches from vector slots to the main vectors Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-08 13:59 ` Catalin Marinas
2018-03-08 13:59 ` Catalin Marinas
2018-03-12 18:27 ` James Morse
2018-03-12 18:27 ` James Morse
2018-03-12 19:43 ` Marc Zyngier
2018-03-12 19:43 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 20/23] arm/arm64: KVM: Introduce EL2-specific executable mappings Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 21/23] arm64: Make BP hardening slot counter available Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 22/23] arm64: KVM: Allow mapping of vectors outside of the RAM region Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
2018-03-08 17:54 ` Andrew Jones
2018-03-08 17:54 ` Andrew Jones
2018-03-13 10:30 ` Marc Zyngier
2018-03-13 10:30 ` Marc Zyngier
2018-03-13 11:14 ` Andrew Jones
2018-03-13 11:14 ` Andrew Jones
2018-03-09 18:59 ` James Morse
2018-03-09 18:59 ` James Morse
2018-03-12 14:23 ` Marc Zyngier
2018-03-12 14:23 ` Marc Zyngier
2018-03-14 11:40 ` James Morse
2018-03-14 11:40 ` James Morse
2018-03-14 12:02 ` Marc Zyngier
2018-03-14 12:02 ` Marc Zyngier
2018-03-01 15:55 ` [PATCH v5 23/23] arm64: Enable ARM64_HARDEN_EL2_VECTORS on Cortex-A57 and A72 Marc Zyngier
2018-03-01 15:55 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5AA2D97C.70503@arm.com \
--to=james.morse@arm.com \
--cc=catalin.marinas@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=marc.zyngier@arm.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.