* [PATCH 1/3] KVM: arm64: Fix memory leak on stage2 update of a valid PTE
2020-12-04 18:19 [GIT PULL] KVM/arm64 fixes for 5.10, take #5 Marc Zyngier
@ 2020-12-04 18:19 ` Marc Zyngier
2020-12-04 18:19 ` [PATCH 2/3] KVM: arm64: Fix handling of merging tables into a block entry Marc Zyngier
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2020-12-04 18:19 UTC (permalink / raw)
To: Paolo Bonzini
Cc: kvm, Suzuki K Poulose, kernel-team, Keqian Zhu, Yanan Wang,
James Morse, linux-arm-kernel, Will Deacon, kvmarm,
Julien Thierry
From: Yanan Wang <wangyanan55@huawei.com>
When installing a new leaf PTE onto an invalid ptep, we need to
get_page(ptep) to account for the new mapping.
However, simply updating a valid PTE shouldn't result in any
additional refcounting, as there is new mapping. This otherwise
results in a page being forever wasted.
Address this by fixing-up the refcount in stage2_map_walker_try_leaf()
if the PTE was already valid, balancing out the later get_page()
in stage2_map_walk_leaf().
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
[maz: update commit message, add comment in the code]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201201201034.116760-2-wangyanan55@huawei.com
---
arch/arm64/kvm/hyp/pgtable.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 0271b4a3b9fe..2beba1dc40ec 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -470,6 +470,15 @@ static bool stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level,
if (!kvm_block_mapping_supported(addr, end, phys, level))
return false;
+ /*
+ * If the PTE was already valid, drop the refcount on the table
+ * early, as it will be bumped-up again in stage2_map_walk_leaf().
+ * This ensures that the refcount stays constant across a valid to
+ * valid PTE update.
+ */
+ if (kvm_pte_valid(*ptep))
+ put_page(virt_to_page(ptep));
+
if (kvm_set_valid_leaf_pte(ptep, phys, data->attr, level))
goto out;
--
2.28.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH 2/3] KVM: arm64: Fix handling of merging tables into a block entry
2020-12-04 18:19 [GIT PULL] KVM/arm64 fixes for 5.10, take #5 Marc Zyngier
2020-12-04 18:19 ` [PATCH 1/3] KVM: arm64: Fix memory leak on stage2 update of a valid PTE Marc Zyngier
@ 2020-12-04 18:19 ` Marc Zyngier
2020-12-04 18:19 ` [PATCH 3/3] KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort() Marc Zyngier
2020-12-06 10:56 ` [GIT PULL] KVM/arm64 fixes for 5.10, take #5 Paolo Bonzini
3 siblings, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2020-12-04 18:19 UTC (permalink / raw)
To: Paolo Bonzini
Cc: kvm, Suzuki K Poulose, kernel-team, Keqian Zhu, Yanan Wang,
James Morse, linux-arm-kernel, Will Deacon, kvmarm,
Julien Thierry
From: Yanan Wang <wangyanan55@huawei.com>
When dirty logging is enabled, we collapse block entries into tables
as necessary. If dirty logging gets canceled, we can end-up merging
tables back into block entries.
When this happens, we must not only free the non-huge page-table
pages but also invalidate all the TLB entries that can potentially
cover the block. Otherwise, we end-up with multiple possible translations
for the same physical page, which can legitimately result in a TLB
conflict.
To address this, replease the bogus invalidation by IPA with a full
VM invalidation. Although this is pretty heavy handed, it happens
very infrequently and saves a bunch of invalidations by IPA.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
[maz: fixup commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201201201034.116760-3-wangyanan55@huawei.com
---
arch/arm64/kvm/hyp/pgtable.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 2beba1dc40ec..bdf8e55ed308 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -502,7 +502,13 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level,
return 0;
kvm_set_invalid_pte(ptep);
- kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, 0);
+
+ /*
+ * Invalidate the whole stage-2, as we may have numerous leaf
+ * entries below us which would otherwise need invalidating
+ * individually.
+ */
+ kvm_call_hyp(__kvm_tlb_flush_vmid, data->mmu);
data->anchor = ptep;
return 0;
}
--
2.28.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/3] KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort()
2020-12-04 18:19 [GIT PULL] KVM/arm64 fixes for 5.10, take #5 Marc Zyngier
2020-12-04 18:19 ` [PATCH 1/3] KVM: arm64: Fix memory leak on stage2 update of a valid PTE Marc Zyngier
2020-12-04 18:19 ` [PATCH 2/3] KVM: arm64: Fix handling of merging tables into a block entry Marc Zyngier
@ 2020-12-04 18:19 ` Marc Zyngier
2020-12-06 10:56 ` [GIT PULL] KVM/arm64 fixes for 5.10, take #5 Paolo Bonzini
3 siblings, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2020-12-04 18:19 UTC (permalink / raw)
To: Paolo Bonzini
Cc: kvm, Suzuki K Poulose, kernel-team, Keqian Zhu, Yanan Wang,
James Morse, linux-arm-kernel, Will Deacon, kvmarm,
Julien Thierry
From: Yanan Wang <wangyanan55@huawei.com>
If we get a FSC_PERM fault, just using (logging_active && writable) to
determine calling kvm_pgtable_stage2_map(). There will be two more cases
we should consider.
(1) After logging_active is configged back to false from true. When we
get a FSC_PERM fault with write_fault and adjustment of hugepage is needed,
we should merge tables back to a block entry. This case is ignored by still
calling kvm_pgtable_stage2_relax_perms(), which will lead to an endless
loop and guest panic due to soft lockup.
(2) We use (FSC_PERM && logging_active && writable) to determine
collapsing a block entry into a table by calling kvm_pgtable_stage2_map().
But sometimes we may only need to relax permissions when trying to write
to a page other than a block.
In this condition,using kvm_pgtable_stage2_relax_perms() will be fine.
The ISS filed bit[1:0] in ESR_EL2 regesiter indicates the stage2 lookup
level at which a D-abort or I-abort occurred. By comparing granule of
the fault lookup level with vma_pagesize, we can strictly distinguish
conditions of calling kvm_pgtable_stage2_relax_perms() or
kvm_pgtable_stage2_map(), and the above two cases will be well considered.
Suggested-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201201201034.116760-4-wangyanan55@huawei.com
---
arch/arm64/include/asm/esr.h | 1 +
arch/arm64/include/asm/kvm_emulate.h | 5 +++++
arch/arm64/kvm/mmu.c | 11 +++++++++--
3 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 22c81f1edda2..85a3e49f92f4 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -104,6 +104,7 @@
/* Shared ISS fault status code(IFSC/DFSC) for Data/Instruction aborts */
#define ESR_ELx_FSC (0x3F)
#define ESR_ELx_FSC_TYPE (0x3C)
+#define ESR_ELx_FSC_LEVEL (0x03)
#define ESR_ELx_FSC_EXTABT (0x10)
#define ESR_ELx_FSC_SERROR (0x11)
#define ESR_ELx_FSC_ACCESS (0x08)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 5ef2669ccd6c..00bc6f1234ba 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -350,6 +350,11 @@ static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vc
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_TYPE;
}
+static __always_inline u8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu)
+{
+ return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL;
+}
+
static __always_inline bool kvm_vcpu_abt_issea(const struct kvm_vcpu *vcpu)
{
switch (kvm_vcpu_trap_get_fault(vcpu)) {
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 1a01da9fdc99..75814a02d189 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -754,10 +754,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
gfn_t gfn;
kvm_pfn_t pfn;
bool logging_active = memslot_is_logging(memslot);
- unsigned long vma_pagesize;
+ unsigned long fault_level = kvm_vcpu_trap_get_fault_level(vcpu);
+ unsigned long vma_pagesize, fault_granule;
enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
struct kvm_pgtable *pgt;
+ fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
VM_BUG_ON(write_fault && exec_fault);
@@ -896,7 +898,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))
prot |= KVM_PGTABLE_PROT_X;
- if (fault_status == FSC_PERM && !(logging_active && writable)) {
+ /*
+ * Under the premise of getting a FSC_PERM fault, we just need to relax
+ * permissions only if vma_pagesize equals fault_granule. Otherwise,
+ * kvm_pgtable_stage2_map() should be called to change block size.
+ */
+ if (fault_status == FSC_PERM && vma_pagesize == fault_granule) {
ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot);
} else {
ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize,
--
2.28.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [GIT PULL] KVM/arm64 fixes for 5.10, take #5
2020-12-04 18:19 [GIT PULL] KVM/arm64 fixes for 5.10, take #5 Marc Zyngier
` (2 preceding siblings ...)
2020-12-04 18:19 ` [PATCH 3/3] KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort() Marc Zyngier
@ 2020-12-06 10:56 ` Paolo Bonzini
3 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2020-12-06 10:56 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvm, Suzuki K Poulose, kernel-team, Keqian Zhu, Yanan Wang,
James Morse, linux-arm-kernel, Will Deacon, kvmarm,
Julien Thierry
On 04/12/20 19:19, Marc Zyngier wrote:
> Hi Paolo,
>
> A week ago, I was hoping being done with the 5.10 fixes. I should
> obviously know better.
>
> Thanks to Yanan's excellent work, we have another set of page table
> fixes, all plugging issues introduced with our new page table code.
> The problems range from memory leak to TLB conflicts, all of which are
> serious enough to be squashed right away.
>
> Are we done yet? Fingers crossed.
Pulled, thanks. I am not sure I'll get my own pull request to Linus
today, though.
Paolo
> Please pull,
>
> M.
>
> The following changes since commit 23bde34771f1ea92fb5e6682c0d8c04304d34b3b:
>
> KVM: arm64: vgic-v3: Drop the reporting of GICR_TYPER.Last for userspace (2020-11-17 18:51:09 +0000)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-5.10-5
>
> for you to fetch changes up to 7d894834a305568a0168c55d4729216f5f8cb4e6:
>
> KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort() (2020-12-02 09:53:29 +0000)
>
> ----------------------------------------------------------------
> kvm/arm64 fixes for 5.10, take #5
>
> - Don't leak page tables on PTE update
> - Correctly invalidate TLBs on table to block transition
> - Only update permissions if the fault level matches the
> expected mapping size
>
> ----------------------------------------------------------------
> Yanan Wang (3):
> KVM: arm64: Fix memory leak on stage2 update of a valid PTE
> KVM: arm64: Fix handling of merging tables into a block entry
> KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort()
>
> arch/arm64/include/asm/esr.h | 1 +
> arch/arm64/include/asm/kvm_emulate.h | 5 +++++
> arch/arm64/kvm/hyp/pgtable.c | 17 ++++++++++++++++-
> arch/arm64/kvm/mmu.c | 11 +++++++++--
> 4 files changed, 31 insertions(+), 3 deletions(-)
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread