* [PATCH v2 1/8] arm64: KVM: Add invalidate_icache_range helper
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
@ 2017-10-20 15:48 ` Marc Zyngier
2017-10-23 12:05 ` Will Deacon
2017-10-20 15:48 ` [PATCH v2 2/8] arm: KVM: Add optimized PIPT icache flushing Marc Zyngier
` (8 subsequent siblings)
9 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2017-10-20 15:48 UTC (permalink / raw)
To: linux-arm-kernel
We currently tightly couple dcache clean with icache invalidation,
but KVM could do without the initial flush to PoU, as we've
already flushed things to PoC.
Let's introduce invalidate_icache_range which is limited to
invalidating the icache from the linear mapping (and thus
has none of the userspace fault handling complexity), and
wire it in KVM instead of flush_icache_range.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
arch/arm64/include/asm/assembler.h | 26 ++++++++++++++++++++++++++
arch/arm64/include/asm/cacheflush.h | 8 ++++++++
arch/arm64/include/asm/kvm_mmu.h | 4 ++--
arch/arm64/mm/cache.S | 26 ++++++++++++++++----------
4 files changed, 52 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d58a6253c6ab..efdbecc8f2ef 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -374,6 +374,32 @@ alternative_endif
dsb \domain
.endm
+/*
+ * Macro to perform an instruction cache maintenance for the interval
+ * [start, end)
+ *
+ * start, end: virtual addresses describing the region
+ * label: A label to branch to on user fault, none if
+ * only used on a kernel address
+ * Corrupts: tmp1, tmp2
+ */
+ .macro invalidate_icache_by_line start, end, tmp1, tmp2, label=none
+ icache_line_size \tmp1, \tmp2
+ sub \tmp2, \tmp1, #1
+ bic \tmp2, \start, \tmp2
+9997:
+ .if (\label == none)
+ ic ivau, \tmp2 // invalidate I line PoU
+ .else
+USER(\label, ic ivau, \tmp2) // invalidate I line PoU
+ .endif
+ add \tmp2, \tmp2, \tmp1
+ cmp \tmp2, \end
+ b.lo 9997b
+ dsb ish
+ isb
+ .endm
+
/*
* reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
*/
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 76d1cc85d5b1..ad56406944c6 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -52,6 +52,13 @@
* - start - virtual start address
* - end - virtual end address
*
+ * invalidate_icache_range(start, end)
+ *
+ * Invalidate the I-cache in the region described by start, end.
+ * Linear mapping only!
+ * - start - virtual start address
+ * - end - virtual end address
+ *
* __flush_cache_user_range(start, end)
*
* Ensure coherency between the I-cache and the D-cache in the
@@ -66,6 +73,7 @@
* - size - region size
*/
extern void flush_icache_range(unsigned long start, unsigned long end);
+extern void invalidate_icache_range(unsigned long start, unsigned long end);
extern void __flush_dcache_area(void *addr, size_t len);
extern void __inval_dcache_area(void *addr, size_t len);
extern void __clean_dcache_area_poc(void *addr, size_t len);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 8034b96fb3a4..56b3e03c85e7 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -250,8 +250,8 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
/* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
void *va = page_address(pfn_to_page(pfn));
- flush_icache_range((unsigned long)va,
- (unsigned long)va + size);
+ invalidate_icache_range((unsigned long)va,
+ (unsigned long)va + size);
}
}
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 7f1dbe962cf5..8e71d237f85e 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -60,16 +60,7 @@ user_alt 9f, "dc cvau, x4", "dc civac, x4", ARM64_WORKAROUND_CLEAN_CACHE
b.lo 1b
dsb ish
- icache_line_size x2, x3
- sub x3, x2, #1
- bic x4, x0, x3
-1:
-USER(9f, ic ivau, x4 ) // invalidate I line PoU
- add x4, x4, x2
- cmp x4, x1
- b.lo 1b
- dsb ish
- isb
+ invalidate_icache_by_line x0, x1, x2, x3, 9f
mov x0, #0
1:
uaccess_ttbr0_disable x1
@@ -80,6 +71,21 @@ USER(9f, ic ivau, x4 ) // invalidate I line PoU
ENDPROC(flush_icache_range)
ENDPROC(__flush_cache_user_range)
+/*
+ * invalidate_icache_range(start,end)
+ *
+ * Ensure that the I cache is invalid within specified region. This
+ * assumes that this is done on the linear mapping. Do not use it
+ * on a userspace range, as this may fault horribly.
+ *
+ * - start - virtual start address of region
+ * - end - virtual end address of region
+ */
+ENTRY(invalidate_icache_range)
+ invalidate_icache_by_line x0, x1, x2, x3
+ ret
+ENDPROC(invalidate_icache_range)
+
/*
* __flush_dcache_area(kaddr, size)
*
--
2.14.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 1/8] arm64: KVM: Add invalidate_icache_range helper
2017-10-20 15:48 ` [PATCH v2 1/8] arm64: KVM: Add invalidate_icache_range helper Marc Zyngier
@ 2017-10-23 12:05 ` Will Deacon
2017-10-23 12:37 ` Marc Zyngier
0 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2017-10-23 12:05 UTC (permalink / raw)
To: linux-arm-kernel
Hi Marc,
On Fri, Oct 20, 2017 at 04:48:57PM +0100, Marc Zyngier wrote:
> We currently tightly couple dcache clean with icache invalidation,
> but KVM could do without the initial flush to PoU, as we've
> already flushed things to PoC.
>
> Let's introduce invalidate_icache_range which is limited to
> invalidating the icache from the linear mapping (and thus
> has none of the userspace fault handling complexity), and
> wire it in KVM instead of flush_icache_range.
>
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
> arch/arm64/include/asm/assembler.h | 26 ++++++++++++++++++++++++++
> arch/arm64/include/asm/cacheflush.h | 8 ++++++++
> arch/arm64/include/asm/kvm_mmu.h | 4 ++--
> arch/arm64/mm/cache.S | 26 ++++++++++++++++----------
> 4 files changed, 52 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index d58a6253c6ab..efdbecc8f2ef 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -374,6 +374,32 @@ alternative_endif
> dsb \domain
> .endm
>
> +/*
> + * Macro to perform an instruction cache maintenance for the interval
> + * [start, end)
> + *
> + * start, end: virtual addresses describing the region
> + * label: A label to branch to on user fault, none if
> + * only used on a kernel address
> + * Corrupts: tmp1, tmp2
> + */
> + .macro invalidate_icache_by_line start, end, tmp1, tmp2, label=none
> + icache_line_size \tmp1, \tmp2
> + sub \tmp2, \tmp1, #1
> + bic \tmp2, \start, \tmp2
> +9997:
> + .if (\label == none)
> + ic ivau, \tmp2 // invalidate I line PoU
> + .else
> +USER(\label, ic ivau, \tmp2) // invalidate I line PoU
> + .endif
> + add \tmp2, \tmp2, \tmp1
> + cmp \tmp2, \end
> + b.lo 9997b
> + dsb ish
> + isb
> + .endm
Code looks fine to me, but I'd like to drop the \label == none case. See
below.
> /*
> * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
> */
> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
> index 76d1cc85d5b1..ad56406944c6 100644
> --- a/arch/arm64/include/asm/cacheflush.h
> +++ b/arch/arm64/include/asm/cacheflush.h
> @@ -52,6 +52,13 @@
> * - start - virtual start address
> * - end - virtual end address
> *
> + * invalidate_icache_range(start, end)
> + *
> + * Invalidate the I-cache in the region described by start, end.
> + * Linear mapping only!
> + * - start - virtual start address
> + * - end - virtual end address
> + *
> * __flush_cache_user_range(start, end)
> *
> * Ensure coherency between the I-cache and the D-cache in the
> @@ -66,6 +73,7 @@
> * - size - region size
> */
> extern void flush_icache_range(unsigned long start, unsigned long end);
> +extern void invalidate_icache_range(unsigned long start, unsigned long end);
> extern void __flush_dcache_area(void *addr, size_t len);
> extern void __inval_dcache_area(void *addr, size_t len);
> extern void __clean_dcache_area_poc(void *addr, size_t len);
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 8034b96fb3a4..56b3e03c85e7 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -250,8 +250,8 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
> /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
> void *va = page_address(pfn_to_page(pfn));
>
> - flush_icache_range((unsigned long)va,
> - (unsigned long)va + size);
> + invalidate_icache_range((unsigned long)va,
> + (unsigned long)va + size);
> }
> }
>
> diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
> index 7f1dbe962cf5..8e71d237f85e 100644
> --- a/arch/arm64/mm/cache.S
> +++ b/arch/arm64/mm/cache.S
> @@ -60,16 +60,7 @@ user_alt 9f, "dc cvau, x4", "dc civac, x4", ARM64_WORKAROUND_CLEAN_CACHE
> b.lo 1b
> dsb ish
>
> - icache_line_size x2, x3
> - sub x3, x2, #1
> - bic x4, x0, x3
> -1:
> -USER(9f, ic ivau, x4 ) // invalidate I line PoU
> - add x4, x4, x2
> - cmp x4, x1
> - b.lo 1b
> - dsb ish
> - isb
> + invalidate_icache_by_line x0, x1, x2, x3, 9f
> mov x0, #0
> 1:
> uaccess_ttbr0_disable x1
> @@ -80,6 +71,21 @@ USER(9f, ic ivau, x4 ) // invalidate I line PoU
> ENDPROC(flush_icache_range)
> ENDPROC(__flush_cache_user_range)
>
> +/*
> + * invalidate_icache_range(start,end)
> + *
> + * Ensure that the I cache is invalid within specified region. This
> + * assumes that this is done on the linear mapping. Do not use it
> + * on a userspace range, as this may fault horribly.
I'm still not sure why we're not hooking up the user fault handling here.
In the worst case, we'd end up returning -EFAULT if we got passed a
faulting user address and there's already precedence for this with e.g.
flush_icache_range.
Will
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 1/8] arm64: KVM: Add invalidate_icache_range helper
2017-10-23 12:05 ` Will Deacon
@ 2017-10-23 12:37 ` Marc Zyngier
2017-10-23 13:07 ` Will Deacon
0 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2017-10-23 12:37 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 23 2017 at 1:05:23 pm BST, Will Deacon <will.deacon@arm.com> wrote:
> Hi Marc,
>
> On Fri, Oct 20, 2017 at 04:48:57PM +0100, Marc Zyngier wrote:
>> We currently tightly couple dcache clean with icache invalidation,
>> but KVM could do without the initial flush to PoU, as we've
>> already flushed things to PoC.
>>
>> Let's introduce invalidate_icache_range which is limited to
>> invalidating the icache from the linear mapping (and thus
>> has none of the userspace fault handling complexity), and
>> wire it in KVM instead of flush_icache_range.
>>
>> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>> arch/arm64/include/asm/assembler.h | 26 ++++++++++++++++++++++++++
>> arch/arm64/include/asm/cacheflush.h | 8 ++++++++
>> arch/arm64/include/asm/kvm_mmu.h | 4 ++--
>> arch/arm64/mm/cache.S | 26 ++++++++++++++++----------
>> 4 files changed, 52 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>> index d58a6253c6ab..efdbecc8f2ef 100644
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -374,6 +374,32 @@ alternative_endif
>> dsb \domain
>> .endm
>>
>> +/*
>> + * Macro to perform an instruction cache maintenance for the interval
>> + * [start, end)
>> + *
>> + * start, end: virtual addresses describing the region
>> + * label: A label to branch to on user fault, none if
>> + * only used on a kernel address
>> + * Corrupts: tmp1, tmp2
>> + */
>> + .macro invalidate_icache_by_line start, end, tmp1, tmp2, label=none
>> + icache_line_size \tmp1, \tmp2
>> + sub \tmp2, \tmp1, #1
>> + bic \tmp2, \start, \tmp2
>> +9997:
>> + .if (\label == none)
>> + ic ivau, \tmp2 // invalidate I line PoU
>> + .else
>> +USER(\label, ic ivau, \tmp2) // invalidate I line PoU
>> + .endif
>> + add \tmp2, \tmp2, \tmp1
>> + cmp \tmp2, \end
>> + b.lo 9997b
>> + dsb ish
>> + isb
>> + .endm
>
> Code looks fine to me, but I'd like to drop the \label == none case. See
> below.
>
>> /*
>> * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
>> */
>> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
>> index 76d1cc85d5b1..ad56406944c6 100644
>> --- a/arch/arm64/include/asm/cacheflush.h
>> +++ b/arch/arm64/include/asm/cacheflush.h
>> @@ -52,6 +52,13 @@
>> * - start - virtual start address
>> * - end - virtual end address
>> *
>> + * invalidate_icache_range(start, end)
>> + *
>> + * Invalidate the I-cache in the region described by start, end.
>> + * Linear mapping only!
>> + * - start - virtual start address
>> + * - end - virtual end address
>> + *
>> * __flush_cache_user_range(start, end)
>> *
>> * Ensure coherency between the I-cache and the D-cache in the
>> @@ -66,6 +73,7 @@
>> * - size - region size
>> */
>> extern void flush_icache_range(unsigned long start, unsigned long end);
>> +extern void invalidate_icache_range(unsigned long start, unsigned long end);
>> extern void __flush_dcache_area(void *addr, size_t len);
>> extern void __inval_dcache_area(void *addr, size_t len);
>> extern void __clean_dcache_area_poc(void *addr, size_t len);
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 8034b96fb3a4..56b3e03c85e7 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -250,8 +250,8 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
>> /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
>> void *va = page_address(pfn_to_page(pfn));
>>
>> - flush_icache_range((unsigned long)va,
>> - (unsigned long)va + size);
>> + invalidate_icache_range((unsigned long)va,
>> + (unsigned long)va + size);
>> }
>> }
>>
>> diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
>> index 7f1dbe962cf5..8e71d237f85e 100644
>> --- a/arch/arm64/mm/cache.S
>> +++ b/arch/arm64/mm/cache.S
>> @@ -60,16 +60,7 @@ user_alt 9f, "dc cvau, x4", "dc civac, x4", ARM64_WORKAROUND_CLEAN_CACHE
>> b.lo 1b
>> dsb ish
>>
>> - icache_line_size x2, x3
>> - sub x3, x2, #1
>> - bic x4, x0, x3
>> -1:
>> -USER(9f, ic ivau, x4 ) // invalidate I line PoU
>> - add x4, x4, x2
>> - cmp x4, x1
>> - b.lo 1b
>> - dsb ish
>> - isb
>> + invalidate_icache_by_line x0, x1, x2, x3, 9f
>> mov x0, #0
>> 1:
>> uaccess_ttbr0_disable x1
>> @@ -80,6 +71,21 @@ USER(9f, ic ivau, x4 ) // invalidate I line PoU
>> ENDPROC(flush_icache_range)
>> ENDPROC(__flush_cache_user_range)
>>
>> +/*
>> + * invalidate_icache_range(start,end)
>> + *
>> + * Ensure that the I cache is invalid within specified region. This
>> + * assumes that this is done on the linear mapping. Do not use it
>> + * on a userspace range, as this may fault horribly.
>
> I'm still not sure why we're not hooking up the user fault handling here.
> In the worst case, we'd end up returning -EFAULT if we got passed a
> faulting user address and there's already precedence for this with e.g.
> flush_icache_range.
Fair enough. Confusingly, flush_icache_range is declared as returning
void... I'll cook a separate patch addressing this. How about the
following on top of that patch:
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index efdbecc8f2ef..27093cf2b91f 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -379,20 +379,15 @@ alternative_endif
* [start, end)
*
* start, end: virtual addresses describing the region
- * label: A label to branch to on user fault, none if
- * only used on a kernel address
+ * label: A label to branch to on user fault.
* Corrupts: tmp1, tmp2
*/
- .macro invalidate_icache_by_line start, end, tmp1, tmp2, label=none
+ .macro invalidate_icache_by_line start, end, tmp1, tmp2, label
icache_line_size \tmp1, \tmp2
sub \tmp2, \tmp1, #1
bic \tmp2, \start, \tmp2
9997:
- .if (\label == none)
- ic ivau, \tmp2 // invalidate I line PoU
- .else
USER(\label, ic ivau, \tmp2) // invalidate I line PoU
- .endif
add \tmp2, \tmp2, \tmp1
cmp \tmp2, \end
b.lo 9997b
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index ad56406944c6..e671e73c6453 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -55,7 +55,6 @@
* invalidate_icache_range(start, end)
*
* Invalidate the I-cache in the region described by start, end.
- * Linear mapping only!
* - start - virtual start address
* - end - virtual end address
*
@@ -73,7 +72,7 @@
* - size - region size
*/
extern void flush_icache_range(unsigned long start, unsigned long end);
-extern void invalidate_icache_range(unsigned long start, unsigned long end);
+extern int invalidate_icache_range(unsigned long start, unsigned long end);
extern void __flush_dcache_area(void *addr, size_t len);
extern void __inval_dcache_area(void *addr, size_t len);
extern void __clean_dcache_area_poc(void *addr, size_t len);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 8e71d237f85e..a89f23610b7e 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -74,15 +74,17 @@ ENDPROC(__flush_cache_user_range)
/*
* invalidate_icache_range(start,end)
*
- * Ensure that the I cache is invalid within specified region. This
- * assumes that this is done on the linear mapping. Do not use it
- * on a userspace range, as this may fault horribly.
+ * Ensure that the I cache is invalid within specified region.
*
* - start - virtual start address of region
* - end - virtual end address of region
*/
ENTRY(invalidate_icache_range)
- invalidate_icache_by_line x0, x1, x2, x3
+ invalidate_icache_by_line x0, x1, x2, x3, 1f
+ mov x0, xzr
+ ret
+1:
+ mov x0, #-EFAULT
ret
ENDPROC(invalidate_icache_range)
Does this address your concerns?
Thanks,
M.
--
Jazz is not dead. It just smells funny.
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 1/8] arm64: KVM: Add invalidate_icache_range helper
2017-10-23 12:37 ` Marc Zyngier
@ 2017-10-23 13:07 ` Will Deacon
0 siblings, 0 replies; 20+ messages in thread
From: Will Deacon @ 2017-10-23 13:07 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 23, 2017 at 01:37:09PM +0100, Marc Zyngier wrote:
> On Mon, Oct 23 2017 at 1:05:23 pm BST, Will Deacon <will.deacon@arm.com> wrote:
> > Hi Marc,
> >
> > On Fri, Oct 20, 2017 at 04:48:57PM +0100, Marc Zyngier wrote:
> >> We currently tightly couple dcache clean with icache invalidation,
> >> but KVM could do without the initial flush to PoU, as we've
> >> already flushed things to PoC.
> >>
> >> Let's introduce invalidate_icache_range which is limited to
> >> invalidating the icache from the linear mapping (and thus
> >> has none of the userspace fault handling complexity), and
> >> wire it in KVM instead of flush_icache_range.
> >>
> >> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >> arch/arm64/include/asm/assembler.h | 26 ++++++++++++++++++++++++++
> >> arch/arm64/include/asm/cacheflush.h | 8 ++++++++
> >> arch/arm64/include/asm/kvm_mmu.h | 4 ++--
> >> arch/arm64/mm/cache.S | 26 ++++++++++++++++----------
> >> 4 files changed, 52 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> >> index d58a6253c6ab..efdbecc8f2ef 100644
> >> --- a/arch/arm64/include/asm/assembler.h
> >> +++ b/arch/arm64/include/asm/assembler.h
> >> @@ -374,6 +374,32 @@ alternative_endif
> >> dsb \domain
> >> .endm
> >>
> >> +/*
> >> + * Macro to perform an instruction cache maintenance for the interval
> >> + * [start, end)
> >> + *
> >> + * start, end: virtual addresses describing the region
> >> + * label: A label to branch to on user fault, none if
> >> + * only used on a kernel address
> >> + * Corrupts: tmp1, tmp2
> >> + */
> >> + .macro invalidate_icache_by_line start, end, tmp1, tmp2, label=none
> >> + icache_line_size \tmp1, \tmp2
> >> + sub \tmp2, \tmp1, #1
> >> + bic \tmp2, \start, \tmp2
> >> +9997:
> >> + .if (\label == none)
> >> + ic ivau, \tmp2 // invalidate I line PoU
> >> + .else
> >> +USER(\label, ic ivau, \tmp2) // invalidate I line PoU
> >> + .endif
> >> + add \tmp2, \tmp2, \tmp1
> >> + cmp \tmp2, \end
> >> + b.lo 9997b
> >> + dsb ish
> >> + isb
> >> + .endm
> >
> > Code looks fine to me, but I'd like to drop the \label == none case. See
> > below.
> >
> >> /*
> >> * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
> >> */
> >> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
> >> index 76d1cc85d5b1..ad56406944c6 100644
> >> --- a/arch/arm64/include/asm/cacheflush.h
> >> +++ b/arch/arm64/include/asm/cacheflush.h
> >> @@ -52,6 +52,13 @@
> >> * - start - virtual start address
> >> * - end - virtual end address
> >> *
> >> + * invalidate_icache_range(start, end)
> >> + *
> >> + * Invalidate the I-cache in the region described by start, end.
> >> + * Linear mapping only!
> >> + * - start - virtual start address
> >> + * - end - virtual end address
> >> + *
> >> * __flush_cache_user_range(start, end)
> >> *
> >> * Ensure coherency between the I-cache and the D-cache in the
> >> @@ -66,6 +73,7 @@
> >> * - size - region size
> >> */
> >> extern void flush_icache_range(unsigned long start, unsigned long end);
> >> +extern void invalidate_icache_range(unsigned long start, unsigned long end);
> >> extern void __flush_dcache_area(void *addr, size_t len);
> >> extern void __inval_dcache_area(void *addr, size_t len);
> >> extern void __clean_dcache_area_poc(void *addr, size_t len);
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 8034b96fb3a4..56b3e03c85e7 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -250,8 +250,8 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
> >> /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
> >> void *va = page_address(pfn_to_page(pfn));
> >>
> >> - flush_icache_range((unsigned long)va,
> >> - (unsigned long)va + size);
> >> + invalidate_icache_range((unsigned long)va,
> >> + (unsigned long)va + size);
> >> }
> >> }
> >>
> >> diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
> >> index 7f1dbe962cf5..8e71d237f85e 100644
> >> --- a/arch/arm64/mm/cache.S
> >> +++ b/arch/arm64/mm/cache.S
> >> @@ -60,16 +60,7 @@ user_alt 9f, "dc cvau, x4", "dc civac, x4", ARM64_WORKAROUND_CLEAN_CACHE
> >> b.lo 1b
> >> dsb ish
> >>
> >> - icache_line_size x2, x3
> >> - sub x3, x2, #1
> >> - bic x4, x0, x3
> >> -1:
> >> -USER(9f, ic ivau, x4 ) // invalidate I line PoU
> >> - add x4, x4, x2
> >> - cmp x4, x1
> >> - b.lo 1b
> >> - dsb ish
> >> - isb
> >> + invalidate_icache_by_line x0, x1, x2, x3, 9f
> >> mov x0, #0
> >> 1:
> >> uaccess_ttbr0_disable x1
> >> @@ -80,6 +71,21 @@ USER(9f, ic ivau, x4 ) // invalidate I line PoU
> >> ENDPROC(flush_icache_range)
> >> ENDPROC(__flush_cache_user_range)
> >>
> >> +/*
> >> + * invalidate_icache_range(start,end)
> >> + *
> >> + * Ensure that the I cache is invalid within specified region. This
> >> + * assumes that this is done on the linear mapping. Do not use it
> >> + * on a userspace range, as this may fault horribly.
> >
> > I'm still not sure why we're not hooking up the user fault handling here.
> > In the worst case, we'd end up returning -EFAULT if we got passed a
> > faulting user address and there's already precedence for this with e.g.
> > flush_icache_range.
>
> Fair enough. Confusingly, flush_icache_range is declared as returning
> void... I'll cook a separate patch addressing this. How about the
> following on top of that patch:
>
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index efdbecc8f2ef..27093cf2b91f 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -379,20 +379,15 @@ alternative_endif
> * [start, end)
> *
> * start, end: virtual addresses describing the region
> - * label: A label to branch to on user fault, none if
> - * only used on a kernel address
> + * label: A label to branch to on user fault.
> * Corrupts: tmp1, tmp2
> */
> - .macro invalidate_icache_by_line start, end, tmp1, tmp2, label=none
> + .macro invalidate_icache_by_line start, end, tmp1, tmp2, label
> icache_line_size \tmp1, \tmp2
> sub \tmp2, \tmp1, #1
> bic \tmp2, \start, \tmp2
> 9997:
> - .if (\label == none)
> - ic ivau, \tmp2 // invalidate I line PoU
> - .else
> USER(\label, ic ivau, \tmp2) // invalidate I line PoU
> - .endif
> add \tmp2, \tmp2, \tmp1
> cmp \tmp2, \end
> b.lo 9997b
> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
> index ad56406944c6..e671e73c6453 100644
> --- a/arch/arm64/include/asm/cacheflush.h
> +++ b/arch/arm64/include/asm/cacheflush.h
> @@ -55,7 +55,6 @@
> * invalidate_icache_range(start, end)
> *
> * Invalidate the I-cache in the region described by start, end.
> - * Linear mapping only!
> * - start - virtual start address
> * - end - virtual end address
> *
> @@ -73,7 +72,7 @@
> * - size - region size
> */
> extern void flush_icache_range(unsigned long start, unsigned long end);
> -extern void invalidate_icache_range(unsigned long start, unsigned long end);
> +extern int invalidate_icache_range(unsigned long start, unsigned long end);
> extern void __flush_dcache_area(void *addr, size_t len);
> extern void __inval_dcache_area(void *addr, size_t len);
> extern void __clean_dcache_area_poc(void *addr, size_t len);
> diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
> index 8e71d237f85e..a89f23610b7e 100644
> --- a/arch/arm64/mm/cache.S
> +++ b/arch/arm64/mm/cache.S
> @@ -74,15 +74,17 @@ ENDPROC(__flush_cache_user_range)
> /*
> * invalidate_icache_range(start,end)
> *
> - * Ensure that the I cache is invalid within specified region. This
> - * assumes that this is done on the linear mapping. Do not use it
> - * on a userspace range, as this may fault horribly.
> + * Ensure that the I cache is invalid within specified region.
> *
> * - start - virtual start address of region
> * - end - virtual end address of region
> */
> ENTRY(invalidate_icache_range)
> - invalidate_icache_by_line x0, x1, x2, x3
> + invalidate_icache_by_line x0, x1, x2, x3, 1f
> + mov x0, xzr
> + ret
> +1:
> + mov x0, #-EFAULT
You need to add the uaccess_ttbr0_{enable,disable} calls here, but with
that then I think this is good.
Will
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/8] arm: KVM: Add optimized PIPT icache flushing
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
2017-10-20 15:48 ` [PATCH v2 1/8] arm64: KVM: Add invalidate_icache_range helper Marc Zyngier
@ 2017-10-20 15:48 ` Marc Zyngier
2017-10-20 16:27 ` Mark Rutland
2017-10-20 15:48 ` [PATCH v2 3/8] arm64: KVM: PTE/PMD S2 XN bit definition Marc Zyngier
` (7 subsequent siblings)
9 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2017-10-20 15:48 UTC (permalink / raw)
To: linux-arm-kernel
Calling __cpuc_coherent_user_range to invalidate the icache on
a PIPT icache machine has some pointless overhead, as it starts
by cleaning the dcache to the PoU, while we're guaranteed to
have already cleaned it to the PoC.
As KVM is the only user of such a feature, let's implement some
ad-hoc cache flushing in kvm_mmu.h. Should it become useful to
other subsystems, it can be moved to a more global location.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
arch/arm/include/asm/kvm_hyp.h | 2 ++
arch/arm/include/asm/kvm_mmu.h | 32 +++++++++++++++++++++++++++++---
2 files changed, 31 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index 14b5903f0224..ad541f9ecc78 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -69,6 +69,8 @@
#define HIFAR __ACCESS_CP15(c6, 4, c0, 2)
#define HPFAR __ACCESS_CP15(c6, 4, c0, 4)
#define ICIALLUIS __ACCESS_CP15(c7, 0, c1, 0)
+#define BPIALLIS __ACCESS_CP15(c7, 0, c1, 6)
+#define ICIMVAU __ACCESS_CP15(c7, 0, c5, 1)
#define ATS1CPR __ACCESS_CP15(c7, 0, c8, 0)
#define TLBIALLIS __ACCESS_CP15(c8, 0, c3, 0)
#define TLBIALL __ACCESS_CP15(c8, 0, c7, 0)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 9fa4b2520974..bc8d21e76637 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -37,6 +37,8 @@
#include <linux/highmem.h>
#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/kvm_hyp.h>
#include <asm/pgalloc.h>
#include <asm/stage2_pgtable.h>
@@ -157,6 +159,8 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
kvm_pfn_t pfn,
unsigned long size)
{
+ u32 iclsz;
+
/*
* If we are going to insert an instruction page and the icache is
* either VIPT or PIPT, there is a potential problem where the host
@@ -181,18 +185,40 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
return;
}
- /* PIPT cache. As for the d-side, use a temporary kernel mapping. */
+ /*
+ * CTR IminLine contains Log2 of the number of words in the
+ * cache line, so we can get the number of words as
+ * 2 << (IminLine - 1). To get the number of bytes, we
+ * multiply by 4 (the number of bytes in a 32-bit word), and
+ * get 4 << (IminLine).
+ */
+ iclsz = 4 << (read_cpuid(CPUID_CACHETYPE) & 0xf);
+
while (size) {
void *va = kmap_atomic_pfn(pfn);
+ void *end = va + PAGE_SIZE;
+ void *addr = va;
- __cpuc_coherent_user_range((unsigned long)va,
- (unsigned long)va + PAGE_SIZE);
+ do {
+ write_sysreg(addr, ICIMVAU);
+ addr += iclsz;
+ } while (addr < end);
+
+ dsb(ishst);
+ isb();
size -= PAGE_SIZE;
pfn++;
kunmap_atomic(va);
}
+
+ /* Check if we need to invalidate the BTB */
+ if ((read_cpuid_ext(CPUID_EXT_MMFR1) >> 28) != 4) {
+ write_sysreg(0, BPIALLIS);
+ dsb(ishst);
+ isb();
+ }
}
static inline void __kvm_flush_dcache_pte(pte_t pte)
--
2.14.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 2/8] arm: KVM: Add optimized PIPT icache flushing
2017-10-20 15:48 ` [PATCH v2 2/8] arm: KVM: Add optimized PIPT icache flushing Marc Zyngier
@ 2017-10-20 16:27 ` Mark Rutland
2017-10-20 16:53 ` Marc Zyngier
0 siblings, 1 reply; 20+ messages in thread
From: Mark Rutland @ 2017-10-20 16:27 UTC (permalink / raw)
To: linux-arm-kernel
Hi Marc,
On Fri, Oct 20, 2017 at 04:48:58PM +0100, Marc Zyngier wrote:
> @@ -181,18 +185,40 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
> return;
> }
>
> - /* PIPT cache. As for the d-side, use a temporary kernel mapping. */
> + /*
> + * CTR IminLine contains Log2 of the number of words in the
> + * cache line, so we can get the number of words as
> + * 2 << (IminLine - 1). To get the number of bytes, we
> + * multiply by 4 (the number of bytes in a 32-bit word), and
> + * get 4 << (IminLine).
> + */
> + iclsz = 4 << (read_cpuid(CPUID_CACHETYPE) & 0xf);
> +
> while (size) {
> void *va = kmap_atomic_pfn(pfn);
> + void *end = va + PAGE_SIZE;
> + void *addr = va;
>
> - __cpuc_coherent_user_range((unsigned long)va,
> - (unsigned long)va + PAGE_SIZE);
> + do {
> + write_sysreg(addr, ICIMVAU);
> + addr += iclsz;
> + } while (addr < end);
> +
> + dsb(ishst);
I believe this needs to be ISH rather than ISHST.
Per, ARM DDI 0487B.b, page G3-4701, "G3.4 AArch32 cache and branch
predictor support":
A DSB or DMB instruction intended to ensure the completion of cache
maintenance instructions or branch predictor instructions must have
an access type of both loads and stores.
> + isb();
>
> size -= PAGE_SIZE;
> pfn++;
>
> kunmap_atomic(va);
> }
> +
> + /* Check if we need to invalidate the BTB */
> + if ((read_cpuid_ext(CPUID_EXT_MMFR1) >> 28) != 4) {
> + write_sysreg(0, BPIALLIS);
> + dsb(ishst);
Likewise here.
Otherwise, looks good!
Thanks,
Mark.
> + isb();
> + }
> }
>
> static inline void __kvm_flush_dcache_pte(pte_t pte)
> --
> 2.14.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH v2 2/8] arm: KVM: Add optimized PIPT icache flushing
2017-10-20 16:27 ` Mark Rutland
@ 2017-10-20 16:53 ` Marc Zyngier
2017-10-20 16:54 ` Mark Rutland
0 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2017-10-20 16:53 UTC (permalink / raw)
To: linux-arm-kernel
Hi Mark,
On 20/10/17 17:27, Mark Rutland wrote:
> Hi Marc,
>
> On Fri, Oct 20, 2017 at 04:48:58PM +0100, Marc Zyngier wrote:
>> @@ -181,18 +185,40 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
>> return;
>> }
>>
>> - /* PIPT cache. As for the d-side, use a temporary kernel mapping. */
>> + /*
>> + * CTR IminLine contains Log2 of the number of words in the
>> + * cache line, so we can get the number of words as
>> + * 2 << (IminLine - 1). To get the number of bytes, we
>> + * multiply by 4 (the number of bytes in a 32-bit word), and
>> + * get 4 << (IminLine).
>> + */
>> + iclsz = 4 << (read_cpuid(CPUID_CACHETYPE) & 0xf);
>> +
>> while (size) {
>> void *va = kmap_atomic_pfn(pfn);
>> + void *end = va + PAGE_SIZE;
>> + void *addr = va;
>>
>> - __cpuc_coherent_user_range((unsigned long)va,
>> - (unsigned long)va + PAGE_SIZE);
>> + do {
>> + write_sysreg(addr, ICIMVAU);
>> + addr += iclsz;
>> + } while (addr < end);
>> +
>> + dsb(ishst);
>
> I believe this needs to be ISH rather than ISHST.
>
> Per, ARM DDI 0487B.b, page G3-4701, "G3.4 AArch32 cache and branch
> predictor support":
>
> A DSB or DMB instruction intended to ensure the completion of cache
> maintenance instructions or branch predictor instructions must have
> an access type of both loads and stores.
Right. This actually comes from 6abdd491698a ("ARM: mm: use
inner-shareable barriers for TLB and user cache operations"), and the
ARMv7 ARM doesn't mention any of this.
My take is that we want to be consistent. Given that KVM/ARM on 32bit is
basically ARMv7 only, I'd rather keep the ST version of the barrier
here, and change it everywhere if/when someone decides to support a
32bit kernel on ARMv8 (yes, we already do as a guest, but it doesn't
seem to really matter so far).
Thoughts?
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH v2 2/8] arm: KVM: Add optimized PIPT icache flushing
2017-10-20 16:53 ` Marc Zyngier
@ 2017-10-20 16:54 ` Mark Rutland
2017-10-21 15:18 ` Christoffer Dall
0 siblings, 1 reply; 20+ messages in thread
From: Mark Rutland @ 2017-10-20 16:54 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Oct 20, 2017 at 05:53:39PM +0100, Marc Zyngier wrote:
> Hi Mark,
>
> On 20/10/17 17:27, Mark Rutland wrote:
> > Hi Marc,
> >
> > On Fri, Oct 20, 2017 at 04:48:58PM +0100, Marc Zyngier wrote:
> >> @@ -181,18 +185,40 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
> >> return;
> >> }
> >>
> >> - /* PIPT cache. As for the d-side, use a temporary kernel mapping. */
> >> + /*
> >> + * CTR IminLine contains Log2 of the number of words in the
> >> + * cache line, so we can get the number of words as
> >> + * 2 << (IminLine - 1). To get the number of bytes, we
> >> + * multiply by 4 (the number of bytes in a 32-bit word), and
> >> + * get 4 << (IminLine).
> >> + */
> >> + iclsz = 4 << (read_cpuid(CPUID_CACHETYPE) & 0xf);
> >> +
> >> while (size) {
> >> void *va = kmap_atomic_pfn(pfn);
> >> + void *end = va + PAGE_SIZE;
> >> + void *addr = va;
> >>
> >> - __cpuc_coherent_user_range((unsigned long)va,
> >> - (unsigned long)va + PAGE_SIZE);
> >> + do {
> >> + write_sysreg(addr, ICIMVAU);
> >> + addr += iclsz;
> >> + } while (addr < end);
> >> +
> >> + dsb(ishst);
> >
> > I believe this needs to be ISH rather than ISHST.
> >
> > Per, ARM DDI 0487B.b, page G3-4701, "G3.4 AArch32 cache and branch
> > predictor support":
> >
> > A DSB or DMB instruction intended to ensure the completion of cache
> > maintenance instructions or branch predictor instructions must have
> > an access type of both loads and stores.
>
> Right. This actually comes from 6abdd491698a ("ARM: mm: use
> inner-shareable barriers for TLB and user cache operations"), and the
> ARMv7 ARM doesn't mention any of this.
Ah; so it doesn't. :/
> My take is that we want to be consistent. Given that KVM/ARM on 32bit is
> basically ARMv7 only, I'd rather keep the ST version of the barrier
> here, and change it everywhere if/when someone decides to support a
> 32bit kernel on ARMv8 (yes, we already do as a guest, but it doesn't
> seem to really matter so far).
Keeping things consistent makes sense to me.
Another one for the backlog...
Thanks,
Mark.
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH v2 2/8] arm: KVM: Add optimized PIPT icache flushing
2017-10-20 16:54 ` Mark Rutland
@ 2017-10-21 15:18 ` Christoffer Dall
2017-10-31 13:51 ` Mark Rutland
0 siblings, 1 reply; 20+ messages in thread
From: Christoffer Dall @ 2017-10-21 15:18 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Oct 20, 2017 at 05:54:40PM +0100, Mark Rutland wrote:
> On Fri, Oct 20, 2017 at 05:53:39PM +0100, Marc Zyngier wrote:
> > Hi Mark,
> >
> > On 20/10/17 17:27, Mark Rutland wrote:
> > > Hi Marc,
> > >
> > > On Fri, Oct 20, 2017 at 04:48:58PM +0100, Marc Zyngier wrote:
> > >> @@ -181,18 +185,40 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
> > >> return;
> > >> }
> > >>
> > >> - /* PIPT cache. As for the d-side, use a temporary kernel mapping. */
> > >> + /*
> > >> + * CTR IminLine contains Log2 of the number of words in the
> > >> + * cache line, so we can get the number of words as
> > >> + * 2 << (IminLine - 1). To get the number of bytes, we
> > >> + * multiply by 4 (the number of bytes in a 32-bit word), and
> > >> + * get 4 << (IminLine).
> > >> + */
> > >> + iclsz = 4 << (read_cpuid(CPUID_CACHETYPE) & 0xf);
> > >> +
> > >> while (size) {
> > >> void *va = kmap_atomic_pfn(pfn);
> > >> + void *end = va + PAGE_SIZE;
> > >> + void *addr = va;
> > >>
> > >> - __cpuc_coherent_user_range((unsigned long)va,
> > >> - (unsigned long)va + PAGE_SIZE);
> > >> + do {
> > >> + write_sysreg(addr, ICIMVAU);
> > >> + addr += iclsz;
> > >> + } while (addr < end);
> > >> +
> > >> + dsb(ishst);
> > >
> > > I believe this needs to be ISH rather than ISHST.
> > >
> > > Per, ARM DDI 0487B.b, page G3-4701, "G3.4 AArch32 cache and branch
> > > predictor support":
> > >
> > > A DSB or DMB instruction intended to ensure the completion of cache
> > > maintenance instructions or branch predictor instructions must have
> > > an access type of both loads and stores.
> >
> > Right. This actually comes from 6abdd491698a ("ARM: mm: use
> > inner-shareable barriers for TLB and user cache operations"), and the
> > ARMv7 ARM doesn't mention any of this.
>
> Ah; so it doesn't. :/
>
> > My take is that we want to be consistent. Given that KVM/ARM on 32bit is
> > basically ARMv7 only, I'd rather keep the ST version of the barrier
> > here, and change it everywhere if/when someone decides to support a
> > 32bit kernel on ARMv8 (yes, we already do as a guest, but it doesn't
> > seem to really matter so far).
>
> Keeping things consistent makes sense to me.
>
Does that mea this is an ack/reviewed-by ?
Thanks,
-Christoffer
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH v2 2/8] arm: KVM: Add optimized PIPT icache flushing
2017-10-21 15:18 ` Christoffer Dall
@ 2017-10-31 13:51 ` Mark Rutland
0 siblings, 0 replies; 20+ messages in thread
From: Mark Rutland @ 2017-10-31 13:51 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Oct 21, 2017 at 05:18:17PM +0200, Christoffer Dall wrote:
> On Fri, Oct 20, 2017 at 05:54:40PM +0100, Mark Rutland wrote:
> > On Fri, Oct 20, 2017 at 05:53:39PM +0100, Marc Zyngier wrote:
> > > On 20/10/17 17:27, Mark Rutland wrote:
> > > > On Fri, Oct 20, 2017 at 04:48:58PM +0100, Marc Zyngier wrote:
> > > >> @@ -181,18 +185,40 @@ static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
> > > >> + do {
> > > >> + write_sysreg(addr, ICIMVAU);
> > > >> + addr += iclsz;
> > > >> + } while (addr < end);
> > > >> +
> > > >> + dsb(ishst);
> > > >
> > > > I believe this needs to be ISH rather than ISHST.
> > > >
> > > > Per, ARM DDI 0487B.b, page G3-4701, "G3.4 AArch32 cache and branch
> > > > predictor support":
> > > >
> > > > A DSB or DMB instruction intended to ensure the completion of cache
> > > > maintenance instructions or branch predictor instructions must have
> > > > an access type of both loads and stores.
> > >
> > > Right. This actually comes from 6abdd491698a ("ARM: mm: use
> > > inner-shareable barriers for TLB and user cache operations"), and the
> > > ARMv7 ARM doesn't mention any of this.
> >
> > Ah; so it doesn't. :/
> >
> > > My take is that we want to be consistent. Given that KVM/ARM on 32bit is
> > > basically ARMv7 only, I'd rather keep the ST version of the barrier
> > > here, and change it everywhere if/when someone decides to support a
> > > 32bit kernel on ARMv8 (yes, we already do as a guest, but it doesn't
> > > seem to really matter so far).
> >
> > Keeping things consistent makes sense to me.
>
> Does that mea this is an ack/reviewed-by ?
Sure. Feel free to take that as an Acked-by if you haven't already
queued the patches!
Thanks,
Mark.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 3/8] arm64: KVM: PTE/PMD S2 XN bit definition
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
2017-10-20 15:48 ` [PATCH v2 1/8] arm64: KVM: Add invalidate_icache_range helper Marc Zyngier
2017-10-20 15:48 ` [PATCH v2 2/8] arm: KVM: Add optimized PIPT icache flushing Marc Zyngier
@ 2017-10-20 15:48 ` Marc Zyngier
2017-10-20 15:49 ` [PATCH v2 4/8] KVM: arm/arm64: Limit icache invalidation to prefetch aborts Marc Zyngier
` (6 subsequent siblings)
9 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2017-10-20 15:48 UTC (permalink / raw)
To: linux-arm-kernel
As we're about to make S2 page-tables eXecute Never by default,
add the required bits for both PMDs and PTEs.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
arch/arm64/include/asm/pgtable-hwdef.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index eb0c2bd90de9..af035331fb09 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -177,9 +177,11 @@
*/
#define PTE_S2_RDONLY (_AT(pteval_t, 1) << 6) /* HAP[2:1] */
#define PTE_S2_RDWR (_AT(pteval_t, 3) << 6) /* HAP[2:1] */
+#define PTE_S2_XN (_AT(pteval_t, 2) << 53) /* XN[1:0] */
#define PMD_S2_RDONLY (_AT(pmdval_t, 1) << 6) /* HAP[2:1] */
#define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
+#define PMD_S2_XN (_AT(pmdval_t, 2) << 53) /* XN[1:0] */
/*
* Memory Attribute override for Stage-2 (MemAttr[3:0])
--
2.14.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 4/8] KVM: arm/arm64: Limit icache invalidation to prefetch aborts
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
` (2 preceding siblings ...)
2017-10-20 15:48 ` [PATCH v2 3/8] arm64: KVM: PTE/PMD S2 XN bit definition Marc Zyngier
@ 2017-10-20 15:49 ` Marc Zyngier
2017-10-20 15:49 ` [PATCH v2 5/8] KVM: arm/arm64: Only clean the dcache on translation fault Marc Zyngier
` (5 subsequent siblings)
9 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2017-10-20 15:49 UTC (permalink / raw)
To: linux-arm-kernel
We've so far eagerly invalidated the icache, no matter how
the page was faulted in (data or prefetch abort).
But we can easily track execution by setting the XN bits
in the S2 page tables, get the prefetch abort at HYP and
perform the icache invalidation at that time only.
As for most VMs, the instruction working set is pretty
small compared to the data set, this is likely to save
some traffic (specially as the invalidation is broadcast).
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
arch/arm/include/asm/kvm_mmu.h | 12 ++++++++++++
arch/arm/include/asm/pgtable.h | 4 ++--
arch/arm64/include/asm/kvm_mmu.h | 12 ++++++++++++
arch/arm64/include/asm/pgtable-prot.h | 4 ++--
virt/kvm/arm/mmu.c | 19 +++++++++++++++----
5 files changed, 43 insertions(+), 8 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index bc8d21e76637..4d7a54cbb3ab 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -85,6 +85,18 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
}
+static inline pte_t kvm_s2pte_mkexec(pte_t pte)
+{
+ pte_val(pte) &= ~L_PTE_XN;
+ return pte;
+}
+
+static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
+{
+ pmd_val(pmd) &= ~PMD_SECT_XN;
+ return pmd;
+}
+
static inline void kvm_set_s2pte_readonly(pte_t *pte)
{
pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 1c462381c225..9b6e77b9ab7e 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -102,8 +102,8 @@ extern pgprot_t pgprot_s2_device;
#define PAGE_HYP_EXEC _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY)
#define PAGE_HYP_RO _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY | L_PTE_XN)
#define PAGE_HYP_DEVICE _MOD_PROT(pgprot_hyp_device, L_PTE_HYP)
-#define PAGE_S2 _MOD_PROT(pgprot_s2, L_PTE_S2_RDONLY)
-#define PAGE_S2_DEVICE _MOD_PROT(pgprot_s2_device, L_PTE_S2_RDONLY)
+#define PAGE_S2 _MOD_PROT(pgprot_s2, L_PTE_S2_RDONLY | L_PTE_XN)
+#define PAGE_S2_DEVICE _MOD_PROT(pgprot_s2_device, L_PTE_S2_RDONLY | L_PTE_XN)
#define __PAGE_NONE __pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN | L_PTE_NONE)
#define __PAGE_SHARED __pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_XN)
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 56b3e03c85e7..1e1b20cb348f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -173,6 +173,18 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
}
+static inline pte_t kvm_s2pte_mkexec(pte_t pte)
+{
+ pte_val(pte) &= ~PTE_S2_XN;
+ return pte;
+}
+
+static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
+{
+ pmd_val(pmd) &= ~PMD_S2_XN;
+ return pmd;
+}
+
static inline void kvm_set_s2pte_readonly(pte_t *pte)
{
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 0a5635fb0ef9..4e12dabd342b 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -60,8 +60,8 @@
#define PAGE_HYP_RO __pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
#define PAGE_HYP_DEVICE __pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
-#define PAGE_S2 __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY)
-#define PAGE_S2_DEVICE __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_UXN)
+#define PAGE_S2 __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
+#define PAGE_S2_DEVICE __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_PXN | PTE_UXN)
#define PAGE_SHARED __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 2174244f6317..0417c8e2a81c 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1292,7 +1292,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
unsigned long fault_status)
{
int ret;
- bool write_fault, writable, hugetlb = false, force_pte = false;
+ bool write_fault, exec_fault, writable, hugetlb = false, force_pte = false;
unsigned long mmu_seq;
gfn_t gfn = fault_ipa >> PAGE_SHIFT;
struct kvm *kvm = vcpu->kvm;
@@ -1304,7 +1304,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
unsigned long flags = 0;
write_fault = kvm_is_write_fault(vcpu);
- if (fault_status == FSC_PERM && !write_fault) {
+ exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
+ VM_BUG_ON(write_fault && exec_fault);
+
+ if (fault_status == FSC_PERM && !write_fault && !exec_fault) {
kvm_err("Unexpected L2 read permission error\n");
return -EFAULT;
}
@@ -1398,7 +1401,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
kvm_set_pfn_dirty(pfn);
}
clean_dcache_guest_page(vcpu, pfn, PMD_SIZE);
- invalidate_icache_guest_page(vcpu, pfn, PMD_SIZE);
+
+ if (exec_fault) {
+ new_pmd = kvm_s2pmd_mkexec(new_pmd);
+ invalidate_icache_guest_page(vcpu, pfn, PMD_SIZE);
+ }
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
} else {
@@ -1410,7 +1417,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
mark_page_dirty(kvm, gfn);
}
clean_dcache_guest_page(vcpu, pfn, PAGE_SIZE);
- invalidate_icache_guest_page(vcpu, pfn, PAGE_SIZE);
+
+ if (exec_fault) {
+ new_pte = kvm_s2pte_mkexec(new_pte);
+ invalidate_icache_guest_page(vcpu, pfn, PAGE_SIZE);
+ }
ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
}
--
2.14.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 5/8] KVM: arm/arm64: Only clean the dcache on translation fault
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
` (3 preceding siblings ...)
2017-10-20 15:49 ` [PATCH v2 4/8] KVM: arm/arm64: Limit icache invalidation to prefetch aborts Marc Zyngier
@ 2017-10-20 15:49 ` Marc Zyngier
2017-10-20 15:49 ` [PATCH v2 6/8] KVM: arm/arm64: Preserve Exec permission across R/W permission faults Marc Zyngier
` (4 subsequent siblings)
9 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2017-10-20 15:49 UTC (permalink / raw)
To: linux-arm-kernel
The only case where we actually need to perform a dcache maintenance
is when we map the page for the first time, and subsequent permission
faults do not require cache maintenance. Let's make it conditional
on not being a permission fault (and thus a translation fault).
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
virt/kvm/arm/mmu.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 0417c8e2a81c..f956efbd933d 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1400,7 +1400,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
kvm_set_pfn_dirty(pfn);
}
- clean_dcache_guest_page(vcpu, pfn, PMD_SIZE);
+
+ if (fault_status != FSC_PERM)
+ clean_dcache_guest_page(vcpu, pfn, PMD_SIZE);
if (exec_fault) {
new_pmd = kvm_s2pmd_mkexec(new_pmd);
@@ -1416,7 +1418,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn);
}
- clean_dcache_guest_page(vcpu, pfn, PAGE_SIZE);
+
+ if (fault_status != FSC_PERM)
+ clean_dcache_guest_page(vcpu, pfn, PAGE_SIZE);
if (exec_fault) {
new_pte = kvm_s2pte_mkexec(new_pte);
--
2.14.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 6/8] KVM: arm/arm64: Preserve Exec permission across R/W permission faults
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
` (4 preceding siblings ...)
2017-10-20 15:49 ` [PATCH v2 5/8] KVM: arm/arm64: Only clean the dcache on translation fault Marc Zyngier
@ 2017-10-20 15:49 ` Marc Zyngier
2017-10-21 15:17 ` Christoffer Dall
2017-10-20 15:49 ` [PATCH v2 7/8] KVM: arm/arm64: Drop vcpu parameter from guest cache maintenance operartions Marc Zyngier
` (3 subsequent siblings)
9 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2017-10-20 15:49 UTC (permalink / raw)
To: linux-arm-kernel
So far, we loose the Exec property whenever we take permission
faults, as we always reconstruct the PTE/PMD from scratch. This
can be counter productive as we can end-up with the following
fault sequence:
X -> RO -> ROX -> RW -> RWX
Instead, we can lookup the existing PTE/PMD and clear the XN bit in the
new entry if it was already cleared in the old one, leadig to a much
nicer fault sequence:
X -> ROX -> RWX
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
arch/arm/include/asm/kvm_mmu.h | 10 ++++++++++
arch/arm64/include/asm/kvm_mmu.h | 10 ++++++++++
virt/kvm/arm/mmu.c | 27 +++++++++++++++++++++++++++
3 files changed, 47 insertions(+)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 4d7a54cbb3ab..aab64fe52146 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -107,6 +107,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
}
+static inline bool kvm_s2pte_exec(pte_t *pte)
+{
+ return !(pte_val(*pte) & L_PTE_XN);
+}
+
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
{
pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
@@ -117,6 +122,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
}
+static inline bool kvm_s2pmd_exec(pmd_t *pmd)
+{
+ return !(pmd_val(*pmd) & PMD_SECT_XN);
+}
+
static inline bool kvm_page_empty(void *ptr)
{
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 1e1b20cb348f..126abefffe7f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -203,6 +203,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
return (pte_val(*pte) & PTE_S2_RDWR) == PTE_S2_RDONLY;
}
+static inline bool kvm_s2pte_exec(pte_t *pte)
+{
+ return !(pte_val(*pte) & PTE_S2_XN);
+}
+
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
{
kvm_set_s2pte_readonly((pte_t *)pmd);
@@ -213,6 +218,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
return kvm_s2pte_readonly((pte_t *)pmd);
}
+static inline bool kvm_s2pmd_exec(pmd_t *pmd)
+{
+ return !(pmd_val(*pmd) & PMD_S2_XN);
+}
+
static inline bool kvm_page_empty(void *ptr)
{
struct page *ptr_page = virt_to_page(ptr);
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index f956efbd933d..b83b5a8442bb 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -926,6 +926,25 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
return 0;
}
+static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+{
+ pmd_t *pmdp;
+ pte_t *ptep;
+
+ pmdp = stage2_get_pmd(kvm, NULL, addr);
+ if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
+ return false;
+
+ if (pmd_thp_or_huge(*pmdp))
+ return kvm_s2pmd_exec(pmdp);
+
+ ptep = pte_offset_kernel(pmdp, addr);
+ if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
+ return false;
+
+ return kvm_s2pte_exec(ptep);
+}
+
static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
phys_addr_t addr, const pte_t *new_pte,
unsigned long flags)
@@ -1407,6 +1426,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
if (exec_fault) {
new_pmd = kvm_s2pmd_mkexec(new_pmd);
invalidate_icache_guest_page(vcpu, pfn, PMD_SIZE);
+ } else if (fault_status == FSC_PERM) {
+ /* Preserve execute if XN was already cleared */
+ if (stage2_is_exec(kvm, fault_ipa))
+ new_pmd = kvm_s2pmd_mkexec(new_pmd);
}
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
@@ -1425,6 +1448,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
if (exec_fault) {
new_pte = kvm_s2pte_mkexec(new_pte);
invalidate_icache_guest_page(vcpu, pfn, PAGE_SIZE);
+ } else if (fault_status == FSC_PERM) {
+ /* Preserve execute if XN was already cleared */
+ if (stage2_is_exec(kvm, fault_ipa))
+ new_pte = kvm_s2pte_mkexec(new_pte);
}
ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
--
2.14.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 6/8] KVM: arm/arm64: Preserve Exec permission across R/W permission faults
2017-10-20 15:49 ` [PATCH v2 6/8] KVM: arm/arm64: Preserve Exec permission across R/W permission faults Marc Zyngier
@ 2017-10-21 15:17 ` Christoffer Dall
0 siblings, 0 replies; 20+ messages in thread
From: Christoffer Dall @ 2017-10-21 15:17 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Oct 20, 2017 at 04:49:02PM +0100, Marc Zyngier wrote:
> So far, we loose the Exec property whenever we take permission
> faults, as we always reconstruct the PTE/PMD from scratch. This
> can be counter productive as we can end-up with the following
> fault sequence:
>
> X -> RO -> ROX -> RW -> RWX
>
> Instead, we can lookup the existing PTE/PMD and clear the XN bit in the
> new entry if it was already cleared in the old one, leadig to a much
> nicer fault sequence:
>
> X -> ROX -> RWX
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
> arch/arm/include/asm/kvm_mmu.h | 10 ++++++++++
> arch/arm64/include/asm/kvm_mmu.h | 10 ++++++++++
> virt/kvm/arm/mmu.c | 27 +++++++++++++++++++++++++++
> 3 files changed, 47 insertions(+)
>
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 4d7a54cbb3ab..aab64fe52146 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -107,6 +107,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
> return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
> }
>
> +static inline bool kvm_s2pte_exec(pte_t *pte)
> +{
> + return !(pte_val(*pte) & L_PTE_XN);
> +}
> +
> static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
> {
> pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
> @@ -117,6 +122,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
> return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
> }
>
> +static inline bool kvm_s2pmd_exec(pmd_t *pmd)
> +{
> + return !(pmd_val(*pmd) & PMD_SECT_XN);
> +}
> +
> static inline bool kvm_page_empty(void *ptr)
> {
> struct page *ptr_page = virt_to_page(ptr);
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 1e1b20cb348f..126abefffe7f 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -203,6 +203,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
> return (pte_val(*pte) & PTE_S2_RDWR) == PTE_S2_RDONLY;
> }
>
> +static inline bool kvm_s2pte_exec(pte_t *pte)
> +{
> + return !(pte_val(*pte) & PTE_S2_XN);
> +}
> +
> static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
> {
> kvm_set_s2pte_readonly((pte_t *)pmd);
> @@ -213,6 +218,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
> return kvm_s2pte_readonly((pte_t *)pmd);
> }
>
> +static inline bool kvm_s2pmd_exec(pmd_t *pmd)
> +{
> + return !(pmd_val(*pmd) & PMD_S2_XN);
> +}
> +
> static inline bool kvm_page_empty(void *ptr)
> {
> struct page *ptr_page = virt_to_page(ptr);
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index f956efbd933d..b83b5a8442bb 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -926,6 +926,25 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
> return 0;
> }
>
> +static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
> +{
> + pmd_t *pmdp;
> + pte_t *ptep;
> +
> + pmdp = stage2_get_pmd(kvm, NULL, addr);
> + if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
> + return false;
> +
> + if (pmd_thp_or_huge(*pmdp))
> + return kvm_s2pmd_exec(pmdp);
> +
> + ptep = pte_offset_kernel(pmdp, addr);
> + if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
> + return false;
> +
> + return kvm_s2pte_exec(ptep);
> +}
> +
> static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> phys_addr_t addr, const pte_t *new_pte,
> unsigned long flags)
> @@ -1407,6 +1426,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> if (exec_fault) {
> new_pmd = kvm_s2pmd_mkexec(new_pmd);
> invalidate_icache_guest_page(vcpu, pfn, PMD_SIZE);
> + } else if (fault_status == FSC_PERM) {
> + /* Preserve execute if XN was already cleared */
> + if (stage2_is_exec(kvm, fault_ipa))
> + new_pmd = kvm_s2pmd_mkexec(new_pmd);
> }
>
> ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
> @@ -1425,6 +1448,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> if (exec_fault) {
> new_pte = kvm_s2pte_mkexec(new_pte);
> invalidate_icache_guest_page(vcpu, pfn, PAGE_SIZE);
> + } else if (fault_status == FSC_PERM) {
> + /* Preserve execute if XN was already cleared */
> + if (stage2_is_exec(kvm, fault_ipa))
> + new_pte = kvm_s2pte_mkexec(new_pte);
> }
>
> ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
> --
> 2.14.1
>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 7/8] KVM: arm/arm64: Drop vcpu parameter from guest cache maintenance operartions
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
` (5 preceding siblings ...)
2017-10-20 15:49 ` [PATCH v2 6/8] KVM: arm/arm64: Preserve Exec permission across R/W permission faults Marc Zyngier
@ 2017-10-20 15:49 ` Marc Zyngier
2017-10-20 15:49 ` [PATCH v2 8/8] KVM: arm/arm64: Detangle kvm_mmu.h from kvm_hyp.h Marc Zyngier
` (2 subsequent siblings)
9 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2017-10-20 15:49 UTC (permalink / raw)
To: linux-arm-kernel
The vcpu parameter isn't used for anything, and gets in the way of
further cleanups. Let's get rid of it.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
arch/arm/include/asm/kvm_mmu.h | 7 ++-----
arch/arm64/include/asm/kvm_mmu.h | 7 ++-----
virt/kvm/arm/mmu.c | 18 ++++++++----------
3 files changed, 12 insertions(+), 20 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index aab64fe52146..bc70a1f0f42d 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -150,9 +150,7 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
return (vcpu_cp15(vcpu, c1_SCTLR) & 0b101) == 0b101;
}
-static inline void __clean_dcache_guest_page(struct kvm_vcpu *vcpu,
- kvm_pfn_t pfn,
- unsigned long size)
+static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
{
/*
* Clean the dcache to the Point of Coherency.
@@ -177,8 +175,7 @@ static inline void __clean_dcache_guest_page(struct kvm_vcpu *vcpu,
}
}
-static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
- kvm_pfn_t pfn,
+static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
unsigned long size)
{
u32 iclsz;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 126abefffe7f..06f1f9794679 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -252,17 +252,14 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
return (vcpu_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101;
}
-static inline void __clean_dcache_guest_page(struct kvm_vcpu *vcpu,
- kvm_pfn_t pfn,
- unsigned long size)
+static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
{
void *va = page_address(pfn_to_page(pfn));
kvm_flush_dcache_to_poc(va, size);
}
-static inline void __invalidate_icache_guest_page(struct kvm_vcpu *vcpu,
- kvm_pfn_t pfn,
+static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
unsigned long size)
{
if (icache_is_aliasing()) {
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index b83b5a8442bb..a1ea43fa75cf 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1276,16 +1276,14 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
}
-static void clean_dcache_guest_page(struct kvm_vcpu *vcpu, kvm_pfn_t pfn,
- unsigned long size)
+static void clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
{
- __clean_dcache_guest_page(vcpu, pfn, size);
+ __clean_dcache_guest_page(pfn, size);
}
-static void invalidate_icache_guest_page(struct kvm_vcpu *vcpu, kvm_pfn_t pfn,
- unsigned long size)
+static void invalidate_icache_guest_page(kvm_pfn_t pfn, unsigned long size)
{
- __invalidate_icache_guest_page(vcpu, pfn, size);
+ __invalidate_icache_guest_page(pfn, size);
}
static void kvm_send_hwpoison_signal(unsigned long address,
@@ -1421,11 +1419,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
}
if (fault_status != FSC_PERM)
- clean_dcache_guest_page(vcpu, pfn, PMD_SIZE);
+ clean_dcache_guest_page(pfn, PMD_SIZE);
if (exec_fault) {
new_pmd = kvm_s2pmd_mkexec(new_pmd);
- invalidate_icache_guest_page(vcpu, pfn, PMD_SIZE);
+ invalidate_icache_guest_page(pfn, PMD_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
@@ -1443,11 +1441,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
}
if (fault_status != FSC_PERM)
- clean_dcache_guest_page(vcpu, pfn, PAGE_SIZE);
+ clean_dcache_guest_page(pfn, PAGE_SIZE);
if (exec_fault) {
new_pte = kvm_s2pte_mkexec(new_pte);
- invalidate_icache_guest_page(vcpu, pfn, PAGE_SIZE);
+ invalidate_icache_guest_page(pfn, PAGE_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
--
2.14.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 8/8] KVM: arm/arm64: Detangle kvm_mmu.h from kvm_hyp.h
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
` (6 preceding siblings ...)
2017-10-20 15:49 ` [PATCH v2 7/8] KVM: arm/arm64: Drop vcpu parameter from guest cache maintenance operartions Marc Zyngier
@ 2017-10-20 15:49 ` Marc Zyngier
2017-10-21 15:24 ` [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Christoffer Dall
2017-10-22 9:20 ` Marc Zyngier
9 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2017-10-20 15:49 UTC (permalink / raw)
To: linux-arm-kernel
kvm_hyp.h has an odd dependency on kvm_mmu.h, which makes the
opposite inclusion impossible. Let's start with breaking that
useless dependency.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
arch/arm/include/asm/kvm_hyp.h | 1 -
arch/arm/kvm/hyp/switch.c | 1 +
arch/arm/kvm/hyp/tlb.c | 1 +
arch/arm64/include/asm/kvm_hyp.h | 1 -
arch/arm64/kvm/hyp/debug-sr.c | 1 +
arch/arm64/kvm/hyp/switch.c | 1 +
arch/arm64/kvm/hyp/tlb.c | 1 +
virt/kvm/arm/hyp/vgic-v2-sr.c | 1 +
8 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index ad541f9ecc78..8b29faa119ba 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -21,7 +21,6 @@
#include <linux/compiler.h>
#include <linux/kvm_host.h>
#include <asm/cp15.h>
-#include <asm/kvm_mmu.h>
#include <asm/vfp.h>
#define __hyp_text __section(.hyp.text) notrace
diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c
index ebd2dd46adf7..67e0a689c4b5 100644
--- a/arch/arm/kvm/hyp/switch.c
+++ b/arch/arm/kvm/hyp/switch.c
@@ -18,6 +18,7 @@
#include <asm/kvm_asm.h>
#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
__asm__(".arch_extension virt");
diff --git a/arch/arm/kvm/hyp/tlb.c b/arch/arm/kvm/hyp/tlb.c
index 6d810af2d9fd..c0edd450e104 100644
--- a/arch/arm/kvm/hyp/tlb.c
+++ b/arch/arm/kvm/hyp/tlb.c
@@ -19,6 +19,7 @@
*/
#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
/**
* Flush per-VMID TLBs
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 4572a9b560fa..afbfbe0c12c5 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -20,7 +20,6 @@
#include <linux/compiler.h>
#include <linux/kvm_host.h>
-#include <asm/kvm_mmu.h>
#include <asm/sysreg.h>
#define __hyp_text __section(.hyp.text) notrace
diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c
index f5154ed3da6c..d3a13d57f2c5 100644
--- a/arch/arm64/kvm/hyp/debug-sr.c
+++ b/arch/arm64/kvm/hyp/debug-sr.c
@@ -21,6 +21,7 @@
#include <asm/debug-monitors.h>
#include <asm/kvm_asm.h>
#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
#define read_debug(r,n) read_sysreg(r##n##_el1)
#define write_debug(v,r,n) write_sysreg(v, r##n##_el1)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c641c4..c52f7094122f 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -21,6 +21,7 @@
#include <asm/kvm_asm.h>
#include <asm/kvm_emulate.h>
#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
#include <asm/fpsimd.h>
static bool __hyp_text __fpsimd_enabled_nvhe(void)
diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c
index 73464a96c365..131c7772703c 100644
--- a/arch/arm64/kvm/hyp/tlb.c
+++ b/arch/arm64/kvm/hyp/tlb.c
@@ -16,6 +16,7 @@
*/
#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
#include <asm/tlbflush.h>
static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm)
diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
index a3f18d362366..77ccd8e2090b 100644
--- a/virt/kvm/arm/hyp/vgic-v2-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
@@ -21,6 +21,7 @@
#include <asm/kvm_emulate.h>
#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
{
--
2.14.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
` (7 preceding siblings ...)
2017-10-20 15:49 ` [PATCH v2 8/8] KVM: arm/arm64: Detangle kvm_mmu.h from kvm_hyp.h Marc Zyngier
@ 2017-10-21 15:24 ` Christoffer Dall
2017-10-22 9:20 ` Marc Zyngier
9 siblings, 0 replies; 20+ messages in thread
From: Christoffer Dall @ 2017-10-21 15:24 UTC (permalink / raw)
To: linux-arm-kernel
Hi Marc, Will, Catalain, and Russell,
On Fri, Oct 20, 2017 at 04:48:56PM +0100, Marc Zyngier wrote:
> It was recently reported that on a VM restore, we seem to spend a
> disproportionate amount of time invalidation the icache. This is
> partially due to some HW behaviour, but also because we're being a bit
> dumb and are invalidating the icache for every page we map at S2, even
> if that on a data access.
>
> The slightly better way of doing this is to mark the pages XN at S2,
> and wait for the the guest to execute something in that page, at which
> point we perform the invalidation. As it is likely that there is a lot
> less instruction than data, we win (or so we hope).
>
> We also take this opportunity to drop the extra dcache clean to the
> PoU which is pretty useless, as we already clean all the way to the
> PoC...
>
> Running a bare metal test that touches 1GB of memory (using a 4kB
> stride) leads to the following results on Seattle:
>
> 4.13:
> do_fault_read.bin: 0.565885992 seconds time elapsed
> do_fault_write.bin: 0.738296337 seconds time elapsed
> do_fault_read_write.bin: 1.241812231 seconds time elapsed
>
> 4.14-rc3+patches:
> do_fault_read.bin: 0.244961803 seconds time elapsed
> do_fault_write.bin: 0.422740092 seconds time elapsed
> do_fault_read_write.bin: 0.643402470 seconds time elapsed
>
> We're almost halving the time of something that more or less looks
> like a restore operation. Some larger systems will show much bigger
> benefits as they become less impacted by the icache invalidation
> (which is broadcast in the inner shareable domain). I've tried to
> measure the impact on a VM boot in order to assess the impact of
> taking an extra permission fault, but found that any difference was
> simply noise.
>
> I've also given it a test run on both Cubietruck and Jetson-TK1.
>
> Tests are archived here:
> https://git.kernel.org/pub/scm/linux/kernel/git/maz/kvm-ws-tests.git/
>
> I'd value some additional test results on HW I don't have access to.
Since these patches are mostly KVM patches I think taking them via the
KVM tree makes the most sense, but they do touch architecture parts of both
arm and arm64. Are you ok with us merging these via the KVM tree, or do
you prefer some more advanced merge strategy?
The 32-bit arm change is really tiny, but it would be good to have an
ack on patch 1 from the arm64 maintainer.
Thanks,
-Christoffer
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts
2017-10-20 15:48 [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
` (8 preceding siblings ...)
2017-10-21 15:24 ` [PATCH v2 0/8] arm/arm64: KVM: limit icache invalidation to prefetch aborts Christoffer Dall
@ 2017-10-22 9:20 ` Marc Zyngier
9 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2017-10-22 9:20 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Oct 20 2017 at 4:48:56 pm BST, Marc Zyngier <marc.zyngier@arm.com> wrote:
[...]
> * From v1:
[...]
> - Dropped patch #10 which was both useless and broken, and patch #9
> that thus became useless
Tuuut!!!!! patch #9[1] is not useless at all, as it allows patch #2 to
compile. I failed to notice it because I still had #9 in my tree as I
pushed things out. Duh.
I've pushed out a branch[2] with that patch as a prerequisite. Let me
know if I should repost the whole thing.
Sorry for the fsckup,
M.
[1] https://patchwork.kernel.org/patch/9993695/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/icache
--
Jazz is not dead. It just smells funny.
^ permalink raw reply [flat|nested] 20+ messages in thread