* [PATCH v2 0/2] arm/arm64: KVM: Optimize arm64 fp/simd, saves 30-50% on exits @ 2015-06-16 21:50 Mario Smarduch 2015-06-16 21:50 ` [PATCH v2 1/2] arm64: KVM: Optimize arm64 fp/simd save/restore Mario Smarduch 2015-06-16 21:50 ` [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64 Mario Smarduch 0 siblings, 2 replies; 7+ messages in thread From: Mario Smarduch @ 2015-06-16 21:50 UTC (permalink / raw) To: linux-arm-kernel Currently we save/restore fp/simd on each exit. Fist patch optimizes arm64 save/restore, we only do so on Guest access. hackbench and several lmbench tests show anywhere from 30% to above 50% optimzation achieved. In second patch 32-bit handler is updated to keep exit handling consistent with 64-bit code. Changes since v1: - Addressed Marcs comments - Verified optimization improvements with lmbench and hackbench, updated commit message Mario Smarduch (2): Optimize arm64 skip 30-50% vfp/simd save/restore on exits keep arm vfp/simd exit handling in sync with arm64 arch/arm/kvm/interrupts.S | 12 +++++----- arch/arm64/include/asm/kvm_arm.h | 5 ++++- arch/arm64/kvm/hyp.S | 46 +++++++++++++++++++++++++++++++++++--- 3 files changed, 54 insertions(+), 9 deletions(-) -- 1.7.9.5 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/2] arm64: KVM: Optimize arm64 fp/simd save/restore 2015-06-16 21:50 [PATCH v2 0/2] arm/arm64: KVM: Optimize arm64 fp/simd, saves 30-50% on exits Mario Smarduch @ 2015-06-16 21:50 ` Mario Smarduch 2015-06-18 17:04 ` Marc Zyngier 2015-06-16 21:50 ` [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64 Mario Smarduch 1 sibling, 1 reply; 7+ messages in thread From: Mario Smarduch @ 2015-06-16 21:50 UTC (permalink / raw) To: linux-arm-kernel This patch only saves and restores FP/SIMD registers on Guest access. To do this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit. lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD context is not saved/restored Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> --- arch/arm64/include/asm/kvm_arm.h | 5 ++++- arch/arm64/kvm/hyp.S | 46 +++++++++++++++++++++++++++++++++++--- 2 files changed, 47 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index ac6fafb..7605e09 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -171,10 +171,13 @@ #define HSTR_EL2_TTEE (1 << 16) #define HSTR_EL2_T(x) (1 << x) +/* Hyp Coproccessor Trap Register Shifts */ +#define CPTR_EL2_TFP_SHIFT 10 + /* Hyp Coprocessor Trap Register */ #define CPTR_EL2_TCPAC (1 << 31) #define CPTR_EL2_TTA (1 << 20) -#define CPTR_EL2_TFP (1 << 10) +#define CPTR_EL2_TFP (1 << CPTR_EL2_TFP_SHIFT) /* Hyp Debug Configuration Register bits */ #define MDCR_EL2_TDRA (1 << 11) diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index 5befd01..de0788f 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -673,6 +673,15 @@ tbz \tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target .endm +/* + * Check cptr VFP/SIMD accessed bit, if set VFP/SIMD not accessed by guest. + */ +.macro skip_fpsimd_state tmp, target + mrs \tmp, cptr_el2 + tbnz \tmp, #CPTR_EL2_TFP_SHIFT, \target +.endm + + .macro compute_debug_state target // Compute debug state: If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY // is set, we do a full save/restore cycle and disable trapping. @@ -763,6 +772,7 @@ ldr x2, [x0, #VCPU_HCR_EL2] msr hcr_el2, x2 mov x2, #CPTR_EL2_TTA + orr x2, x2, #CPTR_EL2_TFP msr cptr_el2, x2 mov x2, #(1 << 15) // Trap CP15 Cr=15 @@ -785,7 +795,6 @@ .macro deactivate_traps mov x2, #HCR_RW msr hcr_el2, x2 - msr cptr_el2, xzr msr hstr_el2, xzr mrs x2, mdcr_el2 @@ -912,6 +921,28 @@ __restore_fpsimd: restore_fpsimd ret +switch_to_guest_fpsimd: + push x4, lr + + mrs x2, cptr_el2 + bic x2, x2, #CPTR_EL2_TFP + msr cptr_el2, x2 + + mrs x0, tpidr_el2 + + ldr x2, [x0, #VCPU_HOST_CONTEXT] + kern_hyp_va x2 + bl __save_fpsimd + + add x2, x0, #VCPU_CONTEXT + bl __restore_fpsimd + + pop x4, lr + pop x2, x3 + pop x0, x1 + + eret + /* * u64 __kvm_vcpu_run(struct kvm_vcpu *vcpu); * @@ -932,7 +963,6 @@ ENTRY(__kvm_vcpu_run) kern_hyp_va x2 save_host_regs - bl __save_fpsimd bl __save_sysregs compute_debug_state 1f @@ -948,7 +978,6 @@ ENTRY(__kvm_vcpu_run) add x2, x0, #VCPU_CONTEXT bl __restore_sysregs - bl __restore_fpsimd skip_debug_state x3, 1f bl __restore_debug @@ -967,7 +996,9 @@ __kvm_vcpu_return: add x2, x0, #VCPU_CONTEXT save_guest_regs + skip_fpsimd_state x3, 1f bl __save_fpsimd +1: bl __save_sysregs skip_debug_state x3, 1f @@ -986,7 +1017,11 @@ __kvm_vcpu_return: kern_hyp_va x2 bl __restore_sysregs + skip_fpsimd_state x3, 1f bl __restore_fpsimd +1: + /* Clear FPSIMD and Trace trapping */ + msr cptr_el2, xzr skip_debug_state x3, 1f // Clear the dirty flag for the next run, as all the state has @@ -1201,6 +1236,11 @@ el1_trap: * x1: ESR * x2: ESR_EC */ + + /* Guest accessed VFP/SIMD registers, save host, restore Guest */ + cmp x2, #ESR_ELx_EC_FP_ASIMD + b.eq switch_to_guest_fpsimd + cmp x2, #ESR_ELx_EC_DABT_LOW mov x0, #ESR_ELx_EC_IABT_LOW ccmp x2, x0, #4, ne -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 1/2] arm64: KVM: Optimize arm64 fp/simd save/restore 2015-06-16 21:50 ` [PATCH v2 1/2] arm64: KVM: Optimize arm64 fp/simd save/restore Mario Smarduch @ 2015-06-18 17:04 ` Marc Zyngier 0 siblings, 0 replies; 7+ messages in thread From: Marc Zyngier @ 2015-06-18 17:04 UTC (permalink / raw) To: linux-arm-kernel On 16/06/15 22:50, Mario Smarduch wrote: > This patch only saves and restores FP/SIMD registers on Guest access. To do > this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit. > lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD > context is not saved/restored > > Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> Looks nice and clean. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64 2015-06-16 21:50 [PATCH v2 0/2] arm/arm64: KVM: Optimize arm64 fp/simd, saves 30-50% on exits Mario Smarduch 2015-06-16 21:50 ` [PATCH v2 1/2] arm64: KVM: Optimize arm64 fp/simd save/restore Mario Smarduch @ 2015-06-16 21:50 ` Mario Smarduch 2015-06-18 17:27 ` Marc Zyngier 1 sibling, 1 reply; 7+ messages in thread From: Mario Smarduch @ 2015-06-16 21:50 UTC (permalink / raw) To: linux-arm-kernel After enhancing arm64 FP/SIMD exit handling, FP/SIMD exit branch is moved to guest trap handling. This keeps exiting handling flow between both architectures consistent. Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> --- arch/arm/kvm/interrupts.S | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index 79caf79..fca2c56 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -363,10 +363,6 @@ hyp_hvc: @ Check syndrome register mrc p15, 4, r1, c5, c2, 0 @ HSR lsr r0, r1, #HSR_EC_SHIFT -#ifdef CONFIG_VFPv3 - cmp r0, #HSR_EC_CP_0_13 - beq switch_to_guest_vfp -#endif cmp r0, #HSR_EC_HVC bne guest_trap @ Not HVC instr. @@ -406,6 +402,12 @@ THUMB( orr lr, #1) 1: eret guest_trap: +#ifdef CONFIG_VFPv3 + /* Guest accessed VFP/SIMD registers, save host, restore Guest */ + cmp r0, #HSR_EC_CP_0_13 + beq switch_to_guest_fpsimd +#endif + load_vcpu @ Load VCPU pointer to r0 str r1, [vcpu, #VCPU_HSR] @@ -478,7 +480,7 @@ guest_trap: * inject an undefined exception to the guest. */ #ifdef CONFIG_VFPv3 -switch_to_guest_vfp: +switch_to_guest_fpsimd: load_vcpu @ Load VCPU pointer to r0 push {r3-r7} -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64 2015-06-16 21:50 ` [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64 Mario Smarduch @ 2015-06-18 17:27 ` Marc Zyngier 2015-06-18 23:49 ` Mario Smarduch 0 siblings, 1 reply; 7+ messages in thread From: Marc Zyngier @ 2015-06-18 17:27 UTC (permalink / raw) To: linux-arm-kernel On 16/06/15 22:50, Mario Smarduch wrote: > After enhancing arm64 FP/SIMD exit handling, FP/SIMD exit branch is moved > to guest trap handling. This keeps exiting handling flow between both > architectures consistent. > > Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> > --- > arch/arm/kvm/interrupts.S | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S > index 79caf79..fca2c56 100644 > --- a/arch/arm/kvm/interrupts.S > +++ b/arch/arm/kvm/interrupts.S > @@ -363,10 +363,6 @@ hyp_hvc: > @ Check syndrome register > mrc p15, 4, r1, c5, c2, 0 @ HSR > lsr r0, r1, #HSR_EC_SHIFT > -#ifdef CONFIG_VFPv3 > - cmp r0, #HSR_EC_CP_0_13 > - beq switch_to_guest_vfp > -#endif > cmp r0, #HSR_EC_HVC > bne guest_trap @ Not HVC instr. > > @@ -406,6 +402,12 @@ THUMB( orr lr, #1) > 1: eret > > guest_trap: > +#ifdef CONFIG_VFPv3 > + /* Guest accessed VFP/SIMD registers, save host, restore Guest */ > + cmp r0, #HSR_EC_CP_0_13 > + beq switch_to_guest_fpsimd > +#endif > + > load_vcpu @ Load VCPU pointer to r0 > str r1, [vcpu, #VCPU_HSR] > > @@ -478,7 +480,7 @@ guest_trap: > * inject an undefined exception to the guest. > */ > #ifdef CONFIG_VFPv3 > -switch_to_guest_vfp: > +switch_to_guest_fpsimd: Ah, I think I managed to confuse you in my previous comment. On ARMv7, we call the floating point stuff VFP. On ARMv8, we call it FP/SIMD. Not very consistent, I know... > load_vcpu @ Load VCPU pointer to r0 It would be interesting to find out if we can make this load_vcpu part of the common sequence (without spilling another register, of course). Probably involves moving the exception class to r2. Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64 2015-06-18 17:27 ` Marc Zyngier @ 2015-06-18 23:49 ` Mario Smarduch 2015-06-19 13:50 ` Marc Zyngier 0 siblings, 1 reply; 7+ messages in thread From: Mario Smarduch @ 2015-06-18 23:49 UTC (permalink / raw) To: linux-arm-kernel On 06/18/2015 10:27 AM, Marc Zyngier wrote: > On 16/06/15 22:50, Mario Smarduch wrote: >> After enhancing arm64 FP/SIMD exit handling, FP/SIMD exit branch is moved >> to guest trap handling. This keeps exiting handling flow between both >> architectures consistent. >> >> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> >> --- >> arch/arm/kvm/interrupts.S | 12 +++++++----- >> 1 file changed, 7 insertions(+), 5 deletions(-) >> >> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S >> index 79caf79..fca2c56 100644 >> --- a/arch/arm/kvm/interrupts.S >> +++ b/arch/arm/kvm/interrupts.S >> @@ -363,10 +363,6 @@ hyp_hvc: >> @ Check syndrome register >> mrc p15, 4, r1, c5, c2, 0 @ HSR >> lsr r0, r1, #HSR_EC_SHIFT >> -#ifdef CONFIG_VFPv3 >> - cmp r0, #HSR_EC_CP_0_13 >> - beq switch_to_guest_vfp >> -#endif >> cmp r0, #HSR_EC_HVC >> bne guest_trap @ Not HVC instr. >> >> @@ -406,6 +402,12 @@ THUMB( orr lr, #1) >> 1: eret >> >> guest_trap: >> +#ifdef CONFIG_VFPv3 >> + /* Guest accessed VFP/SIMD registers, save host, restore Guest */ >> + cmp r0, #HSR_EC_CP_0_13 >> + beq switch_to_guest_fpsimd >> +#endif >> + >> load_vcpu @ Load VCPU pointer to r0 >> str r1, [vcpu, #VCPU_HSR] >> >> @@ -478,7 +480,7 @@ guest_trap: >> * inject an undefined exception to the guest. >> */ >> #ifdef CONFIG_VFPv3 >> -switch_to_guest_vfp: >> +switch_to_guest_fpsimd: > > Ah, I think I managed to confuse you in my previous comment. > On ARMv7, we call the floating point stuff VFP. > On ARMv8, we call it FP/SIMD. Ah I see, I'll update. > > Not very consistent, I know... > >> load_vcpu @ Load VCPU pointer to r0 How about move it here - then it does not stick out like before. guest_trap: load_vcpu @ Load VCPU pointer to r0 str r1, [vcpu, #VCPU_HSR] @ Check if we need the fault information lsr r1, r1, #HSR_EC_SHIFT #ifdef CONFIG_VFPv3 /* Guest accessed VFP/SIMD registers, save host, restore Guest */ cmp r1, #HSR_EC_CP_0_13 beq switch_to_guest_vfp #endif Regarding "host_switch_to_hyp:" it has no reference but appears like a clean separator, that's on purpose? Thanks > > It would be interesting to find out if we can make this load_vcpu part > of the common sequence (without spilling another register, of course). > Probably involves moving the exception class to r2. > > Thanks, > > M. > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64 2015-06-18 23:49 ` Mario Smarduch @ 2015-06-19 13:50 ` Marc Zyngier 0 siblings, 0 replies; 7+ messages in thread From: Marc Zyngier @ 2015-06-19 13:50 UTC (permalink / raw) To: linux-arm-kernel On 19/06/15 00:49, Mario Smarduch wrote: > On 06/18/2015 10:27 AM, Marc Zyngier wrote: >> On 16/06/15 22:50, Mario Smarduch wrote: >>> After enhancing arm64 FP/SIMD exit handling, FP/SIMD exit branch is moved >>> to guest trap handling. This keeps exiting handling flow between both >>> architectures consistent. >>> >>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> >>> --- >>> arch/arm/kvm/interrupts.S | 12 +++++++----- >>> 1 file changed, 7 insertions(+), 5 deletions(-) >>> >>> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S >>> index 79caf79..fca2c56 100644 >>> --- a/arch/arm/kvm/interrupts.S >>> +++ b/arch/arm/kvm/interrupts.S >>> @@ -363,10 +363,6 @@ hyp_hvc: >>> @ Check syndrome register >>> mrc p15, 4, r1, c5, c2, 0 @ HSR >>> lsr r0, r1, #HSR_EC_SHIFT >>> -#ifdef CONFIG_VFPv3 >>> - cmp r0, #HSR_EC_CP_0_13 >>> - beq switch_to_guest_vfp >>> -#endif >>> cmp r0, #HSR_EC_HVC >>> bne guest_trap @ Not HVC instr. >>> >>> @@ -406,6 +402,12 @@ THUMB( orr lr, #1) >>> 1: eret >>> >>> guest_trap: >>> +#ifdef CONFIG_VFPv3 >>> + /* Guest accessed VFP/SIMD registers, save host, restore Guest */ >>> + cmp r0, #HSR_EC_CP_0_13 >>> + beq switch_to_guest_fpsimd >>> +#endif >>> + >>> load_vcpu @ Load VCPU pointer to r0 >>> str r1, [vcpu, #VCPU_HSR] >>> >>> @@ -478,7 +480,7 @@ guest_trap: >>> * inject an undefined exception to the guest. >>> */ >>> #ifdef CONFIG_VFPv3 >>> -switch_to_guest_vfp: >>> +switch_to_guest_fpsimd: >> >> Ah, I think I managed to confuse you in my previous comment. >> On ARMv7, we call the floating point stuff VFP. >> On ARMv8, we call it FP/SIMD. > > Ah I see, I'll update. >> >> Not very consistent, I know... >> >>> load_vcpu @ Load VCPU pointer to r0 > > How about move it here - then it does not stick out like > before. > > guest_trap: > load_vcpu @ Load VCPU pointer to r0 > str r1, [vcpu, #VCPU_HSR] > > @ Check if we need the fault information > lsr r1, r1, #HSR_EC_SHIFT > #ifdef CONFIG_VFPv3 > /* Guest accessed VFP/SIMD registers, save host, restore Guest */ > cmp r1, #HSR_EC_CP_0_13 > beq switch_to_guest_vfp > #endif That would work. > Regarding "host_switch_to_hyp:" it has no reference but appears > like a clean separator, that's on purpose? Not really. It looks like a leftover from the original HYP calling method that we used to have, before the code got merged. You could replace it by a simple comment saying that from this point on, we're dealing with a HVC call from the host. Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-06-19 13:50 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-06-16 21:50 [PATCH v2 0/2] arm/arm64: KVM: Optimize arm64 fp/simd, saves 30-50% on exits Mario Smarduch 2015-06-16 21:50 ` [PATCH v2 1/2] arm64: KVM: Optimize arm64 fp/simd save/restore Mario Smarduch 2015-06-18 17:04 ` Marc Zyngier 2015-06-16 21:50 ` [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64 Mario Smarduch 2015-06-18 17:27 ` Marc Zyngier 2015-06-18 23:49 ` Mario Smarduch 2015-06-19 13:50 ` Marc Zyngier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).