From: Marc Zyngier <marc.zyngier@arm.com>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: "linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [PATCH v4 1/2] arm64: KVM: Optimize arm64 skip 30-50% vfp/simd save/restore on exits
Date: Thu, 06 Aug 2015 13:46:31 +0100 [thread overview]
Message-ID: <55C35727.6060208@arm.com> (raw)
In-Reply-To: <20150806115436.GC4657@cbox>
Hi Christoffer,
On 06/08/15 12:54, Christoffer Dall wrote:
> On Wed, Aug 05, 2015 at 05:11:37PM +0100, Marc Zyngier wrote:
>> On 16/07/15 22:29, Mario Smarduch wrote:
>>> This patch only saves and restores FP/SIMD registers on Guest access. To do
>>> this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit.
>>> lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD
>>> context is not saved/restored
>>>
>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>
>> So this patch seems to break 32bit guests on arm64. I've had a look,
>> squashed a few bugs that I dangerously overlooked during the review, but
>> it still doesn't work (it doesn't crash anymore, but I get random
>> illegal VFP instructions in 32bit guests).
>>
>> I'd be glad if someone could eyeball the following patch and tell me
>> what's going wrong. If we don't find the root cause quickly enough, I'll
>> have to drop the series from -next, and that'd be a real shame.
>>
>> Thanks,
>>
>> M.
>>
>> commit 5777dc55fbc170426a85e00c26002dd5a795cfa5
>> Author: Marc Zyngier <marc.zyngier@arm.com>
>> Date: Wed Aug 5 16:53:01 2015 +0100
>>
>> KVM: arm64: NOTAFIX: Prevent crash when 32bit guest uses VFP
>>
>> Since we switch FPSIMD in a lazy way, access to FPEXC32_EL2
>> must be guarded by skip_fpsimd_state. Otherwise, all hell
>> break loose.
>>
>> Also, FPEXC32_EL2 must be restored when we trap to EL2 to
>> enable floating point.
>>
>> Note that while it prevents the host from catching fire, the
>> guest still doesn't work properly, and I don't understand why just
>> yet.
>>
>> Not-really-signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>
>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>> index c8e0c70..b53ec5d 100644
>> --- a/arch/arm64/kvm/hyp.S
>> +++ b/arch/arm64/kvm/hyp.S
>> @@ -431,10 +431,12 @@
>> add x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
>> mrs x4, dacr32_el2
>> mrs x5, ifsr32_el2
>> - mrs x6, fpexc32_el2
>> stp x4, x5, [x3]
>> - str x6, [x3, #16]
>>
>> + skip_fpsimd_state x8, 3f
>> + mrs x6, fpexc32_el2
>> + str x6, [x3, #16]
>> +3:
>> skip_debug_state x8, 2f
>> mrs x7, dbgvcr32_el2
>> str x7, [x3, #24]
>> @@ -461,10 +463,8 @@
>>
>> add x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
>> ldp x4, x5, [x3]
>> - ldr x6, [x3, #16]
>> msr dacr32_el2, x4
>> msr ifsr32_el2, x5
>> - msr fpexc32_el2, x6
>>
>> skip_debug_state x8, 2f
>> ldr x7, [x3, #24]
>> @@ -669,12 +669,14 @@ __restore_debug:
>> ret
>>
>> __save_fpsimd:
>> + skip_fpsimd_state x3, 1f
>> save_fpsimd
>> - ret
>> +1: ret
>>
>> __restore_fpsimd:
>> + skip_fpsimd_state x3, 1f
>> restore_fpsimd
>> - ret
>> +1: ret
>>
>> switch_to_guest_fpsimd:
>> push x4, lr
>> @@ -682,6 +684,7 @@ switch_to_guest_fpsimd:
>> mrs x2, cptr_el2
>> bic x2, x2, #CPTR_EL2_TFP
>> msr cptr_el2, x2
>> + isb
>
> ah, EL2 accesses themselves trap too, ouch.
Yeah, the FP architecture has all kind of nasty tricks like this.
>>
>> mrs x0, tpidr_el2
>>
>> @@ -692,6 +695,10 @@ switch_to_guest_fpsimd:
>> add x2, x0, #VCPU_CONTEXT
>> bl __restore_fpsimd
>>
>> + skip_32bit_state x3, 1f
>> + ldr x4, [x2, #CPU_SYSREG_OFFSET(FPEXC32_EL2)]
>> + msr fpexc32_el2, x4
>> +1:
>
> wait, it looks like you're missing a store of the host fpsimd state in
> the switch_to_guest_fpsimd situation.
A store of FPEXC32_EL2 for the host? No, this register has no
significance whatsoever for a 64bit host. Its sole purpose is to
accommodate a 32bit guest (see D7.2.31). Or are you thinking of
something else?
> It think this would be easier to follow if the aarch32 FP registers were
> handled as part of the __save_fpsimd and __restore_fpsimd routines
> instead.
Maybe. This code needs some rework...
> Does this help anything?
Not really, sorry... :-(
> Otherwise, I assume you've tested thoroughly between something like
> v4.2-rc2 and this patch, so that you're sure we didn't have the 32-bit
> processed crash before?
Without the lazy switching, we seem to be rock solid. With lazy
switching, I get these illegal instructions, always on VFP ops. My
current thinking is that it has something to do with CPACR_EL1, and the
fact that the 32bit kernel turns it VFP on/off all the time.
/me puzzled...
M.
--
Jazz is not dead. It just smells funny...
WARNING: multiple messages have this Message-ID (diff)
From: marc.zyngier@arm.com (Marc Zyngier)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v4 1/2] arm64: KVM: Optimize arm64 skip 30-50% vfp/simd save/restore on exits
Date: Thu, 06 Aug 2015 13:46:31 +0100 [thread overview]
Message-ID: <55C35727.6060208@arm.com> (raw)
In-Reply-To: <20150806115436.GC4657@cbox>
Hi Christoffer,
On 06/08/15 12:54, Christoffer Dall wrote:
> On Wed, Aug 05, 2015 at 05:11:37PM +0100, Marc Zyngier wrote:
>> On 16/07/15 22:29, Mario Smarduch wrote:
>>> This patch only saves and restores FP/SIMD registers on Guest access. To do
>>> this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit.
>>> lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD
>>> context is not saved/restored
>>>
>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>
>> So this patch seems to break 32bit guests on arm64. I've had a look,
>> squashed a few bugs that I dangerously overlooked during the review, but
>> it still doesn't work (it doesn't crash anymore, but I get random
>> illegal VFP instructions in 32bit guests).
>>
>> I'd be glad if someone could eyeball the following patch and tell me
>> what's going wrong. If we don't find the root cause quickly enough, I'll
>> have to drop the series from -next, and that'd be a real shame.
>>
>> Thanks,
>>
>> M.
>>
>> commit 5777dc55fbc170426a85e00c26002dd5a795cfa5
>> Author: Marc Zyngier <marc.zyngier@arm.com>
>> Date: Wed Aug 5 16:53:01 2015 +0100
>>
>> KVM: arm64: NOTAFIX: Prevent crash when 32bit guest uses VFP
>>
>> Since we switch FPSIMD in a lazy way, access to FPEXC32_EL2
>> must be guarded by skip_fpsimd_state. Otherwise, all hell
>> break loose.
>>
>> Also, FPEXC32_EL2 must be restored when we trap to EL2 to
>> enable floating point.
>>
>> Note that while it prevents the host from catching fire, the
>> guest still doesn't work properly, and I don't understand why just
>> yet.
>>
>> Not-really-signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>
>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>> index c8e0c70..b53ec5d 100644
>> --- a/arch/arm64/kvm/hyp.S
>> +++ b/arch/arm64/kvm/hyp.S
>> @@ -431,10 +431,12 @@
>> add x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
>> mrs x4, dacr32_el2
>> mrs x5, ifsr32_el2
>> - mrs x6, fpexc32_el2
>> stp x4, x5, [x3]
>> - str x6, [x3, #16]
>>
>> + skip_fpsimd_state x8, 3f
>> + mrs x6, fpexc32_el2
>> + str x6, [x3, #16]
>> +3:
>> skip_debug_state x8, 2f
>> mrs x7, dbgvcr32_el2
>> str x7, [x3, #24]
>> @@ -461,10 +463,8 @@
>>
>> add x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
>> ldp x4, x5, [x3]
>> - ldr x6, [x3, #16]
>> msr dacr32_el2, x4
>> msr ifsr32_el2, x5
>> - msr fpexc32_el2, x6
>>
>> skip_debug_state x8, 2f
>> ldr x7, [x3, #24]
>> @@ -669,12 +669,14 @@ __restore_debug:
>> ret
>>
>> __save_fpsimd:
>> + skip_fpsimd_state x3, 1f
>> save_fpsimd
>> - ret
>> +1: ret
>>
>> __restore_fpsimd:
>> + skip_fpsimd_state x3, 1f
>> restore_fpsimd
>> - ret
>> +1: ret
>>
>> switch_to_guest_fpsimd:
>> push x4, lr
>> @@ -682,6 +684,7 @@ switch_to_guest_fpsimd:
>> mrs x2, cptr_el2
>> bic x2, x2, #CPTR_EL2_TFP
>> msr cptr_el2, x2
>> + isb
>
> ah, EL2 accesses themselves trap too, ouch.
Yeah, the FP architecture has all kind of nasty tricks like this.
>>
>> mrs x0, tpidr_el2
>>
>> @@ -692,6 +695,10 @@ switch_to_guest_fpsimd:
>> add x2, x0, #VCPU_CONTEXT
>> bl __restore_fpsimd
>>
>> + skip_32bit_state x3, 1f
>> + ldr x4, [x2, #CPU_SYSREG_OFFSET(FPEXC32_EL2)]
>> + msr fpexc32_el2, x4
>> +1:
>
> wait, it looks like you're missing a store of the host fpsimd state in
> the switch_to_guest_fpsimd situation.
A store of FPEXC32_EL2 for the host? No, this register has no
significance whatsoever for a 64bit host. Its sole purpose is to
accommodate a 32bit guest (see D7.2.31). Or are you thinking of
something else?
> It think this would be easier to follow if the aarch32 FP registers were
> handled as part of the __save_fpsimd and __restore_fpsimd routines
> instead.
Maybe. This code needs some rework...
> Does this help anything?
Not really, sorry... :-(
> Otherwise, I assume you've tested thoroughly between something like
> v4.2-rc2 and this patch, so that you're sure we didn't have the 32-bit
> processed crash before?
Without the lazy switching, we seem to be rock solid. With lazy
switching, I get these illegal instructions, always on VFP ops. My
current thinking is that it has something to do with CPACR_EL1, and the
fact that the 32bit kernel turns it VFP on/off all the time.
/me puzzled...
M.
--
Jazz is not dead. It just smells funny...
next prev parent reply other threads:[~2015-08-06 12:33 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-16 21:29 [PATCH v4 0/2] arm/arm64: KVM: Optimize arm64 fp/simd, saves 30-50% on exits Mario Smarduch
2015-07-16 21:29 ` Mario Smarduch
2015-07-16 21:29 ` [PATCH v4 1/2] arm64: KVM: Optimize arm64 skip 30-50% vfp/simd save/restore " Mario Smarduch
2015-07-16 21:29 ` Mario Smarduch
2015-08-05 16:11 ` Marc Zyngier
2015-08-05 16:11 ` Marc Zyngier
2015-08-06 11:54 ` Christoffer Dall
2015-08-06 11:54 ` Christoffer Dall
2015-08-06 12:46 ` Marc Zyngier [this message]
2015-08-06 12:46 ` Marc Zyngier
2015-08-19 17:49 ` Christoffer Dall
2015-08-19 17:49 ` Christoffer Dall
2015-08-19 21:52 ` Mario Smarduch
2015-08-19 21:52 ` Mario Smarduch
2015-08-19 22:28 ` Marc Zyngier
2015-08-19 22:28 ` Marc Zyngier
2015-08-19 23:21 ` Mario Smarduch
2015-08-19 23:21 ` Mario Smarduch
2015-07-16 21:29 ` [PATCH v4 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64 Mario Smarduch
2015-07-16 21:29 ` Mario Smarduch
2015-07-17 9:28 ` [PATCH v4 0/2] arm/arm64: KVM: Optimize arm64 fp/simd, saves 30-50% on exits Christoffer Dall
2015-07-17 9:28 ` Christoffer Dall
2015-07-17 10:28 ` Marc Zyngier
2015-07-17 10:28 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55C35727.6060208@arm.com \
--to=marc.zyngier@arm.com \
--cc=christoffer.dall@linaro.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.