From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit
Date: Thu, 03 Aug 2017 18:03:33 +0100 [thread overview]
Message-ID: <59835765.7020604@arm.com> (raw)
In-Reply-To: <20170801131855.GH5176@cbox>
Hi Christoffer,
On 01/08/17 14:18, Christoffer Dall wrote:
> On Fri, Jul 28, 2017 at 03:10:19PM +0100, James Morse wrote:
>> On systems with VHE, the RAS extensions and IESB support, KVM gets an
>> implicit ESB whenever it enters/exits a guest, because the host sets
>> SCTLR_EL1.IESB.
>>
>> To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
>> and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
>> process this deferred SError. This data is in addition to the reason the
>> guest exitted.
>
> Two questions:
>
> First, am I reading the spec incorrectly when it says "The implicit form
> of Error Synchronization Barrier: [...] Has no effect on DISR_EL1 or
> VDISR_EL2" and I understand this as we wouldn't actually read anything
> from DISR_EL1 if we rely on the IESB?
(This is from section 2.4.5 Extension for barrier at exception entry and exit of
DDI 0587A.)
Well spotted ... that's embarrassing!
The DISR write is in the pseudocode's ESBOperation() which is not the same as
ErrorSynchronizationBarrier(). Running an 'ESB' does both, but an IESB only does
ErrorSynchronizationBarrier().
I think this distinction is because the CPU may know about RAS errors it hasn't
yet made pending SErrors. (they must have to have a severity for the ESR by this
point).
So IESB makes hidden RAS errors pending SErrors, it doesn't do what ESB does.
Yes, this means the DISR_EL1 check on kernel-entry and guest exit is useless.
Given this the host kernel entry/exit can be simplified, probably getting rid of
the SError over eret horror. I will need to re-think the KVM changes, (we may
just need the ESR from the existing vaxorcism code).
> Second, what if we have several SErrors, and one happens upon entering
> the guest and another one happens when returning from the guest - do we
> end up overwriting the DISR_EL1 by only looking at it during exit and
> potentially miss errors?
There can only be one pending SError at a time, but if we have PSTATE.A set, a
pending SError and a hidden RAS error, then ESB must have to pick one to defer,
and IESB must have to discard one. I suspect the answer is 'implementation
defined', but I will ask!
>> Future patches may add a firmware-first callout from
>> kvm_handle_deferred_serror() to decode CPER records populated by firmware,
>> or call some arm64 arch code to process the RAS 'ERR' registers for
>> kernel-first handling. Without either of these, we just make a judgement
>> on the severity: corrected and restartable errors are ignored, all others
>> result it an SError being given to the guest.
>
> *in an* ?
> Why do we give the remaining types of SErrors to the guest?
Just because that is what KVM does today.
> What would the kernel normally do for any other workload than running a VM when
> discovering this type of error?
I'm trying to make that clearer! Today we 'kill the running task', if its the
kernel, we would panic(). But because the CPU masks SError on exception entry,
and we never touch PSTATE.A, its always masked in the kernel, so we take the
SError and kill the next user space task that gets run.
We should panic() like we do in the early boot code if an SError was pending
from firmware.
Should the host panic because of an SError taken during a guest?, not
necessarily. All the system registers are save/restored by world-switch, and the
host doesn't depend on anything in guest memory. The host should be immune to
any corruption that occurs while a guest was running.
Gengdongjiu's example of device pass-through is the exception to this reasoning,
I think we need a way for the host to contain/reset pass-through devices that
trigger an SError.
Thanks!
James
next prev parent reply other threads:[~2017-08-03 17:03 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-28 14:10 [PATCH v2 00/16] SError rework + v8.2 RAS and IESB cpufeature support James Morse
2017-07-28 14:10 ` [PATCH v2 01/16] arm64: explicitly mask all exceptions James Morse
2017-07-28 14:10 ` [PATCH v2 02/16] arm64: introduce an order for exceptions James Morse
2017-07-28 14:10 ` [PATCH v2 03/16] arm64: unmask all exceptions from C code on CPU startup James Morse
2017-07-28 14:10 ` [PATCH v2 04/16] arm64: entry.S: mask all exceptions during kernel_exit James Morse
2017-07-28 14:10 ` [PATCH v2 05/16] arm64: entry.S: move enable_step_tsk into kernel_exit James Morse
2017-07-28 14:10 ` [PATCH v2 06/16] arm64: entry.S: convert elX_sync James Morse
2017-08-09 17:25 ` Catalin Marinas
2017-08-10 16:57 ` James Morse
2017-08-11 17:24 ` James Morse
2017-07-28 14:10 ` [PATCH v2 07/16] arm64: entry.S: convert elX_irq James Morse
2017-07-28 14:10 ` [PATCH v2 08/16] arm64: entry.S: move SError handling into a C function for future expansion James Morse
2017-07-28 14:10 ` [PATCH v2 09/16] arm64: cpufeature: Detect CPU RAS Extentions James Morse
2017-07-28 14:10 ` [PATCH v2 10/16] arm64: kernel: Survive corrected RAS errors notified by SError James Morse
2017-09-13 20:52 ` Baicar, Tyler
2017-09-14 12:58 ` James Morse
2017-07-28 14:10 ` [PATCH v2 11/16] arm64: kernel: Handle deferred SError on kernel entry James Morse
2017-08-03 17:03 ` James Morse
2017-07-28 14:10 ` [PATCH v2 12/16] arm64: entry.S: Make eret restartable James Morse
2017-07-28 14:10 ` [PATCH v2 13/16] arm64: cpufeature: Enable Implicit ESB on entry/return-from EL1 James Morse
2017-07-28 14:10 ` [PATCH v2 14/16] KVM: arm64: Take pending SErrors on entry to the guest James Morse
2017-08-01 12:53 ` Christoffer Dall
2017-07-28 14:10 ` [PATCH v2 15/16] KVM: arm64: Save ESR_EL2 on guest SError James Morse
2017-08-01 13:25 ` Christoffer Dall
2017-07-28 14:10 ` [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit James Morse
2017-08-01 13:18 ` Christoffer Dall
2017-08-03 17:03 ` James Morse [this message]
2017-08-04 13:12 ` Christoffer Dall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=59835765.7020604@arm.com \
--to=james.morse@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).