From: James Morse <james.morse@arm.com>
To: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jonathan.Zhang@cavium.com,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
wangxiongfeng2@huawei.com, linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
Date: Thu, 12 Oct 2017 13:28:05 +0100 [thread overview]
Message-ID: <59DF5FD5.7020208@arm.com> (raw)
In-Reply-To: <87h8v6j7h1.fsf@on-the-bus.cambridge.arm.com>
Hi Marc,
On 11/10/17 11:37, Marc Zyngier wrote:
> On Thu, Oct 05 2017 at 8:18:11 pm BST, James Morse <james.morse@arm.com> wrote:
>> We expect to have firmware-first handling of RAS SErrors, with errors
>> notified via an APEI method. For systems without firmware-first, add
>> some minimal handling to KVM.
>>
>> There are two ways KVM can take an SError due to a guest, either may be a
>> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
>> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
>>
>> The current SError from EL2 code unmasks SError and tries to fence any
>> pending SError into a single instruction window. It then leaves SError
>> unmasked.
>>
>> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
>> error, but KVM is only able to handle SError from EL2 if they occur
>> during this single instruction window...
>>
>> The RAS Extensions give us a new instruction to synchronise and
>> consume SErrors. The RAS Extensions document (ARM DDI0587),
>> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
>> SError interrupts generated by 'instructions, translation table walks,
>> hardware updates to the translation tables, and instruction fetches on
>> the same PE'. This makes ESB equivalent to KVMs existing
>> 'dsb, mrs-daifclr, isb' sequence.
>>
>> Use the alternatives to synchronise and consume any SError using ESB
>> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
>> in the exit_code so that we can restart the vcpu if it turns out this
>> SError has no impact on the vcpu.
>> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
>> index 12ee62d6d410..96caa5328b3a 100644
>> --- a/arch/arm64/kvm/hyp/entry.S
>> +++ b/arch/arm64/kvm/hyp/entry.S
>> @@ -124,6 +124,17 @@ ENTRY(__guest_exit)
>> // Now restore the host regs
>> restore_callee_saved_regs x2
>>
>> +alternative_if ARM64_HAS_RAS_EXTN
>> + // If we have the RAS extensions we can consume a pending error
>> + // without an unmask-SError and isb.
>> + esb
>> + mrs_s x2, SYS_DISR_EL1
>> + str x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
>> + cbz x2, 1f
>> + msr_s SYS_DISR_EL1, xzr
>> + orr x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
>> +1: ret
>> +alternative_else
>> // If we have a pending asynchronous abort, now is the
>> // time to find out. From your VAXorcist book, page 666:
>> // "Threaten me not, oh Evil one! For I speak with
>> @@ -135,6 +146,8 @@ ENTRY(__guest_exit)
>>
>> dsb sy // Synchronize against in-flight ld/st
>> msr daifclr, #4 // Unmask aborts
>> + nop
>
> Oops. You've now introduced an instruction in what was supposed to be a
> single instruction window (the isb). It means that we may fail to
> identify the Serror as having been generated by our synchronisation
> mechanism, and we'll panic for no good reason.
Gah! and I pointed this thing out in the commit message,
>> +alternative_endif
>>
>> // This is our single instruction exception window. A pending
>> // SError is guaranteed to occur at the earliest when we unmask
and here it is in the diff.
> Moving the nop up will solve this.
Yes.
I've fixed this for v4, along with Julien's suggestions.
Thanks!
James
WARNING: multiple messages have this Message-ID (diff)
From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
Date: Thu, 12 Oct 2017 13:28:05 +0100 [thread overview]
Message-ID: <59DF5FD5.7020208@arm.com> (raw)
In-Reply-To: <87h8v6j7h1.fsf@on-the-bus.cambridge.arm.com>
Hi Marc,
On 11/10/17 11:37, Marc Zyngier wrote:
> On Thu, Oct 05 2017 at 8:18:11 pm BST, James Morse <james.morse@arm.com> wrote:
>> We expect to have firmware-first handling of RAS SErrors, with errors
>> notified via an APEI method. For systems without firmware-first, add
>> some minimal handling to KVM.
>>
>> There are two ways KVM can take an SError due to a guest, either may be a
>> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
>> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
>>
>> The current SError from EL2 code unmasks SError and tries to fence any
>> pending SError into a single instruction window. It then leaves SError
>> unmasked.
>>
>> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
>> error, but KVM is only able to handle SError from EL2 if they occur
>> during this single instruction window...
>>
>> The RAS Extensions give us a new instruction to synchronise and
>> consume SErrors. The RAS Extensions document (ARM DDI0587),
>> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
>> SError interrupts generated by 'instructions, translation table walks,
>> hardware updates to the translation tables, and instruction fetches on
>> the same PE'. This makes ESB equivalent to KVMs existing
>> 'dsb, mrs-daifclr, isb' sequence.
>>
>> Use the alternatives to synchronise and consume any SError using ESB
>> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
>> in the exit_code so that we can restart the vcpu if it turns out this
>> SError has no impact on the vcpu.
>> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
>> index 12ee62d6d410..96caa5328b3a 100644
>> --- a/arch/arm64/kvm/hyp/entry.S
>> +++ b/arch/arm64/kvm/hyp/entry.S
>> @@ -124,6 +124,17 @@ ENTRY(__guest_exit)
>> // Now restore the host regs
>> restore_callee_saved_regs x2
>>
>> +alternative_if ARM64_HAS_RAS_EXTN
>> + // If we have the RAS extensions we can consume a pending error
>> + // without an unmask-SError and isb.
>> + esb
>> + mrs_s x2, SYS_DISR_EL1
>> + str x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
>> + cbz x2, 1f
>> + msr_s SYS_DISR_EL1, xzr
>> + orr x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
>> +1: ret
>> +alternative_else
>> // If we have a pending asynchronous abort, now is the
>> // time to find out. From your VAXorcist book, page 666:
>> // "Threaten me not, oh Evil one! For I speak with
>> @@ -135,6 +146,8 @@ ENTRY(__guest_exit)
>>
>> dsb sy // Synchronize against in-flight ld/st
>> msr daifclr, #4 // Unmask aborts
>> + nop
>
> Oops. You've now introduced an instruction in what was supposed to be a
> single instruction window (the isb). It means that we may fail to
> identify the Serror as having been generated by our synchronisation
> mechanism, and we'll panic for no good reason.
Gah! and I pointed this thing out in the commit message,
>> +alternative_endif
>>
>> // This is our single instruction exception window. A pending
>> // SError is guaranteed to occur at the earliest when we unmask
and here it is in the diff.
> Moving the nop up will solve this.
Yes.
I've fixed this for v4, along with Julien's suggestions.
Thanks!
James
next prev parent reply other threads:[~2017-10-12 12:29 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-05 19:17 [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support James Morse
2017-10-05 19:17 ` James Morse
2017-10-05 19:17 ` [PATCH v3 01/20] arm64: explicitly mask all exceptions James Morse
2017-10-05 19:17 ` James Morse
2017-10-11 16:30 ` Julien Thierry
2017-10-11 16:30 ` Julien Thierry
2017-10-12 12:26 ` James Morse
2017-10-12 12:26 ` James Morse
2017-10-18 14:23 ` Catalin Marinas
2017-10-18 14:23 ` Catalin Marinas
2017-10-18 14:25 ` Catalin Marinas
2017-10-18 14:25 ` Catalin Marinas
2017-10-05 19:17 ` [PATCH v3 02/20] arm64: introduce an order for exceptions James Morse
2017-10-05 19:17 ` James Morse
2017-10-11 17:11 ` Julien Thierry
2017-10-11 17:11 ` Julien Thierry
2017-10-05 19:17 ` [PATCH v3 03/20] arm64: Move the async/fiq helpers to explicitly set process context flags James Morse
2017-10-05 19:17 ` James Morse
2017-10-05 19:17 ` [PATCH v3 04/20] arm64: Mask all exceptions during kernel_exit James Morse
2017-10-05 19:17 ` James Morse
2017-10-05 19:17 ` [PATCH v3 05/20] arm64: entry.S: Remove disable_dbg James Morse
2017-10-05 19:17 ` James Morse
2017-10-05 19:17 ` [PATCH v3 06/20] arm64: entry.S: convert el1_sync James Morse
2017-10-05 19:17 ` James Morse
2017-10-05 19:17 ` [PATCH v3 07/20] arm64: entry.S convert el0_sync James Morse
2017-10-05 19:17 ` James Morse
2017-10-05 19:18 ` [PATCH v3 08/20] arm64: entry.S: convert elX_irq James Morse
2017-10-05 19:18 ` James Morse
2017-10-11 17:13 ` Julien Thierry
2017-10-11 17:13 ` Julien Thierry
2017-10-12 12:26 ` James Morse
2017-10-12 12:26 ` James Morse
2017-10-05 19:18 ` [PATCH v3 09/20] KVM: arm/arm64: mask/unmask daif around VHE guests James Morse
2017-10-05 19:18 ` James Morse
2017-10-11 9:01 ` Marc Zyngier
2017-10-11 9:01 ` Marc Zyngier
2017-10-11 15:40 ` James Morse
2017-10-11 15:40 ` James Morse
2017-10-05 19:18 ` [PATCH v3 10/20] arm64: entry.S: move SError handling into a C function for future expansion James Morse
2017-10-05 19:18 ` James Morse
2017-10-05 19:18 ` [PATCH v3 11/20] arm64: cpufeature: Detect CPU RAS Extentions James Morse
2017-10-05 19:18 ` James Morse
2017-10-05 19:18 ` [PATCH v3 12/20] arm64: kernel: Survive corrected RAS errors notified by SError James Morse
2017-10-05 19:18 ` James Morse
2017-10-05 19:18 ` [PATCH v3 13/20] arm64: cpufeature: Enable IESB on exception entry/return for firmware-first James Morse
2017-10-05 19:18 ` James Morse
2017-10-18 16:43 ` Catalin Marinas
2017-10-18 16:43 ` Catalin Marinas
2017-10-18 17:14 ` James Morse
2017-10-18 17:14 ` James Morse
2017-10-05 19:18 ` [PATCH v3 14/20] arm64: kernel: Prepare for a DISR user James Morse
2017-10-05 19:18 ` James Morse
2017-10-05 19:18 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse
2017-10-05 19:18 ` James Morse
2017-10-13 9:25 ` gengdongjiu
2017-10-13 9:25 ` gengdongjiu
2017-10-13 16:53 ` James Morse
2017-10-13 16:53 ` James Morse
2017-10-05 19:18 ` [PATCH v3 16/20] KVM: arm64: Save/Restore guest DISR_EL1 James Morse
2017-10-05 19:18 ` James Morse
2017-10-05 19:18 ` [PATCH v3 17/20] KVM: arm64: Save ESR_EL2 on guest SError James Morse
2017-10-05 19:18 ` James Morse
2017-10-05 19:18 ` [PATCH v3 18/20] KVM: arm64: Handle RAS SErrors from EL1 on guest exit James Morse
2017-10-05 19:18 ` James Morse
2017-10-05 19:18 ` [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 " James Morse
2017-10-05 19:18 ` James Morse
2017-10-11 10:37 ` Marc Zyngier
2017-10-11 10:37 ` Marc Zyngier
2017-10-12 12:28 ` James Morse [this message]
2017-10-12 12:28 ` James Morse
2017-10-05 19:18 ` [PATCH v3 20/20] KVM: arm64: Take any host SError before entering the guest James Morse
2017-10-05 19:18 ` James Morse
2017-10-18 16:55 ` [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support Catalin Marinas
2017-10-18 16:55 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=59DF5FD5.7020208@arm.com \
--to=james.morse@arm.com \
--cc=Jonathan.Zhang@cavium.com \
--cc=catalin.marinas@arm.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=marc.zyngier@arm.com \
--cc=wangxiongfeng2@huawei.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.