linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v4 00/21] SError rework + RAS&IESB for firmware first support
Date: Tue, 14 Nov 2017 16:03:01 +0000	[thread overview]
Message-ID: <5A0B13B5.3000205@arm.com> (raw)
In-Reply-To: <20171113112946.GK14144@cbox>

Hi Christoffer,

On 13/11/17 11:29, Christoffer Dall wrote:
> On Thu, Nov 09, 2017 at 06:14:56PM +0000, James Morse wrote:
>> On 19/10/17 15:57, James Morse wrote:
>>> Known issues:
>> [...]
>>>  * KVM-Migration: VDISR_EL2 is exposed to userspace as DISR_EL1, but how should
>>>    HCR_EL2.VSE or VSESR_EL2 be migrated when the guest has an SError pending but
>>>    hasn't taken it yet...?
>>
>> I've been trying to work out how this pending-SError-migration could work.
>>
>> If HCR_EL2.VSE is set then the guest will take a virtual SError when it next
>> unmasks SError. Today this doesn't get migrated, but only KVM sets this bit as
>> an attempt to kill the guest.
>>
>> This will be more of a problem with GengDongjiu's SError CAP for triggering
>> guest SError from user-space, which will also allow the VSESR_EL2 to be
>> specified. (this register becomes the guest ESR_EL1 when the virtual SError is
>> taken and is used to emulate firmware-first's NOTIFY_SEI and eventually
>> kernel-first RAS). These errors are likely to be handled by the guest.
>>
>>
>> We don't want to expose VSESR_EL2 to user-space, and for migration it isn't
>> enough as a value of '0' doesn't tell us if HCR_EL2.VSE is set.
>>
>> To get out of this corner: why not declare pending-SError-migration an invalid
>> thing to do?

> To answer that question we'd have to know if that is generally a valid
> thing to require.  How will higher level tools in the stack deal with
> this (e.g. libvirt, and OpenStack).  Is it really valid to tell them
> "nope, can't migrate right now".  I'm thinking if you have a failing
> host and want to signal some error to the guest, that's probably a
> really good time to migrate your mission-critical VM away to a different
> host, and being told, "sorry, cannot do this" would be painful.  I'm
> cc'ing Drew for his insight into libvirt and how this is done on x86,

Thanks,


> but I'm not really crazy about this idea.

Excellent, so at the other extreme we could have an API to query all of this
state, and another to set it. On systems without the RAS extensions this just
moves the HCR_EL2.VSE bit. On systems with the RAS extensions it moves VSESR_EL2
too.

I was hoping to avoid exposing different information. I need to look into how
that works. (and this is all while avoiding adding an EL2 register to
vcpu_sysreg [0])


>> We can give Qemu a way to query if a virtual SError is (still) pending. Qemu
>> would need to check this on each vcpu after migration, just before it throws the
>> switch and the guest runs on the new host. This way the VSESR_EL2 value doesn't
>> need migrating at all.
>>
>> In the ideal world, Qemu could re-inject the last SError it triggered if there
>> is still one pending when it migrates... but because KVM injects errors too, it
>> would need to block migration until this flag is cleared.

> I don't understand your conclusion here.

I was trying to reduce it to exposing just HCR_EL2.VSE as 'bool
serror_still_pending()', then let Qemu re-inject whatever SError it injected
last. This then behaves the same regardless of the RAS support.
But KVM's kvm_inject_vabt() breaks this, Qemu can't know whether this pending
SError was from Qemu, or from KVM.

... So we need VSESR_EL2 on systems which have that register ...

(or, get rid of kvm_inject_vabt(), but that would involve a new exit type, and
some trickery for existing user-space)

> If QEMU can query the virtual SError pending state, it can also inject
> that before running the VM after a restore, and we should have preserved
> the same state.

[..]

>> Can anyone suggest a better way?

> I'm thinking this is analogous to migrating a VM that uses an irqchip in
> userspace and has set the IRQ or FIQ lines using KVM_IRQ_LINE.  My
> feeling is that this is also not supported today.

Does KVM change/update these values behind Qemu's back? It's kvm_inject_vabt()
that is making this tricky. (or at least confusing me)


> My suggestion would be to add some set of VCPU exception state,
> potentially as flags, which can be migrated along with the VM, or at
> least used by userspace to query the state of the VM, if there exists a
> reliable mechanism to restore the state again without any side effects.
> 
> I think we have to comb through Documentation/virtual/kvm/api.txt to see
> if we can reuse anything, and if not, add something.  We could also
> consider adding something to Documentation/virtual/kvm/devices/vcpu.txt,
> where I think we have a large number space to use from.
> 
> Hope this helps?

Yes, I'll go looking for a way to expose VSESR_EL2 to user-space.


Thanks!

James


[0] https://patchwork.kernel.org/patch/9886019/

  parent reply	other threads:[~2017-11-14 16:03 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 14:57 [PATCH v4 00/21] SError rework + RAS&IESB for firmware first support James Morse
2017-10-19 14:57 ` [PATCH v4 01/21] arm64: explicitly mask all exceptions James Morse
2017-10-19 14:57 ` [PATCH v4 02/21] arm64: introduce an order for exceptions James Morse
2017-10-19 14:57 ` [PATCH v4 03/21] arm64: Move the async/fiq helpers to explicitly set process context flags James Morse
2017-10-19 14:57 ` [PATCH v4 04/21] arm64: Mask all exceptions during kernel_exit James Morse
2017-10-19 14:57 ` [PATCH v4 05/21] arm64: entry.S: Remove disable_dbg James Morse
2017-10-19 14:57 ` [PATCH v4 06/21] arm64: entry.S: convert el1_sync James Morse
2017-10-19 14:57 ` [PATCH v4 07/21] arm64: entry.S convert el0_sync James Morse
2017-10-19 14:57 ` [PATCH v4 08/21] arm64: entry.S: convert elX_irq James Morse
2017-10-19 14:57 ` [PATCH v4 09/21] KVM: arm/arm64: mask/unmask daif around VHE guests James Morse
2017-10-30  7:40   ` Christoffer Dall
2017-11-02 12:14     ` James Morse
2017-11-03 12:45       ` Christoffer Dall
2017-11-03 17:19         ` James Morse
2017-11-06 12:42           ` Christoffer Dall
2017-10-19 14:57 ` [PATCH v4 10/21] arm64: entry.S: move SError handling into a C function for future expansion James Morse
2018-01-02 21:07   ` Adam Wallis
2018-01-03 16:00     ` James Morse
2017-10-19 14:57 ` [PATCH v4 11/21] arm64: cpufeature: Detect CPU RAS Extentions James Morse
2017-10-31 13:14   ` Will Deacon
2017-11-02 12:15     ` James Morse
2017-10-19 14:57 ` [PATCH v4 12/21] arm64: kernel: Survive corrected RAS errors notified by SError James Morse
2017-10-31 13:50   ` Will Deacon
2017-11-02 12:15     ` James Morse
2017-10-19 14:57 ` [PATCH v4 13/21] arm64: cpufeature: Enable IESB on exception entry/return for firmware-first James Morse
2017-10-31 13:56   ` Will Deacon
2017-10-19 14:58 ` [PATCH v4 14/21] arm64: kernel: Prepare for a DISR user James Morse
2017-10-19 14:58 ` [PATCH v4 15/21] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse
2017-10-20 16:44   ` gengdongjiu
2017-10-23 15:26     ` James Morse
2017-10-24  9:53       ` gengdongjiu
2017-10-30  7:59   ` Christoffer Dall
2017-10-30 10:51     ` Christoffer Dall
2017-10-30 15:44       ` James Morse
2017-10-31  5:48         ` Christoffer Dall
2017-10-31  6:34   ` Marc Zyngier
2017-10-19 14:58 ` [PATCH v4 16/21] KVM: arm64: Save/Restore guest DISR_EL1 James Morse
2017-10-31  4:27   ` Marc Zyngier
2017-10-31  5:27   ` Christoffer Dall
2017-10-19 14:58 ` [PATCH v4 17/21] KVM: arm64: Save ESR_EL2 on guest SError James Morse
2017-10-31  4:26   ` Marc Zyngier
2017-10-31  5:47     ` Marc Zyngier
2017-11-01 17:42       ` James Morse
2017-10-19 14:58 ` [PATCH v4 18/21] KVM: arm64: Handle RAS SErrors from EL1 on guest exit James Morse
2017-10-31  5:55   ` Marc Zyngier
2017-10-31  5:56   ` Christoffer Dall
2017-10-19 14:58 ` [PATCH v4 19/21] KVM: arm64: Handle RAS SErrors from EL2 " James Morse
2017-10-27  6:26   ` gengdongjiu
2017-10-27 17:38     ` James Morse
2017-10-31  6:13   ` Marc Zyngier
2017-10-31  6:13   ` Christoffer Dall
2017-10-19 14:58 ` [PATCH v4 20/21] KVM: arm64: Take any host SError before entering the guest James Morse
2017-10-31  6:23   ` Christoffer Dall
2017-10-31 11:43     ` James Morse
2017-11-01  4:55       ` Christoffer Dall
2017-11-02 12:18         ` James Morse
2017-11-03 12:49           ` Christoffer Dall
2017-11-03 16:14             ` James Morse
2017-11-06 12:45               ` Christoffer Dall
2017-10-19 14:58 ` [PATCH v4 21/21] KVM: arm64: Trap RAS error registers and set HCR_EL2's TERR & TEA James Morse
2017-10-31  6:32   ` Christoffer Dall
2017-10-31  6:32   ` Marc Zyngier
2017-10-31  6:35 ` [PATCH v4 00/21] SError rework + RAS&IESB for firmware first support Christoffer Dall
2017-10-31 10:08   ` Will Deacon
2017-11-01 15:23     ` James Morse
2017-11-02  8:14       ` Christoffer Dall
2017-11-09 18:14 ` James Morse
2017-11-10 12:03   ` gengdongjiu
2017-11-13 11:29   ` Christoffer Dall
2017-11-13 13:05     ` Peter Maydell
2017-11-20  8:53       ` Christoffer Dall
2017-11-13 16:14     ` Andrew Jones
2017-11-13 17:56       ` Peter Maydell
2017-11-14 16:11       ` James Morse
2017-11-15  9:59         ` gengdongjiu
2017-11-14 16:03     ` James Morse [this message]
2017-11-15  9:15       ` gengdongjiu
2017-11-15 18:25         ` James Morse
2017-11-21 11:31           ` gengdongjiu
2017-11-20  8:55       ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5A0B13B5.3000205@arm.com \
    --to=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).