public inbox for kvmarm@lists.cs.columbia.edu
 help / color / mirror / Atom feed
* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2017-10-05 19:17 [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support James Morse
@ 2017-10-05 19:18 ` James Morse
  2017-10-13  9:25   ` gengdongjiu
  0 siblings, 1 reply; 8+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, Dongjiu Geng, kvmarm

Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
generated an SError with an implementation defined ESR_EL1.ISS, because we
had no mechanism to specify the ESR value.

On Juno this generates an all-zero ESR, the most significant bit 'ISV'
is clear indicating the remainder of the ISS field is invalid.

With the RAS Extensions we have a mechanism to specify this value, and the
most significant bit has a new meaning: 'IDS - Implementation Defined
Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
instead of 'no valid ISS'.

Add KVM support for the VSESR_EL2 register to specify an ESR value when
HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
specify an implementation-defined value.

We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
save/restores this bit during __deactivate_traps() and hardware clears the
bit once the guest has consumed the virtual-SError.

Future patches may add an API (or KVM CAP) to pend a virtual SError with
a specified ESR.

Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  3 +++
 arch/arm64/include/asm/sysreg.h      |  1 +
 arch/arm64/kvm/hyp/switch.c          |  4 ++++
 arch/arm64/kvm/inject_fault.c        | 13 ++++++++++++-
 5 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index e5df3fce0008..8a7a838eb17a 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -61,6 +61,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
 	vcpu->arch.hcr_el2 = hcr;
 }
 
+static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
+{
+	vcpu->arch.vsesr_el2 = vsesr;
+}
+
 static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
 {
 	return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d3eb79a9ed6b..0af35e71fedb 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -277,6 +277,9 @@ struct kvm_vcpu_arch {
 
 	/* Detect first run of a vcpu */
 	bool has_run_once;
+
+	/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
+	u64 vsesr_el2;
 };
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 427c36bc5dd6..a493e93de296 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -253,6 +253,7 @@
 
 #define SYS_DACR32_EL2			sys_reg(3, 4, 3, 0, 0)
 #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
+#define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c641c4..af37658223a0 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -86,6 +86,10 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
 		isb();
 	}
 	write_sysreg(val, hcr_el2);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
+		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
+
 	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
 	write_sysreg(1 << 15, hstr_el2);
 	/*
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index da6a8cfa54a0..52f7f66f1356 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -232,14 +232,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 		inject_undef64(vcpu);
 }
 
+static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
+{
+	vcpu_set_vsesr(vcpu, esr);
+	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+}
+
 /**
  * kvm_inject_vabt - inject an async abort / SError into the guest
  * @vcpu: The VCPU to receive the exception
  *
  * It is assumed that this code is called from the VCPU thread and that the
  * VCPU therefore is not currently executing guest code.
+ *
+ * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
+ * the remaining ISS all-zeros so that this error is not interpreted as an
+ * uncatagorized RAS error. Without the RAS Extensions we can't specify an ESR
+ * value, so the CPU generates an imp-def value.
  */
 void kvm_inject_vabt(struct kvm_vcpu *vcpu)
 {
-	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+	pend_guest_serror(vcpu, ESR_ELx_ISV);
 }
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2017-10-05 19:18 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse
@ 2017-10-13  9:25   ` gengdongjiu
  2017-10-13 16:53     ` James Morse
  0 siblings, 1 reply; 8+ messages in thread
From: gengdongjiu @ 2017-10-13  9:25 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

Hi James,

   After checking this patch, I think my patch[1] already include this logic(only a little difference). In my first version patch [2], It sets the virtual ESR in the KVM, but Marc and
other people disagree that[3][4],and propose to set its value and injection by userspace(when RAS is enabled). If you think we also need to support injection by KVM, I can extend my
patch to support that(but I think we should not support according previous review comments). So I think we no need to submit another patch, it will be duplicated, and waste our review
time. thank you very much. I will combine that.

[1] https://lkml.org/lkml/2017/8/28/497
[2] https://patchwork.kernel.org/patch/9633105/
[3]https://lkml.org/lkml/2017/3/20/441
[4]https://lkml.org/lkml/2017/3/20/516


On 2017/10/6 3:18, James Morse wrote:
> Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
> generated an SError with an implementation defined ESR_EL1.ISS, because we
> had no mechanism to specify the ESR value.
> 
> On Juno this generates an all-zero ESR, the most significant bit 'ISV'
> is clear indicating the remainder of the ISS field is invalid.
> 
> With the RAS Extensions we have a mechanism to specify this value, and the
> most significant bit has a new meaning: 'IDS - Implementation Defined
> Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
> instead of 'no valid ISS'.
> 
> Add KVM support for the VSESR_EL2 register to specify an ESR value when
> HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
> specify an implementation-defined value.
> 
> We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
> save/restores this bit during __deactivate_traps() and hardware clears the
> bit once the guest has consumed the virtual-SError.
> 
> Future patches may add an API (or KVM CAP) to pend a virtual SError with
> a specified ESR.
> 
> Cc: Dongjiu Geng <gengdongjiu@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++++
>  arch/arm64/include/asm/kvm_host.h    |  3 +++
>  arch/arm64/include/asm/sysreg.h      |  1 +
>  arch/arm64/kvm/hyp/switch.c          |  4 ++++
>  arch/arm64/kvm/inject_fault.c        | 13 ++++++++++++-
>  5 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index e5df3fce0008..8a7a838eb17a 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -61,6 +61,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
>  	vcpu->arch.hcr_el2 = hcr;
>  }
>  
> +static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
> +{
> +	vcpu->arch.vsesr_el2 = vsesr;
> +}
> +
>  static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
>  {
>  	return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d3eb79a9ed6b..0af35e71fedb 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -277,6 +277,9 @@ struct kvm_vcpu_arch {
>  
>  	/* Detect first run of a vcpu */
>  	bool has_run_once;
> +
> +	/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
> +	u64 vsesr_el2;
>  };
>  
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 427c36bc5dd6..a493e93de296 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -253,6 +253,7 @@
>  
>  #define SYS_DACR32_EL2			sys_reg(3, 4, 3, 0, 0)
>  #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
> +#define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
>  #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
>  
>  #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 945e79c641c4..af37658223a0 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -86,6 +86,10 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
>  		isb();
>  	}
>  	write_sysreg(val, hcr_el2);
> +
> +	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
> +		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
> +
>  	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
>  	write_sysreg(1 << 15, hstr_el2);
>  	/*
> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
> index da6a8cfa54a0..52f7f66f1356 100644
> --- a/arch/arm64/kvm/inject_fault.c
> +++ b/arch/arm64/kvm/inject_fault.c
> @@ -232,14 +232,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
>  		inject_undef64(vcpu);
>  }
>  
> +static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
> +{
> +	vcpu_set_vsesr(vcpu, esr);
> +	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
> +}
> +
>  /**
>   * kvm_inject_vabt - inject an async abort / SError into the guest
>   * @vcpu: The VCPU to receive the exception
>   *
>   * It is assumed that this code is called from the VCPU thread and that the
>   * VCPU therefore is not currently executing guest code.
> + *
> + * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
> + * the remaining ISS all-zeros so that this error is not interpreted as an
> + * uncatagorized RAS error. Without the RAS Extensions we can't specify an ESR
> + * value, so the CPU generates an imp-def value.
>   */
>  void kvm_inject_vabt(struct kvm_vcpu *vcpu)
>  {
> -	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
> +	pend_guest_serror(vcpu, ESR_ELx_ISV);
>  }
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2017-10-13  9:25   ` gengdongjiu
@ 2017-10-13 16:53     ` James Morse
  0 siblings, 0 replies; 8+ messages in thread
From: James Morse @ 2017-10-13 16:53 UTC (permalink / raw)
  To: gengdongjiu
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, linux-arm-kernel, kvmarm

Hi gengdongjiu,

On 13/10/17 10:25, gengdongjiu wrote:
> After checking this patch, I think my patch[1] already include this logic(only a little
> difference).

Your kvm_handle_guest_sei() is similar to where this series ends up, but the
purpose of this patch is to keep KVMs existing behaviour.

KVM already injects SError into the guest all by itself, now with the RAS
extensions it can specify and ESR, and because of the new ESR encoding it can't
use the reset value of all-zeroes.


> In my first version patch [2], It sets the virtual ESR in the KVM, but Marc and
> other people disagree that[3][4],and propose to set its value and injection by userspace(when
> RAS is enabled). 

Not quite: for RAS errors.
When we want to hand a RAS error to a guest, Qemu should be driving that.

What about impdef SError? Qemu should be able to drive that with the same API.

What about this nasty corner where KVM already injects an impdef SError
directly? This patch keeps that working.


I'd love to get rid of KVMs internal use of kvm_inject_vabt(). But what do we
replace it with? It needs to be a guest exit type that existing software can't
ignore...

(The best I can suggest is: Once we have a mechanism to inject SError into a
guest from Qemu, KVM could make an impdef SError pending, then give Qemu the
opportunity to kill the guest, or set a different ESR. Existing software can
ignore the exit, and take the existing behaviour.)


> So I think we no need to submit another patch, it will be duplicated, and waste our review
> time. thank you very much. I will combine that.

I agree we're posting competing series, there was some off-list co-ordination on
this with Xie XiuQi and Xiongfeng Wang in ~may, it looks like you weren't
involved at that point.

In your last series touching all this:
https://lkml.org/lkml/2017/8/31/698

You had Xie XiuQi's RAS-cpufeature patch in isolation, without the SError rework
underneath it. Applied like this SError is still always masked in the kernel, so
any system without firmware-first will silently consume and discard an
uncontained-RAS-error using the esb() in __switch_to(). We can't do this, hence
the first half of this series.


James

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
@ 2017-10-15 16:09 gengdongjiu
  2017-10-16  3:17 ` gengdongjiu
  2017-10-16 17:09 ` James Morse
  0 siblings, 2 replies; 8+ messages in thread
From: gengdongjiu @ 2017-10-15 16:09 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang@cavium.com, Wuquanming, Marc Zyngier,
	Catalin Marinas, Will Deacon, wangxiongfeng (C), Huangshaoyu,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu

Hi James,
  Thanks for your mail.

> Hi gengdongjiu,
> 
> On 13/10/17 10:25, gengdongjiu wrote:
> > After checking this patch, I think my patch[1] already include this
> > logic(only a little difference).
> 
> Your kvm_handle_guest_sei() is similar to where this series ends up, but the purpose of this patch is to keep KVMs existing behaviour.
> 
> KVM already injects SError into the guest all by itself, now with the RAS extensions it can specify and ESR, and because of the new ESR
> encoding it can't use the reset value of all-zeroes.


Understand it, thanks for your explanation.

> 
> 
> > In my first version patch [2], It sets the virtual ESR in the KVM, but
> > Marc and other people disagree that[3][4],and propose to set its value
> > and injection by userspace(when RAS is enabled).
> 
> Not quite: for RAS errors.
> When we want to hand a RAS error to a guest, Qemu should be driving that.
> 
> What about impdef SError? Qemu should be able to drive that with the same API.
> 
> What about this nasty corner where KVM already injects an impdef SError directly? This patch keeps that working.
> 
> 
> I'd love to get rid of KVMs internal use of kvm_inject_vabt(). But what do we replace it with? It needs to be a guest exit type that existing
> software can't ignore...
> 
> (The best I can suggest is: Once we have a mechanism to inject SError into a guest from Qemu, KVM could make an impdef SError pending,
> then give Qemu the opportunity to kill the guest, or set a different ESR. Existing software can ignore the exit, and take the existing
> behaviour.)

In fact I have below method for that, what do you think about that?

1.  If there is no RAS, old method, directly inject virtual SError, not need to specify ESR, as shown in the [1]
2.  If there is RAS, KVM set "the kvm_run" guest exit type value to let user space handle the SError abort
   A. If ESR_EL2 is IMPLEMENTATION or uncategorized, return " ESR_ELx_ISV " to let user space specify an implementation-defined value, as shown [2]
   B. If ESR_EL2 is categorized and error not propagated,  the error come from guest user space, return " (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR " to let user space specify a recoverable ESR. 
     Here one side calling memory failure, another side let user pace inject SError. Because usually SEI notification does not deliver SIGBUS signal to user space, so here inject virtual SEI to ensure that. As shown [3]
   C. If ESR_EL2 is categorized and error not propagated,  the error come from guest kernel, return "-1" to terminate guest. As shown [4]
   D. Otherwise, Panic host OS. As shown [5]


static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
       unsigned int esr = kvm_vcpu_get_hsr(vcpu);
       bool impdef_syndrome =  esr & ESR_ELx_ISV;      /* aka IDS */
       unsigned int aet = esr & ESR_ELx_AET;

      if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN))                      [1]
               kvm_inject_vabt(vcpu);
               return 1;
       }

       kvm_run->exit_reason = KVM_EXIT_EXCEPTION;
	   kvm_run->ex.exception = KVM_EXCEPTION_SERROR;

	   If (impdef_syndrome || ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
			kvm_run->ex.error_code = ESR_ELx_ISV;
		}

       switch (aet) {
       case ESR_ELx_AET_CE:    /* corrected error */                            [2]
       case ESR_ELx_AET_UEO:   /* restartable error, not yet consumed */
			   kvm_run->ex.error_code = (ESR_ELx_AET_UC | ESR_ELx_FSC_SERROR );   
               break;      /* continue processing the guest exit */


       case ESR_ELx_AET_UEU:  /* The error has not been propagated */            [3]
	   case ESR_ELx_AET_UER:
       /*
        * Only handle the guest user mode SEI if the error has not been propagated
        */
       if ((!vcpu_mode_priv(vcpu)) && !handle_guest_sei(kvm_vcpu_get_hsr(vcpu)))
			kvm_run->ex.error_code = (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR );
      		break;
       else
			return -1;                                                    [4]
       /* If SError handling is failed, continue run */
       default:                                                           [5]
               /*
                * Until now, the CPU supports RAS and SEI is fatal, or user space
                * does not support to handle the SError.
                */
               panic("This Asynchronous SError interrupt is dangerous, panic");
       }

	   return 0;
}

> 
> > So I think we no need to submit another patch, it will be duplicated,
> > and waste our review time. thank you very much. I will combine that.
> 
> I agree we're posting competing series, there was some off-list co-ordination on this with Xie XiuQi and Xiongfeng Wang in ~may, it looks
> like you weren't involved at that point.
  
Thanks very much for your agreement, I will add you to the off-list.

> 
> In your last series touching all this:
> https://lkml.org/lkml/2017/8/31/698
> 
> You had Xie XiuQi's RAS-cpufeature patch in isolation, without the SError rework underneath it. Applied like this SError is still always masked
> in the kernel, so any system without firmware-first will silently consume and discard an uncontained-RAS-error using the esb() in
> __switch_to(). We can't do this, hence the first half of this series.


Yes, seems I lost your SError rework series patches. When my patch update and modification almost done, hope we can combine to one series. thanks

> 
> 
> James

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2017-10-15 16:09 [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 gengdongjiu
@ 2017-10-16  3:17 ` gengdongjiu
  2017-10-16 17:02   ` James Morse
  2017-10-16 17:09 ` James Morse
  1 sibling, 1 reply; 8+ messages in thread
From: gengdongjiu @ 2017-10-16  3:17 UTC (permalink / raw)
  To: gengdongjiu
  Cc: Jonathan.Zhang@cavium.com, Wuquanming, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm@lists.cs.columbia.edu,
	Huangshaoyu, wangxiongfeng (C),
	linux-arm-kernel@lists.infradead.org

> In fact I have below method for that, what do you think about that?
>
> 1.  If there is no RAS, old method, directly inject virtual SError, not need to specify ESR, as shown in the [1]
> 2.  If there is RAS, KVM set "the kvm_run" guest exit type value to let user space handle the SError abort
>    A. If ESR_EL2 is IMPLEMENTATION or uncategorized, return " ESR_ELx_ISV " to let user space specify an implementation-defined value, as shown [2]
>    B. If ESR_EL2 is categorized and error not propagated,  the error come from guest user space, return " (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR " to let user space specify a recoverable ESR.
>      Here one side calling memory failure, another side let user pace inject SError. Because usually SEI notification does not deliver SIGBUS signal to user space, so here inject virtual SEI to ensure that. As shown [3]
>    C. If ESR_EL2 is categorized and error not propagated,  the error come from guest kernel, return "-1" to terminate guest. As shown [4]
>    D. Otherwise, Panic host OS. As shown [5]
>

For the IMPLEMENTATION ESR, for the safety purposes, I have below suggestion

1. When there is no guest,  host OS receives IMPLEMENTATION ESR, I
hope host can be panic, because we do not know its implementation
meaning.
2. when guest received MPLEMENTATION  ESR, and the Error is isolated
to guest by "ESB" instruction, not propagate to host. I hope guest
will exit, but host not panic.
3. when guest received MPLEMENTATION  ESR, and the Error is propagate
to host, I hope host can be panic.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2017-10-16  3:17 ` gengdongjiu
@ 2017-10-16 17:02   ` James Morse
  0 siblings, 0 replies; 8+ messages in thread
From: James Morse @ 2017-10-16 17:02 UTC (permalink / raw)
  To: gengdongjiu, gengdongjiu
  Cc: Jonathan.Zhang@cavium.com, Wuquanming, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm@lists.cs.columbia.edu,
	Huangshaoyu, wangxiongfeng (C),
	linux-arm-kernel@lists.infradead.org

Hi gengdongjiu,

On 16/10/17 04:17, gengdongjiu wrote:
>> In fact I have below method for that, what do you think about that?
>>
>> 1.  If there is no RAS, old method, directly inject virtual SError, not need to specify ESR, as shown in the [1]
>> 2.  If there is RAS, KVM set "the kvm_run" guest exit type value to let user space handle the SError abort
>>    A. If ESR_EL2 is IMPLEMENTATION or uncategorized, return " ESR_ELx_ISV " to let user space specify an implementation-defined value, as shown [2]
>>    B. If ESR_EL2 is categorized and error not propagated,  the error come from guest user space, return " (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR " to let user space specify a recoverable ESR.
>>      Here one side calling memory failure, another side let user pace inject SError. Because usually SEI notification does not deliver SIGBUS signal to user space, so here inject virtual SEI to ensure that. As shown [3]
>>    C. If ESR_EL2 is categorized and error not propagated,  the error come from guest kernel, return "-1" to terminate guest. As shown [4]
>>    D. Otherwise, Panic host OS. As shown [5]
>>
> 
> For the IMPLEMENTATION ESR, for the safety purposes, I have below suggestion
> 
> 1. When there is no guest,  host OS receives IMPLEMENTATION ESR, I
> hope host can be panic, because we do not know its implementation
> meaning.

> 2. when guest received MPLEMENTATION  ESR, and the Error is isolated
> to guest by "ESB" instruction, not propagate to host. I hope guest
> will exit, but host not panic.

How do we know if impdef SError values are contained by ESB?

'2.4.3 ESB and other physical errors' of [0]:
> It is IMPLEMENTATION DEFINED whether IMPLEMENTATION DEFINED and uncategorized
> SError interrupts are containable or Uncontainable, and whether they can be
> synchronized by an Error Synchronization Barrier.


> 3. when guest received MPLEMENTATION  ESR, and the Error is propagate
> to host, I hope host can be panic.


I tried to keep this behaviour 'the same' because I don't think there is a
'right thing' to do here:
If the SError is due to some catastrophic failure, we should panic the host. If
the SError is due to the guest poking a device we don't want to panic the host.

We can't tell these two cases apart.

We can spot RAS errors and handle those.


Thanks,

James

[0]
https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2017-10-15 16:09 [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 gengdongjiu
  2017-10-16  3:17 ` gengdongjiu
@ 2017-10-16 17:09 ` James Morse
  1 sibling, 0 replies; 8+ messages in thread
From: James Morse @ 2017-10-16 17:09 UTC (permalink / raw)
  To: gengdongjiu
  Cc: Jonathan.Zhang@cavium.com, Wuquanming, Marc Zyngier,
	Catalin Marinas, Will Deacon, wangxiongfeng (C), Huangshaoyu,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu

Hi gengdongjiu,

On 15/10/17 17:09, gengdongjiu wrote:
>> On 13/10/17 10:25, gengdongjiu wrote:
>>> In my first version patch [2], It sets the virtual ESR in the KVM, but
>>> Marc and other people disagree that[3][4],and propose to set its value
>>> and injection by userspace(when RAS is enabled).
>>
>> Not quite: for RAS errors.
>> When we want to hand a RAS error to a guest, Qemu should be driving that.
>>
>> What about impdef SError? Qemu should be able to drive that with the same API.
>>
>> What about this nasty corner where KVM already injects an impdef SError directly? This patch keeps that working.
>>
>>
>> I'd love to get rid of KVMs internal use of kvm_inject_vabt(). But what do we replace it with? It needs to be a guest exit type that existing
>> software can't ignore...
>>
>> (The best I can suggest is: Once we have a mechanism to inject SError into a guest from Qemu, KVM could make an impdef SError pending,
>> then give Qemu the opportunity to kill the guest, or set a different ESR. Existing software can ignore the exit, and take the existing
>> behaviour.)

> In fact I have below method for that, what do you think about that?
> 
> 1.  If there is no RAS, old method, directly inject virtual SError, not need to specify
> ESR, as shown in the [1]
> 2.  If there is RAS, KVM set "the kvm_run" guest exit type value to let user space handle
> the SError abort

Nope. There should not be a RAS-specific kvm exit type. What information do you
need to convey that doesn't apply to another user-space process?

Qemu/kvmtool will always need to generate a new RAS error from the user-space
signals they get as a result of the host handling the error. This way KVMs
user-space isn't a special case, and the user-space code is portable across
architectures.

This does mean the host kernel has to be able to handle any and all RAS errors
before they could possibly be presented to a guest. Today we only handle memory
errors. Any other error types will need adding in a way that works on all
architectures.

Again, you're passing RAS SErrors out to user-space. The SError may have nothing
to do with the guest. The host has to handle any and all RAS errors.


The only case where the guest is involved is if if the guest treads in some
poisoned memory. (You know this:)
The host has to do its RAS work first, to check this wasn't an error in the
hosts page tables and that the host really can keep running.
memory_failure() will cause the faulty memory to be unmapped from stage 2 and
signal Qemu/kvmtool.
If the guest touches the faulty memory (even from a different CPU/vcpu),
Qemu/kvmtool will get a signal.

Qemu/kvmtool should take these signals and generate any applicable RAS error
from these.

We have to use these signals so that KVM's user-space is the same as all other
user-space.


>    A. If ESR_EL2 is IMPLEMENTATION or uncategorized, return " ESR_ELx_ISV " to let user
> space specify an implementation-defined value, as shown [2]

I agree impdef SError are a different case. It doesn't look like you are passing
the impdef ESR to user-space, which I think is correct.

Passing uncategorized errors like this isn't correct, these are still RAS
errors. '2.4.3 ESB and other physical errors' in [0] says its
implementation-defined whether these Uncategorized RAS errors are uncontainable,
we should assume they weren't contained, and panic(). (If the CPU wants us to do
a better job, it needs to give us some information).

I don't think there is any point generating a fake ESR.
I need to think about using KVM_EXIT_EXCEPTION, but the exit code should mean
'run this again and I'll give it an SError'.  This lets Qemu/kvmtool set its own
impdef SError or emulate a machine that doesn't generate SError.

I think this needs more discussion, do we want to let Qemu/kvmtool reset an
affected vcpu so that its effectively running in firmware and can be brought
back online again?


> If ESR_EL2 is categorized and error not propagated,  the error come from guest user
> space, return " (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR " to let user space
> specify a recoverable ESR.

No, these are RAS errors, they should be handled by the kernel, not passed to
user-space, and not both.

If user-space needs to know, it will be informed via the usual way we tell
user-space about this stuff.


>      Here one side calling memory failure, another side let user pace inject SError.
> Because usually SEI notification does not deliver SIGBUS signal to user space, so
> here inject virtual SEI to ensure that. As shown [3]

For a RAS error we process the CPER records, and if they describe a memory error
we send signals to user-space.

Is there another type of RAS error we need to support for the guest? We must
support this on the host first.


>    C. If ESR_EL2 is categorized and error not propagated,  the error come from guest kernel,
>  return "-1" to terminate guest. As shown [4]


>    D. Otherwise, Panic host OS. As shown [5]

I don't think KVM should go calling panic(), this needs to be wrapped up the
arch's RAS code.


> static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
> {
>        unsigned int esr = kvm_vcpu_get_hsr(vcpu);
>        bool impdef_syndrome =  esr & ESR_ELx_ISV;      /* aka IDS */
>        unsigned int aet = esr & ESR_ELx_AET;
> 
>       if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN))                      [1]
>                kvm_inject_vabt(vcpu);
>                return 1;
>        }
> 
>        kvm_run->exit_reason = KVM_EXIT_EXCEPTION;
> 	   kvm_run->ex.exception = KVM_EXCEPTION_SERROR;
> 
> 	   If (impdef_syndrome || ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
> 			kvm_run->ex.error_code = ESR_ELx_ISV;
> 		}
> 
>        switch (aet) {
>        case ESR_ELx_AET_CE:    /* corrected error */                            [2]
>        case ESR_ELx_AET_UEO:   /* restartable error, not yet consumed */
> 			   kvm_run->ex.error_code = (ESR_ELx_AET_UC | ESR_ELx_FSC_SERROR );   
>                break;      /* continue processing the guest exit */
> 
> 
>        case ESR_ELx_AET_UEU:  /* The error has not been propagated */            [3]
> 	   case ESR_ELx_AET_UER:
>        /*
>         * Only handle the guest user mode SEI if the error has not been propagated
>         */
>        if ((!vcpu_mode_priv(vcpu)) && !handle_guest_sei(kvm_vcpu_get_hsr(vcpu)))
> 			kvm_run->ex.error_code = (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR );

Why should KVM care whether it was guest EL0 or guest EL1? Something is designed
wrong if you need to behave differently here.

(as far as I can see KVMs existing use of this vcpu_mode_priv() helper is to
emulate bits of the architecture that behave/undef differently at different
exception levels)


>       		break;
>        else
> 			return -1;                                                    [4]
>        /* If SError handling is failed, continue run */
>        default:                                                           [5]
>                /*
>                 * Until now, the CPU supports RAS and SEI is fatal, or user space
>                 * does not support to handle the SError.
>                 */
>                panic("This Asynchronous SError interrupt is dangerous, panic");
>        }
> 
> 	   return 0;
> }
> 
>>
>>> So I think we no need to submit another patch, it will be duplicated,
>>> and waste our review time. thank you very much. I will combine that.
>>
>> I agree we're posting competing series, there was some off-list co-ordination on this with Xie XiuQi and Xiongfeng Wang in ~may, it looks
>> like you weren't involved at that point.
>   
> Thanks very much for your agreement, I will add you to the off-list.

There are two sets of off-list co-ordination? That explains something...


>> In your last series touching all this:
>> https://lkml.org/lkml/2017/8/31/698
>>
>> You had Xie XiuQi's RAS-cpufeature patch in isolation, without the SError rework underneath it. Applied like this SError is still always masked
>> in the kernel, so any system without firmware-first will silently consume and discard an uncontained-RAS-error using the esb() in
>> __switch_to(). We can't do this, hence the first half of this series.
> 
> 
> Yes, seems I lost your SError rework series patches. When my patch update and modification
> almost done, hope we can combine to one series. thanks

I'd like to try and get this series merged as it is, then we can work on the KVM
and APEI bits as separate series on top. (otherwise we still depend on getting
this merged so we can have code that depends on ARM64_HAS_RAS_EXTENSIONS).


For KVM its:
* Allowing guests to inject SError.
* Allow Qemu/kvmtool to do something if the guest trips a non-RAS SError.
* Expose enough information to allow user-space to promote SIGBUS_MCEERR_AR to
  external-abort.
* (Handle SError during world-switch)

For APEI its:
* Allow multiple sources of NMI
* Abstract the estatus cache to use for SEI/SDEI
* Increase the priority of memory_failure_queue()s work,
* Provide a way to block ret_to-user to memory_failure work is done.

(These last two are the problem Xie XiuQi found).


Thanks,

James

[0]
https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
@ 2017-10-19  5:06 gengdongjiu
  0 siblings, 0 replies; 8+ messages in thread
From: gengdongjiu @ 2017-10-19  5:06 UTC (permalink / raw)
  To: 'James Morse'
  Cc: Jonathan.Zhang@cavium.com, Wuquanming, Marc Zyngier,
	Catalin Marinas, Will Deacon, wangxiongfeng (C), Huangshaoyu,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu

Hi James,
   Thanks for your mail. and very sorry for my late response.

> 
> Hi gengdongjiu,
> 
> On 15/10/17 17:09, gengdongjiu wrote:
> >> On 13/10/17 10:25, gengdongjiu wrote:
> >>> In my first version patch [2], It sets the virtual ESR in the KVM,
> >>> but Marc and other people disagree that[3][4],and propose to set its
> >>> value and injection by userspace(when RAS is enabled).
> >>
> >> Not quite: for RAS errors.
> >> When we want to hand a RAS error to a guest, Qemu should be driving that.
> >>
> >> What about impdef SError? Qemu should be able to drive that with the same API.
> >>
> >> What about this nasty corner where KVM already injects an impdef SError directly? This patch keeps that working.
> >>
> >>
> >> I'd love to get rid of KVMs internal use of kvm_inject_vabt(). But
> >> what do we replace it with? It needs to be a guest exit type that existing software can't ignore...
> >>
> >> (The best I can suggest is: Once we have a mechanism to inject SError
> >> into a guest from Qemu, KVM could make an impdef SError pending, then
> >> give Qemu the opportunity to kill the guest, or set a different ESR.
> >> Existing software can ignore the exit, and take the existing
> >> behaviour.)
> 
> > In fact I have below method for that, what do you think about that?
> >
> > 1.  If there is no RAS, old method, directly inject virtual SError,
> > not need to specify ESR, as shown in the [1] 2.  If there is RAS, KVM
> > set "the kvm_run" guest exit type value to let user space handle the
> > SError abort
> 
> Nope. There should not be a RAS-specific kvm exit type. What information do you need to convey that doesn't apply to another user-space
> process?

	run->exit_reason = KVM_EXIT_EXCEPTION;
	run->ex.exception = ARM_EXCEPTION_EL1_SERROR;
	run->ex.error_code = xxxxxxx;
    
    How about above ones?  KVM_EXIT_EXCEPTION and ARM_EXCEPTION_EL1_SERROR are also used by system without RAS, which should
    Not be a RAS-specific kvm exit type.
    User space can choose to handle it or not handle it. If not, userspace will do nothing and not apply to another user-space process.

> 
> Qemu/kvmtool will always need to generate a new RAS error from the user-space signals they get as a result of the host handling the error.
> This way KVMs user-space isn't a special case, and the user-space code is portable across architectures.


Yes, when get user-space signals, I will record them to CPER and report it to guest OS.

> 
> This does mean the host kernel has to be able to handle any and all RAS errors before they could possibly be presented to a guest. Today we
> only handle memory errors. Any other error types will need adding in a way that works on all architectures.

Yes, only memory section error will deliver SIGBUS to guest from currently GHES driver code, not include PCIE AER.

> 
> Again, you're passing RAS SErrors out to user-space. The SError may have nothing to do with the guest. The host has to handle any and all
> RAS errors.

If the SError is generated by guest OS, it should be related with guest. If the SError is generated by host EL2, it have nothing to do with 
the guest.
For this path handle_exit(), it handles the low exception level trap(guest EL0 or guest EL1), what is your meaning that "the SError may have nothing to do with the guest"?

As you mentioned "you're passing RAS SErrors out to user-space", do you mean RAS SErrors is ESR value here?

If so, we can re-encoding the ESR to others, such as below.
enum {
	RECOVERABLE = 1,
	FATAL,
	CORRECTED,
};

Here return, only let userspace specify a valid ESR for guest, and driver to inject a virtual SError, nothing else.



> 
> 
> The only case where the guest is involved is if if the guest treads in some poisoned memory. (You know this:) The host has to do its RAS work
> first, to check this wasn't an error in the hosts page tables and that the host really can keep running.
> memory_failure() will cause the faulty memory to be unmapped from stage 2 and signal Qemu/kvmtool.
> If the guest touches the faulty memory (even from a different CPU/vcpu), Qemu/kvmtool will get a signal.

In my code, I have followed above design. As I said above, here I only let userspace specify a valid ESR for guest and driver to inject a virtual SError for some scene , nothing else, which does not violate above design.
How could handle the case that guest touch the fault memory without recording the address by firmware(because the address is not accurate)?
For this case, there is no chance to deliver the SIGBUS even though it is guest memory fault. 
So for this situation, I will let user space specify a valid ESR and drive to inject a virtual SError.

> 
> Qemu/kvmtool should take these signals and generate any applicable RAS error from these.
> 
> We have to use these signals so that KVM's user-space is the same as all other user-space.

Yes, we should.

> 
> 
> >    A. If ESR_EL2 is IMPLEMENTATION or uncategorized, return "
> > ESR_ELx_ISV " to let user space specify an implementation-defined
> > value, as shown [2]
> 
> I agree impdef SError are a different case. It doesn't look like you are passing the impdef ESR to user-space, which I think is correct.
> 
> Passing uncategorized errors like this isn't correct, these are still RAS errors. '2.4.3 ESB and other physical errors' in [0] says its
> implementation-defined whether these Uncategorized RAS errors are uncontainable, we should assume they weren't contained, and
> panic(). (If the CPU wants us to do a better job, it needs to give us some information).

Ok, I think it is good to panic for implementation-defined ESR.


> 
> I don't think there is any point generating a fake ESR.
> I need to think about using KVM_EXIT_EXCEPTION, but the exit code should mean 'run this again and I'll give it an SError'.  This lets
> Qemu/kvmtool set its own impdef SError or emulate a machine that doesn't generate SError.
> 
> I think this needs more discussion, do we want to let Qemu/kvmtool reset an affected vcpu so that its effectively running in firmware and
> can be brought back online again?

How do we ensure the poison data does not consumed by other vcpu? Only reset CPU may be not enough, because we are not sure whether the error data have propagated to
Other VCPU.

> 
> 
> > If ESR_EL2 is categorized and error not propagated,  the error come
> > from guest user space, return " (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR
> > " to let user space specify a recoverable ESR.
> 
> No, these are RAS errors, they should be handled by the kernel, not passed to user-space, and not both.
> 
> If user-space needs to know, it will be informed via the usual way we tell user-space about this stuff.

Here only let user-space specify a valid ESR and drive to inject SEI, nothing else. maybe we can re-encoding the return value(such as RECOVERABLE)?


> 
> >      Here one side calling memory failure, another side let user pace inject SError.
> > Because usually SEI notification does not deliver SIGBUS signal to
> > user space, so here inject virtual SEI to ensure that. As shown [3]
> 
> For a RAS error we process the CPER records, and if they describe a memory error we send signals to user-space.

For the SEI, usually the recorded address is invalid, so it cannot send signal to user-space(now when only the address is valid it, it will call the memory_failure())

static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
{
		...................................
        if (!pfn_valid(pfn)) {
                pr_warn_ratelimited(FW_WARN GHES_PFX
                "Invalid address in generic error data: %#llx\n",
                mem_err->physical_addr);
                return;
        }
		...................
}


> 
> Is there another type of RAS error we need to support for the guest? We must support this on the host first.

I think the two types "ESR_ELx_AET_UEU and ESR_ELx_AET_UER" need to be supported by guest, because these types of errors are not propagated


> 
> 
> >    C. If ESR_EL2 is categorized and error not propagated,  the error
> > come from guest kernel,  return "-1" to terminate guest. As shown [4]
> 
> 
> >    D. Otherwise, Panic host OS. As shown [5]
> 
> I don't think KVM should go calling panic(), this needs to be wrapped up the arch's RAS code.

I think it is not important(both can OK), as we can see, in the current arch KVM code, it already calling the panic().

void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
{

        for (num = 1; num < NR_SYS_REGS; num++)
                if (vcpu_sys_reg(vcpu, num) == 0x4242424242424242)
                        panic("Didn't reset vcpu_sys_reg(%zi)", num);
}


> 
> > static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run
> > *run) {
> >        unsigned int esr = kvm_vcpu_get_hsr(vcpu);
> >        bool impdef_syndrome =  esr & ESR_ELx_ISV;      /* aka IDS */
> >        unsigned int aet = esr & ESR_ELx_AET;
> >
> >       if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN))                      [1]
> >                kvm_inject_vabt(vcpu);
> >                return 1;
> >        }
> >
> >        kvm_run->exit_reason = KVM_EXIT_EXCEPTION;
> > 	   kvm_run->ex.exception = KVM_EXCEPTION_SERROR;
> >
> > 	   If (impdef_syndrome || ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
> > 			kvm_run->ex.error_code = ESR_ELx_ISV;
> > 		}
> >
> >        switch (aet) {
> >        case ESR_ELx_AET_CE:    /* corrected error */                            [2]
> >        case ESR_ELx_AET_UEO:   /* restartable error, not yet consumed */
> > 			   kvm_run->ex.error_code = (ESR_ELx_AET_UC | ESR_ELx_FSC_SERROR );
> >                break;      /* continue processing the guest exit */
> >
> >
> >        case ESR_ELx_AET_UEU:  /* The error has not been propagated */            [3]
> > 	   case ESR_ELx_AET_UER:
> >        /*
> >         * Only handle the guest user mode SEI if the error has not been propagated
> >         */
> >        if ((!vcpu_mode_priv(vcpu)) && !handle_guest_sei(kvm_vcpu_get_hsr(vcpu)))
> > 			kvm_run->ex.error_code = (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR );
> 
> Why should KVM care whether it was guest EL0 or guest EL1? Something is designed wrong if you need to behave differently here.
> 
> (as far as I can see KVMs existing use of this vcpu_mode_priv() helper is to emulate bits of the architecture that behave/undef differently at
> different exception levels)

   Because For the guest EL1 SError, it should be panic even though guest handle it.

> 
> 
> >       		break;
> >        else
> > 			return -1;                                                    [4]
> >        /* If SError handling is failed, continue run */
> >        default:                                                           [5]
> >                /*
> >                 * Until now, the CPU supports RAS and SEI is fatal, or user space
> >                 * does not support to handle the SError.
> >                 */
> >                panic("This Asynchronous SError interrupt is dangerous, panic");
> >        }
> >
> > 	   return 0;
> > }
> >
> >>
> >>> So I think we no need to submit another patch, it will be
> >>> duplicated, and waste our review time. thank you very much. I will combine that.
> >>
> >> I agree we're posting competing series, there was some off-list
> >> co-ordination on this with Xie XiuQi and Xiongfeng Wang in ~may, it looks like you weren't involved at that point.
> >
> > Thanks very much for your agreement, I will add you to the off-list.
> 
> There are two sets of off-list co-ordination? That explains something...

I combined our patches to one to save reviewer's time, considering I have followed it from beginning for a long time. thanks a lot.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
[Set an impdef ESR for Virtual-SError]
Signed-off-by: James Morse <james.morse@arm.com>

> 
> 
> >> In your last series touching all this:
> >> https://lkml.org/lkml/2017/8/31/698
> >>
> >> You had Xie XiuQi's RAS-cpufeature patch in isolation, without the
> >> SError rework underneath it. Applied like this SError is still always
> >> masked in the kernel, so any system without firmware-first will silently consume and discard an uncontained-RAS-error using the esb() in
> __switch_to(). We can't do this, hence the first half of this series.
> >
> >
> > Yes, seems I lost your SError rework series patches. When my patch
> > update and modification almost done, hope we can combine to one
> > series. thanks
> 
> I'd like to try and get this series merged as it is, then we can work on the KVM and APEI bits as separate series on top. (otherwise we still
> depend on getting this merged so we can have code that depends on ARM64_HAS_RAS_EXTENSIONS).

OK.

> 
> 
> For KVM its:
> * Allowing guests to inject SError.
> * Allow Qemu/kvmtool to do something if the guest trips a non-RAS SError.
> * Expose enough information to allow user-space to promote SIGBUS_MCEERR_AR to
>   external-abort.
> * (Handle SError during world-switch)
> 
> For APEI its:
> * Allow multiple sources of NMI
> * Abstract the estatus cache to use for SEI/SDEI
> * Increase the priority of memory_failure_queue()s work,
> * Provide a way to block ret_to-user to memory_failure work is done.
> 
> (These last two are the problem Xie XiuQi found).
> 
> 
> Thanks,
> 
> James
> 
> [0]
> https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-10-19  5:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-15 16:09 [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 gengdongjiu
2017-10-16  3:17 ` gengdongjiu
2017-10-16 17:02   ` James Morse
2017-10-16 17:09 ` James Morse
  -- strict thread matches above, loose matches on Subject: below --
2017-10-19  5:06 gengdongjiu
2017-10-05 19:17 [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support James Morse
2017-10-05 19:18 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse
2017-10-13  9:25   ` gengdongjiu
2017-10-13 16:53     ` James Morse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox