All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2/5] Xen/MCE: vMCE injection
@ 2012-09-18 13:14 Liu, Jinsong
  0 siblings, 0 replies; 7+ messages in thread
From: Liu, Jinsong @ 2012-09-18 13:14 UTC (permalink / raw)
  To: Jan Beulich, xen-devel@lists.xensource.com
  Cc: keir@xen.org, Ian.Campbell@citrix.com

[-- Attachment #1: Type: text/plain, Size: 2868 bytes --]

Xen/MCE: vMCE injection

In our test for win8 guest mce, we find a bug that no matter what SRAO/SRAR
error xen inject to win8 guest, it always reboot.

The root cause is, current Xen vMCE logic inject vMCE# only to vcpu0, this is
not correct for Intel MCE (Under Intel arch, h/w generate MCE# to all CPUs).

This patch fix vMCE injection bug, injecting vMCE# to all vcpus.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>

diff -r 133664c6bfb4 xen/arch/x86/cpu/mcheck/vmce.c
--- a/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 22:39:11 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 23:46:38 2012 +0800
@@ -340,48 +340,27 @@
 
 int inject_vmce(struct domain *d)
 {
-    int cpu = smp_processor_id();
+    struct vcpu *v;
 
-    /* PV guest and HVM guest have different vMCE# injection methods. */
-    if ( !test_and_set_bool(d->vcpu[0]->mce_pending) )
+    /* inject vMCE to all vcpus */
+    for_each_vcpu(d, v)
     {
-        if ( d->is_hvm )
+        if ( !test_and_set_bool(v->mce_pending) &&
+            ((d->is_hvm) ||
+                guest_has_trap_callback(d, v->vcpu_id, TRAP_machine_check)) )
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to HVM DOM %d\n",
-                       d->domain_id);
-            vcpu_kick(d->vcpu[0]);
+            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to dom%d vcpu%d\n",
+                       d->domain_id, v->vcpu_id);
+            vcpu_kick(v);
         }
         else
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to PV DOM%d\n",
-                       d->domain_id);
-            if ( guest_has_trap_callback(d, 0, TRAP_machine_check) )
-            {
-                cpumask_copy(d->vcpu[0]->cpu_affinity_tmp,
-                             d->vcpu[0]->cpu_affinity);
-                mce_printk(MCE_VERBOSE, "MCE: CPU%d set affinity, old %d\n",
-                           cpu, d->vcpu[0]->processor);
-                vcpu_set_affinity(d->vcpu[0], cpumask_of(cpu));
-                vcpu_kick(d->vcpu[0]);
-            }
-            else
-            {
-                mce_printk(MCE_VERBOSE,
-                           "MCE: Kill PV guest with No MCE handler\n");
-                domain_crash(d);
-            }
+            mce_printk(MCE_QUIET, "Fail to inject vMCE to dom%d vcpu%d\n",
+                       d->domain_id, v->vcpu_id);
+            return -1;
         }
     }
-    else
-    {
-        /* new vMCE comes while first one has not been injected yet,
-         * in this case, inject fail. [We can't lose this vMCE for
-         * the mce node's consistency].
-         */
-        mce_printk(MCE_QUIET, "There's a pending vMCE waiting to be injected "
-                   " to this DOM%d!\n", d->domain_id);
-        return -1;
-    }
+
     return 0;
 }
 

[-- Attachment #2: 2_vmce_injection.patch --]
[-- Type: application/octet-stream, Size: 2793 bytes --]

Xen/MCE: vMCE injection

In our test for win8 guest mce, we find a bug that no matter what SRAO/SRAR
error xen inject to win8 guest, it always reboot.

The root cause is, current Xen vMCE logic inject vMCE# only to vcpu0, this is
not correct for Intel MCE (Under Intel arch, h/w generate MCE# to all CPUs).

This patch fix vMCE injection bug, injecting vMCE# to all vcpus.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>

diff -r 133664c6bfb4 xen/arch/x86/cpu/mcheck/vmce.c
--- a/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 22:39:11 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 23:46:38 2012 +0800
@@ -340,48 +340,27 @@
 
 int inject_vmce(struct domain *d)
 {
-    int cpu = smp_processor_id();
+    struct vcpu *v;
 
-    /* PV guest and HVM guest have different vMCE# injection methods. */
-    if ( !test_and_set_bool(d->vcpu[0]->mce_pending) )
+    /* inject vMCE to all vcpus */
+    for_each_vcpu(d, v)
     {
-        if ( d->is_hvm )
+        if ( !test_and_set_bool(v->mce_pending) &&
+            ((d->is_hvm) ||
+                guest_has_trap_callback(d, v->vcpu_id, TRAP_machine_check)) )
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to HVM DOM %d\n",
-                       d->domain_id);
-            vcpu_kick(d->vcpu[0]);
+            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to dom%d vcpu%d\n",
+                       d->domain_id, v->vcpu_id);
+            vcpu_kick(v);
         }
         else
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to PV DOM%d\n",
-                       d->domain_id);
-            if ( guest_has_trap_callback(d, 0, TRAP_machine_check) )
-            {
-                cpumask_copy(d->vcpu[0]->cpu_affinity_tmp,
-                             d->vcpu[0]->cpu_affinity);
-                mce_printk(MCE_VERBOSE, "MCE: CPU%d set affinity, old %d\n",
-                           cpu, d->vcpu[0]->processor);
-                vcpu_set_affinity(d->vcpu[0], cpumask_of(cpu));
-                vcpu_kick(d->vcpu[0]);
-            }
-            else
-            {
-                mce_printk(MCE_VERBOSE,
-                           "MCE: Kill PV guest with No MCE handler\n");
-                domain_crash(d);
-            }
+            mce_printk(MCE_QUIET, "Fail to inject vMCE to dom%d vcpu%d\n",
+                       d->domain_id, v->vcpu_id);
+            return -1;
         }
     }
-    else
-    {
-        /* new vMCE comes while first one has not been injected yet,
-         * in this case, inject fail. [We can't lose this vMCE for
-         * the mce node's consistency].
-         */
-        mce_printk(MCE_QUIET, "There's a pending vMCE waiting to be injected "
-                   " to this DOM%d!\n", d->domain_id);
-        return -1;
-    }
+
     return 0;
 }
 

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/5] Xen/MCE: vMCE injection
@ 2012-09-19  8:03 Liu, Jinsong
  2012-09-20 15:23 ` Christoph Egger
  0 siblings, 1 reply; 7+ messages in thread
From: Liu, Jinsong @ 2012-09-19  8:03 UTC (permalink / raw)
  To: Jan Beulich, xen-devel@lists.xensource.com
  Cc: Christoph Egger, keir@xen.org, Ian.Campbell@citrix.com

[-- Attachment #1: Type: text/plain, Size: 2868 bytes --]

Xen/MCE: vMCE injection

In our test for win8 guest mce, we find a bug that no matter what SRAO/SRAR
error xen inject to win8 guest, it always reboot.

The root cause is, current Xen vMCE logic inject vMCE# only to vcpu0, this is
not correct for Intel MCE (Under Intel arch, h/w generate MCE# to all CPUs).

This patch fix vMCE injection bug, injecting vMCE# to all vcpus.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>

diff -r 133664c6bfb4 xen/arch/x86/cpu/mcheck/vmce.c
--- a/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 22:39:11 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 23:46:38 2012 +0800
@@ -340,48 +340,27 @@
 
 int inject_vmce(struct domain *d)
 {
-    int cpu = smp_processor_id();
+    struct vcpu *v;
 
-    /* PV guest and HVM guest have different vMCE# injection methods. */
-    if ( !test_and_set_bool(d->vcpu[0]->mce_pending) )
+    /* inject vMCE to all vcpus */
+    for_each_vcpu(d, v)
     {
-        if ( d->is_hvm )
+        if ( !test_and_set_bool(v->mce_pending) &&
+            ((d->is_hvm) ||
+                guest_has_trap_callback(d, v->vcpu_id, TRAP_machine_check)) )
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to HVM DOM %d\n",
-                       d->domain_id);
-            vcpu_kick(d->vcpu[0]);
+            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to dom%d vcpu%d\n",
+                       d->domain_id, v->vcpu_id);
+            vcpu_kick(v);
         }
         else
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to PV DOM%d\n",
-                       d->domain_id);
-            if ( guest_has_trap_callback(d, 0, TRAP_machine_check) )
-            {
-                cpumask_copy(d->vcpu[0]->cpu_affinity_tmp,
-                             d->vcpu[0]->cpu_affinity);
-                mce_printk(MCE_VERBOSE, "MCE: CPU%d set affinity, old %d\n",
-                           cpu, d->vcpu[0]->processor);
-                vcpu_set_affinity(d->vcpu[0], cpumask_of(cpu));
-                vcpu_kick(d->vcpu[0]);
-            }
-            else
-            {
-                mce_printk(MCE_VERBOSE,
-                           "MCE: Kill PV guest with No MCE handler\n");
-                domain_crash(d);
-            }
+            mce_printk(MCE_QUIET, "Fail to inject vMCE to dom%d vcpu%d\n",
+                       d->domain_id, v->vcpu_id);
+            return -1;
         }
     }
-    else
-    {
-        /* new vMCE comes while first one has not been injected yet,
-         * in this case, inject fail. [We can't lose this vMCE for
-         * the mce node's consistency].
-         */
-        mce_printk(MCE_QUIET, "There's a pending vMCE waiting to be injected "
-                   " to this DOM%d!\n", d->domain_id);
-        return -1;
-    }
+
     return 0;
 }
 

[-- Attachment #2: 2_vmce_injection.patch --]
[-- Type: application/octet-stream, Size: 2793 bytes --]

Xen/MCE: vMCE injection

In our test for win8 guest mce, we find a bug that no matter what SRAO/SRAR
error xen inject to win8 guest, it always reboot.

The root cause is, current Xen vMCE logic inject vMCE# only to vcpu0, this is
not correct for Intel MCE (Under Intel arch, h/w generate MCE# to all CPUs).

This patch fix vMCE injection bug, injecting vMCE# to all vcpus.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>

diff -r 133664c6bfb4 xen/arch/x86/cpu/mcheck/vmce.c
--- a/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 22:39:11 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 23:46:38 2012 +0800
@@ -340,48 +340,27 @@
 
 int inject_vmce(struct domain *d)
 {
-    int cpu = smp_processor_id();
+    struct vcpu *v;
 
-    /* PV guest and HVM guest have different vMCE# injection methods. */
-    if ( !test_and_set_bool(d->vcpu[0]->mce_pending) )
+    /* inject vMCE to all vcpus */
+    for_each_vcpu(d, v)
     {
-        if ( d->is_hvm )
+        if ( !test_and_set_bool(v->mce_pending) &&
+            ((d->is_hvm) ||
+                guest_has_trap_callback(d, v->vcpu_id, TRAP_machine_check)) )
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to HVM DOM %d\n",
-                       d->domain_id);
-            vcpu_kick(d->vcpu[0]);
+            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to dom%d vcpu%d\n",
+                       d->domain_id, v->vcpu_id);
+            vcpu_kick(v);
         }
         else
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to PV DOM%d\n",
-                       d->domain_id);
-            if ( guest_has_trap_callback(d, 0, TRAP_machine_check) )
-            {
-                cpumask_copy(d->vcpu[0]->cpu_affinity_tmp,
-                             d->vcpu[0]->cpu_affinity);
-                mce_printk(MCE_VERBOSE, "MCE: CPU%d set affinity, old %d\n",
-                           cpu, d->vcpu[0]->processor);
-                vcpu_set_affinity(d->vcpu[0], cpumask_of(cpu));
-                vcpu_kick(d->vcpu[0]);
-            }
-            else
-            {
-                mce_printk(MCE_VERBOSE,
-                           "MCE: Kill PV guest with No MCE handler\n");
-                domain_crash(d);
-            }
+            mce_printk(MCE_QUIET, "Fail to inject vMCE to dom%d vcpu%d\n",
+                       d->domain_id, v->vcpu_id);
+            return -1;
         }
     }
-    else
-    {
-        /* new vMCE comes while first one has not been injected yet,
-         * in this case, inject fail. [We can't lose this vMCE for
-         * the mce node's consistency].
-         */
-        mce_printk(MCE_QUIET, "There's a pending vMCE waiting to be injected "
-                   " to this DOM%d!\n", d->domain_id);
-        return -1;
-    }
+
     return 0;
 }
 

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/5] Xen/MCE: vMCE injection
  2012-09-19  8:03 [PATCH 2/5] Xen/MCE: vMCE injection Liu, Jinsong
@ 2012-09-20 15:23 ` Christoph Egger
  2012-09-20 19:15   ` Liu, Jinsong
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Egger @ 2012-09-20 15:23 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: xen-devel@lists.xensource.com, keir@xen.org,
	Ian.Campbell@citrix.com, Jan Beulich

On 09/19/12 10:03, Liu, Jinsong wrote:

> Xen/MCE: vMCE injection
> 
> In our test for win8 guest mce, we find a bug that no matter what SRAO/SRAR
> error xen inject to win8 guest, it always reboot.
> 
> The root cause is, current Xen vMCE logic inject vMCE# only to vcpu0, this is
> not correct for Intel MCE (Under Intel arch, h/w generate MCE# to all CPUs).
> 
> This patch fix vMCE injection bug, injecting vMCE# to all vcpus.


This breaks the AMD way. The AMD way is to only inject it to vcpu0.
I suggest to add a flag argument to inject_vmce() that says whether
to inject to all vcpus or just vcpu0.
Then set/clear that flag from the caller side depending on whether you
run on Intel or AMD.

Christoph


> 
> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
> 
> diff -r 133664c6bfb4 xen/arch/x86/cpu/mcheck/vmce.c
> --- a/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 22:39:11 2012 +0800
> +++ b/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 23:46:38 2012 +0800
> @@ -340,48 +340,27 @@
>  
>  int inject_vmce(struct domain *d)
>  {
> -    int cpu = smp_processor_id();
> +    struct vcpu *v;
>  
> -    /* PV guest and HVM guest have different vMCE# injection methods. */
> -    if ( !test_and_set_bool(d->vcpu[0]->mce_pending) )
> +    /* inject vMCE to all vcpus */
> +    for_each_vcpu(d, v)
>      {
> -        if ( d->is_hvm )
> +        if ( !test_and_set_bool(v->mce_pending) &&
> +            ((d->is_hvm) ||
> +                guest_has_trap_callback(d, v->vcpu_id, TRAP_machine_check)) )
>          {
> -            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to HVM DOM %d\n",
> -                       d->domain_id);
> -            vcpu_kick(d->vcpu[0]);
> +            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to dom%d vcpu%d\n",
> +                       d->domain_id, v->vcpu_id);
> +            vcpu_kick(v);
>          }
>          else
>          {
> -            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to PV DOM%d\n",
> -                       d->domain_id);
> -            if ( guest_has_trap_callback(d, 0, TRAP_machine_check) )
> -            {
> -                cpumask_copy(d->vcpu[0]->cpu_affinity_tmp,
> -                             d->vcpu[0]->cpu_affinity);
> -                mce_printk(MCE_VERBOSE, "MCE: CPU%d set affinity, old %d\n",
> -                           cpu, d->vcpu[0]->processor);
> -                vcpu_set_affinity(d->vcpu[0], cpumask_of(cpu));
> -                vcpu_kick(d->vcpu[0]);
> -            }
> -            else
> -            {
> -                mce_printk(MCE_VERBOSE,
> -                           "MCE: Kill PV guest with No MCE handler\n");
> -                domain_crash(d);
> -            }
> +            mce_printk(MCE_QUIET, "Fail to inject vMCE to dom%d vcpu%d\n",
> +                       d->domain_id, v->vcpu_id);
> +            return -1;
>          }
>      }
> -    else
> -    {
> -        /* new vMCE comes while first one has not been injected yet,
> -         * in this case, inject fail. [We can't lose this vMCE for
> -         * the mce node's consistency].
> -         */
> -        mce_printk(MCE_QUIET, "There's a pending vMCE waiting to be injected "
> -                   " to this DOM%d!\n", d->domain_id);
> -        return -1;
> -    }
> +
>      return 0;
>  }
>  



-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/5] Xen/MCE: vMCE injection
  2012-09-20 15:23 ` Christoph Egger
@ 2012-09-20 19:15   ` Liu, Jinsong
  2012-09-21  7:38     ` Christoph Egger
  0 siblings, 1 reply; 7+ messages in thread
From: Liu, Jinsong @ 2012-09-20 19:15 UTC (permalink / raw)
  To: Christoph Egger
  Cc: xen-devel@lists.xensource.com, keir@xen.org,
	Ian.Campbell@citrix.com, Jan Beulich

Christoph Egger wrote:
> On 09/19/12 10:03, Liu, Jinsong wrote:
> 
>> Xen/MCE: vMCE injection
>> 
>> In our test for win8 guest mce, we find a bug that no matter what
>> SRAO/SRAR error xen inject to win8 guest, it always reboot.
>> 
>> The root cause is, current Xen vMCE logic inject vMCE# only to
>> vcpu0, this is not correct for Intel MCE (Under Intel arch, h/w
>> generate MCE# to all CPUs). 
>> 
>> This patch fix vMCE injection bug, injecting vMCE# to all vcpus.
> 
> 
> This breaks the AMD way. The AMD way is to only inject it to vcpu0.
> I suggest to add a flag argument to inject_vmce() that says whether
> to inject to all vcpus or just vcpu0.
> Then set/clear that flag from the caller side depending on whether you
> run on Intel or AMD.
> 
> Christoph
> 

No, it didn't breaks AMD since it only called by intel_memerr_dhandler().

Thanks,
Jinsong

> 
>> 
>> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
>> 
>> diff -r 133664c6bfb4 xen/arch/x86/cpu/mcheck/vmce.c
>> --- a/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 22:39:11 2012 +0800
>> +++ b/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 23:46:38 2012 +0800
>> @@ -340,48 +340,27 @@ 
>> 
>>  int inject_vmce(struct domain *d)
>>  {
>> -    int cpu = smp_processor_id();
>> +    struct vcpu *v;
>> 
>> -    /* PV guest and HVM guest have different vMCE# injection
>> methods. */ 
>> -    if ( !test_and_set_bool(d->vcpu[0]->mce_pending) )
>> +    /* inject vMCE to all vcpus */
>> +    for_each_vcpu(d, v)
>>      {
>> -        if ( d->is_hvm )
>> +        if ( !test_and_set_bool(v->mce_pending) && +           
>> ((d->is_hvm) || +                guest_has_trap_callback(d,
>> v->vcpu_id, TRAP_machine_check)) )          { 
>> -            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to HVM DOM
>> %d\n", 
>> -                       d->domain_id);
>> -            vcpu_kick(d->vcpu[0]);
>> +            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to dom%d
>> vcpu%d\n", +                       d->domain_id, v->vcpu_id);
>> +            vcpu_kick(v);
>>          }
>>          else
>>          {
>> -            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to PV
>> DOM%d\n", 
>> -                       d->domain_id);
>> -            if ( guest_has_trap_callback(d, 0, TRAP_machine_check) )
>> -            {
>> -                cpumask_copy(d->vcpu[0]->cpu_affinity_tmp,
>> -                             d->vcpu[0]->cpu_affinity);
>> -                mce_printk(MCE_VERBOSE, "MCE: CPU%d set affinity,
>> old %d\n", 
>> -                           cpu, d->vcpu[0]->processor);
>> -                vcpu_set_affinity(d->vcpu[0], cpumask_of(cpu));
>> -                vcpu_kick(d->vcpu[0]);
>> -            }
>> -            else
>> -            {
>> -                mce_printk(MCE_VERBOSE,
>> -                           "MCE: Kill PV guest with No MCE
>> handler\n"); 
>> -                domain_crash(d);
>> -            }
>> +            mce_printk(MCE_QUIET, "Fail to inject vMCE to dom%d
>> vcpu%d\n", +                       d->domain_id, v->vcpu_id);
>> +            return -1;
>>          }
>>      }
>> -    else
>> -    {
>> -        /* new vMCE comes while first one has not been injected yet,
>> -         * in this case, inject fail. [We can't lose this vMCE for
>> -         * the mce node's consistency].
>> -         */
>> -        mce_printk(MCE_QUIET, "There's a pending vMCE waiting to be
>> injected " 
>> -                   " to this DOM%d!\n", d->domain_id);
>> -        return -1;
>> -    }
>> +
>>      return 0;
>>  }

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/5] Xen/MCE: vMCE injection
  2012-09-20 19:15   ` Liu, Jinsong
@ 2012-09-21  7:38     ` Christoph Egger
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Egger @ 2012-09-21  7:38 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: xen-devel@lists.xensource.com, keir@xen.org,
	Ian.Campbell@citrix.com, Jan Beulich

On 09/20/12 21:15, Liu, Jinsong wrote:

> Christoph Egger wrote:
>> On 09/19/12 10:03, Liu, Jinsong wrote:
>>
>>> Xen/MCE: vMCE injection
>>>
>>> In our test for win8 guest mce, we find a bug that no matter what
>>> SRAO/SRAR error xen inject to win8 guest, it always reboot.
>>>
>>> The root cause is, current Xen vMCE logic inject vMCE# only to
>>> vcpu0, this is not correct for Intel MCE (Under Intel arch, h/w
>>> generate MCE# to all CPUs). 
>>>
>>> This patch fix vMCE injection bug, injecting vMCE# to all vcpus.
>>
>>
>> This breaks the AMD way. The AMD way is to only inject it to vcpu0.
>> I suggest to add a flag argument to inject_vmce() that says whether
>> to inject to all vcpus or just vcpu0.
>> Then set/clear that flag from the caller side depending on whether you
>> run on Intel or AMD.
>>
>> Christoph
>>
> 
> No, it didn't breaks AMD since it only called by intel_memerr_dhandler().

But it will with the mce patches I still have in my queue.

Christoph


> Thanks,
> Jinsong
> 
>>
>>>
>>> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
>>>
>>> diff -r 133664c6bfb4 xen/arch/x86/cpu/mcheck/vmce.c
>>> --- a/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 22:39:11 2012 +0800
>>> +++ b/xen/arch/x86/cpu/mcheck/vmce.c	Tue Sep 18 23:46:38 2012 +0800
>>> @@ -340,48 +340,27 @@ 
>>>
>>>  int inject_vmce(struct domain *d)
>>>  {
>>> -    int cpu = smp_processor_id();
>>> +    struct vcpu *v;
>>>
>>> -    /* PV guest and HVM guest have different vMCE# injection
>>> methods. */ 
>>> -    if ( !test_and_set_bool(d->vcpu[0]->mce_pending) )
>>> +    /* inject vMCE to all vcpus */
>>> +    for_each_vcpu(d, v)
>>>      {
>>> -        if ( d->is_hvm )
>>> +        if ( !test_and_set_bool(v->mce_pending) && +           
>>> ((d->is_hvm) || +                guest_has_trap_callback(d,
>>> v->vcpu_id, TRAP_machine_check)) )          { 
>>> -            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to HVM DOM
>>> %d\n", 
>>> -                       d->domain_id);
>>> -            vcpu_kick(d->vcpu[0]);
>>> +            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to dom%d
>>> vcpu%d\n", +                       d->domain_id, v->vcpu_id);
>>> +            vcpu_kick(v);
>>>          }
>>>          else
>>>          {
>>> -            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to PV
>>> DOM%d\n", 
>>> -                       d->domain_id);
>>> -            if ( guest_has_trap_callback(d, 0, TRAP_machine_check) )
>>> -            {
>>> -                cpumask_copy(d->vcpu[0]->cpu_affinity_tmp,
>>> -                             d->vcpu[0]->cpu_affinity);
>>> -                mce_printk(MCE_VERBOSE, "MCE: CPU%d set affinity,
>>> old %d\n", 
>>> -                           cpu, d->vcpu[0]->processor);
>>> -                vcpu_set_affinity(d->vcpu[0], cpumask_of(cpu));
>>> -                vcpu_kick(d->vcpu[0]);
>>> -            }
>>> -            else
>>> -            {
>>> -                mce_printk(MCE_VERBOSE,
>>> -                           "MCE: Kill PV guest with No MCE
>>> handler\n"); 
>>> -                domain_crash(d);
>>> -            }
>>> +            mce_printk(MCE_QUIET, "Fail to inject vMCE to dom%d
>>> vcpu%d\n", +                       d->domain_id, v->vcpu_id);
>>> +            return -1;
>>>          }
>>>      }
>>> -    else
>>> -    {
>>> -        /* new vMCE comes while first one has not been injected yet,
>>> -         * in this case, inject fail. [We can't lose this vMCE for
>>> -         * the mce node's consistency].
>>> -         */
>>> -        mce_printk(MCE_QUIET, "There's a pending vMCE waiting to be
>>> injected " 
>>> -                   " to this DOM%d!\n", d->domain_id);
>>> -        return -1;
>>> -    }
>>> +
>>>      return 0;
>>>  }
> 
> 



-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/5] Xen/MCE: vMCE injection
@ 2012-09-26  3:16 Liu, Jinsong
  2012-09-26 10:10 ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Liu, Jinsong @ 2012-09-26  3:16 UTC (permalink / raw)
  To: Jan Beulich, Christoph Egger, xen-devel@lists.xensource.com
  Cc: keir@xen.org, Ian.Campbell@citrix.com

[-- Attachment #1: Type: text/plain, Size: 4425 bytes --]

Xen/MCE: vMCE injection

for Intel MCE, broadcast vMCE to all vcpus;
for AMD MCE, only inject vMCE to 1 vcpu, say, vcpu0

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Suggested_by: Christoph Egger <Christoph.Egger@amd.com>
Suggested_by: Jan Beulich <jbeulich@suse.com>

diff -r 570d98e2f1cf xen/arch/x86/cpu/mcheck/mce.h
--- a/xen/arch/x86/cpu/mcheck/mce.h	Wed Sep 19 23:22:57 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/mce.h	Wed Sep 26 18:59:03 2012 +0800
@@ -168,7 +168,7 @@
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
     uint64_t gstatus);
-int inject_vmce(struct domain *d);
+int inject_vmce(struct domain *d, int vcpuid);
 
 static inline int mce_vendor_bank_msr(const struct vcpu *v, uint32_t msr)
 {
diff -r 570d98e2f1cf xen/arch/x86/cpu/mcheck/mce_intel.c
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c	Wed Sep 19 23:22:57 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c	Wed Sep 26 18:59:03 2012 +0800
@@ -359,7 +359,7 @@
                 }
 
                 /* We will inject vMCE to DOMU*/
-                if ( inject_vmce(d) < 0 )
+                if ( inject_vmce(d, -1) < 0 )
                 {
                     mce_printk(MCE_QUIET, "inject vMCE to DOM%d"
                       " failed\n", d->domain_id);
diff -r 570d98e2f1cf xen/arch/x86/cpu/mcheck/vmce.c
--- a/xen/arch/x86/cpu/mcheck/vmce.c	Wed Sep 19 23:22:57 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/vmce.c	Wed Sep 26 18:59:03 2012 +0800
@@ -338,51 +338,44 @@
 HVM_REGISTER_SAVE_RESTORE(VMCE_VCPU, vmce_save_vcpu_ctxt,
                           vmce_load_vcpu_ctxt, 1, HVMSR_PER_VCPU);
 
-int inject_vmce(struct domain *d)
+/*
+ * for Intel MCE, broadcast vMCE to all vcpus
+ * for AMD MCE, only inject vMCE to 1 vcpu, say, vcpu0
+ * @ d, domain to which would inject vmce
+ * @ vcpuid,
+ *   < 0, broadcast vMCE to all vcpus
+ *   >= 0, vcpu who would be injected vMCE
+ * return 0 for success injection, -1 for fail injection
+ */
+int inject_vmce(struct domain *d, int vcpuid)
 {
-    int cpu = smp_processor_id();
+    struct vcpu *v;
+    int ret = -1;
 
-    /* PV guest and HVM guest have different vMCE# injection methods. */
-    if ( !test_and_set_bool(d->vcpu[0]->mce_pending) )
+    for_each_vcpu(d, v)
     {
-        if ( d->is_hvm )
+        if ( (vcpuid < 0) || (vcpuid == v->vcpu_id) )
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to HVM DOM %d\n",
-                       d->domain_id);
-            vcpu_kick(d->vcpu[0]);
-        }
-        else
-        {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to PV DOM%d\n",
-                       d->domain_id);
-            if ( guest_has_trap_callback(d, 0, TRAP_machine_check) )
+            if ( !test_and_set_bool(v->mce_pending) &&
+                ((d->is_hvm) ||
+                guest_has_trap_callback(d, v->vcpu_id, TRAP_machine_check)) )
             {
-                cpumask_copy(d->vcpu[0]->cpu_affinity_tmp,
-                             d->vcpu[0]->cpu_affinity);
-                mce_printk(MCE_VERBOSE, "MCE: CPU%d set affinity, old %d\n",
-                           cpu, d->vcpu[0]->processor);
-                vcpu_set_affinity(d->vcpu[0], cpumask_of(cpu));
-                vcpu_kick(d->vcpu[0]);
+                mce_printk(MCE_VERBOSE, "MCE: inject vMCE to dom%d vcpu%d\n",
+                           d->domain_id, v->vcpu_id);
+                vcpu_kick(v);
+                ret = 0;
             }
             else
             {
-                mce_printk(MCE_VERBOSE,
-                           "MCE: Kill PV guest with No MCE handler\n");
-                domain_crash(d);
+                mce_printk(MCE_QUIET, "Fail to inject vMCE to dom%d vcpu%d\n",
+                           d->domain_id, v->vcpu_id);
+                ret = -1;
+                break;
             }
         }
     }
-    else
-    {
-        /* new vMCE comes while first one has not been injected yet,
-         * in this case, inject fail. [We can't lose this vMCE for
-         * the mce node's consistency].
-         */
-        mce_printk(MCE_QUIET, "There's a pending vMCE waiting to be injected "
-                   " to this DOM%d!\n", d->domain_id);
-        return -1;
-    }
-    return 0;
+
+    return ret;
 }
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,

[-- Attachment #2: 2_vmce_injection.patch --]
[-- Type: application/octet-stream, Size: 4312 bytes --]

Xen/MCE: vMCE injection

for Intel MCE, broadcast vMCE to all vcpus;
for AMD MCE, only inject vMCE to 1 vcpu, say, vcpu0

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Suggested_by: Christoph Egger <Christoph.Egger@amd.com>
Suggested_by: Jan Beulich <jbeulich@suse.com>

diff -r 570d98e2f1cf xen/arch/x86/cpu/mcheck/mce.h
--- a/xen/arch/x86/cpu/mcheck/mce.h	Wed Sep 19 23:22:57 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/mce.h	Wed Sep 26 18:59:03 2012 +0800
@@ -168,7 +168,7 @@
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
     uint64_t gstatus);
-int inject_vmce(struct domain *d);
+int inject_vmce(struct domain *d, int vcpuid);
 
 static inline int mce_vendor_bank_msr(const struct vcpu *v, uint32_t msr)
 {
diff -r 570d98e2f1cf xen/arch/x86/cpu/mcheck/mce_intel.c
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c	Wed Sep 19 23:22:57 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c	Wed Sep 26 18:59:03 2012 +0800
@@ -359,7 +359,7 @@
                 }
 
                 /* We will inject vMCE to DOMU*/
-                if ( inject_vmce(d) < 0 )
+                if ( inject_vmce(d, -1) < 0 )
                 {
                     mce_printk(MCE_QUIET, "inject vMCE to DOM%d"
                       " failed\n", d->domain_id);
diff -r 570d98e2f1cf xen/arch/x86/cpu/mcheck/vmce.c
--- a/xen/arch/x86/cpu/mcheck/vmce.c	Wed Sep 19 23:22:57 2012 +0800
+++ b/xen/arch/x86/cpu/mcheck/vmce.c	Wed Sep 26 18:59:03 2012 +0800
@@ -338,51 +338,44 @@
 HVM_REGISTER_SAVE_RESTORE(VMCE_VCPU, vmce_save_vcpu_ctxt,
                           vmce_load_vcpu_ctxt, 1, HVMSR_PER_VCPU);
 
-int inject_vmce(struct domain *d)
+/*
+ * for Intel MCE, broadcast vMCE to all vcpus
+ * for AMD MCE, only inject vMCE to 1 vcpu, say, vcpu0
+ * @ d, domain to which would inject vmce
+ * @ vcpuid,
+ *   < 0, broadcast vMCE to all vcpus
+ *   >= 0, vcpu who would be injected vMCE
+ * return 0 for success injection, -1 for fail injection
+ */
+int inject_vmce(struct domain *d, int vcpuid)
 {
-    int cpu = smp_processor_id();
+    struct vcpu *v;
+    int ret = -1;
 
-    /* PV guest and HVM guest have different vMCE# injection methods. */
-    if ( !test_and_set_bool(d->vcpu[0]->mce_pending) )
+    for_each_vcpu(d, v)
     {
-        if ( d->is_hvm )
+        if ( (vcpuid < 0) || (vcpuid == v->vcpu_id) )
         {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to HVM DOM %d\n",
-                       d->domain_id);
-            vcpu_kick(d->vcpu[0]);
-        }
-        else
-        {
-            mce_printk(MCE_VERBOSE, "MCE: inject vMCE to PV DOM%d\n",
-                       d->domain_id);
-            if ( guest_has_trap_callback(d, 0, TRAP_machine_check) )
+            if ( !test_and_set_bool(v->mce_pending) &&
+                ((d->is_hvm) ||
+                guest_has_trap_callback(d, v->vcpu_id, TRAP_machine_check)) )
             {
-                cpumask_copy(d->vcpu[0]->cpu_affinity_tmp,
-                             d->vcpu[0]->cpu_affinity);
-                mce_printk(MCE_VERBOSE, "MCE: CPU%d set affinity, old %d\n",
-                           cpu, d->vcpu[0]->processor);
-                vcpu_set_affinity(d->vcpu[0], cpumask_of(cpu));
-                vcpu_kick(d->vcpu[0]);
+                mce_printk(MCE_VERBOSE, "MCE: inject vMCE to dom%d vcpu%d\n",
+                           d->domain_id, v->vcpu_id);
+                vcpu_kick(v);
+                ret = 0;
             }
             else
             {
-                mce_printk(MCE_VERBOSE,
-                           "MCE: Kill PV guest with No MCE handler\n");
-                domain_crash(d);
+                mce_printk(MCE_QUIET, "Fail to inject vMCE to dom%d vcpu%d\n",
+                           d->domain_id, v->vcpu_id);
+                ret = -1;
+                break;
             }
         }
     }
-    else
-    {
-        /* new vMCE comes while first one has not been injected yet,
-         * in this case, inject fail. [We can't lose this vMCE for
-         * the mce node's consistency].
-         */
-        mce_printk(MCE_QUIET, "There's a pending vMCE waiting to be injected "
-                   " to this DOM%d!\n", d->domain_id);
-        return -1;
-    }
-    return 0;
+
+    return ret;
 }
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/5] Xen/MCE: vMCE injection
  2012-09-26  3:16 Liu, Jinsong
@ 2012-09-26 10:10 ` Jan Beulich
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2012-09-26 10:10 UTC (permalink / raw)
  To: Christoph Egger, Jinsong Liu
  Cc: keir@xen.org, Ian.Campbell@citrix.com, xen-devel

>>> On 26.09.12 at 05:16, "Liu, Jinsong" <jinsong.liu@intel.com> wrote:
> Xen/MCE: vMCE injection
> 
> for Intel MCE, broadcast vMCE to all vcpus;
> for AMD MCE, only inject vMCE to 1 vcpu, say, vcpu0

Please double check what got committed.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-09-26 10:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-19  8:03 [PATCH 2/5] Xen/MCE: vMCE injection Liu, Jinsong
2012-09-20 15:23 ` Christoph Egger
2012-09-20 19:15   ` Liu, Jinsong
2012-09-21  7:38     ` Christoph Egger
  -- strict thread matches above, loose matches on Subject: below --
2012-09-26  3:16 Liu, Jinsong
2012-09-26 10:10 ` Jan Beulich
2012-09-18 13:14 Liu, Jinsong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.