kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset
@ 2010-01-05  8:34 Huang Ying
  2010-01-05 10:44 ` Andi Kleen
  2010-01-05 10:50 ` Avi Kivity
  0 siblings, 2 replies; 8+ messages in thread
From: Huang Ying @ 2010-01-05  8:34 UTC (permalink / raw)
  To: Avi Kivity, Anthony Liguori, Andi Kleen; +Cc: kvm

Now, if we inject a fatal MCE into guest OS, for example Linux, Linux
will go panic and then reboot. But if we inject another MCE now,
system will reset directly instead of go panic firstly, because
MCG_STATUS.MCIP is set to 1 and not cleared after reboot. This is does
not follow the behavior in real hardware.

This patch fixes this via set IA32_MCG_STATUS to 0 during system reset.

Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 qemu-kvm-x86.c |    1 +
 1 file changed, 1 insertion(+)

--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
 #endif
     set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
     set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
+    set_msr_entry(&msrs[n++], MSR_MCG_STATUS, 0);
 
     rc = kvm_set_msrs(env, msrs, n);
     if (rc == -1)



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset
  2010-01-05  8:34 [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset Huang Ying
@ 2010-01-05 10:44 ` Andi Kleen
  2010-01-06  8:22   ` Huang Ying
  2010-01-05 10:50 ` Avi Kivity
  1 sibling, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2010-01-05 10:44 UTC (permalink / raw)
  To: Huang Ying; +Cc: Avi Kivity, Anthony Liguori, Andi Kleen, kvm

> --- a/qemu-kvm-x86.c
> +++ b/qemu-kvm-x86.c
> @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
>  #endif
>      set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
>      set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
> +    set_msr_entry(&msrs[n++], MSR_MCG_STATUS, 0);

Still need to keep EIPV and possibly RIPV, don't we?

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset
  2010-01-05  8:34 [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset Huang Ying
  2010-01-05 10:44 ` Andi Kleen
@ 2010-01-05 10:50 ` Avi Kivity
  2010-01-06  7:05   ` Huang Ying
  1 sibling, 1 reply; 8+ messages in thread
From: Avi Kivity @ 2010-01-05 10:50 UTC (permalink / raw)
  To: Huang Ying; +Cc: Anthony Liguori, Andi Kleen, kvm

On 01/05/2010 10:34 AM, Huang Ying wrote:
> Now, if we inject a fatal MCE into guest OS, for example Linux, Linux
> will go panic and then reboot. But if we inject another MCE now,
> system will reset directly instead of go panic firstly, because
> MCG_STATUS.MCIP is set to 1 and not cleared after reboot. This is does
> not follow the behavior in real hardware.
>
> This patch fixes this via set IA32_MCG_STATUS to 0 during system reset.
>
> Signed-off-by: Huang Ying<ying.huang@intel.com>
> ---
>   qemu-kvm-x86.c |    1 +
>   1 file changed, 1 insertion(+)
>
> --- a/qemu-kvm-x86.c
> +++ b/qemu-kvm-x86.c
> @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
>   #endif
>       set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
>       set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
> +    set_msr_entry(&msrs[n++], MSR_MCG_STATUS, 0);
>
>    

Not sure why you reset this in kvm_arch_load_regs().  Shouldn't this be 
in the cpu reset code?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset
  2010-01-05 10:50 ` Avi Kivity
@ 2010-01-06  7:05   ` Huang Ying
  2010-01-06  8:03     ` Avi Kivity
  0 siblings, 1 reply; 8+ messages in thread
From: Huang Ying @ 2010-01-06  7:05 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Anthony Liguori, Andi Kleen, kvm@vger.kernel.org

Hi, Avi,

On Tue, 2010-01-05 at 18:50 +0800, Avi Kivity wrote:
> On 01/05/2010 10:34 AM, Huang Ying wrote:
> > Now, if we inject a fatal MCE into guest OS, for example Linux, Linux
> > will go panic and then reboot. But if we inject another MCE now,
> > system will reset directly instead of go panic firstly, because
> > MCG_STATUS.MCIP is set to 1 and not cleared after reboot. This is does
> > not follow the behavior in real hardware.
> >
> > This patch fixes this via set IA32_MCG_STATUS to 0 during system reset.
> >
> > Signed-off-by: Huang Ying<ying.huang@intel.com>
> > ---
> >   qemu-kvm-x86.c |    1 +
> >   1 file changed, 1 insertion(+)
> >
> > --- a/qemu-kvm-x86.c
> > +++ b/qemu-kvm-x86.c
> > @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
> >   #endif
> >       set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
> >       set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
> > +    set_msr_entry(&msrs[n++], MSR_MCG_STATUS, 0);
> >
> >    
> 
> Not sure why you reset this in kvm_arch_load_regs().  Shouldn't this be 
> in the cpu reset code?

I found kvm_arch_load_regs() is called by kvm_arch_cpu_reset(), which is
called by qemu_kvm_system_reset(). It is not in cpu reset path?

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset
  2010-01-06  7:05   ` Huang Ying
@ 2010-01-06  8:03     ` Avi Kivity
  2010-01-06  8:17       ` Huang Ying
  0 siblings, 1 reply; 8+ messages in thread
From: Avi Kivity @ 2010-01-06  8:03 UTC (permalink / raw)
  To: Huang Ying; +Cc: Anthony Liguori, Andi Kleen, kvm@vger.kernel.org

On 01/06/2010 09:05 AM, Huang Ying wrote:
> @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
>>>    #endif
>>>        set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
>>>        set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
>>> +    set_msr_entry(&msrs[n++], MSR_MCG_STATUS, 0);
>>>
>>>
>>>        
>> Not sure why you reset this in kvm_arch_load_regs().  Shouldn't this be
>> in the cpu reset code?
>>      
> I found kvm_arch_load_regs() is called by kvm_arch_cpu_reset(), which is
> called by qemu_kvm_system_reset(). It is not in cpu reset path?
>    

It is, but it is also called from many other places, which could cause 
this msr to be zeroed.

A better solution is to allocate it a field in CPUState, load and save 
it in kvm_arch_*_regs, and zero it during reset.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset
  2010-01-06  8:03     ` Avi Kivity
@ 2010-01-06  8:17       ` Huang Ying
  2010-01-06  8:38         ` Marcelo Tosatti
  0 siblings, 1 reply; 8+ messages in thread
From: Huang Ying @ 2010-01-06  8:17 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Anthony Liguori, Andi Kleen, kvm@vger.kernel.org

On Wed, 2010-01-06 at 16:03 +0800, Avi Kivity wrote:
> On 01/06/2010 09:05 AM, Huang Ying wrote:
> > @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
> >>>    #endif
> >>>        set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
> >>>        set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
> >>> +    set_msr_entry(&msrs[n++], MSR_MCG_STATUS, 0);
> >>>
> >>>
> >>>        
> >> Not sure why you reset this in kvm_arch_load_regs().  Shouldn't this be
> >> in the cpu reset code?
> >>      
> > I found kvm_arch_load_regs() is called by kvm_arch_cpu_reset(), which is
> > called by qemu_kvm_system_reset(). It is not in cpu reset path?
> >    
> 
> It is, but it is also called from many other places, which could cause 
> this msr to be zeroed.
> 
> A better solution is to allocate it a field in CPUState, load and save 
> it in kvm_arch_*_regs, and zero it during reset.

Yes. You are right. I will fix this.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset
  2010-01-05 10:44 ` Andi Kleen
@ 2010-01-06  8:22   ` Huang Ying
  0 siblings, 0 replies; 8+ messages in thread
From: Huang Ying @ 2010-01-06  8:22 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Avi Kivity, Anthony Liguori, kvm@vger.kernel.org

On Tue, 2010-01-05 at 18:44 +0800, Andi Kleen wrote:
> > --- a/qemu-kvm-x86.c
> > +++ b/qemu-kvm-x86.c
> > @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
> >  #endif
> >      set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
> >      set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
> > +    set_msr_entry(&msrs[n++], MSR_MCG_STATUS, 0);
> 
> Still need to keep EIPV and possibly RIPV, don't we?

It appears that EIPV and RIPV is meaningless outside of MCE exception
handler. But I will try to check the real hardware behavior.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset
  2010-01-06  8:17       ` Huang Ying
@ 2010-01-06  8:38         ` Marcelo Tosatti
  0 siblings, 0 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2010-01-06  8:38 UTC (permalink / raw)
  To: Huang Ying; +Cc: Avi Kivity, Anthony Liguori, Andi Kleen, kvm@vger.kernel.org

On Wed, Jan 06, 2010 at 04:17:42PM +0800, Huang Ying wrote:
> On Wed, 2010-01-06 at 16:03 +0800, Avi Kivity wrote:
> > On 01/06/2010 09:05 AM, Huang Ying wrote:
> > > @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
> > >>>    #endif
> > >>>        set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
> > >>>        set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
> > >>> +    set_msr_entry(&msrs[n++], MSR_MCG_STATUS, 0);
> > >>>
> > >>>
> > >>>        
> > >> Not sure why you reset this in kvm_arch_load_regs().  Shouldn't this be
> > >> in the cpu reset code?
> > >>      
> > > I found kvm_arch_load_regs() is called by kvm_arch_cpu_reset(), which is
> > > called by qemu_kvm_system_reset(). It is not in cpu reset path?
> > >    
> > 
> > It is, but it is also called from many other places, which could cause 
> > this msr to be zeroed.
> > 
> > A better solution is to allocate it a field in CPUState, load and save 
> > it in kvm_arch_*_regs, and zero it during reset.
> 
> Yes. You are right. I will fix this.

BTW, the MCE MSRs are not being migrated. Perhaps you'd like to fix that
while at it.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-01-06  8:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-05  8:34 [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset Huang Ying
2010-01-05 10:44 ` Andi Kleen
2010-01-06  8:22   ` Huang Ying
2010-01-05 10:50 ` Avi Kivity
2010-01-06  7:05   ` Huang Ying
2010-01-06  8:03     ` Avi Kivity
2010-01-06  8:17       ` Huang Ying
2010-01-06  8:38         ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).