public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Windows guest reboot failure
@ 2007-09-26  7:15 He, Qing
       [not found] ` <37E52D09333DE2469A03574C88DBF40FA9C256-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: He, Qing @ 2007-09-26  7:15 UTC (permalink / raw)
  To: kvm-devel

Hi folks,
	I found that Windows guests often fail reboots in current
kvm.git. Either the guest hangs, or it reports double fault exception
which causes qemu aborts. This is not shown with -no-kvm-irqchip option.

	After some investigation, it seems that this is caused by lack
of in-kernel components reset. Several windows guests will write RESET
command to 8042, the keyboard controller. On receiving the command, qemu
device model will call qemu_system_reset() to reset everything to the
initial state. However, this mechanism is broken when in-kernel
components is introduced: qemu doesn't communicate the reset request to
kvm, so kvm uses stale device state on the next booting, which fails
reboot.

	To resolve this, there are at least two options:
	1. modify qemu_system_reset(), so that qemu destroys kvm context
and re-creates.
	2. introduce a new ioctl, say KVM_RESET, to kvm. On receiving
this ioctl, the kernel calls reset handlers registered by other
components (vcpu, lapic, pic, ioapic, pit, etc.), either against vm or
against each vcpu.

	IMO, the first one is easier in modification and maintaining,
but the second one is closer to qemu logics. Which do you think is more
appropriate?

Any comment?

Thanks,
Qing

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Windows guest reboot failure
       [not found] ` <37E52D09333DE2469A03574C88DBF40FA9C256-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2007-09-26  7:36   ` Dor Laor
  2007-09-26  8:55   ` Avi Kivity
  1 sibling, 0 replies; 7+ messages in thread
From: Dor Laor @ 2007-09-26  7:36 UTC (permalink / raw)
  To: He, Qing; +Cc: kvm-devel

He, Qing wrote:
> Hi folks,
> 	I found that Windows guests often fail reboots in current
> kvm.git. Either the guest hangs, or it reports double fault exception
> which causes qemu aborts. This is not shown with -no-kvm-irqchip option.
>
> 	After some investigation, it seems that this is caused by lack
> of in-kernel components reset. Several windows guests will write RESET
> command to 8042, the keyboard controller. On receiving the command, qemu
> device model will call qemu_system_reset() to reset everything to the
> initial state. However, this mechanism is broken when in-kernel
> components is introduced: qemu doesn't communicate the reset request to
> kvm, so kvm uses stale device state on the next booting, which fails
> reboot.
>
> 	To resolve this, there are at least two options:
> 	1. modify qemu_system_reset(), so that qemu destroys kvm context
> and re-creates.
> 	2. introduce a new ioctl, say KVM_RESET, to kvm. On receiving
> this ioctl, the kernel calls reset handlers registered by other
> components (vcpu, lapic, pic, ioapic, pit, etc.), either against vm or
> against each vcpu.
>
> 	IMO, the first one is easier in modification and maintaining,
> but the second one is closer to qemu logics. Which do you think is more
> appropriate?
>
> Any comment?
>
> Thanks,
> Qing
>   
I'm voting for option 2. As you said its closer to qemu's device model 
approach and it's also similar to kernel components (*pi*)
save/restore methods. It also minimized qemu changes.
Dor.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Windows guest reboot failure
       [not found] ` <37E52D09333DE2469A03574C88DBF40FA9C256-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2007-09-26  7:36   ` Dor Laor
@ 2007-09-26  8:55   ` Avi Kivity
       [not found]     ` <46FA1E70.8080802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 7+ messages in thread
From: Avi Kivity @ 2007-09-26  8:55 UTC (permalink / raw)
  To: He, Qing; +Cc: kvm-devel

He, Qing wrote:
> Hi folks,
> 	I found that Windows guests often fail reboots in current
> kvm.git. Either the guest hangs, or it reports double fault exception
> which causes qemu aborts. This is not shown with -no-kvm-irqchip option.
>
> 	After some investigation, it seems that this is caused by lack
> of in-kernel components reset. Several windows guests will write RESET
> command to 8042, the keyboard controller. On receiving the command, qemu
> device model will call qemu_system_reset() to reset everything to the
> initial state. However, this mechanism is broken when in-kernel
> components is introduced: qemu doesn't communicate the reset request to
> kvm, so kvm uses stale device state on the next booting, which fails
> reboot.
>
> 	To resolve this, there are at least two options:
> 	1. modify qemu_system_reset(), so that qemu destroys kvm context
> and re-creates.
>   

That's too destructive, there's lots of state to recreate, things like 
dirty page logging need to be reinitialized.

> 	2. introduce a new ioctl, say KVM_RESET, to kvm. On receiving
> this ioctl, the kernel calls reset handlers registered by other
> components (vcpu, lapic, pic, ioapic, pit, etc.), either against vm or
> against each vcpu.
>
>   

There's a third option: let the regular qemu logic work, and after that 
copy the state to kvm (just like after vm restore).  I tried that but 
there were problems so I dropped it.  But the approach should work.

> 	IMO, the first one is easier in modification and maintaining,
> but the second one is closer to qemu logics. Which do you think is more
> appropriate?
>
>   

I think #2.  Synchronization will be difficult; we'll need to send 
signals to all other vcpus so that they drop the vcpu mutex.


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Windows guest reboot failure
       [not found]     ` <46FA1E70.8080802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-09-26 11:03       ` Dong, Eddie
       [not found]         ` <10EA09EFD8728347A513008B6B0DA77A02249DBF-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Dong, Eddie @ 2007-09-26 11:03 UTC (permalink / raw)
  To: Avi Kivity, He, Qing; +Cc: kvm-devel

> 
> 
> I think #2.  Synchronization will be difficult; we'll need to send
> signals to all other vcpus so that they drop the vcpu mutex.
> 
How about add a new ABI KVM_RESET_KERNDEVS ?
We don't want to implement RESET ABIs for each kernel devices
especially when we move PIT down and then RTC, pmtimer etc.
Eddie

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Windows guest reboot failure
       [not found]             ` <46FB733B.4080102-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-09-27  9:02               ` Dong, Eddie
       [not found]                 ` <10EA09EFD8728347A513008B6B0DA77A022AE425-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Dong, Eddie @ 2007-09-27  9:02 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, He, Qing

Avi Kivity wrote:
> Dong, Eddie wrote:
>>> I think #2.  Synchronization will be difficult; we'll need to send
>>> signals to all other vcpus so that they drop the vcpu mutex.
>>> 
>>> 
>> How about add a new ABI KVM_RESET_KERNDEVS ?
>> We don't want to implement RESET ABIs for each kernel devices
>> especially when we move PIT down and then RTC, pmtimer etc.
>> 
> 
> I thought that was what Qing proposed :)
> 
> I think we want at least a separate reset for LAPIC and IOAPICs, that
> pushes the synchronization issues to userspace (though it may
> introduce other issues like lapic issuing eois to ioapic after the
> ioapic has been
> reset).
> 

Mmm, if we implement 2, then we will have many. How about pv driver in
future?
They also need to be reseted.
If implementing 1, then everything is left to kernel itself. BTW, for
simplicity, this API
may be per VCPU since reset a cross CPU LAPIC state is problemtic. BSP
will issue platform 
wide device model reset.

Eddie

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Windows guest reboot failure
       [not found]         ` <10EA09EFD8728347A513008B6B0DA77A02249DBF-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2007-09-27  9:09           ` Avi Kivity
       [not found]             ` <46FB733B.4080102-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Avi Kivity @ 2007-09-27  9:09 UTC (permalink / raw)
  To: Dong, Eddie; +Cc: kvm-devel, He, Qing

Dong, Eddie wrote:
>> I think #2.  Synchronization will be difficult; we'll need to send
>> signals to all other vcpus so that they drop the vcpu mutex.
>>
>>     
> How about add a new ABI KVM_RESET_KERNDEVS ?
> We don't want to implement RESET ABIs for each kernel devices
> especially when we move PIT down and then RTC, pmtimer etc.
>   

I thought that was what Qing proposed :)

I think we want at least a separate reset for LAPIC and IOAPICs, that
pushes the synchronization issues to userspace (though it may introduce
other issues like lapic issuing eois to ioapic after the ioapic has been
reset).


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Windows guest reboot failure
       [not found]                 ` <10EA09EFD8728347A513008B6B0DA77A022AE425-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2007-09-27  9:22                   ` Avi Kivity
  0 siblings, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2007-09-27  9:22 UTC (permalink / raw)
  To: Dong, Eddie; +Cc: kvm-devel, He, Qing

Dong, Eddie wrote:
> Avi Kivity wrote:
>   
>> Dong, Eddie wrote:
>>     
>>>> I think #2.  Synchronization will be difficult; we'll need to send
>>>> signals to all other vcpus so that they drop the vcpu mutex.
>>>>
>>>>
>>>>         
>>> How about add a new ABI KVM_RESET_KERNDEVS ?
>>> We don't want to implement RESET ABIs for each kernel devices
>>> especially when we move PIT down and then RTC, pmtimer etc.
>>>
>>>       
>> I thought that was what Qing proposed :)
>>
>> I think we want at least a separate reset for LAPIC and IOAPICs, that
>> pushes the synchronization issues to userspace (though it may
>> introduce other issues like lapic issuing eois to ioapic after the
>> ioapic has been
>> reset).
>>
>>     
>
> Mmm, if we implement 2, then we will have many. How about pv driver in
> future?
> They also need to be reseted.
> If implementing 1, then everything is left to kernel itself. BTW, for
> simplicity, this API
> may be per VCPU since reset a cross CPU LAPIC state is problemtic. BSP
> will issue platform 
> wide device model reset.
>   

We can have two APIs: reset vcpu (per-vcpu) and reset platform
(everything else).


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-09-27  9:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-26  7:15 Windows guest reboot failure He, Qing
     [not found] ` <37E52D09333DE2469A03574C88DBF40FA9C256-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-09-26  7:36   ` Dor Laor
2007-09-26  8:55   ` Avi Kivity
     [not found]     ` <46FA1E70.8080802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-09-26 11:03       ` Dong, Eddie
     [not found]         ` <10EA09EFD8728347A513008B6B0DA77A02249DBF-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-09-27  9:09           ` Avi Kivity
     [not found]             ` <46FB733B.4080102-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-09-27  9:02               ` Dong, Eddie
     [not found]                 ` <10EA09EFD8728347A513008B6B0DA77A022AE425-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-09-27  9:22                   ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox