From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:39504) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QnrtY-00050c-NY for qemu-devel@nongnu.org; Mon, 01 Aug 2011 08:46:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QnrtX-00011m-B1 for qemu-devel@nongnu.org; Mon, 01 Aug 2011 08:46:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44481) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QnrtX-00011b-3u for qemu-devel@nongnu.org; Mon, 01 Aug 2011 08:46:55 -0400 Date: Mon, 1 Aug 2011 09:46:49 -0300 From: Luiz Capitulino Message-ID: <20110801094649.4121169c@doriath> In-Reply-To: References: <20110727154229.656532d4@doriath> <4E310E6A.6000004@redhat.com> <20110728103143.548d7bb1@doriath> <4E3165FD.7040805@redhat.com> <4E316FE7.6030005@web.de> <20110728121811.11dc8434@doriath> <4E317E49.5090708@web.de> <20110728143923.3183d349@doriath> <20110728144831.766fa270@doriath> <4E31A1DF.2050702@web.de> <20110728150007.6d8bc9db@doriath> <4E31A4CA.80402@web.de> <20110728152252.14a33b20@doriath> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Blue Swirl Cc: Jan Kiszka , Avi Kivity , qemu-devel On Sat, 30 Jul 2011 10:41:48 +0300 Blue Swirl wrote: > On Thu, Jul 28, 2011 at 9:22 PM, Luiz Capitulino = wrote: > > On Thu, 28 Jul 2011 20:04:58 +0200 > > Jan Kiszka wrote: > > > >> On 2011-07-28 20:00, Luiz Capitulino wrote: > >> > On Thu, 28 Jul 2011 19:52:31 +0200 > >> > Jan Kiszka wrote: > >> > > >> >> On 2011-07-28 19:48, Luiz Capitulino wrote: > >> >>> On Thu, 28 Jul 2011 14:39:23 -0300 > >> >>> Luiz Capitulino wrote: > >> >>> > >> >>>> On Thu, 28 Jul 2011 17:20:41 +0200 > >> >>>> Jan Kiszka wrote: > >> >>>> > >> >>>>> On 2011-07-28 17:18, Luiz Capitulino wrote: > >> >>>>>> On Thu, 28 Jul 2011 16:19:19 +0200 > >> >>>>>> Jan Kiszka wrote: > >> >>>>>> > >> >>>>>>> On 2011-07-28 15:37, Avi Kivity wrote: > >> >>>>>>>> On 07/28/2011 04:31 PM, Luiz Capitulino wrote: > >> >>>>>>>>> On Thu, 28 Jul 2011 10:23:22 +0300 > >> >>>>>>>>> Avi Kivity =A0wrote: > >> >>>>>>>>> > >> >>>>>>>>>> =A0On 07/28/2011 12:44 AM, Blue Swirl wrote: > >> >>>>>>>>>> =A0> =A0On Wed, Jul 27, 2011 at 9:42 PM, Luiz > >> >>>>>>>>> Capitulino =A0 wrote: > >> >>>>>>>>>> =A0> =A0> =A0 This function should be used when the VM is n= ot supposed to > >> >>>>>>>>> resume > >> >>>>>>>>>> =A0> =A0> =A0 execution (eg. by issuing 'cont' monitor comm= and). > >> >>>>>>>>>> =A0> =A0> > >> >>>>>>>>>> =A0> =A0> =A0 Today, we allow the user to resume execution = even when: > >> >>>>>>>>>> =A0> =A0> > >> >>>>>>>>>> =A0> =A0> =A0 =A0 o the guest shuts down and -no-shutdown i= s used > >> >>>>>>>>>> =A0> =A0> =A0 =A0 o there's a kvm internal error > >> >>>>>>>>>> =A0> =A0> =A0 =A0 o loading the VM state with -loadvm or "l= oadvm" in the > >> >>>>>>>>> monitor fails > >> >>>>>>>>>> =A0> =A0> > >> >>>>>>>>>> =A0> =A0> =A0 I think only badness can happen from the case= s above. > >> >>>>>>>>>> =A0> > >> >>>>>>>>>> =A0> =A0I'd suppose a system_reset should bring the system = back to > >> >>>>>>>>> sanity and > >> >>>>>>>>>> =A0> =A0then clear vm_permanent_stopped (where's -ly?) > >> >>>>>>>>> > >> >>>>>>>>> What's -ly? > >> >>>>>>>>> > >> >>>>>>>> > >> >>>>>>>> permanent-ly. > >> >>>>>>>> > >> >>>>>>>>>> =A0> =A0except maybe for KVM > >> >>>>>>>>>> =A0> =A0internal error if that can't be recovered. Then it = would not very > >> >>>>>>>>>> =A0> =A0permanent anymore, so the name would need adjusting. > >> >>>>>>>>>> > >> >>>>>>>>>> =A0Currently, all kvm internal errors are recoverable by re= set (and > >> >>>>>>>>>> =A0possibly by fiddling with memory/registers). > >> >>>>>>>>> > >> >>>>>>>>> Ok, but a poweroff in the guest isn't recoverable with syste= m_reset > >> >>>>>>>>> right? Or does it depend on the guest? > >> >>>>>>>> > >> >>>>>>>> Right, it's not recoverable if you shut the power down where = the tractor > >> >>>>>>>> beam is coupled to the main reactor. > >> >>>>>>> > >> >>>>>>> system_reset will bring all emulated devices back into their p= ower-on > >> >>>>>>> state - unless we have remaining bugs to fix. Actually, one ma= y consider > >> >>>>>>> issuing this reset automatically on vm_start after "permant" v= m_stop. > >> >>>> > >> >>>> The only permanent vm_stop we'd have is poweroff when -no-shutdow= n is used. > >> >>>> > >> >>>> Are you saying that system_reset should be able to recover from t= hat too? > >> >>> > >> >>> It already does, so we don't have permanent stops. > >> >> > >> >> Exactly. We just have stops over inconsistent states that require a > >> >> reset to continue with anything useful. > >> > > >> > Yes. If I got you right, you suggest that we do the reset automatica= lly. > >> > > >> > I think it's better to let the user do it, because s/he might want to > >> > do something else before resetting. For example, for the kvm error t= he > >> > user might want to save the vm state. > >> > >> Associating the reset with a cont means requesting an explicit action > >> from the user. I'm not suggesting to do the reset when the stop state = is > >> entered. > > > > I see. But automatically resetting on cont might be unexpected to the > > user, even on a bad state. > > > > Another option would be to add a force option to cont, where the reset = is > > done when the state is invalid (otherwise cont will return an error). > > > > I still prefer to let the user do it manually though. > > > >> > For the poweroff case with -no-shutdown it's probably fine, but I do= n't > >> > want to hard code special cases. It's better and easier to treat the= m all > >> > as "require system_reset to recover". > >> > >> In any case, we need to tag the current state as stopped-and-invalid or > >> so vs. a normal stop. That remains a valuable first step. How to deal > >> with that information is the second one. >=20 > I think the right way to fix this is to disable 'cont' until a > system_reset is issued. 'cont' should not perform reset but print an > error message about inconsistent state and suggest issuing a > 'system_reset'. That's exactly what the series I'm working on does, although the error message is not explicit about issuing a system_reset (will change).