From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [PATCH v3 0/1] Introduce VCPUOP_reset_vcpu_info Date: Thu, 21 Aug 2014 22:27:41 -0400 Message-ID: <20140822022741.GG20329@laptop.dumpdata.com> References: <1408442683-12125-1-git-send-email-vkuznets@redhat.com> <53F39EA8.5020602@cantab.net> <20140820133742.GI3120@laptop.dumpdata.com> <20140820215730.GB6411@laptop.dumpdata.com> <87mwayku4n.fsf@vitty.brq.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XKeZz-00079J-KB for xen-devel@lists.xenproject.org; Fri, 22 Aug 2014 02:27:51 +0000 Content-Disposition: inline In-Reply-To: <87mwayku4n.fsf@vitty.brq.redhat.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Vitaly Kuznetsov Cc: David Vrabel , xen-devel@lists.xenproject.org, Andrew Jones , David Vrabel , Jan Beulich List-Id: xen-devel@lists.xenproject.org On Thu, Aug 21, 2014 at 12:35:36PM +0200, Vitaly Kuznetsov wrote: > Konrad Rzeszutek Wilk writes: > > > On Wed, Aug 20, 2014 at 09:37:42AM -0400, Konrad Rzeszutek Wilk wrote: > >> On Tue, Aug 19, 2014 at 07:59:52PM +0100, David Vrabel wrote: > >> > On 19/08/14 11:04, Vitaly Kuznetsov wrote: > >> > >The patch and guest code are based on the prototype by Konrad Rzeszutek Wilk. > >> > > > >> > >VCPUOP_reset_vcpu_info is required to support kexec performed by smp pvhvm > >> > >guest. It was tested with the guest code listed below. > >> > > >> > Instead of having the guest teardown all these bits of setup. I think it > >> > would be preferable to have the toolstack build a new domain with the same > >> > memory contents from the original VM. The toolstack would then start this > >> > new domain at the kexec entry point. > >> > >> What about kdump case /crash case? We might crash at anytime and would > >> want to start the kdump kernel which hopefully can reset all of the VCPU > >> information such that it can boot with more than one VCPU. > >> > >> > > >> > The advantage of this is you don't need to add new hypercall sub-ops to > >> > teardown all bits and pieces, both for existing stuff and for anything new > >> > that might be added. > >> > >> Sure, except that having an setup and teardown paths provide a nice > >> symetrical states. Doing an 'kexec_guest' hypercall seems to be just > >> a workaround that and giving up on the symmetry. > >> > >> My feeling is that we really ought to have 'init' and 'teardown' > >> for every hypercall. That would also be good to test the locking, memory > >> leaks, etc. > > > > We had a big discussion today at the Xen Developer to talk about it. > > I regret I missed this one :-) It would have been good to have had you here. Would it be possible for you to be present at the future Xen Hackathons? > > > The one hypercall option has the appeal that it will reset the > > guest to the initial boot state (minues whatever memory got ballooned out). > > The semantics of it are similar to a SCHEDOP_shutdown hypercall, but it would > > be a warm_reset type. > > > > I think that going that route and instead of chasing down different > > states (event channels, grants, vcpu's, pagetables, etc) we would > > wipe everything to a nice clean slate. Maybe the hypercall argument > > should be called tabula_rasa :-) > > > > The reason I like this instead of doing a seperate de-alloc hypercalls are: > > 1) Cool name (tabule_rasa!) > > 2) It would not require complicated code paths to iterate for tearing > > down grants, events, etc. > > 3). It is one simple hypercall that could be used by kdump/kexec with an > > understanding of its semantics: it would continue executing after this > > hypercall, it might change the VCPU to a different one (so executing on > > vCPU5 but now we are at VCPU0), IDT and GDTs are reset to their initial > > states, ditto on callbacks, etc. And of course work on both PVHVM and PV(and PVH). > > > > Thoughts? > > I think we don't necessary need new hypercall, new reason code for > SCHEDOP_shutdown should work (cool name can go there :-). The op we want > to implement is very similar to rename-restart, we need to copy ram and > vcpu contexts before destroying old domain. > > Recreating domain and copying all memory should work but we'll require > host to have free memory, this can be an issue for large guests. If we > try implementing 'reassigning' of memory without making a copy that can > lead to same issues we have now: mounted grants, shared info,... That is a good point. Especially with 512GB guests. David, Jan, thoughts? > > I can try this approach but it's really hard for me to predict how long > it's going to take me.. On the other hand resetting vcpu_info *should* > be enough for the majority of use cases so we'll have 'full kexec > support' relatively fast.. I think you are going to hit the grant usage as well, but perhaps not? > > -- > Vitaly