From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: kexec -e in PVHVM guests (and in PV). Date: Mon, 30 Jun 2014 12:21:37 -0400 Message-ID: <20140630162137.GB22781@laptop.dumpdata.com> References: <20140630153600.GA19885@laptop.dumpdata.com> <53B18ABD.8070803@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <53B18ABD.8070803@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel Cc: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On Mon, Jun 30, 2014 at 05:05:17PM +0100, David Vrabel wrote: > On 30/06/14 16:36, Konrad Rzeszutek Wilk wrote: > > Hey, > > > > I had on my todo list an patch from Olaf patch that shuffles > > the shared_page to be in the 0xFE700000 addr (in the "gap" > > with newer QEMU's) which unfortunately did not work when > > migrating on 32-bit PVHVM guests on Xen 4.1. > > > > The commit is 9d02b43dee0d7fb18dfb13a00915550b1a3daa9f > > "xen PVonHVM: use E820_Reserved area for shared_info" and it > > ended up being reverted. I dusted it off and I think I found > > the original bug (and fixed it), but while digging in this > > the more I discovered a ton more of issues. > > > > A bit about the use case - the 'kexec -e' allows one to > > restart the Linux kernel without a reboot. It is not a crash kernel > > so it is just meant to restart and work, and then restart, etc. > > > > The 'kdump -c' (crash) is a different use case and I had not > > thought much about it. But I think that all of the solutions > > I am thinking of will make it also work. (so you could > > do kexec-crash -> kexec-e->kexec-e>kexec-crash->kexec-e, and > > so, if you would want to). > > These are equivalent from your point of view -- the only different is > who does the relocation of the image to its final location. kexec -e > does it kexec time; kdump -c does it in advance and requires a region of > memory to be reserved. > > > 7). Grants. Andrew Cooper hinted at this and a bit of experimentation > > shows that Xen hypervisor will indeed smack down any guest that > > tries to re-use its "old" grants. I am not even sure if the > > GNTTAB_setup call is returning the "old" grant frames. > > His suggestion was 'GNTTAB_reset' to well, reset everything. > > You also need consider grants that are in use (mapped or copied to) by > the backend -- the backend might scribble all over your kexec'd state. I don't know how to solve that. Especially as the backend might be DMA-ing data at this point - and it is using the MFN value. The best I could think of was that for in use grants replace its GMFNs with a scratch page (the hypervisor would do that). > > > My thinking is that a lot of this code is shared with PV (and PVH) > > once this is fixed we could do full scale 'kexec -e' in an PV > > (or PVH) type guest. Doing dom0 kexec -e would be an interesting > > experiment :-( > > With some toolstack/Xen help you could probably destroy a domain without > freeing its memory, create a new domain (reusing all the memory) and > jump to the kexec image. I was thinking of a potential 'snapshot' hypercall that the 'hvmloader' (or SeaBIOS) would do. Then on kexec we would reset all of the states back to this. But .. > > For kdump use cases, pause on crash and then have a helper domain with > permission to rummage through the crashed domain perform the crash > dump/analysis. > > I think something like this would be a lot easier than a purely in-guest > kexec solution. .. it smacks against the symmetry of the hypercalls that would reset and/or unbind. And it sounds much more complex than implementing each of these individually. > > David > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel