From mboxrd@z Thu Jan  1 00:00:00 1970
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: kexec -e in PVHVM guests (and in PV).
Date: Mon, 30 Jun 2014 12:21:37 -0400
Message-ID: <20140630162137.GB22781@laptop.dumpdata.com>
References: <20140630153600.GA19885@laptop.dumpdata.com>
	<53B18ABD.8070803@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Content-Disposition: inline
In-Reply-To: <53B18ABD.8070803@citrix.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: David Vrabel <david.vrabel@citrix.com>
Cc: xen-devel@lists.xen.org
List-Id: xen-devel@lists.xenproject.org

On Mon, Jun 30, 2014 at 05:05:17PM +0100, David Vrabel wrote:
> On 30/06/14 16:36, Konrad Rzeszutek Wilk wrote:
> > Hey, 
> > 
> > I had on my todo list an patch from Olaf patch that shuffles
> > the shared_page to be in the 0xFE700000 addr (in the "gap"
> > with newer QEMU's) which unfortunately did not work when
> > migrating on 32-bit PVHVM guests on Xen 4.1.
> > 
> > The commit is 9d02b43dee0d7fb18dfb13a00915550b1a3daa9f
> > "xen PVonHVM: use E820_Reserved area for shared_info" and it
> > ended up being reverted. I dusted it off and I think I found
> > the original bug (and fixed it), but while digging in this
> > the more I discovered a ton more of issues.
> > 
> > A bit about the use case - the 'kexec -e' allows one to
> > restart the Linux kernel without a reboot. It is not a crash kernel
> > so it is just meant to restart and work, and then restart, etc.
> > 
> > The 'kdump -c' (crash) is a different use case and I had not
> > thought much about it. But I think that all of the solutions
> > I am thinking of will make it also work. (so you could
> > do kexec-crash -> kexec-e->kexec-e>kexec-crash->kexec-e, and
> > so, if you would want to).
> 
> These are equivalent from your point of view -- the only different is
> who does the relocation of the image to its final location.  kexec -e
> does it kexec time; kdump -c does it in advance and requires a region of
> memory to be reserved.
> 
> >  7). Grants. Andrew Cooper hinted at this and a bit of experimentation
> >      shows that Xen hypervisor will indeed smack down any guest that
> >      tries to re-use its "old" grants. I am not even sure if the
> >      GNTTAB_setup call is returning the "old" grant frames.
> >      His suggestion was 'GNTTAB_reset' to well, reset everything.
> 
> You also need consider grants that are in use (mapped or copied to) by
> the backend -- the backend might scribble all over your kexec'd state.

I don't know how to solve that. Especially as the backend might
be DMA-ing data at this point - and it is using the MFN value. The
best I could think of was that for in use grants replace its
GMFNs with a scratch page (the hypervisor would do that). 

> 
> > My thinking is that a lot of this code is shared with PV (and PVH)
> > once this is fixed we could do full scale 'kexec -e' in an PV
> > (or PVH) type guest. Doing dom0 kexec -e would be an interesting
> > experiment :-(
> 
> With some toolstack/Xen help you could probably destroy a domain without
> freeing its memory, create a new domain (reusing all the memory) and
> jump to the kexec image.

I was thinking of a potential 'snapshot' hypercall that the 'hvmloader'
(or SeaBIOS) would do. Then on kexec we would reset all of the states
back to this. But ..
> 
> For kdump use cases, pause on crash and then have a helper domain with
> permission to rummage through the crashed domain perform the crash
> dump/analysis.
> 
> I think something like this would be a lot easier than a purely in-guest
> kexec solution.

.. it smacks against the symmetry of the hypercalls that would
reset and/or unbind.

And it sounds much more complex than implementing each of these
individually.
> 
> David
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel