From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>,
Marc Zyngier <marc.zyngier@arm.com>,
Peter Maydell <peter.maydell@linaro.org>,
Wei Huang <wei@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>,
Juan Quintela <quintela@redhat.com>,
Andrew Jones <drjones@redhat.com>
Subject: Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64
Date: Wed, 31 Jan 2018 15:23:56 +0000 [thread overview]
Message-ID: <20180131152356.GH2521@work-vm> (raw)
In-Reply-To: <CAKv+Gu-NAcUTpEUXCLudHTCFAf=Yb28q3AET9d4ED9A9z7dMfQ@mail.gmail.com>
* Ard Biesheuvel (ard.biesheuvel@linaro.org) wrote:
> On 31 January 2018 at 09:53, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
> > On Mon, Jan 29, 2018 at 10:32:12AM +0000, Marc Zyngier wrote:
> >> On 29/01/18 10:04, Peter Maydell wrote:
> >> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >> >> * Peter Maydell (peter.maydell@linaro.org) wrote:
> >> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >> >>>> * Peter Maydell (peter.maydell@linaro.org) wrote:
> >> >>>>> I think the correct fix here is that your test code should turn
> >> >>>>> its MMU on. Trying to treat guest RAM as uncacheable doesn't work
> >> >>>>> for Arm KVM guests (for the same reason that VGA device video memory
> >> >>>>> doesn't work). If it's RAM your guest has to arrange to map it as
> >> >>>>> Normal Cacheable, and then everything should work fine.
> >> >>>>
> >> >>>> Does this cause problems with migrating at just the wrong point during
> >> >>>> a VM boot?
> >> >>>
> >> >>> It wouldn't surprise me if it did, but I don't think I've ever
> >> >>> tried to provoke that problem...
> >> >>
> >> >> If you think it'll get the RAM contents wrong, it might be best to fail
> >> >> the migration if you can detect the cache is disabled in the guest.
> >> >
> >> > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
> >> > in the guest's system registers, and refuse migration if it's off...
> >> >
> >> > (cc'd Marc, Christoffer to check that I don't have the wrong end
> >> > of the stick about how thin the ice is in the period before the
> >> > guest turns on its MMU...)
> >>
> >> Once MMU and caches are on, we should be in a reasonable place for QEMU
> >> to have a consistent view of the memory. The trick is to prevent the
> >> vcpus from changing that. A guest could perfectly turn off its MMU at
> >> any given time if it needs to (and it is actually required on some HW if
> >> you want to mitigate headlining CVEs), and KVM won't know about that.
> >>
> >
> > (Clarification: KVM can detect this is it bother to check the VCPU's
> > system registers, but we don't trap to KVM when the VCPU turns off its
> > caches, right?)
> >
> >> You may have to pause the vcpus before starting the migration, or
> >> introduce a new KVM feature that would automatically pause a vcpu that
> >> is trying to disable its MMU while the migration is on. This would
> >> involve trapping all the virtual memory related system registers, with
> >> an obvious cost. But that cost would be limited to the time it takes to
> >> migrate the memory, so maybe that's acceptable.
> >>
> > Is that even sufficient?
> >
> > What if the following happened. (1) guest turns off MMU, (2) guest
> > writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
> > guest ram. QEMU's view of guest ram is now incorrect (stale,
> > incoherent, ...).
> >
> > I'm also not really sure if pausing one VCPU because it turned off its
> > MMU will go very well when trying to migrate a large VM (wouldn't this
> > ask for all the other VCPUs beginning to complain that the stopped VCPU
> > appears to be dead?). As a short-term 'fix' it's probably better to
> > refuse migration if you detect that a VCPU had begun turning off its
> > MMU.
> >
> > On the larger scale of thins; this appears to me to be another case of
> > us really needing some way to coherently access memory between QEMU and
> > the VM, but in the case of the VCPU turning off the MMU prior to
> > migration, we don't even know where it may have written data, and I'm
> > therefore not really sure what the 'proper' solution would be.
> >
> > (cc'ing Ard who has has thought about this problem before in the context
> > of UEFI and VGA.)
> >
>
> Actually, the VGA case is much simpler because the host is not
> expected to write to the framebuffer, only read from it, and the guest
> is not expected to create a cacheable mapping for it, so any
> incoherency can be trivially solved by cache invalidation on the host
> side. (Note that this has nothing to do with DMA coherency, but only
> with PCI MMIO BARs that are backed by DRAM in the host)
>
> In the migration case, it is much more complicated, and I think
> capturing the state of the VM in a way that takes incoherency between
> caches and main memory into account is simply infeasible (i.e., the
> act of recording the state of guest RAM via a cached mapping may evict
> clean cachelines that are out of sync, and so it is impossible to
> record both the cached *and* the delta with the uncached state)
>
> I wonder how difficult it would be to
> a) enable trapping of the MMU system register when a guest CPU is
> found to have its MMU off at migration time
> b) allow the guest CPU to complete whatever it thinks it needs to be
> doing with the MMU off
> c) once it re-enables the MMU, proceed with capturing the memory state
>
> Full disclosure: I know very little about KVM migration ...
The difficulty is that migration is 'live' - i.e. the guest is running
while we're copying the data across; that means that a guest might
do any of these MMU things multiple times - so if we wait for it
to be right, will it go back to being wrong? How long do you wait?
(It's not a bad hack if that's the best we can do though).
Now of course 'live' itself sounds scary for consistency, but the only thing we really
require is that a page is marked dirty some time after it's been
written to so that we cause it to be sent again and that we
eventually send a correct version; it's ok for us to be sending
inconsistent versions as long as we eventually send the right
version.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2018-01-31 15:24 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-24 21:22 [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64 Wei Huang
2018-01-25 20:05 ` Dr. David Alan Gilbert
2018-01-26 15:47 ` Wei Huang
2018-01-26 16:39 ` Peter Maydell
2018-01-26 17:08 ` Wei Huang
2018-01-26 19:46 ` Dr. David Alan Gilbert
2018-01-28 15:08 ` Peter Maydell
2018-01-29 9:53 ` Dr. David Alan Gilbert
2018-01-29 10:04 ` Peter Maydell
2018-01-29 10:19 ` Dr. David Alan Gilbert
2018-01-29 10:32 ` Marc Zyngier
2018-01-31 9:53 ` Christoffer Dall
2018-01-31 15:18 ` Ard Biesheuvel
2018-01-31 15:23 ` Dr. David Alan Gilbert [this message]
2018-01-31 16:53 ` Christoffer Dall
2018-01-31 16:59 ` Ard Biesheuvel
2018-01-31 17:39 ` Christoffer Dall
2018-01-31 18:00 ` Ard Biesheuvel
2018-01-31 19:12 ` Christoffer Dall
2018-01-31 20:15 ` Ard Biesheuvel
2018-02-01 9:17 ` Christoffer Dall
2018-02-01 9:33 ` Ard Biesheuvel
2018-02-01 9:59 ` Christoffer Dall
2018-02-01 10:09 ` Ard Biesheuvel
2018-02-01 10:42 ` Andrew Jones
2018-02-01 10:48 ` Christoffer Dall
2018-02-01 12:25 ` Andrew Jones
2018-02-01 14:04 ` Christoffer Dall
2018-02-01 20:01 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180131152356.GH2521@work-vm \
--to=dgilbert@redhat.com \
--cc=ard.biesheuvel@linaro.org \
--cc=christoffer.dall@linaro.org \
--cc=drjones@redhat.com \
--cc=marc.zyngier@arm.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=wei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.