All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Tim Deegan <tim@xen.org>, David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Xen-devel List <xen-devel@lists.xen.org>,
	Paul Durrant <Paul.Durrant@citrix.com>,
	Jan Beulich <JBeulich@suse.com>
Subject: Re: Migration memory corruption - PV backends need to quiesce
Date: Fri, 27 Jun 2014 19:37:00 +0100	[thread overview]
Message-ID: <53ADB9CC.4000309@citrix.com> (raw)
In-Reply-To: <20140627181500.GA26661@deinos.phlegethon.org>

On 27/06/14 19:15, Tim Deegan wrote:
> At 18:28 +0100 on 27 Jun (1403890088), David Vrabel wrote:
>> On 27/06/14 17:51, Andrew Cooper wrote:
>>> Overall, it would appear that there needs to be a hook for all PV
>>> drivers to force quiescence.  In particular, a backend must guarantee to
>>> unmap all active grant maps (so the frames get properly reflected in the
>>> dirty bitmap), and never process subsequent requests (so no new frames
>>> appear dirty in the bitmap after the guest has been paused).
>> I think this would be much too expensive for snapshots and things like
>> remus.  Waiting for all outstanding I/O could take seconds.
> The other option we talked about yesterday was a flag to the log-dirty
> operation that reports all grant-mapped frames as dirty.  Then the
> tools would add such frames to the final pass.  That could take a long
> time too, of course.
>
> I'm not sure how you would synchronize the final pass with backends
> that were doing grant copy operations -- you could exclude copies for
> the duration, but I'm not sure what that would look like for the
> backend.
>
> Tim.

Hmm - I have a crazy idea.

As identified by David, it is impractical to wait for backends to
complete any outstanding requests and unmap the grants, as this could
take seconds.

However, what the backend can do very quickly is guarantee that it will
never start processing any further requests, and never mark
subsequently-completed requests as complete in the ring.

This means that a the backend will not submit any new grant copy
operations, or regular copies to/from persistent grants, and even if a
hardware device has a dma mapping of an active grant, the request will
not be marked as completed in the ring. Even if the eventual dma'd pages
end up dirty, the frontend will replay the uncompleted requests in the
ring and be mostly fine[1].

Combined with a XEN_DOMCTL_SHADOW_OP_PEEK_INCLUDING_ACTIVE_GRANTS (name
subject to improvement), the migration code can guarantee that there
will be no corruption of the ring, and no relevant corruption of guest
memory.

I *believe* this covers all the cases, and doesn't depend on waiting for
the backends to fully complete all outstanding requests.

~Andrew

[1] The caveat is a pending read followed by a write of the same block
which, once replayed, might be out-of-order if the write did take effect
on the source side.  Any frontends which care about this must wait for
all write requests to complete before entering the suspend state.

  reply	other threads:[~2014-06-27 18:37 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-27 16:51 Migration memory corruption - PV backends need to quiesce Andrew Cooper
2014-06-27 17:28 ` David Vrabel
2014-06-27 18:15   ` Tim Deegan
2014-06-27 18:37     ` Andrew Cooper [this message]
2014-06-30  8:38 ` Ian Campbell
2014-06-30  9:02   ` Andrew Cooper
2014-06-30  9:21     ` Ian Campbell
2014-06-30  9:46       ` Andrew Cooper
2014-06-30  9:52         ` Ian Campbell
2014-06-30 10:13           ` Andrew Cooper
2014-06-30  9:47   ` David Vrabel
2014-06-30  9:53     ` Ian Campbell
2014-07-01 10:29       ` David Vrabel
2014-07-02 10:02         ` Ian Campbell
2014-07-02 10:03           ` David Vrabel
2014-06-30 10:14     ` Tim Deegan
2014-06-30 10:24       ` Ian Campbell
2014-06-30 10:52       ` Tim Deegan
2014-06-30 11:07         ` Ian Campbell
2014-06-30 11:12           ` Tim Deegan
2014-06-30 11:57           ` David Vrabel
2014-06-30 12:20             ` Ian Campbell
2014-06-30 11:01       ` Ian Campbell
2014-06-30 11:08         ` Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53ADB9CC.4000309@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Paul.Durrant@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.