Re: Postcopy with different page sizes on source and destination

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Florian Schmidt <flosch@nutanix.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: Postcopy with different page sizes on source and destination
Date: Tue, 5 May 2020 18:57:09 +0100	[thread overview]
Message-ID: <20200505175709.GB2813@work-vm> (raw)
In-Reply-To: <3fd5a3b0-5b1c-e123-b3d7-e8f91e3871c1@nutanix.com>

* Florian Schmidt (flosch@nutanix.com) wrote:
> Hi,

Hi Florian,

> with precopy live migration, change in page size on source and 
> destination is possible: using hugetlbfs memory backing for the VM on 
> the source and anonymous memory on the destination, and vice versa. For 
> postcopy migration, this is not allowed, and in fact checked during the 
> advise stage.
> 
> Is there any fundamental limitation in the design that prevents this, or 
> is it more that this is an additional complication that nobody has 
> implemented so far because there was no strong need for it?
> 
> It seems to me like this should be possible, and the comment in 
> loadvm_postcopy_handle_advise() (migration/savevm.c:1681) also seems to 
> suggest that; so I'll add a (very rough) first idea.

Yeh it was getting hairy enough at the time, so I kept that restriction.
Now let me just reload that from my brain from 3 years or so back....

> Please tell me if 
> I'm missing something important. The "background" copy is similar to 
> precopy, so the main difference is the userfaultfd page fault handling 
> on the destination, and requesting the correct memory from the source.

Right.

> 1. If the source has hugepages and the destination doesn't, then a page 
> fault would lead the destination to ask "I need these 4k of memory from 
> you to fill my page and handle the page fault". The source could then 
> answer "here you are, and here are these other 511 4k pages around it 
> (which form my 2M page; similarly for 1G pages), please deal with them 
> now". That way, even "release-ram" would still work on a (huge)page 
> granularity.

Yeh I think that's about right; you might have to watch out for cases
where the RAMBlock sizes are different because they've got rounded.

> 2. If the destination has hugepages and the source doesn't, then the 
> above works similarly: now the destination, on a page fault, asks for a 
> larger memory area that corresponds to 512 (or more) pages on the 
> source. The only issue I could see here is during the initial phase, 
> when postcopy is switched on, to make sure that the source doesn't 
> release RAM that it has copied and thinks is clean, but it part of a 
> hugepage on the other side. That seems easy enough to solve though? And 
> indeed is probably already implemented for precopy migration to work 
> with different page sizes on source and destination and could be adapted 
> here.

Precopy doesn't have to worry about it because it doesn't have to clear
out previously partially sent pages.

You'd have a situation where the source things page p+8k is dirty so
sends a discard for that; the destination can't do that - so what does
it do?  You need to get the source to discard on the largest granularity
of source and destination.

The other problem you have here is making sure that the source really
does send all the pages continuously starting from the right point so
that they all end up in one chunk on the destination and it can perform
a place;  for example imagine the source is doing a background
page transfer and is currently at x+1MB,  now it gets a request
from the destination for page y, so it switches to transmitting 'y'
which given the destinations request it will probably transfer
the whole of y - but x was partially transmitted which means
x won't have got placed on the destination.  I'd also worry about
whether the code on the source is OK if it gets a request for 'z'
while it's sending y, but it's probably ok because it has the
counter.

(I'm not sure if there are any changes needed in postcopy recovery -
that was more recent).

Note it's not just hugepages either; you get aarch and power systems
that can have configurable base page sizes, but it's rare to mix
them.

Dave

> Cheers,
> Florian
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

     prev parent reply	other threads:[~2020-05-05 18:04 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-20  9:14 Postcopy with different page sizes on source and destination Florian Schmidt
2020-05-05 17:57 ` Dr. David Alan Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200505175709.GB2813@work-vm \
    --to=dgilbert@redhat.com \
    --cc=flosch@nutanix.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).