qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	t.hirofuchi@aist.go.jp, qemu-devel@nongnu.org,
	kvm@vger.kernel.org, satoshi.itoh@aist.go.jp
Subject: Re: [Qemu-devel] [RFC] postcopy livemigration proposal
Date: Mon, 08 Aug 2011 15:38:54 +0300	[thread overview]
Message-ID: <4E3FD8DE.6060508@redhat.com> (raw)
In-Reply-To: <20110808032438.GC24764@valinux.co.jp>

On 08/08/2011 06:24 AM, Isaku Yamahata wrote:
> This mail is on "Yabusame: Postcopy Live Migration for Qemu/KVM"
> on which we'll give a talk at KVM-forum.
> The purpose of this mail is to letting developers know it in advance
> so that we can get better feedback on its design/implementation approach
> early before our starting to implement it.

Interesting; what is the impact of increased latency on memory reads?

>
>
> There are several design points.
>    - who takes care of pulling page contents.
>      an independent daemon vs a thread in qemu
>      The daemon approach is preferable because an independent daemon would
>      easy for debug postcopy memory mechanism without qemu.
>      If required, it wouldn't be difficult to convert a daemon into
>      a thread in qemu

Isn't this equivalent to touching each page in sequence?

Care must be taken that we don't post too many requests, or it could 
affect the latency of synchronous accesses by the guest.

>
>    - connection between the source and the destination
>      The connection for live migration can be re-used after sending machine
>      state.
>
>    - transfer protocol
>      The existing protocol that exists today can be extended.
>
>    - hooking guest RAM access
>      Introduce a character device to handle page fault.
>      When page fault occurs, it queues page request up to user space daemon
>      at the destination. And the daemon pulls page contents from the source
>      and serves it into the character device. Then the page fault is resovlved.

This doesn't play well with host swapping, transparent hugepages, or 
ksm, does it?

I see you note this later on.

> * More on hooking guest RAM access
> There are several candidate for the implementation. Our preference is
> character device approach.
>
>    - inserting hooks into everywhere in qemu/kvm
>      This is impractical
>
>    - backing store for guest ram
>      a block device or a file can be used to back guest RAM.
>      Thus hook the guest ram access.
>
>      pros
>      - new device driver isn't needed.
>      cons
>      - future improvement would be difficult
>      - some KVM host feature(KSM, THP) wouldn't work
>
>    - character device
>      qemu mmap() the dedicated character device, and then hook page fault.
>
>      pros
>      - straght forward approach
>      - future improvement would be easy
>      cons
>      - new driver is needed
>      - some KVM host feature(KSM, THP) wouldn't work
>        They checks if a given VMA is anonymous. This can be fixed.
>
>    - swap device
>      When creating guest, it is set up as if all the guest RAM is swapped out
>      to a dedicated swap device, which may be nbd disk (or some kind of user
>      space block device, BUSE?).
>      When the VM tries to access memory, swap-in is triggered and IO to the
>      swap device is issued. Then the IO to swap is routed to the daemon
>      in user space with nbd protocol (or BUSE, AOE, iSCSI...). The daemon pulls
>      pages from the migration source and services the IO request.
>
>      pros
>      - After the page transfer is complete, everything is same as normal case.
>      - no new device driver isn't needed
>      cons
>      - future improvement would be difficult
>      - administration: setting up nbd, swap device
>

Using a swap device would be my preference.  We'd still be using 
anonymous memory so thp/ksm/ordinary swap still work.

It would need to be a special kind of swap device since we only want to 
swap in, and never out, to that device.  We'd also need a special way of 
telling the kernel that memory comes from that device.  In that it's 
similar your second option.

Maybe we should use a backing file (using nbd) and have a madvise() call 
that converts the vma to anonymous memory once the migration is finished.

-- 
error compiling committee.c: too many arguments to function

  parent reply	other threads:[~2011-08-08 12:39 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-08  3:24 [Qemu-devel] [RFC] postcopy livemigration proposal Isaku Yamahata
2011-08-08  9:20 ` Dor Laor
2011-08-08  9:40   ` Yaniv Kaul
2011-08-08 21:42     ` Anthony Liguori
2011-08-08 10:59   ` Nadav Har'El
2011-08-08 11:47     ` Dor Laor
2011-08-08 16:52       ` Cleber Rosa
2011-08-08 15:52         ` Anthony Liguori
2011-08-08 12:32   ` Anthony Liguori
2011-08-08 15:11     ` Dor Laor
2011-08-08 15:29       ` Anthony Liguori
2011-08-08 15:36         ` Avi Kivity
2011-08-08 15:59           ` Anthony Liguori
2011-08-08 19:47             ` Dor Laor
2011-08-09  2:07               ` Isaku Yamahata
2011-08-08  9:38 ` Stefan Hajnoczi
2011-08-08  9:43   ` Isaku Yamahata
2011-08-08 12:38 ` Avi Kivity [this message]
2011-08-09  2:33   ` Isaku Yamahata
2011-08-10 13:55     ` Avi Kivity
2011-08-11  2:19       ` Isaku Yamahata
2011-08-11 16:55         ` Andrea Arcangeli
2011-08-12 11:07 ` [Qemu-devel] [PATCH][RFC] post copy chardevice (was Re: [RFC] postcopy livemigration proposal) Isaku Yamahata
2011-08-12 11:09   ` Isaku Yamahata
2011-08-12 21:26   ` Blue Swirl
2011-08-15 19:29   ` Avi Kivity
2011-08-16  1:42     ` Isaku Yamahata
2011-08-16 13:40       ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E3FD8DE.6060508@redhat.com \
    --to=avi@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=satoshi.itoh@aist.go.jp \
    --cc=t.hirofuchi@aist.go.jp \
    --cc=yamahata@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).