* Re: [Qemu-devel] Fwd: [PATCH v2 00/41] postcopy live migration
[not found] <4FCCA39A.1050300@hp.com>
@ 2012-06-04 13:13 ` Isaku Yamahata
2012-06-04 14:27 ` Chegu Vinod
0 siblings, 1 reply; 3+ messages in thread
From: Isaku Yamahata @ 2012-06-04 13:13 UTC (permalink / raw)
To: Chegu Vinod; +Cc: qemu-devel, kvm
On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote:
> Hello Isaku Yamahata,
Hi.
> I just saw your patches..Would it be possible to email me a tar bundle of these
> patches (makes it easier to apply the patches to a copy of the upstream qemu.git)
I uploaded them to github for those who are interested in it.
git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012
git://github.com/yamahata/linux-umem.git linux-umem-june-04-2012
> BTW, I am also curious if you have considered using any kind of RDMA features for
> optimizing the page-faults during postcopy ?
Yes, RDMA is interesting topic. Can we share your use case/concern/issues?
Thus we can collaborate.
You may want to see Benoit's results. As long as I know, he has not published
his code yet.
thanks,
> Thanks
> Vinod
>
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 4 Jun 2012 18:57:02 +0900
> From: Isaku Yamahata<yamahata@valinux.co.jp>
> To: qemu-devel@nongnu.org, kvm@vger.kernel.org
> Cc: benoit.hudzia@gmail.com, aarcange@redhat.com, aliguori@us.ibm.com,
> quintela@redhat.com, stefanha@gmail.com, t.hirofuchi@aist.go.jp,
> dlaor@redhat.com, satoshi.itoh@aist.go.jp, mdroth@linux.vnet.ibm.com,
> yoshikawa.takuya@oss.ntt.co.jp, owasserm@redhat.com, avi@redhat.com,
> pbonzini@redhat.com
> Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration
> Message-ID:<cover.1338802190.git.yamahata@valinux.co.jp>
>
> After the long time, we have v2. This is qemu part.
> The linux kernel part is sent separatedly.
>
> Changes v1 -> v2:
> - split up patches for review
> - buffered file refactored
> - many bug fixes
> Espcially PV drivers can work with postcopy
> - optimization/heuristic
>
> Patches
> 1 - 30: refactoring exsiting code and preparation
> 31 - 37: implement postcopy itself (essential part)
> 38 - 41: some optimization/heuristic for postcopy
>
> Intro
> =====
> This patch series implements postcopy live migration.[1]
> As discussed at KVM forum 2011, dedicated character device is used for
> distributed shared memory between migration source and destination.
> Now we can discuss/benchmark/compare with precopy. I believe there are
> much rooms for improvement.
>
> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>
>
> Usage
> =====
> You need load umem character device on the host before starting migration.
> Postcopy can be used for tcg and kvm accelarator. The implementation depend
> on only linux umem character device. But the driver dependent code is split
> into a file.
> I tested only host page size == guest page size case, but the implementation
> allows host page size != guest page size case.
>
> The following options are added with this patch series.
> - incoming part
> command line options
> -postcopy [-postcopy-flags<flags>]
> where flags is for changing behavior for benchmark/debugging
> Currently the following flags are available
> 0: default
> 1: enable touching page request
>
> example:
> qemu -postcopy -incoming tcp:0:4444 -monitor stdio -machine accel=kvm
>
> - outging part
> options for migrate command
> migrate [-p [-n] [-m]] URI [<prefault forward> [<prefault backword>]]
> -p: indicate postcopy migration
> -n: disable background transferring pages: This is for benchmark/debugging
> -m: move background transfer of postcopy mode
> <prefault forward>: The number of forward pages which is sent with on-demand
> <prefault backward>: The number of backward pages which is sent with
> on-demand
>
> example:
> migrate -p -n tcp:<dest ip address>:4444
> migrate -p -n -m tcp:<dest ip address>:4444 32 0
>
>
> TODO
> ====
> - benchmark/evaluation. Especially how async page fault affects the result.
> - improve/optimization
> At the moment at least what I'm aware of is
> - making incoming socket non-blocking with thread
> As page compression is comming, it is impractical to non-blocking read
> and check if the necessary data is read.
> - touching pages in incoming qemu process by fd handler seems suboptimal.
> creating dedicated thread?
> - outgoing handler seems suboptimal causing latency.
> - consider on FUSE/CUSE possibility
> - don't fork umemd, but create thread?
>
> basic postcopy work flow
> ========================
> qemu on the destination
> |
> V
> open(/dev/umem)
> |
> V
> UMEM_INIT
> |
> V
> Here we have two file descriptors to
> umem device and shmem file
> |
> | umemd
> | daemon on the destination
> |
> V create pipe to communicate
> fork()---------------------------------------,
> | |
> V |
> close(socket) V
> close(shmem) mmap(shmem file)
> | |
> V V
> mmap(umem device) for guest RAM close(shmem file)
> | |
> close(umem device) |
> | |
> V |
> wait for ready from daemon<----pipe-----send ready message
> | |
> | Here the daemon takes over
> send ok------------pipe---------------> the owner of the socket
> | to the source
> V |
> entering post copy stage |
> start guest execution |
> | |
> V V
> access guest RAM read() to get faulted pages
> | |
> V V
> page fault ------------------------------>page offset is returned
> block |
> V
> pull page from the source
> write the page contents
> to the shmem.
> |
> V
> unblock<-----------------------------write() to tell served pages
> the fault handler returns the page
> page fault is resolved
> |
> | pages can be sent
> | backgroundly
> | |
> | V
> | write()
> | |
> V V
> The specified pages<-----pipe------------request to touch pages
> are made present by |
> touching guest RAM. |
> | |
> V V
> reply-------------pipe-------------> release the cached page
> | madvise(MADV_REMOVE)
> | |
> V V
>
> all the pages are pulled from the source
>
> | |
> V V
> the vma becomes anonymous<----------------UMEM_MAKE_VMA_ANONYMOUS
> (note: I'm not sure if this can be implemented or not)
> | |
> V V
> migration completes exit()
>
>
>
>
> Isaku Yamahata (41):
> arch_init: export sort_ram_list() and ram_save_block()
> arch_init: export RAM_SAVE_xxx flags for postcopy
> arch_init/ram_save: introduce constant for ram save version = 4
> arch_init: refactor host_from_stream_offset()
> arch_init/ram_save_live: factor out RAM_SAVE_FLAG_MEM_SIZE case
> arch_init: refactor ram_save_block()
> arch_init/ram_save_live: factor out ram_save_limit
> arch_init/ram_load: refactor ram_load
> arch_init: introduce helper function to find ram block with id string
> arch_init: simplify a bit by ram_find_block()
> arch_init: factor out counting transferred bytes
> arch_init: factor out setting last_block, last_offset
> exec.c: factor out qemu_get_ram_ptr()
> exec.c: export last_ram_offset()
> savevm: export qemu_peek_buffer, qemu_peek_byte, qemu_file_skip
> savevm: qemu_pending_size() to return pending buffered size
> savevm, buffered_file: introduce method to drain buffer of buffered
> file
> QEMUFile: add qemu_file_fd() for later use
> savevm/QEMUFile: drop qemu_stdio_fd
> savevm/QEMUFileSocket: drop duplicated member fd
> savevm: rename QEMUFileSocket to QEMUFileFD, socket_close to fd_close
> savevm/QEMUFile: introduce qemu_fopen_fd
> migration.c: remove redundant line in migrate_init()
> migration: export migrate_fd_completed() and migrate_fd_cleanup()
> migration: factor out parameters into MigrationParams
> buffered_file: factor out buffer management logic
> buffered_file: Introduce QEMUFileNonblock for nonblock write
> buffered_file: add qemu_file to read/write to buffer in memory
> umem.h: import Linux umem.h
> update-linux-headers.sh: teach umem.h to update-linux-headers.sh
> configure: add CONFIG_POSTCOPY option
> savevm: add new section that is used by postcopy
> postcopy: introduce -postcopy and -postcopy-flags option
> postcopy outgoing: add -p and -n option to migrate command
> postcopy: introduce helper functions for postcopy
> postcopy: implement incoming part of postcopy live migration
> postcopy: implement outgoing part of postcopy live migration
> postcopy/outgoing: add forward, backward option to specify the size
> of prefault
> postcopy/outgoing: implement prefault
> migrate: add -m (movebg) option to migrate command
> migration/postcopy: add movebg mode
>
> Makefile.target | 5 +
> arch_init.c | 298 ++++---
> arch_init.h | 20 +
> block-migration.c | 8 +-
> buffered_file.c | 322 ++++++--
> buffered_file.h | 32 +
> configure | 12 +
> cpu-all.h | 9 +
> exec-obsolete.h | 1 +
> exec.c | 87 ++-
> hmp-commands.hx | 18 +-
> hmp.c | 10 +-
> linux-headers/linux/umem.h | 42 +
> migration-exec.c | 12 +-
> migration-fd.c | 25 +-
> migration-postcopy-stub.c | 77 ++
> migration-postcopy.c | 1771 +++++++++++++++++++++++++++++++++++++++
> migration-tcp.c | 25 +-
> migration-unix.c | 26 +-
> migration.c | 97 ++-
> migration.h | 47 +-
> qapi-schema.json | 4 +-
> qemu-common.h | 2 +
> qemu-file.h | 8 +-
> qemu-options.hx | 25 +
> qmp-commands.hx | 4 +-
> savevm.c | 177 ++++-
> scripts/update-linux-headers.sh | 2 +-
> sysemu.h | 4 +-
> umem.c | 364 ++++++++
> umem.h | 101 +++
> vl.c | 16 +-
> vmstate.h | 2 +-
> 33 files changed, 3373 insertions(+), 280 deletions(-)
> create mode 100644 linux-headers/linux/umem.h
> create mode 100644 migration-postcopy-stub.c
> create mode 100644 migration-postcopy.c
> create mode 100644 umem.c
> create mode 100644 umem.h
>
>
>
>
> ------------------------------
>
>
--
yamahata
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Fwd: [PATCH v2 00/41] postcopy live migration
2012-06-04 13:13 ` [Qemu-devel] Fwd: [PATCH v2 00/41] postcopy live migration Isaku Yamahata
@ 2012-06-04 14:27 ` Chegu Vinod
2012-06-04 15:13 ` Isaku Yamahata
0 siblings, 1 reply; 3+ messages in thread
From: Chegu Vinod @ 2012-06-04 14:27 UTC (permalink / raw)
To: Isaku Yamahata; +Cc: qemu-devel, kvm
On 6/4/2012 6:13 AM, Isaku Yamahata wrote:
> On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote:
>> Hello Isaku Yamahata,
> Hi.
>
>> I just saw your patches..Would it be possible to email me a tar bundle of these
>> patches (makes it easier to apply the patches to a copy of the upstream qemu.git)
> I uploaded them to github for those who are interested in it.
>
> git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012
> git://github.com/yamahata/linux-umem.git linux-umem-june-04-2012
>
Thanks for the pointer...
>> BTW, I am also curious if you have considered using any kind of RDMA features for
>> optimizing the page-faults during postcopy ?
> Yes, RDMA is interesting topic. Can we share your use case/concern/issues?
Looking at large sized guests (256GB and higher) running cpu/memory
intensive enterprise workloads.
The concerns are the same...i.e. having a predictable total migration
time, minimal downtime/freeze-time and of course minimal service
degradation to the workload(s) in the VM or the co-located VM's...
How large of a guest have you tested your changes with and what kind of
workloads have you used so far ?
> Thus we can collaborate.
> You may want to see Benoit's results.
Yes. 'have already seen some of Benoit's results.
Hence the question about use of RDMA techniques for post copy.
> As long as I know, he has not published
> his code yet.
Thanks
Vinod
>
> thanks,
>
>> Thanks
>> Vinod
>>
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Mon, 4 Jun 2012 18:57:02 +0900
>> From: Isaku Yamahata<yamahata@valinux.co.jp>
>> To: qemu-devel@nongnu.org, kvm@vger.kernel.org
>> Cc: benoit.hudzia@gmail.com, aarcange@redhat.com, aliguori@us.ibm.com,
>> quintela@redhat.com, stefanha@gmail.com, t.hirofuchi@aist.go.jp,
>> dlaor@redhat.com, satoshi.itoh@aist.go.jp, mdroth@linux.vnet.ibm.com,
>> yoshikawa.takuya@oss.ntt.co.jp, owasserm@redhat.com, avi@redhat.com,
>> pbonzini@redhat.com
>> Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration
>> Message-ID:<cover.1338802190.git.yamahata@valinux.co.jp>
>>
>> After the long time, we have v2. This is qemu part.
>> The linux kernel part is sent separatedly.
>>
>> Changes v1 -> v2:
>> - split up patches for review
>> - buffered file refactored
>> - many bug fixes
>> Espcially PV drivers can work with postcopy
>> - optimization/heuristic
>>
>> Patches
>> 1 - 30: refactoring exsiting code and preparation
>> 31 - 37: implement postcopy itself (essential part)
>> 38 - 41: some optimization/heuristic for postcopy
>>
>> Intro
>> =====
>> This patch series implements postcopy live migration.[1]
>> As discussed at KVM forum 2011, dedicated character device is used for
>> distributed shared memory between migration source and destination.
>> Now we can discuss/benchmark/compare with precopy. I believe there are
>> much rooms for improvement.
>>
>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>>
>>
>> Usage
>> =====
>> You need load umem character device on the host before starting migration.
>> Postcopy can be used for tcg and kvm accelarator. The implementation depend
>> on only linux umem character device. But the driver dependent code is split
>> into a file.
>> I tested only host page size == guest page size case, but the implementation
>> allows host page size != guest page size case.
>>
>> The following options are added with this patch series.
>> - incoming part
>> command line options
>> -postcopy [-postcopy-flags<flags>]
>> where flags is for changing behavior for benchmark/debugging
>> Currently the following flags are available
>> 0: default
>> 1: enable touching page request
>>
>> example:
>> qemu -postcopy -incoming tcp:0:4444 -monitor stdio -machine accel=kvm
>>
>> - outging part
>> options for migrate command
>> migrate [-p [-n] [-m]] URI [<prefault forward> [<prefault backword>]]
>> -p: indicate postcopy migration
>> -n: disable background transferring pages: This is for benchmark/debugging
>> -m: move background transfer of postcopy mode
>> <prefault forward>: The number of forward pages which is sent with on-demand
>> <prefault backward>: The number of backward pages which is sent with
>> on-demand
>>
>> example:
>> migrate -p -n tcp:<dest ip address>:4444
>> migrate -p -n -m tcp:<dest ip address>:4444 32 0
>>
>>
>> TODO
>> ====
>> - benchmark/evaluation. Especially how async page fault affects the result.
>> - improve/optimization
>> At the moment at least what I'm aware of is
>> - making incoming socket non-blocking with thread
>> As page compression is comming, it is impractical to non-blocking read
>> and check if the necessary data is read.
>> - touching pages in incoming qemu process by fd handler seems suboptimal.
>> creating dedicated thread?
>> - outgoing handler seems suboptimal causing latency.
>> - consider on FUSE/CUSE possibility
>> - don't fork umemd, but create thread?
>>
>> basic postcopy work flow
>> ========================
>> qemu on the destination
>> |
>> V
>> open(/dev/umem)
>> |
>> V
>> UMEM_INIT
>> |
>> V
>> Here we have two file descriptors to
>> umem device and shmem file
>> |
>> | umemd
>> | daemon on the destination
>> |
>> V create pipe to communicate
>> fork()---------------------------------------,
>> | |
>> V |
>> close(socket) V
>> close(shmem) mmap(shmem file)
>> | |
>> V V
>> mmap(umem device) for guest RAM close(shmem file)
>> | |
>> close(umem device) |
>> | |
>> V |
>> wait for ready from daemon<----pipe-----send ready message
>> | |
>> | Here the daemon takes over
>> send ok------------pipe---------------> the owner of the socket
>> | to the source
>> V |
>> entering post copy stage |
>> start guest execution |
>> | |
>> V V
>> access guest RAM read() to get faulted pages
>> | |
>> V V
>> page fault ------------------------------>page offset is returned
>> block |
>> V
>> pull page from the source
>> write the page contents
>> to the shmem.
>> |
>> V
>> unblock<-----------------------------write() to tell served pages
>> the fault handler returns the page
>> page fault is resolved
>> |
>> | pages can be sent
>> | backgroundly
>> | |
>> | V
>> | write()
>> | |
>> V V
>> The specified pages<-----pipe------------request to touch pages
>> are made present by |
>> touching guest RAM. |
>> | |
>> V V
>> reply-------------pipe-------------> release the cached page
>> | madvise(MADV_REMOVE)
>> | |
>> V V
>>
>> all the pages are pulled from the source
>>
>> | |
>> V V
>> the vma becomes anonymous<----------------UMEM_MAKE_VMA_ANONYMOUS
>> (note: I'm not sure if this can be implemented or not)
>> | |
>> V V
>> migration completes exit()
>>
>>
>>
>>
>> Isaku Yamahata (41):
>> arch_init: export sort_ram_list() and ram_save_block()
>> arch_init: export RAM_SAVE_xxx flags for postcopy
>> arch_init/ram_save: introduce constant for ram save version = 4
>> arch_init: refactor host_from_stream_offset()
>> arch_init/ram_save_live: factor out RAM_SAVE_FLAG_MEM_SIZE case
>> arch_init: refactor ram_save_block()
>> arch_init/ram_save_live: factor out ram_save_limit
>> arch_init/ram_load: refactor ram_load
>> arch_init: introduce helper function to find ram block with id string
>> arch_init: simplify a bit by ram_find_block()
>> arch_init: factor out counting transferred bytes
>> arch_init: factor out setting last_block, last_offset
>> exec.c: factor out qemu_get_ram_ptr()
>> exec.c: export last_ram_offset()
>> savevm: export qemu_peek_buffer, qemu_peek_byte, qemu_file_skip
>> savevm: qemu_pending_size() to return pending buffered size
>> savevm, buffered_file: introduce method to drain buffer of buffered
>> file
>> QEMUFile: add qemu_file_fd() for later use
>> savevm/QEMUFile: drop qemu_stdio_fd
>> savevm/QEMUFileSocket: drop duplicated member fd
>> savevm: rename QEMUFileSocket to QEMUFileFD, socket_close to fd_close
>> savevm/QEMUFile: introduce qemu_fopen_fd
>> migration.c: remove redundant line in migrate_init()
>> migration: export migrate_fd_completed() and migrate_fd_cleanup()
>> migration: factor out parameters into MigrationParams
>> buffered_file: factor out buffer management logic
>> buffered_file: Introduce QEMUFileNonblock for nonblock write
>> buffered_file: add qemu_file to read/write to buffer in memory
>> umem.h: import Linux umem.h
>> update-linux-headers.sh: teach umem.h to update-linux-headers.sh
>> configure: add CONFIG_POSTCOPY option
>> savevm: add new section that is used by postcopy
>> postcopy: introduce -postcopy and -postcopy-flags option
>> postcopy outgoing: add -p and -n option to migrate command
>> postcopy: introduce helper functions for postcopy
>> postcopy: implement incoming part of postcopy live migration
>> postcopy: implement outgoing part of postcopy live migration
>> postcopy/outgoing: add forward, backward option to specify the size
>> of prefault
>> postcopy/outgoing: implement prefault
>> migrate: add -m (movebg) option to migrate command
>> migration/postcopy: add movebg mode
>>
>> Makefile.target | 5 +
>> arch_init.c | 298 ++++---
>> arch_init.h | 20 +
>> block-migration.c | 8 +-
>> buffered_file.c | 322 ++++++--
>> buffered_file.h | 32 +
>> configure | 12 +
>> cpu-all.h | 9 +
>> exec-obsolete.h | 1 +
>> exec.c | 87 ++-
>> hmp-commands.hx | 18 +-
>> hmp.c | 10 +-
>> linux-headers/linux/umem.h | 42 +
>> migration-exec.c | 12 +-
>> migration-fd.c | 25 +-
>> migration-postcopy-stub.c | 77 ++
>> migration-postcopy.c | 1771 +++++++++++++++++++++++++++++++++++++++
>> migration-tcp.c | 25 +-
>> migration-unix.c | 26 +-
>> migration.c | 97 ++-
>> migration.h | 47 +-
>> qapi-schema.json | 4 +-
>> qemu-common.h | 2 +
>> qemu-file.h | 8 +-
>> qemu-options.hx | 25 +
>> qmp-commands.hx | 4 +-
>> savevm.c | 177 ++++-
>> scripts/update-linux-headers.sh | 2 +-
>> sysemu.h | 4 +-
>> umem.c | 364 ++++++++
>> umem.h | 101 +++
>> vl.c | 16 +-
>> vmstate.h | 2 +-
>> 33 files changed, 3373 insertions(+), 280 deletions(-)
>> create mode 100644 linux-headers/linux/umem.h
>> create mode 100644 migration-postcopy-stub.c
>> create mode 100644 migration-postcopy.c
>> create mode 100644 umem.c
>> create mode 100644 umem.h
>>
>>
>>
>>
>> ------------------------------
>>
>>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Fwd: [PATCH v2 00/41] postcopy live migration
2012-06-04 14:27 ` Chegu Vinod
@ 2012-06-04 15:13 ` Isaku Yamahata
0 siblings, 0 replies; 3+ messages in thread
From: Isaku Yamahata @ 2012-06-04 15:13 UTC (permalink / raw)
To: Chegu Vinod; +Cc: qemu-devel, kvm
On Mon, Jun 04, 2012 at 07:27:25AM -0700, Chegu Vinod wrote:
> On 6/4/2012 6:13 AM, Isaku Yamahata wrote:
>> On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote:
>>> Hello Isaku Yamahata,
>> Hi.
>>
>>> I just saw your patches..Would it be possible to email me a tar bundle of these
>>> patches (makes it easier to apply the patches to a copy of the upstream qemu.git)
>> I uploaded them to github for those who are interested in it.
>>
>> git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012
>> git://github.com/yamahata/linux-umem.git linux-umem-june-04-2012
>>
>
> Thanks for the pointer...
>>> BTW, I am also curious if you have considered using any kind of RDMA features for
>>> optimizing the page-faults during postcopy ?
>> Yes, RDMA is interesting topic. Can we share your use case/concern/issues?
>
>
> Looking at large sized guests (256GB and higher) running cpu/memory
> intensive enterprise workloads.
> The concerns are the same...i.e. having a predictable total migration
> time, minimal downtime/freeze-time and of course minimal service
> degradation to the workload(s) in the VM or the co-located VM's...
>
> How large of a guest have you tested your changes with and what kind of
> workloads have you used so far ?
Only up to several GB VM. Off course We'd like to benchmark with real
huge VM (several hundred GB), but it's somewhat difficult.
>> Thus we can collaborate.
>> You may want to see Benoit's results.
>
> Yes. 'have already seen some of Benoit's results.
Great.
> Hence the question about use of RDMA techniques for post copy.
So far my implementation doesn't used RDMA.
>> As long as I know, he has not published
>> his code yet.
>
> Thanks
> Vinod
>
>>
>> thanks,
>>
>>> Thanks
>>> Vinod
>>>
>>>
>>>
>>> ----------------------------------------------------------------------
>>>
>>> Message: 1
>>> Date: Mon, 4 Jun 2012 18:57:02 +0900
>>> From: Isaku Yamahata<yamahata@valinux.co.jp>
>>> To: qemu-devel@nongnu.org, kvm@vger.kernel.org
>>> Cc: benoit.hudzia@gmail.com, aarcange@redhat.com, aliguori@us.ibm.com,
>>> quintela@redhat.com, stefanha@gmail.com, t.hirofuchi@aist.go.jp,
>>> dlaor@redhat.com, satoshi.itoh@aist.go.jp, mdroth@linux.vnet.ibm.com,
>>> yoshikawa.takuya@oss.ntt.co.jp, owasserm@redhat.com, avi@redhat.com,
>>> pbonzini@redhat.com
>>> Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration
>>> Message-ID:<cover.1338802190.git.yamahata@valinux.co.jp>
>>>
>>> After the long time, we have v2. This is qemu part.
>>> The linux kernel part is sent separatedly.
>>>
>>> Changes v1 -> v2:
>>> - split up patches for review
>>> - buffered file refactored
>>> - many bug fixes
>>> Espcially PV drivers can work with postcopy
>>> - optimization/heuristic
>>>
>>> Patches
>>> 1 - 30: refactoring exsiting code and preparation
>>> 31 - 37: implement postcopy itself (essential part)
>>> 38 - 41: some optimization/heuristic for postcopy
>>>
>>> Intro
>>> =====
>>> This patch series implements postcopy live migration.[1]
>>> As discussed at KVM forum 2011, dedicated character device is used for
>>> distributed shared memory between migration source and destination.
>>> Now we can discuss/benchmark/compare with precopy. I believe there are
>>> much rooms for improvement.
>>>
>>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>>>
>>>
>>> Usage
>>> =====
>>> You need load umem character device on the host before starting migration.
>>> Postcopy can be used for tcg and kvm accelarator. The implementation depend
>>> on only linux umem character device. But the driver dependent code is split
>>> into a file.
>>> I tested only host page size == guest page size case, but the implementation
>>> allows host page size != guest page size case.
>>>
>>> The following options are added with this patch series.
>>> - incoming part
>>> command line options
>>> -postcopy [-postcopy-flags<flags>]
>>> where flags is for changing behavior for benchmark/debugging
>>> Currently the following flags are available
>>> 0: default
>>> 1: enable touching page request
>>>
>>> example:
>>> qemu -postcopy -incoming tcp:0:4444 -monitor stdio -machine accel=kvm
>>>
>>> - outging part
>>> options for migrate command
>>> migrate [-p [-n] [-m]] URI [<prefault forward> [<prefault backword>]]
>>> -p: indicate postcopy migration
>>> -n: disable background transferring pages: This is for benchmark/debugging
>>> -m: move background transfer of postcopy mode
>>> <prefault forward>: The number of forward pages which is sent with on-demand
>>> <prefault backward>: The number of backward pages which is sent with
>>> on-demand
>>>
>>> example:
>>> migrate -p -n tcp:<dest ip address>:4444
>>> migrate -p -n -m tcp:<dest ip address>:4444 32 0
>>>
>>>
>>> TODO
>>> ====
>>> - benchmark/evaluation. Especially how async page fault affects the result.
>>> - improve/optimization
>>> At the moment at least what I'm aware of is
>>> - making incoming socket non-blocking with thread
>>> As page compression is comming, it is impractical to non-blocking read
>>> and check if the necessary data is read.
>>> - touching pages in incoming qemu process by fd handler seems suboptimal.
>>> creating dedicated thread?
>>> - outgoing handler seems suboptimal causing latency.
>>> - consider on FUSE/CUSE possibility
>>> - don't fork umemd, but create thread?
>>>
>>> basic postcopy work flow
>>> ========================
>>> qemu on the destination
>>> |
>>> V
>>> open(/dev/umem)
>>> |
>>> V
>>> UMEM_INIT
>>> |
>>> V
>>> Here we have two file descriptors to
>>> umem device and shmem file
>>> |
>>> | umemd
>>> | daemon on the destination
>>> |
>>> V create pipe to communicate
>>> fork()---------------------------------------,
>>> | |
>>> V |
>>> close(socket) V
>>> close(shmem) mmap(shmem file)
>>> | |
>>> V V
>>> mmap(umem device) for guest RAM close(shmem file)
>>> | |
>>> close(umem device) |
>>> | |
>>> V |
>>> wait for ready from daemon<----pipe-----send ready message
>>> | |
>>> | Here the daemon takes over
>>> send ok------------pipe---------------> the owner of the socket
>>> | to the source
>>> V |
>>> entering post copy stage |
>>> start guest execution |
>>> | |
>>> V V
>>> access guest RAM read() to get faulted pages
>>> | |
>>> V V
>>> page fault ------------------------------>page offset is returned
>>> block |
>>> V
>>> pull page from the source
>>> write the page contents
>>> to the shmem.
>>> |
>>> V
>>> unblock<-----------------------------write() to tell served pages
>>> the fault handler returns the page
>>> page fault is resolved
>>> |
>>> | pages can be sent
>>> | backgroundly
>>> | |
>>> | V
>>> | write()
>>> | |
>>> V V
>>> The specified pages<-----pipe------------request to touch pages
>>> are made present by |
>>> touching guest RAM. |
>>> | |
>>> V V
>>> reply-------------pipe-------------> release the cached page
>>> | madvise(MADV_REMOVE)
>>> | |
>>> V V
>>>
>>> all the pages are pulled from the source
>>>
>>> | |
>>> V V
>>> the vma becomes anonymous<----------------UMEM_MAKE_VMA_ANONYMOUS
>>> (note: I'm not sure if this can be implemented or not)
>>> | |
>>> V V
>>> migration completes exit()
>>>
>>>
>>>
>>>
>>> Isaku Yamahata (41):
>>> arch_init: export sort_ram_list() and ram_save_block()
>>> arch_init: export RAM_SAVE_xxx flags for postcopy
>>> arch_init/ram_save: introduce constant for ram save version = 4
>>> arch_init: refactor host_from_stream_offset()
>>> arch_init/ram_save_live: factor out RAM_SAVE_FLAG_MEM_SIZE case
>>> arch_init: refactor ram_save_block()
>>> arch_init/ram_save_live: factor out ram_save_limit
>>> arch_init/ram_load: refactor ram_load
>>> arch_init: introduce helper function to find ram block with id string
>>> arch_init: simplify a bit by ram_find_block()
>>> arch_init: factor out counting transferred bytes
>>> arch_init: factor out setting last_block, last_offset
>>> exec.c: factor out qemu_get_ram_ptr()
>>> exec.c: export last_ram_offset()
>>> savevm: export qemu_peek_buffer, qemu_peek_byte, qemu_file_skip
>>> savevm: qemu_pending_size() to return pending buffered size
>>> savevm, buffered_file: introduce method to drain buffer of buffered
>>> file
>>> QEMUFile: add qemu_file_fd() for later use
>>> savevm/QEMUFile: drop qemu_stdio_fd
>>> savevm/QEMUFileSocket: drop duplicated member fd
>>> savevm: rename QEMUFileSocket to QEMUFileFD, socket_close to fd_close
>>> savevm/QEMUFile: introduce qemu_fopen_fd
>>> migration.c: remove redundant line in migrate_init()
>>> migration: export migrate_fd_completed() and migrate_fd_cleanup()
>>> migration: factor out parameters into MigrationParams
>>> buffered_file: factor out buffer management logic
>>> buffered_file: Introduce QEMUFileNonblock for nonblock write
>>> buffered_file: add qemu_file to read/write to buffer in memory
>>> umem.h: import Linux umem.h
>>> update-linux-headers.sh: teach umem.h to update-linux-headers.sh
>>> configure: add CONFIG_POSTCOPY option
>>> savevm: add new section that is used by postcopy
>>> postcopy: introduce -postcopy and -postcopy-flags option
>>> postcopy outgoing: add -p and -n option to migrate command
>>> postcopy: introduce helper functions for postcopy
>>> postcopy: implement incoming part of postcopy live migration
>>> postcopy: implement outgoing part of postcopy live migration
>>> postcopy/outgoing: add forward, backward option to specify the size
>>> of prefault
>>> postcopy/outgoing: implement prefault
>>> migrate: add -m (movebg) option to migrate command
>>> migration/postcopy: add movebg mode
>>>
>>> Makefile.target | 5 +
>>> arch_init.c | 298 ++++---
>>> arch_init.h | 20 +
>>> block-migration.c | 8 +-
>>> buffered_file.c | 322 ++++++--
>>> buffered_file.h | 32 +
>>> configure | 12 +
>>> cpu-all.h | 9 +
>>> exec-obsolete.h | 1 +
>>> exec.c | 87 ++-
>>> hmp-commands.hx | 18 +-
>>> hmp.c | 10 +-
>>> linux-headers/linux/umem.h | 42 +
>>> migration-exec.c | 12 +-
>>> migration-fd.c | 25 +-
>>> migration-postcopy-stub.c | 77 ++
>>> migration-postcopy.c | 1771 +++++++++++++++++++++++++++++++++++++++
>>> migration-tcp.c | 25 +-
>>> migration-unix.c | 26 +-
>>> migration.c | 97 ++-
>>> migration.h | 47 +-
>>> qapi-schema.json | 4 +-
>>> qemu-common.h | 2 +
>>> qemu-file.h | 8 +-
>>> qemu-options.hx | 25 +
>>> qmp-commands.hx | 4 +-
>>> savevm.c | 177 ++++-
>>> scripts/update-linux-headers.sh | 2 +-
>>> sysemu.h | 4 +-
>>> umem.c | 364 ++++++++
>>> umem.h | 101 +++
>>> vl.c | 16 +-
>>> vmstate.h | 2 +-
>>> 33 files changed, 3373 insertions(+), 280 deletions(-)
>>> create mode 100644 linux-headers/linux/umem.h
>>> create mode 100644 migration-postcopy-stub.c
>>> create mode 100644 migration-postcopy.c
>>> create mode 100644 umem.c
>>> create mode 100644 umem.h
>>>
>>>
>>>
>>>
>>> ------------------------------
>>>
>>>
>
--
yamahata
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-06-04 15:13 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4FCCA39A.1050300@hp.com>
2012-06-04 13:13 ` [Qemu-devel] Fwd: [PATCH v2 00/41] postcopy live migration Isaku Yamahata
2012-06-04 14:27 ` Chegu Vinod
2012-06-04 15:13 ` Isaku Yamahata
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).