From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43076) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bSw6C-000508-Qv for qemu-devel@nongnu.org; Thu, 28 Jul 2016 20:56:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bSw67-0003Xs-SD for qemu-devel@nongnu.org; Thu, 28 Jul 2016 20:56:23 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:50849) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bSw66-0003WP-SC for qemu-devel@nongnu.org; Thu, 28 Jul 2016 20:56:19 -0400 References: <1464940366-9880-1-git-send-email-zhang.zhanghailiang@huawei.com> <20160728190704.GC2084@work-vm> <579AA62C.5000001@huawei.com> <579AA98C.4030002@cn.fujitsu.com> From: Hailiang Zhang Message-ID: <579AA985.10703@huawei.com> Date: Fri, 29 Jul 2016 08:55:33 +0800 MIME-Version: 1.0 In-Reply-To: <579AA98C.4030002@cn.fujitsu.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH COLO-Frame v17 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Changlong Xie , "Dr. David Alan Gilbert" Cc: peter.huangpeng@huawei.com, qemu-devel@nongnu.org, amit.shah@redhat.com, quintela@redhat.com, eddie.dong@intel.com, yunhong.jiang@intel.com, wency@cn.fujitsu.com, lizhijian@cn.fujitsu.com, arei.gonglei@huawei.com, stefanha@redhat.com, hongyang.yang@easystack.cn, zhangchen.fnst@cn.fujitsu.com, Jeff Cody , Kevin Wolf , Max Reitz , Jason Wang On 2016/7/29 8:55, Changlong Xie wrote: > On 07/29/2016 08:41 AM, Hailiang Zhang wrote: >> On 2016/7/29 3:07, Dr. David Alan Gilbert wrote: >>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote: >>>> This is the 17th version of COLO FT feature. >>>> >>>> Here is only COLO frame part, you can get the whole codes from github: >>>> https://github.com/coloft/qemu/commits/colo-v3.0-periodic-mode >>>> >>>> Migration now switches to use the new QIOChannel API. It only affects >>>> COLO's >>>> patch 9 and patch 12, which we used the old qsb buffer before, and we >>>> updated >>>> them with the new API. It's only involving tiny changes. >>> >>> I notice that the block code has nearly been accepted; so perhaps it's >>> worth posting a version rebased on the current rc and then hopefully >>> we can line this core code up for the start of 2.8 as soon as the block >>> code lands. >>> >> >> Yes, thanks for reminding, I'll post next version in the next few days. >> > > Hi hailiang > > Just a warm reminder. Since colo framework bases on replication, and the > replication feature is #optional. Maybe you need check > CONFIG_REPLICATION in your new version. > OK, i noticed that, thanks. > https://github.com//Pating/qemu/tree/block-replication-v24, see > "configure: support replication" > > Thanks > -Xie > >> Hailiang >> >>> Dave >>> >>>> Patch status: >>>> Unreviewed: patch 32 ~ 35 >>>> Updated: patch 9, 12, 32 >>>> >>>> Cc: Stefan Hajnoczi >>>> Cc: Jeff Cody >>>> Cc: Kevin Wolf >>>> Cc: Max Reitz >>>> Cc: Juan Quintela >>>> Cc: Amit Shah >>>> Cc: Dr. David Alan Gilbert >>>> Cc: Jason Wang >>>> >>>> PS: These series has been in community for a long time, it depends on >>>> Changlong's block-replicaton series, but that has been blocked for a >>>> long >>>> time, we really need help on reviewing that and this series. Thanks. >>>> >>>> TODO: >>>> 1. Checkpoint based on proxy in qemu >>>> 2. The capability of continuous FT >>>> 3. Optimize the VM's downtime during checkpoint >>>> >>>> v17: >>>> - Rebase master to use the new QIOChannel API, only affect patch 9 >>>> and 12 >>>> - Reorganize some ugly comments >>>> - Rename colo_sem to colo_exit_sem (patch 21) >>>> >>>> v16: >>>> - Fix compile broken due to missing osdep.h >>>> - Add reviewed-by tag for patch 27, 28, 29 >>>> - Rename the message send/receive helper function (patch 7, 13) >>>> - Simplify the codes by using some notifier helpers in QEMU (patch 32) >>>> - Remove the useless check in colo_add_buffer_filter() (patch 33) >>>> - Remove the previous patch 36, 37 which export filter_buffer_flush() >>>> to release the buffered packets, we simplify it by stopping buffer >>>> filter while doing checkpoint, which will flush the buffered packets >>>> by default. (patch 34) >>>> v15: >>>> - Go on the shutdown process if encounter error while sending shutdown >>>> message to SVM. (patch 24) >>>> - Rename qemu_need_skip_netfilter to qemu_netfilter_can_skip and >>>> Remove >>>> some useless comment. (patch 31, Jason) >>>> - Call object_new_with_props() directly to add filter in >>>> colo_add_buffer_filter. (patch 34, Jason) >>>> - Re-implement colo_set_filter_status() based on COLOBufferFilters >>>> list. (patch 35) >>>> - Re-implement colo_flush_filter_packets() based on COLOBufferFilters >>>> list. (patch 37) >>>> v14: >>>> - Re-implement the network processing based on netfilter (Jason Wang) >>>> - Rename 'COLOCommand' to 'COLOMessage'. (Markus's suggestion) >>>> - Split two new patches (patch 27/28) from patch 29 >>>> - Fix some other comments from Dave and Markus. >>>> >>>> v13: >>>> - Refactor colo_*_cmd helper functions to use 'Error **errp' parameter >>>> instead of return value to indicate success or failure. (patch 10) >>>> - Remove the optional error message for COLO_EXIT event. (patch 25) >>>> - Use semaphore to notify colo/colo incoming loop that failover >>>> work is >>>> finished. (patch 26) >>>> - Move COLO shutdown related codes to colo.c file. (patch 28) >>>> - Fix memory leak bug for colo incoming loop. (new patch 31) >>>> - Re-use some existed helper functions to realize the process of >>>> saving/loading ram and device. (patch 32) >>>> - Fix some other comments from Dave and Markus. >>>> >>>> >>>> zhanghailiang (34): >>>> configure: Add parameter for configure to enable/disable COLO support >>>> migration: Introduce capability 'x-colo' to migration >>>> COLO: migrate colo related info to secondary node >>>> migration: Integrate COLO checkpoint process into migration >>>> migration: Integrate COLO checkpoint process into loadvm >>>> COLO/migration: Create a new communication path from destination to >>>> source >>>> COLO: Implement COLO checkpoint protocol >>>> COLO: Add a new RunState RUN_STATE_COLO >>>> COLO: Save PVM state to secondary side when do checkpoint >>>> COLO: Load PVM's dirty pages into SVM's RAM cache temporarily >>>> ram/COLO: Record the dirty pages that SVM received >>>> COLO: Load VMState into buffer before restore it >>>> COLO: Flush PVM's cached RAM into SVM's memory >>>> COLO: Add checkpoint-delay parameter for migrate-set-parameters >>>> COLO: Synchronize PVM's state to SVM periodically >>>> COLO failover: Introduce a new command to trigger a failover >>>> COLO failover: Introduce state to record failover process >>>> COLO: Implement failover work for Primary VM >>>> COLO: Implement failover work for Secondary VM >>>> qmp event: Add COLO_EXIT event to notify users while exited from COLO >>>> COLO failover: Shutdown related socket fd when do failover >>>> COLO failover: Don't do failover during loading VM's state >>>> COLO: Process shutdown command for VM in COLO state >>>> COLO: Update the global runstate after going into colo state >>>> savevm: Introduce two helper functions for save/find loadvm_handlers >>>> entry >>>> migration/savevm: Add new helpers to process the different stages of >>>> loadvm >>>> migration/savevm: Export two helper functions for savevm process >>>> COLO: Separate the process of saving/loading ram and device state >>>> COLO: Split qemu_savevm_state_begin out of checkpoint process >>>> filter-buffer: Accept zero interval >>>> net: Add notifier/callback for netdev init >>>> COLO/filter: Add each netdev a buffer filter >>>> COLO: Control the status of buffer filters for PVM >>>> COLO: Add block replication into colo process >>>> >>>> configure | 11 + >>>> docs/qmp-events.txt | 16 + >>>> hmp-commands.hx | 15 + >>>> hmp.c | 15 + >>>> hmp.h | 1 + >>>> include/exec/ram_addr.h | 1 + >>>> include/migration/colo.h | 43 +++ >>>> include/migration/failover.h | 33 ++ >>>> include/migration/migration.h | 16 + >>>> include/net/filter.h | 2 + >>>> include/net/net.h | 3 + >>>> include/sysemu/sysemu.h | 9 + >>>> migration/Makefile.objs | 2 + >>>> migration/colo-comm.c | 79 ++++ >>>> migration/colo-failover.c | 84 +++++ >>>> migration/colo.c | 844 >>>> ++++++++++++++++++++++++++++++++++++++++++ >>>> migration/migration.c | 84 ++++- >>>> migration/ram.c | 175 ++++++++- >>>> migration/savevm.c | 114 ++++-- >>>> net/filter-buffer.c | 12 - >>>> net/net.c | 12 + >>>> qapi-schema.json | 102 ++++- >>>> qapi/event.json | 15 + >>>> qmp-commands.hx | 24 +- >>>> stubs/Makefile.objs | 1 + >>>> stubs/migration-colo.c | 55 +++ >>>> trace-events | 8 + >>>> vl.c | 31 +- >>>> 28 files changed, 1744 insertions(+), 63 deletions(-) >>>> create mode 100644 include/migration/colo.h >>>> create mode 100644 include/migration/failover.h >>>> create mode 100644 migration/colo-comm.c >>>> create mode 100644 migration/colo-failover.c >>>> create mode 100644 migration/colo.c >>>> create mode 100644 stubs/migration-colo.c >>>> >>>> -- >>>> 1.8.3.1 >>>> >>>> >>> -- >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >>> >>> . >>> >> >> >> >> . >> > > > > . >