From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37976) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bKNaj-0007du-84 for qemu-devel@nongnu.org; Tue, 05 Jul 2016 06:28:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bKNae-0008Mq-8v for qemu-devel@nongnu.org; Tue, 05 Jul 2016 06:28:32 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:56419) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bKNad-0008Lq-9t for qemu-devel@nongnu.org; Tue, 05 Jul 2016 06:28:28 -0400 References: <1452169208-840-1-git-send-email-zhang.zhanghailiang@huawei.com> <577B1238.7040605@huawei.com> From: Hailiang Zhang Message-ID: <577B8BA7.6010001@huawei.com> Date: Tue, 5 Jul 2016 18:27:51 +0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC 00/13] Live memory snapshot based on userfaultfd List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Baptiste Reynal Cc: peter.huangpeng@huawei.com, aarcange@redhat.com, qemu list , hanweidong@huawei.com, Juan Quintela , dgilbert@redhat.com, Amit Shah , Christian Pinto On 2016/7/5 17:57, Baptiste Reynal wrote: > On Tue, Jul 5, 2016 at 3:49 AM, Hailiang Zhang > wrote: >> On 2016/7/4 20:22, Baptiste Reynal wrote: >>> >>> On Thu, Jan 7, 2016 at 1:19 PM, zhanghailiang >>> wrote: >>>> >>>> For now, we still didn't support live memory snapshot, we have discussed >>>> a scheme which based on userfaultfd long time ago. >>>> You can find the discussion by the follow link: >>>> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg01779.html >>>> >>>> The scheme is based on userfaultfd's write-protect capability. >>>> The userfaultfd write protection feature is available here: >>>> http://www.spinics.net/lists/linux-mm/msg97422.html >>>> >>>> The process of this live memory scheme is like bellow: >>>> 1. Pause VM >>>> 2. Enable write-protect fault notification by using userfaultfd to >>>> mark VM's memory to write-protect (readonly). >>>> 3. Save VM's static state (here is device state) to snapshot file >>>> 4. Resume VM, VM is going to run. >>>> 5. Snapshot thread begins to save VM's live state (here is RAM) into >>>> snapshot file. >>>> 6. During this time, all the actions of writing VM's memory will be >>>> blocked >>>> by kernel, and kernel will wakeup the fault treating thread in qemu to >>>> process this write-protect fault. The fault treating thread will >>>> deliver this >>>> page's address to snapshot thread. >>>> 7. snapshot thread gets this address, save this page into snasphot file, >>>> and then remove the write-protect by using userfaultfd API, after >>>> that, >>>> the actions of writing will be recovered. >>>> 8. Repeat step 5~7 until all VM's memory is saved to snapshot file >>>> >>>> Compared with the feature of 'migrate VM's state to file', >>>> the main difference for live memory snapshot is it has little time delay >>>> for >>>> catching VM's state. It just captures the VM's state while got users >>>> snapshot >>>> command, just like take a photo of VM's state. >>>> >>>> For now, we only support tcg accelerator, since userfaultfd is not >>>> supporting >>>> tracking write faults for KVM. >>>> >>>> Usage: >>>> 1. Take a snapshot >>>> #x86_64-softmmu/qemu-system-x86_64 -machine >>>> pc-i440fx-2.5,accel=tcg,usb=off -drive >>>> file=/mnt/windows/win7_install.qcow2.bak,if=none,id=drive-ide0-0-1,format=qcow2,cache=none >>>> -device ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -vnc :7 -m >>>> 8192 -smp 1 -netdev tap,id=bn0 -device virtio-net-pci,id=net-pci0,netdev=bn0 >>>> --monitor stdio >>>> Issue snapshot command: >>>> (qemu)migrate -d file:/home/Snapshot >>>> 2. Revert to the snapshot >>>> #x86_64-softmmu/qemu-system-x86_64 -machine >>>> pc-i440fx-2.5,accel=tcg,usb=off -drive >>>> file=/mnt/windows/win7_install.qcow2.bak,if=none,id=drive-ide0-0-1,format=qcow2,cache=none >>>> -device ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -vnc :7 -m >>>> 8192 -smp 1 -netdev tap,id=bn0 -device virtio-net-pci,id=net-pci0,netdev=bn0 >>>> --monitor stdio -incoming file:/home/Snapshot >>>> >>>> NOTE: >>>> The userfaultfd write protection feature does not support THP for now, >>>> Before taking snapshot, please disable THP by: >>>> echo never > /sys/kernel/mm/transparent_hugepage/enabled >>>> >>>> TODO: >>>> - Reduce the influence for VM while taking snapshot >>>> >>>> zhanghailiang (13): >>>> postcopy/migration: Split fault related state into struct >>>> UserfaultState >>>> migration: Allow the migrate command to work on file: urls >>>> migration: Allow -incoming to work on file: urls >>>> migration: Create a snapshot thread to realize saving memory snapshot >>>> migration: implement initialization work for snapshot >>>> QEMUSizedBuffer: Introduce two help functions for qsb >>>> savevm: Split qemu_savevm_state_complete_precopy() into two helper >>>> functions >>>> snapshot: Save VM's device state into snapshot file >>>> migration/postcopy-ram: fix some helper functions to support >>>> userfaultfd write-protect >>>> snapshot: Enable the write-protect notification capability for VM's >>>> RAM >>>> snapshot/migration: Save VM's RAM into snapshot file >>>> migration/ram: Fix some helper functions' parameter to use >>>> PageSearchStatus >>>> snapshot: Remove page's write-protect and copy the content during >>>> setup stage >>>> >>>> include/migration/migration.h | 41 +++++-- >>>> include/migration/postcopy-ram.h | 9 +- >>>> include/migration/qemu-file.h | 3 +- >>>> include/qemu/typedefs.h | 1 + >>>> include/sysemu/sysemu.h | 3 + >>>> linux-headers/linux/userfaultfd.h | 21 +++- >>>> migration/fd.c | 51 ++++++++- >>>> migration/migration.c | 101 ++++++++++++++++- >>>> migration/postcopy-ram.c | 229 >>>> ++++++++++++++++++++++++++++---------- >>>> migration/qemu-file-buf.c | 61 ++++++++++ >>>> migration/ram.c | 104 ++++++++++++----- >>>> migration/savevm.c | 90 ++++++++++++--- >>>> trace-events | 1 + >>>> 13 files changed, 587 insertions(+), 128 deletions(-) >>>> >>>> -- >>>> 1.8.3.1 >>>> >>>> >>>> >>> >> >> Hi, >> >>> Hi Hailiang, >>> >>> Can I get the status of this patch series ? I cannot find a v2. >> >> >> Yes, I haven't updated it for long time, it is based on userfault-wp API >> in kernel, and Andrea didn't update the related patches until recent days. >> I will update this series in the next one or two weeks. But it will only >> support TCG until userfault-wp API supports KVM. >> > > May I have a pointer to those patches ? The last I found is > http://thread.gmane.org/gmane.linux.kernel.mm/141647 and it seems > pretty old. > Yes, Andrea has updated it, but not released it in public, I have retransmited his email to you. Please see the related email. >>> About TCG limitation, is KVM support on a TODO list or is there a >>> strong technical barrier ? >>> >> >> I don't think there are any technical hurdles, I would like to >> have a try on realizing the KVM part to support userfault-wp, >> But I'm a little busy with other things now. >> Andrea may has a plan to achieve it. >> >> To: Andrea Arcangeli >> >> Thanks, >> Hailiang >> > > Ok, if it is not on Andrea schedule I am willing to take the action, > at least for ARM/ARM64 support. > Hmm, great, if you can participate, we can speed up the developing process. Thanks, Hailiang > Regards, > Baptiste > >>> Thanks, >>> Baptiste >>> >>> . >>> >> > > . >