From mboxrd@z Thu Jan  1 00:00:00 1970
From: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] Are there plans to achieve ram live Snapshot feature?
Date: Wed, 14 Aug 2013 09:54:21 +0800
Message-ID: <520AE34D.8000002@linux.vnet.ibm.com>
References: <33FB050264B7AD4DBD6583581F2E03104B764728@nkgeml511-mbx.china.huawei.com> <20130812095903.GF29880@stefanha-thinkpad.redhat.com> <232DEBC1058FA4A5BD76D16A@Ximines.local> <CAJSP0QU3HYdN+FiQY-RtM1N1kkWfL0Y=KbrWj-qYk+TtS-6+Rw@mail.gmail.com> <52099FA3.6010207@linux.vnet.ibm.com> <CAJSP0QW-aM7EyEtPuQfjp+FRp4aObZen3Pu2nF9TE5A4F7LRgw@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8;
	format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: Anthony Liguori <aliguori@us.ibm.com>, kvm <kvm@vger.kernel.org>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Chijianchun <chijianchun@huawei.com>,
	Avi Kivity <avi@redhat.com>, Alex Bligh <alex@alex.org.uk>,
	fred.konrad@greensocs.com, Paul Brook <paul@codesourcery.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from e23smtp03.au.ibm.com ([202.81.31.145]:60906 "EHLO
	e23smtp03.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1758814Ab3HNByk (ORCPT <rfc822;kvm@vger.kernel.org>);
	Tue, 13 Aug 2013 21:54:40 -0400
Received: from /spool/local
	by e23smtp03.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted
	for <kvm@vger.kernel.org> from <xiawenc@linux.vnet.ibm.com>;
	Wed, 14 Aug 2013 11:43:48 +1000
Received: from d23relay05.au.ibm.com (d23relay05.au.ibm.com [9.190.235.152])
	by d23dlp03.au.ibm.com (Postfix) with ESMTP id F07E63578056
	for <kvm@vger.kernel.org>; Wed, 14 Aug 2013 11:54:32 +1000 (EST)
Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96])
	by d23relay05.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r7E1cZWJ63701224
	for <kvm@vger.kernel.org>; Wed, 14 Aug 2013 11:38:35 +1000
Received: from d23av01.au.ibm.com (localhost [127.0.0.1])
	by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id r7E1sVjV026338
	for <kvm@vger.kernel.org>; Wed, 14 Aug 2013 11:54:32 +1000
In-Reply-To: <CAJSP0QW-aM7EyEtPuQfjp+FRp4aObZen3Pu2nF9TE5A4F7LRgw@mail.gmail.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

=E4=BA=8E 2013-8-13 16:21, Stefan Hajnoczi =E5=86=99=E9=81=93:
> On Tue, Aug 13, 2013 at 4:53 AM, Wenchao Xia <xiawenc@linux.vnet.ibm.=
com> wrote:
>> =E4=BA=8E 2013-8-12 19:33, Stefan Hajnoczi =E5=86=99=E9=81=93:
>>
>>> On Mon, Aug 12, 2013 at 12:26 PM, Alex Bligh <alex@alex.org.uk> wro=
te:
>>>>
>>>> --On 12 August 2013 11:59:03 +0200 Stefan Hajnoczi <stefanha@gmail=
=2Ecom>
>>>> wrote:
>>>>
>>>>> The idea that was discussed on qemu-devel@nongnu.org uses fork(2)=
 to
>>>>> capture the state of guest RAM and then send it back to the paren=
t
>>>>> process.  The guest is only paused for a brief instant during for=
k(2)
>>>>> and can continue to run afterwards.
>>>>
>>>>
>>>>
>>>> How would you capture the state of emulated hardware which might n=
ot
>>>> be in the guest RAM?
>>>
>>>
>>> Exactly the same way vmsave works today.  It calls the device's sav=
e
>>> functions which serialize state to file.
>>>
>>> The difference between today's vmsave and the fork(2) approach is t=
hat
>>> QEMU does not need to wait for guest RAM to be written to file befo=
re
>>> resuming the guest.
>>>
>>> Stefan
>>>
>>    I have a worry about what glib says:
>>
>> "On Unix, the GLib mainloop is incompatible with fork(). Any program
>> using the mainloop must either exec() or exit() from the child witho=
ut
>> returning to the mainloop. "
>
> This is fine, the child just writes out the memory pages and exits.
> It never returns to the glib mainloop.
>
>>    There is another way to do it: intercept the write in kvm.ko(or o=
ther
>> kernel code). Since the key is intercept the memory change, we can d=
o
>> it in userspace in TCG mode, thus we can add the missing part in KVM
>> mode. Another benefit of this way is: the used memory can be
>> controlled. For example, with ioctl(), set a buffer of a fixed size
>> which keeps the intercepted write data by kernel code, which can avo=
id
>> frequently switch back to user space qemu code. when it is full alwa=
ys
>> return back to userspace's qemu code, let qemu code save the data in=
to
>> disk. I haven't check the exactly behavior of Intel guest mode about
>> how to handle page fault, so can't estimate the performance caused b=
y
>> switching of guest mode and root mode, but it should not be worse th=
an
>> fork().
>
> The fork(2) approach is portable, covers both KVM and TCG, and doesn'=
t
> require kernel changes.  A kvm.ko kernel change also won't be
> supported on existing KVM hosts.  These are big drawbacks and the
> kernel approach would need to be significantly better than plain old
> fork(2) to make it worthwhile.
>
> Stefan
>
   I think advantage is memory usage is predictable, so memory usage
peak can be avoided, by always save the changed pages first. fork()
does not know which pages are changed. I am not sure if this would
be a serious issue when server's memory is consumed much, for example,
24G host emulate 11G*2 guest to provide powerful virtual server.

--=20
Best Regards

Wenchao Xia