From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:56019) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1To3Pf-00017G-Go for qemu-devel@nongnu.org; Wed, 26 Dec 2012 21:41:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1To3Pe-0006L1-0W for qemu-devel@nongnu.org; Wed, 26 Dec 2012 21:41:39 -0500 Received: from e23smtp09.au.ibm.com ([202.81.31.142]:57043) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1To3Pd-0006Kl-Ef for qemu-devel@nongnu.org; Wed, 26 Dec 2012 21:41:37 -0500 Received: from /spool/local by e23smtp09.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 27 Dec 2012 12:36:06 +1000 Received: from d23relay05.au.ibm.com (d23relay05.au.ibm.com [9.190.235.152]) by d23dlp02.au.ibm.com (Postfix) with ESMTP id 2B9282BB004B for ; Thu, 27 Dec 2012 13:41:24 +1100 (EST) Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96]) by d23relay05.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qBR2U0X92490868 for ; Thu, 27 Dec 2012 13:30:01 +1100 Received: from d23av01.au.ibm.com (loopback [127.0.0.1]) by d23av01.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qBR2fMal025796 for ; Thu, 27 Dec 2012 13:41:22 +1100 Message-ID: <50DBB52E.40708@linux.vnet.ibm.com> Date: Thu, 27 Dec 2012 10:40:46 +0800 From: Wenchao Xia MIME-Version: 1.0 References: <50D2CC22.1060900@linux.vnet.ibm.com> <50D3C8AF.7050506@linux.vnet.ibm.com> <87623vl0bb.fsf@elfo.mitica> <50D934A3.1030902@linux.vnet.ibm.com> In-Reply-To: <50D934A3.1030902@linux.vnet.ibm.com> Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] lively write vmstate with predictable size List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: quintela@redhat.com Cc: Anthony Liguori , Pavel Hrdina , Stefan Hajnoczi , qemu-devel , Paolo Bonzini , Dietmar Maurer > Hi, Juan > Thank u for reviewing on this, have some questions below. > >> Wenchao Xia wrote: >>> resent the mail to mail-list. >>> ------------------- >>> >>> Hi, Paolo and Juan >>> Currently savevm needs pause vm, and I am working on that make it >>> lively. Considering the flexibility I'd like to split out the >>> functions apart as following: >>> 1) snapshot lively internal/external >>> 2) save vmstate lively internal/external >>> 3) assemble them as will >>> >>> 1) was sent at >>> http://lists.nongnu.org/archive/html/qemu-devel/2012-12/msg02393.html >>> >>> but for 2), I think it have problem because file size may grow to a size >>> out of control. Considering the migration code, I'd like to propose a >>> way to fix it as following: >>> >>> Migration logic: >>> Src send->dest recv->data analysis->copy data >>> Savevm logic: >>> Src send->write data to qcow2. >>> >>> My suggestion: >>> Savevm logic: >>> Src send->dest recv->data analysis->write data to qcow2/external with >>> addr. >>> >>> The idea is do the write operation after data analysis, and overwrite >>> old data if address overlaps. So this will need qcow2 support >>> write snapshot data at "address", and also change some savevm logic. >> >> We could change the code to do it, but it is not going to be >> trivial. Furthermore we should disable xbzrle (in this case, it just >> makes no sense) >> >> The easiest way that I can think of is changing the >> arch_init.c: >> >> We can: >> a- reserve a big enough area in the qcow2 image/external to store the >> whole ram >> b- change ram_save_block() on that file to directly write to the "right" >> position inside the image. >> c- change ram_load() to understand the new format. >> > I agree with your way for that it saves computation for > data analysis. But why a big area should be reserved and new format > is needed? I thought every thing would be same as old vmstate save > case with vm stopped, if qemu can overwrite the contents in the right > place and leaves remaining tag unchanged. > Hi, Juan After reconsideration, I think preallocation is indeed needed, or a file format is needed which make write to a position larger than the file should extend the file to the same size of postion(maybe some filesystem support hole can make it)? I suppose a solution to make things relative easy: 1) at start qemu make sure space is preallocated or a file support holes. 2) calculate the vmstate size, and arrange the contents lineiar with communication headers, that is what is in save_block_hdr() and other key stamps. 3) write those data lively with headers to the arranged position. In this way the qcow2 internal snapshot format need not be changed, but sacrifice some space used to store communication stamps, which I think would not be too many. And if the coding in this approach is carefully and predict the remove the write stamps, it would be easy to remove the stamps later as a new snapshot format, if a flag is added in the code. But before all a good arrangement of vmstate is needed. what do you think about it? :) > how about write a new function replace save_block_hdr() and following > buffer writing, which would be an abstract function used to write data > with addr? It still encapsulate it as save_block_hdr() logic, but in > savevm case it will directly write data to a right place. In this way > at vmstate contents level, nothing changes, it still stores the same > stream as if vm is stopped. > >> Notice that this would have to be a different implementation that >> sending data over tcp, as there we want (something) similar to the >> current code. >> > I think it would be a choice between performance and code unifying, > personally I guess performance is the first to consider for qemu. > >> >>> Could u give some some comments on this to see if it is workable? >> >> It needs lots of code rearragement as far as I can think, it is doable >> but it is quite a bit of work. >> >> Later, Juan. >> > > -- Best Regards Wenchao Xia