From: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
To: quintela@redhat.com
Cc: Anthony Liguori <aliguori@us.ibm.com>,
Pavel Hrdina <phrdina@redhat.com>,
Stefan Hajnoczi <stefanha@gmail.com>,
qemu-devel <qemu-devel@nongnu.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Dietmar Maurer <dietmar@proxmox.com>
Subject: Re: [Qemu-devel] [RFC] lively write vmstate with predictable size
Date: Thu, 27 Dec 2012 10:40:46 +0800 [thread overview]
Message-ID: <50DBB52E.40708@linux.vnet.ibm.com> (raw)
In-Reply-To: <50D934A3.1030902@linux.vnet.ibm.com>
> Hi, Juan
> Thank u for reviewing on this, have some questions below.
>
>> Wenchao Xia <xiawenc@linux.vnet.ibm.com> wrote:
>>> resent the mail to mail-list.
>>> -------------------
>>>
>>> Hi, Paolo and Juan
>>> Currently savevm needs pause vm, and I am working on that make it
>>> lively. Considering the flexibility I'd like to split out the
>>> functions apart as following:
>>> 1) snapshot lively internal/external
>>> 2) save vmstate lively internal/external
>>> 3) assemble them as will
>>>
>>> 1) was sent at
>>> http://lists.nongnu.org/archive/html/qemu-devel/2012-12/msg02393.html
>>>
>>> but for 2), I think it have problem because file size may grow to a size
>>> out of control. Considering the migration code, I'd like to propose a
>>> way to fix it as following:
>>>
>>> Migration logic:
>>> Src send->dest recv->data analysis->copy data
>>> Savevm logic:
>>> Src send->write data to qcow2.
>>>
>>> My suggestion:
>>> Savevm logic:
>>> Src send->dest recv->data analysis->write data to qcow2/external with
>>> addr.
>>>
>>> The idea is do the write operation after data analysis, and overwrite
>>> old data if address overlaps. So this will need qcow2 support
>>> write snapshot data at "address", and also change some savevm logic.
>>
>> We could change the code to do it, but it is not going to be
>> trivial. Furthermore we should disable xbzrle (in this case, it just
>> makes no sense)
>>
>> The easiest way that I can think of is changing the
>> arch_init.c:
>>
>> We can:
>> a- reserve a big enough area in the qcow2 image/external to store the
>> whole ram
>> b- change ram_save_block() on that file to directly write to the "right"
>> position inside the image.
>> c- change ram_load() to understand the new format.
>>
> I agree with your way for that it saves computation for
> data analysis. But why a big area should be reserved and new format
> is needed? I thought every thing would be same as old vmstate save
> case with vm stopped, if qemu can overwrite the contents in the right
> place and leaves remaining tag unchanged.
>
Hi, Juan
After reconsideration, I think preallocation is indeed needed, or a
file format is needed which make write to a position larger than the
file should extend the file to the same size of postion(maybe some
filesystem support hole can make it)?
I suppose a solution to make things relative easy:
1) at start qemu make sure space is preallocated or a file support
holes.
2) calculate the vmstate size, and arrange the contents lineiar with
communication headers, that is what is in save_block_hdr() and other key
stamps.
3) write those data lively with headers to the arranged position.
In this way the qcow2 internal snapshot format need not be changed,
but sacrifice some space used to store communication stamps, which I
think would not be too many. And if the coding in this approach is
carefully and predict the remove the write stamps, it would be easy
to remove the stamps later as a new snapshot format, if a flag is
added in the code. But before all a good arrangement of vmstate
is needed.
what do you think about it? :)
> how about write a new function replace save_block_hdr() and following
> buffer writing, which would be an abstract function used to write data
> with addr? It still encapsulate it as save_block_hdr() logic, but in
> savevm case it will directly write data to a right place. In this way
> at vmstate contents level, nothing changes, it still stores the same
> stream as if vm is stopped.
>
>> Notice that this would have to be a different implementation that
>> sending data over tcp, as there we want (something) similar to the
>> current code.
>>
> I think it would be a choice between performance and code unifying,
> personally I guess performance is the first to consider for qemu.
>
>>
>>> Could u give some some comments on this to see if it is workable?
>>
>> It needs lots of code rearragement as far as I can think, it is doable
>> but it is quite a bit of work.
>>
>> Later, Juan.
>>
>
>
--
Best Regards
Wenchao Xia
next prev parent reply other threads:[~2012-12-27 2:41 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <50D2CC22.1060900@linux.vnet.ibm.com>
2012-12-21 2:25 ` [Qemu-devel] [RFC] lively write vmstate with predictable size Wenchao Xia
2012-12-21 18:36 ` Juan Quintela
2012-12-25 5:07 ` Wenchao Xia
2012-12-27 2:40 ` Wenchao Xia [this message]
2013-01-04 17:35 ` Stefan Hajnoczi
2013-01-04 22:42 ` Juan Quintela
2013-01-05 8:21 ` Wenchao Xia
2013-01-06 3:28 ` Eric Blake
2013-01-07 12:35 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50DBB52E.40708@linux.vnet.ibm.com \
--to=xiawenc@linux.vnet.ibm.com \
--cc=aliguori@us.ibm.com \
--cc=dietmar@proxmox.com \
--cc=pbonzini@redhat.com \
--cc=phrdina@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).