* [Qemu-devel] [RFC] qmp interface for save vmstate to image @ 2013-03-15 7:24 Wenchao Xia 2013-03-15 14:51 ` Stefan Hajnoczi 2013-03-18 13:28 ` Pavel Hrdina 0 siblings, 2 replies; 19+ messages in thread From: Wenchao Xia @ 2013-03-15 7:24 UTC (permalink / raw) To: Juan Quintela, Eric Blake, Dietmar Maurer, Stefan Hajnoczi, Paolo Bonzini, Kevin Wolf, qemu-devel Hi, Juan and guys, I'd like to add a new way to save vmstate, which will based on the migration thread, but will write contents to block images, instead of fd as stream. Following is the method to add API: 1 add parameters to migrate interface, and a new type of uri: image:[VMSATE_SAVE_IMAGE] ## # @MigrateImageOptions: # # Options for migration to image. # # @path: the full path to the image to be used. # @use-existing: #optional, whether to use existing image in path. If # not specified, qemu will try create new image. # @create-size: #optional, the image's virtual size in creation. Only # valid when use-existing is false or absence, unit is M. # @fmt: #optional the format of the image. If not specified, when # use-existing is true, qemu will try detect the image format, # when use-existing is false or absence, qcow2 format will be # used. # @stream: #optional, whether to save vmstate as stream, in which way # small writes reduce but size may continue growing. If not # specified, vmstate will be saved with fixed size. # # Since: 1.5 ## { 'type': 'MigrateImageOptions', 'data': { 'path': 'str', '*use-existing': 'bool', '*create-size': 'int', '*fmt': 'str', '*stream': 'bool' } } ## # @migrate # # Migrates the current running guest to another Virtual Machine. # # @uri: the Uniform Resource Identifier of the destination VM # # @blk: #optional do block migration (full disk copy) # # @inc: #optional incremental disk copy migration # # @detach: this argument exists only for compatibility reasons and # is ignored by QEMU # # @image-options: #optional, the options used in migration to image. # Only valid in migration to image. # # Returns: nothing on success # # Since: 0.14.0 ## { 'command': 'migrate', 'data': {'uri': 'str', '*blk': 'bool', '*inc': 'bool', '*detach': 'bool', '*image-options': MigrateImageOptions} } In this way query-migrate and migrate incoming could be naturelly used for querying and restoring, But introduce some options only for the image migration. 2 new command vmstate-save with above options. Then use query-migrate and migrate incoming to query/restore the states, which seems wild. I can't decide which is better, could u take a look and put some comments on this? -- Best Regards Wenchao Xia ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-15 7:24 [Qemu-devel] [RFC] qmp interface for save vmstate to image Wenchao Xia @ 2013-03-15 14:51 ` Stefan Hajnoczi 2013-03-18 6:40 ` Wenchao Xia 2013-03-18 13:28 ` Pavel Hrdina 1 sibling, 1 reply; 19+ messages in thread From: Stefan Hajnoczi @ 2013-03-15 14:51 UTC (permalink / raw) To: Wenchao Xia Cc: Kevin Wolf, Juan Quintela, qemu-devel, Paolo Bonzini, Dietmar Maurer On Fri, Mar 15, 2013 at 03:24:38PM +0800, Wenchao Xia wrote: > I'd like to add a new way to save vmstate, which will based on the > migration thread, but will write contents to block images, instead > of fd as stream. Following is the method to add API: Hi Wenchao, What use cases are there besides saving vmstate to a raw image? I'm curious if you're proposing this since there is no "file:" URI or because you really want to do things like saving vmstate into a qcow2 file or over NBD. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-15 14:51 ` Stefan Hajnoczi @ 2013-03-18 6:40 ` Wenchao Xia 2013-03-18 9:04 ` Kevin Wolf 2013-03-18 10:09 ` Stefan Hajnoczi 0 siblings, 2 replies; 19+ messages in thread From: Wenchao Xia @ 2013-03-18 6:40 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Juan Quintela, qemu-devel, Paolo Bonzini, Dietmar Maurer 于 2013-3-15 22:51, Stefan Hajnoczi 写道: > On Fri, Mar 15, 2013 at 03:24:38PM +0800, Wenchao Xia wrote: >> I'd like to add a new way to save vmstate, which will based on the >> migration thread, but will write contents to block images, instead >> of fd as stream. Following is the method to add API: > > Hi Wenchao, > What use cases are there besides saving vmstate to a raw image? > > I'm curious if you're proposing this since there is no "file:" URI or > because you really want to do things like saving vmstate into a qcow2 > file or over NBD. > > Stefan > Hi, Stefan Most used cases would be "raw" and "qcow2", which is flex and can be chosen by user. In this way, existing block layer feature in qemu can be used, such as tagging zeros. I haven't check the buffer/cache status in qemu block layer, but if there is, it can also benefit. User can specify "raw" or "qcow2" according to host configuration, If there is dedicated storage components underlining, he can use "raw" to skip qemu's block layer. -- Best Regards Wenchao Xia ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-18 6:40 ` Wenchao Xia @ 2013-03-18 9:04 ` Kevin Wolf 2013-03-18 10:08 ` Paolo Bonzini 2013-03-18 10:47 ` Wenchao Xia 2013-03-18 10:09 ` Stefan Hajnoczi 1 sibling, 2 replies; 19+ messages in thread From: Kevin Wolf @ 2013-03-18 9:04 UTC (permalink / raw) To: Wenchao Xia Cc: Juan Quintela, Stefan Hajnoczi, qemu-devel, Paolo Bonzini, Dietmar Maurer Am 18.03.2013 um 07:40 hat Wenchao Xia geschrieben: > 于 2013-3-15 22:51, Stefan Hajnoczi 写道: > > On Fri, Mar 15, 2013 at 03:24:38PM +0800, Wenchao Xia wrote: > >> I'd like to add a new way to save vmstate, which will based on the > >> migration thread, but will write contents to block images, instead > >> of fd as stream. Following is the method to add API: > > > > Hi Wenchao, > > What use cases are there besides saving vmstate to a raw image? > > > > I'm curious if you're proposing this since there is no "file:" URI or > > because you really want to do things like saving vmstate into a qcow2 > > file or over NBD. > > > > Stefan > > > Hi, Stefan > Most used cases would be "raw" and "qcow2", which is flex and can be > chosen by user. In this way, existing block layer feature in qemu can > be used, such as tagging zeros. I haven't check the buffer/cache status > in qemu block layer, but if there is, it can also benefit. > User can specify "raw" or "qcow2" according to host configuration, If > there is dedicated storage components underlining, he can use "raw" to > skip qemu's block layer. Oh, seems I misread this then. I thought this was about internal live snapshots, which is a feature that I consider really useful. I'm not so sure if saving the VM state as the disk contents of a qcow2 image is really helpful. If zero clusters help a lot, then there's clearly something to improve in the migration protocol, because it shouldn't send so many zeros in the first place. Kevin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-18 9:04 ` Kevin Wolf @ 2013-03-18 10:08 ` Paolo Bonzini 2013-03-18 10:50 ` Wenchao Xia 2013-03-18 10:47 ` Wenchao Xia 1 sibling, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-03-18 10:08 UTC (permalink / raw) To: Kevin Wolf Cc: Juan Quintela, Stefan Hajnoczi, qemu-devel, Dietmar Maurer, Wenchao Xia Il 18/03/2013 10:04, Kevin Wolf ha scritto: > Oh, seems I misread this then. I thought this was about internal live > snapshots, which is a feature that I consider really useful. I'm not so > sure if saving the VM state as the disk contents of a qcow2 image is > really helpful. > > If zero clusters help a lot, then there's clearly something to improve > in the migration protocol, because it shouldn't send so many zeros in > the first place. Zero pages are sent as a single 9-byte entry (64 bits for the address and flags, 8 for the zero). I don't expect the migration stream to have a single zero cluster, since every page is prefixed by the 64 bits for the address and flags. Furthermore, the RAM data would be horribly unaligned because of this. 15-20% sectors or so would be read twice, since reading each page (4104 bytes including the address and flags) would span 10 sectors (5120 bytes). Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-18 10:08 ` Paolo Bonzini @ 2013-03-18 10:50 ` Wenchao Xia 0 siblings, 0 replies; 19+ messages in thread From: Wenchao Xia @ 2013-03-18 10:50 UTC (permalink / raw) To: Paolo Bonzini Cc: Kevin Wolf, Juan Quintela, Stefan Hajnoczi, qemu-devel, Dietmar Maurer 于 2013-3-18 18:08, Paolo Bonzini 写道: > Il 18/03/2013 10:04, Kevin Wolf ha scritto: >> Oh, seems I misread this then. I thought this was about internal live >> snapshots, which is a feature that I consider really useful. I'm not so >> sure if saving the VM state as the disk contents of a qcow2 image is >> really helpful. >> >> If zero clusters help a lot, then there's clearly something to improve >> in the migration protocol, because it shouldn't send so many zeros in >> the first place. > > Zero pages are sent as a single 9-byte entry (64 bits for the address > and flags, 8 for the zero). > > I don't expect the migration stream to have a single zero cluster, since > every page is prefixed by the 64 bits for the address and flags. > Furthermore, the RAM data would be horribly unaligned because of this. > 15-20% sectors or so would be read twice, since reading each page (4104 > bytes including the address and flags) would span 10 sectors (5120 bytes). > > Paolo > I think in streaming case, zero page will be handled well. I use qcow2 mainly for fseek() case, which may have some zero holes inside. -- Best Regards Wenchao Xia ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-18 9:04 ` Kevin Wolf 2013-03-18 10:08 ` Paolo Bonzini @ 2013-03-18 10:47 ` Wenchao Xia 1 sibling, 0 replies; 19+ messages in thread From: Wenchao Xia @ 2013-03-18 10:47 UTC (permalink / raw) To: Kevin Wolf Cc: Stefan Hajnoczi, Paolo Bonzini, qemu-devel, Dietmar Maurer, Juan Quintela 于 2013-3-18 17:04, Kevin Wolf 写道: > Am 18.03.2013 um 07:40 hat Wenchao Xia geschrieben: >> 于 2013-3-15 22:51, Stefan Hajnoczi 写道: >>> On Fri, Mar 15, 2013 at 03:24:38PM +0800, Wenchao Xia wrote: >>>> I'd like to add a new way to save vmstate, which will based on the >>>> migration thread, but will write contents to block images, instead >>>> of fd as stream. Following is the method to add API: >>> >>> Hi Wenchao, >>> What use cases are there besides saving vmstate to a raw image? >>> >>> I'm curious if you're proposing this since there is no "file:" URI or >>> because you really want to do things like saving vmstate into a qcow2 >>> file or over NBD. >>> >>> Stefan >>> >> Hi, Stefan >> Most used cases would be "raw" and "qcow2", which is flex and can be >> chosen by user. In this way, existing block layer feature in qemu can >> be used, such as tagging zeros. I haven't check the buffer/cache status >> in qemu block layer, but if there is, it can also benefit. >> User can specify "raw" or "qcow2" according to host configuration, If >> there is dedicated storage components underlining, he can use "raw" to >> skip qemu's block layer. > > Oh, seems I misread this then. I thought this was about internal live > snapshots, which is a feature that I consider really useful. I'm not so > sure if saving the VM state as the disk contents of a qcow2 image is > really helpful. > Actually I am leaving internal live snapshot as 2nd step since there are a bit more work to do when using migration thread, since SPICE is handled in migration but not in internal snapshot. The main purpose is getting a standalone vmstate saving file with limited size, since internal snapshot lacks a API now to drop vmstate at any time.(better to have API to export vmstate/delta block data). > If zero clusters help a lot, then there's clearly something to improve > in the migration protocol, because it shouldn't send so many zeros in > the first place. > In streaming case, zero are good encoded now I think, but when it uses fseek(), there may be some zeros inside, and small writes. Handling those are likely block layer's job, by using image I can directly use qemu's block layer with qcow2 format, or using raw if underline component there, make it flex. > Kevin > -- Best Regards Wenchao Xia ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-18 6:40 ` Wenchao Xia 2013-03-18 9:04 ` Kevin Wolf @ 2013-03-18 10:09 ` Stefan Hajnoczi 1 sibling, 0 replies; 19+ messages in thread From: Stefan Hajnoczi @ 2013-03-18 10:09 UTC (permalink / raw) To: Wenchao Xia Cc: Kevin Wolf, Juan Quintela, qemu-devel, Paolo Bonzini, Dietmar Maurer On Mon, Mar 18, 2013 at 02:40:50PM +0800, Wenchao Xia wrote: > 于 2013-3-15 22:51, Stefan Hajnoczi 写道: > > On Fri, Mar 15, 2013 at 03:24:38PM +0800, Wenchao Xia wrote: > >> I'd like to add a new way to save vmstate, which will based on the > >> migration thread, but will write contents to block images, instead > >> of fd as stream. Following is the method to add API: > > > > Hi Wenchao, > > What use cases are there besides saving vmstate to a raw image? > > > > I'm curious if you're proposing this since there is no "file:" URI or > > because you really want to do things like saving vmstate into a qcow2 > > file or over NBD. > > > > Stefan > > > Hi, Stefan > Most used cases would be "raw" and "qcow2", which is flex and can be > chosen by user. In this way, existing block layer feature in qemu can > be used, such as tagging zeros. I haven't check the buffer/cache status > in qemu block layer, but if there is, it can also benefit. Okay, thanks for explaining. You can use caching with the BDRV_O_CACHE_WB option. Then you need to call bdrv_co_flush() to ensure data reaches the disk. The advantage of caching is that I/O patterns with many small unaligned writes may be much faster when going through the host's page cache - and reads can also be faster. You can bypass the host page cache with BDRV_O_CACHE_WB | BDRV_NO_CACHE. Here bdrv_co_flush() calls are still necessary to ensure data reaches the disk. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-15 7:24 [Qemu-devel] [RFC] qmp interface for save vmstate to image Wenchao Xia 2013-03-15 14:51 ` Stefan Hajnoczi @ 2013-03-18 13:28 ` Pavel Hrdina 2013-03-21 6:43 ` Wenchao Xia 1 sibling, 1 reply; 19+ messages in thread From: Pavel Hrdina @ 2013-03-18 13:28 UTC (permalink / raw) To: Wenchao Xia Cc: Kevin Wolf, Juan Quintela, Stefan Hajnoczi, qemu-devel, Paolo Bonzini, Dietmar Maurer Hi Wenchao, It seems the we are working on the same thing. You are trying to improve the size of vmstate if you want to save it to file or as an internal snapshot. I'm also working on that issue and I think that my solution could be also used for savevm to external file or for live backup. Here is my proposal how to do it: We will not have the fixed size of vmstate, we will have the possible minimal size of the vmstate. I will also use the migration code to save the vmstate. In the qemu_savevm_state_begin we will create bitmap for all ram pages. Then we set all pages in bitmap to "1" and it means we need to save them all. Then we check all ram pages for duplicated pages and we will unset all duplicated pages from "savevm_bitmap". In the qemu_savevm_state_iterate we will start saving remaining ram pages according to "savevm_bitmap". Because the guest is running, it could change the data in ram pages which is still not saved. For this case we also have to create a priority queue. Into this priority queue we will copy every ram page before it will be changed and also remove this ram page from savevm_bitmap. In the iterate cycle we will at first handle the priority queue and then continue to other ram pages from the savevm_bitmap. In the qemu_savevm_state_complete we will save only non-live data. This should reduce the vmstate size and also speedup the saving of vmstate with minimal memory usage. Pavel On 03/15/2013 08:24 AM, Wenchao Xia wrote: > Hi, Juan and guys, > I'd like to add a new way to save vmstate, which will based on the > migration thread, but will write contents to block images, instead > of fd as stream. Following is the method to add API: > > 1 add parameters to migrate interface, and a new type of uri: > image:[VMSATE_SAVE_IMAGE] > > ## > # @MigrateImageOptions: > # > # Options for migration to image. > # > # @path: the full path to the image to be used. > # @use-existing: #optional, whether to use existing image in path. If > # not specified, qemu will try create new image. > # @create-size: #optional, the image's virtual size in creation. Only > # valid when use-existing is false or absence, unit is M. > # @fmt: #optional the format of the image. If not specified, when > # use-existing is true, qemu will try detect the image format, > # when use-existing is false or absence, qcow2 format will be > # used. > # @stream: #optional, whether to save vmstate as stream, in which way > # small writes reduce but size may continue growing. If not > # specified, vmstate will be saved with fixed size. > # > # Since: 1.5 > ## > { 'type': 'MigrateImageOptions', > 'data': { 'path': 'str', '*use-existing': 'bool', > '*create-size': 'int', '*fmt': 'str', > '*stream': 'bool' } } > > ## > # @migrate > # > # Migrates the current running guest to another Virtual Machine. > # > # @uri: the Uniform Resource Identifier of the destination VM > # > # @blk: #optional do block migration (full disk copy) > # > # @inc: #optional incremental disk copy migration > # > # @detach: this argument exists only for compatibility reasons and > # is ignored by QEMU > # > # @image-options: #optional, the options used in migration to image. > # Only valid in migration to image. > # > # Returns: nothing on success > # > # Since: 0.14.0 > ## > { 'command': 'migrate', > 'data': {'uri': 'str', '*blk': 'bool', '*inc': 'bool', > '*detach': 'bool', '*image-options': MigrateImageOptions} } > > In this way query-migrate and migrate incoming could be naturelly used > for querying and restoring, But introduce some options only for the > image migration. > > 2 new command vmstate-save with above options. Then use query-migrate > and migrate incoming to query/restore the states, which seems wild. > > I can't decide which is better, could u take a look and put some > comments on this? > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-18 13:28 ` Pavel Hrdina @ 2013-03-21 6:43 ` Wenchao Xia 2013-03-21 11:48 ` Pavel Hrdina 0 siblings, 1 reply; 19+ messages in thread From: Wenchao Xia @ 2013-03-21 6:43 UTC (permalink / raw) To: Pavel Hrdina Cc: Kevin Wolf, Juan Quintela, Stefan Hajnoczi, qemu-devel, Paolo Bonzini, Dietmar Maurer Hi, Pavel Sorry for late response. > Hi Wenchao, > > It seems the we are working on the same thing. You are trying to improve > the size of vmstate if you want to save it to file or as an internal > snapshot. > > I'm also working on that issue and I think that my solution could be > also used for savevm to external file or for live backup. > > Here is my proposal how to do it: > > We will not have the fixed size of vmstate, we will have the possible > minimal size of the vmstate. I will also use the migration code to save > the vmstate. > It is good if speed and size can be improved, but IMHO the size will be a problem. Predictable or fixed size ensure management stack to give assess and decision, preserve resource ahead, personally I does not like a process continue to take resource without limit, in most case I'll turn it off.... By using qcow2m vmstate will have a fixed MAX size, ideal to be used to take it as a backup data. Above is my personal opinion, and I do want to know the maintainer's opinion to decide whether to continue. > In the qemu_savevm_state_begin we will create bitmap for all ram pages. > Then we set all pages in bitmap to "1" and it means we need to save them > all. Then we check all ram pages for duplicated pages and we will unset > all duplicated pages from "savevm_bitmap". > > In the qemu_savevm_state_iterate we will start saving remaining ram > pages according to "savevm_bitmap". Because the guest is running, it > could change the data in ram pages which is still not saved. For this > case we also have to create a priority queue. Into this priority queue > we will copy every ram page before it will be changed and also remove > this ram page from savevm_bitmap. In the iterate cycle we will at first > handle the priority queue and then continue to other ram pages from the > savevm_bitmap. > OK, I got your idea: intercept the page writing before it changes. I think this could reduce time in savevm. But some problems need to be confirmed: 1 is it workable when KVM is used? In my understanding KVM will directly change the ram page before qemu can take over. 2 the performance sacrifice of running guest, need a test. 3 the total buffer size in the queue. If you plan to make it used for any migration then in TCP case the buffer may grow to a large size for speed reason. If you use it only for local device, I suggest conclude it as a improvement for migrate to block device, in contrast to migrate to stream, then the performance optimizing infra such as buffer/cache can be used much easier, to reduce the performance lost in page changing. I feel this is more likely as an algorithm improvement for block migration, which can work with my patch together. My patch is actually introducing migrate vmstate to block instead of stream. > In the qemu_savevm_state_complete we will save only non-live data. > > This should reduce the vmstate size and also speedup the saving of > vmstate with minimal memory usage. > > Pavel > > On 03/15/2013 08:24 AM, Wenchao Xia wrote: >> Hi, Juan and guys, >> I'd like to add a new way to save vmstate, which will based on the >> migration thread, but will write contents to block images, instead >> of fd as stream. Following is the method to add API: >> >> 1 add parameters to migrate interface, and a new type of uri: >> image:[VMSATE_SAVE_IMAGE] >> >> ## >> # @MigrateImageOptions: >> # >> # Options for migration to image. >> # >> # @path: the full path to the image to be used. >> # @use-existing: #optional, whether to use existing image in path. If >> # not specified, qemu will try create new image. >> # @create-size: #optional, the image's virtual size in creation. Only >> # valid when use-existing is false or absence, unit is M. >> # @fmt: #optional the format of the image. If not specified, when >> # use-existing is true, qemu will try detect the image format, >> # when use-existing is false or absence, qcow2 format will be >> # used. >> # @stream: #optional, whether to save vmstate as stream, in which way >> # small writes reduce but size may continue growing. If not >> # specified, vmstate will be saved with fixed size. >> # >> # Since: 1.5 >> ## >> { 'type': 'MigrateImageOptions', >> 'data': { 'path': 'str', '*use-existing': 'bool', >> '*create-size': 'int', '*fmt': 'str', >> '*stream': 'bool' } } >> >> ## >> # @migrate >> # >> # Migrates the current running guest to another Virtual Machine. >> # >> # @uri: the Uniform Resource Identifier of the destination VM >> # >> # @blk: #optional do block migration (full disk copy) >> # >> # @inc: #optional incremental disk copy migration >> # >> # @detach: this argument exists only for compatibility reasons and >> # is ignored by QEMU >> # >> # @image-options: #optional, the options used in migration to image. >> # Only valid in migration to image. >> # >> # Returns: nothing on success >> # >> # Since: 0.14.0 >> ## >> { 'command': 'migrate', >> 'data': {'uri': 'str', '*blk': 'bool', '*inc': 'bool', >> '*detach': 'bool', '*image-options': MigrateImageOptions} } >> >> In this way query-migrate and migrate incoming could be naturelly used >> for querying and restoring, But introduce some options only for the >> image migration. >> >> 2 new command vmstate-save with above options. Then use query-migrate >> and migrate incoming to query/restore the states, which seems wild. >> >> I can't decide which is better, could u take a look and put some >> comments on this? >> > -- Best Regards Wenchao Xia ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-21 6:43 ` Wenchao Xia @ 2013-03-21 11:48 ` Pavel Hrdina 2013-03-21 13:38 ` Stefan Hajnoczi 0 siblings, 1 reply; 19+ messages in thread From: Pavel Hrdina @ 2013-03-21 11:48 UTC (permalink / raw) To: Wenchao Xia Cc: Kevin Wolf, Juan Quintela, Stefan Hajnoczi, qemu-devel, Paolo Bonzini, Dietmar Maurer On 03/21/2013 07:43 AM, Wenchao Xia wrote: > Hi, Pavel > Sorry for late response. np :) >> Hi Wenchao, >> >> It seems the we are working on the same thing. You are trying to improve >> the size of vmstate if you want to save it to file or as an internal >> snapshot. >> >> I'm also working on that issue and I think that my solution could be >> also used for savevm to external file or for live backup. >> >> Here is my proposal how to do it: >> >> We will not have the fixed size of vmstate, we will have the possible >> minimal size of the vmstate. I will also use the migration code to save >> the vmstate. >> > It is good if speed and size can be improved, but IMHO the size will > be a problem. Predictable or fixed size ensure management stack to > give assess and decision, preserve resource ahead, personally I > does not like a process continue to take resource without limit, > in most case I'll turn it off.... By using qcow2m vmstate will have a > fixed MAX size, ideal to be used to take it as a backup data. > Above is my personal opinion, and I do want to know the maintainer's > opinion to decide whether to continue. I mean that the vmstate size would by at max the same as the guest ram size, but could be smaller. I also dislike that actually the vmstate could be much more larger then the guest ram size. > >> In the qemu_savevm_state_begin we will create bitmap for all ram pages. >> Then we set all pages in bitmap to "1" and it means we need to save them >> all. Then we check all ram pages for duplicated pages and we will unset >> all duplicated pages from "savevm_bitmap". >> >> In the qemu_savevm_state_iterate we will start saving remaining ram >> pages according to "savevm_bitmap". Because the guest is running, it >> could change the data in ram pages which is still not saved. For this >> case we also have to create a priority queue. Into this priority queue >> we will copy every ram page before it will be changed and also remove >> this ram page from savevm_bitmap. In the iterate cycle we will at first >> handle the priority queue and then continue to other ram pages from the >> savevm_bitmap. >> > OK, I got your idea: intercept the page writing before it changes. > I think this could reduce time in savevm. But some problems need to be > confirmed: > 1 is it workable when KVM is used? In my understanding KVM will directly > change the ram page before qemu can take over. Yes, this is true. I'm now investigating any way how to do this, but I'm afraid that without support from kvm kernel module it cannot be done. > 2 the performance sacrifice of running guest, need a test. Surly I'll test it if there will be some solution how to copy the page before it is changed when kvm is used. > 3 the total buffer size in the queue. If you plan to make it used for > any migration then in TCP case the buffer may grow to a large size > for speed reason. If you use it only for local device, I suggest > conclude it as a improvement for migrate to block device, in contrast > to migrate to stream, then the performance optimizing infra such > as buffer/cache can be used much easier, to reduce the performance > lost in page changing. > I feel this is more likely as an algorithm improvement for block > migration, which can work with my patch together. My patch > is actually introducing migrate vmstate to block instead of stream. Yes, this proposal is to improve migration to block device. > >> In the qemu_savevm_state_complete we will save only non-live data. >> >> This should reduce the vmstate size and also speedup the saving of >> vmstate with minimal memory usage. >> >> Pavel >> >> On 03/15/2013 08:24 AM, Wenchao Xia wrote: >>> Hi, Juan and guys, >>> I'd like to add a new way to save vmstate, which will based on the >>> migration thread, but will write contents to block images, instead >>> of fd as stream. Following is the method to add API: >>> >>> 1 add parameters to migrate interface, and a new type of uri: >>> image:[VMSATE_SAVE_IMAGE] >>> >>> ## >>> # @MigrateImageOptions: >>> # >>> # Options for migration to image. >>> # >>> # @path: the full path to the image to be used. >>> # @use-existing: #optional, whether to use existing image in path. If >>> # not specified, qemu will try create new image. >>> # @create-size: #optional, the image's virtual size in creation. Only >>> # valid when use-existing is false or absence, unit is M. >>> # @fmt: #optional the format of the image. If not specified, when >>> # use-existing is true, qemu will try detect the image format, >>> # when use-existing is false or absence, qcow2 format will be >>> # used. >>> # @stream: #optional, whether to save vmstate as stream, in which way >>> # small writes reduce but size may continue growing. If not >>> # specified, vmstate will be saved with fixed size. >>> # >>> # Since: 1.5 >>> ## >>> { 'type': 'MigrateImageOptions', >>> 'data': { 'path': 'str', '*use-existing': 'bool', >>> '*create-size': 'int', '*fmt': 'str', >>> '*stream': 'bool' } } >>> >>> ## >>> # @migrate >>> # >>> # Migrates the current running guest to another Virtual Machine. >>> # >>> # @uri: the Uniform Resource Identifier of the destination VM >>> # >>> # @blk: #optional do block migration (full disk copy) >>> # >>> # @inc: #optional incremental disk copy migration >>> # >>> # @detach: this argument exists only for compatibility reasons and >>> # is ignored by QEMU >>> # >>> # @image-options: #optional, the options used in migration to image. >>> # Only valid in migration to image. >>> # >>> # Returns: nothing on success >>> # >>> # Since: 0.14.0 >>> ## >>> { 'command': 'migrate', >>> 'data': {'uri': 'str', '*blk': 'bool', '*inc': 'bool', >>> '*detach': 'bool', '*image-options': MigrateImageOptions} } >>> >>> In this way query-migrate and migrate incoming could be naturelly used >>> for querying and restoring, But introduce some options only for the >>> image migration. >>> >>> 2 new command vmstate-save with above options. Then use query-migrate >>> and migrate incoming to query/restore the states, which seems wild. >>> >>> I can't decide which is better, could u take a look and put some >>> comments on this? >>> >> > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-21 11:48 ` Pavel Hrdina @ 2013-03-21 13:38 ` Stefan Hajnoczi 2013-03-21 13:42 ` Paolo Bonzini 2013-03-21 13:43 ` Pavel Hrdina 0 siblings, 2 replies; 19+ messages in thread From: Stefan Hajnoczi @ 2013-03-21 13:38 UTC (permalink / raw) To: Pavel Hrdina Cc: Kevin Wolf, Juan Quintela, qemu-devel, Dietmar Maurer, Paolo Bonzini, Wenchao Xia On Thu, Mar 21, 2013 at 12:48:35PM +0100, Pavel Hrdina wrote: > On 03/21/2013 07:43 AM, Wenchao Xia wrote: > > Hi, Pavel > > Sorry for late response. > > np :) > > >> Hi Wenchao, > >> > >> It seems the we are working on the same thing. You are trying to improve > >> the size of vmstate if you want to save it to file or as an internal > >> snapshot. > >> > >> I'm also working on that issue and I think that my solution could be > >> also used for savevm to external file or for live backup. > >> > >> Here is my proposal how to do it: > >> > >> We will not have the fixed size of vmstate, we will have the possible > >> minimal size of the vmstate. I will also use the migration code to save > >> the vmstate. > >> > > It is good if speed and size can be improved, but IMHO the size will > > be a problem. Predictable or fixed size ensure management stack to > > give assess and decision, preserve resource ahead, personally I > > does not like a process continue to take resource without limit, > > in most case I'll turn it off.... By using qcow2m vmstate will have a > > fixed MAX size, ideal to be used to take it as a backup data. > > Above is my personal opinion, and I do want to know the maintainer's > > opinion to decide whether to continue. > > I mean that the vmstate size would by at max the same as the guest ram > size, but could be smaller. I also dislike that actually the vmstate > could be much more larger then the guest ram size. > > > > >> In the qemu_savevm_state_begin we will create bitmap for all ram pages. > >> Then we set all pages in bitmap to "1" and it means we need to save them > >> all. Then we check all ram pages for duplicated pages and we will unset > >> all duplicated pages from "savevm_bitmap". > >> > >> In the qemu_savevm_state_iterate we will start saving remaining ram > >> pages according to "savevm_bitmap". Because the guest is running, it > >> could change the data in ram pages which is still not saved. For this > >> case we also have to create a priority queue. Into this priority queue > >> we will copy every ram page before it will be changed and also remove > >> this ram page from savevm_bitmap. In the iterate cycle we will at first > >> handle the priority queue and then continue to other ram pages from the > >> savevm_bitmap. > >> > > OK, I got your idea: intercept the page writing before it changes. > > I think this could reduce time in savevm. But some problems need to be > > confirmed: > > 1 is it workable when KVM is used? In my understanding KVM will directly > > change the ram page before qemu can take over. > > Yes, this is true. I'm now investigating any way how to do this, but I'm > afraid that without support from kvm kernel module it cannot be done. > > > 2 the performance sacrifice of running guest, need a test. > > Surly I'll test it if there will be some solution how to copy the page > before it is changed when kvm is used. There already is a guest RAM cloning mechanism: fork the QEMU process. Then you have a copy-on-write guest RAM. In a little more detail: 1. save non-RAM device state 2. quiesce QEMU to a state that is safe for forking 3. create an EventNotifier for live savevm completion signal 4. fork and pass completion EventNotifier to child 5. parent continues running VM 6. child performs vmsave of copy-on-write guest RAM 7. child signals completion EventNotifier and terminates 8. parent raises live savevm completion QMP event Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-21 13:38 ` Stefan Hajnoczi @ 2013-03-21 13:42 ` Paolo Bonzini 2013-03-21 13:53 ` Pavel Hrdina 2013-03-21 14:56 ` Stefan Hajnoczi 2013-03-21 13:43 ` Pavel Hrdina 1 sibling, 2 replies; 19+ messages in thread From: Paolo Bonzini @ 2013-03-21 13:42 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Pavel Hrdina, Juan Quintela, qemu-devel, Dietmar Maurer, Wenchao Xia Il 21/03/2013 14:38, Stefan Hajnoczi ha scritto: > There already is a guest RAM cloning mechanism: fork the QEMU process. > Then you have a copy-on-write guest RAM. > > In a little more detail: > > 1. save non-RAM device state > 2. quiesce QEMU to a state that is safe for forking > 3. create an EventNotifier for live savevm completion signal > 4. fork and pass completion EventNotifier to child > 5. parent continues running VM > 6. child performs vmsave of copy-on-write guest RAM > 7. child signals completion EventNotifier and terminates > 8. parent raises live savevm completion QMP event Forking a threaded program is not so easy, but it could be done if the child is very simple and only uses syscalls to communicate back with the parent: 1. save non-RAM device state 2. quiesce QEMU to a state that is safe for forking 3. create a memory map and a pipe 4. fork and pass the write end of the pipe to the child 5. parent continues running VM 6. child reads the memory map and writes data to the pipe 7. parent copies data from the pipe to the migration stream 8. child exits, parent raises live savevm completion QMP event Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-21 13:42 ` Paolo Bonzini @ 2013-03-21 13:53 ` Pavel Hrdina 2013-03-21 14:56 ` Stefan Hajnoczi 1 sibling, 0 replies; 19+ messages in thread From: Pavel Hrdina @ 2013-03-21 13:53 UTC (permalink / raw) To: Paolo Bonzini Cc: Kevin Wolf, Juan Quintela, Stefan Hajnoczi, qemu-devel, Dietmar Maurer, Wenchao Xia On 03/21/2013 02:42 PM, Paolo Bonzini wrote: > Il 21/03/2013 14:38, Stefan Hajnoczi ha scritto: >> There already is a guest RAM cloning mechanism: fork the QEMU process. >> Then you have a copy-on-write guest RAM. >> >> In a little more detail: >> >> 1. save non-RAM device state >> 2. quiesce QEMU to a state that is safe for forking >> 3. create an EventNotifier for live savevm completion signal >> 4. fork and pass completion EventNotifier to child >> 5. parent continues running VM >> 6. child performs vmsave of copy-on-write guest RAM >> 7. child signals completion EventNotifier and terminates >> 8. parent raises live savevm completion QMP event > > Forking a threaded program is not so easy, but it could be done if the > child is very simple and only uses syscalls to communicate back with the > parent: > > 1. save non-RAM device state > 2. quiesce QEMU to a state that is safe for forking > 3. create a memory map and a pipe > 4. fork and pass the write end of the pipe to the child > 5. parent continues running VM > 6. child reads the memory map and writes data to the pipe > 7. parent copies data from the pipe to the migration stream > 8. child exits, parent raises live savevm completion QMP event > > Paolo > As I just wrote to Stefan, I've already heard the fork idea. I was trying to do it without forking the QEMU, but it will need support from KVM and it could be harder to do instead of forking the QEMU. I'll start working on this. Thanks Paolo Pavel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-21 13:42 ` Paolo Bonzini 2013-03-21 13:53 ` Pavel Hrdina @ 2013-03-21 14:56 ` Stefan Hajnoczi 2013-03-21 15:08 ` Eric Blake 1 sibling, 1 reply; 19+ messages in thread From: Stefan Hajnoczi @ 2013-03-21 14:56 UTC (permalink / raw) To: Paolo Bonzini Cc: Kevin Wolf, Pavel Hrdina, Juan Quintela, qemu-devel, Dietmar Maurer, Wenchao Xia On Thu, Mar 21, 2013 at 02:42:23PM +0100, Paolo Bonzini wrote: > Il 21/03/2013 14:38, Stefan Hajnoczi ha scritto: > > There already is a guest RAM cloning mechanism: fork the QEMU process. > > Then you have a copy-on-write guest RAM. > > > > In a little more detail: > > > > 1. save non-RAM device state > > 2. quiesce QEMU to a state that is safe for forking > > 3. create an EventNotifier for live savevm completion signal > > 4. fork and pass completion EventNotifier to child > > 5. parent continues running VM > > 6. child performs vmsave of copy-on-write guest RAM > > 7. child signals completion EventNotifier and terminates > > 8. parent raises live savevm completion QMP event > > Forking a threaded program is not so easy, but it could be done if the > child is very simple and only uses syscalls to communicate back with the > parent: On Linux you should be able to use clone(2) to spawn a thread with copy-on-write memory. Too bad it's not portable because it gets around the messy fork issues. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-21 14:56 ` Stefan Hajnoczi @ 2013-03-21 15:08 ` Eric Blake 2013-03-23 4:36 ` Wenchao Xia 0 siblings, 1 reply; 19+ messages in thread From: Eric Blake @ 2013-03-21 15:08 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Pavel Hrdina, Juan Quintela, qemu-devel, Dietmar Maurer, Paolo Bonzini, Wenchao Xia [-- Attachment #1: Type: text/plain, Size: 1438 bytes --] On 03/21/2013 08:56 AM, Stefan Hajnoczi wrote: > On Thu, Mar 21, 2013 at 02:42:23PM +0100, Paolo Bonzini wrote: >> Il 21/03/2013 14:38, Stefan Hajnoczi ha scritto: >>> There already is a guest RAM cloning mechanism: fork the QEMU process. >>> Then you have a copy-on-write guest RAM. >>> >>> In a little more detail: >>> >>> 1. save non-RAM device state >>> 2. quiesce QEMU to a state that is safe for forking >>> 3. create an EventNotifier for live savevm completion signal >>> 4. fork and pass completion EventNotifier to child >>> 5. parent continues running VM >>> 6. child performs vmsave of copy-on-write guest RAM >>> 7. child signals completion EventNotifier and terminates >>> 8. parent raises live savevm completion QMP event >> >> Forking a threaded program is not so easy, but it could be done if the >> child is very simple and only uses syscalls to communicate back with the >> parent: > > On Linux you should be able to use clone(2) to spawn a thread with > copy-on-write memory. Too bad it's not portable because it gets around > the messy fork issues. And introduces its own messy issues - once you clone() using different flags than what fork() does, you have invalidated the use of a LOT of libc interfaces in that child; in particular, any use of pthread is liable to break. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 621 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-21 15:08 ` Eric Blake @ 2013-03-23 4:36 ` Wenchao Xia 2013-03-27 3:35 ` Wenchao Xia 0 siblings, 1 reply; 19+ messages in thread From: Wenchao Xia @ 2013-03-23 4:36 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, Pavel Hrdina, Juan Quintela, Stefan Hajnoczi, qemu-devel, Paolo Bonzini, Dietmar Maurer 于 2013-3-21 23:08, Eric Blake 写道: > On 03/21/2013 08:56 AM, Stefan Hajnoczi wrote: >> On Thu, Mar 21, 2013 at 02:42:23PM +0100, Paolo Bonzini wrote: >>> Il 21/03/2013 14:38, Stefan Hajnoczi ha scritto: >>>> There already is a guest RAM cloning mechanism: fork the QEMU process. >>>> Then you have a copy-on-write guest RAM. >>>> >>>> In a little more detail: >>>> >>>> 1. save non-RAM device state >>>> 2. quiesce QEMU to a state that is safe for forking >>>> 3. create an EventNotifier for live savevm completion signal >>>> 4. fork and pass completion EventNotifier to child >>>> 5. parent continues running VM >>>> 6. child performs vmsave of copy-on-write guest RAM >>>> 7. child signals completion EventNotifier and terminates >>>> 8. parent raises live savevm completion QMP event >>> >>> Forking a threaded program is not so easy, but it could be done if the >>> child is very simple and only uses syscalls to communicate back with the >>> parent: >> >> On Linux you should be able to use clone(2) to spawn a thread with >> copy-on-write memory. Too bad it's not portable because it gets around >> the messy fork issues. > > And introduces its own messy issues - once you clone() using different > flags than what fork() does, you have invalidated the use of a LOT of > libc interfaces in that child; in particular, any use of pthread is > liable to break. > I think the core of fork() is snapshot RAM pages with RAM, just like LVM2's block snapshot, very cool idea :). The problem is implemention, an API like following is needed: void *mem_snapshot(void *addr, uint64_t len); Briefly I haven't found it on Linux, and not sure if it is available on upstream Linux kernel/C lib. Make this API available then use it in qemu, would be much nicer. It is very challenge to use fork()/clone() way in qemu, I guess there will be many sparse code preparing for fork(), and some resource handling code after fork(), code to query progress, exception handling, child/parent talking mechnism, ah... seems complex. But I am looking forward to see how good it is. Compared with migration to image, the later one use less mem with more I/O, but is much easier to be implemented and portable, maybe it can be used as a simple improvement for "migrate to fd", before an underlining mem snapshot API is available. -- Best Regards Wenchao Xia ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-23 4:36 ` Wenchao Xia @ 2013-03-27 3:35 ` Wenchao Xia 0 siblings, 0 replies; 19+ messages in thread From: Wenchao Xia @ 2013-03-27 3:35 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, Carsten Otte, Anthony Liguori, Pavel Hrdina, Heiko Carstens, Juan Quintela, Stefan Hajnoczi, Marcelo Tosatti, Sebastian Ott, qemu-devel, Alexander Graf, Christian Borntraeger, Cornelia Huck, Paolo Bonzini, Dietmar Maurer, Martin Schwidefsky > With a deeper thinking, I'd like to share some more analyse: Vmstate saving equals memory snapshotting, to do it in theory methods can be concluded as: 1 get a mirror of it just in the time sending the "snapshot" request, kernel cow that region. 2 get a mirror of it by gradually coping out the region, complete when clone sync with the original region, basically similar to migrate. Take a closer look: 1 cow the memory region: Saving: block I/O, cpu, since any duplicated step do not exist. Sacrifice: mem. Industry improvement solution: NUMA, price: expensive. Implement: hard, need quite some work. Qemu code maintain: easy. Detail: This method is the closest one to the meaning of "snapshot", but it contains a hidden requirement: reserved memory. As a really used server today, it is not possible that a huge memory is reserved for it: for example, one 4G mem server will possible to run a 3.5G mem guest, to get benefit of easing deploying, hardware independency, whole machine backup/restore. In this case, memory is not enough to do it. Let's take another example more possible happen: one 4G mem server run two 1.5G guest, in this case one guest need to be migrated out, obvious bad. So a much better solution is adding memory at the time doing snapshot, to do it without hardware plug and economic, it need NUMA+memory sharing: Host1 Host2 Host3 | | | | | | | mem | mem | mem | | | |------------------ | shared mem Some hosts share a memory to do snapshot, they get it when doing snapshot and return it to cluster manager after complete. This is possible on expensive architecture, but hard to be done on x86 architecture which labels itself cheap. One unrelated topic I thought: does qemu support migrating to a host device? If not it should support migrate to a block device with fixed size(different with snapshot, two mirror need sync), when shared memory present they can be migrated to a RAM block device quickly. Implement detail: It should be done by adding an API in kernel: mem_snapshot(), from where kernel can cow a region, and write the snapshotted pages to far slower shared mem(if this logic is added as optimization). Fork() can do it, but brings many trouble and wound not benefit from NUMA architecture by moving snapshotted pages to slower mem. 2 gradually coping out and sync the memory region, two ways to do it: 2.1 migrate to block device.(migrate to fd, or migrate to image): Saving: mem. Sacrifice: CPU, block I/O. Industry improvement solution: Flash disk, cheap. Implement: easy, based on migration. Qemu code maintain: easy. Detail: It is a relative easier case, we just need to make the size fixed. And flash disk is possible on X86 architecture. 2.2 migrate to a stream, use another process to receive and rearrange the data. Saving: mem. Sacrifice: CPU(very high), block I/O(unless big buffer). Industry improvement solution: another host or CPU do it. Implement: hard, need new qemu tool. Qemu code maintain: hard, data need to be encoded in qemu, decoded on another process and rearrange, every change or new device adding need change it on both side. Detail: It invokes a process to receive the data, or invoke a fake qemu to recieve it and save(need many memory). Since code are hard to maintain, personally I think it is worse than 2.1. Summary: suggest: 1) support both method 1 and 2.1, treat 2.1 as an improvement for migrate fd. Adding a new qmp interface as "vmsate snapshot" for method 1 to declare it as true snapshot. This allow it work on different architecture. 2) pushing a API to Linux to do method 1, instead of fork(). I'd like to send a RFC to Linux memory mail-list to get feedback. -- Best Regards Wenchao Xia ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [RFC] qmp interface for save vmstate to image 2013-03-21 13:38 ` Stefan Hajnoczi 2013-03-21 13:42 ` Paolo Bonzini @ 2013-03-21 13:43 ` Pavel Hrdina 1 sibling, 0 replies; 19+ messages in thread From: Pavel Hrdina @ 2013-03-21 13:43 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Juan Quintela, qemu-devel, Dietmar Maurer, Paolo Bonzini, Wenchao Xia On 03/21/2013 02:38 PM, Stefan Hajnoczi wrote: > On Thu, Mar 21, 2013 at 12:48:35PM +0100, Pavel Hrdina wrote: >> On 03/21/2013 07:43 AM, Wenchao Xia wrote: >>> Hi, Pavel >>> Sorry for late response. >> >> np :) >> >>>> Hi Wenchao, >>>> >>>> It seems the we are working on the same thing. You are trying to improve >>>> the size of vmstate if you want to save it to file or as an internal >>>> snapshot. >>>> >>>> I'm also working on that issue and I think that my solution could be >>>> also used for savevm to external file or for live backup. >>>> >>>> Here is my proposal how to do it: >>>> >>>> We will not have the fixed size of vmstate, we will have the possible >>>> minimal size of the vmstate. I will also use the migration code to save >>>> the vmstate. >>>> >>> It is good if speed and size can be improved, but IMHO the size will >>> be a problem. Predictable or fixed size ensure management stack to >>> give assess and decision, preserve resource ahead, personally I >>> does not like a process continue to take resource without limit, >>> in most case I'll turn it off.... By using qcow2m vmstate will have a >>> fixed MAX size, ideal to be used to take it as a backup data. >>> Above is my personal opinion, and I do want to know the maintainer's >>> opinion to decide whether to continue. >> >> I mean that the vmstate size would by at max the same as the guest ram >> size, but could be smaller. I also dislike that actually the vmstate >> could be much more larger then the guest ram size. >> >>> >>>> In the qemu_savevm_state_begin we will create bitmap for all ram pages. >>>> Then we set all pages in bitmap to "1" and it means we need to save them >>>> all. Then we check all ram pages for duplicated pages and we will unset >>>> all duplicated pages from "savevm_bitmap". >>>> >>>> In the qemu_savevm_state_iterate we will start saving remaining ram >>>> pages according to "savevm_bitmap". Because the guest is running, it >>>> could change the data in ram pages which is still not saved. For this >>>> case we also have to create a priority queue. Into this priority queue >>>> we will copy every ram page before it will be changed and also remove >>>> this ram page from savevm_bitmap. In the iterate cycle we will at first >>>> handle the priority queue and then continue to other ram pages from the >>>> savevm_bitmap. >>>> >>> OK, I got your idea: intercept the page writing before it changes. >>> I think this could reduce time in savevm. But some problems need to be >>> confirmed: >>> 1 is it workable when KVM is used? In my understanding KVM will directly >>> change the ram page before qemu can take over. >> >> Yes, this is true. I'm now investigating any way how to do this, but I'm >> afraid that without support from kvm kernel module it cannot be done. >> >>> 2 the performance sacrifice of running guest, need a test. >> >> Surly I'll test it if there will be some solution how to copy the page >> before it is changed when kvm is used. > > There already is a guest RAM cloning mechanism: fork the QEMU process. > Then you have a copy-on-write guest RAM. > > In a little more detail: > > 1. save non-RAM device state > 2. quiesce QEMU to a state that is safe for forking > 3. create an EventNotifier for live savevm completion signal And also for internal live savevm create pipe to past the vmstate data to the parent. It wouldn't be good idea touching qcow2 file from the child. > 4. fork and pass completion EventNotifier to child > 5. parent continues running VM > 6. child performs vmsave of copy-on-write guest RAM > 7. child signals completion EventNotifier and terminates > 8. parent raises live savevm completion QMP event > > Stefan > Yes, this is another way how to do it. I already consider this. Now I know that it could be the right way if someone else wrote this. Thanks Stefan, Pavel ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2013-03-27 3:36 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-03-15 7:24 [Qemu-devel] [RFC] qmp interface for save vmstate to image Wenchao Xia 2013-03-15 14:51 ` Stefan Hajnoczi 2013-03-18 6:40 ` Wenchao Xia 2013-03-18 9:04 ` Kevin Wolf 2013-03-18 10:08 ` Paolo Bonzini 2013-03-18 10:50 ` Wenchao Xia 2013-03-18 10:47 ` Wenchao Xia 2013-03-18 10:09 ` Stefan Hajnoczi 2013-03-18 13:28 ` Pavel Hrdina 2013-03-21 6:43 ` Wenchao Xia 2013-03-21 11:48 ` Pavel Hrdina 2013-03-21 13:38 ` Stefan Hajnoczi 2013-03-21 13:42 ` Paolo Bonzini 2013-03-21 13:53 ` Pavel Hrdina 2013-03-21 14:56 ` Stefan Hajnoczi 2013-03-21 15:08 ` Eric Blake 2013-03-23 4:36 ` Wenchao Xia 2013-03-27 3:35 ` Wenchao Xia 2013-03-21 13:43 ` Pavel Hrdina
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).