qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] About VM fork in QEMU
@ 2013-10-22 20:23 Xinyang Ge
  2013-10-22 22:10 ` Eric Blake
  0 siblings, 1 reply; 9+ messages in thread
From: Xinyang Ge @ 2013-10-22 20:23 UTC (permalink / raw)
  To: qemu-devel

Dear QEMU developers,

I am a Ph.D. student in Penn State. And we are currently working on a
project that needs to fork multiple instances of a same VM instance
with exactly same state (e.g., memory layout, registers, etc.) in a
very efficient way. Snapshot is too heavy for us because it needs to
dump the memory state to the filesystem so that reverting is possible
sometime later. Our project does not need to revert a VM to a previous
snapshot but lively clone (or fork) multiple instances and make them
run at the same time. Do you happen to know if it's possible to do
this? What we are envisioning is copy-on-write would happen both on
disks (e.g., qcow2) and memory state (e.g., physical pages).

Thanks,
Xinyang

-- 
Xinyang GE
Department of Computer Science & Engineering
The Pennsylvania State University
Homepage: http://www.cse.psu.edu/~xxg113/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] About VM fork in QEMU
  2013-10-22 20:23 [Qemu-devel] About VM fork in QEMU Xinyang Ge
@ 2013-10-22 22:10 ` Eric Blake
  2013-10-23 14:36   ` Xinyang Ge
  2013-10-24 19:10   ` Xinyang Ge
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Blake @ 2013-10-22 22:10 UTC (permalink / raw)
  To: Xinyang Ge, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2780 bytes --]

On 10/22/2013 09:23 PM, Xinyang Ge wrote:
> Dear QEMU developers,
> 
> I am a Ph.D. student in Penn State. And we are currently working on a
> project that needs to fork multiple instances of a same VM instance
> with exactly same state (e.g., memory layout, registers, etc.) in a
> very efficient way. Snapshot is too heavy for us because it needs to
> dump the memory state to the filesystem so that reverting is possible
> sometime later. Our project does not need to revert a VM to a previous
> snapshot but lively clone (or fork) multiple instances and make them
> run at the same time. Do you happen to know if it's possible to do
> this? What we are envisioning is copy-on-write would happen both on
> disks (e.g., qcow2) and memory state (e.g., physical pages).

Live cloning is a disaster waiting to happen if not done in a very
carefully controlled environment (I could maybe see it useful across two
private networks for forensic analysis or running "what-if" scenarios,
but never for provisioning enterprise-quality public-facing servers).
Remember, if you ever expose both forks of a live clone to the same
network at the same time, you have a security vulnerability if you did
not manage to scrube the random pool of the two guests to be different,
where the crypto behavior of the second guest can be guessed by
observing the behavior of the first.  But scrubbing memory correctly
requires knowing EXACTLY where in memory the random pool is stored,
which is highly guest-dependent, and may be spread across multiple guest
locations.  With offline disk images, the set of information to scrub is
a bit easier, and in fact, 'virt-sysprep' from the libguestfs tools can
do it for a number of guests, but virt-sysprep (rightfully) refuses to
try to scrub a live image.  Do your forked guests really have to run in
parallel, or is it sufficient to serialize the running of one variation
followed by the other variation?

As far as I know, the only way to run two guests that diverge from the
same live state is to take a snapshot and then run two qemu instances
that both point to that common state as their starting point, and I
would personally never attempt it in parallel.  Meanwhile, although you
complained that snapshots are too heavyweight, it's really the only way
I know to even begin to attempt live cloning with current qemu.  Of
course, being open source, you're welcome to submit a patch to add
features to qemu to do a faster live clone.  But be prepared for an
uphill battle if you cannot prove that such a patch does not introduce
security implications running improperly scrubbed forks in parallel.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] About VM fork in QEMU
  2013-10-22 22:10 ` Eric Blake
@ 2013-10-23 14:36   ` Xinyang Ge
  2013-10-26  4:15     ` Eric Blake
  2013-10-24 19:10   ` Xinyang Ge
  1 sibling, 1 reply; 9+ messages in thread
From: Xinyang Ge @ 2013-10-23 14:36 UTC (permalink / raw)
  To: Eric Blake; +Cc: Hayawardh Vijayakumar, qemu-devel

> Live cloning is a disaster waiting to happen if not done in a very
> carefully controlled environment (I could maybe see it useful across two
> private networks for forensic analysis or running "what-if" scenarios,
> but never for provisioning enterprise-quality public-facing servers).
> Remember, if you ever expose both forks of a live clone to the same
> network at the same time, you have a security vulnerability if you did
> not manage to scrube the random pool of the two guests to be different,
> where the crypto behavior of the second guest can be guessed by
> observing the behavior of the first. But scrubbing memory correctly
> requires knowing EXACTLY where in memory the random pool is stored,
> which is highly guest-dependent, and may be spread across multiple guest
> locations.  With offline disk images, the set of information to scrub is
> a bit easier, and in fact, 'virt-sysprep' from the libguestfs tools can
> do it for a number of guests, but virt-sysprep (rightfully) refuses to
> try to scrub a live image.  Do your forked guests really have to run in
> parallel, or is it sufficient to serialize the running of one variation
> followed by the other variation?

It's better to have them run in parallel since our project doesn't
have any network stuff. However, running each variation sequentially
is also sufficient for us. What we are concerned the most is whether
we can get a snapshot in milliseconds because we don't really need to
save the memory state to disk for future reversion. Could you let me
know if it's possible for qemu or qemu-kvm with minor changes?

> As far as I know, the only way to run two guests that diverge from the
> same live state is to take a snapshot and then run two qemu instances
> that both point to that common state as their starting point, and I
> would personally never attempt it in parallel.  Meanwhile, although you
> complained that snapshots are too heavyweight, it's really the only way
> I know to even begin to attempt live cloning with current qemu.  Of
> course, being open source, you're welcome to submit a patch to add
> features to qemu to do a faster live clone.  But be prepared for an
> uphill battle if you cannot prove that such a patch does not introduce
> security implications running improperly scrubbed forks in parallel.

Thanks for letting me know about it. Yes, if only there's
communication between the guest and outside world, live cloning can
bring a bunch of security issues (e.g., IP address spoofing). But
since in our scenario VM doesn't have any network stuff, we would be
happy if only there's a quick-and-dirty way to implement it.


Xinyang

--
Xinyang GE
Department of Computer Science & Engineering
The Pennsylvania State University
Homepage: http://www.cse.psu.edu/~xxg113/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] About VM fork in QEMU
  2013-10-22 22:10 ` Eric Blake
  2013-10-23 14:36   ` Xinyang Ge
@ 2013-10-24 19:10   ` Xinyang Ge
  1 sibling, 0 replies; 9+ messages in thread
From: Xinyang Ge @ 2013-10-24 19:10 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel

> Live cloning is a disaster waiting to happen if not done in a very
> carefully controlled environment (I could maybe see it useful across two
> private networks for forensic analysis or running "what-if" scenarios,
> but never for provisioning enterprise-quality public-facing servers).
> Remember, if you ever expose both forks of a live clone to the same
> network at the same time, you have a security vulnerability if you did
> not manage to scrube the random pool of the two guests to be different,
> where the crypto behavior of the second guest can be guessed by
> observing the behavior of the first. But scrubbing memory correctly
> requires knowing EXACTLY where in memory the random pool is stored,
> which is highly guest-dependent, and may be spread across multiple guest
> locations.  With offline disk images, the set of information to scrub is
> a bit easier, and in fact, 'virt-sysprep' from the libguestfs tools can
> do it for a number of guests, but virt-sysprep (rightfully) refuses to
> try to scrub a live image.  Do your forked guests really have to run in
> parallel, or is it sufficient to serialize the running of one variation
> followed by the other variation?

It's better to have them run in parallel since our project doesn't
have any network stuff. However, running each variation sequentially
is also sufficient for us. What we are concerned the most is whether
we can get a snapshot in milliseconds because we don't really need to
save the memory state to disk for future reversion. Could you let me
know if it's possible for qemu or qemu-kvm with minor changes?

> As far as I know, the only way to run two guests that diverge from the
> same live state is to take a snapshot and then run two qemu instances
> that both point to that common state as their starting point, and I
> would personally never attempt it in parallel.  Meanwhile, although you
> complained that snapshots are too heavyweight, it's really the only way
> I know to even begin to attempt live cloning with current qemu.  Of
> course, being open source, you're welcome to submit a patch to add
> features to qemu to do a faster live clone.  But be prepared for an
> uphill battle if you cannot prove that such a patch does not introduce
> security implications running improperly scrubbed forks in parallel.

Thanks for letting me know about it. Yes, if only there's
communication between the guest and outside world, live cloning can
bring a bunch of security issues (e.g., IP address spoofing). But
since in our scenario VM doesn't have any network stuff, we would be
happy if only there's a quick-and-dirty way to implement it.

Xinyang

-- 
Xinyang GE
Department of Computer Science & Engineering
The Pennsylvania State University
Homepage: http://www.cse.psu.edu/~xxg113/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] About VM fork in QEMU
  2013-10-23 14:36   ` Xinyang Ge
@ 2013-10-26  4:15     ` Eric Blake
  2013-10-26 17:37       ` Xinyang Ge
  2013-10-26 19:12       ` Xinyang Ge
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Blake @ 2013-10-26  4:15 UTC (permalink / raw)
  To: Xinyang Ge; +Cc: Hayawardh Vijayakumar, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2911 bytes --]

On 10/23/2013 03:36 PM, Xinyang Ge wrote:
>> Live cloning is a disaster waiting to happen if not done in a very
>> carefully controlled environment (I could maybe see it useful across two
>> private networks for forensic analysis or running "what-if" scenarios,
>> but never for provisioning enterprise-quality public-facing servers).
>> Remember, if you ever expose both forks of a live clone to the same
>> network at the same time, you have a security vulnerability if you did
>> not manage to scrube the random pool of the two guests to be different,
>> where the crypto behavior of the second guest can be guessed by
>> observing the behavior of the first. But scrubbing memory correctly
>> requires knowing EXACTLY where in memory the random pool is stored,
>> which is highly guest-dependent, and may be spread across multiple guest
>> locations.  With offline disk images, the set of information to scrub is
>> a bit easier, and in fact, 'virt-sysprep' from the libguestfs tools can
>> do it for a number of guests, but virt-sysprep (rightfully) refuses to
>> try to scrub a live image.  Do your forked guests really have to run in
>> parallel, or is it sufficient to serialize the running of one variation
>> followed by the other variation?
> 
> It's better to have them run in parallel since our project doesn't
> have any network stuff.

Good, then it sounds like you are being careful about avoiding the worst
aspect of live cloning (as long as two guests are never visible to the
same network, then you aren't exposing security risks over that network).

> However, running each variation sequentially
> is also sufficient for us. What we are concerned the most is whether
> we can get a snapshot in milliseconds because we don't really need to
> save the memory state to disk for future reversion. Could you let me
> know if it's possible for qemu or qemu-kvm with minor changes?

External snapshots (via the blockdev-snapshot-sync QMP command) can be
taken in a matter of milliseconds if you only care about disk state.
Furthermore, if you want to take a snapshot of both memory and disk
state, such that the clone can be resumed from the same time, you can do
that with a guest downtime that only lasts as long as the
blockdev-snapshot-sync, by first doing a migrate to file then doing the
disk snapshot when the VM pauses at the end of migration.  Resuming the
original guest is fast; resuming from the migration file is a bit
longer, but it is still the fastest way possible to resume from a
memory+disk snapshot.  If you need anything faster, then yes, you would
have to write patches to qemu to attempt cloning via fork() that makes
sure to modify the active disk in use by the fork child so as not to
interfere with the fork parent.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] About VM fork in QEMU
  2013-10-26  4:15     ` Eric Blake
@ 2013-10-26 17:37       ` Xinyang Ge
  2013-10-28 14:33         ` Eric Blake
  2013-10-26 19:12       ` Xinyang Ge
  1 sibling, 1 reply; 9+ messages in thread
From: Xinyang Ge @ 2013-10-26 17:37 UTC (permalink / raw)
  To: Eric Blake; +Cc: Hayawardh Vijayakumar, qemu-devel

> External snapshots (via the blockdev-snapshot-sync QMP command) can be
> taken in a matter of milliseconds if you only care about disk state.
> Furthermore, if you want to take a snapshot of both memory and disk
> state, such that the clone can be resumed from the same time, you can do
> that with a guest downtime that only lasts as long as the
> blockdev-snapshot-sync, by first doing a migrate to file then doing the
> disk snapshot when the VM pauses at the end of migration.  Resuming the
> original guest is fast; resuming from the migration file is a bit
> longer, but it is still the fastest way possible to resume from a
> memory+disk snapshot.  If you need anything faster, then yes, you would
> have to write patches to qemu to attempt cloning via fork() that makes
> sure to modify the active disk in use by the fork child so as not to
> interfere with the fork parent.

I think migrating memory to file then doing external disk snapshot is
exactly what we want. Since we are using libvirt to manage different
VMs, could you give us some specific guides (or references) that how
we could migrate memory state to file using virsh interfaces and do
external snapshots?

Thanks,
Xinyang

-- 
Xinyang GE
Department of Computer Science & Engineering
The Pennsylvania State University
Homepage: http://www.cse.psu.edu/~xxg113/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] About VM fork in QEMU
  2013-10-26  4:15     ` Eric Blake
  2013-10-26 17:37       ` Xinyang Ge
@ 2013-10-26 19:12       ` Xinyang Ge
  1 sibling, 0 replies; 9+ messages in thread
From: Xinyang Ge @ 2013-10-26 19:12 UTC (permalink / raw)
  To: Eric Blake; +Cc: Hayawardh Vijayakumar, qemu-devel

> External snapshots (via the blockdev-snapshot-sync QMP command) can be
> taken in a matter of milliseconds if you only care about disk state.
> Furthermore, if you want to take a snapshot of both memory and disk
> state, such that the clone can be resumed from the same time, you can do
> that with a guest downtime that only lasts as long as the
> blockdev-snapshot-sync, by first doing a migrate to file then doing the
> disk snapshot when the VM pauses at the end of migration.  Resuming the
> original guest is fast; resuming from the migration file is a bit
> longer, but it is still the fastest way possible to resume from a
> memory+disk snapshot.  If you need anything faster, then yes, you would
> have to write patches to qemu to attempt cloning via fork() that makes
> sure to modify the active disk in use by the fork child so as not to
> interfere with the fork parent.

I noticed there is a "qemu-monitor-command" option for virsh and maybe
I can use this to issue QMP commands. However, when I run
"blockdev-snapshot-sync", I cannot correctly identify the correct
values for argument "device". Below is part of the VM's configuration
file (libvirt format):

<devices>
    <emulator>/usr/bin/kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/u12.qcow2'/>
      <target dev='hda' bus='ide'/>
      <alias name='ide0-0-0'/>
      <address type='drive' controller='0' bus='0' unit='0'/>
    </disk>
    <disk type='block' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <alias name='ide0-1-0'/>
      <address type='drive' controller='0' bus='1' unit='0'/>
    </disk>

Can you help me find out which is the right value for "device" when I
run "blockdev-snapshot-sync"? Thanks in advance!

Xinyang

-- 
Xinyang GE
Department of Computer Science & Engineering
The Pennsylvania State University
Homepage: http://www.cse.psu.edu/~xxg113/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] About VM fork in QEMU
  2013-10-26 17:37       ` Xinyang Ge
@ 2013-10-28 14:33         ` Eric Blake
  2013-10-28 17:30           ` Xinyang Ge
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Blake @ 2013-10-28 14:33 UTC (permalink / raw)
  To: Xinyang Ge; +Cc: Hayawardh Vijayakumar, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1530 bytes --]

On 10/26/2013 11:37 AM, Xinyang Ge wrote:
>> External snapshots (via the blockdev-snapshot-sync QMP command) can be
>> taken in a matter of milliseconds if you only care about disk state.
>> Furthermore, if you want to take a snapshot of both memory and disk
>> state, such that the clone can be resumed from the same time, you can do
>> that with a guest downtime that only lasts as long as the
>> blockdev-snapshot-sync, by first doing a migrate to file then doing the
>> disk snapshot when the VM pauses at the end of migration.  Resuming the
>> original guest is fast; resuming from the migration file is a bit
>> longer, but it is still the fastest way possible to resume from a
>> memory+disk snapshot.  If you need anything faster, then yes, you would
>> have to write patches to qemu to attempt cloning via fork() that makes
>> sure to modify the active disk in use by the fork child so as not to
>> interfere with the fork parent.
> 
> I think migrating memory to file then doing external disk snapshot is
> exactly what we want. Since we are using libvirt to manage different
> VMs, could you give us some specific guides (or references) that how
> we could migrate memory state to file using virsh interfaces and do
> external snapshots?

virsh snapshot-create-as $dom $name --live --memspec /path/to/memoryfile

Libvirt usage questions might be better directed to the libvirt lists.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] About VM fork in QEMU
  2013-10-28 14:33         ` Eric Blake
@ 2013-10-28 17:30           ` Xinyang Ge
  0 siblings, 0 replies; 9+ messages in thread
From: Xinyang Ge @ 2013-10-28 17:30 UTC (permalink / raw)
  To: Eric Blake; +Cc: Hayawardh Vijayakumar, libvirt-users, qemu-devel

>>> External snapshots (via the blockdev-snapshot-sync QMP command) can be
>>> taken in a matter of milliseconds if you only care about disk state.
>>> Furthermore, if you want to take a snapshot of both memory and disk
>>> state, such that the clone can be resumed from the same time, you can do
>>> that with a guest downtime that only lasts as long as the
>>> blockdev-snapshot-sync, by first doing a migrate to file then doing the
>>> disk snapshot when the VM pauses at the end of migration.  Resuming the
>>> original guest is fast; resuming from the migration file is a bit
>>> longer, but it is still the fastest way possible to resume from a
>>> memory+disk snapshot.  If you need anything faster, then yes, you would
>>> have to write patches to qemu to attempt cloning via fork() that makes
>>> sure to modify the active disk in use by the fork child so as not to
>>> interfere with the fork parent.
>>
>> I think migrating memory to file then doing external disk snapshot is
>> exactly what we want. Since we are using libvirt to manage different
>> VMs, could you give us some specific guides (or references) that how
>> we could migrate memory state to file using virsh interfaces and do
>> external snapshots?
>
> virsh snapshot-create-as $dom $name --live --memspec /path/to/memoryfile

I have tried this command on libvirt v1.1.3 and it returns "error:
invalid argument: qemuDomainSnapshotCreateXML: unsupported flags
(0x100)". Looks like --live is not supported yet. Could you let us
know which version we should of libvirt we should use in order to use
this feature?

Thanks,
Xinyang

-- 
Xinyang GE
Department of Computer Science & Engineering
The Pennsylvania State University
Homepage: http://www.cse.psu.edu/~xxg113/

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-10-28 17:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-22 20:23 [Qemu-devel] About VM fork in QEMU Xinyang Ge
2013-10-22 22:10 ` Eric Blake
2013-10-23 14:36   ` Xinyang Ge
2013-10-26  4:15     ` Eric Blake
2013-10-26 17:37       ` Xinyang Ge
2013-10-28 14:33         ` Eric Blake
2013-10-28 17:30           ` Xinyang Ge
2013-10-26 19:12       ` Xinyang Ge
2013-10-24 19:10   ` Xinyang Ge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).