From: Avi Kivity <avi@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Jes.Sorensen@redhat.com, Marcelo Tosatti <mtosatti@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Date: Mon, 28 Feb 2011 10:38:18 +0200 [thread overview]
Message-ID: <4D6B5EFA.8060106@redhat.com> (raw)
In-Reply-To: <4D6A8CC9.4090304@codemonkey.ws>
On 02/27/2011 07:41 PM, Anthony Liguori wrote:
>>
>> I agree 100% the management tool cannot be the authoritative source
>> of state.
>>
>> My position is:
>> - the management tool should be 100% in control of configuration (how
>> the guest is put together from its components)
>> - qemu should be 100% in control of state (memory, disk state, NVRAM
>> in various components, cd-rom eject state, explosive bolts for
>> payload separation, self-destruct mechanism, etc.)
>
>
> There simply is not such a clean separation between the two because
> things that the guest does affects the configuration of the guest.
>
> Hot plug,
I don't think hotunplug works this way. When the guest ejects the pci
or usb device, it simply stops working with the device and disconnects
the power. There is nothing non-volatile going on, no spring-loaded
lever that pushes the device out. If the server reboots immediately
after hotunplug, but before the user physically removes the device, then
the server will see the device when it boots up.
> removable media eject,
Here, we do have a single bit of non-volatile storage.
> persistent device settings (whether it's CMOS or EEPROM) all disrupt
> this model.
These are just arrays of bits, most of them with no standard
interpretation. So a block device fits them perfectly.
>
> If you really wanted to have this separation, you'd have to be very
> strict about making all guest settings not be specified in config.
> You would need to do:
>
> qemu-img create -f e1000-eprom -o macaddr=12:23:45:67:78:90 e1000.0.rom
> qemu-img create -f e1000-eprom -o macaddr=12:23:45:67:78:91 e1000.1.rom
>
> qemu -device e1000,id=e1000.0,eeprom=e1000.0.rom -device
> e1000,id=e1000.1,eeprom=e1000.1.rom
>
> And now I need a tool that lets me modify e1000-eprom images if I want
> to change the mac address dynamically (say I'm trying to clone a VM).
>
> This type of model can be workable but as I said earlier, I think it's
> overengineering the problem.
In fact I don't think anyone wants this. Usually management wants the
assigned MAC to be used without the guest playing games with it. So
it's more or less pointless however it's implemented.
>
> We don't separate configuration from guest state today. Instead of
> setting ourselves up for failure by setting an unrealistic standard
> that we try to achieve and never do, let's embrace the system that is
> working for us today. We are authoritative for everything and guest
> state is intimately tied to the virtual machine configuration.
"we are authoritative for everything" is a clean break from everything
that's being done today. It's also a clean break from the model of
central management plus database. We can't force it on people.
Non-volatile state is not intimately tied to configuration. We store
block device state completely outside the configuration. What's left is
the CD-ROM tray, CMOS memory, and network card EEPROM. We could argue
back and forth about where exactly they belong, but they aren't really
worth the conversation since they are meaningless for real-life use.
>
>>>
>>> But beyond those races, QEMU is the only entity that knows with
>>> certainty what bits of information are important to persist in order
>>> to preserve a guest across shutdown/restart. The fact that we've
>>> punted this problem for so long has only ensured that management
>>> tools are either intrinsically broken or only support the most
>>> minimal subset of functionality we actually support.
>>
>> I'm not arguing about that. I just want to stress again the
>> difference between state and configuration. Qemu has no authority,
>> in my mind, as to configuration. Only state.
>
> Being the one that creates a guest based on configuration, I would say
> that we most certainly do.
That is not what being authoritative means.
In a virt-manager deployment, libvirt is the authoritative source of
guest configuration. In a RHEV-M deployment, the RHEV-M database is the
authoritative source of guest configuration. You can completely replace
the host machine and your guest will recreate just fine as long as the
authoritative source is intact.
>
>>>> Currently they contain the required guest configuration, a
>>>> representation of what's the current live configuration, and they
>>>> issue monitor commands to move the live configuration towards the
>>>> required configuration (or just generate a qemu command line).
>>>> What you're describing is completely different, I'm not even sure
>>>> what it is.
>>>
>>> Management tools shouldn't have to think about how the monitor
>>> commands they issue impact the invocation options of QEMU.
>>
>> They have to, when creating a guest from scratch.
>>
>> But I admit, this throws a new light (for me) on things. What's the
>> implications?
>> - must have a qemu instance running when editing configuration, even
>> when the guest is down
>
> QMP is an API. Whether a qemu instance is launched is an
> implementation detail. This could all be hidden completely with libqmp.
QMP is first and foremost a protocol.
>
>> - cannot add additional information to configuration; must store it
>> in an external database and cross-reference it with the qemu data
>> using the device ID
>
> Don't confuse a management tool's notion of configuration with QEMU's
> configuration.
>
> A management tools config is used to initially create and then
> manipulate an existing guest. If the management tool supports
> out-of-band manipulation of a configuration file, then it needs to
> determine how the configuration file changed and execute the
> appropriate commands.
I wasn't talking about that. I was talking about data that is
meaningful to a user but not meaningful to qemu. That sort of data
doesn't store well if qemu is the authoritative source.
> Yes, it is. libvirt kind of cheats here and just deletes the old VM
> and creates a new one when editing the XML IIUC.
>
>> - no transactions/queries/etc except on non-authoritative source
>> - issues with shared-nothing design (well, can store the
>> configuration file using DRBD).
>
> In both cases, today a management tool races with QEMU so both of
> these points are currently true.
No, it doesn't. If the guest ejects a network card, the network card is
still there. Queries against the database still return correct results.
>
>>>> If you look at management tools, they believe they are the
>>>> authoritative source of configuration information (not guest state,
>>>> which is more or less ignored).
>>>
>>> It's because we've given them no other option.
>>
>> It's the natural way of doing it. You have a web interface that
>> talks to a database. When you want to list all VMs that have network
>> cards on the production subnet, you issue a database query and get a
>> recordset. How do you do that when the authoritative source of
>> information is spread across a cluster?
>
> This problem still exists today. A guest can eject a network card on
> it's own (without the management tool issuing a device_del command).
> QEMU will delete the NIC when this happens.
I think that's a bug.
> The same is true with CDROM eject.
CDROM tray position is state, not configuration.
>
> Management tools are simply not authoritative today.
>
> Regards,
>
> Anthony Liguori
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2011-02-28 8:38 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-22 17:00 [Qemu-devel] [patch 0/3] live block copy (v2) Marcelo Tosatti
2011-02-22 17:00 ` [Qemu-devel] [patch 1/3] add migration_active function Marcelo Tosatti
2011-02-22 17:00 ` [Qemu-devel] [patch 2/3] Add support for live block copy Marcelo Tosatti
2011-02-22 20:50 ` [Qemu-devel] " Anthony Liguori
2011-02-22 21:07 ` Marcelo Tosatti
2011-02-22 21:11 ` Anthony Liguori
2011-02-22 23:09 ` Marcelo Tosatti
2011-02-22 23:14 ` Anthony Liguori
2011-02-23 13:01 ` Avi Kivity
2011-02-23 14:35 ` Anthony Liguori
2011-02-23 15:31 ` Avi Kivity
2011-02-23 16:01 ` Anthony Liguori
2011-02-23 16:14 ` Avi Kivity
2011-02-23 16:28 ` Anthony Liguori
2011-02-23 17:18 ` Avi Kivity
2011-02-23 20:18 ` Anthony Liguori
2011-02-23 20:44 ` Marcelo Tosatti
2011-02-23 21:41 ` Anthony Liguori
2011-02-24 14:39 ` Marcelo Tosatti
2011-02-24 7:37 ` Markus Armbruster
2011-02-24 8:54 ` Avi Kivity
2011-02-24 15:00 ` Anthony Liguori
2011-02-24 15:22 ` Avi Kivity
2011-02-24 17:58 ` Anthony Liguori
2011-02-27 9:10 ` Avi Kivity
2011-02-27 9:55 ` Dor Laor
2011-02-27 13:49 ` Anthony Liguori
2011-02-27 16:02 ` Dor Laor
2011-02-27 17:25 ` Anthony Liguori
2011-02-28 8:58 ` Dor Laor
2011-02-27 14:00 ` Anthony Liguori
2011-02-27 15:31 ` Avi Kivity
2011-02-27 17:41 ` Anthony Liguori
2011-02-28 8:38 ` Avi Kivity [this message]
2011-02-28 12:45 ` Anthony Liguori
2011-02-28 13:21 ` Avi Kivity
2011-02-28 17:33 ` Anthony Liguori
2011-02-28 17:47 ` Avi Kivity
2011-02-28 18:12 ` Anthony Liguori
[not found] ` <4D6CBECF.8090805@redhat.c! om>
[not found] ` <4D6CB556.5060401@redhat.c! om>
2011-03-01 8:59 ` Dor Laor
2011-03-02 12:39 ` Anthony Liguori
2011-03-02 13:00 ` Avi Kivity
2011-03-02 15:07 ` Anthony Liguori
2011-03-01 9:39 ` Avi Kivity
2011-03-01 15:51 ` Anthony Liguori
2011-03-01 22:27 ` Dor Laor
2011-03-02 16:30 ` Avi Kivity
2011-03-02 21:55 ` Anthony Liguori
2011-02-28 18:56 ` Marcelo Tosatti
2011-03-01 9:45 ` Avi Kivity
2011-02-23 16:17 ` Peter Maydell
2011-02-23 16:30 ` Anthony Liguori
2011-02-24 5:41 ` [Qemu-devel] Unsubsribing James Brown
2011-02-24 10:00 ` Stefan Hajnoczi
2011-02-23 17:26 ` [Qemu-devel] Re: [patch 2/3] Add support for live block copy Markus Armbruster
2011-02-23 20:06 ` Anthony Liguori
2011-02-24 12:15 ` Markus Armbruster
2011-02-25 7:16 ` Stefan Hajnoczi
2011-02-23 17:49 ` Marcelo Tosatti
2011-02-24 8:58 ` Avi Kivity
2011-02-24 15:14 ` Marcelo Tosatti
2011-02-24 15:28 ` Avi Kivity
2011-02-24 16:39 ` Marcelo Tosatti
2011-02-24 17:32 ` Avi Kivity
2011-02-24 17:45 ` Anthony Liguori
2011-02-27 9:22 ` Avi Kivity
2011-02-23 12:46 ` Avi Kivity
2011-02-22 20:50 ` Anthony Liguori
2011-02-22 21:16 ` [Qemu-devel] " Anthony Liguori
2011-02-23 19:06 ` Anthony Liguori
2011-02-26 0:02 ` Marcelo Tosatti
2011-02-26 13:45 ` Anthony Liguori
2011-02-28 19:09 ` Marcelo Tosatti
2011-03-01 2:35 ` Marcelo Tosatti
2011-02-26 15:32 ` Anthony Liguori
2011-02-22 17:00 ` [Qemu-devel] [patch 3/3] do not allow migration if block copy in progress Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D6B5EFA.8060106@redhat.com \
--to=avi@redhat.com \
--cc=Jes.Sorensen@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=mtosatti@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).