From: Avi Kivity <avi@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Jes.Sorensen@redhat.com, Marcelo Tosatti <mtosatti@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Date: Sun, 27 Feb 2011 17:31:04 +0200 [thread overview]
Message-ID: <4D6A6E38.4030700@redhat.com> (raw)
In-Reply-To: <4D6A58E0.9020607@codemonkey.ws>
On 02/27/2011 04:00 PM, Anthony Liguori wrote:
> On 02/27/2011 03:10 AM, Avi Kivity wrote:
>> On 02/24/2011 07:58 PM, Anthony Liguori wrote:
>>>> If you move the cdrom to a different IDE channel, you have to
>>>> update the stateful non-config file.
>>>>
>>>> Whereas if you do
>>>>
>>>> $ qemu-img create -f cd-tray -b ~/foo.img ~/foo-media-tray.img
>>>> $ qemu -cdrom ~/foo-media-tray.img
>>>>
>>>> the cd-rom tray state will be tracked in the image file.
>>>
>>>
>>> Yeah, but how do you move it?
>>
>> There is no need to move the file at all. Simply point the new drive
>> at the media tray.
>
> No, I was asking, how do you move the cdrom to a different IDE
> channel. Are you using QMP? Are you changing the command line
> arguments?
Yes.
If we're doing hot-move (not really relevant to ide-cd) then you'd use
QMP. If you're editing a virtual machine that is down, or scheduling a
change for the next reboot, then you're using command line arguments (or
cold-plugging into a stopped guest).
Requiring management to remember the old configuration and issue delta
commands to move the device for the cold-plug case is increased
complexity IMO.
>
>>
>>> If you do a remove/add through QMP, then the config file will
>>> reflect things just fine.
>>
>> If all access to the state file is through QMP then it becomes more
>> palatable. A bit on that later.
>
> As I think I've mentioned before, I hadn't really thought about an
> opaque state file but I'm not necessary opposed to it. I don't see an
> obvious advantage to making it opaque but I agree it should be
> accessible via QMP.
The advantage is that we keep the management tool talking to one
interface (I don't think we should prevent users from interpreting it,
just make it unnecessary).
>>
>> I thought that's what I'm doing by separating the state out. It's
>> easy for management to assemble configuration from their database and
>> convert it into a centralized representation (like a qemu command
>> line). It's a lot harder to disassemble a central state
>> representation and move it back to the database.
>>
>> Using QMP is better than directly accessing the state file since qemu
>> does the disassembly for you (provided the command references the
>> device using its normal path, not some random key). The file just
>> becomes a way to survive a crash, and all management needs to know
>> about is to make it available and back it up. But it means that
>> everything must be done via QMP, including assembly of the machine,
>> otherwise the state file can become stale.
>>
>> Separating the state out to the device is even easier, since
>> management is already expected to take care of disk images. All
>> that's needed is to create the media tray image once, then you can
>> forget about it completely.
>
> Except that instead of having one state file, we might have a dozen
> additional "device state" files.
That is fine. We already have one state file per block device.
>>> QEMU. No question about it. At any point in time, we are the
>>> authoritative source of what the guest's configuration is. There's
>>> no doubt about it. A management tool can try to keep up with us,
>>> but ultimately we are the only ones that know for sure.
>>>
>>> We have all of this information internally. Just persisting it is
>>> not a major architectural change. It's something we should have
>>> been doing (arguably) from the very beginning.
>>
>> That's a huge divergence from how management tools are written.
>
> This is one of the reasons why management tooling around QEMU needs
> quite a bit of improving.
>
> There is simply no way a management tool can do a good job of being an
> authoritative source of configuration. The races we're discussion is
> a good example of why.
What we're discussing is not configuration. It is non-volatile state.
Configuration comes from the user; state comes from the guest (the
management tool may edit state; but the guest cannot edit the
configuration).
I agree 100% the management tool cannot be the authoritative source of
state.
My position is:
- the management tool should be 100% in control of configuration (how
the guest is put together from its components)
- qemu should be 100% in control of state (memory, disk state, NVRAM in
various components, cd-rom eject state, explosive bolts for payload
separation, self-destruct mechanism, etc.)
- the management tool should have access to state using the same
identifiers it used to create the devices that contain the state
- it is preferable to store state "in" the device so that when the
configuration changes, state is maintained (like hot-unplug of a network
card with NVRAM followed by hot-plug of the same card).
- the angular momentum of the planet we (presumably) are on won't
change, whatever we do [1]
>
> But beyond those races, QEMU is the only entity that knows with
> certainty what bits of information are important to persist in order
> to preserve a guest across shutdown/restart. The fact that we've
> punted this problem for so long has only ensured that management tools
> are either intrinsically broken or only support the most minimal
> subset of functionality we actually support.
I'm not arguing about that. I just want to stress again the difference
between state and configuration. Qemu has no authority, in my mind, as
to configuration. Only state.
>> Currently they contain the required guest configuration, a
>> representation of what's the current live configuration, and they
>> issue monitor commands to move the live configuration towards the
>> required configuration (or just generate a qemu command line). What
>> you're describing is completely different, I'm not even sure what it is.
>
> Management tools shouldn't have to think about how the monitor
> commands they issue impact the invocation options of QEMU.
They have to, when creating a guest from scratch.
But I admit, this throws a new light (for me) on things. What's the
implications?
- must have a qemu instance running when editing configuration, even
when the guest is down
- cannot add additional information to configuration; must store it in
an external database and cross-reference it with the qemu data using the
device ID
- when editing non-hotpluggable configuration for the next boot, must
maintain old config somewhere, so we can issue delta commands later
(might be needed for current way of doing things)
- no transactions/queries/etc except on non-authoritative source
- issues with shared-nothing design (well, can store the configuration
file using DRBD).
>>
>> If you look at management tools, they believe they are the
>> authoritative source of configuration information (not guest state,
>> which is more or less ignored).
>
> It's because we've given them no other option.
It's the natural way of doing it. You have a web interface that talks
to a database. When you want to list all VMs that have network cards on
the production subnet, you issue a database query and get a recordset.
How do you do that when the authoritative source of information is
spread across a cluster?
>
>>>>
>>>> Right, but we should make it easy, not hard.
>>>
>>> Yeah, I fail to see how this makes it hard. We conveniently are
>>> saying, hey, this is all the state that needs to be persisted.
>>> We'll persist it for you if you want, otherwise, we'll expose it in
>>> a central location.
>>
>> The state-in-a-file is just a blob. Don't expect the tool to parse
>> it and reassociate the various bits to its own representation.
>> Exposing it via QMP commands is a lot better though.
>
> I don't really see this as being a major issue. There's no such thing
> as a "blob". If someone wants to manipulate the state, they will.
> We need to keep compatibility to support migrating from
> version-to-version.
>
> I agree that we want to provide QMP interfaces to work with the state
> file. But I don't think we should be hostile to manual manipulation.
No, not hostile. We should make QMP commands sufficient to deal with
it, that's all.
[1] in fact, it does change, due to tidal effects.
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2011-02-27 15:31 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-22 17:00 [Qemu-devel] [patch 0/3] live block copy (v2) Marcelo Tosatti
2011-02-22 17:00 ` [Qemu-devel] [patch 1/3] add migration_active function Marcelo Tosatti
2011-02-22 17:00 ` [Qemu-devel] [patch 2/3] Add support for live block copy Marcelo Tosatti
2011-02-22 20:50 ` [Qemu-devel] " Anthony Liguori
2011-02-22 21:07 ` Marcelo Tosatti
2011-02-22 21:11 ` Anthony Liguori
2011-02-22 23:09 ` Marcelo Tosatti
2011-02-22 23:14 ` Anthony Liguori
2011-02-23 13:01 ` Avi Kivity
2011-02-23 14:35 ` Anthony Liguori
2011-02-23 15:31 ` Avi Kivity
2011-02-23 16:01 ` Anthony Liguori
2011-02-23 16:14 ` Avi Kivity
2011-02-23 16:28 ` Anthony Liguori
2011-02-23 17:18 ` Avi Kivity
2011-02-23 20:18 ` Anthony Liguori
2011-02-23 20:44 ` Marcelo Tosatti
2011-02-23 21:41 ` Anthony Liguori
2011-02-24 14:39 ` Marcelo Tosatti
2011-02-24 7:37 ` Markus Armbruster
2011-02-24 8:54 ` Avi Kivity
2011-02-24 15:00 ` Anthony Liguori
2011-02-24 15:22 ` Avi Kivity
2011-02-24 17:58 ` Anthony Liguori
2011-02-27 9:10 ` Avi Kivity
2011-02-27 9:55 ` Dor Laor
2011-02-27 13:49 ` Anthony Liguori
2011-02-27 16:02 ` Dor Laor
2011-02-27 17:25 ` Anthony Liguori
2011-02-28 8:58 ` Dor Laor
2011-02-27 14:00 ` Anthony Liguori
2011-02-27 15:31 ` Avi Kivity [this message]
2011-02-27 17:41 ` Anthony Liguori
2011-02-28 8:38 ` Avi Kivity
2011-02-28 12:45 ` Anthony Liguori
2011-02-28 13:21 ` Avi Kivity
2011-02-28 17:33 ` Anthony Liguori
2011-02-28 17:47 ` Avi Kivity
2011-02-28 18:12 ` Anthony Liguori
[not found] ` <4D6CB556.5060401@redhat.c! om>
[not found] ` <4D6CBECF.8090805@redhat.c! om>
2011-03-01 8:59 ` Dor Laor
2011-03-02 12:39 ` Anthony Liguori
2011-03-02 13:00 ` Avi Kivity
2011-03-02 15:07 ` Anthony Liguori
2011-03-01 9:39 ` Avi Kivity
2011-03-01 15:51 ` Anthony Liguori
2011-03-01 22:27 ` Dor Laor
2011-03-02 16:30 ` Avi Kivity
2011-03-02 21:55 ` Anthony Liguori
2011-02-28 18:56 ` Marcelo Tosatti
2011-03-01 9:45 ` Avi Kivity
2011-02-23 16:17 ` Peter Maydell
2011-02-23 16:30 ` Anthony Liguori
2011-02-24 5:41 ` [Qemu-devel] Unsubsribing James Brown
2011-02-24 10:00 ` Stefan Hajnoczi
2011-02-23 17:26 ` [Qemu-devel] Re: [patch 2/3] Add support for live block copy Markus Armbruster
2011-02-23 20:06 ` Anthony Liguori
2011-02-24 12:15 ` Markus Armbruster
2011-02-25 7:16 ` Stefan Hajnoczi
2011-02-23 17:49 ` Marcelo Tosatti
2011-02-24 8:58 ` Avi Kivity
2011-02-24 15:14 ` Marcelo Tosatti
2011-02-24 15:28 ` Avi Kivity
2011-02-24 16:39 ` Marcelo Tosatti
2011-02-24 17:32 ` Avi Kivity
2011-02-24 17:45 ` Anthony Liguori
2011-02-27 9:22 ` Avi Kivity
2011-02-23 12:46 ` Avi Kivity
2011-02-22 20:50 ` Anthony Liguori
2011-02-22 21:16 ` [Qemu-devel] " Anthony Liguori
2011-02-23 19:06 ` Anthony Liguori
2011-02-26 0:02 ` Marcelo Tosatti
2011-02-26 13:45 ` Anthony Liguori
2011-02-28 19:09 ` Marcelo Tosatti
2011-03-01 2:35 ` Marcelo Tosatti
2011-02-26 15:32 ` Anthony Liguori
2011-02-22 17:00 ` [Qemu-devel] [patch 3/3] do not allow migration if block copy in progress Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D6A6E38.4030700@redhat.com \
--to=avi@redhat.com \
--cc=Jes.Sorensen@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=mtosatti@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).