From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=55301 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1PsIMQ-00015t-LK
	for qemu-devel@nongnu.org; Wed, 23 Feb 2011 12:18:47 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1PsIMP-0006Bs-6r
	for qemu-devel@nongnu.org; Wed, 23 Feb 2011 12:18:46 -0500
Received: from mx1.redhat.com ([209.132.183.28]:47336)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1PsIMO-0006BT-TR
	for qemu-devel@nongnu.org; Wed, 23 Feb 2011 12:18:45 -0500
Message-ID: <4D65416D.8040803@redhat.com>
Date: Wed, 23 Feb 2011 19:18:37 +0200
From: Avi Kivity <avi@redhat.com>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
References: <20110222170004.808373778@redhat.com>	<20110222170115.710717278@redhat.com>	<4D642181.4080509@codemonkey.ws>	<20110222210735.GA9372@amt.cnet>	<4D64266A.3060106@codemonkey.ws>	<20110222230935.GA11082@amt.cnet>	<4D644343.4050800@codemonkey.ws>
	<4D65051A.6070707@redhat.com>	<4D651B20.70405@codemonkey.ws>
	<4D652852.60505@redhat.com>	<4D652F73.3000305@codemonkey.ws>
	<4D65324A.5080408@redhat.com> <4D65359E.3040008@codemonkey.ws>
In-Reply-To: <4D65359E.3040008@codemonkey.ws>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Jes.Sorensen@redhat.com, Marcelo Tosatti <mtosatti@redhat.com>, qemu-devel@nongnu.org

On 02/23/2011 06:28 PM, Anthony Liguori wrote:
>
>>> Well specifically, it has to ask QEMU and QEMU can tell it the 
>>> current state via a nice structured data format over QMP.  It's a 
>>> hell of a lot easier than the management tool trying to do this 
>>> outside of QEMU.
>>
>> So, if qemu crashes, the management tool has to start it up to find 
>> out what the current state is.
>
> Depends on how opaque we make the state file.  I've been thinking a 
> simple ini syntax with a well supported set of keys.  In that case, a 
> management tool can read it without starting QEMU.

Then the management stack has to worry about yet another way of 
interacting via qemu.  I'd like to limit it to the monitor.

>>
>> Doesn't the stateful non-config file becomes a failure point?  It has 
>> to be on shared and redundant storage?
>
> It depends on what your availability model is and how frequently your 
> management tool backs up the config.  As of right now, we have a 
> pretty glaring reliability hole here so adding a stateful "non-config" 
> can only improve things.

I think the solutions I pointed out close the hole with the existing 
interfaces.

>
>> To me, it seems a lot easier to require management to replay any 
>> commands that hadn't been acknowledged (due to management failure), 
>> or to query qemu as to its current state (if it is alive). 
>
> You still have the race condition around guest initiated events like 
> eject.  Unless you have an acknowledged event from a management tool 
> (which we can't do in QMP today) whereas you don't complete the guest 
> initiated eject operation until management ack's it, we need to store 
> that state ourself.

I don't see why.

If management crashes, it queries the eject state when it reconnects to 
qemu.
If qemu crashes, the eject state is lost, but that is fine.  My CD-ROM 
drive tray pulls itself in when the machine is started.

>
> I don't like the idea of making a management tool such an integral 
> part of the functional paths. 

I agree that we don't want qemu to wait on the management stack any more 
than necessary.

> Not having a stateful config file also means that this problem isn't 
> solved in any form without a really sophisticated management stack.  
> I'm a big fan of being robust in the face of not-so sophisticated 
> management tools.

You're introducing the need for additional code in the management layer, 
the care and feeding for the stateful non-config file.

>> If qemu crashes, these events are meaningless.  If management 
>> crashes, it has to query qemu for all state that it wants to keep 
>> track of via events.
>
> Think power failure, not qemu crash.  In the event of a power failure, 
> any hardware change initiated by the guest ought to be consistent with 
> when the guest has restarted.  If you eject the CDROM tray and then 
> lose power, its still ejected after the power comes back on.

Not on all machines.

Let's list guest state which is independent of power.  That would be 
wither NVRAM of various types, or physical alterations.  CD-ROM eject is 
one.  Are there others?

>
>>>
>>> I think the nature of a posted event management interface is such 
>>> that we need a stateful config that persists across QEMU invocations.
>>
>> I'm not convinced, and I think making qemu manage even more state 
>> creates more problems.
>
> Well this patch series is making qemu management more state.  The only 
> question is whether we do this as a one-off mechanism or whether we 
> architect a general mechanism to do it.
>
> How much state we store can always be up for discussion but I think 
> it's undeniable that we need to store more state than we're storing 
> today (none).

I think my solution (multiplexing block format driver) fits the 
requirements for live-copy perfectly.  In fact it has a name - it's a 
RAID-1 driver started in degraded mode.  It could be useful other use cases.

-- 
error compiling committee.c: too many arguments to function