From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=33911 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pu7aC-0002wX-0X for qemu-devel@nongnu.org; Mon, 28 Feb 2011 13:12:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pu7Zx-0005Cr-56 for qemu-devel@nongnu.org; Mon, 28 Feb 2011 13:12:22 -0500 Received: from mail-yx0-f173.google.com ([209.85.213.173]:51089) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pu7Zw-00056L-W8 for qemu-devel@nongnu.org; Mon, 28 Feb 2011 13:12:17 -0500 Received: by mail-yx0-f173.google.com with SMTP id 8so2005678yxk.4 for ; Mon, 28 Feb 2011 10:12:16 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <4D6BDFA1.3000100@redhat.com> References: <20110222170004.808373778@redhat.com> <20110222170115.710717278@redhat.com> <4D642181.4080509@codemonkey.ws> <20110222210735.GA9372@amt.cnet> <4D64266A.3060106@codemonkey.ws> <20110222230935.GA11082@amt.cnet> <4D644343.4050800@codemonkey.ws> <4D65051A.6070707@redhat.com> <4D651B20.70405@codemonkey.ws> <4D652852.60505@redhat.com> <4D652F73.3000305@codemonkey.ws> <4D65324A.5080408@redhat.com> <4D65359E.3040008@codemonkey.ws> <4D65416D.8040803@redhat.com> <4D656B97.5030301@codemonkey.ws> <4D661CB8.6010305@redhat.com> <4D667287.9010005@codemonkey.ws> <4D6677BE.2030009@redhat.com> <4D669C46.40909@codemonkey.ws> <4D6A150B.8030205@redhat.com> <4D6A58E0.9020607@codemonkey.ws> <4D6A6E38.4030700@redhat.com> <4D6A8CC9.4090304@codemonkey.ws> <4D6B5EFA.8060106@redhat.com> <4D6B98FD.7020103@codemonkey.ws> <4D6BA16A.2020204@redhat.com> <4D6BDFA1.3000100@redhat.com> Date: Mon, 28 Feb 2011 12:12:16 -0600 Message-ID: Subject: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy From: Anthony Liguori Content-Type: multipart/alternative; boundary=20cf303f6496aaa22a049d5b9ebd List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Jes.Sorensen@redhat.com, Marcelo Tosatti , qemu-devel@nongnu.org --20cf303f6496aaa22a049d5b9ebd Content-Type: text/plain; charset=ISO-8859-1 On Feb 28, 2011 11:47 AM, "Avi Kivity" wrote: > > On 02/28/2011 07:33 PM, Anthony Liguori wrote: >> >> >> > >> > You're just ignoring what I've written. >> >> No, you're just impervious to my subtle attempt to refocus the discussion on solving a practical problem. >> >> There's a lot of good, reasonably straight forward changes we can make that have a high return on investment. >> > > Is making qemu the authoritative source of configuration information a straightforward change? Is the return on it high? Is the investment low? I think this is where we fundamentally disagree. My position is that QEMU is already the authoritative source. Having a state file doesn't change anything. Do a hot unplug of a network device with upstream libvirt with acpiphp unloaded, consult libvirt and then consult the monitor to see who has the right view of the guests config. To me, that's the definition of authoritative. > "No" to all three (ignoring for the moment whether it is good or not, which we were debating). > > >> The only suggestion I'm making beyond Marcelo's original patch is that we use a structured format and that we make it possible to use the same file to solve this problem in multiple places. >> > > No, you're suggesting a lot more than that. That's exactly what I'm suggesting from a technical perspective. >> I don't think this creates a fundamental break in how management tools interact with QEMU. I don't think introducing RAID support in the block layer is a reasonable alternative. >> >> > > Why not? Because its a lot of complexity and code that can go wrong while only solving the race for one specific case. Not to mention that we double the iop rate. > Something that avoids the whole state thing altogether: > > - instead of atomically switching when live copy is done, keep on issuing writes to both the origin and the live copy > - issue a notification to management > - management receives the notification, and issues an atomic blockdev switch command > this is really the RAID-1 solution but without the state file (credit Dor). An advantage is that there is no additional latency when trying to catch up to the dirty bitmap. It still suffers from the two generals problem. You cannot solve this without making one node reliable and that takes us back to it being either QEMU (posted event and state file) or the management tool (sync event). Regards, Anthony Liguori > > -- > error compiling committee.c: too many arguments to function > --20cf303f6496aaa22a049d5b9ebd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable


On Feb 28, 2011 11:47 AM, "Avi Kivity" <avi@redhat.com> wrote:
>
> On 02/28/2011 07:33 PM, Anthony Liguori wrote:
>>
>>
>> >
>> > You're just ignoring what I've written.
>>
>> No, you're just impervious to my subtle attempt to refocus the= discussion on solving a practical problem.
>>
>> There's a lot of good, reasonably straight forward changes we = can make that have a high return on investment.
>>
>
> Is making qemu the authoritative source of configuration information a= straightforward change? =A0Is the return on it high? =A0Is the investment = low?

I think this is where we fundamentally disagree.=A0 My position is that = QEMU is already the authoritative source.=A0 Having a state file doesn'= t change anything.

Do a hot unplug of a network device with upstream libvirt with acpiphp u= nloaded, consult libvirt and then consult the monitor to see who has the ri= ght view of the guests config.

To me, that's the definition of authoritative.

> "No" to all three (ignoring for the moment whether it is = good or not, which we were debating).
>
>
>> The only suggestion I'm making beyond Marcelo's original p= atch is that we use a structured format and that we make it possible to use= the same file to solve this problem in multiple places.
>>
>
> No, you're suggesting a lot more than that.

That's exactly what I'm suggesting from a technical perspective.=

>> I don't think this creates a fundamental break in how manag= ement tools interact with QEMU. =A0I don't think introducing RAID suppo= rt in the block layer is a reasonable alternative.
>>
>>
>
> Why not?

Because its a lot of complexity and code that can go wrong while only so= lving the race for one specific case.=A0 Not to mention that we double the = iop rate.

> Something that avoids the whole state thing altogether:
>
> - instead of atomically switching when live copy is done, keep on issu= ing writes to both the origin and the live copy
> - issue a notification to management
> - management receives the notification, and issues an atomic blockdev = switch command

> this is really the RAID-1 solution but without the state file (cred= it Dor). =A0An advantage is that there is no additional latency when trying= to catch up to the dirty bitmap.

It still suffers from the two generals problem.=A0 You cannot solve this= without making one node reliable and that takes us back to it being either= QEMU (posted event and state file) or the management tool (sync event).

Regards,

Anthony Liguori

>
> --
> error compiling committee.c: too many arguments to function
>

--20cf303f6496aaa22a049d5b9ebd--