From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=60071 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OukE1-0004Oa-Nr for qemu-devel@nongnu.org; Sun, 12 Sep 2010 06:55:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OukE0-0005lT-FM for qemu-devel@nongnu.org; Sun, 12 Sep 2010 06:55:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42029) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OukE0-0005lH-87 for qemu-devel@nongnu.org; Sun, 12 Sep 2010 06:55:56 -0400 Message-ID: <4C8CB1B0.9070203@redhat.com> Date: Sun, 12 Sep 2010 12:55:44 +0200 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] QEMU interfaces for image streaming and post-copy block migration References: <4C864118.7070206@linux.vnet.ibm.com> In-Reply-To: <4C864118.7070206@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: "libvir-list@redhat.com" , qemu-devel , Stefan Hajnoczi On 09/07/2010 04:41 PM, Anthony Liguori wrote: > Hi, > > We've got copy-on-read and image streaming working in QED and before > going much further, I wanted to bounce some interfaces off of the > libvirt folks to make sure our final interface makes sense. > > Here's the basic idea: > > Today, you can create images based on base images that are copy on > write. With QED, we also support copy on read which forces a copy > from the backing image on read requests and write requests. Is copy on read QED specific? It looks very similar to the commit command, except with I/O directions reversed. IIRC, commit looks like for each sector: if image.mapped(sector): backing_image.write(sector, image.read(sector)) whereas copy-on-read looks like: def copy_on_read(): set_ioprio(idle) for each sector: if not image.mapped(sector): image.write(sector, backing_image.read(sector)) run_in_thread(copy_on_read) With appropriate locking. > > In additional to copy on read, we introduce a notion of streaming a > block device which means that we search for an unallocated region of > the leaf image and force a copy-on-read operation. > > The combination of copy-on-read and streaming means that you can start > a guest based on slow storage (like over the network) and bring in > blocks on demand while also having a deterministic mechanism to > complete the transfer. > > The interface for copy-on-read is just an option within qemu-img > create. Streaming, on the other hand, requires a bit more thought. > Today, I have a monitor command that does the following: > > stream > > Which will try to stream the minimal amount of data for a single I/O > operation and then return how many sectors were successfully streamed. > > The idea about how to drive this interface is a loop like: > > offset = 0; > while offset < image_size: > wait_for_idle_time() > count = stream(device, offset) > offset += count > This is way too low level for the management stack. Have you considered using the idle class I/O priority to implement this? That would allow host-wide prioritization. Not sure how to do cluster-wide, I don't think NFS has the concept of I/O priority. -- error compiling committee.c: too many arguments to function