All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <aliguori@linux.vnet.ibm.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	"libvir-list@redhat.com" <libvir-list@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] QEMU interfaces for image streaming and post-copy block migration
Date: Tue, 07 Sep 2010 09:57:15 -0500	[thread overview]
Message-ID: <4C8652CB.9060801@linux.vnet.ibm.com> (raw)
In-Reply-To: <AANLkTim7UHH3r__3C_Ad3oB1rnXyRsH7bcuZw+rBQP6=@mail.gmail.com>

On 09/07/2010 09:49 AM, Stefan Hajnoczi wrote:
> On Tue, Sep 7, 2010 at 3:34 PM, Kevin Wolf<kwolf@redhat.com>  wrote:
>    
>> Am 07.09.2010 15:41, schrieb Anthony Liguori:
>>      
>>> Hi,
>>>
>>> We've got copy-on-read and image streaming working in QED and before
>>> going much further, I wanted to bounce some interfaces off of the
>>> libvirt folks to make sure our final interface makes sense.
>>>
>>> Here's the basic idea:
>>>
>>> Today, you can create images based on base images that are copy on
>>> write.  With QED, we also support copy on read which forces a copy from
>>> the backing image on read requests and write requests.
>>>
>>> In additional to copy on read, we introduce a notion of streaming a
>>> block device which means that we search for an unallocated region of the
>>> leaf image and force a copy-on-read operation.
>>>
>>> The combination of copy-on-read and streaming means that you can start a
>>> guest based on slow storage (like over the network) and bring in blocks
>>> on demand while also having a deterministic mechanism to complete the
>>> transfer.
>>>
>>> The interface for copy-on-read is just an option within qemu-img
>>> create.
>>>        
>> Shouldn't it be a runtime option? You can use the very same image with
>> copy-on-read or copy-on-write and it will behave the same (execpt for
>> performance), so it's not an inherent feature of the image file.
>>
>> Doing it this way has the additional advantage that you need no image
>> format support for this, so we could implement copy-on-read for other
>> formats, too.
>>      
> I agree that streaming should be generic, like block migration.  The
> trivial generic implementation is:
>
> void bdrv_stream(BlockDriverState* bs)
> {
>      for (sector = 0; sector<  bdrv_getlength(bs); sector += n) {
>          if (!bdrv_is_allocated(bs, sector,&n)) {
>    

Three problems here.  First problem is that bdrv_is_allocated is 
synchronous.  The second problem is that streaming makes the most sense 
when it's the smallest useful piece of work whereas bdrv_is_allocated() 
may return a very large range.

You could cap it here but you then need to make sure that cap is at 
least cluster_size to avoid a lot of unnecessary I/O.

The QED streaming implementation is 140 LOCs too so you quickly end up 
adding more code to the block formats to support these new interfaces 
than it takes to just implement it in the block format.

Third problem is that  streaming really requires being able to do zero 
write detection in a meaningful way.  You don't want to always do zero 
write detection so you need another interface to mark a specific write 
as a write that should be checked for zeros.

Regards,

Anthony Liguori

  reply	other threads:[~2010-09-07 14:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-07 13:41 [Qemu-devel] QEMU interfaces for image streaming and post-copy block migration Anthony Liguori
2010-09-07 14:01 ` Alexander Graf
2010-09-07 14:31   ` Anthony Liguori
2010-09-07 14:33 ` Stefan Hajnoczi
2010-09-07 14:51   ` Anthony Liguori
2010-09-07 14:55     ` Stefan Hajnoczi
2010-09-07 15:00       ` Anthony Liguori
2010-09-07 15:09         ` Stefan Hajnoczi
2010-09-07 15:20           ` Anthony Liguori
2010-09-08  8:26           ` Kevin Wolf
2010-09-07 14:34 ` Kevin Wolf
2010-09-07 14:49   ` Stefan Hajnoczi
2010-09-07 14:57     ` Anthony Liguori [this message]
2010-09-07 15:05       ` Stefan Hajnoczi
2010-09-07 15:23         ` Anthony Liguori
2010-09-12 12:41       ` Avi Kivity
2010-09-12 13:25         ` Anthony Liguori
2010-09-12 13:40           ` Avi Kivity
2010-09-12 15:23             ` Anthony Liguori
2010-09-12 16:45               ` Avi Kivity
2010-09-12 17:19                 ` Anthony Liguori
2010-09-12 17:31                   ` Avi Kivity
2010-09-07 14:49   ` Anthony Liguori
2010-09-07 15:02     ` Kevin Wolf
2010-09-07 15:11       ` Anthony Liguori
2010-09-07 15:20         ` Kevin Wolf
2010-09-07 15:30           ` Anthony Liguori
2010-09-07 15:39             ` Kevin Wolf
2010-09-07 16:00               ` Anthony Liguori
2010-09-07 15:03 ` [Qemu-devel] " Daniel P. Berrange
2010-09-07 15:16   ` Anthony Liguori
2010-09-12 10:55 ` [Qemu-devel] " Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C8652CB.9060801@linux.vnet.ibm.com \
    --to=aliguori@linux.vnet.ibm.com \
    --cc=kwolf@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.