Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Kaveh Razavi <kaveh@cs.vu.nl>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format
Date: Wed, 14 Aug 2013 16:20:27 +0200	[thread overview]
Message-ID: <520B922B.6030806@cs.vu.nl> (raw)
In-Reply-To: <20130814092912.GC14914@stefanha-thinkpad.redhat.com>

Hi,

On 08/14/2013 11:29 AM, Stefan Hajnoczi wrote:
> 100 MB is small enough for RAM.  Did you try enabling the host kernel
> page cache for the backing file?  That way all guests running on this
> host share a single RAM-cached version of the backing file.
>

Yes, indeed. That is why we think it makes sense to store many of these
cache images on memory, but at the storage node to avoid hot-spotting
its disk(s). Relying on the page-cache at the storage node may not be
enough, since there is no guarantee on what stays there.

The VM host page cache can be evicted at any time, requiring it to go to
the network again to read from the backing file. Since these cache
images are small, it is possible to store many of them at the hosts,
instead of caching many complete backing images that are usually in GB
order.

> The other existing solution is to use the image streaming feature, which
> was designed to speed up deployment of image files over the network.  It
> copies the contents of the image from a remote server onto the host
> while allowing immediate random access from the guest.  This isn't a
> cache, this is a full copy of the image.
> 

Streaming the complete image may work well for some cases, but streaming
at scale to many hosts at the same time can easily create a bottleneck
at the network. In most scenarios, only a fraction of the backing file
is needed during the lifetime of a VM.

> I share an idea of how to turn this into a cache in a second, but first
> how to deploy this safely.  Since multiple QEMU processes can share a
> backing file and the cache must not suffer from corruptions due to
> races, you can use one qemu-nbd per backing image.  The QEMU processes
> connect to the local read-only qemu-nbd server.
> 
> If you want a cache you could enable copy-on-read without the image
> streaming feature (block_stream command) and evict old data using
> discard commands.  No qcow2 image format changes are necessary to do
> this.

This is an interesting alternative. I may be wrong, but I think there
are two limitations with this: 1) it is not persistent and 2) you can
not enforce quota.

(1) is important if you would like to have a pool of these cache images
that survives a reboot. (2) is important, if the caching medium is a
scarce resource such as memory and also if you want to make sure that
only important data blocks get cached (i.e. data blocks needed for booting).

> This is unsafe since other QEMU processes on the host are not
> synchronizing with each other.  The image file will be corrupted.

That is true. My solution earlier is allowing only a single qemu process
to write to the cache at a time. Other qemu processes can only read from
it once it is ready and no longer modified.

Kaveh

next prev parent reply	other threads:[~2013-08-14 14:20 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-13 17:03 [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format Kaveh Razavi
2013-08-13 21:37 ` Eric Blake
2013-08-14 11:13   ` Kaveh Razavi
2013-08-13 22:53 ` Alex Bligh
2013-08-14 11:28   ` Kaveh Razavi
2013-08-14 11:52     ` Fam Zheng
2013-08-14 12:03       ` Alex Bligh
2013-08-14 15:58         ` Richard W.M. Jones
2013-08-15  0:53         ` Fam Zheng
2013-08-15  5:51           ` Alex Bligh
2013-08-14 11:57     ` Alex Bligh
2013-08-14 13:37       ` Kaveh Razavi
2013-08-13 23:16 ` Alex Bligh
2013-08-14 11:42   ` Kaveh Razavi
2013-08-14 12:02     ` Alex Bligh
2013-08-14 13:43       ` Kaveh Razavi
2013-08-14 13:50         ` Alex Bligh
2013-08-14 14:26           ` Kaveh Razavi
2013-08-14 15:02             ` Alex Bligh
2013-08-14 15:32             ` Kevin Wolf
2013-08-15  7:50               ` Wenchao Xia
2013-08-15  8:11               ` Stefan Hajnoczi
2013-08-14  9:29 ` Stefan Hajnoczi
2013-08-14 14:20   ` Kaveh Razavi [this message]
2013-08-15  8:32     ` Stefan Hajnoczi
2013-08-15 12:25       ` Kaveh Razavi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=520B922B.6030806@cs.vu.nl \
    --to=kaveh@cs.vu.nl \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.