From: Kevin Wolf <kwolf@redhat.com>
To: Taisuke Yamada <tai@rakugaki.org>
Cc: jcody@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] CoW image commit+shrink(= make_empty) support
Date: Fri, 08 Jun 2012 12:39:12 +0200 [thread overview]
Message-ID: <4FD1D650.6020207@redhat.com> (raw)
In-Reply-To: <CAM8qCPK4_WeOVg=C+Lu95PwjTWbxUS3j+PYkqSAB-t0vOTRHZw@mail.gmail.com>
Am 07.06.2012 08:19, schrieb Taisuke Yamada:
> I attended Paolo Bonzini's qemu session ("Live Disk Operations: Juggling
> Data and Trying to go Unnoticed") in LinuxCon Japan, and he adviced me
> to post the bits I have regarding my question on qemu's support on shrinking
> CoW image.
>
> Here's my problem description.
>
> I recently designed a experimental system which holds VM master images
> on a HDD and CoW snapshots on a SSD. VMs run on CoW snapshots only.
> This split-image configration is done to keep VM I/Os on a SSD
This is an interesting use case that I wasn't aware of yet. So you're
not really interested in a snapshot here, but what you're trying to do
is using the SSD as some sort of a cache, right?
> As SSD capacity is rather limited, I need to do a writeback commit from SSD to
> HDD time to time, and that is done during weekend/midnight. The problem is
> although a commit is made, that alone won't shrink CoW image - all unused blocks
> are still kept in a snapshot, and uses up space.
>
> Patch attached is a workaround I added to cope with the problem,
> but the basic problem I faced was that both QCOW2/QED format still does not
> support "bdrv_make_empty" API.
>
> Implementing the API (say, by hole punching) seemed like a lot of effort, so
> I ended up creating a new CoW image, and then replace current CoW
> snapshot with a new (empty) one. But I find the code ugly.
It's kind of a hack indeed, but if it works...? :-)
I agree that the real solution would be hole punching. We do already
support this for raw images on XFS and we want to extend it (I think
there are even patches floating around for it). Once you have this,
implementing bdrv_make_empty() for qcow2 shouldn't be too hard, it might
actually just take calling qcow2_co_discard() and adding another discard
call in qcow2_free_cluster() that passes the request to the image file.
> In his talk, Paolo suggested possibility of using new "live op" API for this
> task, but I'm not aware of the actual API. Is there any documentation or
> source code I can look at to re-implement above feature?
The problem that a live block operation could solve sounds unrelated to
me: While you perform your 'commit' monitor command, the VM doesn't run.
Some kind of live commit would surely be helpful there (that's what Jeff
Cody is working on, iirc).
Maybe you could actually use a live commit mode where the commit stays
active all the time in the background so that you write to your SSD and
signal completion to the guest while the background job starts copying
the request to your backing file on the slow disk. Or actually, it
sounds quite similar to the "block mirror" approaches that were
discussed recently, where guest requests are duplicated to the current
(SSD) image and a secondary (HDD) image.
One other thing to consider that complicates everything is that
committing to the backing file when bdrv_make_empty is implemented
obviously also means that reads go to the slow disk now. So I guess you
really want to commit only part of the image on the SSD...
Kevin
next prev parent reply other threads:[~2012-06-08 10:39 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-07 6:19 [Qemu-devel] CoW image commit+shrink(= make_empty) support Taisuke Yamada
2012-06-07 14:14 ` Jeff Cody
2012-06-08 12:42 ` Stefan Hajnoczi
2012-06-08 13:19 ` Jeff Cody
2012-06-08 13:53 ` Stefan Hajnoczi
2012-06-08 14:32 ` Jeff Cody
2012-06-08 16:11 ` Kevin Wolf
2012-06-08 17:46 ` Jeff Cody
2012-06-08 17:57 ` Kevin Wolf
2012-06-08 18:33 ` Jeff Cody
2012-06-08 21:08 ` Kevin Wolf
2012-06-09 16:52 ` Jeff Cody
2012-06-11 7:57 ` Kevin Wolf
2012-06-10 16:10 ` Paolo Bonzini
2012-06-11 7:59 ` Kevin Wolf
2012-06-11 8:01 ` Paolo Bonzini
2012-06-11 12:09 ` Stefan Hajnoczi
2012-06-11 12:50 ` Kevin Wolf
2012-06-11 14:24 ` Stefan Hajnoczi
2012-06-11 15:37 ` Jeff Cody
2012-06-11 19:12 ` Paolo Bonzini
2012-06-12 7:27 ` Zhi Hui Li
2012-06-12 10:56 ` Stefan Hajnoczi
2012-06-13 10:56 ` Supriya Kannery
2012-06-14 14:23 ` Zhi Hui Li
2012-06-14 14:29 ` Jeff Cody
2012-06-14 18:28 ` Supriya Kannery
2012-06-15 21:01 ` Supriya Kannery
2012-06-10 16:06 ` Paolo Bonzini
2012-06-08 10:39 ` Kevin Wolf [this message]
2012-06-09 11:21 ` Taisuke Yamada
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FD1D650.6020207@redhat.com \
--to=kwolf@redhat.com \
--cc=jcody@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=tai@rakugaki.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).