qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: benoit.canet@irqsave.net, jcody@redhat.com, famz@redhat.com,
	qemu-devel@nongnu.org, mreitz@redhat.com
Subject: Re: [Qemu-devel] [RFC] qcow2 journalling draft
Date: Thu, 5 Sep 2013 11:35:43 +0200	[thread overview]
Message-ID: <20130905093543.GC12293@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <1378215952-7151-1-git-send-email-kwolf@redhat.com>

On Tue, Sep 03, 2013 at 03:45:52PM +0200, Kevin Wolf wrote:
> This contains an extension of the qcow2 spec that introduces journalling
> to the image format, plus some preliminary type definitions and
> function prototypes in the qcow2 code.
> 
> Journalling functionality is a crucial feature for the design of data
> deduplication, and it will improve the core part of qcow2 by avoiding
> cluster leaks on crashes as well as provide an easier way to get a
> reliable implementation of performance features like Delayed COW.
> 
> At this point of the RFC, it would be most important to review the
> on-disk structure. Once we're confident that it can do everything we
> want, we can start going into more detail on the qemu side of things.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/Makefile.objs   |   2 +-
>  block/qcow2-journal.c |  55 ++++++++++++++
>  block/qcow2.h         |  78 +++++++++++++++++++
>  docs/specs/qcow2.txt  | 204 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  4 files changed, 337 insertions(+), 2 deletions(-)
>  create mode 100644 block/qcow2-journal.c

Although we are still discussing details of the on-disk layout, the
general design is clear enough to discuss how the journal will be used.

Today qcow2 uses Qcow2Cache to do lazy, ordered metadata updates.  The
performance is pretty good with two exceptions that I can think of:

1. The delayed CoW problem that Kevin has been working on.  Guests
   perform sequential writes that are smaller than a qcow2 cluster.  The
   first write triggers a copy-on-write of the full cluster.  Later
   writes then overwrite the copied data.  It would be more efficient to
   anticipate sequential writes and hold off on CoW where possible.

2. Lazy metadata updates lead to bursty behavior and expensive flushes.
   We do not take advantage of disk bandwidth since metadata updates
   stay in the Qcow2Cache until the last possible second.  When the
   guest issues a flush we must write out dirty Qcow2Cache entries and
   possibly fsync between them if dependencies have been set (e.g.
   refcount before L2).

How will the journal change this situation?  Writes that go through the
journal are doubled - they must first be journalled, fsync, and then
they can be applied to the actual image.

How do we benefit by using the journal?

Stefan

  parent reply	other threads:[~2013-09-05  9:36 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-03 13:45 [Qemu-devel] [RFC] qcow2 journalling draft Kevin Wolf
2013-09-03 14:43 ` Benoît Canet
2013-09-04  8:03 ` Stefan Hajnoczi
2013-09-04  9:37   ` Benoît Canet
2013-09-04  9:39   ` Kevin Wolf
2013-09-04  9:55     ` Benoît Canet
2013-09-05  9:24       ` Stefan Hajnoczi
2013-09-05 15:26         ` Benoît Canet
2013-09-06  7:27           ` Kevin Wolf
2013-09-15 18:23             ` Benoît Canet
2013-09-05  9:21     ` Stefan Hajnoczi
2013-09-05 11:18       ` Kevin Wolf
2013-09-05 14:55         ` Stefan Hajnoczi
2013-09-05 15:20           ` Kevin Wolf
2013-09-05 15:56             ` Eric Blake
2013-09-06  9:20     ` Fam Zheng
2013-09-06  9:57       ` Kevin Wolf
2013-09-06 10:02         ` Fam Zheng
2013-09-04  8:32 ` Max Reitz
2013-09-04 10:12   ` Kevin Wolf
2013-09-05  9:35 ` Stefan Hajnoczi [this message]
2013-09-05 11:50   ` Kevin Wolf
2013-09-05 12:08     ` Benoît Canet
2013-09-06  9:59 ` Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130905093543.GC12293@stefanha-thinkpad.redhat.com \
    --to=stefanha@redhat.com \
    --cc=benoit.canet@irqsave.net \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).