qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Xingbo Wu <wuxb45@gmail.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] disk image: self-organized format or raw file
Date: Wed, 13 Aug 2014 20:32:36 +0200	[thread overview]
Message-ID: <20140813183236.GA2469@noname.redhat.com> (raw)
In-Reply-To: <CABPa+v2af6qLk5kFLQhcgbhOQJnT9_f97BgdfHavJq3Xpcnf0A@mail.gmail.com>

Am 13.08.2014 um 18:38 hat Xingbo Wu geschrieben:
> On Wed, Aug 13, 2014 at 11:54 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> > Am 12.08.2014 um 01:38 hat 吴兴博 geschrieben:
> >> Hello,
> >>
> >>   The introduction in the wiki page present several advantages of qcow2 [1].
> >> But I'm a little confused. I really appreciate if any one can give me some help
> >> on this :).
> >>
> >>  (1) Currently the raw format doesn't support COW. In other words, a raw image
> >> cannot have a backing file. COW depends on the mapping table on which we it
> >> knows whether each block/cluster is present (has been modified) in the current
> >> image file. Modern file-systems like xfs/ext4/etc. provide extent/block
> >> allocation information to user-level. Like what 'filefrag' does with ioctl
> >> 'FIBMAP' and 'FIEMAP'. I guess the raw file driver (maybe block/raw-posix.c)
> >> may obtain correct 'present information about blocks. However this information
> >> may be limited to be aligned with file allocation unit size. Maybe it's just
> >> because a raw file has no space to store the "backing file name"? I don't think
> >> this could hinder the useful feature.
> >>
> >>  (2) As most popular filesystems support delay-allocation/on-demand allocation/
> >> holes, whatever, a raw image is also thin provisioned as other formats. It
> >> doesn't consume much disk space by storing useless zeros. However, I don't know
> >> if there is any concern on whether fragmented extents would become a burden of
> >> the host filesystem.
> >>
> >>  (3) For compression and encryption, I'm not an export on these topics at all
> >> but I think these features may not be vital to a image format as both guest/
> >> host's filesystem can also provide similar functionality.
> >>
> >>  (4) I don't have too much understanding on how snapshot works but I think
> >> theoretically it would be using the techniques no more than that used in COW
> >> and backing file.
> >>
> >> After all these thoughts, I still found no reason to not using a 'raw' file
> >> image (engineering efforts in Qemu should not count as we don't ask  for more
> >> features from outside world).
> >> I would be very sorry if my ignorance wasted your time.
> >
> > Even if it did work (that it's problematic is already discussed in other
> > subthreads) what advantage would you get from using an extended raw
> > driver compared to simply using qcow2, which supports all of this today?
> >
> > Kevin
> 
> 
> I read several messages from this thread: "[RFC] qed: Add QEMU
> Enhanced Disk format". To my understanding, if the new format can be
> acceptable to the community:
>   It needs to retain all the key features provided by qcow2,
> especially for compression, encryption, and internal snapshot, as
> mentioned in that thread.
>   And, needless to say, it must run faster.
> 
> Yes I agree it's at least a subset of the homework one need to do
> before selling the new format to the community.

So your goal is improved performance?

Why do you think that a raw driver with backing file support would run
much faster than qcow2? It would have to solve the same problems, like
doing efficient COW.

> Thanks and another question:
> What's the magic that makes QED runs faster than QCOW2?

During cluster allocation (which is the real critical part), QED is a
lot slower than today's qcow2. And by that I mean not just a few
percent, but like half the performance. After that, when accessing
already allocated data, both perform similar. Mailing list discussions
of four years ago don't reflect accurately how qemu works today.

The main trick of QED was to introduce a dirty flag, which allowed to
call fdatasync() less often because it was okay for image metadata to
become inconsistent. After a crash, you have to repair the image then.

qcow2 supports the same with lazy_refcounts=on, but it's really only
useful in rare cases, mostly with cache=writethrough.

> In some simple
> parallel IO tests QED can run a magnitude faster than QCOW2.  I saw
> differences on simple/complex metadata organization, and coroutine/aio
> (however "bdrv_co_"s finally call "bdrv_aio_"s via "_em". If you can
> provide some insight on this I would be really appreciate.

Today, everything is internally coroutine operations, so every request
goes through bdrv_co_do_preadv/pwritev. The aio_* versions are just
wrappers around it for callers and block drivers that prefer a callback
based interface.

Kevin

  reply	other threads:[~2014-08-13 18:32 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-11 23:38 [Qemu-devel] disk image: self-organized format or raw file 吴兴博
2014-08-12  0:52 ` Fam Zheng
2014-08-12 10:46   ` 吴兴博
2014-08-12 11:19     ` Fam Zheng
     [not found]       ` <CABPa+v1a7meoEtjLkwygjuZEABTqd8q3efGWJvAsAr-mLTQb-A@mail.gmail.com>
     [not found]         ` <20140812113916.GB2803@T430.redhat.com>
2014-08-12 12:03           ` 吴兴博
2014-08-12 12:21             ` Fam Zheng
2014-08-12 13:08   ` Kirill Batuzov
2014-08-12 13:23 ` Eric Blake
2014-08-12 13:45   ` 吴兴博
2014-08-12 14:07     ` Eric Blake
2014-08-12 14:14       ` 吴兴博
2014-08-12 15:30         ` Eric Blake
2014-08-12 16:22           ` Xingbo Wu
2014-08-13  1:29             ` Fam Zheng
2014-08-13 15:42           ` Kevin Wolf
2014-08-12 18:39       ` Richard W.M. Jones
2014-08-12 18:46 ` Daniel P. Berrange
2014-08-12 18:52   ` Richard W.M. Jones
2014-08-12 19:23     ` Xingbo Wu
2014-08-12 20:14       ` Richard W.M. Jones
2014-08-13 15:54 ` Kevin Wolf
2014-08-13 16:38   ` Xingbo Wu
2014-08-13 18:32     ` Kevin Wolf [this message]
2014-08-13 21:04       ` Xingbo Wu
2014-08-13 21:35         ` Eric Blake
2014-08-14  2:42         ` Xingbo Wu
2014-08-14  9:06           ` Kevin Wolf
2014-08-14 20:53             ` Xingbo Wu
2014-08-15 10:46               ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140813183236.GA2469@noname.redhat.com \
    --to=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wuxb45@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).