All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format
Date: Wed, 08 Sep 2010 11:23:46 +0300	[thread overview]
Message-ID: <4C874812.9090807@redhat.com> (raw)
In-Reply-To: <4C86BC6B.5010809@codemonkey.ws>

  On 09/08/2010 01:27 AM, Anthony Liguori wrote:
>> FWIW, L2s are 256K at the moment and with a two level table, it can 
>> support 5PB of data.
>
>
> I clearly suck at basic math today.  The image supports 64TB today.  
> Dropping to 128K tables would reduce it to 16TB and 64k tables would 
> be 4TB.

Maybe we should do three levels then.  Some users are bound to complain 
about 64TB.

>
> BTW, I don't think your checksumming idea is sound.  If you store a 
> 64-bit checksum along side each point, it becomes necessary to update 
> the parent pointer every time the table changes.   This introduces an 
> ordering requirement which means you need to sync() the file every 
> time you update and L2 entry.

Even worse, if the crash happens between an L2 update and an L1 checksum 
update, the entire cluster goes away.  You really want allocate-on-write 
for this.

> Today, we only need to sync() when we first allocate an L2 entry 
> (because their locations never change).  From a performance 
> perspective, it's the difference between an fsync() every 64k vs. 
> every 2GB.

Yup.  From a correctness perspective, it's the difference between a 
corrupted filesystem on almost every crash and a corrupted filesystem in 
some very rare cases.

>
> Plus, doesn't btrfs do block level checksumming?  IOW, if you run a 
> workload where you care about this level of data integrity validation, 
> if you did btrfs + qed, you would be fine.

Or just btrfs by itself (use btrfs for snapshots and base images, use 
qemu-img convert for shipping).

>
> Since the majority of file systems don't do metadata checksumming, 
> it's not obvious to me that we should be. 

The logic is that as data sizes increase, the probablity of error increases.

> I think one of the critical flaws in qcow2 was trying to invent a 
> better filesystem within qemu instead of just sticking to a very 
> simple and obviously correct format and letting the FS folks do the 
> really fancy stuff.

Well, if we introduce a minimal format, we need to make sure it isn't 
too minimal.

I'm still not sold on the idea.  What we're doing now is pushing the 
qcow2 complexity to users.  We don't have to worry about refcounts now, 
but users have to worry whether they're the machine they're copying the 
image to supports qed or not.

The performance problems with qcow2 are solvable.  If we preallocate 
clusters, the performance characteristics become essentially the same as 
qed.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

  reply	other threads:[~2010-09-08  8:23 UTC|newest]

Thread overview: 132+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-06 10:04 [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format Stefan Hajnoczi
2010-09-06 10:25 ` Alexander Graf
2010-09-06 10:31   ` Stefan Hajnoczi
2010-09-06 14:21   ` Luca Tettamanti
2010-09-06 14:24     ` Alexander Graf
2010-09-06 16:27       ` Anthony Liguori
2010-09-06 10:27 ` [Qemu-devel] " Kevin Wolf
2010-09-06 12:40   ` Stefan Hajnoczi
2010-09-06 12:57     ` Anthony Liguori
2010-09-06 13:02       ` Stefan Hajnoczi
2010-09-06 14:10       ` Kevin Wolf
2010-09-06 16:45         ` Anthony Liguori
2010-09-06 12:45   ` Anthony Liguori
2010-09-10 23:49     ` H. Peter Anvin
2010-09-06 11:18 ` [Qemu-devel] " Daniel P. Berrange
2010-09-06 12:52   ` Anthony Liguori
2010-09-06 13:35     ` Daniel P. Berrange
2010-09-06 16:38       ` Anthony Liguori
2010-09-06 13:06 ` Anthony Liguori
2010-09-07 14:51   ` Avi Kivity
2010-09-07 15:40     ` Anthony Liguori
2010-09-07 16:09       ` Avi Kivity
2010-09-07 16:25         ` Anthony Liguori
2010-09-07 22:27           ` Anthony Liguori
2010-09-08  8:23             ` Avi Kivity [this message]
2010-09-08  8:41               ` Alexander Graf
2010-09-08  8:53                 ` Avi Kivity
2010-09-08 11:15                   ` Stefan Hajnoczi
2010-09-08 15:38                     ` Christoph Hellwig
2010-09-08 16:30                       ` Anthony Liguori
2010-09-08 20:23                         ` Christoph Hellwig
2010-09-08 20:28                           ` Anthony Liguori
2010-09-09  2:35                             ` Christoph Hellwig
2010-09-09  6:24                               ` Avi Kivity
2010-09-09 21:01                                 ` Christoph Hellwig
2010-09-10 11:15                                   ` Avi Kivity
2010-09-09  6:53                     ` Avi Kivity
2010-09-10 21:22                     ` Jamie Lokier
2010-09-14 10:46                       ` Stefan Hajnoczi
2010-09-14 11:08                         ` Stefan Hajnoczi
2010-09-14 12:54                         ` Anthony Liguori
2010-09-08 12:55                   ` Anthony Liguori
2010-09-09  6:30                     ` Avi Kivity
2010-09-08 12:48               ` Anthony Liguori
2010-09-08 13:20                 ` Kevin Wolf
2010-09-08 13:26                   ` Anthony Liguori
2010-09-08 13:46                     ` Kevin Wolf
2010-09-09  6:45                 ` Avi Kivity
2010-09-09  6:48                   ` Avi Kivity
2010-09-09 12:49                   ` Anthony Liguori
2010-09-09 16:48                     ` [Qemu-devel] " Paolo Bonzini
2010-09-09 17:02                       ` Anthony Liguori
2010-09-09 20:56                         ` Christoph Hellwig
2010-09-10 10:53                         ` Avi Kivity
2010-09-10 11:14                     ` [Qemu-devel] " Avi Kivity
2010-09-10 11:25                       ` Avi Kivity
2010-09-10 11:33                         ` Stefan Hajnoczi
2010-09-10 11:43                           ` Avi Kivity
2010-09-10 13:22                             ` Anthony Liguori
2010-09-10 13:48                               ` Christoph Hellwig
2010-09-10 15:02                                 ` Anthony Liguori
2010-09-10 15:18                                   ` Kevin Wolf
2010-09-10 15:53                                     ` Anthony Liguori
2010-09-10 16:05                                       ` Kevin Wolf
2010-09-10 17:10                                         ` Anthony Liguori
2010-09-10 17:44                                           ` Kevin Wolf
2010-09-10 17:46                                           ` Miguel Di Ciurcio Filho
2010-09-10 14:02                               ` Avi Kivity
2010-09-10 13:47                           ` Christoph Hellwig
2010-09-10 14:05                             ` Avi Kivity
2010-09-10 14:12                               ` Christoph Hellwig
2010-09-10 14:24                                 ` Avi Kivity
2010-09-10 13:16                         ` Anthony Liguori
2010-09-10 14:06                           ` Avi Kivity
2010-09-10 11:43                       ` Stefan Hajnoczi
2010-09-10 12:06                         ` Avi Kivity
2010-09-10 13:28                           ` Anthony Liguori
2010-09-10 12:12                         ` Kevin Wolf
2010-09-10 12:35                           ` Stefan Hajnoczi
2010-09-10 12:47                             ` Avi Kivity
2010-09-10 13:10                               ` Stefan Hajnoczi
2010-09-10 13:19                                 ` Avi Kivity
2010-09-10 13:39                               ` Anthony Liguori
2010-09-10 13:52                                 ` Christoph Hellwig
2010-09-10 13:56                                 ` Avi Kivity
2010-09-10 13:48                             ` Kevin Wolf
2010-09-10 13:14                       ` Anthony Liguori
2010-09-10 13:47                         ` Avi Kivity
2010-09-10 14:56                           ` Anthony Liguori
2010-09-10 15:49                             ` Avi Kivity
2010-09-10 17:07                               ` Anthony Liguori
2010-09-10 17:42                                 ` Kevin Wolf
2010-09-10 19:33                                   ` Anthony Liguori
2010-09-13 10:41                                     ` Kevin Wolf
2010-09-12 13:24                                 ` Avi Kivity
2010-09-12 15:13                                   ` Anthony Liguori
2010-09-12 15:56                                     ` Avi Kivity
2010-09-12 17:09                                       ` Anthony Liguori
2010-09-12 17:51                                         ` Avi Kivity
2010-09-12 20:18                                           ` Anthony Liguori
2010-09-13  9:24                                             ` Avi Kivity
2010-09-13 11:28                                         ` Kevin Wolf
2010-09-13 11:34                                           ` Avi Kivity
2010-09-13 11:48                                             ` Kevin Wolf
2010-09-13 13:19                                               ` Anthony Liguori
2010-09-13 13:12                                           ` Anthony Liguori
2010-09-13 11:03                                       ` Kevin Wolf
2010-09-13 13:07                                         ` Anthony Liguori
2010-09-13 13:24                                           ` Kevin Wolf
2010-09-07 16:12     ` Anthony Liguori
2010-09-07 21:35       ` Christoph Hellwig
2010-09-07 22:29         ` Anthony Liguori
2010-09-07 22:40           ` Christoph Hellwig
2010-09-08 15:07     ` Stefan Hajnoczi
2010-09-09  6:59       ` Avi Kivity
2010-09-09 17:43         ` Anthony Liguori
2010-09-09 20:46           ` Christoph Hellwig
2010-09-10 11:22           ` Avi Kivity
2010-09-10 11:29             ` Stefan Hajnoczi
2010-09-10 11:37               ` Avi Kivity
2010-09-07 13:58 ` Avi Kivity
2010-09-07 19:25 ` Blue Swirl
2010-09-07 20:41   ` Anthony Liguori
2010-09-08  7:48     ` Kevin Wolf
2010-09-08 15:37   ` Stefan Hajnoczi
2010-09-08 18:24     ` Blue Swirl
2010-09-08 18:35       ` Anthony Liguori
2010-09-08 18:56         ` Blue Swirl
2010-09-08 19:19           ` Anthony Liguori
2010-09-15 21:01 ` [Qemu-devel] " Michael S. Tsirkin
2010-09-15 21:12   ` Anthony Liguori
  -- strict thread matches above, loose matches on Subject: below --
2010-09-17  3:51 [Qemu-devel] " Khoa Huynh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C874812.9090807@redhat.com \
    --to=avi@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.