From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L4edF-0006bB-2K for qemu-devel@nongnu.org; Mon, 24 Nov 2008 11:49:53 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L4edD-0006aR-C3 for qemu-devel@nongnu.org; Mon, 24 Nov 2008 11:49:52 -0500 Received: from [199.232.76.173] (port=43223 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L4edD-0006aI-2E for qemu-devel@nongnu.org; Mon, 24 Nov 2008 11:49:51 -0500 Received: from yw-out-1718.google.com ([74.125.46.158]:34177) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L4edC-0001IN-Ka for qemu-devel@nongnu.org; Mon, 24 Nov 2008 11:49:50 -0500 Received: by yw-out-1718.google.com with SMTP id 6so804452ywa.82 for ; Mon, 24 Nov 2008 08:49:49 -0800 (PST) Message-ID: <492ADB2A.7030700@codemonkey.ws> Date: Mon, 24 Nov 2008 10:49:46 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 5/5] Change order of metadata update to prevent loosing guest data because of unexpected exit. References: <20081123145248.22178.36228.stgit@dhcp-1-237.tlv.redhat.com> <20081123145326.22178.36990.stgit@dhcp-1-237.tlv.redhat.com> In-Reply-To: <20081123145326.22178.36990.stgit@dhcp-1-237.tlv.redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Gleb Natapov wrote: > Currently the order is this (during cow since it's the interesting case): > 1. Decrement refcount of old clusters > 2. Increment refcount for newly allocated clusters > 3. Copy content of old sectors that will not be rewritten > 4. Update L2 table with pointers to new clusters > 5. Write guest data into new clusters (asynchronously) > > There are several problems with this order. The first one is that if qemu > crashes (or killed or host reboots) after new clusters are linked into L2 > table but before user data is written there, then on the next reboot guest > will find neither old data nor new one in those sectors and this is not > what gust expects even when journaling file system is in use. The other > problem is that if qemu is killed between steps 1 and 4 then refcount > of old cluster will be incorrect and may cause snapshot corruption. > > The patch change the order to be like this: > 1. Increment refcount for newly allocated clusters > 2. Write guest data into new clusters (asynchronously) > 3. Copy content of old sectors that were not rewritten > 4. Update L2 table with pointers to new clusters > 5. Decrement refcount of old clusters > > Unexpected crash may cause cluster leakage, but guest data should be safe. > > Signed-off-by: Gleb Natapov > --- > > block-qcow2.c | 155 +++++++++++++++++++++++++++++++++------------------------ > 1 files changed, 91 insertions(+), 64 deletions(-) > > diff --git a/block-qcow2.c b/block-qcow2.c > index 0771281..c600517 100644 > --- a/block-qcow2.c > +++ b/block-qcow2.c > @@ -852,6 +852,69 @@ static uint64_t alloc_compressed_cluster_offset(BlockDriverState *bs, > return cluster_offset; > } > > +typedef struct QCowL2Meta > +{ > + uint64_t offset; > + int n_start; > + int nb_available; > + int nb_clusters; > +} QCowL2Meta; > + > +static int alloc_cluster_link_l2(BlockDriverState *bs, uint64_t cluster_offset, > + QCowL2Meta *m) > +{ > + BDRVQcowState *s = bs->opaque; > + int i, j = 0, l2_index, ret; > + uint64_t *old_cluster, start_sect, l2_offset, *l2_table; > + > + if (m->nb_clusters == 0) > + return 0; > + > + if (!(old_cluster = qemu_malloc(m->nb_clusters * sizeof(uint64_t)))) > + return -ENOMEM; > This memory is never freed. Regards, Anthony Liguori