From: Gleb Natapov <gleb@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: markmc@redhat.com, Avi Kivity <avi@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Re: [PATCH] qcow2/virtio corruption: Don't allocate the same cluster twice
Date: Thu, 7 May 2009 10:32:37 +0300 [thread overview]
Message-ID: <20090507073237.GY9795@redhat.com> (raw)
In-Reply-To: <4A01CE6C.3000901@redhat.com>
On Wed, May 06, 2009 at 07:52:44PM +0200, Kevin Wolf wrote:
> Avi Kivity schrieb:
> > Kevin Wolf wrote:
> >> Avi Kivity schrieb:
> >>> Also, the second request now depends on the first to update its
> >>> metadata. But if the first request fail, it will not update its
> >>> metadata, and the second request will complete without error and also
> >>> without updating its metadata.
> >>>
> >> Hm, right. Need to think about this...
> >>
> >
> > I suggest retaining the part where you use inflight l2metas to layout
> > data contiguously, but change alloc_cluster_link_l2() not to rely on
> > n_start and nb_available but instead recompute them on completion.
> > m->nb_clusters should never be zeroed for this to work.
>
> Is there even a reason why we need to copy the unmodified sectors in
> alloc_cluster_link_l2() and cannot do that in alloc_cluster_offset()
> before we write the new data? Then the callback wouldn't need to mess
> around with figuring out which part must be overwritten and which one
> mustn't.
>
The reason we need to copy unmodified sectors in alloc_cluster_link_l2()
is exactly to handle concurrent writes into the same cluster. This is
essentially RMW. I don't see why concurrent writes should not work with
the logic in place. There is a bug there currently of cause :) Can
somebody check this patch:
diff --git a/block-qcow2.c b/block-qcow2.c
index 7840634..801d26d 100644
--- a/block-qcow2.c
+++ b/block-qcow2.c
@@ -995,8 +995,8 @@ static int alloc_cluster_link_l2(BlockDriverState *bs, uint64_t cluster_offset,
if(l2_table[l2_index + i] != 0)
old_cluster[j++] = l2_table[l2_index + i];
- l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
- (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
+ l2_table[l2_index + i] = cpu_to_be64(((cluster_offset +
+ (i << s->cluster_bits)) | QCOW_OFLAG_COPIED));
}
if (bdrv_pwrite(s->hd, l2_offset + l2_index * sizeof(uint64_t),
@@ -1005,7 +1005,8 @@ static int alloc_cluster_link_l2(BlockDriverState *bs, uint64_t cluster_offset,
goto err;
for (i = 0; i < j; i++)
- free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1);
+ free_any_clusters(bs, be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED,
+ 1);
ret = 0;
err:
--
Gleb.
next prev parent reply other threads:[~2009-05-07 7:32 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-06 16:39 [Qemu-devel] [PATCH] qcow2/virtio corruption: Don't allocate the same cluster twice Kevin Wolf
2009-05-06 16:54 ` [Qemu-devel] " Avi Kivity
2009-05-06 17:03 ` Kevin Wolf
2009-05-06 17:08 ` Avi Kivity
2009-05-06 17:52 ` Kevin Wolf
2009-05-06 18:31 ` Avi Kivity
2009-05-07 7:32 ` Gleb Natapov [this message]
2009-05-07 7:54 ` Avi Kivity
2009-05-07 8:01 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090507073237.GY9795@redhat.com \
--to=gleb@redhat.com \
--cc=avi@redhat.com \
--cc=kwolf@redhat.com \
--cc=markmc@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).