* Re: [Qemu-devel] [RFC V6 27/33] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup.
[not found] ` <20130228161418.GA4214@irqsave.net>
@ 2013-03-01 8:59 ` Kevin Wolf
2013-03-01 9:34 ` Stefan Hajnoczi
0 siblings, 1 reply; 5+ messages in thread
From: Kevin Wolf @ 2013-03-01 8:59 UTC (permalink / raw)
To: Benoît Canet; +Cc: qemu-devel, Stefan Hajnoczi
Am 28.02.2013 um 17:14 hat Benoît Canet geschrieben:
> Le Thursday 28 Feb 2013 à 11:14:34 (+0100), Kevin Wolf a écrit :
> > Am 28.02.2013 um 10:41 hat Stefan Hajnoczi geschrieben:
> > > On Wed, Feb 27, 2013 at 04:00:28PM +0100, Benoît Canet wrote:
> > > > > > - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
> > > > > > + if (!s->has_dedup &&
> > > > > > + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
> > > > > > + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%"
> > > > > > + PRIx64 " refcount=%d\n", l2_entry, refcount);
> > > > > > + res->corruptions++;
> > > > > > + }
> > > > >
> > > > > Why is this warning suppressed when dedup is enabled? The meaning of
> > > > > QCOW_OFLAG_COPIED is that refcount == 1. If this invariant is violated
> > > > > then something is wrong.
> > > >
> > > > When deduplication is done refcount will be bigger than one and
> > > > QCOW_OFLAG_COPIED will be cleared.
> > > >
> > > > Then if enough logical clustere pointing to the same physical cluster are
> > > > rewritten with something else the refcount will goes down back to one.
> > > >
> > > > But this time QCOW_OFLAG_COPIED can be set back so this equality won't be true.
> > >
> > > When the refcount decreases to 1 again we need to set QCOW_OFLAG_COPIED
> > > again. qcow2-snapshot.c:qcow2_snapshot_delete() does this with:
> > >
> > > /* must update the copied flag on the current cluster offsets */
> > > ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0);
> > >
> > > Is dedup not restoring QCOW_OFLAG_COPIED?
> >
> > This is a very expensive operation. I don't think that you can do it for
> > each deduplicated cluster that is overwritten. Not doing it comes with
> > the cost of doing more COW than is actually needed. And we need to
> > mention in the spec that QCOW_OFLAG_COPIED can be missing on clusters
> > with deduplication enabled.
>
> Also when two logical clusters point to the same physical cluster and one of the
> logical cluster get overwritten the deduplication code has no way to know the
> index of the last logical cluster entry.
Well, strictly speaking you can. The qcow2_update_snapshot_refcount()
call that Stefan mention does exactly that. It's just insanely expensive
because it has to look at the refcounts for all clusters.
Kevin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [RFC V6 27/33] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup.
2013-03-01 8:59 ` [Qemu-devel] [RFC V6 27/33] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup Kevin Wolf
@ 2013-03-01 9:34 ` Stefan Hajnoczi
0 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2013-03-01 9:34 UTC (permalink / raw)
To: Kevin Wolf; +Cc: Benoît Canet, qemu-devel
On Fri, Mar 01, 2013 at 09:59:33AM +0100, Kevin Wolf wrote:
> Am 28.02.2013 um 17:14 hat Benoît Canet geschrieben:
> > Le Thursday 28 Feb 2013 à 11:14:34 (+0100), Kevin Wolf a écrit :
> > > Am 28.02.2013 um 10:41 hat Stefan Hajnoczi geschrieben:
> > > > On Wed, Feb 27, 2013 at 04:00:28PM +0100, Benoît Canet wrote:
> > > > > > > - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
> > > > > > > + if (!s->has_dedup &&
> > > > > > > + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
> > > > > > > + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%"
> > > > > > > + PRIx64 " refcount=%d\n", l2_entry, refcount);
> > > > > > > + res->corruptions++;
> > > > > > > + }
> > > > > >
> > > > > > Why is this warning suppressed when dedup is enabled? The meaning of
> > > > > > QCOW_OFLAG_COPIED is that refcount == 1. If this invariant is violated
> > > > > > then something is wrong.
> > > > >
> > > > > When deduplication is done refcount will be bigger than one and
> > > > > QCOW_OFLAG_COPIED will be cleared.
> > > > >
> > > > > Then if enough logical clustere pointing to the same physical cluster are
> > > > > rewritten with something else the refcount will goes down back to one.
> > > > >
> > > > > But this time QCOW_OFLAG_COPIED can be set back so this equality won't be true.
> > > >
> > > > When the refcount decreases to 1 again we need to set QCOW_OFLAG_COPIED
> > > > again. qcow2-snapshot.c:qcow2_snapshot_delete() does this with:
> > > >
> > > > /* must update the copied flag on the current cluster offsets */
> > > > ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0);
> > > >
> > > > Is dedup not restoring QCOW_OFLAG_COPIED?
> > >
> > > This is a very expensive operation. I don't think that you can do it for
> > > each deduplicated cluster that is overwritten. Not doing it comes with
> > > the cost of doing more COW than is actually needed. And we need to
> > > mention in the spec that QCOW_OFLAG_COPIED can be missing on clusters
> > > with deduplication enabled.
> >
> > Also when two logical clusters point to the same physical cluster and one of the
> > logical cluster get overwritten the deduplication code has no way to know the
> > index of the last logical cluster entry.
>
> Well, strictly speaking you can. The qcow2_update_snapshot_refcount()
> call that Stefan mention does exactly that. It's just insanely expensive
> because it has to look at the refcounts for all clusters.
Okay, I agree that qcow2_update_snapshot_refcount() is too expensive.
Please add a comment explaining that QCOW_OFLAG_COPIED is not guaranteed
when dedup is enabled since it would be too expensive to do this
everything sharing breaks (refcount is decremented to 1).
Stefan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [RFC V6 13/33] qcow2: make the deduplication forget a cluster hash when a cluster is to dedupe
[not found] ` <20130207094814.GH1081@stefanha-thinkpad.redhat.com>
@ 2013-03-11 12:59 ` Benoît Canet
0 siblings, 0 replies; 5+ messages in thread
From: Benoît Canet @ 2013-03-11 12:59 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: kwolf, qemu-devel, stefanha
Le Thursday 07 Feb 2013 à 10:48:14 (+0100), Stefan Hajnoczi a écrit :
> On Wed, Feb 06, 2013 at 01:31:46PM +0100, Benoît Canet wrote:
> > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> > index ef91216..5b1d20d 100644
> > --- a/block/qcow2-cluster.c
> > +++ b/block/qcow2-cluster.c
> > @@ -710,6 +710,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
> >
> > for (i = 0; i < m->nb_clusters; i++) {
> > uint64_t flags = 0;
> > + uint64_t offset = cluster_offset + (i << s->cluster_bits);
> > /* if two concurrent writes happen to the same unallocated cluster
> > * each write allocates separate cluster and writes data concurrently.
> > * The first one to complete updates l2 table with pointer to its
> > @@ -722,8 +723,14 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
> > flags = m->oflag_copied ? QCOW_OFLAG_COPIED : 0;
> > flags |= m->to_deduplicate ? QCOW_OFLAG_TO_DEDUP : 0;
> >
> > - l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
> > - (i << s->cluster_bits)) | flags);
> > + l2_table[l2_index + i] = cpu_to_be64(offset | flags);
> > +
> > + /* make the deduplication forget the cluster to avoid making
> > + * the dedup pointing to a cluster that has changed on it's back.
> > + */
> > + if (m->to_deduplicate) {
> > + qcow2_dedup_forget_cluster_by_sector(bs, offset >> 9);
> > + }
>
> This does not play well with internal snapshots. Imagine that an
> internal snapshot was taken, so refcount == 2.
>
> Now the cluster is overwritten by the guest but we still need to hang on
> to the original data since the snapshot refers to it.
>
> If dedup forgets about the cluster then dedup is only effective for the
> current disk image, but it ignores snapshots. Ideally dedup would take
> snapshots into account since they share the same data clusters as the
> current image.
>
> Stefan
Is checking the refcount with a qcow2_get_refcounts(index) a solution ?
Benoît
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [RFC V6 19/33] block: Add qcow2_dedup format and image creation code.
[not found] ` <20130207101621.GM1081@stefanha-thinkpad.redhat.com>
@ 2013-03-11 15:20 ` Benoît Canet
2013-03-12 9:33 ` Stefan Hajnoczi
0 siblings, 1 reply; 5+ messages in thread
From: Benoît Canet @ 2013-03-11 15:20 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: kwolf, qemu-devel, stefanha
> > + if (dedup) {
> > + version = 3;
> > + }
>
> Lazy refcounts don't force the version. It would be consistent to
> refrain from forcing the version too.
I don't understand. Aren't the incompatible features bits implying version = 3 ?
When compat=1.1 is set the code force version to 3.
Benoît
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [RFC V6 19/33] block: Add qcow2_dedup format and image creation code.
2013-03-11 15:20 ` [Qemu-devel] [RFC V6 19/33] block: Add qcow2_dedup format and image creation code Benoît Canet
@ 2013-03-12 9:33 ` Stefan Hajnoczi
0 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2013-03-12 9:33 UTC (permalink / raw)
To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha
On Mon, Mar 11, 2013 at 04:20:03PM +0100, Benoît Canet wrote:
> > > + if (dedup) {
> > > + version = 3;
> > > + }
> >
> > Lazy refcounts don't force the version. It would be consistent to
> > refrain from forcing the version too.
>
> I don't understand. Aren't the incompatible features bits implying version = 3 ?
> When compat=1.1 is set the code force version to 3.
Only compat= affects the version. Lazy refcounts and dedup should be
orthogonal.
If you specify lazy_refcounts=on without compat=1.1 you get an error.
Same should apply to deduplication. Otherwise the qemu-img create
behavior is inconsistent - you get an error for lazy_refcounts but no
error and silent version=3 for deduplication.
Stefan
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-03-12 9:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1360153926-9492-1-git-send-email-benoit@irqsave.net>
[not found] ` <1360153926-9492-28-git-send-email-benoit@irqsave.net>
[not found] ` <20130208105746.GD7835@stefanha-thinkpad.redhat.com>
[not found] ` <20130227150028.GC4010@irqsave.net>
[not found] ` <20130228094139.GB12748@stefanha-thinkpad.redhat.com>
[not found] ` <20130228101434.GD2429@dhcp-200-207.str.redhat.com>
[not found] ` <20130228161418.GA4214@irqsave.net>
2013-03-01 8:59 ` [Qemu-devel] [RFC V6 27/33] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup Kevin Wolf
2013-03-01 9:34 ` Stefan Hajnoczi
[not found] ` <1360153926-9492-14-git-send-email-benoit@irqsave.net>
[not found] ` <20130207094814.GH1081@stefanha-thinkpad.redhat.com>
2013-03-11 12:59 ` [Qemu-devel] [RFC V6 13/33] qcow2: make the deduplication forget a cluster hash when a cluster is to dedupe Benoît Canet
[not found] ` <1360153926-9492-20-git-send-email-benoit@irqsave.net>
[not found] ` <20130207101621.GM1081@stefanha-thinkpad.redhat.com>
2013-03-11 15:20 ` [Qemu-devel] [RFC V6 19/33] block: Add qcow2_dedup format and image creation code Benoît Canet
2013-03-12 9:33 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).